All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next v2 0/8] udp: GRO L4 improvements
@ 2021-03-25 17:23 Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path Paolo Abeni
                   ` (7 more replies)
  0 siblings, 8 replies; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:23 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

This series improves the UDP L4 - either 'forward' or 'frag_list' -
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic.

The first for patches are mostly bugfixes, addressing some GRO 
edge-cases when both tunnels and L4 are present, enabled and in use.

The next 3 patches avoid unneeded segmentation when UDP GRO
traffic traverses in the receive path UDP tunnels.

Finally, some self-tests are included, covering the relevant
GRO scenarios.

Even if most patches are actually bugfixes, this series is
targeting net-next, as overall it makes available a new feature.

v1 -> v2:
 - restrict post segmentation csum fixup to the only the relevant pkts
 - use individual 'accept_gso_type' fields instead of whole gso bitmask
   (Willem)
 - use only ipv6 addesses from test range in self-tests (Willem)
 - hopefully clarified most individual patches commit messages

Paolo Abeni (8):
  udp: fixup csum for GSO receive slow path
  udp: skip L4 aggregation for UDP tunnel packets
  udp: properly complete L4 GRO over UDP tunnel packet
  udp: never accept GSO_FRAGLIST packets
  vxlan: allow L4 GRO passthrough
  geneve: allow UDP L4 GRO passthrou
  bareudp: allow UDP L4 GRO passthrou
  selftests: net: add UDP GRO forwarding self-tests

 drivers/net/bareudp.c                     |   1 +
 drivers/net/geneve.c                      |   1 +
 drivers/net/vxlan.c                       |   1 +
 include/linux/udp.h                       |  22 +-
 include/net/udp.h                         |  18 ++
 net/ipv4/udp.c                            |   5 +
 net/ipv4/udp_offload.c                    |  27 ++-
 net/ipv6/udp.c                            |   1 +
 net/ipv6/udp_offload.c                    |   3 +-
 tools/testing/selftests/net/Makefile      |   1 +
 tools/testing/selftests/net/udpgro_fwd.sh | 251 ++++++++++++++++++++++
 11 files changed, 318 insertions(+), 13 deletions(-)
 create mode 100755 tools/testing/selftests/net/udpgro_fwd.sh

-- 
2.26.2


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  2021-03-26 18:30   ` Willem de Bruijn
  2021-03-25 17:24 ` [PATCH net-next v2 2/8] udp: skip L4 aggregation for UDP tunnel packets Paolo Abeni
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

When UDP packets generated locally by a socket with UDP_SEGMENT
traverse the following path:

UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
	UDP tunnel (rx) -> UDP socket (no UDP_GRO)

they are segmented as part of the rx socket receive operation, and
present a CHECKSUM_NONE after segmentation.

Additionally the segmented packets UDP CB still refers to the original
GSO packet len. Overall that causes unexpected/wrong csum validation
errors later in the UDP receive path.

We could possibly address the issue with some additional checks and
csum mangling in the UDP tunnel code. Since the issue affects only
this UDP receive slow path, let's set a suitable csum status there.

v1 -> v2:
 - restrict the csum update to the packets strictly needing them
 - hopefully clarify the commit message and code comments

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/udp.h | 18 ++++++++++++++++++
 net/ipv4/udp.c    |  2 ++
 net/ipv6/udp.c    |  1 +
 3 files changed, 21 insertions(+)

diff --git a/include/net/udp.h b/include/net/udp.h
index d4d064c592328..7fc735919f4df 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -515,6 +515,24 @@ static inline struct sk_buff *udp_rcv_segment(struct sock *sk,
 	return segs;
 }
 
+static inline void udp_post_segment_fix_csum(struct sk_buff *skb)
+{
+	/* UDP-lite can't land here - no GRO */
+	WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
+
+	/* UDP packets generated with UDP_SEGMENT and traversing:
+	 * UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx)
+	 * land here with CHECKSUM_NONE. Instead of adding another check
+	 * in the tunnel fastpath, we can force valid csums here:
+	 * packets are locally generated and the GRO engine already validated
+	 * the csum.
+	 * Additionally fixup the UDP CB
+	 */
+	UDP_SKB_CB(skb)->cscov = skb->len;
+	if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
+		skb->csum_valid = 1;
+}
+
 #ifdef CONFIG_BPF_SYSCALL
 struct sk_psock;
 struct proto *udp_bpf_get_proto(struct sock *sk, struct sk_psock *psock);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 4a0478b17243a..fe85dcf8c0087 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2178,6 +2178,8 @@ static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	segs = udp_rcv_segment(sk, skb, true);
 	skb_list_walk_safe(segs, skb, next) {
 		__skb_pull(skb, skb_transport_offset(skb));
+
+		udp_post_segment_fix_csum(skb);
 		ret = udp_queue_rcv_one_skb(sk, skb);
 		if (ret > 0)
 			ip_protocol_deliver_rcu(dev_net(skb->dev), skb, ret);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d25e5a9252fdb..fa2f547383925 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -749,6 +749,7 @@ static int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
 	skb_list_walk_safe(segs, skb, next) {
 		__skb_pull(skb, skb_transport_offset(skb));
 
+		udp_post_segment_fix_csum(skb);
 		ret = udpv6_queue_rcv_one_skb(sk, skb);
 		if (ret > 0)
 			ip6_protocol_deliver_rcu(dev_net(skb->dev), skb, ret,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 2/8] udp: skip L4 aggregation for UDP tunnel packets
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  2021-03-26 18:23   ` Willem de Bruijn
  2021-03-25 17:24 ` [PATCH net-next v2 3/8] udp: properly complete L4 GRO over UDP tunnel packet Paolo Abeni
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there
are UDP tunnels available in the system, udp_gro_receive() could end-up
doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at
the outer UDP tunnel level for packets effectively carrying and UDP
tunnel header.

That could cause inner protocol corruption. If e.g. the relevant
packets carry a vxlan header, different vxlan ids will be ignored/
aggregated to the same GSO packet. Inner headers will be ignored, too,
so that e.g. TCP over vxlan push packets will be held in the GRO
engine till the next flush, etc.

Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the
current packet could land in a UDP tunnel, and let udp_gro_receive()
do GRO via udp_sk(sk)->gro_receive.

The check implemented in this patch is broader than what is strictly
needed, as the existing UDP tunnel could be e.g. configured on top of
a different device: we could end-up skipping GRO at-all for some packets.

Anyhow, the latter is a very thin corner case and covering it would add
quite a bit of complexity.

v1 -> v2:
 - hopefully clarify the commit message

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv4/udp_offload.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index c5b4b586570fe..25134a3548e99 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -515,21 +515,24 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 	unsigned int off = skb_gro_offset(skb);
 	int flush = 1;
 
+	/* we can do L4 aggregation only if the packet can't land in a tunnel
+	 * otherwise we could corrupt the inner stream
+	 */
 	NAPI_GRO_CB(skb)->is_flist = 0;
-	if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
-		NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1;
+	if (!sk || !udp_sk(sk)->gro_receive) {
+		if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
+			NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled : 1;
 
-	if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
-	    (sk && udp_sk(sk)->gro_enabled) || NAPI_GRO_CB(skb)->is_flist) {
-		pp = call_gro_receive(udp_gro_receive_segment, head, skb);
+		if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
+		    (sk && udp_sk(sk)->gro_enabled) || NAPI_GRO_CB(skb)->is_flist)
+			pp = call_gro_receive(udp_gro_receive_segment, head, skb);
 		return pp;
 	}
 
-	if (!sk || NAPI_GRO_CB(skb)->encap_mark ||
+	if (NAPI_GRO_CB(skb)->encap_mark ||
 	    (uh->check && skb->ip_summed != CHECKSUM_PARTIAL &&
 	     NAPI_GRO_CB(skb)->csum_cnt == 0 &&
-	     !NAPI_GRO_CB(skb)->csum_valid) ||
-	    !udp_sk(sk)->gro_receive)
+	     !NAPI_GRO_CB(skb)->csum_valid))
 		goto out;
 
 	/* mark that this skb passed once through the tunnel gro layer */
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 3/8] udp: properly complete L4 GRO over UDP tunnel packet
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 2/8] udp: skip L4 aggregation for UDP tunnel packets Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  2021-03-26 17:51   ` Willem de Bruijn
  2021-03-25 17:24 ` [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets Paolo Abeni
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

After the previous patch, the stack can do L4 UDP aggregation
on top of a UDP tunnel.

In such scenario, udp{4,6}_gro_complete will be called twice. This function
will enter its is_flist branch immediately, even though that is only
correct on the second call, as GSO_FRAGLIST is only relevant for the
inner packet.

Instead, we need to try first UDP tunnel-based aggregation, if the GRO
packet requires that.

This patch changes udp{4,6}_gro_complete to skip the frag list processing
when while encap_mark == 1, identifying processing of the outer tunnel
header.
Additionally, clears the field in udp_gro_complete() so that we can enter
the frag list path on the next round, for the inner header.

v1 -> v2:
 - hopefully clarified the commit message

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/ipv4/udp_offload.c | 8 +++++++-
 net/ipv6/udp_offload.c | 3 ++-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 25134a3548e99..54e06b88af69a 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -642,6 +642,11 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff,
 		skb_shinfo(skb)->gso_type = uh->check ? SKB_GSO_UDP_TUNNEL_CSUM
 					: SKB_GSO_UDP_TUNNEL;
 
+		/* clear the encap mark, so that inner frag_list gro_complete
+		 * can take place
+		 */
+		NAPI_GRO_CB(skb)->encap_mark = 0;
+
 		/* Set encapsulation before calling into inner gro_complete()
 		 * functions to make them set up the inner offsets.
 		 */
@@ -665,7 +670,8 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 	const struct iphdr *iph = ip_hdr(skb);
 	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 
-	if (NAPI_GRO_CB(skb)->is_flist) {
+	/* do fraglist only if there is no outer UDP encap (or we already processed it) */
+	if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
 		uh->len = htons(skb->len - nhoff);
 
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index faa823c242923..b3d9ed96e5ea5 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -163,7 +163,8 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 	const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
 	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 
-	if (NAPI_GRO_CB(skb)->is_flist) {
+	/* do fraglist only if there is no outer UDP encap (or we already processed it) */
+	if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) {
 		uh->len = htons(skb->len - nhoff);
 
 		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
                   ` (2 preceding siblings ...)
  2021-03-25 17:24 ` [PATCH net-next v2 3/8] udp: properly complete L4 GRO over UDP tunnel packet Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  2021-03-26 18:15   ` Willem de Bruijn
  2021-03-25 17:24 ` [PATCH net-next v2 5/8] vxlan: allow L4 GRO passthrough Paolo Abeni
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

Currently the UDP protocol delivers GSO_FRAGLIST packets to
the sockets without the expected segmentation.

This change addresses the issue introducing and maintaining
a couple of new fields to explicitly accept SKB_GSO_UDP_L4
or GSO_FRAGLIST packets. Additionally updates  udp_unexpected_gso()
accordingly.

UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist
zeroed.

v1 -> v2:
 - use 2 bits instead of a whole GSO bitmask (Willem)

Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/linux/udp.h | 16 +++++++++++++---
 net/ipv4/udp.c      |  3 +++
 2 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/include/linux/udp.h b/include/linux/udp.h
index aa84597bdc33c..ae58ff3b6b5b8 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -51,7 +51,9 @@ struct udp_sock {
 					   * different encapsulation layer set
 					   * this
 					   */
-			 gro_enabled:1;	/* Can accept GRO packets */
+			 gro_enabled:1,	/* Request GRO aggregation */
+			 accept_udp_l4:1,
+			 accept_udp_fraglist:1;
 	/*
 	 * Following member retains the information to create a UDP header
 	 * when the socket is uncorked.
@@ -131,8 +133,16 @@ static inline void udp_cmsg_recv(struct msghdr *msg, struct sock *sk,
 
 static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
 {
-	return !udp_sk(sk)->gro_enabled && skb_is_gso(skb) &&
-	       skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4;
+	if (!skb_is_gso(skb))
+		return false;
+
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_UDP_L4 && !udp_sk(sk)->accept_udp_l4)
+		return true;
+
+	if (skb_shinfo(skb)->gso_type & SKB_GSO_FRAGLIST && !udp_sk(sk)->accept_udp_fraglist)
+		return true;
+
+	return false;
 }
 
 #define udp_portaddr_for_each_entry(__sk, list) \
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index fe85dcf8c0087..c0695ce42dc53 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -2666,9 +2666,12 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
 
 	case UDP_GRO:
 		lock_sock(sk);
+
+		/* when enabling GRO, accept the related GSO packet type */
 		if (valbool)
 			udp_tunnel_encap_enable(sk->sk_socket);
 		up->gro_enabled = valbool;
+		up->accept_udp_l4 = valbool;
 		release_sock(sk);
 		break;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 5/8] vxlan: allow L4 GRO passthrough
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
                   ` (3 preceding siblings ...)
  2021-03-25 17:24 ` [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 6/8] geneve: allow UDP L4 GRO passthrou Paolo Abeni
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

When passing up an UDP GSO packet with L4 aggregation, there is
no need to segment it at the vxlan level. We can propagate the
packet untouched and let it be segmented later, if needed.

Introduce an helper to allow let the UDP socket to accept any
L4 aggregation and use it in the vxlan driver.

v1 -> v2:
 - updated to use the newly introduced UDP socket 'accept*' fields

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/vxlan.c | 1 +
 include/linux/udp.h | 6 ++++++
 2 files changed, 7 insertions(+)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 7665817f3cb61..39ee1300cdd9d 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -3484,6 +3484,7 @@ static struct socket *vxlan_create_sock(struct net *net, bool ipv6,
 	if (err < 0)
 		return ERR_PTR(err);
 
+	udp_allow_gso(sock->sk);
 	return sock;
 }
 
diff --git a/include/linux/udp.h b/include/linux/udp.h
index ae58ff3b6b5b8..ae66dadd85434 100644
--- a/include/linux/udp.h
+++ b/include/linux/udp.h
@@ -145,6 +145,12 @@ static inline bool udp_unexpected_gso(struct sock *sk, struct sk_buff *skb)
 	return false;
 }
 
+static inline void udp_allow_gso(struct sock *sk)
+{
+	udp_sk(sk)->accept_udp_l4 = 1;
+	udp_sk(sk)->accept_udp_fraglist = 1;
+}
+
 #define udp_portaddr_for_each_entry(__sk, list) \
 	hlist_for_each_entry(__sk, list, __sk_common.skc_portaddr_node)
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 6/8] geneve: allow UDP L4 GRO passthrou
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
                   ` (4 preceding siblings ...)
  2021-03-25 17:24 ` [PATCH net-next v2 5/8] vxlan: allow L4 GRO passthrough Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 7/8] bareudp: " Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 8/8] selftests: net: add UDP GRO forwarding self-tests Paolo Abeni
  7 siblings, 0 replies; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

Similar to the previous commit, let even geneve
passthrou the L4 GRO packets

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/geneve.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 4ac0373326efd..5d7a2b1469f4c 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -461,6 +461,7 @@ static struct socket *geneve_create_sock(struct net *net, bool ipv6,
 	if (err < 0)
 		return ERR_PTR(err);
 
+	udp_allow_gso(sock->sk);
 	return sock;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 7/8] bareudp: allow UDP L4 GRO passthrou
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
                   ` (5 preceding siblings ...)
  2021-03-25 17:24 ` [PATCH net-next v2 6/8] geneve: allow UDP L4 GRO passthrou Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  2021-03-25 17:24 ` [PATCH net-next v2 8/8] selftests: net: add UDP GRO forwarding self-tests Paolo Abeni
  7 siblings, 0 replies; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

Similar to the previous commit, let even geneve
passthrou the L4 GRO packets

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/bareudp.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c
index 7511bca9c15ed..edfad93e7b686 100644
--- a/drivers/net/bareudp.c
+++ b/drivers/net/bareudp.c
@@ -218,6 +218,7 @@ static struct socket *bareudp_create_sock(struct net *net, __be16 port)
 	if (err < 0)
 		return ERR_PTR(err);
 
+	udp_allow_gso(sock->sk);
 	return sock;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net-next v2 8/8] selftests: net: add UDP GRO forwarding self-tests
  2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
                   ` (6 preceding siblings ...)
  2021-03-25 17:24 ` [PATCH net-next v2 7/8] bareudp: " Paolo Abeni
@ 2021-03-25 17:24 ` Paolo Abeni
  7 siblings, 0 replies; 24+ messages in thread
From: Paolo Abeni @ 2021-03-25 17:24 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Jakub Kicinski, Steffen Klassert,
	Willem de Bruijn, Alexander Lobakin

Create a bunch of virtual topologies and verify that
NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD-enabled
devices aggregate the ingress packets as expected.
Additionally check that the aggregate packets are
segmented correctly when landing on a socket

Also test SKB_GSO_FRAGLIST and SKB_GSO_UDP_L4 aggregation
on top of UDP tunnel (vxlan)

v1 -> v2:
 - hopefully clarify the commit message
 - moved the overlay network ipv6 range into the 'documentation'
   reserved range (Willem)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 tools/testing/selftests/net/Makefile      |   1 +
 tools/testing/selftests/net/udpgro_fwd.sh | 251 ++++++++++++++++++++++
 2 files changed, 252 insertions(+)
 create mode 100755 tools/testing/selftests/net/udpgro_fwd.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 25f198bec0b25..2d71b283dde36 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -23,6 +23,7 @@ TEST_PROGS += drop_monitor_tests.sh
 TEST_PROGS += vrf_route_leaking.sh
 TEST_PROGS += bareudp.sh
 TEST_PROGS += unicast_extensions.sh
+TEST_PROGS += udpgro_fwd.sh
 TEST_PROGS_EXTENDED := in_netns.sh
 TEST_GEN_FILES =  socket nettest
 TEST_GEN_FILES += psock_fanout psock_tpacket msg_zerocopy reuseport_addr_any
diff --git a/tools/testing/selftests/net/udpgro_fwd.sh b/tools/testing/selftests/net/udpgro_fwd.sh
new file mode 100755
index 0000000000000..a8fa641362828
--- /dev/null
+++ b/tools/testing/selftests/net/udpgro_fwd.sh
@@ -0,0 +1,251 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+readonly BASE="ns-$(mktemp -u XXXXXX)"
+readonly SRC=2
+readonly DST=1
+readonly DST_NAT=100
+readonly NS_SRC=$BASE$SRC
+readonly NS_DST=$BASE$DST
+
+# "baremetal" network used for raw UDP traffic
+readonly BM_NET_V4=192.168.1.
+readonly BM_NET_V6=2001:db8::
+
+# "overlay" network used for UDP over UDP tunnel traffic
+readonly OL_NET_V4=172.16.1.
+readonly OL_NET_V6=2001:db8:1::
+readonly NPROCS=`nproc`
+
+cleanup() {
+	local ns
+	local -r jobs="$(jobs -p)"
+	[ -n "${jobs}" ] && kill -1 ${jobs} 2>/dev/null
+
+	for ns in $NS_SRC $NS_DST; do
+		ip netns del $ns 2>/dev/null
+	done
+}
+
+trap cleanup EXIT
+
+create_ns() {
+	local net
+	local ns
+
+	for ns in $NS_SRC $NS_DST; do
+		ip netns add $ns
+		ip -n $ns link set dev lo up
+	done
+
+	ip link add name veth$SRC type veth peer name veth$DST
+
+	for ns in $SRC $DST; do
+		ip link set dev veth$ns netns $BASE$ns
+		ip -n $BASE$ns link set dev veth$ns up
+		ip -n $BASE$ns addr add dev veth$ns $BM_NET_V4$ns/24
+		ip -n $BASE$ns addr add dev veth$ns $BM_NET_V6$ns/64 nodad
+	done
+	ip -n $NS_DST link set veth$DST xdp object ../bpf/xdp_dummy.o section xdp_dummy 2>/dev/null
+}
+
+create_vxlan_endpoint() {
+	local -r netns=$1
+	local -r bm_dev=$2
+	local -r bm_rem_addr=$3
+	local -r vxlan_dev=$4
+	local -r vxlan_id=$5
+	local -r vxlan_port=4789
+
+	ip -n $netns link set dev $bm_dev up
+	ip -n $netns link add dev $vxlan_dev type vxlan id $vxlan_id \
+				dstport $vxlan_port remote $bm_rem_addr
+	ip -n $netns link set dev $vxlan_dev up
+}
+
+create_vxlan_pair() {
+	local ns
+
+	create_ns
+
+	for ns in $SRC $DST; do
+		# note that 3 - $SRC == $DST and 3 - $DST == $SRC
+		create_vxlan_endpoint $BASE$ns veth$ns $BM_NET_V4$((3 - $ns)) vxlan$ns 4
+		ip -n $BASE$ns addr add dev vxlan$ns $OL_NET_V4$ns/24
+	done
+	for ns in $SRC $DST; do
+		create_vxlan_endpoint $BASE$ns veth$ns $BM_NET_V6$((3 - $ns)) vxlan6$ns 6
+		ip -n $BASE$ns addr add dev vxlan6$ns $OL_NET_V6$ns/24 nodad
+	done
+}
+
+is_ipv6() {
+	if [[ $1 =~ .*:.* ]]; then
+		return 0
+	fi
+	return 1
+}
+
+run_test() {
+	local -r msg=$1
+	local -r dst=$2
+	local -r pkts=$3
+	local -r vxpkts=$4
+	local bind=$5
+	local rx_args=""
+	local rx_family="-4"
+	local family=-4
+	local filter=IpInReceives
+	local ipt=iptables
+
+	printf "%-40s" "$msg"
+
+	if is_ipv6 $dst; then
+		# rx program does not support '-6' and implies ipv6 usage by default
+		rx_family=""
+		family=-6
+		filter=Ip6InReceives
+		ipt=ip6tables
+	fi
+
+	rx_args="$rx_family"
+	[ -n "$bind" ] && rx_args="$rx_args -b $bind"
+
+	# send a single GSO packet, segmented in 10 UDP frames.
+	# Always expect 10 UDP frames on RX side as rx socket does
+	# not enable GRO
+	ip netns exec $NS_DST $ipt -A INPUT -p udp --dport 4789
+	ip netns exec $NS_DST $ipt -A INPUT -p udp --dport 8000
+	ip netns exec $NS_DST ./udpgso_bench_rx -C 1000 -R 10 -n 10 -l 1300 $rx_args &
+	local spid=$!
+	sleep 0.1
+	ip netns exec $NS_SRC ./udpgso_bench_tx $family -M 1 -s 13000 -S 1300 -D $dst
+	local retc=$?
+	wait $spid
+	local rets=$?
+	if [ ${rets} -ne 0 ] || [ ${retc} -ne 0 ]; then
+		echo " fail client exit code $retc, server $rets"
+		ret=1
+		return
+	fi
+
+	local rcv=`ip netns exec $NS_DST $ipt"-save" -c | grep 'dport 8000' | \
+							  sed -e 's/\[//' -e 's/:.*//'`
+	if [ $rcv != $pkts ]; then
+		echo " fail - received $rvs packets, expected $pkts"
+		ret=1
+		return
+	fi
+
+	local vxrcv=`ip netns exec $NS_DST $ipt"-save" -c | grep 'dport 4789' | \
+							    sed -e 's/\[//' -e 's/:.*//'`
+
+	# upper net can generate a little noise, allow some tolerance
+	if [ $vxrcv -lt $vxpkts -o $vxrcv -gt $((vxpkts + 3)) ]; then
+		echo " fail - received $vxrcv vxlan packets, expected $vxpkts"
+		ret=1
+		return
+	fi
+	echo " ok"
+}
+
+run_bench() {
+	local -r msg=$1
+	local -r dst=$2
+	local family=-4
+
+	printf "%-40s" "$msg"
+	if [ $NPROCS -lt 2 ]; then
+		echo " skip - needed 2 CPUs found $NPROCS"
+		return
+	fi
+
+	is_ipv6 $dst && family=-6
+
+	# bind the sender and the receiver to different CPUs to try
+	# get reproducible results
+	ip netns exec $NS_DST bash -c "echo 2 > /sys/class/net/veth$DST/queues/rx-0/rps_cpus"
+	ip netns exec $NS_DST taskset 0x2 ./udpgso_bench_rx -C 1000 -R 10  &
+	local spid=$!
+	sleep 0.1
+	ip netns exec $NS_SRC taskset 0x1 ./udpgso_bench_tx $family -l 3 -S 1300 -D $dst
+	local retc=$?
+	wait $spid
+	local rets=$?
+	if [ ${rets} -ne 0 ] || [ ${retc} -ne 0 ]; then
+		echo " fail client exit code $retc, server $rets"
+		ret=1
+		return
+	fi
+}
+
+for family in 4 6; do
+	BM_NET=$BM_NET_V4
+	OL_NET=$OL_NET_V4
+	IPT=iptables
+	SUFFIX=24
+	VXDEV=vxlan
+
+	if [ $family = 6 ]; then
+		BM_NET=$BM_NET_V6
+		OL_NET=$OL_NET_V6
+		SUFFIX="64 nodad"
+		VXDEV=vxlan6
+		IPT=ip6tables
+	fi
+
+	echo "IPv$family"
+
+	create_ns
+	run_test "No GRO" $BM_NET$DST 10 0
+	cleanup
+
+	create_ns
+	ip netns exec $NS_DST ethtool -K veth$DST rx-gro-list on
+	run_test "GRO frag list" $BM_NET$DST 1 0
+	cleanup
+
+	# UDP GRO fwd skips aggregation when find an udp socket with the GRO option
+	# if there is an UDP tunnel in the running system, such lookup happen
+	# take place.
+	# use NAT to circumvent GRO FWD check
+	create_ns
+	ip -n $NS_DST addr add dev veth$DST $BM_NET$DST_NAT/$SUFFIX
+	ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
+	ip netns exec $NS_DST $IPT -t nat -I PREROUTING -d $BM_NET$DST_NAT \
+					-j DNAT --to-destination $BM_NET$DST
+	run_test "GRO fwd" $BM_NET$DST_NAT 1 0 $BM_NET$DST
+	cleanup
+
+	create_ns
+	run_bench "UDP fwd perf" $BM_NET$DST
+	ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
+	run_bench "UDP GRO fwd perf" $BM_NET$DST
+	cleanup
+
+	create_vxlan_pair
+	ip netns exec $NS_DST ethtool -K veth$DST rx-gro-list on
+	run_test "GRO frag list over UDP tunnel" $OL_NET$DST 1 1
+	cleanup
+
+	# use NAT to circumvent GRO FWD check
+	create_vxlan_pair
+	ip -n $NS_DST addr add dev $VXDEV$DST $OL_NET$DST_NAT/$SUFFIX
+	ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
+	ip netns exec $NS_DST $IPT -t nat -I PREROUTING -d $OL_NET$DST_NAT \
+					-j DNAT --to-destination $OL_NET$DST
+
+	# load arp cache before running the test to reduce the amount of
+	# stray traffic on top of the UDP tunnel
+	ip netns exec $NS_SRC ping -q -c 1 $OL_NET$DST_NAT >/dev/null
+	run_test "GRO fwd over UDP tunnel" $OL_NET$DST_NAT 1 1 $OL_NET$DST
+	cleanup
+
+	create_vxlan_pair
+	run_bench "UDP tunnel fwd perf" $OL_NET$DST
+	ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
+	run_bench "UDP tunnel GRO fwd perf" $OL_NET$DST
+	cleanup
+done
+
+exit $ret
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 3/8] udp: properly complete L4 GRO over UDP tunnel packet
  2021-03-25 17:24 ` [PATCH net-next v2 3/8] udp: properly complete L4 GRO over UDP tunnel packet Paolo Abeni
@ 2021-03-26 17:51   ` Willem de Bruijn
  0 siblings, 0 replies; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-26 17:51 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> After the previous patch, the stack can do L4 UDP aggregation
> on top of a UDP tunnel.
>
> In such scenario, udp{4,6}_gro_complete will be called twice. This function
> will enter its is_flist branch immediately, even though that is only
> correct on the second call, as GSO_FRAGLIST is only relevant for the
> inner packet.
>
> Instead, we need to try first UDP tunnel-based aggregation, if the GRO
> packet requires that.
>
> This patch changes udp{4,6}_gro_complete to skip the frag list processing
> when while encap_mark == 1, identifying processing of the outer tunnel
> header.
> Additionally, clears the field in udp_gro_complete() so that we can enter
> the frag list path on the next round, for the inner header.
>
> v1 -> v2:
>  - hopefully clarified the commit message
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets
  2021-03-25 17:24 ` [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets Paolo Abeni
@ 2021-03-26 18:15   ` Willem de Bruijn
  2021-03-29  8:11     ` Paolo Abeni
  0 siblings, 1 reply; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-26 18:15 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> Currently the UDP protocol delivers GSO_FRAGLIST packets to
> the sockets without the expected segmentation.
>
> This change addresses the issue introducing and maintaining
> a couple of new fields to explicitly accept SKB_GSO_UDP_L4
> or GSO_FRAGLIST packets. Additionally updates  udp_unexpected_gso()
> accordingly.
>
> UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist
> zeroed.
>
> v1 -> v2:
>  - use 2 bits instead of a whole GSO bitmask (Willem)
>
> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

This looks good to me in principle, thanks for the revision.

I hadn't fully appreciated that gro_enabled implies accept_udp_l4, but
not necessarily vice versa.

It is equivalent to (accept_udp_l4 && !up->gro_receive), right?

Could the extra bit be avoided with

"
+      /* Prefer fraglist GRO unless target is a socket with UDP_GRO,
+       * which requires all but last segments to be of same gso_size,
passed in cmsg */
        if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
-                NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1;
+               NAPI_GRO_CB(skb)->is_flist = sk ?
(!udp_sk(sk)->gro_enabled || udp_sk(sk)->accept_udp_fraglist) : 1;

+     /* Apply transport layer GRO if forwarding is enabled or the
flow lands at a local socket */
       if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
            (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) ||
NAPI_GRO_CB(skb)->is_flist) {
                pp = call_gro_receive(udp_gro_receive_segment, head, skb);
                return pp;
        }

+      /* Continue with tunnel GRO */
"

.. not that the extra bit matters a lot. And these two conditions with
gro_enabled are not very obvious.

Just a thought.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 2/8] udp: skip L4 aggregation for UDP tunnel packets
  2021-03-25 17:24 ` [PATCH net-next v2 2/8] udp: skip L4 aggregation for UDP tunnel packets Paolo Abeni
@ 2021-03-26 18:23   ` Willem de Bruijn
  0 siblings, 0 replies; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-26 18:23 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> If NETIF_F_GRO_FRAGLIST or NETIF_F_GRO_UDP_FWD are enabled, and there
> are UDP tunnels available in the system, udp_gro_receive() could end-up
> doing L4 aggregation (either SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST) at
> the outer UDP tunnel level for packets effectively carrying and UDP
> tunnel header.
>
> That could cause inner protocol corruption. If e.g. the relevant
> packets carry a vxlan header, different vxlan ids will be ignored/
> aggregated to the same GSO packet. Inner headers will be ignored, too,
> so that e.g. TCP over vxlan push packets will be held in the GRO
> engine till the next flush, etc.
>
> Just skip the SKB_GSO_UDP_L4 and SKB_GSO_FRAGLIST code path if the
> current packet could land in a UDP tunnel, and let udp_gro_receive()
> do GRO via udp_sk(sk)->gro_receive.
>
> The check implemented in this patch is broader than what is strictly
> needed, as the existing UDP tunnel could be e.g. configured on top of
> a different device: we could end-up skipping GRO at-all for some packets.
>
> Anyhow, the latter is a very thin corner case and covering it would add
> quite a bit of complexity.
>
> v1 -> v2:
>  - hopefully clarify the commit message
>
> Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> Fixes: 36707061d6ba ("udp: allow forwarding of plain (non-fraglisted) UDP GRO packets")
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Reviewed-by: Willem de Bruijn <willemb@google.com>

Key is that udp tunnel GRO must take precedence over transport GRO,
but the way the code is structured, the latter is tried first.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-25 17:24 ` [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path Paolo Abeni
@ 2021-03-26 18:30   ` Willem de Bruijn
  2021-03-29 11:25     ` Paolo Abeni
  0 siblings, 1 reply; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-26 18:30 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> When UDP packets generated locally by a socket with UDP_SEGMENT
> traverse the following path:
>
> UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
>         UDP tunnel (rx) -> UDP socket (no UDP_GRO)
>
> they are segmented as part of the rx socket receive operation, and
> present a CHECKSUM_NONE after segmentation.

would be good to capture how this happens, as it was not immediately obvious.

>
> Additionally the segmented packets UDP CB still refers to the original
> GSO packet len. Overall that causes unexpected/wrong csum validation
> errors later in the UDP receive path.
>
> We could possibly address the issue with some additional checks and
> csum mangling in the UDP tunnel code. Since the issue affects only
> this UDP receive slow path, let's set a suitable csum status there.
>
> v1 -> v2:
>  - restrict the csum update to the packets strictly needing them
>  - hopefully clarify the commit message and code comments
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>

> +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> +               skb->csum_valid = 1;

Not entirely obvious is that UDP packets arriving on a device with rx
checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
this test.

I assume that such packets are not coalesced by the GRO layer in the
first place. But I can't immediately spot the reason for it..

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets
  2021-03-26 18:15   ` Willem de Bruijn
@ 2021-03-29  8:11     ` Paolo Abeni
  2021-03-29 12:31       ` Willem de Bruijn
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-29  8:11 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Fri, 2021-03-26 at 14:15 -0400, Willem de Bruijn wrote:
> On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > Currently the UDP protocol delivers GSO_FRAGLIST packets to
> > the sockets without the expected segmentation.
> > 
> > This change addresses the issue introducing and maintaining
> > a couple of new fields to explicitly accept SKB_GSO_UDP_L4
> > or GSO_FRAGLIST packets. Additionally updates  udp_unexpected_gso()
> > accordingly.
> > 
> > UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist
> > zeroed.
> > 
> > v1 -> v2:
> >  - use 2 bits instead of a whole GSO bitmask (Willem)
> > 
> > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> 
> This looks good to me in principle, thanks for the revision.
> 
> I hadn't fully appreciated that gro_enabled implies accept_udp_l4, but
> not necessarily vice versa.
> 
> It is equivalent to (accept_udp_l4 && !up->gro_receive), right?

In this series, yes. 

> Could the extra bit be avoided with
> 
> "
> +      /* Prefer fraglist GRO unless target is a socket with UDP_GRO,
> +       * which requires all but last segments to be of same gso_size,
> passed in cmsg */
>         if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
> -                NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1;
> +               NAPI_GRO_CB(skb)->is_flist = sk ?
> (!udp_sk(sk)->gro_enabled || udp_sk(sk)->accept_udp_fraglist) : 1;

This is not ovious at all to me.

> +     /* Apply transport layer GRO if forwarding is enabled or the
> flow lands at a local socket */
>        if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
>             (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) ||
> NAPI_GRO_CB(skb)->is_flist) {
>                 pp = call_gro_receive(udp_gro_receive_segment, head, skb);
>                 return pp;
>         }
> 
> +      /* Continue with tunnel GRO */
> "
> 
> .. not that the extra bit matters a lot. And these two conditions with
> gro_enabled are not very obvious.
> 
> Just a thought.

Overall looks more complex to me. I would keep the extra bit, unless
you have strong opinion.

Side note: I was wondering about a follow-up to simplify the condition:

	if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
             (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) || NAPI_GRO_CB(skb)->is_flist) {

Since UDP sockets could process (segmenting as needed) unexpected GSO
packets, we could always do 'NETIF_F_GRO_UDP_FWD', when enabled on the
device level. The above becomes:

	if (skb->dev->features & NETIF_F_GRO_UDP_FWD) ||
            (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) ||
            NAPI_GRO_CB(skb)->is_flist) {

which is hopefully more clear (and simpler). As said, non for this
series anyhow.

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-26 18:30   ` Willem de Bruijn
@ 2021-03-29 11:25     ` Paolo Abeni
  2021-03-29 12:28       ` Willem de Bruijn
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-29 11:25 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Fri, 2021-03-26 at 14:30 -0400, Willem de Bruijn wrote:
> On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > When UDP packets generated locally by a socket with UDP_SEGMENT
> > traverse the following path:
> > 
> > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
> >         UDP tunnel (rx) -> UDP socket (no UDP_GRO)
> > 
> > they are segmented as part of the rx socket receive operation, and
> > present a CHECKSUM_NONE after segmentation.
> 
> would be good to capture how this happens, as it was not immediately obvious.

The CHECKSUM_PARTIAL is propagated up to the UDP tunnel processing,
where we have:

	__iptunnel_pull_header() -> skb_pull_rcsum() ->
skb_postpull_rcsum() -> __skb_postpull_rcsum() and the latter do the
conversion.

> > Additionally the segmented packets UDP CB still refers to the original
> > GSO packet len. Overall that causes unexpected/wrong csum validation
> > errors later in the UDP receive path.
> > 
> > We could possibly address the issue with some additional checks and
> > csum mangling in the UDP tunnel code. Since the issue affects only
> > this UDP receive slow path, let's set a suitable csum status there.
> > 
> > v1 -> v2:
> >  - restrict the csum update to the packets strictly needing them
> >  - hopefully clarify the commit message and code comments
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> > +               skb->csum_valid = 1;
> 
> Not entirely obvious is that UDP packets arriving on a device with rx
> checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
> this test.
> 
> I assume that such packets are not coalesced by the GRO layer in the
> first place. But I can't immediately spot the reason for it..

Packets with CHECKSUM_NONE are actually aggregated by the GRO engine. 

Their checksum is validated by:

udp4_gro_receive -> skb_gro_checksum_validate_zero_check()
	-> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete() 

and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by:

__skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary
	-> __skb_incr_checksum_unnecessary()

and finally to CHECKSUM_PARTIAL by:

udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment()

Do you prefer I resubmit with some more comments, either in the commit
message or in the code?

Thanks

Paolo

side note: perf probe here is fooled by skb->ip_summed being a bitfield
and does not dump the real value. I had to look at skb-
>__pkt_type_offset[0] instead.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-29 11:25     ` Paolo Abeni
@ 2021-03-29 12:28       ` Willem de Bruijn
  2021-03-29 13:24         ` Paolo Abeni
  0 siblings, 1 reply; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-29 12:28 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, Mar 29, 2021 at 7:26 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Fri, 2021-03-26 at 14:30 -0400, Willem de Bruijn wrote:
> > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > When UDP packets generated locally by a socket with UDP_SEGMENT
> > > traverse the following path:
> > >
> > > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
> > >         UDP tunnel (rx) -> UDP socket (no UDP_GRO)
> > >
> > > they are segmented as part of the rx socket receive operation, and
> > > present a CHECKSUM_NONE after segmentation.
> >
> > would be good to capture how this happens, as it was not immediately obvious.
>
> The CHECKSUM_PARTIAL is propagated up to the UDP tunnel processing,
> where we have:
>
>         __iptunnel_pull_header() -> skb_pull_rcsum() ->
> skb_postpull_rcsum() -> __skb_postpull_rcsum() and the latter do the
> conversion.

Please capture this in the commit message.

> > > Additionally the segmented packets UDP CB still refers to the original
> > > GSO packet len. Overall that causes unexpected/wrong csum validation
> > > errors later in the UDP receive path.
> > >
> > > We could possibly address the issue with some additional checks and
> > > csum mangling in the UDP tunnel code. Since the issue affects only
> > > this UDP receive slow path, let's set a suitable csum status there.
> > >
> > > v1 -> v2:
> > >  - restrict the csum update to the packets strictly needing them
> > >  - hopefully clarify the commit message and code comments
> > >
> > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > > +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> > > +               skb->csum_valid = 1;
> >
> > Not entirely obvious is that UDP packets arriving on a device with rx
> > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
> > this test.
> >
> > I assume that such packets are not coalesced by the GRO layer in the
> > first place. But I can't immediately spot the reason for it..
>
> Packets with CHECKSUM_NONE are actually aggregated by the GRO engine.
>
> Their checksum is validated by:
>
> udp4_gro_receive -> skb_gro_checksum_validate_zero_check()
>         -> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete()
>
> and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by:
>
> __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary
>         -> __skb_incr_checksum_unnecessary()
>
> and finally to CHECKSUM_PARTIAL by:
>
> udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment()
>
> Do you prefer I resubmit with some more comments, either in the commit
> message or in the code?

That breaks the checksum-and-copy optimization when delivering to
local sockets. I wonder if that is a regression.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets
  2021-03-29  8:11     ` Paolo Abeni
@ 2021-03-29 12:31       ` Willem de Bruijn
  2021-03-29 13:29         ` Paolo Abeni
  0 siblings, 1 reply; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-29 12:31 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, Mar 29, 2021 at 4:14 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Fri, 2021-03-26 at 14:15 -0400, Willem de Bruijn wrote:
> > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > Currently the UDP protocol delivers GSO_FRAGLIST packets to
> > > the sockets without the expected segmentation.
> > >
> > > This change addresses the issue introducing and maintaining
> > > a couple of new fields to explicitly accept SKB_GSO_UDP_L4
> > > or GSO_FRAGLIST packets. Additionally updates  udp_unexpected_gso()
> > > accordingly.
> > >
> > > UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist
> > > zeroed.
> > >
> > > v1 -> v2:
> > >  - use 2 bits instead of a whole GSO bitmask (Willem)
> > >
> > > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> >
> > This looks good to me in principle, thanks for the revision.
> >
> > I hadn't fully appreciated that gro_enabled implies accept_udp_l4, but
> > not necessarily vice versa.
> >
> > It is equivalent to (accept_udp_l4 && !up->gro_receive), right?
>
> In this series, yes.
>
> > Could the extra bit be avoided with
> >
> > "
> > +      /* Prefer fraglist GRO unless target is a socket with UDP_GRO,
> > +       * which requires all but last segments to be of same gso_size,
> > passed in cmsg */
> >         if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
> > -                NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1;
> > +               NAPI_GRO_CB(skb)->is_flist = sk ?
> > (!udp_sk(sk)->gro_enabled || udp_sk(sk)->accept_udp_fraglist) : 1;
>
> This is not ovious at all to me.
>
> > +     /* Apply transport layer GRO if forwarding is enabled or the
> > flow lands at a local socket */
> >        if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
> >             (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) ||
> > NAPI_GRO_CB(skb)->is_flist) {
> >                 pp = call_gro_receive(udp_gro_receive_segment, head, skb);
> >                 return pp;
> >         }
> >
> > +      /* Continue with tunnel GRO */
> > "
> >
> > .. not that the extra bit matters a lot. And these two conditions with
> > gro_enabled are not very obvious.
> >
> > Just a thought.
>
> Overall looks more complex to me. I would keep the extra bit, unless
> you have strong opinion.

Sounds good.

> Side note: I was wondering about a follow-up to simplify the condition:
>
>         if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
>              (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) || NAPI_GRO_CB(skb)->is_flist) {
>
> Since UDP sockets could process (segmenting as needed) unexpected GSO
> packets, we could always do 'NETIF_F_GRO_UDP_FWD', when enabled on the
> device level. The above becomes:
>
>         if (skb->dev->features & NETIF_F_GRO_UDP_FWD) ||
>             (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) ||
>             NAPI_GRO_CB(skb)->is_flist) {
>
> which is hopefully more clear (and simpler). As said, non for this
> series anyhow.

UDP sockets can segment, but it is expensive. In this case I think the
simplification is not worth the possible regression.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-29 12:28       ` Willem de Bruijn
@ 2021-03-29 13:24         ` Paolo Abeni
  2021-03-29 13:52           ` Willem de Bruijn
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-29 13:24 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, 2021-03-29 at 08:28 -0400, Willem de Bruijn wrote:
> On Mon, Mar 29, 2021 at 7:26 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > On Fri, 2021-03-26 at 14:30 -0400, Willem de Bruijn wrote:
> > > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > > When UDP packets generated locally by a socket with UDP_SEGMENT
> > > > traverse the following path:
> > > > 
> > > > UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) ->
> > > >         UDP tunnel (rx) -> UDP socket (no UDP_GRO)
> > > > 
> > > > they are segmented as part of the rx socket receive operation, and
> > > > present a CHECKSUM_NONE after segmentation.
> > > 
> > > would be good to capture how this happens, as it was not immediately obvious.
> > 
> > The CHECKSUM_PARTIAL is propagated up to the UDP tunnel processing,
> > where we have:
> > 
> >         __iptunnel_pull_header() -> skb_pull_rcsum() ->
> > skb_postpull_rcsum() -> __skb_postpull_rcsum() and the latter do the
> > conversion.
> 
> Please capture this in the commit message.

I will do.

> > > > Additionally the segmented packets UDP CB still refers to the original
> > > > GSO packet len. Overall that causes unexpected/wrong csum validation
> > > > errors later in the UDP receive path.
> > > > 
> > > > We could possibly address the issue with some additional checks and
> > > > csum mangling in the UDP tunnel code. Since the issue affects only
> > > > this UDP receive slow path, let's set a suitable csum status there.
> > > > 
> > > > v1 -> v2:
> > > >  - restrict the csum update to the packets strictly needing them
> > > >  - hopefully clarify the commit message and code comments
> > > > 
> > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > > > +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> > > > +               skb->csum_valid = 1;
> > > 
> > > Not entirely obvious is that UDP packets arriving on a device with rx
> > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
> > > this test.
> > > 
> > > I assume that such packets are not coalesced by the GRO layer in the
> > > first place. But I can't immediately spot the reason for it..
> > 
> > Packets with CHECKSUM_NONE are actually aggregated by the GRO engine.
> > 
> > Their checksum is validated by:
> > 
> > udp4_gro_receive -> skb_gro_checksum_validate_zero_check()
> >         -> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete()
> > 
> > and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by:
> > 
> > __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary
> >         -> __skb_incr_checksum_unnecessary()
> > 
> > and finally to CHECKSUM_PARTIAL by:
> > 
> > udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment()
> > 
> > Do you prefer I resubmit with some more comments, either in the commit
> > message or in the code?
> 
> That breaks the checksum-and-copy optimization when delivering to
> local sockets. I wonder if that is a regression.

The conversion to CHECKSUM_UNNECESSARY happens since
commit 573e8fca255a27e3573b51f9b183d62641c47a3d.

Even the conversion to CHECKSUM_PARTIAL happens independently from this
series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43.

I don't see a regression here ?!?

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets
  2021-03-29 12:31       ` Willem de Bruijn
@ 2021-03-29 13:29         ` Paolo Abeni
  0 siblings, 0 replies; 24+ messages in thread
From: Paolo Abeni @ 2021-03-29 13:29 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, 2021-03-29 at 08:31 -0400, Willem de Bruijn wrote:
> On Mon, Mar 29, 2021 at 4:14 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > On Fri, 2021-03-26 at 14:15 -0400, Willem de Bruijn wrote:
> > > On Thu, Mar 25, 2021 at 1:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > > Currently the UDP protocol delivers GSO_FRAGLIST packets to
> > > > the sockets without the expected segmentation.
> > > > 
> > > > This change addresses the issue introducing and maintaining
> > > > a couple of new fields to explicitly accept SKB_GSO_UDP_L4
> > > > or GSO_FRAGLIST packets. Additionally updates  udp_unexpected_gso()
> > > > accordingly.
> > > > 
> > > > UDP sockets enabling UDP_GRO stil keep accept_udp_fraglist
> > > > zeroed.
> > > > 
> > > > v1 -> v2:
> > > >  - use 2 bits instead of a whole GSO bitmask (Willem)
> > > > 
> > > > Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.")
> > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > > 
> > > This looks good to me in principle, thanks for the revision.
> > > 
> > > I hadn't fully appreciated that gro_enabled implies accept_udp_l4, but
> > > not necessarily vice versa.
> > > 
> > > It is equivalent to (accept_udp_l4 && !up->gro_receive), right?
> > 
> > In this series, yes.
> > 
> > > Could the extra bit be avoided with
> > > 
> > > "
> > > +      /* Prefer fraglist GRO unless target is a socket with UDP_GRO,
> > > +       * which requires all but last segments to be of same gso_size,
> > > passed in cmsg */
> > >         if (skb->dev->features & NETIF_F_GRO_FRAGLIST)
> > > -                NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1;
> > > +               NAPI_GRO_CB(skb)->is_flist = sk ?
> > > (!udp_sk(sk)->gro_enabled || udp_sk(sk)->accept_udp_fraglist) : 1;
> > 
> > This is not ovious at all to me.
> > 
> > > +     /* Apply transport layer GRO if forwarding is enabled or the
> > > flow lands at a local socket */
> > >        if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
> > >             (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) ||
> > > NAPI_GRO_CB(skb)->is_flist) {
> > >                 pp = call_gro_receive(udp_gro_receive_segment, head, skb);
> > >                 return pp;
> > >         }
> > > 
> > > +      /* Continue with tunnel GRO */
> > > "
> > > 
> > > .. not that the extra bit matters a lot. And these two conditions with
> > > gro_enabled are not very obvious.
> > > 
> > > Just a thought.
> > 
> > Overall looks more complex to me. I would keep the extra bit, unless
> > you have strong opinion.
> 
> Sounds good.
> 
> > Side note: I was wondering about a follow-up to simplify the condition:
> > 
> >         if ((!sk && (skb->dev->features & NETIF_F_GRO_UDP_FWD)) ||
> >              (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) || NAPI_GRO_CB(skb)->is_flist) {
> > 
> > Since UDP sockets could process (segmenting as needed) unexpected GSO
> > packets, we could always do 'NETIF_F_GRO_UDP_FWD', when enabled on the
> > device level. The above becomes:
> > 
> >         if (skb->dev->features & NETIF_F_GRO_UDP_FWD) ||
> >             (sk && udp_sk(sk)->gro_enabled && !up->encap_rcv) ||
> >             NAPI_GRO_CB(skb)->is_flist) {
> > 
> > which is hopefully more clear (and simpler). As said, non for this
> > series anyhow.
> 
> UDP sockets can segment, but it is expensive. In this case I think the
> simplification is not worth the possible regression.

No strong opinion here, I will not do the thing mentioned above.

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-29 13:24         ` Paolo Abeni
@ 2021-03-29 13:52           ` Willem de Bruijn
  2021-03-29 15:00             ` Paolo Abeni
  0 siblings, 1 reply; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-29 13:52 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

> > > > > Additionally the segmented packets UDP CB still refers to the original
> > > > > GSO packet len. Overall that causes unexpected/wrong csum validation
> > > > > errors later in the UDP receive path.
> > > > >
> > > > > We could possibly address the issue with some additional checks and
> > > > > csum mangling in the UDP tunnel code. Since the issue affects only
> > > > > this UDP receive slow path, let's set a suitable csum status there.
> > > > >
> > > > > v1 -> v2:
> > > > >  - restrict the csum update to the packets strictly needing them
> > > > >  - hopefully clarify the commit message and code comments
> > > > >
> > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > > > > +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> > > > > +               skb->csum_valid = 1;
> > > >
> > > > Not entirely obvious is that UDP packets arriving on a device with rx
> > > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
> > > > this test.
> > > >
> > > > I assume that such packets are not coalesced by the GRO layer in the
> > > > first place. But I can't immediately spot the reason for it..
> > >
> > > Packets with CHECKSUM_NONE are actually aggregated by the GRO engine.
> > >
> > > Their checksum is validated by:
> > >
> > > udp4_gro_receive -> skb_gro_checksum_validate_zero_check()
> > >         -> __skb_gro_checksum_validate -> __skb_gro_checksum_validate_complete()
> > >
> > > and skb->ip_summed is changed to CHECKSUM_UNNECESSARY by:
> > >
> > > __skb_gro_checksum_validate -> skb_gro_incr_csum_unnecessary
> > >         -> __skb_incr_checksum_unnecessary()
> > >
> > > and finally to CHECKSUM_PARTIAL by:
> > >
> > > udp4_gro_complete() -> udp_gro_complete() -> udp_gro_complete_segment()
> > >
> > > Do you prefer I resubmit with some more comments, either in the commit
> > > message or in the code?
> >
> > That breaks the checksum-and-copy optimization when delivering to
> > local sockets. I wonder if that is a regression.
>
> The conversion to CHECKSUM_UNNECESSARY happens since
> commit 573e8fca255a27e3573b51f9b183d62641c47a3d.
>
> Even the conversion to CHECKSUM_PARTIAL happens independently from this
> series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43.
>
> I don't see a regression here ?!?

I mean that UDP packets with local destination socket and no tunnels
that arrive with CHECKSUM_NONE normally benefit from the
checksum-and-copy optimization in recvmsg() when copying to user.

If those packets are now checksummed during GRO, that voids that
optimization, and the packet payload is now touched twice.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-29 13:52           ` Willem de Bruijn
@ 2021-03-29 15:00             ` Paolo Abeni
  2021-03-29 15:24               ` Willem de Bruijn
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-29 15:00 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote:
> > > That breaks the checksum-and-copy optimization when delivering to
> > > local sockets. I wonder if that is a regression.
> > 
> > The conversion to CHECKSUM_UNNECESSARY happens since
> > commit 573e8fca255a27e3573b51f9b183d62641c47a3d.
> > 
> > Even the conversion to CHECKSUM_PARTIAL happens independently from this
> > series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43.
> > 
> > I don't see a regression here ?!?
> 
> I mean that UDP packets with local destination socket and no tunnels
> that arrive with CHECKSUM_NONE normally benefit from the
> checksum-and-copy optimization in recvmsg() when copying to user.
> 
> If those packets are now checksummed during GRO, that voids that
> optimization, and the packet payload is now touched twice.

The 'now' part confuses me. Nothing in this patch or this series
changes the processing of CHECKSUM_NONE UDP packets with no tunnel.

I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP
packet prior to this series.

I *think* the checksum-and-copy optimization is lost
since 573e8fca255a27e3573b51f9b183d62641c47a3d.

Regards,

Paolo


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-29 15:00             ` Paolo Abeni
@ 2021-03-29 15:24               ` Willem de Bruijn
  2021-03-29 16:23                 ` Paolo Abeni
  0 siblings, 1 reply; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-29 15:24 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, Mar 29, 2021 at 11:01 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote:
> > > > That breaks the checksum-and-copy optimization when delivering to
> > > > local sockets. I wonder if that is a regression.
> > >
> > > The conversion to CHECKSUM_UNNECESSARY happens since
> > > commit 573e8fca255a27e3573b51f9b183d62641c47a3d.
> > >
> > > Even the conversion to CHECKSUM_PARTIAL happens independently from this
> > > series, since commit 6f1c0ea133a6e4a193a7b285efe209664caeea43.
> > >
> > > I don't see a regression here ?!?
> >
> > I mean that UDP packets with local destination socket and no tunnels
> > that arrive with CHECKSUM_NONE normally benefit from the
> > checksum-and-copy optimization in recvmsg() when copying to user.
> >
> > If those packets are now checksummed during GRO, that voids that
> > optimization, and the packet payload is now touched twice.
>
> The 'now' part confuses me. Nothing in this patch or this series
> changes the processing of CHECKSUM_NONE UDP packets with no tunnel.

Agreed. I did not mean to imply that this patch changes that. I was
responding to

> > > +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> > > +               skb->csum_valid = 1;
> >
> > Not entirely obvious is that UDP packets arriving on a device with rx
> > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
> > this test.
> >
> > I assume that such packets are not coalesced by the GRO layer in the
> > first place. But I can't immediately spot the reason for it..

As you point out, such packets will already have had their checksum
verified at this point, so this branch only matches tunneled packets.
That point is just not immediately obvious from the code.

> I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP
> packet prior to this series.
>
> I *think* the checksum-and-copy optimization is lost
> since 573e8fca255a27e3573b51f9b183d62641c47a3d.

Wouldn't this have been introduced with UDP_GRO?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-29 15:24               ` Willem de Bruijn
@ 2021-03-29 16:23                 ` Paolo Abeni
  2021-03-29 22:37                   ` Willem de Bruijn
  0 siblings, 1 reply; 24+ messages in thread
From: Paolo Abeni @ 2021-03-29 16:23 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, 2021-03-29 at 11:24 -0400, Willem de Bruijn wrote:
> On Mon, Mar 29, 2021 at 11:01 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote:
> > > > +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> > > > +               skb->csum_valid = 1;
> > > 
> > > Not entirely obvious is that UDP packets arriving on a device with rx
> > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
> > > this test.
> > > 
> > > I assume that such packets are not coalesced by the GRO layer in the
> > > first place. But I can't immediately spot the reason for it..
> 
> As you point out, such packets will already have had their checksum
> verified at this point, so this branch only matches tunneled packets.
> That point is just not immediately obvious from the code.

I understand is a matter of comment clarity ?!?

I'll rewrite the related code comment - in udp_post_segment_fix_csum()
- as:

	/* UDP packets generated with UDP_SEGMENT and traversing:
	 *
         * UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx)
	 * 
         * land here with CHECKSUM_NONE, because __iptunnel_pull_header() converts
         * CHECKSUM_PARTIAL into NONE.
	 * SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets with no UDP tunnel will land
	 * here with valid checksum, as the GRO engine validates the UDP csum
	 * before the aggregation and nobody strips such info in between.
	 * Instead of adding another check in the tunnel fastpath, we can force
	 * a valid csum here.
         * Additionally fixup the UDP CB.
         */

Would that be clear enough?

> > I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP
> > packet prior to this series.
> > 
> > I *think* the checksum-and-copy optimization is lost
> > since 573e8fca255a27e3573b51f9b183d62641c47a3d.
> 
> Wouldn't this have been introduced with UDP_GRO?

Uhmm.... looks like the checksum-and-copy optimization has been lost
and recovered a few times. I think the last one
with 9fd1ff5d2ac7181844735806b0a703c942365291, which move the csum
validation before the static branch on udp_encap_needed_key.

Can we agree re-introducing the optimization is independent from this
series?

Thanks!

Paolo



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path
  2021-03-29 16:23                 ` Paolo Abeni
@ 2021-03-29 22:37                   ` Willem de Bruijn
  0 siblings, 0 replies; 24+ messages in thread
From: Willem de Bruijn @ 2021-03-29 22:37 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Network Development, David S. Miller, Jakub Kicinski,
	Steffen Klassert, Alexander Lobakin

On Mon, Mar 29, 2021 at 12:24 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Mon, 2021-03-29 at 11:24 -0400, Willem de Bruijn wrote:
> > On Mon, Mar 29, 2021 at 11:01 AM Paolo Abeni <pabeni@redhat.com> wrote:
> > > On Mon, 2021-03-29 at 09:52 -0400, Willem de Bruijn wrote:
> > > > > +       if (skb->ip_summed == CHECKSUM_NONE && !skb->csum_valid)
> > > > > +               skb->csum_valid = 1;
> > > >
> > > > Not entirely obvious is that UDP packets arriving on a device with rx
> > > > checksum offload off, i.e., with CHECKSUM_NONE, are not matched by
> > > > this test.
> > > >
> > > > I assume that such packets are not coalesced by the GRO layer in the
> > > > first place. But I can't immediately spot the reason for it..
> >
> > As you point out, such packets will already have had their checksum
> > verified at this point, so this branch only matches tunneled packets.
> > That point is just not immediately obvious from the code.
>
> I understand is a matter of comment clarity ?!?
>
> I'll rewrite the related code comment - in udp_post_segment_fix_csum()
> - as:
>
>         /* UDP packets generated with UDP_SEGMENT and traversing:
>          *
>          * UDP tunnel(xmit) -> veth (segmentation) -> veth (gro) -> UDP tunnel (rx)
>          *
>          * land here with CHECKSUM_NONE, because __iptunnel_pull_header() converts
>          * CHECKSUM_PARTIAL into NONE.
>          * SKB_GSO_UDP_L4 or SKB_GSO_FRAGLIST packets with no UDP tunnel will land
>          * here with valid checksum, as the GRO engine validates the UDP csum
>          * before the aggregation and nobody strips such info in between.
>          * Instead of adding another check in the tunnel fastpath, we can force
>          * a valid csum here.
>          * Additionally fixup the UDP CB.
>          */
>
> Would that be clear enough?

Definitely. Thanks!

> > > I do see checksum validation in the GRO engine for CHECKSUM_NONE UDP
> > > packet prior to this series.
> > >
> > > I *think* the checksum-and-copy optimization is lost
> > > since 573e8fca255a27e3573b51f9b183d62641c47a3d.
> >
> > Wouldn't this have been introduced with UDP_GRO?
>
> Uhmm.... looks like the checksum-and-copy optimization has been lost
> and recovered a few times. I think the last one
> with 9fd1ff5d2ac7181844735806b0a703c942365291, which move the csum
> validation before the static branch on udp_encap_needed_key.
>
> Can we agree re-introducing the optimization is independent from this
> series?

Yep :)
> Thanks!
>
> Paolo
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2021-03-29 22:38 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-25 17:23 [PATCH net-next v2 0/8] udp: GRO L4 improvements Paolo Abeni
2021-03-25 17:24 ` [PATCH net-next v2 1/8] udp: fixup csum for GSO receive slow path Paolo Abeni
2021-03-26 18:30   ` Willem de Bruijn
2021-03-29 11:25     ` Paolo Abeni
2021-03-29 12:28       ` Willem de Bruijn
2021-03-29 13:24         ` Paolo Abeni
2021-03-29 13:52           ` Willem de Bruijn
2021-03-29 15:00             ` Paolo Abeni
2021-03-29 15:24               ` Willem de Bruijn
2021-03-29 16:23                 ` Paolo Abeni
2021-03-29 22:37                   ` Willem de Bruijn
2021-03-25 17:24 ` [PATCH net-next v2 2/8] udp: skip L4 aggregation for UDP tunnel packets Paolo Abeni
2021-03-26 18:23   ` Willem de Bruijn
2021-03-25 17:24 ` [PATCH net-next v2 3/8] udp: properly complete L4 GRO over UDP tunnel packet Paolo Abeni
2021-03-26 17:51   ` Willem de Bruijn
2021-03-25 17:24 ` [PATCH net-next v2 4/8] udp: never accept GSO_FRAGLIST packets Paolo Abeni
2021-03-26 18:15   ` Willem de Bruijn
2021-03-29  8:11     ` Paolo Abeni
2021-03-29 12:31       ` Willem de Bruijn
2021-03-29 13:29         ` Paolo Abeni
2021-03-25 17:24 ` [PATCH net-next v2 5/8] vxlan: allow L4 GRO passthrough Paolo Abeni
2021-03-25 17:24 ` [PATCH net-next v2 6/8] geneve: allow UDP L4 GRO passthrou Paolo Abeni
2021-03-25 17:24 ` [PATCH net-next v2 7/8] bareudp: " Paolo Abeni
2021-03-25 17:24 ` [PATCH net-next v2 8/8] selftests: net: add UDP GRO forwarding self-tests Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.