netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/5] Support fraglist GRO/GSO
@ 2019-09-20  4:49 Steffen Klassert
  2019-09-20  4:49 ` [PATCH RFC 1/5] UDP: enable GRO by default Steffen Klassert
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Steffen Klassert @ 2019-09-20  4:49 UTC (permalink / raw)
  To: netdev
  Cc: Steffen Klassert, Willem de Bruijn, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

This patchset adds support to do GRO/GSO by chaining packets
of the same flow at the SKB frag_list pointer. This avoids
the overhead to merge payloads into one big packet, and
on the other end, if GSO is needed it avoids the overhead
of splitting the big packet back to the native form.

Patch 1 Enables UDP GRO by default.

Patch 2 adds netdev feature flags to enable listifyed GRO,
this implements one of the configuration options discussed
at netconf 2019.

Patch 3 adds a netdev software feature set that defaults to off
and assigns the new listifyed GRO feature flag to it.

Patch 4 adds the core infrastructure to do fraglist GRO/GSO.

Patch 5 enables UDP to use fraglist GRO/GSO if configured and no
GRO supported socket is found.

I have only meaningful forwarding performance measurements.
I did some tests for the local receive path with netperf and iperf,
but in this case the sender that generates the packets is the
bottleneck. So the benchmarks are not that meaningful for the
receive path.

Paolo Abeni did some benchmarks of the local receive path for the v2
version of this pachset, results can be found here:

https://www.spinics.net/lists/netdev/msg551158.html

I used my IPsec forwarding test setup for the performance measurements:

           ------------         ------------
        -->| router 1 |-------->| router 2 |--
        |  ------------         ------------  |
        |                                     |
        |       --------------------          |
        --------|Spirent Testcenter|<----------
                --------------------

net-next (September 7th):

Single stream UDP frame size 1460 Bytes: 1.161.000 fps (13.5 Gbps).

----------------------------------------------------------------------

net-next (September 7th) + standard UDP GRO/GSO:

Single stream UDP frame size 1460 Bytes: 1.801.000 fps (21 Gbps).

----------------------------------------------------------------------

net-next (September 7th) + fraglist UDP GRO/GSO:

Single stream UDP frame size 1460 Bytes: 2.860.000 fps (33.4 Gbps).

-----------------------------------------------------------------------

Changes from v1:

- Add IPv6 support.
- Split patchset to enable UDP GRO by default before adding
  fraglist GRO support.
- Mark fraglist GRO packets as CHECKSUM_NONE.
- Take a refcount on the first segment skb when doing fraglist
  segmentation. With this we can use the same error handling
  path as with standard segmentation.

Changes from v2:

- Add a netdev feature flag to configure listifyed GRO.
- Fix UDP GRO enabling for IPv6.
- Fix a rcu_read_lock() imbalance.
- Fix error path in skb_segment_list().

Changes from v3:

- Rename NETIF_F_GRO_LIST to NETIF_F_GRO_FRAGLIST and add
  NETIF_F_GSO_FRAGLIST.
- Move introduction of SKB_GSO_FRAGLIST to patch 2.
- Use udpv6_encap_needed_key instead of udp_encap_needed_key in IPv6.
- Move some missplaced code from patch 5 to patch 1 where it belongs to.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH RFC 1/5] UDP: enable GRO by default.
  2019-09-20  4:49 [PATCH RFC 0/5] Support fraglist GRO/GSO Steffen Klassert
@ 2019-09-20  4:49 ` Steffen Klassert
  2019-09-23 12:53   ` Willem de Bruijn
  2019-09-20  4:49 ` [PATCH RFC 2/5] net: Add fraglist GRO/GSO feature flags Steffen Klassert
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2019-09-20  4:49 UTC (permalink / raw)
  To: netdev
  Cc: Steffen Klassert, Willem de Bruijn, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

This patch enables UDP GRO regardless if a GRO capable
socket is present. With this GRO is done by default
for the local input and forwarding path.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/udp.h      |  2 +-
 net/ipv4/udp_offload.c | 38 ++++++++++++++++----------------------
 net/ipv6/udp_offload.c | 10 ++++++++--
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index bad74f780831..44e0e52b585c 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -167,7 +167,7 @@ typedef struct sock *(*udp_lookup_t)(struct sk_buff *skb, __be16 sport,
 				     __be16 dport);
 
 struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
-				struct udphdr *uh, udp_lookup_t lookup);
+				struct udphdr *uh, struct sock *sk);
 int udp_gro_complete(struct sk_buff *skb, int nhoff, udp_lookup_t lookup);
 
 struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index a3908e55ed89..929b12fc7bc5 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -401,36 +401,25 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 	return NULL;
 }
 
-INDIRECT_CALLABLE_DECLARE(struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
-						   __be16 sport, __be16 dport));
 struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
-				struct udphdr *uh, udp_lookup_t lookup)
+				struct udphdr *uh, struct sock *sk)
 {
 	struct sk_buff *pp = NULL;
 	struct sk_buff *p;
 	struct udphdr *uh2;
 	unsigned int off = skb_gro_offset(skb);
 	int flush = 1;
-	struct sock *sk;
 
-	rcu_read_lock();
-	sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb,
-				udp4_lib_lookup_skb, skb, uh->source, uh->dest);
-	if (!sk)
-		goto out_unlock;
-
-	if (udp_sk(sk)->gro_enabled) {
+	if (!sk || !udp_sk(sk)->gro_receive) {
 		pp = call_gro_receive(udp_gro_receive_segment, head, skb);
-		rcu_read_unlock();
 		return pp;
 	}
 
 	if (NAPI_GRO_CB(skb)->encap_mark ||
 	    (skb->ip_summed != CHECKSUM_PARTIAL &&
 	     NAPI_GRO_CB(skb)->csum_cnt == 0 &&
-	     !NAPI_GRO_CB(skb)->csum_valid) ||
-	    !udp_sk(sk)->gro_receive)
-		goto out_unlock;
+	     !NAPI_GRO_CB(skb)->csum_valid))
+		goto out;
 
 	/* mark that this skb passed once through the tunnel gro layer */
 	NAPI_GRO_CB(skb)->encap_mark = 1;
@@ -457,8 +446,7 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 	skb_gro_postpull_rcsum(skb, uh, sizeof(struct udphdr));
 	pp = call_gro_receive_sk(udp_sk(sk)->gro_receive, sk, head, skb);
 
-out_unlock:
-	rcu_read_unlock();
+out:
 	skb_gro_flush_final(skb, pp, flush);
 	return pp;
 }
@@ -468,8 +456,10 @@ INDIRECT_CALLABLE_SCOPE
 struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb)
 {
 	struct udphdr *uh = udp_gro_udphdr(skb);
+	struct sk_buff *pp;
+	struct sock *sk;
 
-	if (unlikely(!uh) || !static_branch_unlikely(&udp_encap_needed_key))
+	if (unlikely(!uh))
 		goto flush;
 
 	/* Don't bother verifying checksum if we're going to flush anyway. */
@@ -484,7 +474,11 @@ struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb)
 					     inet_gro_compute_pseudo);
 skip:
 	NAPI_GRO_CB(skb)->is_ipv6 = 0;
-	return udp_gro_receive(head, skb, uh, udp4_lib_lookup_skb);
+	rcu_read_lock();
+	sk = static_branch_unlikely(&udp_encap_needed_key) ? udp4_lib_lookup_skb(skb, uh->source, uh->dest) : NULL;
+	pp = udp_gro_receive(head, skb, uh, sk);
+	rcu_read_unlock();
+	return pp;
 
 flush:
 	NAPI_GRO_CB(skb)->flush = 1;
@@ -517,9 +511,7 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff,
 	rcu_read_lock();
 	sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb,
 				udp4_lib_lookup_skb, skb, uh->source, uh->dest);
-	if (sk && udp_sk(sk)->gro_enabled) {
-		err = udp_gro_complete_segment(skb);
-	} else if (sk && udp_sk(sk)->gro_complete) {
+	if (sk && udp_sk(sk)->gro_complete) {
 		skb_shinfo(skb)->gso_type = uh->check ? SKB_GSO_UDP_TUNNEL_CSUM
 					: SKB_GSO_UDP_TUNNEL;
 
@@ -529,6 +521,8 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff,
 		skb->encapsulation = 1;
 		err = udp_sk(sk)->gro_complete(sk, skb,
 				nhoff + sizeof(struct udphdr));
+	} else {
+		err = udp_gro_complete_segment(skb);
 	}
 	rcu_read_unlock();
 
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 64b8f05d6735..435cfbadb6bd 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -115,8 +115,10 @@ INDIRECT_CALLABLE_SCOPE
 struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
 {
 	struct udphdr *uh = udp_gro_udphdr(skb);
+	struct sk_buff *pp;
+	struct sock *sk;
 
-	if (unlikely(!uh) || !static_branch_unlikely(&udpv6_encap_needed_key))
+	if (unlikely(!uh))
 		goto flush;
 
 	/* Don't bother verifying checksum if we're going to flush anyway. */
@@ -132,7 +134,11 @@ struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
 
 skip:
 	NAPI_GRO_CB(skb)->is_ipv6 = 1;
-	return udp_gro_receive(head, skb, uh, udp6_lib_lookup_skb);
+	rcu_read_lock();
+	sk = static_branch_unlikely(&udpv6_encap_needed_key) ? udp6_lib_lookup_skb(skb, uh->source, uh->dest) : NULL;
+	pp = udp_gro_receive(head, skb, uh, sk);
+	rcu_read_unlock();
+	return pp;
 
 flush:
 	NAPI_GRO_CB(skb)->flush = 1;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC 2/5] net: Add fraglist GRO/GSO feature flags
  2019-09-20  4:49 [PATCH RFC 0/5] Support fraglist GRO/GSO Steffen Klassert
  2019-09-20  4:49 ` [PATCH RFC 1/5] UDP: enable GRO by default Steffen Klassert
@ 2019-09-20  4:49 ` Steffen Klassert
  2019-09-20  4:49 ` [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off Steffen Klassert
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Steffen Klassert @ 2019-09-20  4:49 UTC (permalink / raw)
  To: netdev
  Cc: Steffen Klassert, Willem de Bruijn, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

This adds new Fraglist GRO/GSO feature flags. They will be used
to configure fraglist GRO/GSO what will be implemented with some
followup paches.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/linux/netdev_features.h | 6 +++++-
 include/linux/netdevice.h       | 1 +
 include/linux/skbuff.h          | 2 ++
 net/core/ethtool.c              | 1 +
 4 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index 4b19c544c59a..b239507da2a0 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -53,8 +53,9 @@ enum {
 	NETIF_F_GSO_ESP_BIT,		/* ... ESP with TSO */
 	NETIF_F_GSO_UDP_BIT,		/* ... UFO, deprecated except tuntap */
 	NETIF_F_GSO_UDP_L4_BIT,		/* ... UDP payload GSO (not UFO) */
+	NETIF_F_GSO_FRAGLIST_BIT,		/* ... Fraglist GSO */
 	/**/NETIF_F_GSO_LAST =		/* last bit, see GSO_MASK */
-		NETIF_F_GSO_UDP_L4_BIT,
+		NETIF_F_GSO_FRAGLIST_BIT,
 
 	NETIF_F_FCOE_CRC_BIT,		/* FCoE CRC32 */
 	NETIF_F_SCTP_CRC_BIT,		/* SCTP checksum offload */
@@ -80,6 +81,7 @@ enum {
 
 	NETIF_F_GRO_HW_BIT,		/* Hardware Generic receive offload */
 	NETIF_F_HW_TLS_RECORD_BIT,	/* Offload TLS record */
+	NETIF_F_GRO_FRAGLIST_BIT,	/* Fraglist GRO */
 
 	/*
 	 * Add your fresh new feature above and remember to update
@@ -150,6 +152,8 @@ enum {
 #define NETIF_F_GSO_UDP_L4	__NETIF_F(GSO_UDP_L4)
 #define NETIF_F_HW_TLS_TX	__NETIF_F(HW_TLS_TX)
 #define NETIF_F_HW_TLS_RX	__NETIF_F(HW_TLS_RX)
+#define NETIF_F_GRO_FRAGLIST	__NETIF_F(GRO_FRAGLIST)
+#define NETIF_F_GSO_FRAGLIST	__NETIF_F(GSO_FRAGLIST)
 
 /* Finds the next feature with the highest number of the range of start till 0.
  */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index d7d5626002e9..4917cf513bd1 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4524,6 +4524,7 @@ static inline bool net_gso_ok(netdev_features_t features, int gso_type)
 	BUILD_BUG_ON(SKB_GSO_ESP != (NETIF_F_GSO_ESP >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_UDP != (NETIF_F_GSO_UDP >> NETIF_F_GSO_SHIFT));
 	BUILD_BUG_ON(SKB_GSO_UDP_L4 != (NETIF_F_GSO_UDP_L4 >> NETIF_F_GSO_SHIFT));
+	BUILD_BUG_ON(SKB_GSO_FRAGLIST != (NETIF_F_GSO_FRAGLIST >> NETIF_F_GSO_SHIFT));
 
 	return (features & feature) == feature;
 }
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 028e684fa974..c72540813ea7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -595,6 +595,8 @@ enum {
 	SKB_GSO_UDP = 1 << 16,
 
 	SKB_GSO_UDP_L4 = 1 << 17,
+
+	SKB_GSO_FRAGLIST = 1 << 18,
 };
 
 #if BITS_PER_LONG > 32
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index 6288e69e94fc..2eaf94debbf6 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -111,6 +111,7 @@ static const char netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN]
 	[NETIF_F_HW_TLS_RECORD_BIT] =	"tls-hw-record",
 	[NETIF_F_HW_TLS_TX_BIT] =	 "tls-hw-tx-offload",
 	[NETIF_F_HW_TLS_RX_BIT] =	 "tls-hw-rx-offload",
+	[NETIF_F_GRO_FRAGLIST_BIT] =	 "rx-gro-list",
 };
 
 static const char
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-09-20  4:49 [PATCH RFC 0/5] Support fraglist GRO/GSO Steffen Klassert
  2019-09-20  4:49 ` [PATCH RFC 1/5] UDP: enable GRO by default Steffen Klassert
  2019-09-20  4:49 ` [PATCH RFC 2/5] net: Add fraglist GRO/GSO feature flags Steffen Klassert
@ 2019-09-20  4:49 ` Steffen Klassert
  2019-09-23 12:38   ` Willem de Bruijn
  2019-09-20  4:49 ` [PATCH RFC 4/5] net: Support GRO/GSO fraglist chaining Steffen Klassert
  2019-09-20  4:49 ` [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO Steffen Klassert
  4 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2019-09-20  4:49 UTC (permalink / raw)
  To: netdev
  Cc: Steffen Klassert, Willem de Bruijn, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

The previous patch added the NETIF_F_GRO_FRAGLIST feature.
This is a software feature that should default to off.
Current software features default to on, so add a new
feature set that defaults to off.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/linux/netdev_features.h | 3 +++
 net/core/dev.c                  | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
index b239507da2a0..34d050bb1ae6 100644
--- a/include/linux/netdev_features.h
+++ b/include/linux/netdev_features.h
@@ -230,6 +230,9 @@ static inline int find_next_netdev_feature(u64 feature, unsigned long start)
 /* changeable features with no special hardware requirements */
 #define NETIF_F_SOFT_FEATURES	(NETIF_F_GSO | NETIF_F_GRO)
 
+/* Changeable features with no special hardware requirements that defaults to off. */
+#define NETIF_F_SOFT_FEATURES_OFF	NETIF_F_GRO_FRAGLIST
+
 #define NETIF_F_VLAN_FEATURES	(NETIF_F_HW_VLAN_CTAG_FILTER | \
 				 NETIF_F_HW_VLAN_CTAG_RX | \
 				 NETIF_F_HW_VLAN_CTAG_TX | \
diff --git a/net/core/dev.c b/net/core/dev.c
index b1afafee3e2a..cc0bbec0f1d7 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -8730,7 +8730,7 @@ int register_netdevice(struct net_device *dev)
 	/* Transfer changeable features to wanted_features and enable
 	 * software offloads (GSO and GRO).
 	 */
-	dev->hw_features |= NETIF_F_SOFT_FEATURES;
+	dev->hw_features |= (NETIF_F_SOFT_FEATURES | NETIF_F_SOFT_FEATURES_OFF);
 	dev->features |= NETIF_F_SOFT_FEATURES;
 
 	if (dev->netdev_ops->ndo_udp_tunnel_add) {
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC 4/5] net: Support GRO/GSO fraglist chaining.
  2019-09-20  4:49 [PATCH RFC 0/5] Support fraglist GRO/GSO Steffen Klassert
                   ` (2 preceding siblings ...)
  2019-09-20  4:49 ` [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off Steffen Klassert
@ 2019-09-20  4:49 ` Steffen Klassert
  2019-09-20  4:49 ` [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO Steffen Klassert
  4 siblings, 0 replies; 18+ messages in thread
From: Steffen Klassert @ 2019-09-20  4:49 UTC (permalink / raw)
  To: netdev
  Cc: Steffen Klassert, Willem de Bruijn, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

This patch adds the core functions to chain/unchain
GSO skbs at the frag_list pointer. This also adds
a new GSO type SKB_GSO_FRAGLIST and a is_flist
flag to napi_gro_cb which indicates that this
flow will be GROed by fraglist chaining.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/linux/netdevice.h |   4 +-
 include/linux/skbuff.h    |   2 +
 net/core/dev.c            |   2 +-
 net/core/skbuff.c         | 106 ++++++++++++++++++++++++++++++++++++++
 4 files changed, 112 insertions(+), 2 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4917cf513bd1..d037e31a1acb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2306,7 +2306,8 @@ struct napi_gro_cb {
 	/* Number of gro_receive callbacks this packet already went through */
 	u8 recursion_counter:4;
 
-	/* 1 bit hole */
+	/* GRO is done by frag_list pointer chaining. */
+	u8	is_flist:1;
 
 	/* used to support CHECKSUM_COMPLETE for tunneling protocols */
 	__wsum	csum;
@@ -2656,6 +2657,7 @@ struct net_device *dev_get_by_napi_id(unsigned int napi_id);
 int netdev_get_name(struct net *net, char *name, int ifindex);
 int dev_restart(struct net_device *dev);
 int skb_gro_receive(struct sk_buff *p, struct sk_buff *skb);
+int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb);
 
 static inline unsigned int skb_gro_offset(const struct sk_buff *skb)
 {
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index c72540813ea7..3d5fd0a0eea7 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3511,6 +3511,8 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet);
 bool skb_gso_validate_network_len(const struct sk_buff *skb, unsigned int mtu);
 bool skb_gso_validate_mac_len(const struct sk_buff *skb, unsigned int len);
 struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
+struct sk_buff *skb_segment_list(struct sk_buff *skb, netdev_features_t features,
+				 unsigned int offset);
 struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
 int skb_ensure_writable(struct sk_buff *skb, int write_len);
 int __skb_vlan_pop(struct sk_buff *skb, u16 *vlan_tci);
diff --git a/net/core/dev.c b/net/core/dev.c
index cc0bbec0f1d7..f2a66198154d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3109,7 +3109,7 @@ struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 
 	segs = skb_mac_gso_segment(skb, features);
 
-	if (unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs)))
+	if (segs != skb && unlikely(skb_needs_check(skb, tx_path) && !IS_ERR(segs)))
 		skb_warn_bad_offload(skb);
 
 	return segs;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 2b40b5a9425b..3ff56677a6fb 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3638,6 +3638,112 @@ static inline skb_frag_t skb_head_frag_to_page_desc(struct sk_buff *frag_skb)
 	return head_frag;
 }
 
+struct sk_buff *skb_segment_list(struct sk_buff *skb,
+				 netdev_features_t features,
+				 unsigned int offset)
+{
+	struct sk_buff *list_skb = skb_shinfo(skb)->frag_list;
+	unsigned int tnl_hlen = skb_tnl_header_len(skb);
+	unsigned int delta_truesize = 0;
+	unsigned int delta_len = 0;
+	struct sk_buff *tail = NULL;
+	struct sk_buff *nskb;
+
+	skb_push(skb, -skb_network_offset(skb) + offset);
+
+	skb_shinfo(skb)->frag_list = NULL;
+
+	do {
+		nskb = list_skb;
+		list_skb = list_skb->next;
+
+		if (!tail)
+			skb->next = nskb;
+		else
+			tail->next = nskb;
+
+		tail = nskb;
+
+		delta_len += nskb->len;
+		delta_truesize += nskb->truesize;
+
+		skb_push(nskb, -skb_network_offset(nskb) + offset);
+
+		if (!secpath_exists(nskb))
+			__skb_ext_copy(nskb, skb);
+
+		memcpy(nskb->cb, skb->cb, sizeof(skb->cb));
+
+		nskb->ip_summed = skb->ip_summed;
+		nskb->csum_valid = skb->csum_valid;
+		nskb->tstamp = skb->tstamp;
+		nskb->dev = skb->dev;
+		nskb->queue_mapping = skb->queue_mapping;
+
+		nskb->mac_len = skb->mac_len;
+		nskb->mac_header = skb->mac_header;
+		nskb->transport_header = skb->transport_header;
+		nskb->network_header = skb->network_header;
+		skb_dst_copy(nskb, skb);
+
+		skb_headers_offset_update(nskb, skb_headroom(nskb) - skb_headroom(skb));
+		skb_copy_from_linear_data_offset(skb, -tnl_hlen,
+						 nskb->data - tnl_hlen,
+						 offset + tnl_hlen);
+
+		if (skb_needs_linearize(nskb, features) &&
+		    __skb_linearize(nskb))
+			goto err_linearize;
+
+	} while (list_skb);
+
+	skb->truesize = skb->truesize - delta_truesize;
+	skb->data_len = skb->data_len - delta_len;
+	skb->len = skb->len - delta_len;
+
+	skb_gso_reset(skb);
+
+	skb->prev = tail;
+
+	if (skb_needs_linearize(skb, features) &&
+	    __skb_linearize(skb))
+		goto err_linearize;
+
+	skb_get(skb);
+
+	return skb;
+
+err_linearize:
+	kfree_skb_list(skb->next);
+	skb->next = NULL;
+	return ERR_PTR(-ENOMEM);
+}
+EXPORT_SYMBOL_GPL(skb_segment_list);
+
+int skb_gro_receive_list(struct sk_buff *p, struct sk_buff *skb)
+{
+	if (unlikely(p->len + skb->len >= 65536))
+		return -E2BIG;
+
+	if (NAPI_GRO_CB(p)->last == p)
+		skb_shinfo(p)->frag_list = skb;
+	else
+		NAPI_GRO_CB(p)->last->next = skb;
+
+	skb_pull(skb, skb_gro_offset(skb));
+
+	NAPI_GRO_CB(p)->last = skb;
+	NAPI_GRO_CB(p)->count++;
+	p->data_len += skb->len;
+	p->truesize += skb->truesize;
+	p->len += skb->len;
+
+	NAPI_GRO_CB(skb)->same_flow = 1;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(skb_gro_receive_list);
+
 /**
  *	skb_segment - Perform protocol segmentation on skb.
  *	@head_skb: buffer to segment
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO.
  2019-09-20  4:49 [PATCH RFC 0/5] Support fraglist GRO/GSO Steffen Klassert
                   ` (3 preceding siblings ...)
  2019-09-20  4:49 ` [PATCH RFC 4/5] net: Support GRO/GSO fraglist chaining Steffen Klassert
@ 2019-09-20  4:49 ` Steffen Klassert
  2019-09-23 13:01   ` Willem de Bruijn
  4 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2019-09-20  4:49 UTC (permalink / raw)
  To: netdev
  Cc: Steffen Klassert, Willem de Bruijn, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

This patch extends UDP GRO to support fraglist GRO/GSO
by using the previously introduced infrastructure.
All UDP packets that are not targeted to a GRO capable
UDP sockets are going to fraglist GRO now (local input
and forward).

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/udp_offload.c | 56 +++++++++++++++++++++++++++++++++++++++---
 net/ipv6/udp_offload.c |  9 +++++++
 2 files changed, 62 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 929b12fc7bc5..37daafb06d4c 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -184,6 +184,20 @@ struct sk_buff *skb_udp_tunnel_segment(struct sk_buff *skb,
 }
 EXPORT_SYMBOL(skb_udp_tunnel_segment);
 
+static struct sk_buff *__udp_gso_segment_list(struct sk_buff *skb,
+					      netdev_features_t features)
+{
+	unsigned int mss = skb_shinfo(skb)->gso_size;
+
+	skb = skb_segment_list(skb, features, skb_mac_header_len(skb));
+	if (IS_ERR(skb))
+		return skb;
+
+	udp_hdr(skb)->len = htons(sizeof(struct udphdr) + mss);
+
+	return skb;
+}
+
 struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
 				  netdev_features_t features)
 {
@@ -196,6 +210,9 @@ struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
 	__sum16 check;
 	__be16 newlen;
 
+	if (skb_shinfo(gso_skb)->gso_type & SKB_GSO_FRAGLIST)
+		return __udp_gso_segment_list(gso_skb, features);
+
 	mss = skb_shinfo(gso_skb)->gso_size;
 	if (gso_skb->len <= sizeof(*uh) + mss)
 		return ERR_PTR(-EINVAL);
@@ -354,6 +371,7 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 	struct udphdr *uh2;
 	struct sk_buff *p;
 	unsigned int ulen;
+	int ret = 0;
 
 	/* requires non zero csum, for symmetry with GSO */
 	if (!uh->check) {
@@ -369,7 +387,6 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 	}
 	/* pull encapsulating udp header */
 	skb_gro_pull(skb, sizeof(struct udphdr));
-	skb_gro_postpull_rcsum(skb, uh, sizeof(struct udphdr));
 
 	list_for_each_entry(p, head, list) {
 		if (!NAPI_GRO_CB(p)->same_flow)
@@ -383,14 +400,35 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
 			continue;
 		}
 
+		if (NAPI_GRO_CB(skb)->is_flist != NAPI_GRO_CB(p)->is_flist) {
+			NAPI_GRO_CB(skb)->flush = 1;
+			return p;
+		}
+
 		/* Terminate the flow on len mismatch or if it grow "too much".
 		 * Under small packet flood GRO count could elsewhere grow a lot
 		 * leading to excessive truesize values.
 		 * On len mismatch merge the first packet shorter than gso_size,
 		 * otherwise complete the GRO packet.
 		 */
-		if (ulen > ntohs(uh2->len) || skb_gro_receive(p, skb) ||
-		    ulen != ntohs(uh2->len) ||
+		if (ulen > ntohs(uh2->len)) {
+			pp = p;
+		} else {
+			if (NAPI_GRO_CB(skb)->is_flist) {
+				if (!pskb_may_pull(skb, skb_gro_offset(skb))) {
+					NAPI_GRO_CB(skb)->flush = 1;
+					return NULL;
+				}
+				ret = skb_gro_receive_list(p, skb);
+			} else {
+				skb_gro_postpull_rcsum(skb, uh,
+						       sizeof(struct udphdr));
+
+				ret = skb_gro_receive(p, skb);
+			}
+		}
+
+		if (ret || ulen != ntohs(uh2->len) ||
 		    NAPI_GRO_CB(p)->count >= UDP_GRO_CNT_MAX)
 			pp = p;
 
@@ -411,6 +449,9 @@ struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
 	int flush = 1;
 
 	if (!sk || !udp_sk(sk)->gro_receive) {
+		if (skb->dev->features & NETIF_F_GRO_LIST)
+			NAPI_GRO_CB(skb)->is_flist = sk ? !udp_sk(sk)->gro_enabled: 1;
+
 		pp = call_gro_receive(udp_gro_receive_segment, head, skb);
 		return pp;
 	}
@@ -538,6 +579,15 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
 	const struct iphdr *iph = ip_hdr(skb);
 	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 
+	if (NAPI_GRO_CB(skb)->is_flist) {
+		uh->len = htons(skb->len - nhoff);
+
+		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
+		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
+
+		return 0;
+	}
+
 	if (uh->check)
 		uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
 					  iph->daddr, 0);
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 435cfbadb6bd..8836f2b69ef3 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -150,6 +150,15 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
 	const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
 	struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
 
+	if (NAPI_GRO_CB(skb)->is_flist) {
+		uh->len = htons(skb->len - nhoff);
+
+		skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
+		skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
+
+		return 0;
+	}
+
 	if (uh->check)
 		uh->check = ~udp_v6_check(skb->len - nhoff, &ipv6h->saddr,
 					  &ipv6h->daddr, 0);
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-09-20  4:49 ` [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off Steffen Klassert
@ 2019-09-23 12:38   ` Willem de Bruijn
  2019-09-30  6:24     ` Steffen Klassert
  0 siblings, 1 reply; 18+ messages in thread
From: Willem de Bruijn @ 2019-09-23 12:38 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> The previous patch added the NETIF_F_GRO_FRAGLIST feature.
> This is a software feature that should default to off.
> Current software features default to on, so add a new
> feature set that defaults to off.
>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>  include/linux/netdev_features.h | 3 +++
>  net/core/dev.c                  | 2 +-
>  2 files changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
> index b239507da2a0..34d050bb1ae6 100644
> --- a/include/linux/netdev_features.h
> +++ b/include/linux/netdev_features.h
> @@ -230,6 +230,9 @@ static inline int find_next_netdev_feature(u64 feature, unsigned long start)
>  /* changeable features with no special hardware requirements */
>  #define NETIF_F_SOFT_FEATURES  (NETIF_F_GSO | NETIF_F_GRO)
>
> +/* Changeable features with no special hardware requirements that defaults to off. */
> +#define NETIF_F_SOFT_FEATURES_OFF      NETIF_F_GRO_FRAGLIST
> +

NETIF_F_GRO_FRAGLIST is not really a device feature, but a way to
configure which form of UDP GRO to apply.

The UDP GRO benchmarks were largely positive, but not a strict win if
I read Paolo's previous results correctly. Even if enabling to by
default, it probably should come with a sysctl to disable for specific
workloads.

If so, how about a ternary per-netns sysctl {off, on without gro-list,
on with gro-list} instead of configuring through ethtool?

Alternative, the choice between gro-list or not could perhaps be
informed by whether ip_forward is set. I think we discussed that and
it was rejected, but I cannot remember why.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 1/5] UDP: enable GRO by default.
  2019-09-20  4:49 ` [PATCH RFC 1/5] UDP: enable GRO by default Steffen Klassert
@ 2019-09-23 12:53   ` Willem de Bruijn
  2019-09-23 12:55     ` Willem de Bruijn
  0 siblings, 1 reply; 18+ messages in thread
From: Willem de Bruijn @ 2019-09-23 12:53 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> This patch enables UDP GRO regardless if a GRO capable
> socket is present. With this GRO is done by default
> for the local input and forwarding path.
>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
>  struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> index a3908e55ed89..929b12fc7bc5 100644
> --- a/net/ipv4/udp_offload.c
> +++ b/net/ipv4/udp_offload.c
> @@ -401,36 +401,25 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
>         return NULL;
>  }
>
> -INDIRECT_CALLABLE_DECLARE(struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
> -                                                  __be16 sport, __be16 dport));
>  struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
> -                               struct udphdr *uh, udp_lookup_t lookup)
> +                               struct udphdr *uh, struct sock *sk)
>  {
>         struct sk_buff *pp = NULL;
>         struct sk_buff *p;
>         struct udphdr *uh2;
>         unsigned int off = skb_gro_offset(skb);
>         int flush = 1;
> -       struct sock *sk;
>
> -       rcu_read_lock();
> -       sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb,
> -                               udp4_lib_lookup_skb, skb, uh->source, uh->dest);
> -       if (!sk)
> -               goto out_unlock;
> -
> -       if (udp_sk(sk)->gro_enabled) {
> +       if (!sk || !udp_sk(sk)->gro_receive) {

Not critical, but the use of sk->gro_enabled and sk->gro_receive to
signal whether sockets are willing to accept large packets or are udp
tunnels, respectively, is subtle and possibly confusing.

Wrappers udp_sock_is_tunnel and udp_sock_accepts_gso could perhaps
help document the logic a bit.

static inline bool udp_sock_is_tunnel(struct udp_sock *up)
{
    return up->gro_receive;
}

And perhaps only pass a non-zero sk to udp_gro_receive if it is a
tunnel and thus skips the new default path:

static inline struct sock *sk = udp4_lookup_tunnel(const struct
sk_buff *skb, __be16 sport, __be16_dport)
{
    struct sock *sk;

    if (!static_branch_unlikely(&udp_encap_needed_key))
      return NULL;

    rcu_read_lock();
    sk = udp4_lib_lookup_skb(skb, source, dest);
    rcu_read_unlock();

    return udp_sock_is_tunnel(udp_sk(sk)) ? sk  : NULL;
}

>                 pp = call_gro_receive(udp_gro_receive_segment, head, skb);
> -               rcu_read_unlock();
>                 return pp;
>         }

Just a suggestion. It may be too verbose as given.

> @@ -468,8 +456,10 @@ INDIRECT_CALLABLE_SCOPE
>  struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb)
>  {
>         struct udphdr *uh = udp_gro_udphdr(skb);
> +       struct sk_buff *pp;
> +       struct sock *sk;
>
> -       if (unlikely(!uh) || !static_branch_unlikely(&udp_encap_needed_key))
> +       if (unlikely(!uh))
>                 goto flush;
>
>         /* Don't bother verifying checksum if we're going to flush anyway. */
> @@ -484,7 +474,11 @@ struct sk_buff *udp4_gro_receive(struct list_head *head, struct sk_buff *skb)
>                                              inet_gro_compute_pseudo);
>  skip:
>         NAPI_GRO_CB(skb)->is_ipv6 = 0;
> -       return udp_gro_receive(head, skb, uh, udp4_lib_lookup_skb);
> +       rcu_read_lock();
> +       sk = static_branch_unlikely(&udp_encap_needed_key) ? udp4_lib_lookup_skb(skb, uh->source, uh->dest) : NULL;
> +       pp = udp_gro_receive(head, skb, uh, sk);
> +       rcu_read_unlock();
> +       return pp;

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 1/5] UDP: enable GRO by default.
  2019-09-23 12:53   ` Willem de Bruijn
@ 2019-09-23 12:55     ` Willem de Bruijn
  0 siblings, 0 replies; 18+ messages in thread
From: Willem de Bruijn @ 2019-09-23 12:55 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Steffen Klassert, Network Development, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

On Mon, Sep 23, 2019 at 8:53 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> >
> > This patch enables UDP GRO regardless if a GRO capable
> > socket is present. With this GRO is done by default
> > for the local input and forwarding path.
> >
> > Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> >  struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
> > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
> > index a3908e55ed89..929b12fc7bc5 100644
> > --- a/net/ipv4/udp_offload.c
> > +++ b/net/ipv4/udp_offload.c
> > @@ -401,36 +401,25 @@ static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
> >         return NULL;
> >  }
> >
> > -INDIRECT_CALLABLE_DECLARE(struct sock *udp6_lib_lookup_skb(struct sk_buff *skb,
> > -                                                  __be16 sport, __be16 dport));
> >  struct sk_buff *udp_gro_receive(struct list_head *head, struct sk_buff *skb,
> > -                               struct udphdr *uh, udp_lookup_t lookup)
> > +                               struct udphdr *uh, struct sock *sk)
> >  {
> >         struct sk_buff *pp = NULL;
> >         struct sk_buff *p;
> >         struct udphdr *uh2;
> >         unsigned int off = skb_gro_offset(skb);
> >         int flush = 1;
> > -       struct sock *sk;
> >
> > -       rcu_read_lock();
> > -       sk = INDIRECT_CALL_INET(lookup, udp6_lib_lookup_skb,
> > -                               udp4_lib_lookup_skb, skb, uh->source, uh->dest);
> > -       if (!sk)
> > -               goto out_unlock;
> > -
> > -       if (udp_sk(sk)->gro_enabled) {
> > +       if (!sk || !udp_sk(sk)->gro_receive) {
>
> Not critical, but the use of sk->gro_enabled and sk->gro_receive to
> signal whether sockets are willing to accept large packets or are udp
> tunnels, respectively, is subtle and possibly confusing.
>
> Wrappers udp_sock_is_tunnel and udp_sock_accepts_gso could perhaps
> help document the logic a bit.
>
> static inline bool udp_sock_is_tunnel(struct udp_sock *up)
> {
>     return up->gro_receive;
> }
>
> And perhaps only pass a non-zero sk to udp_gro_receive if it is a
> tunnel and thus skips the new default path:
>
> static inline struct sock *sk = udp4_lookup_tunnel(const struct
> sk_buff *skb, __be16 sport, __be16_dport)
> {
>     struct sock *sk;
>
>     if (!static_branch_unlikely(&udp_encap_needed_key))
>       return NULL;
>
>     rcu_read_lock();
>     sk = udp4_lib_lookup_skb(skb, source, dest);
>     rcu_read_unlock();
>
>     return udp_sock_is_tunnel(udp_sk(sk)) ? sk  : NULL;
> }
>
> >                 pp = call_gro_receive(udp_gro_receive_segment, head, skb);
> > -               rcu_read_unlock();
> >                 return pp;
> >         }
>
> Just a suggestion. It may be too verbose as given.

.. and buggy, as it does not even test for NULL sk ;)

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO.
  2019-09-20  4:49 ` [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO Steffen Klassert
@ 2019-09-23 13:01   ` Willem de Bruijn
  2019-09-30  6:30     ` Steffen Klassert
  0 siblings, 1 reply; 18+ messages in thread
From: Willem de Bruijn @ 2019-09-23 13:01 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> This patch extends UDP GRO to support fraglist GRO/GSO
> by using the previously introduced infrastructure.
> All UDP packets that are not targeted to a GRO capable
> UDP sockets are going to fraglist GRO now (local input
> and forward).
>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

> @@ -538,6 +579,15 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
>         const struct iphdr *iph = ip_hdr(skb);
>         struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
>
> +       if (NAPI_GRO_CB(skb)->is_flist) {
> +               uh->len = htons(skb->len - nhoff);
> +
> +               skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> +               skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> +
> +               return 0;
> +       }
> +
>         if (uh->check)
>                 uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
>                                           iph->daddr, 0);
> diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> index 435cfbadb6bd..8836f2b69ef3 100644
> --- a/net/ipv6/udp_offload.c
> +++ b/net/ipv6/udp_offload.c
> @@ -150,6 +150,15 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
>         const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
>         struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
>
> +       if (NAPI_GRO_CB(skb)->is_flist) {
> +               uh->len = htons(skb->len - nhoff);
> +
> +               skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> +               skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> +
> +               return 0;
> +       }
> +

This is the same logic as in udp4_gro_complete. Can it be deduplicated
in udp_gro_complete?

>         if (uh->check)
>                 uh->check = ~udp_v6_check(skb->len - nhoff, &ipv6h->saddr,
>                                           &ipv6h->daddr, 0);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-09-23 12:38   ` Willem de Bruijn
@ 2019-09-30  6:24     ` Steffen Klassert
  2019-09-30 15:26       ` Willem de Bruijn
  0 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2019-09-30  6:24 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Mon, Sep 23, 2019 at 08:38:56AM -0400, Willem de Bruijn wrote:
> On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> >
> > diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
> > index b239507da2a0..34d050bb1ae6 100644
> > --- a/include/linux/netdev_features.h
> > +++ b/include/linux/netdev_features.h
> > @@ -230,6 +230,9 @@ static inline int find_next_netdev_feature(u64 feature, unsigned long start)
> >  /* changeable features with no special hardware requirements */
> >  #define NETIF_F_SOFT_FEATURES  (NETIF_F_GSO | NETIF_F_GRO)
> >
> > +/* Changeable features with no special hardware requirements that defaults to off. */
> > +#define NETIF_F_SOFT_FEATURES_OFF      NETIF_F_GRO_FRAGLIST
> > +
> 
> NETIF_F_GRO_FRAGLIST is not really a device feature, but a way to
> configure which form of UDP GRO to apply.

NETIF_F_GRO is also not really a device feature. It is a feature with
no special hardware requirements, as NETIF_F_GRO_FRAGLIST is.
Fraglist GRO is a special way to do GRO and should be configured in the
same way we configure standard GRO.

> 
> The UDP GRO benchmarks were largely positive, but not a strict win if
> I read Paolo's previous results correctly. Even if enabling to by
> default, it probably should come with a sysctl to disable for specific
> workloads.

Maybe we can just keep the default for the local input path
as is and enable GRO as this:

For standard UDP GRO on local input, do GRO only if a GRO enabled
socket is found.

If there is no local socket found and forwarding is enabled,
assume forwarding and do standard GRO.

If fraglist GRO is enabled, do it as default on local input and
forwarding because it is explicitly configured.

Would such a policy make semse?

> 
> If so, how about a ternary per-netns sysctl {off, on without gro-list,
> on with gro-list} instead of configuring through ethtool?

I'd not like to have a global knob to configure this.
On some devices it might make sense to enable fraglist
GRO, but on others not. Also it would be nice if we can
configure both vatiants with the same tool (ethtool).


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO.
  2019-09-23 13:01   ` Willem de Bruijn
@ 2019-09-30  6:30     ` Steffen Klassert
  2019-09-30 15:32       ` Willem de Bruijn
  0 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2019-09-30  6:30 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Mon, Sep 23, 2019 at 09:01:13AM -0400, Willem de Bruijn wrote:
> On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> >
> > This patch extends UDP GRO to support fraglist GRO/GSO
> > by using the previously introduced infrastructure.
> > All UDP packets that are not targeted to a GRO capable
> > UDP sockets are going to fraglist GRO now (local input
> > and forward).
> >
> > Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> 
> > @@ -538,6 +579,15 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
> >         const struct iphdr *iph = ip_hdr(skb);
> >         struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
> >
> > +       if (NAPI_GRO_CB(skb)->is_flist) {
> > +               uh->len = htons(skb->len - nhoff);
> > +
> > +               skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> > +               skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> > +
> > +               return 0;
> > +       }
> > +
> >         if (uh->check)
> >                 uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
> >                                           iph->daddr, 0);
> > diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> > index 435cfbadb6bd..8836f2b69ef3 100644
> > --- a/net/ipv6/udp_offload.c
> > +++ b/net/ipv6/udp_offload.c
> > @@ -150,6 +150,15 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
> >         const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
> >         struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
> >
> > +       if (NAPI_GRO_CB(skb)->is_flist) {
> > +               uh->len = htons(skb->len - nhoff);
> > +
> > +               skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> > +               skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> > +
> > +               return 0;
> > +       }
> > +
> 
> This is the same logic as in udp4_gro_complete. Can it be deduplicated
> in udp_gro_complete?

The code below would mess up the checksum then. We did not change
the packets, so the checksum is still correct.

> 
> >         if (uh->check)
> >                 uh->check = ~udp_v6_check(skb->len - nhoff, &ipv6h->saddr,
> >                                           &ipv6h->daddr, 0);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-09-30  6:24     ` Steffen Klassert
@ 2019-09-30 15:26       ` Willem de Bruijn
  2019-10-01  6:18         ` Steffen Klassert
  0 siblings, 1 reply; 18+ messages in thread
From: Willem de Bruijn @ 2019-09-30 15:26 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Mon, Sep 30, 2019 at 2:24 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> On Mon, Sep 23, 2019 at 08:38:56AM -0400, Willem de Bruijn wrote:
> > On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
> > <steffen.klassert@secunet.com> wrote:
> > >
> > > diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h
> > > index b239507da2a0..34d050bb1ae6 100644
> > > --- a/include/linux/netdev_features.h
> > > +++ b/include/linux/netdev_features.h
> > > @@ -230,6 +230,9 @@ static inline int find_next_netdev_feature(u64 feature, unsigned long start)
> > >  /* changeable features with no special hardware requirements */
> > >  #define NETIF_F_SOFT_FEATURES  (NETIF_F_GSO | NETIF_F_GRO)
> > >
> > > +/* Changeable features with no special hardware requirements that defaults to off. */
> > > +#define NETIF_F_SOFT_FEATURES_OFF      NETIF_F_GRO_FRAGLIST
> > > +
> >
> > NETIF_F_GRO_FRAGLIST is not really a device feature, but a way to
> > configure which form of UDP GRO to apply.
>
> NETIF_F_GRO is also not really a device feature. It is a feature with
> no special hardware requirements, as NETIF_F_GRO_FRAGLIST is.
> Fraglist GRO is a special way to do GRO and should be configured in the
> same way we configure standard GRO.
>
> >
> > The UDP GRO benchmarks were largely positive, but not a strict win if
> > I read Paolo's previous results correctly. Even if enabling to by
> > default, it probably should come with a sysctl to disable for specific
> > workloads.
>
> Maybe we can just keep the default for the local input path
> as is and enable GRO as this:
>
> For standard UDP GRO on local input, do GRO only if a GRO enabled
> socket is found.
>
> If there is no local socket found and forwarding is enabled,
> assume forwarding and do standard GRO.
>
> If fraglist GRO is enabled, do it as default on local input and
> forwarding because it is explicitly configured.
>
> Would such a policy make semse?

Making the choice between fraglist or non-fraglist GRO explicitly
configurable sounds great. Per device through ethtool over global
sysctl, too.

My main concern is not this patch, but 1/5 that enables UDP GRO by
default. There should be a way to disable it, at least.

I guess your suggestion is to only enable it with forwarding, which is
unlikely to see a cycle regression. And if there is a latency
regression, disable all GRO to disable UDP GRO.

Instead, how about adding a UDP GRO ethtool feature independent of
forwarding, analogous to fraglist GRO? Then both are explicitly under
admin control. And can be enabled by default (either now, or after
getting more data).

>
> >
> > If so, how about a ternary per-netns sysctl {off, on without gro-list,
> > on with gro-list} instead of configuring through ethtool?
>
> I'd not like to have a global knob to configure this.
> On some devices it might make sense to enable fraglist
> GRO, but on others not. Also it would be nice if we can
> configure both vatiants with the same tool (ethtool).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO.
  2019-09-30  6:30     ` Steffen Klassert
@ 2019-09-30 15:32       ` Willem de Bruijn
  0 siblings, 0 replies; 18+ messages in thread
From: Willem de Bruijn @ 2019-09-30 15:32 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Mon, Sep 30, 2019 at 2:30 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> On Mon, Sep 23, 2019 at 09:01:13AM -0400, Willem de Bruijn wrote:
> > On Fri, Sep 20, 2019 at 12:49 AM Steffen Klassert
> > <steffen.klassert@secunet.com> wrote:
> > >
> > > This patch extends UDP GRO to support fraglist GRO/GSO
> > > by using the previously introduced infrastructure.
> > > All UDP packets that are not targeted to a GRO capable
> > > UDP sockets are going to fraglist GRO now (local input
> > > and forward).
> > >
> > > Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> >
> > > @@ -538,6 +579,15 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff)
> > >         const struct iphdr *iph = ip_hdr(skb);
> > >         struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
> > >
> > > +       if (NAPI_GRO_CB(skb)->is_flist) {
> > > +               uh->len = htons(skb->len - nhoff);
> > > +
> > > +               skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> > > +               skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> > > +
> > > +               return 0;
> > > +       }
> > > +
> > >         if (uh->check)
> > >                 uh->check = ~udp_v4_check(skb->len - nhoff, iph->saddr,
> > >                                           iph->daddr, 0);
> > > diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
> > > index 435cfbadb6bd..8836f2b69ef3 100644
> > > --- a/net/ipv6/udp_offload.c
> > > +++ b/net/ipv6/udp_offload.c
> > > @@ -150,6 +150,15 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
> > >         const struct ipv6hdr *ipv6h = ipv6_hdr(skb);
> > >         struct udphdr *uh = (struct udphdr *)(skb->data + nhoff);
> > >
> > > +       if (NAPI_GRO_CB(skb)->is_flist) {
> > > +               uh->len = htons(skb->len - nhoff);
> > > +
> > > +               skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
> > > +               skb_shinfo(skb)->gso_segs = NAPI_GRO_CB(skb)->count;
> > > +
> > > +               return 0;
> > > +       }
> > > +
> >
> > This is the same logic as in udp4_gro_complete. Can it be deduplicated
> > in udp_gro_complete?
>
> The code below would mess up the checksum then. We did not change
> the packets, so the checksum is still correct.

Uh, right, of course. I guess it's not enough code to create a
separate udp_gro_fraglist_complete helper for. Okay, never mind.

> >
> > >         if (uh->check)
> > >                 uh->check = ~udp_v6_check(skb->len - nhoff, &ipv6h->saddr,
> > >                                           &ipv6h->daddr, 0);

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-09-30 15:26       ` Willem de Bruijn
@ 2019-10-01  6:18         ` Steffen Klassert
  2019-10-01 12:43           ` Willem de Bruijn
  0 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2019-10-01  6:18 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Mon, Sep 30, 2019 at 11:26:55AM -0400, Willem de Bruijn wrote:
> On Mon, Sep 30, 2019 at 2:24 AM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> >
> > On Mon, Sep 23, 2019 at 08:38:56AM -0400, Willem de Bruijn wrote:
> > >
> > > The UDP GRO benchmarks were largely positive, but not a strict win if
> > > I read Paolo's previous results correctly. Even if enabling to by
> > > default, it probably should come with a sysctl to disable for specific
> > > workloads.
> >
> > Maybe we can just keep the default for the local input path
> > as is and enable GRO as this:
> >
> > For standard UDP GRO on local input, do GRO only if a GRO enabled
> > socket is found.
> >
> > If there is no local socket found and forwarding is enabled,
> > assume forwarding and do standard GRO.
> >
> > If fraglist GRO is enabled, do it as default on local input and
> > forwarding because it is explicitly configured.
> >
> > Would such a policy make semse?
> 
> Making the choice between fraglist or non-fraglist GRO explicitly
> configurable sounds great. Per device through ethtool over global
> sysctl, too.
> 
> My main concern is not this patch, but 1/5 that enables UDP GRO by
> default. There should be a way to disable it, at least.
> 
> I guess your suggestion is to only enable it with forwarding, which is
> unlikely to see a cycle regression. And if there is a latency
> regression, disable all GRO to disable UDP GRO.

Yes, do GRO only for forwarding or if there is a GRO capable socket.

In this case it can be disabled only by disable all GRO.
It might be a disadvantage, but that's how it is with other
protocols too.

> 
> Instead, how about adding a UDP GRO ethtool feature independent of
> forwarding, analogous to fraglist GRO? Then both are explicitly under
> admin control. And can be enabled by default (either now, or after
> getting more data).

We could add a protocol specific feature, but what would it mean
if UDP GRO is enabled?

Would it be enabled for forwarding, and for local input only if there
is a GRO capable socket? Or would it be enabled even if there
is no GRO capable socket? Same question when UDP GRO is disabled.

Also, what means enabling GRO then? Enable GRO for all protocols
but UDP? Either UDP becomes something special then, or we need
to create protocol specific features for the other protocols
too. Same would apply for fraglist GRO.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-10-01  6:18         ` Steffen Klassert
@ 2019-10-01 12:43           ` Willem de Bruijn
  2019-10-02  8:27             ` Steffen Klassert
  0 siblings, 1 reply; 18+ messages in thread
From: Willem de Bruijn @ 2019-10-01 12:43 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Willem de Bruijn, Network Development, Paolo Abeni,
	Subash Abhinov Kasiviswanathan, Marcelo Ricardo Leitner

On Tue, Oct 1, 2019 at 2:18 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> On Mon, Sep 30, 2019 at 11:26:55AM -0400, Willem de Bruijn wrote:
> > On Mon, Sep 30, 2019 at 2:24 AM Steffen Klassert
> > <steffen.klassert@secunet.com> wrote:
> > >
> > > On Mon, Sep 23, 2019 at 08:38:56AM -0400, Willem de Bruijn wrote:
> > > >
> > > > The UDP GRO benchmarks were largely positive, but not a strict win if
> > > > I read Paolo's previous results correctly. Even if enabling to by
> > > > default, it probably should come with a sysctl to disable for specific
> > > > workloads.
> > >
> > > Maybe we can just keep the default for the local input path
> > > as is and enable GRO as this:
> > >
> > > For standard UDP GRO on local input, do GRO only if a GRO enabled
> > > socket is found.
> > >
> > > If there is no local socket found and forwarding is enabled,
> > > assume forwarding and do standard GRO.
> > >
> > > If fraglist GRO is enabled, do it as default on local input and
> > > forwarding because it is explicitly configured.
> > >
> > > Would such a policy make semse?
> >
> > Making the choice between fraglist or non-fraglist GRO explicitly
> > configurable sounds great. Per device through ethtool over global
> > sysctl, too.
> >
> > My main concern is not this patch, but 1/5 that enables UDP GRO by
> > default. There should be a way to disable it, at least.
> >
> > I guess your suggestion is to only enable it with forwarding, which is
> > unlikely to see a cycle regression. And if there is a latency
> > regression, disable all GRO to disable UDP GRO.
>
> Yes, do GRO only for forwarding or if there is a GRO capable socket.
>
> In this case it can be disabled only by disable all GRO.
> It might be a disadvantage, but that's how it is with other
> protocols too.
>
> >
> > Instead, how about adding a UDP GRO ethtool feature independent of
> > forwarding, analogous to fraglist GRO? Then both are explicitly under
> > admin control. And can be enabled by default (either now, or after
> > getting more data).
>
> We could add a protocol specific feature, but what would it mean
> if UDP GRO is enabled?
>
> Would it be enabled for forwarding, and for local input only if there
> is a GRO capable socket? Or would it be enabled even if there
> is no GRO capable socket? Same question when UDP GRO is disabled.

Enable UDP GRO for all traffic if GRO and UDP GRO are set, and only
then. That seems like the easiest to understand behavior to me, and
gives administrators an opt-out for workloads where UDP GRO causes a
regression. We cannot realistically turn off all GRO on a mixed
TCP/UDP workload (like, say, hosting TCP and QUIC).

> Also, what means enabling GRO then? Enable GRO for all protocols
> but UDP? Either UDP becomes something special then,

Yes and true. But it is something special. We don't know whether UDP
GRO is safe to deploy everywhere.

Only enabling it for the forwarding case is more conservative, but
gives no path to enabling it systemwide, is arguably confusing and
still lacks the admin control to turn off in case of unexpected
regressions. I do think that for a time this needs to be configurable
unless you're confident that the forwarding path is such a win that
no plan B is needed. But especially without fraglist, I'm not sure.

> or we need
> to create protocol specific features for the other protocols
> too. Same would apply for fraglist GRO.

We don't need it for other protocols after the fact, but it's a good
question: I don't know how it was enabled for them. Perhaps confidence
was gained based on testing. Or it was enabled for -rc1, no one
complained and stayed turned on. In which case you could do the same.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-10-01 12:43           ` Willem de Bruijn
@ 2019-10-02  8:27             ` Steffen Klassert
  2019-10-02 12:32               ` Willem de Bruijn
  0 siblings, 1 reply; 18+ messages in thread
From: Steffen Klassert @ 2019-10-02  8:27 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Tue, Oct 01, 2019 at 08:43:05AM -0400, Willem de Bruijn wrote:
> On Tue, Oct 1, 2019 at 2:18 AM Steffen Klassert
> <steffen.klassert@secunet.com> wrote:
> >
> > On Mon, Sep 30, 2019 at 11:26:55AM -0400, Willem de Bruijn wrote:
> > >
> > > Instead, how about adding a UDP GRO ethtool feature independent of
> > > forwarding, analogous to fraglist GRO? Then both are explicitly under
> > > admin control. And can be enabled by default (either now, or after
> > > getting more data).
> >
> > We could add a protocol specific feature, but what would it mean
> > if UDP GRO is enabled?
> >
> > Would it be enabled for forwarding, and for local input only if there
> > is a GRO capable socket? Or would it be enabled even if there
> > is no GRO capable socket? Same question when UDP GRO is disabled.
> 
> Enable UDP GRO for all traffic if GRO and UDP GRO are set, and only
> then. 

But this means that we would need to enable UDP GRO by default then.
Currently, if an application uses a UDP GRO capable socket, it
can expect that it gets GROed packets without doing any additional
configuration. This would change if we disable it by default.
Unfortunately, enabling UDP GRO by default has the biggest
risk because most applications don't use UDP GRO capable sockets.

The most condervative way would be to leave standard GRO as it is.
But on some workloads standard GRO might be preferable, in
particular on forwarding to a NIC that can do UDP segmentation
in hardware.

> That seems like the easiest to understand behavior to me, and
> gives administrators an opt-out for workloads where UDP GRO causes a
> regression. We cannot realistically turn off all GRO on a mixed
> TCP/UDP workload (like, say, hosting TCP and QUIC).
> 
> > Also, what means enabling GRO then? Enable GRO for all protocols
> > but UDP? Either UDP becomes something special then,
> 
> Yes and true. But it is something special. We don't know whether UDP
> GRO is safe to deploy everywhere.
> 
> Only enabling it for the forwarding case is more conservative, but
> gives no path to enabling it systemwide, is arguably confusing and
> still lacks the admin control to turn off in case of unexpected
> regressions. I do think that for a time this needs to be configurable
> unless you're confident that the forwarding path is such a win that
> no plan B is needed. But especially without fraglist, I'm not sure.

On my tests it was a win on forwarding, but there might be
usecases where it is not. I guess the only way to find this out
is to enable is and wait what happens.

I'm a bit hesitating on adding a feature flag that might be only
temporary usefull. In particular on the background of the talk
that Jesse Brandeburg gave on the LPC last year. Maybe you
remember the slide where he showed the output of
ethtool --show-offloads, it filled the whole page.

> 
> > or we need
> > to create protocol specific features for the other protocols
> > too. Same would apply for fraglist GRO.
> 
> We don't need it for other protocols after the fact, but it's a good
> question: I don't know how it was enabled for them. Perhaps confidence
> was gained based on testing. Or it was enabled for -rc1, no one
> complained and stayed turned on. In which case you could do the same.

Maybe we should go that way to enable it and wait whether somebody
complains. A patch to add the feature flag could be prepared
beforehand for that case.

It is easy to make a suboptimal design decision here, so
some more opinions would be helpfull.


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off.
  2019-10-02  8:27             ` Steffen Klassert
@ 2019-10-02 12:32               ` Willem de Bruijn
  0 siblings, 0 replies; 18+ messages in thread
From: Willem de Bruijn @ 2019-10-02 12:32 UTC (permalink / raw)
  To: Steffen Klassert
  Cc: Network Development, Paolo Abeni, Subash Abhinov Kasiviswanathan,
	Marcelo Ricardo Leitner

On Wed, Oct 2, 2019 at 4:27 AM Steffen Klassert
<steffen.klassert@secunet.com> wrote:
>
> On Tue, Oct 01, 2019 at 08:43:05AM -0400, Willem de Bruijn wrote:
> > On Tue, Oct 1, 2019 at 2:18 AM Steffen Klassert
> > <steffen.klassert@secunet.com> wrote:
> > >
> > > On Mon, Sep 30, 2019 at 11:26:55AM -0400, Willem de Bruijn wrote:
> > > >
> > > > Instead, how about adding a UDP GRO ethtool feature independent of
> > > > forwarding, analogous to fraglist GRO? Then both are explicitly under
> > > > admin control. And can be enabled by default (either now, or after
> > > > getting more data).
> > >
> > > We could add a protocol specific feature, but what would it mean
> > > if UDP GRO is enabled?
> > >
> > > Would it be enabled for forwarding, and for local input only if there
> > > is a GRO capable socket? Or would it be enabled even if there
> > > is no GRO capable socket? Same question when UDP GRO is disabled.
> >
> > Enable UDP GRO for all traffic if GRO and UDP GRO are set, and only
> > then.
>
> But this means that we would need to enable UDP GRO by default then.

That is what your patch 1/5 does. My concern was that that is a bold
change without an admin opt-out.

> Currently, if an application uses a UDP GRO capable socket, it
> can expect that it gets GROed packets without doing any additional
> configuration. This would change if we disable it by default.
> Unfortunately, enabling UDP GRO by default has the biggest
> risk because most applications don't use UDP GRO capable sockets.
>
> The most condervative way would be to leave standard GRO as it is.
> But on some workloads standard GRO might be preferable, in
> particular on forwarding to a NIC that can do UDP segmentation
> in hardware.
>
> > That seems like the easiest to understand behavior to me, and
> > gives administrators an opt-out for workloads where UDP GRO causes a
> > regression. We cannot realistically turn off all GRO on a mixed
> > TCP/UDP workload (like, say, hosting TCP and QUIC).
> >
> > > Also, what means enabling GRO then? Enable GRO for all protocols
> > > but UDP? Either UDP becomes something special then,
> >
> > Yes and true. But it is something special. We don't know whether UDP
> > GRO is safe to deploy everywhere.
> >
> > Only enabling it for the forwarding case is more conservative, but
> > gives no path to enabling it systemwide, is arguably confusing and
> > still lacks the admin control to turn off in case of unexpected
> > regressions. I do think that for a time this needs to be configurable
> > unless you're confident that the forwarding path is such a win that
> > no plan B is needed. But especially without fraglist, I'm not sure.
>
> On my tests it was a win on forwarding, but there might be
> usecases where it is not. I guess the only way to find this out
> is to enable is and wait what happens.
>
> I'm a bit hesitating on adding a feature flag that might be only
> temporary usefull. In particular on the background of the talk
> that Jesse Brandeburg gave on the LPC last year. Maybe you
> remember the slide where he showed the output of
> ethtool --show-offloads, it filled the whole page.

I was using ethtool -K just yesterday to debug a peculiar mix of
tunneling protocols. And yes, used grep on it ;) But I don't have much
of a problem with this.

But agreed that if default on works in all cases, then it's unnecessary.

> >
> > > or we need
> > > to create protocol specific features for the other protocols
> > > too. Same would apply for fraglist GRO.
> >
> > We don't need it for other protocols after the fact, but it's a good
> > question: I don't know how it was enabled for them. Perhaps confidence
> > was gained based on testing. Or it was enabled for -rc1, no one
> > complained and stayed turned on. In which case you could do the same.
>
> Maybe we should go that way to enable it and wait whether somebody
> complains. A patch to add the feature flag could be prepared
> beforehand for that case.

This early in the cycle, that may work. Yes, it's definitely good to
have the plan B at the ready.

>
> It is easy to make a suboptimal design decision here, so
> some more opinions would be helpfull.

Agreed.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-10-02 12:32 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-20  4:49 [PATCH RFC 0/5] Support fraglist GRO/GSO Steffen Klassert
2019-09-20  4:49 ` [PATCH RFC 1/5] UDP: enable GRO by default Steffen Klassert
2019-09-23 12:53   ` Willem de Bruijn
2019-09-23 12:55     ` Willem de Bruijn
2019-09-20  4:49 ` [PATCH RFC 2/5] net: Add fraglist GRO/GSO feature flags Steffen Klassert
2019-09-20  4:49 ` [PATCH RFC 3/5] net: Add a netdev software feature set that defaults to off Steffen Klassert
2019-09-23 12:38   ` Willem de Bruijn
2019-09-30  6:24     ` Steffen Klassert
2019-09-30 15:26       ` Willem de Bruijn
2019-10-01  6:18         ` Steffen Klassert
2019-10-01 12:43           ` Willem de Bruijn
2019-10-02  8:27             ` Steffen Klassert
2019-10-02 12:32               ` Willem de Bruijn
2019-09-20  4:49 ` [PATCH RFC 4/5] net: Support GRO/GSO fraglist chaining Steffen Klassert
2019-09-20  4:49 ` [PATCH RFC 5/5] udp: Support UDP fraglist GRO/GSO Steffen Klassert
2019-09-23 13:01   ` Willem de Bruijn
2019-09-30  6:30     ` Steffen Klassert
2019-09-30 15:32       ` Willem de Bruijn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).