All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next 0/4] veth: allow GRO even without XDP
@ 2021-04-09 11:04 Paolo Abeni
  2021-04-09 11:04 ` [PATCH net-next 1/4] veth: use skb_orphan_partial instead of skb_orphan Paolo Abeni
                   ` (4 more replies)
  0 siblings, 5 replies; 14+ messages in thread
From: Paolo Abeni @ 2021-04-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

This series allows the user-space to enable GRO/NAPI on a veth
device even without attaching an XDP program.

It does not change the default veth behavior (no NAPI, no GRO),
except that the GRO feature bit on top of this series will be
effectively off by default on veth devices. Note that currently
the GRO bit is on by default, but GRO never takes place in
absence of XDP.

On top of this series, setting the GRO feature bit enables NAPI
and allows the GRO to take place. The TSO features on the peer
device are preserved.

The main goal is improving UDP forwarding performances for
containers in a typical virtual network setup:

(container) veth -> veth peer -> bridge/ovs -> vxlan -> NIC

Enabling the NAPI threaded mode, GRO the NETIF_F_GRO_UDP_FWD
feature on the veth peer improves the UDP stream performance
with not void netfilter configuration by 2x factor with no
measurable overhead for TCP traffic: some heuristic ensures
that TCP will not go through the additional NAPI/GRO layer.

Some self-tests are added to check the expected behavior in
the default configuration, with XDP and with plain GRO enabled.

Paolo Abeni (4):
  veth: use skb_orphan_partial instead of skb_orphan
  veth: allow enabling NAPI even without XDP
  veth: refine napi usage
  self-tests: add veth tests

 drivers/net/veth.c                   | 152 ++++++++++++++++++++---
 tools/testing/selftests/net/Makefile |   1 +
 tools/testing/selftests/net/veth.sh  | 177 +++++++++++++++++++++++++++
 3 files changed, 316 insertions(+), 14 deletions(-)
 create mode 100755 tools/testing/selftests/net/veth.sh

-- 
2.26.2


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH net-next 1/4] veth: use skb_orphan_partial instead of skb_orphan
  2021-04-09 11:04 [PATCH net-next 0/4] veth: allow GRO even without XDP Paolo Abeni
@ 2021-04-09 11:04 ` Paolo Abeni
  2021-04-09 11:04 ` [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP Paolo Abeni
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 14+ messages in thread
From: Paolo Abeni @ 2021-04-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

As described by commit 9c4c325252c5 ("skbuff: preserve sock
reference when scrubbing the skb."), orphaning a skb
in the TX path will cause OoO.

Let's use skb_orphan_partial() instead of skb_orphan(), so
that we keep the sk around for queue's selection sake and we
still avoid the problem fixed with commit 4bf9ffa0fb57 ("veth:
Orphan skb before GRO")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/veth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 91b73db37555b..ad36e7ed16134 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -674,7 +674,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	int mac_len, delta, off;
 	struct xdp_buff xdp;
 
-	skb_orphan(skb);
+	skb_orphan_partial(skb);
 
 	rcu_read_lock();
 	xdp_prog = rcu_dereference(rq->xdp_prog);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP
  2021-04-09 11:04 [PATCH net-next 0/4] veth: allow GRO even without XDP Paolo Abeni
  2021-04-09 11:04 ` [PATCH net-next 1/4] veth: use skb_orphan_partial instead of skb_orphan Paolo Abeni
@ 2021-04-09 11:04 ` Paolo Abeni
  2021-04-09 14:58   ` Toke Høiland-Jørgensen
  2021-04-09 11:04 ` [PATCH net-next 3/4] veth: refine napi usage Paolo Abeni
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2021-04-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

Currently the veth device has the GRO feature bit set, even if
no GRO aggregation is possible with the default configuration,
as the veth device does not hook into the GRO engine.

Flipping the GRO feature bit from user-space is a no-op, unless
XDP is enabled. In such scenario GRO could actually take place, but
TSO is forced to off on the peer device.

This change allow user-space to really control the GRO feature, with
no need for an XDP program.

The GRO feature bit is now cleared by default - so that there are no
user-visible behavior changes with the default configuration.

When the GRO bit is set, the per-queue NAPI instances are initialized
and registered. On xmit, when napi instances are available, we try
to use them.

Some additional checks are in place to ensure we initialize/delete NAPIs
only when needed in case of overlapping XDP and GRO configuration
changes.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/veth.c | 129 ++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 116 insertions(+), 13 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index ad36e7ed16134..ca44e82d1edeb 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -57,6 +57,7 @@ struct veth_rq_stats {
 
 struct veth_rq {
 	struct napi_struct	xdp_napi;
+	struct napi_struct __rcu *napi; /* points to xdp_napi when the latter is initialized */
 	struct net_device	*dev;
 	struct bpf_prog __rcu	*xdp_prog;
 	struct xdp_mem_info	xdp_mem;
@@ -287,7 +288,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct veth_rq *rq = NULL;
 	struct net_device *rcv;
 	int length = skb->len;
-	bool rcv_xdp = false;
+	bool use_napi = false;
 	int rxq;
 
 	rcu_read_lock();
@@ -301,20 +302,24 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 	rxq = skb_get_queue_mapping(skb);
 	if (rxq < rcv->real_num_rx_queues) {
 		rq = &rcv_priv->rq[rxq];
-		rcv_xdp = rcu_access_pointer(rq->xdp_prog);
+
+		/* The napi pointer is available when an XDP program is
+		 * attached or when GRO is enabled
+		 */
+		use_napi = rcu_access_pointer(rq->napi);
 		skb_record_rx_queue(skb, rxq);
 	}
 
 	skb_tx_timestamp(skb);
-	if (likely(veth_forward_skb(rcv, skb, rq, rcv_xdp) == NET_RX_SUCCESS)) {
-		if (!rcv_xdp)
+	if (likely(veth_forward_skb(rcv, skb, rq, use_napi) == NET_RX_SUCCESS)) {
+		if (!use_napi)
 			dev_lstats_add(dev, length);
 	} else {
 drop:
 		atomic64_inc(&priv->dropped);
 	}
 
-	if (rcv_xdp)
+	if (use_napi)
 		__veth_xdp_flush(rq);
 
 	rcu_read_unlock();
@@ -891,7 +896,7 @@ static int veth_poll(struct napi_struct *napi, int budget)
 	return done;
 }
 
-static int veth_napi_add(struct net_device *dev)
+static int __veth_napi_enable(struct net_device *dev)
 {
 	struct veth_priv *priv = netdev_priv(dev);
 	int err, i;
@@ -908,6 +913,7 @@ static int veth_napi_add(struct net_device *dev)
 		struct veth_rq *rq = &priv->rq[i];
 
 		napi_enable(&rq->xdp_napi);
+		rcu_assign_pointer(priv->rq[i].napi, &priv->rq[i].xdp_napi);
 	}
 
 	return 0;
@@ -926,6 +932,7 @@ static void veth_napi_del(struct net_device *dev)
 	for (i = 0; i < dev->real_num_rx_queues; i++) {
 		struct veth_rq *rq = &priv->rq[i];
 
+		rcu_assign_pointer(priv->rq[i].napi, NULL);
 		napi_disable(&rq->xdp_napi);
 		__netif_napi_del(&rq->xdp_napi);
 	}
@@ -939,8 +946,14 @@ static void veth_napi_del(struct net_device *dev)
 	}
 }
 
+static bool veth_gro_requested(const struct net_device *dev)
+{
+	return !!(dev->wanted_features & NETIF_F_GRO);
+}
+
 static int veth_enable_xdp(struct net_device *dev)
 {
+	bool napi_already_on = veth_gro_requested(dev) && (dev->flags & IFF_UP);
 	struct veth_priv *priv = netdev_priv(dev);
 	int err, i;
 
@@ -948,7 +961,8 @@ static int veth_enable_xdp(struct net_device *dev)
 		for (i = 0; i < dev->real_num_rx_queues; i++) {
 			struct veth_rq *rq = &priv->rq[i];
 
-			netif_napi_add(dev, &rq->xdp_napi, veth_poll, NAPI_POLL_WEIGHT);
+			if (!napi_already_on)
+				netif_napi_add(dev, &rq->xdp_napi, veth_poll, NAPI_POLL_WEIGHT);
 			err = xdp_rxq_info_reg(&rq->xdp_rxq, dev, i, rq->xdp_napi.napi_id);
 			if (err < 0)
 				goto err_rxq_reg;
@@ -963,13 +977,25 @@ static int veth_enable_xdp(struct net_device *dev)
 			rq->xdp_mem = rq->xdp_rxq.mem;
 		}
 
-		err = veth_napi_add(dev);
-		if (err)
-			goto err_rxq_reg;
+		if (!napi_already_on) {
+			err = __veth_napi_enable(dev);
+			if (err)
+				goto err_rxq_reg;
+
+			if (!veth_gro_requested(dev)) {
+				/* user-space did not require GRO, but adding XDP
+				 * is supposed to get GRO working
+				 */
+				dev->features |= NETIF_F_GRO;
+				netdev_features_change(dev);
+			}
+		}
 	}
 
-	for (i = 0; i < dev->real_num_rx_queues; i++)
+	for (i = 0; i < dev->real_num_rx_queues; i++) {
 		rcu_assign_pointer(priv->rq[i].xdp_prog, priv->_xdp_prog);
+		rcu_assign_pointer(priv->rq[i].napi, &priv->rq[i].xdp_napi);
+	}
 
 	return 0;
 err_reg_mem:
@@ -979,7 +1005,8 @@ static int veth_enable_xdp(struct net_device *dev)
 		struct veth_rq *rq = &priv->rq[i];
 
 		xdp_rxq_info_unreg(&rq->xdp_rxq);
-		netif_napi_del(&rq->xdp_napi);
+		if (!napi_already_on)
+			netif_napi_del(&rq->xdp_napi);
 	}
 
 	return err;
@@ -992,7 +1019,19 @@ static void veth_disable_xdp(struct net_device *dev)
 
 	for (i = 0; i < dev->real_num_rx_queues; i++)
 		rcu_assign_pointer(priv->rq[i].xdp_prog, NULL);
-	veth_napi_del(dev);
+
+	if (!netif_running(dev) || !veth_gro_requested(dev)) {
+		veth_napi_del(dev);
+
+		/* if user-space did not require GRO, since adding XDP
+		 * enabled it, clear it now
+		 */
+		if (!veth_gro_requested(dev) && netif_running(dev)) {
+			dev->features &= ~NETIF_F_GRO;
+			netdev_features_change(dev);
+		}
+	}
+
 	for (i = 0; i < dev->real_num_rx_queues; i++) {
 		struct veth_rq *rq = &priv->rq[i];
 
@@ -1001,6 +1040,29 @@ static void veth_disable_xdp(struct net_device *dev)
 	}
 }
 
+static int veth_napi_enable(struct net_device *dev)
+{
+	struct veth_priv *priv = netdev_priv(dev);
+	int err, i;
+
+	for (i = 0; i < dev->real_num_rx_queues; i++) {
+		struct veth_rq *rq = &priv->rq[i];
+
+		netif_napi_add(dev, &rq->xdp_napi, veth_poll, NAPI_POLL_WEIGHT);
+	}
+
+	err = __veth_napi_enable(dev);
+	if (err) {
+		for (i = 0; i < dev->real_num_rx_queues; i++) {
+			struct veth_rq *rq = &priv->rq[i];
+
+			netif_napi_del(&rq->xdp_napi);
+		}
+		return err;
+	}
+	return err;
+}
+
 static int veth_open(struct net_device *dev)
 {
 	struct veth_priv *priv = netdev_priv(dev);
@@ -1014,6 +1076,10 @@ static int veth_open(struct net_device *dev)
 		err = veth_enable_xdp(dev);
 		if (err)
 			return err;
+	} else if (veth_gro_requested(dev)) {
+		err = veth_napi_enable(dev);
+		if (err)
+			return err;
 	}
 
 	if (peer->flags & IFF_UP) {
@@ -1035,6 +1101,8 @@ static int veth_close(struct net_device *dev)
 
 	if (priv->_xdp_prog)
 		veth_disable_xdp(dev);
+	else if (veth_gro_requested(dev))
+		veth_napi_del(dev);
 
 	return 0;
 }
@@ -1133,10 +1201,32 @@ static netdev_features_t veth_fix_features(struct net_device *dev,
 		if (peer_priv->_xdp_prog)
 			features &= ~NETIF_F_GSO_SOFTWARE;
 	}
+	if (priv->_xdp_prog)
+		features |= NETIF_F_GRO;
 
 	return features;
 }
 
+static int veth_set_features(struct net_device *dev,
+			     netdev_features_t features)
+{
+	netdev_features_t changed = features ^ dev->features;
+	struct veth_priv *priv = netdev_priv(dev);
+	int err;
+
+	if (!(changed & NETIF_F_GRO) || !(dev->flags & IFF_UP) || priv->_xdp_prog)
+		return 0;
+
+	if (features & NETIF_F_GRO) {
+		err = veth_napi_enable(dev);
+		if (err)
+			return err;
+	} else {
+		veth_napi_del(dev);
+	}
+	return 0;
+}
+
 static void veth_set_rx_headroom(struct net_device *dev, int new_hr)
 {
 	struct veth_priv *peer_priv, *priv = netdev_priv(dev);
@@ -1255,6 +1345,7 @@ static const struct net_device_ops veth_netdev_ops = {
 #endif
 	.ndo_get_iflink		= veth_get_iflink,
 	.ndo_fix_features	= veth_fix_features,
+	.ndo_set_features	= veth_set_features,
 	.ndo_features_check	= passthru_features_check,
 	.ndo_set_rx_headroom	= veth_set_rx_headroom,
 	.ndo_bpf		= veth_xdp,
@@ -1317,6 +1408,13 @@ static int veth_validate(struct nlattr *tb[], struct nlattr *data[],
 
 static struct rtnl_link_ops veth_link_ops;
 
+static void veth_disable_gro(struct net_device *dev)
+{
+	dev->features &= ~NETIF_F_GRO;
+	dev->wanted_features &= ~NETIF_F_GRO;
+	netdev_update_features(dev);
+}
+
 static int veth_newlink(struct net *src_net, struct net_device *dev,
 			struct nlattr *tb[], struct nlattr *data[],
 			struct netlink_ext_ack *extack)
@@ -1389,6 +1487,10 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
 	if (err < 0)
 		goto err_register_peer;
 
+	/* keep GRO disabled by default to be consistent with the established
+	 * veth behavior
+	 */
+	veth_disable_gro(peer);
 	netif_carrier_off(peer);
 
 	err = rtnl_configure_link(peer, ifmp);
@@ -1426,6 +1528,7 @@ static int veth_newlink(struct net *src_net, struct net_device *dev,
 	priv = netdev_priv(peer);
 	rcu_assign_pointer(priv->peer, dev);
 
+	veth_disable_gro(dev);
 	return 0;
 
 err_register_dev:
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 3/4] veth: refine napi usage
  2021-04-09 11:04 [PATCH net-next 0/4] veth: allow GRO even without XDP Paolo Abeni
  2021-04-09 11:04 ` [PATCH net-next 1/4] veth: use skb_orphan_partial instead of skb_orphan Paolo Abeni
  2021-04-09 11:04 ` [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP Paolo Abeni
@ 2021-04-09 11:04 ` Paolo Abeni
  2021-04-09 14:57   ` Toke Høiland-Jørgensen
  2021-04-09 11:04 ` [PATCH net-next 4/4] self-tests: add veth tests Paolo Abeni
  2021-04-12  0:10 ` [PATCH net-next 0/4] veth: allow GRO even without XDP patchwork-bot+netdevbpf
  4 siblings, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2021-04-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

After the previous patch, when enabling GRO, locally generated
TCP traffic experiences some measurable overhead, as it traverses
the GRO engine without any chance of aggregation.

This change refine the NAPI receive path admission test, to avoid
unnecessary GRO overhead in most scenarios, when GRO is enabled
on a veth peer.

Only skbs that are eligible for aggregation enter the GRO layer,
the others will go through the traditional receive path.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 drivers/net/veth.c | 23 ++++++++++++++++++++++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index ca44e82d1edeb..85f90f33d437e 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -282,6 +282,25 @@ static int veth_forward_skb(struct net_device *dev, struct sk_buff *skb,
 		netif_rx(skb);
 }
 
+/* return true if the specified skb has chances of GRO aggregation
+ * Don't strive for accuracy, but try to avoid GRO overhead in the most
+ * common scenarios.
+ * When XDP is enabled, all traffic is considered eligible, as the xmit
+ * device has TSO off.
+ * When TSO is enabled on the xmit device, we are likely interested only
+ * in UDP aggregation, explicitly check for that if the skb is suspected
+ * - the sock_wfree destructor is used by UDP, ICMP and XDP sockets -
+ * to belong to locally generated UDP traffic.
+ */
+static bool veth_skb_is_eligible_for_gro(const struct net_device *dev,
+					 const struct net_device *rcv,
+					 const struct sk_buff *skb)
+{
+	return !(dev->features & NETIF_F_ALL_TSO) ||
+		(skb->destructor == sock_wfree &&
+		 rcv->features & (NETIF_F_GRO_FRAGLIST | NETIF_F_GRO_UDP_FWD));
+}
+
 static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 {
 	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
@@ -305,8 +324,10 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 
 		/* The napi pointer is available when an XDP program is
 		 * attached or when GRO is enabled
+		 * Don't bother with napi/GRO if the skb can't be aggregated
 		 */
-		use_napi = rcu_access_pointer(rq->napi);
+		use_napi = rcu_access_pointer(rq->napi) &&
+			   veth_skb_is_eligible_for_gro(dev, rcv, skb);
 		skb_record_rx_queue(skb, rxq);
 	}
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH net-next 4/4] self-tests: add veth tests
  2021-04-09 11:04 [PATCH net-next 0/4] veth: allow GRO even without XDP Paolo Abeni
                   ` (2 preceding siblings ...)
  2021-04-09 11:04 ` [PATCH net-next 3/4] veth: refine napi usage Paolo Abeni
@ 2021-04-09 11:04 ` Paolo Abeni
  2021-04-12  0:10 ` [PATCH net-next 0/4] veth: allow GRO even without XDP patchwork-bot+netdevbpf
  4 siblings, 0 replies; 14+ messages in thread
From: Paolo Abeni @ 2021-04-09 11:04 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

Add some basic veth tests, that verify the expected flags and
aggregation with different setups (default, xdp, etc...)

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 tools/testing/selftests/net/Makefile |   1 +
 tools/testing/selftests/net/veth.sh  | 177 +++++++++++++++++++++++++++
 2 files changed, 178 insertions(+)
 create mode 100755 tools/testing/selftests/net/veth.sh

diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 2d71b283dde36..f4242a9610883 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -24,6 +24,7 @@ TEST_PROGS += vrf_route_leaking.sh
 TEST_PROGS += bareudp.sh
 TEST_PROGS += unicast_extensions.sh
 TEST_PROGS += udpgro_fwd.sh
+TEST_PROGS += veth.sh
 TEST_PROGS_EXTENDED := in_netns.sh
 TEST_GEN_FILES =  socket nettest
 TEST_GEN_FILES += psock_fanout psock_tpacket msg_zerocopy reuseport_addr_any
diff --git a/tools/testing/selftests/net/veth.sh b/tools/testing/selftests/net/veth.sh
new file mode 100755
index 0000000000000..2fedc0781ce8c
--- /dev/null
+++ b/tools/testing/selftests/net/veth.sh
@@ -0,0 +1,177 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+
+readonly STATS="$(mktemp -p /tmp ns-XXXXXX)"
+readonly BASE=`basename $STATS`
+readonly SRC=2
+readonly DST=1
+readonly DST_NAT=100
+readonly NS_SRC=$BASE$SRC
+readonly NS_DST=$BASE$DST
+
+# "baremetal" network used for raw UDP traffic
+readonly BM_NET_V4=192.168.1.
+readonly BM_NET_V6=2001:db8::
+
+readonly NPROCS=`nproc`
+ret=0
+
+cleanup() {
+	local ns
+	local -r jobs="$(jobs -p)"
+	[ -n "${jobs}" ] && kill -1 ${jobs} 2>/dev/null
+	rm -f $STATS
+
+	for ns in $NS_SRC $NS_DST; do
+		ip netns del $ns 2>/dev/null
+	done
+}
+
+trap cleanup EXIT
+
+create_ns() {
+	local ns
+
+	for ns in $NS_SRC $NS_DST; do
+		ip netns add $ns
+		ip -n $ns link set dev lo up
+	done
+
+	ip link add name veth$SRC type veth peer name veth$DST
+
+	for ns in $SRC $DST; do
+		ip link set dev veth$ns netns $BASE$ns up
+		ip -n $BASE$ns addr add dev veth$ns $BM_NET_V4$ns/24
+		ip -n $BASE$ns addr add dev veth$ns $BM_NET_V6$ns/64 nodad
+	done
+	echo "#kernel" > $BASE
+	chmod go-rw $BASE
+}
+
+__chk_flag() {
+	local msg="$1"
+	local target=$2
+	local expected=$3
+	local flagname=$4
+
+	local flag=`ip netns exec $BASE$target ethtool -k veth$target |\
+		    grep $flagname | awk '{print $2}'`
+
+	printf "%-60s" "$msg"
+	if [ "$flag" = "$expected" ]; then
+		echo " ok "
+	else
+		echo " fail - expected $expected found $flag"
+		ret=1
+	fi
+}
+
+chk_gro_flag() {
+	__chk_flag "$1" $2 $3 generic-receive-offload
+}
+
+chk_tso_flag() {
+	__chk_flag "$1" $2 $3 tcp-segmentation-offload
+}
+
+chk_gro() {
+	local msg="$1"
+	local expected=$2
+
+	ip netns exec $BASE$SRC ping -qc 1 $BM_NET_V4$DST >/dev/null
+	NSTAT_HISTORY=$STATS ip netns exec $NS_DST nstat -n
+
+	printf "%-60s" "$msg"
+	ip netns exec $BASE$DST ./udpgso_bench_rx -C 1000 -R 10 &
+	local spid=$!
+	sleep 0.1
+
+	ip netns exec $NS_SRC ./udpgso_bench_tx -4 -s 13000 -S 1300 -M 1 -D $BM_NET_V4$DST
+	local retc=$?
+	wait $spid
+	local rets=$?
+	if [ ${rets} -ne 0 ] || [ ${retc} -ne 0 ]; then
+		echo " fail client exit code $retc, server $rets"
+		ret=1
+		return
+	fi
+
+	local pkts=`NSTAT_HISTORY=$STATS ip netns exec $NS_DST nstat IpInReceives | \
+		    awk '{print $2}' | tail -n 1`
+	if [ "$pkts" = "$expected" ]; then
+		echo " ok "
+	else
+		echo " fail - got $pkts packets, expected $expected "
+		ret=1
+	fi
+}
+
+if [ ! -f ../bpf/xdp_dummy.o ]; then
+	echo "Missing xdp_dummy helper. Build bpf selftest first"
+	exit -1
+fi
+
+create_ns
+chk_gro_flag "default - gro flag" $SRC off
+chk_gro_flag "        - peer gro flag" $DST off
+chk_tso_flag "        - tso flag" $SRC on
+chk_tso_flag "        - peer tso flag" $DST on
+chk_gro "        - aggregation" 1
+ip netns exec $NS_SRC ethtool -K veth$SRC tx-udp-segmentation off
+chk_gro "        - aggregation with TSO off" 10
+cleanup
+
+create_ns
+ip netns exec $NS_DST ethtool -K veth$DST gro on
+chk_gro_flag "with gro on - gro flag" $DST on
+chk_gro_flag "        - peer gro flag" $SRC off
+chk_tso_flag "        - tso flag" $SRC on
+chk_tso_flag "        - peer tso flag" $DST on
+ip netns exec $NS_SRC ethtool -K veth$SRC tx-udp-segmentation off
+ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
+chk_gro "        - aggregation with TSO off" 1
+cleanup
+
+create_ns
+ip -n $NS_DST link set dev veth$DST down
+ip netns exec $NS_DST ethtool -K veth$DST gro on
+chk_gro_flag "with gro enabled on link down - gro flag" $DST on
+chk_gro_flag "        - peer gro flag" $SRC off
+chk_tso_flag "        - tso flag" $SRC on
+chk_tso_flag "        - peer tso flag" $DST on
+ip -n $NS_DST link set dev veth$DST up
+ip netns exec $NS_SRC ethtool -K veth$SRC tx-udp-segmentation off
+ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
+chk_gro "        - aggregation with TSO off" 1
+cleanup
+
+create_ns
+ip -n $NS_DST link set dev veth$DST xdp object ../bpf/xdp_dummy.o section xdp_dummy 2>/dev/null
+chk_gro_flag "with xdp attached - gro flag" $DST on
+chk_gro_flag "        - peer gro flag" $SRC off
+chk_tso_flag "        - tso flag" $SRC off
+chk_tso_flag "        - peer tso flag" $DST on
+ip netns exec $NS_DST ethtool -K veth$DST rx-udp-gro-forwarding on
+chk_gro "        - aggregation" 1
+
+
+ip -n $NS_DST link set dev veth$DST down
+ip -n $NS_SRC link set dev veth$SRC down
+chk_gro_flag "        - after dev off, flag" $DST on
+chk_gro_flag "        - peer flag" $SRC off
+
+ip netns exec $NS_DST ethtool -K veth$DST gro on
+ip -n $NS_DST link set dev veth$DST xdp off
+chk_gro_flag "        - after gro on xdp off, gro flag" $DST on
+chk_gro_flag "        - peer gro flag" $SRC off
+chk_tso_flag "        - tso flag" $SRC on
+chk_tso_flag "        - peer tso flag" $DST on
+ip -n $NS_DST link set dev veth$DST up
+ip -n $NS_SRC link set dev veth$SRC up
+chk_gro "        - aggregation" 1
+
+ip netns exec $NS_DST ethtool -K veth$DST gro off
+ip netns exec $NS_SRC ethtool -K veth$SRC tx-udp-segmentation off
+chk_gro "aggregation again with default and TSO off" 10
+
+exit $ret
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/4] veth: refine napi usage
  2021-04-09 11:04 ` [PATCH net-next 3/4] veth: refine napi usage Paolo Abeni
@ 2021-04-09 14:57   ` Toke Høiland-Jørgensen
  2021-04-09 15:07     ` Paolo Abeni
  0 siblings, 1 reply; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-04-09 14:57 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

Paolo Abeni <pabeni@redhat.com> writes:

> After the previous patch, when enabling GRO, locally generated
> TCP traffic experiences some measurable overhead, as it traverses
> the GRO engine without any chance of aggregation.
>
> This change refine the NAPI receive path admission test, to avoid
> unnecessary GRO overhead in most scenarios, when GRO is enabled
> on a veth peer.
>
> Only skbs that are eligible for aggregation enter the GRO layer,
> the others will go through the traditional receive path.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  drivers/net/veth.c | 23 ++++++++++++++++++++++-
>  1 file changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index ca44e82d1edeb..85f90f33d437e 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -282,6 +282,25 @@ static int veth_forward_skb(struct net_device *dev, struct sk_buff *skb,
>  		netif_rx(skb);
>  }
>  
> +/* return true if the specified skb has chances of GRO aggregation
> + * Don't strive for accuracy, but try to avoid GRO overhead in the most
> + * common scenarios.
> + * When XDP is enabled, all traffic is considered eligible, as the xmit
> + * device has TSO off.
> + * When TSO is enabled on the xmit device, we are likely interested only
> + * in UDP aggregation, explicitly check for that if the skb is suspected
> + * - the sock_wfree destructor is used by UDP, ICMP and XDP sockets -
> + * to belong to locally generated UDP traffic.
> + */
> +static bool veth_skb_is_eligible_for_gro(const struct net_device *dev,
> +					 const struct net_device *rcv,
> +					 const struct sk_buff *skb)
> +{
> +	return !(dev->features & NETIF_F_ALL_TSO) ||
> +		(skb->destructor == sock_wfree &&
> +		 rcv->features & (NETIF_F_GRO_FRAGLIST | NETIF_F_GRO_UDP_FWD));
> +}
> +
>  static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
>  {
>  	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
> @@ -305,8 +324,10 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
>  
>  		/* The napi pointer is available when an XDP program is
>  		 * attached or when GRO is enabled
> +		 * Don't bother with napi/GRO if the skb can't be aggregated
>  		 */
> -		use_napi = rcu_access_pointer(rq->napi);
> +		use_napi = rcu_access_pointer(rq->napi) &&
> +			   veth_skb_is_eligible_for_gro(dev, rcv, skb);
>  		skb_record_rx_queue(skb, rxq);
>  	}

You just changed the 'xdp_rcv' check to this use_napi, and now you're
conditioning it on GRO eligibility, so doesn't this break XDP if that
was the reason NAPI was turned on in the first place?

-Toke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP
  2021-04-09 11:04 ` [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP Paolo Abeni
@ 2021-04-09 14:58   ` Toke Høiland-Jørgensen
  2021-04-09 15:20     ` Paolo Abeni
  0 siblings, 1 reply; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-04-09 14:58 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

Paolo Abeni <pabeni@redhat.com> writes:

> Currently the veth device has the GRO feature bit set, even if
> no GRO aggregation is possible with the default configuration,
> as the veth device does not hook into the GRO engine.
>
> Flipping the GRO feature bit from user-space is a no-op, unless
> XDP is enabled. In such scenario GRO could actually take place, but
> TSO is forced to off on the peer device.
>
> This change allow user-space to really control the GRO feature, with
> no need for an XDP program.
>
> The GRO feature bit is now cleared by default - so that there are no
> user-visible behavior changes with the default configuration.
>
> When the GRO bit is set, the per-queue NAPI instances are initialized
> and registered. On xmit, when napi instances are available, we try
> to use them.

Am I mistaken in thinking that this also makes XDP redirect into a veth
work without having to load an XDP program on the peer device? That's
been a long-outstanding thing we've been meaning to fix, so that would
be awesome! :)

-Toke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/4] veth: refine napi usage
  2021-04-09 14:57   ` Toke Høiland-Jørgensen
@ 2021-04-09 15:07     ` Paolo Abeni
  2021-04-09 15:18       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2021-04-09 15:07 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

hello,

On Fri, 2021-04-09 at 16:57 +0200, Toke Høiland-Jørgensen wrote:
> Paolo Abeni <pabeni@redhat.com> writes:
> 
> > After the previous patch, when enabling GRO, locally generated
> > TCP traffic experiences some measurable overhead, as it traverses
> > the GRO engine without any chance of aggregation.
> > 
> > This change refine the NAPI receive path admission test, to avoid
> > unnecessary GRO overhead in most scenarios, when GRO is enabled
> > on a veth peer.
> > 
> > Only skbs that are eligible for aggregation enter the GRO layer,
> > the others will go through the traditional receive path.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  drivers/net/veth.c | 23 ++++++++++++++++++++++-
> >  1 file changed, 22 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> > index ca44e82d1edeb..85f90f33d437e 100644
> > --- a/drivers/net/veth.c
> > +++ b/drivers/net/veth.c
> > @@ -282,6 +282,25 @@ static int veth_forward_skb(struct net_device *dev, struct sk_buff *skb,
> >  		netif_rx(skb);
> >  }
> >  
> > +/* return true if the specified skb has chances of GRO aggregation
> > + * Don't strive for accuracy, but try to avoid GRO overhead in the most
> > + * common scenarios.
> > + * When XDP is enabled, all traffic is considered eligible, as the xmit
> > + * device has TSO off.
> > + * When TSO is enabled on the xmit device, we are likely interested only
> > + * in UDP aggregation, explicitly check for that if the skb is suspected
> > + * - the sock_wfree destructor is used by UDP, ICMP and XDP sockets -
> > + * to belong to locally generated UDP traffic.
> > + */
> > +static bool veth_skb_is_eligible_for_gro(const struct net_device *dev,
> > +					 const struct net_device *rcv,
> > +					 const struct sk_buff *skb)
> > +{
> > +	return !(dev->features & NETIF_F_ALL_TSO) ||
> > +		(skb->destructor == sock_wfree &&
> > +		 rcv->features & (NETIF_F_GRO_FRAGLIST | NETIF_F_GRO_UDP_FWD));
> > +}
> > +
> >  static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
> >  {
> >  	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
> > @@ -305,8 +324,10 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
> >  
> >  		/* The napi pointer is available when an XDP program is
> >  		 * attached or when GRO is enabled
> > +		 * Don't bother with napi/GRO if the skb can't be aggregated
> >  		 */
> > -		use_napi = rcu_access_pointer(rq->napi);
> > +		use_napi = rcu_access_pointer(rq->napi) &&
> > +			   veth_skb_is_eligible_for_gro(dev, rcv, skb);
> >  		skb_record_rx_queue(skb, rxq);
> >  	}
> 
> You just changed the 'xdp_rcv' check to this use_napi, and now you're
> conditioning it on GRO eligibility, so doesn't this break XDP if that
> was the reason NAPI was turned on in the first place?

Thank you for the feedback.

If XDP is enabled, TSO is forced of on 'dev'
and veth_skb_is_eligible_for_gro() returns true, so napi/GRO is always
used - there is no functional change when XDP is enabled.

Please let me know if the above is more clear, thanks!

Paolo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 3/4] veth: refine napi usage
  2021-04-09 15:07     ` Paolo Abeni
@ 2021-04-09 15:18       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-04-09 15:18 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

Paolo Abeni <pabeni@redhat.com> writes:

> hello,
>
> On Fri, 2021-04-09 at 16:57 +0200, Toke Høiland-Jørgensen wrote:
>> Paolo Abeni <pabeni@redhat.com> writes:
>> 
>> > After the previous patch, when enabling GRO, locally generated
>> > TCP traffic experiences some measurable overhead, as it traverses
>> > the GRO engine without any chance of aggregation.
>> > 
>> > This change refine the NAPI receive path admission test, to avoid
>> > unnecessary GRO overhead in most scenarios, when GRO is enabled
>> > on a veth peer.
>> > 
>> > Only skbs that are eligible for aggregation enter the GRO layer,
>> > the others will go through the traditional receive path.
>> > 
>> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
>> > ---
>> >  drivers/net/veth.c | 23 ++++++++++++++++++++++-
>> >  1 file changed, 22 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/drivers/net/veth.c b/drivers/net/veth.c
>> > index ca44e82d1edeb..85f90f33d437e 100644
>> > --- a/drivers/net/veth.c
>> > +++ b/drivers/net/veth.c
>> > @@ -282,6 +282,25 @@ static int veth_forward_skb(struct net_device *dev, struct sk_buff *skb,
>> >  		netif_rx(skb);
>> >  }
>> >  
>> > +/* return true if the specified skb has chances of GRO aggregation
>> > + * Don't strive for accuracy, but try to avoid GRO overhead in the most
>> > + * common scenarios.
>> > + * When XDP is enabled, all traffic is considered eligible, as the xmit
>> > + * device has TSO off.
>> > + * When TSO is enabled on the xmit device, we are likely interested only
>> > + * in UDP aggregation, explicitly check for that if the skb is suspected
>> > + * - the sock_wfree destructor is used by UDP, ICMP and XDP sockets -
>> > + * to belong to locally generated UDP traffic.
>> > + */
>> > +static bool veth_skb_is_eligible_for_gro(const struct net_device *dev,
>> > +					 const struct net_device *rcv,
>> > +					 const struct sk_buff *skb)
>> > +{
>> > +	return !(dev->features & NETIF_F_ALL_TSO) ||
>> > +		(skb->destructor == sock_wfree &&
>> > +		 rcv->features & (NETIF_F_GRO_FRAGLIST | NETIF_F_GRO_UDP_FWD));
>> > +}
>> > +
>> >  static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
>> >  {
>> >  	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
>> > @@ -305,8 +324,10 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
>> >  
>> >  		/* The napi pointer is available when an XDP program is
>> >  		 * attached or when GRO is enabled
>> > +		 * Don't bother with napi/GRO if the skb can't be aggregated
>> >  		 */
>> > -		use_napi = rcu_access_pointer(rq->napi);
>> > +		use_napi = rcu_access_pointer(rq->napi) &&
>> > +			   veth_skb_is_eligible_for_gro(dev, rcv, skb);
>> >  		skb_record_rx_queue(skb, rxq);
>> >  	}
>> 
>> You just changed the 'xdp_rcv' check to this use_napi, and now you're
>> conditioning it on GRO eligibility, so doesn't this break XDP if that
>> was the reason NAPI was turned on in the first place?
>
> Thank you for the feedback.
>
> If XDP is enabled, TSO is forced of on 'dev'
> and veth_skb_is_eligible_for_gro() returns true, so napi/GRO is always
> used - there is no functional change when XDP is enabled.

Ah, right, so it says right there in the comment; sorry for missing
that! :)

-Toke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP
  2021-04-09 14:58   ` Toke Høiland-Jørgensen
@ 2021-04-09 15:20     ` Paolo Abeni
  2021-04-16 15:29       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2021-04-09 15:20 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

On Fri, 2021-04-09 at 16:58 +0200, Toke Høiland-Jørgensen wrote:
> Paolo Abeni <pabeni@redhat.com> writes:
> 
> > Currently the veth device has the GRO feature bit set, even if
> > no GRO aggregation is possible with the default configuration,
> > as the veth device does not hook into the GRO engine.
> > 
> > Flipping the GRO feature bit from user-space is a no-op, unless
> > XDP is enabled. In such scenario GRO could actually take place, but
> > TSO is forced to off on the peer device.
> > 
> > This change allow user-space to really control the GRO feature, with
> > no need for an XDP program.
> > 
> > The GRO feature bit is now cleared by default - so that there are no
> > user-visible behavior changes with the default configuration.
> > 
> > When the GRO bit is set, the per-queue NAPI instances are initialized
> > and registered. On xmit, when napi instances are available, we try
> > to use them.
> 
> Am I mistaken in thinking that this also makes XDP redirect into a veth
> work without having to load an XDP program on the peer device? That's
> been a long-outstanding thing we've been meaning to fix, so that would
> be awesome! :)

I have not experimented that, and I admit gross ignorance WRT this
argument, but AFAICS the needed bits to get XDP redirect working on
veth are the ptr_ring initialization and the napi instance available.

With this patch both are in place when GRO is enabled, so I guess XPD
redirect should work, too (modulo bugs for untested scenario).

Thanks!

Paolo


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 0/4] veth: allow GRO even without XDP
  2021-04-09 11:04 [PATCH net-next 0/4] veth: allow GRO even without XDP Paolo Abeni
                   ` (3 preceding siblings ...)
  2021-04-09 11:04 ` [PATCH net-next 4/4] self-tests: add veth tests Paolo Abeni
@ 2021-04-12  0:10 ` patchwork-bot+netdevbpf
  4 siblings, 0 replies; 14+ messages in thread
From: patchwork-bot+netdevbpf @ 2021-04-12  0:10 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: netdev, davem, kuba, toshiaki.makita1, lorenzo

Hello:

This series was applied to netdev/net-next.git (refs/heads/master):

On Fri,  9 Apr 2021 13:04:36 +0200 you wrote:
> This series allows the user-space to enable GRO/NAPI on a veth
> device even without attaching an XDP program.
> 
> It does not change the default veth behavior (no NAPI, no GRO),
> except that the GRO feature bit on top of this series will be
> effectively off by default on veth devices. Note that currently
> the GRO bit is on by default, but GRO never takes place in
> absence of XDP.
> 
> [...]

Here is the summary with links:
  - [net-next,1/4] veth: use skb_orphan_partial instead of skb_orphan
    https://git.kernel.org/netdev/net-next/c/c75fb320d482
  - [net-next,2/4] veth: allow enabling NAPI even without XDP
    https://git.kernel.org/netdev/net-next/c/d3256efd8e8b
  - [net-next,3/4] veth: refine napi usage
    https://git.kernel.org/netdev/net-next/c/47e550e0105b
  - [net-next,4/4] self-tests: add veth tests
    https://git.kernel.org/netdev/net-next/c/1c3cadbe0242

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP
  2021-04-09 15:20     ` Paolo Abeni
@ 2021-04-16 15:29       ` Toke Høiland-Jørgensen
  2021-04-16 17:26         ` Paolo Abeni
  0 siblings, 1 reply; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-04-16 15:29 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

Paolo Abeni <pabeni@redhat.com> writes:

> On Fri, 2021-04-09 at 16:58 +0200, Toke Høiland-Jørgensen wrote:
>> Paolo Abeni <pabeni@redhat.com> writes:
>> 
>> > Currently the veth device has the GRO feature bit set, even if
>> > no GRO aggregation is possible with the default configuration,
>> > as the veth device does not hook into the GRO engine.
>> > 
>> > Flipping the GRO feature bit from user-space is a no-op, unless
>> > XDP is enabled. In such scenario GRO could actually take place, but
>> > TSO is forced to off on the peer device.
>> > 
>> > This change allow user-space to really control the GRO feature, with
>> > no need for an XDP program.
>> > 
>> > The GRO feature bit is now cleared by default - so that there are no
>> > user-visible behavior changes with the default configuration.
>> > 
>> > When the GRO bit is set, the per-queue NAPI instances are initialized
>> > and registered. On xmit, when napi instances are available, we try
>> > to use them.
>> 
>> Am I mistaken in thinking that this also makes XDP redirect into a veth
>> work without having to load an XDP program on the peer device? That's
>> been a long-outstanding thing we've been meaning to fix, so that would
>> be awesome! :)
>
> I have not experimented that, and I admit gross ignorance WRT this
> argument, but AFAICS the needed bits to get XDP redirect working on
> veth are the ptr_ring initialization and the napi instance available.
>
> With this patch both are in place when GRO is enabled, so I guess XPD
> redirect should work, too (modulo bugs for untested scenario).

OK, finally got around to testing this; it doesn't quite work with just
your patch, because veth_xdp_xmit() still checks for rq->xdp_prog
instead of rq->napi. Fixing this indeed enabled veth to be an
XDP_REDIRECT target without an XDP program loaded on the peer. So yay!
I'll send a followup fixing that check.

So with this we seem to have some nice improvements in both
functionality and performance when GRO is turned on; so any reason why
we shouldn't just flip the default to on?

-Toke


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP
  2021-04-16 15:29       ` Toke Høiland-Jørgensen
@ 2021-04-16 17:26         ` Paolo Abeni
  2021-04-16 18:19           ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 14+ messages in thread
From: Paolo Abeni @ 2021-04-16 17:26 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

On Fri, 2021-04-16 at 17:29 +0200, Toke Høiland-Jørgensen wrote:
> Paolo Abeni <pabeni@redhat.com> writes:
> 
> > On Fri, 2021-04-09 at 16:58 +0200, Toke Høiland-Jørgensen wrote:
> > > Paolo Abeni <pabeni@redhat.com> writes:
> > > 
> > > > Currently the veth device has the GRO feature bit set, even if
> > > > no GRO aggregation is possible with the default configuration,
> > > > as the veth device does not hook into the GRO engine.
> > > > 
> > > > Flipping the GRO feature bit from user-space is a no-op, unless
> > > > XDP is enabled. In such scenario GRO could actually take place, but
> > > > TSO is forced to off on the peer device.
> > > > 
> > > > This change allow user-space to really control the GRO feature, with
> > > > no need for an XDP program.
> > > > 
> > > > The GRO feature bit is now cleared by default - so that there are no
> > > > user-visible behavior changes with the default configuration.
> > > > 
> > > > When the GRO bit is set, the per-queue NAPI instances are initialized
> > > > and registered. On xmit, when napi instances are available, we try
> > > > to use them.
> > > 
> > > Am I mistaken in thinking that this also makes XDP redirect into a veth
> > > work without having to load an XDP program on the peer device? That's
> > > been a long-outstanding thing we've been meaning to fix, so that would
> > > be awesome! :)
> > 
> > I have not experimented that, and I admit gross ignorance WRT this
> > argument, but AFAICS the needed bits to get XDP redirect working on
> > veth are the ptr_ring initialization and the napi instance available.
> > 
> > With this patch both are in place when GRO is enabled, so I guess XPD
> > redirect should work, too (modulo bugs for untested scenario).
> 
> OK, finally got around to testing this; it doesn't quite work with just
> your patch, because veth_xdp_xmit() still checks for rq->xdp_prog
> instead of rq->napi. Fixing this indeed enabled veth to be an
> XDP_REDIRECT target without an XDP program loaded on the peer. So yay!
> I'll send a followup fixing that check.

Thank you for double checking!

> So with this we seem to have some nice improvements in both
> functionality and performance when GRO is turned on; so any reason why
> we shouldn't just flip the default to on?

Uhmmm... patch 3/4 should avoid the GRO overhead for most cases where
we can't leverage the aggregation benefit, but I'm not 110% sure that
enabling GRO by default will not cause performance regressions in some
scenarios.

It this proves to be always a win we can still change the default
later, I think.

Cheers,

Paolo



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP
  2021-04-16 17:26         ` Paolo Abeni
@ 2021-04-16 18:19           ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 14+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-04-16 18:19 UTC (permalink / raw)
  To: Paolo Abeni, netdev
  Cc: David S. Miller, Jakub Kicinski, Toshiaki Makita, Lorenzo Bianconi

Paolo Abeni <pabeni@redhat.com> writes:

> On Fri, 2021-04-16 at 17:29 +0200, Toke Høiland-Jørgensen wrote:
>> Paolo Abeni <pabeni@redhat.com> writes:
>> 
>> > On Fri, 2021-04-09 at 16:58 +0200, Toke Høiland-Jørgensen wrote:
>> > > Paolo Abeni <pabeni@redhat.com> writes:
>> > > 
>> > > > Currently the veth device has the GRO feature bit set, even if
>> > > > no GRO aggregation is possible with the default configuration,
>> > > > as the veth device does not hook into the GRO engine.
>> > > > 
>> > > > Flipping the GRO feature bit from user-space is a no-op, unless
>> > > > XDP is enabled. In such scenario GRO could actually take place, but
>> > > > TSO is forced to off on the peer device.
>> > > > 
>> > > > This change allow user-space to really control the GRO feature, with
>> > > > no need for an XDP program.
>> > > > 
>> > > > The GRO feature bit is now cleared by default - so that there are no
>> > > > user-visible behavior changes with the default configuration.
>> > > > 
>> > > > When the GRO bit is set, the per-queue NAPI instances are initialized
>> > > > and registered. On xmit, when napi instances are available, we try
>> > > > to use them.
>> > > 
>> > > Am I mistaken in thinking that this also makes XDP redirect into a veth
>> > > work without having to load an XDP program on the peer device? That's
>> > > been a long-outstanding thing we've been meaning to fix, so that would
>> > > be awesome! :)
>> > 
>> > I have not experimented that, and I admit gross ignorance WRT this
>> > argument, but AFAICS the needed bits to get XDP redirect working on
>> > veth are the ptr_ring initialization and the napi instance available.
>> > 
>> > With this patch both are in place when GRO is enabled, so I guess XPD
>> > redirect should work, too (modulo bugs for untested scenario).
>> 
>> OK, finally got around to testing this; it doesn't quite work with just
>> your patch, because veth_xdp_xmit() still checks for rq->xdp_prog
>> instead of rq->napi. Fixing this indeed enabled veth to be an
>> XDP_REDIRECT target without an XDP program loaded on the peer. So yay!
>> I'll send a followup fixing that check.
>
> Thank you for double checking!
>
>> So with this we seem to have some nice improvements in both
>> functionality and performance when GRO is turned on; so any reason why
>> we shouldn't just flip the default to on?
>
> Uhmmm... patch 3/4 should avoid the GRO overhead for most cases where
> we can't leverage the aggregation benefit, but I'm not 110% sure that
> enabling GRO by default will not cause performance regressions in some
> scenarios.
>
> It this proves to be always a win we can still change the default
> later, I think.

Alright, sure, let's hold off on that and revisit once this has had some
more testing :)

-Toke


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2021-04-16 18:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-09 11:04 [PATCH net-next 0/4] veth: allow GRO even without XDP Paolo Abeni
2021-04-09 11:04 ` [PATCH net-next 1/4] veth: use skb_orphan_partial instead of skb_orphan Paolo Abeni
2021-04-09 11:04 ` [PATCH net-next 2/4] veth: allow enabling NAPI even without XDP Paolo Abeni
2021-04-09 14:58   ` Toke Høiland-Jørgensen
2021-04-09 15:20     ` Paolo Abeni
2021-04-16 15:29       ` Toke Høiland-Jørgensen
2021-04-16 17:26         ` Paolo Abeni
2021-04-16 18:19           ` Toke Høiland-Jørgensen
2021-04-09 11:04 ` [PATCH net-next 3/4] veth: refine napi usage Paolo Abeni
2021-04-09 14:57   ` Toke Høiland-Jørgensen
2021-04-09 15:07     ` Paolo Abeni
2021-04-09 15:18       ` Toke Høiland-Jørgensen
2021-04-09 11:04 ` [PATCH net-next 4/4] self-tests: add veth tests Paolo Abeni
2021-04-12  0:10 ` [PATCH net-next 0/4] veth: allow GRO even without XDP patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.