netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC net-next 00/27] net and/or udp optimisations
@ 2022-04-03 13:06 Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 01/27] sock: deduplicate ->sk_wmem_alloc check Pavel Begunkov
                   ` (27 more replies)
  0 siblings, 28 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

A mix of various net optimisations, which were mostly discovered during UDP
testing. Benchmarked with an io_uring test using 16B UDP/IPv6 over dummy netdev:
2090K vs 2229K tx/s, +6.6%, or in a 4-8% range if not averaging across reboots.

1-3 removes extra atomics and barriers from sock_wfree() mainly benefitting UDP.
4-7 cleans up some zerocopy helpers
8-16 do inlining of ipv6 and generic net pathes
17 is a small nice performance improvement for TCP zerocopy
18-27 refactors UDP to shed some more overhead

Pavel Begunkov (27):
  sock: deduplicate ->sk_wmem_alloc check
  sock: optimise sock_def_write_space send refcounting
  sock: optimise sock_def_write_space barriers
  skbuff: drop zero check from skb_zcopy_set
  skbuff: drop null check from skb_zcopy
  net: xen: set zc flags only when there is ubuf
  skbuff: introduce skb_is_zcopy()
  skbuff: optimise alloc_skb_with_frags()
  net: inline sock_alloc_send_skb
  net: inline part of skb_csum_hwoffload_help
  net: inline skb_zerocopy_iter_dgram
  ipv6: inline ip6_local_out()
  ipv6: help __ip6_finish_output() inlining
  ipv6: refactor ip6_finish_output2()
  net: inline dev_queue_xmit()
  ipv6: partially inline fl6_update_dst()
  tcp: optimise skb_zerocopy_iter_stream()
  net: optimise ipcm6 cookie init
  udp/ipv6: refactor udpv6_sendmsg udplite checks
  udp/ipv6: move pending section of udpv6_sendmsg
  udp/ipv6: prioritise the ip6 path over ip4 checks
  udp/ipv6: optimise udpv6_sendmsg() daddr checks
  udp/ipv6: optimise out daddr reassignment
  udp/ipv6: clean up udpv6_sendmsg's saddr init
  ipv6: refactor opts push in __ip6_make_skb()
  ipv6: improve opt-less __ip6_make_skb()
  ipv6: clean up ip6_setup_cork

 drivers/net/xen-netback/interface.c |   3 +-
 include/linux/netdevice.h           |  27 ++++-
 include/linux/skbuff.h              | 102 +++++++++++++-----
 include/net/ipv6.h                  |  37 ++++---
 include/net/sock.h                  |  10 +-
 net/core/datagram.c                 |   2 -
 net/core/datagram.h                 |  15 ---
 net/core/dev.c                      |  28 ++---
 net/core/skbuff.c                   |  59 ++++-------
 net/core/sock.c                     |  50 +++++++--
 net/ipv4/ip_output.c                |  10 +-
 net/ipv4/tcp.c                      |   5 +-
 net/ipv6/datagram.c                 |   4 +-
 net/ipv6/exthdrs.c                  |  15 ++-
 net/ipv6/ip6_output.c               |  88 ++++++++--------
 net/ipv6/output_core.c              |  12 ---
 net/ipv6/raw.c                      |   8 +-
 net/ipv6/udp.c                      | 158 +++++++++++++---------------
 net/l2tp/l2tp_ip6.c                 |   8 +-
 19 files changed, 339 insertions(+), 302 deletions(-)
 delete mode 100644 net/core/datagram.h

-- 
2.35.1


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH net-next 01/27] sock: deduplicate ->sk_wmem_alloc check
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 02/27] sock: optimise sock_def_write_space send refcounting Pavel Begunkov
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

The main ->sk_wmem_alloc check in sock_def_write_space() almost
completely repeats sock_writeable() apart from small differences like
rounding, so we should be able to replace the first check and remove
extra sock_writeable().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/core/sock.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 1180a0cb0110..f5766d6e27cb 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3174,15 +3174,14 @@ static void sock_def_write_space(struct sock *sk)
 	/* Do not wake up a writer until he can make "significant"
 	 * progress.  --DaveM
 	 */
-	if ((refcount_read(&sk->sk_wmem_alloc) << 1) <= READ_ONCE(sk->sk_sndbuf)) {
+	if (sock_writeable(sk)) {
 		wq = rcu_dereference(sk->sk_wq);
 		if (skwq_has_sleeper(wq))
 			wake_up_interruptible_sync_poll(&wq->wait, EPOLLOUT |
 						EPOLLWRNORM | EPOLLWRBAND);
 
 		/* Should agree with poll, otherwise some programs break */
-		if (sock_writeable(sk))
-			sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT);
+		sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT);
 	}
 
 	rcu_read_unlock();
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 02/27] sock: optimise sock_def_write_space send refcounting
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 01/27] sock: deduplicate ->sk_wmem_alloc check Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 03/27] sock: optimise sock_def_write_space barriers Pavel Begunkov
                   ` (25 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

sock_def_write_space() is extensively used by UDP and there is some
room for optimisation. When sock_wfree() needs to do ->sk_write_space(),
it modifies ->sk_wmem_alloc in two steps. First, it puts all but one
refs and calls ->sk_write_space(), and then puts down remaining 1.
That's needed because the callback relies on ->sk_wmem_alloc being
subbed but something should hold the socket alive.

The idea behind this patch is to take advantage of SOCK_RCU_FREE and
ensure the socket is not freed by wrapping ->sk_write_space() in an RCU
section. Then we can remove one extra refcount atomic.

Note: not all callbacks might be RCU prepared, so we carve out a
sock_def_write_space() specific path.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/core/sock.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/core/sock.c b/net/core/sock.c
index f5766d6e27cb..9389bb602c64 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -144,6 +144,8 @@
 static DEFINE_MUTEX(proto_list_mutex);
 static LIST_HEAD(proto_list);
 
+static void sock_def_write_space(struct sock *sk);
+
 /**
  * sk_ns_capable - General socket capability test
  * @sk: Socket to use a capability on or through
@@ -2300,8 +2302,20 @@ void sock_wfree(struct sk_buff *skb)
 {
 	struct sock *sk = skb->sk;
 	unsigned int len = skb->truesize;
+	bool free;
 
 	if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE)) {
+		if (sock_flag(sk, SOCK_RCU_FREE) &&
+		    sk->sk_write_space == sock_def_write_space) {
+			rcu_read_lock();
+			free = refcount_sub_and_test(len, &sk->sk_wmem_alloc);
+			sock_def_write_space(sk);
+			rcu_read_unlock();
+			if (unlikely(free))
+				__sk_free(sk);
+			return;
+		}
+
 		/*
 		 * Keep a reference on sk_wmem_alloc, this will be released
 		 * after sk_write_space() call
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 03/27] sock: optimise sock_def_write_space barriers
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 01/27] sock: deduplicate ->sk_wmem_alloc check Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 02/27] sock: optimise sock_def_write_space send refcounting Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 04/27] skbuff: drop zero check from skb_zcopy_set Pavel Begunkov
                   ` (24 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Now we have a separate path for sock_def_write_space() and can go one
step further. When it's called from sock_wfree() we know that there is a
preceding atomic for putting down ->sk_wmem_alloc. We can use it to
replace to replace smb_mb() with a less expensive
smp_mb__after_atomic(). It also removes an extra RCU read lock/unlock as
a small bonus.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/core/sock.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/net/core/sock.c b/net/core/sock.c
index 9389bb602c64..b1a8f47fda55 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -144,6 +144,7 @@
 static DEFINE_MUTEX(proto_list_mutex);
 static LIST_HEAD(proto_list);
 
+static void sock_def_write_space_wfree(struct sock *sk);
 static void sock_def_write_space(struct sock *sk);
 
 /**
@@ -2309,7 +2310,7 @@ void sock_wfree(struct sk_buff *skb)
 		    sk->sk_write_space == sock_def_write_space) {
 			rcu_read_lock();
 			free = refcount_sub_and_test(len, &sk->sk_wmem_alloc);
-			sock_def_write_space(sk);
+			sock_def_write_space_wfree(sk);
 			rcu_read_unlock();
 			if (unlikely(free))
 				__sk_free(sk);
@@ -3201,6 +3202,29 @@ static void sock_def_write_space(struct sock *sk)
 	rcu_read_unlock();
 }
 
+/* An optimised version of sock_def_write_space(), should only be called
+ * for SOCK_RCU_FREE sockets under RCU read section and after putting
+ * ->sk_wmem_alloc.
+ */
+static void sock_def_write_space_wfree(struct sock *sk)
+{
+	/* Do not wake up a writer until he can make "significant"
+	 * progress.  --DaveM
+	 */
+	if (sock_writeable(sk)) {
+		struct socket_wq *wq = rcu_dereference(sk->sk_wq);
+
+		/* rely on refcount_sub from sock_wfree() */
+		smp_mb__after_atomic();
+		if (wq && waitqueue_active(&wq->wait))
+			wake_up_interruptible_sync_poll(&wq->wait, EPOLLOUT |
+						EPOLLWRNORM | EPOLLWRBAND);
+
+		/* Should agree with poll, otherwise some programs break */
+		sk_wake_async(sk, SOCK_WAKE_SPACE, POLL_OUT);
+	}
+}
+
 static void sock_def_destruct(struct sock *sk)
 {
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 04/27] skbuff: drop zero check from skb_zcopy_set
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (2 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 03/27] sock: optimise sock_def_write_space barriers Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 05/27] skbuff: drop null check from skb_zcopy Pavel Begunkov
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Only two skb_zcopy_set() callers may pass a null skb, so kill a null
check from the function, which can't be easily compiled out and hand
code where needed. This will also help with further patches.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/skbuff.h | 2 +-
 net/ipv4/ip_output.c   | 3 ++-
 net/ipv6/ip6_output.c  | 3 ++-
 3 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 3a30cae8b0a5..f5de5c9cc3da 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1679,7 +1679,7 @@ static inline void skb_zcopy_init(struct sk_buff *skb, struct ubuf_info *uarg)
 static inline void skb_zcopy_set(struct sk_buff *skb, struct ubuf_info *uarg,
 				 bool *have_ref)
 {
-	if (skb && uarg && !skb_zcopy(skb)) {
+	if (uarg && !skb_zcopy(skb)) {
 		if (unlikely(have_ref && *have_ref))
 			*have_ref = false;
 		else
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 00b4bf26fd93..f864b8c48e42 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1027,7 +1027,8 @@ static int __ip_append_data(struct sock *sk,
 			paged = true;
 		} else {
 			uarg->zerocopy = 0;
-			skb_zcopy_set(skb, uarg, &extra_uref);
+			if (skb)
+				skb_zcopy_set(skb, uarg, &extra_uref);
 		}
 	}
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index e23f058166af..e9b039f56637 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1529,7 +1529,8 @@ static int __ip6_append_data(struct sock *sk,
 			paged = true;
 		} else {
 			uarg->zerocopy = 0;
-			skb_zcopy_set(skb, uarg, &extra_uref);
+			if (skb)
+				skb_zcopy_set(skb, uarg, &extra_uref);
 		}
 	}
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 05/27] skbuff: drop null check from skb_zcopy
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (3 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 04/27] skbuff: drop zero check from skb_zcopy_set Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 06/27] net: xen: set zc flags only when there is ubuf Pavel Begunkov
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

skb_zcopy() is used all around the networkong code including generic
paths. Many callers pass only a non-null skb, so we can remove it from
there and fix up several callers that would be affected. It removes
extra checks from zerocopy paths but also sheds some bytes from the
binary.

   text    data     bss     dec     hex filename
8521472       0       0 8521472  820700 arch/x86/boot/bzImage
8521056       0       0 8521056  820560 arch/x86/boot/bzImage
delta=416B

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/skbuff.h | 2 +-
 net/core/dev.c         | 2 +-
 net/core/skbuff.c      | 3 ++-
 net/ipv4/ip_output.c   | 7 +++++--
 net/ipv4/tcp.c         | 5 ++++-
 net/ipv6/ip6_output.c  | 7 +++++--
 6 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f5de5c9cc3da..10f94b1909da 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1649,7 +1649,7 @@ static inline struct skb_shared_hwtstamps *skb_hwtstamps(struct sk_buff *skb)
 
 static inline struct ubuf_info *skb_zcopy(struct sk_buff *skb)
 {
-	bool is_zcopy = skb && skb_shinfo(skb)->flags & SKBFL_ZEROCOPY_ENABLE;
+	bool is_zcopy = skb_shinfo(skb)->flags & SKBFL_ZEROCOPY_ENABLE;
 
 	return is_zcopy ? skb_uarg(skb) : NULL;
 }
diff --git a/net/core/dev.c b/net/core/dev.c
index 8a5109479dbe..4842a398f08d 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2286,7 +2286,7 @@ void dev_queue_xmit_nit(struct sk_buff *skb, struct net_device *dev)
 	}
 out_unlock:
 	if (pt_prev) {
-		if (!skb_orphan_frags_rx(skb2, GFP_ATOMIC))
+		if (!skb2 || !skb_orphan_frags_rx(skb2, GFP_ATOMIC))
 			pt_prev->func(skb2, skb->dev, pt_prev, skb->dev);
 		else
 			kfree_skb(skb2);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 10bde7c6db44..7680314038b4 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -893,7 +893,8 @@ EXPORT_SYMBOL(skb_dump);
  */
 void skb_tx_error(struct sk_buff *skb)
 {
-	skb_zcopy_clear(skb, true);
+	if (skb)
+		skb_zcopy_clear(skb, true);
 }
 EXPORT_SYMBOL(skb_tx_error);
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index f864b8c48e42..ab10b1f94669 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1018,10 +1018,13 @@ static int __ip_append_data(struct sock *sk,
 		csummode = CHECKSUM_PARTIAL;
 
 	if (flags & MSG_ZEROCOPY && length && sock_flag(sk, SOCK_ZEROCOPY)) {
-		uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb));
+		if (skb)
+			uarg = skb_zcopy(skb);
+		extra_uref = !uarg; /* only ref on new uarg */
+
+		uarg = msg_zerocopy_realloc(sk, length, uarg);
 		if (!uarg)
 			return -ENOBUFS;
-		extra_uref = !skb_zcopy(skb);	/* only ref on new uarg */
 		if (rt->dst.dev->features & NETIF_F_SG &&
 		    csummode == CHECKSUM_PARTIAL) {
 			paged = true;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index cf18fbcbf123..add71b703520 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1205,7 +1205,10 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	if (flags & MSG_ZEROCOPY && size && sock_flag(sk, SOCK_ZEROCOPY)) {
 		skb = tcp_write_queue_tail(sk);
-		uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
+		if (skb)
+			uarg = skb_zcopy(skb);
+
+		uarg = msg_zerocopy_realloc(sk, size, uarg);
 		if (!uarg) {
 			err = -ENOBUFS;
 			goto out_err;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index e9b039f56637..f1ada6f2af7d 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1520,10 +1520,13 @@ static int __ip6_append_data(struct sock *sk,
 		csummode = CHECKSUM_PARTIAL;
 
 	if (flags & MSG_ZEROCOPY && length && sock_flag(sk, SOCK_ZEROCOPY)) {
-		uarg = msg_zerocopy_realloc(sk, length, skb_zcopy(skb));
+		if (skb)
+			uarg = skb_zcopy(skb);
+		extra_uref = !uarg; /* only ref on new uarg */
+
+		uarg = msg_zerocopy_realloc(sk, length, uarg);
 		if (!uarg)
 			return -ENOBUFS;
-		extra_uref = !skb_zcopy(skb);	/* only ref on new uarg */
 		if (rt->dst.dev->features & NETIF_F_SG &&
 		    csummode == CHECKSUM_PARTIAL) {
 			paged = true;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 06/27] net: xen: set zc flags only when there is ubuf
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (4 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 05/27] skbuff: drop null check from skb_zcopy Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 07/27] skbuff: introduce skb_is_zcopy() Pavel Begunkov
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

In preparation to changing zc ubuf invariants, set SKBFL_ZEROCOPY_ENABLE
IFF there is a ubuf set.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 drivers/net/xen-netback/interface.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/xen-netback/interface.c b/drivers/net/xen-netback/interface.c
index fe8e21ad8ed9..0a0c36a38fd4 100644
--- a/drivers/net/xen-netback/interface.c
+++ b/drivers/net/xen-netback/interface.c
@@ -55,7 +55,8 @@
 void xenvif_skb_zerocopy_prepare(struct xenvif_queue *queue,
 				 struct sk_buff *skb)
 {
-	skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_ENABLE;
+	if (skb_uarg(skb))
+		skb_shinfo(skb)->flags |= SKBFL_ZEROCOPY_ENABLE;
 	atomic_inc(&queue->inflight_packets);
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 07/27] skbuff: introduce skb_is_zcopy()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (5 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 06/27] net: xen: set zc flags only when there is ubuf Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 08/27] skbuff: optimise alloc_skb_with_frags() Pavel Begunkov
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Add a new helper function called skb_is_zcopy() for checking for an skb
zerocopy status. Before we were using skb_zcopy() for that, but it's
slightly heavier and generates extra code. Note: since the previous
patch we should have a ubuf set IFF an skb is SKBFL_ZEROCOPY_ENABLE
marked apart from nouarg cases.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/skbuff.h | 25 +++++++++++++++----------
 net/core/skbuff.c      | 15 +++++++--------
 2 files changed, 22 insertions(+), 18 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 10f94b1909da..410850832b6a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1647,11 +1647,14 @@ static inline struct skb_shared_hwtstamps *skb_hwtstamps(struct sk_buff *skb)
 	return &skb_shinfo(skb)->hwtstamps;
 }
 
-static inline struct ubuf_info *skb_zcopy(struct sk_buff *skb)
+static inline bool skb_is_zcopy(struct sk_buff *skb)
 {
-	bool is_zcopy = skb_shinfo(skb)->flags & SKBFL_ZEROCOPY_ENABLE;
+	return skb_shinfo(skb)->flags & SKBFL_ZEROCOPY_ENABLE;
+}
 
-	return is_zcopy ? skb_uarg(skb) : NULL;
+static inline struct ubuf_info *skb_zcopy(struct sk_buff *skb)
+{
+	return skb_is_zcopy(skb) ? skb_uarg(skb) : NULL;
 }
 
 static inline bool skb_zcopy_pure(const struct sk_buff *skb)
@@ -1679,7 +1682,7 @@ static inline void skb_zcopy_init(struct sk_buff *skb, struct ubuf_info *uarg)
 static inline void skb_zcopy_set(struct sk_buff *skb, struct ubuf_info *uarg,
 				 bool *have_ref)
 {
-	if (uarg && !skb_zcopy(skb)) {
+	if (uarg && !skb_is_zcopy(skb)) {
 		if (unlikely(have_ref && *have_ref))
 			*have_ref = false;
 		else
@@ -1723,11 +1726,13 @@ static inline void net_zcopy_put_abort(struct ubuf_info *uarg, bool have_uref)
 /* Release a reference on a zerocopy structure */
 static inline void skb_zcopy_clear(struct sk_buff *skb, bool zerocopy_success)
 {
-	struct ubuf_info *uarg = skb_zcopy(skb);
 
-	if (uarg) {
-		if (!skb_zcopy_is_nouarg(skb))
+	if (skb_is_zcopy(skb)) {
+		if (!skb_zcopy_is_nouarg(skb)) {
+			struct ubuf_info *uarg = skb_zcopy(skb);
+
 			uarg->callback(skb, uarg, zerocopy_success);
+		}
 
 		skb_shinfo(skb)->flags &= ~SKBFL_ALL_ZEROCOPY;
 	}
@@ -3023,7 +3028,7 @@ static inline void skb_orphan(struct sk_buff *skb)
  */
 static inline int skb_orphan_frags(struct sk_buff *skb, gfp_t gfp_mask)
 {
-	if (likely(!skb_zcopy(skb)))
+	if (likely(!skb_is_zcopy(skb)))
 		return 0;
 	if (!skb_zcopy_is_nouarg(skb) &&
 	    skb_uarg(skb)->callback == msg_zerocopy_callback)
@@ -3034,7 +3039,7 @@ static inline int skb_orphan_frags(struct sk_buff *skb, gfp_t gfp_mask)
 /* Frags must be orphaned, even if refcounted, if skb might loop to rx path */
 static inline int skb_orphan_frags_rx(struct sk_buff *skb, gfp_t gfp_mask)
 {
-	if (likely(!skb_zcopy(skb)))
+	if (likely(!skb_is_zcopy(skb)))
 		return 0;
 	return skb_copy_ubufs(skb, gfp_mask);
 }
@@ -3591,7 +3596,7 @@ static inline int skb_add_data(struct sk_buff *skb,
 static inline bool skb_can_coalesce(struct sk_buff *skb, int i,
 				    const struct page *page, int off)
 {
-	if (skb_zcopy(skb))
+	if (skb_is_zcopy(skb))
 		return false;
 	if (i) {
 		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1];
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 7680314038b4..f7842bfdd7ae 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1350,14 +1350,13 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
 			     struct msghdr *msg, int len,
 			     struct ubuf_info *uarg)
 {
-	struct ubuf_info *orig_uarg = skb_zcopy(skb);
 	struct iov_iter orig_iter = msg->msg_iter;
 	int err, orig_len = skb->len;
 
 	/* An skb can only point to one uarg. This edge case happens when
 	 * TCP appends to an skb, but zerocopy_realloc triggered a new alloc.
 	 */
-	if (orig_uarg && uarg != orig_uarg)
+	if (skb_is_zcopy(skb) && uarg != skb_zcopy(skb))
 		return -EEXIST;
 
 	err = __zerocopy_sg_from_iter(sk, skb, &msg->msg_iter, len);
@@ -1380,9 +1379,9 @@ EXPORT_SYMBOL_GPL(skb_zerocopy_iter_stream);
 static int skb_zerocopy_clone(struct sk_buff *nskb, struct sk_buff *orig,
 			      gfp_t gfp_mask)
 {
-	if (skb_zcopy(orig)) {
-		if (skb_zcopy(nskb)) {
-			/* !gfp_mask callers are verified to !skb_zcopy(nskb) */
+	if (skb_is_zcopy(orig)) {
+		if (skb_is_zcopy(nskb)) {
+			/* !gfp_mask callers are verified to !skb_is_zcopy(nskb) */
 			if (!gfp_mask) {
 				WARN_ON_ONCE(1);
 				return -ENOMEM;
@@ -1721,8 +1720,8 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 	if (skb_cloned(skb)) {
 		if (skb_orphan_frags(skb, gfp_mask))
 			goto nofrags;
-		if (skb_zcopy(skb))
-			refcount_inc(&skb_uarg(skb)->refcnt);
+		if (skb_is_zcopy(skb))
+			net_zcopy_get(skb_uarg(skb));
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
 			skb_frag_ref(skb, i);
 
@@ -3535,7 +3534,7 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
 
 	if (skb_headlen(skb))
 		return 0;
-	if (skb_zcopy(tgt) || skb_zcopy(skb))
+	if (skb_is_zcopy(tgt) || skb_is_zcopy(skb))
 		return 0;
 
 	todo = shiftlen;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 08/27] skbuff: optimise alloc_skb_with_frags()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (6 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 07/27] skbuff: introduce skb_is_zcopy() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 09/27] net: inline sock_alloc_send_skb Pavel Begunkov
                   ` (19 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Some users of alloc_skb_with_frags() including UDP pass zero datalen,
Extract and inline the pure skb alloc part of it. We also save on
needlessly pre-setting errcode ptr and with other small refactorings.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/skbuff.h | 41 ++++++++++++++++++++++++++++++++++++-----
 net/core/skbuff.c      | 31 ++++++++++++-------------------
 2 files changed, 48 insertions(+), 24 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 410850832b6a..ebc4ad36c3a2 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1300,11 +1300,42 @@ static inline struct sk_buff *alloc_skb(unsigned int size,
 	return __alloc_skb(size, priority, 0, NUMA_NO_NODE);
 }
 
-struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
-				     unsigned long data_len,
-				     int max_page_order,
-				     int *errcode,
-				     gfp_t gfp_mask);
+struct sk_buff *alloc_skb_frags(struct sk_buff *skb,
+				unsigned long data_len,
+				int max_page_order,
+				int *errcode,
+				gfp_t gfp_mask);
+
+/**
+ * alloc_skb_with_frags - allocate skb with page frags
+ *
+ * @header_len: size of linear part
+ * @data_len: needed length in frags
+ * @max_page_order: max page order desired.
+ * @errcode: pointer to error code if any
+ * @gfp_mask: allocation mask
+ *
+ * This can be used to allocate a paged skb, given a maximal order for frags.
+ */
+static inline struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
+						   unsigned long data_len,
+						   int max_page_order,
+						   int *errcode,
+						   gfp_t gfp_mask)
+{
+	struct sk_buff *skb;
+
+	skb = alloc_skb(header_len, gfp_mask);
+	if (unlikely(!skb)) {
+		*errcode = -ENOBUFS;
+		return NULL;
+	}
+
+	if (!data_len)
+		return skb;
+	return alloc_skb_frags(skb, data_len, max_page_order, errcode, gfp_mask);
+}
+
 struct sk_buff *alloc_skb_for_msg(struct sk_buff *first);
 
 /* Layout of fast clones : [skb1][skb2][fclone_ref] */
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index f7842bfdd7ae..2c787d964a60 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -5955,40 +5955,32 @@ int skb_mpls_dec_ttl(struct sk_buff *skb)
 EXPORT_SYMBOL_GPL(skb_mpls_dec_ttl);
 
 /**
- * alloc_skb_with_frags - allocate skb with page frags
+ * alloc_skb_frags - allocate page frags for skb
  *
- * @header_len: size of linear part
+ * @skb: buffer
  * @data_len: needed length in frags
  * @max_page_order: max page order desired.
  * @errcode: pointer to error code if any
  * @gfp_mask: allocation mask
  *
- * This can be used to allocate a paged skb, given a maximal order for frags.
+ * This can be used to allocate pages for skb, given a maximal order for frags.
  */
-struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
-				     unsigned long data_len,
-				     int max_page_order,
-				     int *errcode,
-				     gfp_t gfp_mask)
+struct sk_buff *alloc_skb_frags(struct sk_buff *skb,
+				unsigned long data_len,
+				int max_page_order,
+				int *errcode,
+				gfp_t gfp_mask)
 {
 	int npages = (data_len + (PAGE_SIZE - 1)) >> PAGE_SHIFT;
 	unsigned long chunk;
-	struct sk_buff *skb;
 	struct page *page;
 	int i;
 
-	*errcode = -EMSGSIZE;
 	/* Note this test could be relaxed, if we succeed to allocate
 	 * high order pages...
 	 */
-	if (npages > MAX_SKB_FRAGS)
-		return NULL;
-
-	*errcode = -ENOBUFS;
-	skb = alloc_skb(header_len, gfp_mask);
-	if (!skb)
-		return NULL;
-
+	if (unlikely(npages > MAX_SKB_FRAGS))
+		goto failure;
 	skb->truesize += npages << PAGE_SHIFT;
 
 	for (i = 0; npages > 0; i++) {
@@ -6022,9 +6014,10 @@ struct sk_buff *alloc_skb_with_frags(unsigned long header_len,
 
 failure:
 	kfree_skb(skb);
+	*errcode = -EMSGSIZE;
 	return NULL;
 }
-EXPORT_SYMBOL(alloc_skb_with_frags);
+EXPORT_SYMBOL(alloc_skb_frags);
 
 /* carve out the first off bytes from skb when off < headlen */
 static int pskb_carve_inside_header(struct sk_buff *skb, const u32 off,
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 09/27] net: inline sock_alloc_send_skb
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (7 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 08/27] skbuff: optimise alloc_skb_with_frags() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 10/27] net: inline part of skb_csum_hwoffload_help Pavel Begunkov
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

sock_alloc_send_skb() is simple and just proxying to another function,
so we can inline it and cut associated overhead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/net/sock.h | 10 ++++++++--
 net/core/sock.c    |  7 -------
 2 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index c4b91fc19b9c..9dab633c3caf 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1825,11 +1825,17 @@ int sock_getsockopt(struct socket *sock, int level, int op,
 		    char __user *optval, int __user *optlen);
 int sock_gettstamp(struct socket *sock, void __user *userstamp,
 		   bool timeval, bool time32);
-struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
-				    int noblock, int *errcode);
 struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 				     unsigned long data_len, int noblock,
 				     int *errcode, int max_page_order);
+
+static inline struct sk_buff *sock_alloc_send_skb(struct sock *sk,
+						  unsigned long size,
+						  int noblock, int *errcode)
+{
+	return sock_alloc_send_pskb(sk, size, 0, noblock, errcode, 0);
+}
+
 void *sock_kmalloc(struct sock *sk, int size, gfp_t priority);
 void sock_kfree_s(struct sock *sk, void *mem, int size);
 void sock_kzfree_s(struct sock *sk, void *mem, int size);
diff --git a/net/core/sock.c b/net/core/sock.c
index b1a8f47fda55..77e37556e0c3 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2626,13 +2626,6 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 }
 EXPORT_SYMBOL(sock_alloc_send_pskb);
 
-struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
-				    int noblock, int *errcode)
-{
-	return sock_alloc_send_pskb(sk, size, 0, noblock, errcode, 0);
-}
-EXPORT_SYMBOL(sock_alloc_send_skb);
-
 int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg,
 		     struct sockcm_cookie *sockc)
 {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 10/27] net: inline part of skb_csum_hwoffload_help
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (8 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 09/27] net: inline sock_alloc_send_skb Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 11/27] net: inline skb_zerocopy_iter_dgram Pavel Begunkov
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Inline a part of skb_csum_hwoffload_help() responsible for skipping
for HW-accelerated cases.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/netdevice.h | 13 ++++++++++---
 net/core/dev.c            | 11 ++++-------
 2 files changed, 14 insertions(+), 10 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index cd7a597c55b1..a4e41f7edc47 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4699,9 +4699,16 @@ extern u8 netdev_rss_key[NETDEV_RSS_KEY_LEN] __read_mostly;
 void netdev_rss_key_fill(void *buffer, size_t len);
 
 int skb_checksum_help(struct sk_buff *skb);
-int skb_crc32c_csum_help(struct sk_buff *skb);
-int skb_csum_hwoffload_help(struct sk_buff *skb,
-			    const netdev_features_t features);
+int __skb_csum_hwoffload_help(struct sk_buff *skb,
+			      const netdev_features_t features);
+
+static inline int skb_csum_hwoffload_help(struct sk_buff *skb,
+					  const netdev_features_t features)
+{
+	if ((features & NETIF_F_HW_CSUM) && !skb_csum_is_sctp(skb))
+		return 0;
+	return __skb_csum_hwoffload_help(skb, features);
+}
 
 struct sk_buff *__skb_gso_segment(struct sk_buff *skb,
 				  netdev_features_t features, bool tx_path);
diff --git a/net/core/dev.c b/net/core/dev.c
index 4842a398f08d..6044b6124edc 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3233,7 +3233,7 @@ int skb_checksum_help(struct sk_buff *skb)
 }
 EXPORT_SYMBOL(skb_checksum_help);
 
-int skb_crc32c_csum_help(struct sk_buff *skb)
+static inline int skb_crc32c_csum_help(struct sk_buff *skb)
 {
 	__le32 crc32c_csum;
 	int ret = 0, offset, start;
@@ -3572,16 +3572,13 @@ static struct sk_buff *validate_xmit_vlan(struct sk_buff *skb,
 	return skb;
 }
 
-int skb_csum_hwoffload_help(struct sk_buff *skb,
-			    const netdev_features_t features)
+int __skb_csum_hwoffload_help(struct sk_buff *skb,
+			      const netdev_features_t features)
 {
 	if (unlikely(skb_csum_is_sctp(skb)))
 		return !!(features & NETIF_F_SCTP_CRC) ? 0 :
 			skb_crc32c_csum_help(skb);
 
-	if (features & NETIF_F_HW_CSUM)
-		return 0;
-
 	if (features & (NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM)) {
 		switch (skb->csum_offset) {
 		case offsetof(struct tcphdr, check):
@@ -3592,7 +3589,7 @@ int skb_csum_hwoffload_help(struct sk_buff *skb,
 
 	return skb_checksum_help(skb);
 }
-EXPORT_SYMBOL(skb_csum_hwoffload_help);
+EXPORT_SYMBOL(__skb_csum_hwoffload_help);
 
 static struct sk_buff *validate_xmit_skb(struct sk_buff *skb, struct net_device *dev, bool *again)
 {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 11/27] net: inline skb_zerocopy_iter_dgram
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (9 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 10/27] net: inline part of skb_csum_hwoffload_help Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 12/27] ipv6: inline ip6_local_out() Pavel Begunkov
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

skb_zerocopy_iter_dgram() is a small proxy function, inline it. For
that, move __zerocopy_sg_from_iter into linux/skbuff.h

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/skbuff.h | 36 ++++++++++++++++++++++--------------
 net/core/datagram.c    |  2 --
 net/core/datagram.h    | 15 ---------------
 net/core/skbuff.c      |  7 -------
 4 files changed, 22 insertions(+), 38 deletions(-)
 delete mode 100644 net/core/datagram.h

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ebc4ad36c3a2..93a50ac6b9c4 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -647,20 +647,6 @@ struct ubuf_info {
 int mm_account_pinned_pages(struct mmpin *mmp, size_t size);
 void mm_unaccount_pinned_pages(struct mmpin *mmp);
 
-struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size);
-struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
-				       struct ubuf_info *uarg);
-
-void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref);
-
-void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg,
-			   bool success);
-
-int skb_zerocopy_iter_dgram(struct sk_buff *skb, struct msghdr *msg, int len);
-int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
-			     struct msghdr *msg, int len,
-			     struct ubuf_info *uarg);
-
 /* This data is invariant across clones and lives at
  * the end of the header data, ie. at skb->end.
  */
@@ -1670,6 +1656,28 @@ static inline void skb_set_end_offset(struct sk_buff *skb, unsigned int offset)
 }
 #endif
 
+struct ubuf_info *msg_zerocopy_alloc(struct sock *sk, size_t size);
+struct ubuf_info *msg_zerocopy_realloc(struct sock *sk, size_t size,
+				       struct ubuf_info *uarg);
+
+void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref);
+
+void msg_zerocopy_callback(struct sk_buff *skb, struct ubuf_info *uarg,
+			   bool success);
+
+int __zerocopy_sg_from_iter(struct sock *sk, struct sk_buff *skb,
+			    struct iov_iter *from, size_t length);
+
+static inline int skb_zerocopy_iter_dgram(struct sk_buff *skb,
+					  struct msghdr *msg, int len)
+{
+	return __zerocopy_sg_from_iter(skb->sk, skb, &msg->msg_iter, len);
+}
+
+int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
+			     struct msghdr *msg, int len,
+			     struct ubuf_info *uarg);
+
 /* Internal */
 #define skb_shinfo(SKB)	((struct skb_shared_info *)(skb_end_pointer(SKB)))
 
diff --git a/net/core/datagram.c b/net/core/datagram.c
index ee290776c661..bd78b974baa5 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -62,8 +62,6 @@
 #include <trace/events/skb.h>
 #include <net/busy_poll.h>
 
-#include "datagram.h"
-
 /*
  *	Is a socket 'connection oriented' ?
  */
diff --git a/net/core/datagram.h b/net/core/datagram.h
deleted file mode 100644
index bcfb75bfa3b2..000000000000
--- a/net/core/datagram.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-
-#ifndef _NET_CORE_DATAGRAM_H_
-#define _NET_CORE_DATAGRAM_H_
-
-#include <linux/types.h>
-
-struct sock;
-struct sk_buff;
-struct iov_iter;
-
-int __zerocopy_sg_from_iter(struct sock *sk, struct sk_buff *skb,
-			    struct iov_iter *from, size_t length);
-
-#endif /* _NET_CORE_DATAGRAM_H_ */
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 2c787d964a60..65ac779eb5cd 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -80,7 +80,6 @@
 #include <linux/user_namespace.h>
 #include <linux/indirect_call_wrapper.h>
 
-#include "datagram.h"
 #include "sock_destructor.h"
 
 struct kmem_cache *skbuff_head_cache __ro_after_init;
@@ -1340,12 +1339,6 @@ void msg_zerocopy_put_abort(struct ubuf_info *uarg, bool have_uref)
 }
 EXPORT_SYMBOL_GPL(msg_zerocopy_put_abort);
 
-int skb_zerocopy_iter_dgram(struct sk_buff *skb, struct msghdr *msg, int len)
-{
-	return __zerocopy_sg_from_iter(skb->sk, skb, &msg->msg_iter, len);
-}
-EXPORT_SYMBOL_GPL(skb_zerocopy_iter_dgram);
-
 int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
 			     struct msghdr *msg, int len,
 			     struct ubuf_info *uarg)
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 12/27] ipv6: inline ip6_local_out()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (10 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 11/27] net: inline skb_zerocopy_iter_dgram Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 13/27] ipv6: help __ip6_finish_output() inlining Pavel Begunkov
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

ip6_local_out() is simple, inline it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/net/ipv6.h     | 13 ++++++++++++-
 net/ipv6/output_core.c | 12 ------------
 2 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 213612f1680c..0320bea599c9 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1074,7 +1074,18 @@ void ip6_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int nexthdr,
 			      bool have_final);
 
 int __ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb);
-int ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb);
+
+static inline int ip6_local_out(struct net *net, struct sock *sk,
+				struct sk_buff *skb)
+{
+	int err;
+
+	err = __ip6_local_out(net, sk, skb);
+	if (likely(err == 1))
+		err = dst_output(net, sk, skb);
+
+	return err;
+}
 
 /*
  *	Extension header (options) processing
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
index 2880dc7d9a49..f657e713561b 100644
--- a/net/ipv6/output_core.c
+++ b/net/ipv6/output_core.c
@@ -151,15 +151,3 @@ int __ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb)
 		       dst_output);
 }
 EXPORT_SYMBOL_GPL(__ip6_local_out);
-
-int ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb)
-{
-	int err;
-
-	err = __ip6_local_out(net, sk, skb);
-	if (likely(err == 1))
-		err = dst_output(net, sk, skb);
-
-	return err;
-}
-EXPORT_SYMBOL_GPL(ip6_local_out);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 13/27] ipv6: help __ip6_finish_output() inlining
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (11 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 12/27] ipv6: inline ip6_local_out() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 14/27] ipv6: refactor ip6_finish_output2() Pavel Begunkov
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

There are two callers of __ip6_finish_output(), both are in
ip6_finish_output(). We can combine the call sites into one and handle
return code after, that will inline __ip6_finish_output().

Note, error handling under NET_XMIT_CN will only return 0 if
__ip6_finish_output() succeded, and in this case it return 0.
Considering that NET_XMIT_SUCCESS is 0, it'll be returning exactly the
same result for it as before.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/ip6_output.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f1ada6f2af7d..39f3e4bee9e6 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -198,7 +198,6 @@ static int ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff *s
 	ret = BPF_CGROUP_RUN_PROG_INET_EGRESS(sk, skb);
 	switch (ret) {
 	case NET_XMIT_SUCCESS:
-		return __ip6_finish_output(net, sk, skb);
 	case NET_XMIT_CN:
 		return __ip6_finish_output(net, sk, skb) ? : ret;
 	default:
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 14/27] ipv6: refactor ip6_finish_output2()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (12 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 13/27] ipv6: help __ip6_finish_output() inlining Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 15/27] net: inline dev_queue_xmit() Pavel Begunkov
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Throw neigh checks in ip6_finish_output2() under a single slow path if,
so we don't have the overhead in the hot path.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/ip6_output.c | 24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 39f3e4bee9e6..4319364a4a8c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -119,19 +119,21 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
 	rcu_read_lock_bh();
 	nexthop = rt6_nexthop((struct rt6_info *)dst, daddr);
 	neigh = __ipv6_neigh_lookup_noref(dev, nexthop);
-	if (unlikely(!neigh))
-		neigh = __neigh_create(&nd_tbl, nexthop, dev, false);
-	if (!IS_ERR(neigh)) {
-		sock_confirm_neigh(skb, neigh);
-		ret = neigh_output(neigh, skb, false);
-		rcu_read_unlock_bh();
-		return ret;
+
+	if (unlikely(IS_ERR_OR_NULL(neigh))) {
+		if (unlikely(!neigh))
+			neigh = __neigh_create(&nd_tbl, nexthop, dev, false);
+		if (IS_ERR(neigh)) {
+			rcu_read_unlock_bh();
+			IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTNOROUTES);
+			kfree_skb_reason(skb, SKB_DROP_REASON_NEIGH_CREATEFAIL);
+			return -EINVAL;
+		}
 	}
+	sock_confirm_neigh(skb, neigh);
+	ret = neigh_output(neigh, skb, false);
 	rcu_read_unlock_bh();
-
-	IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTNOROUTES);
-	kfree_skb_reason(skb, SKB_DROP_REASON_NEIGH_CREATEFAIL);
-	return -EINVAL;
+	return ret;
 }
 
 static int
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 15/27] net: inline dev_queue_xmit()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (13 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 14/27] ipv6: refactor ip6_finish_output2() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 16/27] ipv6: partially inline fl6_update_dst() Pavel Begunkov
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Inline dev_queue_xmit() and dev_queue_xmit_accel(), they both are small
proxy functions doing nothing but redirecting the control flow to
__dev_queue_xmit().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/linux/netdevice.h | 14 ++++++++++++--
 net/core/dev.c            | 15 ++-------------
 2 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index a4e41f7edc47..6aca1f3b21ff 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2932,10 +2932,20 @@ u16 dev_pick_tx_zero(struct net_device *dev, struct sk_buff *skb,
 u16 dev_pick_tx_cpu_id(struct net_device *dev, struct sk_buff *skb,
 		       struct net_device *sb_dev);
 
-int dev_queue_xmit(struct sk_buff *skb);
-int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev);
+int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev);
 int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id);
 
+static inline int dev_queue_xmit(struct sk_buff *skb)
+{
+	return __dev_queue_xmit(skb, NULL);
+}
+
+static inline int dev_queue_xmit_accel(struct sk_buff *skb,
+				       struct net_device *sb_dev)
+{
+	return __dev_queue_xmit(skb, sb_dev);
+}
+
 static inline int dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
 {
 	int ret;
diff --git a/net/core/dev.c b/net/core/dev.c
index 6044b6124edc..ed5459552117 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4084,7 +4084,7 @@ struct netdev_queue *netdev_core_pick_tx(struct net_device *dev,
  *      the BH enable code must have IRQs enabled so that it will not deadlock.
  *          --BLG
  */
-static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
+int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 {
 	struct net_device *dev = skb->dev;
 	struct netdev_queue *txq;
@@ -4200,18 +4200,7 @@ static int __dev_queue_xmit(struct sk_buff *skb, struct net_device *sb_dev)
 	rcu_read_unlock_bh();
 	return rc;
 }
-
-int dev_queue_xmit(struct sk_buff *skb)
-{
-	return __dev_queue_xmit(skb, NULL);
-}
-EXPORT_SYMBOL(dev_queue_xmit);
-
-int dev_queue_xmit_accel(struct sk_buff *skb, struct net_device *sb_dev)
-{
-	return __dev_queue_xmit(skb, sb_dev);
-}
-EXPORT_SYMBOL(dev_queue_xmit_accel);
+EXPORT_SYMBOL(__dev_queue_xmit);
 
 int __dev_direct_xmit(struct sk_buff *skb, u16 queue_id)
 {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 16/27] ipv6: partially inline fl6_update_dst()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (14 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 15/27] net: inline dev_queue_xmit() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 17/27] tcp: optimise skb_zerocopy_iter_stream() Pavel Begunkov
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

fl6_update_dst() doesn't do anything when there are no opts passed.
Inline the null checking part.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/net/ipv6.h | 15 ++++++++++++---
 net/ipv6/exthdrs.c | 15 ++++++---------
 2 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 0320bea599c9..48a25f663646 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1114,9 +1114,18 @@ int ipv6_find_hdr(const struct sk_buff *skb, unsigned int *offset, int target,
 
 int ipv6_find_tlv(const struct sk_buff *skb, int offset, int type);
 
-struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
-				const struct ipv6_txoptions *opt,
-				struct in6_addr *orig);
+struct in6_addr *__fl6_update_dst(struct flowi6 *fl6,
+				  const struct ipv6_txoptions *opt,
+				  struct in6_addr *orig);
+
+static inline struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
+					      const struct ipv6_txoptions *opt,
+					      struct in6_addr *orig)
+{
+	if (!opt || !opt->srcrt)
+		return NULL;
+	return __fl6_update_dst(fl6, opt, orig);
+}
 
 /*
  *	socket options (ipv6_sockglue.c)
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 658d5eabaf7e..0b37b11cd2a9 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -1365,8 +1365,8 @@ struct ipv6_txoptions *__ipv6_fixup_options(struct ipv6_txoptions *opt_space,
 EXPORT_SYMBOL_GPL(__ipv6_fixup_options);
 
 /**
- * fl6_update_dst - update flowi destination address with info given
- *                  by srcrt option, if any.
+ * __fl6_update_dst - update flowi destination address with info given
+ *                    by srcrt option.
  *
  * @fl6: flowi6 for which daddr is to be updated
  * @opt: struct ipv6_txoptions in which to look for srcrt opt
@@ -1375,13 +1375,10 @@ EXPORT_SYMBOL_GPL(__ipv6_fixup_options);
  * Returns NULL if no txoptions or no srcrt, otherwise returns orig
  * and initial value of fl6->daddr set in orig
  */
-struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
-				const struct ipv6_txoptions *opt,
-				struct in6_addr *orig)
+struct in6_addr *__fl6_update_dst(struct flowi6 *fl6,
+				  const struct ipv6_txoptions *opt,
+				  struct in6_addr *orig)
 {
-	if (!opt || !opt->srcrt)
-		return NULL;
-
 	*orig = fl6->daddr;
 
 	switch (opt->srcrt->type) {
@@ -1403,4 +1400,4 @@ struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
 
 	return orig;
 }
-EXPORT_SYMBOL_GPL(fl6_update_dst);
+EXPORT_SYMBOL_GPL(__fl6_update_dst);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 17/27] tcp: optimise skb_zerocopy_iter_stream()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (15 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 16/27] ipv6: partially inline fl6_update_dst() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 18/27] net: optimise ipcm6 cookie init Pavel Begunkov
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

It's expensive to make a copy of 40B struct iov_iter to the point it
was taking 0.2-0.5% of all cycles in my tests. iov_iter_revert() should
be fine as it's a simple case without nested reverts/truncates.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/core/skbuff.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 65ac779eb5cd..77cbdb02e885 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1343,7 +1343,6 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
 			     struct msghdr *msg, int len,
 			     struct ubuf_info *uarg)
 {
-	struct iov_iter orig_iter = msg->msg_iter;
 	int err, orig_len = skb->len;
 
 	/* An skb can only point to one uarg. This edge case happens when
@@ -1357,7 +1356,7 @@ int skb_zerocopy_iter_stream(struct sock *sk, struct sk_buff *skb,
 		struct sock *save_sk = skb->sk;
 
 		/* Streams do not free skb on error. Reset to prev state. */
-		msg->msg_iter = orig_iter;
+		iov_iter_revert(&msg->msg_iter, skb->len - orig_len);
 		skb->sk = sk;
 		___pskb_trim(skb, orig_len);
 		skb->sk = save_sk;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 18/27] net: optimise ipcm6 cookie init
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (16 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 17/27] tcp: optimise skb_zerocopy_iter_stream() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 19/27] udp/ipv6: refactor udpv6_sendmsg udplite checks Pavel Begunkov
                   ` (9 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Users of ipcm6_init() have a somewhat complex post initialisation
of ->dontfrag and ->tclass. Not only it adds additional overhead,
but also complicates the code.

First, replace ipcm6_init() with ipcm6_init_sk(). As it might be not an
equivalent change, let's first look at ->dontfrag. The logic was to set
it from cmsg if specified and otherwise fallback to np->dontfrag. Now
it's initialising to np->dontfrag in the beginning and then potentially
overriding with cmsg, which is absolutely the same behaviour.

It's a bit more complex with ->tclass as ip6_datagram_send_ctl() might
set it to -1, which is a default and not valid value. The solution
here is to skip -1's specified in cmsg, so it'll be left with the socket
default value getting us to the old behaviour.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 include/net/ipv6.h    | 9 ---------
 net/ipv6/datagram.c   | 4 ++--
 net/ipv6/ip6_output.c | 2 --
 net/ipv6/raw.c        | 8 +-------
 net/ipv6/udp.c        | 7 +------
 net/l2tp/l2tp_ip6.c   | 8 +-------
 6 files changed, 5 insertions(+), 33 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 48a25f663646..2f2d9af58f05 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -352,15 +352,6 @@ struct ipcm6_cookie {
 	struct ipv6_txoptions *opt;
 };
 
-static inline void ipcm6_init(struct ipcm6_cookie *ipc6)
-{
-	*ipc6 = (struct ipcm6_cookie) {
-		.hlimit = -1,
-		.tclass = -1,
-		.dontfrag = -1,
-	};
-}
-
 static inline void ipcm6_init_sk(struct ipcm6_cookie *ipc6,
 				 const struct ipv6_pinfo *np)
 {
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 206f66310a88..1b334bc855ae 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -1003,9 +1003,9 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 			if (tc < -1 || tc > 0xff)
 				goto exit_f;
 
+			if (tc != -1)
+				ipc6->tclass = tc;
 			err = 0;
-			ipc6->tclass = tc;
-
 			break;
 		    }
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 4319364a4a8c..bd5de7a5aa8c 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -2003,8 +2003,6 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
 		ip6_cork_release(cork, &v6_cork);
 		return ERR_PTR(err);
 	}
-	if (ipc6->dontfrag < 0)
-		ipc6->dontfrag = inet6_sk(sk)->dontfrag;
 
 	err = __ip6_append_data(sk, &queue, cork, &v6_cork,
 				&current->task_frag, getfrag, from,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index c51d5ce3711c..0e0156938968 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -808,7 +808,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	fl6.flowi6_mark = sk->sk_mark;
 	fl6.flowi6_uid = sk->sk_uid;
 
-	ipcm6_init(&ipc6);
+	ipcm6_init_sk(&ipc6, np);
 	ipc6.sockc.tsflags = sk->sk_tsflags;
 	ipc6.sockc.mark = sk->sk_mark;
 
@@ -920,9 +920,6 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (hdrincl)
 		fl6.flowi6_flags |= FLOWI_FLAG_KNOWN_NH;
 
-	if (ipc6.tclass < 0)
-		ipc6.tclass = np->tclass;
-
 	fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
 
 	dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
@@ -933,9 +930,6 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (ipc6.hlimit < 0)
 		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
 
-	if (ipc6.dontfrag < 0)
-		ipc6.dontfrag = np->dontfrag;
-
 	if (msg->msg_flags&MSG_CONFIRM)
 		goto do_confirm;
 
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 7f0fa9bd9ffe..4b15b37fc8f9 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1313,7 +1313,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	int is_udplite = IS_UDPLITE(sk);
 	int (*getfrag)(void *, char *, int, int, int, struct sk_buff *);
 
-	ipcm6_init(&ipc6);
+	ipcm6_init_sk(&ipc6, np);
 	ipc6.gso_size = READ_ONCE(up->gso_size);
 	ipc6.sockc.tsflags = sk->sk_tsflags;
 	ipc6.sockc.mark = sk->sk_mark;
@@ -1518,9 +1518,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 	security_sk_classify_flow(sk, flowi6_to_flowi_common(fl6));
 
-	if (ipc6.tclass < 0)
-		ipc6.tclass = np->tclass;
-
 	fl6->flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6->flowlabel);
 
 	dst = ip6_sk_dst_lookup_flow(sk, fl6, final_p, connected);
@@ -1566,8 +1563,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	up->pending = AF_INET6;
 
 do_append_data:
-	if (ipc6.dontfrag < 0)
-		ipc6.dontfrag = np->dontfrag;
 	up->len += ulen;
 	err = ip6_append_data(sk, getfrag, msg, ulen, sizeof(struct udphdr),
 			      &ipc6, fl6, (struct rt6_info *)dst,
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 96f975777438..4459926f5840 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -521,7 +521,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	fl6.flowi6_mark = sk->sk_mark;
 	fl6.flowi6_uid = sk->sk_uid;
 
-	ipcm6_init(&ipc6);
+	ipcm6_init_sk(&ipc6, np);
 
 	if (lsa) {
 		if (addr_len < SIN6_LEN_RFC2133)
@@ -608,9 +608,6 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 	security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
 
-	if (ipc6.tclass < 0)
-		ipc6.tclass = np->tclass;
-
 	fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
 
 	dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
@@ -622,9 +619,6 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (ipc6.hlimit < 0)
 		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
 
-	if (ipc6.dontfrag < 0)
-		ipc6.dontfrag = np->dontfrag;
-
 	if (msg->msg_flags & MSG_CONFIRM)
 		goto do_confirm;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 19/27] udp/ipv6: refactor udpv6_sendmsg udplite checks
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (17 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 18/27] net: optimise ipcm6 cookie init Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 20/27] udp/ipv6: move pending section of udpv6_sendmsg Pavel Begunkov
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Don't save a IS_UDPLITE() result in advance but do when it's really
needed, so it doesn't store/load it from the stack. Same for resolving
the getfrag callback pointer.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/udp.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 4b15b37fc8f9..588bd7e3ebc1 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1310,7 +1310,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	int ulen = len;
 	int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE;
 	int err;
-	int is_udplite = IS_UDPLITE(sk);
 	int (*getfrag)(void *, char *, int, int, int, struct sk_buff *);
 
 	ipcm6_init_sk(&ipc6, np);
@@ -1371,7 +1370,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (len > INT_MAX - sizeof(struct udphdr))
 		return -EMSGSIZE;
 
-	getfrag  =  is_udplite ?  udplite_getfrag : ip_generic_getfrag;
 	if (up->pending) {
 		if (up->pending == AF_INET)
 			return udp_sendmsg(sk, msg, len);
@@ -1538,6 +1536,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	if (!corkreq) {
 		struct sk_buff *skb;
 
+		getfrag = IS_UDPLITE(sk) ? udplite_getfrag : ip_generic_getfrag;
 		skb = ip6_make_skb(sk, getfrag, msg, ulen,
 				   sizeof(struct udphdr), &ipc6,
 				   (struct rt6_info *)dst,
@@ -1564,6 +1563,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 do_append_data:
 	up->len += ulen;
+	getfrag = IS_UDPLITE(sk) ? udplite_getfrag : ip_generic_getfrag;
 	err = ip6_append_data(sk, getfrag, msg, ulen, sizeof(struct udphdr),
 			      &ipc6, fl6, (struct rt6_info *)dst,
 			      corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags);
@@ -1594,7 +1594,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	 */
 	if (err == -ENOBUFS || test_bit(SOCK_NOSPACE, &sk->sk_socket->flags)) {
 		UDP6_INC_STATS(sock_net(sk),
-			       UDP_MIB_SNDBUFERRORS, is_udplite);
+			       UDP_MIB_SNDBUFERRORS, IS_UDPLITE(sk));
 	}
 	return err;
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 20/27] udp/ipv6: move pending section of udpv6_sendmsg
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (18 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 19/27] udp/ipv6: refactor udpv6_sendmsg udplite checks Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 21/27] udp/ipv6: prioritise the ip6 path over ip4 checks Pavel Begunkov
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Move up->pending section of udpv6_sendmsg() to the beginning of the
function. Even though it require some code duplication for sin6 parsing,
it clearly localises the pending handling in one place, removes an extra
if and more importantly will prepare the code for further patches.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/udp.c | 67 ++++++++++++++++++++++++++++++--------------------
 1 file changed, 40 insertions(+), 27 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 588bd7e3ebc1..26832be40f31 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1317,6 +1317,44 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	ipc6.sockc.tsflags = sk->sk_tsflags;
 	ipc6.sockc.mark = sk->sk_mark;
 
+	/* Rough check on arithmetic overflow,
+	   better check is made in ip6_append_data().
+	   */
+	if (unlikely(len > INT_MAX - sizeof(struct udphdr)))
+		return -EMSGSIZE;
+
+	/* There are pending frames. */
+	if (up->pending) {
+		if (up->pending == AF_INET)
+			return udp_sendmsg(sk, msg, len);
+
+		/* Do a quick destination sanity check before corking. */
+		if (sin6) {
+			if (msg->msg_namelen < offsetof(struct sockaddr, sa_data))
+				return -EINVAL;
+			if (sin6->sin6_family == AF_INET6) {
+				if (msg->msg_namelen < SIN6_LEN_RFC2133)
+					return -EINVAL;
+				if (ipv6_addr_any(&sin6->sin6_addr) &&
+				    ipv6_addr_v4mapped(&np->saddr))
+					return -EINVAL;
+			} else if (sin6->sin6_family != AF_UNSPEC) {
+				return -EINVAL;
+			}
+		}
+
+		/* The socket lock must be held while it's corked. */
+		lock_sock(sk);
+		if (unlikely(up->pending != AF_INET6)) {
+			/* Just now it was seen corked, userspace is buggy */
+			err = up->pending ? -EAFNOSUPPORT : -EINVAL;
+			release_sock(sk);
+			return err;
+		}
+		dst = NULL;
+		goto do_append_data;
+	}
+
 	/* destination address check */
 	if (sin6) {
 		if (addr_len < offsetof(struct sockaddr, sa_data))
@@ -1342,12 +1380,11 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		default:
 			return -EINVAL;
 		}
-	} else if (!up->pending) {
+	} else {
 		if (sk->sk_state != TCP_ESTABLISHED)
 			return -EDESTADDRREQ;
 		daddr = &sk->sk_v6_daddr;
-	} else
-		daddr = NULL;
+	}
 
 	if (daddr) {
 		if (ipv6_addr_v4mapped(daddr)) {
@@ -1364,30 +1401,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		}
 	}
 
-	/* Rough check on arithmetic overflow,
-	   better check is made in ip6_append_data().
-	   */
-	if (len > INT_MAX - sizeof(struct udphdr))
-		return -EMSGSIZE;
-
-	if (up->pending) {
-		if (up->pending == AF_INET)
-			return udp_sendmsg(sk, msg, len);
-		/*
-		 * There are pending frames.
-		 * The socket lock must be held while it's corked.
-		 */
-		lock_sock(sk);
-		if (likely(up->pending)) {
-			if (unlikely(up->pending != AF_INET6)) {
-				release_sock(sk);
-				return -EAFNOSUPPORT;
-			}
-			dst = NULL;
-			goto do_append_data;
-		}
-		release_sock(sk);
-	}
 	ulen += sizeof(struct udphdr);
 
 	memset(fl6, 0, sizeof(*fl6));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 21/27] udp/ipv6: prioritise the ip6 path over ip4 checks
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (19 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 20/27] udp/ipv6: move pending section of udpv6_sendmsg Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 22/27] udp/ipv6: optimise udpv6_sendmsg() daddr checks Pavel Begunkov
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

For AF_INET6 sockets we care the most about ipv6 but not ip4 mappings as
it's requires some extra hops anyway. Take AF_INET6 case from the address
parsing switch and add an explicit path for it. It removes some extra
ifs from the path and removes the switch overhead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/udp.c | 37 +++++++++++++++++--------------------
 1 file changed, 17 insertions(+), 20 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 26832be40f31..707e26ed45a4 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1357,30 +1357,27 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 	/* destination address check */
 	if (sin6) {
-		if (addr_len < offsetof(struct sockaddr, sa_data))
-			return -EINVAL;
+		if (addr_len < SIN6_LEN_RFC2133 || sin6->sin6_family != AF_INET6) {
+			if (addr_len < offsetof(struct sockaddr, sa_data))
+				return -EINVAL;
 
-		switch (sin6->sin6_family) {
-		case AF_INET6:
-			if (addr_len < SIN6_LEN_RFC2133)
+			switch (sin6->sin6_family) {
+			case AF_INET:
+				goto do_udp_sendmsg;
+			case AF_UNSPEC:
+				msg->msg_name = sin6 = NULL;
+				msg->msg_namelen = addr_len = 0;
+				goto no_daddr;
+			default:
 				return -EINVAL;
-			daddr = &sin6->sin6_addr;
-			if (ipv6_addr_any(daddr) &&
-			    ipv6_addr_v4mapped(&np->saddr))
-				ipv6_addr_set_v4mapped(htonl(INADDR_LOOPBACK),
-						       daddr);
-			break;
-		case AF_INET:
-			goto do_udp_sendmsg;
-		case AF_UNSPEC:
-			msg->msg_name = sin6 = NULL;
-			msg->msg_namelen = addr_len = 0;
-			daddr = NULL;
-			break;
-		default:
-			return -EINVAL;
+			}
 		}
+
+		daddr = &sin6->sin6_addr;
+		if (ipv6_addr_any(daddr) && ipv6_addr_v4mapped(&np->saddr))
+			ipv6_addr_set_v4mapped(htonl(INADDR_LOOPBACK), daddr);
 	} else {
+no_daddr:
 		if (sk->sk_state != TCP_ESTABLISHED)
 			return -EDESTADDRREQ;
 		daddr = &sk->sk_v6_daddr;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 22/27] udp/ipv6: optimise udpv6_sendmsg() daddr checks
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (20 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 21/27] udp/ipv6: prioritise the ip6 path over ip4 checks Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 23/27] udp/ipv6: optimise out daddr reassignment Pavel Begunkov
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

All paths taking udpv6_sendmsg() to the ipv6_addr_v4mapped() check set a
non zero daddr, we can safely kill the NULL check just before it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/udp.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 707e26ed45a4..cbb11316a526 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1383,19 +1383,18 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		daddr = &sk->sk_v6_daddr;
 	}
 
-	if (daddr) {
-		if (ipv6_addr_v4mapped(daddr)) {
-			struct sockaddr_in sin;
-			sin.sin_family = AF_INET;
-			sin.sin_port = sin6 ? sin6->sin6_port : inet->inet_dport;
-			sin.sin_addr.s_addr = daddr->s6_addr32[3];
-			msg->msg_name = &sin;
-			msg->msg_namelen = sizeof(sin);
+	if (ipv6_addr_v4mapped(daddr)) {
+		struct sockaddr_in sin;
+
+		sin.sin_family = AF_INET;
+		sin.sin_port = sin6 ? sin6->sin6_port : inet->inet_dport;
+		sin.sin_addr.s_addr = daddr->s6_addr32[3];
+		msg->msg_name = &sin;
+		msg->msg_namelen = sizeof(sin);
 do_udp_sendmsg:
-			if (__ipv6_only_sock(sk))
-				return -ENETUNREACH;
-			return udp_sendmsg(sk, msg, len);
-		}
+		if (__ipv6_only_sock(sk))
+			return -ENETUNREACH;
+		return udp_sendmsg(sk, msg, len);
 	}
 
 	ulen += sizeof(struct udphdr);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 23/27] udp/ipv6: optimise out daddr reassignment
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (21 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 22/27] udp/ipv6: optimise udpv6_sendmsg() daddr checks Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 24/27] udp/ipv6: clean up udpv6_sendmsg's saddr init Pavel Begunkov
                   ` (4 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

There is nothing that checks daddr placement in udpv6_sendmsg(), so the
check reassigning it to ->sk_v6_daddr looks like a not needed anymore
artifact from the past. Remove it.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/udp.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index cbb11316a526..2b5a3ed3f138 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1417,14 +1417,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			}
 		}
 
-		/*
-		 * Otherwise it will be difficult to maintain
-		 * sk->sk_dst_cache.
-		 */
-		if (sk->sk_state == TCP_ESTABLISHED &&
-		    ipv6_addr_equal(daddr, &sk->sk_v6_daddr))
-			daddr = &sk->sk_v6_daddr;
-
 		if (addr_len >= sizeof(struct sockaddr_in6) &&
 		    sin6->sin6_scope_id &&
 		    __ipv6_addr_needs_scope_id(__ipv6_addr_type(daddr)))
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 24/27] udp/ipv6: clean up udpv6_sendmsg's saddr init
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (22 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 23/27] udp/ipv6: optimise out daddr reassignment Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 25/27] ipv6: refactor opts push in __ip6_make_skb() Pavel Begunkov
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

We initialise fl6 in udpv6_sendmsg() to zeroes, that sets saddr to any
addr, then it might be changed in by cmsg but only to a non-any addr.
After we check again for it left set to "any", which is likely to be so,
and try to initialise it from socket saddr.

The result of it is that fl6->saddr is set to cmsg's saddr if specified
and inet6_sk(sk)->saddr otherwise. We can achieve the same by
pre-setting it to the sockets saddr and potentially overriding by cmsg
after.

This looks a bit cleaner comparing to conditional init and also removes
extra checks from the way.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/udp.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 2b5a3ed3f138..0b82447629b7 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1431,14 +1431,15 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		connected = true;
 	}
 
+	fl6->flowi6_uid = sk->sk_uid;
+	fl6->saddr = np->saddr;
+	fl6->daddr = *daddr;
+
 	if (!fl6->flowi6_oif)
 		fl6->flowi6_oif = sk->sk_bound_dev_if;
-
 	if (!fl6->flowi6_oif)
 		fl6->flowi6_oif = np->sticky_pktinfo.ipi6_ifindex;
 
-	fl6->flowi6_uid = sk->sk_uid;
-
 	if (msg->msg_controllen) {
 		opt = &opt_space;
 		memset(opt, 0, sizeof(struct ipv6_txoptions));
@@ -1473,9 +1474,6 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 	fl6->flowi6_proto = sk->sk_protocol;
 	fl6->flowi6_mark = ipc6.sockc.mark;
-	fl6->daddr = *daddr;
-	if (ipv6_addr_any(&fl6->saddr) && !ipv6_addr_any(&np->saddr))
-		fl6->saddr = np->saddr;
 	fl6->fl6_sport = inet->inet_sport;
 
 	if (cgroup_bpf_enabled(CGROUP_UDP6_SENDMSG) && !connected) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 25/27] ipv6: refactor opts push in __ip6_make_skb()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (23 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 24/27] udp/ipv6: clean up udpv6_sendmsg's saddr init Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 26/27] ipv6: improve opt-less __ip6_make_skb() Pavel Begunkov
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Don't preload v6_cork->opt before we actually need it, it likely to be
saved on the stack and read again for no good reason.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/ip6_output.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index bd5de7a5aa8c..3c37b07cbfae 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1857,7 +1857,6 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	struct net *net = sock_net(sk);
 	struct ipv6hdr *hdr;
-	struct ipv6_txoptions *opt = v6_cork->opt;
 	struct rt6_info *rt = (struct rt6_info *)cork->base.dst;
 	struct flowi6 *fl6 = &cork->fl.u.ip6;
 	unsigned char proto = fl6->flowi6_proto;
@@ -1886,10 +1885,14 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 	__skb_pull(skb, skb_network_header_len(skb));
 
 	final_dst = &fl6->daddr;
-	if (opt && opt->opt_flen)
-		ipv6_push_frag_opts(skb, opt, &proto);
-	if (opt && opt->opt_nflen)
-		ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst, &fl6->saddr);
+	if (v6_cork->opt) {
+		struct ipv6_txoptions *opt = v6_cork->opt;
+
+		if (opt->opt_flen)
+			ipv6_push_frag_opts(skb, opt, &proto);
+		if (opt->opt_nflen)
+			ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst, &fl6->saddr);
+	}
 
 	skb_push(skb, sizeof(struct ipv6hdr));
 	skb_reset_network_header(skb);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 26/27] ipv6: improve opt-less __ip6_make_skb()
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (24 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 25/27] ipv6: refactor opts push in __ip6_make_skb() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-03 13:06 ` [PATCH net-next 27/27] ipv6: clean up ip6_setup_cork Pavel Begunkov
  2022-04-06  9:44 ` [RFC net-next 00/27] net and/or udp optimisations Eric Dumazet
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

We do a bit of a network header pointer shuffling in __ip6_make_skb()
expecting that ipv6_push_*frag_opts() might change the layout. Avoid it
with associated overhead when there are no opts.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/ip6_output.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3c37b07cbfae..f7c092af64f5 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1882,22 +1882,20 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
 
 	/* Allow local fragmentation. */
 	skb->ignore_df = ip6_sk_ignore_df(sk);
-	__skb_pull(skb, skb_network_header_len(skb));
-
 	final_dst = &fl6->daddr;
 	if (v6_cork->opt) {
 		struct ipv6_txoptions *opt = v6_cork->opt;
 
+		__skb_pull(skb, skb_network_header_len(skb));
 		if (opt->opt_flen)
 			ipv6_push_frag_opts(skb, opt, &proto);
 		if (opt->opt_nflen)
 			ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst, &fl6->saddr);
+		skb_push(skb, sizeof(struct ipv6hdr));
+		skb_reset_network_header(skb);
 	}
 
-	skb_push(skb, sizeof(struct ipv6hdr));
-	skb_reset_network_header(skb);
 	hdr = ipv6_hdr(skb);
-
 	ip6_flow_hdr(hdr, v6_cork->tclass,
 		     ip6_make_flowlabel(net, skb, fl6->flowlabel,
 					ip6_autoflowlabel(net, np), fl6));
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH net-next 27/27] ipv6: clean up ip6_setup_cork
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (25 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 26/27] ipv6: improve opt-less __ip6_make_skb() Pavel Begunkov
@ 2022-04-03 13:06 ` Pavel Begunkov
  2022-04-06  9:44 ` [RFC net-next 00/27] net and/or udp optimisations Eric Dumazet
  27 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-03 13:06 UTC (permalink / raw)
  To: netdev, David S . Miller, Jakub Kicinski
  Cc: Eric Dumazet, Wei Liu, Paul Durrant, Pavel Begunkov

Do a bit of refactoring for ip6_setup_cork(). Cache a xfrm_dst_path()
result to not call it twice, reshuffle ifs to not repeat some parts
twice and so.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
 net/ipv6/ip6_output.c | 30 +++++++++++++-----------------
 1 file changed, 13 insertions(+), 17 deletions(-)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f7c092af64f5..e10b7f42e493 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1356,15 +1356,13 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	unsigned int mtu;
 	struct ipv6_txoptions *nopt, *opt = ipc6->opt;
+	struct dst_entry *xrfm_dst;
 
 	/* callers pass dst together with a reference, set it first so
 	 * ip6_cork_release() can put it down even in case of an error.
 	 */
 	cork->base.dst = &rt->dst;
 
-	/*
-	 * setup for corking
-	 */
 	if (opt) {
 		if (WARN_ON(v6_cork->opt))
 			return -EINVAL;
@@ -1397,28 +1395,26 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 	}
 	v6_cork->hop_limit = ipc6->hlimit;
 	v6_cork->tclass = ipc6->tclass;
-	if (rt->dst.flags & DST_XFRM_TUNNEL)
-		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
-		      READ_ONCE(rt->dst.dev->mtu) : dst_mtu(&rt->dst);
+
+	xrfm_dst = xfrm_dst_path(&rt->dst);
+	if (dst_allfrag(xrfm_dst))
+		cork->base.flags |= IPCORK_ALLFRAG;
+
+	if (np->pmtudisc < IPV6_PMTUDISC_PROBE)
+		mtu = dst_mtu(rt->dst.flags & DST_XFRM_TUNNEL ? &rt->dst : xrfm_dst);
 	else
-		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
-			READ_ONCE(rt->dst.dev->mtu) : dst_mtu(xfrm_dst_path(&rt->dst));
-	if (np->frag_size < mtu) {
-		if (np->frag_size)
-			mtu = np->frag_size;
-	}
+		mtu = READ_ONCE(rt->dst.dev->mtu);
+
+	if (np->frag_size < mtu && np->frag_size)
+		mtu = np->frag_size;
+
 	cork->base.fragsize = mtu;
 	cork->base.gso_size = ipc6->gso_size;
 	cork->base.tx_flags = 0;
 	cork->base.mark = ipc6->sockc.mark;
 	sock_tx_timestamp(sk, ipc6->sockc.tsflags, &cork->base.tx_flags);
-
-	if (dst_allfrag(xfrm_dst_path(&rt->dst)))
-		cork->base.flags |= IPCORK_ALLFRAG;
 	cork->base.length = 0;
-
 	cork->base.transmit_time = ipc6->sockc.transmit_time;
-
 	return 0;
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [RFC net-next 00/27] net and/or udp optimisations
  2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
                   ` (26 preceding siblings ...)
  2022-04-03 13:06 ` [PATCH net-next 27/27] ipv6: clean up ip6_setup_cork Pavel Begunkov
@ 2022-04-06  9:44 ` Eric Dumazet
  2022-04-11 12:04   ` Pavel Begunkov
  27 siblings, 1 reply; 30+ messages in thread
From: Eric Dumazet @ 2022-04-06  9:44 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: netdev, David S . Miller, Jakub Kicinski, Wei Liu, Paul Durrant

On Sun, Apr 3, 2022 at 6:08 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>
> A mix of various net optimisations, which were mostly discovered during UDP
> testing. Benchmarked with an io_uring test using 16B UDP/IPv6 over dummy netdev:
> 2090K vs 2229K tx/s, +6.6%, or in a 4-8% range if not averaging across reboots.
>
> 1-3 removes extra atomics and barriers from sock_wfree() mainly benefitting UDP.
> 4-7 cleans up some zerocopy helpers
> 8-16 do inlining of ipv6 and generic net pathes
> 17 is a small nice performance improvement for TCP zerocopy
> 18-27 refactors UDP to shed some more overhead
>

Please send a smaller series first.

About inlining everything around, make sure to include performance
numbers only for the inline parts.
We can inline everything and make the kernel 4 x time bigger.
Synthetic benchmarks will still show improvements but in overall, we
add icache cost that is going to hurt latencies.
I vote that you focus on the other parts first.

Thank you.

> Pavel Begunkov (27):
>   sock: deduplicate ->sk_wmem_alloc check
>   sock: optimise sock_def_write_space send refcounting
>   sock: optimise sock_def_write_space barriers
>   skbuff: drop zero check from skb_zcopy_set
>   skbuff: drop null check from skb_zcopy
>   net: xen: set zc flags only when there is ubuf
>   skbuff: introduce skb_is_zcopy()
>   skbuff: optimise alloc_skb_with_frags()
>   net: inline sock_alloc_send_skb
>   net: inline part of skb_csum_hwoffload_help
>   net: inline skb_zerocopy_iter_dgram
>   ipv6: inline ip6_local_out()
>   ipv6: help __ip6_finish_output() inlining
>   ipv6: refactor ip6_finish_output2()
>   net: inline dev_queue_xmit()
>   ipv6: partially inline fl6_update_dst()
>   tcp: optimise skb_zerocopy_iter_stream()
>   net: optimise ipcm6 cookie init
>   udp/ipv6: refactor udpv6_sendmsg udplite checks
>   udp/ipv6: move pending section of udpv6_sendmsg
>   udp/ipv6: prioritise the ip6 path over ip4 checks
>   udp/ipv6: optimise udpv6_sendmsg() daddr checks
>   udp/ipv6: optimise out daddr reassignment
>   udp/ipv6: clean up udpv6_sendmsg's saddr init
>   ipv6: refactor opts push in __ip6_make_skb()
>   ipv6: improve opt-less __ip6_make_skb()
>   ipv6: clean up ip6_setup_cork
>
>  drivers/net/xen-netback/interface.c |   3 +-
>  include/linux/netdevice.h           |  27 ++++-
>  include/linux/skbuff.h              | 102 +++++++++++++-----
>  include/net/ipv6.h                  |  37 ++++---
>  include/net/sock.h                  |  10 +-
>  net/core/datagram.c                 |   2 -
>  net/core/datagram.h                 |  15 ---
>  net/core/dev.c                      |  28 ++---
>  net/core/skbuff.c                   |  59 ++++-------
>  net/core/sock.c                     |  50 +++++++--
>  net/ipv4/ip_output.c                |  10 +-
>  net/ipv4/tcp.c                      |   5 +-
>  net/ipv6/datagram.c                 |   4 +-
>  net/ipv6/exthdrs.c                  |  15 ++-
>  net/ipv6/ip6_output.c               |  88 ++++++++--------
>  net/ipv6/output_core.c              |  12 ---
>  net/ipv6/raw.c                      |   8 +-
>  net/ipv6/udp.c                      | 158 +++++++++++++---------------
>  net/l2tp/l2tp_ip6.c                 |   8 +-
>  19 files changed, 339 insertions(+), 302 deletions(-)
>  delete mode 100644 net/core/datagram.h
>
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [RFC net-next 00/27] net and/or udp optimisations
  2022-04-06  9:44 ` [RFC net-next 00/27] net and/or udp optimisations Eric Dumazet
@ 2022-04-11 12:04   ` Pavel Begunkov
  0 siblings, 0 replies; 30+ messages in thread
From: Pavel Begunkov @ 2022-04-11 12:04 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, David S . Miller, Jakub Kicinski, Wei Liu, Paul Durrant

On 4/6/22 10:44, Eric Dumazet wrote:
> On Sun, Apr 3, 2022 at 6:08 AM Pavel Begunkov <asml.silence@gmail.com> wrote:
>>
>> A mix of various net optimisations, which were mostly discovered during UDP
>> testing. Benchmarked with an io_uring test using 16B UDP/IPv6 over dummy netdev:
>> 2090K vs 2229K tx/s, +6.6%, or in a 4-8% range if not averaging across reboots.
>>
>> 1-3 removes extra atomics and barriers from sock_wfree() mainly benefitting UDP.
>> 4-7 cleans up some zerocopy helpers
>> 8-16 do inlining of ipv6 and generic net pathes
>> 17 is a small nice performance improvement for TCP zerocopy
>> 18-27 refactors UDP to shed some more overhead
>>

> Please send a smaller series first.

Apologies for delays. Ok, I'll split it.

> About inlining everything around, make sure to include performance
> numbers only for the inline parts.
> We can inline everything and make the kernel 4 x time bigger.
> Synthetic benchmarks will still show improvements but in overall, we
> add icache cost that is going to hurt latencies.

I do care about kernel bloating, but I think we can agree that for most
patches inlining is safe. There are 6 such patches (9-12,15,16). Three
of them (9,11,15) only do simple redirecting to another function
skb_csum_hwoffload_help() in 10 has only two callers. I think we can
agree that they're safe to inline.

That leaves ip6_local_out() with ~8 callers and used quite heavily. And
fl6_update_dst() with ~12 users, I don't have exact data but it appears
not everybody uses ip6 options and so the function does nothing. At
least that's true for UDP cases I care about. I think it's justified
to be inlined. Would you prefer these two to be removed?


> I vote that you focus on the other parts first.
> 
> Thank you.
> 
>> Pavel Begunkov (27):
>>    sock: deduplicate ->sk_wmem_alloc check
>>    sock: optimise sock_def_write_space send refcounting
>>    sock: optimise sock_def_write_space barriers
>>    skbuff: drop zero check from skb_zcopy_set
>>    skbuff: drop null check from skb_zcopy
>>    net: xen: set zc flags only when there is ubuf
>>    skbuff: introduce skb_is_zcopy()
>>    skbuff: optimise alloc_skb_with_frags()
>>    net: inline sock_alloc_send_skb
>>    net: inline part of skb_csum_hwoffload_help
>>    net: inline skb_zerocopy_iter_dgram
>>    ipv6: inline ip6_local_out()
>>    ipv6: help __ip6_finish_output() inlining
>>    ipv6: refactor ip6_finish_output2()
>>    net: inline dev_queue_xmit()
>>    ipv6: partially inline fl6_update_dst()
>>    tcp: optimise skb_zerocopy_iter_stream()
>>    net: optimise ipcm6 cookie init
>>    udp/ipv6: refactor udpv6_sendmsg udplite checks
>>    udp/ipv6: move pending section of udpv6_sendmsg
>>    udp/ipv6: prioritise the ip6 path over ip4 checks
>>    udp/ipv6: optimise udpv6_sendmsg() daddr checks
>>    udp/ipv6: optimise out daddr reassignment
>>    udp/ipv6: clean up udpv6_sendmsg's saddr init
>>    ipv6: refactor opts push in __ip6_make_skb()
>>    ipv6: improve opt-less __ip6_make_skb()
>>    ipv6: clean up ip6_setup_cork
>>
>>   drivers/net/xen-netback/interface.c |   3 +-
>>   include/linux/netdevice.h           |  27 ++++-
>>   include/linux/skbuff.h              | 102 +++++++++++++-----
>>   include/net/ipv6.h                  |  37 ++++---
>>   include/net/sock.h                  |  10 +-
>>   net/core/datagram.c                 |   2 -
>>   net/core/datagram.h                 |  15 ---
>>   net/core/dev.c                      |  28 ++---
>>   net/core/skbuff.c                   |  59 ++++-------
>>   net/core/sock.c                     |  50 +++++++--
>>   net/ipv4/ip_output.c                |  10 +-
>>   net/ipv4/tcp.c                      |   5 +-
>>   net/ipv6/datagram.c                 |   4 +-
>>   net/ipv6/exthdrs.c                  |  15 ++-
>>   net/ipv6/ip6_output.c               |  88 ++++++++--------
>>   net/ipv6/output_core.c              |  12 ---
>>   net/ipv6/raw.c                      |   8 +-
>>   net/ipv6/udp.c                      | 158 +++++++++++++---------------
>>   net/l2tp/l2tp_ip6.c                 |   8 +-
>>   19 files changed, 339 insertions(+), 302 deletions(-)
>>   delete mode 100644 net/core/datagram.h
>>
>> --
>> 2.35.1
>>

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2022-04-11 12:04 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 01/27] sock: deduplicate ->sk_wmem_alloc check Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 02/27] sock: optimise sock_def_write_space send refcounting Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 03/27] sock: optimise sock_def_write_space barriers Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 04/27] skbuff: drop zero check from skb_zcopy_set Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 05/27] skbuff: drop null check from skb_zcopy Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 06/27] net: xen: set zc flags only when there is ubuf Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 07/27] skbuff: introduce skb_is_zcopy() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 08/27] skbuff: optimise alloc_skb_with_frags() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 09/27] net: inline sock_alloc_send_skb Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 10/27] net: inline part of skb_csum_hwoffload_help Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 11/27] net: inline skb_zerocopy_iter_dgram Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 12/27] ipv6: inline ip6_local_out() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 13/27] ipv6: help __ip6_finish_output() inlining Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 14/27] ipv6: refactor ip6_finish_output2() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 15/27] net: inline dev_queue_xmit() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 16/27] ipv6: partially inline fl6_update_dst() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 17/27] tcp: optimise skb_zerocopy_iter_stream() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 18/27] net: optimise ipcm6 cookie init Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 19/27] udp/ipv6: refactor udpv6_sendmsg udplite checks Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 20/27] udp/ipv6: move pending section of udpv6_sendmsg Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 21/27] udp/ipv6: prioritise the ip6 path over ip4 checks Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 22/27] udp/ipv6: optimise udpv6_sendmsg() daddr checks Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 23/27] udp/ipv6: optimise out daddr reassignment Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 24/27] udp/ipv6: clean up udpv6_sendmsg's saddr init Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 25/27] ipv6: refactor opts push in __ip6_make_skb() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 26/27] ipv6: improve opt-less __ip6_make_skb() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 27/27] ipv6: clean up ip6_setup_cork Pavel Begunkov
2022-04-06  9:44 ` [RFC net-next 00/27] net and/or udp optimisations Eric Dumazet
2022-04-11 12:04   ` Pavel Begunkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).