From: Pavel Begunkov <asml.silence@gmail.com>
To: netdev@vger.kernel.org, "David S . Miller" <davem@davemloft.net>,
Jakub Kicinski <kuba@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>, Wei Liu <wei.liu@kernel.org>,
Paul Durrant <paul@xen.org>,
Pavel Begunkov <asml.silence@gmail.com>
Subject: [PATCH net-next 02/27] sock: optimise sock_def_write_space send refcounting
Date: Sun, 3 Apr 2022 14:06:14 +0100 [thread overview]
Message-ID: <769468f1e09dc13caefaa5cebc3ed1e04f747bcc.1648981570.git.asml.silence@gmail.com> (raw)
In-Reply-To: <cover.1648981570.git.asml.silence@gmail.com>
sock_def_write_space() is extensively used by UDP and there is some
room for optimisation. When sock_wfree() needs to do ->sk_write_space(),
it modifies ->sk_wmem_alloc in two steps. First, it puts all but one
refs and calls ->sk_write_space(), and then puts down remaining 1.
That's needed because the callback relies on ->sk_wmem_alloc being
subbed but something should hold the socket alive.
The idea behind this patch is to take advantage of SOCK_RCU_FREE and
ensure the socket is not freed by wrapping ->sk_write_space() in an RCU
section. Then we can remove one extra refcount atomic.
Note: not all callbacks might be RCU prepared, so we carve out a
sock_def_write_space() specific path.
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
---
net/core/sock.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/net/core/sock.c b/net/core/sock.c
index f5766d6e27cb..9389bb602c64 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -144,6 +144,8 @@
static DEFINE_MUTEX(proto_list_mutex);
static LIST_HEAD(proto_list);
+static void sock_def_write_space(struct sock *sk);
+
/**
* sk_ns_capable - General socket capability test
* @sk: Socket to use a capability on or through
@@ -2300,8 +2302,20 @@ void sock_wfree(struct sk_buff *skb)
{
struct sock *sk = skb->sk;
unsigned int len = skb->truesize;
+ bool free;
if (!sock_flag(sk, SOCK_USE_WRITE_QUEUE)) {
+ if (sock_flag(sk, SOCK_RCU_FREE) &&
+ sk->sk_write_space == sock_def_write_space) {
+ rcu_read_lock();
+ free = refcount_sub_and_test(len, &sk->sk_wmem_alloc);
+ sock_def_write_space(sk);
+ rcu_read_unlock();
+ if (unlikely(free))
+ __sk_free(sk);
+ return;
+ }
+
/*
* Keep a reference on sk_wmem_alloc, this will be released
* after sk_write_space() call
--
2.35.1
next prev parent reply other threads:[~2022-04-03 13:08 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-03 13:06 [RFC net-next 00/27] net and/or udp optimisations Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 01/27] sock: deduplicate ->sk_wmem_alloc check Pavel Begunkov
2022-04-03 13:06 ` Pavel Begunkov [this message]
2022-04-03 13:06 ` [PATCH net-next 03/27] sock: optimise sock_def_write_space barriers Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 04/27] skbuff: drop zero check from skb_zcopy_set Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 05/27] skbuff: drop null check from skb_zcopy Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 06/27] net: xen: set zc flags only when there is ubuf Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 07/27] skbuff: introduce skb_is_zcopy() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 08/27] skbuff: optimise alloc_skb_with_frags() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 09/27] net: inline sock_alloc_send_skb Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 10/27] net: inline part of skb_csum_hwoffload_help Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 11/27] net: inline skb_zerocopy_iter_dgram Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 12/27] ipv6: inline ip6_local_out() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 13/27] ipv6: help __ip6_finish_output() inlining Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 14/27] ipv6: refactor ip6_finish_output2() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 15/27] net: inline dev_queue_xmit() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 16/27] ipv6: partially inline fl6_update_dst() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 17/27] tcp: optimise skb_zerocopy_iter_stream() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 18/27] net: optimise ipcm6 cookie init Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 19/27] udp/ipv6: refactor udpv6_sendmsg udplite checks Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 20/27] udp/ipv6: move pending section of udpv6_sendmsg Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 21/27] udp/ipv6: prioritise the ip6 path over ip4 checks Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 22/27] udp/ipv6: optimise udpv6_sendmsg() daddr checks Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 23/27] udp/ipv6: optimise out daddr reassignment Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 24/27] udp/ipv6: clean up udpv6_sendmsg's saddr init Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 25/27] ipv6: refactor opts push in __ip6_make_skb() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 26/27] ipv6: improve opt-less __ip6_make_skb() Pavel Begunkov
2022-04-03 13:06 ` [PATCH net-next 27/27] ipv6: clean up ip6_setup_cork Pavel Begunkov
2022-04-06 9:44 ` [RFC net-next 00/27] net and/or udp optimisations Eric Dumazet
2022-04-11 12:04 ` Pavel Begunkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=769468f1e09dc13caefaa5cebc3ed1e04f747bcc.1648981570.git.asml.silence@gmail.com \
--to=asml.silence@gmail.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=kuba@kernel.org \
--cc=netdev@vger.kernel.org \
--cc=paul@xen.org \
--cc=wei.liu@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).