From: David Howells <dhowells@redhat.com>
To: netdev@vger.kernel.org
Cc: David Howells <dhowells@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Willem de Bruijn <willemdebruijn.kernel@gmail.com>,
David Ahern <dsahern@kernel.org>,
Matthew Wilcox <willy@infradead.org>,
Al Viro <viro@zeniv.linux.org.uk>,
Christoph Hellwig <hch@infradead.org>,
Jens Axboe <axboe@kernel.dk>, Jeff Layton <jlayton@kernel.org>,
Christian Brauner <brauner@kernel.org>,
Chuck Lever III <chuck.lever@oracle.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: [PATCH net-next v6 07/18] tcp: Support MSG_SPLICE_PAGES
Date: Tue, 11 Apr 2023 17:08:51 +0100 [thread overview]
Message-ID: <20230411160902.4134381-8-dhowells@redhat.com> (raw)
In-Reply-To: <20230411160902.4134381-1-dhowells@redhat.com>
Make TCP's sendmsg() support MSG_SPLICE_PAGES. This causes pages to be
spliced or copied (if it cannot be spliced) from the source iterator.
This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: David Ahern <dsahern@kernel.org>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
Notes:
ver #6)
- Use common helper.
net/ipv4/tcp.c | 43 ++++++++++++++++++++++++++++++++++++-------
1 file changed, 36 insertions(+), 7 deletions(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index fd68d49490f2..0b2213da5aaf 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1221,7 +1221,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
int flags, err, copied = 0;
int mss_now = 0, size_goal, copied_syn = 0;
int process_backlog = 0;
- bool zc = false;
+ int zc = 0;
long timeo;
flags = msg->msg_flags;
@@ -1232,17 +1232,22 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
if (msg->msg_ubuf) {
uarg = msg->msg_ubuf;
net_zcopy_get(uarg);
- zc = sk->sk_route_caps & NETIF_F_SG;
+ if (sk->sk_route_caps & NETIF_F_SG)
+ zc = 1;
} else if (sock_flag(sk, SOCK_ZEROCOPY)) {
uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
if (!uarg) {
err = -ENOBUFS;
goto out_err;
}
- zc = sk->sk_route_caps & NETIF_F_SG;
- if (!zc)
+ if (sk->sk_route_caps & NETIF_F_SG)
+ zc = MSG_ZEROCOPY;
+ else
uarg_to_msgzc(uarg)->zerocopy = 0;
}
+ } else if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES) && size) {
+ if (sk->sk_route_caps & NETIF_F_SG)
+ zc = MSG_SPLICE_PAGES;
}
if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) &&
@@ -1305,7 +1310,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
goto do_error;
while (msg_data_left(msg)) {
- int copy = 0;
+ ssize_t copy = 0;
skb = tcp_write_queue_tail(sk);
if (skb)
@@ -1346,7 +1351,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
if (copy > msg_data_left(msg))
copy = msg_data_left(msg);
- if (!zc) {
+ if (zc == 0) {
bool merge = true;
int i = skb_shinfo(skb)->nr_frags;
struct page_frag *pfrag = sk_page_frag(sk);
@@ -1391,7 +1396,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
page_ref_inc(pfrag->page);
}
pfrag->offset += copy;
- } else {
+ } else if (zc == MSG_ZEROCOPY) {
/* First append to a fragless skb builds initial
* pure zerocopy skb
*/
@@ -1412,6 +1417,30 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
if (err < 0)
goto do_error;
copy = err;
+ } else if (zc == MSG_SPLICE_PAGES) {
+ /* Splice in data if we can; copy if we can't. */
+ if (tcp_downgrade_zcopy_pure(sk, skb))
+ goto wait_for_space;
+ copy = tcp_wmem_schedule(sk, copy);
+ if (!copy)
+ goto wait_for_space;
+
+ err = skb_splice_from_iter(skb, &msg->msg_iter, copy,
+ sk->sk_allocation);
+ if (err < 0) {
+ if (err == -EMSGSIZE) {
+ tcp_mark_push(tp, skb);
+ goto new_segment;
+ }
+ goto do_error;
+ }
+ copy = err;
+
+ if (!(flags & MSG_NO_SHARED_FRAGS))
+ skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG;
+
+ sk_wmem_queued_add(sk, copy);
+ sk_mem_charge(sk, copy);
}
if (!copied)
next prev parent reply other threads:[~2023-04-11 16:09 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-11 16:08 [PATCH net-next v6 00/18] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES), part 1 David Howells
2023-04-11 16:08 ` [PATCH net-next v6 01/18] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-04-13 0:51 ` Al Viro
2023-04-13 4:29 ` Al Viro
2023-04-13 20:39 ` David Howells
2023-04-13 20:49 ` Al Viro
2023-04-13 21:01 ` Al Viro
2023-04-11 16:08 ` [PATCH net-next v6 02/18] mm: Move the page fragment allocator from page_alloc.c into its own file David Howells
2023-04-11 16:08 ` [PATCH net-next v6 03/18] mm: Make the page_frag_cache allocator use multipage folios David Howells
2023-04-11 16:08 ` [PATCH net-next v6 04/18] mm: Make the page_frag_cache allocator use per-cpu David Howells
2023-04-11 16:55 ` Christoph Hellwig
2023-04-12 15:31 ` Christoph Hellwig
2023-04-12 23:12 ` David Howells
2023-04-11 16:08 ` [PATCH net-next v6 05/18] net: Pass max frags into skb_append_pagefrags() David Howells
2023-04-11 16:08 ` [PATCH net-next v6 06/18] net: Add a function to splice pages into an skbuff for MSG_SPLICE_PAGES David Howells
2023-04-13 4:41 ` Al Viro
2023-04-13 21:26 ` How to determine if a page can be spliced into an skbuff, or if it should be copied/rejected? David Howells
2023-04-11 16:08 ` David Howells [this message]
2023-04-11 17:09 ` [PATCH net-next v6 07/18] tcp: Support MSG_SPLICE_PAGES Eric Dumazet
2023-04-11 17:49 ` David Howells
2023-04-11 16:08 ` [PATCH net-next v6 08/18] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
2023-04-11 16:08 ` [PATCH net-next v6 09/18] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
2023-04-11 16:08 ` [PATCH net-next v6 10/18] espintcp: Inline do_tcp_sendpages() David Howells
2023-04-11 16:08 ` [PATCH net-next v6 11/18] tls: " David Howells
2023-04-11 16:08 ` [PATCH net-next v6 12/18] siw: " David Howells
2023-04-11 17:22 ` Tom Talpey
2023-04-11 16:08 ` [PATCH net-next v6 13/18] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
2023-04-11 16:08 ` [PATCH net-next v6 14/18] ip, udp: Support MSG_SPLICE_PAGES David Howells
2023-04-11 16:08 ` [PATCH net-next v6 15/18] ip6, udp6: " David Howells
2023-04-11 16:09 ` [PATCH net-next v6 16/18] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-04-11 16:09 ` [PATCH net-next v6 17/18] ip: Remove ip_append_page() David Howells
2023-04-11 16:09 ` [PATCH net-next v6 18/18] af_unix: Support MSG_SPLICE_PAGES David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230411160902.4134381-8-dhowells@redhat.com \
--to=dhowells@redhat.com \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=davem@davemloft.net \
--cc=dsahern@kernel.org \
--cc=edumazet@google.com \
--cc=hch@infradead.org \
--cc=jlayton@kernel.org \
--cc=kuba@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=torvalds@linux-foundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willemdebruijn.kernel@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).