All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES)
@ 2023-03-16 15:25 David Howells
  2023-03-16 15:25 ` [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
                   ` (27 more replies)
  0 siblings, 28 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Hi Willy, Dave, et al.,

[NOTE! This patchset is a work in progress and some modules will not
 compile with it.]

I've been looking at how to make pipes handle the splicing in of multipage
folios and also looking to see if I could implement a suggestion from Willy
that pipe_buffers could perhaps hold a list of pages (which could make
splicing simpler - an entire splice segment would go in a single
pipe_buffer).

There are a couple of issues here:

 (1) Gifting/stealing a multipage folio is really tricky.  I think that if
     a multipage folio if gifted, the gift flag should be quietly dropped.
     Userspace has no control over what splice() and vmsplice() will see in
     the pagecache.

 (2) The sendpage op expects to be given a single page and various network
     protocols just attach that to a socket buffer.

This patchset aims to deal with the second by removing the ->sendpage()
operation and replacing it with sendmsg() and a new internal flag
MSG_SPLICE_PAGES.  As sendmsg() takes an I/O iterator, this also affords
the opportunity to pass a slew of pages in one go, rather than one at a
time.

If MSG_SPLICE_PAGES is set, the current implementation requires that the
iterator be ITER_BVEC-type and that the pages can be retained by calling
get_page() on them.  Note that I'm accessing the bvec[] directly, but
should really use iov_iter_extract_pages() which would allow an
ITER_XARRAY-type iterator to be used also.

The patchset consists of the following parts:

 (1) Define the MSG_SPLICE_PAGES flag.

 (2) Provide a simple allocator that takes pages and splits pieces off them
     on request and returns them with a ref on the page.  Unlike with slab
     memory, the lifetime of the allocated memory is controlled by the page
     refcount.  This allows protocol bits to be included in the same bvec[]
     as the data.

 (3) Implement MSG_SPLICE_PAGES support in TCP.

 (4) Make do_tcp_sendpages() just wrap sendmsg() and then fold it in to its
     various callers.

 (5) Implement MSG_SPLICE_PAGES support in IP and make udp_sendpage() just
     a wrapper around sendmsg().

 (6) Implement MSG_SPLICE_PAGES support in AF_UNIX.

 (7) Implement MSG_SPLICE_PAGES support in AF_ALG and make
     af_alg_sendpage() just a wrapper around sendmsg().

 (8) Rename pipe_to_sendpage() to pipe_to_sendmsg() and make it a wrapper
     around sendmsg().

 (9) Remove sendpage file operation.

(10) Convert siw, ceph, iscsi and tcp_bpf to use sendmsg() instead of
     tcp_sendpage().

(11) Make skb_send_sock() use sendmsg().

(12) Remove AF_ALG's hash_sendpage() as hash_sendmsg() seems to do paste
     the page pointers in anyway.

(13) Convert ceph, rds, dlm and sunrpc to use sendmsg().

(14) Remove the sendpage socket operation.

This leaves the implementation of MSG_SPLICE_PAGES in AF_TLS, AF_KCM,
AF_SMC and Chelsio-TLS which I'm going to need help with, and cleaning up
the use of kernel_sendpage in AF_KCM, AF_SMC and NVMe over TCP still to be
done.


I'm wondering about how best to proceed further:

 - Rather than providing a special allocator, should protocols implementing
   MSG_SPLICE_PAGES recognise pages that belong to the slab allocator and
   copy the content of those to the skbuff and only directly attach the
   source page if it's not a slab page?

 - Should MSG_SPLICE_PAGES work with ITER_XARRAY as well as ITER_BVEC?

 - Should MSG_SPLICE_PAGES just be a hint and get ignored if the conditions
   for using it are not met rather than giving an error?

 - Should pages attached to a pipe be pinned (ie. FOLL_PIN) rather than
   simply ref'd (ie. FOLL_GET) so that the DIO issue doesn't occur on
   spliced pages?

 - Similarly, should pages undergoing zerocopy be pinned when attached to
   an skbuff rather than being simply ref'd?  I have a patch to note in the
   bottom two bits of the frag page pointer if they are pinned, ref'd or
   neither.


I have tested AF_UNIX splicing - which, surprisingly, seems nearly twice as
fast - TCP splicing, the siw driver (softIWarp RDMA with nfs and cifs),
sunrpc (with nfsd) and UDP (using a patched rxrpc).

I've pushed the patches here also:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-sendpage

David

David Howells (28):
  net: Declare MSG_SPLICE_PAGES internal sendmsg() flag
  Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES
  tcp: Support MSG_SPLICE_PAGES
  tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES
  tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around
    tcp_sendmsg
  espintcp: Inline do_tcp_sendpages()
  tls: Inline do_tcp_sendpages()
  siw: Inline do_tcp_sendpages()
  tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked()
  ip, udp: Support MSG_SPLICE_PAGES
  udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES
  af_unix: Support MSG_SPLICE_PAGES
  crypto: af_alg: Indent the loop in af_alg_sendmsg()
  crypto: af_alg: Support MSG_SPLICE_PAGES
  crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES
  splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()
  Remove file->f_op->sendpage
  siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit
  ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  tcp_bpf: Make tcp_bpf_sendpage() go through
    tcp_bpf_sendmsg(MSG_SPLICE_PAGES)
  net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock()
  algif: Remove hash_sendpage*()
  ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
  rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)

 Documentation/networking/scaling.rst     |   4 +-
 crypto/Kconfig                           |   1 +
 crypto/af_alg.c                          | 137 +++++--------
 crypto/algif_aead.c                      |  40 ++--
 crypto/algif_hash.c                      |  66 ------
 crypto/algif_rng.c                       |   2 -
 crypto/algif_skcipher.c                  |  22 +-
 drivers/infiniband/sw/siw/siw_qp_tx.c    | 224 +++++----------------
 drivers/target/iscsi/iscsi_target_util.c |  14 +-
 fs/dlm/lowcomms.c                        |  10 +-
 fs/splice.c                              |  42 ++--
 include/linux/fs.h                       |   3 -
 include/linux/net.h                      |   8 -
 include/linux/socket.h                   |   1 +
 include/linux/splice.h                   |   2 +
 include/linux/zcopy_alloc.h              |  16 ++
 include/net/inet_common.h                |   2 -
 include/net/sock.h                       |   6 -
 include/net/tcp.h                        |   2 -
 include/net/tls.h                        |   2 +-
 mm/Makefile                              |   2 +-
 mm/zcopy_alloc.c                         | 129 ++++++++++++
 net/appletalk/ddp.c                      |   1 -
 net/atm/pvc.c                            |   1 -
 net/atm/svc.c                            |   1 -
 net/ax25/af_ax25.c                       |   1 -
 net/caif/caif_socket.c                   |   2 -
 net/can/bcm.c                            |   1 -
 net/can/isotp.c                          |   1 -
 net/can/j1939/socket.c                   |   1 -
 net/can/raw.c                            |   1 -
 net/ceph/messenger_v1.c                  |  58 ++----
 net/ceph/messenger_v2.c                  |  89 ++-------
 net/core/skbuff.c                        |  49 +++--
 net/core/sock.c                          |  35 +---
 net/dccp/ipv4.c                          |   1 -
 net/dccp/ipv6.c                          |   1 -
 net/ieee802154/socket.c                  |   2 -
 net/ipv4/af_inet.c                       |  21 --
 net/ipv4/ip_output.c                     |  89 ++++++++-
 net/ipv4/tcp.c                           | 244 +++++------------------
 net/ipv4/tcp_bpf.c                       |  72 ++-----
 net/ipv4/tcp_ipv4.c                      |   1 -
 net/ipv4/udp.c                           |  54 -----
 net/ipv4/udp_impl.h                      |   2 -
 net/ipv4/udplite.c                       |   1 -
 net/ipv6/af_inet6.c                      |   3 -
 net/ipv6/raw.c                           |   1 -
 net/ipv6/tcp_ipv6.c                      |   1 -
 net/key/af_key.c                         |   1 -
 net/l2tp/l2tp_ip.c                       |   1 -
 net/l2tp/l2tp_ip6.c                      |   1 -
 net/llc/af_llc.c                         |   1 -
 net/mctp/af_mctp.c                       |   1 -
 net/mptcp/protocol.c                     |   2 -
 net/netlink/af_netlink.c                 |   1 -
 net/netrom/af_netrom.c                   |   1 -
 net/packet/af_packet.c                   |   2 -
 net/phonet/socket.c                      |   2 -
 net/qrtr/af_qrtr.c                       |   1 -
 net/rds/af_rds.c                         |   1 -
 net/rds/tcp_send.c                       |  80 ++++----
 net/rose/af_rose.c                       |   1 -
 net/rxrpc/af_rxrpc.c                     |   1 -
 net/sctp/protocol.c                      |   1 -
 net/socket.c                             |  74 +------
 net/sunrpc/svcsock.c                     |  70 ++-----
 net/sunrpc/xdr.c                         |  24 ++-
 net/tipc/socket.c                        |   3 -
 net/tls/tls_main.c                       |  24 ++-
 net/unix/af_unix.c                       | 223 +++++++--------------
 net/vmw_vsock/af_vsock.c                 |   3 -
 net/x25/af_x25.c                         |   1 -
 net/xdp/xsk.c                            |   1 -
 net/xfrm/espintcp.c                      |  10 +-
 75 files changed, 687 insertions(+), 1313 deletions(-)
 create mode 100644 include/linux/zcopy_alloc.h
 create mode 100644 mm/zcopy_alloc.c


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
                   ` (26 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a
network protocol that it should splice pages from the source iterator
rather than copying the data if it can.

This is intended as a replacement for the ->sendpage() op, allowing a way
to splice in several multipage folios in one go.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 include/linux/socket.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 13c3a237b9c9..a67d02da3c54 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -327,6 +327,7 @@ struct ucred {
 					  */
 
 #define MSG_ZEROCOPY	0x4000000	/* Use user data in kernel path */
+#define MSG_SPLICE_PAGES 0x8000000	/* Splice the pages from the iterator in sendmsg() */
 #define MSG_FASTOPEN	0x20000000	/* Send data in TCP SYN */
 #define MSG_CMSG_CLOEXEC 0x40000000	/* Set close_on_exec for file
 					   descriptor received through


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
  2023-03-16 15:25 ` [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 17:28   ` Matthew Wilcox
  2023-03-16 18:00   ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES David Howells
                   ` (25 subsequent siblings)
  27 siblings, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Bernard Metzler,
	Tom Talpey, linux-rdma

If a network protocol sendmsg() sees MSG_SPLICE_DATA, it expects that the
iterator is of ITER_BVEC type and that all the pages can have refs taken on
them with get_page() and discarded with put_page().  Bits of network
filesystem protocol data, however, are typically contained in slab memory
for which the cleanup method is kfree(), not put_page(), so this doesn't
work.

Provide a simple allocator, zcopy_alloc(), that allocates a page at a time
per-cpu and sequentially breaks off pieces and hands them out with a ref as
it's asked for them.  The caller disposes of the memory it was given by
calling put_page().  When a page is all parcelled out, it is abandoned by
the allocator and another page is obtained.  The page will get cleaned up
when the last skbuff fragment is destroyed.

A helper function, zcopy_memdup() is provided to call zcopy_alloc() and
copy the data it is given into it.

[!] I'm not sure this is the best way to do things.  A better way might be
    to make the network protocol look at the page and copy it if it's a
    slab object rather than taking a ref on it.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Bernard Metzler <bmt@zurich.ibm.com>
cc: Tom Talpey <tom@talpey.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-rdma@vger.kernel.org
cc: netdev@vger.kernel.org
---
 include/linux/zcopy_alloc.h |  16 +++++
 mm/Makefile                 |   2 +-
 mm/zcopy_alloc.c            | 129 ++++++++++++++++++++++++++++++++++++
 3 files changed, 146 insertions(+), 1 deletion(-)
 create mode 100644 include/linux/zcopy_alloc.h
 create mode 100644 mm/zcopy_alloc.c

diff --git a/include/linux/zcopy_alloc.h b/include/linux/zcopy_alloc.h
new file mode 100644
index 000000000000..8eb205678073
--- /dev/null
+++ b/include/linux/zcopy_alloc.h
@@ -0,0 +1,16 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Defs for for zerocopy filler fragment allocator.
+ *
+ * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ */
+
+#ifndef _LINUX_ZCOPY_ALLOC_H
+#define _LINUX_ZCOPY_ALLOC_H
+
+struct bio_vec;
+
+int zcopy_alloc(size_t size, struct bio_vec *bvec, gfp_t gfp);
+int zcopy_memdup(size_t size, const void *p, struct bio_vec *bvec, gfp_t gfp);
+
+#endif /* _LINUX_ZCOPY_ALLOC_H */
diff --git a/mm/Makefile b/mm/Makefile
index 8e105e5b3e29..3848f43751ee 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -52,7 +52,7 @@ obj-y			:= filemap.o mempool.o oom_kill.o fadvise.o \
 			   readahead.o swap.o truncate.o vmscan.o shmem.o \
 			   util.o mmzone.o vmstat.o backing-dev.o \
 			   mm_init.o percpu.o slab_common.o \
-			   compaction.o \
+			   compaction.o zcopy_alloc.o \
 			   interval_tree.o list_lru.o workingset.o \
 			   debug.o gup.o mmap_lock.o $(mmu-y)
 
diff --git a/mm/zcopy_alloc.c b/mm/zcopy_alloc.c
new file mode 100644
index 000000000000..7b219392e829
--- /dev/null
+++ b/mm/zcopy_alloc.c
@@ -0,0 +1,129 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Allocator for zerocopy filler fragments
+ *
+ * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ *
+ * Provide a facility whereby pieces of bufferage can be allocated for
+ * insertion into bio_vec arrays intended for zerocopying, allowing protocol
+ * stuff to be mixed in with data.
+ *
+ * Unlike objects allocated from the slab, the lifetime of these pieces of
+ * buffer are governed purely by the refcount of the page in which they reside.
+ */
+
+#include <linux/export.h>
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/zcopy_alloc.h>
+#include <linux/bvec.h>
+
+struct zcopy_alloc_info {
+	struct folio	*folio;		/* Page currently being allocated from */
+	struct folio	*spare;		/* Spare page */
+	unsigned int	used;		/* Amount of folio used */
+	spinlock_t	lock;		/* Allocation lock (needs bh-disable) */
+};
+
+static struct zcopy_alloc_info __percpu *zcopy_alloc_info;
+
+static int __init zcopy_alloc_init(void)
+{
+	zcopy_alloc_info = alloc_percpu(struct zcopy_alloc_info);
+	if (!zcopy_alloc_info)
+		panic("Unable to set up zcopy_alloc allocator\n");
+	return 0;
+}
+subsys_initcall(zcopy_alloc_init);
+
+/**
+ * zcopy_alloc - Allocate some memory for use in zerocopy
+ * @size: The amount of memory (maximum 1/2 page).
+ * @bvec: Where to store the details of the memory
+ * @gfp: Allocation flags under which to make an allocation
+ *
+ * Allocate some memory for use with zerocopy where protocol bits have to be
+ * mixed in with spliced/zerocopied data.  Unlike memory allocated from the
+ * slab, this memory's lifetime is purely dependent on the folio's refcount.
+ *
+ * The way it works is that a folio is allocated and pieces are broken off
+ * sequentially and given to the allocators with a ref until it no longer has
+ * enough spare space, at which point the allocator's ref is dropped and a new
+ * folio is allocated.  The folio remains in existence until the last ref held
+ * by, say, a sk_buff is discarded and then the page is returned to the
+ * allocator.
+ *
+ * Returns 0 on success and -ENOMEM on allocation failure.  If successful, the
+ * details of the allocated memory are placed in *%bvec.
+ *
+ * The allocated memory should be disposed of with folio_put().
+ */
+int zcopy_alloc(size_t size, struct bio_vec *bvec, gfp_t gfp)
+{
+	struct zcopy_alloc_info *info;
+	struct folio *folio, *spare = NULL;
+	size_t full_size = round_up(size, 8);
+
+	if (WARN_ON_ONCE(full_size > PAGE_SIZE / 2))
+		return -ENOMEM; /* Allocate pages */
+
+try_again:
+	info = get_cpu_ptr(zcopy_alloc_info);
+
+	folio = info->folio;
+	if (folio && folio_size(folio) - info->used < full_size) {
+		folio_put(folio);
+		folio = info->folio = NULL;
+	}
+	if (spare && !info->spare) {
+		info->spare = spare;
+		spare = NULL;
+	}
+	if (!folio && info->spare) {
+		folio = info->folio = info->spare;
+		info->spare = NULL;
+		info->used = 0;
+	}
+	if (folio) {
+		bvec_set_folio(bvec, folio, size, info->used);
+		info->used += full_size;
+		if (info->used < folio_size(folio))
+			folio_get(folio);
+		else
+			info->folio = NULL;
+	}
+
+	put_cpu_ptr(zcopy_alloc_info);
+	if (folio) {
+		if (spare)
+			folio_put(spare);
+		return 0;
+	}
+
+	spare = folio_alloc(gfp, 0);
+	if (!spare)
+		return -ENOMEM;
+	goto try_again;
+}
+EXPORT_SYMBOL(zcopy_alloc);
+
+/**
+ * zcopy_memdup - Allocate some memory for use in zerocopy and fill it
+ * @size: The amount of memory to copy (maximum 1/2 page).
+ * @p: The source data to copy
+ * @bvec: Where to store the details of the memory
+ * @gfp: Allocation flags under which to make an allocation
+ */
+int zcopy_memdup(size_t size, const void *p, struct bio_vec *bvec, gfp_t gfp)
+{
+	void *q;
+
+	if (zcopy_alloc(size, bvec, gfp) < 0)
+		return -ENOMEM;
+
+	q = kmap_local_folio(page_folio(bvec->bv_page), bvec->bv_offset);
+	memcpy(q, p, size);
+	kunmap_local(q);
+	return 0;
+}
+EXPORT_SYMBOL(zcopy_memdup);


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
  2023-03-16 15:25 ` [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
  2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 18:37   ` Willem de Bruijn
  2023-03-16 18:44   ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
                   ` (24 subsequent siblings)
  27 siblings, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Make TCP's sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
spliced from the source iterator if possible (the iterator must be
ITER_BVEC and the pages must be spliceable).

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 net/ipv4/tcp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 53 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 288693981b00..77c0c69208a5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1220,7 +1220,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 	int flags, err, copied = 0;
 	int mss_now = 0, size_goal, copied_syn = 0;
 	int process_backlog = 0;
-	bool zc = false;
+	int zc = 0;
 	long timeo;
 
 	flags = msg->msg_flags;
@@ -1231,17 +1231,24 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 		if (msg->msg_ubuf) {
 			uarg = msg->msg_ubuf;
 			net_zcopy_get(uarg);
-			zc = sk->sk_route_caps & NETIF_F_SG;
+			if (sk->sk_route_caps & NETIF_F_SG)
+				zc = 1;
 		} else if (sock_flag(sk, SOCK_ZEROCOPY)) {
 			uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
 			if (!uarg) {
 				err = -ENOBUFS;
 				goto out_err;
 			}
-			zc = sk->sk_route_caps & NETIF_F_SG;
-			if (!zc)
+			if (sk->sk_route_caps & NETIF_F_SG)
+				zc = 1;
+			else
 				uarg_to_msgzc(uarg)->zerocopy = 0;
 		}
+	} else if (unlikely(flags & MSG_SPLICE_PAGES) && size) {
+		if (!iov_iter_is_bvec(&msg->msg_iter))
+			return -EINVAL;
+		if (sk->sk_route_caps & NETIF_F_SG)
+			zc = 2;
 	}
 
 	if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) &&
@@ -1345,7 +1352,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 		if (copy > msg_data_left(msg))
 			copy = msg_data_left(msg);
 
-		if (!zc) {
+		if (zc == 0) {
 			bool merge = true;
 			int i = skb_shinfo(skb)->nr_frags;
 			struct page_frag *pfrag = sk_page_frag(sk);
@@ -1390,7 +1397,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 				page_ref_inc(pfrag->page);
 			}
 			pfrag->offset += copy;
-		} else {
+		} else if (zc == 1)  {
 			/* First append to a fragless skb builds initial
 			 * pure zerocopy skb
 			 */
@@ -1411,6 +1418,46 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 			if (err < 0)
 				goto do_error;
 			copy = err;
+		} else if (zc == 2) {
+			/* Splice in data. */
+			const struct bio_vec *bv = msg->msg_iter.bvec;
+			size_t seg = iov_iter_single_seg_count(&msg->msg_iter);
+			size_t off = bv->bv_offset + msg->msg_iter.iov_offset;
+			bool can_coalesce;
+			int i = skb_shinfo(skb)->nr_frags;
+
+			if (copy > seg)
+				copy = seg;
+
+			can_coalesce = skb_can_coalesce(skb, i, bv->bv_page, off);
+			if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
+				tcp_mark_push(tp, skb);
+				goto new_segment;
+			}
+			if (tcp_downgrade_zcopy_pure(sk, skb))
+				goto wait_for_space;
+
+			copy = tcp_wmem_schedule(sk, copy);
+			if (!copy)
+				goto wait_for_space;
+
+			if (can_coalesce) {
+				skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy);
+			} else {
+				get_page(bv->bv_page);
+				skb_fill_page_desc_noacc(skb, i, bv->bv_page, off, copy);
+			}
+			iov_iter_advance(&msg->msg_iter, copy);
+
+			if (!(flags & MSG_NO_SHARED_FRAGS))
+				skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG;
+
+			skb->len += copy;
+			skb->data_len += copy;
+			skb->truesize += copy;
+			sk_wmem_queued_add(sk, copy);
+			sk_mem_charge(sk, copy);
+
 		}
 
 		if (!copied)


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (2 preceding siblings ...)
  2023-03-16 15:25 ` [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 05/28] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
                   ` (23 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Convert do_tcp_sendpages() to use sendmsg() with MSG_SPLICE_PAGES rather
than directly splicing in the pages itself.  do_tcp_sendpages() can then be
inlined in subsequent patches into its callers.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 net/ipv4/tcp.c | 160 +++----------------------------------------------
 1 file changed, 9 insertions(+), 151 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 77c0c69208a5..7c3acc5673e9 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -971,163 +971,21 @@ static int tcp_wmem_schedule(struct sock *sk, int copy)
 	return min(copy, sk->sk_forward_alloc);
 }
 
-static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,
-				      struct page *page, int offset, size_t *size)
-{
-	struct sk_buff *skb = tcp_write_queue_tail(sk);
-	struct tcp_sock *tp = tcp_sk(sk);
-	bool can_coalesce;
-	int copy, i;
-
-	if (!skb || (copy = size_goal - skb->len) <= 0 ||
-	    !tcp_skb_can_collapse_to(skb)) {
-new_segment:
-		if (!sk_stream_memory_free(sk))
-			return NULL;
-
-		skb = tcp_stream_alloc_skb(sk, 0, sk->sk_allocation,
-					   tcp_rtx_and_write_queues_empty(sk));
-		if (!skb)
-			return NULL;
-
-#ifdef CONFIG_TLS_DEVICE
-		skb->decrypted = !!(flags & MSG_SENDPAGE_DECRYPTED);
-#endif
-		tcp_skb_entail(sk, skb);
-		copy = size_goal;
-	}
-
-	if (copy > *size)
-		copy = *size;
-
-	i = skb_shinfo(skb)->nr_frags;
-	can_coalesce = skb_can_coalesce(skb, i, page, offset);
-	if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
-		tcp_mark_push(tp, skb);
-		goto new_segment;
-	}
-	if (tcp_downgrade_zcopy_pure(sk, skb))
-		return NULL;
-
-	copy = tcp_wmem_schedule(sk, copy);
-	if (!copy)
-		return NULL;
-
-	if (can_coalesce) {
-		skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy);
-	} else {
-		get_page(page);
-		skb_fill_page_desc_noacc(skb, i, page, offset, copy);
-	}
-
-	if (!(flags & MSG_NO_SHARED_FRAGS))
-		skb_shinfo(skb)->flags |= SKBFL_SHARED_FRAG;
-
-	skb->len += copy;
-	skb->data_len += copy;
-	skb->truesize += copy;
-	sk_wmem_queued_add(sk, copy);
-	sk_mem_charge(sk, copy);
-	WRITE_ONCE(tp->write_seq, tp->write_seq + copy);
-	TCP_SKB_CB(skb)->end_seq += copy;
-	tcp_skb_pcount_set(skb, 0);
-
-	*size = copy;
-	return skb;
-}
-
 ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 			 size_t size, int flags)
 {
-	struct tcp_sock *tp = tcp_sk(sk);
-	int mss_now, size_goal;
-	int err;
-	ssize_t copied;
-	long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
-
-	if (IS_ENABLED(CONFIG_DEBUG_VM) &&
-	    WARN_ONCE(!sendpage_ok(page),
-		      "page must not be a Slab one and have page_count > 0"))
-		return -EINVAL;
-
-	/* Wait for a connection to finish. One exception is TCP Fast Open
-	 * (passive side) where data is allowed to be sent before a connection
-	 * is fully established.
-	 */
-	if (((1 << sk->sk_state) & ~(TCPF_ESTABLISHED | TCPF_CLOSE_WAIT)) &&
-	    !tcp_passive_fastopen(sk)) {
-		err = sk_stream_wait_connect(sk, &timeo);
-		if (err != 0)
-			goto out_err;
-	}
-
-	sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
-
-	mss_now = tcp_send_mss(sk, &size_goal, flags);
-	copied = 0;
-
-	err = -EPIPE;
-	if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
-		goto out_err;
-
-	while (size > 0) {
-		struct sk_buff *skb;
-		size_t copy = size;
-
-		skb = tcp_build_frag(sk, size_goal, flags, page, offset, &copy);
-		if (!skb)
-			goto wait_for_space;
-
-		if (!copied)
-			TCP_SKB_CB(skb)->tcp_flags &= ~TCPHDR_PSH;
-
-		copied += copy;
-		offset += copy;
-		size -= copy;
-		if (!size)
-			goto out;
-
-		if (skb->len < size_goal || (flags & MSG_OOB))
-			continue;
-
-		if (forced_push(tp)) {
-			tcp_mark_push(tp, skb);
-			__tcp_push_pending_frames(sk, mss_now, TCP_NAGLE_PUSH);
-		} else if (skb == tcp_send_head(sk))
-			tcp_push_one(sk, mss_now);
-		continue;
-
-wait_for_space:
-		set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
-		tcp_push(sk, flags & ~MSG_MORE, mss_now,
-			 TCP_NAGLE_PUSH, size_goal);
-
-		err = sk_stream_wait_memory(sk, &timeo);
-		if (err != 0)
-			goto do_error;
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = flags | MSG_SPLICE_PAGES,
+	};
 
-		mss_now = tcp_send_mss(sk, &size_goal, flags);
-	}
+	bvec_set_page(&bvec, page, size, offset);
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
 
-out:
-	if (copied) {
-		tcp_tx_timestamp(sk, sk->sk_tsflags);
-		if (!(flags & MSG_SENDPAGE_NOTLAST))
-			tcp_push(sk, flags, mss_now, tp->nonagle, size_goal);
-	}
-	return copied;
+	if (flags & MSG_SENDPAGE_NOTLAST)
+		msg.msg_flags |= MSG_MORE;
 
-do_error:
-	tcp_remove_empty_skb(sk);
-	if (copied)
-		goto out;
-out_err:
-	/* make sure we wake any epoll edge trigger waiter */
-	if (unlikely(tcp_rtx_and_write_queues_empty(sk) && err == -EAGAIN)) {
-		sk->sk_write_space(sk);
-		tcp_chrono_stop(sk, TCP_CHRONO_SNDBUF_LIMITED);
-	}
-	return sk_stream_error(sk, flags, err);
+	return tcp_sendmsg_locked(sk, &msg, size);
 }
 EXPORT_SYMBOL_GPL(do_tcp_sendpages);
 


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 05/28] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (3 preceding siblings ...)
  2023-03-16 15:25 ` [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 06/28] espintcp: Inline do_tcp_sendpages() David Howells
                   ` (22 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, John Fastabend,
	Jakub Sitnicki, bpf

do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(),
so inline it.  This is part of replacing ->sendpage() with a call to
sendmsg() with MSG_SPLICE_PAGES set.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jakub Sitnicki <jakub@cloudflare.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
cc: bpf@vger.kernel.org
---
 net/ipv4/tcp_bpf.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index cf26d65ca389..7f17134637eb 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -72,11 +72,13 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes,
 {
 	bool apply = apply_bytes;
 	struct scatterlist *sge;
+	struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, };
 	struct page *page;
 	int size, ret = 0;
 	u32 off;
 
 	while (1) {
+		struct bio_vec bvec;
 		bool has_tx_ulp;
 
 		sge = sk_msg_elem(msg, msg->sg.start);
@@ -88,16 +90,18 @@ static int tcp_bpf_push(struct sock *sk, struct sk_msg *msg, u32 apply_bytes,
 		tcp_rate_check_app_limited(sk);
 retry:
 		has_tx_ulp = tls_sw_has_ctx_tx(sk);
-		if (has_tx_ulp) {
-			flags |= MSG_SENDPAGE_NOPOLICY;
-			ret = kernel_sendpage_locked(sk,
-						     page, off, size, flags);
-		} else {
-			ret = do_tcp_sendpages(sk, page, off, size, flags);
-		}
+		if (has_tx_ulp)
+			msghdr.msg_flags |= MSG_SENDPAGE_NOPOLICY;
 
+		if (flags & MSG_SENDPAGE_NOTLAST)
+			msghdr.msg_flags |= MSG_MORE;
+
+		bvec_set_page(&bvec, page, size, off);
+		iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size);
+		ret = tcp_sendmsg_locked(sk, &msghdr, size);
 		if (ret <= 0)
 			return ret;
+
 		if (apply)
 			apply_bytes -= ret;
 		msg->sg.size -= ret;
@@ -398,7 +402,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	long timeo;
 	int flags;
 
-	/* Don't let internal do_tcp_sendpages() flags through */
+	/* Don't let internal sendpage flags through */
 	flags = (msg->msg_flags & ~MSG_SENDPAGE_DECRYPTED);
 	flags |= MSG_NO_SHARED_FRAGS;
 


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 06/28] espintcp: Inline do_tcp_sendpages()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (4 preceding siblings ...)
  2023-03-16 15:25 ` [RFC PATCH 05/28] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 07/28] tls: " David Howells
                   ` (21 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Steffen Klassert,
	Herbert Xu

do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(),
so inline it, allowing do_tcp_sendpages() to be removed.  This is part of
replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Steffen Klassert <steffen.klassert@secunet.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 net/xfrm/espintcp.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index 872b80188e83..3504925babdb 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -205,14 +205,16 @@ static int espintcp_sendskb_locked(struct sock *sk, struct espintcp_msg *emsg,
 static int espintcp_sendskmsg_locked(struct sock *sk,
 				     struct espintcp_msg *emsg, int flags)
 {
+	struct msghdr msghdr = { .msg_flags = flags | MSG_SPLICE_PAGES, };
 	struct sk_msg *skmsg = &emsg->skmsg;
 	struct scatterlist *sg;
 	int done = 0;
 	int ret;
 
-	flags |= MSG_SENDPAGE_NOTLAST;
+	msghdr.msg_flags |= MSG_SENDPAGE_NOTLAST;
 	sg = &skmsg->sg.data[skmsg->sg.start];
 	do {
+		struct bio_vec bvec;
 		size_t size = sg->length - emsg->offset;
 		int offset = sg->offset + emsg->offset;
 		struct page *p;
@@ -220,11 +222,13 @@ static int espintcp_sendskmsg_locked(struct sock *sk,
 		emsg->offset = 0;
 
 		if (sg_is_last(sg))
-			flags &= ~MSG_SENDPAGE_NOTLAST;
+			msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST;
 
 		p = sg_page(sg);
 retry:
-		ret = do_tcp_sendpages(sk, p, offset, size, flags);
+		bvec_set_page(&bvec, p, size, offset);
+		iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size);
+		ret = tcp_sendmsg_locked(sk, &msghdr, size);
 		if (ret < 0) {
 			emsg->offset = offset - sg->offset;
 			skmsg->sg.start += done;


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 07/28] tls: Inline do_tcp_sendpages()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (5 preceding siblings ...)
  2023-03-16 15:25 ` [RFC PATCH 06/28] espintcp: Inline do_tcp_sendpages() David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 08/28] siw: " David Howells
                   ` (20 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Boris Pismenny,
	John Fastabend

do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(),
so inline it, allowing do_tcp_sendpages() to be removed.  This is part of
replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Boris Pismenny <borisp@nvidia.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 include/net/tls.h  |  2 +-
 net/tls/tls_main.c | 24 +++++++++++++++---------
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/include/net/tls.h b/include/net/tls.h
index 154949c7b0c8..d31521c36a84 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -256,7 +256,7 @@ struct tls_context {
 	struct scatterlist *partially_sent_record;
 	u16 partially_sent_offset;
 
-	bool in_tcp_sendpages;
+	bool splicing_pages;
 	bool pending_open_record_frags;
 
 	struct mutex tx_lock; /* protects partially_sent_* fields and
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 3735cb00905d..8802b4f8b652 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -124,7 +124,10 @@ int tls_push_sg(struct sock *sk,
 		u16 first_offset,
 		int flags)
 {
-	int sendpage_flags = flags | MSG_SENDPAGE_NOTLAST;
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = flags | MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST,
+	};
 	int ret = 0;
 	struct page *p;
 	size_t size;
@@ -133,16 +136,19 @@ int tls_push_sg(struct sock *sk,
 	size = sg->length - offset;
 	offset += sg->offset;
 
-	ctx->in_tcp_sendpages = true;
+	ctx->splicing_pages = true;
 	while (1) {
 		if (sg_is_last(sg))
-			sendpage_flags = flags;
+			msg.msg_flags = flags | MSG_SPLICE_PAGES;
 
 		/* is sending application-limited? */
 		tcp_rate_check_app_limited(sk);
 		p = sg_page(sg);
 retry:
-		ret = do_tcp_sendpages(sk, p, offset, size, sendpage_flags);
+		bvec_set_page(&bvec, p, size, offset);
+		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
+
+		ret = tcp_sendmsg_locked(sk, &msg, size);
 
 		if (ret != size) {
 			if (ret > 0) {
@@ -154,7 +160,7 @@ int tls_push_sg(struct sock *sk,
 			offset -= sg->offset;
 			ctx->partially_sent_offset = offset;
 			ctx->partially_sent_record = (void *)sg;
-			ctx->in_tcp_sendpages = false;
+			ctx->splicing_pages = false;
 			return ret;
 		}
 
@@ -168,7 +174,7 @@ int tls_push_sg(struct sock *sk,
 		size = sg->length;
 	}
 
-	ctx->in_tcp_sendpages = false;
+	ctx->splicing_pages = false;
 
 	return 0;
 }
@@ -246,11 +252,11 @@ static void tls_write_space(struct sock *sk)
 {
 	struct tls_context *ctx = tls_get_ctx(sk);
 
-	/* If in_tcp_sendpages call lower protocol write space handler
+	/* If splicing_pages call lower protocol write space handler
 	 * to ensure we wake up any waiting operations there. For example
-	 * if do_tcp_sendpages where to call sk_wait_event.
+	 * if splicing pages where to call sk_wait_event.
 	 */
-	if (ctx->in_tcp_sendpages) {
+	if (ctx->splicing_pages) {
 		ctx->sk_write_space(sk);
 		return;
 	}


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (6 preceding siblings ...)
  2023-03-16 15:25 ` [RFC PATCH 07/28] tls: " David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-20 10:53   ` Bernard Metzler
  2023-03-20 11:08   ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 09/28] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
                   ` (19 subsequent siblings)
  27 siblings, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Bernard Metzler,
	Tom Talpey, linux-rdma

do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(),
so inline it, allowing do_tcp_sendpages() to be removed.  This is part of
replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Bernard Metzler <bmt@zurich.ibm.com>
cc: Tom Talpey <tom@talpey.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-rdma@vger.kernel.org
cc: netdev@vger.kernel.org
---
 drivers/infiniband/sw/siw/siw_qp_tx.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 05052b49107f..8fc179321e2b 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -313,7 +313,7 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s,
 }
 
 /*
- * 0copy TCP transmit interface: Use do_tcp_sendpages.
+ * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES.
  *
  * Using sendpage to push page by page appears to be less efficient
  * than using sendmsg, even if data are copied.
@@ -324,20 +324,27 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s,
 static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
 			     size_t size)
 {
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = (MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT |
+			      MSG_SENDPAGE_NOTLAST),
+	};
 	struct sock *sk = s->sk;
-	int i = 0, rv = 0, sent = 0,
-	    flags = MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST;
+	int i = 0, rv = 0, sent = 0;
 
 	while (size) {
 		size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
 
 		if (size + offset <= PAGE_SIZE)
-			flags = MSG_MORE | MSG_DONTWAIT;
+			msg.msg_flags = MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT;
 
 		tcp_rate_check_app_limited(sk);
+		bvec_set_page(&bvec, page[i], bytes, offset);
+		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
+
 try_page_again:
 		lock_sock(sk);
-		rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags);
+		rv = tcp_sendmsg_locked(sk, &msg, size);
 		release_sock(sk);
 
 		if (rv > 0) {


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 09/28] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (7 preceding siblings ...)
  2023-03-16 15:25 ` [RFC PATCH 08/28] siw: " David Howells
@ 2023-03-16 15:25 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 10/28] ip, udp: Support MSG_SPLICE_PAGES David Howells
                   ` (18 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:25 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Fold do_tcp_sendpages() into its last remaining caller,
tcp_sendpage_locked().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 include/net/tcp.h |  2 --
 net/ipv4/tcp.c    | 21 +++++++--------------
 2 files changed, 7 insertions(+), 16 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index db9f828e9d1e..844bc8e6a714 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -333,8 +333,6 @@ int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
 		 int flags);
 int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
 			size_t size, int flags);
-ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags);
 int tcp_send_mss(struct sock *sk, int *size_goal, int flags);
 void tcp_push(struct sock *sk, int flags, int mss_now, int nonagle,
 	      int size_goal);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 7c3acc5673e9..f1454e4497df 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -971,14 +971,19 @@ static int tcp_wmem_schedule(struct sock *sk, int copy)
 	return min(copy, sk->sk_forward_alloc);
 }
 
-ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
-			 size_t size, int flags)
+int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
+			size_t size, int flags)
 {
 	struct bio_vec bvec;
 	struct msghdr msg = {
 		.msg_flags = flags | MSG_SPLICE_PAGES,
 	};
 
+	if (!(sk->sk_route_caps & NETIF_F_SG))
+		return sock_no_sendpage_locked(sk, page, offset, size, flags);
+
+	tcp_rate_check_app_limited(sk);  /* is sending application-limited? */
+
 	bvec_set_page(&bvec, page, size, offset);
 	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
 
@@ -987,18 +992,6 @@ ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
 
 	return tcp_sendmsg_locked(sk, &msg, size);
 }
-EXPORT_SYMBOL_GPL(do_tcp_sendpages);
-
-int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			size_t size, int flags)
-{
-	if (!(sk->sk_route_caps & NETIF_F_SG))
-		return sock_no_sendpage_locked(sk, page, offset, size, flags);
-
-	tcp_rate_check_app_limited(sk);  /* is sending application-limited? */
-
-	return do_tcp_sendpages(sk, page, offset, size, flags);
-}
 EXPORT_SYMBOL_GPL(tcp_sendpage_locked);
 
 int tcp_sendpage(struct sock *sk, struct page *page, int offset,


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 10/28] ip, udp: Support MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (8 preceding siblings ...)
  2023-03-16 15:25 ` [RFC PATCH 09/28] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 11/28] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
                   ` (17 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Willem de Bruijn

Make IP/UDP sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
spliced from the source iterator if possible (the iterator must be
ITER_BVEC and the pages must be spliceable).

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 net/ipv4/ip_output.c | 89 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 86 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 4e4e308c3230..721d7e4343ed 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -977,7 +977,7 @@ static int __ip_append_data(struct sock *sk,
 	int err;
 	int offset = 0;
 	bool zc = false;
-	unsigned int maxfraglen, fragheaderlen, maxnonfragsize;
+	unsigned int maxfraglen, fragheaderlen, maxnonfragsize, xlength;
 	int csummode = CHECKSUM_NONE;
 	struct rtable *rt = (struct rtable *)cork->dst;
 	unsigned int wmem_alloc_delta = 0;
@@ -1017,6 +1017,7 @@ static int __ip_append_data(struct sock *sk,
 	    (!exthdrlen || (rt->dst.dev->features & NETIF_F_HW_ESP_TX_CSUM)))
 		csummode = CHECKSUM_PARTIAL;
 
+	xlength = length;
 	if ((flags & MSG_ZEROCOPY) && length) {
 		struct msghdr *msg = from;
 
@@ -1047,6 +1048,16 @@ static int __ip_append_data(struct sock *sk,
 				skb_zcopy_set(skb, uarg, &extra_uref);
 			}
 		}
+	} else if ((flags & MSG_SPLICE_PAGES) && length) {
+		struct msghdr *msg = from;
+
+		if (!iov_iter_is_bvec(&msg->msg_iter))
+			return -EINVAL;
+		if (inet->hdrincl)
+			return -EPERM;
+		if (!(rt->dst.dev->features & NETIF_F_SG))
+			return -EOPNOTSUPP;
+		xlength = transhdrlen; /* We need an empty buffer to attach stuff to */
 	}
 
 	cork->length += length;
@@ -1074,6 +1085,50 @@ static int __ip_append_data(struct sock *sk,
 			unsigned int alloclen, alloc_extra;
 			unsigned int pagedlen;
 			struct sk_buff *skb_prev;
+
+			if (unlikely(flags & MSG_SPLICE_PAGES)) {
+				skb_prev = skb;
+				fraggap = skb_prev->len - maxfraglen;
+
+				alloclen = fragheaderlen + hh_len + fraggap + 15;
+				skb = sock_wmalloc(sk, alloclen, 1, sk->sk_allocation);
+				if (unlikely(!skb)) {
+					err = -ENOBUFS;
+					goto error;
+				}
+
+				/*
+				 *	Fill in the control structures
+				 */
+				skb->ip_summed = CHECKSUM_NONE;
+				skb->csum = 0;
+				skb_reserve(skb, hh_len);
+
+				/*
+				 *	Find where to start putting bytes.
+				 */
+				skb_put(skb, fragheaderlen + fraggap);
+				skb_reset_network_header(skb);
+				skb->transport_header = (skb->network_header +
+							 fragheaderlen);
+				if (fraggap) {
+					skb->csum = skb_copy_and_csum_bits(
+						skb_prev, maxfraglen,
+						skb_transport_header(skb),
+						fraggap);
+					skb_prev->csum = csum_sub(skb_prev->csum,
+								  skb->csum);
+					pskb_trim_unique(skb_prev, maxfraglen);
+				}
+
+				/*
+				 * Put the packet on the pending queue.
+				 */
+				__skb_queue_tail(&sk->sk_write_queue, skb);
+				continue;
+			}
+			xlength = length;
+
 alloc_new_skb:
 			skb_prev = skb;
 			if (skb_prev)
@@ -1085,7 +1140,7 @@ static int __ip_append_data(struct sock *sk,
 			 * If remaining data exceeds the mtu,
 			 * we know we need more fragment(s).
 			 */
-			datalen = length + fraggap;
+			datalen = xlength + fraggap;
 			if (datalen > mtu - fragheaderlen)
 				datalen = maxfraglen - fragheaderlen;
 			fraglen = datalen + fragheaderlen;
@@ -1099,7 +1154,7 @@ static int __ip_append_data(struct sock *sk,
 			 * because we have no idea what fragment will be
 			 * the last.
 			 */
-			if (datalen == length + fraggap)
+			if (datalen == xlength + fraggap)
 				alloc_extra += rt->dst.trailer_len;
 
 			if ((flags & MSG_MORE) &&
@@ -1206,6 +1261,34 @@ static int __ip_append_data(struct sock *sk,
 				err = -EFAULT;
 				goto error;
 			}
+		} else if (flags & MSG_SPLICE_PAGES) {
+			struct msghdr *msg = from;
+			struct iov_iter *iter = &msg->msg_iter;
+			const struct bio_vec *bv = iter->bvec;
+
+			if (iov_iter_count(iter) <= 0) {
+				err = -EIO;
+				goto error;
+			}
+
+			copy = iov_iter_single_seg_count(&msg->msg_iter);
+
+			err = skb_append_pagefrags(skb, bv->bv_page,
+						   bv->bv_offset + iter->iov_offset,
+						   copy);
+			if (err < 0)
+				goto error;
+
+			if (skb->ip_summed == CHECKSUM_NONE) {
+				__wsum csum;
+				csum = csum_page(bv->bv_page,
+						 bv->bv_offset + iter->iov_offset, copy);
+				skb->csum = csum_block_add(skb->csum, csum, skb->len);
+			}
+
+			iov_iter_advance(iter, copy);
+			skb_len_add(skb, copy);
+			refcount_add(copy, &sk->sk_wmem_alloc);
 		} else if (!zc) {
 			int i = skb_shinfo(skb)->nr_frags;
 


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 11/28] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (9 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 10/28] ip, udp: Support MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 12/28] af_unix: Support MSG_SPLICE_PAGES David Howells
                   ` (16 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Willem de Bruijn

Convert udp_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather than
directly splicing in the pages itself.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 net/ipv4/udp.c | 50 +++++++++-----------------------------------------
 1 file changed, 9 insertions(+), 41 deletions(-)

diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index c605d171eb2d..097feb92e215 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1332,52 +1332,20 @@ EXPORT_SYMBOL(udp_sendmsg);
 int udp_sendpage(struct sock *sk, struct page *page, int offset,
 		 size_t size, int flags)
 {
-	struct inet_sock *inet = inet_sk(sk);
-	struct udp_sock *up = udp_sk(sk);
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE
+	};
 	int ret;
 
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		flags |= MSG_MORE;
+	bvec_set_page(&bvec, page, size, offset);
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
 
-	if (!up->pending) {
-		struct msghdr msg = {	.msg_flags = flags|MSG_MORE };
-
-		/* Call udp_sendmsg to specify destination address which
-		 * sendpage interface can't pass.
-		 * This will succeed only when the socket is connected.
-		 */
-		ret = udp_sendmsg(sk, &msg, 0);
-		if (ret < 0)
-			return ret;
-	}
+	if (flags & MSG_SENDPAGE_NOTLAST)
+		msg.msg_flags |= MSG_MORE;
 
 	lock_sock(sk);
-
-	if (unlikely(!up->pending)) {
-		release_sock(sk);
-
-		net_dbg_ratelimited("cork failed\n");
-		return -EINVAL;
-	}
-
-	ret = ip_append_page(sk, &inet->cork.fl.u.ip4,
-			     page, offset, size, flags);
-	if (ret == -EOPNOTSUPP) {
-		release_sock(sk);
-		return sock_no_sendpage(sk->sk_socket, page, offset,
-					size, flags);
-	}
-	if (ret < 0) {
-		udp_flush_pending_frames(sk);
-		goto out;
-	}
-
-	up->len += size;
-	if (!(READ_ONCE(up->corkflag) || (flags&MSG_MORE)))
-		ret = udp_push_pending_frames(sk);
-	if (!ret)
-		ret = size;
-out:
+	ret = udp_sendmsg(sk, &msg, size);
 	release_sock(sk);
 	return ret;
 }


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 12/28] af_unix: Support MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (10 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 11/28] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 13/28] crypto: af_alg: Indent the loop in af_alg_sendmsg() David Howells
                   ` (15 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Make AF_UNIX sendmsg() support MSG_SPLICE_PAGES, splicing in pages from the
source iterator if given and if ITER_BVEC and copying the data in
otherwise.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 net/unix/af_unix.c | 84 +++++++++++++++++++++++++++++++++++++---------
 1 file changed, 68 insertions(+), 16 deletions(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 347122c3575e..6f3454db9c53 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -2151,6 +2151,44 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other
 }
 #endif
 
+/*
+ * Extract pages from a BVEC-type iterator and add them to the socket buffer.
+ */
+static ssize_t unix_extract_bvec_to_skb(struct sk_buff *skb,
+					struct iov_iter *iter, ssize_t maxsize)
+{
+	const struct bio_vec *bv = iter->bvec;
+	unsigned long start = iter->iov_offset;
+	unsigned int i;
+	ssize_t ret = 0;
+
+	for (i = 0; i < iter->nr_segs; i++) {
+		size_t off, len;
+
+		len = bv[i].bv_len;
+		if (start >= len) {
+			start -= len;
+			continue;
+		}
+
+		len = min_t(size_t, maxsize, len - start);
+		off = bv[i].bv_offset + start;
+
+		if (skb_append_pagefrags(skb, bv->bv_page, off, len) < 0)
+			break;
+
+		ret += len;
+		maxsize -= len;
+		if (maxsize <= 0)
+			break;
+		start = 0;
+	}
+
+	if (ret > 0)
+		iov_iter_advance(iter, ret);
+	return ret;
+}
+
 static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 			       size_t len)
 {
@@ -2194,19 +2232,25 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	while (sent < len) {
 		size = len - sent;
 
-		/* Keep two messages in the pipe so it schedules better */
-		size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64);
+		if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) {
+			skb = sock_alloc_send_pskb(sk, 0, 0,
+						   msg->msg_flags & MSG_DONTWAIT,
+						   &err, 0);
+		} else {
+			/* Keep two messages in the pipe so it schedules better */
+			size = min_t(int, size, (sk->sk_sndbuf >> 1) - 64);
 
-		/* allow fallback to order-0 allocations */
-		size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ);
+			/* allow fallback to order-0 allocations */
+			size = min_t(int, size, SKB_MAX_HEAD(0) + UNIX_SKB_FRAGS_SZ);
 
-		data_len = max_t(int, 0, size - SKB_MAX_HEAD(0));
+			data_len = max_t(int, 0, size - SKB_MAX_HEAD(0));
 
-		data_len = min_t(size_t, size, PAGE_ALIGN(data_len));
+			data_len = min_t(size_t, size, PAGE_ALIGN(data_len));
 
-		skb = sock_alloc_send_pskb(sk, size - data_len, data_len,
-					   msg->msg_flags & MSG_DONTWAIT, &err,
-					   get_order(UNIX_SKB_FRAGS_SZ));
+			skb = sock_alloc_send_pskb(sk, size - data_len, data_len,
+						   msg->msg_flags & MSG_DONTWAIT, &err,
+						   get_order(UNIX_SKB_FRAGS_SZ));
+		}
 		if (!skb)
 			goto out_err;
 
@@ -2218,13 +2262,21 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 		}
 		fds_sent = true;
 
-		skb_put(skb, size - data_len);
-		skb->data_len = data_len;
-		skb->len = size;
-		err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size);
-		if (err) {
-			kfree_skb(skb);
-			goto out_err;
+		if (unlikely(msg->msg_flags & MSG_SPLICE_PAGES)) {
+			size = unix_extract_bvec_to_skb(skb, &msg->msg_iter, size);
+			skb->data_len += size;
+			skb->len += size;
+			skb->truesize += size;
+			refcount_add(size, &sk->sk_wmem_alloc);
+		} else {
+			skb_put(skb, size - data_len);
+			skb->data_len = data_len;
+			skb->len = size;
+			err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size);
+			if (err) {
+				kfree_skb(skb);
+				goto out_err;
+			}
 		}
 
 		unix_state_lock(other);


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 13/28] crypto: af_alg: Indent the loop in af_alg_sendmsg()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (11 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 12/28] af_unix: Support MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 14/28] crypto: af_alg: Support MSG_SPLICE_PAGES David Howells
                   ` (14 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Herbert Xu, linux-crypto

Put the loop in af_alg_sendmsg() into an if-statement to indent it to make
the next patch easier to review as that will add another branch to handle
MSG_SPLICE_PAGES to the if-statement.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: netdev@vger.kernel.org
---
 crypto/af_alg.c | 50 +++++++++++++++++++++++++------------------------
 1 file changed, 26 insertions(+), 24 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 5f7252a5b7b4..feb989b32606 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1060,35 +1060,37 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		if (sgl->cur)
 			sg_unmark_end(sg + sgl->cur - 1);
 
-		do {
-			struct page *pg;
-			unsigned int i = sgl->cur;
+		if (1 /* TODO check MSG_SPLICE_PAGES */) {
+			do {
+				struct page *pg;
+				unsigned int i = sgl->cur;
 
-			plen = min_t(size_t, len, PAGE_SIZE);
+				plen = min_t(size_t, len, PAGE_SIZE);
 
-			pg = alloc_page(GFP_KERNEL);
-			if (!pg) {
-				err = -ENOMEM;
-				goto unlock;
-			}
+				pg = alloc_page(GFP_KERNEL);
+				if (!pg) {
+					err = -ENOMEM;
+					goto unlock;
+				}
 
-			sg_assign_page(sg + i, pg);
+				sg_assign_page(sg + i, pg);
 
-			err = memcpy_from_msg(page_address(sg_page(sg + i)),
-					      msg, plen);
-			if (err) {
-				__free_page(sg_page(sg + i));
-				sg_assign_page(sg + i, NULL);
-				goto unlock;
-			}
+				err = memcpy_from_msg(page_address(sg_page(sg + i)),
+						      msg, plen);
+				if (err) {
+					__free_page(sg_page(sg + i));
+					sg_assign_page(sg + i, NULL);
+					goto unlock;
+				}
 
-			sg[i].length = plen;
-			len -= plen;
-			ctx->used += plen;
-			copied += plen;
-			size -= plen;
-			sgl->cur++;
-		} while (len && sgl->cur < MAX_SGL_ENTS);
+				sg[i].length = plen;
+				len -= plen;
+				ctx->used += plen;
+				copied += plen;
+				size -= plen;
+				sgl->cur++;
+			} while (len && sgl->cur < MAX_SGL_ENTS);
+		}
 
 		if (!size)
 			sg_mark_end(sg + sgl->cur - 1);


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 14/28] crypto: af_alg: Support MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (12 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 13/28] crypto: af_alg: Indent the loop in af_alg_sendmsg() David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 15/28] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES David Howells
                   ` (13 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Herbert Xu, linux-crypto

Make AF_ALG sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
spliced from the source iterator if possible (the iterator must be
ITER_BVEC and the pages must be spliceable).

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

[!] Note that this makes use of netfs_extract_iter_to_sg() from netfslib.
    This probably needs moving to core code somewhere.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: netdev@vger.kernel.org
---
 crypto/Kconfig          |  1 +
 crypto/af_alg.c         | 29 +++++++++++++++++++++++++++--
 crypto/algif_aead.c     | 22 +++++++++++-----------
 crypto/algif_skcipher.c |  8 ++++----
 4 files changed, 43 insertions(+), 17 deletions(-)

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 9c86f7045157..8c04ecbb4395 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1297,6 +1297,7 @@ menu "Userspace interface"
 
 config CRYPTO_USER_API
 	tristate
+	select NETFS_SUPPORT # for netfs_extract_iter_to_sg()
 
 config CRYPTO_USER_API_HASH
 	tristate "Hash algorithms"
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index feb989b32606..80ab4f6e018c 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -22,6 +22,7 @@
 #include <linux/sched/signal.h>
 #include <linux/security.h>
 #include <linux/string.h>
+#include <linux/netfs.h>
 #include <keys/user-type.h>
 #include <keys/trusted-type.h>
 #include <keys/encrypted-type.h>
@@ -970,6 +971,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	bool init = false;
 	int err = 0;
 
+	if ((msg->msg_flags & MSG_SPLICE_PAGES) &&
+	    !iov_iter_is_bvec(&msg->msg_iter))
+		return -EINVAL;
+
 	if (msg->msg_controllen) {
 		err = af_alg_cmsg_send(msg, &con);
 		if (err)
@@ -1015,7 +1020,7 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	while (size) {
 		struct scatterlist *sg;
 		size_t len = size;
-		size_t plen;
+		ssize_t plen;
 
 		/* use the existing memory in an allocated page */
 		if (ctx->merge) {
@@ -1060,7 +1065,27 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		if (sgl->cur)
 			sg_unmark_end(sg + sgl->cur - 1);
 
-		if (1 /* TODO check MSG_SPLICE_PAGES */) {
+		if (msg->msg_flags & MSG_SPLICE_PAGES) {
+			struct sg_table sgtable = {
+				.sgl		= sg,
+				.nents		= sgl->cur,
+				.orig_nents	= sgl->cur,
+			};
+
+			plen = netfs_extract_iter_to_sg(&msg->msg_iter, len,
+							&sgtable, MAX_SGL_ENTS, 0);
+			if (plen < 0) {
+				err = plen;
+				goto unlock;
+			}
+
+			for (; sgl->cur < sgtable.nents; sgl->cur++)
+				get_page(sg_page(&sg[sgl->cur]));
+			len -= plen;
+			ctx->used += plen;
+			copied += plen;
+			size -= plen;
+		} else {
 			do {
 				struct page *pg;
 				unsigned int i = sgl->cur;
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 42493b4d8ce4..279eb17a1dfc 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -9,8 +9,8 @@
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage/sendmsg. Filling
- * up the TX SGL does not cause a crypto operation -- the data will only be
+ * filled by user space with the data submitted via sendpage. Filling up
+ * the TX SGL does not cause a crypto operation -- the data will only be
  * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
  * provide a buffer which is tracked with the RX SGL.
  *
@@ -113,19 +113,19 @@ static int _aead_recvmsg(struct socket *sock, struct msghdr *msg,
 	}
 
 	/*
-	 * Data length provided by caller via sendmsg/sendpage that has not
-	 * yet been processed.
+	 * Data length provided by caller via sendmsg that has not yet been
+	 * processed.
 	 */
 	used = ctx->used;
 
 	/*
-	 * Make sure sufficient data is present -- note, the same check is
-	 * also present in sendmsg/sendpage. The checks in sendpage/sendmsg
-	 * shall provide an information to the data sender that something is
-	 * wrong, but they are irrelevant to maintain the kernel integrity.
-	 * We need this check here too in case user space decides to not honor
-	 * the error message in sendmsg/sendpage and still call recvmsg. This
-	 * check here protects the kernel integrity.
+	 * Make sure sufficient data is present -- note, the same check is also
+	 * present in sendmsg. The checks in sendmsg shall provide an
+	 * information to the data sender that something is wrong, but they are
+	 * irrelevant to maintain the kernel integrity.  We need this check
+	 * here too in case user space decides to not honor the error message
+	 * in sendmsg and still call recvmsg. This check here protects the
+	 * kernel integrity.
 	 */
 	if (!aead_sufficient_data(sk))
 		return -EINVAL;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index ee8890ee8f33..021f9ce7e87c 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -9,10 +9,10 @@
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage/sendmsg. Filling
- * up the TX SGL does not cause a crypto operation -- the data will only be
- * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
- * provide a buffer which is tracked with the RX SGL.
+ * filled by user space with the data submitted via sendmsg. Filling up the TX
+ * SGL does not cause a crypto operation -- the data will only be tracked by
+ * the kernel. Upon receipt of one recvmsg call, the caller must provide a
+ * buffer which is tracked with the RX SGL.
  *
  * During the processing of the recvmsg operation, the cipher request is
  * allocated and prepared. As part of the recvmsg operation, the processed


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 15/28] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (13 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 14/28] crypto: af_alg: Support MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 16/28] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() David Howells
                   ` (12 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Herbert Xu, linux-crypto

Convert af_alg_sendpage() to use sendmsg() with MSG_SPLICE_PAGES rather
than directly splicing in the pages itself.

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

[!] Note that this makes use of netfs_extract_iter_to_sg() from netfslib.
    This probably needs moving to core code somewhere.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: netdev@vger.kernel.org
---
 crypto/af_alg.c | 53 +++++++++----------------------------------------
 1 file changed, 9 insertions(+), 44 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 80ab4f6e018c..0e77fce60876 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -1148,53 +1148,18 @@ EXPORT_SYMBOL_GPL(af_alg_sendmsg);
 ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
 			int offset, size_t size, int flags)
 {
-	struct sock *sk = sock->sk;
-	struct alg_sock *ask = alg_sk(sk);
-	struct af_alg_ctx *ctx = ask->private;
-	struct af_alg_tsgl *sgl;
-	int err = -EINVAL;
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		flags |= MSG_MORE;
-
-	lock_sock(sk);
-	if (!ctx->more && ctx->used)
-		goto unlock;
-
-	if (!size)
-		goto done;
-
-	if (!af_alg_writable(sk)) {
-		err = af_alg_wait_for_wmem(sk, flags);
-		if (err)
-			goto unlock;
-	}
-
-	err = af_alg_alloc_tsgl(sk);
-	if (err)
-		goto unlock;
-
-	ctx->merge = 0;
-	sgl = list_entry(ctx->tsgl_list.prev, struct af_alg_tsgl, list);
-
-	if (sgl->cur)
-		sg_unmark_end(sgl->sg + sgl->cur - 1);
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = flags | MSG_SPLICE_PAGES,
+	};
 
-	sg_mark_end(sgl->sg + sgl->cur);
+	bvec_set_page(&bvec, page, size, offset);
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
 
-	get_page(page);
-	sg_set_page(sgl->sg + sgl->cur, page, size, offset);
-	sgl->cur++;
-	ctx->used += size;
-
-done:
-	ctx->more = flags & MSG_MORE;
-
-unlock:
-	af_alg_data_wakeup(sk);
-	release_sock(sk);
+	if (flags & MSG_SENDPAGE_NOTLAST)
+		msg.msg_flags |= MSG_MORE;
 
-	return err ?: size;
+	return sock_sendmsg(sock, &msg);
 }
 EXPORT_SYMBOL_GPL(af_alg_sendpage);
 


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 16/28] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (14 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 15/28] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 17/28] Remove file->f_op->sendpage David Howells
                   ` (11 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() to splice data from
a pipe to a socket.  This paves the way for passing in multiple pages at
once from a pipe and the handling of multipage folios.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 fs/splice.c            | 42 +++++++++++++++++++++++-------------------
 include/linux/fs.h     |  2 --
 include/linux/splice.h |  2 ++
 net/socket.c           | 26 ++------------------------
 4 files changed, 27 insertions(+), 45 deletions(-)

diff --git a/fs/splice.c b/fs/splice.c
index f46dd1fb367b..23ead122d631 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -32,6 +32,7 @@
 #include <linux/uio.h>
 #include <linux/security.h>
 #include <linux/gfp.h>
+#include <linux/net.h>
 #include <linux/socket.h>
 #include <linux/sched/signal.h>
 
@@ -410,29 +411,32 @@ const struct pipe_buf_operations nosteal_pipe_buf_ops = {
 };
 EXPORT_SYMBOL(nosteal_pipe_buf_ops);
 
+#ifdef CONFIG_NET
 /*
  * Send 'sd->len' bytes to socket from 'sd->file' at position 'sd->pos'
  * using sendpage(). Return the number of bytes sent.
  */
-static int pipe_to_sendpage(struct pipe_inode_info *pipe,
-			    struct pipe_buffer *buf, struct splice_desc *sd)
+static int pipe_to_sendmsg(struct pipe_inode_info *pipe,
+			   struct pipe_buffer *buf, struct splice_desc *sd)
 {
-	struct file *file = sd->u.file;
-	loff_t pos = sd->pos;
-	int more;
-
-	if (!likely(file->f_op->sendpage))
-		return -EINVAL;
+	struct socket *sock = sock_from_file(sd->u.file);
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = MSG_SPLICE_PAGES,
+	};
 
-	more = (sd->flags & SPLICE_F_MORE) ? MSG_MORE : 0;
+	if (sd->flags & SPLICE_F_MORE)
+		msg.msg_flags |= MSG_MORE;
 
 	if (sd->len < sd->total_len &&
 	    pipe_occupancy(pipe->head, pipe->tail) > 1)
-		more |= MSG_SENDPAGE_NOTLAST;
+		msg.msg_flags |= MSG_MORE;
 
-	return file->f_op->sendpage(file, buf->page, buf->offset,
-				    sd->len, &pos, more);
+	bvec_set_page(&bvec, buf->page, sd->len, buf->offset);
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, sd->len);
+	return sock_sendmsg(sock, &msg);
 }
+#endif
 
 static void wakeup_pipe_writers(struct pipe_inode_info *pipe)
 {
@@ -614,7 +618,7 @@ static void splice_from_pipe_end(struct pipe_inode_info *pipe, struct splice_des
  * Description:
  *    This function does little more than loop over the pipe and call
  *    @actor to do the actual moving of a single struct pipe_buffer to
- *    the desired destination. See pipe_to_file, pipe_to_sendpage, or
+ *    the desired destination. See pipe_to_file, pipe_to_sendmsg, or
  *    pipe_to_user.
  *
  */
@@ -795,8 +799,9 @@ iter_file_splice_write(struct pipe_inode_info *pipe, struct file *out,
 
 EXPORT_SYMBOL(iter_file_splice_write);
 
+#ifdef CONFIG_NET
 /**
- * generic_splice_sendpage - splice data from a pipe to a socket
+ * splice_to_socket - splice data from a pipe to a socket
  * @pipe:	pipe to splice from
  * @out:	socket to write to
  * @ppos:	position in @out
@@ -808,13 +813,12 @@ EXPORT_SYMBOL(iter_file_splice_write);
  *    is involved.
  *
  */
-ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe, struct file *out,
-				loff_t *ppos, size_t len, unsigned int flags)
+ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out,
+			 loff_t *ppos, size_t len, unsigned int flags)
 {
-	return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_sendpage);
+	return splice_from_pipe(pipe, out, ppos, len, flags, pipe_to_sendmsg);
 }
-
-EXPORT_SYMBOL(generic_splice_sendpage);
+#endif
 
 static int warn_unsupported(struct file *file, const char *op)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c85916e9f7db..f3ccc243851e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2740,8 +2740,6 @@ extern ssize_t generic_file_splice_read(struct file *, loff_t *,
 		struct pipe_inode_info *, size_t, unsigned int);
 extern ssize_t iter_file_splice_write(struct pipe_inode_info *,
 		struct file *, loff_t *, size_t, unsigned int);
-extern ssize_t generic_splice_sendpage(struct pipe_inode_info *pipe,
-		struct file *out, loff_t *, size_t len, unsigned int flags);
 extern long do_splice_direct(struct file *in, loff_t *ppos, struct file *out,
 		loff_t *opos, size_t len, unsigned int flags);
 
diff --git a/include/linux/splice.h b/include/linux/splice.h
index 8f052c3dae95..e6153feda86c 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -87,6 +87,8 @@ extern long do_splice(struct file *in, loff_t *off_in,
 
 extern long do_tee(struct file *in, struct file *out, size_t len,
 		   unsigned int flags);
+extern ssize_t splice_to_socket(struct pipe_inode_info *pipe, struct file *out,
+				loff_t *ppos, size_t len, unsigned int flags);
 
 /*
  * for dynamic pipe sizing
diff --git a/net/socket.c b/net/socket.c
index 6bae8ce7059e..1b48a976b8cc 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -57,6 +57,7 @@
 #include <linux/mm.h>
 #include <linux/socket.h>
 #include <linux/file.h>
+#include <linux/splice.h>
 #include <linux/net.h>
 #include <linux/interrupt.h>
 #include <linux/thread_info.h>
@@ -126,8 +127,6 @@ static long compat_sock_ioctl(struct file *file,
 			      unsigned int cmd, unsigned long arg);
 #endif
 static int sock_fasync(int fd, struct file *filp, int on);
-static ssize_t sock_sendpage(struct file *file, struct page *page,
-			     int offset, size_t size, loff_t *ppos, int more);
 static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
 				struct pipe_inode_info *pipe, size_t len,
 				unsigned int flags);
@@ -162,8 +161,7 @@ static const struct file_operations socket_file_ops = {
 	.mmap =		sock_mmap,
 	.release =	sock_close,
 	.fasync =	sock_fasync,
-	.sendpage =	sock_sendpage,
-	.splice_write = generic_splice_sendpage,
+	.splice_write = splice_to_socket,
 	.splice_read =	sock_splice_read,
 	.show_fdinfo =	sock_show_fdinfo,
 };
@@ -1062,26 +1060,6 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
 }
 EXPORT_SYMBOL(kernel_recvmsg);
 
-static ssize_t sock_sendpage(struct file *file, struct page *page,
-			     int offset, size_t size, loff_t *ppos, int more)
-{
-	struct socket *sock;
-	int flags;
-	int ret;
-
-	sock = file->private_data;
-
-	flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0;
-	/* more is a combination of MSG_MORE and MSG_SENDPAGE_NOTLAST */
-	flags |= more;
-
-	ret = kernel_sendpage(sock, page, offset, size, flags);
-
-	if (trace_sock_send_length_enabled())
-		call_trace_sock_send_length(sock->sk, ret, 0);
-	return ret;
-}
-
 static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
 				struct pipe_inode_info *pipe, size_t len,
 				unsigned int flags)


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 17/28] Remove file->f_op->sendpage
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (15 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 16/28] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit David Howells
                   ` (10 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Remove file->f_op->sendpage as splicing to a socket now calls sendmsg
rather than sendpage.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 include/linux/fs.h | 1 -
 1 file changed, 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index f3ccc243851e..a9f1b2543d2c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1773,7 +1773,6 @@ struct file_operations {
 	int (*fsync) (struct file *, loff_t, loff_t, int datasync);
 	int (*fasync) (int, struct file *, int);
 	int (*lock) (struct file *, int, struct file_lock *);
-	ssize_t (*sendpage) (struct file *, struct page *, int, size_t, loff_t *, int);
 	unsigned long (*get_unmapped_area)(struct file *, unsigned long, unsigned long, unsigned long, unsigned long);
 	int (*check_flags)(int);
 	int (*flock) (struct file *, int, struct file_lock *);


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (16 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 17/28] Remove file->f_op->sendpage David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-20 13:39   ` Bernard Metzler
  2023-03-16 15:26 ` [RFC PATCH 19/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
                   ` (9 subsequent siblings)
  27 siblings, 1 reply; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Bernard Metzler,
	Tom Talpey, linux-rdma

When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather than
performing several sendmsg and sendpage calls to transmit header, data
pages and trailer.

To make this work, the data is assembled in a bio_vec array and attached to
a BVEC-type iterator.  The header and trailer (if present) are copied into
memory acquired from zcopy_alloc() which just breaks a page up into small
pieces that can be freed with put_page().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Bernard Metzler <bmt@zurich.ibm.com>
cc: Tom Talpey <tom@talpey.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-rdma@vger.kernel.org
cc: netdev@vger.kernel.org
---
 drivers/infiniband/sw/siw/siw_qp_tx.c | 231 +++++---------------------
 1 file changed, 46 insertions(+), 185 deletions(-)

diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 8fc179321e2b..ec4f0ac324ce 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -8,6 +8,7 @@
 #include <linux/net.h>
 #include <linux/scatterlist.h>
 #include <linux/highmem.h>
+#include <linux/zcopy_alloc.h>
 #include <net/tcp.h>
 
 #include <rdma/iw_cm.h>
@@ -312,114 +313,8 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx, struct socket *s,
 	return rv;
 }
 
-/*
- * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES.
- *
- * Using sendpage to push page by page appears to be less efficient
- * than using sendmsg, even if data are copied.
- *
- * A general performance limitation might be the extra four bytes
- * trailer checksum segment to be pushed after user data.
- */
-static int siw_tcp_sendpages(struct socket *s, struct page **page, int offset,
-			     size_t size)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = (MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT |
-			      MSG_SENDPAGE_NOTLAST),
-	};
-	struct sock *sk = s->sk;
-	int i = 0, rv = 0, sent = 0;
-
-	while (size) {
-		size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
-
-		if (size + offset <= PAGE_SIZE)
-			msg.msg_flags = MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT;
-
-		tcp_rate_check_app_limited(sk);
-		bvec_set_page(&bvec, page[i], bytes, offset);
-		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-try_page_again:
-		lock_sock(sk);
-		rv = tcp_sendmsg_locked(sk, &msg, size);
-		release_sock(sk);
-
-		if (rv > 0) {
-			size -= rv;
-			sent += rv;
-			if (rv != bytes) {
-				offset += rv;
-				bytes -= rv;
-				goto try_page_again;
-			}
-			offset = 0;
-		} else {
-			if (rv == -EAGAIN || rv == 0)
-				break;
-			return rv;
-		}
-		i++;
-	}
-	return sent;
-}
-
-/*
- * siw_0copy_tx()
- *
- * Pushes list of pages to TCP socket. If pages from multiple
- * SGE's, all referenced pages of each SGE are pushed in one
- * shot.
- */
-static int siw_0copy_tx(struct socket *s, struct page **page,
-			struct siw_sge *sge, unsigned int offset,
-			unsigned int size)
-{
-	int i = 0, sent = 0, rv;
-	int sge_bytes = min(sge->length - offset, size);
-
-	offset = (sge->laddr + offset) & ~PAGE_MASK;
-
-	while (sent != size) {
-		rv = siw_tcp_sendpages(s, &page[i], offset, sge_bytes);
-		if (rv >= 0) {
-			sent += rv;
-			if (size == sent || sge_bytes > rv)
-				break;
-
-			i += PAGE_ALIGN(sge_bytes + offset) >> PAGE_SHIFT;
-			sge++;
-			sge_bytes = min(sge->length, size - sent);
-			offset = sge->laddr & ~PAGE_MASK;
-		} else {
-			sent = rv;
-			break;
-		}
-	}
-	return sent;
-}
-
 #define MAX_TRAILER (MPA_CRC_SIZE + 4)
 
-static void siw_unmap_pages(struct kvec *iov, unsigned long kmap_mask, int len)
-{
-	int i;
-
-	/*
-	 * Work backwards through the array to honor the kmap_local_page()
-	 * ordering requirements.
-	 */
-	for (i = (len-1); i >= 0; i--) {
-		if (kmap_mask & BIT(i)) {
-			unsigned long addr = (unsigned long)iov[i].iov_base;
-
-			kunmap_local((void *)(addr & PAGE_MASK));
-		}
-	}
-}
-
 /*
  * siw_tx_hdt() tries to push a complete packet to TCP where all
  * packet fragments are referenced by the elements of one iovec.
@@ -439,15 +334,13 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 {
 	struct siw_wqe *wqe = &c_tx->wqe_active;
 	struct siw_sge *sge = &wqe->sqe.sge[c_tx->sge_idx];
-	struct kvec iov[MAX_ARRAY];
-	struct page *page_array[MAX_ARRAY];
+	struct bio_vec bvec[MAX_ARRAY];
 	struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_EOR };
 
 	int seg = 0, do_crc = c_tx->do_crc, is_kva = 0, rv;
 	unsigned int data_len = c_tx->bytes_unsent, hdr_len = 0, trl_len = 0,
 		     sge_off = c_tx->sge_off, sge_idx = c_tx->sge_idx,
 		     pbl_idx = c_tx->pbl_idx;
-	unsigned long kmap_mask = 0L;
 
 	if (c_tx->state == SIW_SEND_HDR) {
 		if (c_tx->use_sendpage) {
@@ -457,10 +350,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 
 			c_tx->state = SIW_SEND_DATA;
 		} else {
-			iov[0].iov_base =
-				(char *)&c_tx->pkt.ctrl + c_tx->ctrl_sent;
-			iov[0].iov_len = hdr_len =
-				c_tx->ctrl_len - c_tx->ctrl_sent;
+			const void *hdr = &c_tx->pkt.ctrl + c_tx->ctrl_sent;
+
+			hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent;
+			rv = zcopy_memdup(hdr_len, hdr, &bvec[0], GFP_NOFS);
+			if (rv < 0)
+				goto done;
 			seg = 1;
 		}
 	}
@@ -478,28 +373,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 		} else {
 			is_kva = 1;
 		}
-		if (is_kva && !c_tx->use_sendpage) {
-			/*
-			 * tx from kernel virtual address: either inline data
-			 * or memory region with assigned kernel buffer
-			 */
-			iov[seg].iov_base =
-				(void *)(uintptr_t)(sge->laddr + sge_off);
-			iov[seg].iov_len = sge_len;
-
-			if (do_crc)
-				crypto_shash_update(c_tx->mpa_crc_hd,
-						    iov[seg].iov_base,
-						    sge_len);
-			sge_off += sge_len;
-			data_len -= sge_len;
-			seg++;
-			goto sge_done;
-		}
 
 		while (sge_len) {
 			size_t plen = min((int)PAGE_SIZE - fp_off, sge_len);
-			void *kaddr;
 
 			if (!is_kva) {
 				struct page *p;
@@ -512,33 +388,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 					p = siw_get_upage(mem->umem,
 							  sge->laddr + sge_off);
 				if (unlikely(!p)) {
-					siw_unmap_pages(iov, kmap_mask, seg);
 					wqe->processed -= c_tx->bytes_unsent;
 					rv = -EFAULT;
 					goto done_crc;
 				}
-				page_array[seg] = p;
-
-				if (!c_tx->use_sendpage) {
-					void *kaddr = kmap_local_page(p);
-
-					/* Remember for later kunmap() */
-					kmap_mask |= BIT(seg);
-					iov[seg].iov_base = kaddr + fp_off;
-					iov[seg].iov_len = plen;
-
-					if (do_crc)
-						crypto_shash_update(
-							c_tx->mpa_crc_hd,
-							iov[seg].iov_base,
-							plen);
-				} else if (do_crc) {
-					kaddr = kmap_local_page(p);
-					crypto_shash_update(c_tx->mpa_crc_hd,
-							    kaddr + fp_off,
-							    plen);
-					kunmap_local(kaddr);
-				}
+
+				bvec_set_page(&bvec[seg], p, plen, fp_off);
 			} else {
 				/*
 				 * Cast to an uintptr_t to preserve all 64 bits
@@ -552,12 +407,15 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 				 * bits on a 64 bit platform and 32 bits on a
 				 * 32 bit platform.
 				 */
-				page_array[seg] = virt_to_page((void *)(va & PAGE_MASK));
-				if (do_crc)
-					crypto_shash_update(
-						c_tx->mpa_crc_hd,
-						(void *)va,
-						plen);
+				bvec_set_virt(&bvec[seg], (void *)va, plen);
+			}
+
+			if (do_crc) {
+				void *kaddr = kmap_local_page(bvec[seg].bv_page);
+				crypto_shash_update(c_tx->mpa_crc_hd,
+						    kaddr + bvec[seg].bv_offset,
+						    bvec[seg].bv_len);
+				kunmap_local(kaddr);
 			}
 
 			sge_len -= plen;
@@ -567,13 +425,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 
 			if (++seg > (int)MAX_ARRAY) {
 				siw_dbg_qp(tx_qp(c_tx), "to many fragments\n");
-				siw_unmap_pages(iov, kmap_mask, seg-1);
 				wqe->processed -= c_tx->bytes_unsent;
 				rv = -EMSGSIZE;
 				goto done_crc;
 			}
 		}
-sge_done:
+
 		/* Update SGE variables at end of SGE */
 		if (sge_off == sge->length &&
 		    (data_len != 0 || wqe->processed < wqe->bytes)) {
@@ -582,15 +439,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 			sge_off = 0;
 		}
 	}
-	/* trailer */
-	if (likely(c_tx->state != SIW_SEND_TRAILER)) {
-		iov[seg].iov_base = &c_tx->trailer.pad[4 - c_tx->pad];
-		iov[seg].iov_len = trl_len = MAX_TRAILER - (4 - c_tx->pad);
-	} else {
-		iov[seg].iov_base = &c_tx->trailer.pad[c_tx->ctrl_sent];
-		iov[seg].iov_len = trl_len = MAX_TRAILER - c_tx->ctrl_sent;
-	}
 
+	/* Set the CRC in the trailer */
 	if (c_tx->pad) {
 		*(u32 *)c_tx->trailer.pad = 0;
 		if (do_crc)
@@ -603,23 +453,31 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 	else if (do_crc)
 		crypto_shash_final(c_tx->mpa_crc_hd, (u8 *)&c_tx->trailer.crc);
 
-	data_len = c_tx->bytes_unsent;
+	/* Copy the trailer and add it to the output list */
+	if (likely(c_tx->state != SIW_SEND_TRAILER)) {
+		void *trl = &c_tx->trailer.pad[4 - c_tx->pad];
 
-	if (c_tx->use_sendpage) {
-		rv = siw_0copy_tx(s, page_array, &wqe->sqe.sge[c_tx->sge_idx],
-				  c_tx->sge_off, data_len);
-		if (rv == data_len) {
-			rv = kernel_sendmsg(s, &msg, &iov[seg], 1, trl_len);
-			if (rv > 0)
-				rv += data_len;
-			else
-				rv = data_len;
-		}
+		trl_len = MAX_TRAILER - (4 - c_tx->pad);
+		rv = zcopy_memdup(trl_len, trl, &bvec[seg], GFP_NOFS);
+		if (rv < 0)
+			goto done_crc;
 	} else {
-		rv = kernel_sendmsg(s, &msg, iov, seg + 1,
-				    hdr_len + data_len + trl_len);
-		siw_unmap_pages(iov, kmap_mask, seg);
+		void *trl = &c_tx->trailer.pad[c_tx->ctrl_sent];
+
+		trl_len = MAX_TRAILER - c_tx->ctrl_sent;
+		rv = zcopy_memdup(trl_len, trl, &bvec[seg], GFP_NOFS);
+		if (rv < 0)
+			goto done_crc;
 	}
+
+	data_len = c_tx->bytes_unsent;
+
+	if (c_tx->use_sendpage)
+		msg.msg_flags |= MSG_SPLICE_PAGES;
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, seg + 1,
+		      hdr_len + data_len + trl_len);
+	rv = sock_sendmsg(s, &msg);
+
 	if (rv < (int)hdr_len) {
 		/* Not even complete hdr pushed or negative rv */
 		wqe->processed -= data_len;
@@ -680,6 +538,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 	}
 done_crc:
 	c_tx->do_crc = 0;
+	if (c_tx->state == SIW_SEND_HDR)
+		folio_put(page_folio(bvec[0].bv_page));
+	folio_put(page_folio(bvec[seg].bv_page));
 done:
 	return rv;
 }


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 19/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (17 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 20/28] iscsi: " David Howells
                   ` (8 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Ilya Dryomov, Xiubo Li,
	ceph-devel

Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when
transmitting data.  For the moment, this can only transmit one page at a
time because of the architecture of net/ceph/, but if
write_partial_message_data() can be given a bvec[] at a time by the
iteration code, this would allow pages to be sent in a batch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: ceph-devel@vger.kernel.org
cc: netdev@vger.kernel.org
---
 net/ceph/messenger_v1.c | 58 ++++++++++++++---------------------------
 1 file changed, 19 insertions(+), 39 deletions(-)

diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c
index d664cb1593a7..b2d801a49122 100644
--- a/net/ceph/messenger_v1.c
+++ b/net/ceph/messenger_v1.c
@@ -74,37 +74,6 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov,
 	return r;
 }
 
-/*
- * @more: either or both of MSG_MORE and MSG_SENDPAGE_NOTLAST
- */
-static int ceph_tcp_sendpage(struct socket *sock, struct page *page,
-			     int offset, size_t size, int more)
-{
-	ssize_t (*sendpage)(struct socket *sock, struct page *page,
-			    int offset, size_t size, int flags);
-	int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more;
-	int ret;
-
-	/*
-	 * sendpage cannot properly handle pages with page_count == 0,
-	 * we need to fall back to sendmsg if that's the case.
-	 *
-	 * Same goes for slab pages: skb_can_coalesce() allows
-	 * coalescing neighboring slab objects into a single frag which
-	 * triggers one of hardened usercopy checks.
-	 */
-	if (sendpage_ok(page))
-		sendpage = sock->ops->sendpage;
-	else
-		sendpage = sock_no_sendpage;
-
-	ret = sendpage(sock, page, offset, size, flags);
-	if (ret == -EAGAIN)
-		ret = 0;
-
-	return ret;
-}
-
 static void con_out_kvec_reset(struct ceph_connection *con)
 {
 	BUG_ON(con->v1.out_skip);
@@ -464,7 +433,6 @@ static int write_partial_message_data(struct ceph_connection *con)
 	struct ceph_msg *msg = con->out_msg;
 	struct ceph_msg_data_cursor *cursor = &msg->cursor;
 	bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC);
-	int more = MSG_MORE | MSG_SENDPAGE_NOTLAST;
 	u32 crc;
 
 	dout("%s %p msg %p\n", __func__, con, msg);
@@ -482,6 +450,10 @@ static int write_partial_message_data(struct ceph_connection *con)
 	 */
 	crc = do_datacrc ? le32_to_cpu(msg->footer.data_crc) : 0;
 	while (cursor->total_resid) {
+		struct bio_vec bvec;
+		struct msghdr msghdr = {
+			.msg_flags = MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST,
+		};
 		struct page *page;
 		size_t page_offset;
 		size_t length;
@@ -494,9 +466,12 @@ static int write_partial_message_data(struct ceph_connection *con)
 
 		page = ceph_msg_data_next(cursor, &page_offset, &length);
 		if (length == cursor->total_resid)
-			more = MSG_MORE;
-		ret = ceph_tcp_sendpage(con->sock, page, page_offset, length,
-					more);
+			msghdr.msg_flags |= MSG_MORE;
+
+		bvec_set_page(&bvec, page, length, page_offset);
+		iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, length);
+
+		ret = sock_sendmsg(con->sock, &msghdr);
 		if (ret <= 0) {
 			if (do_datacrc)
 				msg->footer.data_crc = cpu_to_le32(crc);
@@ -526,7 +501,10 @@ static int write_partial_message_data(struct ceph_connection *con)
  */
 static int write_partial_skip(struct ceph_connection *con)
 {
-	int more = MSG_MORE | MSG_SENDPAGE_NOTLAST;
+	struct bio_vec bvec;
+	struct msghdr msghdr = {
+		.msg_flags = MSG_SPLICE_PAGES | MSG_SENDPAGE_NOTLAST | MSG_MORE,
+	};
 	int ret;
 
 	dout("%s %p %d left\n", __func__, con, con->v1.out_skip);
@@ -534,9 +512,11 @@ static int write_partial_skip(struct ceph_connection *con)
 		size_t size = min(con->v1.out_skip, (int)PAGE_SIZE);
 
 		if (size == con->v1.out_skip)
-			more = MSG_MORE;
-		ret = ceph_tcp_sendpage(con->sock, ceph_zero_page, 0, size,
-					more);
+			msghdr.msg_flags &= ~MSG_SENDPAGE_NOTLAST;
+		bvec_set_page(&bvec, ZERO_PAGE(0), size, 0);
+		iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, size);
+
+		ret = sock_sendmsg(con->sock, &msghdr);
 		if (ret <= 0)
 			goto out;
 		con->v1.out_skip -= ret;


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 20/28] iscsi: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (18 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 19/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 21/28] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (7 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Martin K. Petersen,
	linux-scsi, target-devel

Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage.  This allows
multiple pages and multipage folios to be passed through.

TODO: iscsit_fe_sendpage_sg() should perhaps set up a bio_vec array for the
entire set of pages it's going to transfer plus two for the header and
trailer and use zcopy_alloc() to allocate the header and trailer - and then
call sendmsg once for the entire message.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "Martin K. Petersen" <martin.petersen@oracle.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-scsi@vger.kernel.org
cc: target-devel@vger.kernel.org
cc: netdev@vger.kernel.org
---
 drivers/target/iscsi/iscsi_target_util.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index 26dc8ed3045b..c7d58e41ac3b 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -1078,6 +1078,8 @@ int iscsit_fe_sendpage_sg(
 	struct iscsit_conn *conn)
 {
 	struct scatterlist *sg = cmd->first_data_sg;
+	struct bio_vec bvec;
+	struct msghdr msghdr = { .msg_flags = MSG_SPLICE_PAGES,	};
 	struct kvec iov;
 	u32 tx_hdr_size, data_len;
 	u32 offset = cmd->first_data_sg_off;
@@ -1121,17 +1123,17 @@ int iscsit_fe_sendpage_sg(
 		u32 space = (sg->length - offset);
 		u32 sub_len = min_t(u32, data_len, space);
 send_pg:
-		tx_sent = conn->sock->ops->sendpage(conn->sock,
-					sg_page(sg), sg->offset + offset, sub_len, 0);
+		bvec_set_page(&bvec, sg_page(sg), sub_len, sg->offset + offset);
+		iov_iter_bvec(&msghdr.msg_iter, ITER_SOURCE, &bvec, 1, sub_len);
+
+		tx_sent = conn->sock->ops->sendmsg(conn->sock, &msghdr, sub_len);
 		if (tx_sent != sub_len) {
 			if (tx_sent == -EAGAIN) {
-				pr_err("tcp_sendpage() returned"
-						" -EAGAIN\n");
+				pr_err("sendmsg/splice returned -EAGAIN\n");
 				goto send_pg;
 			}
 
-			pr_err("tcp_sendpage() failure: %d\n",
-					tx_sent);
+			pr_err("sendmsg/splice failure: %d\n", tx_sent);
 			return -1;
 		}
 


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 21/28] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES)
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (19 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 20/28] iscsi: " David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 22/28] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() David Howells
                   ` (6 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, John Fastabend,
	Jakub Sitnicki, bpf

Translate tcp_bpf_sendpage() calls to tcp_bpf_sendmsg(MSG_SPLICE_PAGES).

Signed-off-by: David Howells <dhowells@redhat.com>
cc: John Fastabend <john.fastabend@gmail.com>
cc: Jakub Sitnicki <jakub@cloudflare.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: bpf@vger.kernel.org
cc: netdev@vger.kernel.org
---
 net/ipv4/tcp_bpf.c | 49 +++++++++-------------------------------------
 1 file changed, 9 insertions(+), 40 deletions(-)

diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index 7f17134637eb..de37a4372437 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -485,49 +485,18 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset,
 			    size_t size, int flags)
 {
-	struct sk_msg tmp, *msg = NULL;
-	int err = 0, copied = 0;
-	struct sk_psock *psock;
-	bool enospc = false;
-
-	psock = sk_psock_get(sk);
-	if (unlikely(!psock))
-		return tcp_sendpage(sk, page, offset, size, flags);
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = flags | MSG_SPLICE_PAGES,
+	};
 
-	lock_sock(sk);
-	if (psock->cork) {
-		msg = psock->cork;
-	} else {
-		msg = &tmp;
-		sk_msg_init(msg);
-	}
+	bvec_set_page(&bvec, page, size, offset);
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
 
-	/* Catch case where ring is full and sendpage is stalled. */
-	if (unlikely(sk_msg_full(msg)))
-		goto out_err;
-
-	sk_msg_page_add(msg, page, size, offset);
-	sk_mem_charge(sk, size);
-	copied = size;
-	if (sk_msg_full(msg))
-		enospc = true;
-	if (psock->cork_bytes) {
-		if (size > psock->cork_bytes)
-			psock->cork_bytes = 0;
-		else
-			psock->cork_bytes -= size;
-		if (psock->cork_bytes && !enospc)
-			goto out_err;
-		/* All cork bytes are accounted, rerun the prog. */
-		psock->eval = __SK_NONE;
-		psock->cork_bytes = 0;
-	}
+	if (flags & MSG_SENDPAGE_NOTLAST)
+		msg.msg_flags |= MSG_MORE;
 
-	err = tcp_bpf_send_verdict(sk, psock, msg, &copied, flags);
-out_err:
-	release_sock(sk);
-	sk_psock_put(sk, psock);
-	return copied ? copied : err;
+	return tcp_bpf_sendmsg(sk, &msg, size);
 }
 
 enum {


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 22/28] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (20 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 21/28] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 23/28] algif: Remove hash_sendpage*() David Howells
                   ` (5 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

Use sendmsg() with MSG_SPLICE_PAGES rather than sendpage in
skb_send_sock().  This causes pages to be spliced from the source iterator
if possible (the iterator must be ITER_BVEC and the pages must be
spliceable).

This allows ->sendpage() to be replaced by something that can handle
multiple multipage folios in a single transaction.

Note that this could perhaps be improved to fill out a bvec array with all
the frags and then make a single sendmsg call, possibly sticking the header
on the front also.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 net/core/skbuff.c | 49 ++++++++++++++++++++++++++---------------------
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index eb7d33b41e71..9fa333e26b7d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2927,32 +2927,32 @@ int skb_splice_bits(struct sk_buff *skb, struct sock *sk, unsigned int offset,
 }
 EXPORT_SYMBOL_GPL(skb_splice_bits);
 
-static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg,
-			    struct kvec *vec, size_t num, size_t size)
+static int sendmsg_locked(struct sock *sk, struct msghdr *msg)
 {
 	struct socket *sock = sk->sk_socket;
+	size_t size = msg_data_left(msg);
 
 	if (!sock)
 		return -EINVAL;
-	return kernel_sendmsg(sock, msg, vec, num, size);
+
+	if (!sock->ops->sendmsg_locked)
+		return sock_no_sendmsg_locked(sk, msg, size);
+
+	return sock->ops->sendmsg_locked(sk, msg, size);
 }
 
-static int sendpage_unlocked(struct sock *sk, struct page *page, int offset,
-			     size_t size, int flags)
+static int sendmsg_unlocked(struct sock *sk, struct msghdr *msg)
 {
 	struct socket *sock = sk->sk_socket;
 
 	if (!sock)
 		return -EINVAL;
-	return kernel_sendpage(sock, page, offset, size, flags);
+	return sock_sendmsg(sock, msg);
 }
 
-typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg,
-			    struct kvec *vec, size_t num, size_t size);
-typedef int (*sendpage_func)(struct sock *sk, struct page *page, int offset,
-			     size_t size, int flags);
+typedef int (*sendmsg_func)(struct sock *sk, struct msghdr *msg);
 static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset,
-			   int len, sendmsg_func sendmsg, sendpage_func sendpage)
+			   int len, sendmsg_func sendmsg)
 {
 	unsigned int orig_len = len;
 	struct sk_buff *head = skb;
@@ -2972,8 +2972,9 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset,
 		memset(&msg, 0, sizeof(msg));
 		msg.msg_flags = MSG_DONTWAIT;
 
-		ret = INDIRECT_CALL_2(sendmsg, kernel_sendmsg_locked,
-				      sendmsg_unlocked, sk, &msg, &kv, 1, slen);
+		iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, &kv, 1, slen);
+		ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked,
+				      sendmsg_unlocked, sk, &msg);
 		if (ret <= 0)
 			goto error;
 
@@ -3004,11 +3005,17 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset,
 		slen = min_t(size_t, len, skb_frag_size(frag) - offset);
 
 		while (slen) {
-			ret = INDIRECT_CALL_2(sendpage, kernel_sendpage_locked,
-					      sendpage_unlocked, sk,
-					      skb_frag_page(frag),
-					      skb_frag_off(frag) + offset,
-					      slen, MSG_DONTWAIT);
+			struct bio_vec bvec;
+			struct msghdr msg = {
+				.msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT,
+			};
+
+			bvec_set_page(&bvec, skb_frag_page(frag), slen,
+				      skb_frag_off(frag) + offset);
+			iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, slen);
+
+			ret = INDIRECT_CALL_2(sendmsg, sendmsg_locked,
+					      sendmsg_unlocked, sk, &msg);
 			if (ret <= 0)
 				goto error;
 
@@ -3045,16 +3052,14 @@ static int __skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset,
 int skb_send_sock_locked(struct sock *sk, struct sk_buff *skb, int offset,
 			 int len)
 {
-	return __skb_send_sock(sk, skb, offset, len, kernel_sendmsg_locked,
-			       kernel_sendpage_locked);
+	return __skb_send_sock(sk, skb, offset, len, sendmsg_locked);
 }
 EXPORT_SYMBOL_GPL(skb_send_sock_locked);
 
 /* Send skb data on a socket. Socket must be unlocked. */
 int skb_send_sock(struct sock *sk, struct sk_buff *skb, int offset, int len)
 {
-	return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked,
-			       sendpage_unlocked);
+	return __skb_send_sock(sk, skb, offset, len, sendmsg_unlocked);
 }
 
 /**


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 23/28] algif: Remove hash_sendpage*()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (21 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 22/28] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-17  2:40   ` Herbert Xu
  2023-03-16 15:26 ` [RFC PATCH 24/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() David Howells
                   ` (4 subsequent siblings)
  27 siblings, 1 reply; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Herbert Xu, linux-crypto

Remove hash_sendpage*() and use hash_sendmsg() as the latter seems to just
use the source pages directly anyway.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Herbert Xu <herbert@gondor.apana.org.au>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-crypto@vger.kernel.org
cc: netdev@vger.kernel.org
---
 crypto/algif_hash.c | 66 ---------------------------------------------
 1 file changed, 66 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 1d017ec5c63c..52f5828a054a 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -129,58 +129,6 @@ static int hash_sendmsg(struct socket *sock, struct msghdr *msg,
 	return err ?: copied;
 }
 
-static ssize_t hash_sendpage(struct socket *sock, struct page *page,
-			     int offset, size_t size, int flags)
-{
-	struct sock *sk = sock->sk;
-	struct alg_sock *ask = alg_sk(sk);
-	struct hash_ctx *ctx = ask->private;
-	int err;
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		flags |= MSG_MORE;
-
-	lock_sock(sk);
-	sg_init_table(ctx->sgl.sg, 1);
-	sg_set_page(ctx->sgl.sg, page, size, offset);
-
-	if (!(flags & MSG_MORE)) {
-		err = hash_alloc_result(sk, ctx);
-		if (err)
-			goto unlock;
-	} else if (!ctx->more)
-		hash_free_result(sk, ctx);
-
-	ahash_request_set_crypt(&ctx->req, ctx->sgl.sg, ctx->result, size);
-
-	if (!(flags & MSG_MORE)) {
-		if (ctx->more)
-			err = crypto_ahash_finup(&ctx->req);
-		else
-			err = crypto_ahash_digest(&ctx->req);
-	} else {
-		if (!ctx->more) {
-			err = crypto_ahash_init(&ctx->req);
-			err = crypto_wait_req(err, &ctx->wait);
-			if (err)
-				goto unlock;
-		}
-
-		err = crypto_ahash_update(&ctx->req);
-	}
-
-	err = crypto_wait_req(err, &ctx->wait);
-	if (err)
-		goto unlock;
-
-	ctx->more = flags & MSG_MORE;
-
-unlock:
-	release_sock(sk);
-
-	return err ?: size;
-}
-
 static int hash_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 			int flags)
 {
@@ -285,7 +233,6 @@ static struct proto_ops algif_hash_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	hash_sendmsg,
-	.sendpage	=	hash_sendpage,
 	.recvmsg	=	hash_recvmsg,
 	.accept		=	hash_accept,
 };
@@ -337,18 +284,6 @@ static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return hash_sendmsg(sock, msg, size);
 }
 
-static ssize_t hash_sendpage_nokey(struct socket *sock, struct page *page,
-				   int offset, size_t size, int flags)
-{
-	int err;
-
-	err = hash_check_key(sock);
-	if (err)
-		return err;
-
-	return hash_sendpage(sock, page, offset, size, flags);
-}
-
 static int hash_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 			      size_t ignored, int flags)
 {
@@ -387,7 +322,6 @@ static struct proto_ops algif_hash_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	hash_sendmsg_nokey,
-	.sendpage	=	hash_sendpage_nokey,
 	.recvmsg	=	hash_recvmsg_nokey,
 	.accept		=	hash_accept_nokey,
 };


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 24/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (22 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 23/28] algif: Remove hash_sendpage*() David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26 ` [RFC PATCH 25/28] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
                   ` (3 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Ilya Dryomov, Xiubo Li,
	ceph-devel

Use sendmsg() and MSG_SPLICE_PAGES rather than sendpage in ceph when
transmitting data.  For the moment, this can only transmit one page at a
time because of the architecture of net/ceph/, but if
write_partial_message_data() can be given a bvec[] at a time by the
iteration code, this would allow pages to be sent in a batch.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Ilya Dryomov <idryomov@gmail.com>
cc: Xiubo Li <xiubli@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: ceph-devel@vger.kernel.org
cc: netdev@vger.kernel.org
---
 net/ceph/messenger_v2.c | 89 +++++++++--------------------------------
 1 file changed, 18 insertions(+), 71 deletions(-)

diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c
index 301a991dc6a6..1637a0c21126 100644
--- a/net/ceph/messenger_v2.c
+++ b/net/ceph/messenger_v2.c
@@ -117,91 +117,38 @@ static int ceph_tcp_recv(struct ceph_connection *con)
 	return ret;
 }
 
-static int do_sendmsg(struct socket *sock, struct iov_iter *it)
-{
-	struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
-	int ret;
-
-	msg.msg_iter = *it;
-	while (iov_iter_count(it)) {
-		ret = sock_sendmsg(sock, &msg);
-		if (ret <= 0) {
-			if (ret == -EAGAIN)
-				ret = 0;
-			return ret;
-		}
-
-		iov_iter_advance(it, ret);
-	}
-
-	WARN_ON(msg_data_left(&msg));
-	return 1;
-}
-
-static int do_try_sendpage(struct socket *sock, struct iov_iter *it)
-{
-	struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
-	struct bio_vec bv;
-	int ret;
-
-	if (WARN_ON(!iov_iter_is_bvec(it)))
-		return -EINVAL;
-
-	while (iov_iter_count(it)) {
-		/* iov_iter_iovec() for ITER_BVEC */
-		bvec_set_page(&bv, it->bvec->bv_page,
-			      min(iov_iter_count(it),
-				  it->bvec->bv_len - it->iov_offset),
-			      it->bvec->bv_offset + it->iov_offset);
-
-		/*
-		 * sendpage cannot properly handle pages with
-		 * page_count == 0, we need to fall back to sendmsg if
-		 * that's the case.
-		 *
-		 * Same goes for slab pages: skb_can_coalesce() allows
-		 * coalescing neighboring slab objects into a single frag
-		 * which triggers one of hardened usercopy checks.
-		 */
-		if (sendpage_ok(bv.bv_page)) {
-			ret = sock->ops->sendpage(sock, bv.bv_page,
-						  bv.bv_offset, bv.bv_len,
-						  CEPH_MSG_FLAGS);
-		} else {
-			iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len);
-			ret = sock_sendmsg(sock, &msg);
-		}
-		if (ret <= 0) {
-			if (ret == -EAGAIN)
-				ret = 0;
-			return ret;
-		}
-
-		iov_iter_advance(it, ret);
-	}
-
-	return 1;
-}
-
 /*
  * Write as much as possible.  The socket is expected to be corked,
  * so we don't bother with MSG_MORE/MSG_SENDPAGE_NOTLAST here.
  *
  * Return:
- *   1 - done, nothing (else) to write
+ *  >0 - done, nothing (else) to write
  *   0 - socket is full, need to wait
  *  <0 - error
  */
 static int ceph_tcp_send(struct ceph_connection *con)
 {
+	struct msghdr msg = {
+		.msg_iter	= con->v2.out_iter,
+		.msg_flags	= CEPH_MSG_FLAGS,
+	};
 	int ret;
 
+	if (WARN_ON(!iov_iter_is_bvec(&con->v2.out_iter)))
+		return -EINVAL;
+
+	if (con->v2.out_iter_sendpage)
+		msg.msg_flags |= MSG_SPLICE_PAGES;
+
 	dout("%s con %p have %zu try_sendpage %d\n", __func__, con,
 	     iov_iter_count(&con->v2.out_iter), con->v2.out_iter_sendpage);
-	if (con->v2.out_iter_sendpage)
-		ret = do_try_sendpage(con->sock, &con->v2.out_iter);
-	else
-		ret = do_sendmsg(con->sock, &con->v2.out_iter);
+
+	ret = sock_sendmsg(con->sock, &msg);
+	if (ret > 0)
+		iov_iter_advance(&con->v2.out_iter, ret);
+	else if (ret == -EAGAIN)
+		ret = 0;
+
 	dout("%s con %p ret %d left %zu\n", __func__, con, ret,
 	     iov_iter_count(&con->v2.out_iter));
 	return ret;


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 25/28] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (23 preceding siblings ...)
  2023-03-16 15:26 ` [RFC PATCH 24/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 15:26   ` [Cluster-devel] " David Howells
                   ` (2 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Santosh Shilimkar,
	linux-rdma, rds-devel

When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather than
performing several sendmsg and sendpage calls to transmit header and data
pages.

To make this work, the data is assembled in a bio_vec array and attached to
a BVEC-type iterator.  The header are copied into memory acquired from
zcopy_alloc() which just breaks a page up into small pieces that can be
freed with put_page().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Santosh Shilimkar <santosh.shilimkar@oracle.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-rdma@vger.kernel.org
cc: rds-devel@oss.oracle.com
cc: netdev@vger.kernel.org
---
 net/rds/tcp_send.c | 80 ++++++++++++++++++++--------------------------
 1 file changed, 35 insertions(+), 45 deletions(-)

diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c
index 8c4d1d6e9249..0d6eb85a930d 100644
--- a/net/rds/tcp_send.c
+++ b/net/rds/tcp_send.c
@@ -32,6 +32,7 @@
  */
 #include <linux/kernel.h>
 #include <linux/in.h>
+#include <linux/zcopy_alloc.h>
 #include <net/tcp.h>
 
 #include "rds_single_path.h"
@@ -52,29 +53,24 @@ void rds_tcp_xmit_path_complete(struct rds_conn_path *cp)
 	tcp_sock_set_cork(tc->t_sock->sk, false);
 }
 
-/* the core send_sem serializes this with other xmit and shutdown */
-static int rds_tcp_sendmsg(struct socket *sock, void *data, unsigned int len)
-{
-	struct kvec vec = {
-		.iov_base = data,
-		.iov_len = len,
-	};
-	struct msghdr msg = {
-		.msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL,
-	};
-
-	return kernel_sendmsg(sock, &msg, &vec, 1, vec.iov_len);
-}
-
 /* the core send_sem serializes this with other xmit and shutdown */
 int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
 		 unsigned int hdr_off, unsigned int sg, unsigned int off)
 {
 	struct rds_conn_path *cp = rm->m_inc.i_conn_path;
 	struct rds_tcp_connection *tc = cp->cp_transport_data;
+	struct msghdr msg = {
+		.msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL,
+	};
+	struct bio_vec *bvec;
+	unsigned int i, size = 0, ix = 0;
+	bool free_hdr = false;
 	int done = 0;
-	int ret = 0;
-	int more;
+	int ret = -ENOMEM;
+
+	bvec = kmalloc_array(1 + sg, sizeof(struct bio_vec), GFP_KERNEL);
+	if (!bvec)
+		goto out;
 
 	if (hdr_off == 0) {
 		/*
@@ -101,41 +97,30 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
 		/* see rds_tcp_write_space() */
 		set_bit(SOCK_NOSPACE, &tc->t_sock->sk->sk_socket->flags);
 
-		ret = rds_tcp_sendmsg(tc->t_sock,
-				      (void *)&rm->m_inc.i_hdr + hdr_off,
-				      sizeof(rm->m_inc.i_hdr) - hdr_off);
+		ret = zcopy_memdup(sizeof(rm->m_inc.i_hdr) - hdr_off,
+				   (void *)&rm->m_inc.i_hdr + hdr_off,
+				   &bvec[ix], GFP_KERNEL);
 		if (ret < 0)
 			goto out;
-		done += ret;
-		if (hdr_off + done != sizeof(struct rds_header))
-			goto out;
+		free_hdr = true;
+		size += bvec[ix].bv_len;
+		ix++;
 	}
 
-	more = rm->data.op_nents > 1 ? (MSG_MORE | MSG_SENDPAGE_NOTLAST) : 0;
-	while (sg < rm->data.op_nents) {
-		int flags = MSG_DONTWAIT | MSG_NOSIGNAL | more;
-
-		ret = tc->t_sock->ops->sendpage(tc->t_sock,
-						sg_page(&rm->data.op_sg[sg]),
-						rm->data.op_sg[sg].offset + off,
-						rm->data.op_sg[sg].length - off,
-						flags);
-		rdsdebug("tcp sendpage %p:%u:%u ret %d\n", (void *)sg_page(&rm->data.op_sg[sg]),
-			 rm->data.op_sg[sg].offset + off, rm->data.op_sg[sg].length - off,
-			 ret);
-		if (ret <= 0)
-			break;
-
-		off += ret;
-		done += ret;
-		if (off == rm->data.op_sg[sg].length) {
-			off = 0;
-			sg++;
-		}
-		if (sg == rm->data.op_nents - 1)
-			more = 0;
+	for (i = sg; i < rm->data.op_nents; i++) {
+		bvec_set_page(&bvec[ix],
+			      sg_page(&rm->data.op_sg[i]),
+			      rm->data.op_sg[i].length - off,
+			      rm->data.op_sg[i].offset + off);
+		off = 0;
+		size += bvec[ix].bv_len;
+		ix++;
 	}
 
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, ix, size);
+	ret = sock_sendmsg(tc->t_sock, &msg);
+	rdsdebug("tcp sendmsg-splice %u,%u ret %d\n", ix, size, ret);
+
 out:
 	if (ret <= 0) {
 		/* write_space will hit after EAGAIN, all else fatal */
@@ -158,6 +143,11 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
 	}
 	if (done == 0)
 		done = ret;
+	if (bvec) {
+		if (free_hdr)
+			put_page(bvec[0].bv_page);
+		kfree(bvec);
+	}
 	return done;
 }
 


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 26/28] dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
@ 2023-03-16 15:26   ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
                     ` (26 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Christine Caulfield,
	David Teigland, cluster-devel

When transmitting data, call down a layer using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather using
sendpage.  This allows ->sendpage() to be replaced by something that can
handle multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christine Caulfield <ccaulfie@redhat.com>
cc: David Teigland <teigland@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: cluster-devel@redhat.com
cc: netdev@vger.kernel.org
---
 fs/dlm/lowcomms.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index a9b14f81d655..9c0c691b6106 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1394,8 +1394,11 @@ int dlm_lowcomms_resend_msg(struct dlm_msg *msg)
 /* Send a message */
 static int send_to_sock(struct connection *con)
 {
-	const int msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL;
 	struct writequeue_entry *e;
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL,
+	};
 	int len, offset, ret;
 
 	spin_lock_bh(&con->writequeue_lock);
@@ -1411,8 +1414,9 @@ static int send_to_sock(struct connection *con)
 	WARN_ON_ONCE(len == 0 && e->users == 0);
 	spin_unlock_bh(&con->writequeue_lock);
 
-	ret = kernel_sendpage(con->sock, e->page, offset, len,
-			      msg_flags);
+	bvec_set_page(&bvec, e->page, len, offset);
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len);
+	ret = sock_sendmsg(con->sock, &msg);
 	trace_dlm_send(con->nodeid, ret);
 	if (ret == -EAGAIN || ret == 0) {
 		lock_sock(con->sock->sk);


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [Cluster-devel] [RFC PATCH 26/28] dlm: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage
@ 2023-03-16 15:26   ` David Howells
  0 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: cluster-devel.redhat.com

When transmitting data, call down a layer using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather using
sendpage.  This allows ->sendpage() to be replaced by something that can
handle multiple multipage folios in a single transaction.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christine Caulfield <ccaulfie@redhat.com>
cc: David Teigland <teigland@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: cluster-devel at redhat.com
cc: netdev at vger.kernel.org
---
 fs/dlm/lowcomms.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index a9b14f81d655..9c0c691b6106 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1394,8 +1394,11 @@ int dlm_lowcomms_resend_msg(struct dlm_msg *msg)
 /* Send a message */
 static int send_to_sock(struct connection *con)
 {
-	const int msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL;
 	struct writequeue_entry *e;
+	struct bio_vec bvec;
+	struct msghdr msg = {
+		.msg_flags = MSG_SPLICE_PAGES | MSG_DONTWAIT | MSG_NOSIGNAL,
+	};
 	int len, offset, ret;
 
 	spin_lock_bh(&con->writequeue_lock);
@@ -1411,8 +1414,9 @@ static int send_to_sock(struct connection *con)
 	WARN_ON_ONCE(len == 0 && e->users == 0);
 	spin_unlock_bh(&con->writequeue_lock);
 
-	ret = kernel_sendpage(con->sock, e->page, offset, len,
-			      msg_flags);
+	bvec_set_page(&bvec, e->page, len, offset);
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len);
+	ret = sock_sendmsg(con->sock, &msg);
 	trace_dlm_send(con->nodeid, ret);
 	if (ret == -EAGAIN || ret == 0) {
 		lock_sock(con->sock->sk);

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
                   ` (25 preceding siblings ...)
  2023-03-16 15:26   ` [Cluster-devel] " David Howells
@ 2023-03-16 15:26 ` David Howells
  2023-03-16 16:17   ` Trond Myklebust
  2023-03-16 16:24   ` David Howells
  2023-03-16 15:26   ` David Howells
  27 siblings, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Trond Myklebust,
	Anna Schumaker, Chuck Lever, linux-nfs

When transmitting data, call down into TCP using a single sendmsg with
MSG_SPLICE_PAGES to indicate that content should be spliced rather than
performing several sendmsg and sendpage calls to transmit header, data
pages and trailer.

To make this work, the data is assembled in a bio_vec array and attached to
a BVEC-type iterator.  The bio_vec array has two extra slots before the
first for headers and one after the last for a trailer.  The headers and
trailer are copied into memory acquired from zcopy_alloc() which just
breaks a page up into small pieces that can be freed with put_page().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Trond Myklebust <trond.myklebust@hammerspace.com>
cc: Anna Schumaker <anna@kernel.org>
cc: Chuck Lever <chuck.lever@oracle.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-nfs@vger.kernel.org
cc: netdev@vger.kernel.org
---
 net/sunrpc/svcsock.c | 70 ++++++++++++--------------------------------
 net/sunrpc/xdr.c     | 24 ++++++++++++---
 2 files changed, 38 insertions(+), 56 deletions(-)

diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 03a4f5615086..1fa41ddbc40e 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -36,6 +36,7 @@
 #include <linux/skbuff.h>
 #include <linux/file.h>
 #include <linux/freezer.h>
+#include <linux/zcopy_alloc.h>
 #include <net/sock.h>
 #include <net/checksum.h>
 #include <net/ip.h>
@@ -1060,16 +1061,8 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
 	return 0;	/* record not complete */
 }
 
-static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec,
-			      int flags)
-{
-	return kernel_sendpage(sock, virt_to_page(vec->iov_base),
-			       offset_in_page(vec->iov_base),
-			       vec->iov_len, flags);
-}
-
 /*
- * kernel_sendpage() is used exclusively to reduce the number of
+ * MSG_SPLICE_PAGES is used exclusively to reduce the number of
  * copy operations in this path. Therefore the caller must ensure
  * that the pages backing @xdr are unchanging.
  *
@@ -1081,65 +1074,38 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr,
 {
 	const struct kvec *head = xdr->head;
 	const struct kvec *tail = xdr->tail;
-	struct kvec rm = {
-		.iov_base	= &marker,
-		.iov_len	= sizeof(marker),
-	};
 	struct msghdr msg = {
-		.msg_flags	= 0,
+		.msg_flags	= MSG_SPLICE_PAGES,
 	};
-	int ret;
+	int ret, n = xdr_buf_pagecount(xdr), size;
 
 	*sentp = 0;
 	ret = xdr_alloc_bvec(xdr, GFP_KERNEL);
 	if (ret < 0)
 		return ret;
 
-	ret = kernel_sendmsg(sock, &msg, &rm, 1, rm.iov_len);
+	ret = zcopy_memdup(sizeof(marker), &marker, &xdr->bvec[-2], GFP_KERNEL);
 	if (ret < 0)
 		return ret;
-	*sentp += ret;
-	if (ret != rm.iov_len)
-		return -EAGAIN;
 
-	ret = svc_tcp_send_kvec(sock, head, 0);
+	ret = zcopy_memdup(head->iov_len, head->iov_base, &xdr->bvec[-1], GFP_KERNEL);
 	if (ret < 0)
 		return ret;
-	*sentp += ret;
-	if (ret != head->iov_len)
-		goto out;
 
-	if (xdr->page_len) {
-		unsigned int offset, len, remaining;
-		struct bio_vec *bvec;
-
-		bvec = xdr->bvec + (xdr->page_base >> PAGE_SHIFT);
-		offset = offset_in_page(xdr->page_base);
-		remaining = xdr->page_len;
-		while (remaining > 0) {
-			len = min(remaining, bvec->bv_len - offset);
-			ret = kernel_sendpage(sock, bvec->bv_page,
-					      bvec->bv_offset + offset,
-					      len, 0);
-			if (ret < 0)
-				return ret;
-			*sentp += ret;
-			if (ret != len)
-				goto out;
-			remaining -= len;
-			offset = 0;
-			bvec++;
-		}
-	}
+	ret = zcopy_memdup(tail->iov_len, tail->iov_base, &xdr->bvec[n], GFP_KERNEL);
+	if (ret < 0)
+		return ret;
 
-	if (tail->iov_len) {
-		ret = svc_tcp_send_kvec(sock, tail, 0);
-		if (ret < 0)
-			return ret;
-		*sentp += ret;
-	}
+	size = sizeof(marker) + head->iov_len + xdr->page_len + tail->iov_len;
+	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec - 2, n + 3, size);
 
-out:
+	ret = sock_sendmsg(sock, &msg);
+	if (ret < 0)
+		return ret;
+	if (ret > 0)
+		*sentp = ret;
+	if (ret != size)
+		return -EAGAIN;
 	return 0;
 }
 
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 36835b2f5446..6dff0b4f17b8 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -145,14 +145,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
 {
 	size_t i, n = xdr_buf_pagecount(buf);
 
-	if (n != 0 && buf->bvec == NULL) {
-		buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]), gfp);
+	if (buf->bvec == NULL) {
+		/* Allow for two headers and a trailer to be attached */
+		buf->bvec = kmalloc_array(n + 3, sizeof(buf->bvec[0]), gfp);
 		if (!buf->bvec)
 			return -ENOMEM;
+		buf->bvec += 2;
+		buf->bvec[-2].bv_page = NULL;
+		buf->bvec[-1].bv_page = NULL;
 		for (i = 0; i < n; i++) {
 			bvec_set_page(&buf->bvec[i], buf->pages[i], PAGE_SIZE,
 				      0);
 		}
+		buf->bvec[n].bv_page = NULL;
 	}
 	return 0;
 }
@@ -160,8 +165,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
 void
 xdr_free_bvec(struct xdr_buf *buf)
 {
-	kfree(buf->bvec);
-	buf->bvec = NULL;
+	if (buf->bvec) {
+		size_t n = xdr_buf_pagecount(buf);
+
+		if (buf->bvec[-2].bv_page)
+			put_page(buf->bvec[-2].bv_page);
+		if (buf->bvec[-1].bv_page)
+			put_page(buf->bvec[-1].bv_page);
+		if (buf->bvec[n].bv_page)
+			put_page(buf->bvec[n].bv_page);
+		buf->bvec -= 2;
+		kfree(buf->bvec);
+		buf->bvec = NULL;
+	}
 }
 
 /**


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
  2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
  2023-03-16 15:25 ` [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
  2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
@ 2023-03-16 15:26   ` David Howells
  2023-03-16 15:25 ` [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
                     ` (24 subsequent siblings)
  27 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: linux-doc, virtualization, David Howells, linux-mm, linux-sctp,
	linux-afs, rds-devel, linux-x25, dccp, linux-rdma,
	Christoph Hellwig, Linus Torvalds, linux-arm-msm, linux-can,
	Al Viro, linux-hams, mptcp, Jens Axboe, Christian Brauner,
	netdev, Jeff Layton, linux-kernel, tipc-discussion, linux-crypto,
	linux-fsdevel, bpf, linux-wpan

[!] Note: This is a work in progress.  At the moment, some things won't
    build if this patch is applied.  nvme, kcm, smc, tls.

Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: bpf@vger.kernel.org
cc: dccp@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: linux-arm-msm@vger.kernel.org
cc: linux-can@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: linux-doc@vger.kernel.org
cc: linux-hams@vger.kernel.org
cc: linux-kernel@vger.kernel.org
cc: linux-rdma@vger.kernel.org
cc: linux-sctp@vger.kernel.org
cc: linux-wpan@vger.kernel.org
cc: linux-x25@vger.kernel.org
cc: mptcp@lists.linux.dev
cc: netdev@vger.kernel.org
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
---
 Documentation/networking/scaling.rst |   4 +-
 crypto/af_alg.c                      |  29 ------
 crypto/algif_aead.c                  |  22 +----
 crypto/algif_rng.c                   |   2 -
 crypto/algif_skcipher.c              |  14 ---
 include/linux/net.h                  |   8 --
 include/net/inet_common.h            |   2 -
 include/net/sock.h                   |   6 --
 net/appletalk/ddp.c                  |   1 -
 net/atm/pvc.c                        |   1 -
 net/atm/svc.c                        |   1 -
 net/ax25/af_ax25.c                   |   1 -
 net/caif/caif_socket.c               |   2 -
 net/can/bcm.c                        |   1 -
 net/can/isotp.c                      |   1 -
 net/can/j1939/socket.c               |   1 -
 net/can/raw.c                        |   1 -
 net/core/sock.c                      |  35 +------
 net/dccp/ipv4.c                      |   1 -
 net/dccp/ipv6.c                      |   1 -
 net/ieee802154/socket.c              |   2 -
 net/ipv4/af_inet.c                   |  21 ----
 net/ipv4/tcp.c                       |  36 -------
 net/ipv4/tcp_bpf.c                   |  21 +---
 net/ipv4/tcp_ipv4.c                  |   1 -
 net/ipv4/udp.c                       |  22 -----
 net/ipv4/udp_impl.h                  |   2 -
 net/ipv4/udplite.c                   |   1 -
 net/ipv6/af_inet6.c                  |   3 -
 net/ipv6/raw.c                       |   1 -
 net/ipv6/tcp_ipv6.c                  |   1 -
 net/key/af_key.c                     |   1 -
 net/l2tp/l2tp_ip.c                   |   1 -
 net/l2tp/l2tp_ip6.c                  |   1 -
 net/llc/af_llc.c                     |   1 -
 net/mctp/af_mctp.c                   |   1 -
 net/mptcp/protocol.c                 |   2 -
 net/netlink/af_netlink.c             |   1 -
 net/netrom/af_netrom.c               |   1 -
 net/packet/af_packet.c               |   2 -
 net/phonet/socket.c                  |   2 -
 net/qrtr/af_qrtr.c                   |   1 -
 net/rds/af_rds.c                     |   1 -
 net/rose/af_rose.c                   |   1 -
 net/rxrpc/af_rxrpc.c                 |   1 -
 net/sctp/protocol.c                  |   1 -
 net/socket.c                         |  48 ---------
 net/tipc/socket.c                    |   3 -
 net/unix/af_unix.c                   | 139 ---------------------------
 net/vmw_vsock/af_vsock.c             |   3 -
 net/x25/af_x25.c                     |   1 -
 net/xdp/xsk.c                        |   1 -
 52 files changed, 9 insertions(+), 449 deletions(-)

diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 3d435caa3ef2..92c9fb46d6a2 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes.
 rps_sock_flow_table is a global flow table that contains the *desired* CPU
 for flows: the CPU that is currently processing the flow in userspace.
 Each table value is a CPU index that is updated during calls to recvmsg
-and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
-and tcp_splice_read()).
+and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and
+tcp_splice_read()).
 
 When the scheduler moves a thread to a new CPU while it has outstanding
 receive packets on the old CPU, packets may arrive out of order. To
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 0e77fce60876..225c90657f58 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -483,7 +483,6 @@ static const struct proto_ops alg_proto_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 	.sendmsg	=	sock_no_sendmsg,
 	.recvmsg	=	sock_no_recvmsg,
 
@@ -1135,34 +1134,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 }
 EXPORT_SYMBOL_GPL(af_alg_sendmsg);
 
-/**
- * af_alg_sendpage - sendpage system call handler
- * @sock: socket of connection to user space to write to
- * @page: data to send
- * @offset: offset into page to begin sending
- * @size: length of data
- * @flags: message send/receive flags
- *
- * This is a generic implementation of sendpage to fill ctx->tsgl_list.
- */
-ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
-			int offset, size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return sock_sendmsg(sock, &msg);
-}
-EXPORT_SYMBOL_GPL(af_alg_sendpage);
-
 /**
  * af_alg_free_resources - release resources required for crypto request
  * @areq: Request holding the TX and RX SGL
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 279eb17a1dfc..b65baefe6123 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -9,10 +9,10 @@
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage. Filling up
- * the TX SGL does not cause a crypto operation -- the data will only be
- * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
- * provide a buffer which is tracked with the RX SGL.
+ * filled by user space with the data submitted via sendmsg (maybe with with
+ * MSG_SPLICE_PAGES).  Filling up the TX SGL does not cause a crypto operation
+ * -- the data will only be tracked by the kernel. Upon receipt of one recvmsg
+ * call, the caller must provide a buffer which is tracked with the RX SGL.
  *
  * During the processing of the recvmsg operation, the cipher request is
  * allocated and prepared. As part of the recvmsg operation, the processed
@@ -368,7 +368,6 @@ static struct proto_ops algif_aead_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	aead_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -420,18 +419,6 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return aead_sendmsg(sock, msg, size);
 }
 
-static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = aead_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int aead_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -459,7 +446,6 @@ static struct proto_ops algif_aead_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg_nokey,
-	.sendpage	=	aead_sendpage_nokey,
 	.recvmsg	=	aead_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 407408c43730..10c41adac3b1 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -174,7 +174,6 @@ static struct proto_ops algif_rng_ops = {
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
 	.sendmsg	=	sock_no_sendmsg,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_recvmsg,
@@ -192,7 +191,6 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = {
 	.mmap		=	sock_no_mmap,
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_test_recvmsg,
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 021f9ce7e87c..b34e20400e80 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -194,7 +194,6 @@ static struct proto_ops algif_skcipher_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	skcipher_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -246,18 +245,6 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return skcipher_sendmsg(sock, msg, size);
 }
 
-static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = skcipher_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int skcipher_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -285,7 +272,6 @@ static struct proto_ops algif_skcipher_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg_nokey,
-	.sendpage	=	skcipher_sendpage_nokey,
 	.recvmsg	=	skcipher_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..e5794968ac9f 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -206,8 +206,6 @@ struct proto_ops {
 				      size_t total_len, int flags);
 	int		(*mmap)	     (struct file *file, struct socket *sock,
 				      struct vm_area_struct * vma);
-	ssize_t		(*sendpage)  (struct socket *sock, struct page *page,
-				      int offset, size_t size, int flags);
 	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
 				       struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 	int		(*set_peek_off)(struct sock *sk, int val);
@@ -220,8 +218,6 @@ struct proto_ops {
 				     sk_read_actor_t recv_actor);
 	/* This is different from read_sock(), it reads an entire skb at a time. */
 	int		(*read_skb)(struct sock *sk, skb_read_actor_t recv_actor);
-	int		(*sendpage_locked)(struct sock *sk, struct page *page,
-					   int offset, size_t size, int flags);
 	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
 					  size_t size);
 	int		(*set_rcvlowat)(struct sock *sk, int val);
@@ -339,10 +335,6 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
 		   int flags);
 int kernel_getsockname(struct socket *sock, struct sockaddr *addr);
 int kernel_getpeername(struct socket *sock, struct sockaddr *addr);
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags);
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags);
 int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);
 
 /* Routine returns the IP overhead imposed by a (caller-protected) socket. */
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index cec453c18f1d..054c3388fa51 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -33,8 +33,6 @@ int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern);
 int inet_send_prepare(struct sock *sk);
 int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags);
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		 int flags);
 int inet_shutdown(struct socket *sock, int how);
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..4618cd21e16b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1265,8 +1265,6 @@ struct proto {
 					   size_t len);
 	int			(*recvmsg)(struct sock *sk, struct msghdr *msg,
 					   size_t len, int flags, int *addr_len);
-	int			(*sendpage)(struct sock *sk, struct page *page,
-					int offset, size_t size, int flags);
 	int			(*bind)(struct sock *sk,
 					struct sockaddr *addr, int addr_len);
 	int			(*bind_add)(struct sock *sk,
@@ -1906,10 +1904,6 @@ int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len);
 int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int);
 int sock_no_mmap(struct file *file, struct socket *sock,
 		 struct vm_area_struct *vma);
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset,
-			 size_t size, int flags);
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags);
 
 /*
  * Functions to fill in entries in struct proto_ops when a protocol
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index a06f4d4a6f47..8978fb6212ff 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1929,7 +1929,6 @@ static const struct proto_ops atalk_dgram_ops = {
 	.sendmsg	= atalk_sendmsg,
 	.recvmsg	= atalk_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct notifier_block ddp_notifier = {
diff --git a/net/atm/pvc.c b/net/atm/pvc.c
index 53e7d3f39e26..66d9a9bd5896 100644
--- a/net/atm/pvc.c
+++ b/net/atm/pvc.c
@@ -126,7 +126,6 @@ static const struct proto_ops pvc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/atm/svc.c b/net/atm/svc.c
index 4a02bcaad279..289240fe234e 100644
--- a/net/atm/svc.c
+++ b/net/atm/svc.c
@@ -649,7 +649,6 @@ static const struct proto_ops svc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index d8da400cb4de..5db805d5f74d 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -2022,7 +2022,6 @@ static const struct proto_ops ax25_proto_ops = {
 	.sendmsg	= ax25_sendmsg,
 	.recvmsg	= ax25_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 /*
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 4eebcc66c19a..9c82698da4f5 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -976,7 +976,6 @@ static const struct proto_ops caif_seqpacket_ops = {
 	.sendmsg = caif_seqpkt_sendmsg,
 	.recvmsg = caif_seqpkt_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static const struct proto_ops caif_stream_ops = {
@@ -996,7 +995,6 @@ static const struct proto_ops caif_stream_ops = {
 	.sendmsg = caif_stream_sendmsg,
 	.recvmsg = caif_stream_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 /* This function is called when a socket is finally destroyed. */
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6ace34..65a946a36d92 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1699,7 +1699,6 @@ static const struct proto_ops bcm_ops = {
 	.sendmsg       = bcm_sendmsg,
 	.recvmsg       = bcm_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto bcm_proto __read_mostly = {
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..0c3d11c29a2b 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1633,7 +1633,6 @@ static const struct proto_ops isotp_ops = {
 	.sendmsg = isotp_sendmsg,
 	.recvmsg = isotp_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto isotp_proto __read_mostly = {
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 7e90f9e61d9b..2bfe4f79bb67 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1301,7 +1301,6 @@ static const struct proto_ops j1939_ops = {
 	.sendmsg = j1939_sk_sendmsg,
 	.recvmsg = j1939_sk_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto j1939_proto __read_mostly = {
diff --git a/net/can/raw.c b/net/can/raw.c
index f64469b98260..15c79b079184 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -962,7 +962,6 @@ static const struct proto_ops raw_ops = {
 	.sendmsg       = raw_sendmsg,
 	.recvmsg       = raw_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto raw_proto __read_mostly = {
diff --git a/net/core/sock.c b/net/core/sock.c
index 341c565dbc26..c2ae77bb2075 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3223,36 +3223,6 @@ void __receive_sock(struct file *file)
 	}
 }
 
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg(sock, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage);
-
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg_locked(sk, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage_locked);
-
 /*
  *	Default Socket Callbacks
  */
@@ -4008,7 +3978,7 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 {
 
 	seq_printf(seq, "%-9s %4u %6d  %6ld   %-3s %6u   %-3s  %-10s "
-			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
+			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
 		   proto->name,
 		   proto->obj_size,
 		   sock_prot_inuse_get(seq_file_net(seq), proto),
@@ -4029,7 +3999,6 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 		   proto_method_implemented(proto->getsockopt),
 		   proto_method_implemented(proto->sendmsg),
 		   proto_method_implemented(proto->recvmsg),
-		   proto_method_implemented(proto->sendpage),
 		   proto_method_implemented(proto->bind),
 		   proto_method_implemented(proto->backlog_rcv),
 		   proto_method_implemented(proto->hash),
@@ -4050,7 +4019,7 @@ static int proto_seq_show(struct seq_file *seq, void *v)
 			   "maxhdr",
 			   "slab",
 			   "module",
-			   "cl co di ac io in de sh ss gs se re sp bi br ha uh gp em\n");
+			   "cl co di ac io in de sh ss gs se re bi br ha uh gp em\n");
 	else
 		proto_seq_printf(seq, list_entry(v, struct proto, node));
 	return 0;
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index b780827f5e0a..ea808de374ea 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -1008,7 +1008,6 @@ static const struct proto_ops inet_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw dccp_v4_protosw = {
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index b9d7c3dd1cb3..23eb8159e3cd 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -1085,7 +1085,6 @@ static const struct proto_ops inet6_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 1fa2fe041ec0..1238f036117f 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -426,7 +426,6 @@ static const struct proto_ops ieee802154_raw_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* DGRAM Sockets (802.15.4 dataframes) */
@@ -990,7 +989,6 @@ static const struct proto_ops ieee802154_dgram_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static void ieee802154_sock_destruct(struct sock *sk)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 8db6747f892f..869b49933f15 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -827,23 +827,6 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags)
-{
-	struct sock *sk = sock->sk;
-	const struct proto *prot;
-
-	if (unlikely(inet_send_prepare(sk)))
-		return -EAGAIN;
-
-	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
-	prot = READ_ONCE(sk->sk_prot);
-	if (prot->sendpage)
-		return prot->sendpage(sk, page, offset, size, flags);
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(inet_sendpage);
-
 INDIRECT_CALLABLE_DECLARE(int udp_recvmsg(struct sock *, struct msghdr *,
 					  size_t, int, int *));
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
@@ -1046,12 +1029,10 @@ const struct proto_ops inet_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.peek_len	   = tcp_peek_len,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1080,7 +1061,6 @@ const struct proto_ops inet_dgram_ops = {
 	.read_skb	   = udp_read_skb,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1111,7 +1091,6 @@ static const struct proto_ops inet_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
 #endif
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f1454e4497df..26fa387f1084 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -971,42 +971,6 @@ static int tcp_wmem_schedule(struct sock *sk, int copy)
 	return min(copy, sk->sk_forward_alloc);
 }
 
-int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	if (!(sk->sk_route_caps & NETIF_F_SG))
-		return sock_no_sendpage_locked(sk, page, offset, size, flags);
-
-	tcp_rate_check_app_limited(sk);  /* is sending application-limited? */
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_sendmsg_locked(sk, &msg, size);
-}
-EXPORT_SYMBOL_GPL(tcp_sendpage_locked);
-
-int tcp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	int ret;
-
-	lock_sock(sk);
-	ret = tcp_sendpage_locked(sk, page, offset, size, flags);
-	release_sock(sk);
-
-	return ret;
-}
-EXPORT_SYMBOL(tcp_sendpage);
-
 void tcp_free_fastopen_req(struct tcp_sock *tp)
 {
 	if (tp->fastopen_req) {
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index de37a4372437..ab83cfb9de22 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -482,23 +482,6 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	return copied ? copied : err;
 }
 
-static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset,
-			    size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_bpf_sendmsg(sk, &msg, size);
-}
-
 enum {
 	TCP_BPF_IPV4,
 	TCP_BPF_IPV6,
@@ -528,7 +511,6 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS],
 
 	prot[TCP_BPF_TX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_TX].sendmsg		= tcp_bpf_sendmsg;
-	prot[TCP_BPF_TX].sendpage		= tcp_bpf_sendpage;
 
 	prot[TCP_BPF_RX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_RX].recvmsg		= tcp_bpf_recvmsg_parser;
@@ -563,8 +545,7 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops)
 	 * indeed valid assumptions.
 	 */
 	return ops->recvmsg  == tcp_recvmsg &&
-	       ops->sendmsg  == tcp_sendmsg &&
-	       ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP;
+	       ops->sendmsg  == tcp_sendmsg ? 0 : -ENOTSUPP;
 }
 
 int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index ea370afa70ed..5c2e1c1ca329 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -3112,7 +3112,6 @@ struct proto tcp_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v4_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet_hash,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 097feb92e215..85bd5960f7ef 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1329,27 +1329,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 }
 EXPORT_SYMBOL(udp_sendmsg);
 
-int udp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE
-	};
-	int ret;
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	lock_sock(sk);
-	ret = udp_sendmsg(sk, &msg, size);
-	release_sock(sk);
-	return ret;
-}
-
 #define UDP_SKB_IS_STATELESS 0x80000000
 
 /* all head states (dst, sk, nf conntrack) except skb extensions are
@@ -2926,7 +2905,6 @@ struct proto udp_prot = {
 	.getsockopt		= udp_getsockopt,
 	.sendmsg		= udp_sendmsg,
 	.recvmsg		= udp_recvmsg,
-	.sendpage		= udp_sendpage,
 	.release_cb		= ip4_datagram_release_cb,
 	.hash			= udp_lib_hash,
 	.unhash			= udp_lib_unhash,
diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h
index 4ba7a88a1b1d..e1ff3a375996 100644
--- a/net/ipv4/udp_impl.h
+++ b/net/ipv4/udp_impl.h
@@ -19,8 +19,6 @@ int udp_getsockopt(struct sock *sk, int level, int optname,
 
 int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		int *addr_len);
-int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
-		 int flags);
 void udp_destroy_sock(struct sock *sk);
 
 #ifdef CONFIG_PROC_FS
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
index e0c9cc39b81e..69870f0afc6c 100644
--- a/net/ipv4/udplite.c
+++ b/net/ipv4/udplite.c
@@ -54,7 +54,6 @@ struct proto 	udplite_prot = {
 	.getsockopt	   = udp_getsockopt,
 	.sendmsg	   = udp_sendmsg,
 	.recvmsg	   = udp_recvmsg,
-	.sendpage	   = udp_sendpage,
 	.hash		   = udp_lib_hash,
 	.unhash		   = udp_lib_unhash,
 	.rehash		   = udp_v4_rehash,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 38689bedfce7..769c76d59053 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -695,9 +695,7 @@ const struct proto_ops inet6_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
@@ -728,7 +726,6 @@ const struct proto_ops inet6_dgram_ops = {
 	.recvmsg	   = inet6_recvmsg,		/* retpoline's sake */
 	.read_skb	   = udp_read_skb,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index bac9ba747bde..c6c062678c0e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1298,7 +1298,6 @@ const struct proto_ops inet6_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,		/* ok		*/
 	.recvmsg	   = sock_common_recvmsg,	/* ok		*/
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 1bf93b61aa06..03ba1e389901 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2151,7 +2151,6 @@ struct proto tcpv6_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v6_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet6_hash,
diff --git a/net/key/af_key.c b/net/key/af_key.c
index a815f5ab4c49..bf59d42dc697 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3757,7 +3757,6 @@ static const struct proto_ops pfkey_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 
 	/* Now the operations that really occur. */
 	.release	=	pfkey_release,
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4db5a554bdbd..d0dcbe3a4cd7 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -625,7 +625,6 @@ static const struct proto_ops l2tp_ip_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw l2tp_ip_protosw = {
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2478aa60145f..49296ce14a90 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -751,7 +751,6 @@ static const struct proto_ops l2tp_ip6_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index da7fe94bea2e..addd94da2a81 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -1230,7 +1230,6 @@ static const struct proto_ops llc_ui_ops = {
 	.sendmsg     = llc_ui_sendmsg,
 	.recvmsg     = llc_ui_recvmsg,
 	.mmap	     = sock_no_mmap,
-	.sendpage    = sock_no_sendpage,
 };
 
 static const char llc_proc_err_msg[] __initconst =
diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
index 3150f3f0c872..c6fe2e6b85dd 100644
--- a/net/mctp/af_mctp.c
+++ b/net/mctp/af_mctp.c
@@ -485,7 +485,6 @@ static const struct proto_ops mctp_dgram_ops = {
 	.sendmsg	= mctp_sendmsg,
 	.recvmsg	= mctp_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= mctp_compat_ioctl,
 #endif
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3ad9c46202fc..ade89b8d0082 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3816,7 +3816,6 @@ static const struct proto_ops mptcp_stream_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 };
 
 static struct inet_protosw mptcp_protosw = {
@@ -3911,7 +3910,6 @@ static const struct proto_ops mptcp_v6_stream_ops = {
 	.sendmsg	   = inet6_sendmsg,
 	.recvmsg	   = inet6_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c64277659753..f70073a3bb49 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2841,7 +2841,6 @@ static const struct proto_ops netlink_ops = {
 	.sendmsg =	netlink_sendmsg,
 	.recvmsg =	netlink_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family netlink_family_ops = {
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 5a4cb796150f..eb8ccbd58df7 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1364,7 +1364,6 @@ static const struct proto_ops nr_proto_ops = {
 	.sendmsg	=	nr_sendmsg,
 	.recvmsg	=	nr_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block nr_dev_notifier = {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d4e76e2ae153..385bd4982b80 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4604,7 +4604,6 @@ static const struct proto_ops packet_ops_spkt = {
 	.sendmsg =	packet_sendmsg_spkt,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct proto_ops packet_ops = {
@@ -4626,7 +4625,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg =	packet_sendmsg,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		packet_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family packet_family_ops = {
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 71e2caf6ab85..a246f7d0a817 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -441,7 +441,6 @@ const struct proto_ops phonet_dgram_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 const struct proto_ops phonet_stream_ops = {
@@ -462,7 +461,6 @@ const struct proto_ops phonet_stream_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 EXPORT_SYMBOL(phonet_stream_ops);
 
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 5c2fb992803b..5bb7d680bd5f 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -1240,7 +1240,6 @@ static const struct proto_ops qrtr_proto_ops = {
 	.shutdown	= sock_no_shutdown,
 	.release	= qrtr_release,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto qrtr_proto = {
diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
index 3ff6995244e5..01c4cdfef45d 100644
--- a/net/rds/af_rds.c
+++ b/net/rds/af_rds.c
@@ -653,7 +653,6 @@ static const struct proto_ops rds_proto_ops = {
 	.sendmsg =	rds_sendmsg,
 	.recvmsg =	rds_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static void rds_sock_destruct(struct sock *sk)
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index ca2b17f32670..49dafe9ac72f 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1496,7 +1496,6 @@ static const struct proto_ops rose_proto_ops = {
 	.sendmsg	=	rose_sendmsg,
 	.recvmsg	=	rose_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block rose_dev_notifier = {
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..182495804f8f 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -938,7 +938,6 @@ static const struct proto_ops rxrpc_rpc_ops = {
 	.sendmsg	= rxrpc_sendmsg,
 	.recvmsg	= rxrpc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto rxrpc_proto = {
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index c365df24ad33..acb2d2a69268 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1135,7 +1135,6 @@ static const struct proto_ops inet_seqpacket_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* Registration with AF_INET family.  */
diff --git a/net/socket.c b/net/socket.c
index 1b48a976b8cc..130d6ce7f82d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3541,54 +3541,6 @@ int kernel_getpeername(struct socket *sock, struct sockaddr *addr)
 }
 EXPORT_SYMBOL(kernel_getpeername);
 
-/**
- *	kernel_sendpage - send a &page through a socket (kernel space)
- *	@sock: socket
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- */
-
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags)
-{
-	if (sock->ops->sendpage) {
-		/* Warn in case the improper page to zero-copy send */
-		WARN_ONCE(!sendpage_ok(page), "improper page for zero-copy send");
-		return sock->ops->sendpage(sock, page, offset, size, flags);
-	}
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage);
-
-/**
- *	kernel_sendpage_locked - send a &page through the locked sock (kernel space)
- *	@sk: sock
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- *	Caller must hold @sk.
- */
-
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags)
-{
-	struct socket *sock = sk->sk_socket;
-
-	if (sock->ops->sendpage_locked)
-		return sock->ops->sendpage_locked(sk, page, offset, size,
-						  flags);
-
-	return sock_no_sendpage_locked(sk, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage_locked);
-
 /**
  *	kernel_sock_shutdown - shut down part of a full-duplex connection (kernel space)
  *	@sock: socket
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 37edfe10f8c6..d2072fbf3272 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -3375,7 +3375,6 @@ static const struct proto_ops msg_ops = {
 	.sendmsg	= tipc_sendmsg,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops packet_ops = {
@@ -3396,7 +3395,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg	= tipc_send_packet,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops stream_ops = {
@@ -3417,7 +3415,6 @@ static const struct proto_ops stream_ops = {
 	.sendmsg	= tipc_sendstream,
 	.recvmsg	= tipc_recvstream,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct net_proto_family tipc_family_ops = {
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 6f3454db9c53..407f449df564 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -758,8 +758,6 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon
 static int unix_shutdown(struct socket *, int);
 static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t);
 static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int);
-static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset,
-				    size_t size, int flags);
 static ssize_t unix_stream_splice_read(struct socket *,  loff_t *ppos,
 				       struct pipe_inode_info *, size_t size,
 				       unsigned int flags);
@@ -852,7 +850,6 @@ static const struct proto_ops unix_stream_ops = {
 	.recvmsg =	unix_stream_recvmsg,
 	.read_skb =	unix_stream_read_skb,
 	.mmap =		sock_no_mmap,
-	.sendpage =	unix_stream_sendpage,
 	.splice_read =	unix_stream_splice_read,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
@@ -878,7 +875,6 @@ static const struct proto_ops unix_dgram_ops = {
 	.read_skb =	unix_read_skb,
 	.recvmsg =	unix_dgram_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -902,7 +898,6 @@ static const struct proto_ops unix_seqpacket_ops = {
 	.sendmsg =	unix_seqpacket_sendmsg,
 	.recvmsg =	unix_seqpacket_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -1839,24 +1834,6 @@ static void maybe_add_creds(struct sk_buff *skb, const struct socket *sock,
 	}
 }
 
-static int maybe_init_creds(struct scm_cookie *scm,
-			    struct socket *socket,
-			    const struct sock *other)
-{
-	int err;
-	struct msghdr msg = { .msg_controllen = 0 };
-
-	err = scm_send(socket, &msg, scm, false);
-	if (err)
-		return err;
-
-	if (unix_passcred_enabled(socket, other)) {
-		scm->pid = get_pid(task_tgid(current));
-		current_uid_gid(&scm->creds.uid, &scm->creds.gid);
-	}
-	return err;
-}
-
 static bool unix_skb_scm_eq(struct sk_buff *skb,
 			    struct scm_cookie *scm)
 {
@@ -2318,122 +2295,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	return sent ? : err;
 }
 
-static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
-				    int offset, size_t size, int flags)
-{
-	int err;
-	bool send_sigpipe = false;
-	bool init_scm = true;
-	struct scm_cookie scm;
-	struct sock *other, *sk = socket->sk;
-	struct sk_buff *skb, *newskb = NULL, *tail = NULL;
-
-	if (flags & MSG_OOB)
-		return -EOPNOTSUPP;
-
-	other = unix_peer(sk);
-	if (!other || sk->sk_state != TCP_ESTABLISHED)
-		return -ENOTCONN;
-
-	if (false) {
-alloc_skb:
-		unix_state_unlock(other);
-		mutex_unlock(&unix_sk(other)->iolock);
-		newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT,
-					      &err, 0);
-		if (!newskb)
-			goto err;
-	}
-
-	/* we must acquire iolock as we modify already present
-	 * skbs in the sk_receive_queue and mess with skb->len
-	 */
-	err = mutex_lock_interruptible(&unix_sk(other)->iolock);
-	if (err) {
-		err = flags & MSG_DONTWAIT ? -EAGAIN : -ERESTARTSYS;
-		goto err;
-	}
-
-	if (sk->sk_shutdown & SEND_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_unlock;
-	}
-
-	unix_state_lock(other);
-
-	if (sock_flag(other, SOCK_DEAD) ||
-	    other->sk_shutdown & RCV_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_state_unlock;
-	}
-
-	if (init_scm) {
-		err = maybe_init_creds(&scm, socket, other);
-		if (err)
-			goto err_state_unlock;
-		init_scm = false;
-	}
-
-	skb = skb_peek_tail(&other->sk_receive_queue);
-	if (tail && tail == skb) {
-		skb = newskb;
-	} else if (!skb || !unix_skb_scm_eq(skb, &scm)) {
-		if (newskb) {
-			skb = newskb;
-		} else {
-			tail = skb;
-			goto alloc_skb;
-		}
-	} else if (newskb) {
-		/* this is fast path, we don't necessarily need to
-		 * call to kfree_skb even though with newskb == NULL
-		 * this - does no harm
-		 */
-		consume_skb(newskb);
-		newskb = NULL;
-	}
-
-	if (skb_append_pagefrags(skb, page, offset, size)) {
-		tail = skb;
-		goto alloc_skb;
-	}
-
-	skb->len += size;
-	skb->data_len += size;
-	skb->truesize += size;
-	refcount_add(size, &sk->sk_wmem_alloc);
-
-	if (newskb) {
-		err = unix_scm_to_skb(&scm, skb, false);
-		if (err)
-			goto err_state_unlock;
-		spin_lock(&other->sk_receive_queue.lock);
-		__skb_queue_tail(&other->sk_receive_queue, newskb);
-		spin_unlock(&other->sk_receive_queue.lock);
-	}
-
-	unix_state_unlock(other);
-	mutex_unlock(&unix_sk(other)->iolock);
-
-	other->sk_data_ready(other);
-	scm_destroy(&scm);
-	return size;
-
-err_state_unlock:
-	unix_state_unlock(other);
-err_unlock:
-	mutex_unlock(&unix_sk(other)->iolock);
-err:
-	kfree_skb(newskb);
-	if (send_sigpipe && !(flags & MSG_NOSIGNAL))
-		send_sig(SIGPIPE, current, 0);
-	if (!init_scm)
-		scm_destroy(&scm);
-	return err;
-}
-
 static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
 				  size_t len)
 {
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 19aea7cba26e..d0e476755cdc 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1271,7 +1271,6 @@ static const struct proto_ops vsock_dgram_ops = {
 	.sendmsg = vsock_dgram_sendmsg,
 	.recvmsg = vsock_dgram_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_transport_cancel_pkt(struct vsock_sock *vsk)
@@ -2186,7 +2185,6 @@ static const struct proto_ops vsock_stream_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 	.set_rcvlowat = vsock_set_rcvlowat,
 };
 
@@ -2208,7 +2206,6 @@ static const struct proto_ops vsock_seqpacket_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_create(struct net *net, struct socket *sock,
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5c7ad301d742..0fb5143bec7a 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1757,7 +1757,6 @@ static const struct proto_ops x25_proto_ops = {
 	.sendmsg =	x25_sendmsg,
 	.recvmsg =	x25_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static struct packet_type x25_packet_type __read_mostly = {
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5e..eff1f0aaa4b5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -1386,7 +1386,6 @@ static const struct proto_ops xsk_proto_ops = {
 	.sendmsg	= xsk_sendmsg,
 	.recvmsg	= xsk_recvmsg,
 	.mmap		= xsk_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static void xsk_destruct(struct sock *sk)


_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
@ 2023-03-16 15:26   ` David Howells
  0 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, bpf, dccp, linux-afs,
	linux-arm-msm, linux-can, linux-crypto, linux-doc, linux-hams,
	linux-rdma, linux-sctp, linux-wpan, linux-x25, mptcp, rds-devel,
	tipc-discussion, virtualization

[!] Note: This is a work in progress.  At the moment, some things won't
    build if this patch is applied.  nvme, kcm, smc, tls.

Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: bpf@vger.kernel.org
cc: dccp@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: linux-arm-msm@vger.kernel.org
cc: linux-can@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: linux-doc@vger.kernel.org
cc: linux-hams@vger.kernel.org
cc: linux-kernel@vger.kernel.org
cc: linux-rdma@vger.kernel.org
cc: linux-sctp@vger.kernel.org
cc: linux-wpan@vger.kernel.org
cc: linux-x25@vger.kernel.org
cc: mptcp@lists.linux.dev
cc: netdev@vger.kernel.org
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
---
 Documentation/networking/scaling.rst |   4 +-
 crypto/af_alg.c                      |  29 ------
 crypto/algif_aead.c                  |  22 +----
 crypto/algif_rng.c                   |   2 -
 crypto/algif_skcipher.c              |  14 ---
 include/linux/net.h                  |   8 --
 include/net/inet_common.h            |   2 -
 include/net/sock.h                   |   6 --
 net/appletalk/ddp.c                  |   1 -
 net/atm/pvc.c                        |   1 -
 net/atm/svc.c                        |   1 -
 net/ax25/af_ax25.c                   |   1 -
 net/caif/caif_socket.c               |   2 -
 net/can/bcm.c                        |   1 -
 net/can/isotp.c                      |   1 -
 net/can/j1939/socket.c               |   1 -
 net/can/raw.c                        |   1 -
 net/core/sock.c                      |  35 +------
 net/dccp/ipv4.c                      |   1 -
 net/dccp/ipv6.c                      |   1 -
 net/ieee802154/socket.c              |   2 -
 net/ipv4/af_inet.c                   |  21 ----
 net/ipv4/tcp.c                       |  36 -------
 net/ipv4/tcp_bpf.c                   |  21 +---
 net/ipv4/tcp_ipv4.c                  |   1 -
 net/ipv4/udp.c                       |  22 -----
 net/ipv4/udp_impl.h                  |   2 -
 net/ipv4/udplite.c                   |   1 -
 net/ipv6/af_inet6.c                  |   3 -
 net/ipv6/raw.c                       |   1 -
 net/ipv6/tcp_ipv6.c                  |   1 -
 net/key/af_key.c                     |   1 -
 net/l2tp/l2tp_ip.c                   |   1 -
 net/l2tp/l2tp_ip6.c                  |   1 -
 net/llc/af_llc.c                     |   1 -
 net/mctp/af_mctp.c                   |   1 -
 net/mptcp/protocol.c                 |   2 -
 net/netlink/af_netlink.c             |   1 -
 net/netrom/af_netrom.c               |   1 -
 net/packet/af_packet.c               |   2 -
 net/phonet/socket.c                  |   2 -
 net/qrtr/af_qrtr.c                   |   1 -
 net/rds/af_rds.c                     |   1 -
 net/rose/af_rose.c                   |   1 -
 net/rxrpc/af_rxrpc.c                 |   1 -
 net/sctp/protocol.c                  |   1 -
 net/socket.c                         |  48 ---------
 net/tipc/socket.c                    |   3 -
 net/unix/af_unix.c                   | 139 ---------------------------
 net/vmw_vsock/af_vsock.c             |   3 -
 net/x25/af_x25.c                     |   1 -
 net/xdp/xsk.c                        |   1 -
 52 files changed, 9 insertions(+), 449 deletions(-)

diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 3d435caa3ef2..92c9fb46d6a2 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes.
 rps_sock_flow_table is a global flow table that contains the *desired* CPU
 for flows: the CPU that is currently processing the flow in userspace.
 Each table value is a CPU index that is updated during calls to recvmsg
-and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
-and tcp_splice_read()).
+and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and
+tcp_splice_read()).
 
 When the scheduler moves a thread to a new CPU while it has outstanding
 receive packets on the old CPU, packets may arrive out of order. To
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 0e77fce60876..225c90657f58 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -483,7 +483,6 @@ static const struct proto_ops alg_proto_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 	.sendmsg	=	sock_no_sendmsg,
 	.recvmsg	=	sock_no_recvmsg,
 
@@ -1135,34 +1134,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 }
 EXPORT_SYMBOL_GPL(af_alg_sendmsg);
 
-/**
- * af_alg_sendpage - sendpage system call handler
- * @sock: socket of connection to user space to write to
- * @page: data to send
- * @offset: offset into page to begin sending
- * @size: length of data
- * @flags: message send/receive flags
- *
- * This is a generic implementation of sendpage to fill ctx->tsgl_list.
- */
-ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
-			int offset, size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return sock_sendmsg(sock, &msg);
-}
-EXPORT_SYMBOL_GPL(af_alg_sendpage);
-
 /**
  * af_alg_free_resources - release resources required for crypto request
  * @areq: Request holding the TX and RX SGL
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 279eb17a1dfc..b65baefe6123 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -9,10 +9,10 @@
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage. Filling up
- * the TX SGL does not cause a crypto operation -- the data will only be
- * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
- * provide a buffer which is tracked with the RX SGL.
+ * filled by user space with the data submitted via sendmsg (maybe with with
+ * MSG_SPLICE_PAGES).  Filling up the TX SGL does not cause a crypto operation
+ * -- the data will only be tracked by the kernel. Upon receipt of one recvmsg
+ * call, the caller must provide a buffer which is tracked with the RX SGL.
  *
  * During the processing of the recvmsg operation, the cipher request is
  * allocated and prepared. As part of the recvmsg operation, the processed
@@ -368,7 +368,6 @@ static struct proto_ops algif_aead_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	aead_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -420,18 +419,6 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return aead_sendmsg(sock, msg, size);
 }
 
-static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = aead_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int aead_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -459,7 +446,6 @@ static struct proto_ops algif_aead_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg_nokey,
-	.sendpage	=	aead_sendpage_nokey,
 	.recvmsg	=	aead_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 407408c43730..10c41adac3b1 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -174,7 +174,6 @@ static struct proto_ops algif_rng_ops = {
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
 	.sendmsg	=	sock_no_sendmsg,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_recvmsg,
@@ -192,7 +191,6 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = {
 	.mmap		=	sock_no_mmap,
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_test_recvmsg,
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 021f9ce7e87c..b34e20400e80 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -194,7 +194,6 @@ static struct proto_ops algif_skcipher_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	skcipher_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -246,18 +245,6 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return skcipher_sendmsg(sock, msg, size);
 }
 
-static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = skcipher_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int skcipher_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -285,7 +272,6 @@ static struct proto_ops algif_skcipher_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg_nokey,
-	.sendpage	=	skcipher_sendpage_nokey,
 	.recvmsg	=	skcipher_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..e5794968ac9f 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -206,8 +206,6 @@ struct proto_ops {
 				      size_t total_len, int flags);
 	int		(*mmap)	     (struct file *file, struct socket *sock,
 				      struct vm_area_struct * vma);
-	ssize_t		(*sendpage)  (struct socket *sock, struct page *page,
-				      int offset, size_t size, int flags);
 	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
 				       struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 	int		(*set_peek_off)(struct sock *sk, int val);
@@ -220,8 +218,6 @@ struct proto_ops {
 				     sk_read_actor_t recv_actor);
 	/* This is different from read_sock(), it reads an entire skb at a time. */
 	int		(*read_skb)(struct sock *sk, skb_read_actor_t recv_actor);
-	int		(*sendpage_locked)(struct sock *sk, struct page *page,
-					   int offset, size_t size, int flags);
 	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
 					  size_t size);
 	int		(*set_rcvlowat)(struct sock *sk, int val);
@@ -339,10 +335,6 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
 		   int flags);
 int kernel_getsockname(struct socket *sock, struct sockaddr *addr);
 int kernel_getpeername(struct socket *sock, struct sockaddr *addr);
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags);
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags);
 int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);
 
 /* Routine returns the IP overhead imposed by a (caller-protected) socket. */
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index cec453c18f1d..054c3388fa51 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -33,8 +33,6 @@ int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern);
 int inet_send_prepare(struct sock *sk);
 int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags);
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		 int flags);
 int inet_shutdown(struct socket *sock, int how);
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..4618cd21e16b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1265,8 +1265,6 @@ struct proto {
 					   size_t len);
 	int			(*recvmsg)(struct sock *sk, struct msghdr *msg,
 					   size_t len, int flags, int *addr_len);
-	int			(*sendpage)(struct sock *sk, struct page *page,
-					int offset, size_t size, int flags);
 	int			(*bind)(struct sock *sk,
 					struct sockaddr *addr, int addr_len);
 	int			(*bind_add)(struct sock *sk,
@@ -1906,10 +1904,6 @@ int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len);
 int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int);
 int sock_no_mmap(struct file *file, struct socket *sock,
 		 struct vm_area_struct *vma);
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset,
-			 size_t size, int flags);
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags);
 
 /*
  * Functions to fill in entries in struct proto_ops when a protocol
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index a06f4d4a6f47..8978fb6212ff 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1929,7 +1929,6 @@ static const struct proto_ops atalk_dgram_ops = {
 	.sendmsg	= atalk_sendmsg,
 	.recvmsg	= atalk_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct notifier_block ddp_notifier = {
diff --git a/net/atm/pvc.c b/net/atm/pvc.c
index 53e7d3f39e26..66d9a9bd5896 100644
--- a/net/atm/pvc.c
+++ b/net/atm/pvc.c
@@ -126,7 +126,6 @@ static const struct proto_ops pvc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/atm/svc.c b/net/atm/svc.c
index 4a02bcaad279..289240fe234e 100644
--- a/net/atm/svc.c
+++ b/net/atm/svc.c
@@ -649,7 +649,6 @@ static const struct proto_ops svc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index d8da400cb4de..5db805d5f74d 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -2022,7 +2022,6 @@ static const struct proto_ops ax25_proto_ops = {
 	.sendmsg	= ax25_sendmsg,
 	.recvmsg	= ax25_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 /*
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 4eebcc66c19a..9c82698da4f5 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -976,7 +976,6 @@ static const struct proto_ops caif_seqpacket_ops = {
 	.sendmsg = caif_seqpkt_sendmsg,
 	.recvmsg = caif_seqpkt_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static const struct proto_ops caif_stream_ops = {
@@ -996,7 +995,6 @@ static const struct proto_ops caif_stream_ops = {
 	.sendmsg = caif_stream_sendmsg,
 	.recvmsg = caif_stream_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 /* This function is called when a socket is finally destroyed. */
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6ace34..65a946a36d92 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1699,7 +1699,6 @@ static const struct proto_ops bcm_ops = {
 	.sendmsg       = bcm_sendmsg,
 	.recvmsg       = bcm_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto bcm_proto __read_mostly = {
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..0c3d11c29a2b 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1633,7 +1633,6 @@ static const struct proto_ops isotp_ops = {
 	.sendmsg = isotp_sendmsg,
 	.recvmsg = isotp_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto isotp_proto __read_mostly = {
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 7e90f9e61d9b..2bfe4f79bb67 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1301,7 +1301,6 @@ static const struct proto_ops j1939_ops = {
 	.sendmsg = j1939_sk_sendmsg,
 	.recvmsg = j1939_sk_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto j1939_proto __read_mostly = {
diff --git a/net/can/raw.c b/net/can/raw.c
index f64469b98260..15c79b079184 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -962,7 +962,6 @@ static const struct proto_ops raw_ops = {
 	.sendmsg       = raw_sendmsg,
 	.recvmsg       = raw_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto raw_proto __read_mostly = {
diff --git a/net/core/sock.c b/net/core/sock.c
index 341c565dbc26..c2ae77bb2075 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3223,36 +3223,6 @@ void __receive_sock(struct file *file)
 	}
 }
 
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg(sock, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage);
-
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg_locked(sk, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage_locked);
-
 /*
  *	Default Socket Callbacks
  */
@@ -4008,7 +3978,7 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 {
 
 	seq_printf(seq, "%-9s %4u %6d  %6ld   %-3s %6u   %-3s  %-10s "
-			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
+			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
 		   proto->name,
 		   proto->obj_size,
 		   sock_prot_inuse_get(seq_file_net(seq), proto),
@@ -4029,7 +3999,6 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 		   proto_method_implemented(proto->getsockopt),
 		   proto_method_implemented(proto->sendmsg),
 		   proto_method_implemented(proto->recvmsg),
-		   proto_method_implemented(proto->sendpage),
 		   proto_method_implemented(proto->bind),
 		   proto_method_implemented(proto->backlog_rcv),
 		   proto_method_implemented(proto->hash),
@@ -4050,7 +4019,7 @@ static int proto_seq_show(struct seq_file *seq, void *v)
 			   "maxhdr",
 			   "slab",
 			   "module",
-			   "cl co di ac io in de sh ss gs se re sp bi br ha uh gp em\n");
+			   "cl co di ac io in de sh ss gs se re bi br ha uh gp em\n");
 	else
 		proto_seq_printf(seq, list_entry(v, struct proto, node));
 	return 0;
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index b780827f5e0a..ea808de374ea 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -1008,7 +1008,6 @@ static const struct proto_ops inet_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw dccp_v4_protosw = {
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index b9d7c3dd1cb3..23eb8159e3cd 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -1085,7 +1085,6 @@ static const struct proto_ops inet6_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 1fa2fe041ec0..1238f036117f 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -426,7 +426,6 @@ static const struct proto_ops ieee802154_raw_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* DGRAM Sockets (802.15.4 dataframes) */
@@ -990,7 +989,6 @@ static const struct proto_ops ieee802154_dgram_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static void ieee802154_sock_destruct(struct sock *sk)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 8db6747f892f..869b49933f15 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -827,23 +827,6 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags)
-{
-	struct sock *sk = sock->sk;
-	const struct proto *prot;
-
-	if (unlikely(inet_send_prepare(sk)))
-		return -EAGAIN;
-
-	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
-	prot = READ_ONCE(sk->sk_prot);
-	if (prot->sendpage)
-		return prot->sendpage(sk, page, offset, size, flags);
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(inet_sendpage);
-
 INDIRECT_CALLABLE_DECLARE(int udp_recvmsg(struct sock *, struct msghdr *,
 					  size_t, int, int *));
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
@@ -1046,12 +1029,10 @@ const struct proto_ops inet_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.peek_len	   = tcp_peek_len,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1080,7 +1061,6 @@ const struct proto_ops inet_dgram_ops = {
 	.read_skb	   = udp_read_skb,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1111,7 +1091,6 @@ static const struct proto_ops inet_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
 #endif
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f1454e4497df..26fa387f1084 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -971,42 +971,6 @@ static int tcp_wmem_schedule(struct sock *sk, int copy)
 	return min(copy, sk->sk_forward_alloc);
 }
 
-int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	if (!(sk->sk_route_caps & NETIF_F_SG))
-		return sock_no_sendpage_locked(sk, page, offset, size, flags);
-
-	tcp_rate_check_app_limited(sk);  /* is sending application-limited? */
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_sendmsg_locked(sk, &msg, size);
-}
-EXPORT_SYMBOL_GPL(tcp_sendpage_locked);
-
-int tcp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	int ret;
-
-	lock_sock(sk);
-	ret = tcp_sendpage_locked(sk, page, offset, size, flags);
-	release_sock(sk);
-
-	return ret;
-}
-EXPORT_SYMBOL(tcp_sendpage);
-
 void tcp_free_fastopen_req(struct tcp_sock *tp)
 {
 	if (tp->fastopen_req) {
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index de37a4372437..ab83cfb9de22 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -482,23 +482,6 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	return copied ? copied : err;
 }
 
-static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset,
-			    size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_bpf_sendmsg(sk, &msg, size);
-}
-
 enum {
 	TCP_BPF_IPV4,
 	TCP_BPF_IPV6,
@@ -528,7 +511,6 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS],
 
 	prot[TCP_BPF_TX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_TX].sendmsg		= tcp_bpf_sendmsg;
-	prot[TCP_BPF_TX].sendpage		= tcp_bpf_sendpage;
 
 	prot[TCP_BPF_RX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_RX].recvmsg		= tcp_bpf_recvmsg_parser;
@@ -563,8 +545,7 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops)
 	 * indeed valid assumptions.
 	 */
 	return ops->recvmsg  == tcp_recvmsg &&
-	       ops->sendmsg  == tcp_sendmsg &&
-	       ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP;
+	       ops->sendmsg  == tcp_sendmsg ? 0 : -ENOTSUPP;
 }
 
 int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index ea370afa70ed..5c2e1c1ca329 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -3112,7 +3112,6 @@ struct proto tcp_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v4_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet_hash,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 097feb92e215..85bd5960f7ef 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1329,27 +1329,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 }
 EXPORT_SYMBOL(udp_sendmsg);
 
-int udp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE
-	};
-	int ret;
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	lock_sock(sk);
-	ret = udp_sendmsg(sk, &msg, size);
-	release_sock(sk);
-	return ret;
-}
-
 #define UDP_SKB_IS_STATELESS 0x80000000
 
 /* all head states (dst, sk, nf conntrack) except skb extensions are
@@ -2926,7 +2905,6 @@ struct proto udp_prot = {
 	.getsockopt		= udp_getsockopt,
 	.sendmsg		= udp_sendmsg,
 	.recvmsg		= udp_recvmsg,
-	.sendpage		= udp_sendpage,
 	.release_cb		= ip4_datagram_release_cb,
 	.hash			= udp_lib_hash,
 	.unhash			= udp_lib_unhash,
diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h
index 4ba7a88a1b1d..e1ff3a375996 100644
--- a/net/ipv4/udp_impl.h
+++ b/net/ipv4/udp_impl.h
@@ -19,8 +19,6 @@ int udp_getsockopt(struct sock *sk, int level, int optname,
 
 int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		int *addr_len);
-int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
-		 int flags);
 void udp_destroy_sock(struct sock *sk);
 
 #ifdef CONFIG_PROC_FS
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
index e0c9cc39b81e..69870f0afc6c 100644
--- a/net/ipv4/udplite.c
+++ b/net/ipv4/udplite.c
@@ -54,7 +54,6 @@ struct proto 	udplite_prot = {
 	.getsockopt	   = udp_getsockopt,
 	.sendmsg	   = udp_sendmsg,
 	.recvmsg	   = udp_recvmsg,
-	.sendpage	   = udp_sendpage,
 	.hash		   = udp_lib_hash,
 	.unhash		   = udp_lib_unhash,
 	.rehash		   = udp_v4_rehash,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 38689bedfce7..769c76d59053 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -695,9 +695,7 @@ const struct proto_ops inet6_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
@@ -728,7 +726,6 @@ const struct proto_ops inet6_dgram_ops = {
 	.recvmsg	   = inet6_recvmsg,		/* retpoline's sake */
 	.read_skb	   = udp_read_skb,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index bac9ba747bde..c6c062678c0e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1298,7 +1298,6 @@ const struct proto_ops inet6_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,		/* ok		*/
 	.recvmsg	   = sock_common_recvmsg,	/* ok		*/
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 1bf93b61aa06..03ba1e389901 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2151,7 +2151,6 @@ struct proto tcpv6_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v6_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet6_hash,
diff --git a/net/key/af_key.c b/net/key/af_key.c
index a815f5ab4c49..bf59d42dc697 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3757,7 +3757,6 @@ static const struct proto_ops pfkey_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 
 	/* Now the operations that really occur. */
 	.release	=	pfkey_release,
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4db5a554bdbd..d0dcbe3a4cd7 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -625,7 +625,6 @@ static const struct proto_ops l2tp_ip_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw l2tp_ip_protosw = {
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2478aa60145f..49296ce14a90 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -751,7 +751,6 @@ static const struct proto_ops l2tp_ip6_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index da7fe94bea2e..addd94da2a81 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -1230,7 +1230,6 @@ static const struct proto_ops llc_ui_ops = {
 	.sendmsg     = llc_ui_sendmsg,
 	.recvmsg     = llc_ui_recvmsg,
 	.mmap	     = sock_no_mmap,
-	.sendpage    = sock_no_sendpage,
 };
 
 static const char llc_proc_err_msg[] __initconst =
diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
index 3150f3f0c872..c6fe2e6b85dd 100644
--- a/net/mctp/af_mctp.c
+++ b/net/mctp/af_mctp.c
@@ -485,7 +485,6 @@ static const struct proto_ops mctp_dgram_ops = {
 	.sendmsg	= mctp_sendmsg,
 	.recvmsg	= mctp_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= mctp_compat_ioctl,
 #endif
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3ad9c46202fc..ade89b8d0082 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3816,7 +3816,6 @@ static const struct proto_ops mptcp_stream_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 };
 
 static struct inet_protosw mptcp_protosw = {
@@ -3911,7 +3910,6 @@ static const struct proto_ops mptcp_v6_stream_ops = {
 	.sendmsg	   = inet6_sendmsg,
 	.recvmsg	   = inet6_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c64277659753..f70073a3bb49 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2841,7 +2841,6 @@ static const struct proto_ops netlink_ops = {
 	.sendmsg =	netlink_sendmsg,
 	.recvmsg =	netlink_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family netlink_family_ops = {
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 5a4cb796150f..eb8ccbd58df7 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1364,7 +1364,6 @@ static const struct proto_ops nr_proto_ops = {
 	.sendmsg	=	nr_sendmsg,
 	.recvmsg	=	nr_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block nr_dev_notifier = {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d4e76e2ae153..385bd4982b80 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4604,7 +4604,6 @@ static const struct proto_ops packet_ops_spkt = {
 	.sendmsg =	packet_sendmsg_spkt,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct proto_ops packet_ops = {
@@ -4626,7 +4625,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg =	packet_sendmsg,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		packet_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family packet_family_ops = {
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 71e2caf6ab85..a246f7d0a817 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -441,7 +441,6 @@ const struct proto_ops phonet_dgram_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 const struct proto_ops phonet_stream_ops = {
@@ -462,7 +461,6 @@ const struct proto_ops phonet_stream_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 EXPORT_SYMBOL(phonet_stream_ops);
 
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 5c2fb992803b..5bb7d680bd5f 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -1240,7 +1240,6 @@ static const struct proto_ops qrtr_proto_ops = {
 	.shutdown	= sock_no_shutdown,
 	.release	= qrtr_release,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto qrtr_proto = {
diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
index 3ff6995244e5..01c4cdfef45d 100644
--- a/net/rds/af_rds.c
+++ b/net/rds/af_rds.c
@@ -653,7 +653,6 @@ static const struct proto_ops rds_proto_ops = {
 	.sendmsg =	rds_sendmsg,
 	.recvmsg =	rds_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static void rds_sock_destruct(struct sock *sk)
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index ca2b17f32670..49dafe9ac72f 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1496,7 +1496,6 @@ static const struct proto_ops rose_proto_ops = {
 	.sendmsg	=	rose_sendmsg,
 	.recvmsg	=	rose_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block rose_dev_notifier = {
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..182495804f8f 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -938,7 +938,6 @@ static const struct proto_ops rxrpc_rpc_ops = {
 	.sendmsg	= rxrpc_sendmsg,
 	.recvmsg	= rxrpc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto rxrpc_proto = {
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index c365df24ad33..acb2d2a69268 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1135,7 +1135,6 @@ static const struct proto_ops inet_seqpacket_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* Registration with AF_INET family.  */
diff --git a/net/socket.c b/net/socket.c
index 1b48a976b8cc..130d6ce7f82d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3541,54 +3541,6 @@ int kernel_getpeername(struct socket *sock, struct sockaddr *addr)
 }
 EXPORT_SYMBOL(kernel_getpeername);
 
-/**
- *	kernel_sendpage - send a &page through a socket (kernel space)
- *	@sock: socket
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- */
-
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags)
-{
-	if (sock->ops->sendpage) {
-		/* Warn in case the improper page to zero-copy send */
-		WARN_ONCE(!sendpage_ok(page), "improper page for zero-copy send");
-		return sock->ops->sendpage(sock, page, offset, size, flags);
-	}
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage);
-
-/**
- *	kernel_sendpage_locked - send a &page through the locked sock (kernel space)
- *	@sk: sock
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- *	Caller must hold @sk.
- */
-
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags)
-{
-	struct socket *sock = sk->sk_socket;
-
-	if (sock->ops->sendpage_locked)
-		return sock->ops->sendpage_locked(sk, page, offset, size,
-						  flags);
-
-	return sock_no_sendpage_locked(sk, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage_locked);
-
 /**
  *	kernel_sock_shutdown - shut down part of a full-duplex connection (kernel space)
  *	@sock: socket
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 37edfe10f8c6..d2072fbf3272 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -3375,7 +3375,6 @@ static const struct proto_ops msg_ops = {
 	.sendmsg	= tipc_sendmsg,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops packet_ops = {
@@ -3396,7 +3395,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg	= tipc_send_packet,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops stream_ops = {
@@ -3417,7 +3415,6 @@ static const struct proto_ops stream_ops = {
 	.sendmsg	= tipc_sendstream,
 	.recvmsg	= tipc_recvstream,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct net_proto_family tipc_family_ops = {
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 6f3454db9c53..407f449df564 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -758,8 +758,6 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon
 static int unix_shutdown(struct socket *, int);
 static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t);
 static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int);
-static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset,
-				    size_t size, int flags);
 static ssize_t unix_stream_splice_read(struct socket *,  loff_t *ppos,
 				       struct pipe_inode_info *, size_t size,
 				       unsigned int flags);
@@ -852,7 +850,6 @@ static const struct proto_ops unix_stream_ops = {
 	.recvmsg =	unix_stream_recvmsg,
 	.read_skb =	unix_stream_read_skb,
 	.mmap =		sock_no_mmap,
-	.sendpage =	unix_stream_sendpage,
 	.splice_read =	unix_stream_splice_read,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
@@ -878,7 +875,6 @@ static const struct proto_ops unix_dgram_ops = {
 	.read_skb =	unix_read_skb,
 	.recvmsg =	unix_dgram_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -902,7 +898,6 @@ static const struct proto_ops unix_seqpacket_ops = {
 	.sendmsg =	unix_seqpacket_sendmsg,
 	.recvmsg =	unix_seqpacket_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -1839,24 +1834,6 @@ static void maybe_add_creds(struct sk_buff *skb, const struct socket *sock,
 	}
 }
 
-static int maybe_init_creds(struct scm_cookie *scm,
-			    struct socket *socket,
-			    const struct sock *other)
-{
-	int err;
-	struct msghdr msg = { .msg_controllen = 0 };
-
-	err = scm_send(socket, &msg, scm, false);
-	if (err)
-		return err;
-
-	if (unix_passcred_enabled(socket, other)) {
-		scm->pid = get_pid(task_tgid(current));
-		current_uid_gid(&scm->creds.uid, &scm->creds.gid);
-	}
-	return err;
-}
-
 static bool unix_skb_scm_eq(struct sk_buff *skb,
 			    struct scm_cookie *scm)
 {
@@ -2318,122 +2295,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	return sent ? : err;
 }
 
-static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
-				    int offset, size_t size, int flags)
-{
-	int err;
-	bool send_sigpipe = false;
-	bool init_scm = true;
-	struct scm_cookie scm;
-	struct sock *other, *sk = socket->sk;
-	struct sk_buff *skb, *newskb = NULL, *tail = NULL;
-
-	if (flags & MSG_OOB)
-		return -EOPNOTSUPP;
-
-	other = unix_peer(sk);
-	if (!other || sk->sk_state != TCP_ESTABLISHED)
-		return -ENOTCONN;
-
-	if (false) {
-alloc_skb:
-		unix_state_unlock(other);
-		mutex_unlock(&unix_sk(other)->iolock);
-		newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT,
-					      &err, 0);
-		if (!newskb)
-			goto err;
-	}
-
-	/* we must acquire iolock as we modify already present
-	 * skbs in the sk_receive_queue and mess with skb->len
-	 */
-	err = mutex_lock_interruptible(&unix_sk(other)->iolock);
-	if (err) {
-		err = flags & MSG_DONTWAIT ? -EAGAIN : -ERESTARTSYS;
-		goto err;
-	}
-
-	if (sk->sk_shutdown & SEND_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_unlock;
-	}
-
-	unix_state_lock(other);
-
-	if (sock_flag(other, SOCK_DEAD) ||
-	    other->sk_shutdown & RCV_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_state_unlock;
-	}
-
-	if (init_scm) {
-		err = maybe_init_creds(&scm, socket, other);
-		if (err)
-			goto err_state_unlock;
-		init_scm = false;
-	}
-
-	skb = skb_peek_tail(&other->sk_receive_queue);
-	if (tail && tail == skb) {
-		skb = newskb;
-	} else if (!skb || !unix_skb_scm_eq(skb, &scm)) {
-		if (newskb) {
-			skb = newskb;
-		} else {
-			tail = skb;
-			goto alloc_skb;
-		}
-	} else if (newskb) {
-		/* this is fast path, we don't necessarily need to
-		 * call to kfree_skb even though with newskb == NULL
-		 * this - does no harm
-		 */
-		consume_skb(newskb);
-		newskb = NULL;
-	}
-
-	if (skb_append_pagefrags(skb, page, offset, size)) {
-		tail = skb;
-		goto alloc_skb;
-	}
-
-	skb->len += size;
-	skb->data_len += size;
-	skb->truesize += size;
-	refcount_add(size, &sk->sk_wmem_alloc);
-
-	if (newskb) {
-		err = unix_scm_to_skb(&scm, skb, false);
-		if (err)
-			goto err_state_unlock;
-		spin_lock(&other->sk_receive_queue.lock);
-		__skb_queue_tail(&other->sk_receive_queue, newskb);
-		spin_unlock(&other->sk_receive_queue.lock);
-	}
-
-	unix_state_unlock(other);
-	mutex_unlock(&unix_sk(other)->iolock);
-
-	other->sk_data_ready(other);
-	scm_destroy(&scm);
-	return size;
-
-err_state_unlock:
-	unix_state_unlock(other);
-err_unlock:
-	mutex_unlock(&unix_sk(other)->iolock);
-err:
-	kfree_skb(newskb);
-	if (send_sigpipe && !(flags & MSG_NOSIGNAL))
-		send_sig(SIGPIPE, current, 0);
-	if (!init_scm)
-		scm_destroy(&scm);
-	return err;
-}
-
 static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
 				  size_t len)
 {
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 19aea7cba26e..d0e476755cdc 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1271,7 +1271,6 @@ static const struct proto_ops vsock_dgram_ops = {
 	.sendmsg = vsock_dgram_sendmsg,
 	.recvmsg = vsock_dgram_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_transport_cancel_pkt(struct vsock_sock *vsk)
@@ -2186,7 +2185,6 @@ static const struct proto_ops vsock_stream_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 	.set_rcvlowat = vsock_set_rcvlowat,
 };
 
@@ -2208,7 +2206,6 @@ static const struct proto_ops vsock_seqpacket_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_create(struct net *net, struct socket *sock,
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5c7ad301d742..0fb5143bec7a 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1757,7 +1757,6 @@ static const struct proto_ops x25_proto_ops = {
 	.sendmsg =	x25_sendmsg,
 	.recvmsg =	x25_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static struct packet_type x25_packet_type __read_mostly = {
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5e..eff1f0aaa4b5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -1386,7 +1386,6 @@ static const struct proto_ops xsk_proto_ops = {
 	.sendmsg	= xsk_sendmsg,
 	.recvmsg	= xsk_recvmsg,
 	.mmap		= xsk_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static void xsk_destruct(struct sock *sk)


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
@ 2023-03-16 15:26   ` David Howells
  0 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: dccp

[!] Note: This is a work in progress.  At the moment, some things won't
    build if this patch is applied.  nvme, kcm, smc, tls.

Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: bpf@vger.kernel.org
cc: dccp@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: linux-arm-msm@vger.kernel.org
cc: linux-can@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: linux-doc@vger.kernel.org
cc: linux-hams@vger.kernel.org
cc: linux-kernel@vger.kernel.org
cc: linux-rdma@vger.kernel.org
cc: linux-sctp@vger.kernel.org
cc: linux-wpan@vger.kernel.org
cc: linux-x25@vger.kernel.org
cc: mptcp@lists.linux.dev
cc: netdev@vger.kernel.org
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
---
 Documentation/networking/scaling.rst |   4 +-
 crypto/af_alg.c                      |  29 ------
 crypto/algif_aead.c                  |  22 +----
 crypto/algif_rng.c                   |   2 -
 crypto/algif_skcipher.c              |  14 ---
 include/linux/net.h                  |   8 --
 include/net/inet_common.h            |   2 -
 include/net/sock.h                   |   6 --
 net/appletalk/ddp.c                  |   1 -
 net/atm/pvc.c                        |   1 -
 net/atm/svc.c                        |   1 -
 net/ax25/af_ax25.c                   |   1 -
 net/caif/caif_socket.c               |   2 -
 net/can/bcm.c                        |   1 -
 net/can/isotp.c                      |   1 -
 net/can/j1939/socket.c               |   1 -
 net/can/raw.c                        |   1 -
 net/core/sock.c                      |  35 +------
 net/dccp/ipv4.c                      |   1 -
 net/dccp/ipv6.c                      |   1 -
 net/ieee802154/socket.c              |   2 -
 net/ipv4/af_inet.c                   |  21 ----
 net/ipv4/tcp.c                       |  36 -------
 net/ipv4/tcp_bpf.c                   |  21 +---
 net/ipv4/tcp_ipv4.c                  |   1 -
 net/ipv4/udp.c                       |  22 -----
 net/ipv4/udp_impl.h                  |   2 -
 net/ipv4/udplite.c                   |   1 -
 net/ipv6/af_inet6.c                  |   3 -
 net/ipv6/raw.c                       |   1 -
 net/ipv6/tcp_ipv6.c                  |   1 -
 net/key/af_key.c                     |   1 -
 net/l2tp/l2tp_ip.c                   |   1 -
 net/l2tp/l2tp_ip6.c                  |   1 -
 net/llc/af_llc.c                     |   1 -
 net/mctp/af_mctp.c                   |   1 -
 net/mptcp/protocol.c                 |   2 -
 net/netlink/af_netlink.c             |   1 -
 net/netrom/af_netrom.c               |   1 -
 net/packet/af_packet.c               |   2 -
 net/phonet/socket.c                  |   2 -
 net/qrtr/af_qrtr.c                   |   1 -
 net/rds/af_rds.c                     |   1 -
 net/rose/af_rose.c                   |   1 -
 net/rxrpc/af_rxrpc.c                 |   1 -
 net/sctp/protocol.c                  |   1 -
 net/socket.c                         |  48 ---------
 net/tipc/socket.c                    |   3 -
 net/unix/af_unix.c                   | 139 ---------------------------
 net/vmw_vsock/af_vsock.c             |   3 -
 net/x25/af_x25.c                     |   1 -
 net/xdp/xsk.c                        |   1 -
 52 files changed, 9 insertions(+), 449 deletions(-)

diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 3d435caa3ef2..92c9fb46d6a2 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes.
 rps_sock_flow_table is a global flow table that contains the *desired* CPU
 for flows: the CPU that is currently processing the flow in userspace.
 Each table value is a CPU index that is updated during calls to recvmsg
-and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
-and tcp_splice_read()).
+and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and
+tcp_splice_read()).
 
 When the scheduler moves a thread to a new CPU while it has outstanding
 receive packets on the old CPU, packets may arrive out of order. To
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 0e77fce60876..225c90657f58 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -483,7 +483,6 @@ static const struct proto_ops alg_proto_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 	.sendmsg	=	sock_no_sendmsg,
 	.recvmsg	=	sock_no_recvmsg,
 
@@ -1135,34 +1134,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 }
 EXPORT_SYMBOL_GPL(af_alg_sendmsg);
 
-/**
- * af_alg_sendpage - sendpage system call handler
- * @sock: socket of connection to user space to write to
- * @page: data to send
- * @offset: offset into page to begin sending
- * @size: length of data
- * @flags: message send/receive flags
- *
- * This is a generic implementation of sendpage to fill ctx->tsgl_list.
- */
-ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
-			int offset, size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return sock_sendmsg(sock, &msg);
-}
-EXPORT_SYMBOL_GPL(af_alg_sendpage);
-
 /**
  * af_alg_free_resources - release resources required for crypto request
  * @areq: Request holding the TX and RX SGL
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 279eb17a1dfc..b65baefe6123 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -9,10 +9,10 @@
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage. Filling up
- * the TX SGL does not cause a crypto operation -- the data will only be
- * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
- * provide a buffer which is tracked with the RX SGL.
+ * filled by user space with the data submitted via sendmsg (maybe with with
+ * MSG_SPLICE_PAGES).  Filling up the TX SGL does not cause a crypto operation
+ * -- the data will only be tracked by the kernel. Upon receipt of one recvmsg
+ * call, the caller must provide a buffer which is tracked with the RX SGL.
  *
  * During the processing of the recvmsg operation, the cipher request is
  * allocated and prepared. As part of the recvmsg operation, the processed
@@ -368,7 +368,6 @@ static struct proto_ops algif_aead_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	aead_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -420,18 +419,6 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return aead_sendmsg(sock, msg, size);
 }
 
-static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = aead_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int aead_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -459,7 +446,6 @@ static struct proto_ops algif_aead_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg_nokey,
-	.sendpage	=	aead_sendpage_nokey,
 	.recvmsg	=	aead_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 407408c43730..10c41adac3b1 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -174,7 +174,6 @@ static struct proto_ops algif_rng_ops = {
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
 	.sendmsg	=	sock_no_sendmsg,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_recvmsg,
@@ -192,7 +191,6 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = {
 	.mmap		=	sock_no_mmap,
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_test_recvmsg,
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 021f9ce7e87c..b34e20400e80 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -194,7 +194,6 @@ static struct proto_ops algif_skcipher_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	skcipher_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -246,18 +245,6 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return skcipher_sendmsg(sock, msg, size);
 }
 
-static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = skcipher_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int skcipher_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -285,7 +272,6 @@ static struct proto_ops algif_skcipher_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg_nokey,
-	.sendpage	=	skcipher_sendpage_nokey,
 	.recvmsg	=	skcipher_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..e5794968ac9f 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -206,8 +206,6 @@ struct proto_ops {
 				      size_t total_len, int flags);
 	int		(*mmap)	     (struct file *file, struct socket *sock,
 				      struct vm_area_struct * vma);
-	ssize_t		(*sendpage)  (struct socket *sock, struct page *page,
-				      int offset, size_t size, int flags);
 	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
 				       struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 	int		(*set_peek_off)(struct sock *sk, int val);
@@ -220,8 +218,6 @@ struct proto_ops {
 				     sk_read_actor_t recv_actor);
 	/* This is different from read_sock(), it reads an entire skb at a time. */
 	int		(*read_skb)(struct sock *sk, skb_read_actor_t recv_actor);
-	int		(*sendpage_locked)(struct sock *sk, struct page *page,
-					   int offset, size_t size, int flags);
 	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
 					  size_t size);
 	int		(*set_rcvlowat)(struct sock *sk, int val);
@@ -339,10 +335,6 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
 		   int flags);
 int kernel_getsockname(struct socket *sock, struct sockaddr *addr);
 int kernel_getpeername(struct socket *sock, struct sockaddr *addr);
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags);
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags);
 int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);
 
 /* Routine returns the IP overhead imposed by a (caller-protected) socket. */
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index cec453c18f1d..054c3388fa51 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -33,8 +33,6 @@ int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern);
 int inet_send_prepare(struct sock *sk);
 int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags);
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		 int flags);
 int inet_shutdown(struct socket *sock, int how);
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..4618cd21e16b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1265,8 +1265,6 @@ struct proto {
 					   size_t len);
 	int			(*recvmsg)(struct sock *sk, struct msghdr *msg,
 					   size_t len, int flags, int *addr_len);
-	int			(*sendpage)(struct sock *sk, struct page *page,
-					int offset, size_t size, int flags);
 	int			(*bind)(struct sock *sk,
 					struct sockaddr *addr, int addr_len);
 	int			(*bind_add)(struct sock *sk,
@@ -1906,10 +1904,6 @@ int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len);
 int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int);
 int sock_no_mmap(struct file *file, struct socket *sock,
 		 struct vm_area_struct *vma);
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset,
-			 size_t size, int flags);
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags);
 
 /*
  * Functions to fill in entries in struct proto_ops when a protocol
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index a06f4d4a6f47..8978fb6212ff 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1929,7 +1929,6 @@ static const struct proto_ops atalk_dgram_ops = {
 	.sendmsg	= atalk_sendmsg,
 	.recvmsg	= atalk_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct notifier_block ddp_notifier = {
diff --git a/net/atm/pvc.c b/net/atm/pvc.c
index 53e7d3f39e26..66d9a9bd5896 100644
--- a/net/atm/pvc.c
+++ b/net/atm/pvc.c
@@ -126,7 +126,6 @@ static const struct proto_ops pvc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/atm/svc.c b/net/atm/svc.c
index 4a02bcaad279..289240fe234e 100644
--- a/net/atm/svc.c
+++ b/net/atm/svc.c
@@ -649,7 +649,6 @@ static const struct proto_ops svc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index d8da400cb4de..5db805d5f74d 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -2022,7 +2022,6 @@ static const struct proto_ops ax25_proto_ops = {
 	.sendmsg	= ax25_sendmsg,
 	.recvmsg	= ax25_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 /*
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 4eebcc66c19a..9c82698da4f5 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -976,7 +976,6 @@ static const struct proto_ops caif_seqpacket_ops = {
 	.sendmsg = caif_seqpkt_sendmsg,
 	.recvmsg = caif_seqpkt_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static const struct proto_ops caif_stream_ops = {
@@ -996,7 +995,6 @@ static const struct proto_ops caif_stream_ops = {
 	.sendmsg = caif_stream_sendmsg,
 	.recvmsg = caif_stream_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 /* This function is called when a socket is finally destroyed. */
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6ace34..65a946a36d92 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1699,7 +1699,6 @@ static const struct proto_ops bcm_ops = {
 	.sendmsg       = bcm_sendmsg,
 	.recvmsg       = bcm_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto bcm_proto __read_mostly = {
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..0c3d11c29a2b 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1633,7 +1633,6 @@ static const struct proto_ops isotp_ops = {
 	.sendmsg = isotp_sendmsg,
 	.recvmsg = isotp_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto isotp_proto __read_mostly = {
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 7e90f9e61d9b..2bfe4f79bb67 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1301,7 +1301,6 @@ static const struct proto_ops j1939_ops = {
 	.sendmsg = j1939_sk_sendmsg,
 	.recvmsg = j1939_sk_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto j1939_proto __read_mostly = {
diff --git a/net/can/raw.c b/net/can/raw.c
index f64469b98260..15c79b079184 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -962,7 +962,6 @@ static const struct proto_ops raw_ops = {
 	.sendmsg       = raw_sendmsg,
 	.recvmsg       = raw_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto raw_proto __read_mostly = {
diff --git a/net/core/sock.c b/net/core/sock.c
index 341c565dbc26..c2ae77bb2075 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3223,36 +3223,6 @@ void __receive_sock(struct file *file)
 	}
 }
 
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg(sock, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage);
-
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg_locked(sk, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage_locked);
-
 /*
  *	Default Socket Callbacks
  */
@@ -4008,7 +3978,7 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 {
 
 	seq_printf(seq, "%-9s %4u %6d  %6ld   %-3s %6u   %-3s  %-10s "
-			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
+			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
 		   proto->name,
 		   proto->obj_size,
 		   sock_prot_inuse_get(seq_file_net(seq), proto),
@@ -4029,7 +3999,6 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 		   proto_method_implemented(proto->getsockopt),
 		   proto_method_implemented(proto->sendmsg),
 		   proto_method_implemented(proto->recvmsg),
-		   proto_method_implemented(proto->sendpage),
 		   proto_method_implemented(proto->bind),
 		   proto_method_implemented(proto->backlog_rcv),
 		   proto_method_implemented(proto->hash),
@@ -4050,7 +4019,7 @@ static int proto_seq_show(struct seq_file *seq, void *v)
 			   "maxhdr",
 			   "slab",
 			   "module",
-			   "cl co di ac io in de sh ss gs se re sp bi br ha uh gp em\n");
+			   "cl co di ac io in de sh ss gs se re bi br ha uh gp em\n");
 	else
 		proto_seq_printf(seq, list_entry(v, struct proto, node));
 	return 0;
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index b780827f5e0a..ea808de374ea 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -1008,7 +1008,6 @@ static const struct proto_ops inet_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw dccp_v4_protosw = {
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index b9d7c3dd1cb3..23eb8159e3cd 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -1085,7 +1085,6 @@ static const struct proto_ops inet6_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 1fa2fe041ec0..1238f036117f 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -426,7 +426,6 @@ static const struct proto_ops ieee802154_raw_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* DGRAM Sockets (802.15.4 dataframes) */
@@ -990,7 +989,6 @@ static const struct proto_ops ieee802154_dgram_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static void ieee802154_sock_destruct(struct sock *sk)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 8db6747f892f..869b49933f15 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -827,23 +827,6 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags)
-{
-	struct sock *sk = sock->sk;
-	const struct proto *prot;
-
-	if (unlikely(inet_send_prepare(sk)))
-		return -EAGAIN;
-
-	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
-	prot = READ_ONCE(sk->sk_prot);
-	if (prot->sendpage)
-		return prot->sendpage(sk, page, offset, size, flags);
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(inet_sendpage);
-
 INDIRECT_CALLABLE_DECLARE(int udp_recvmsg(struct sock *, struct msghdr *,
 					  size_t, int, int *));
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
@@ -1046,12 +1029,10 @@ const struct proto_ops inet_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.peek_len	   = tcp_peek_len,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1080,7 +1061,6 @@ const struct proto_ops inet_dgram_ops = {
 	.read_skb	   = udp_read_skb,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1111,7 +1091,6 @@ static const struct proto_ops inet_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
 #endif
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f1454e4497df..26fa387f1084 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -971,42 +971,6 @@ static int tcp_wmem_schedule(struct sock *sk, int copy)
 	return min(copy, sk->sk_forward_alloc);
 }
 
-int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	if (!(sk->sk_route_caps & NETIF_F_SG))
-		return sock_no_sendpage_locked(sk, page, offset, size, flags);
-
-	tcp_rate_check_app_limited(sk);  /* is sending application-limited? */
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_sendmsg_locked(sk, &msg, size);
-}
-EXPORT_SYMBOL_GPL(tcp_sendpage_locked);
-
-int tcp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	int ret;
-
-	lock_sock(sk);
-	ret = tcp_sendpage_locked(sk, page, offset, size, flags);
-	release_sock(sk);
-
-	return ret;
-}
-EXPORT_SYMBOL(tcp_sendpage);
-
 void tcp_free_fastopen_req(struct tcp_sock *tp)
 {
 	if (tp->fastopen_req) {
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index de37a4372437..ab83cfb9de22 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -482,23 +482,6 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	return copied ? copied : err;
 }
 
-static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset,
-			    size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_bpf_sendmsg(sk, &msg, size);
-}
-
 enum {
 	TCP_BPF_IPV4,
 	TCP_BPF_IPV6,
@@ -528,7 +511,6 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS],
 
 	prot[TCP_BPF_TX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_TX].sendmsg		= tcp_bpf_sendmsg;
-	prot[TCP_BPF_TX].sendpage		= tcp_bpf_sendpage;
 
 	prot[TCP_BPF_RX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_RX].recvmsg		= tcp_bpf_recvmsg_parser;
@@ -563,8 +545,7 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops)
 	 * indeed valid assumptions.
 	 */
 	return ops->recvmsg  = tcp_recvmsg &&
-	       ops->sendmsg  = tcp_sendmsg &&
-	       ops->sendpage = tcp_sendpage ? 0 : -ENOTSUPP;
+	       ops->sendmsg  = tcp_sendmsg ? 0 : -ENOTSUPP;
 }
 
 int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index ea370afa70ed..5c2e1c1ca329 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -3112,7 +3112,6 @@ struct proto tcp_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v4_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet_hash,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 097feb92e215..85bd5960f7ef 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1329,27 +1329,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 }
 EXPORT_SYMBOL(udp_sendmsg);
 
-int udp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE
-	};
-	int ret;
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	lock_sock(sk);
-	ret = udp_sendmsg(sk, &msg, size);
-	release_sock(sk);
-	return ret;
-}
-
 #define UDP_SKB_IS_STATELESS 0x80000000
 
 /* all head states (dst, sk, nf conntrack) except skb extensions are
@@ -2926,7 +2905,6 @@ struct proto udp_prot = {
 	.getsockopt		= udp_getsockopt,
 	.sendmsg		= udp_sendmsg,
 	.recvmsg		= udp_recvmsg,
-	.sendpage		= udp_sendpage,
 	.release_cb		= ip4_datagram_release_cb,
 	.hash			= udp_lib_hash,
 	.unhash			= udp_lib_unhash,
diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h
index 4ba7a88a1b1d..e1ff3a375996 100644
--- a/net/ipv4/udp_impl.h
+++ b/net/ipv4/udp_impl.h
@@ -19,8 +19,6 @@ int udp_getsockopt(struct sock *sk, int level, int optname,
 
 int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		int *addr_len);
-int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
-		 int flags);
 void udp_destroy_sock(struct sock *sk);
 
 #ifdef CONFIG_PROC_FS
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
index e0c9cc39b81e..69870f0afc6c 100644
--- a/net/ipv4/udplite.c
+++ b/net/ipv4/udplite.c
@@ -54,7 +54,6 @@ struct proto 	udplite_prot = {
 	.getsockopt	   = udp_getsockopt,
 	.sendmsg	   = udp_sendmsg,
 	.recvmsg	   = udp_recvmsg,
-	.sendpage	   = udp_sendpage,
 	.hash		   = udp_lib_hash,
 	.unhash		   = udp_lib_unhash,
 	.rehash		   = udp_v4_rehash,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 38689bedfce7..769c76d59053 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -695,9 +695,7 @@ const struct proto_ops inet6_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
@@ -728,7 +726,6 @@ const struct proto_ops inet6_dgram_ops = {
 	.recvmsg	   = inet6_recvmsg,		/* retpoline's sake */
 	.read_skb	   = udp_read_skb,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index bac9ba747bde..c6c062678c0e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1298,7 +1298,6 @@ const struct proto_ops inet6_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,		/* ok		*/
 	.recvmsg	   = sock_common_recvmsg,	/* ok		*/
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 1bf93b61aa06..03ba1e389901 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2151,7 +2151,6 @@ struct proto tcpv6_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v6_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet6_hash,
diff --git a/net/key/af_key.c b/net/key/af_key.c
index a815f5ab4c49..bf59d42dc697 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3757,7 +3757,6 @@ static const struct proto_ops pfkey_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 
 	/* Now the operations that really occur. */
 	.release	=	pfkey_release,
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4db5a554bdbd..d0dcbe3a4cd7 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -625,7 +625,6 @@ static const struct proto_ops l2tp_ip_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw l2tp_ip_protosw = {
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2478aa60145f..49296ce14a90 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -751,7 +751,6 @@ static const struct proto_ops l2tp_ip6_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index da7fe94bea2e..addd94da2a81 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -1230,7 +1230,6 @@ static const struct proto_ops llc_ui_ops = {
 	.sendmsg     = llc_ui_sendmsg,
 	.recvmsg     = llc_ui_recvmsg,
 	.mmap	     = sock_no_mmap,
-	.sendpage    = sock_no_sendpage,
 };
 
 static const char llc_proc_err_msg[] __initconst diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
index 3150f3f0c872..c6fe2e6b85dd 100644
--- a/net/mctp/af_mctp.c
+++ b/net/mctp/af_mctp.c
@@ -485,7 +485,6 @@ static const struct proto_ops mctp_dgram_ops = {
 	.sendmsg	= mctp_sendmsg,
 	.recvmsg	= mctp_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= mctp_compat_ioctl,
 #endif
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3ad9c46202fc..ade89b8d0082 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3816,7 +3816,6 @@ static const struct proto_ops mptcp_stream_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 };
 
 static struct inet_protosw mptcp_protosw = {
@@ -3911,7 +3910,6 @@ static const struct proto_ops mptcp_v6_stream_ops = {
 	.sendmsg	   = inet6_sendmsg,
 	.recvmsg	   = inet6_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c64277659753..f70073a3bb49 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2841,7 +2841,6 @@ static const struct proto_ops netlink_ops = {
 	.sendmsg =	netlink_sendmsg,
 	.recvmsg =	netlink_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family netlink_family_ops = {
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 5a4cb796150f..eb8ccbd58df7 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1364,7 +1364,6 @@ static const struct proto_ops nr_proto_ops = {
 	.sendmsg	=	nr_sendmsg,
 	.recvmsg	=	nr_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block nr_dev_notifier = {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d4e76e2ae153..385bd4982b80 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4604,7 +4604,6 @@ static const struct proto_ops packet_ops_spkt = {
 	.sendmsg =	packet_sendmsg_spkt,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct proto_ops packet_ops = {
@@ -4626,7 +4625,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg =	packet_sendmsg,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		packet_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family packet_family_ops = {
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 71e2caf6ab85..a246f7d0a817 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -441,7 +441,6 @@ const struct proto_ops phonet_dgram_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 const struct proto_ops phonet_stream_ops = {
@@ -462,7 +461,6 @@ const struct proto_ops phonet_stream_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 EXPORT_SYMBOL(phonet_stream_ops);
 
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 5c2fb992803b..5bb7d680bd5f 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -1240,7 +1240,6 @@ static const struct proto_ops qrtr_proto_ops = {
 	.shutdown	= sock_no_shutdown,
 	.release	= qrtr_release,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto qrtr_proto = {
diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
index 3ff6995244e5..01c4cdfef45d 100644
--- a/net/rds/af_rds.c
+++ b/net/rds/af_rds.c
@@ -653,7 +653,6 @@ static const struct proto_ops rds_proto_ops = {
 	.sendmsg =	rds_sendmsg,
 	.recvmsg =	rds_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static void rds_sock_destruct(struct sock *sk)
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index ca2b17f32670..49dafe9ac72f 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1496,7 +1496,6 @@ static const struct proto_ops rose_proto_ops = {
 	.sendmsg	=	rose_sendmsg,
 	.recvmsg	=	rose_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block rose_dev_notifier = {
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..182495804f8f 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -938,7 +938,6 @@ static const struct proto_ops rxrpc_rpc_ops = {
 	.sendmsg	= rxrpc_sendmsg,
 	.recvmsg	= rxrpc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto rxrpc_proto = {
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index c365df24ad33..acb2d2a69268 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1135,7 +1135,6 @@ static const struct proto_ops inet_seqpacket_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* Registration with AF_INET family.  */
diff --git a/net/socket.c b/net/socket.c
index 1b48a976b8cc..130d6ce7f82d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3541,54 +3541,6 @@ int kernel_getpeername(struct socket *sock, struct sockaddr *addr)
 }
 EXPORT_SYMBOL(kernel_getpeername);
 
-/**
- *	kernel_sendpage - send a &page through a socket (kernel space)
- *	@sock: socket
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- */
-
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags)
-{
-	if (sock->ops->sendpage) {
-		/* Warn in case the improper page to zero-copy send */
-		WARN_ONCE(!sendpage_ok(page), "improper page for zero-copy send");
-		return sock->ops->sendpage(sock, page, offset, size, flags);
-	}
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage);
-
-/**
- *	kernel_sendpage_locked - send a &page through the locked sock (kernel space)
- *	@sk: sock
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- *	Caller must hold @sk.
- */
-
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags)
-{
-	struct socket *sock = sk->sk_socket;
-
-	if (sock->ops->sendpage_locked)
-		return sock->ops->sendpage_locked(sk, page, offset, size,
-						  flags);
-
-	return sock_no_sendpage_locked(sk, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage_locked);
-
 /**
  *	kernel_sock_shutdown - shut down part of a full-duplex connection (kernel space)
  *	@sock: socket
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 37edfe10f8c6..d2072fbf3272 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -3375,7 +3375,6 @@ static const struct proto_ops msg_ops = {
 	.sendmsg	= tipc_sendmsg,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops packet_ops = {
@@ -3396,7 +3395,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg	= tipc_send_packet,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops stream_ops = {
@@ -3417,7 +3415,6 @@ static const struct proto_ops stream_ops = {
 	.sendmsg	= tipc_sendstream,
 	.recvmsg	= tipc_recvstream,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct net_proto_family tipc_family_ops = {
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 6f3454db9c53..407f449df564 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -758,8 +758,6 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon
 static int unix_shutdown(struct socket *, int);
 static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t);
 static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int);
-static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset,
-				    size_t size, int flags);
 static ssize_t unix_stream_splice_read(struct socket *,  loff_t *ppos,
 				       struct pipe_inode_info *, size_t size,
 				       unsigned int flags);
@@ -852,7 +850,6 @@ static const struct proto_ops unix_stream_ops = {
 	.recvmsg =	unix_stream_recvmsg,
 	.read_skb =	unix_stream_read_skb,
 	.mmap =		sock_no_mmap,
-	.sendpage =	unix_stream_sendpage,
 	.splice_read =	unix_stream_splice_read,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
@@ -878,7 +875,6 @@ static const struct proto_ops unix_dgram_ops = {
 	.read_skb =	unix_read_skb,
 	.recvmsg =	unix_dgram_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -902,7 +898,6 @@ static const struct proto_ops unix_seqpacket_ops = {
 	.sendmsg =	unix_seqpacket_sendmsg,
 	.recvmsg =	unix_seqpacket_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -1839,24 +1834,6 @@ static void maybe_add_creds(struct sk_buff *skb, const struct socket *sock,
 	}
 }
 
-static int maybe_init_creds(struct scm_cookie *scm,
-			    struct socket *socket,
-			    const struct sock *other)
-{
-	int err;
-	struct msghdr msg = { .msg_controllen = 0 };
-
-	err = scm_send(socket, &msg, scm, false);
-	if (err)
-		return err;
-
-	if (unix_passcred_enabled(socket, other)) {
-		scm->pid = get_pid(task_tgid(current));
-		current_uid_gid(&scm->creds.uid, &scm->creds.gid);
-	}
-	return err;
-}
-
 static bool unix_skb_scm_eq(struct sk_buff *skb,
 			    struct scm_cookie *scm)
 {
@@ -2318,122 +2295,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	return sent ? : err;
 }
 
-static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
-				    int offset, size_t size, int flags)
-{
-	int err;
-	bool send_sigpipe = false;
-	bool init_scm = true;
-	struct scm_cookie scm;
-	struct sock *other, *sk = socket->sk;
-	struct sk_buff *skb, *newskb = NULL, *tail = NULL;
-
-	if (flags & MSG_OOB)
-		return -EOPNOTSUPP;
-
-	other = unix_peer(sk);
-	if (!other || sk->sk_state != TCP_ESTABLISHED)
-		return -ENOTCONN;
-
-	if (false) {
-alloc_skb:
-		unix_state_unlock(other);
-		mutex_unlock(&unix_sk(other)->iolock);
-		newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT,
-					      &err, 0);
-		if (!newskb)
-			goto err;
-	}
-
-	/* we must acquire iolock as we modify already present
-	 * skbs in the sk_receive_queue and mess with skb->len
-	 */
-	err = mutex_lock_interruptible(&unix_sk(other)->iolock);
-	if (err) {
-		err = flags & MSG_DONTWAIT ? -EAGAIN : -ERESTARTSYS;
-		goto err;
-	}
-
-	if (sk->sk_shutdown & SEND_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_unlock;
-	}
-
-	unix_state_lock(other);
-
-	if (sock_flag(other, SOCK_DEAD) ||
-	    other->sk_shutdown & RCV_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_state_unlock;
-	}
-
-	if (init_scm) {
-		err = maybe_init_creds(&scm, socket, other);
-		if (err)
-			goto err_state_unlock;
-		init_scm = false;
-	}
-
-	skb = skb_peek_tail(&other->sk_receive_queue);
-	if (tail && tail = skb) {
-		skb = newskb;
-	} else if (!skb || !unix_skb_scm_eq(skb, &scm)) {
-		if (newskb) {
-			skb = newskb;
-		} else {
-			tail = skb;
-			goto alloc_skb;
-		}
-	} else if (newskb) {
-		/* this is fast path, we don't necessarily need to
-		 * call to kfree_skb even though with newskb = NULL
-		 * this - does no harm
-		 */
-		consume_skb(newskb);
-		newskb = NULL;
-	}
-
-	if (skb_append_pagefrags(skb, page, offset, size)) {
-		tail = skb;
-		goto alloc_skb;
-	}
-
-	skb->len += size;
-	skb->data_len += size;
-	skb->truesize += size;
-	refcount_add(size, &sk->sk_wmem_alloc);
-
-	if (newskb) {
-		err = unix_scm_to_skb(&scm, skb, false);
-		if (err)
-			goto err_state_unlock;
-		spin_lock(&other->sk_receive_queue.lock);
-		__skb_queue_tail(&other->sk_receive_queue, newskb);
-		spin_unlock(&other->sk_receive_queue.lock);
-	}
-
-	unix_state_unlock(other);
-	mutex_unlock(&unix_sk(other)->iolock);
-
-	other->sk_data_ready(other);
-	scm_destroy(&scm);
-	return size;
-
-err_state_unlock:
-	unix_state_unlock(other);
-err_unlock:
-	mutex_unlock(&unix_sk(other)->iolock);
-err:
-	kfree_skb(newskb);
-	if (send_sigpipe && !(flags & MSG_NOSIGNAL))
-		send_sig(SIGPIPE, current, 0);
-	if (!init_scm)
-		scm_destroy(&scm);
-	return err;
-}
-
 static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
 				  size_t len)
 {
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 19aea7cba26e..d0e476755cdc 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1271,7 +1271,6 @@ static const struct proto_ops vsock_dgram_ops = {
 	.sendmsg = vsock_dgram_sendmsg,
 	.recvmsg = vsock_dgram_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_transport_cancel_pkt(struct vsock_sock *vsk)
@@ -2186,7 +2185,6 @@ static const struct proto_ops vsock_stream_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 	.set_rcvlowat = vsock_set_rcvlowat,
 };
 
@@ -2208,7 +2206,6 @@ static const struct proto_ops vsock_seqpacket_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_create(struct net *net, struct socket *sock,
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5c7ad301d742..0fb5143bec7a 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1757,7 +1757,6 @@ static const struct proto_ops x25_proto_ops = {
 	.sendmsg =	x25_sendmsg,
 	.recvmsg =	x25_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static struct packet_type x25_packet_type __read_mostly = {
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5e..eff1f0aaa4b5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -1386,7 +1386,6 @@ static const struct proto_ops xsk_proto_ops = {
 	.sendmsg	= xsk_sendmsg,
 	.recvmsg	= xsk_recvmsg,
 	.mmap		= xsk_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static void xsk_destruct(struct sock *sk)

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
@ 2023-03-16 15:26   ` David Howells
  0 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 15:26 UTC (permalink / raw)
  To: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni
  Cc: linux-doc, virtualization, David Howells, linux-mm, linux-sctp,
	linux-afs, rds-devel, linux-x25, dccp, linux-rdma,
	Christoph Hellwig, Linus Torvalds, linux-arm-msm, linux-can,
	Al Viro, linux-hams, mptcp, Jens Axboe, Christian Brauner,
	netdev, Jeff Layton, linux-kernel, tipc-discussion, linux-crypto,
	linux-fsdevel, bpf, linux-wpan

[!] Note: This is a work in progress.  At the moment, some things won't
    build if this patch is applied.  nvme, kcm, smc, tls.

Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
multipage folios to be passed through.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: bpf@vger.kernel.org
cc: dccp@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: linux-arm-msm@vger.kernel.org
cc: linux-can@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: linux-doc@vger.kernel.org
cc: linux-hams@vger.kernel.org
cc: linux-kernel@vger.kernel.org
cc: linux-rdma@vger.kernel.org
cc: linux-sctp@vger.kernel.org
cc: linux-wpan@vger.kernel.org
cc: linux-x25@vger.kernel.org
cc: mptcp@lists.linux.dev
cc: netdev@vger.kernel.org
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
---
 Documentation/networking/scaling.rst |   4 +-
 crypto/af_alg.c                      |  29 ------
 crypto/algif_aead.c                  |  22 +----
 crypto/algif_rng.c                   |   2 -
 crypto/algif_skcipher.c              |  14 ---
 include/linux/net.h                  |   8 --
 include/net/inet_common.h            |   2 -
 include/net/sock.h                   |   6 --
 net/appletalk/ddp.c                  |   1 -
 net/atm/pvc.c                        |   1 -
 net/atm/svc.c                        |   1 -
 net/ax25/af_ax25.c                   |   1 -
 net/caif/caif_socket.c               |   2 -
 net/can/bcm.c                        |   1 -
 net/can/isotp.c                      |   1 -
 net/can/j1939/socket.c               |   1 -
 net/can/raw.c                        |   1 -
 net/core/sock.c                      |  35 +------
 net/dccp/ipv4.c                      |   1 -
 net/dccp/ipv6.c                      |   1 -
 net/ieee802154/socket.c              |   2 -
 net/ipv4/af_inet.c                   |  21 ----
 net/ipv4/tcp.c                       |  36 -------
 net/ipv4/tcp_bpf.c                   |  21 +---
 net/ipv4/tcp_ipv4.c                  |   1 -
 net/ipv4/udp.c                       |  22 -----
 net/ipv4/udp_impl.h                  |   2 -
 net/ipv4/udplite.c                   |   1 -
 net/ipv6/af_inet6.c                  |   3 -
 net/ipv6/raw.c                       |   1 -
 net/ipv6/tcp_ipv6.c                  |   1 -
 net/key/af_key.c                     |   1 -
 net/l2tp/l2tp_ip.c                   |   1 -
 net/l2tp/l2tp_ip6.c                  |   1 -
 net/llc/af_llc.c                     |   1 -
 net/mctp/af_mctp.c                   |   1 -
 net/mptcp/protocol.c                 |   2 -
 net/netlink/af_netlink.c             |   1 -
 net/netrom/af_netrom.c               |   1 -
 net/packet/af_packet.c               |   2 -
 net/phonet/socket.c                  |   2 -
 net/qrtr/af_qrtr.c                   |   1 -
 net/rds/af_rds.c                     |   1 -
 net/rose/af_rose.c                   |   1 -
 net/rxrpc/af_rxrpc.c                 |   1 -
 net/sctp/protocol.c                  |   1 -
 net/socket.c                         |  48 ---------
 net/tipc/socket.c                    |   3 -
 net/unix/af_unix.c                   | 139 ---------------------------
 net/vmw_vsock/af_vsock.c             |   3 -
 net/x25/af_x25.c                     |   1 -
 net/xdp/xsk.c                        |   1 -
 52 files changed, 9 insertions(+), 449 deletions(-)

diff --git a/Documentation/networking/scaling.rst b/Documentation/networking/scaling.rst
index 3d435caa3ef2..92c9fb46d6a2 100644
--- a/Documentation/networking/scaling.rst
+++ b/Documentation/networking/scaling.rst
@@ -269,8 +269,8 @@ a single application thread handles flows with many different flow hashes.
 rps_sock_flow_table is a global flow table that contains the *desired* CPU
 for flows: the CPU that is currently processing the flow in userspace.
 Each table value is a CPU index that is updated during calls to recvmsg
-and sendmsg (specifically, inet_recvmsg(), inet_sendmsg(), inet_sendpage()
-and tcp_splice_read()).
+and sendmsg (specifically, inet_recvmsg(), inet_sendmsg() and
+tcp_splice_read()).
 
 When the scheduler moves a thread to a new CPU while it has outstanding
 receive packets on the old CPU, packets may arrive out of order. To
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 0e77fce60876..225c90657f58 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -483,7 +483,6 @@ static const struct proto_ops alg_proto_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 	.sendmsg	=	sock_no_sendmsg,
 	.recvmsg	=	sock_no_recvmsg,
 
@@ -1135,34 +1134,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 }
 EXPORT_SYMBOL_GPL(af_alg_sendmsg);
 
-/**
- * af_alg_sendpage - sendpage system call handler
- * @sock: socket of connection to user space to write to
- * @page: data to send
- * @offset: offset into page to begin sending
- * @size: length of data
- * @flags: message send/receive flags
- *
- * This is a generic implementation of sendpage to fill ctx->tsgl_list.
- */
-ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
-			int offset, size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return sock_sendmsg(sock, &msg);
-}
-EXPORT_SYMBOL_GPL(af_alg_sendpage);
-
 /**
  * af_alg_free_resources - release resources required for crypto request
  * @areq: Request holding the TX and RX SGL
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 279eb17a1dfc..b65baefe6123 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -9,10 +9,10 @@
  * The following concept of the memory management is used:
  *
  * The kernel maintains two SGLs, the TX SGL and the RX SGL. The TX SGL is
- * filled by user space with the data submitted via sendpage. Filling up
- * the TX SGL does not cause a crypto operation -- the data will only be
- * tracked by the kernel. Upon receipt of one recvmsg call, the caller must
- * provide a buffer which is tracked with the RX SGL.
+ * filled by user space with the data submitted via sendmsg (maybe with with
+ * MSG_SPLICE_PAGES).  Filling up the TX SGL does not cause a crypto operation
+ * -- the data will only be tracked by the kernel. Upon receipt of one recvmsg
+ * call, the caller must provide a buffer which is tracked with the RX SGL.
  *
  * During the processing of the recvmsg operation, the cipher request is
  * allocated and prepared. As part of the recvmsg operation, the processed
@@ -368,7 +368,6 @@ static struct proto_ops algif_aead_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	aead_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -420,18 +419,6 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return aead_sendmsg(sock, msg, size);
 }
 
-static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = aead_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int aead_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -459,7 +446,6 @@ static struct proto_ops algif_aead_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	aead_sendmsg_nokey,
-	.sendpage	=	aead_sendpage_nokey,
 	.recvmsg	=	aead_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 407408c43730..10c41adac3b1 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -174,7 +174,6 @@ static struct proto_ops algif_rng_ops = {
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
 	.sendmsg	=	sock_no_sendmsg,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_recvmsg,
@@ -192,7 +191,6 @@ static struct proto_ops __maybe_unused algif_rng_test_ops = {
 	.mmap		=	sock_no_mmap,
 	.bind		=	sock_no_bind,
 	.accept		=	sock_no_accept,
-	.sendpage	=	sock_no_sendpage,
 
 	.release	=	af_alg_release,
 	.recvmsg	=	rng_test_recvmsg,
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 021f9ce7e87c..b34e20400e80 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -194,7 +194,6 @@ static struct proto_ops algif_skcipher_ops = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg,
-	.sendpage	=	af_alg_sendpage,
 	.recvmsg	=	skcipher_recvmsg,
 	.poll		=	af_alg_poll,
 };
@@ -246,18 +245,6 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	return skcipher_sendmsg(sock, msg, size);
 }
 
-static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page,
-				       int offset, size_t size, int flags)
-{
-	int err;
-
-	err = skcipher_check_key(sock);
-	if (err)
-		return err;
-
-	return af_alg_sendpage(sock, page, offset, size, flags);
-}
-
 static int skcipher_recvmsg_nokey(struct socket *sock, struct msghdr *msg,
 				  size_t ignored, int flags)
 {
@@ -285,7 +272,6 @@ static struct proto_ops algif_skcipher_ops_nokey = {
 
 	.release	=	af_alg_release,
 	.sendmsg	=	skcipher_sendmsg_nokey,
-	.sendpage	=	skcipher_sendpage_nokey,
 	.recvmsg	=	skcipher_recvmsg_nokey,
 	.poll		=	af_alg_poll,
 };
diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..e5794968ac9f 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -206,8 +206,6 @@ struct proto_ops {
 				      size_t total_len, int flags);
 	int		(*mmap)	     (struct file *file, struct socket *sock,
 				      struct vm_area_struct * vma);
-	ssize_t		(*sendpage)  (struct socket *sock, struct page *page,
-				      int offset, size_t size, int flags);
 	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
 				       struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 	int		(*set_peek_off)(struct sock *sk, int val);
@@ -220,8 +218,6 @@ struct proto_ops {
 				     sk_read_actor_t recv_actor);
 	/* This is different from read_sock(), it reads an entire skb at a time. */
 	int		(*read_skb)(struct sock *sk, skb_read_actor_t recv_actor);
-	int		(*sendpage_locked)(struct sock *sk, struct page *page,
-					   int offset, size_t size, int flags);
 	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
 					  size_t size);
 	int		(*set_rcvlowat)(struct sock *sk, int val);
@@ -339,10 +335,6 @@ int kernel_connect(struct socket *sock, struct sockaddr *addr, int addrlen,
 		   int flags);
 int kernel_getsockname(struct socket *sock, struct sockaddr *addr);
 int kernel_getpeername(struct socket *sock, struct sockaddr *addr);
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags);
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags);
 int kernel_sock_shutdown(struct socket *sock, enum sock_shutdown_cmd how);
 
 /* Routine returns the IP overhead imposed by a (caller-protected) socket. */
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index cec453c18f1d..054c3388fa51 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -33,8 +33,6 @@ int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern);
 int inet_send_prepare(struct sock *sk);
 int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags);
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		 int flags);
 int inet_shutdown(struct socket *sock, int how);
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..4618cd21e16b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1265,8 +1265,6 @@ struct proto {
 					   size_t len);
 	int			(*recvmsg)(struct sock *sk, struct msghdr *msg,
 					   size_t len, int flags, int *addr_len);
-	int			(*sendpage)(struct sock *sk, struct page *page,
-					int offset, size_t size, int flags);
 	int			(*bind)(struct sock *sk,
 					struct sockaddr *addr, int addr_len);
 	int			(*bind_add)(struct sock *sk,
@@ -1906,10 +1904,6 @@ int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len);
 int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int);
 int sock_no_mmap(struct file *file, struct socket *sock,
 		 struct vm_area_struct *vma);
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset,
-			 size_t size, int flags);
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags);
 
 /*
  * Functions to fill in entries in struct proto_ops when a protocol
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index a06f4d4a6f47..8978fb6212ff 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1929,7 +1929,6 @@ static const struct proto_ops atalk_dgram_ops = {
 	.sendmsg	= atalk_sendmsg,
 	.recvmsg	= atalk_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct notifier_block ddp_notifier = {
diff --git a/net/atm/pvc.c b/net/atm/pvc.c
index 53e7d3f39e26..66d9a9bd5896 100644
--- a/net/atm/pvc.c
+++ b/net/atm/pvc.c
@@ -126,7 +126,6 @@ static const struct proto_ops pvc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/atm/svc.c b/net/atm/svc.c
index 4a02bcaad279..289240fe234e 100644
--- a/net/atm/svc.c
+++ b/net/atm/svc.c
@@ -649,7 +649,6 @@ static const struct proto_ops svc_proto_ops = {
 	.sendmsg =	vcc_sendmsg,
 	.recvmsg =	vcc_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index d8da400cb4de..5db805d5f74d 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -2022,7 +2022,6 @@ static const struct proto_ops ax25_proto_ops = {
 	.sendmsg	= ax25_sendmsg,
 	.recvmsg	= ax25_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 /*
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 4eebcc66c19a..9c82698da4f5 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -976,7 +976,6 @@ static const struct proto_ops caif_seqpacket_ops = {
 	.sendmsg = caif_seqpkt_sendmsg,
 	.recvmsg = caif_seqpkt_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static const struct proto_ops caif_stream_ops = {
@@ -996,7 +995,6 @@ static const struct proto_ops caif_stream_ops = {
 	.sendmsg = caif_stream_sendmsg,
 	.recvmsg = caif_stream_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 /* This function is called when a socket is finally destroyed. */
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6ace34..65a946a36d92 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1699,7 +1699,6 @@ static const struct proto_ops bcm_ops = {
 	.sendmsg       = bcm_sendmsg,
 	.recvmsg       = bcm_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto bcm_proto __read_mostly = {
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..0c3d11c29a2b 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -1633,7 +1633,6 @@ static const struct proto_ops isotp_ops = {
 	.sendmsg = isotp_sendmsg,
 	.recvmsg = isotp_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto isotp_proto __read_mostly = {
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 7e90f9e61d9b..2bfe4f79bb67 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1301,7 +1301,6 @@ static const struct proto_ops j1939_ops = {
 	.sendmsg = j1939_sk_sendmsg,
 	.recvmsg = j1939_sk_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static struct proto j1939_proto __read_mostly = {
diff --git a/net/can/raw.c b/net/can/raw.c
index f64469b98260..15c79b079184 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -962,7 +962,6 @@ static const struct proto_ops raw_ops = {
 	.sendmsg       = raw_sendmsg,
 	.recvmsg       = raw_recvmsg,
 	.mmap          = sock_no_mmap,
-	.sendpage      = sock_no_sendpage,
 };
 
 static struct proto raw_proto __read_mostly = {
diff --git a/net/core/sock.c b/net/core/sock.c
index 341c565dbc26..c2ae77bb2075 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3223,36 +3223,6 @@ void __receive_sock(struct file *file)
 	}
 }
 
-ssize_t sock_no_sendpage(struct socket *sock, struct page *page, int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg(sock, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage);
-
-ssize_t sock_no_sendpage_locked(struct sock *sk, struct page *page,
-				int offset, size_t size, int flags)
-{
-	ssize_t res;
-	struct msghdr msg = {.msg_flags = flags};
-	struct kvec iov;
-	char *kaddr = kmap(page);
-
-	iov.iov_base = kaddr + offset;
-	iov.iov_len = size;
-	res = kernel_sendmsg_locked(sk, &msg, &iov, 1, size);
-	kunmap(page);
-	return res;
-}
-EXPORT_SYMBOL(sock_no_sendpage_locked);
-
 /*
  *	Default Socket Callbacks
  */
@@ -4008,7 +3978,7 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 {
 
 	seq_printf(seq, "%-9s %4u %6d  %6ld   %-3s %6u   %-3s  %-10s "
-			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
+			"%2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c %2c\n",
 		   proto->name,
 		   proto->obj_size,
 		   sock_prot_inuse_get(seq_file_net(seq), proto),
@@ -4029,7 +3999,6 @@ static void proto_seq_printf(struct seq_file *seq, struct proto *proto)
 		   proto_method_implemented(proto->getsockopt),
 		   proto_method_implemented(proto->sendmsg),
 		   proto_method_implemented(proto->recvmsg),
-		   proto_method_implemented(proto->sendpage),
 		   proto_method_implemented(proto->bind),
 		   proto_method_implemented(proto->backlog_rcv),
 		   proto_method_implemented(proto->hash),
@@ -4050,7 +4019,7 @@ static int proto_seq_show(struct seq_file *seq, void *v)
 			   "maxhdr",
 			   "slab",
 			   "module",
-			   "cl co di ac io in de sh ss gs se re sp bi br ha uh gp em\n");
+			   "cl co di ac io in de sh ss gs se re bi br ha uh gp em\n");
 	else
 		proto_seq_printf(seq, list_entry(v, struct proto, node));
 	return 0;
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index b780827f5e0a..ea808de374ea 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -1008,7 +1008,6 @@ static const struct proto_ops inet_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw dccp_v4_protosw = {
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index b9d7c3dd1cb3..23eb8159e3cd 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -1085,7 +1085,6 @@ static const struct proto_ops inet6_dccp_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 1fa2fe041ec0..1238f036117f 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -426,7 +426,6 @@ static const struct proto_ops ieee802154_raw_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* DGRAM Sockets (802.15.4 dataframes) */
@@ -990,7 +989,6 @@ static const struct proto_ops ieee802154_dgram_ops = {
 	.sendmsg	   = ieee802154_sock_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static void ieee802154_sock_destruct(struct sock *sk)
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 8db6747f892f..869b49933f15 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -827,23 +827,6 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
-		      size_t size, int flags)
-{
-	struct sock *sk = sock->sk;
-	const struct proto *prot;
-
-	if (unlikely(inet_send_prepare(sk)))
-		return -EAGAIN;
-
-	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
-	prot = READ_ONCE(sk->sk_prot);
-	if (prot->sendpage)
-		return prot->sendpage(sk, page, offset, size, flags);
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(inet_sendpage);
-
 INDIRECT_CALLABLE_DECLARE(int udp_recvmsg(struct sock *, struct msghdr *,
 					  size_t, int, int *));
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
@@ -1046,12 +1029,10 @@ const struct proto_ops inet_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.peek_len	   = tcp_peek_len,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1080,7 +1061,6 @@ const struct proto_ops inet_dgram_ops = {
 	.read_skb	   = udp_read_skb,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
@@ -1111,7 +1091,6 @@ static const struct proto_ops inet_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet_compat_ioctl,
 #endif
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f1454e4497df..26fa387f1084 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -971,42 +971,6 @@ static int tcp_wmem_schedule(struct sock *sk, int copy)
 	return min(copy, sk->sk_forward_alloc);
 }
 
-int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	if (!(sk->sk_route_caps & NETIF_F_SG))
-		return sock_no_sendpage_locked(sk, page, offset, size, flags);
-
-	tcp_rate_check_app_limited(sk);  /* is sending application-limited? */
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_sendmsg_locked(sk, &msg, size);
-}
-EXPORT_SYMBOL_GPL(tcp_sendpage_locked);
-
-int tcp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	int ret;
-
-	lock_sock(sk);
-	ret = tcp_sendpage_locked(sk, page, offset, size, flags);
-	release_sock(sk);
-
-	return ret;
-}
-EXPORT_SYMBOL(tcp_sendpage);
-
 void tcp_free_fastopen_req(struct tcp_sock *tp)
 {
 	if (tp->fastopen_req) {
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index de37a4372437..ab83cfb9de22 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -482,23 +482,6 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	return copied ? copied : err;
 }
 
-static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset,
-			    size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES,
-	};
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	return tcp_bpf_sendmsg(sk, &msg, size);
-}
-
 enum {
 	TCP_BPF_IPV4,
 	TCP_BPF_IPV6,
@@ -528,7 +511,6 @@ static void tcp_bpf_rebuild_protos(struct proto prot[TCP_BPF_NUM_CFGS],
 
 	prot[TCP_BPF_TX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_TX].sendmsg		= tcp_bpf_sendmsg;
-	prot[TCP_BPF_TX].sendpage		= tcp_bpf_sendpage;
 
 	prot[TCP_BPF_RX]			= prot[TCP_BPF_BASE];
 	prot[TCP_BPF_RX].recvmsg		= tcp_bpf_recvmsg_parser;
@@ -563,8 +545,7 @@ static int tcp_bpf_assert_proto_ops(struct proto *ops)
 	 * indeed valid assumptions.
 	 */
 	return ops->recvmsg  == tcp_recvmsg &&
-	       ops->sendmsg  == tcp_sendmsg &&
-	       ops->sendpage == tcp_sendpage ? 0 : -ENOTSUPP;
+	       ops->sendmsg  == tcp_sendmsg ? 0 : -ENOTSUPP;
 }
 
 int tcp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index ea370afa70ed..5c2e1c1ca329 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -3112,7 +3112,6 @@ struct proto tcp_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v4_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet_hash,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 097feb92e215..85bd5960f7ef 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1329,27 +1329,6 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 }
 EXPORT_SYMBOL(udp_sendmsg);
 
-int udp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
-{
-	struct bio_vec bvec;
-	struct msghdr msg = {
-		.msg_flags = flags | MSG_SPLICE_PAGES | MSG_MORE
-	};
-	int ret;
-
-	bvec_set_page(&bvec, page, size, offset);
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
-
-	if (flags & MSG_SENDPAGE_NOTLAST)
-		msg.msg_flags |= MSG_MORE;
-
-	lock_sock(sk);
-	ret = udp_sendmsg(sk, &msg, size);
-	release_sock(sk);
-	return ret;
-}
-
 #define UDP_SKB_IS_STATELESS 0x80000000
 
 /* all head states (dst, sk, nf conntrack) except skb extensions are
@@ -2926,7 +2905,6 @@ struct proto udp_prot = {
 	.getsockopt		= udp_getsockopt,
 	.sendmsg		= udp_sendmsg,
 	.recvmsg		= udp_recvmsg,
-	.sendpage		= udp_sendpage,
 	.release_cb		= ip4_datagram_release_cb,
 	.hash			= udp_lib_hash,
 	.unhash			= udp_lib_unhash,
diff --git a/net/ipv4/udp_impl.h b/net/ipv4/udp_impl.h
index 4ba7a88a1b1d..e1ff3a375996 100644
--- a/net/ipv4/udp_impl.h
+++ b/net/ipv4/udp_impl.h
@@ -19,8 +19,6 @@ int udp_getsockopt(struct sock *sk, int level, int optname,
 
 int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		int *addr_len);
-int udp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
-		 int flags);
 void udp_destroy_sock(struct sock *sk);
 
 #ifdef CONFIG_PROC_FS
diff --git a/net/ipv4/udplite.c b/net/ipv4/udplite.c
index e0c9cc39b81e..69870f0afc6c 100644
--- a/net/ipv4/udplite.c
+++ b/net/ipv4/udplite.c
@@ -54,7 +54,6 @@ struct proto 	udplite_prot = {
 	.getsockopt	   = udp_getsockopt,
 	.sendmsg	   = udp_sendmsg,
 	.recvmsg	   = udp_recvmsg,
-	.sendpage	   = udp_sendpage,
 	.hash		   = udp_lib_hash,
 	.unhash		   = udp_lib_unhash,
 	.rehash		   = udp_v4_rehash,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 38689bedfce7..769c76d59053 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -695,9 +695,7 @@ const struct proto_ops inet6_stream_ops = {
 #ifdef CONFIG_MMU
 	.mmap		   = tcp_mmap,
 #endif
-	.sendpage	   = inet_sendpage,
 	.sendmsg_locked    = tcp_sendmsg_locked,
-	.sendpage_locked   = tcp_sendpage_locked,
 	.splice_read	   = tcp_splice_read,
 	.read_sock	   = tcp_read_sock,
 	.read_skb	   = tcp_read_skb,
@@ -728,7 +726,6 @@ const struct proto_ops inet6_dgram_ops = {
 	.recvmsg	   = inet6_recvmsg,		/* retpoline's sake */
 	.read_skb	   = udp_read_skb,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 	.set_peek_off	   = sk_set_peek_off,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index bac9ba747bde..c6c062678c0e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1298,7 +1298,6 @@ const struct proto_ops inet6_sockraw_ops = {
 	.sendmsg	   = inet_sendmsg,		/* ok		*/
 	.recvmsg	   = sock_common_recvmsg,	/* ok		*/
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 1bf93b61aa06..03ba1e389901 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2151,7 +2151,6 @@ struct proto tcpv6_prot = {
 	.keepalive		= tcp_set_keepalive,
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
-	.sendpage		= tcp_sendpage,
 	.backlog_rcv		= tcp_v6_do_rcv,
 	.release_cb		= tcp_release_cb,
 	.hash			= inet6_hash,
diff --git a/net/key/af_key.c b/net/key/af_key.c
index a815f5ab4c49..bf59d42dc697 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3757,7 +3757,6 @@ static const struct proto_ops pfkey_ops = {
 	.listen		=	sock_no_listen,
 	.shutdown	=	sock_no_shutdown,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 
 	/* Now the operations that really occur. */
 	.release	=	pfkey_release,
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4db5a554bdbd..d0dcbe3a4cd7 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -625,7 +625,6 @@ static const struct proto_ops l2tp_ip_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 static struct inet_protosw l2tp_ip_protosw = {
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2478aa60145f..49296ce14a90 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -751,7 +751,6 @@ static const struct proto_ops l2tp_ip6_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = sock_common_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index da7fe94bea2e..addd94da2a81 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -1230,7 +1230,6 @@ static const struct proto_ops llc_ui_ops = {
 	.sendmsg     = llc_ui_sendmsg,
 	.recvmsg     = llc_ui_recvmsg,
 	.mmap	     = sock_no_mmap,
-	.sendpage    = sock_no_sendpage,
 };
 
 static const char llc_proc_err_msg[] __initconst =
diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
index 3150f3f0c872..c6fe2e6b85dd 100644
--- a/net/mctp/af_mctp.c
+++ b/net/mctp/af_mctp.c
@@ -485,7 +485,6 @@ static const struct proto_ops mctp_dgram_ops = {
 	.sendmsg	= mctp_sendmsg,
 	.recvmsg	= mctp_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= mctp_compat_ioctl,
 #endif
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 3ad9c46202fc..ade89b8d0082 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -3816,7 +3816,6 @@ static const struct proto_ops mptcp_stream_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 };
 
 static struct inet_protosw mptcp_protosw = {
@@ -3911,7 +3910,6 @@ static const struct proto_ops mptcp_v6_stream_ops = {
 	.sendmsg	   = inet6_sendmsg,
 	.recvmsg	   = inet6_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = inet_sendpage,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	   = inet6_compat_ioctl,
 #endif
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index c64277659753..f70073a3bb49 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2841,7 +2841,6 @@ static const struct proto_ops netlink_ops = {
 	.sendmsg =	netlink_sendmsg,
 	.recvmsg =	netlink_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family netlink_family_ops = {
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 5a4cb796150f..eb8ccbd58df7 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1364,7 +1364,6 @@ static const struct proto_ops nr_proto_ops = {
 	.sendmsg	=	nr_sendmsg,
 	.recvmsg	=	nr_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block nr_dev_notifier = {
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d4e76e2ae153..385bd4982b80 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4604,7 +4604,6 @@ static const struct proto_ops packet_ops_spkt = {
 	.sendmsg =	packet_sendmsg_spkt,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct proto_ops packet_ops = {
@@ -4626,7 +4625,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg =	packet_sendmsg,
 	.recvmsg =	packet_recvmsg,
 	.mmap =		packet_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static const struct net_proto_family packet_family_ops = {
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 71e2caf6ab85..a246f7d0a817 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -441,7 +441,6 @@ const struct proto_ops phonet_dgram_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 const struct proto_ops phonet_stream_ops = {
@@ -462,7 +461,6 @@ const struct proto_ops phonet_stream_ops = {
 	.sendmsg	= pn_socket_sendmsg,
 	.recvmsg	= sock_common_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 EXPORT_SYMBOL(phonet_stream_ops);
 
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 5c2fb992803b..5bb7d680bd5f 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -1240,7 +1240,6 @@ static const struct proto_ops qrtr_proto_ops = {
 	.shutdown	= sock_no_shutdown,
 	.release	= qrtr_release,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto qrtr_proto = {
diff --git a/net/rds/af_rds.c b/net/rds/af_rds.c
index 3ff6995244e5..01c4cdfef45d 100644
--- a/net/rds/af_rds.c
+++ b/net/rds/af_rds.c
@@ -653,7 +653,6 @@ static const struct proto_ops rds_proto_ops = {
 	.sendmsg =	rds_sendmsg,
 	.recvmsg =	rds_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static void rds_sock_destruct(struct sock *sk)
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index ca2b17f32670..49dafe9ac72f 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1496,7 +1496,6 @@ static const struct proto_ops rose_proto_ops = {
 	.sendmsg	=	rose_sendmsg,
 	.recvmsg	=	rose_recvmsg,
 	.mmap		=	sock_no_mmap,
-	.sendpage	=	sock_no_sendpage,
 };
 
 static struct notifier_block rose_dev_notifier = {
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..182495804f8f 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -938,7 +938,6 @@ static const struct proto_ops rxrpc_rpc_ops = {
 	.sendmsg	= rxrpc_sendmsg,
 	.recvmsg	= rxrpc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static struct proto rxrpc_proto = {
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index c365df24ad33..acb2d2a69268 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1135,7 +1135,6 @@ static const struct proto_ops inet_seqpacket_ops = {
 	.sendmsg	   = inet_sendmsg,
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
-	.sendpage	   = sock_no_sendpage,
 };
 
 /* Registration with AF_INET family.  */
diff --git a/net/socket.c b/net/socket.c
index 1b48a976b8cc..130d6ce7f82d 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -3541,54 +3541,6 @@ int kernel_getpeername(struct socket *sock, struct sockaddr *addr)
 }
 EXPORT_SYMBOL(kernel_getpeername);
 
-/**
- *	kernel_sendpage - send a &page through a socket (kernel space)
- *	@sock: socket
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- */
-
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
-		    size_t size, int flags)
-{
-	if (sock->ops->sendpage) {
-		/* Warn in case the improper page to zero-copy send */
-		WARN_ONCE(!sendpage_ok(page), "improper page for zero-copy send");
-		return sock->ops->sendpage(sock, page, offset, size, flags);
-	}
-	return sock_no_sendpage(sock, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage);
-
-/**
- *	kernel_sendpage_locked - send a &page through the locked sock (kernel space)
- *	@sk: sock
- *	@page: page
- *	@offset: page offset
- *	@size: total size in bytes
- *	@flags: flags (MSG_DONTWAIT, ...)
- *
- *	Returns the total amount sent in bytes or an error.
- *	Caller must hold @sk.
- */
-
-int kernel_sendpage_locked(struct sock *sk, struct page *page, int offset,
-			   size_t size, int flags)
-{
-	struct socket *sock = sk->sk_socket;
-
-	if (sock->ops->sendpage_locked)
-		return sock->ops->sendpage_locked(sk, page, offset, size,
-						  flags);
-
-	return sock_no_sendpage_locked(sk, page, offset, size, flags);
-}
-EXPORT_SYMBOL(kernel_sendpage_locked);
-
 /**
  *	kernel_sock_shutdown - shut down part of a full-duplex connection (kernel space)
  *	@sock: socket
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 37edfe10f8c6..d2072fbf3272 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -3375,7 +3375,6 @@ static const struct proto_ops msg_ops = {
 	.sendmsg	= tipc_sendmsg,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops packet_ops = {
@@ -3396,7 +3395,6 @@ static const struct proto_ops packet_ops = {
 	.sendmsg	= tipc_send_packet,
 	.recvmsg	= tipc_recvmsg,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct proto_ops stream_ops = {
@@ -3417,7 +3415,6 @@ static const struct proto_ops stream_ops = {
 	.sendmsg	= tipc_sendstream,
 	.recvmsg	= tipc_recvstream,
 	.mmap		= sock_no_mmap,
-	.sendpage	= sock_no_sendpage
 };
 
 static const struct net_proto_family tipc_family_ops = {
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 6f3454db9c53..407f449df564 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -758,8 +758,6 @@ static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned lon
 static int unix_shutdown(struct socket *, int);
 static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t);
 static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int);
-static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset,
-				    size_t size, int flags);
 static ssize_t unix_stream_splice_read(struct socket *,  loff_t *ppos,
 				       struct pipe_inode_info *, size_t size,
 				       unsigned int flags);
@@ -852,7 +850,6 @@ static const struct proto_ops unix_stream_ops = {
 	.recvmsg =	unix_stream_recvmsg,
 	.read_skb =	unix_stream_read_skb,
 	.mmap =		sock_no_mmap,
-	.sendpage =	unix_stream_sendpage,
 	.splice_read =	unix_stream_splice_read,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
@@ -878,7 +875,6 @@ static const struct proto_ops unix_dgram_ops = {
 	.read_skb =	unix_read_skb,
 	.recvmsg =	unix_dgram_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -902,7 +898,6 @@ static const struct proto_ops unix_seqpacket_ops = {
 	.sendmsg =	unix_seqpacket_sendmsg,
 	.recvmsg =	unix_seqpacket_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 	.set_peek_off =	unix_set_peek_off,
 	.show_fdinfo =	unix_show_fdinfo,
 };
@@ -1839,24 +1834,6 @@ static void maybe_add_creds(struct sk_buff *skb, const struct socket *sock,
 	}
 }
 
-static int maybe_init_creds(struct scm_cookie *scm,
-			    struct socket *socket,
-			    const struct sock *other)
-{
-	int err;
-	struct msghdr msg = { .msg_controllen = 0 };
-
-	err = scm_send(socket, &msg, scm, false);
-	if (err)
-		return err;
-
-	if (unix_passcred_enabled(socket, other)) {
-		scm->pid = get_pid(task_tgid(current));
-		current_uid_gid(&scm->creds.uid, &scm->creds.gid);
-	}
-	return err;
-}
-
 static bool unix_skb_scm_eq(struct sk_buff *skb,
 			    struct scm_cookie *scm)
 {
@@ -2318,122 +2295,6 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	return sent ? : err;
 }
 
-static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
-				    int offset, size_t size, int flags)
-{
-	int err;
-	bool send_sigpipe = false;
-	bool init_scm = true;
-	struct scm_cookie scm;
-	struct sock *other, *sk = socket->sk;
-	struct sk_buff *skb, *newskb = NULL, *tail = NULL;
-
-	if (flags & MSG_OOB)
-		return -EOPNOTSUPP;
-
-	other = unix_peer(sk);
-	if (!other || sk->sk_state != TCP_ESTABLISHED)
-		return -ENOTCONN;
-
-	if (false) {
-alloc_skb:
-		unix_state_unlock(other);
-		mutex_unlock(&unix_sk(other)->iolock);
-		newskb = sock_alloc_send_pskb(sk, 0, 0, flags & MSG_DONTWAIT,
-					      &err, 0);
-		if (!newskb)
-			goto err;
-	}
-
-	/* we must acquire iolock as we modify already present
-	 * skbs in the sk_receive_queue and mess with skb->len
-	 */
-	err = mutex_lock_interruptible(&unix_sk(other)->iolock);
-	if (err) {
-		err = flags & MSG_DONTWAIT ? -EAGAIN : -ERESTARTSYS;
-		goto err;
-	}
-
-	if (sk->sk_shutdown & SEND_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_unlock;
-	}
-
-	unix_state_lock(other);
-
-	if (sock_flag(other, SOCK_DEAD) ||
-	    other->sk_shutdown & RCV_SHUTDOWN) {
-		err = -EPIPE;
-		send_sigpipe = true;
-		goto err_state_unlock;
-	}
-
-	if (init_scm) {
-		err = maybe_init_creds(&scm, socket, other);
-		if (err)
-			goto err_state_unlock;
-		init_scm = false;
-	}
-
-	skb = skb_peek_tail(&other->sk_receive_queue);
-	if (tail && tail == skb) {
-		skb = newskb;
-	} else if (!skb || !unix_skb_scm_eq(skb, &scm)) {
-		if (newskb) {
-			skb = newskb;
-		} else {
-			tail = skb;
-			goto alloc_skb;
-		}
-	} else if (newskb) {
-		/* this is fast path, we don't necessarily need to
-		 * call to kfree_skb even though with newskb == NULL
-		 * this - does no harm
-		 */
-		consume_skb(newskb);
-		newskb = NULL;
-	}
-
-	if (skb_append_pagefrags(skb, page, offset, size)) {
-		tail = skb;
-		goto alloc_skb;
-	}
-
-	skb->len += size;
-	skb->data_len += size;
-	skb->truesize += size;
-	refcount_add(size, &sk->sk_wmem_alloc);
-
-	if (newskb) {
-		err = unix_scm_to_skb(&scm, skb, false);
-		if (err)
-			goto err_state_unlock;
-		spin_lock(&other->sk_receive_queue.lock);
-		__skb_queue_tail(&other->sk_receive_queue, newskb);
-		spin_unlock(&other->sk_receive_queue.lock);
-	}
-
-	unix_state_unlock(other);
-	mutex_unlock(&unix_sk(other)->iolock);
-
-	other->sk_data_ready(other);
-	scm_destroy(&scm);
-	return size;
-
-err_state_unlock:
-	unix_state_unlock(other);
-err_unlock:
-	mutex_unlock(&unix_sk(other)->iolock);
-err:
-	kfree_skb(newskb);
-	if (send_sigpipe && !(flags & MSG_NOSIGNAL))
-		send_sig(SIGPIPE, current, 0);
-	if (!init_scm)
-		scm_destroy(&scm);
-	return err;
-}
-
 static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
 				  size_t len)
 {
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 19aea7cba26e..d0e476755cdc 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1271,7 +1271,6 @@ static const struct proto_ops vsock_dgram_ops = {
 	.sendmsg = vsock_dgram_sendmsg,
 	.recvmsg = vsock_dgram_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_transport_cancel_pkt(struct vsock_sock *vsk)
@@ -2186,7 +2185,6 @@ static const struct proto_ops vsock_stream_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 	.set_rcvlowat = vsock_set_rcvlowat,
 };
 
@@ -2208,7 +2206,6 @@ static const struct proto_ops vsock_seqpacket_ops = {
 	.sendmsg = vsock_connectible_sendmsg,
 	.recvmsg = vsock_connectible_recvmsg,
 	.mmap = sock_no_mmap,
-	.sendpage = sock_no_sendpage,
 };
 
 static int vsock_create(struct net *net, struct socket *sock,
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5c7ad301d742..0fb5143bec7a 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1757,7 +1757,6 @@ static const struct proto_ops x25_proto_ops = {
 	.sendmsg =	x25_sendmsg,
 	.recvmsg =	x25_recvmsg,
 	.mmap =		sock_no_mmap,
-	.sendpage =	sock_no_sendpage,
 };
 
 static struct packet_type x25_packet_type __read_mostly = {
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5e..eff1f0aaa4b5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -1386,7 +1386,6 @@ static const struct proto_ops xsk_proto_ops = {
 	.sendmsg	= xsk_sendmsg,
 	.recvmsg	= xsk_recvmsg,
 	.mmap		= xsk_mmap,
-	.sendpage	= sock_no_sendpage,
 };
 
 static void xsk_destruct(struct sock *sk)

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
  2023-03-16 15:26   ` David Howells
  (?)
@ 2023-03-16 15:57     ` Marc Kleine-Budde
  -1 siblings, 0 replies; 81+ messages in thread
From: Marc Kleine-Budde @ 2023-03-16 15:57 UTC (permalink / raw)
  To: David Howells
  Cc: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Al Viro, Christoph Hellwig, Jens Axboe, Jeff Layton,
	Christian Brauner, Linus Torvalds, netdev, linux-fsdevel,
	linux-kernel, linux-mm, bpf, dccp, linux-afs, linux-arm-msm,
	linux-can, linux-crypto, linux-doc, linux-hams, linux-rdma,
	linux-sctp, linux-wpan, linux-x25, mptcp, rds-devel,
	tipc-discussion, virtualization

[-- Attachment #1: Type: text/plain, Size: 740 bytes --]

On 16.03.2023 15:26:18, David Howells wrote:
> [!] Note: This is a work in progress.  At the moment, some things won't
>     build if this patch is applied.  nvme, kcm, smc, tls.
> 
> Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
> MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
> multipage folios to be passed through.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>

> cc: linux-can@vger.kernel.org

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
@ 2023-03-16 15:57     ` Marc Kleine-Budde
  0 siblings, 0 replies; 81+ messages in thread
From: Marc Kleine-Budde @ 2023-03-16 15:57 UTC (permalink / raw)
  To: dccp

[-- Attachment #1: Type: text/plain, Size: 740 bytes --]

On 16.03.2023 15:26:18, David Howells wrote:
> [!] Note: This is a work in progress.  At the moment, some things won't
>     build if this patch is applied.  nvme, kcm, smc, tls.
> 
> Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
> MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
> multipage folios to be passed through.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>

> cc: linux-can@vger.kernel.org

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES)
@ 2023-03-16 15:57     ` Marc Kleine-Budde
  0 siblings, 0 replies; 81+ messages in thread
From: Marc Kleine-Budde @ 2023-03-16 15:57 UTC (permalink / raw)
  To: David Howells
  Cc: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Al Viro, Christoph Hellwig, Jens Axboe, Jeff Layton,
	Christian Brauner, Linus Torvalds, netdev, linux-fsdevel,
	linux-kernel, linux-mm, bpf, dccp, linux-afs, linux-arm-msm,
	linux-can, linux-crypto, linux-doc, linux-hams, linux-rdma,
	linux-sctp

[-- Attachment #1: Type: text/plain, Size: 740 bytes --]

On 16.03.2023 15:26:18, David Howells wrote:
> [!] Note: This is a work in progress.  At the moment, some things won't
>     build if this patch is applied.  nvme, kcm, smc, tls.
> 
> Remove ->sendpage() and ->sendpage_locked().  sendmsg() with
> MSG_SPLICE_PAGES should be used instead.  This allows multiple pages and
> multipage folios to be passed through.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>

> cc: linux-can@vger.kernel.org

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can

Marc

-- 
Pengutronix e.K.                 | Marc Kleine-Budde           |
Embedded Linux                   | https://www.pengutronix.de  |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917-5555 |

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 15:26 ` [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage David Howells
@ 2023-03-16 16:17   ` Trond Myklebust
  2023-03-16 17:10     ` Chuck Lever III
                       ` (2 more replies)
  2023-03-16 16:24   ` David Howells
  1 sibling, 3 replies; 81+ messages in thread
From: Trond Myklebust @ 2023-03-16 16:17 UTC (permalink / raw)
  To: David Howells
  Cc: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexander Viro, Christoph Hellwig, Jens Axboe,
	Jeffrey Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Trond Myklebust,
	Anna Schumaker, Charles Edward Lever, linux-nfs



> On Mar 16, 2023, at 11:26, David Howells <dhowells@redhat.com> wrote:
> 
> When transmitting data, call down into TCP using a single sendmsg with
> MSG_SPLICE_PAGES to indicate that content should be spliced rather than
> performing several sendmsg and sendpage calls to transmit header, data
> pages and trailer.
> 
> To make this work, the data is assembled in a bio_vec array and attached to
> a BVEC-type iterator.  The bio_vec array has two extra slots before the
> first for headers and one after the last for a trailer.  The headers and
> trailer are copied into memory acquired from zcopy_alloc() which just
> breaks a page up into small pieces that can be freed with put_page().
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Trond Myklebust <trond.myklebust@hammerspace.com>
> cc: Anna Schumaker <anna@kernel.org>
> cc: Chuck Lever <chuck.lever@oracle.com>
> cc: Jeff Layton <jlayton@kernel.org>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: linux-nfs@vger.kernel.org
> cc: netdev@vger.kernel.org
> ---
> net/sunrpc/svcsock.c | 70 ++++++++++++--------------------------------
> net/sunrpc/xdr.c     | 24 ++++++++++++---
> 2 files changed, 38 insertions(+), 56 deletions(-)
> 
> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
> index 03a4f5615086..1fa41ddbc40e 100644
> --- a/net/sunrpc/svcsock.c
> +++ b/net/sunrpc/svcsock.c
> @@ -36,6 +36,7 @@
> #include <linux/skbuff.h>
> #include <linux/file.h>
> #include <linux/freezer.h>
> +#include <linux/zcopy_alloc.h>
> #include <net/sock.h>
> #include <net/checksum.h>
> #include <net/ip.h>
> @@ -1060,16 +1061,8 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
> return 0; /* record not complete */
> }
> 
> -static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec,
> -      int flags)
> -{
> - return kernel_sendpage(sock, virt_to_page(vec->iov_base),
> -       offset_in_page(vec->iov_base),
> -       vec->iov_len, flags);
> -}
> -
> /*
> - * kernel_sendpage() is used exclusively to reduce the number of
> + * MSG_SPLICE_PAGES is used exclusively to reduce the number of
>  * copy operations in this path. Therefore the caller must ensure
>  * that the pages backing @xdr are unchanging.
>  *
> @@ -1081,65 +1074,38 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr,
> {
> const struct kvec *head = xdr->head;
> const struct kvec *tail = xdr->tail;
> - struct kvec rm = {
> - .iov_base = &marker,
> - .iov_len = sizeof(marker),
> - };
> struct msghdr msg = {
> - .msg_flags = 0,
> + .msg_flags = MSG_SPLICE_PAGES,
> };
> - int ret;
> + int ret, n = xdr_buf_pagecount(xdr), size;
> 
> *sentp = 0;
> ret = xdr_alloc_bvec(xdr, GFP_KERNEL);
> if (ret < 0)
> return ret;
> 
> - ret = kernel_sendmsg(sock, &msg, &rm, 1, rm.iov_len);
> + ret = zcopy_memdup(sizeof(marker), &marker, &xdr->bvec[-2], GFP_KERNEL);
> if (ret < 0)
> return ret;
> - *sentp += ret;
> - if (ret != rm.iov_len)
> - return -EAGAIN;
> 
> - ret = svc_tcp_send_kvec(sock, head, 0);
> + ret = zcopy_memdup(head->iov_len, head->iov_base, &xdr->bvec[-1], GFP_KERNEL);
> if (ret < 0)
> return ret;
> - *sentp += ret;
> - if (ret != head->iov_len)
> - goto out;
> 
> - if (xdr->page_len) {
> - unsigned int offset, len, remaining;
> - struct bio_vec *bvec;
> -
> - bvec = xdr->bvec + (xdr->page_base >> PAGE_SHIFT);
> - offset = offset_in_page(xdr->page_base);
> - remaining = xdr->page_len;
> - while (remaining > 0) {
> - len = min(remaining, bvec->bv_len - offset);
> - ret = kernel_sendpage(sock, bvec->bv_page,
> -      bvec->bv_offset + offset,
> -      len, 0);
> - if (ret < 0)
> - return ret;
> - *sentp += ret;
> - if (ret != len)
> - goto out;
> - remaining -= len;
> - offset = 0;
> - bvec++;
> - }
> - }
> + ret = zcopy_memdup(tail->iov_len, tail->iov_base, &xdr->bvec[n], GFP_KERNEL);
> + if (ret < 0)
> + return ret;
> 
> - if (tail->iov_len) {
> - ret = svc_tcp_send_kvec(sock, tail, 0);
> - if (ret < 0)
> - return ret;
> - *sentp += ret;
> - }
> + size = sizeof(marker) + head->iov_len + xdr->page_len + tail->iov_len;
> + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec - 2, n + 3, size);
> 
> -out:
> + ret = sock_sendmsg(sock, &msg);
> + if (ret < 0)
> + return ret;
> + if (ret > 0)
> + *sentp = ret;
> + if (ret != size)
> + return -EAGAIN;
> return 0;
> }
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index 36835b2f5446..6dff0b4f17b8 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -145,14 +145,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
> {
> size_t i, n = xdr_buf_pagecount(buf);
> 
> - if (n != 0 && buf->bvec == NULL) {
> - buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]), gfp);
> + if (buf->bvec == NULL) {
> + /* Allow for two headers and a trailer to be attached */
> + buf->bvec = kmalloc_array(n + 3, sizeof(buf->bvec[0]), gfp);
> if (!buf->bvec)
> return -ENOMEM;
> + buf->bvec += 2;
> + buf->bvec[-2].bv_page = NULL;
> + buf->bvec[-1].bv_page = NULL;

NACK.

> for (i = 0; i < n; i++) {
> bvec_set_page(&buf->bvec[i], buf->pages[i], PAGE_SIZE,
>      0);
> }
> + buf->bvec[n].bv_page = NULL;
> }
> return 0;
> }
> @@ -160,8 +165,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
> void
> xdr_free_bvec(struct xdr_buf *buf)
> {
> - kfree(buf->bvec);
> - buf->bvec = NULL;
> + if (buf->bvec) {
> + size_t n = xdr_buf_pagecount(buf);
> +
> + if (buf->bvec[-2].bv_page)
> + put_page(buf->bvec[-2].bv_page);
> + if (buf->bvec[-1].bv_page)
> + put_page(buf->bvec[-1].bv_page);
> + if (buf->bvec[n].bv_page)
> + put_page(buf->bvec[n].bv_page);
> + buf->bvec -= 2;
> + kfree(buf->bvec);
> + buf->bvec = NULL;
> + }
> }
> 
> /**
> 


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 15:26 ` [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage David Howells
  2023-03-16 16:17   ` Trond Myklebust
@ 2023-03-16 16:24   ` David Howells
  2023-03-16 17:23     ` Trond Myklebust
  2023-03-16 18:06     ` David Howells
  1 sibling, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 16:24 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexander Viro, Christoph Hellwig,
	Jens Axboe, Jeffrey Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs

Trond Myklebust <trondmy@hammerspace.com> wrote:

> > + buf->bvec += 2;
> > + buf->bvec[-2].bv_page = NULL;
> > + buf->bvec[-1].bv_page = NULL;
> 
> NACK.

Can you elaborate?

Is it that you dislike allocating extra slots for protocol bits?  Or just that
the bvec[] is offset by 2?  Or some other reason?

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 16:17   ` Trond Myklebust
@ 2023-03-16 17:10     ` Chuck Lever III
  2023-03-16 17:28     ` David Howells
  2023-03-16 21:21     ` David Howells
  2 siblings, 0 replies; 81+ messages in thread
From: Chuck Lever III @ 2023-03-16 17:10 UTC (permalink / raw)
  To: David Howells
  Cc: Trond Myklebust, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Al Viro, Christoph Hellwig,
	Jens Axboe, Jeffrey Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Linux NFS Mailing List


Note: this is the first I've seen of this series -- not sure why
I never received any of these patches.

That means I haven't seen the cover letter and do not have any
context for this proposed change.


> On Mar 16, 2023, at 12:17 PM, Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
>> On Mar 16, 2023, at 11:26, David Howells <dhowells@redhat.com> wrote:
>> 
>> When transmitting data, call down into TCP using a single sendmsg with
>> MSG_SPLICE_PAGES to indicate that content should be spliced rather than
>> performing several sendmsg and sendpage calls to transmit header, data
>> pages and trailer.

We've tried combining the sendpages calls in here before. It
results in a significant and measurable performance regression.
See:

da1661b93bf4 ("SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends")

and it's subsequent revert:

4a85a6a3320b ("SUNRPC: Handle TCP socket sends with kernel_sendpage() again")


Therefore, this kind of change needs to be accompanied by both
benchmark results and some field testing to convince me it won't
cause harm.

Also, I'd rather see struct xdr_buf changed to /replace/ the
head/pagevec/tail arrangement with bvecs before we do this
kind of overhaul.

And, we have to make certain that this doesn't break operation
with kTLS sockets... do they support MSG_SPLICE_PAGES ?


>> To make this work, the data is assembled in a bio_vec array and attached to
>> a BVEC-type iterator.  The bio_vec array has two extra slots before the
>> first for headers and one after the last for a trailer.  The headers and
>> trailer are copied into memory acquired from zcopy_alloc() which just
>> breaks a page up into small pieces that can be freed with put_page().
>> 
>> Signed-off-by: David Howells <dhowells@redhat.com>
>> cc: Trond Myklebust <trond.myklebust@hammerspace.com>
>> cc: Anna Schumaker <anna@kernel.org>
>> cc: Chuck Lever <chuck.lever@oracle.com>
>> cc: Jeff Layton <jlayton@kernel.org>
>> cc: "David S. Miller" <davem@davemloft.net>
>> cc: Eric Dumazet <edumazet@google.com>
>> cc: Jakub Kicinski <kuba@kernel.org>
>> cc: Paolo Abeni <pabeni@redhat.com>
>> cc: Jens Axboe <axboe@kernel.dk>
>> cc: Matthew Wilcox <willy@infradead.org>
>> cc: linux-nfs@vger.kernel.org
>> cc: netdev@vger.kernel.org
>> ---
>> net/sunrpc/svcsock.c | 70 ++++++++++++--------------------------------
>> net/sunrpc/xdr.c     | 24 ++++++++++++---
>> 2 files changed, 38 insertions(+), 56 deletions(-)
>> 
>> diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
>> index 03a4f5615086..1fa41ddbc40e 100644
>> --- a/net/sunrpc/svcsock.c
>> +++ b/net/sunrpc/svcsock.c
>> @@ -36,6 +36,7 @@
>> #include <linux/skbuff.h>
>> #include <linux/file.h>
>> #include <linux/freezer.h>
>> +#include <linux/zcopy_alloc.h>
>> #include <net/sock.h>
>> #include <net/checksum.h>
>> #include <net/ip.h>
>> @@ -1060,16 +1061,8 @@ static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
>> return 0; /* record not complete */
>> }
>> 
>> -static int svc_tcp_send_kvec(struct socket *sock, const struct kvec *vec,
>> -      int flags)
>> -{
>> - return kernel_sendpage(sock, virt_to_page(vec->iov_base),
>> -       offset_in_page(vec->iov_base),
>> -       vec->iov_len, flags);
>> -}
>> -
>> /*
>> - * kernel_sendpage() is used exclusively to reduce the number of
>> + * MSG_SPLICE_PAGES is used exclusively to reduce the number of
>> * copy operations in this path. Therefore the caller must ensure
>> * that the pages backing @xdr are unchanging.
>> *
>> @@ -1081,65 +1074,38 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr,
>> {
>> const struct kvec *head = xdr->head;
>> const struct kvec *tail = xdr->tail;
>> - struct kvec rm = {
>> - .iov_base = &marker,
>> - .iov_len = sizeof(marker),
>> - };
>> struct msghdr msg = {
>> - .msg_flags = 0,
>> + .msg_flags = MSG_SPLICE_PAGES,
>> };
>> - int ret;
>> + int ret, n = xdr_buf_pagecount(xdr), size;
>> 
>> *sentp = 0;
>> ret = xdr_alloc_bvec(xdr, GFP_KERNEL);
>> if (ret < 0)
>> return ret;
>> 
>> - ret = kernel_sendmsg(sock, &msg, &rm, 1, rm.iov_len);
>> + ret = zcopy_memdup(sizeof(marker), &marker, &xdr->bvec[-2], GFP_KERNEL);
>> if (ret < 0)
>> return ret;
>> - *sentp += ret;
>> - if (ret != rm.iov_len)
>> - return -EAGAIN;
>> 
>> - ret = svc_tcp_send_kvec(sock, head, 0);
>> + ret = zcopy_memdup(head->iov_len, head->iov_base, &xdr->bvec[-1], GFP_KERNEL);
>> if (ret < 0)
>> return ret;
>> - *sentp += ret;
>> - if (ret != head->iov_len)
>> - goto out;
>> 
>> - if (xdr->page_len) {
>> - unsigned int offset, len, remaining;
>> - struct bio_vec *bvec;
>> -
>> - bvec = xdr->bvec + (xdr->page_base >> PAGE_SHIFT);
>> - offset = offset_in_page(xdr->page_base);
>> - remaining = xdr->page_len;
>> - while (remaining > 0) {
>> - len = min(remaining, bvec->bv_len - offset);
>> - ret = kernel_sendpage(sock, bvec->bv_page,
>> -      bvec->bv_offset + offset,
>> -      len, 0);
>> - if (ret < 0)
>> - return ret;
>> - *sentp += ret;
>> - if (ret != len)
>> - goto out;
>> - remaining -= len;
>> - offset = 0;
>> - bvec++;
>> - }
>> - }
>> + ret = zcopy_memdup(tail->iov_len, tail->iov_base, &xdr->bvec[n], GFP_KERNEL);
>> + if (ret < 0)
>> + return ret;
>> 
>> - if (tail->iov_len) {
>> - ret = svc_tcp_send_kvec(sock, tail, 0);
>> - if (ret < 0)
>> - return ret;
>> - *sentp += ret;
>> - }
>> + size = sizeof(marker) + head->iov_len + xdr->page_len + tail->iov_len;
>> + iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec - 2, n + 3, size);
>> 
>> -out:
>> + ret = sock_sendmsg(sock, &msg);
>> + if (ret < 0)
>> + return ret;
>> + if (ret > 0)
>> + *sentp = ret;
>> + if (ret != size)
>> + return -EAGAIN;
>> return 0;
>> }
>> 
>> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
>> index 36835b2f5446..6dff0b4f17b8 100644
>> --- a/net/sunrpc/xdr.c
>> +++ b/net/sunrpc/xdr.c
>> @@ -145,14 +145,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
>> {
>> size_t i, n = xdr_buf_pagecount(buf);
>> 
>> - if (n != 0 && buf->bvec == NULL) {
>> - buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]), gfp);
>> + if (buf->bvec == NULL) {
>> + /* Allow for two headers and a trailer to be attached */
>> + buf->bvec = kmalloc_array(n + 3, sizeof(buf->bvec[0]), gfp);
>> if (!buf->bvec)
>> return -ENOMEM;
>> + buf->bvec += 2;
>> + buf->bvec[-2].bv_page = NULL;
>> + buf->bvec[-1].bv_page = NULL;
> 
> NACK.
> 
>> for (i = 0; i < n; i++) {
>> bvec_set_page(&buf->bvec[i], buf->pages[i], PAGE_SIZE,
>>     0);
>> }
>> + buf->bvec[n].bv_page = NULL;
>> }
>> return 0;
>> }
>> @@ -160,8 +165,19 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
>> void
>> xdr_free_bvec(struct xdr_buf *buf)
>> {
>> - kfree(buf->bvec);
>> - buf->bvec = NULL;
>> + if (buf->bvec) {
>> + size_t n = xdr_buf_pagecount(buf);
>> +
>> + if (buf->bvec[-2].bv_page)
>> + put_page(buf->bvec[-2].bv_page);
>> + if (buf->bvec[-1].bv_page)
>> + put_page(buf->bvec[-1].bv_page);
>> + if (buf->bvec[n].bv_page)
>> + put_page(buf->bvec[n].bv_page);
>> + buf->bvec -= 2;
>> + kfree(buf->bvec);
>> + buf->bvec = NULL;
>> + }
>> }
>> 
>> /**
>> 
> 

--
Chuck Lever



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 16:24   ` David Howells
@ 2023-03-16 17:23     ` Trond Myklebust
  2023-03-16 18:06     ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: Trond Myklebust @ 2023-03-16 17:23 UTC (permalink / raw)
  To: David Howells
  Cc: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexander Viro, Christoph Hellwig, Jens Axboe,
	Jeffrey Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs

[-- Attachment #1: Type: text/plain, Size: 981 bytes --]



On Mar 16, 2023, at 12:24, David Howells <dhowells@redhat.com> wrote:

Trond Myklebust <trondmy@hammerspace.com> wrote:

+ buf->bvec += 2;
+ buf->bvec[-2].bv_page = NULL;
+ buf->bvec[-1].bv_page = NULL;

NACK.

Can you elaborate?

Is it that you dislike allocating extra slots for protocol bits?  Or just that
the bvec[] is offset by 2?  Or some other reason?


1) This is code that is common to the client and the server. Why are we adding unused 3  bvec slots to every client RPC call?
2) It obfuscates the existence of these bvec slots.
3) knfsd may use splice_direct_to_actor() in order to avoid copying the page cache data into private buffers (it just takes a reference to the pages). Using MSG_SPLICE_PAGES will presumably require it to protect those pages against further writes while the socket is referencing them.


_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com


[-- Attachment #2: Type: text/html, Size: 3245 bytes --]

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 16:17   ` Trond Myklebust
  2023-03-16 17:10     ` Chuck Lever III
@ 2023-03-16 17:28     ` David Howells
  2023-03-16 17:41       ` Chuck Lever III
  2023-03-16 21:21     ` David Howells
  2 siblings, 1 reply; 81+ messages in thread
From: David Howells @ 2023-03-16 17:28 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: dhowells, Trond Myklebust, Matthew Wilcox, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Al Viro,
	Christoph Hellwig, Jens Axboe, Jeffrey Layton, Christian Brauner,
	Linus Torvalds, netdev, linux-fsdevel, linux-kernel, linux-mm,
	Anna Schumaker, Linux NFS Mailing List

Chuck Lever III <chuck.lever@oracle.com> wrote:

> That means I haven't seen the cover letter and do not have any
> context for this proposed change.

https://lore.kernel.org/linux-fsdevel/20230316152618.711970-1-dhowells@redhat.com/

> We've tried combining the sendpages calls in here before. It
> results in a significant and measurable performance regression.
> See:
> 
> da1661b93bf4 ("SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends")

The commit replaced the use of sendpage with sendmsg, but that took away the
zerocopy aspect of sendpage.  The idea behind MSG_SPLICE_PAGES is that it
allows you to do keep that.  I'll have to try reapplying this commit and
adding the MSG_SPLICE_PAGES flag.

> Therefore, this kind of change needs to be accompanied by both
> benchmark results and some field testing to convince me it won't
> cause harm.

Yep.

> And, we have to make certain that this doesn't break operation
> with kTLS sockets... do they support MSG_SPLICE_PAGES ?

I haven't yet tackled AF_TLS, AF_KCM or AF_SMC as they seem significantly more
complex than TCP and UDP.  I thought I'd get some feedback on what I have
before I tried my hand at those.

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES
  2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
@ 2023-03-16 17:28   ` Matthew Wilcox
  2023-03-16 18:00   ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: Matthew Wilcox @ 2023-03-16 17:28 UTC (permalink / raw)
  To: David Howells
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Al Viro, Christoph Hellwig, Jens Axboe, Jeff Layton,
	Christian Brauner, Linus Torvalds, netdev, linux-fsdevel,
	linux-kernel, linux-mm, Bernard Metzler, Tom Talpey, linux-rdma

On Thu, Mar 16, 2023 at 03:25:52PM +0000, David Howells wrote:
> If a network protocol sendmsg() sees MSG_SPLICE_DATA, it expects that the
> iterator is of ITER_BVEC type and that all the pages can have refs taken on
> them with get_page() and discarded with put_page().  Bits of network
> filesystem protocol data, however, are typically contained in slab memory
> for which the cleanup method is kfree(), not put_page(), so this doesn't
> work.
> 
> Provide a simple allocator, zcopy_alloc(), that allocates a page at a time
> per-cpu and sequentially breaks off pieces and hands them out with a ref as
> it's asked for them.  The caller disposes of the memory it was given by
> calling put_page().  When a page is all parcelled out, it is abandoned by
> the allocator and another page is obtained.  The page will get cleaned up
> when the last skbuff fragment is destroyed.

This feels a _lot_ like the page_frag allocator.  Can the two be
unified?

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 17:28     ` David Howells
@ 2023-03-16 17:41       ` Chuck Lever III
  0 siblings, 0 replies; 81+ messages in thread
From: Chuck Lever III @ 2023-03-16 17:41 UTC (permalink / raw)
  To: David Howells
  Cc: Trond Myklebust, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Al Viro, Christoph Hellwig,
	Jens Axboe, Jeffrey Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Linux NFS Mailing List



> On Mar 16, 2023, at 1:28 PM, David Howells <dhowells@redhat.com> wrote:
> 
> Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
>> That means I haven't seen the cover letter and do not have any
>> context for this proposed change.
> 
> https://lore.kernel.org/linux-fsdevel/20230316152618.711970-1-dhowells@redhat.com/
> 
>> We've tried combining the sendpages calls in here before. It
>> results in a significant and measurable performance regression.
>> See:
>> 
>> da1661b93bf4 ("SUNRPC: Teach server to use xprt_sock_sendmsg for socket sends")
> 
> The commit replaced the use of sendpage with sendmsg, but that took away the
> zerocopy aspect of sendpage.  The idea behind MSG_SPLICE_PAGES is that it
> allows you to do keep that.  I'll have to try reapplying this commit and
> adding the MSG_SPLICE_PAGES flag.

Note that, as Trond point out, NFSD can handle an NFS READ
request with either a splice actor or by copying through a
vector, depending on what the underlying filesystem can
support and whether we are using a security flavor that
requires stable pages. Grep for RQ_SPLICE_OK.

Eventually we want to make use of iomaps to ensure that
reading areas of a file that are not allocated on disk
does not trigger an extent allocation. Anna is working on
that, but I have no idea what it will look like. We can
talk more at LSF, if you'll both be around.

Also... I find I have to put back the use of MSG_MORE and
friends in here, otherwise kTLS will split each of these
kernel_sendsomething() calls into its own TLS record. This
code is likely going to look different after support for
RPC-with-TLS goes in.


>> Therefore, this kind of change needs to be accompanied by both
>> benchmark results and some field testing to convince me it won't
>> cause harm.
> 
> Yep.
> 
>> And, we have to make certain that this doesn't break operation
>> with kTLS sockets... do they support MSG_SPLICE_PAGES ?
> 
> I haven't yet tackled AF_TLS, AF_KCM or AF_SMC as they seem significantly more
> complex than TCP and UDP.  I thought I'd get some feedback on what I have
> before I tried my hand at those.

OK, I didn't mean AF_TLS, I meant the stuff under net/tls,
which is AF_INET[6] and TCP, but with a ULP in place. It's
got its own sendpage and sendmsg methods that choke when
an unrecognized MSG_ flag is present.

But OK, you're just asking for feedback, so I'll put my red
pencil down.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES
  2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
  2023-03-16 17:28   ` Matthew Wilcox
@ 2023-03-16 18:00   ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 18:00 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: dhowells, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Al Viro, Christoph Hellwig, Jens Axboe, Jeff Layton,
	Christian Brauner, Linus Torvalds, netdev, linux-fsdevel,
	linux-kernel, linux-mm, Bernard Metzler, Tom Talpey, linux-rdma

Matthew Wilcox <willy@infradead.org> wrote:

> This feels a _lot_ like the page_frag allocator.  Can the two be
> unified?

Looks kind of similar.  I might well be able to use that instead.

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 16:24   ` David Howells
  2023-03-16 17:23     ` Trond Myklebust
@ 2023-03-16 18:06     ` David Howells
  2023-03-16 19:01       ` Trond Myklebust
                         ` (2 more replies)
  1 sibling, 3 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 18:06 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexander Viro, Christoph Hellwig,
	Jens Axboe, Jeffrey Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs

Trond Myklebust <trondmy@hammerspace.com> wrote:

> 1) This is code that is common to the client and the server. Why are we
> adding unused 3 bvec slots to every client RPC call?

Fair point, but I'm trying to avoid making four+ sendmsg calls in nfsd rather
than one.

> 2) It obfuscates the existence of these bvec slots.

True, it'd be nice to find a better way to do it.  Question is, can the client
make use of MSG_SPLICE_PAGES also?

> 3) knfsd may use splice_direct_to_actor() in order to avoid copying the page
> cache data into private buffers (it just takes a reference to the
> pages). Using MSG_SPLICE_PAGES will presumably require it to protect those
> pages against further writes while the socket is referencing them.

Upstream sunrpc is using sendpage with TCP.  It already has that issue.
MSG_SPLICE_PAGES is a way of doing sendpage through sendmsg.

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES
  2023-03-16 15:25 ` [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES David Howells
@ 2023-03-16 18:37   ` Willem de Bruijn
  2023-03-16 18:44   ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: Willem de Bruijn @ 2023-03-16 18:37 UTC (permalink / raw)
  To: David Howells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni
  Cc: David Howells, Al Viro, Christoph Hellwig, Jens Axboe,
	Jeff Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm

David Howells wrote:
> Make TCP's sendmsg() support MSG_SPLICE_PAGES.  This causes pages to be
> spliced from the source iterator if possible (the iterator must be
> ITER_BVEC and the pages must be spliceable).
> 
> This allows ->sendpage() to be replaced by something that can handle
> multiple multipage folios in a single transaction.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Eric Dumazet <edumazet@google.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: netdev@vger.kernel.org
> ---
>  net/ipv4/tcp.c | 59 +++++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 53 insertions(+), 6 deletions(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 288693981b00..77c0c69208a5 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1220,7 +1220,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>  	int flags, err, copied = 0;
>  	int mss_now = 0, size_goal, copied_syn = 0;
>  	int process_backlog = 0;
> -	bool zc = false;
> +	int zc = 0;
>  	long timeo;
>  
>  	flags = msg->msg_flags;
> @@ -1231,17 +1231,24 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
>  		if (msg->msg_ubuf) {
>  			uarg = msg->msg_ubuf;
>  			net_zcopy_get(uarg);
> -			zc = sk->sk_route_caps & NETIF_F_SG;
> +			if (sk->sk_route_caps & NETIF_F_SG)
> +				zc = 1;
>  		} else if (sock_flag(sk, SOCK_ZEROCOPY)) {
>  			uarg = msg_zerocopy_realloc(sk, size, skb_zcopy(skb));
>  			if (!uarg) {
>  				err = -ENOBUFS;
>  				goto out_err;
>  			}
> -			zc = sk->sk_route_caps & NETIF_F_SG;
> -			if (!zc)
> +			if (sk->sk_route_caps & NETIF_F_SG)
> +				zc = 1;
> +			else
>  				uarg_to_msgzc(uarg)->zerocopy = 0;
>  		}
> +	} else if (unlikely(flags & MSG_SPLICE_PAGES) && size) {
> +		if (!iov_iter_is_bvec(&msg->msg_iter))
> +			return -EINVAL;
> +		if (sk->sk_route_caps & NETIF_F_SG)
> +			zc = 2;
>  	}

The commit message mentions MSG_SPLICE_PAGES as an internal flag.

It can be passed from userspace. The code anticipates that and checks
preconditions.

A side effect is that legacy applications that may already be setting
this bit in the flags now start failing. Most socket types are
historically permissive and simply ignore undefined flags.

With MSG_ZEROCOPY we chose to be extra cautious and added
SOCK_ZEROCOPY, only testing the MSG_ZEROCOPY bit if this socket option
is explicitly enabled. Perhaps more cautious than necessary, but FYI.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES
  2023-03-16 15:25 ` [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES David Howells
  2023-03-16 18:37   ` Willem de Bruijn
@ 2023-03-16 18:44   ` David Howells
  2023-03-16 19:00     ` Willem de Bruijn
  2023-03-21  0:38     ` David Howells
  1 sibling, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-16 18:44 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Al Viro, Christoph Hellwig,
	Jens Axboe, Jeff Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm

Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:

> The commit message mentions MSG_SPLICE_PAGES as an internal flag.
> 
> It can be passed from userspace. The code anticipates that and checks
> preconditions.

Should I add a separate field in the in-kernel msghdr struct for such internal
flags?  That would also avoid putting an internal flag in the same space as
the uapi flags.

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES
  2023-03-16 18:44   ` David Howells
@ 2023-03-16 19:00     ` Willem de Bruijn
  2023-03-21  0:38     ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: Willem de Bruijn @ 2023-03-16 19:00 UTC (permalink / raw)
  To: David Howells, Willem de Bruijn
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Al Viro, Christoph Hellwig,
	Jens Axboe, Jeff Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm

David Howells wrote:
> Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> 
> > The commit message mentions MSG_SPLICE_PAGES as an internal flag.
> > 
> > It can be passed from userspace. The code anticipates that and checks
> > preconditions.
> 
> Should I add a separate field in the in-kernel msghdr struct for such internal
> flags?  That would also avoid putting an internal flag in the same space as
> the uapi flags.

That would work, if no cost to common paths that don't need it.

A not very pretty alternative would be to add an an extra arg to each
sendmsg handler that is used only when called from sendpage.

There are a few other internal MSG_.. flags, such as
MSG_SENDPAGE_NOPOLICY. Those are all limited to sendpage, and ignored
in sendmsg, I think. Which would explain why it was clearly safe to
add them.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 18:06     ` David Howells
@ 2023-03-16 19:01       ` Trond Myklebust
  2023-03-22 13:10       ` David Howells
  2023-03-22 18:15       ` [RFC PATCH] iov_iter: Add an iterator-of-iterators David Howells
  2 siblings, 0 replies; 81+ messages in thread
From: Trond Myklebust @ 2023-03-16 19:01 UTC (permalink / raw)
  To: David Howells
  Cc: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexander Viro, Christoph Hellwig, Jens Axboe,
	Jeffrey Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs



> On Mar 16, 2023, at 14:06, David Howells <dhowells@redhat.com> wrote:
> 
> Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
>> 1) This is code that is common to the client and the server. Why are we
>> adding unused 3 bvec slots to every client RPC call?
> 
> Fair point, but I'm trying to avoid making four+ sendmsg calls in nfsd rather
> than one.

Add an enum iter_type for ITER_ITER ? :-)

Otherwise, please just split these functions into one for knfsd and a separate one for the client.

> 
>> 2) It obfuscates the existence of these bvec slots.
> 
> True, it'd be nice to find a better way to do it.  Question is, can the client
> make use of MSG_SPLICE_PAGES also?

The requirement for O_DIRECT support means we get the stable write issues with added extra spicy sauce.

> 
>> 3) knfsd may use splice_direct_to_actor() in order to avoid copying the page
>> cache data into private buffers (it just takes a reference to the
>> pages). Using MSG_SPLICE_PAGES will presumably require it to protect those
>> pages against further writes while the socket is referencing them.
> 
> Upstream sunrpc is using sendpage with TCP.  It already has that issue.
> MSG_SPLICE_PAGES is a way of doing sendpage through sendmsg.

Fair enough. I do seem to remember a schism with the knfsd developers over that issue.

_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 16:17   ` Trond Myklebust
  2023-03-16 17:10     ` Chuck Lever III
  2023-03-16 17:28     ` David Howells
@ 2023-03-16 21:21     ` David Howells
  2023-03-17 15:29       ` Chuck Lever III
  2 siblings, 1 reply; 81+ messages in thread
From: David Howells @ 2023-03-16 21:21 UTC (permalink / raw)
  To: Chuck Lever III
  Cc: dhowells, Trond Myklebust, Matthew Wilcox, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Al Viro,
	Christoph Hellwig, Jens Axboe, Jeffrey Layton, Christian Brauner,
	Linus Torvalds, netdev, linux-fsdevel, linux-kernel, linux-mm,
	Anna Schumaker, Linux NFS Mailing List

Chuck Lever III <chuck.lever@oracle.com> wrote:

> Therefore, this kind of change needs to be accompanied by both
> benchmark results and some field testing to convince me it won't
> cause harm.

Btw, what do you use to benchmark NFS performance?

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 23/28] algif: Remove hash_sendpage*()
  2023-03-16 15:26 ` [RFC PATCH 23/28] algif: Remove hash_sendpage*() David Howells
@ 2023-03-17  2:40   ` Herbert Xu
  2023-03-24 16:47     ` David Howells
  0 siblings, 1 reply; 81+ messages in thread
From: Herbert Xu @ 2023-03-17  2:40 UTC (permalink / raw)
  To: David Howells
  Cc: willy, davem, edumazet, kuba, pabeni, dhowells, viro, hch, axboe,
	jlayton, brauner, torvalds, netdev, linux-fsdevel, linux-kernel,
	linux-mm, linux-crypto

David Howells <dhowells@redhat.com> wrote:
> Remove hash_sendpage*() and use hash_sendmsg() as the latter seems to just
> use the source pages directly anyway.

...

> -       if (!(flags & MSG_MORE)) {
> -               if (ctx->more)
> -                       err = crypto_ahash_finup(&ctx->req);
> -               else
> -                       err = crypto_ahash_digest(&ctx->req);

You've just removed the optimised path from user-space to
finup/digest.  You need to add them back to sendmsg if you
want to eliminate sendpage.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 21:21     ` David Howells
@ 2023-03-17 15:29       ` Chuck Lever III
  0 siblings, 0 replies; 81+ messages in thread
From: Chuck Lever III @ 2023-03-17 15:29 UTC (permalink / raw)
  To: David Howells
  Cc: Trond Myklebust, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Al Viro, Christoph Hellwig,
	Jens Axboe, Jeffrey Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Linux NFS Mailing List


> On Mar 16, 2023, at 5:21 PM, David Howells <dhowells@redhat.com> wrote:
> 
> Chuck Lever III <chuck.lever@oracle.com> wrote:
> 
>> Therefore, this kind of change needs to be accompanied by both
>> benchmark results and some field testing to convince me it won't
>> cause harm.
> 
> Btw, what do you use to benchmark NFS performance?

It depends on what I'm trying to observe. I have only a small
handful of systems in my lab, which is why I was not able to
immediately detect the effects of the zero-copy change in my
lab. Daire has a large client cohort on a fast network, so is
able to see the impact of that kind of change quite readily.

A perhaps more interesting question is what kind of tooling
would I use to measure the performance of the proposed change.

The bottom line is whether or not applications on clients can
see a change. NFS client implementations can hide server and
network latency improvement from applications, and RPC-on-TCP
adds palpable latency as well that reduces the efficacy of
server performance optimizations.

For that I might use a multi-threaded fio workload with fixed
record sizes (2KB, 8KB, 128KB, 1MB) and then look at the
throughput numbers and latency distribution for each size.

In a single thread qd=1 test, iozone can show changes in
READ latency pretty clearly, though most folks believe qd=1
tests are junk.

I generally run such tests on 100GbE with a tmpfs or NVMe
export to take filesystem latencies out of the equation,
although that might matter more for WRITE latency if you
can keep your READ workload completely in server memory.

To measure server-side behavior without the effects of the
network or client, NFSD has a built-in trace point,
nfsd:svc_stats_latency, that records the latency in
microseconds of each RPC. Run the above workloads and
record this tracepoint (perhaps with a capture filter to
record only the latency of READ operations).

Then you can post-process the raw latencies to get an average
latency and deviation, or even look at latency distribution
to see if the shape of the outlier curve has changed. I use
awk for this.

[ Sidebar: you can use this tracepoint to track latency
outliers too, but that's another topic. ]

Second, I might try a flame graph study to measure changes in
instruction path length, and also capture an average cycles-
per-byte-read value. Looking at CPU cache misses can often be
a rathole, but flame graphs can surface changes there too.

And lastly, you might want to visit lock_stats to see if
there is any significant change in lock contention. An
unexpected increase in lock contention can often rob
positive changes made in other areas.


My guess is that for the RQ_SPLICE_OK case, the difference
would amount to the elimination of the kernel_sendpage
calls, which are indirect, but not terribly expensive.
Those calls amount to a significant cost only on large I/O.
It might not amount to much relative to the other costs
in the READ path.

So the real purpose here would have to be refactoring to
use bvecs instead of the bespoke xdr_buf structure, and I
would like to see support for bvecs in all of our transports
(looking at you, RDMA) to make this truly worthwhile. I had
started this a while back, but lack of a bvec-based RDMA API
made it less interesting to me. It isn't clear to me yet
whether bvecs or folios should be the replacement for
xdr_buf's head/pages/tail, but I'm a paid-in-full member of
the uneducated rabble.

This might sound like a lot of pushback, but actually I am
open to discussing clean-ups in this area, including the
one you proposed. Just getting a little more careful about
this kind of change as time goes on. And it sounds like you
were already aware of the most recent previous attempt at
this kind of improvement.


--
Chuck Lever



^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE:  [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
  2023-03-16 15:25 ` [RFC PATCH 08/28] siw: " David Howells
@ 2023-03-20 10:53   ` Bernard Metzler
  2023-03-20 11:08   ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: Bernard Metzler @ 2023-03-20 10:53 UTC (permalink / raw)
  To: David Howells; +Cc: David Howells, Tom Talpey, linux-rdma



> -----Original Message-----
> From: David Howells <dhowells@redhat.com>
> Sent: Thursday, 16 March 2023 16:26
> To: Matthew Wilcox <willy@infradead.org>; David S. Miller
> <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub Kicinski
> <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>
> Cc: David Howells <dhowells@redhat.com>; Al Viro <viro@zeniv.linux.org.uk>;
> Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@kernel.dk>; Jeff
> Layton <jlayton@kernel.org>; Christian Brauner <brauner@kernel.org>; Linus
> Torvalds <torvalds@linux-foundation.org>; netdev@vger.kernel.org; linux-
> fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> Bernard Metzler <BMT@zurich.ibm.com>; Tom Talpey <tom@talpey.com>; linux-
> rdma@vger.kernel.org
> Subject: [EXTERNAL] [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
> 
> do_tcp_sendpages() is now just a small wrapper around tcp_sendmsg_locked(),
> so inline it, allowing do_tcp_sendpages() to be removed.  This is part of
> replacing ->sendpage() with a call to sendmsg() with MSG_SPLICE_PAGES set.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Bernard Metzler <bmt@zurich.ibm.com>
> cc: Tom Talpey <tom@talpey.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: linux-rdma@vger.kernel.org
> cc: netdev@vger.kernel.org
> ---
>  drivers/infiniband/sw/siw/siw_qp_tx.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 05052b49107f..8fc179321e2b 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -313,7 +313,7 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx,
> struct socket *s,
>  }
> 


Hi David,

many thanks for looking into that!

I reply to a limited audience, expecting limited interest,
basically to the rdma list plus Tom.


>  /*
> - * 0copy TCP transmit interface: Use do_tcp_sendpages.
> + * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES.
>   *
>   * Using sendpage to push page by page appears to be less efficient
>   * than using sendmsg, even if data are copied.

That is an interesting observation. Is efficiency to be read as
CPU load, or throughput on the wire, or both?

Back in the days, I introduced that zcopy path for efficiency
reasons - getting both better throughput and less CPU load.
I looked at both WRITE and READ performance. Using
do_tcp_sendpages() is currently limited to processing work
which is not registered with local completion generation.
Replying to a remote READ request is a typical case. Did
you check with READ?

> @@ -324,20 +324,27 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx,
> struct socket *s,
>  static int siw_tcp_sendpages(struct socket *s, struct page **page, int
> offset,
>  			     size_t size)
>  {
> +	struct bio_vec bvec;
> +	struct msghdr msg = {
> +		.msg_flags = (MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT |
> +			      MSG_SENDPAGE_NOTLAST),
> +	};
>  	struct sock *sk = s->sk;
> -	int i = 0, rv = 0, sent = 0,
> -	    flags = MSG_MORE | MSG_DONTWAIT | MSG_SENDPAGE_NOTLAST;
> +	int i = 0, rv = 0, sent = 0;
> 
>  	while (size) {
>  		size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
> 
>  		if (size + offset <= PAGE_SIZE)
> -			flags = MSG_MORE | MSG_DONTWAIT;
> +			msg.msg_flags = MSG_SPLICE_PAGES | MSG_MORE |
> MSG_DONTWAIT;
> 
>  		tcp_rate_check_app_limited(sk);
> +		bvec_set_page(&bvec, page[i], bytes, offset);
> +		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> +
>  try_page_again:
>  		lock_sock(sk);
> -		rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags);
> +		rv = tcp_sendmsg_locked(sk, &msg, size);

Would that tcp_sendmsg_locked() with a msg flagged
MSG_SPLICE_PAGES still have zero copy semantics?


>  		release_sock(sk);
> 
>  		if (rv > 0) {


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
  2023-03-16 15:25 ` [RFC PATCH 08/28] siw: " David Howells
  2023-03-20 10:53   ` Bernard Metzler
@ 2023-03-20 11:08   ` David Howells
  2023-03-20 12:27     ` Bernard Metzler
  2023-03-20 13:13     ` David Howells
  1 sibling, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-20 11:08 UTC (permalink / raw)
  To: Bernard Metzler; +Cc: dhowells, Tom Talpey, linux-rdma

Bernard Metzler <BMT@zurich.ibm.com> wrote:

> >  /*
> > - * 0copy TCP transmit interface: Use do_tcp_sendpages.
> > + * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES.
> >   *
> >   * Using sendpage to push page by page appears to be less efficient
> >   * than using sendmsg, even if data are copied.
> 
> That is an interesting observation. Is efficiency to be read as
> CPU load, or throughput on the wire, or both?

Um.  The observation in the comment is one you made, not me according to git
blame.  I merely changed "do_tcp_sendpages" to "MSG_SPLICE_PAGES" in the first
line of the comment.

> Back in the days, I introduced that zcopy path for efficiency
> reasons - getting both better throughput and less CPU load.
> I looked at both WRITE and READ performance. Using
> do_tcp_sendpages() is currently limited to processing work
> which is not registered with local completion generation.
> Replying to a remote READ request is a typical case. Did
> you check with READ?

Ah - you're talking about ksmbd there?  I haven't tested the patch with that.

> > -		rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags);
> > +		rv = tcp_sendmsg_locked(sk, &msg, size);
> 
> Would that tcp_sendmsg_locked() with a msg flagged
> MSG_SPLICE_PAGES still have zero copy semantics?

Yes - though I am considering making it conditional on whether the pages in
the iterator belong to the slab allocator (in which case they get copied) or
not.

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
  2023-03-20 11:08   ` David Howells
@ 2023-03-20 12:27     ` Bernard Metzler
  2023-03-20 13:13     ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: Bernard Metzler @ 2023-03-20 12:27 UTC (permalink / raw)
  To: David Howells; +Cc: David Howells, Tom Talpey, linux-rdma



> -----Original Message-----
> From: David Howells <dhowells@redhat.com>
> Sent: Monday, 20 March 2023 12:09
> To: Bernard Metzler <BMT@zurich.ibm.com>
> Cc: David Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>;
> linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] Re: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
> 
> Bernard Metzler <BMT@zurich.ibm.com> wrote:
> 
> > >  /*
> > > - * 0copy TCP transmit interface: Use do_tcp_sendpages.
> > > + * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES.
> > >   *
> > >   * Using sendpage to push page by page appears to be less efficient
> > >   * than using sendmsg, even if data are copied.
> >
> > That is an interesting observation. Is efficiency to be read as
> > CPU load, or throughput on the wire, or both?
> 
> Um.  The observation in the comment is one you made, not me according to
> git

Haha, yes. So sorry for that. I am getting older ;) I need
to put on some more sanity checks before posting here!


> blame.  I merely changed "do_tcp_sendpages" to "MSG_SPLICE_PAGES" in the
> first
> line of the comment.
> 
> > Back in the days, I introduced that zcopy path for efficiency
> > reasons - getting both better throughput and less CPU load.
> > I looked at both WRITE and READ performance. Using
> > do_tcp_sendpages() is currently limited to processing work
> > which is not registered with local completion generation.
> > Replying to a remote READ request is a typical case. Did
> > you check with READ?
> 
> Ah - you're talking about ksmbd there?  I haven't tested the patch with
> that.

Did you test with both kernel ULPs and user level applications?

> 
> > > -		rv = do_tcp_sendpages(sk, page[i], offset, bytes, flags);
> > > +		rv = tcp_sendmsg_locked(sk, &msg, size);
> >
> > Would that tcp_sendmsg_locked() with a msg flagged
> > MSG_SPLICE_PAGES still have zero copy semantics?
> 
> Yes - though I am considering making it conditional on whether the pages in
> the iterator belong to the slab allocator (in which case they get copied)
> or
> not.

Sounds good to me!
> 
> David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
  2023-03-20 11:08   ` David Howells
  2023-03-20 12:27     ` Bernard Metzler
@ 2023-03-20 13:13     ` David Howells
  2023-03-20 13:18       ` Bernard Metzler
  1 sibling, 1 reply; 81+ messages in thread
From: David Howells @ 2023-03-20 13:13 UTC (permalink / raw)
  To: Bernard Metzler; +Cc: dhowells, Tom Talpey, linux-rdma

Bernard Metzler <BMT@zurich.ibm.com> wrote:

> > > Back in the days, I introduced that zcopy path for efficiency
> > > reasons - getting both better throughput and less CPU load.
> > > I looked at both WRITE and READ performance. Using
> > > do_tcp_sendpages() is currently limited to processing work
> > > which is not registered with local completion generation.
> > > Replying to a remote READ request is a typical case. Did
> > > you check with READ?
> > 
> > Ah - you're talking about ksmbd there?  I haven't tested the patch with
> > that.
> 
> Did you test with both kernel ULPs and user level applications?

Kernel "ULPs"?

As far as cifs goes, I've tested the fs with large dd commands for the moment,
but that's all.  This post was more to find out how attached people were to
->sendpage() and to see if anyone had any preferences on a couple of things
mentioned in the cover note.  This isn't aimed at the next merge window.

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
  2023-03-20 13:13     ` David Howells
@ 2023-03-20 13:18       ` Bernard Metzler
  0 siblings, 0 replies; 81+ messages in thread
From: Bernard Metzler @ 2023-03-20 13:18 UTC (permalink / raw)
  To: David Howells; +Cc: David Howells, Tom Talpey, linux-rdma



> -----Original Message-----
> From: David Howells <dhowells@redhat.com>
> Sent: Monday, 20 March 2023 14:13
> To: Bernard Metzler <BMT@zurich.ibm.com>
> Cc: David Howells <dhowells@redhat.com>; Tom Talpey <tom@talpey.com>;
> linux-rdma@vger.kernel.org
> Subject: [EXTERNAL] Re: [RFC PATCH 08/28] siw: Inline do_tcp_sendpages()
> 
> Bernard Metzler <BMT@zurich.ibm.com> wrote:
> 
> > > > Back in the days, I introduced that zcopy path for efficiency
> > > > reasons - getting both better throughput and less CPU load.
> > > > I looked at both WRITE and READ performance. Using
> > > > do_tcp_sendpages() is currently limited to processing work
> > > > which is not registered with local completion generation.
> > > > Replying to a remote READ request is a typical case. Did
> > > > you check with READ?
> > >
> > > Ah - you're talking about ksmbd there?  I haven't tested the patch with
> > > that.
> >
> > Did you test with both kernel ULPs and user level applications?
> 
> Kernel "ULPs"?

I was trying to refer to kernel applications or clients or
upper layer protocols (ulp, like nfs).

> 
> As far as cifs goes, I've tested the fs with large dd commands for the
> moment,
> but that's all.  This post was more to find out how attached people were to
> ->sendpage() and to see if anyone had any preferences on a couple of things
> mentioned in the cover note.  This isn't aimed at the next merge window.

I like your patches to siw a lot, since it would significantly
simplify the transmit code path.

Thank you,
Bernard.

> 
> David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE:  [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit
  2023-03-16 15:26 ` [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit David Howells
@ 2023-03-20 13:39   ` Bernard Metzler
  0 siblings, 0 replies; 81+ messages in thread
From: Bernard Metzler @ 2023-03-20 13:39 UTC (permalink / raw)
  To: David Howells; +Cc: David Howells, Tom Talpey, linux-rdma



> -----Original Message-----
> From: David Howells <dhowells@redhat.com>
> Sent: Thursday, 16 March 2023 16:26
> To: Matthew Wilcox <willy@infradead.org>; David S. Miller
> <davem@davemloft.net>; Eric Dumazet <edumazet@google.com>; Jakub Kicinski
> <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>
> Cc: David Howells <dhowells@redhat.com>; Al Viro <viro@zeniv.linux.org.uk>;
> Christoph Hellwig <hch@infradead.org>; Jens Axboe <axboe@kernel.dk>; Jeff
> Layton <jlayton@kernel.org>; Christian Brauner <brauner@kernel.org>; Linus
> Torvalds <torvalds@linux-foundation.org>; netdev@vger.kernel.org; linux-
> fsdevel@vger.kernel.org; linux-kernel@vger.kernel.org; linux-mm@kvack.org;
> Bernard Metzler <BMT@zurich.ibm.com>; Tom Talpey <tom@talpey.com>; linux-
> rdma@vger.kernel.org
> Subject: [EXTERNAL] [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES)
> rather than sendpage to transmit
> 
> When transmitting data, call down into TCP using a single sendmsg with
> MSG_SPLICE_PAGES to indicate that content should be spliced rather than
> performing several sendmsg and sendpage calls to transmit header, data
> pages and trailer.
> 
> To make this work, the data is assembled in a bio_vec array and attached to
> a BVEC-type iterator.  The header and trailer (if present) are copied into
> memory acquired from zcopy_alloc() which just breaks a page up into small
> pieces that can be freed with put_page().
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> cc: Bernard Metzler <bmt@zurich.ibm.com>
> cc: Tom Talpey <tom@talpey.com>
> cc: "David S. Miller" <davem@davemloft.net>
> cc: Eric Dumazet <edumazet@google.com>
> cc: Jakub Kicinski <kuba@kernel.org>
> cc: Paolo Abeni <pabeni@redhat.com>
> cc: Jens Axboe <axboe@kernel.dk>
> cc: Matthew Wilcox <willy@infradead.org>
> cc: linux-rdma@vger.kernel.org
> cc: netdev@vger.kernel.org
> ---
>  drivers/infiniband/sw/siw/siw_qp_tx.c | 231 +++++---------------------
>  1 file changed, 46 insertions(+), 185 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
> b/drivers/infiniband/sw/siw/siw_qp_tx.c
> index 8fc179321e2b..ec4f0ac324ce 100644
> --- a/drivers/infiniband/sw/siw/siw_qp_tx.c
> +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
> @@ -8,6 +8,7 @@
>  #include <linux/net.h>
>  #include <linux/scatterlist.h>
>  #include <linux/highmem.h>
> +#include <linux/zcopy_alloc.h>
>  #include <net/tcp.h>
> 
>  #include <rdma/iw_cm.h>
> @@ -312,114 +313,8 @@ static int siw_tx_ctrl(struct siw_iwarp_tx *c_tx,
> struct socket *s,
>  	return rv;
>  }
> 


This patch looks really great! We would get rid of all that
convoluted page mapping and in-order unmapping mechanics and
extra 0copy transmit path, making the core siw transmit
function overall a lot less complex.

> -/*
> - * 0copy TCP transmit interface: Use MSG_SPLICE_PAGES.
> - *
> - * Using sendpage to push page by page appears to be less efficient
> - * than using sendmsg, even if data are copied.
> - *
> - * A general performance limitation might be the extra four bytes
> - * trailer checksum segment to be pushed after user data.
> - */
> -static int siw_tcp_sendpages(struct socket *s, struct page **page, int
> offset,
> -			     size_t size)
> -{
> -	struct bio_vec bvec;
> -	struct msghdr msg = {
> -		.msg_flags = (MSG_SPLICE_PAGES | MSG_MORE | MSG_DONTWAIT |
> -			      MSG_SENDPAGE_NOTLAST),
> -	};
> -	struct sock *sk = s->sk;
> -	int i = 0, rv = 0, sent = 0;
> -
> -	while (size) {
> -		size_t bytes = min_t(size_t, PAGE_SIZE - offset, size);
> -
> -		if (size + offset <= PAGE_SIZE)
> -			msg.msg_flags = MSG_SPLICE_PAGES | MSG_MORE |
> MSG_DONTWAIT;
> -
> -		tcp_rate_check_app_limited(sk);
> -		bvec_set_page(&bvec, page[i], bytes, offset);
> -		iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, size);
> -
> -try_page_again:
> -		lock_sock(sk);
> -		rv = tcp_sendmsg_locked(sk, &msg, size);
> -		release_sock(sk);
> -
> -		if (rv > 0) {
> -			size -= rv;
> -			sent += rv;
> -			if (rv != bytes) {
> -				offset += rv;
> -				bytes -= rv;
> -				goto try_page_again;
> -			}
> -			offset = 0;
> -		} else {
> -			if (rv == -EAGAIN || rv == 0)
> -				break;
> -			return rv;
> -		}
> -		i++;
> -	}
> -	return sent;
> -}
> -
> -/*
> - * siw_0copy_tx()
> - *
> - * Pushes list of pages to TCP socket. If pages from multiple
> - * SGE's, all referenced pages of each SGE are pushed in one
> - * shot.
> - */
> -static int siw_0copy_tx(struct socket *s, struct page **page,
> -			struct siw_sge *sge, unsigned int offset,
> -			unsigned int size)
> -{
> -	int i = 0, sent = 0, rv;
> -	int sge_bytes = min(sge->length - offset, size);
> -
> -	offset = (sge->laddr + offset) & ~PAGE_MASK;
> -
> -	while (sent != size) {
> -		rv = siw_tcp_sendpages(s, &page[i], offset, sge_bytes);
> -		if (rv >= 0) {
> -			sent += rv;
> -			if (size == sent || sge_bytes > rv)
> -				break;
> -
> -			i += PAGE_ALIGN(sge_bytes + offset) >> PAGE_SHIFT;
> -			sge++;
> -			sge_bytes = min(sge->length, size - sent);
> -			offset = sge->laddr & ~PAGE_MASK;
> -		} else {
> -			sent = rv;
> -			break;
> -		}
> -	}
> -	return sent;
> -}
> -
>  #define MAX_TRAILER (MPA_CRC_SIZE + 4)
> 
> -static void siw_unmap_pages(struct kvec *iov, unsigned long kmap_mask, int
> len)
> -{
> -	int i;
> -
> -	/*
> -	 * Work backwards through the array to honor the kmap_local_page()
> -	 * ordering requirements.
> -	 */
> -	for (i = (len-1); i >= 0; i--) {
> -		if (kmap_mask & BIT(i)) {
> -			unsigned long addr = (unsigned long)iov[i].iov_base;
> -
> -			kunmap_local((void *)(addr & PAGE_MASK));
> -		}
> -	}
> -}
> -
>  /*
>   * siw_tx_hdt() tries to push a complete packet to TCP where all
>   * packet fragments are referenced by the elements of one iovec.
> @@ -439,15 +334,13 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
>  {
>  	struct siw_wqe *wqe = &c_tx->wqe_active;
>  	struct siw_sge *sge = &wqe->sqe.sge[c_tx->sge_idx];
> -	struct kvec iov[MAX_ARRAY];
> -	struct page *page_array[MAX_ARRAY];
> +	struct bio_vec bvec[MAX_ARRAY];
>  	struct msghdr msg = { .msg_flags = MSG_DONTWAIT | MSG_EOR };
> 
>  	int seg = 0, do_crc = c_tx->do_crc, is_kva = 0, rv;
>  	unsigned int data_len = c_tx->bytes_unsent, hdr_len = 0, trl_len = 0,
>  		     sge_off = c_tx->sge_off, sge_idx = c_tx->sge_idx,
>  		     pbl_idx = c_tx->pbl_idx;
> -	unsigned long kmap_mask = 0L;
> 
>  	if (c_tx->state == SIW_SEND_HDR) {
>  		if (c_tx->use_sendpage) {
> @@ -457,10 +350,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)

Can't we now also get rid of that extra hdr handling for zero copy
transmits? If we put all data pages and trailer crc in one bvec
anyway, can't we do that for the hdr as well unconditionally?

> 
>  			c_tx->state = SIW_SEND_DATA;
>  		} else {
> -			iov[0].iov_base =
> -				(char *)&c_tx->pkt.ctrl + c_tx->ctrl_sent;
> -			iov[0].iov_len = hdr_len =
> -				c_tx->ctrl_len - c_tx->ctrl_sent;
> +			const void *hdr = &c_tx->pkt.ctrl + c_tx->ctrl_sent;
> +
> +			hdr_len = c_tx->ctrl_len - c_tx->ctrl_sent;
> +			rv = zcopy_memdup(hdr_len, hdr, &bvec[0], GFP_NOFS);
> +			if (rv < 0)
> +				goto done;
>  			seg = 1;
>  		}
>  	}
> @@ -478,28 +373,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
>  		} else {
>  			is_kva = 1;
>  		}
> -		if (is_kva && !c_tx->use_sendpage) {
> -			/*
> -			 * tx from kernel virtual address: either inline data
> -			 * or memory region with assigned kernel buffer
> -			 */
> -			iov[seg].iov_base =
> -				(void *)(uintptr_t)(sge->laddr + sge_off);
> -			iov[seg].iov_len = sge_len;
> -
> -			if (do_crc)
> -				crypto_shash_update(c_tx->mpa_crc_hd,
> -						    iov[seg].iov_base,
> -						    sge_len);
> -			sge_off += sge_len;
> -			data_len -= sge_len;
> -			seg++;
> -			goto sge_done;
> -		}
> 
>  		while (sge_len) {
>  			size_t plen = min((int)PAGE_SIZE - fp_off, sge_len);
> -			void *kaddr;
> 
>  			if (!is_kva) {
>  				struct page *p;
> @@ -512,33 +388,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
>  					p = siw_get_upage(mem->umem,
>  							  sge->laddr + sge_off);
>  				if (unlikely(!p)) {
> -					siw_unmap_pages(iov, kmap_mask, seg);
>  					wqe->processed -= c_tx->bytes_unsent;
>  					rv = -EFAULT;
>  					goto done_crc;
>  				}
> -				page_array[seg] = p;
> -
> -				if (!c_tx->use_sendpage) {
> -					void *kaddr = kmap_local_page(p);
> -
> -					/* Remember for later kunmap() */
> -					kmap_mask |= BIT(seg);
> -					iov[seg].iov_base = kaddr + fp_off;
> -					iov[seg].iov_len = plen;
> -
> -					if (do_crc)
> -						crypto_shash_update(
> -							c_tx->mpa_crc_hd,
> -							iov[seg].iov_base,
> -							plen);
> -				} else if (do_crc) {
> -					kaddr = kmap_local_page(p);
> -					crypto_shash_update(c_tx->mpa_crc_hd,
> -							    kaddr + fp_off,
> -							    plen);
> -					kunmap_local(kaddr);
> -				}
> +
> +				bvec_set_page(&bvec[seg], p, plen, fp_off);
>  			} else {
>  				/*
>  				 * Cast to an uintptr_t to preserve all 64 bits
> @@ -552,12 +407,15 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
>  				 * bits on a 64 bit platform and 32 bits on a
>  				 * 32 bit platform.
>  				 */
> -				page_array[seg] = virt_to_page((void *)(va &
> PAGE_MASK));
> -				if (do_crc)
> -					crypto_shash_update(
> -						c_tx->mpa_crc_hd,
> -						(void *)va,
> -						plen);
> +				bvec_set_virt(&bvec[seg], (void *)va, plen);
> +			}
> +
> +			if (do_crc) {
> +				void *kaddr = kmap_local_page(bvec[seg].bv_page);
> +				crypto_shash_update(c_tx->mpa_crc_hd,
> +						    kaddr + bvec[seg].bv_offset,
> +						    bvec[seg].bv_len);
> +				kunmap_local(kaddr);
>  			}
> 
>  			sge_len -= plen;
> @@ -567,13 +425,12 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
> 
>  			if (++seg > (int)MAX_ARRAY) {
>  				siw_dbg_qp(tx_qp(c_tx), "to many fragments\n");
> -				siw_unmap_pages(iov, kmap_mask, seg-1);
>  				wqe->processed -= c_tx->bytes_unsent;
>  				rv = -EMSGSIZE;
>  				goto done_crc;
>  			}
>  		}
> -sge_done:
> +
>  		/* Update SGE variables at end of SGE */
>  		if (sge_off == sge->length &&
>  		    (data_len != 0 || wqe->processed < wqe->bytes)) {
> @@ -582,15 +439,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
>  			sge_off = 0;
>  		}
>  	}
> -	/* trailer */
> -	if (likely(c_tx->state != SIW_SEND_TRAILER)) {
> -		iov[seg].iov_base = &c_tx->trailer.pad[4 - c_tx->pad];
> -		iov[seg].iov_len = trl_len = MAX_TRAILER - (4 - c_tx->pad);
> -	} else {
> -		iov[seg].iov_base = &c_tx->trailer.pad[c_tx->ctrl_sent];
> -		iov[seg].iov_len = trl_len = MAX_TRAILER - c_tx->ctrl_sent;
> -	}
> 
> +	/* Set the CRC in the trailer */
>  	if (c_tx->pad) {
>  		*(u32 *)c_tx->trailer.pad = 0;
>  		if (do_crc)
> @@ -603,23 +453,31 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
> struct socket *s)
>  	else if (do_crc)
>  		crypto_shash_final(c_tx->mpa_crc_hd, (u8 *)&c_tx->trailer.crc);
> 
> -	data_len = c_tx->bytes_unsent;
> +	/* Copy the trailer and add it to the output list */
> +	if (likely(c_tx->state != SIW_SEND_TRAILER)) {
> +		void *trl = &c_tx->trailer.pad[4 - c_tx->pad];
> 
> -	if (c_tx->use_sendpage) {
> -		rv = siw_0copy_tx(s, page_array, &wqe->sqe.sge[c_tx->sge_idx],
> -				  c_tx->sge_off, data_len);
> -		if (rv == data_len) {
> -			rv = kernel_sendmsg(s, &msg, &iov[seg], 1, trl_len);
> -			if (rv > 0)
> -				rv += data_len;
> -			else
> -				rv = data_len;
> -		}
> +		trl_len = MAX_TRAILER - (4 - c_tx->pad);
> +		rv = zcopy_memdup(trl_len, trl, &bvec[seg], GFP_NOFS);
> +		if (rv < 0)
> +			goto done_crc;
>  	} else {
> -		rv = kernel_sendmsg(s, &msg, iov, seg + 1,
> -				    hdr_len + data_len + trl_len);
> -		siw_unmap_pages(iov, kmap_mask, seg);
> +		void *trl = &c_tx->trailer.pad[c_tx->ctrl_sent];
> +
> +		trl_len = MAX_TRAILER - c_tx->ctrl_sent;
> +		rv = zcopy_memdup(trl_len, trl, &bvec[seg], GFP_NOFS);
> +		if (rv < 0)
> +			goto done_crc;
>  	}
> +
> +	data_len = c_tx->bytes_unsent;
> +
> +	if (c_tx->use_sendpage)
> +		msg.msg_flags |= MSG_SPLICE_PAGES;
> +	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, bvec, seg + 1,
> +		      hdr_len + data_len + trl_len);
> +	rv = sock_sendmsg(s, &msg);
> +
>  	if (rv < (int)hdr_len) {
>  		/* Not even complete hdr pushed or negative rv */
>  		wqe->processed -= data_len;
> @@ -680,6 +538,9 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct
> socket *s)
>  	}
>  done_crc:
>  	c_tx->do_crc = 0;
> +	if (c_tx->state == SIW_SEND_HDR)
> +		folio_put(page_folio(bvec[0].bv_page));
> +	folio_put(page_folio(bvec[seg].bv_page));
>  done:
>  	return rv;
>  }


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES
  2023-03-16 18:44   ` David Howells
  2023-03-16 19:00     ` Willem de Bruijn
@ 2023-03-21  0:38     ` David Howells
  2023-03-21 14:22       ` Willem de Bruijn
  1 sibling, 1 reply; 81+ messages in thread
From: David Howells @ 2023-03-21  0:38 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Al Viro, Christoph Hellwig,
	Jens Axboe, Jeff Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm

Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:

> David Howells wrote:
> > Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > 
> > > The commit message mentions MSG_SPLICE_PAGES as an internal flag.
> > > 
> > > It can be passed from userspace. The code anticipates that and checks
> > > preconditions.
> > 
> > Should I add a separate field in the in-kernel msghdr struct for such internal
> > flags?  That would also avoid putting an internal flag in the same space as
> > the uapi flags.
> 
> That would work, if no cost to common paths that don't need it.

Actually, it might be tricky.  __ip_append_data() doesn't take a msghdr struct
pointer per se.  The "void *from" argument *might* point to one - but it
depends on seeing a MSG_SPLICE_PAGES or MSG_ZEROCOPY flag, otherwise we don't
know.

Possibly this changes if sendpage goes away.

> A not very pretty alternative would be to add an an extra arg to each
> sendmsg handler that is used only when called from sendpage.
> 
> There are a few other internal MSG_.. flags, such as
> MSG_SENDPAGE_NOPOLICY. Those are all limited to sendpage, and ignored
> in sendmsg, I think. Which would explain why it was clearly safe to
> add them.

Should those be moved across to the internal flags with MSG_SPLICE_PAGES?

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES
  2023-03-21  0:38     ` David Howells
@ 2023-03-21 14:22       ` Willem de Bruijn
  2023-03-22 13:56         ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() David Howells
  0 siblings, 1 reply; 81+ messages in thread
From: Willem de Bruijn @ 2023-03-21 14:22 UTC (permalink / raw)
  To: David Howells, Willem de Bruijn
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Al Viro, Christoph Hellwig,
	Jens Axboe, Jeff Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm

David Howells wrote:
> Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> 
> > David Howells wrote:
> > > Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> > > 
> > > > The commit message mentions MSG_SPLICE_PAGES as an internal flag.
> > > > 
> > > > It can be passed from userspace. The code anticipates that and checks
> > > > preconditions.
> > > 
> > > Should I add a separate field in the in-kernel msghdr struct for such internal
> > > flags?  That would also avoid putting an internal flag in the same space as
> > > the uapi flags.
> > 
> > That would work, if no cost to common paths that don't need it.
> 
> Actually, it might be tricky.  __ip_append_data() doesn't take a msghdr struct
> pointer per se.  The "void *from" argument *might* point to one - but it
> depends on seeing a MSG_SPLICE_PAGES or MSG_ZEROCOPY flag, otherwise we don't
> know.
> 
> Possibly this changes if sendpage goes away.

Is it sufficient to mask out this bit in tcp_sendmsg_locked and
udp_sendmsg if passed from userspace (and should be ignored), and pass
it through flags to callees like ip_append_data?
> 
> > A not very pretty alternative would be to add an an extra arg to each
> > sendmsg handler that is used only when called from sendpage.
> > 
> > There are a few other internal MSG_.. flags, such as
> > MSG_SENDPAGE_NOPOLICY. Those are all limited to sendpage, and ignored
> > in sendmsg, I think. Which would explain why it was clearly safe to
> > add them.
> 
> Should those be moved across to the internal flags with MSG_SPLICE_PAGES?

I would not include that in this patch series.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage
  2023-03-16 18:06     ` David Howells
  2023-03-16 19:01       ` Trond Myklebust
@ 2023-03-22 13:10       ` David Howells
  2023-03-22 18:15       ` [RFC PATCH] iov_iter: Add an iterator-of-iterators David Howells
  2 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 13:10 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexander Viro, Christoph Hellwig,
	Jens Axboe, Jeffrey Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs

Trond Myklebust <trondmy@hammerspace.com> wrote:

> Add an enum iter_type for ITER_ITER ? :-)

Actually, that might not be such a bad idea, now that I've pondered on it some
more.  Give it an array of iterators and add a flag to each iterator to say if
it can be spliced from or not.

Once ITER_PIPE is killed off, advancing and reverting over it should be pretty
straightforward - though each iterator would also need to keep track of how
big it started off as in order that it can be reverted over.

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data()
  2023-03-21 14:22       ` Willem de Bruijn
@ 2023-03-22 13:56         ` David Howells
  2023-03-22 13:56             ` David Howells
                             ` (3 more replies)
  0 siblings, 4 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 13:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Howells, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Matthew Wilcox, Jeff Layton, Linus Torvalds, netdev,
	linux-kernel

Hi Willem,

Here's another option to passing MSG_SPLICE_PAGES into sendmsg()[1] without
polluting the flags in msg->msg_flags.  The idea here is to put the flag
into a new field in msghdr, msg_kflags, that holds internal kernel flags
that aren't available to userspace.

What I've done here is:

 (1) Pass msg down to __ip_append_data() and __ip6_append_data() so that
     they can access the extra flags.

 (2) In order to avoid adding extra arguments to these functions and the
     functions in their call chains (such as ip_make_skb()), remove the
     size and flags arguments as these values are redundant if msg is
     passed in.

 (3) msg is then passed into getfrag().  I would like to get rid of the
     "from" argument also in favour of using something in msghdr, but I'm
     not sure how best to do that.

 (4) The size parameter to ->sendmsg() seems to be redundant; indeed
     sock_sendmsg() doesn't actually take it, but rather gets the count
     from msg_iter - so remove this parameter.

     kernel_sendmsg() will still take a size, but it sets it on the
     iterator and then calls sock_sendmsg().

 (5) Protocol sendmsg implementations then extract the length and the flags
     from the iterator.

 (6) Illustrate the addition of msg_kflags and MSG_SPLICE_PAGES.  I think
     that, at some point in the future, some of the other flags could be
     moved from msg_flags to msg_kflags.

David

Link: https://lore.kernel.org/r/20230316152618.711970-1-dhowells@redhat.com/ [1]

David Howells (3):
  net: Drop the size argument from ->sendmsg()
  ip: Make __ip{,6}_append_data() and co. take a msghdr*
  net: Declare MSG_SPLICE_PAGES internal sendmsg() flag

 crypto/af_alg.c                               | 12 +--
 crypto/algif_aead.c                           |  9 +--
 crypto/algif_hash.c                           |  8 +-
 crypto/algif_rng.c                            |  3 +-
 crypto/algif_skcipher.c                       | 10 +--
 drivers/isdn/mISDN/socket.c                   |  3 +-
 .../chelsio/inline_crypto/chtls/chtls.h       |  2 +-
 .../chelsio/inline_crypto/chtls/chtls_io.c    | 15 ++--
 drivers/net/ppp/pppoe.c                       |  4 +-
 drivers/net/tap.c                             |  3 +-
 drivers/net/tun.c                             |  3 +-
 drivers/vhost/net.c                           |  6 +-
 drivers/xen/pvcalls-back.c                    |  2 +-
 drivers/xen/pvcalls-front.c                   |  4 +-
 drivers/xen/pvcalls-front.h                   |  3 +-
 fs/afs/rxrpc.c                                |  8 +-
 include/crypto/if_alg.h                       |  3 +-
 include/linux/lsm_hook_defs.h                 |  3 +-
 include/linux/lsm_hooks.h                     |  1 -
 include/linux/net.h                           |  6 +-
 include/linux/security.h                      |  4 +-
 include/linux/socket.h                        |  3 +
 include/net/af_rxrpc.h                        |  3 +-
 include/net/inet_common.h                     |  2 +-
 include/net/ip.h                              | 24 +++---
 include/net/ipv6.h                            | 22 +++---
 include/net/ping.h                            |  7 +-
 include/net/sock.h                            |  7 +-
 include/net/tcp.h                             |  8 +-
 include/net/udp.h                             |  2 +-
 include/net/udplite.h                         |  4 +-
 net/appletalk/ddp.c                           |  3 +-
 net/atm/common.c                              |  3 +-
 net/atm/common.h                              |  2 +-
 net/ax25/af_ax25.c                            |  4 +-
 net/bluetooth/hci_sock.c                      |  4 +-
 net/bluetooth/iso.c                           |  4 +-
 net/bluetooth/l2cap_sock.c                    |  5 +-
 net/bluetooth/rfcomm/sock.c                   |  7 +-
 net/bluetooth/sco.c                           |  4 +-
 net/caif/caif_socket.c                        | 13 ++--
 net/can/bcm.c                                 |  3 +-
 net/can/isotp.c                               |  3 +-
 net/can/j1939/socket.c                        |  4 +-
 net/can/raw.c                                 |  3 +-
 net/core/sock.c                               |  4 +-
 net/dccp/dccp.h                               |  2 +-
 net/dccp/proto.c                              |  3 +-
 net/ieee802154/socket.c                       | 11 +--
 net/ipv4/af_inet.c                            |  4 +-
 net/ipv4/icmp.c                               | 14 ++--
 net/ipv4/ip_output.c                          | 73 ++++++++++---------
 net/ipv4/ping.c                               | 18 ++---
 net/ipv4/raw.c                                | 23 +++---
 net/ipv4/tcp.c                                | 17 +++--
 net/ipv4/tcp_bpf.c                            |  5 +-
 net/ipv4/tcp_input.c                          |  3 +-
 net/ipv4/udp.c                                | 24 +++---
 net/ipv6/af_inet6.c                           |  7 +-
 net/ipv6/icmp.c                               | 21 ++++--
 net/ipv6/ip6_output.c                         | 57 +++++++--------
 net/ipv6/ping.c                               | 12 +--
 net/ipv6/raw.c                                | 25 +++----
 net/ipv6/udp.c                                | 26 ++++---
 net/ipv6/udp_impl.h                           |  2 +-
 net/iucv/af_iucv.c                            |  4 +-
 net/kcm/kcmsock.c                             |  2 +-
 net/key/af_key.c                              |  3 +-
 net/l2tp/l2tp_ip.c                            |  3 +-
 net/l2tp/l2tp_ip6.c                           |  3 +-
 net/l2tp/l2tp_ppp.c                           |  4 +-
 net/llc/af_llc.c                              |  5 +-
 net/mctp/af_mctp.c                            |  3 +-
 net/mptcp/protocol.c                          |  8 +-
 net/netlink/af_netlink.c                      | 11 +--
 net/netrom/af_netrom.c                        |  3 +-
 net/nfc/llcp_sock.c                           |  7 +-
 net/nfc/rawsock.c                             |  3 +-
 net/packet/af_packet.c                        | 11 +--
 net/phonet/datagram.c                         |  3 +-
 net/phonet/pep.c                              |  3 +-
 net/phonet/socket.c                           |  5 +-
 net/qrtr/af_qrtr.c                            |  4 +-
 net/rds/rds.h                                 |  2 +-
 net/rds/send.c                                |  3 +-
 net/rose/af_rose.c                            |  3 +-
 net/rxrpc/af_rxrpc.c                          |  6 +-
 net/rxrpc/ar-internal.h                       |  2 +-
 net/rxrpc/output.c                            | 22 +++---
 net/rxrpc/rxperf.c                            |  4 +-
 net/rxrpc/sendmsg.c                           | 15 ++--
 net/sctp/socket.c                             |  3 +-
 net/smc/af_smc.c                              |  5 +-
 net/socket.c                                  | 16 ++--
 net/tipc/socket.c                             | 34 ++++-----
 net/tls/tls.h                                 |  4 +-
 net/tls/tls_device.c                          |  5 +-
 net/tls/tls_sw.c                              |  2 +-
 net/unix/af_unix.c                            | 19 +++--
 net/vmw_vsock/af_vsock.c                      | 16 ++--
 net/x25/af_x25.c                              |  3 +-
 net/xdp/xsk.c                                 |  6 +-
 net/xfrm/espintcp.c                           |  8 +-
 security/apparmor/lsm.c                       |  6 +-
 security/security.c                           |  4 +-
 security/selinux/hooks.c                      |  3 +-
 security/smack/smack_lsm.c                    |  4 +-
 security/tomoyo/common.h                      |  3 +-
 security/tomoyo/network.c                     |  4 +-
 security/tomoyo/tomoyo.c                      |  6 +-
 110 files changed, 444 insertions(+), 456 deletions(-)


^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH 1/3] net: Drop the size argument from ->sendmsg()
  2023-03-22 13:56         ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() David Howells
  2023-03-22 13:56             ` David Howells
@ 2023-03-22 13:56             ` David Howells
  2023-03-22 13:56           ` [RFC PATCH 3/3] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
  2023-03-23  1:17           ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() Willem de Bruijn
  3 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 13:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: kvm, virtualization, David Howells, Eric Dumazet, linux-afs,
	linux-s390, rds-devel, linux-x25, dccp, linux-rdma,
	linux-security-module, Matthew Wilcox, linux-wpan,
	Jakub Kicinski, Paolo Abeni, selinux, linux-arm-msm, apparmor,
	linux-can, xen-devel, linux-hams, mptcp, netdev, Jeff Layton,
	linux-kernel, linux-bluetooth, linux-sctp, tipc-discussion,
	linux-crypto, bpf, Linus Torvalds, David S. Miller

The size argument to ->sendmsg() ought to be redundant as the same
information should be conveyed by msg->msg_iter.count as returned by
msg_data_left().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: netdev@vger.kernel.org
cc: apparmor@lists.ubuntu.com
cc: bpf@vger.kernel.org
cc: dccp@vger.kernel.org
cc: kvm@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: linux-arm-msm@vger.kernel.org
cc: linux-bluetooth@vger.kernel.org
cc: linux-can@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: linux-hams@vger.kernel.org
cc: linux-rdma@vger.kernel.org
cc: linux-s390@vger.kernel.org
cc: linux-sctp@vger.kernel.org
cc: linux-security-module@vger.kernel.org
cc: linux-wpan@vger.kernel.org
cc: linux-x25@vger.kernel.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: selinux@vger.kernel.org
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
cc: xen-devel@lists.xenproject.org
---
 crypto/af_alg.c                               | 12 +++----
 crypto/algif_aead.c                           |  9 +++--
 crypto/algif_hash.c                           |  8 ++---
 crypto/algif_rng.c                            |  3 +-
 crypto/algif_skcipher.c                       | 10 +++---
 drivers/isdn/mISDN/socket.c                   |  3 +-
 .../chelsio/inline_crypto/chtls/chtls.h       |  2 +-
 .../chelsio/inline_crypto/chtls/chtls_io.c    | 15 ++++----
 drivers/net/ppp/pppoe.c                       |  4 +--
 drivers/net/tap.c                             |  3 +-
 drivers/net/tun.c                             |  3 +-
 drivers/vhost/net.c                           |  6 ++--
 drivers/xen/pvcalls-back.c                    |  2 +-
 drivers/xen/pvcalls-front.c                   |  4 +--
 drivers/xen/pvcalls-front.h                   |  3 +-
 fs/afs/rxrpc.c                                |  8 ++---
 include/crypto/if_alg.h                       |  3 +-
 include/linux/lsm_hook_defs.h                 |  3 +-
 include/linux/lsm_hooks.h                     |  1 -
 include/linux/net.h                           |  6 ++--
 include/linux/security.h                      |  4 +--
 include/net/af_rxrpc.h                        |  3 +-
 include/net/inet_common.h                     |  2 +-
 include/net/ipv6.h                            |  2 +-
 include/net/ping.h                            |  2 +-
 include/net/sock.h                            |  7 ++--
 include/net/tcp.h                             |  8 ++---
 include/net/udp.h                             |  2 +-
 net/appletalk/ddp.c                           |  3 +-
 net/atm/common.c                              |  3 +-
 net/atm/common.h                              |  2 +-
 net/ax25/af_ax25.c                            |  4 +--
 net/bluetooth/hci_sock.c                      |  4 +--
 net/bluetooth/iso.c                           |  4 +--
 net/bluetooth/l2cap_sock.c                    |  5 ++-
 net/bluetooth/rfcomm/sock.c                   |  7 ++--
 net/bluetooth/sco.c                           |  4 +--
 net/caif/caif_socket.c                        | 13 +++----
 net/can/bcm.c                                 |  3 +-
 net/can/isotp.c                               |  3 +-
 net/can/j1939/socket.c                        |  4 +--
 net/can/raw.c                                 |  3 +-
 net/core/sock.c                               |  4 +--
 net/dccp/dccp.h                               |  2 +-
 net/dccp/proto.c                              |  3 +-
 net/ieee802154/socket.c                       | 11 +++---
 net/ipv4/af_inet.c                            |  4 +--
 net/ipv4/ping.c                               |  8 +++--
 net/ipv4/raw.c                                |  3 +-
 net/ipv4/tcp.c                                | 17 +++++-----
 net/ipv4/tcp_bpf.c                            |  5 +--
 net/ipv4/tcp_input.c                          |  3 +-
 net/ipv4/udp.c                                |  5 +--
 net/ipv6/af_inet6.c                           |  7 ++--
 net/ipv6/ping.c                               |  5 +--
 net/ipv6/raw.c                                |  3 +-
 net/ipv6/udp.c                                |  7 ++--
 net/ipv6/udp_impl.h                           |  2 +-
 net/iucv/af_iucv.c                            |  4 +--
 net/kcm/kcmsock.c                             |  2 +-
 net/key/af_key.c                              |  3 +-
 net/l2tp/l2tp_ip.c                            |  3 +-
 net/l2tp/l2tp_ip6.c                           |  3 +-
 net/l2tp/l2tp_ppp.c                           |  4 +--
 net/llc/af_llc.c                              |  5 ++-
 net/mctp/af_mctp.c                            |  3 +-
 net/mptcp/protocol.c                          |  8 ++---
 net/netlink/af_netlink.c                      | 11 +++---
 net/netrom/af_netrom.c                        |  3 +-
 net/nfc/llcp_sock.c                           |  7 ++--
 net/nfc/rawsock.c                             |  3 +-
 net/packet/af_packet.c                        | 11 +++---
 net/phonet/datagram.c                         |  3 +-
 net/phonet/pep.c                              |  3 +-
 net/phonet/socket.c                           |  5 ++-
 net/qrtr/af_qrtr.c                            |  4 +--
 net/rds/rds.h                                 |  2 +-
 net/rds/send.c                                |  3 +-
 net/rose/af_rose.c                            |  3 +-
 net/rxrpc/af_rxrpc.c                          |  6 ++--
 net/rxrpc/ar-internal.h                       |  2 +-
 net/rxrpc/output.c                            | 22 ++++++------
 net/rxrpc/rxperf.c                            |  4 +--
 net/rxrpc/sendmsg.c                           | 15 ++++----
 net/sctp/socket.c                             |  3 +-
 net/smc/af_smc.c                              |  5 +--
 net/socket.c                                  | 16 ++++-----
 net/tipc/socket.c                             | 34 +++++++++----------
 net/tls/tls.h                                 |  4 +--
 net/tls/tls_device.c                          |  5 +--
 net/tls/tls_sw.c                              |  2 +-
 net/unix/af_unix.c                            | 19 +++++------
 net/vmw_vsock/af_vsock.c                      | 16 ++++-----
 net/x25/af_x25.c                              |  3 +-
 net/xdp/xsk.c                                 |  6 ++--
 net/xfrm/espintcp.c                           |  8 +++--
 security/apparmor/lsm.c                       |  6 ++--
 security/security.c                           |  4 +--
 security/selinux/hooks.c                      |  3 +-
 security/smack/smack_lsm.c                    |  4 +--
 security/tomoyo/common.h                      |  3 +-
 security/tomoyo/network.c                     |  4 +--
 security/tomoyo/tomoyo.c                      |  6 ++--
 103 files changed, 286 insertions(+), 296 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 5f7252a5b7b4..dc49b4e2d719 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -952,19 +952,18 @@ static void af_alg_data_wakeup(struct sock *sk)
  *
  * @sock: socket of connection to user space
  * @msg: message from user space
- * @size: size of message from user space
  * @ivsize: the size of the IV for the cipher operation to verify that the
  *	   user-space-provided IV has the right size
  * Return: the number of copied data upon success, < 0 upon error
  */
-int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
-		   unsigned int ivsize)
+int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, unsigned int ivsize)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
 	struct af_alg_ctx *ctx = ask->private;
 	struct af_alg_tsgl *sgl;
 	struct af_alg_control con = {};
+	size_t len;
 	long copied = 0;
 	bool enc = false;
 	bool init = false;
@@ -1012,9 +1011,8 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		ctx->aead_assoclen = con.aead_assoclen;
 	}
 
-	while (size) {
+	while ((len = msg_data_left(msg))) {
 		struct scatterlist *sg;
-		size_t len = size;
 		size_t plen;
 
 		/* use the existing memory in an allocated page */
@@ -1037,7 +1035,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 
 			ctx->used += len;
 			copied += len;
-			size -= len;
 			continue;
 		}
 
@@ -1086,11 +1083,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 			len -= plen;
 			ctx->used += plen;
 			copied += plen;
-			size -= plen;
 			sgl->cur++;
 		} while (len && sgl->cur < MAX_SGL_ENTS);
 
-		if (!size)
+		if (!msg_data_left(msg))
 			sg_mark_end(sg + sgl->cur - 1);
 
 		ctx->merge = plen & (PAGE_SIZE - 1);
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 42493b4d8ce4..1005c755c4c8 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -58,7 +58,7 @@ static inline bool aead_sufficient_data(struct sock *sk)
 	return ctx->used >= ctx->aead_assoclen + (ctx->enc ? 0 : as);
 }
 
-static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int aead_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
@@ -68,7 +68,7 @@ static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	struct crypto_aead *tfm = aeadc->aead;
 	unsigned int ivsize = crypto_aead_ivsize(tfm);
 
-	return af_alg_sendmsg(sock, msg, size, ivsize);
+	return af_alg_sendmsg(sock, msg, ivsize);
 }
 
 static int crypto_aead_copy_sgl(struct crypto_sync_skcipher *null_tfm,
@@ -408,8 +408,7 @@ static int aead_check_key(struct socket *sock)
 	return err;
 }
 
-static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-				  size_t size)
+static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -417,7 +416,7 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return aead_sendmsg(sock, msg, size);
+	return aead_sendmsg(sock, msg);
 }
 
 static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 1d017ec5c63c..9817adecdf1a 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -60,8 +60,7 @@ static void hash_free_result(struct sock *sk, struct hash_ctx *ctx)
 	ctx->result = NULL;
 }
 
-static int hash_sendmsg(struct socket *sock, struct msghdr *msg,
-			size_t ignored)
+static int hash_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int limit = ALG_MAX_PAGES * PAGE_SIZE;
 	struct sock *sk = sock->sk;
@@ -325,8 +324,7 @@ static int hash_check_key(struct socket *sock)
 	return err;
 }
 
-static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-			      size_t size)
+static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -334,7 +332,7 @@ static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return hash_sendmsg(sock, msg, size);
+	return hash_sendmsg(sock, msg);
 }
 
 static ssize_t hash_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 407408c43730..f838be6c2fd7 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -130,11 +130,12 @@ static int rng_test_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 	return ret;
 }
 
-static int rng_test_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rng_test_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct alg_sock *ask = alg_sk(sock->sk);
 	struct rng_ctx *ctx = ask->private;
+	size_t len = msg_data_left(msg);
 
 	lock_sock(sock->sk);
 	if (len > MAXSIZE) {
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index ee8890ee8f33..f5cd9dbbad1b 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -34,8 +34,7 @@
 #include <linux/net.h>
 #include <net/sock.h>
 
-static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t size)
+static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
@@ -44,7 +43,7 @@ static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct crypto_skcipher *tfm = pask->private;
 	unsigned ivsize = crypto_skcipher_ivsize(tfm);
 
-	return af_alg_sendmsg(sock, msg, size, ivsize);
+	return af_alg_sendmsg(sock, msg, ivsize);
 }
 
 static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
@@ -234,8 +233,7 @@ static int skcipher_check_key(struct socket *sock)
 	return err;
 }
 
-static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-				  size_t size)
+static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -243,7 +241,7 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return skcipher_sendmsg(sock, msg, size);
+	return skcipher_sendmsg(sock, msg);
 }
 
 static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 2776ca5fc33f..4c42d39e994a 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -164,10 +164,11 @@ mISDN_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 }
 
 static int
-mISDN_sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+mISDN_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock		*sk = sock->sk;
 	struct sk_buff		*skb;
+	size_t			len = msg_data_left(msg);
 	int			err = -ENOMEM;
 
 	if (*debug & DEBUG_SOCKET)
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
index 41714203ace8..32077c61273b 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
@@ -565,7 +565,7 @@ void chtls_close(struct sock *sk, long timeout);
 int chtls_disconnect(struct sock *sk, int flags);
 void chtls_shutdown(struct sock *sk, int how);
 void chtls_destroy_sock(struct sock *sk);
-int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int chtls_sendmsg(struct sock *sk, struct msghdr *msg);
 int chtls_recvmsg(struct sock *sk, struct msghdr *msg,
 		  size_t len, int flags, int *addr_len);
 int chtls_sendpage(struct sock *sk, struct page *page,
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
index ae6b17b96bf1..5782267618cf 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
@@ -1004,7 +1004,7 @@ static int chtls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
 	return rc;
 }
 
-int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int chtls_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
 	struct chtls_dev *cdev = csk->cdev;
@@ -1058,7 +1058,7 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 					tx_skb_finalize(skb);
 			}
 
-			recordsz = size;
+			recordsz = msg_data_left(msg);
 			csk->tlshws.txleft = recordsz;
 			csk->tlshws.type = record_type;
 		}
@@ -1080,8 +1080,8 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 								 false);
 			} else {
 				skb = get_tx_skb(sk,
-						 select_size(sk, size, flags,
-							     TX_HEADER_LEN));
+						 select_size(sk, msg_data_left(msg),
+							     flags, TX_HEADER_LEN));
 			}
 			if (unlikely(!skb))
 				goto wait_for_memory;
@@ -1089,8 +1089,8 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 			copy = mss;
 		}
-		if (copy > size)
-			copy = size;
+		if (copy > msg_data_left(msg))
+			copy = msg_data_left(msg);
 
 		if (skb_tailroom(skb) > 0) {
 			copy = min(copy, skb_tailroom(skb));
@@ -1182,7 +1182,6 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 			tx_skb_finalize(skb);
 		tp->write_seq += copy;
 		copied += copy;
-		size -= copy;
 
 		if (is_tls_tx(csk))
 			csk->tlshws.txleft -= copy;
@@ -1191,7 +1190,7 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 		    (sk_stream_wspace(sk) < sk_stream_min_wspace(sk)))
 			ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_NO_APPEND;
 
-		if (size == 0)
+		if (msg_data_left(msg) == 0)
 			goto out;
 
 		if (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND)
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index ce2cbb5903d7..7ae28a1f528a 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -833,8 +833,7 @@ static int pppoe_ioctl(struct socket *sock, unsigned int cmd,
 	return err;
 }
 
-static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
-			 size_t total_len)
+static int pppoe_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sk_buff *skb;
 	struct sock *sk = sock->sk;
@@ -843,6 +842,7 @@ static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
 	struct pppoe_hdr hdr;
 	struct pppoe_hdr *ph;
 	struct net_device *dev;
+	size_t total_len = msg_data_left(m);
 	char *start;
 	int hlen;
 
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index ce993cc75bf3..2b076d4a1a58 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1224,8 +1224,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
 	return err;
 }
 
-static int tap_sendmsg(struct socket *sock, struct msghdr *m,
-		       size_t total_len)
+static int tap_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct tap_queue *q = container_of(sock, struct tap_queue, sock);
 	struct tun_msg_ctl *ctl = m->msg_control;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4c7f74904c25..b31d696adafd 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2531,13 +2531,14 @@ static int tun_xdp_one(struct tun_struct *tun,
 	return ret;
 }
 
-static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int tun_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	int ret, i;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = tun_get(tfile);
 	struct tun_msg_ctl *ctl = m->msg_control;
 	struct xdp_buff *xdp;
+	size_t total_len = msg_data_left(m);
 
 	if (!tun)
 		return -EBADFD;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 07181cd8d52e..ddf01a21f208 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -476,7 +476,7 @@ static void vhost_tx_batch(struct vhost_net *net,
 
 	msghdr->msg_control = &ctl;
 	msghdr->msg_controllen = sizeof(ctl);
-	err = sock->ops->sendmsg(sock, msghdr, 0);
+	err = sock->ops->sendmsg(sock, msghdr);
 	if (unlikely(err < 0)) {
 		vq_err(&nvq->vq, "Fail to batch sending packets\n");
 
@@ -836,7 +836,7 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
 				msg.msg_flags &= ~MSG_MORE;
 		}
 
-		err = sock->ops->sendmsg(sock, &msg, len);
+		err = sock->ops->sendmsg(sock, &msg);
 		if (unlikely(err < 0)) {
 			if (err == -EAGAIN || err == -ENOMEM || err == -ENOBUFS) {
 				vhost_discard_vq_desc(vq, 1);
@@ -933,7 +933,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 			msg.msg_flags &= ~MSG_MORE;
 		}
 
-		err = sock->ops->sendmsg(sock, &msg, len);
+		err = sock->ops->sendmsg(sock, &msg);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
 				if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS)
diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index 1f5219e12cc3..37cfd15b6d9d 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -200,7 +200,7 @@ static bool pvcalls_conn_back_write(struct sock_mapping *map)
 		iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, vec, 2, size);
 	}
 
-	ret = inet_sendmsg(map->sock, &msg, size);
+	ret = inet_sendmsg(map->sock, &msg);
 	if (ret == -EAGAIN) {
 		atomic_inc(&map->write);
 		atomic_inc(&map->io);
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index d5d589bda243..257d92612371 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -531,10 +531,10 @@ static int __write_ring(struct pvcalls_data_intf *intf,
 	return len;
 }
 
-int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
-			  size_t len)
+int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock_mapping *map;
+	size_t len = msg_data_left(msg);
 	int sent, tot_sent = 0;
 	int count = 0, flags;
 
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index f694ad77379f..f0c5429604e6 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -14,8 +14,7 @@ int pvcalls_front_accept(struct socket *sock,
 			 struct socket *newsock,
 			 int flags);
 int pvcalls_front_sendmsg(struct socket *sock,
-			  struct msghdr *msg,
-			  size_t len);
+			  struct msghdr *msg);
 int pvcalls_front_recvmsg(struct socket *sock,
 			  struct msghdr *msg,
 			  size_t len,
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 7817e2b860e5..95ef04862025 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -367,8 +367,7 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp)
 	msg.msg_flags		= MSG_WAITALL | (call->write_iter ? MSG_MORE : 0);
 
 	ret = rxrpc_kernel_send_data(call->net->socket, rxcall,
-				     &msg, call->request_size,
-				     afs_notify_end_request_tx);
+				     &msg, afs_notify_end_request_tx);
 	if (ret < 0)
 		goto error_do_abort;
 
@@ -379,7 +378,6 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp)
 
 		ret = rxrpc_kernel_send_data(call->net->socket,
 					     call->rxcall, &msg,
-					     iov_iter_count(&msg.msg_iter),
 					     afs_notify_end_request_tx);
 		*call->write_iter = msg.msg_iter;
 
@@ -834,7 +832,7 @@ void afs_send_empty_reply(struct afs_call *call)
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
 
-	switch (rxrpc_kernel_send_data(net->socket, call->rxcall, &msg, 0,
+	switch (rxrpc_kernel_send_data(net->socket, call->rxcall, &msg,
 				       afs_notify_end_reply_tx)) {
 	case 0:
 		_leave(" [replied]");
@@ -875,7 +873,7 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
 
-	n = rxrpc_kernel_send_data(net->socket, call->rxcall, &msg, len,
+	n = rxrpc_kernel_send_data(net->socket, call->rxcall, &msg,
 				   afs_notify_end_reply_tx);
 	if (n >= 0) {
 		/* Success */
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 7e76623f9ec3..bcf0077aae6d 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -228,8 +228,7 @@ void af_alg_pull_tsgl(struct sock *sk, size_t used, struct scatterlist *dst,
 		      size_t dst_offset);
 void af_alg_wmem_wakeup(struct sock *sk);
 int af_alg_wait_for_data(struct sock *sk, unsigned flags, unsigned min);
-int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
-		   unsigned int ivsize);
+int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, unsigned int ivsize);
 ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
 			int offset, size_t size, int flags);
 void af_alg_free_resources(struct af_alg_async_req *areq);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 094b76dc7164..b176525025da 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -298,8 +298,7 @@ LSM_HOOK(int, 0, socket_connect, struct socket *sock, struct sockaddr *address,
 	 int addrlen)
 LSM_HOOK(int, 0, socket_listen, struct socket *sock, int backlog)
 LSM_HOOK(int, 0, socket_accept, struct socket *sock, struct socket *newsock)
-LSM_HOOK(int, 0, socket_sendmsg, struct socket *sock, struct msghdr *msg,
-	 int size)
+LSM_HOOK(int, 0, socket_sendmsg, struct socket *sock, struct msghdr *msg)
 LSM_HOOK(int, 0, socket_recvmsg, struct socket *sock, struct msghdr *msg,
 	 int size, int flags)
 LSM_HOOK(int, 0, socket_getsockname, struct socket *sock)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 6e156d2acffc..6f48be80b6bf 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -932,7 +932,6 @@
  *	Check permission before transmitting a message to another socket.
  *	@sock contains the socket structure.
  *	@msg contains the message to be transmitted.
- *	@size contains the size of message.
  *	Return 0 if permission is granted.
  * @socket_recvmsg:
  *	Check permission before receiving a message from a socket.
diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..8adf1328445a 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -192,8 +192,7 @@ struct proto_ops {
 	int		(*getsockopt)(struct socket *sock, int level,
 				      int optname, char __user *optval, int __user *optlen);
 	void		(*show_fdinfo)(struct seq_file *m, struct socket *sock);
-	int		(*sendmsg)   (struct socket *sock, struct msghdr *m,
-				      size_t total_len);
+	int		(*sendmsg)   (struct socket *sock, struct msghdr *m);
 	/* Notes for implementing recvmsg:
 	 * ===============================
 	 * msg->msg_namelen should get updated by the recvmsg handlers
@@ -222,8 +221,7 @@ struct proto_ops {
 	int		(*read_skb)(struct sock *sk, skb_read_actor_t recv_actor);
 	int		(*sendpage_locked)(struct sock *sk, struct page *page,
 					   int offset, size_t size, int flags);
-	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
-					  size_t size);
+	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg);
 	int		(*set_rcvlowat)(struct sock *sk, int val);
 };
 
diff --git a/include/linux/security.h b/include/linux/security.h
index 5984d0d550b4..6c67a4de4a89 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1436,7 +1436,7 @@ int security_socket_bind(struct socket *sock, struct sockaddr *address, int addr
 int security_socket_connect(struct socket *sock, struct sockaddr *address, int addrlen);
 int security_socket_listen(struct socket *sock, int backlog);
 int security_socket_accept(struct socket *sock, struct socket *newsock);
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size);
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg);
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
 			    int size, int flags);
 int security_socket_getsockname(struct socket *sock);
@@ -1538,7 +1538,7 @@ static inline int security_socket_accept(struct socket *sock,
 }
 
 static inline int security_socket_sendmsg(struct socket *sock,
-					  struct msghdr *msg, int size)
+					  struct msghdr *msg)
 {
 	return 0;
 }
diff --git a/include/net/af_rxrpc.h b/include/net/af_rxrpc.h
index ba717eac0229..33f1b8c622e3 100644
--- a/include/net/af_rxrpc.h
+++ b/include/net/af_rxrpc.h
@@ -51,8 +51,7 @@ struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *,
 					   enum rxrpc_interruptibility,
 					   unsigned int);
 int rxrpc_kernel_send_data(struct socket *, struct rxrpc_call *,
-			   struct msghdr *, size_t,
-			   rxrpc_notify_end_tx_t);
+			   struct msghdr *, rxrpc_notify_end_tx_t);
 int rxrpc_kernel_recv_data(struct socket *, struct rxrpc_call *,
 			   struct iov_iter *, size_t *, bool, u32 *, u16 *);
 bool rxrpc_kernel_abort_call(struct socket *, struct rxrpc_call *,
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index cec453c18f1d..ec798fdd371c 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -32,7 +32,7 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
 int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern);
 int inet_send_prepare(struct sock *sk);
-int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
+int inet_sendmsg(struct socket *sock, struct msghdr *msg);
 ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 		      size_t size, int flags);
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 7332296eca44..f2132311e92b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1228,7 +1228,7 @@ int inet6_compat_ioctl(struct socket *sock, unsigned int cmd,
 
 int inet6_hash_connect(struct inet_timewait_death_row *death_row,
 			      struct sock *sk);
-int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
+int inet6_sendmsg(struct socket *sock, struct msghdr *msg);
 int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		  int flags);
 
diff --git a/include/net/ping.h b/include/net/ping.h
index 9233ad3de0ad..04814edde8e3 100644
--- a/include/net/ping.h
+++ b/include/net/ping.h
@@ -70,7 +70,7 @@ int  ping_getfrag(void *from, char *to, int offset, int fraglen, int odd,
 
 int  ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		  int flags, int *addr_len);
-int  ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
+int  ping_common_sendmsg(int family, struct msghdr *msg,
 			 void *user_icmph, size_t icmph_len);
 int  ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
 enum skb_drop_reason ping_rcv(struct sk_buff *skb);
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..7a6d06c181b6 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1261,8 +1261,7 @@ struct proto {
 	int			(*compat_ioctl)(struct sock *sk,
 					unsigned int cmd, unsigned long arg);
 #endif
-	int			(*sendmsg)(struct sock *sk, struct msghdr *msg,
-					   size_t len);
+	int			(*sendmsg)(struct sock *sk, struct msghdr *msg);
 	int			(*recvmsg)(struct sock *sk, struct msghdr *msg,
 					   size_t len, int flags, int *addr_len);
 	int			(*sendpage)(struct sock *sk, struct page *page,
@@ -1901,8 +1900,8 @@ int sock_no_getname(struct socket *, struct sockaddr *, int);
 int sock_no_ioctl(struct socket *, unsigned int, unsigned long);
 int sock_no_listen(struct socket *, int);
 int sock_no_shutdown(struct socket *, int);
-int sock_no_sendmsg(struct socket *, struct msghdr *, size_t);
-int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len);
+int sock_no_sendmsg(struct socket *sk, struct msghdr *msg);
+int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg);
 int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int);
 int sock_no_mmap(struct file *file, struct socket *sock,
 		 struct vm_area_struct *vma);
diff --git a/include/net/tcp.h b/include/net/tcp.h
index a0a91a988272..12b228e3d563 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -325,10 +325,10 @@ int tcp_v4_rcv(struct sk_buff *skb);
 
 void tcp_remove_empty_skb(struct sock *sk);
 int tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);
-int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
-int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size);
+int tcp_sendmsg(struct sock *sk, struct msghdr *msg);
+int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg);
 int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
-			 size_t size, struct ubuf_info *uarg);
+			 struct ubuf_info *uarg);
 int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
 		 int flags);
 int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
@@ -479,7 +479,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 int tcp_disconnect(struct sock *sk, int flags);
 
 void tcp_finish_connect(struct sock *sk, struct sk_buff *skb);
-int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size);
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg);
 void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb);
 
 /* From syncookies.c */
diff --git a/include/net/udp.h b/include/net/udp.h
index de4b528522bb..b9b2ea5af42d 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -277,7 +277,7 @@ int udp_get_port(struct sock *sk, unsigned short snum,
 				  const struct sock *));
 int udp_err(struct sk_buff *, u32);
 int udp_abort(struct sock *sk, int err);
-int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+int udp_sendmsg(struct sock *sk, struct msghdr *msg);
 int udp_push_pending_frames(struct sock *sk);
 void udp_flush_pending_frames(struct sock *sk);
 int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index a06f4d4a6f47..70008c57503f 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1566,7 +1566,7 @@ static int ltalk_rcv(struct sk_buff *skb, struct net_device *dev,
 	return 0;
 }
 
-static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int atalk_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct atalk_sock *at = at_sk(sk);
@@ -1579,6 +1579,7 @@ static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct ddpehdr *ddp;
 	int size, hard_header_len;
 	struct atalk_route *rt, *rt_lo = NULL;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	if (flags & ~(MSG_DONTWAIT|MSG_CMSG_COMPAT))
diff --git a/net/atm/common.c b/net/atm/common.c
index f7019df41c3e..09060644760b 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -565,12 +565,13 @@ int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return copied;
 }
 
-int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t size)
+int vcc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	DEFINE_WAIT(wait);
 	struct atm_vcc *vcc;
 	struct sk_buff *skb;
+	size_t size = msg_data_left(m);
 	int eff, error;
 
 	lock_sock(sk);
diff --git a/net/atm/common.h b/net/atm/common.h
index a1e56e8de698..6597f8308f03 100644
--- a/net/atm/common.h
+++ b/net/atm/common.h
@@ -16,7 +16,7 @@ int vcc_release(struct socket *sock);
 int vcc_connect(struct socket *sock, int itf, short vpi, int vci);
 int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		int flags);
-int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len);
+int vcc_sendmsg(struct socket *sock, struct msghdr *m);
 __poll_t vcc_poll(struct file *file, struct socket *sock, poll_table *wait);
 int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 int vcc_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index d8da400cb4de..48f96e28f7ea 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1489,7 +1489,7 @@ static int ax25_getname(struct socket *sock, struct sockaddr *uaddr,
 	return err;
 }
 
-static int ax25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int ax25_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_ax25 *, usax, msg->msg_name);
 	struct sock *sk = sock->sk;
@@ -1497,7 +1497,7 @@ static int ax25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sk_buff *skb;
 	ax25_digi dtmp, *dp;
 	ax25_cb *ax25;
-	size_t size;
+	size_t size, len = msg_data_left(msg);
 	int lv, err, addr_len = msg->msg_namelen;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 06581223238c..9d6f713eeac1 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -1692,8 +1692,7 @@ static int hci_logging_frame(struct sock *sk, struct sk_buff *skb,
 	return err;
 }
 
-static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct hci_mgmt_chan *chan;
@@ -1701,6 +1700,7 @@ static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	int err;
 	const unsigned int flags = msg->msg_flags;
+	size_t len = msg_data_left(msg);
 
 	BT_DBG("sock %p sk %p", sock, sk);
 
diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c
index 24444b502e58..6d8863878abc 100644
--- a/net/bluetooth/iso.c
+++ b/net/bluetooth/iso.c
@@ -1031,12 +1031,12 @@ static int iso_sock_getname(struct socket *sock, struct sockaddr *addr,
 	return sizeof(struct sockaddr_iso);
 }
 
-static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct iso_conn *conn = iso_pi(sk)->conn;
 	struct sk_buff *skb, **frag;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	BT_DBG("sock %p, sk %p", sock, sk);
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index eebe256104bc..d488aca82037 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1143,8 +1143,7 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
 	return err;
 }
 
-static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			      size_t len)
+static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct l2cap_chan *chan = l2cap_pi(sk)->chan;
@@ -1169,7 +1168,7 @@ static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 		return err;
 
 	l2cap_chan_lock(chan);
-	err = l2cap_chan_send(chan, msg, len);
+	err = l2cap_chan_send(chan, msg, msg_data_left(msg));
 	l2cap_chan_unlock(chan);
 
 	return err;
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 4397e14ff560..8a0a51b5c3a3 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -558,8 +558,7 @@ static int rfcomm_sock_getname(struct socket *sock, struct sockaddr *addr, int p
 	return sizeof(struct sockaddr_rc);
 }
 
-static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rfcomm_dlc *d = rfcomm_pi(sk)->dlc;
@@ -586,8 +585,8 @@ static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (sent)
 		return sent;
 
-	skb = bt_skb_sendmmsg(sk, msg, len, d->mtu, RFCOMM_SKB_HEAD_RESERVE,
-			      RFCOMM_SKB_TAIL_RESERVE);
+	skb = bt_skb_sendmmsg(sk, msg, msg_data_left(msg), d->mtu,
+			      RFCOMM_SKB_HEAD_RESERVE, RFCOMM_SKB_TAIL_RESERVE);
 	if (IS_ERR(skb))
 		return PTR_ERR(skb);
 
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 1111da4e2f2b..8c62c5dc5b57 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -722,11 +722,11 @@ static int sco_sock_getname(struct socket *sock, struct sockaddr *addr,
 	return sizeof(struct sockaddr_sco);
 }
 
-static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	BT_DBG("sock %p, sk %p", sock, sk);
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 4eebcc66c19a..827230b3f7c3 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -510,8 +510,7 @@ static int transmit_skb(struct sk_buff *skb, struct caifsock *cf_sk,
 }
 
 /* Copied from af_unix:unix_dgram_sendmsg, and adapted to CAIF */
-static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
@@ -520,6 +519,8 @@ static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb = NULL;
 	int noblock;
 	long timeo;
+	size_t len = msg_data_left(msg);
+
 	caif_assert(cf_sk);
 	ret = sock_error(sk);
 	if (ret)
@@ -582,8 +583,7 @@ static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
  * Changed removed permission handling and added waiting for flow on
  * and other minor adaptations.
  */
-static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
@@ -605,10 +605,7 @@ static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (unlikely(sk->sk_shutdown & SEND_SHUTDOWN))
 		goto pipe_err;
 
-	while (sent < len) {
-
-		size = len-sent;
-
+	while ((size = msg_data_left(msg))) {
 		if (size > cf_sk->maxframe)
 			size = cf_sk->maxframe;
 
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6ace34..9baace5e0d71 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1287,12 +1287,13 @@ static int bcm_tx_send(struct msghdr *msg, int ifindex, struct sock *sk,
 /*
  * bcm_sendmsg - process BCM commands (opcodes) from the userspace
  */
-static int bcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int bcm_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct bcm_sock *bo = bcm_sk(sk);
 	int ifindex = bo->ifindex; /* default ifindex for this bcm_op */
 	struct bcm_msg_head msg_head;
+	size_t size = msg_data_left(msg);
 	int cfsiz;
 	int ret; /* read bytes or error codes as return value */
 
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..6b5d3ebd6748 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -914,7 +914,7 @@ static enum hrtimer_restart isotp_txfr_timer_handler(struct hrtimer *hrtimer)
 	return HRTIMER_NORESTART;
 }
 
-static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int isotp_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct isotp_sock *so = isotp_sk(sk);
@@ -922,6 +922,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	struct sk_buff *skb;
 	struct net_device *dev;
 	struct canfd_frame *cf;
+	size_t size = msg_data_left(msg);
 	int ae = (so->opt.flags & CAN_ISOTP_EXTEND_ADDR) ? 1 : 0;
 	int wait_tx_done = (so->opt.flags & CAN_ISOTP_WAIT_TX_DONE) ? 1 : 0;
 	s64 hrtimer_sec = ISOTP_ECHO_TIMEOUT;
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 7e90f9e61d9b..2b009b69e853 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1187,12 +1187,12 @@ static int j1939_sk_send_loop(struct j1939_priv *priv,  struct sock *sk,
 	return ret;
 }
 
-static int j1939_sk_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t size)
+static int j1939_sk_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct j1939_sock *jsk = j1939_sk(sk);
 	struct j1939_priv *priv;
+	size_t size = msg_data_left(msg);
 	int ifindex;
 	int ret;
 
diff --git a/net/can/raw.c b/net/can/raw.c
index f64469b98260..0c37f1c70685 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -814,13 +814,14 @@ static bool raw_bad_txframe(struct raw_sock *ro, struct sk_buff *skb, int mtu)
 	return true;
 }
 
-static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int raw_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct raw_sock *ro = raw_sk(sk);
 	struct sockcm_cookie sockc;
 	struct sk_buff *skb;
 	struct net_device *dev;
+	size_t size = msg_data_left(msg);
 	int ifindex;
 	int err = -EINVAL;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index c25888795390..4170381356aa 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3183,13 +3183,13 @@ int sock_no_shutdown(struct socket *sock, int how)
 }
 EXPORT_SYMBOL(sock_no_shutdown);
 
-int sock_no_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
+int sock_no_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	return -EOPNOTSUPP;
 }
 EXPORT_SYMBOL(sock_no_sendmsg);
 
-int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *m, size_t len)
+int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *m)
 {
 	return -EOPNOTSUPP;
 }
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index 9ddc3a9e89e4..3d5d7615ddd8 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -293,7 +293,7 @@ int dccp_getsockopt(struct sock *sk, int level, int optname,
 int dccp_setsockopt(struct sock *sk, int level, int optname,
 		    sockptr_t optval, unsigned int optlen);
 int dccp_ioctl(struct sock *sk, int cmd, unsigned long arg);
-int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int dccp_sendmsg(struct sock *sk, struct msghdr *msg);
 int dccp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		 int *addr_len);
 void dccp_shutdown(struct sock *sk, int how);
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index a06b5641287a..6f6623bb1ff8 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -725,12 +725,13 @@ static int dccp_msghdr_parse(struct msghdr *msg, struct sk_buff *skb)
 	return 0;
 }
 
-int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int dccp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	const struct dccp_sock *dp = dccp_sk(sk);
 	const int flags = msg->msg_flags;
 	const int noblock = flags & MSG_DONTWAIT;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int rc, size;
 	long timeo;
 
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 1fa2fe041ec0..70f2948b7946 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -88,12 +88,11 @@ static int ieee802154_sock_release(struct socket *sock)
 	return 0;
 }
 
-static int ieee802154_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-				   size_t len)
+static int ieee802154_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->sendmsg(sk, msg, len);
+	return sk->sk_prot->sendmsg(sk, msg);
 }
 
 static int ieee802154_sock_bind(struct socket *sock, struct sockaddr *uaddr,
@@ -238,11 +237,12 @@ static int raw_disconnect(struct sock *sk, int flags)
 	return 0;
 }
 
-static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net_device *dev;
 	unsigned int mtu;
 	struct sk_buff *skb;
+	size_t size = msg_data_left(msg);
 	int hlen, tlen;
 	int err;
 
@@ -605,7 +605,7 @@ static int dgram_disconnect(struct sock *sk, int flags)
 	return 0;
 }
 
-static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int dgram_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net_device *dev;
 	unsigned int mtu;
@@ -614,6 +614,7 @@ static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	struct dgram_sock *ro = dgram_sk(sk);
 	struct ieee802154_addr dst_addr;
 	DECLARE_SOCKADDR(struct sockaddr_ieee802154*, daddr, msg->msg_name);
+	size_t size = msg_data_left(msg);
 	int hlen, tlen;
 	int err;
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 940062e08f57..4facfef8bded 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -815,7 +815,7 @@ int inet_send_prepare(struct sock *sk)
 }
 EXPORT_SYMBOL_GPL(inet_send_prepare);
 
-int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+int inet_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 
@@ -823,7 +823,7 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 		return -EAGAIN;
 
 	return INDIRECT_CALL_2(sk->sk_prot->sendmsg, tcp_sendmsg, udp_sendmsg,
-			       sk, msg, size);
+			       sk, msg);
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 409ec2a1f95b..f689f9f530c9 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -657,9 +657,10 @@ static int ping_v4_push_pending_frames(struct sock *sk, struct pingfakehdr *pfh,
 	return ip_push_pending_frames(sk, fl4);
 }
 
-int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
+int ping_common_sendmsg(int family, struct msghdr *msg,
 			void *user_icmph, size_t icmph_len)
 {
+	size_t len = msg_data_left(msg);
 	u8 type, code;
 
 	if (len > 0xFFFF)
@@ -703,7 +704,7 @@ int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
 }
 EXPORT_SYMBOL_GPL(ping_common_sendmsg);
 
-static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net *net = sock_net(sk);
 	struct flowi4 fl4;
@@ -713,6 +714,7 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct pingfakehdr pfh;
 	struct rtable *rt = NULL;
 	struct ip_options_data opt_copy;
+	size_t len = msg_data_left(msg);
 	int free = 0;
 	__be32 saddr, daddr, faddr;
 	u8  tos;
@@ -720,7 +722,7 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 	pr_debug("ping_v4_sendmsg(sk=%p,sk->num=%u)\n", inet, inet->inet_num);
 
-	err = ping_common_sendmsg(AF_INET, msg, len, &user_icmph,
+	err = ping_common_sendmsg(AF_INET, msg, &user_icmph,
 				  sizeof(user_icmph));
 	if (err)
 		return err;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 3cf68695b40d..f2859c117796 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -471,7 +471,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
-static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct net *net = sock_net(sk);
@@ -485,6 +485,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	int err;
 	struct ip_options_data opt_copy;
 	struct raw_frag_vec rfv;
+	size_t len = msg_data_left(msg);
 	int hdrincl;
 
 	err = -EMSGSIZE;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index fd68d49490f2..2a98b104892c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1166,7 +1166,7 @@ void tcp_free_fastopen_req(struct tcp_sock *tp)
 }
 
 int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
-			 size_t size, struct ubuf_info *uarg)
+			 struct ubuf_info *uarg)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_sock *inet = inet_sk(sk);
@@ -1186,7 +1186,7 @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
 	if (unlikely(!tp->fastopen_req))
 		return -ENOBUFS;
 	tp->fastopen_req->data = msg;
-	tp->fastopen_req->size = size;
+	tp->fastopen_req->size = msg_data_left(msg);
 	tp->fastopen_req->uarg = uarg;
 
 	if (inet->defer_connect) {
@@ -1212,12 +1212,13 @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
 	return err;
 }
 
-int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct ubuf_info *uarg = NULL;
 	struct sk_buff *skb;
 	struct sockcm_cookie sockc;
+	size_t size = msg_data_left(msg);
 	int flags, err, copied = 0;
 	int mss_now = 0, size_goal, copied_syn = 0;
 	int process_backlog = 0;
@@ -1226,7 +1227,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	flags = msg->msg_flags;
 
-	if ((flags & MSG_ZEROCOPY) && size) {
+	if ((flags & MSG_ZEROCOPY) && msg_data_left(msg)) {
 		skb = tcp_write_queue_tail(sk);
 
 		if (msg->msg_ubuf) {
@@ -1247,7 +1248,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) &&
 	    !tp->repair) {
-		err = tcp_sendmsg_fastopen(sk, msg, &copied_syn, size, uarg);
+		err = tcp_sendmsg_fastopen(sk, msg, &copied_syn, uarg);
 		if (err == -EINPROGRESS && copied_syn > 0)
 			goto out;
 		else if (err)
@@ -1271,7 +1272,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	if (unlikely(tp->repair)) {
 		if (tp->repair_queue == TCP_RECV_QUEUE) {
-			copied = tcp_send_rcvq(sk, msg, size);
+			copied = tcp_send_rcvq(sk, msg);
 			goto out_nopush;
 		}
 
@@ -1477,12 +1478,12 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 }
 EXPORT_SYMBOL_GPL(tcp_sendmsg_locked);
 
-int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	int ret;
 
 	lock_sock(sk);
-	ret = tcp_sendmsg_locked(sk, msg, size);
+	ret = tcp_sendmsg_locked(sk, msg);
 	release_sock(sk);
 
 	return ret;
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index ebf917511937..843eb2b6b8d3 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -396,9 +396,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 	return ret;
 }
 
-static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_msg tmp, *msg_tx = NULL;
+	size_t size = msg_data_left(msg);
 	int copied = 0, err = 0;
 	struct sk_psock *psock;
 	long timeo;
@@ -410,7 +411,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 
 	psock = sk_psock_get(sk);
 	if (unlikely(!psock))
-		return tcp_sendmsg(sk, msg, size);
+		return tcp_sendmsg(sk, msg);
 
 	lock_sock(sk);
 	timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2b75cd9e2e92..a1c7d834abca 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4948,9 +4948,10 @@ static int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb,
 	return eaten;
 }
 
-int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_buff *skb;
+	size_t size = msg_data_left(msg);
 	int err = -ENOMEM;
 	int data_len = 0;
 	bool fragstolen;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index aa32afd871ee..b2ed9d37a362 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1049,13 +1049,14 @@ int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size)
 }
 EXPORT_SYMBOL_GPL(udp_cmsg_send);
 
-int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int udp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct udp_sock *up = udp_sk(sk);
 	DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name);
 	struct flowi4 fl4_stack;
 	struct flowi4 *fl4;
+	unsigned int len = msg_data_left(msg);
 	int ulen = len;
 	struct ipcm_cookie ipc;
 	struct rtable *rt = NULL;
@@ -1346,7 +1347,7 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
 		 * sendpage interface can't pass.
 		 * This will succeed only when the socket is connected.
 		 */
-		ret = udp_sendmsg(sk, &msg, 0);
+		ret = udp_sendmsg(sk, &msg);
 		if (ret < 0)
 			return ret;
 	}
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index e1b679a590c9..d6b4cfc44e2a 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -636,9 +636,8 @@ int inet6_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 EXPORT_SYMBOL_GPL(inet6_compat_ioctl);
 #endif /* CONFIG_COMPAT */
 
-INDIRECT_CALLABLE_DECLARE(int udpv6_sendmsg(struct sock *, struct msghdr *,
-					    size_t));
-int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+INDIRECT_CALLABLE_DECLARE(int udpv6_sendmsg(struct sock *, struct msghdr *));
+int inet6_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	const struct proto *prot;
@@ -649,7 +648,7 @@ int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
 	prot = READ_ONCE(sk->sk_prot);
 	return INDIRECT_CALL_2(prot->sendmsg, tcp_sendmsg, udpv6_sendmsg,
-			       sk, msg, size);
+			       sk, msg);
 }
 
 INDIRECT_CALLABLE_DECLARE(int udpv6_recvmsg(struct sock *, struct msghdr *,
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index c4835dbdfcff..54c94b28744f 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -59,7 +59,7 @@ static int ping_v6_pre_connect(struct sock *sk, struct sockaddr *uaddr,
 	return BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr);
 }
 
-static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct ipv6_pinfo *np = inet6_sk(sk);
@@ -73,8 +73,9 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct rt6_info *rt;
 	struct pingfakehdr pfh;
 	struct ipcm6_cookie ipc6;
+	size_t len = msg_data_left(msg);
 
-	err = ping_common_sendmsg(AF_INET6, msg, len, &user_icmph,
+	err = ping_common_sendmsg(AF_INET6, msg, &user_icmph,
 				  sizeof(user_icmph));
 	if (err)
 		return err;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 6ac2f2690c44..a3437deeeb74 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -735,7 +735,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
-static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions *opt_to_free = NULL;
 	struct ipv6_txoptions opt_space;
@@ -751,6 +751,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct flowi6 fl6;
 	struct ipcm6_cookie ipc6;
 	int addr_len = msg->msg_namelen;
+	size_t len = msg_data_left(msg);
 	int hdrincl;
 	u16 proto;
 	int err;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d350e57c4792..80f2eb58ba1a 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1326,7 +1326,7 @@ static int udp_v6_push_pending_frames(struct sock *sk)
 	return err;
 }
 
-int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int udpv6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions opt_space;
 	struct udp_sock *up = udp_sk(sk);
@@ -1343,6 +1343,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct ipcm6_cookie ipc6;
 	int addr_len = msg->msg_namelen;
 	bool connected = false;
+	size_t len = msg_data_left(msg);
 	int ulen = len;
 	int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE;
 	int err;
@@ -1397,7 +1398,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 do_udp_sendmsg:
 			if (ipv6_only_sock(sk))
 				return -ENETUNREACH;
-			return udp_sendmsg(sk, msg, len);
+			return udp_sendmsg(sk, msg);
 		}
 	}
 
@@ -1410,7 +1411,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	getfrag  =  is_udplite ?  udplite_getfrag : ip_generic_getfrag;
 	if (up->pending) {
 		if (up->pending == AF_INET)
-			return udp_sendmsg(sk, msg, len);
+			return udp_sendmsg(sk, msg);
 		/*
 		 * There are pending frames.
 		 * The socket lock must be held while it's corked.
diff --git a/net/ipv6/udp_impl.h b/net/ipv6/udp_impl.h
index 0590f566379d..c905a5cb34af 100644
--- a/net/ipv6/udp_impl.h
+++ b/net/ipv6/udp_impl.h
@@ -20,7 +20,7 @@ int udpv6_getsockopt(struct sock *sk, int level, int optname,
 		     char __user *optval, int __user *optlen);
 int udpv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
 		     unsigned int optlen);
-int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+int udpv6_sendmsg(struct sock *sk, struct msghdr *msg);
 int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		  int *addr_len);
 void udpv6_destroy_sock(struct sock *sk);
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index 498a0c35b7bb..d963d245a4e2 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -895,8 +895,7 @@ static int iucv_send_iprm(struct iucv_path *path, struct iucv_message *msg,
 				 (void *) prmdata, 8);
 }
 
-static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			     size_t len)
+static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct iucv_sock *iucv = iucv_sk(sk);
@@ -905,6 +904,7 @@ static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	struct iucv_message txmsg = {0};
 	struct cmsghdr *cmsg;
+	size_t len = msg_data_left(msg);
 	int cmsg_done;
 	long timeo;
 	char user_id[9];
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index cfe828bd7fc6..caf13ed1bfeb 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -904,7 +904,7 @@ static ssize_t kcm_sendpage(struct socket *sock, struct page *page,
 	return err;
 }
 
-static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int kcm_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct kcm_sock *kcm = kcm_sk(sk);
diff --git a/net/key/af_key.c b/net/key/af_key.c
index a815f5ab4c49..3cde1e0c3119 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3662,13 +3662,14 @@ static int pfkey_send_migrate(const struct xfrm_selector *sel, u8 dir, u8 type,
 }
 #endif
 
-static int pfkey_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int pfkey_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb = NULL;
 	struct sadb_msg *hdr = NULL;
 	int err;
 	struct net *net = sock_net(sk);
+	size_t len = msg_data_left(msg);
 
 	err = -EOPNOTSUPP;
 	if (msg->msg_flags & MSG_OOB)
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4db5a554bdbd..474ce4ae9b63 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -394,13 +394,14 @@ static int l2tp_ip_backlog_recv(struct sock *sk, struct sk_buff *skb)
 /* Userspace will call sendmsg() on the tunnel socket to send L2TP
  * control frames.
  */
-static int l2tp_ip_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int l2tp_ip_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_buff *skb;
 	int rc;
 	struct inet_sock *inet = inet_sk(sk);
 	struct rtable *rt = NULL;
 	struct flowi4 *fl4;
+	size_t len = msg_data_left(msg);
 	int connected = 0;
 	__be32 daddr;
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2478aa60145f..7619afe77855 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -488,7 +488,7 @@ static int l2tp_ip6_push_pending_frames(struct sock *sk)
 /* Userspace will call sendmsg() on the tunnel socket to send L2TP
  * control frames.
  */
-static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions opt_space;
 	DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name);
@@ -500,6 +500,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct dst_entry *dst = NULL;
 	struct flowi6 fl6;
 	struct ipcm6_cookie ipc6;
+	size_t len = msg_data_left(msg);
 	int addr_len = msg->msg_namelen;
 	int transhdrlen = 4; /* zero session-id */
 	int ulen;
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index f011af6601c9..ae351f50adff 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -262,14 +262,14 @@ static void pppol2tp_recv(struct l2tp_session *session, struct sk_buff *skb, int
  * when a user application does a sendmsg() on the session socket. L2TP and
  * PPP headers must be inserted into the user's data.
  */
-static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m,
-			    size_t total_len)
+static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
 	int error;
 	struct l2tp_session *session;
 	struct l2tp_tunnel *tunnel;
+	size_t total_len = msg_data_left(m);
 	int uhlen;
 
 	error = -ENOTCONN;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index da7fe94bea2e..d10b5ef66c88 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -919,12 +919,11 @@ static int llc_ui_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
  *	llc_ui_sendmsg - Transmit data provided by the socket user.
  *	@sock: Socket to transmit data from.
  *	@msg: Various user related information.
- *	@len: Length of data to transmit.
  *
  *	Transmit data provided by the socket user.
  *	Returns non-negative upon success, negative otherwise.
  */
-static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct llc_sock *llc = llc_sk(sk);
@@ -954,7 +953,7 @@ static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 			goto out;
 	}
 	hdrlen = llc->dev->hard_header_len + llc_ui_header_len(sk, addr);
-	size = hdrlen + len;
+	size = hdrlen + msg_data_left(msg);
 	if (size > llc->dev->mtu)
 		size = llc->dev->mtu;
 	copied = size - hdrlen;
diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
index bb4bd0b6a4f7..9ead250f1be3 100644
--- a/net/mctp/af_mctp.c
+++ b/net/mctp/af_mctp.c
@@ -90,7 +90,7 @@ static int mctp_bind(struct socket *sock, struct sockaddr *addr, int addrlen)
 	return rc;
 }
 
-static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int mctp_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_mctp *, addr, msg->msg_name);
 	int rc, addrlen = msg->msg_namelen;
@@ -99,6 +99,7 @@ static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct mctp_skb_cb *cb;
 	struct mctp_route *rt;
 	struct sk_buff *skb = NULL;
+	size_t len = msg_data_left(msg);
 	int hlen;
 
 	if (addr) {
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2d26b9114373..0a58f2dbd3ce 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1663,7 +1663,7 @@ static void mptcp_set_nospace(struct sock *sk)
 static int mptcp_disconnect(struct sock *sk, int flags);
 
 static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msghdr *msg,
-				  size_t len, int *copied_syn)
+				  int *copied_syn)
 {
 	unsigned int saved_flags = msg->msg_flags;
 	struct mptcp_sock *msk = mptcp_sk(sk);
@@ -1673,7 +1673,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msgh
 	msg->msg_flags |= MSG_DONTWAIT;
 	msk->connect_flags = O_NONBLOCK;
 	msk->fastopening = 1;
-	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, len, NULL);
+	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, NULL);
 	msk->fastopening = 0;
 	msg->msg_flags = saved_flags;
 	release_sock(ssk);
@@ -1695,7 +1695,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msgh
 	return ret;
 }
 
-static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	struct page_frag *pfrag;
@@ -1714,7 +1714,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			       msg->msg_flags & MSG_FASTOPEN))) {
 		int copied_syn = 0;
 
-		ret = mptcp_sendmsg_fastopen(sk, ssock->sk, msg, len, &copied_syn);
+		ret = mptcp_sendmsg_fastopen(sk, ssock->sk, msg, &copied_syn);
 		copied += copied_syn;
 		if (ret == -EINPROGRESS && copied_syn > 0)
 			goto out;
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 877f1da1a8ac..519487cbfcce 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1857,7 +1857,7 @@ static void netlink_cmsg_listen_all_nsid(struct sock *sk, struct msghdr *msg,
 		 &NETLINK_CB(skb).nsid);
 }
 
-static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int netlink_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct netlink_sock *nlk = nlk_sk(sk);
@@ -1872,7 +1872,7 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	if (msg->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-	if (len == 0) {
+	if (msg_data_left(msg) == 0) {
 		pr_warn_once("Zero length message leads to an empty skb\n");
 		return -ENODATA;
 	}
@@ -1911,10 +1911,10 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	err = -EMSGSIZE;
-	if (len > sk->sk_sndbuf - 32)
+	if (msg_data_left(msg) > sk->sk_sndbuf - 32)
 		goto out;
 	err = -ENOBUFS;
-	skb = netlink_alloc_large_skb(len, dst_group);
+	skb = netlink_alloc_large_skb(msg_data_left(msg), dst_group);
 	if (skb == NULL)
 		goto out;
 
@@ -1924,7 +1924,8 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	NETLINK_CB(skb).flags	= netlink_skb_flags;
 
 	err = -EFAULT;
-	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
+	if (memcpy_from_msg(skb_put(skb, msg_data_left(msg)),
+			    msg, msg_data_left(msg))) {
 		kfree_skb(skb);
 		goto out;
 	}
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 5a4cb796150f..d2c65f38c22c 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1034,7 +1034,7 @@ int nr_rx_frame(struct sk_buff *skb, struct net_device *dev)
 	return 1;
 }
 
-static int nr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int nr_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nr_sock *nr = nr_sk(sk);
@@ -1043,6 +1043,7 @@ static int nr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sockaddr_ax25 sax;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int size;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 77642d18a3b4..70226fc36396 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -770,8 +770,7 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr,
 	return ret;
 }
 
-static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			     size_t len)
+static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nfc_llcp_sock *llcp_sock = nfc_llcp_sock(sk);
@@ -805,7 +804,7 @@ static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 		release_sock(sk);
 
 		return nfc_llcp_send_ui_frame(llcp_sock, addr->dsap, addr->ssap,
-					      msg, len);
+					      msg, msg_data_left(msg));
 	}
 
 	if (sk->sk_state != LLCP_CONNECTED) {
@@ -815,7 +814,7 @@ static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 
 	release_sock(sk);
 
-	return nfc_llcp_send_i_frame(llcp_sock, msg, len);
+	return nfc_llcp_send_i_frame(llcp_sock, msg, msg_data_left(msg));
 }
 
 static int llcp_sock_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 5125392bb68e..d9d54240b2a2 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -202,11 +202,12 @@ static void rawsock_tx_work(struct work_struct *work)
 	kcov_remote_stop();
 }
 
-static int rawsock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rawsock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nfc_dev *dev = nfc_rawsock(sk)->dev;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int rc;
 
 	pr_debug("sock=%p sk=%p len=%zu\n", sock, sk, len);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 497193f73030..84a95e177260 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1947,14 +1947,14 @@ static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
  *	protocol layers and you must therefore supply it with a complete frame
  */
 
-static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_pkt *, saddr, msg->msg_name);
 	struct sk_buff *skb = NULL;
 	struct net_device *dev;
 	struct sockcm_cookie sockc;
+	size_t len = msg_data_left(msg);
 	__be16 proto = 0;
 	int err;
 	int extra_len = 0;
@@ -2933,7 +2933,7 @@ static struct sk_buff *packet_alloc_skb(struct sock *sk, size_t prepad,
 	return skb;
 }
 
-static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
+static int packet_snd(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_ll *, saddr, msg->msg_name);
@@ -2946,6 +2946,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	struct virtio_net_hdr vnet_hdr = { 0 };
 	int offset = 0;
 	struct packet_sock *po = pkt_sk(sk);
+	size_t len = msg_data_left(msg);
 	bool has_vnet_hdr = false;
 	int hlen, tlen, linear;
 	int extra_len = 0;
@@ -3093,7 +3094,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	return err;
 }
 
-static int packet_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int packet_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct packet_sock *po = pkt_sk(sk);
@@ -3104,7 +3105,7 @@ static int packet_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	if (data_race(po->tx_ring.pg_vec))
 		return tpacket_snd(po, msg);
 
-	return packet_snd(sock, msg, len);
+	return packet_snd(sock, msg);
 }
 
 /*
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index ff5f49ab236e..4839f7d6785b 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -70,10 +70,11 @@ static int pn_init(struct sock *sk)
 	return 0;
 }
 
-static int pn_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int pn_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_pn *, target, msg->msg_name);
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_NOSIGNAL|
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 83ea13a50690..5afc99ab9eca 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1112,10 +1112,11 @@ static int pipe_skb_send(struct sock *sk, struct sk_buff *skb)
 
 }
 
-static int pep_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int pep_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct pep_sock *pn = pep_sk(sk);
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	long timeo;
 	int flags = msg->msg_flags;
 	int err, done;
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 71e2caf6ab85..99cd62f64944 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -414,15 +414,14 @@ static int pn_socket_listen(struct socket *sock, int backlog)
 	return err;
 }
 
-static int pn_socket_sendmsg(struct socket *sock, struct msghdr *m,
-			     size_t total_len)
+static int pn_socket_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 
 	if (pn_socket_autobind(sock))
 		return -EAGAIN;
 
-	return sk->sk_prot->sendmsg(sk, m, total_len);
+	return sk->sk_prot->sendmsg(sk, m);
 }
 
 const struct proto_ops phonet_dgram_ops = {
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 5c2fb992803b..7c1b908dd479 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -888,7 +888,7 @@ static int qrtr_bcast_enqueue(struct qrtr_node *node, struct sk_buff *skb,
 	return 0;
 }
 
-static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_qrtr *, addr, msg->msg_name);
 	int (*enqueue_fn)(struct qrtr_node *, struct sk_buff *, int,
@@ -898,7 +898,7 @@ static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sock *sk = sock->sk;
 	struct qrtr_node *node;
 	struct sk_buff *skb;
-	size_t plen;
+	size_t plen, len = msg_data_left(msg);
 	u32 type;
 	int rc;
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index d35d1fc39807..9e8ecafd5b51 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -909,7 +909,7 @@ void rds6_inc_info_copy(struct rds_incoming *inc,
 			int flip);
 
 /* send.c */
-int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len);
+int rds_sendmsg(struct socket *sock, struct msghdr *msg);
 void rds_send_path_reset(struct rds_conn_path *conn);
 int rds_send_xmit(struct rds_conn_path *cp);
 struct sockaddr_in;
diff --git a/net/rds/send.c b/net/rds/send.c
index 5e57a1581dc6..f588b720e1c3 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -1098,7 +1098,7 @@ static int rds_rdma_bytes(struct msghdr *msg, size_t *rdma_bytes)
 	return 0;
 }
 
-int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
+int rds_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rds_sock *rs = rds_sk_to_rs(sk);
@@ -1114,6 +1114,7 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
 	struct rds_conn_path *cpath;
 	struct in6_addr daddr;
 	__u32 scope_id = 0;
+	size_t payload_len = msg_data_left(msg);
 	size_t rdma_payload_len = 0;
 	bool zcopy = ((msg->msg_flags & MSG_ZEROCOPY) &&
 		      sock_flag(rds_rs_to_sk(rs), SOCK_ZEROCOPY));
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index ca2b17f32670..938ea0716751 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1069,7 +1069,7 @@ int rose_rx_call_request(struct sk_buff *skb, struct net_device *dev, struct ros
 	return 1;
 }
 
-static int rose_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rose_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rose_sock *rose = rose_sk(sk);
@@ -1078,6 +1078,7 @@ static int rose_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct full_sockaddr_rose srose;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int n, size, qbit = 0;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..bdce6ab30899 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -502,13 +502,13 @@ static int rxrpc_connect(struct socket *sock, struct sockaddr *addr,
  *   - sends a call data packet
  *   - may send an abort (abort code in control data)
  */
-static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
+static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct rxrpc_local *local;
 	struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
 	int ret;
 
-	_enter(",{%d},,%zu", rx->sk.sk_state, len);
+	_enter(",{%d},,%zu", rx->sk.sk_state, msg_data_left(m));
 
 	if (m->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
@@ -562,7 +562,7 @@ static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
 		fallthrough;
 	case RXRPC_SERVER_BOUND:
 	case RXRPC_SERVER_LISTENING:
-		ret = rxrpc_do_sendmsg(rx, m, len);
+		ret = rxrpc_do_sendmsg(rx, m);
 		/* The socket has been unlocked */
 		goto out;
 	default:
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 67b0a894162d..36738f8f050d 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1221,7 +1221,7 @@ struct key *rxrpc_look_up_server_security(struct rxrpc_connection *,
  */
 bool rxrpc_propose_abort(struct rxrpc_call *call, s32 abort_code, int error,
 			 enum rxrpc_abort_reason why);
-int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *, size_t);
+int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *);
 
 /*
  * server_key.c
diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index 5e53429c6922..0f3ff3455101 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -16,9 +16,9 @@
 #include <net/udp.h>
 #include "ar-internal.h"
 
-extern int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+extern int udpv6_sendmsg(struct sock *sk, struct msghdr *msg);
 
-static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg, size_t len)
+static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg)
 {
 	struct sockaddr *sa = msg->msg_name;
 	struct sock *sk = socket->sk;
@@ -29,10 +29,10 @@ static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg, size_t
 				pr_warn("AF_INET6 address on AF_INET socket\n");
 				return -ENOPROTOOPT;
 			}
-			return udpv6_sendmsg(sk, msg, len);
+			return udpv6_sendmsg(sk, msg);
 		}
 	}
-	return udp_sendmsg(sk, msg, len);
+	return udp_sendmsg(sk, msg);
 }
 
 struct rxrpc_abort_buffer {
@@ -232,7 +232,7 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 	txb->ack.previousPacket	= htonl(call->rx_highest_seq);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	call->peer->last_tx_at = ktime_get_seconds();
 	if (ret < 0) {
 		trace_rxrpc_tx_fail(call->debug_id, serial, ret,
@@ -306,7 +306,7 @@ int rxrpc_send_abort_packet(struct rxrpc_call *call)
 	pkt.whdr.serial = htonl(serial);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, sizeof(pkt));
-	ret = do_udp_sendmsg(conn->local->socket, &msg, sizeof(pkt));
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	conn->peer->last_tx_at = ktime_get_seconds();
 	if (ret < 0)
 		trace_rxrpc_tx_fail(call->debug_id, serial, ret,
@@ -424,7 +424,7 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 	 *     message and update the peer record
 	 */
 	rxrpc_inc_stat(call->rxnet, stat_tx_data_send);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	conn->peer->last_tx_at = ktime_get_seconds();
 
 	if (ret < 0) {
@@ -497,7 +497,7 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 		ip_sock_set_mtu_discover(conn->local->socket->sk,
 					 IP_PMTUDISC_DONT);
 		rxrpc_inc_stat(call->rxnet, stat_tx_data_send_frag);
-		ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+		ret = do_udp_sendmsg(conn->local->socket, &msg);
 		conn->peer->last_tx_at = ktime_get_seconds();
 
 		ip_sock_set_mtu_discover(conn->local->socket->sk,
@@ -564,7 +564,7 @@ void rxrpc_send_conn_abort(struct rxrpc_connection *conn)
 	whdr.serial = htonl(serial);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 2, len);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	if (ret < 0) {
 		trace_rxrpc_tx_fail(conn->debug_id, serial, ret,
 				    rxrpc_tx_point_conn_abort);
@@ -633,7 +633,7 @@ void rxrpc_reject_packet(struct rxrpc_local *local, struct sk_buff *skb)
 		whdr.flags	&= RXRPC_CLIENT_INITIATED;
 
 		iov_iter_kvec(&msg.msg_iter, WRITE, iov, ioc, size);
-		ret = do_udp_sendmsg(local->socket, &msg, size);
+		ret = do_udp_sendmsg(local->socket, &msg);
 		if (ret < 0)
 			trace_rxrpc_tx_fail(local->debug_id, 0, ret,
 					    rxrpc_tx_point_reject);
@@ -682,7 +682,7 @@ void rxrpc_send_keepalive(struct rxrpc_peer *peer)
 	len = iov[0].iov_len + iov[1].iov_len;
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 2, len);
-	ret = do_udp_sendmsg(peer->local->socket, &msg, len);
+	ret = do_udp_sendmsg(peer->local->socket, &msg);
 	if (ret < 0)
 		trace_rxrpc_tx_fail(peer->debug_id, 0, ret,
 				    rxrpc_tx_point_version_keepalive);
diff --git a/net/rxrpc/rxperf.c b/net/rxrpc/rxperf.c
index 4a2e90015ca7..0167afb67a7a 100644
--- a/net/rxrpc/rxperf.c
+++ b/net/rxrpc/rxperf.c
@@ -507,7 +507,7 @@ static int rxperf_process_call(struct rxperf_call *call)
 		iov_iter_bvec(&msg.msg_iter, WRITE, &bv, 1, len);
 		msg.msg_flags = MSG_MORE;
 		n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg,
-					   len, rxperf_notify_end_reply_tx);
+					   rxperf_notify_end_reply_tx);
 		if (n < 0)
 			return n;
 		if (n == 0)
@@ -520,7 +520,7 @@ static int rxperf_process_call(struct rxperf_call *call)
 	iov[0].iov_len	= len;
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
 	msg.msg_flags = 0;
-	n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg, len,
+	n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg,
 				   rxperf_notify_end_reply_tx);
 	if (n >= 0)
 		return 0; /* Success */
diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c
index da49fcf1c456..b6ffd8124ced 100644
--- a/net/rxrpc/sendmsg.c
+++ b/net/rxrpc/sendmsg.c
@@ -280,7 +280,7 @@ static void rxrpc_queue_packet(struct rxrpc_sock *rx, struct rxrpc_call *call,
  */
 static int rxrpc_send_data(struct rxrpc_sock *rx,
 			   struct rxrpc_call *call,
-			   struct msghdr *msg, size_t len,
+			   struct msghdr *msg,
 			   rxrpc_notify_end_tx_t notify_end_tx,
 			   bool *_dropped_lock)
 {
@@ -327,9 +327,9 @@ static int rxrpc_send_data(struct rxrpc_sock *rx,
 
 	ret = -EMSGSIZE;
 	if (call->tx_total_len != -1) {
-		if (len - copied > call->tx_total_len)
+		if (msg_data_left(msg) > call->tx_total_len)
 			goto maybe_error;
-		if (!more && len - copied != call->tx_total_len)
+		if (!more && msg_data_left(msg) != call->tx_total_len)
 			goto maybe_error;
 	}
 
@@ -612,7 +612,7 @@ rxrpc_new_client_call_for_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg,
  * - caller holds the socket locked
  * - the socket may be either a client socket or a server socket
  */
-int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
+int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg)
 	__releases(&rx->sk.sk_lock.slock)
 {
 	struct rxrpc_call *call;
@@ -723,7 +723,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
 	} else if (p.command != RXRPC_CMD_SEND_DATA) {
 		ret = -EINVAL;
 	} else {
-		ret = rxrpc_send_data(rx, call, msg, len, NULL, &dropped_lock);
+		ret = rxrpc_send_data(rx, call, msg, NULL, &dropped_lock);
 	}
 
 out_put_unlock:
@@ -744,7 +744,6 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
  * @sock: The socket the call is on
  * @call: The call to send data through
  * @msg: The data to send
- * @len: The amount of data to send
  * @notify_end_tx: Notification that the last packet is queued.
  *
  * Allow a kernel service to send data on a call.  The call must be in an state
@@ -753,7 +752,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
  * more data to come, otherwise this data will end the transmission phase.
  */
 int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
-			   struct msghdr *msg, size_t len,
+			   struct msghdr *msg,
 			   rxrpc_notify_end_tx_t notify_end_tx)
 {
 	bool dropped_lock = false;
@@ -766,7 +765,7 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
 
 	mutex_lock(&call->user_mutex);
 
-	ret = rxrpc_send_data(rxrpc_sk(sock->sk), call, msg, len,
+	ret = rxrpc_send_data(rxrpc_sk(sock->sk), call, msg,
 			      notify_end_tx, &dropped_lock);
 	if (ret == -ESHUTDOWN)
 		ret = call->error;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index b91616f819de..da99aab89d82 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1935,7 +1935,7 @@ static void sctp_sendmsg_update_sinfo(struct sctp_association *asoc,
 	}
 }
 
-static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
+static int sctp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sctp_endpoint *ep = sctp_sk(sk)->ep;
 	struct sctp_transport *transport = NULL;
@@ -1943,6 +1943,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 	struct sctp_association *asoc, *tmp;
 	struct sctp_cmsgs cmsgs;
 	union sctp_addr *daddr;
+	size_t msg_len = msg_data_left(msg);
 	bool new = false;
 	__u16 sflags;
 	int err;
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index c6b4a62276f6..0e725698ebcd 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2653,10 +2653,11 @@ static int smc_getname(struct socket *sock, struct sockaddr *addr,
 	return smc->clcsock->ops->getname(smc->clcsock, addr, peer);
 }
 
-static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int smc_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
+	size_t len = msg_data_left(msg);
 	int rc;
 
 	smc = smc_sk(sk);
@@ -2681,7 +2682,7 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	if (smc->use_fallback) {
-		rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg, len);
+		rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg);
 	} else {
 		rc = smc_tx_sendmsg(smc, msg, len);
 		SMC_STAT_TX_PAYLOAD(smc, len, rc);
diff --git a/net/socket.c b/net/socket.c
index 73e493da4589..1690e1782bf0 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -708,10 +708,8 @@ void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags)
 }
 EXPORT_SYMBOL(__sock_tx_timestamp);
 
-INDIRECT_CALLABLE_DECLARE(int inet_sendmsg(struct socket *, struct msghdr *,
-					   size_t));
-INDIRECT_CALLABLE_DECLARE(int inet6_sendmsg(struct socket *, struct msghdr *,
-					    size_t));
+INDIRECT_CALLABLE_DECLARE(int inet_sendmsg(struct socket *, struct msghdr *));
+INDIRECT_CALLABLE_DECLARE(int inet6_sendmsg(struct socket *, struct msghdr *));
 
 static noinline void call_trace_sock_send_length(struct sock *sk, int ret,
 						 int flags)
@@ -722,8 +720,7 @@ static noinline void call_trace_sock_send_length(struct sock *sk, int ret,
 static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
 {
 	int ret = INDIRECT_CALL_INET(sock->ops->sendmsg, inet6_sendmsg,
-				     inet_sendmsg, sock, msg,
-				     msg_data_left(msg));
+				     inet_sendmsg, sock, msg);
 	BUG_ON(ret == -EIOCBQUEUED);
 
 	if (trace_sock_send_length_enabled())
@@ -741,8 +738,7 @@ static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
  */
 int sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	int err = security_socket_sendmsg(sock, msg,
-					  msg_data_left(msg));
+	int err = security_socket_sendmsg(sock, msg);
 
 	return err ?: sock_sendmsg_nosec(sock, msg);
 }
@@ -787,11 +783,11 @@ int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 	struct socket *sock = sk->sk_socket;
 
 	if (!sock->ops->sendmsg_locked)
-		return sock_no_sendmsg_locked(sk, msg, size);
+		return sock_no_sendmsg_locked(sk, msg);
 
 	iov_iter_kvec(&msg->msg_iter, ITER_SOURCE, vec, num, size);
 
-	return sock->ops->sendmsg_locked(sk, msg, msg_data_left(msg));
+	return sock->ops->sendmsg_locked(sk, msg);
 }
 EXPORT_SYMBOL(kernel_sendmsg_locked);
 
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 37edfe10f8c6..bd677e707548 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -156,8 +156,8 @@ static int tipc_sk_leave(struct tipc_sock *tsk);
 static struct tipc_sock *tipc_sk_lookup(struct net *net, u32 portid);
 static int tipc_sk_insert(struct tipc_sock *tsk);
 static void tipc_sk_remove(struct tipc_sock *tsk);
-static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz);
-static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dsz);
+static int __tipc_sendstream(struct socket *sock, struct msghdr *m);
+static int __tipc_sendmsg(struct socket *sock, struct msghdr *m);
 static void tipc_sk_push_backlog(struct tipc_sock *tsk, bool nagle_ack);
 static int tipc_wait_for_connect(struct socket *sock, long *timeo_p);
 
@@ -1385,7 +1385,6 @@ static void tipc_sk_conn_proto_rcv(struct tipc_sock *tsk, struct sk_buff *skb,
  * tipc_sendmsg - send message in connectionless manner
  * @sock: socket structure
  * @m: message to send
- * @dsz: amount of user data to be sent
  *
  * Message must have an destination specified explicitly.
  * Used for SOCK_RDM and SOCK_DGRAM messages,
@@ -1394,20 +1393,19 @@ static void tipc_sk_conn_proto_rcv(struct tipc_sock *tsk, struct sk_buff *skb,
  *
  * Return: the number of bytes sent on success, or errno otherwise
  */
-static int tipc_sendmsg(struct socket *sock,
-			struct msghdr *m, size_t dsz)
+static int tipc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	int ret;
 
 	lock_sock(sk);
-	ret = __tipc_sendmsg(sock, m, dsz);
+	ret = __tipc_sendmsg(sock, m);
 	release_sock(sk);
 
 	return ret;
 }
 
-static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
+static int __tipc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	struct net *net = sock_net(sk);
@@ -1420,6 +1418,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
 	struct tipc_msg *hdr = &tsk->phdr;
 	struct tipc_socket_addr skaddr;
 	struct sk_buff_head pkts;
+	size_t dlen = msg_data_left(m);
 	int atype, mtu, rc;
 
 	if (unlikely(dlen > TIPC_MAX_USER_MSG_SIZE))
@@ -1535,26 +1534,25 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
  * tipc_sendstream - send stream-oriented data
  * @sock: socket structure
  * @m: data to send
- * @dsz: total length of data to be transmitted
  *
  * Used for SOCK_STREAM data.
  *
  * Return: the number of bytes sent on success (or partial success),
  * or errno if no data sent
  */
-static int tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz)
+static int tipc_sendstream(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	int ret;
 
 	lock_sock(sk);
-	ret = __tipc_sendstream(sock, m, dsz);
+	ret = __tipc_sendstream(sock, m);
 	release_sock(sk);
 
 	return ret;
 }
 
-static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
+static int __tipc_sendstream(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_tipc *, dest, m->msg_name);
@@ -1564,6 +1562,7 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
 	struct tipc_msg *hdr = &tsk->phdr;
 	struct net *net = sock_net(sk);
 	struct sk_buff *skb;
+	size_t dlen = msg_data_left(m);
 	u32 dnode = tsk_peer_node(tsk);
 	int maxnagle = tsk->maxnagle;
 	int maxpkt = tsk->max_pkt;
@@ -1575,7 +1574,7 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
 
 	/* Handle implicit connection setup */
 	if (unlikely(dest && sk->sk_state == TIPC_OPEN)) {
-		rc = __tipc_sendmsg(sock, m, dlen);
+		rc = __tipc_sendmsg(sock, m);
 		if (dlen && dlen == rc) {
 			tsk->peer_caps = tipc_node_get_capabilities(net, dnode);
 			tsk->snt_unacked = tsk_inc(tsk, dlen + msg_hdr_sz(hdr));
@@ -1643,18 +1642,17 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
  * tipc_send_packet - send a connection-oriented message
  * @sock: socket structure
  * @m: message to send
- * @dsz: length of data to be transmitted
  *
  * Used for SOCK_SEQPACKET messages.
  *
  * Return: the number of bytes sent on success, or errno otherwise
  */
-static int tipc_send_packet(struct socket *sock, struct msghdr *m, size_t dsz)
+static int tipc_send_packet(struct socket *sock, struct msghdr *m)
 {
-	if (dsz > TIPC_MAX_USER_MSG_SIZE)
+	if (msg_data_left(m) > TIPC_MAX_USER_MSG_SIZE)
 		return -EMSGSIZE;
 
-	return tipc_sendstream(sock, m, dsz);
+	return tipc_sendstream(sock, m);
 }
 
 /* tipc_sk_finish_conn - complete the setup of a connection
@@ -2625,7 +2623,7 @@ static int tipc_connect(struct socket *sock, struct sockaddr *dest,
 		if (!timeout)
 			m.msg_flags = MSG_DONTWAIT;
 
-		res = __tipc_sendmsg(sock, &m, 0);
+		res = __tipc_sendmsg(sock, &m);
 		if ((res < 0) && (res != -EWOULDBLOCK))
 			goto exit;
 
@@ -2781,7 +2779,7 @@ static int tipc_accept(struct socket *sock, struct socket *new_sock, int flags,
 		skb_set_owner_r(buf, new_sk);
 	}
 	iov_iter_kvec(&m.msg_iter, ITER_SOURCE, NULL, 0, 0);
-	__tipc_sendstream(new_sock, &m, 0);
+	__tipc_sendstream(new_sock, &m);
 	release_sock(new_sk);
 exit:
 	release_sock(sk);
diff --git a/net/tls/tls.h b/net/tls/tls.h
index 804c3880d028..a969955ddd7c 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -96,7 +96,7 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx);
 void tls_update_rx_zc_capable(struct tls_context *tls_ctx);
 void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx);
 void tls_sw_strparser_done(struct tls_context *tls_ctx);
-int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg);
 int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
 			   int offset, size_t size, int flags);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
@@ -114,7 +114,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
 			   struct pipe_inode_info *pipe,
 			   size_t len, unsigned int flags);
 
-int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg);
 int tls_device_sendpage(struct sock *sk, struct page *page,
 			int offset, size_t size, int flags);
 int tls_tx_records(struct sock *sk, int flags);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index a7cc4f9faac2..3616dde20a96 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -566,7 +566,7 @@ static int tls_push_data(struct sock *sk,
 	return rc;
 }
 
-int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	unsigned char record_type = TLS_RECORD_TYPE_DATA;
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
@@ -583,7 +583,8 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	}
 
 	iter.msg_iter = &msg->msg_iter;
-	rc = tls_push_data(sk, iter, size, msg->msg_flags, record_type, NULL);
+	rc = tls_push_data(sk, iter, msg_data_left(msg), msg->msg_flags,
+			   record_type, NULL);
 
 out:
 	release_sock(sk);
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 635b8bf6b937..17ea9b07a277 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -929,7 +929,7 @@ static int tls_sw_push_pending_record(struct sock *sk, int flags)
 				   &copied, flags);
 }
 
-int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index fb31e8a4409e..37c96a73e6b4 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -756,20 +756,20 @@ static int unix_ioctl(struct socket *, unsigned int, unsigned long);
 static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 #endif
 static int unix_shutdown(struct socket *, int);
-static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_stream_sendmsg(struct socket *, struct msghdr *);
 static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int);
 static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset,
 				    size_t size, int flags);
 static ssize_t unix_stream_splice_read(struct socket *,  loff_t *ppos,
 				       struct pipe_inode_info *, size_t size,
 				       unsigned int flags);
-static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_dgram_sendmsg(struct socket *, struct msghdr *);
 static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int);
 static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
 static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
 static int unix_dgram_connect(struct socket *, struct sockaddr *,
 			      int, int);
-static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *);
 static int unix_seqpacket_recvmsg(struct socket *, struct msghdr *, size_t,
 				  int);
 
@@ -1888,14 +1888,14 @@ static void scm_stat_del(struct sock *sk, struct sk_buff *skb)
  *	Send AF_UNIX data.
  */
 
-static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
-			      size_t len)
+static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, msg->msg_name);
 	struct sock *sk = sock->sk, *other = NULL;
 	struct unix_sock *u = unix_sk(sk);
 	struct scm_cookie scm;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int data_len = 0;
 	int sk_locked;
 	long timeo;
@@ -2157,11 +2157,11 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other
 }
 #endif
 
-static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sock *other = NULL;
+	size_t len = msg_data_left(msg);
 	int err, size;
 	struct sk_buff *skb;
 	int sent = 0;
@@ -2388,8 +2388,7 @@ static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
 	return err;
 }
 
-static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
-				  size_t len)
+static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct sock *sk = sock->sk;
@@ -2404,7 +2403,7 @@ static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (msg->msg_namelen)
 		msg->msg_namelen = 0;
 
-	return unix_dgram_sendmsg(sock, msg, len);
+	return unix_dgram_sendmsg(sock, msg);
 }
 
 static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 19aea7cba26e..20bac3e04abd 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1131,8 +1131,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
 	return mask;
 }
 
-static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct sock *sk;
@@ -1198,7 +1197,7 @@ static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
 		goto out;
 	}
 
-	err = transport->dgram_enqueue(vsk, remote_addr, msg, len);
+	err = transport->dgram_enqueue(vsk, remote_addr, msg, msg_data_left(msg));
 
 out:
 	release_sock(sk);
@@ -1737,8 +1736,7 @@ static int vsock_connectible_getsockopt(struct socket *sock,
 	return 0;
 }
 
-static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
-				     size_t len)
+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk;
 	struct vsock_sock *vsk;
@@ -1794,7 +1792,7 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (err < 0)
 		goto out;
 
-	while (total_written < len) {
+	while (msg_data_left(msg)) {
 		ssize_t written;
 
 		add_wait_queue(sk_sleep(sk), &wait);
@@ -1856,10 +1854,10 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 
 		if (sk->sk_type == SOCK_SEQPACKET) {
 			written = transport->seqpacket_enqueue(vsk,
-						msg, len - total_written);
+					msg, msg_data_left(msg));
 		} else {
 			written = transport->stream_enqueue(vsk,
-					msg, len - total_written);
+					msg, msg_data_left(msg));
 		}
 
 		if (written < 0) {
@@ -1882,7 +1880,7 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 		 * 1) SOCK_STREAM socket.
 		 * 2) SOCK_SEQPACKET socket when whole buffer is sent.
 		 */
-		if (sk->sk_type == SOCK_STREAM || total_written == len)
+		if (sk->sk_type == SOCK_STREAM || !msg_data_left(msg))
 			err = total_written;
 	}
 out:
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5c7ad301d742..5b8751669136 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1100,7 +1100,7 @@ int x25_rx_call_request(struct sk_buff *skb, struct x25_neigh *nb,
 	goto out;
 }
 
-static int x25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int x25_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct x25_sock *x25 = x25_sk(sk);
@@ -1108,6 +1108,7 @@ static int x25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sockaddr_x25 sx25;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int noblock = msg->msg_flags & MSG_DONTWAIT;
 	size_t size;
 	int qbit = 0, rc = -EINVAL;
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5e..db82e2a287f5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -629,7 +629,7 @@ static int xsk_check_common(struct xdp_sock *xs)
 	return 0;
 }
 
-static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int __xsk_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	bool need_wait = !(m->msg_flags & MSG_DONTWAIT);
 	struct sock *sk = sock->sk;
@@ -663,12 +663,12 @@ static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len
 	return 0;
 }
 
-static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int xsk_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	int ret;
 
 	rcu_read_lock();
-	ret = __xsk_sendmsg(sock, m, total_len);
+	ret = __xsk_sendmsg(sock, m);
 	rcu_read_unlock();
 
 	return ret;
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index 872b80188e83..d07faa356347 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -311,13 +311,14 @@ int espintcp_push_skb(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(espintcp_push_skb);
 
-static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	struct espintcp_ctx *ctx = espintcp_getctx(sk);
 	struct espintcp_msg *emsg = &ctx->partial;
 	struct iov_iter pfx_iter;
 	struct kvec pfx_iov = {};
+	size_t size = msg_data_left(msg);
 	size_t msglen = size + 2;
 	char buf[2] = {0};
 	int err, end;
@@ -325,7 +326,7 @@ static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	if (msg->msg_flags & ~MSG_DONTWAIT)
 		return -EOPNOTSUPP;
 
-	if (size > MAX_ESPINTCP_MSG)
+	if (msg_data_left(msg) > MAX_ESPINTCP_MSG)
 		return -EMSGSIZE;
 
 	if (msg->msg_controllen)
@@ -362,7 +363,8 @@ static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	if (err < 0)
 		goto fail;
 
-	err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, &emsg->skmsg, size);
+	err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, &emsg->skmsg,
+				       msg_data_left(msg));
 	if (err < 0)
 		goto fail;
 
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index d6cc4812ca53..cb220a8e8126 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -997,10 +997,10 @@ static int aa_sock_msg_perm(const char *op, u32 request, struct socket *sock,
 /**
  * apparmor_socket_sendmsg - check perms before sending msg to another socket
  */
-static int apparmor_socket_sendmsg(struct socket *sock,
-				   struct msghdr *msg, int size)
+static int apparmor_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg, size);
+	return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg,
+				msg_data_left(msg));
 }
 
 /**
diff --git a/security/security.c b/security/security.c
index cf6cc576736f..faa87f363af8 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2301,9 +2301,9 @@ int security_socket_accept(struct socket *sock, struct socket *newsock)
 	return call_int_hook(socket_accept, 0, sock, newsock);
 }
 
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return call_int_hook(socket_sendmsg, 0, sock, msg, size);
+	return call_int_hook(socket_sendmsg, 0, sock, msg);
 }
 
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 9a5bdfc21314..ff0d82e6331d 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4912,8 +4912,7 @@ static int selinux_socket_accept(struct socket *sock, struct socket *newsock)
 	return 0;
 }
 
-static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				  int size)
+static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	return sock_has_perm(sock->sk, SOCKET__WRITE);
 }
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index cfcbb748da25..ca30c105f254 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -3730,14 +3730,12 @@ static int smack_unix_may_send(struct socket *sock, struct socket *other)
  * smack_socket_sendmsg - Smack check based on destination host
  * @sock: the socket
  * @msg: the message
- * @size: the size of the message
  *
  * Return 0 if the current subject can write to the destination host.
  * For IPv4 this is only a question if the destination is a single label host.
  * For IPv6 this is a check against the label of the port.
  */
-static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				int size)
+static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sockaddr_in *sip = (struct sockaddr_in *) msg->msg_name;
 #if IS_ENABLED(CONFIG_IPV6)
diff --git a/security/tomoyo/common.h b/security/tomoyo/common.h
index ca285f362705..0841098d966a 100644
--- a/security/tomoyo/common.h
+++ b/security/tomoyo/common.h
@@ -997,8 +997,7 @@ int tomoyo_socket_bind_permission(struct socket *sock, struct sockaddr *addr,
 int tomoyo_socket_connect_permission(struct socket *sock,
 				     struct sockaddr *addr, int addr_len);
 int tomoyo_socket_listen_permission(struct socket *sock);
-int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg,
-				     int size);
+int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg);
 int tomoyo_supervisor(struct tomoyo_request_info *r, const char *fmt, ...)
 	__printf(2, 3);
 int tomoyo_update_domain(struct tomoyo_acl_info *new_entry, const int size,
diff --git a/security/tomoyo/network.c b/security/tomoyo/network.c
index 8dc61335f65e..0315b335cdff 100644
--- a/security/tomoyo/network.c
+++ b/security/tomoyo/network.c
@@ -751,12 +751,10 @@ int tomoyo_socket_bind_permission(struct socket *sock, struct sockaddr *addr,
  *
  * @sock: Pointer to "struct socket".
  * @msg:  Pointer to "struct msghdr".
- * @size: Unused.
  *
  * Returns 0 on success, negative value otherwise.
  */
-int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg,
-				     int size)
+int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg)
 {
 	struct tomoyo_addr_info address;
 	const u8 family = tomoyo_sock_family(sock->sk);
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index af04a7b7eb28..72c6f343ffba 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -489,14 +489,12 @@ static int tomoyo_socket_bind(struct socket *sock, struct sockaddr *addr,
  *
  * @sock: Pointer to "struct socket".
  * @msg:  Pointer to "struct msghdr".
- * @size: Size of message.
  *
  * Returns 0 on success, negative value otherwise.
  */
-static int tomoyo_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				 int size)
+static int tomoyo_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return tomoyo_socket_sendmsg_permission(sock, msg, size);
+	return tomoyo_socket_sendmsg_permission(sock, msg);
 }
 
 struct lsm_blob_sizes tomoyo_blob_sizes __lsm_ro_after_init = {

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 1/3] net: Drop the size argument from ->sendmsg()
@ 2023-03-22 13:56             ` David Howells
  0 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 13:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Howells, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Matthew Wilcox, Jeff Layton, Linus Torvalds, netdev,
	linux-kernel, apparmor, bpf, dccp, kvm, linux-afs, linux-arm-msm,
	linux-bluetooth, linux-can, linux-crypto, linux-hams, linux-rdma,
	linux-s390, linux-sctp, linux-security-module, linux-wpan,
	linux-x25, mptcp, rds-devel, selinux, tipc-discussion,
	virtualization, xen-devel

The size argument to ->sendmsg() ought to be redundant as the same
information should be conveyed by msg->msg_iter.count as returned by
msg_data_left().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: netdev@vger.kernel.org
cc: apparmor@lists.ubuntu.com
cc: bpf@vger.kernel.org
cc: dccp@vger.kernel.org
cc: kvm@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: linux-arm-msm@vger.kernel.org
cc: linux-bluetooth@vger.kernel.org
cc: linux-can@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: linux-hams@vger.kernel.org
cc: linux-rdma@vger.kernel.org
cc: linux-s390@vger.kernel.org
cc: linux-sctp@vger.kernel.org
cc: linux-security-module@vger.kernel.org
cc: linux-wpan@vger.kernel.org
cc: linux-x25@vger.kernel.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: selinux@vger.kernel.org
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
cc: xen-devel@lists.xenproject.org
---
 crypto/af_alg.c                               | 12 +++----
 crypto/algif_aead.c                           |  9 +++--
 crypto/algif_hash.c                           |  8 ++---
 crypto/algif_rng.c                            |  3 +-
 crypto/algif_skcipher.c                       | 10 +++---
 drivers/isdn/mISDN/socket.c                   |  3 +-
 .../chelsio/inline_crypto/chtls/chtls.h       |  2 +-
 .../chelsio/inline_crypto/chtls/chtls_io.c    | 15 ++++----
 drivers/net/ppp/pppoe.c                       |  4 +--
 drivers/net/tap.c                             |  3 +-
 drivers/net/tun.c                             |  3 +-
 drivers/vhost/net.c                           |  6 ++--
 drivers/xen/pvcalls-back.c                    |  2 +-
 drivers/xen/pvcalls-front.c                   |  4 +--
 drivers/xen/pvcalls-front.h                   |  3 +-
 fs/afs/rxrpc.c                                |  8 ++---
 include/crypto/if_alg.h                       |  3 +-
 include/linux/lsm_hook_defs.h                 |  3 +-
 include/linux/lsm_hooks.h                     |  1 -
 include/linux/net.h                           |  6 ++--
 include/linux/security.h                      |  4 +--
 include/net/af_rxrpc.h                        |  3 +-
 include/net/inet_common.h                     |  2 +-
 include/net/ipv6.h                            |  2 +-
 include/net/ping.h                            |  2 +-
 include/net/sock.h                            |  7 ++--
 include/net/tcp.h                             |  8 ++---
 include/net/udp.h                             |  2 +-
 net/appletalk/ddp.c                           |  3 +-
 net/atm/common.c                              |  3 +-
 net/atm/common.h                              |  2 +-
 net/ax25/af_ax25.c                            |  4 +--
 net/bluetooth/hci_sock.c                      |  4 +--
 net/bluetooth/iso.c                           |  4 +--
 net/bluetooth/l2cap_sock.c                    |  5 ++-
 net/bluetooth/rfcomm/sock.c                   |  7 ++--
 net/bluetooth/sco.c                           |  4 +--
 net/caif/caif_socket.c                        | 13 +++----
 net/can/bcm.c                                 |  3 +-
 net/can/isotp.c                               |  3 +-
 net/can/j1939/socket.c                        |  4 +--
 net/can/raw.c                                 |  3 +-
 net/core/sock.c                               |  4 +--
 net/dccp/dccp.h                               |  2 +-
 net/dccp/proto.c                              |  3 +-
 net/ieee802154/socket.c                       | 11 +++---
 net/ipv4/af_inet.c                            |  4 +--
 net/ipv4/ping.c                               |  8 +++--
 net/ipv4/raw.c                                |  3 +-
 net/ipv4/tcp.c                                | 17 +++++-----
 net/ipv4/tcp_bpf.c                            |  5 +--
 net/ipv4/tcp_input.c                          |  3 +-
 net/ipv4/udp.c                                |  5 +--
 net/ipv6/af_inet6.c                           |  7 ++--
 net/ipv6/ping.c                               |  5 +--
 net/ipv6/raw.c                                |  3 +-
 net/ipv6/udp.c                                |  7 ++--
 net/ipv6/udp_impl.h                           |  2 +-
 net/iucv/af_iucv.c                            |  4 +--
 net/kcm/kcmsock.c                             |  2 +-
 net/key/af_key.c                              |  3 +-
 net/l2tp/l2tp_ip.c                            |  3 +-
 net/l2tp/l2tp_ip6.c                           |  3 +-
 net/l2tp/l2tp_ppp.c                           |  4 +--
 net/llc/af_llc.c                              |  5 ++-
 net/mctp/af_mctp.c                            |  3 +-
 net/mptcp/protocol.c                          |  8 ++---
 net/netlink/af_netlink.c                      | 11 +++---
 net/netrom/af_netrom.c                        |  3 +-
 net/nfc/llcp_sock.c                           |  7 ++--
 net/nfc/rawsock.c                             |  3 +-
 net/packet/af_packet.c                        | 11 +++---
 net/phonet/datagram.c                         |  3 +-
 net/phonet/pep.c                              |  3 +-
 net/phonet/socket.c                           |  5 ++-
 net/qrtr/af_qrtr.c                            |  4 +--
 net/rds/rds.h                                 |  2 +-
 net/rds/send.c                                |  3 +-
 net/rose/af_rose.c                            |  3 +-
 net/rxrpc/af_rxrpc.c                          |  6 ++--
 net/rxrpc/ar-internal.h                       |  2 +-
 net/rxrpc/output.c                            | 22 ++++++------
 net/rxrpc/rxperf.c                            |  4 +--
 net/rxrpc/sendmsg.c                           | 15 ++++----
 net/sctp/socket.c                             |  3 +-
 net/smc/af_smc.c                              |  5 +--
 net/socket.c                                  | 16 ++++-----
 net/tipc/socket.c                             | 34 +++++++++----------
 net/tls/tls.h                                 |  4 +--
 net/tls/tls_device.c                          |  5 +--
 net/tls/tls_sw.c                              |  2 +-
 net/unix/af_unix.c                            | 19 +++++------
 net/vmw_vsock/af_vsock.c                      | 16 ++++-----
 net/x25/af_x25.c                              |  3 +-
 net/xdp/xsk.c                                 |  6 ++--
 net/xfrm/espintcp.c                           |  8 +++--
 security/apparmor/lsm.c                       |  6 ++--
 security/security.c                           |  4 +--
 security/selinux/hooks.c                      |  3 +-
 security/smack/smack_lsm.c                    |  4 +--
 security/tomoyo/common.h                      |  3 +-
 security/tomoyo/network.c                     |  4 +--
 security/tomoyo/tomoyo.c                      |  6 ++--
 103 files changed, 286 insertions(+), 296 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 5f7252a5b7b4..dc49b4e2d719 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -952,19 +952,18 @@ static void af_alg_data_wakeup(struct sock *sk)
  *
  * @sock: socket of connection to user space
  * @msg: message from user space
- * @size: size of message from user space
  * @ivsize: the size of the IV for the cipher operation to verify that the
  *	   user-space-provided IV has the right size
  * Return: the number of copied data upon success, < 0 upon error
  */
-int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
-		   unsigned int ivsize)
+int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, unsigned int ivsize)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
 	struct af_alg_ctx *ctx = ask->private;
 	struct af_alg_tsgl *sgl;
 	struct af_alg_control con = {};
+	size_t len;
 	long copied = 0;
 	bool enc = false;
 	bool init = false;
@@ -1012,9 +1011,8 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		ctx->aead_assoclen = con.aead_assoclen;
 	}
 
-	while (size) {
+	while ((len = msg_data_left(msg))) {
 		struct scatterlist *sg;
-		size_t len = size;
 		size_t plen;
 
 		/* use the existing memory in an allocated page */
@@ -1037,7 +1035,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 
 			ctx->used += len;
 			copied += len;
-			size -= len;
 			continue;
 		}
 
@@ -1086,11 +1083,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 			len -= plen;
 			ctx->used += plen;
 			copied += plen;
-			size -= plen;
 			sgl->cur++;
 		} while (len && sgl->cur < MAX_SGL_ENTS);
 
-		if (!size)
+		if (!msg_data_left(msg))
 			sg_mark_end(sg + sgl->cur - 1);
 
 		ctx->merge = plen & (PAGE_SIZE - 1);
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 42493b4d8ce4..1005c755c4c8 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -58,7 +58,7 @@ static inline bool aead_sufficient_data(struct sock *sk)
 	return ctx->used >= ctx->aead_assoclen + (ctx->enc ? 0 : as);
 }
 
-static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int aead_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
@@ -68,7 +68,7 @@ static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	struct crypto_aead *tfm = aeadc->aead;
 	unsigned int ivsize = crypto_aead_ivsize(tfm);
 
-	return af_alg_sendmsg(sock, msg, size, ivsize);
+	return af_alg_sendmsg(sock, msg, ivsize);
 }
 
 static int crypto_aead_copy_sgl(struct crypto_sync_skcipher *null_tfm,
@@ -408,8 +408,7 @@ static int aead_check_key(struct socket *sock)
 	return err;
 }
 
-static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-				  size_t size)
+static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -417,7 +416,7 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return aead_sendmsg(sock, msg, size);
+	return aead_sendmsg(sock, msg);
 }
 
 static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 1d017ec5c63c..9817adecdf1a 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -60,8 +60,7 @@ static void hash_free_result(struct sock *sk, struct hash_ctx *ctx)
 	ctx->result = NULL;
 }
 
-static int hash_sendmsg(struct socket *sock, struct msghdr *msg,
-			size_t ignored)
+static int hash_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int limit = ALG_MAX_PAGES * PAGE_SIZE;
 	struct sock *sk = sock->sk;
@@ -325,8 +324,7 @@ static int hash_check_key(struct socket *sock)
 	return err;
 }
 
-static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-			      size_t size)
+static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -334,7 +332,7 @@ static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return hash_sendmsg(sock, msg, size);
+	return hash_sendmsg(sock, msg);
 }
 
 static ssize_t hash_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 407408c43730..f838be6c2fd7 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -130,11 +130,12 @@ static int rng_test_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 	return ret;
 }
 
-static int rng_test_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rng_test_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct alg_sock *ask = alg_sk(sock->sk);
 	struct rng_ctx *ctx = ask->private;
+	size_t len = msg_data_left(msg);
 
 	lock_sock(sock->sk);
 	if (len > MAXSIZE) {
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index ee8890ee8f33..f5cd9dbbad1b 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -34,8 +34,7 @@
 #include <linux/net.h>
 #include <net/sock.h>
 
-static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t size)
+static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
@@ -44,7 +43,7 @@ static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct crypto_skcipher *tfm = pask->private;
 	unsigned ivsize = crypto_skcipher_ivsize(tfm);
 
-	return af_alg_sendmsg(sock, msg, size, ivsize);
+	return af_alg_sendmsg(sock, msg, ivsize);
 }
 
 static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
@@ -234,8 +233,7 @@ static int skcipher_check_key(struct socket *sock)
 	return err;
 }
 
-static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-				  size_t size)
+static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -243,7 +241,7 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return skcipher_sendmsg(sock, msg, size);
+	return skcipher_sendmsg(sock, msg);
 }
 
 static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 2776ca5fc33f..4c42d39e994a 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -164,10 +164,11 @@ mISDN_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 }
 
 static int
-mISDN_sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+mISDN_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock		*sk = sock->sk;
 	struct sk_buff		*skb;
+	size_t			len = msg_data_left(msg);
 	int			err = -ENOMEM;
 
 	if (*debug & DEBUG_SOCKET)
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
index 41714203ace8..32077c61273b 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
@@ -565,7 +565,7 @@ void chtls_close(struct sock *sk, long timeout);
 int chtls_disconnect(struct sock *sk, int flags);
 void chtls_shutdown(struct sock *sk, int how);
 void chtls_destroy_sock(struct sock *sk);
-int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int chtls_sendmsg(struct sock *sk, struct msghdr *msg);
 int chtls_recvmsg(struct sock *sk, struct msghdr *msg,
 		  size_t len, int flags, int *addr_len);
 int chtls_sendpage(struct sock *sk, struct page *page,
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
index ae6b17b96bf1..5782267618cf 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
@@ -1004,7 +1004,7 @@ static int chtls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
 	return rc;
 }
 
-int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int chtls_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
 	struct chtls_dev *cdev = csk->cdev;
@@ -1058,7 +1058,7 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 					tx_skb_finalize(skb);
 			}
 
-			recordsz = size;
+			recordsz = msg_data_left(msg);
 			csk->tlshws.txleft = recordsz;
 			csk->tlshws.type = record_type;
 		}
@@ -1080,8 +1080,8 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 								 false);
 			} else {
 				skb = get_tx_skb(sk,
-						 select_size(sk, size, flags,
-							     TX_HEADER_LEN));
+						 select_size(sk, msg_data_left(msg),
+							     flags, TX_HEADER_LEN));
 			}
 			if (unlikely(!skb))
 				goto wait_for_memory;
@@ -1089,8 +1089,8 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 			copy = mss;
 		}
-		if (copy > size)
-			copy = size;
+		if (copy > msg_data_left(msg))
+			copy = msg_data_left(msg);
 
 		if (skb_tailroom(skb) > 0) {
 			copy = min(copy, skb_tailroom(skb));
@@ -1182,7 +1182,6 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 			tx_skb_finalize(skb);
 		tp->write_seq += copy;
 		copied += copy;
-		size -= copy;
 
 		if (is_tls_tx(csk))
 			csk->tlshws.txleft -= copy;
@@ -1191,7 +1190,7 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 		    (sk_stream_wspace(sk) < sk_stream_min_wspace(sk)))
 			ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_NO_APPEND;
 
-		if (size == 0)
+		if (msg_data_left(msg) == 0)
 			goto out;
 
 		if (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND)
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index ce2cbb5903d7..7ae28a1f528a 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -833,8 +833,7 @@ static int pppoe_ioctl(struct socket *sock, unsigned int cmd,
 	return err;
 }
 
-static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
-			 size_t total_len)
+static int pppoe_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sk_buff *skb;
 	struct sock *sk = sock->sk;
@@ -843,6 +842,7 @@ static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
 	struct pppoe_hdr hdr;
 	struct pppoe_hdr *ph;
 	struct net_device *dev;
+	size_t total_len = msg_data_left(m);
 	char *start;
 	int hlen;
 
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index ce993cc75bf3..2b076d4a1a58 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1224,8 +1224,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
 	return err;
 }
 
-static int tap_sendmsg(struct socket *sock, struct msghdr *m,
-		       size_t total_len)
+static int tap_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct tap_queue *q = container_of(sock, struct tap_queue, sock);
 	struct tun_msg_ctl *ctl = m->msg_control;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4c7f74904c25..b31d696adafd 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2531,13 +2531,14 @@ static int tun_xdp_one(struct tun_struct *tun,
 	return ret;
 }
 
-static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int tun_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	int ret, i;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = tun_get(tfile);
 	struct tun_msg_ctl *ctl = m->msg_control;
 	struct xdp_buff *xdp;
+	size_t total_len = msg_data_left(m);
 
 	if (!tun)
 		return -EBADFD;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 07181cd8d52e..ddf01a21f208 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -476,7 +476,7 @@ static void vhost_tx_batch(struct vhost_net *net,
 
 	msghdr->msg_control = &ctl;
 	msghdr->msg_controllen = sizeof(ctl);
-	err = sock->ops->sendmsg(sock, msghdr, 0);
+	err = sock->ops->sendmsg(sock, msghdr);
 	if (unlikely(err < 0)) {
 		vq_err(&nvq->vq, "Fail to batch sending packets\n");
 
@@ -836,7 +836,7 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
 				msg.msg_flags &= ~MSG_MORE;
 		}
 
-		err = sock->ops->sendmsg(sock, &msg, len);
+		err = sock->ops->sendmsg(sock, &msg);
 		if (unlikely(err < 0)) {
 			if (err == -EAGAIN || err == -ENOMEM || err == -ENOBUFS) {
 				vhost_discard_vq_desc(vq, 1);
@@ -933,7 +933,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 			msg.msg_flags &= ~MSG_MORE;
 		}
 
-		err = sock->ops->sendmsg(sock, &msg, len);
+		err = sock->ops->sendmsg(sock, &msg);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
 				if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS)
diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index 1f5219e12cc3..37cfd15b6d9d 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -200,7 +200,7 @@ static bool pvcalls_conn_back_write(struct sock_mapping *map)
 		iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, vec, 2, size);
 	}
 
-	ret = inet_sendmsg(map->sock, &msg, size);
+	ret = inet_sendmsg(map->sock, &msg);
 	if (ret == -EAGAIN) {
 		atomic_inc(&map->write);
 		atomic_inc(&map->io);
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index d5d589bda243..257d92612371 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -531,10 +531,10 @@ static int __write_ring(struct pvcalls_data_intf *intf,
 	return len;
 }
 
-int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
-			  size_t len)
+int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock_mapping *map;
+	size_t len = msg_data_left(msg);
 	int sent, tot_sent = 0;
 	int count = 0, flags;
 
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index f694ad77379f..f0c5429604e6 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -14,8 +14,7 @@ int pvcalls_front_accept(struct socket *sock,
 			 struct socket *newsock,
 			 int flags);
 int pvcalls_front_sendmsg(struct socket *sock,
-			  struct msghdr *msg,
-			  size_t len);
+			  struct msghdr *msg);
 int pvcalls_front_recvmsg(struct socket *sock,
 			  struct msghdr *msg,
 			  size_t len,
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 7817e2b860e5..95ef04862025 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -367,8 +367,7 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp)
 	msg.msg_flags		= MSG_WAITALL | (call->write_iter ? MSG_MORE : 0);
 
 	ret = rxrpc_kernel_send_data(call->net->socket, rxcall,
-				     &msg, call->request_size,
-				     afs_notify_end_request_tx);
+				     &msg, afs_notify_end_request_tx);
 	if (ret < 0)
 		goto error_do_abort;
 
@@ -379,7 +378,6 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp)
 
 		ret = rxrpc_kernel_send_data(call->net->socket,
 					     call->rxcall, &msg,
-					     iov_iter_count(&msg.msg_iter),
 					     afs_notify_end_request_tx);
 		*call->write_iter = msg.msg_iter;
 
@@ -834,7 +832,7 @@ void afs_send_empty_reply(struct afs_call *call)
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
 
-	switch (rxrpc_kernel_send_data(net->socket, call->rxcall, &msg, 0,
+	switch (rxrpc_kernel_send_data(net->socket, call->rxcall, &msg,
 				       afs_notify_end_reply_tx)) {
 	case 0:
 		_leave(" [replied]");
@@ -875,7 +873,7 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
 
-	n = rxrpc_kernel_send_data(net->socket, call->rxcall, &msg, len,
+	n = rxrpc_kernel_send_data(net->socket, call->rxcall, &msg,
 				   afs_notify_end_reply_tx);
 	if (n >= 0) {
 		/* Success */
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 7e76623f9ec3..bcf0077aae6d 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -228,8 +228,7 @@ void af_alg_pull_tsgl(struct sock *sk, size_t used, struct scatterlist *dst,
 		      size_t dst_offset);
 void af_alg_wmem_wakeup(struct sock *sk);
 int af_alg_wait_for_data(struct sock *sk, unsigned flags, unsigned min);
-int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
-		   unsigned int ivsize);
+int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, unsigned int ivsize);
 ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
 			int offset, size_t size, int flags);
 void af_alg_free_resources(struct af_alg_async_req *areq);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 094b76dc7164..b176525025da 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -298,8 +298,7 @@ LSM_HOOK(int, 0, socket_connect, struct socket *sock, struct sockaddr *address,
 	 int addrlen)
 LSM_HOOK(int, 0, socket_listen, struct socket *sock, int backlog)
 LSM_HOOK(int, 0, socket_accept, struct socket *sock, struct socket *newsock)
-LSM_HOOK(int, 0, socket_sendmsg, struct socket *sock, struct msghdr *msg,
-	 int size)
+LSM_HOOK(int, 0, socket_sendmsg, struct socket *sock, struct msghdr *msg)
 LSM_HOOK(int, 0, socket_recvmsg, struct socket *sock, struct msghdr *msg,
 	 int size, int flags)
 LSM_HOOK(int, 0, socket_getsockname, struct socket *sock)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 6e156d2acffc..6f48be80b6bf 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -932,7 +932,6 @@
  *	Check permission before transmitting a message to another socket.
  *	@sock contains the socket structure.
  *	@msg contains the message to be transmitted.
- *	@size contains the size of message.
  *	Return 0 if permission is granted.
  * @socket_recvmsg:
  *	Check permission before receiving a message from a socket.
diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..8adf1328445a 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -192,8 +192,7 @@ struct proto_ops {
 	int		(*getsockopt)(struct socket *sock, int level,
 				      int optname, char __user *optval, int __user *optlen);
 	void		(*show_fdinfo)(struct seq_file *m, struct socket *sock);
-	int		(*sendmsg)   (struct socket *sock, struct msghdr *m,
-				      size_t total_len);
+	int		(*sendmsg)   (struct socket *sock, struct msghdr *m);
 	/* Notes for implementing recvmsg:
 	 * ===============================
 	 * msg->msg_namelen should get updated by the recvmsg handlers
@@ -222,8 +221,7 @@ struct proto_ops {
 	int		(*read_skb)(struct sock *sk, skb_read_actor_t recv_actor);
 	int		(*sendpage_locked)(struct sock *sk, struct page *page,
 					   int offset, size_t size, int flags);
-	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
-					  size_t size);
+	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg);
 	int		(*set_rcvlowat)(struct sock *sk, int val);
 };
 
diff --git a/include/linux/security.h b/include/linux/security.h
index 5984d0d550b4..6c67a4de4a89 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1436,7 +1436,7 @@ int security_socket_bind(struct socket *sock, struct sockaddr *address, int addr
 int security_socket_connect(struct socket *sock, struct sockaddr *address, int addrlen);
 int security_socket_listen(struct socket *sock, int backlog);
 int security_socket_accept(struct socket *sock, struct socket *newsock);
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size);
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg);
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
 			    int size, int flags);
 int security_socket_getsockname(struct socket *sock);
@@ -1538,7 +1538,7 @@ static inline int security_socket_accept(struct socket *sock,
 }
 
 static inline int security_socket_sendmsg(struct socket *sock,
-					  struct msghdr *msg, int size)
+					  struct msghdr *msg)
 {
 	return 0;
 }
diff --git a/include/net/af_rxrpc.h b/include/net/af_rxrpc.h
index ba717eac0229..33f1b8c622e3 100644
--- a/include/net/af_rxrpc.h
+++ b/include/net/af_rxrpc.h
@@ -51,8 +51,7 @@ struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *,
 					   enum rxrpc_interruptibility,
 					   unsigned int);
 int rxrpc_kernel_send_data(struct socket *, struct rxrpc_call *,
-			   struct msghdr *, size_t,
-			   rxrpc_notify_end_tx_t);
+			   struct msghdr *, rxrpc_notify_end_tx_t);
 int rxrpc_kernel_recv_data(struct socket *, struct rxrpc_call *,
 			   struct iov_iter *, size_t *, bool, u32 *, u16 *);
 bool rxrpc_kernel_abort_call(struct socket *, struct rxrpc_call *,
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index cec453c18f1d..ec798fdd371c 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -32,7 +32,7 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
 int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern);
 int inet_send_prepare(struct sock *sk);
-int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
+int inet_sendmsg(struct socket *sock, struct msghdr *msg);
 ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 		      size_t size, int flags);
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 7332296eca44..f2132311e92b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1228,7 +1228,7 @@ int inet6_compat_ioctl(struct socket *sock, unsigned int cmd,
 
 int inet6_hash_connect(struct inet_timewait_death_row *death_row,
 			      struct sock *sk);
-int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
+int inet6_sendmsg(struct socket *sock, struct msghdr *msg);
 int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		  int flags);
 
diff --git a/include/net/ping.h b/include/net/ping.h
index 9233ad3de0ad..04814edde8e3 100644
--- a/include/net/ping.h
+++ b/include/net/ping.h
@@ -70,7 +70,7 @@ int  ping_getfrag(void *from, char *to, int offset, int fraglen, int odd,
 
 int  ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		  int flags, int *addr_len);
-int  ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
+int  ping_common_sendmsg(int family, struct msghdr *msg,
 			 void *user_icmph, size_t icmph_len);
 int  ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
 enum skb_drop_reason ping_rcv(struct sk_buff *skb);
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..7a6d06c181b6 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1261,8 +1261,7 @@ struct proto {
 	int			(*compat_ioctl)(struct sock *sk,
 					unsigned int cmd, unsigned long arg);
 #endif
-	int			(*sendmsg)(struct sock *sk, struct msghdr *msg,
-					   size_t len);
+	int			(*sendmsg)(struct sock *sk, struct msghdr *msg);
 	int			(*recvmsg)(struct sock *sk, struct msghdr *msg,
 					   size_t len, int flags, int *addr_len);
 	int			(*sendpage)(struct sock *sk, struct page *page,
@@ -1901,8 +1900,8 @@ int sock_no_getname(struct socket *, struct sockaddr *, int);
 int sock_no_ioctl(struct socket *, unsigned int, unsigned long);
 int sock_no_listen(struct socket *, int);
 int sock_no_shutdown(struct socket *, int);
-int sock_no_sendmsg(struct socket *, struct msghdr *, size_t);
-int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len);
+int sock_no_sendmsg(struct socket *sk, struct msghdr *msg);
+int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg);
 int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int);
 int sock_no_mmap(struct file *file, struct socket *sock,
 		 struct vm_area_struct *vma);
diff --git a/include/net/tcp.h b/include/net/tcp.h
index a0a91a988272..12b228e3d563 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -325,10 +325,10 @@ int tcp_v4_rcv(struct sk_buff *skb);
 
 void tcp_remove_empty_skb(struct sock *sk);
 int tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);
-int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
-int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size);
+int tcp_sendmsg(struct sock *sk, struct msghdr *msg);
+int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg);
 int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
-			 size_t size, struct ubuf_info *uarg);
+			 struct ubuf_info *uarg);
 int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
 		 int flags);
 int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
@@ -479,7 +479,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 int tcp_disconnect(struct sock *sk, int flags);
 
 void tcp_finish_connect(struct sock *sk, struct sk_buff *skb);
-int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size);
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg);
 void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb);
 
 /* From syncookies.c */
diff --git a/include/net/udp.h b/include/net/udp.h
index de4b528522bb..b9b2ea5af42d 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -277,7 +277,7 @@ int udp_get_port(struct sock *sk, unsigned short snum,
 				  const struct sock *));
 int udp_err(struct sk_buff *, u32);
 int udp_abort(struct sock *sk, int err);
-int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+int udp_sendmsg(struct sock *sk, struct msghdr *msg);
 int udp_push_pending_frames(struct sock *sk);
 void udp_flush_pending_frames(struct sock *sk);
 int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index a06f4d4a6f47..70008c57503f 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1566,7 +1566,7 @@ static int ltalk_rcv(struct sk_buff *skb, struct net_device *dev,
 	return 0;
 }
 
-static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int atalk_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct atalk_sock *at = at_sk(sk);
@@ -1579,6 +1579,7 @@ static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct ddpehdr *ddp;
 	int size, hard_header_len;
 	struct atalk_route *rt, *rt_lo = NULL;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	if (flags & ~(MSG_DONTWAIT|MSG_CMSG_COMPAT))
diff --git a/net/atm/common.c b/net/atm/common.c
index f7019df41c3e..09060644760b 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -565,12 +565,13 @@ int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return copied;
 }
 
-int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t size)
+int vcc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	DEFINE_WAIT(wait);
 	struct atm_vcc *vcc;
 	struct sk_buff *skb;
+	size_t size = msg_data_left(m);
 	int eff, error;
 
 	lock_sock(sk);
diff --git a/net/atm/common.h b/net/atm/common.h
index a1e56e8de698..6597f8308f03 100644
--- a/net/atm/common.h
+++ b/net/atm/common.h
@@ -16,7 +16,7 @@ int vcc_release(struct socket *sock);
 int vcc_connect(struct socket *sock, int itf, short vpi, int vci);
 int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		int flags);
-int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len);
+int vcc_sendmsg(struct socket *sock, struct msghdr *m);
 __poll_t vcc_poll(struct file *file, struct socket *sock, poll_table *wait);
 int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 int vcc_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index d8da400cb4de..48f96e28f7ea 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1489,7 +1489,7 @@ static int ax25_getname(struct socket *sock, struct sockaddr *uaddr,
 	return err;
 }
 
-static int ax25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int ax25_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_ax25 *, usax, msg->msg_name);
 	struct sock *sk = sock->sk;
@@ -1497,7 +1497,7 @@ static int ax25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sk_buff *skb;
 	ax25_digi dtmp, *dp;
 	ax25_cb *ax25;
-	size_t size;
+	size_t size, len = msg_data_left(msg);
 	int lv, err, addr_len = msg->msg_namelen;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 06581223238c..9d6f713eeac1 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -1692,8 +1692,7 @@ static int hci_logging_frame(struct sock *sk, struct sk_buff *skb,
 	return err;
 }
 
-static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct hci_mgmt_chan *chan;
@@ -1701,6 +1700,7 @@ static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	int err;
 	const unsigned int flags = msg->msg_flags;
+	size_t len = msg_data_left(msg);
 
 	BT_DBG("sock %p sk %p", sock, sk);
 
diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c
index 24444b502e58..6d8863878abc 100644
--- a/net/bluetooth/iso.c
+++ b/net/bluetooth/iso.c
@@ -1031,12 +1031,12 @@ static int iso_sock_getname(struct socket *sock, struct sockaddr *addr,
 	return sizeof(struct sockaddr_iso);
 }
 
-static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct iso_conn *conn = iso_pi(sk)->conn;
 	struct sk_buff *skb, **frag;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	BT_DBG("sock %p, sk %p", sock, sk);
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index eebe256104bc..d488aca82037 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1143,8 +1143,7 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
 	return err;
 }
 
-static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			      size_t len)
+static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct l2cap_chan *chan = l2cap_pi(sk)->chan;
@@ -1169,7 +1168,7 @@ static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 		return err;
 
 	l2cap_chan_lock(chan);
-	err = l2cap_chan_send(chan, msg, len);
+	err = l2cap_chan_send(chan, msg, msg_data_left(msg));
 	l2cap_chan_unlock(chan);
 
 	return err;
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 4397e14ff560..8a0a51b5c3a3 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -558,8 +558,7 @@ static int rfcomm_sock_getname(struct socket *sock, struct sockaddr *addr, int p
 	return sizeof(struct sockaddr_rc);
 }
 
-static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rfcomm_dlc *d = rfcomm_pi(sk)->dlc;
@@ -586,8 +585,8 @@ static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (sent)
 		return sent;
 
-	skb = bt_skb_sendmmsg(sk, msg, len, d->mtu, RFCOMM_SKB_HEAD_RESERVE,
-			      RFCOMM_SKB_TAIL_RESERVE);
+	skb = bt_skb_sendmmsg(sk, msg, msg_data_left(msg), d->mtu,
+			      RFCOMM_SKB_HEAD_RESERVE, RFCOMM_SKB_TAIL_RESERVE);
 	if (IS_ERR(skb))
 		return PTR_ERR(skb);
 
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 1111da4e2f2b..8c62c5dc5b57 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -722,11 +722,11 @@ static int sco_sock_getname(struct socket *sock, struct sockaddr *addr,
 	return sizeof(struct sockaddr_sco);
 }
 
-static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	BT_DBG("sock %p, sk %p", sock, sk);
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 4eebcc66c19a..827230b3f7c3 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -510,8 +510,7 @@ static int transmit_skb(struct sk_buff *skb, struct caifsock *cf_sk,
 }
 
 /* Copied from af_unix:unix_dgram_sendmsg, and adapted to CAIF */
-static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
@@ -520,6 +519,8 @@ static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb = NULL;
 	int noblock;
 	long timeo;
+	size_t len = msg_data_left(msg);
+
 	caif_assert(cf_sk);
 	ret = sock_error(sk);
 	if (ret)
@@ -582,8 +583,7 @@ static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
  * Changed removed permission handling and added waiting for flow on
  * and other minor adaptations.
  */
-static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
@@ -605,10 +605,7 @@ static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (unlikely(sk->sk_shutdown & SEND_SHUTDOWN))
 		goto pipe_err;
 
-	while (sent < len) {
-
-		size = len-sent;
-
+	while ((size = msg_data_left(msg))) {
 		if (size > cf_sk->maxframe)
 			size = cf_sk->maxframe;
 
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6ace34..9baace5e0d71 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1287,12 +1287,13 @@ static int bcm_tx_send(struct msghdr *msg, int ifindex, struct sock *sk,
 /*
  * bcm_sendmsg - process BCM commands (opcodes) from the userspace
  */
-static int bcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int bcm_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct bcm_sock *bo = bcm_sk(sk);
 	int ifindex = bo->ifindex; /* default ifindex for this bcm_op */
 	struct bcm_msg_head msg_head;
+	size_t size = msg_data_left(msg);
 	int cfsiz;
 	int ret; /* read bytes or error codes as return value */
 
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..6b5d3ebd6748 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -914,7 +914,7 @@ static enum hrtimer_restart isotp_txfr_timer_handler(struct hrtimer *hrtimer)
 	return HRTIMER_NORESTART;
 }
 
-static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int isotp_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct isotp_sock *so = isotp_sk(sk);
@@ -922,6 +922,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	struct sk_buff *skb;
 	struct net_device *dev;
 	struct canfd_frame *cf;
+	size_t size = msg_data_left(msg);
 	int ae = (so->opt.flags & CAN_ISOTP_EXTEND_ADDR) ? 1 : 0;
 	int wait_tx_done = (so->opt.flags & CAN_ISOTP_WAIT_TX_DONE) ? 1 : 0;
 	s64 hrtimer_sec = ISOTP_ECHO_TIMEOUT;
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 7e90f9e61d9b..2b009b69e853 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1187,12 +1187,12 @@ static int j1939_sk_send_loop(struct j1939_priv *priv,  struct sock *sk,
 	return ret;
 }
 
-static int j1939_sk_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t size)
+static int j1939_sk_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct j1939_sock *jsk = j1939_sk(sk);
 	struct j1939_priv *priv;
+	size_t size = msg_data_left(msg);
 	int ifindex;
 	int ret;
 
diff --git a/net/can/raw.c b/net/can/raw.c
index f64469b98260..0c37f1c70685 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -814,13 +814,14 @@ static bool raw_bad_txframe(struct raw_sock *ro, struct sk_buff *skb, int mtu)
 	return true;
 }
 
-static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int raw_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct raw_sock *ro = raw_sk(sk);
 	struct sockcm_cookie sockc;
 	struct sk_buff *skb;
 	struct net_device *dev;
+	size_t size = msg_data_left(msg);
 	int ifindex;
 	int err = -EINVAL;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index c25888795390..4170381356aa 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3183,13 +3183,13 @@ int sock_no_shutdown(struct socket *sock, int how)
 }
 EXPORT_SYMBOL(sock_no_shutdown);
 
-int sock_no_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
+int sock_no_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	return -EOPNOTSUPP;
 }
 EXPORT_SYMBOL(sock_no_sendmsg);
 
-int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *m, size_t len)
+int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *m)
 {
 	return -EOPNOTSUPP;
 }
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index 9ddc3a9e89e4..3d5d7615ddd8 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -293,7 +293,7 @@ int dccp_getsockopt(struct sock *sk, int level, int optname,
 int dccp_setsockopt(struct sock *sk, int level, int optname,
 		    sockptr_t optval, unsigned int optlen);
 int dccp_ioctl(struct sock *sk, int cmd, unsigned long arg);
-int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int dccp_sendmsg(struct sock *sk, struct msghdr *msg);
 int dccp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		 int *addr_len);
 void dccp_shutdown(struct sock *sk, int how);
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index a06b5641287a..6f6623bb1ff8 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -725,12 +725,13 @@ static int dccp_msghdr_parse(struct msghdr *msg, struct sk_buff *skb)
 	return 0;
 }
 
-int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int dccp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	const struct dccp_sock *dp = dccp_sk(sk);
 	const int flags = msg->msg_flags;
 	const int noblock = flags & MSG_DONTWAIT;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int rc, size;
 	long timeo;
 
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 1fa2fe041ec0..70f2948b7946 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -88,12 +88,11 @@ static int ieee802154_sock_release(struct socket *sock)
 	return 0;
 }
 
-static int ieee802154_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-				   size_t len)
+static int ieee802154_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->sendmsg(sk, msg, len);
+	return sk->sk_prot->sendmsg(sk, msg);
 }
 
 static int ieee802154_sock_bind(struct socket *sock, struct sockaddr *uaddr,
@@ -238,11 +237,12 @@ static int raw_disconnect(struct sock *sk, int flags)
 	return 0;
 }
 
-static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net_device *dev;
 	unsigned int mtu;
 	struct sk_buff *skb;
+	size_t size = msg_data_left(msg);
 	int hlen, tlen;
 	int err;
 
@@ -605,7 +605,7 @@ static int dgram_disconnect(struct sock *sk, int flags)
 	return 0;
 }
 
-static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int dgram_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net_device *dev;
 	unsigned int mtu;
@@ -614,6 +614,7 @@ static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	struct dgram_sock *ro = dgram_sk(sk);
 	struct ieee802154_addr dst_addr;
 	DECLARE_SOCKADDR(struct sockaddr_ieee802154*, daddr, msg->msg_name);
+	size_t size = msg_data_left(msg);
 	int hlen, tlen;
 	int err;
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 940062e08f57..4facfef8bded 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -815,7 +815,7 @@ int inet_send_prepare(struct sock *sk)
 }
 EXPORT_SYMBOL_GPL(inet_send_prepare);
 
-int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+int inet_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 
@@ -823,7 +823,7 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 		return -EAGAIN;
 
 	return INDIRECT_CALL_2(sk->sk_prot->sendmsg, tcp_sendmsg, udp_sendmsg,
-			       sk, msg, size);
+			       sk, msg);
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 409ec2a1f95b..f689f9f530c9 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -657,9 +657,10 @@ static int ping_v4_push_pending_frames(struct sock *sk, struct pingfakehdr *pfh,
 	return ip_push_pending_frames(sk, fl4);
 }
 
-int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
+int ping_common_sendmsg(int family, struct msghdr *msg,
 			void *user_icmph, size_t icmph_len)
 {
+	size_t len = msg_data_left(msg);
 	u8 type, code;
 
 	if (len > 0xFFFF)
@@ -703,7 +704,7 @@ int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
 }
 EXPORT_SYMBOL_GPL(ping_common_sendmsg);
 
-static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net *net = sock_net(sk);
 	struct flowi4 fl4;
@@ -713,6 +714,7 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct pingfakehdr pfh;
 	struct rtable *rt = NULL;
 	struct ip_options_data opt_copy;
+	size_t len = msg_data_left(msg);
 	int free = 0;
 	__be32 saddr, daddr, faddr;
 	u8  tos;
@@ -720,7 +722,7 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 	pr_debug("ping_v4_sendmsg(sk=%p,sk->num=%u)\n", inet, inet->inet_num);
 
-	err = ping_common_sendmsg(AF_INET, msg, len, &user_icmph,
+	err = ping_common_sendmsg(AF_INET, msg, &user_icmph,
 				  sizeof(user_icmph));
 	if (err)
 		return err;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 3cf68695b40d..f2859c117796 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -471,7 +471,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
-static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct net *net = sock_net(sk);
@@ -485,6 +485,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	int err;
 	struct ip_options_data opt_copy;
 	struct raw_frag_vec rfv;
+	size_t len = msg_data_left(msg);
 	int hdrincl;
 
 	err = -EMSGSIZE;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index fd68d49490f2..2a98b104892c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1166,7 +1166,7 @@ void tcp_free_fastopen_req(struct tcp_sock *tp)
 }
 
 int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
-			 size_t size, struct ubuf_info *uarg)
+			 struct ubuf_info *uarg)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_sock *inet = inet_sk(sk);
@@ -1186,7 +1186,7 @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
 	if (unlikely(!tp->fastopen_req))
 		return -ENOBUFS;
 	tp->fastopen_req->data = msg;
-	tp->fastopen_req->size = size;
+	tp->fastopen_req->size = msg_data_left(msg);
 	tp->fastopen_req->uarg = uarg;
 
 	if (inet->defer_connect) {
@@ -1212,12 +1212,13 @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
 	return err;
 }
 
-int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct ubuf_info *uarg = NULL;
 	struct sk_buff *skb;
 	struct sockcm_cookie sockc;
+	size_t size = msg_data_left(msg);
 	int flags, err, copied = 0;
 	int mss_now = 0, size_goal, copied_syn = 0;
 	int process_backlog = 0;
@@ -1226,7 +1227,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	flags = msg->msg_flags;
 
-	if ((flags & MSG_ZEROCOPY) && size) {
+	if ((flags & MSG_ZEROCOPY) && msg_data_left(msg)) {
 		skb = tcp_write_queue_tail(sk);
 
 		if (msg->msg_ubuf) {
@@ -1247,7 +1248,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) &&
 	    !tp->repair) {
-		err = tcp_sendmsg_fastopen(sk, msg, &copied_syn, size, uarg);
+		err = tcp_sendmsg_fastopen(sk, msg, &copied_syn, uarg);
 		if (err == -EINPROGRESS && copied_syn > 0)
 			goto out;
 		else if (err)
@@ -1271,7 +1272,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	if (unlikely(tp->repair)) {
 		if (tp->repair_queue == TCP_RECV_QUEUE) {
-			copied = tcp_send_rcvq(sk, msg, size);
+			copied = tcp_send_rcvq(sk, msg);
 			goto out_nopush;
 		}
 
@@ -1477,12 +1478,12 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 }
 EXPORT_SYMBOL_GPL(tcp_sendmsg_locked);
 
-int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	int ret;
 
 	lock_sock(sk);
-	ret = tcp_sendmsg_locked(sk, msg, size);
+	ret = tcp_sendmsg_locked(sk, msg);
 	release_sock(sk);
 
 	return ret;
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index ebf917511937..843eb2b6b8d3 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -396,9 +396,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 	return ret;
 }
 
-static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_msg tmp, *msg_tx = NULL;
+	size_t size = msg_data_left(msg);
 	int copied = 0, err = 0;
 	struct sk_psock *psock;
 	long timeo;
@@ -410,7 +411,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 
 	psock = sk_psock_get(sk);
 	if (unlikely(!psock))
-		return tcp_sendmsg(sk, msg, size);
+		return tcp_sendmsg(sk, msg);
 
 	lock_sock(sk);
 	timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2b75cd9e2e92..a1c7d834abca 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4948,9 +4948,10 @@ static int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb,
 	return eaten;
 }
 
-int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_buff *skb;
+	size_t size = msg_data_left(msg);
 	int err = -ENOMEM;
 	int data_len = 0;
 	bool fragstolen;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index aa32afd871ee..b2ed9d37a362 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1049,13 +1049,14 @@ int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size)
 }
 EXPORT_SYMBOL_GPL(udp_cmsg_send);
 
-int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int udp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct udp_sock *up = udp_sk(sk);
 	DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name);
 	struct flowi4 fl4_stack;
 	struct flowi4 *fl4;
+	unsigned int len = msg_data_left(msg);
 	int ulen = len;
 	struct ipcm_cookie ipc;
 	struct rtable *rt = NULL;
@@ -1346,7 +1347,7 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
 		 * sendpage interface can't pass.
 		 * This will succeed only when the socket is connected.
 		 */
-		ret = udp_sendmsg(sk, &msg, 0);
+		ret = udp_sendmsg(sk, &msg);
 		if (ret < 0)
 			return ret;
 	}
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index e1b679a590c9..d6b4cfc44e2a 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -636,9 +636,8 @@ int inet6_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 EXPORT_SYMBOL_GPL(inet6_compat_ioctl);
 #endif /* CONFIG_COMPAT */
 
-INDIRECT_CALLABLE_DECLARE(int udpv6_sendmsg(struct sock *, struct msghdr *,
-					    size_t));
-int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+INDIRECT_CALLABLE_DECLARE(int udpv6_sendmsg(struct sock *, struct msghdr *));
+int inet6_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	const struct proto *prot;
@@ -649,7 +648,7 @@ int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
 	prot = READ_ONCE(sk->sk_prot);
 	return INDIRECT_CALL_2(prot->sendmsg, tcp_sendmsg, udpv6_sendmsg,
-			       sk, msg, size);
+			       sk, msg);
 }
 
 INDIRECT_CALLABLE_DECLARE(int udpv6_recvmsg(struct sock *, struct msghdr *,
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index c4835dbdfcff..54c94b28744f 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -59,7 +59,7 @@ static int ping_v6_pre_connect(struct sock *sk, struct sockaddr *uaddr,
 	return BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr);
 }
 
-static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct ipv6_pinfo *np = inet6_sk(sk);
@@ -73,8 +73,9 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct rt6_info *rt;
 	struct pingfakehdr pfh;
 	struct ipcm6_cookie ipc6;
+	size_t len = msg_data_left(msg);
 
-	err = ping_common_sendmsg(AF_INET6, msg, len, &user_icmph,
+	err = ping_common_sendmsg(AF_INET6, msg, &user_icmph,
 				  sizeof(user_icmph));
 	if (err)
 		return err;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 6ac2f2690c44..a3437deeeb74 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -735,7 +735,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
-static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions *opt_to_free = NULL;
 	struct ipv6_txoptions opt_space;
@@ -751,6 +751,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct flowi6 fl6;
 	struct ipcm6_cookie ipc6;
 	int addr_len = msg->msg_namelen;
+	size_t len = msg_data_left(msg);
 	int hdrincl;
 	u16 proto;
 	int err;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d350e57c4792..80f2eb58ba1a 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1326,7 +1326,7 @@ static int udp_v6_push_pending_frames(struct sock *sk)
 	return err;
 }
 
-int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int udpv6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions opt_space;
 	struct udp_sock *up = udp_sk(sk);
@@ -1343,6 +1343,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct ipcm6_cookie ipc6;
 	int addr_len = msg->msg_namelen;
 	bool connected = false;
+	size_t len = msg_data_left(msg);
 	int ulen = len;
 	int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE;
 	int err;
@@ -1397,7 +1398,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 do_udp_sendmsg:
 			if (ipv6_only_sock(sk))
 				return -ENETUNREACH;
-			return udp_sendmsg(sk, msg, len);
+			return udp_sendmsg(sk, msg);
 		}
 	}
 
@@ -1410,7 +1411,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	getfrag  =  is_udplite ?  udplite_getfrag : ip_generic_getfrag;
 	if (up->pending) {
 		if (up->pending == AF_INET)
-			return udp_sendmsg(sk, msg, len);
+			return udp_sendmsg(sk, msg);
 		/*
 		 * There are pending frames.
 		 * The socket lock must be held while it's corked.
diff --git a/net/ipv6/udp_impl.h b/net/ipv6/udp_impl.h
index 0590f566379d..c905a5cb34af 100644
--- a/net/ipv6/udp_impl.h
+++ b/net/ipv6/udp_impl.h
@@ -20,7 +20,7 @@ int udpv6_getsockopt(struct sock *sk, int level, int optname,
 		     char __user *optval, int __user *optlen);
 int udpv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
 		     unsigned int optlen);
-int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+int udpv6_sendmsg(struct sock *sk, struct msghdr *msg);
 int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		  int *addr_len);
 void udpv6_destroy_sock(struct sock *sk);
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index 498a0c35b7bb..d963d245a4e2 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -895,8 +895,7 @@ static int iucv_send_iprm(struct iucv_path *path, struct iucv_message *msg,
 				 (void *) prmdata, 8);
 }
 
-static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			     size_t len)
+static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct iucv_sock *iucv = iucv_sk(sk);
@@ -905,6 +904,7 @@ static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	struct iucv_message txmsg = {0};
 	struct cmsghdr *cmsg;
+	size_t len = msg_data_left(msg);
 	int cmsg_done;
 	long timeo;
 	char user_id[9];
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index cfe828bd7fc6..caf13ed1bfeb 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -904,7 +904,7 @@ static ssize_t kcm_sendpage(struct socket *sock, struct page *page,
 	return err;
 }
 
-static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int kcm_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct kcm_sock *kcm = kcm_sk(sk);
diff --git a/net/key/af_key.c b/net/key/af_key.c
index a815f5ab4c49..3cde1e0c3119 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3662,13 +3662,14 @@ static int pfkey_send_migrate(const struct xfrm_selector *sel, u8 dir, u8 type,
 }
 #endif
 
-static int pfkey_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int pfkey_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb = NULL;
 	struct sadb_msg *hdr = NULL;
 	int err;
 	struct net *net = sock_net(sk);
+	size_t len = msg_data_left(msg);
 
 	err = -EOPNOTSUPP;
 	if (msg->msg_flags & MSG_OOB)
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4db5a554bdbd..474ce4ae9b63 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -394,13 +394,14 @@ static int l2tp_ip_backlog_recv(struct sock *sk, struct sk_buff *skb)
 /* Userspace will call sendmsg() on the tunnel socket to send L2TP
  * control frames.
  */
-static int l2tp_ip_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int l2tp_ip_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_buff *skb;
 	int rc;
 	struct inet_sock *inet = inet_sk(sk);
 	struct rtable *rt = NULL;
 	struct flowi4 *fl4;
+	size_t len = msg_data_left(msg);
 	int connected = 0;
 	__be32 daddr;
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2478aa60145f..7619afe77855 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -488,7 +488,7 @@ static int l2tp_ip6_push_pending_frames(struct sock *sk)
 /* Userspace will call sendmsg() on the tunnel socket to send L2TP
  * control frames.
  */
-static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions opt_space;
 	DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name);
@@ -500,6 +500,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct dst_entry *dst = NULL;
 	struct flowi6 fl6;
 	struct ipcm6_cookie ipc6;
+	size_t len = msg_data_left(msg);
 	int addr_len = msg->msg_namelen;
 	int transhdrlen = 4; /* zero session-id */
 	int ulen;
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index f011af6601c9..ae351f50adff 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -262,14 +262,14 @@ static void pppol2tp_recv(struct l2tp_session *session, struct sk_buff *skb, int
  * when a user application does a sendmsg() on the session socket. L2TP and
  * PPP headers must be inserted into the user's data.
  */
-static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m,
-			    size_t total_len)
+static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
 	int error;
 	struct l2tp_session *session;
 	struct l2tp_tunnel *tunnel;
+	size_t total_len = msg_data_left(m);
 	int uhlen;
 
 	error = -ENOTCONN;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index da7fe94bea2e..d10b5ef66c88 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -919,12 +919,11 @@ static int llc_ui_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
  *	llc_ui_sendmsg - Transmit data provided by the socket user.
  *	@sock: Socket to transmit data from.
  *	@msg: Various user related information.
- *	@len: Length of data to transmit.
  *
  *	Transmit data provided by the socket user.
  *	Returns non-negative upon success, negative otherwise.
  */
-static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct llc_sock *llc = llc_sk(sk);
@@ -954,7 +953,7 @@ static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 			goto out;
 	}
 	hdrlen = llc->dev->hard_header_len + llc_ui_header_len(sk, addr);
-	size = hdrlen + len;
+	size = hdrlen + msg_data_left(msg);
 	if (size > llc->dev->mtu)
 		size = llc->dev->mtu;
 	copied = size - hdrlen;
diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
index bb4bd0b6a4f7..9ead250f1be3 100644
--- a/net/mctp/af_mctp.c
+++ b/net/mctp/af_mctp.c
@@ -90,7 +90,7 @@ static int mctp_bind(struct socket *sock, struct sockaddr *addr, int addrlen)
 	return rc;
 }
 
-static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int mctp_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_mctp *, addr, msg->msg_name);
 	int rc, addrlen = msg->msg_namelen;
@@ -99,6 +99,7 @@ static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct mctp_skb_cb *cb;
 	struct mctp_route *rt;
 	struct sk_buff *skb = NULL;
+	size_t len = msg_data_left(msg);
 	int hlen;
 
 	if (addr) {
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2d26b9114373..0a58f2dbd3ce 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1663,7 +1663,7 @@ static void mptcp_set_nospace(struct sock *sk)
 static int mptcp_disconnect(struct sock *sk, int flags);
 
 static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msghdr *msg,
-				  size_t len, int *copied_syn)
+				  int *copied_syn)
 {
 	unsigned int saved_flags = msg->msg_flags;
 	struct mptcp_sock *msk = mptcp_sk(sk);
@@ -1673,7 +1673,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msgh
 	msg->msg_flags |= MSG_DONTWAIT;
 	msk->connect_flags = O_NONBLOCK;
 	msk->fastopening = 1;
-	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, len, NULL);
+	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, NULL);
 	msk->fastopening = 0;
 	msg->msg_flags = saved_flags;
 	release_sock(ssk);
@@ -1695,7 +1695,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msgh
 	return ret;
 }
 
-static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	struct page_frag *pfrag;
@@ -1714,7 +1714,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			       msg->msg_flags & MSG_FASTOPEN))) {
 		int copied_syn = 0;
 
-		ret = mptcp_sendmsg_fastopen(sk, ssock->sk, msg, len, &copied_syn);
+		ret = mptcp_sendmsg_fastopen(sk, ssock->sk, msg, &copied_syn);
 		copied += copied_syn;
 		if (ret == -EINPROGRESS && copied_syn > 0)
 			goto out;
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 877f1da1a8ac..519487cbfcce 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1857,7 +1857,7 @@ static void netlink_cmsg_listen_all_nsid(struct sock *sk, struct msghdr *msg,
 		 &NETLINK_CB(skb).nsid);
 }
 
-static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int netlink_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct netlink_sock *nlk = nlk_sk(sk);
@@ -1872,7 +1872,7 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	if (msg->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-	if (len == 0) {
+	if (msg_data_left(msg) == 0) {
 		pr_warn_once("Zero length message leads to an empty skb\n");
 		return -ENODATA;
 	}
@@ -1911,10 +1911,10 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	err = -EMSGSIZE;
-	if (len > sk->sk_sndbuf - 32)
+	if (msg_data_left(msg) > sk->sk_sndbuf - 32)
 		goto out;
 	err = -ENOBUFS;
-	skb = netlink_alloc_large_skb(len, dst_group);
+	skb = netlink_alloc_large_skb(msg_data_left(msg), dst_group);
 	if (skb == NULL)
 		goto out;
 
@@ -1924,7 +1924,8 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	NETLINK_CB(skb).flags	= netlink_skb_flags;
 
 	err = -EFAULT;
-	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
+	if (memcpy_from_msg(skb_put(skb, msg_data_left(msg)),
+			    msg, msg_data_left(msg))) {
 		kfree_skb(skb);
 		goto out;
 	}
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 5a4cb796150f..d2c65f38c22c 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1034,7 +1034,7 @@ int nr_rx_frame(struct sk_buff *skb, struct net_device *dev)
 	return 1;
 }
 
-static int nr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int nr_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nr_sock *nr = nr_sk(sk);
@@ -1043,6 +1043,7 @@ static int nr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sockaddr_ax25 sax;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int size;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 77642d18a3b4..70226fc36396 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -770,8 +770,7 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr,
 	return ret;
 }
 
-static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			     size_t len)
+static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nfc_llcp_sock *llcp_sock = nfc_llcp_sock(sk);
@@ -805,7 +804,7 @@ static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 		release_sock(sk);
 
 		return nfc_llcp_send_ui_frame(llcp_sock, addr->dsap, addr->ssap,
-					      msg, len);
+					      msg, msg_data_left(msg));
 	}
 
 	if (sk->sk_state != LLCP_CONNECTED) {
@@ -815,7 +814,7 @@ static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 
 	release_sock(sk);
 
-	return nfc_llcp_send_i_frame(llcp_sock, msg, len);
+	return nfc_llcp_send_i_frame(llcp_sock, msg, msg_data_left(msg));
 }
 
 static int llcp_sock_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 5125392bb68e..d9d54240b2a2 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -202,11 +202,12 @@ static void rawsock_tx_work(struct work_struct *work)
 	kcov_remote_stop();
 }
 
-static int rawsock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rawsock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nfc_dev *dev = nfc_rawsock(sk)->dev;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int rc;
 
 	pr_debug("sock=%p sk=%p len=%zu\n", sock, sk, len);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 497193f73030..84a95e177260 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1947,14 +1947,14 @@ static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
  *	protocol layers and you must therefore supply it with a complete frame
  */
 
-static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_pkt *, saddr, msg->msg_name);
 	struct sk_buff *skb = NULL;
 	struct net_device *dev;
 	struct sockcm_cookie sockc;
+	size_t len = msg_data_left(msg);
 	__be16 proto = 0;
 	int err;
 	int extra_len = 0;
@@ -2933,7 +2933,7 @@ static struct sk_buff *packet_alloc_skb(struct sock *sk, size_t prepad,
 	return skb;
 }
 
-static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
+static int packet_snd(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_ll *, saddr, msg->msg_name);
@@ -2946,6 +2946,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	struct virtio_net_hdr vnet_hdr = { 0 };
 	int offset = 0;
 	struct packet_sock *po = pkt_sk(sk);
+	size_t len = msg_data_left(msg);
 	bool has_vnet_hdr = false;
 	int hlen, tlen, linear;
 	int extra_len = 0;
@@ -3093,7 +3094,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	return err;
 }
 
-static int packet_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int packet_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct packet_sock *po = pkt_sk(sk);
@@ -3104,7 +3105,7 @@ static int packet_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	if (data_race(po->tx_ring.pg_vec))
 		return tpacket_snd(po, msg);
 
-	return packet_snd(sock, msg, len);
+	return packet_snd(sock, msg);
 }
 
 /*
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index ff5f49ab236e..4839f7d6785b 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -70,10 +70,11 @@ static int pn_init(struct sock *sk)
 	return 0;
 }
 
-static int pn_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int pn_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_pn *, target, msg->msg_name);
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_NOSIGNAL|
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 83ea13a50690..5afc99ab9eca 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1112,10 +1112,11 @@ static int pipe_skb_send(struct sock *sk, struct sk_buff *skb)
 
 }
 
-static int pep_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int pep_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct pep_sock *pn = pep_sk(sk);
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	long timeo;
 	int flags = msg->msg_flags;
 	int err, done;
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 71e2caf6ab85..99cd62f64944 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -414,15 +414,14 @@ static int pn_socket_listen(struct socket *sock, int backlog)
 	return err;
 }
 
-static int pn_socket_sendmsg(struct socket *sock, struct msghdr *m,
-			     size_t total_len)
+static int pn_socket_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 
 	if (pn_socket_autobind(sock))
 		return -EAGAIN;
 
-	return sk->sk_prot->sendmsg(sk, m, total_len);
+	return sk->sk_prot->sendmsg(sk, m);
 }
 
 const struct proto_ops phonet_dgram_ops = {
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 5c2fb992803b..7c1b908dd479 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -888,7 +888,7 @@ static int qrtr_bcast_enqueue(struct qrtr_node *node, struct sk_buff *skb,
 	return 0;
 }
 
-static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_qrtr *, addr, msg->msg_name);
 	int (*enqueue_fn)(struct qrtr_node *, struct sk_buff *, int,
@@ -898,7 +898,7 @@ static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sock *sk = sock->sk;
 	struct qrtr_node *node;
 	struct sk_buff *skb;
-	size_t plen;
+	size_t plen, len = msg_data_left(msg);
 	u32 type;
 	int rc;
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index d35d1fc39807..9e8ecafd5b51 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -909,7 +909,7 @@ void rds6_inc_info_copy(struct rds_incoming *inc,
 			int flip);
 
 /* send.c */
-int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len);
+int rds_sendmsg(struct socket *sock, struct msghdr *msg);
 void rds_send_path_reset(struct rds_conn_path *conn);
 int rds_send_xmit(struct rds_conn_path *cp);
 struct sockaddr_in;
diff --git a/net/rds/send.c b/net/rds/send.c
index 5e57a1581dc6..f588b720e1c3 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -1098,7 +1098,7 @@ static int rds_rdma_bytes(struct msghdr *msg, size_t *rdma_bytes)
 	return 0;
 }
 
-int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
+int rds_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rds_sock *rs = rds_sk_to_rs(sk);
@@ -1114,6 +1114,7 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
 	struct rds_conn_path *cpath;
 	struct in6_addr daddr;
 	__u32 scope_id = 0;
+	size_t payload_len = msg_data_left(msg);
 	size_t rdma_payload_len = 0;
 	bool zcopy = ((msg->msg_flags & MSG_ZEROCOPY) &&
 		      sock_flag(rds_rs_to_sk(rs), SOCK_ZEROCOPY));
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index ca2b17f32670..938ea0716751 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1069,7 +1069,7 @@ int rose_rx_call_request(struct sk_buff *skb, struct net_device *dev, struct ros
 	return 1;
 }
 
-static int rose_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rose_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rose_sock *rose = rose_sk(sk);
@@ -1078,6 +1078,7 @@ static int rose_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct full_sockaddr_rose srose;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int n, size, qbit = 0;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..bdce6ab30899 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -502,13 +502,13 @@ static int rxrpc_connect(struct socket *sock, struct sockaddr *addr,
  *   - sends a call data packet
  *   - may send an abort (abort code in control data)
  */
-static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
+static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct rxrpc_local *local;
 	struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
 	int ret;
 
-	_enter(",{%d},,%zu", rx->sk.sk_state, len);
+	_enter(",{%d},,%zu", rx->sk.sk_state, msg_data_left(m));
 
 	if (m->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
@@ -562,7 +562,7 @@ static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
 		fallthrough;
 	case RXRPC_SERVER_BOUND:
 	case RXRPC_SERVER_LISTENING:
-		ret = rxrpc_do_sendmsg(rx, m, len);
+		ret = rxrpc_do_sendmsg(rx, m);
 		/* The socket has been unlocked */
 		goto out;
 	default:
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 67b0a894162d..36738f8f050d 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1221,7 +1221,7 @@ struct key *rxrpc_look_up_server_security(struct rxrpc_connection *,
  */
 bool rxrpc_propose_abort(struct rxrpc_call *call, s32 abort_code, int error,
 			 enum rxrpc_abort_reason why);
-int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *, size_t);
+int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *);
 
 /*
  * server_key.c
diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index 5e53429c6922..0f3ff3455101 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -16,9 +16,9 @@
 #include <net/udp.h>
 #include "ar-internal.h"
 
-extern int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+extern int udpv6_sendmsg(struct sock *sk, struct msghdr *msg);
 
-static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg, size_t len)
+static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg)
 {
 	struct sockaddr *sa = msg->msg_name;
 	struct sock *sk = socket->sk;
@@ -29,10 +29,10 @@ static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg, size_t
 				pr_warn("AF_INET6 address on AF_INET socket\n");
 				return -ENOPROTOOPT;
 			}
-			return udpv6_sendmsg(sk, msg, len);
+			return udpv6_sendmsg(sk, msg);
 		}
 	}
-	return udp_sendmsg(sk, msg, len);
+	return udp_sendmsg(sk, msg);
 }
 
 struct rxrpc_abort_buffer {
@@ -232,7 +232,7 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 	txb->ack.previousPacket	= htonl(call->rx_highest_seq);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	call->peer->last_tx_at = ktime_get_seconds();
 	if (ret < 0) {
 		trace_rxrpc_tx_fail(call->debug_id, serial, ret,
@@ -306,7 +306,7 @@ int rxrpc_send_abort_packet(struct rxrpc_call *call)
 	pkt.whdr.serial = htonl(serial);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, sizeof(pkt));
-	ret = do_udp_sendmsg(conn->local->socket, &msg, sizeof(pkt));
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	conn->peer->last_tx_at = ktime_get_seconds();
 	if (ret < 0)
 		trace_rxrpc_tx_fail(call->debug_id, serial, ret,
@@ -424,7 +424,7 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 	 *     message and update the peer record
 	 */
 	rxrpc_inc_stat(call->rxnet, stat_tx_data_send);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	conn->peer->last_tx_at = ktime_get_seconds();
 
 	if (ret < 0) {
@@ -497,7 +497,7 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 		ip_sock_set_mtu_discover(conn->local->socket->sk,
 					 IP_PMTUDISC_DONT);
 		rxrpc_inc_stat(call->rxnet, stat_tx_data_send_frag);
-		ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+		ret = do_udp_sendmsg(conn->local->socket, &msg);
 		conn->peer->last_tx_at = ktime_get_seconds();
 
 		ip_sock_set_mtu_discover(conn->local->socket->sk,
@@ -564,7 +564,7 @@ void rxrpc_send_conn_abort(struct rxrpc_connection *conn)
 	whdr.serial = htonl(serial);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 2, len);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	if (ret < 0) {
 		trace_rxrpc_tx_fail(conn->debug_id, serial, ret,
 				    rxrpc_tx_point_conn_abort);
@@ -633,7 +633,7 @@ void rxrpc_reject_packet(struct rxrpc_local *local, struct sk_buff *skb)
 		whdr.flags	&= RXRPC_CLIENT_INITIATED;
 
 		iov_iter_kvec(&msg.msg_iter, WRITE, iov, ioc, size);
-		ret = do_udp_sendmsg(local->socket, &msg, size);
+		ret = do_udp_sendmsg(local->socket, &msg);
 		if (ret < 0)
 			trace_rxrpc_tx_fail(local->debug_id, 0, ret,
 					    rxrpc_tx_point_reject);
@@ -682,7 +682,7 @@ void rxrpc_send_keepalive(struct rxrpc_peer *peer)
 	len = iov[0].iov_len + iov[1].iov_len;
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 2, len);
-	ret = do_udp_sendmsg(peer->local->socket, &msg, len);
+	ret = do_udp_sendmsg(peer->local->socket, &msg);
 	if (ret < 0)
 		trace_rxrpc_tx_fail(peer->debug_id, 0, ret,
 				    rxrpc_tx_point_version_keepalive);
diff --git a/net/rxrpc/rxperf.c b/net/rxrpc/rxperf.c
index 4a2e90015ca7..0167afb67a7a 100644
--- a/net/rxrpc/rxperf.c
+++ b/net/rxrpc/rxperf.c
@@ -507,7 +507,7 @@ static int rxperf_process_call(struct rxperf_call *call)
 		iov_iter_bvec(&msg.msg_iter, WRITE, &bv, 1, len);
 		msg.msg_flags = MSG_MORE;
 		n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg,
-					   len, rxperf_notify_end_reply_tx);
+					   rxperf_notify_end_reply_tx);
 		if (n < 0)
 			return n;
 		if (n == 0)
@@ -520,7 +520,7 @@ static int rxperf_process_call(struct rxperf_call *call)
 	iov[0].iov_len	= len;
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
 	msg.msg_flags = 0;
-	n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg, len,
+	n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg,
 				   rxperf_notify_end_reply_tx);
 	if (n >= 0)
 		return 0; /* Success */
diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c
index da49fcf1c456..b6ffd8124ced 100644
--- a/net/rxrpc/sendmsg.c
+++ b/net/rxrpc/sendmsg.c
@@ -280,7 +280,7 @@ static void rxrpc_queue_packet(struct rxrpc_sock *rx, struct rxrpc_call *call,
  */
 static int rxrpc_send_data(struct rxrpc_sock *rx,
 			   struct rxrpc_call *call,
-			   struct msghdr *msg, size_t len,
+			   struct msghdr *msg,
 			   rxrpc_notify_end_tx_t notify_end_tx,
 			   bool *_dropped_lock)
 {
@@ -327,9 +327,9 @@ static int rxrpc_send_data(struct rxrpc_sock *rx,
 
 	ret = -EMSGSIZE;
 	if (call->tx_total_len != -1) {
-		if (len - copied > call->tx_total_len)
+		if (msg_data_left(msg) > call->tx_total_len)
 			goto maybe_error;
-		if (!more && len - copied != call->tx_total_len)
+		if (!more && msg_data_left(msg) != call->tx_total_len)
 			goto maybe_error;
 	}
 
@@ -612,7 +612,7 @@ rxrpc_new_client_call_for_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg,
  * - caller holds the socket locked
  * - the socket may be either a client socket or a server socket
  */
-int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
+int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg)
 	__releases(&rx->sk.sk_lock.slock)
 {
 	struct rxrpc_call *call;
@@ -723,7 +723,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
 	} else if (p.command != RXRPC_CMD_SEND_DATA) {
 		ret = -EINVAL;
 	} else {
-		ret = rxrpc_send_data(rx, call, msg, len, NULL, &dropped_lock);
+		ret = rxrpc_send_data(rx, call, msg, NULL, &dropped_lock);
 	}
 
 out_put_unlock:
@@ -744,7 +744,6 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
  * @sock: The socket the call is on
  * @call: The call to send data through
  * @msg: The data to send
- * @len: The amount of data to send
  * @notify_end_tx: Notification that the last packet is queued.
  *
  * Allow a kernel service to send data on a call.  The call must be in an state
@@ -753,7 +752,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
  * more data to come, otherwise this data will end the transmission phase.
  */
 int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
-			   struct msghdr *msg, size_t len,
+			   struct msghdr *msg,
 			   rxrpc_notify_end_tx_t notify_end_tx)
 {
 	bool dropped_lock = false;
@@ -766,7 +765,7 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
 
 	mutex_lock(&call->user_mutex);
 
-	ret = rxrpc_send_data(rxrpc_sk(sock->sk), call, msg, len,
+	ret = rxrpc_send_data(rxrpc_sk(sock->sk), call, msg,
 			      notify_end_tx, &dropped_lock);
 	if (ret == -ESHUTDOWN)
 		ret = call->error;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index b91616f819de..da99aab89d82 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1935,7 +1935,7 @@ static void sctp_sendmsg_update_sinfo(struct sctp_association *asoc,
 	}
 }
 
-static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
+static int sctp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sctp_endpoint *ep = sctp_sk(sk)->ep;
 	struct sctp_transport *transport = NULL;
@@ -1943,6 +1943,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 	struct sctp_association *asoc, *tmp;
 	struct sctp_cmsgs cmsgs;
 	union sctp_addr *daddr;
+	size_t msg_len = msg_data_left(msg);
 	bool new = false;
 	__u16 sflags;
 	int err;
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index c6b4a62276f6..0e725698ebcd 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2653,10 +2653,11 @@ static int smc_getname(struct socket *sock, struct sockaddr *addr,
 	return smc->clcsock->ops->getname(smc->clcsock, addr, peer);
 }
 
-static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int smc_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
+	size_t len = msg_data_left(msg);
 	int rc;
 
 	smc = smc_sk(sk);
@@ -2681,7 +2682,7 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	if (smc->use_fallback) {
-		rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg, len);
+		rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg);
 	} else {
 		rc = smc_tx_sendmsg(smc, msg, len);
 		SMC_STAT_TX_PAYLOAD(smc, len, rc);
diff --git a/net/socket.c b/net/socket.c
index 73e493da4589..1690e1782bf0 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -708,10 +708,8 @@ void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags)
 }
 EXPORT_SYMBOL(__sock_tx_timestamp);
 
-INDIRECT_CALLABLE_DECLARE(int inet_sendmsg(struct socket *, struct msghdr *,
-					   size_t));
-INDIRECT_CALLABLE_DECLARE(int inet6_sendmsg(struct socket *, struct msghdr *,
-					    size_t));
+INDIRECT_CALLABLE_DECLARE(int inet_sendmsg(struct socket *, struct msghdr *));
+INDIRECT_CALLABLE_DECLARE(int inet6_sendmsg(struct socket *, struct msghdr *));
 
 static noinline void call_trace_sock_send_length(struct sock *sk, int ret,
 						 int flags)
@@ -722,8 +720,7 @@ static noinline void call_trace_sock_send_length(struct sock *sk, int ret,
 static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
 {
 	int ret = INDIRECT_CALL_INET(sock->ops->sendmsg, inet6_sendmsg,
-				     inet_sendmsg, sock, msg,
-				     msg_data_left(msg));
+				     inet_sendmsg, sock, msg);
 	BUG_ON(ret == -EIOCBQUEUED);
 
 	if (trace_sock_send_length_enabled())
@@ -741,8 +738,7 @@ static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
  */
 int sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	int err = security_socket_sendmsg(sock, msg,
-					  msg_data_left(msg));
+	int err = security_socket_sendmsg(sock, msg);
 
 	return err ?: sock_sendmsg_nosec(sock, msg);
 }
@@ -787,11 +783,11 @@ int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 	struct socket *sock = sk->sk_socket;
 
 	if (!sock->ops->sendmsg_locked)
-		return sock_no_sendmsg_locked(sk, msg, size);
+		return sock_no_sendmsg_locked(sk, msg);
 
 	iov_iter_kvec(&msg->msg_iter, ITER_SOURCE, vec, num, size);
 
-	return sock->ops->sendmsg_locked(sk, msg, msg_data_left(msg));
+	return sock->ops->sendmsg_locked(sk, msg);
 }
 EXPORT_SYMBOL(kernel_sendmsg_locked);
 
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 37edfe10f8c6..bd677e707548 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -156,8 +156,8 @@ static int tipc_sk_leave(struct tipc_sock *tsk);
 static struct tipc_sock *tipc_sk_lookup(struct net *net, u32 portid);
 static int tipc_sk_insert(struct tipc_sock *tsk);
 static void tipc_sk_remove(struct tipc_sock *tsk);
-static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz);
-static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dsz);
+static int __tipc_sendstream(struct socket *sock, struct msghdr *m);
+static int __tipc_sendmsg(struct socket *sock, struct msghdr *m);
 static void tipc_sk_push_backlog(struct tipc_sock *tsk, bool nagle_ack);
 static int tipc_wait_for_connect(struct socket *sock, long *timeo_p);
 
@@ -1385,7 +1385,6 @@ static void tipc_sk_conn_proto_rcv(struct tipc_sock *tsk, struct sk_buff *skb,
  * tipc_sendmsg - send message in connectionless manner
  * @sock: socket structure
  * @m: message to send
- * @dsz: amount of user data to be sent
  *
  * Message must have an destination specified explicitly.
  * Used for SOCK_RDM and SOCK_DGRAM messages,
@@ -1394,20 +1393,19 @@ static void tipc_sk_conn_proto_rcv(struct tipc_sock *tsk, struct sk_buff *skb,
  *
  * Return: the number of bytes sent on success, or errno otherwise
  */
-static int tipc_sendmsg(struct socket *sock,
-			struct msghdr *m, size_t dsz)
+static int tipc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	int ret;
 
 	lock_sock(sk);
-	ret = __tipc_sendmsg(sock, m, dsz);
+	ret = __tipc_sendmsg(sock, m);
 	release_sock(sk);
 
 	return ret;
 }
 
-static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
+static int __tipc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	struct net *net = sock_net(sk);
@@ -1420,6 +1418,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
 	struct tipc_msg *hdr = &tsk->phdr;
 	struct tipc_socket_addr skaddr;
 	struct sk_buff_head pkts;
+	size_t dlen = msg_data_left(m);
 	int atype, mtu, rc;
 
 	if (unlikely(dlen > TIPC_MAX_USER_MSG_SIZE))
@@ -1535,26 +1534,25 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
  * tipc_sendstream - send stream-oriented data
  * @sock: socket structure
  * @m: data to send
- * @dsz: total length of data to be transmitted
  *
  * Used for SOCK_STREAM data.
  *
  * Return: the number of bytes sent on success (or partial success),
  * or errno if no data sent
  */
-static int tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz)
+static int tipc_sendstream(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	int ret;
 
 	lock_sock(sk);
-	ret = __tipc_sendstream(sock, m, dsz);
+	ret = __tipc_sendstream(sock, m);
 	release_sock(sk);
 
 	return ret;
 }
 
-static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
+static int __tipc_sendstream(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_tipc *, dest, m->msg_name);
@@ -1564,6 +1562,7 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
 	struct tipc_msg *hdr = &tsk->phdr;
 	struct net *net = sock_net(sk);
 	struct sk_buff *skb;
+	size_t dlen = msg_data_left(m);
 	u32 dnode = tsk_peer_node(tsk);
 	int maxnagle = tsk->maxnagle;
 	int maxpkt = tsk->max_pkt;
@@ -1575,7 +1574,7 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
 
 	/* Handle implicit connection setup */
 	if (unlikely(dest && sk->sk_state == TIPC_OPEN)) {
-		rc = __tipc_sendmsg(sock, m, dlen);
+		rc = __tipc_sendmsg(sock, m);
 		if (dlen && dlen == rc) {
 			tsk->peer_caps = tipc_node_get_capabilities(net, dnode);
 			tsk->snt_unacked = tsk_inc(tsk, dlen + msg_hdr_sz(hdr));
@@ -1643,18 +1642,17 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
  * tipc_send_packet - send a connection-oriented message
  * @sock: socket structure
  * @m: message to send
- * @dsz: length of data to be transmitted
  *
  * Used for SOCK_SEQPACKET messages.
  *
  * Return: the number of bytes sent on success, or errno otherwise
  */
-static int tipc_send_packet(struct socket *sock, struct msghdr *m, size_t dsz)
+static int tipc_send_packet(struct socket *sock, struct msghdr *m)
 {
-	if (dsz > TIPC_MAX_USER_MSG_SIZE)
+	if (msg_data_left(m) > TIPC_MAX_USER_MSG_SIZE)
 		return -EMSGSIZE;
 
-	return tipc_sendstream(sock, m, dsz);
+	return tipc_sendstream(sock, m);
 }
 
 /* tipc_sk_finish_conn - complete the setup of a connection
@@ -2625,7 +2623,7 @@ static int tipc_connect(struct socket *sock, struct sockaddr *dest,
 		if (!timeout)
 			m.msg_flags = MSG_DONTWAIT;
 
-		res = __tipc_sendmsg(sock, &m, 0);
+		res = __tipc_sendmsg(sock, &m);
 		if ((res < 0) && (res != -EWOULDBLOCK))
 			goto exit;
 
@@ -2781,7 +2779,7 @@ static int tipc_accept(struct socket *sock, struct socket *new_sock, int flags,
 		skb_set_owner_r(buf, new_sk);
 	}
 	iov_iter_kvec(&m.msg_iter, ITER_SOURCE, NULL, 0, 0);
-	__tipc_sendstream(new_sock, &m, 0);
+	__tipc_sendstream(new_sock, &m);
 	release_sock(new_sk);
 exit:
 	release_sock(sk);
diff --git a/net/tls/tls.h b/net/tls/tls.h
index 804c3880d028..a969955ddd7c 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -96,7 +96,7 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx);
 void tls_update_rx_zc_capable(struct tls_context *tls_ctx);
 void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx);
 void tls_sw_strparser_done(struct tls_context *tls_ctx);
-int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg);
 int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
 			   int offset, size_t size, int flags);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
@@ -114,7 +114,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
 			   struct pipe_inode_info *pipe,
 			   size_t len, unsigned int flags);
 
-int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg);
 int tls_device_sendpage(struct sock *sk, struct page *page,
 			int offset, size_t size, int flags);
 int tls_tx_records(struct sock *sk, int flags);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index a7cc4f9faac2..3616dde20a96 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -566,7 +566,7 @@ static int tls_push_data(struct sock *sk,
 	return rc;
 }
 
-int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	unsigned char record_type = TLS_RECORD_TYPE_DATA;
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
@@ -583,7 +583,8 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	}
 
 	iter.msg_iter = &msg->msg_iter;
-	rc = tls_push_data(sk, iter, size, msg->msg_flags, record_type, NULL);
+	rc = tls_push_data(sk, iter, msg_data_left(msg), msg->msg_flags,
+			   record_type, NULL);
 
 out:
 	release_sock(sk);
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 635b8bf6b937..17ea9b07a277 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -929,7 +929,7 @@ static int tls_sw_push_pending_record(struct sock *sk, int flags)
 				   &copied, flags);
 }
 
-int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index fb31e8a4409e..37c96a73e6b4 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -756,20 +756,20 @@ static int unix_ioctl(struct socket *, unsigned int, unsigned long);
 static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 #endif
 static int unix_shutdown(struct socket *, int);
-static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_stream_sendmsg(struct socket *, struct msghdr *);
 static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int);
 static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset,
 				    size_t size, int flags);
 static ssize_t unix_stream_splice_read(struct socket *,  loff_t *ppos,
 				       struct pipe_inode_info *, size_t size,
 				       unsigned int flags);
-static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_dgram_sendmsg(struct socket *, struct msghdr *);
 static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int);
 static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
 static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
 static int unix_dgram_connect(struct socket *, struct sockaddr *,
 			      int, int);
-static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *);
 static int unix_seqpacket_recvmsg(struct socket *, struct msghdr *, size_t,
 				  int);
 
@@ -1888,14 +1888,14 @@ static void scm_stat_del(struct sock *sk, struct sk_buff *skb)
  *	Send AF_UNIX data.
  */
 
-static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
-			      size_t len)
+static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, msg->msg_name);
 	struct sock *sk = sock->sk, *other = NULL;
 	struct unix_sock *u = unix_sk(sk);
 	struct scm_cookie scm;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int data_len = 0;
 	int sk_locked;
 	long timeo;
@@ -2157,11 +2157,11 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other
 }
 #endif
 
-static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sock *other = NULL;
+	size_t len = msg_data_left(msg);
 	int err, size;
 	struct sk_buff *skb;
 	int sent = 0;
@@ -2388,8 +2388,7 @@ static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
 	return err;
 }
 
-static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
-				  size_t len)
+static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct sock *sk = sock->sk;
@@ -2404,7 +2403,7 @@ static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (msg->msg_namelen)
 		msg->msg_namelen = 0;
 
-	return unix_dgram_sendmsg(sock, msg, len);
+	return unix_dgram_sendmsg(sock, msg);
 }
 
 static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 19aea7cba26e..20bac3e04abd 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1131,8 +1131,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
 	return mask;
 }
 
-static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct sock *sk;
@@ -1198,7 +1197,7 @@ static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
 		goto out;
 	}
 
-	err = transport->dgram_enqueue(vsk, remote_addr, msg, len);
+	err = transport->dgram_enqueue(vsk, remote_addr, msg, msg_data_left(msg));
 
 out:
 	release_sock(sk);
@@ -1737,8 +1736,7 @@ static int vsock_connectible_getsockopt(struct socket *sock,
 	return 0;
 }
 
-static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
-				     size_t len)
+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk;
 	struct vsock_sock *vsk;
@@ -1794,7 +1792,7 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (err < 0)
 		goto out;
 
-	while (total_written < len) {
+	while (msg_data_left(msg)) {
 		ssize_t written;
 
 		add_wait_queue(sk_sleep(sk), &wait);
@@ -1856,10 +1854,10 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 
 		if (sk->sk_type == SOCK_SEQPACKET) {
 			written = transport->seqpacket_enqueue(vsk,
-						msg, len - total_written);
+					msg, msg_data_left(msg));
 		} else {
 			written = transport->stream_enqueue(vsk,
-					msg, len - total_written);
+					msg, msg_data_left(msg));
 		}
 
 		if (written < 0) {
@@ -1882,7 +1880,7 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 		 * 1) SOCK_STREAM socket.
 		 * 2) SOCK_SEQPACKET socket when whole buffer is sent.
 		 */
-		if (sk->sk_type == SOCK_STREAM || total_written == len)
+		if (sk->sk_type == SOCK_STREAM || !msg_data_left(msg))
 			err = total_written;
 	}
 out:
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5c7ad301d742..5b8751669136 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1100,7 +1100,7 @@ int x25_rx_call_request(struct sk_buff *skb, struct x25_neigh *nb,
 	goto out;
 }
 
-static int x25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int x25_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct x25_sock *x25 = x25_sk(sk);
@@ -1108,6 +1108,7 @@ static int x25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sockaddr_x25 sx25;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int noblock = msg->msg_flags & MSG_DONTWAIT;
 	size_t size;
 	int qbit = 0, rc = -EINVAL;
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5e..db82e2a287f5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -629,7 +629,7 @@ static int xsk_check_common(struct xdp_sock *xs)
 	return 0;
 }
 
-static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int __xsk_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	bool need_wait = !(m->msg_flags & MSG_DONTWAIT);
 	struct sock *sk = sock->sk;
@@ -663,12 +663,12 @@ static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len
 	return 0;
 }
 
-static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int xsk_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	int ret;
 
 	rcu_read_lock();
-	ret = __xsk_sendmsg(sock, m, total_len);
+	ret = __xsk_sendmsg(sock, m);
 	rcu_read_unlock();
 
 	return ret;
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index 872b80188e83..d07faa356347 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -311,13 +311,14 @@ int espintcp_push_skb(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(espintcp_push_skb);
 
-static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	struct espintcp_ctx *ctx = espintcp_getctx(sk);
 	struct espintcp_msg *emsg = &ctx->partial;
 	struct iov_iter pfx_iter;
 	struct kvec pfx_iov = {};
+	size_t size = msg_data_left(msg);
 	size_t msglen = size + 2;
 	char buf[2] = {0};
 	int err, end;
@@ -325,7 +326,7 @@ static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	if (msg->msg_flags & ~MSG_DONTWAIT)
 		return -EOPNOTSUPP;
 
-	if (size > MAX_ESPINTCP_MSG)
+	if (msg_data_left(msg) > MAX_ESPINTCP_MSG)
 		return -EMSGSIZE;
 
 	if (msg->msg_controllen)
@@ -362,7 +363,8 @@ static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	if (err < 0)
 		goto fail;
 
-	err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, &emsg->skmsg, size);
+	err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, &emsg->skmsg,
+				       msg_data_left(msg));
 	if (err < 0)
 		goto fail;
 
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index d6cc4812ca53..cb220a8e8126 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -997,10 +997,10 @@ static int aa_sock_msg_perm(const char *op, u32 request, struct socket *sock,
 /**
  * apparmor_socket_sendmsg - check perms before sending msg to another socket
  */
-static int apparmor_socket_sendmsg(struct socket *sock,
-				   struct msghdr *msg, int size)
+static int apparmor_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg, size);
+	return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg,
+				msg_data_left(msg));
 }
 
 /**
diff --git a/security/security.c b/security/security.c
index cf6cc576736f..faa87f363af8 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2301,9 +2301,9 @@ int security_socket_accept(struct socket *sock, struct socket *newsock)
 	return call_int_hook(socket_accept, 0, sock, newsock);
 }
 
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return call_int_hook(socket_sendmsg, 0, sock, msg, size);
+	return call_int_hook(socket_sendmsg, 0, sock, msg);
 }
 
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 9a5bdfc21314..ff0d82e6331d 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4912,8 +4912,7 @@ static int selinux_socket_accept(struct socket *sock, struct socket *newsock)
 	return 0;
 }
 
-static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				  int size)
+static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	return sock_has_perm(sock->sk, SOCKET__WRITE);
 }
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index cfcbb748da25..ca30c105f254 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -3730,14 +3730,12 @@ static int smack_unix_may_send(struct socket *sock, struct socket *other)
  * smack_socket_sendmsg - Smack check based on destination host
  * @sock: the socket
  * @msg: the message
- * @size: the size of the message
  *
  * Return 0 if the current subject can write to the destination host.
  * For IPv4 this is only a question if the destination is a single label host.
  * For IPv6 this is a check against the label of the port.
  */
-static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				int size)
+static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sockaddr_in *sip = (struct sockaddr_in *) msg->msg_name;
 #if IS_ENABLED(CONFIG_IPV6)
diff --git a/security/tomoyo/common.h b/security/tomoyo/common.h
index ca285f362705..0841098d966a 100644
--- a/security/tomoyo/common.h
+++ b/security/tomoyo/common.h
@@ -997,8 +997,7 @@ int tomoyo_socket_bind_permission(struct socket *sock, struct sockaddr *addr,
 int tomoyo_socket_connect_permission(struct socket *sock,
 				     struct sockaddr *addr, int addr_len);
 int tomoyo_socket_listen_permission(struct socket *sock);
-int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg,
-				     int size);
+int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg);
 int tomoyo_supervisor(struct tomoyo_request_info *r, const char *fmt, ...)
 	__printf(2, 3);
 int tomoyo_update_domain(struct tomoyo_acl_info *new_entry, const int size,
diff --git a/security/tomoyo/network.c b/security/tomoyo/network.c
index 8dc61335f65e..0315b335cdff 100644
--- a/security/tomoyo/network.c
+++ b/security/tomoyo/network.c
@@ -751,12 +751,10 @@ int tomoyo_socket_bind_permission(struct socket *sock, struct sockaddr *addr,
  *
  * @sock: Pointer to "struct socket".
  * @msg:  Pointer to "struct msghdr".
- * @size: Unused.
  *
  * Returns 0 on success, negative value otherwise.
  */
-int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg,
-				     int size)
+int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg)
 {
 	struct tomoyo_addr_info address;
 	const u8 family = tomoyo_sock_family(sock->sk);
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index af04a7b7eb28..72c6f343ffba 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -489,14 +489,12 @@ static int tomoyo_socket_bind(struct socket *sock, struct sockaddr *addr,
  *
  * @sock: Pointer to "struct socket".
  * @msg:  Pointer to "struct msghdr".
- * @size: Size of message.
  *
  * Returns 0 on success, negative value otherwise.
  */
-static int tomoyo_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				 int size)
+static int tomoyo_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return tomoyo_socket_sendmsg_permission(sock, msg, size);
+	return tomoyo_socket_sendmsg_permission(sock, msg);
 }
 
 struct lsm_blob_sizes tomoyo_blob_sizes __lsm_ro_after_init = {


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 1/3] net: Drop the size argument from ->sendmsg()
@ 2023-03-22 13:56             ` David Howells
  0 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 13:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: kvm, virtualization, David Howells, Eric Dumazet, linux-afs,
	linux-s390, rds-devel, linux-x25, dccp, linux-rdma,
	linux-security-module, Matthew Wilcox, linux-wpan,
	Jakub Kicinski, Paolo Abeni, selinux, linux-arm-msm, apparmor,
	linux-can, xen-devel, linux-hams, mptcp, netdev, Jeff Layton,
	linux-kernel, linux-bluetooth, linux-sctp, tipc-discussion,
	linux-crypto

The size argument to ->sendmsg() ought to be redundant as the same
information should be conveyed by msg->msg_iter.count as returned by
msg_data_left().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Eric Dumazet <edumazet@google.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: netdev@vger.kernel.org
cc: apparmor@lists.ubuntu.com
cc: bpf@vger.kernel.org
cc: dccp@vger.kernel.org
cc: kvm@vger.kernel.org
cc: linux-afs@lists.infradead.org
cc: linux-arm-msm@vger.kernel.org
cc: linux-bluetooth@vger.kernel.org
cc: linux-can@vger.kernel.org
cc: linux-crypto@vger.kernel.org
cc: linux-hams@vger.kernel.org
cc: linux-rdma@vger.kernel.org
cc: linux-s390@vger.kernel.org
cc: linux-sctp@vger.kernel.org
cc: linux-security-module@vger.kernel.org
cc: linux-wpan@vger.kernel.org
cc: linux-x25@vger.kernel.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: selinux@vger.kernel.org
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
cc: xen-devel@lists.xenproject.org
---
 crypto/af_alg.c                               | 12 +++----
 crypto/algif_aead.c                           |  9 +++--
 crypto/algif_hash.c                           |  8 ++---
 crypto/algif_rng.c                            |  3 +-
 crypto/algif_skcipher.c                       | 10 +++---
 drivers/isdn/mISDN/socket.c                   |  3 +-
 .../chelsio/inline_crypto/chtls/chtls.h       |  2 +-
 .../chelsio/inline_crypto/chtls/chtls_io.c    | 15 ++++----
 drivers/net/ppp/pppoe.c                       |  4 +--
 drivers/net/tap.c                             |  3 +-
 drivers/net/tun.c                             |  3 +-
 drivers/vhost/net.c                           |  6 ++--
 drivers/xen/pvcalls-back.c                    |  2 +-
 drivers/xen/pvcalls-front.c                   |  4 +--
 drivers/xen/pvcalls-front.h                   |  3 +-
 fs/afs/rxrpc.c                                |  8 ++---
 include/crypto/if_alg.h                       |  3 +-
 include/linux/lsm_hook_defs.h                 |  3 +-
 include/linux/lsm_hooks.h                     |  1 -
 include/linux/net.h                           |  6 ++--
 include/linux/security.h                      |  4 +--
 include/net/af_rxrpc.h                        |  3 +-
 include/net/inet_common.h                     |  2 +-
 include/net/ipv6.h                            |  2 +-
 include/net/ping.h                            |  2 +-
 include/net/sock.h                            |  7 ++--
 include/net/tcp.h                             |  8 ++---
 include/net/udp.h                             |  2 +-
 net/appletalk/ddp.c                           |  3 +-
 net/atm/common.c                              |  3 +-
 net/atm/common.h                              |  2 +-
 net/ax25/af_ax25.c                            |  4 +--
 net/bluetooth/hci_sock.c                      |  4 +--
 net/bluetooth/iso.c                           |  4 +--
 net/bluetooth/l2cap_sock.c                    |  5 ++-
 net/bluetooth/rfcomm/sock.c                   |  7 ++--
 net/bluetooth/sco.c                           |  4 +--
 net/caif/caif_socket.c                        | 13 +++----
 net/can/bcm.c                                 |  3 +-
 net/can/isotp.c                               |  3 +-
 net/can/j1939/socket.c                        |  4 +--
 net/can/raw.c                                 |  3 +-
 net/core/sock.c                               |  4 +--
 net/dccp/dccp.h                               |  2 +-
 net/dccp/proto.c                              |  3 +-
 net/ieee802154/socket.c                       | 11 +++---
 net/ipv4/af_inet.c                            |  4 +--
 net/ipv4/ping.c                               |  8 +++--
 net/ipv4/raw.c                                |  3 +-
 net/ipv4/tcp.c                                | 17 +++++-----
 net/ipv4/tcp_bpf.c                            |  5 +--
 net/ipv4/tcp_input.c                          |  3 +-
 net/ipv4/udp.c                                |  5 +--
 net/ipv6/af_inet6.c                           |  7 ++--
 net/ipv6/ping.c                               |  5 +--
 net/ipv6/raw.c                                |  3 +-
 net/ipv6/udp.c                                |  7 ++--
 net/ipv6/udp_impl.h                           |  2 +-
 net/iucv/af_iucv.c                            |  4 +--
 net/kcm/kcmsock.c                             |  2 +-
 net/key/af_key.c                              |  3 +-
 net/l2tp/l2tp_ip.c                            |  3 +-
 net/l2tp/l2tp_ip6.c                           |  3 +-
 net/l2tp/l2tp_ppp.c                           |  4 +--
 net/llc/af_llc.c                              |  5 ++-
 net/mctp/af_mctp.c                            |  3 +-
 net/mptcp/protocol.c                          |  8 ++---
 net/netlink/af_netlink.c                      | 11 +++---
 net/netrom/af_netrom.c                        |  3 +-
 net/nfc/llcp_sock.c                           |  7 ++--
 net/nfc/rawsock.c                             |  3 +-
 net/packet/af_packet.c                        | 11 +++---
 net/phonet/datagram.c                         |  3 +-
 net/phonet/pep.c                              |  3 +-
 net/phonet/socket.c                           |  5 ++-
 net/qrtr/af_qrtr.c                            |  4 +--
 net/rds/rds.h                                 |  2 +-
 net/rds/send.c                                |  3 +-
 net/rose/af_rose.c                            |  3 +-
 net/rxrpc/af_rxrpc.c                          |  6 ++--
 net/rxrpc/ar-internal.h                       |  2 +-
 net/rxrpc/output.c                            | 22 ++++++------
 net/rxrpc/rxperf.c                            |  4 +--
 net/rxrpc/sendmsg.c                           | 15 ++++----
 net/sctp/socket.c                             |  3 +-
 net/smc/af_smc.c                              |  5 +--
 net/socket.c                                  | 16 ++++-----
 net/tipc/socket.c                             | 34 +++++++++----------
 net/tls/tls.h                                 |  4 +--
 net/tls/tls_device.c                          |  5 +--
 net/tls/tls_sw.c                              |  2 +-
 net/unix/af_unix.c                            | 19 +++++------
 net/vmw_vsock/af_vsock.c                      | 16 ++++-----
 net/x25/af_x25.c                              |  3 +-
 net/xdp/xsk.c                                 |  6 ++--
 net/xfrm/espintcp.c                           |  8 +++--
 security/apparmor/lsm.c                       |  6 ++--
 security/security.c                           |  4 +--
 security/selinux/hooks.c                      |  3 +-
 security/smack/smack_lsm.c                    |  4 +--
 security/tomoyo/common.h                      |  3 +-
 security/tomoyo/network.c                     |  4 +--
 security/tomoyo/tomoyo.c                      |  6 ++--
 103 files changed, 286 insertions(+), 296 deletions(-)

diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 5f7252a5b7b4..dc49b4e2d719 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -952,19 +952,18 @@ static void af_alg_data_wakeup(struct sock *sk)
  *
  * @sock: socket of connection to user space
  * @msg: message from user space
- * @size: size of message from user space
  * @ivsize: the size of the IV for the cipher operation to verify that the
  *	   user-space-provided IV has the right size
  * Return: the number of copied data upon success, < 0 upon error
  */
-int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
-		   unsigned int ivsize)
+int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, unsigned int ivsize)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
 	struct af_alg_ctx *ctx = ask->private;
 	struct af_alg_tsgl *sgl;
 	struct af_alg_control con = {};
+	size_t len;
 	long copied = 0;
 	bool enc = false;
 	bool init = false;
@@ -1012,9 +1011,8 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		ctx->aead_assoclen = con.aead_assoclen;
 	}
 
-	while (size) {
+	while ((len = msg_data_left(msg))) {
 		struct scatterlist *sg;
-		size_t len = size;
 		size_t plen;
 
 		/* use the existing memory in an allocated page */
@@ -1037,7 +1035,6 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 
 			ctx->used += len;
 			copied += len;
-			size -= len;
 			continue;
 		}
 
@@ -1086,11 +1083,10 @@ int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
 			len -= plen;
 			ctx->used += plen;
 			copied += plen;
-			size -= plen;
 			sgl->cur++;
 		} while (len && sgl->cur < MAX_SGL_ENTS);
 
-		if (!size)
+		if (!msg_data_left(msg))
 			sg_mark_end(sg + sgl->cur - 1);
 
 		ctx->merge = plen & (PAGE_SIZE - 1);
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index 42493b4d8ce4..1005c755c4c8 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -58,7 +58,7 @@ static inline bool aead_sufficient_data(struct sock *sk)
 	return ctx->used >= ctx->aead_assoclen + (ctx->enc ? 0 : as);
 }
 
-static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int aead_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
@@ -68,7 +68,7 @@ static int aead_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	struct crypto_aead *tfm = aeadc->aead;
 	unsigned int ivsize = crypto_aead_ivsize(tfm);
 
-	return af_alg_sendmsg(sock, msg, size, ivsize);
+	return af_alg_sendmsg(sock, msg, ivsize);
 }
 
 static int crypto_aead_copy_sgl(struct crypto_sync_skcipher *null_tfm,
@@ -408,8 +408,7 @@ static int aead_check_key(struct socket *sock)
 	return err;
 }
 
-static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-				  size_t size)
+static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -417,7 +416,7 @@ static int aead_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return aead_sendmsg(sock, msg, size);
+	return aead_sendmsg(sock, msg);
 }
 
 static ssize_t aead_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 1d017ec5c63c..9817adecdf1a 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -60,8 +60,7 @@ static void hash_free_result(struct sock *sk, struct hash_ctx *ctx)
 	ctx->result = NULL;
 }
 
-static int hash_sendmsg(struct socket *sock, struct msghdr *msg,
-			size_t ignored)
+static int hash_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int limit = ALG_MAX_PAGES * PAGE_SIZE;
 	struct sock *sk = sock->sk;
@@ -325,8 +324,7 @@ static int hash_check_key(struct socket *sock)
 	return err;
 }
 
-static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-			      size_t size)
+static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -334,7 +332,7 @@ static int hash_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return hash_sendmsg(sock, msg, size);
+	return hash_sendmsg(sock, msg);
 }
 
 static ssize_t hash_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/crypto/algif_rng.c b/crypto/algif_rng.c
index 407408c43730..f838be6c2fd7 100644
--- a/crypto/algif_rng.c
+++ b/crypto/algif_rng.c
@@ -130,11 +130,12 @@ static int rng_test_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 	return ret;
 }
 
-static int rng_test_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rng_test_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct alg_sock *ask = alg_sk(sock->sk);
 	struct rng_ctx *ctx = ask->private;
+	size_t len = msg_data_left(msg);
 
 	lock_sock(sock->sk);
 	if (len > MAXSIZE) {
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index ee8890ee8f33..f5cd9dbbad1b 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -34,8 +34,7 @@
 #include <linux/net.h>
 #include <net/sock.h>
 
-static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t size)
+static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct alg_sock *ask = alg_sk(sk);
@@ -44,7 +43,7 @@ static int skcipher_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct crypto_skcipher *tfm = pask->private;
 	unsigned ivsize = crypto_skcipher_ivsize(tfm);
 
-	return af_alg_sendmsg(sock, msg, size, ivsize);
+	return af_alg_sendmsg(sock, msg, ivsize);
 }
 
 static int _skcipher_recvmsg(struct socket *sock, struct msghdr *msg,
@@ -234,8 +233,7 @@ static int skcipher_check_key(struct socket *sock)
 	return err;
 }
 
-static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
-				  size_t size)
+static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 
@@ -243,7 +241,7 @@ static int skcipher_sendmsg_nokey(struct socket *sock, struct msghdr *msg,
 	if (err)
 		return err;
 
-	return skcipher_sendmsg(sock, msg, size);
+	return skcipher_sendmsg(sock, msg);
 }
 
 static ssize_t skcipher_sendpage_nokey(struct socket *sock, struct page *page,
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 2776ca5fc33f..4c42d39e994a 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -164,10 +164,11 @@ mISDN_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
 }
 
 static int
-mISDN_sock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+mISDN_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock		*sk = sock->sk;
 	struct sk_buff		*skb;
+	size_t			len = msg_data_left(msg);
 	int			err = -ENOMEM;
 
 	if (*debug & DEBUG_SOCKET)
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
index 41714203ace8..32077c61273b 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls.h
@@ -565,7 +565,7 @@ void chtls_close(struct sock *sk, long timeout);
 int chtls_disconnect(struct sock *sk, int flags);
 void chtls_shutdown(struct sock *sk, int how);
 void chtls_destroy_sock(struct sock *sk);
-int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int chtls_sendmsg(struct sock *sk, struct msghdr *msg);
 int chtls_recvmsg(struct sock *sk, struct msghdr *msg,
 		  size_t len, int flags, int *addr_len);
 int chtls_sendpage(struct sock *sk, struct page *page,
diff --git a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
index ae6b17b96bf1..5782267618cf 100644
--- a/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
+++ b/drivers/net/ethernet/chelsio/inline_crypto/chtls/chtls_io.c
@@ -1004,7 +1004,7 @@ static int chtls_proccess_cmsg(struct sock *sk, struct msghdr *msg,
 	return rc;
 }
 
-int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int chtls_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct chtls_sock *csk = rcu_dereference_sk_user_data(sk);
 	struct chtls_dev *cdev = csk->cdev;
@@ -1058,7 +1058,7 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 					tx_skb_finalize(skb);
 			}
 
-			recordsz = size;
+			recordsz = msg_data_left(msg);
 			csk->tlshws.txleft = recordsz;
 			csk->tlshws.type = record_type;
 		}
@@ -1080,8 +1080,8 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 								 false);
 			} else {
 				skb = get_tx_skb(sk,
-						 select_size(sk, size, flags,
-							     TX_HEADER_LEN));
+						 select_size(sk, msg_data_left(msg),
+							     flags, TX_HEADER_LEN));
 			}
 			if (unlikely(!skb))
 				goto wait_for_memory;
@@ -1089,8 +1089,8 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 			skb->ip_summed = CHECKSUM_UNNECESSARY;
 			copy = mss;
 		}
-		if (copy > size)
-			copy = size;
+		if (copy > msg_data_left(msg))
+			copy = msg_data_left(msg);
 
 		if (skb_tailroom(skb) > 0) {
 			copy = min(copy, skb_tailroom(skb));
@@ -1182,7 +1182,6 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 			tx_skb_finalize(skb);
 		tp->write_seq += copy;
 		copied += copy;
-		size -= copy;
 
 		if (is_tls_tx(csk))
 			csk->tlshws.txleft -= copy;
@@ -1191,7 +1190,7 @@ int chtls_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 		    (sk_stream_wspace(sk) < sk_stream_min_wspace(sk)))
 			ULP_SKB_CB(skb)->flags |= ULPCB_FLAG_NO_APPEND;
 
-		if (size == 0)
+		if (msg_data_left(msg) == 0)
 			goto out;
 
 		if (ULP_SKB_CB(skb)->flags & ULPCB_FLAG_NO_APPEND)
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index ce2cbb5903d7..7ae28a1f528a 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -833,8 +833,7 @@ static int pppoe_ioctl(struct socket *sock, unsigned int cmd,
 	return err;
 }
 
-static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
-			 size_t total_len)
+static int pppoe_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sk_buff *skb;
 	struct sock *sk = sock->sk;
@@ -843,6 +842,7 @@ static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
 	struct pppoe_hdr hdr;
 	struct pppoe_hdr *ph;
 	struct net_device *dev;
+	size_t total_len = msg_data_left(m);
 	char *start;
 	int hlen;
 
diff --git a/drivers/net/tap.c b/drivers/net/tap.c
index ce993cc75bf3..2b076d4a1a58 100644
--- a/drivers/net/tap.c
+++ b/drivers/net/tap.c
@@ -1224,8 +1224,7 @@ static int tap_get_user_xdp(struct tap_queue *q, struct xdp_buff *xdp)
 	return err;
 }
 
-static int tap_sendmsg(struct socket *sock, struct msghdr *m,
-		       size_t total_len)
+static int tap_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct tap_queue *q = container_of(sock, struct tap_queue, sock);
 	struct tun_msg_ctl *ctl = m->msg_control;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4c7f74904c25..b31d696adafd 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -2531,13 +2531,14 @@ static int tun_xdp_one(struct tun_struct *tun,
 	return ret;
 }
 
-static int tun_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int tun_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	int ret, i;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = tun_get(tfile);
 	struct tun_msg_ctl *ctl = m->msg_control;
 	struct xdp_buff *xdp;
+	size_t total_len = msg_data_left(m);
 
 	if (!tun)
 		return -EBADFD;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 07181cd8d52e..ddf01a21f208 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -476,7 +476,7 @@ static void vhost_tx_batch(struct vhost_net *net,
 
 	msghdr->msg_control = &ctl;
 	msghdr->msg_controllen = sizeof(ctl);
-	err = sock->ops->sendmsg(sock, msghdr, 0);
+	err = sock->ops->sendmsg(sock, msghdr);
 	if (unlikely(err < 0)) {
 		vq_err(&nvq->vq, "Fail to batch sending packets\n");
 
@@ -836,7 +836,7 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
 				msg.msg_flags &= ~MSG_MORE;
 		}
 
-		err = sock->ops->sendmsg(sock, &msg, len);
+		err = sock->ops->sendmsg(sock, &msg);
 		if (unlikely(err < 0)) {
 			if (err == -EAGAIN || err == -ENOMEM || err == -ENOBUFS) {
 				vhost_discard_vq_desc(vq, 1);
@@ -933,7 +933,7 @@ static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
 			msg.msg_flags &= ~MSG_MORE;
 		}
 
-		err = sock->ops->sendmsg(sock, &msg, len);
+		err = sock->ops->sendmsg(sock, &msg);
 		if (unlikely(err < 0)) {
 			if (zcopy_used) {
 				if (vq->heads[ubuf->desc].len == VHOST_DMA_IN_PROGRESS)
diff --git a/drivers/xen/pvcalls-back.c b/drivers/xen/pvcalls-back.c
index 1f5219e12cc3..37cfd15b6d9d 100644
--- a/drivers/xen/pvcalls-back.c
+++ b/drivers/xen/pvcalls-back.c
@@ -200,7 +200,7 @@ static bool pvcalls_conn_back_write(struct sock_mapping *map)
 		iov_iter_kvec(&msg.msg_iter, ITER_SOURCE, vec, 2, size);
 	}
 
-	ret = inet_sendmsg(map->sock, &msg, size);
+	ret = inet_sendmsg(map->sock, &msg);
 	if (ret == -EAGAIN) {
 		atomic_inc(&map->write);
 		atomic_inc(&map->io);
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index d5d589bda243..257d92612371 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -531,10 +531,10 @@ static int __write_ring(struct pvcalls_data_intf *intf,
 	return len;
 }
 
-int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg,
-			  size_t len)
+int pvcalls_front_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock_mapping *map;
+	size_t len = msg_data_left(msg);
 	int sent, tot_sent = 0;
 	int count = 0, flags;
 
diff --git a/drivers/xen/pvcalls-front.h b/drivers/xen/pvcalls-front.h
index f694ad77379f..f0c5429604e6 100644
--- a/drivers/xen/pvcalls-front.h
+++ b/drivers/xen/pvcalls-front.h
@@ -14,8 +14,7 @@ int pvcalls_front_accept(struct socket *sock,
 			 struct socket *newsock,
 			 int flags);
 int pvcalls_front_sendmsg(struct socket *sock,
-			  struct msghdr *msg,
-			  size_t len);
+			  struct msghdr *msg);
 int pvcalls_front_recvmsg(struct socket *sock,
 			  struct msghdr *msg,
 			  size_t len,
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 7817e2b860e5..95ef04862025 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -367,8 +367,7 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp)
 	msg.msg_flags		= MSG_WAITALL | (call->write_iter ? MSG_MORE : 0);
 
 	ret = rxrpc_kernel_send_data(call->net->socket, rxcall,
-				     &msg, call->request_size,
-				     afs_notify_end_request_tx);
+				     &msg, afs_notify_end_request_tx);
 	if (ret < 0)
 		goto error_do_abort;
 
@@ -379,7 +378,6 @@ void afs_make_call(struct afs_addr_cursor *ac, struct afs_call *call, gfp_t gfp)
 
 		ret = rxrpc_kernel_send_data(call->net->socket,
 					     call->rxcall, &msg,
-					     iov_iter_count(&msg.msg_iter),
 					     afs_notify_end_request_tx);
 		*call->write_iter = msg.msg_iter;
 
@@ -834,7 +832,7 @@ void afs_send_empty_reply(struct afs_call *call)
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
 
-	switch (rxrpc_kernel_send_data(net->socket, call->rxcall, &msg, 0,
+	switch (rxrpc_kernel_send_data(net->socket, call->rxcall, &msg,
 				       afs_notify_end_reply_tx)) {
 	case 0:
 		_leave(" [replied]");
@@ -875,7 +873,7 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
 
-	n = rxrpc_kernel_send_data(net->socket, call->rxcall, &msg, len,
+	n = rxrpc_kernel_send_data(net->socket, call->rxcall, &msg,
 				   afs_notify_end_reply_tx);
 	if (n >= 0) {
 		/* Success */
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 7e76623f9ec3..bcf0077aae6d 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -228,8 +228,7 @@ void af_alg_pull_tsgl(struct sock *sk, size_t used, struct scatterlist *dst,
 		      size_t dst_offset);
 void af_alg_wmem_wakeup(struct sock *sk);
 int af_alg_wait_for_data(struct sock *sk, unsigned flags, unsigned min);
-int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, size_t size,
-		   unsigned int ivsize);
+int af_alg_sendmsg(struct socket *sock, struct msghdr *msg, unsigned int ivsize);
 ssize_t af_alg_sendpage(struct socket *sock, struct page *page,
 			int offset, size_t size, int flags);
 void af_alg_free_resources(struct af_alg_async_req *areq);
diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
index 094b76dc7164..b176525025da 100644
--- a/include/linux/lsm_hook_defs.h
+++ b/include/linux/lsm_hook_defs.h
@@ -298,8 +298,7 @@ LSM_HOOK(int, 0, socket_connect, struct socket *sock, struct sockaddr *address,
 	 int addrlen)
 LSM_HOOK(int, 0, socket_listen, struct socket *sock, int backlog)
 LSM_HOOK(int, 0, socket_accept, struct socket *sock, struct socket *newsock)
-LSM_HOOK(int, 0, socket_sendmsg, struct socket *sock, struct msghdr *msg,
-	 int size)
+LSM_HOOK(int, 0, socket_sendmsg, struct socket *sock, struct msghdr *msg)
 LSM_HOOK(int, 0, socket_recvmsg, struct socket *sock, struct msghdr *msg,
 	 int size, int flags)
 LSM_HOOK(int, 0, socket_getsockname, struct socket *sock)
diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
index 6e156d2acffc..6f48be80b6bf 100644
--- a/include/linux/lsm_hooks.h
+++ b/include/linux/lsm_hooks.h
@@ -932,7 +932,6 @@
  *	Check permission before transmitting a message to another socket.
  *	@sock contains the socket structure.
  *	@msg contains the message to be transmitted.
- *	@size contains the size of message.
  *	Return 0 if permission is granted.
  * @socket_recvmsg:
  *	Check permission before receiving a message from a socket.
diff --git a/include/linux/net.h b/include/linux/net.h
index b73ad8e3c212..8adf1328445a 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -192,8 +192,7 @@ struct proto_ops {
 	int		(*getsockopt)(struct socket *sock, int level,
 				      int optname, char __user *optval, int __user *optlen);
 	void		(*show_fdinfo)(struct seq_file *m, struct socket *sock);
-	int		(*sendmsg)   (struct socket *sock, struct msghdr *m,
-				      size_t total_len);
+	int		(*sendmsg)   (struct socket *sock, struct msghdr *m);
 	/* Notes for implementing recvmsg:
 	 * ===============================
 	 * msg->msg_namelen should get updated by the recvmsg handlers
@@ -222,8 +221,7 @@ struct proto_ops {
 	int		(*read_skb)(struct sock *sk, skb_read_actor_t recv_actor);
 	int		(*sendpage_locked)(struct sock *sk, struct page *page,
 					   int offset, size_t size, int flags);
-	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg,
-					  size_t size);
+	int		(*sendmsg_locked)(struct sock *sk, struct msghdr *msg);
 	int		(*set_rcvlowat)(struct sock *sk, int val);
 };
 
diff --git a/include/linux/security.h b/include/linux/security.h
index 5984d0d550b4..6c67a4de4a89 100644
--- a/include/linux/security.h
+++ b/include/linux/security.h
@@ -1436,7 +1436,7 @@ int security_socket_bind(struct socket *sock, struct sockaddr *address, int addr
 int security_socket_connect(struct socket *sock, struct sockaddr *address, int addrlen);
 int security_socket_listen(struct socket *sock, int backlog);
 int security_socket_accept(struct socket *sock, struct socket *newsock);
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size);
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg);
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
 			    int size, int flags);
 int security_socket_getsockname(struct socket *sock);
@@ -1538,7 +1538,7 @@ static inline int security_socket_accept(struct socket *sock,
 }
 
 static inline int security_socket_sendmsg(struct socket *sock,
-					  struct msghdr *msg, int size)
+					  struct msghdr *msg)
 {
 	return 0;
 }
diff --git a/include/net/af_rxrpc.h b/include/net/af_rxrpc.h
index ba717eac0229..33f1b8c622e3 100644
--- a/include/net/af_rxrpc.h
+++ b/include/net/af_rxrpc.h
@@ -51,8 +51,7 @@ struct rxrpc_call *rxrpc_kernel_begin_call(struct socket *,
 					   enum rxrpc_interruptibility,
 					   unsigned int);
 int rxrpc_kernel_send_data(struct socket *, struct rxrpc_call *,
-			   struct msghdr *, size_t,
-			   rxrpc_notify_end_tx_t);
+			   struct msghdr *, rxrpc_notify_end_tx_t);
 int rxrpc_kernel_recv_data(struct socket *, struct rxrpc_call *,
 			   struct iov_iter *, size_t *, bool, u32 *, u16 *);
 bool rxrpc_kernel_abort_call(struct socket *, struct rxrpc_call *,
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index cec453c18f1d..ec798fdd371c 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -32,7 +32,7 @@ int inet_dgram_connect(struct socket *sock, struct sockaddr *uaddr,
 int inet_accept(struct socket *sock, struct socket *newsock, int flags,
 		bool kern);
 int inet_send_prepare(struct sock *sk);
-int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
+int inet_sendmsg(struct socket *sock, struct msghdr *msg);
 ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 		      size_t size, int flags);
 int inet_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 7332296eca44..f2132311e92b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1228,7 +1228,7 @@ int inet6_compat_ioctl(struct socket *sock, unsigned int cmd,
 
 int inet6_hash_connect(struct inet_timewait_death_row *death_row,
 			      struct sock *sk);
-int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size);
+int inet6_sendmsg(struct socket *sock, struct msghdr *msg);
 int inet6_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		  int flags);
 
diff --git a/include/net/ping.h b/include/net/ping.h
index 9233ad3de0ad..04814edde8e3 100644
--- a/include/net/ping.h
+++ b/include/net/ping.h
@@ -70,7 +70,7 @@ int  ping_getfrag(void *from, char *to, int offset, int fraglen, int odd,
 
 int  ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		  int flags, int *addr_len);
-int  ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
+int  ping_common_sendmsg(int family, struct msghdr *msg,
 			 void *user_icmph, size_t icmph_len);
 int  ping_queue_rcv_skb(struct sock *sk, struct sk_buff *skb);
 enum skb_drop_reason ping_rcv(struct sk_buff *skb);
diff --git a/include/net/sock.h b/include/net/sock.h
index 573f2bf7e0de..7a6d06c181b6 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1261,8 +1261,7 @@ struct proto {
 	int			(*compat_ioctl)(struct sock *sk,
 					unsigned int cmd, unsigned long arg);
 #endif
-	int			(*sendmsg)(struct sock *sk, struct msghdr *msg,
-					   size_t len);
+	int			(*sendmsg)(struct sock *sk, struct msghdr *msg);
 	int			(*recvmsg)(struct sock *sk, struct msghdr *msg,
 					   size_t len, int flags, int *addr_len);
 	int			(*sendpage)(struct sock *sk, struct page *page,
@@ -1901,8 +1900,8 @@ int sock_no_getname(struct socket *, struct sockaddr *, int);
 int sock_no_ioctl(struct socket *, unsigned int, unsigned long);
 int sock_no_listen(struct socket *, int);
 int sock_no_shutdown(struct socket *, int);
-int sock_no_sendmsg(struct socket *, struct msghdr *, size_t);
-int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t len);
+int sock_no_sendmsg(struct socket *sk, struct msghdr *msg);
+int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *msg);
 int sock_no_recvmsg(struct socket *, struct msghdr *, size_t, int);
 int sock_no_mmap(struct file *file, struct socket *sock,
 		 struct vm_area_struct *vma);
diff --git a/include/net/tcp.h b/include/net/tcp.h
index a0a91a988272..12b228e3d563 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -325,10 +325,10 @@ int tcp_v4_rcv(struct sk_buff *skb);
 
 void tcp_remove_empty_skb(struct sock *sk);
 int tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);
-int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
-int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size);
+int tcp_sendmsg(struct sock *sk, struct msghdr *msg);
+int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg);
 int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
-			 size_t size, struct ubuf_info *uarg);
+			 struct ubuf_info *uarg);
 int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
 		 int flags);
 int tcp_sendpage_locked(struct sock *sk, struct page *page, int offset,
@@ -479,7 +479,7 @@ struct sk_buff *tcp_make_synack(const struct sock *sk, struct dst_entry *dst,
 int tcp_disconnect(struct sock *sk, int flags);
 
 void tcp_finish_connect(struct sock *sk, struct sk_buff *skb);
-int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size);
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg);
 void inet_sk_rx_dst_set(struct sock *sk, const struct sk_buff *skb);
 
 /* From syncookies.c */
diff --git a/include/net/udp.h b/include/net/udp.h
index de4b528522bb..b9b2ea5af42d 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -277,7 +277,7 @@ int udp_get_port(struct sock *sk, unsigned short snum,
 				  const struct sock *));
 int udp_err(struct sk_buff *, u32);
 int udp_abort(struct sock *sk, int err);
-int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+int udp_sendmsg(struct sock *sk, struct msghdr *msg);
 int udp_push_pending_frames(struct sock *sk);
 void udp_flush_pending_frames(struct sock *sk);
 int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index a06f4d4a6f47..70008c57503f 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1566,7 +1566,7 @@ static int ltalk_rcv(struct sk_buff *skb, struct net_device *dev,
 	return 0;
 }
 
-static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int atalk_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct atalk_sock *at = at_sk(sk);
@@ -1579,6 +1579,7 @@ static int atalk_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct ddpehdr *ddp;
 	int size, hard_header_len;
 	struct atalk_route *rt, *rt_lo = NULL;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	if (flags & ~(MSG_DONTWAIT|MSG_CMSG_COMPAT))
diff --git a/net/atm/common.c b/net/atm/common.c
index f7019df41c3e..09060644760b 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -565,12 +565,13 @@ int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 	return copied;
 }
 
-int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t size)
+int vcc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	DEFINE_WAIT(wait);
 	struct atm_vcc *vcc;
 	struct sk_buff *skb;
+	size_t size = msg_data_left(m);
 	int eff, error;
 
 	lock_sock(sk);
diff --git a/net/atm/common.h b/net/atm/common.h
index a1e56e8de698..6597f8308f03 100644
--- a/net/atm/common.h
+++ b/net/atm/common.h
@@ -16,7 +16,7 @@ int vcc_release(struct socket *sock);
 int vcc_connect(struct socket *sock, int itf, short vpi, int vci);
 int vcc_recvmsg(struct socket *sock, struct msghdr *msg, size_t size,
 		int flags);
-int vcc_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len);
+int vcc_sendmsg(struct socket *sock, struct msghdr *m);
 __poll_t vcc_poll(struct file *file, struct socket *sock, poll_table *wait);
 int vcc_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 int vcc_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index d8da400cb4de..48f96e28f7ea 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1489,7 +1489,7 @@ static int ax25_getname(struct socket *sock, struct sockaddr *uaddr,
 	return err;
 }
 
-static int ax25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int ax25_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_ax25 *, usax, msg->msg_name);
 	struct sock *sk = sock->sk;
@@ -1497,7 +1497,7 @@ static int ax25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sk_buff *skb;
 	ax25_digi dtmp, *dp;
 	ax25_cb *ax25;
-	size_t size;
+	size_t size, len = msg_data_left(msg);
 	int lv, err, addr_len = msg->msg_namelen;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 06581223238c..9d6f713eeac1 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -1692,8 +1692,7 @@ static int hci_logging_frame(struct sock *sk, struct sk_buff *skb,
 	return err;
 }
 
-static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct hci_mgmt_chan *chan;
@@ -1701,6 +1700,7 @@ static int hci_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	int err;
 	const unsigned int flags = msg->msg_flags;
+	size_t len = msg_data_left(msg);
 
 	BT_DBG("sock %p sk %p", sock, sk);
 
diff --git a/net/bluetooth/iso.c b/net/bluetooth/iso.c
index 24444b502e58..6d8863878abc 100644
--- a/net/bluetooth/iso.c
+++ b/net/bluetooth/iso.c
@@ -1031,12 +1031,12 @@ static int iso_sock_getname(struct socket *sock, struct sockaddr *addr,
 	return sizeof(struct sockaddr_iso);
 }
 
-static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int iso_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct iso_conn *conn = iso_pi(sk)->conn;
 	struct sk_buff *skb, **frag;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	BT_DBG("sock %p, sk %p", sock, sk);
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index eebe256104bc..d488aca82037 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1143,8 +1143,7 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
 	return err;
 }
 
-static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			      size_t len)
+static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct l2cap_chan *chan = l2cap_pi(sk)->chan;
@@ -1169,7 +1168,7 @@ static int l2cap_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 		return err;
 
 	l2cap_chan_lock(chan);
-	err = l2cap_chan_send(chan, msg, len);
+	err = l2cap_chan_send(chan, msg, msg_data_left(msg));
 	l2cap_chan_unlock(chan);
 
 	return err;
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 4397e14ff560..8a0a51b5c3a3 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -558,8 +558,7 @@ static int rfcomm_sock_getname(struct socket *sock, struct sockaddr *addr, int p
 	return sizeof(struct sockaddr_rc);
 }
 
-static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rfcomm_dlc *d = rfcomm_pi(sk)->dlc;
@@ -586,8 +585,8 @@ static int rfcomm_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (sent)
 		return sent;
 
-	skb = bt_skb_sendmmsg(sk, msg, len, d->mtu, RFCOMM_SKB_HEAD_RESERVE,
-			      RFCOMM_SKB_TAIL_RESERVE);
+	skb = bt_skb_sendmmsg(sk, msg, msg_data_left(msg), d->mtu,
+			      RFCOMM_SKB_HEAD_RESERVE, RFCOMM_SKB_TAIL_RESERVE);
 	if (IS_ERR(skb))
 		return PTR_ERR(skb);
 
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 1111da4e2f2b..8c62c5dc5b57 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -722,11 +722,11 @@ static int sco_sock_getname(struct socket *sock, struct sockaddr *addr,
 	return sizeof(struct sockaddr_sco);
 }
 
-static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t len)
+static int sco_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	BT_DBG("sock %p, sk %p", sock, sk);
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 4eebcc66c19a..827230b3f7c3 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -510,8 +510,7 @@ static int transmit_skb(struct sk_buff *skb, struct caifsock *cf_sk,
 }
 
 /* Copied from af_unix:unix_dgram_sendmsg, and adapted to CAIF */
-static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
@@ -520,6 +519,8 @@ static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb = NULL;
 	int noblock;
 	long timeo;
+	size_t len = msg_data_left(msg);
+
 	caif_assert(cf_sk);
 	ret = sock_error(sk);
 	if (ret)
@@ -582,8 +583,7 @@ static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg,
  * Changed removed permission handling and added waiting for flow on
  * and other minor adaptations.
  */
-static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
@@ -605,10 +605,7 @@ static int caif_stream_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (unlikely(sk->sk_shutdown & SEND_SHUTDOWN))
 		goto pipe_err;
 
-	while (sent < len) {
-
-		size = len-sent;
-
+	while ((size = msg_data_left(msg))) {
 		if (size > cf_sk->maxframe)
 			size = cf_sk->maxframe;
 
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6ace34..9baace5e0d71 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1287,12 +1287,13 @@ static int bcm_tx_send(struct msghdr *msg, int ifindex, struct sock *sk,
 /*
  * bcm_sendmsg - process BCM commands (opcodes) from the userspace
  */
-static int bcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int bcm_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct bcm_sock *bo = bcm_sk(sk);
 	int ifindex = bo->ifindex; /* default ifindex for this bcm_op */
 	struct bcm_msg_head msg_head;
+	size_t size = msg_data_left(msg);
 	int cfsiz;
 	int ret; /* read bytes or error codes as return value */
 
diff --git a/net/can/isotp.c b/net/can/isotp.c
index 9bc344851704..6b5d3ebd6748 100644
--- a/net/can/isotp.c
+++ b/net/can/isotp.c
@@ -914,7 +914,7 @@ static enum hrtimer_restart isotp_txfr_timer_handler(struct hrtimer *hrtimer)
 	return HRTIMER_NORESTART;
 }
 
-static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int isotp_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct isotp_sock *so = isotp_sk(sk);
@@ -922,6 +922,7 @@ static int isotp_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	struct sk_buff *skb;
 	struct net_device *dev;
 	struct canfd_frame *cf;
+	size_t size = msg_data_left(msg);
 	int ae = (so->opt.flags & CAN_ISOTP_EXTEND_ADDR) ? 1 : 0;
 	int wait_tx_done = (so->opt.flags & CAN_ISOTP_WAIT_TX_DONE) ? 1 : 0;
 	s64 hrtimer_sec = ISOTP_ECHO_TIMEOUT;
diff --git a/net/can/j1939/socket.c b/net/can/j1939/socket.c
index 7e90f9e61d9b..2b009b69e853 100644
--- a/net/can/j1939/socket.c
+++ b/net/can/j1939/socket.c
@@ -1187,12 +1187,12 @@ static int j1939_sk_send_loop(struct j1939_priv *priv,  struct sock *sk,
 	return ret;
 }
 
-static int j1939_sk_sendmsg(struct socket *sock, struct msghdr *msg,
-			    size_t size)
+static int j1939_sk_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct j1939_sock *jsk = j1939_sk(sk);
 	struct j1939_priv *priv;
+	size_t size = msg_data_left(msg);
 	int ifindex;
 	int ret;
 
diff --git a/net/can/raw.c b/net/can/raw.c
index f64469b98260..0c37f1c70685 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -814,13 +814,14 @@ static bool raw_bad_txframe(struct raw_sock *ro, struct sk_buff *skb, int mtu)
 	return true;
 }
 
-static int raw_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+static int raw_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct raw_sock *ro = raw_sk(sk);
 	struct sockcm_cookie sockc;
 	struct sk_buff *skb;
 	struct net_device *dev;
+	size_t size = msg_data_left(msg);
 	int ifindex;
 	int err = -EINVAL;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index c25888795390..4170381356aa 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3183,13 +3183,13 @@ int sock_no_shutdown(struct socket *sock, int how)
 }
 EXPORT_SYMBOL(sock_no_shutdown);
 
-int sock_no_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
+int sock_no_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	return -EOPNOTSUPP;
 }
 EXPORT_SYMBOL(sock_no_sendmsg);
 
-int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *m, size_t len)
+int sock_no_sendmsg_locked(struct sock *sk, struct msghdr *m)
 {
 	return -EOPNOTSUPP;
 }
diff --git a/net/dccp/dccp.h b/net/dccp/dccp.h
index 9ddc3a9e89e4..3d5d7615ddd8 100644
--- a/net/dccp/dccp.h
+++ b/net/dccp/dccp.h
@@ -293,7 +293,7 @@ int dccp_getsockopt(struct sock *sk, int level, int optname,
 int dccp_setsockopt(struct sock *sk, int level, int optname,
 		    sockptr_t optval, unsigned int optlen);
 int dccp_ioctl(struct sock *sk, int cmd, unsigned long arg);
-int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int dccp_sendmsg(struct sock *sk, struct msghdr *msg);
 int dccp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		 int *addr_len);
 void dccp_shutdown(struct sock *sk, int how);
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index a06b5641287a..6f6623bb1ff8 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -725,12 +725,13 @@ static int dccp_msghdr_parse(struct msghdr *msg, struct sk_buff *skb)
 	return 0;
 }
 
-int dccp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int dccp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	const struct dccp_sock *dp = dccp_sk(sk);
 	const int flags = msg->msg_flags;
 	const int noblock = flags & MSG_DONTWAIT;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int rc, size;
 	long timeo;
 
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index 1fa2fe041ec0..70f2948b7946 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -88,12 +88,11 @@ static int ieee802154_sock_release(struct socket *sock)
 	return 0;
 }
 
-static int ieee802154_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-				   size_t len)
+static int ieee802154_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 
-	return sk->sk_prot->sendmsg(sk, msg, len);
+	return sk->sk_prot->sendmsg(sk, msg);
 }
 
 static int ieee802154_sock_bind(struct socket *sock, struct sockaddr *uaddr,
@@ -238,11 +237,12 @@ static int raw_disconnect(struct sock *sk, int flags)
 	return 0;
 }
 
-static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net_device *dev;
 	unsigned int mtu;
 	struct sk_buff *skb;
+	size_t size = msg_data_left(msg);
 	int hlen, tlen;
 	int err;
 
@@ -605,7 +605,7 @@ static int dgram_disconnect(struct sock *sk, int flags)
 	return 0;
 }
 
-static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int dgram_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net_device *dev;
 	unsigned int mtu;
@@ -614,6 +614,7 @@ static int dgram_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	struct dgram_sock *ro = dgram_sk(sk);
 	struct ieee802154_addr dst_addr;
 	DECLARE_SOCKADDR(struct sockaddr_ieee802154*, daddr, msg->msg_name);
+	size_t size = msg_data_left(msg);
 	int hlen, tlen;
 	int err;
 
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 940062e08f57..4facfef8bded 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -815,7 +815,7 @@ int inet_send_prepare(struct sock *sk)
 }
 EXPORT_SYMBOL_GPL(inet_send_prepare);
 
-int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+int inet_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 
@@ -823,7 +823,7 @@ int inet_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 		return -EAGAIN;
 
 	return INDIRECT_CALL_2(sk->sk_prot->sendmsg, tcp_sendmsg, udp_sendmsg,
-			       sk, msg, size);
+			       sk, msg);
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 409ec2a1f95b..f689f9f530c9 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -657,9 +657,10 @@ static int ping_v4_push_pending_frames(struct sock *sk, struct pingfakehdr *pfh,
 	return ip_push_pending_frames(sk, fl4);
 }
 
-int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
+int ping_common_sendmsg(int family, struct msghdr *msg,
 			void *user_icmph, size_t icmph_len)
 {
+	size_t len = msg_data_left(msg);
 	u8 type, code;
 
 	if (len > 0xFFFF)
@@ -703,7 +704,7 @@ int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
 }
 EXPORT_SYMBOL_GPL(ping_common_sendmsg);
 
-static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct net *net = sock_net(sk);
 	struct flowi4 fl4;
@@ -713,6 +714,7 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct pingfakehdr pfh;
 	struct rtable *rt = NULL;
 	struct ip_options_data opt_copy;
+	size_t len = msg_data_left(msg);
 	int free = 0;
 	__be32 saddr, daddr, faddr;
 	u8  tos;
@@ -720,7 +722,7 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 
 	pr_debug("ping_v4_sendmsg(sk=%p,sk->num=%u)\n", inet, inet->inet_num);
 
-	err = ping_common_sendmsg(AF_INET, msg, len, &user_icmph,
+	err = ping_common_sendmsg(AF_INET, msg, &user_icmph,
 				  sizeof(user_icmph));
 	if (err)
 		return err;
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 3cf68695b40d..f2859c117796 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -471,7 +471,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
-static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct net *net = sock_net(sk);
@@ -485,6 +485,7 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	int err;
 	struct ip_options_data opt_copy;
 	struct raw_frag_vec rfv;
+	size_t len = msg_data_left(msg);
 	int hdrincl;
 
 	err = -EMSGSIZE;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index fd68d49490f2..2a98b104892c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1166,7 +1166,7 @@ void tcp_free_fastopen_req(struct tcp_sock *tp)
 }
 
 int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
-			 size_t size, struct ubuf_info *uarg)
+			 struct ubuf_info *uarg)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct inet_sock *inet = inet_sk(sk);
@@ -1186,7 +1186,7 @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
 	if (unlikely(!tp->fastopen_req))
 		return -ENOBUFS;
 	tp->fastopen_req->data = msg;
-	tp->fastopen_req->size = size;
+	tp->fastopen_req->size = msg_data_left(msg);
 	tp->fastopen_req->uarg = uarg;
 
 	if (inet->defer_connect) {
@@ -1212,12 +1212,13 @@ int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg, int *copied,
 	return err;
 }
 
-int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct ubuf_info *uarg = NULL;
 	struct sk_buff *skb;
 	struct sockcm_cookie sockc;
+	size_t size = msg_data_left(msg);
 	int flags, err, copied = 0;
 	int mss_now = 0, size_goal, copied_syn = 0;
 	int process_backlog = 0;
@@ -1226,7 +1227,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	flags = msg->msg_flags;
 
-	if ((flags & MSG_ZEROCOPY) && size) {
+	if ((flags & MSG_ZEROCOPY) && msg_data_left(msg)) {
 		skb = tcp_write_queue_tail(sk);
 
 		if (msg->msg_ubuf) {
@@ -1247,7 +1248,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	if (unlikely(flags & MSG_FASTOPEN || inet_sk(sk)->defer_connect) &&
 	    !tp->repair) {
-		err = tcp_sendmsg_fastopen(sk, msg, &copied_syn, size, uarg);
+		err = tcp_sendmsg_fastopen(sk, msg, &copied_syn, uarg);
 		if (err == -EINPROGRESS && copied_syn > 0)
 			goto out;
 		else if (err)
@@ -1271,7 +1272,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 
 	if (unlikely(tp->repair)) {
 		if (tp->repair_queue == TCP_RECV_QUEUE) {
-			copied = tcp_send_rcvq(sk, msg, size);
+			copied = tcp_send_rcvq(sk, msg);
 			goto out_nopush;
 		}
 
@@ -1477,12 +1478,12 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)
 }
 EXPORT_SYMBOL_GPL(tcp_sendmsg_locked);
 
-int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	int ret;
 
 	lock_sock(sk);
-	ret = tcp_sendmsg_locked(sk, msg, size);
+	ret = tcp_sendmsg_locked(sk, msg);
 	release_sock(sk);
 
 	return ret;
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c
index ebf917511937..843eb2b6b8d3 100644
--- a/net/ipv4/tcp_bpf.c
+++ b/net/ipv4/tcp_bpf.c
@@ -396,9 +396,10 @@ static int tcp_bpf_send_verdict(struct sock *sk, struct sk_psock *psock,
 	return ret;
 }
 
-static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_msg tmp, *msg_tx = NULL;
+	size_t size = msg_data_left(msg);
 	int copied = 0, err = 0;
 	struct sk_psock *psock;
 	long timeo;
@@ -410,7 +411,7 @@ static int tcp_bpf_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 
 	psock = sk_psock_get(sk);
 	if (unlikely(!psock))
-		return tcp_sendmsg(sk, msg, size);
+		return tcp_sendmsg(sk, msg);
 
 	lock_sock(sk);
 	timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 2b75cd9e2e92..a1c7d834abca 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4948,9 +4948,10 @@ static int __must_check tcp_queue_rcv(struct sock *sk, struct sk_buff *skb,
 	return eaten;
 }
 
-int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
+int tcp_send_rcvq(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_buff *skb;
+	size_t size = msg_data_left(msg);
 	int err = -ENOMEM;
 	int data_len = 0;
 	bool fragstolen;
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index aa32afd871ee..b2ed9d37a362 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1049,13 +1049,14 @@ int udp_cmsg_send(struct sock *sk, struct msghdr *msg, u16 *gso_size)
 }
 EXPORT_SYMBOL_GPL(udp_cmsg_send);
 
-int udp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int udp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct udp_sock *up = udp_sk(sk);
 	DECLARE_SOCKADDR(struct sockaddr_in *, usin, msg->msg_name);
 	struct flowi4 fl4_stack;
 	struct flowi4 *fl4;
+	unsigned int len = msg_data_left(msg);
 	int ulen = len;
 	struct ipcm_cookie ipc;
 	struct rtable *rt = NULL;
@@ -1346,7 +1347,7 @@ int udp_sendpage(struct sock *sk, struct page *page, int offset,
 		 * sendpage interface can't pass.
 		 * This will succeed only when the socket is connected.
 		 */
-		ret = udp_sendmsg(sk, &msg, 0);
+		ret = udp_sendmsg(sk, &msg);
 		if (ret < 0)
 			return ret;
 	}
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index e1b679a590c9..d6b4cfc44e2a 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -636,9 +636,8 @@ int inet6_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 EXPORT_SYMBOL_GPL(inet6_compat_ioctl);
 #endif /* CONFIG_COMPAT */
 
-INDIRECT_CALLABLE_DECLARE(int udpv6_sendmsg(struct sock *, struct msghdr *,
-					    size_t));
-int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
+INDIRECT_CALLABLE_DECLARE(int udpv6_sendmsg(struct sock *, struct msghdr *));
+int inet6_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	const struct proto *prot;
@@ -649,7 +648,7 @@ int inet6_sendmsg(struct socket *sock, struct msghdr *msg, size_t size)
 	/* IPV6_ADDRFORM can change sk->sk_prot under us. */
 	prot = READ_ONCE(sk->sk_prot);
 	return INDIRECT_CALL_2(prot->sendmsg, tcp_sendmsg, udpv6_sendmsg,
-			       sk, msg, size);
+			       sk, msg);
 }
 
 INDIRECT_CALLABLE_DECLARE(int udpv6_recvmsg(struct sock *, struct msghdr *,
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index c4835dbdfcff..54c94b28744f 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -59,7 +59,7 @@ static int ping_v6_pre_connect(struct sock *sk, struct sockaddr *uaddr,
 	return BPF_CGROUP_RUN_PROG_INET6_CONNECT_LOCK(sk, uaddr);
 }
 
-static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct ipv6_pinfo *np = inet6_sk(sk);
@@ -73,8 +73,9 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct rt6_info *rt;
 	struct pingfakehdr pfh;
 	struct ipcm6_cookie ipc6;
+	size_t len = msg_data_left(msg);
 
-	err = ping_common_sendmsg(AF_INET6, msg, len, &user_icmph,
+	err = ping_common_sendmsg(AF_INET6, msg, &user_icmph,
 				  sizeof(user_icmph));
 	if (err)
 		return err;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 6ac2f2690c44..a3437deeeb74 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -735,7 +735,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
-static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions *opt_to_free = NULL;
 	struct ipv6_txoptions opt_space;
@@ -751,6 +751,7 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct flowi6 fl6;
 	struct ipcm6_cookie ipc6;
 	int addr_len = msg->msg_namelen;
+	size_t len = msg_data_left(msg);
 	int hdrincl;
 	u16 proto;
 	int err;
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index d350e57c4792..80f2eb58ba1a 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1326,7 +1326,7 @@ static int udp_v6_push_pending_frames(struct sock *sk)
 	return err;
 }
 
-int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+int udpv6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions opt_space;
 	struct udp_sock *up = udp_sk(sk);
@@ -1343,6 +1343,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct ipcm6_cookie ipc6;
 	int addr_len = msg->msg_namelen;
 	bool connected = false;
+	size_t len = msg_data_left(msg);
 	int ulen = len;
 	int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE;
 	int err;
@@ -1397,7 +1398,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 do_udp_sendmsg:
 			if (ipv6_only_sock(sk))
 				return -ENETUNREACH;
-			return udp_sendmsg(sk, msg, len);
+			return udp_sendmsg(sk, msg);
 		}
 	}
 
@@ -1410,7 +1411,7 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	getfrag  =  is_udplite ?  udplite_getfrag : ip_generic_getfrag;
 	if (up->pending) {
 		if (up->pending == AF_INET)
-			return udp_sendmsg(sk, msg, len);
+			return udp_sendmsg(sk, msg);
 		/*
 		 * There are pending frames.
 		 * The socket lock must be held while it's corked.
diff --git a/net/ipv6/udp_impl.h b/net/ipv6/udp_impl.h
index 0590f566379d..c905a5cb34af 100644
--- a/net/ipv6/udp_impl.h
+++ b/net/ipv6/udp_impl.h
@@ -20,7 +20,7 @@ int udpv6_getsockopt(struct sock *sk, int level, int optname,
 		     char __user *optval, int __user *optlen);
 int udpv6_setsockopt(struct sock *sk, int level, int optname, sockptr_t optval,
 		     unsigned int optlen);
-int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+int udpv6_sendmsg(struct sock *sk, struct msghdr *msg);
 int udpv6_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int flags,
 		  int *addr_len);
 void udpv6_destroy_sock(struct sock *sk);
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index 498a0c35b7bb..d963d245a4e2 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -895,8 +895,7 @@ static int iucv_send_iprm(struct iucv_path *path, struct iucv_message *msg,
 				 (void *) prmdata, 8);
 }
 
-static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			     size_t len)
+static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct iucv_sock *iucv = iucv_sk(sk);
@@ -905,6 +904,7 @@ static int iucv_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 	struct sk_buff *skb;
 	struct iucv_message txmsg = {0};
 	struct cmsghdr *cmsg;
+	size_t len = msg_data_left(msg);
 	int cmsg_done;
 	long timeo;
 	char user_id[9];
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index cfe828bd7fc6..caf13ed1bfeb 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -904,7 +904,7 @@ static ssize_t kcm_sendpage(struct socket *sock, struct page *page,
 	return err;
 }
 
-static int kcm_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int kcm_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct kcm_sock *kcm = kcm_sk(sk);
diff --git a/net/key/af_key.c b/net/key/af_key.c
index a815f5ab4c49..3cde1e0c3119 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3662,13 +3662,14 @@ static int pfkey_send_migrate(const struct xfrm_selector *sel, u8 dir, u8 type,
 }
 #endif
 
-static int pfkey_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int pfkey_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb = NULL;
 	struct sadb_msg *hdr = NULL;
 	int err;
 	struct net *net = sock_net(sk);
+	size_t len = msg_data_left(msg);
 
 	err = -EOPNOTSUPP;
 	if (msg->msg_flags & MSG_OOB)
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 4db5a554bdbd..474ce4ae9b63 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -394,13 +394,14 @@ static int l2tp_ip_backlog_recv(struct sock *sk, struct sk_buff *skb)
 /* Userspace will call sendmsg() on the tunnel socket to send L2TP
  * control frames.
  */
-static int l2tp_ip_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int l2tp_ip_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sk_buff *skb;
 	int rc;
 	struct inet_sock *inet = inet_sk(sk);
 	struct rtable *rt = NULL;
 	struct flowi4 *fl4;
+	size_t len = msg_data_left(msg);
 	int connected = 0;
 	__be32 daddr;
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2478aa60145f..7619afe77855 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -488,7 +488,7 @@ static int l2tp_ip6_push_pending_frames(struct sock *sk)
 /* Userspace will call sendmsg() on the tunnel socket to send L2TP
  * control frames.
  */
-static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct ipv6_txoptions opt_space;
 	DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name);
@@ -500,6 +500,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 	struct dst_entry *dst = NULL;
 	struct flowi6 fl6;
 	struct ipcm6_cookie ipc6;
+	size_t len = msg_data_left(msg);
 	int addr_len = msg->msg_namelen;
 	int transhdrlen = 4; /* zero session-id */
 	int ulen;
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index f011af6601c9..ae351f50adff 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -262,14 +262,14 @@ static void pppol2tp_recv(struct l2tp_session *session, struct sk_buff *skb, int
  * when a user application does a sendmsg() on the session socket. L2TP and
  * PPP headers must be inserted into the user's data.
  */
-static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m,
-			    size_t total_len)
+static int pppol2tp_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	struct sk_buff *skb;
 	int error;
 	struct l2tp_session *session;
 	struct l2tp_tunnel *tunnel;
+	size_t total_len = msg_data_left(m);
 	int uhlen;
 
 	error = -ENOTCONN;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index da7fe94bea2e..d10b5ef66c88 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -919,12 +919,11 @@ static int llc_ui_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
  *	llc_ui_sendmsg - Transmit data provided by the socket user.
  *	@sock: Socket to transmit data from.
  *	@msg: Various user related information.
- *	@len: Length of data to transmit.
  *
  *	Transmit data provided by the socket user.
  *	Returns non-negative upon success, negative otherwise.
  */
-static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct llc_sock *llc = llc_sk(sk);
@@ -954,7 +953,7 @@ static int llc_ui_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 			goto out;
 	}
 	hdrlen = llc->dev->hard_header_len + llc_ui_header_len(sk, addr);
-	size = hdrlen + len;
+	size = hdrlen + msg_data_left(msg);
 	if (size > llc->dev->mtu)
 		size = llc->dev->mtu;
 	copied = size - hdrlen;
diff --git a/net/mctp/af_mctp.c b/net/mctp/af_mctp.c
index bb4bd0b6a4f7..9ead250f1be3 100644
--- a/net/mctp/af_mctp.c
+++ b/net/mctp/af_mctp.c
@@ -90,7 +90,7 @@ static int mctp_bind(struct socket *sock, struct sockaddr *addr, int addrlen)
 	return rc;
 }
 
-static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int mctp_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_mctp *, addr, msg->msg_name);
 	int rc, addrlen = msg->msg_namelen;
@@ -99,6 +99,7 @@ static int mctp_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct mctp_skb_cb *cb;
 	struct mctp_route *rt;
 	struct sk_buff *skb = NULL;
+	size_t len = msg_data_left(msg);
 	int hlen;
 
 	if (addr) {
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2d26b9114373..0a58f2dbd3ce 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1663,7 +1663,7 @@ static void mptcp_set_nospace(struct sock *sk)
 static int mptcp_disconnect(struct sock *sk, int flags);
 
 static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msghdr *msg,
-				  size_t len, int *copied_syn)
+				  int *copied_syn)
 {
 	unsigned int saved_flags = msg->msg_flags;
 	struct mptcp_sock *msk = mptcp_sk(sk);
@@ -1673,7 +1673,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msgh
 	msg->msg_flags |= MSG_DONTWAIT;
 	msk->connect_flags = O_NONBLOCK;
 	msk->fastopening = 1;
-	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, len, NULL);
+	ret = tcp_sendmsg_fastopen(ssk, msg, copied_syn, NULL);
 	msk->fastopening = 0;
 	msg->msg_flags = saved_flags;
 	release_sock(ssk);
@@ -1695,7 +1695,7 @@ static int mptcp_sendmsg_fastopen(struct sock *sk, struct sock *ssk, struct msgh
 	return ret;
 }
 
-static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct mptcp_sock *msk = mptcp_sk(sk);
 	struct page_frag *pfrag;
@@ -1714,7 +1714,7 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 			       msg->msg_flags & MSG_FASTOPEN))) {
 		int copied_syn = 0;
 
-		ret = mptcp_sendmsg_fastopen(sk, ssock->sk, msg, len, &copied_syn);
+		ret = mptcp_sendmsg_fastopen(sk, ssock->sk, msg, &copied_syn);
 		copied += copied_syn;
 		if (ret == -EINPROGRESS && copied_syn > 0)
 			goto out;
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 877f1da1a8ac..519487cbfcce 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1857,7 +1857,7 @@ static void netlink_cmsg_listen_all_nsid(struct sock *sk, struct msghdr *msg,
 		 &NETLINK_CB(skb).nsid);
 }
 
-static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int netlink_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct netlink_sock *nlk = nlk_sk(sk);
@@ -1872,7 +1872,7 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	if (msg->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
 
-	if (len == 0) {
+	if (msg_data_left(msg) == 0) {
 		pr_warn_once("Zero length message leads to an empty skb\n");
 		return -ENODATA;
 	}
@@ -1911,10 +1911,10 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	err = -EMSGSIZE;
-	if (len > sk->sk_sndbuf - 32)
+	if (msg_data_left(msg) > sk->sk_sndbuf - 32)
 		goto out;
 	err = -ENOBUFS;
-	skb = netlink_alloc_large_skb(len, dst_group);
+	skb = netlink_alloc_large_skb(msg_data_left(msg), dst_group);
 	if (skb == NULL)
 		goto out;
 
@@ -1924,7 +1924,8 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	NETLINK_CB(skb).flags	= netlink_skb_flags;
 
 	err = -EFAULT;
-	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
+	if (memcpy_from_msg(skb_put(skb, msg_data_left(msg)),
+			    msg, msg_data_left(msg))) {
 		kfree_skb(skb);
 		goto out;
 	}
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 5a4cb796150f..d2c65f38c22c 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1034,7 +1034,7 @@ int nr_rx_frame(struct sk_buff *skb, struct net_device *dev)
 	return 1;
 }
 
-static int nr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int nr_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nr_sock *nr = nr_sk(sk);
@@ -1043,6 +1043,7 @@ static int nr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sockaddr_ax25 sax;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int size;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 77642d18a3b4..70226fc36396 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -770,8 +770,7 @@ static int llcp_sock_connect(struct socket *sock, struct sockaddr *_addr,
 	return ret;
 }
 
-static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
-			     size_t len)
+static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nfc_llcp_sock *llcp_sock = nfc_llcp_sock(sk);
@@ -805,7 +804,7 @@ static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 		release_sock(sk);
 
 		return nfc_llcp_send_ui_frame(llcp_sock, addr->dsap, addr->ssap,
-					      msg, len);
+					      msg, msg_data_left(msg));
 	}
 
 	if (sk->sk_state != LLCP_CONNECTED) {
@@ -815,7 +814,7 @@ static int llcp_sock_sendmsg(struct socket *sock, struct msghdr *msg,
 
 	release_sock(sk);
 
-	return nfc_llcp_send_i_frame(llcp_sock, msg, len);
+	return nfc_llcp_send_i_frame(llcp_sock, msg, msg_data_left(msg));
 }
 
 static int llcp_sock_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 5125392bb68e..d9d54240b2a2 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -202,11 +202,12 @@ static void rawsock_tx_work(struct work_struct *work)
 	kcov_remote_stop();
 }
 
-static int rawsock_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rawsock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct nfc_dev *dev = nfc_rawsock(sk)->dev;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int rc;
 
 	pr_debug("sock=%p sk=%p len=%zu\n", sock, sk, len);
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 497193f73030..84a95e177260 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1947,14 +1947,14 @@ static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
  *	protocol layers and you must therefore supply it with a complete frame
  */
 
-static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_pkt *, saddr, msg->msg_name);
 	struct sk_buff *skb = NULL;
 	struct net_device *dev;
 	struct sockcm_cookie sockc;
+	size_t len = msg_data_left(msg);
 	__be16 proto = 0;
 	int err;
 	int extra_len = 0;
@@ -2933,7 +2933,7 @@ static struct sk_buff *packet_alloc_skb(struct sock *sk, size_t prepad,
 	return skb;
 }
 
-static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
+static int packet_snd(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_ll *, saddr, msg->msg_name);
@@ -2946,6 +2946,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	struct virtio_net_hdr vnet_hdr = { 0 };
 	int offset = 0;
 	struct packet_sock *po = pkt_sk(sk);
+	size_t len = msg_data_left(msg);
 	bool has_vnet_hdr = false;
 	int hlen, tlen, linear;
 	int extra_len = 0;
@@ -3093,7 +3094,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	return err;
 }
 
-static int packet_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int packet_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct packet_sock *po = pkt_sk(sk);
@@ -3104,7 +3105,7 @@ static int packet_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	if (data_race(po->tx_ring.pg_vec))
 		return tpacket_snd(po, msg);
 
-	return packet_snd(sock, msg, len);
+	return packet_snd(sock, msg);
 }
 
 /*
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index ff5f49ab236e..4839f7d6785b 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -70,10 +70,11 @@ static int pn_init(struct sock *sk)
 	return 0;
 }
 
-static int pn_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int pn_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_pn *, target, msg->msg_name);
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int err;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_NOSIGNAL|
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 83ea13a50690..5afc99ab9eca 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1112,10 +1112,11 @@ static int pipe_skb_send(struct sock *sk, struct sk_buff *skb)
 
 }
 
-static int pep_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
+static int pep_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct pep_sock *pn = pep_sk(sk);
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	long timeo;
 	int flags = msg->msg_flags;
 	int err, done;
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index 71e2caf6ab85..99cd62f64944 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -414,15 +414,14 @@ static int pn_socket_listen(struct socket *sock, int backlog)
 	return err;
 }
 
-static int pn_socket_sendmsg(struct socket *sock, struct msghdr *m,
-			     size_t total_len)
+static int pn_socket_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 
 	if (pn_socket_autobind(sock))
 		return -EAGAIN;
 
-	return sk->sk_prot->sendmsg(sk, m, total_len);
+	return sk->sk_prot->sendmsg(sk, m);
 }
 
 const struct proto_ops phonet_dgram_ops = {
diff --git a/net/qrtr/af_qrtr.c b/net/qrtr/af_qrtr.c
index 5c2fb992803b..7c1b908dd479 100644
--- a/net/qrtr/af_qrtr.c
+++ b/net/qrtr/af_qrtr.c
@@ -888,7 +888,7 @@ static int qrtr_bcast_enqueue(struct qrtr_node *node, struct sk_buff *skb,
 	return 0;
 }
 
-static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_qrtr *, addr, msg->msg_name);
 	int (*enqueue_fn)(struct qrtr_node *, struct sk_buff *, int,
@@ -898,7 +898,7 @@ static int qrtr_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sock *sk = sock->sk;
 	struct qrtr_node *node;
 	struct sk_buff *skb;
-	size_t plen;
+	size_t plen, len = msg_data_left(msg);
 	u32 type;
 	int rc;
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index d35d1fc39807..9e8ecafd5b51 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -909,7 +909,7 @@ void rds6_inc_info_copy(struct rds_incoming *inc,
 			int flip);
 
 /* send.c */
-int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len);
+int rds_sendmsg(struct socket *sock, struct msghdr *msg);
 void rds_send_path_reset(struct rds_conn_path *conn);
 int rds_send_xmit(struct rds_conn_path *cp);
 struct sockaddr_in;
diff --git a/net/rds/send.c b/net/rds/send.c
index 5e57a1581dc6..f588b720e1c3 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -1098,7 +1098,7 @@ static int rds_rdma_bytes(struct msghdr *msg, size_t *rdma_bytes)
 	return 0;
 }
 
-int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
+int rds_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rds_sock *rs = rds_sk_to_rs(sk);
@@ -1114,6 +1114,7 @@ int rds_sendmsg(struct socket *sock, struct msghdr *msg, size_t payload_len)
 	struct rds_conn_path *cpath;
 	struct in6_addr daddr;
 	__u32 scope_id = 0;
+	size_t payload_len = msg_data_left(msg);
 	size_t rdma_payload_len = 0;
 	bool zcopy = ((msg->msg_flags & MSG_ZEROCOPY) &&
 		      sock_flag(rds_rs_to_sk(rs), SOCK_ZEROCOPY));
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index ca2b17f32670..938ea0716751 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1069,7 +1069,7 @@ int rose_rx_call_request(struct sk_buff *skb, struct net_device *dev, struct ros
 	return 1;
 }
 
-static int rose_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int rose_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct rose_sock *rose = rose_sk(sk);
@@ -1078,6 +1078,7 @@ static int rose_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct full_sockaddr_rose srose;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int n, size, qbit = 0;
 
 	if (msg->msg_flags & ~(MSG_DONTWAIT|MSG_EOR|MSG_CMSG_COMPAT))
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 102f5cbff91a..bdce6ab30899 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -502,13 +502,13 @@ static int rxrpc_connect(struct socket *sock, struct sockaddr *addr,
  *   - sends a call data packet
  *   - may send an abort (abort code in control data)
  */
-static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
+static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct rxrpc_local *local;
 	struct rxrpc_sock *rx = rxrpc_sk(sock->sk);
 	int ret;
 
-	_enter(",{%d},,%zu", rx->sk.sk_state, len);
+	_enter(",{%d},,%zu", rx->sk.sk_state, msg_data_left(m));
 
 	if (m->msg_flags & MSG_OOB)
 		return -EOPNOTSUPP;
@@ -562,7 +562,7 @@ static int rxrpc_sendmsg(struct socket *sock, struct msghdr *m, size_t len)
 		fallthrough;
 	case RXRPC_SERVER_BOUND:
 	case RXRPC_SERVER_LISTENING:
-		ret = rxrpc_do_sendmsg(rx, m, len);
+		ret = rxrpc_do_sendmsg(rx, m);
 		/* The socket has been unlocked */
 		goto out;
 	default:
diff --git a/net/rxrpc/ar-internal.h b/net/rxrpc/ar-internal.h
index 67b0a894162d..36738f8f050d 100644
--- a/net/rxrpc/ar-internal.h
+++ b/net/rxrpc/ar-internal.h
@@ -1221,7 +1221,7 @@ struct key *rxrpc_look_up_server_security(struct rxrpc_connection *,
  */
 bool rxrpc_propose_abort(struct rxrpc_call *call, s32 abort_code, int error,
 			 enum rxrpc_abort_reason why);
-int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *, size_t);
+int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *);
 
 /*
  * server_key.c
diff --git a/net/rxrpc/output.c b/net/rxrpc/output.c
index 5e53429c6922..0f3ff3455101 100644
--- a/net/rxrpc/output.c
+++ b/net/rxrpc/output.c
@@ -16,9 +16,9 @@
 #include <net/udp.h>
 #include "ar-internal.h"
 
-extern int udpv6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len);
+extern int udpv6_sendmsg(struct sock *sk, struct msghdr *msg);
 
-static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg, size_t len)
+static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg)
 {
 	struct sockaddr *sa = msg->msg_name;
 	struct sock *sk = socket->sk;
@@ -29,10 +29,10 @@ static ssize_t do_udp_sendmsg(struct socket *socket, struct msghdr *msg, size_t
 				pr_warn("AF_INET6 address on AF_INET socket\n");
 				return -ENOPROTOOPT;
 			}
-			return udpv6_sendmsg(sk, msg, len);
+			return udpv6_sendmsg(sk, msg);
 		}
 	}
-	return udp_sendmsg(sk, msg, len);
+	return udp_sendmsg(sk, msg);
 }
 
 struct rxrpc_abort_buffer {
@@ -232,7 +232,7 @@ int rxrpc_send_ack_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 	txb->ack.previousPacket	= htonl(call->rx_highest_seq);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	call->peer->last_tx_at = ktime_get_seconds();
 	if (ret < 0) {
 		trace_rxrpc_tx_fail(call->debug_id, serial, ret,
@@ -306,7 +306,7 @@ int rxrpc_send_abort_packet(struct rxrpc_call *call)
 	pkt.whdr.serial = htonl(serial);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, sizeof(pkt));
-	ret = do_udp_sendmsg(conn->local->socket, &msg, sizeof(pkt));
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	conn->peer->last_tx_at = ktime_get_seconds();
 	if (ret < 0)
 		trace_rxrpc_tx_fail(call->debug_id, serial, ret,
@@ -424,7 +424,7 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 	 *     message and update the peer record
 	 */
 	rxrpc_inc_stat(call->rxnet, stat_tx_data_send);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	conn->peer->last_tx_at = ktime_get_seconds();
 
 	if (ret < 0) {
@@ -497,7 +497,7 @@ int rxrpc_send_data_packet(struct rxrpc_call *call, struct rxrpc_txbuf *txb)
 		ip_sock_set_mtu_discover(conn->local->socket->sk,
 					 IP_PMTUDISC_DONT);
 		rxrpc_inc_stat(call->rxnet, stat_tx_data_send_frag);
-		ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+		ret = do_udp_sendmsg(conn->local->socket, &msg);
 		conn->peer->last_tx_at = ktime_get_seconds();
 
 		ip_sock_set_mtu_discover(conn->local->socket->sk,
@@ -564,7 +564,7 @@ void rxrpc_send_conn_abort(struct rxrpc_connection *conn)
 	whdr.serial = htonl(serial);
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 2, len);
-	ret = do_udp_sendmsg(conn->local->socket, &msg, len);
+	ret = do_udp_sendmsg(conn->local->socket, &msg);
 	if (ret < 0) {
 		trace_rxrpc_tx_fail(conn->debug_id, serial, ret,
 				    rxrpc_tx_point_conn_abort);
@@ -633,7 +633,7 @@ void rxrpc_reject_packet(struct rxrpc_local *local, struct sk_buff *skb)
 		whdr.flags	&= RXRPC_CLIENT_INITIATED;
 
 		iov_iter_kvec(&msg.msg_iter, WRITE, iov, ioc, size);
-		ret = do_udp_sendmsg(local->socket, &msg, size);
+		ret = do_udp_sendmsg(local->socket, &msg);
 		if (ret < 0)
 			trace_rxrpc_tx_fail(local->debug_id, 0, ret,
 					    rxrpc_tx_point_reject);
@@ -682,7 +682,7 @@ void rxrpc_send_keepalive(struct rxrpc_peer *peer)
 	len = iov[0].iov_len + iov[1].iov_len;
 
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 2, len);
-	ret = do_udp_sendmsg(peer->local->socket, &msg, len);
+	ret = do_udp_sendmsg(peer->local->socket, &msg);
 	if (ret < 0)
 		trace_rxrpc_tx_fail(peer->debug_id, 0, ret,
 				    rxrpc_tx_point_version_keepalive);
diff --git a/net/rxrpc/rxperf.c b/net/rxrpc/rxperf.c
index 4a2e90015ca7..0167afb67a7a 100644
--- a/net/rxrpc/rxperf.c
+++ b/net/rxrpc/rxperf.c
@@ -507,7 +507,7 @@ static int rxperf_process_call(struct rxperf_call *call)
 		iov_iter_bvec(&msg.msg_iter, WRITE, &bv, 1, len);
 		msg.msg_flags = MSG_MORE;
 		n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg,
-					   len, rxperf_notify_end_reply_tx);
+					   rxperf_notify_end_reply_tx);
 		if (n < 0)
 			return n;
 		if (n == 0)
@@ -520,7 +520,7 @@ static int rxperf_process_call(struct rxperf_call *call)
 	iov[0].iov_len	= len;
 	iov_iter_kvec(&msg.msg_iter, WRITE, iov, 1, len);
 	msg.msg_flags = 0;
-	n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg, len,
+	n = rxrpc_kernel_send_data(rxperf_socket, call->rxcall, &msg,
 				   rxperf_notify_end_reply_tx);
 	if (n >= 0)
 		return 0; /* Success */
diff --git a/net/rxrpc/sendmsg.c b/net/rxrpc/sendmsg.c
index da49fcf1c456..b6ffd8124ced 100644
--- a/net/rxrpc/sendmsg.c
+++ b/net/rxrpc/sendmsg.c
@@ -280,7 +280,7 @@ static void rxrpc_queue_packet(struct rxrpc_sock *rx, struct rxrpc_call *call,
  */
 static int rxrpc_send_data(struct rxrpc_sock *rx,
 			   struct rxrpc_call *call,
-			   struct msghdr *msg, size_t len,
+			   struct msghdr *msg,
 			   rxrpc_notify_end_tx_t notify_end_tx,
 			   bool *_dropped_lock)
 {
@@ -327,9 +327,9 @@ static int rxrpc_send_data(struct rxrpc_sock *rx,
 
 	ret = -EMSGSIZE;
 	if (call->tx_total_len != -1) {
-		if (len - copied > call->tx_total_len)
+		if (msg_data_left(msg) > call->tx_total_len)
 			goto maybe_error;
-		if (!more && len - copied != call->tx_total_len)
+		if (!more && msg_data_left(msg) != call->tx_total_len)
 			goto maybe_error;
 	}
 
@@ -612,7 +612,7 @@ rxrpc_new_client_call_for_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg,
  * - caller holds the socket locked
  * - the socket may be either a client socket or a server socket
  */
-int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
+int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg)
 	__releases(&rx->sk.sk_lock.slock)
 {
 	struct rxrpc_call *call;
@@ -723,7 +723,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
 	} else if (p.command != RXRPC_CMD_SEND_DATA) {
 		ret = -EINVAL;
 	} else {
-		ret = rxrpc_send_data(rx, call, msg, len, NULL, &dropped_lock);
+		ret = rxrpc_send_data(rx, call, msg, NULL, &dropped_lock);
 	}
 
 out_put_unlock:
@@ -744,7 +744,6 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
  * @sock: The socket the call is on
  * @call: The call to send data through
  * @msg: The data to send
- * @len: The amount of data to send
  * @notify_end_tx: Notification that the last packet is queued.
  *
  * Allow a kernel service to send data on a call.  The call must be in an state
@@ -753,7 +752,7 @@ int rxrpc_do_sendmsg(struct rxrpc_sock *rx, struct msghdr *msg, size_t len)
  * more data to come, otherwise this data will end the transmission phase.
  */
 int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
-			   struct msghdr *msg, size_t len,
+			   struct msghdr *msg,
 			   rxrpc_notify_end_tx_t notify_end_tx)
 {
 	bool dropped_lock = false;
@@ -766,7 +765,7 @@ int rxrpc_kernel_send_data(struct socket *sock, struct rxrpc_call *call,
 
 	mutex_lock(&call->user_mutex);
 
-	ret = rxrpc_send_data(rxrpc_sk(sock->sk), call, msg, len,
+	ret = rxrpc_send_data(rxrpc_sk(sock->sk), call, msg,
 			      notify_end_tx, &dropped_lock);
 	if (ret == -ESHUTDOWN)
 		ret = call->error;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index b91616f819de..da99aab89d82 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1935,7 +1935,7 @@ static void sctp_sendmsg_update_sinfo(struct sctp_association *asoc,
 	}
 }
 
-static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
+static int sctp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	struct sctp_endpoint *ep = sctp_sk(sk)->ep;
 	struct sctp_transport *transport = NULL;
@@ -1943,6 +1943,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 	struct sctp_association *asoc, *tmp;
 	struct sctp_cmsgs cmsgs;
 	union sctp_addr *daddr;
+	size_t msg_len = msg_data_left(msg);
 	bool new = false;
 	__u16 sflags;
 	int err;
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index c6b4a62276f6..0e725698ebcd 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -2653,10 +2653,11 @@ static int smc_getname(struct socket *sock, struct sockaddr *addr,
 	return smc->clcsock->ops->getname(smc->clcsock, addr, peer);
 }
 
-static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int smc_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct smc_sock *smc;
+	size_t len = msg_data_left(msg);
 	int rc;
 
 	smc = smc_sk(sk);
@@ -2681,7 +2682,7 @@ static int smc_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	if (smc->use_fallback) {
-		rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg, len);
+		rc = smc->clcsock->ops->sendmsg(smc->clcsock, msg);
 	} else {
 		rc = smc_tx_sendmsg(smc, msg, len);
 		SMC_STAT_TX_PAYLOAD(smc, len, rc);
diff --git a/net/socket.c b/net/socket.c
index 73e493da4589..1690e1782bf0 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -708,10 +708,8 @@ void __sock_tx_timestamp(__u16 tsflags, __u8 *tx_flags)
 }
 EXPORT_SYMBOL(__sock_tx_timestamp);
 
-INDIRECT_CALLABLE_DECLARE(int inet_sendmsg(struct socket *, struct msghdr *,
-					   size_t));
-INDIRECT_CALLABLE_DECLARE(int inet6_sendmsg(struct socket *, struct msghdr *,
-					    size_t));
+INDIRECT_CALLABLE_DECLARE(int inet_sendmsg(struct socket *, struct msghdr *));
+INDIRECT_CALLABLE_DECLARE(int inet6_sendmsg(struct socket *, struct msghdr *));
 
 static noinline void call_trace_sock_send_length(struct sock *sk, int ret,
 						 int flags)
@@ -722,8 +720,7 @@ static noinline void call_trace_sock_send_length(struct sock *sk, int ret,
 static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
 {
 	int ret = INDIRECT_CALL_INET(sock->ops->sendmsg, inet6_sendmsg,
-				     inet_sendmsg, sock, msg,
-				     msg_data_left(msg));
+				     inet_sendmsg, sock, msg);
 	BUG_ON(ret == -EIOCBQUEUED);
 
 	if (trace_sock_send_length_enabled())
@@ -741,8 +738,7 @@ static inline int sock_sendmsg_nosec(struct socket *sock, struct msghdr *msg)
  */
 int sock_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	int err = security_socket_sendmsg(sock, msg,
-					  msg_data_left(msg));
+	int err = security_socket_sendmsg(sock, msg);
 
 	return err ?: sock_sendmsg_nosec(sock, msg);
 }
@@ -787,11 +783,11 @@ int kernel_sendmsg_locked(struct sock *sk, struct msghdr *msg,
 	struct socket *sock = sk->sk_socket;
 
 	if (!sock->ops->sendmsg_locked)
-		return sock_no_sendmsg_locked(sk, msg, size);
+		return sock_no_sendmsg_locked(sk, msg);
 
 	iov_iter_kvec(&msg->msg_iter, ITER_SOURCE, vec, num, size);
 
-	return sock->ops->sendmsg_locked(sk, msg, msg_data_left(msg));
+	return sock->ops->sendmsg_locked(sk, msg);
 }
 EXPORT_SYMBOL(kernel_sendmsg_locked);
 
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 37edfe10f8c6..bd677e707548 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -156,8 +156,8 @@ static int tipc_sk_leave(struct tipc_sock *tsk);
 static struct tipc_sock *tipc_sk_lookup(struct net *net, u32 portid);
 static int tipc_sk_insert(struct tipc_sock *tsk);
 static void tipc_sk_remove(struct tipc_sock *tsk);
-static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz);
-static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dsz);
+static int __tipc_sendstream(struct socket *sock, struct msghdr *m);
+static int __tipc_sendmsg(struct socket *sock, struct msghdr *m);
 static void tipc_sk_push_backlog(struct tipc_sock *tsk, bool nagle_ack);
 static int tipc_wait_for_connect(struct socket *sock, long *timeo_p);
 
@@ -1385,7 +1385,6 @@ static void tipc_sk_conn_proto_rcv(struct tipc_sock *tsk, struct sk_buff *skb,
  * tipc_sendmsg - send message in connectionless manner
  * @sock: socket structure
  * @m: message to send
- * @dsz: amount of user data to be sent
  *
  * Message must have an destination specified explicitly.
  * Used for SOCK_RDM and SOCK_DGRAM messages,
@@ -1394,20 +1393,19 @@ static void tipc_sk_conn_proto_rcv(struct tipc_sock *tsk, struct sk_buff *skb,
  *
  * Return: the number of bytes sent on success, or errno otherwise
  */
-static int tipc_sendmsg(struct socket *sock,
-			struct msghdr *m, size_t dsz)
+static int tipc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	int ret;
 
 	lock_sock(sk);
-	ret = __tipc_sendmsg(sock, m, dsz);
+	ret = __tipc_sendmsg(sock, m);
 	release_sock(sk);
 
 	return ret;
 }
 
-static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
+static int __tipc_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	struct net *net = sock_net(sk);
@@ -1420,6 +1418,7 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
 	struct tipc_msg *hdr = &tsk->phdr;
 	struct tipc_socket_addr skaddr;
 	struct sk_buff_head pkts;
+	size_t dlen = msg_data_left(m);
 	int atype, mtu, rc;
 
 	if (unlikely(dlen > TIPC_MAX_USER_MSG_SIZE))
@@ -1535,26 +1534,25 @@ static int __tipc_sendmsg(struct socket *sock, struct msghdr *m, size_t dlen)
  * tipc_sendstream - send stream-oriented data
  * @sock: socket structure
  * @m: data to send
- * @dsz: total length of data to be transmitted
  *
  * Used for SOCK_STREAM data.
  *
  * Return: the number of bytes sent on success (or partial success),
  * or errno if no data sent
  */
-static int tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dsz)
+static int tipc_sendstream(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	int ret;
 
 	lock_sock(sk);
-	ret = __tipc_sendstream(sock, m, dsz);
+	ret = __tipc_sendstream(sock, m);
 	release_sock(sk);
 
 	return ret;
 }
 
-static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
+static int __tipc_sendstream(struct socket *sock, struct msghdr *m)
 {
 	struct sock *sk = sock->sk;
 	DECLARE_SOCKADDR(struct sockaddr_tipc *, dest, m->msg_name);
@@ -1564,6 +1562,7 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
 	struct tipc_msg *hdr = &tsk->phdr;
 	struct net *net = sock_net(sk);
 	struct sk_buff *skb;
+	size_t dlen = msg_data_left(m);
 	u32 dnode = tsk_peer_node(tsk);
 	int maxnagle = tsk->maxnagle;
 	int maxpkt = tsk->max_pkt;
@@ -1575,7 +1574,7 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
 
 	/* Handle implicit connection setup */
 	if (unlikely(dest && sk->sk_state == TIPC_OPEN)) {
-		rc = __tipc_sendmsg(sock, m, dlen);
+		rc = __tipc_sendmsg(sock, m);
 		if (dlen && dlen == rc) {
 			tsk->peer_caps = tipc_node_get_capabilities(net, dnode);
 			tsk->snt_unacked = tsk_inc(tsk, dlen + msg_hdr_sz(hdr));
@@ -1643,18 +1642,17 @@ static int __tipc_sendstream(struct socket *sock, struct msghdr *m, size_t dlen)
  * tipc_send_packet - send a connection-oriented message
  * @sock: socket structure
  * @m: message to send
- * @dsz: length of data to be transmitted
  *
  * Used for SOCK_SEQPACKET messages.
  *
  * Return: the number of bytes sent on success, or errno otherwise
  */
-static int tipc_send_packet(struct socket *sock, struct msghdr *m, size_t dsz)
+static int tipc_send_packet(struct socket *sock, struct msghdr *m)
 {
-	if (dsz > TIPC_MAX_USER_MSG_SIZE)
+	if (msg_data_left(m) > TIPC_MAX_USER_MSG_SIZE)
 		return -EMSGSIZE;
 
-	return tipc_sendstream(sock, m, dsz);
+	return tipc_sendstream(sock, m);
 }
 
 /* tipc_sk_finish_conn - complete the setup of a connection
@@ -2625,7 +2623,7 @@ static int tipc_connect(struct socket *sock, struct sockaddr *dest,
 		if (!timeout)
 			m.msg_flags = MSG_DONTWAIT;
 
-		res = __tipc_sendmsg(sock, &m, 0);
+		res = __tipc_sendmsg(sock, &m);
 		if ((res < 0) && (res != -EWOULDBLOCK))
 			goto exit;
 
@@ -2781,7 +2779,7 @@ static int tipc_accept(struct socket *sock, struct socket *new_sock, int flags,
 		skb_set_owner_r(buf, new_sk);
 	}
 	iov_iter_kvec(&m.msg_iter, ITER_SOURCE, NULL, 0, 0);
-	__tipc_sendstream(new_sock, &m, 0);
+	__tipc_sendstream(new_sock, &m);
 	release_sock(new_sk);
 exit:
 	release_sock(sk);
diff --git a/net/tls/tls.h b/net/tls/tls.h
index 804c3880d028..a969955ddd7c 100644
--- a/net/tls/tls.h
+++ b/net/tls/tls.h
@@ -96,7 +96,7 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx, int tx);
 void tls_update_rx_zc_capable(struct tls_context *tls_ctx);
 void tls_sw_strparser_arm(struct sock *sk, struct tls_context *ctx);
 void tls_sw_strparser_done(struct tls_context *tls_ctx);
-int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg);
 int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
 			   int offset, size_t size, int flags);
 int tls_sw_sendpage(struct sock *sk, struct page *page,
@@ -114,7 +114,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
 			   struct pipe_inode_info *pipe,
 			   size_t len, unsigned int flags);
 
-int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg);
 int tls_device_sendpage(struct sock *sk, struct page *page,
 			int offset, size_t size, int flags);
 int tls_tx_records(struct sock *sk, int flags);
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
index a7cc4f9faac2..3616dde20a96 100644
--- a/net/tls/tls_device.c
+++ b/net/tls/tls_device.c
@@ -566,7 +566,7 @@ static int tls_push_data(struct sock *sk,
 	return rc;
 }
 
-int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	unsigned char record_type = TLS_RECORD_TYPE_DATA;
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
@@ -583,7 +583,8 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	}
 
 	iter.msg_iter = &msg->msg_iter;
-	rc = tls_push_data(sk, iter, size, msg->msg_flags, record_type, NULL);
+	rc = tls_push_data(sk, iter, msg_data_left(msg), msg->msg_flags,
+			   record_type, NULL);
 
 out:
 	release_sock(sk);
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 635b8bf6b937..17ea9b07a277 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -929,7 +929,7 @@ static int tls_sw_push_pending_record(struct sock *sk, int flags)
 				   &copied, flags);
 }
 
-int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	struct tls_context *tls_ctx = tls_get_ctx(sk);
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index fb31e8a4409e..37c96a73e6b4 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -756,20 +756,20 @@ static int unix_ioctl(struct socket *, unsigned int, unsigned long);
 static int unix_compat_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg);
 #endif
 static int unix_shutdown(struct socket *, int);
-static int unix_stream_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_stream_sendmsg(struct socket *, struct msghdr *);
 static int unix_stream_recvmsg(struct socket *, struct msghdr *, size_t, int);
 static ssize_t unix_stream_sendpage(struct socket *, struct page *, int offset,
 				    size_t size, int flags);
 static ssize_t unix_stream_splice_read(struct socket *,  loff_t *ppos,
 				       struct pipe_inode_info *, size_t size,
 				       unsigned int flags);
-static int unix_dgram_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_dgram_sendmsg(struct socket *, struct msghdr *);
 static int unix_dgram_recvmsg(struct socket *, struct msghdr *, size_t, int);
 static int unix_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
 static int unix_stream_read_skb(struct sock *sk, skb_read_actor_t recv_actor);
 static int unix_dgram_connect(struct socket *, struct sockaddr *,
 			      int, int);
-static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
+static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *);
 static int unix_seqpacket_recvmsg(struct socket *, struct msghdr *, size_t,
 				  int);
 
@@ -1888,14 +1888,14 @@ static void scm_stat_del(struct sock *sk, struct sk_buff *skb)
  *	Send AF_UNIX data.
  */
 
-static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
-			      size_t len)
+static int unix_dgram_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	DECLARE_SOCKADDR(struct sockaddr_un *, sunaddr, msg->msg_name);
 	struct sock *sk = sock->sk, *other = NULL;
 	struct unix_sock *u = unix_sk(sk);
 	struct scm_cookie scm;
 	struct sk_buff *skb;
+	size_t len = msg_data_left(msg);
 	int data_len = 0;
 	int sk_locked;
 	long timeo;
@@ -2157,11 +2157,11 @@ static int queue_oob(struct socket *sock, struct msghdr *msg, struct sock *other
 }
 #endif
 
-static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct sock *other = NULL;
+	size_t len = msg_data_left(msg);
 	int err, size;
 	struct sk_buff *skb;
 	int sent = 0;
@@ -2388,8 +2388,7 @@ static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
 	return err;
 }
 
-static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
-				  size_t len)
+static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct sock *sk = sock->sk;
@@ -2404,7 +2403,7 @@ static int unix_seqpacket_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (msg->msg_namelen)
 		msg->msg_namelen = 0;
 
-	return unix_dgram_sendmsg(sock, msg, len);
+	return unix_dgram_sendmsg(sock, msg);
 }
 
 static int unix_seqpacket_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 19aea7cba26e..20bac3e04abd 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1131,8 +1131,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
 	return mask;
 }
 
-static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
-			       size_t len)
+static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	int err;
 	struct sock *sk;
@@ -1198,7 +1197,7 @@ static int vsock_dgram_sendmsg(struct socket *sock, struct msghdr *msg,
 		goto out;
 	}
 
-	err = transport->dgram_enqueue(vsk, remote_addr, msg, len);
+	err = transport->dgram_enqueue(vsk, remote_addr, msg, msg_data_left(msg));
 
 out:
 	release_sock(sk);
@@ -1737,8 +1736,7 @@ static int vsock_connectible_getsockopt(struct socket *sock,
 	return 0;
 }
 
-static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
-				     size_t len)
+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk;
 	struct vsock_sock *vsk;
@@ -1794,7 +1792,7 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 	if (err < 0)
 		goto out;
 
-	while (total_written < len) {
+	while (msg_data_left(msg)) {
 		ssize_t written;
 
 		add_wait_queue(sk_sleep(sk), &wait);
@@ -1856,10 +1854,10 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 
 		if (sk->sk_type == SOCK_SEQPACKET) {
 			written = transport->seqpacket_enqueue(vsk,
-						msg, len - total_written);
+					msg, msg_data_left(msg));
 		} else {
 			written = transport->stream_enqueue(vsk,
-					msg, len - total_written);
+					msg, msg_data_left(msg));
 		}
 
 		if (written < 0) {
@@ -1882,7 +1880,7 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
 		 * 1) SOCK_STREAM socket.
 		 * 2) SOCK_SEQPACKET socket when whole buffer is sent.
 		 */
-		if (sk->sk_type == SOCK_STREAM || total_written == len)
+		if (sk->sk_type == SOCK_STREAM || !msg_data_left(msg))
 			err = total_written;
 	}
 out:
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5c7ad301d742..5b8751669136 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1100,7 +1100,7 @@ int x25_rx_call_request(struct sk_buff *skb, struct x25_neigh *nb,
 	goto out;
 }
 
-static int x25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
+static int x25_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sock *sk = sock->sk;
 	struct x25_sock *x25 = x25_sk(sk);
@@ -1108,6 +1108,7 @@ static int x25_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
 	struct sockaddr_x25 sx25;
 	struct sk_buff *skb;
 	unsigned char *asmptr;
+	size_t len = msg_data_left(msg);
 	int noblock = msg->msg_flags & MSG_DONTWAIT;
 	size_t size;
 	int qbit = 0, rc = -EINVAL;
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 2ac58b282b5e..db82e2a287f5 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -629,7 +629,7 @@ static int xsk_check_common(struct xdp_sock *xs)
 	return 0;
 }
 
-static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int __xsk_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	bool need_wait = !(m->msg_flags & MSG_DONTWAIT);
 	struct sock *sk = sock->sk;
@@ -663,12 +663,12 @@ static int __xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len
 	return 0;
 }
 
-static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len)
+static int xsk_sendmsg(struct socket *sock, struct msghdr *m)
 {
 	int ret;
 
 	rcu_read_lock();
-	ret = __xsk_sendmsg(sock, m, total_len);
+	ret = __xsk_sendmsg(sock, m);
 	rcu_read_unlock();
 
 	return ret;
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
index 872b80188e83..d07faa356347 100644
--- a/net/xfrm/espintcp.c
+++ b/net/xfrm/espintcp.c
@@ -311,13 +311,14 @@ int espintcp_push_skb(struct sock *sk, struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(espintcp_push_skb);
 
-static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg)
 {
 	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	struct espintcp_ctx *ctx = espintcp_getctx(sk);
 	struct espintcp_msg *emsg = &ctx->partial;
 	struct iov_iter pfx_iter;
 	struct kvec pfx_iov = {};
+	size_t size = msg_data_left(msg);
 	size_t msglen = size + 2;
 	char buf[2] = {0};
 	int err, end;
@@ -325,7 +326,7 @@ static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	if (msg->msg_flags & ~MSG_DONTWAIT)
 		return -EOPNOTSUPP;
 
-	if (size > MAX_ESPINTCP_MSG)
+	if (msg_data_left(msg) > MAX_ESPINTCP_MSG)
 		return -EMSGSIZE;
 
 	if (msg->msg_controllen)
@@ -362,7 +363,8 @@ static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 	if (err < 0)
 		goto fail;
 
-	err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, &emsg->skmsg, size);
+	err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, &emsg->skmsg,
+				       msg_data_left(msg));
 	if (err < 0)
 		goto fail;
 
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index d6cc4812ca53..cb220a8e8126 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -997,10 +997,10 @@ static int aa_sock_msg_perm(const char *op, u32 request, struct socket *sock,
 /**
  * apparmor_socket_sendmsg - check perms before sending msg to another socket
  */
-static int apparmor_socket_sendmsg(struct socket *sock,
-				   struct msghdr *msg, int size)
+static int apparmor_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg, size);
+	return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg,
+				msg_data_left(msg));
 }
 
 /**
diff --git a/security/security.c b/security/security.c
index cf6cc576736f..faa87f363af8 100644
--- a/security/security.c
+++ b/security/security.c
@@ -2301,9 +2301,9 @@ int security_socket_accept(struct socket *sock, struct socket *newsock)
 	return call_int_hook(socket_accept, 0, sock, newsock);
 }
 
-int security_socket_sendmsg(struct socket *sock, struct msghdr *msg, int size)
+int security_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return call_int_hook(socket_sendmsg, 0, sock, msg, size);
+	return call_int_hook(socket_sendmsg, 0, sock, msg);
 }
 
 int security_socket_recvmsg(struct socket *sock, struct msghdr *msg,
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 9a5bdfc21314..ff0d82e6331d 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -4912,8 +4912,7 @@ static int selinux_socket_accept(struct socket *sock, struct socket *newsock)
 	return 0;
 }
 
-static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				  int size)
+static int selinux_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	return sock_has_perm(sock->sk, SOCKET__WRITE);
 }
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index cfcbb748da25..ca30c105f254 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -3730,14 +3730,12 @@ static int smack_unix_may_send(struct socket *sock, struct socket *other)
  * smack_socket_sendmsg - Smack check based on destination host
  * @sock: the socket
  * @msg: the message
- * @size: the size of the message
  *
  * Return 0 if the current subject can write to the destination host.
  * For IPv4 this is only a question if the destination is a single label host.
  * For IPv6 this is a check against the label of the port.
  */
-static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				int size)
+static int smack_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
 	struct sockaddr_in *sip = (struct sockaddr_in *) msg->msg_name;
 #if IS_ENABLED(CONFIG_IPV6)
diff --git a/security/tomoyo/common.h b/security/tomoyo/common.h
index ca285f362705..0841098d966a 100644
--- a/security/tomoyo/common.h
+++ b/security/tomoyo/common.h
@@ -997,8 +997,7 @@ int tomoyo_socket_bind_permission(struct socket *sock, struct sockaddr *addr,
 int tomoyo_socket_connect_permission(struct socket *sock,
 				     struct sockaddr *addr, int addr_len);
 int tomoyo_socket_listen_permission(struct socket *sock);
-int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg,
-				     int size);
+int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg);
 int tomoyo_supervisor(struct tomoyo_request_info *r, const char *fmt, ...)
 	__printf(2, 3);
 int tomoyo_update_domain(struct tomoyo_acl_info *new_entry, const int size,
diff --git a/security/tomoyo/network.c b/security/tomoyo/network.c
index 8dc61335f65e..0315b335cdff 100644
--- a/security/tomoyo/network.c
+++ b/security/tomoyo/network.c
@@ -751,12 +751,10 @@ int tomoyo_socket_bind_permission(struct socket *sock, struct sockaddr *addr,
  *
  * @sock: Pointer to "struct socket".
  * @msg:  Pointer to "struct msghdr".
- * @size: Unused.
  *
  * Returns 0 on success, negative value otherwise.
  */
-int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg,
-				     int size)
+int tomoyo_socket_sendmsg_permission(struct socket *sock, struct msghdr *msg)
 {
 	struct tomoyo_addr_info address;
 	const u8 family = tomoyo_sock_family(sock->sk);
diff --git a/security/tomoyo/tomoyo.c b/security/tomoyo/tomoyo.c
index af04a7b7eb28..72c6f343ffba 100644
--- a/security/tomoyo/tomoyo.c
+++ b/security/tomoyo/tomoyo.c
@@ -489,14 +489,12 @@ static int tomoyo_socket_bind(struct socket *sock, struct sockaddr *addr,
  *
  * @sock: Pointer to "struct socket".
  * @msg:  Pointer to "struct msghdr".
- * @size: Size of message.
  *
  * Returns 0 on success, negative value otherwise.
  */
-static int tomoyo_socket_sendmsg(struct socket *sock, struct msghdr *msg,
-				 int size)
+static int tomoyo_socket_sendmsg(struct socket *sock, struct msghdr *msg)
 {
-	return tomoyo_socket_sendmsg_permission(sock, msg, size);
+	return tomoyo_socket_sendmsg_permission(sock, msg);
 }
 
 struct lsm_blob_sizes tomoyo_blob_sizes __lsm_ro_after_init = {

^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
  2023-03-22 13:56         ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() David Howells
  2023-03-22 13:56             ` David Howells
@ 2023-03-22 13:56           ` David Howells
  2023-03-22 17:25             ` kernel test robot
                               ` (3 more replies)
  2023-03-22 13:56           ` [RFC PATCH 3/3] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
  2023-03-23  1:17           ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() Willem de Bruijn
  3 siblings, 4 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 13:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Howells, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Matthew Wilcox, Jeff Layton, Linus Torvalds, netdev,
	linux-kernel, David Ahern

In order to pass an extra internal flag to indicate that sendmsg() should
splice pages rather than copying them, pass a struct msghdr pointer into
various paths that lead to __ip_append_data() and __ip6_append_data() and
thence into getfrag().  The flag can then be stashed in the msghdr struct
in a new field to avoid polluting the msg_flags field with non-UAPI flags.

Passing msghdr around like this allows the length and flags arguments to
__ip*_append_data() to be eliminated (the values can be obtained from the
msghdr and its iterator).  Unfortunately, the "from" parameter can't be so
easily eliminated as it's used by the icmp routines particularly.

The getfrag function pointer is formalised as ip_getfrag_t by typedef.

This requires the following additional changes:

 (1) __ip_append_data() and __ip6_append_data() add transhdrlen onto the
     data length inside the functions rather than it being included in
     msg_data_left().

 (2) A few places, such as icmp_glue_bits(), have to create a msghdr they
     didn't need before in order to pass in flags and length.  They also
     need to cheat a bit and stash the length in msg->msg_iter.count - even
     though they don't actually use the iterator.

 (3) udp_sendmsg() OR's MSG_MORE into msg->msg_flags if the corkflag is
     set.  Separate flags don't then need to be passed in to
     ip_append_data().  Ditto udpv6_sendmsg().

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: David Ahern <dsahern@kernel.org>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 include/net/ip.h      | 24 ++++++--------
 include/net/ipv6.h    | 20 ++++++------
 include/net/ping.h    |  5 ++-
 include/net/udplite.h |  4 +--
 net/ipv4/icmp.c       | 14 +++++----
 net/ipv4/ip_output.c  | 73 ++++++++++++++++++++++---------------------
 net/ipv4/ping.c       | 10 +++---
 net/ipv4/raw.c        | 20 ++++++------
 net/ipv4/udp.c        | 19 ++++++-----
 net/ipv6/icmp.c       | 21 ++++++++-----
 net/ipv6/ip6_output.c | 57 +++++++++++++++------------------
 net/ipv6/ping.c       |  7 ++---
 net/ipv6/raw.c        | 22 ++++++-------
 net/ipv6/udp.c        | 19 ++++++-----
 14 files changed, 155 insertions(+), 160 deletions(-)

diff --git a/include/net/ip.h b/include/net/ip.h
index c3fffaa92d6e..152553bd9ad4 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -211,15 +211,13 @@ int ip_local_out(struct net *net, struct sock *sk, struct sk_buff *skb);
 int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl,
 		    __u8 tos);
 void ip_init(void);
-int ip_append_data(struct sock *sk, struct flowi4 *fl4,
-		   int getfrag(void *from, char *to, int offset, int len,
-			       int odd, struct sk_buff *skb),
-		   void *from, int len, int protolen,
-		   struct ipcm_cookie *ipc,
-		   struct rtable **rt,
-		   unsigned int flags);
-int ip_generic_getfrag(void *from, char *to, int offset, int len, int odd,
-		       struct sk_buff *skb);
+typedef int (*ip_getfrag_t)(struct msghdr *msg, void *from, char *to,
+			    int offset, int len, int odd, struct sk_buff *skb);
+int ip_append_data(struct sock *sk, struct flowi4 *fl4, struct msghdr *msg,
+		   ip_getfrag_t getfrag, void *from, int protolen,
+		   struct ipcm_cookie *ipc, struct rtable **rt);
+int ip_generic_getfrag(struct msghdr *msg, void *from, char *to,
+		       int offset, int len, int odd, struct sk_buff *skb);
 ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
 		       int offset, size_t size, int flags);
 struct sk_buff *__ip_make_skb(struct sock *sk, struct flowi4 *fl4,
@@ -228,12 +226,10 @@ struct sk_buff *__ip_make_skb(struct sock *sk, struct flowi4 *fl4,
 int ip_send_skb(struct net *net, struct sk_buff *skb);
 int ip_push_pending_frames(struct sock *sk, struct flowi4 *fl4);
 void ip_flush_pending_frames(struct sock *sk);
-struct sk_buff *ip_make_skb(struct sock *sk, struct flowi4 *fl4,
-			    int getfrag(void *from, char *to, int offset,
-					int len, int odd, struct sk_buff *skb),
-			    void *from, int length, int transhdrlen,
+struct sk_buff *ip_make_skb(struct sock *sk, struct flowi4 *fl4, struct msghdr *msg,
+			    ip_getfrag_t getfrag, int transhdrlen,
 			    struct ipcm_cookie *ipc, struct rtable **rtp,
-			    struct inet_cork *cork, unsigned int flags);
+			    struct inet_cork *cork);
 
 int ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl);
 
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index f2132311e92b..bec2ecf31076 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1094,12 +1094,13 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
 
 int ip6_find_1stfragopt(struct sk_buff *skb, u8 **nexthdr);
 
-int ip6_append_data(struct sock *sk,
-		    int getfrag(void *from, char *to, int offset, int len,
-				int odd, struct sk_buff *skb),
-		    void *from, size_t length, int transhdrlen,
+typedef int (*ip_getfrag_t)(struct msghdr *msg, void *from, char *to,
+			    int offset, int len, int odd, struct sk_buff *skb);
+
+int ip6_append_data(struct sock *sk, struct msghdr *msg,
+		    ip_getfrag_t getfrag, void *from, int transhdrlen,
 		    struct ipcm6_cookie *ipc6, struct flowi6 *fl6,
-		    struct rt6_info *rt, unsigned int flags);
+		    struct rt6_info *rt);
 
 int ip6_push_pending_frames(struct sock *sk);
 
@@ -1110,12 +1111,9 @@ int ip6_send_skb(struct sk_buff *skb);
 struct sk_buff *__ip6_make_skb(struct sock *sk, struct sk_buff_head *queue,
 			       struct inet_cork_full *cork,
 			       struct inet6_cork *v6_cork);
-struct sk_buff *ip6_make_skb(struct sock *sk,
-			     int getfrag(void *from, char *to, int offset,
-					 int len, int odd, struct sk_buff *skb),
-			     void *from, size_t length, int transhdrlen,
-			     struct ipcm6_cookie *ipc6,
-			     struct rt6_info *rt, unsigned int flags,
+struct sk_buff *ip6_make_skb(struct sock *sk, struct msghdr *msg,
+			     ip_getfrag_t getfrag, void *from, int transhdrlen,
+			     struct ipcm6_cookie *ipc6, struct rt6_info *rt,
 			     struct inet_cork_full *cork);
 
 static inline struct sk_buff *ip6_finish_skb(struct sock *sk)
diff --git a/include/net/ping.h b/include/net/ping.h
index 04814edde8e3..cfa7cbeb5ebc 100644
--- a/include/net/ping.h
+++ b/include/net/ping.h
@@ -52,7 +52,6 @@ extern struct pingv6_ops pingv6_ops;
 
 struct pingfakehdr {
 	struct icmphdr icmph;
-	struct msghdr *msg;
 	sa_family_t family;
 	__wsum wcheck;
 };
@@ -65,8 +64,8 @@ int  ping_init_sock(struct sock *sk);
 void ping_close(struct sock *sk, long timeout);
 int  ping_bind(struct sock *sk, struct sockaddr *uaddr, int addr_len);
 void ping_err(struct sk_buff *skb, int offset, u32 info);
-int  ping_getfrag(void *from, char *to, int offset, int fraglen, int odd,
-		  struct sk_buff *);
+int  ping_getfrag(struct msghdr *msg, void *from, char *to,
+		  int offset, int fraglen, int odd, struct sk_buff *skb);
 
 int  ping_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
 		  int flags, int *addr_len);
diff --git a/include/net/udplite.h b/include/net/udplite.h
index 299c14ce2bb9..13ffb096154f 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -18,10 +18,10 @@ extern struct udp_table		udplite_table;
 /*
  *	Checksum computation is all in software, hence simpler getfrag.
  */
-static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
+static __inline__ int udplite_getfrag(struct msghdr *msg,
+				      void *from, char *to, int  offset,
 				      int len, int odd, struct sk_buff *skb)
 {
-	struct msghdr *msg = from;
 	return copy_from_iter_full(to, len, &msg->msg_iter) ? 0 : -EFAULT;
 }
 
diff --git a/net/ipv4/icmp.c b/net/ipv4/icmp.c
index 8cebb476b3ab..5496cd50285a 100644
--- a/net/ipv4/icmp.c
+++ b/net/ipv4/icmp.c
@@ -344,8 +344,8 @@ void icmp_out_count(struct net *net, unsigned char type)
  *	Checksum each fragment, and on the first include the headers and final
  *	checksum.
  */
-static int icmp_glue_bits(void *from, char *to, int offset, int len, int odd,
-			  struct sk_buff *skb)
+static int icmp_glue_bits(struct msghdr *msg, void *from, char *to,
+			  int offset, int len, int odd, struct sk_buff *skb)
 {
 	struct icmp_bxm *icmp_param = from;
 	__wsum csum;
@@ -366,11 +366,13 @@ static void icmp_push_reply(struct sock *sk,
 			    struct ipcm_cookie *ipc, struct rtable **rt)
 {
 	struct sk_buff *skb;
+	struct msghdr msg = {
+		.msg_flags	= MSG_DONTWAIT,
+		.msg_iter.count	= icmp_param->data_len,
+	};
 
-	if (ip_append_data(sk, fl4, icmp_glue_bits, icmp_param,
-			   icmp_param->data_len+icmp_param->head_len,
-			   icmp_param->head_len,
-			   ipc, rt, MSG_DONTWAIT) < 0) {
+	if (ip_append_data(sk, fl4, &msg, icmp_glue_bits, icmp_param,
+			   icmp_param->head_len, ipc, rt) < 0) {
 		__ICMP_INC_STATS(sock_net(sk), ICMP_MIB_OUTERRORS);
 		ip_flush_pending_frames(sk);
 	} else if ((skb = skb_peek(&sk->sk_write_queue)) != NULL) {
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index cb04dbad9ea4..46ab2ea25764 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -929,10 +929,9 @@ int ip_do_fragment(struct net *net, struct sock *sk, struct sk_buff *skb,
 EXPORT_SYMBOL(ip_do_fragment);
 
 int
-ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
+ip_generic_getfrag(struct msghdr *msg, void *from, char *to,
+		   int offset, int len, int odd, struct sk_buff *skb)
 {
-	struct msghdr *msg = from;
-
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		if (!copy_from_iter_full(to, len, &msg->msg_iter))
 			return -EFAULT;
@@ -959,13 +958,12 @@ csum_page(struct page *page, int offset, int copy)
 
 static int __ip_append_data(struct sock *sk,
 			    struct flowi4 *fl4,
+			    struct msghdr *msg,
 			    struct sk_buff_head *queue,
 			    struct inet_cork *cork,
 			    struct page_frag *pfrag,
-			    int getfrag(void *from, char *to, int offset,
-					int len, int odd, struct sk_buff *skb),
-			    void *from, int length, int transhdrlen,
-			    unsigned int flags)
+			    ip_getfrag_t getfrag,
+			    void *from, int transhdrlen)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct ubuf_info *uarg = NULL;
@@ -978,6 +976,7 @@ static int __ip_append_data(struct sock *sk,
 	int err;
 	int offset = 0;
 	bool zc = false;
+	unsigned int length = msg_data_left(msg) + transhdrlen;
 	unsigned int maxfraglen, fragheaderlen, maxnonfragsize;
 	int csummode = CHECKSUM_NONE;
 	struct rtable *rt = (struct rtable *)cork->dst;
@@ -1014,11 +1013,11 @@ static int __ip_append_data(struct sock *sk,
 	if (transhdrlen &&
 	    length + fragheaderlen <= mtu &&
 	    rt->dst.dev->features & (NETIF_F_HW_CSUM | NETIF_F_IP_CSUM) &&
-	    (!(flags & MSG_MORE) || cork->gso_size) &&
+	    (!(msg->msg_flags & MSG_MORE) || cork->gso_size) &&
 	    (!exthdrlen || (rt->dst.dev->features & NETIF_F_HW_ESP_TX_CSUM)))
 		csummode = CHECKSUM_PARTIAL;
 
-	if ((flags & MSG_ZEROCOPY) && length) {
+	if ((msg->msg_flags & MSG_ZEROCOPY) && length) {
 		struct msghdr *msg = from;
 
 		if (getfrag == ip_generic_getfrag && msg->msg_ubuf) {
@@ -1103,7 +1102,7 @@ static int __ip_append_data(struct sock *sk,
 			if (datalen == length + fraggap)
 				alloc_extra += rt->dst.trailer_len;
 
-			if ((flags & MSG_MORE) &&
+			if ((msg->msg_flags & MSG_MORE) &&
 			    !(rt->dst.dev->features&NETIF_F_SG))
 				alloclen = mtu;
 			else if (!paged &&
@@ -1119,7 +1118,7 @@ static int __ip_append_data(struct sock *sk,
 
 			if (transhdrlen) {
 				skb = sock_alloc_send_skb(sk, alloclen,
-						(flags & MSG_DONTWAIT), &err);
+						(msg->msg_flags & MSG_DONTWAIT), &err);
 			} else {
 				skb = NULL;
 				if (refcount_read(&sk->sk_wmem_alloc) + wmem_alloc_delta <=
@@ -1159,7 +1158,8 @@ static int __ip_append_data(struct sock *sk,
 			}
 
 			copy = datalen - transhdrlen - fraggap - pagedlen;
-			if (copy > 0 && getfrag(from, data + transhdrlen, offset, copy, fraggap, skb) < 0) {
+			if (copy > 0 && getfrag(msg, from, data + transhdrlen,
+						offset, copy, fraggap, skb) < 0) {
 				err = -EFAULT;
 				kfree_skb(skb);
 				goto error;
@@ -1178,7 +1178,7 @@ static int __ip_append_data(struct sock *sk,
 			tskey = 0;
 			skb_zcopy_set(skb, uarg, &extra_uref);
 
-			if ((flags & MSG_CONFIRM) && !skb_prev)
+			if ((msg->msg_flags & MSG_CONFIRM) && !skb_prev)
 				skb_set_dst_pending_confirm(skb, 1);
 
 			/*
@@ -1201,8 +1201,8 @@ static int __ip_append_data(struct sock *sk,
 			unsigned int off;
 
 			off = skb->len;
-			if (getfrag(from, skb_put(skb, copy),
-					offset, copy, off, skb) < 0) {
+			if (getfrag(msg, from, skb_put(skb, copy),
+				    offset, copy, off, skb) < 0) {
 				__skb_trim(skb, off);
 				err = -EFAULT;
 				goto error;
@@ -1227,7 +1227,7 @@ static int __ip_append_data(struct sock *sk,
 				get_page(pfrag->page);
 			}
 			copy = min_t(int, copy, pfrag->size - pfrag->offset);
-			if (getfrag(from,
+			if (getfrag(msg, from,
 				    page_address(pfrag->page) + pfrag->offset,
 				    offset, copy, skb->len, skb) < 0)
 				goto error_efault;
@@ -1320,17 +1320,14 @@ static int ip_setup_cork(struct sock *sk, struct inet_cork *cork,
  *
  *	LATER: length must be adjusted by pad at tail, when it is required.
  */
-int ip_append_data(struct sock *sk, struct flowi4 *fl4,
-		   int getfrag(void *from, char *to, int offset, int len,
-			       int odd, struct sk_buff *skb),
-		   void *from, int length, int transhdrlen,
-		   struct ipcm_cookie *ipc, struct rtable **rtp,
-		   unsigned int flags)
+int ip_append_data(struct sock *sk, struct flowi4 *fl4, struct msghdr *msg,
+		   ip_getfrag_t getfrag, void *from, int transhdrlen,
+		   struct ipcm_cookie *ipc, struct rtable **rtp)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	int err;
 
-	if (flags&MSG_PROBE)
+	if (msg->msg_flags & MSG_PROBE)
 		return 0;
 
 	if (skb_queue_empty(&sk->sk_write_queue)) {
@@ -1341,9 +1338,9 @@ int ip_append_data(struct sock *sk, struct flowi4 *fl4,
 		transhdrlen = 0;
 	}
 
-	return __ip_append_data(sk, fl4, &sk->sk_write_queue, &inet->cork.base,
-				sk_page_frag(sk), getfrag,
-				from, length, transhdrlen, flags);
+	return __ip_append_data(sk, fl4, msg, &sk->sk_write_queue,
+				&inet->cork.base, sk_page_frag(sk),
+				getfrag, from, transhdrlen);
 }
 
 ssize_t	ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
@@ -1629,16 +1626,16 @@ void ip_flush_pending_frames(struct sock *sk)
 
 struct sk_buff *ip_make_skb(struct sock *sk,
 			    struct flowi4 *fl4,
-			    int getfrag(void *from, char *to, int offset,
-					int len, int odd, struct sk_buff *skb),
-			    void *from, int length, int transhdrlen,
+			    struct msghdr *msg,
+			    ip_getfrag_t getfrag,
+			    int transhdrlen,
 			    struct ipcm_cookie *ipc, struct rtable **rtp,
-			    struct inet_cork *cork, unsigned int flags)
+			    struct inet_cork *cork)
 {
 	struct sk_buff_head queue;
 	int err;
 
-	if (flags & MSG_PROBE)
+	if (msg->msg_flags & MSG_PROBE)
 		return NULL;
 
 	__skb_queue_head_init(&queue);
@@ -1650,9 +1647,9 @@ struct sk_buff *ip_make_skb(struct sock *sk,
 	if (err)
 		return ERR_PTR(err);
 
-	err = __ip_append_data(sk, fl4, &queue, cork,
+	err = __ip_append_data(sk, fl4, msg, &queue, cork,
 			       &current->task_frag, getfrag,
-			       from, length, transhdrlen, flags);
+			       msg, transhdrlen);
 	if (err) {
 		__ip_flush_pending_frames(sk, &queue, cork);
 		return ERR_PTR(err);
@@ -1664,7 +1661,7 @@ struct sk_buff *ip_make_skb(struct sock *sk,
 /*
  *	Fetch data from kernel space and fill in checksum if needed.
  */
-static int ip_reply_glue_bits(void *dptr, char *to, int offset,
+static int ip_reply_glue_bits(struct msghdr *msg, void *dptr, char *to, int offset,
 			      int len, int odd, struct sk_buff *skb)
 {
 	__wsum csum;
@@ -1690,6 +1687,10 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 	struct rtable *rt = skb_rtable(skb);
 	struct net *net = sock_net(sk);
 	struct sk_buff *nskb;
+	struct msghdr msg = {
+		.msg_flags	= MSG_DONTWAIT,
+		.msg_iter.count	= len,
+	};
 	int err;
 	int oif;
 
@@ -1730,8 +1731,8 @@ void ip_send_unicast_reply(struct sock *sk, struct sk_buff *skb,
 	sk->sk_bound_dev_if = arg->bound_dev_if;
 	sk->sk_sndbuf = READ_ONCE(sysctl_wmem_default);
 	ipc.sockc.mark = fl4.flowi4_mark;
-	err = ip_append_data(sk, &fl4, ip_reply_glue_bits, arg->iov->iov_base,
-			     len, 0, &ipc, &rt, MSG_DONTWAIT);
+	err = ip_append_data(sk, &fl4, &msg, ip_reply_glue_bits, arg->iov->iov_base,
+			     0, &ipc, &rt);
 	if (unlikely(err)) {
 		ip_flush_pending_frames(sk);
 		goto out;
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index f689f9f530c9..e93e0a8849cb 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -617,13 +617,13 @@ EXPORT_SYMBOL_GPL(ping_err);
  *	starting from the payload.
  */
 
-int ping_getfrag(void *from, char *to,
+int ping_getfrag(struct msghdr *msg, void *from, char *to,
 		 int offset, int fraglen, int odd, struct sk_buff *skb)
 {
 	struct pingfakehdr *pfh = from;
 
 	if (!csum_and_copy_from_iter_full(to, fraglen, &pfh->wcheck,
-					  &pfh->msg->msg_iter))
+					  &msg->msg_iter))
 		return -EFAULT;
 
 #if IS_ENABLED(CONFIG_IPV6)
@@ -832,13 +832,11 @@ static int ping_v4_sendmsg(struct sock *sk, struct msghdr *msg)
 	pfh.icmph.checksum = 0;
 	pfh.icmph.un.echo.id = inet->inet_sport;
 	pfh.icmph.un.echo.sequence = user_icmph.un.echo.sequence;
-	pfh.msg = msg;
 	pfh.wcheck = 0;
 	pfh.family = AF_INET;
 
-	err = ip_append_data(sk, &fl4, ping_getfrag, &pfh, len,
-			     sizeof(struct icmphdr), &ipc, &rt,
-			     msg->msg_flags);
+	err = ip_append_data(sk, &fl4, msg, ping_getfrag, &pfh,
+			     sizeof(struct icmphdr), &ipc, &rt);
 	if (err)
 		ip_flush_pending_frames(sk);
 	else
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index f2859c117796..504045163f86 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -77,7 +77,6 @@
 #include <linux/uio.h>
 
 struct raw_frag_vec {
-	struct msghdr *msg;
 	union {
 		struct icmphdr icmph;
 		char c[1];
@@ -420,7 +419,8 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
 	return err;
 }
 
-static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4)
+static int raw_probe_proto_opt(struct msghdr *msg, struct raw_frag_vec *rfv,
+			       struct flowi4 *fl4)
 {
 	int err;
 
@@ -430,7 +430,7 @@ static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4)
 	/* We only need the first two bytes. */
 	rfv->hlen = 2;
 
-	err = memcpy_from_msg(rfv->hdr.c, rfv->msg, rfv->hlen);
+	err = memcpy_from_msg(rfv->hdr.c, msg, rfv->hlen);
 	if (err)
 		return err;
 
@@ -440,8 +440,8 @@ static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4)
 	return 0;
 }
 
-static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
-		       struct sk_buff *skb)
+static int raw_getfrag(struct msghdr *msg, void *from, char *to,
+		       int offset, int len, int odd, struct sk_buff *skb)
 {
 	struct raw_frag_vec *rfv = from;
 
@@ -468,7 +468,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
+	return ip_generic_getfrag(msg, NULL, to, offset, len, odd, skb);
 }
 
 static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
@@ -608,10 +608,9 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 			   daddr, saddr, 0, 0, sk->sk_uid);
 
 	if (!hdrincl) {
-		rfv.msg = msg;
 		rfv.hlen = 0;
 
-		err = raw_probe_proto_opt(&rfv, &fl4);
+		err = raw_probe_proto_opt(msg, &rfv, &fl4);
 		if (err)
 			goto done;
 	}
@@ -640,9 +639,8 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg)
 		if (!ipc.addr)
 			ipc.addr = fl4.daddr;
 		lock_sock(sk);
-		err = ip_append_data(sk, &fl4, raw_getfrag,
-				     &rfv, len, 0,
-				     &ipc, &rt, msg->msg_flags);
+		err = ip_append_data(sk, &fl4, msg, raw_getfrag,
+				     &rfv, 0, &ipc, &rt);
 		if (err)
 			ip_flush_pending_frames(sk);
 		else if (!(msg->msg_flags & MSG_MORE)) {
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b2ed9d37a362..bb2e2e98c94c 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1066,11 +1066,16 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg)
 	__be16 dport;
 	u8  tos;
 	int err, is_udplite = IS_UDPLITE(sk);
-	int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE;
-	int (*getfrag)(void *, char *, int, int, int, struct sk_buff *);
+	bool corkreq = READ_ONCE(up->corkflag);
+	ip_getfrag_t getfrag;
 	struct sk_buff *skb;
 	struct ip_options_data opt_copy;
 
+	if (corkreq)
+		msg->msg_flags |= MSG_MORE;
+	else
+		corkreq = msg->msg_flags & MSG_MORE;
+
 	if (len > 0xFFFF)
 		return -EMSGSIZE;
 
@@ -1258,9 +1263,8 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg)
 	if (!corkreq) {
 		struct inet_cork cork;
 
-		skb = ip_make_skb(sk, fl4, getfrag, msg, ulen,
-				  sizeof(struct udphdr), &ipc, &rt,
-				  &cork, msg->msg_flags);
+		skb = ip_make_skb(sk, fl4, msg, getfrag,
+				  sizeof(struct udphdr), &ipc, &rt, &cork);
 		err = PTR_ERR(skb);
 		if (!IS_ERR_OR_NULL(skb))
 			err = udp_send_skb(skb, fl4, &cork);
@@ -1289,9 +1293,8 @@ int udp_sendmsg(struct sock *sk, struct msghdr *msg)
 
 do_append_data:
 	up->len += ulen;
-	err = ip_append_data(sk, fl4, getfrag, msg, ulen,
-			     sizeof(struct udphdr), &ipc, &rt,
-			     corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags);
+	err = ip_append_data(sk, fl4, msg, getfrag, NULL,
+			     sizeof(struct udphdr), &ipc, &rt);
 	if (err)
 		udp_flush_pending_frames(sk);
 	else if (!corkreq)
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 1f53f2a74480..92d94943bbee 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -313,7 +313,8 @@ struct icmpv6_msg {
 	uint8_t		type;
 };
 
-static int icmpv6_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
+static int icmpv6_getfrag(struct msghdr *_msg, void *from, char *to,
+			  int offset, int len, int odd, struct sk_buff *skb)
 {
 	struct icmpv6_msg *msg = (struct icmpv6_msg *) from;
 	struct sk_buff *org_skb = msg->skb;
@@ -453,6 +454,7 @@ void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info,
 	struct flowi6 fl6;
 	struct icmpv6_msg msg;
 	struct ipcm6_cookie ipc6;
+	struct msghdr msghdr;
 	int iif = 0;
 	int addr_type = 0;
 	int len;
@@ -606,14 +608,15 @@ void icmp6_send(struct sk_buff *skb, u8 type, u8 code, __u32 info,
 		goto out_dst_release;
 	}
 
+	msghdr.msg_iter.count = len;
+	msghdr.msg_flags = MSG_DONTWAIT;
+
 	rcu_read_lock();
 	idev = __in6_dev_get(skb->dev);
 
-	if (ip6_append_data(sk, icmpv6_getfrag, &msg,
-			    len + sizeof(struct icmp6hdr),
+	if (ip6_append_data(sk, &msghdr, icmpv6_getfrag, &msg,
 			    sizeof(struct icmp6hdr),
-			    &ipc6, &fl6, (struct rt6_info *)dst,
-			    MSG_DONTWAIT)) {
+			    &ipc6, &fl6, (struct rt6_info *)dst)) {
 		ICMP6_INC_STATS(net, idev, ICMP6_MIB_OUTERRORS);
 		ip6_flush_pending_frames(sk);
 	} else {
@@ -718,6 +721,7 @@ static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb)
 	struct icmpv6_msg msg;
 	struct dst_entry *dst;
 	struct ipcm6_cookie ipc6;
+	struct msghdr msghdr;
 	u32 mark = IP6_REPLY_MARK(net, skb->mark);
 	SKB_DR(reason);
 	bool acast;
@@ -796,10 +800,11 @@ static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb)
 		if (!icmp_build_probe(skb, (struct icmphdr *)&tmp_hdr))
 			goto out_dst_release;
 
-	if (ip6_append_data(sk, icmpv6_getfrag, &msg,
-			    skb->len + sizeof(struct icmp6hdr),
+	msghdr.msg_iter.count	= skb->len;
+	msghdr.msg_flags	= MSG_DONTWAIT;
+	if (ip6_append_data(sk, &msghdr, icmpv6_getfrag, &msg,
 			    sizeof(struct icmp6hdr), &ipc6, &fl6,
-			    (struct rt6_info *)dst, MSG_DONTWAIT)) {
+			    (struct rt6_info *)dst)) {
 		__ICMP6_INC_STATS(net, idev, ICMP6_MIB_OUTERRORS);
 		ip6_flush_pending_frames(sk);
 	} else {
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index e5ed39a3c65f..171a026d1dca 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1462,13 +1462,13 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 
 static int __ip6_append_data(struct sock *sk,
 			     struct sk_buff_head *queue,
+			     struct msghdr *msg,
 			     struct inet_cork_full *cork_full,
 			     struct inet6_cork *v6_cork,
 			     struct page_frag *pfrag,
-			     int getfrag(void *from, char *to, int offset,
-					 int len, int odd, struct sk_buff *skb),
-			     void *from, size_t length, int transhdrlen,
-			     unsigned int flags, struct ipcm6_cookie *ipc6)
+			     ip_getfrag_t getfrag,
+			     void *from, int transhdrlen,
+			     struct ipcm6_cookie *ipc6)
 {
 	struct sk_buff *skb, *skb_prev = NULL;
 	struct inet_cork *cork = &cork_full->base;
@@ -1488,6 +1488,7 @@ static int __ip6_append_data(struct sock *sk,
 	int csummode = CHECKSUM_NONE;
 	unsigned int maxnonfragsize, headersize;
 	unsigned int wmem_alloc_delta = 0;
+	size_t length = msg_data_left(msg) + transhdrlen;
 	bool paged, extra_uref = false;
 
 	skb = skb_peek_tail(queue);
@@ -1555,11 +1556,11 @@ static int __ip6_append_data(struct sock *sk,
 	if (transhdrlen && sk->sk_protocol == IPPROTO_UDP &&
 	    headersize == sizeof(struct ipv6hdr) &&
 	    length <= mtu - headersize &&
-	    (!(flags & MSG_MORE) || cork->gso_size) &&
+	    (!(msg->msg_flags & MSG_MORE) || cork->gso_size) &&
 	    rt->dst.dev->features & (NETIF_F_IPV6_CSUM | NETIF_F_HW_CSUM))
 		csummode = CHECKSUM_PARTIAL;
 
-	if ((flags & MSG_ZEROCOPY) && length) {
+	if ((msg->msg_flags & MSG_ZEROCOPY) && length) {
 		struct msghdr *msg = from;
 
 		if (getfrag == ip_generic_getfrag && msg->msg_ubuf) {
@@ -1659,7 +1660,7 @@ static int __ip6_append_data(struct sock *sk,
 			 */
 			alloc_extra += sizeof(struct frag_hdr);
 
-			if ((flags & MSG_MORE) &&
+			if ((msg->msg_flags & MSG_MORE) &&
 			    !(rt->dst.dev->features&NETIF_F_SG))
 				alloclen = mtu;
 			else if (!paged &&
@@ -1689,7 +1690,7 @@ static int __ip6_append_data(struct sock *sk,
 			}
 			if (transhdrlen) {
 				skb = sock_alloc_send_skb(sk, alloclen,
-						(flags & MSG_DONTWAIT), &err);
+						(msg->msg_flags & MSG_DONTWAIT), &err);
 			} else {
 				skb = NULL;
 				if (refcount_read(&sk->sk_wmem_alloc) + wmem_alloc_delta <=
@@ -1729,7 +1730,7 @@ static int __ip6_append_data(struct sock *sk,
 				pskb_trim_unique(skb_prev, maxfraglen);
 			}
 			if (copy > 0 &&
-			    getfrag(from, data + transhdrlen, offset,
+			    getfrag(msg, from, data + transhdrlen, offset,
 				    copy, fraggap, skb) < 0) {
 				err = -EFAULT;
 				kfree_skb(skb);
@@ -1749,7 +1750,7 @@ static int __ip6_append_data(struct sock *sk,
 			tskey = 0;
 			skb_zcopy_set(skb, uarg, &extra_uref);
 
-			if ((flags & MSG_CONFIRM) && !skb_prev)
+			if ((msg->msg_flags & MSG_CONFIRM) && !skb_prev)
 				skb_set_dst_pending_confirm(skb, 1);
 
 			/*
@@ -1772,8 +1773,8 @@ static int __ip6_append_data(struct sock *sk,
 			unsigned int off;
 
 			off = skb->len;
-			if (getfrag(from, skb_put(skb, copy),
-						offset, copy, off, skb) < 0) {
+			if (getfrag(msg, from, skb_put(skb, copy),
+				    offset, copy, off, skb) < 0) {
 				__skb_trim(skb, off);
 				err = -EFAULT;
 				goto error;
@@ -1798,7 +1799,7 @@ static int __ip6_append_data(struct sock *sk,
 				get_page(pfrag->page);
 			}
 			copy = min_t(int, copy, pfrag->size - pfrag->offset);
-			if (getfrag(from,
+			if (getfrag(msg, from,
 				    page_address(pfrag->page) + pfrag->offset,
 				    offset, copy, skb->len, skb) < 0)
 				goto error_efault;
@@ -1832,19 +1833,17 @@ static int __ip6_append_data(struct sock *sk,
 	return err;
 }
 
-int ip6_append_data(struct sock *sk,
-		    int getfrag(void *from, char *to, int offset, int len,
-				int odd, struct sk_buff *skb),
-		    void *from, size_t length, int transhdrlen,
+int ip6_append_data(struct sock *sk, struct msghdr *msg,
+		    ip_getfrag_t getfrag, void *from, int transhdrlen,
 		    struct ipcm6_cookie *ipc6, struct flowi6 *fl6,
-		    struct rt6_info *rt, unsigned int flags)
+		    struct rt6_info *rt)
 {
 	struct inet_sock *inet = inet_sk(sk);
 	struct ipv6_pinfo *np = inet6_sk(sk);
 	int exthdrlen;
 	int err;
 
-	if (flags&MSG_PROBE)
+	if (msg->msg_flags & MSG_PROBE)
 		return 0;
 	if (skb_queue_empty(&sk->sk_write_queue)) {
 		/*
@@ -1858,15 +1857,14 @@ int ip6_append_data(struct sock *sk,
 
 		inet->cork.fl.u.ip6 = *fl6;
 		exthdrlen = (ipc6->opt ? ipc6->opt->opt_flen : 0);
-		length += exthdrlen;
 		transhdrlen += exthdrlen;
 	} else {
 		transhdrlen = 0;
 	}
 
-	return __ip6_append_data(sk, &sk->sk_write_queue, &inet->cork,
+	return __ip6_append_data(sk, &sk->sk_write_queue, msg, &inet->cork,
 				 &np->cork, sk_page_frag(sk), getfrag,
-				 from, length, transhdrlen, flags, ipc6);
+				 from, transhdrlen, ipc6);
 }
 EXPORT_SYMBOL_GPL(ip6_append_data);
 
@@ -2029,19 +2027,17 @@ void ip6_flush_pending_frames(struct sock *sk)
 }
 EXPORT_SYMBOL_GPL(ip6_flush_pending_frames);
 
-struct sk_buff *ip6_make_skb(struct sock *sk,
-			     int getfrag(void *from, char *to, int offset,
-					 int len, int odd, struct sk_buff *skb),
-			     void *from, size_t length, int transhdrlen,
+struct sk_buff *ip6_make_skb(struct sock *sk, struct msghdr *msg,
+			     ip_getfrag_t getfrag, void *from, int transhdrlen,
 			     struct ipcm6_cookie *ipc6, struct rt6_info *rt,
-			     unsigned int flags, struct inet_cork_full *cork)
+			     struct inet_cork_full *cork)
 {
 	struct inet6_cork v6_cork;
 	struct sk_buff_head queue;
 	int exthdrlen = (ipc6->opt ? ipc6->opt->opt_flen : 0);
 	int err;
 
-	if (flags & MSG_PROBE) {
+	if (msg->msg_flags & MSG_PROBE) {
 		dst_release(&rt->dst);
 		return NULL;
 	}
@@ -2060,10 +2056,9 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
 	if (ipc6->dontfrag < 0)
 		ipc6->dontfrag = inet6_sk(sk)->dontfrag;
 
-	err = __ip6_append_data(sk, &queue, cork, &v6_cork,
+	err = __ip6_append_data(sk, &queue, msg, cork, &v6_cork,
 				&current->task_frag, getfrag, from,
-				length + exthdrlen, transhdrlen + exthdrlen,
-				flags, ipc6);
+				transhdrlen + exthdrlen, ipc6);
 	if (err) {
 		__ip6_flush_pending_frames(sk, &queue, cork, &v6_cork);
 		return ERR_PTR(err);
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 54c94b28744f..0380d3230814 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -166,17 +166,16 @@ static int ping_v6_sendmsg(struct sock *sk, struct msghdr *msg)
 	pfh.icmph.checksum = 0;
 	pfh.icmph.un.echo.id = inet->inet_sport;
 	pfh.icmph.un.echo.sequence = user_icmph.icmp6_sequence;
-	pfh.msg = msg;
 	pfh.wcheck = 0;
 	pfh.family = AF_INET6;
 
 	if (ipc6.hlimit < 0)
 		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
 
+	msg->msg_flags = MSG_DONTWAIT;
 	lock_sock(sk);
-	err = ip6_append_data(sk, ping_getfrag, &pfh, len,
-			      sizeof(struct icmp6hdr), &ipc6, &fl6, rt,
-			      MSG_DONTWAIT);
+	err = ip6_append_data(sk, msg, ping_getfrag, &pfh,
+			      sizeof(struct icmp6hdr), &ipc6, &fl6, rt);
 
 	if (err) {
 		ICMP6_INC_STATS(sock_net(sk), rt->rt6i_idev,
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index a3437deeeb74..2affd7589939 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -678,18 +678,18 @@ static int rawv6_send_hdrinc(struct sock *sk, struct msghdr *msg, int length,
 }
 
 struct raw6_frag_vec {
-	struct msghdr *msg;
 	int hlen;
 	char c[4];
 };
 
-static int rawv6_probe_proto_opt(struct raw6_frag_vec *rfv, struct flowi6 *fl6)
+static int rawv6_probe_proto_opt(struct raw6_frag_vec *rfv, struct flowi6 *fl6,
+				 struct msghdr *msg)
 {
 	int err = 0;
 	switch (fl6->flowi6_proto) {
 	case IPPROTO_ICMPV6:
 		rfv->hlen = 2;
-		err = memcpy_from_msg(rfv->c, rfv->msg, rfv->hlen);
+		err = memcpy_from_msg(rfv->c, msg, rfv->hlen);
 		if (!err) {
 			fl6->fl6_icmp_type = rfv->c[0];
 			fl6->fl6_icmp_code = rfv->c[1];
@@ -697,15 +697,15 @@ static int rawv6_probe_proto_opt(struct raw6_frag_vec *rfv, struct flowi6 *fl6)
 		break;
 	case IPPROTO_MH:
 		rfv->hlen = 4;
-		err = memcpy_from_msg(rfv->c, rfv->msg, rfv->hlen);
+		err = memcpy_from_msg(rfv->c, msg, rfv->hlen);
 		if (!err)
 			fl6->fl6_mh_type = rfv->c[2];
 	}
 	return err;
 }
 
-static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
-		       struct sk_buff *skb)
+static int raw6_getfrag(struct msghdr *msg, void *from, char *to,
+			int offset, int len, int odd, struct sk_buff *skb)
 {
 	struct raw6_frag_vec *rfv = from;
 
@@ -732,7 +732,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
+	return ip_generic_getfrag(msg, NULL, to, offset, len, odd, skb);
 }
 
 static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg)
@@ -868,9 +868,8 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg)
 	fl6.flowi6_mark = ipc6.sockc.mark;
 
 	if (!hdrincl) {
-		rfv.msg = msg;
 		rfv.hlen = 0;
-		err = rawv6_probe_proto_opt(&rfv, &fl6);
+		err = rawv6_probe_proto_opt(&rfv, &fl6, msg);
 		if (err)
 			goto out;
 	}
@@ -919,9 +918,8 @@ static int rawv6_sendmsg(struct sock *sk, struct msghdr *msg)
 	else {
 		ipc6.opt = opt;
 		lock_sock(sk);
-		err = ip6_append_data(sk, raw6_getfrag, &rfv,
-			len, 0, &ipc6, &fl6, (struct rt6_info *)dst,
-			msg->msg_flags);
+		err = ip6_append_data(sk, msg, raw6_getfrag, &rfv,
+				      0, &ipc6, &fl6, (struct rt6_info *)dst);
 
 		if (err)
 			ip6_flush_pending_frames(sk);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 80f2eb58ba1a..5bb67739bc0d 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1345,10 +1345,15 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg)
 	bool connected = false;
 	size_t len = msg_data_left(msg);
 	int ulen = len;
-	int corkreq = READ_ONCE(up->corkflag) || msg->msg_flags&MSG_MORE;
+	int corkreq = READ_ONCE(up->corkflag);
 	int err;
 	int is_udplite = IS_UDPLITE(sk);
-	int (*getfrag)(void *, char *, int, int, int, struct sk_buff *);
+	ip_getfrag_t getfrag;
+
+	if (corkreq)
+		msg->msg_flags |= MSG_MORE;
+	else
+		corkreq = msg->msg_flags & MSG_MORE;
 
 	ipcm6_init(&ipc6);
 	ipc6.gso_size = READ_ONCE(up->gso_size);
@@ -1578,10 +1583,9 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg)
 	if (!corkreq) {
 		struct sk_buff *skb;
 
-		skb = ip6_make_skb(sk, getfrag, msg, ulen,
+		skb = ip6_make_skb(sk, msg, getfrag, NULL,
 				   sizeof(struct udphdr), &ipc6,
-				   (struct rt6_info *)dst,
-				   msg->msg_flags, &cork);
+				   (struct rt6_info *)dst, &cork);
 		err = PTR_ERR(skb);
 		if (!IS_ERR_OR_NULL(skb))
 			err = udp_v6_send_skb(skb, fl6, &cork.base);
@@ -1606,9 +1610,8 @@ int udpv6_sendmsg(struct sock *sk, struct msghdr *msg)
 	if (ipc6.dontfrag < 0)
 		ipc6.dontfrag = np->dontfrag;
 	up->len += ulen;
-	err = ip6_append_data(sk, getfrag, msg, ulen, sizeof(struct udphdr),
-			      &ipc6, fl6, (struct rt6_info *)dst,
-			      corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags);
+	err = ip6_append_data(sk, msg, getfrag, NULL, sizeof(struct udphdr),
+			      &ipc6, fl6, (struct rt6_info *)dst);
 	if (err)
 		udp_v6_flush_pending_frames(sk);
 	else if (!corkreq)


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* [RFC PATCH 3/3] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag
  2023-03-22 13:56         ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() David Howells
  2023-03-22 13:56             ` David Howells
  2023-03-22 13:56           ` [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr* David Howells
@ 2023-03-22 13:56           ` David Howells
  2023-03-23  1:17           ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() Willem de Bruijn
  3 siblings, 0 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 13:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Howells, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Matthew Wilcox, Jeff Layton, Linus Torvalds, netdev,
	linux-kernel, Jens Axboe

Declare MSG_SPLICE_PAGES, an internal sendmsg() flag, that hints to a
network protocol that it should splice pages from the source iterator
rather than copying the data if it can.  This is set in msg->msg_kflags,
not msg->msg_flags, thereby isolating it from the UAPI.

This is intended as a replacement for the ->sendpage() op, allowing a way
to splice in several multipage folios in one go.

Signed-off-by: David Howells <dhowells@redhat.com>
cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
cc: "David S. Miller" <davem@davemloft.net>
cc: Eric Dumazet <edumazet@google.com>
cc: Jakub Kicinski <kuba@kernel.org>
cc: Paolo Abeni <pabeni@redhat.com>
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: netdev@vger.kernel.org
---
 include/linux/socket.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 13c3a237b9c9..229f54484d3c 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -72,6 +72,7 @@ struct msghdr {
 	bool		msg_control_is_user : 1;
 	bool		msg_get_inq : 1;/* return INQ after receive */
 	unsigned int	msg_flags;	/* flags on received message */
+	unsigned int	msg_kflags;	/* Kernel internal flags */
 	__kernel_size_t	msg_controllen;	/* ancillary data buffer length */
 	struct kiocb	*msg_iocb;	/* ptr to iocb for async requests */
 	struct ubuf_info *msg_ubuf;
@@ -337,6 +338,8 @@ struct ucred {
 #define MSG_CMSG_COMPAT	0		/* We never have 32 bit fixups */
 #endif
 
+/* Flags for msghdr::msg_kflags (all internal to the kernel) */
+#define MSG_SPLICE_PAGES 0x00000001	/* Splice the pages from the iterator in sendmsg() */
 
 /* Setsockoptions(2) level. Thanks to BSD these must match IPPROTO_xxx */
 #define SOL_IP		0


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* RE: [RFC,1/3] net: Drop the size argument from ->sendmsg()
  2023-03-22 13:56             ` David Howells
  (?)
  (?)
@ 2023-03-22 14:13             ` bluez.test.bot
  -1 siblings, 0 replies; 81+ messages in thread
From: bluez.test.bot @ 2023-03-22 14:13 UTC (permalink / raw)
  To: linux-bluetooth, dhowells

[-- Attachment #1: Type: text/plain, Size: 908 bytes --]

This is an automated email and please do not reply to this email.

Dear Submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
While preparing the CI tests, the patches you submitted couldn't be applied to the current HEAD of the repository.

----- Output -----

error: patch failed: drivers/xen/pvcalls-back.c:200
error: drivers/xen/pvcalls-back.c: patch does not apply
error: patch failed: net/rxrpc/rxperf.c:507
error: net/rxrpc/rxperf.c: patch does not apply
error: patch failed: net/smc/af_smc.c:2653
error: net/smc/af_smc.c: patch does not apply
error: patch failed: net/tipc/socket.c:2781
error: net/tipc/socket.c: patch does not apply
error: patch failed: net/xdp/xsk.c:629
error: net/xdp/xsk.c: patch does not apply
hint: Use 'git am --show-current-patch' to see the failed patch

Please resolve the issue and submit the patches again.


---
Regards,
Linux Bluetooth


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
  2023-03-22 13:56           ` [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr* David Howells
@ 2023-03-22 17:25             ` kernel test robot
  2023-03-22 22:12             ` kernel test robot
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 81+ messages in thread
From: kernel test robot @ 2023-03-22 17:25 UTC (permalink / raw)
  To: David Howells; +Cc: oe-kbuild-all

Hi David,

[FYI, it's a private test report for your RFC patch.]
[auto build test WARNING on linus/master]
[also build test WARNING on v6.3-rc3]
[cannot apply to herbert-cryptodev-2.6/master herbert-crypto-2.6/master bluetooth-next/master bluetooth/master next-20230322]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
patch link:    https://lore.kernel.org/r/20230322135612.3265850-3-dhowells%40redhat.com
patch subject: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
config: s390-allyesconfig (https://download.01.org/0day-ci/archive/20230323/202303230109.SDnCF6Xq-lkp@intel.com/config)
compiler: s390-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/eecac0727821eaf716a8600550bf68f21ead4b87
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
        git checkout eecac0727821eaf716a8600550bf68f21ead4b87
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=s390 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=s390 SHELL=/bin/bash net/l2tp/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303230109.SDnCF6Xq-lkp@intel.com/

All warnings (new ones prefixed by >>):

   net/l2tp/l2tp_ip6.c: In function 'l2tp_ip6_sendmsg':
   net/l2tp/l2tp_ip6.c:635:35: error: passing argument 2 of 'ip6_append_data' from incompatible pointer type [-Werror=incompatible-pointer-types]
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |                                   ^~~~~~~~~~~~~~~~~~
         |                                   |
         |                                   int (*)(struct msghdr *, void *, char *, int,  int,  int,  struct sk_buff *)
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1100:53: note: expected 'struct msghdr *' but argument is of type 'int (*)(struct msghdr *, void *, char *, int,  int,  int,  struct sk_buff *)'
    1100 | int ip6_append_data(struct sock *sk, struct msghdr *msg,
         |                                      ~~~~~~~~~~~~~~~^~~
   net/l2tp/l2tp_ip6.c:635:55: error: passing argument 3 of 'ip6_append_data' from incompatible pointer type [-Werror=incompatible-pointer-types]
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |                                                       ^~~
         |                                                       |
         |                                                       struct msghdr *
   include/net/ipv6.h:1101:34: note: expected 'ip_getfrag_t' but argument is of type 'struct msghdr *'
    1101 |                     ip_getfrag_t getfrag, void *from, int transhdrlen,
         |                     ~~~~~~~~~~~~~^~~~~~~
>> net/l2tp/l2tp_ip6.c:636:31: warning: passing argument 4 of 'ip6_append_data' makes pointer from integer without a cast [-Wint-conversion]
     636 |                               ulen, transhdrlen, &ipc6,
         |                               ^~~~
         |                               |
         |                               int
   include/net/ipv6.h:1101:49: note: expected 'void *' but argument is of type 'int'
    1101 |                     ip_getfrag_t getfrag, void *from, int transhdrlen,
         |                                           ~~~~~~^~~~
   net/l2tp/l2tp_ip6.c:635:15: error: too many arguments to function 'ip6_append_data'
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |               ^~~~~~~~~~~~~~~
   include/net/ipv6.h:1100:5: note: declared here
    1100 | int ip6_append_data(struct sock *sk, struct msghdr *msg,
         |     ^~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/ip6_append_data +636 net/l2tp/l2tp_ip6.c

a32e0eec7042b2 Chris Elston         2012-04-29  487  
a32e0eec7042b2 Chris Elston         2012-04-29  488  /* Userspace will call sendmsg() on the tunnel socket to send L2TP
a32e0eec7042b2 Chris Elston         2012-04-29  489   * control frames.
a32e0eec7042b2 Chris Elston         2012-04-29  490   */
cee416e2c19501 David Howells        2023-03-22  491  static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg)
a32e0eec7042b2 Chris Elston         2012-04-29  492  {
a32e0eec7042b2 Chris Elston         2012-04-29  493  	struct ipv6_txoptions opt_space;
342dfc306fb321 Steffen Hurrle       2014-01-17  494  	DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name);
a32e0eec7042b2 Chris Elston         2012-04-29  495  	struct in6_addr *daddr, *final_p, final;
a32e0eec7042b2 Chris Elston         2012-04-29  496  	struct ipv6_pinfo *np = inet6_sk(sk);
45f6fad84cc305 Eric Dumazet         2015-11-29  497  	struct ipv6_txoptions *opt_to_free = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  498  	struct ipv6_txoptions *opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  499  	struct ip6_flowlabel *flowlabel = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  500  	struct dst_entry *dst = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  501  	struct flowi6 fl6;
26879da58711aa Wei Wang             2016-05-02  502  	struct ipcm6_cookie ipc6;
cee416e2c19501 David Howells        2023-03-22  503  	size_t len = msg_data_left(msg);
a32e0eec7042b2 Chris Elston         2012-04-29  504  	int addr_len = msg->msg_namelen;
a32e0eec7042b2 Chris Elston         2012-04-29  505  	int transhdrlen = 4; /* zero session-id */
f638a84afef3df Wang Yufen           2022-06-07  506  	int ulen;
a32e0eec7042b2 Chris Elston         2012-04-29  507  	int err;
a32e0eec7042b2 Chris Elston         2012-04-29  508  
a32e0eec7042b2 Chris Elston         2012-04-29  509  	/* Rough check on arithmetic overflow,
20dcb1107ab1a3 Tom Parkin           2020-07-22  510  	 * better check is made in ip6_append_data().
a32e0eec7042b2 Chris Elston         2012-04-29  511  	 */
f638a84afef3df Wang Yufen           2022-06-07  512  	if (len > INT_MAX - transhdrlen)
a32e0eec7042b2 Chris Elston         2012-04-29  513  		return -EMSGSIZE;
f638a84afef3df Wang Yufen           2022-06-07  514  	ulen = len + transhdrlen;
a32e0eec7042b2 Chris Elston         2012-04-29  515  
a32e0eec7042b2 Chris Elston         2012-04-29  516  	/* Mirror BSD error message compatibility */
a32e0eec7042b2 Chris Elston         2012-04-29  517  	if (msg->msg_flags & MSG_OOB)
a32e0eec7042b2 Chris Elston         2012-04-29  518  		return -EOPNOTSUPP;
a32e0eec7042b2 Chris Elston         2012-04-29  519  
20dcb1107ab1a3 Tom Parkin           2020-07-22  520  	/* Get and verify the address */
a32e0eec7042b2 Chris Elston         2012-04-29  521  	memset(&fl6, 0, sizeof(fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  522  
a32e0eec7042b2 Chris Elston         2012-04-29  523  	fl6.flowi6_mark = sk->sk_mark;
e2d118a1cb5e60 Lorenzo Colitti      2016-11-04  524  	fl6.flowi6_uid = sk->sk_uid;
a32e0eec7042b2 Chris Elston         2012-04-29  525  
b515430ac9c25d Willem de Bruijn     2018-07-06  526  	ipcm6_init(&ipc6);
26879da58711aa Wei Wang             2016-05-02  527  
a32e0eec7042b2 Chris Elston         2012-04-29  528  	if (lsa) {
a32e0eec7042b2 Chris Elston         2012-04-29  529  		if (addr_len < SIN6_LEN_RFC2133)
a32e0eec7042b2 Chris Elston         2012-04-29  530  			return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  531  
a32e0eec7042b2 Chris Elston         2012-04-29  532  		if (lsa->l2tp_family && lsa->l2tp_family != AF_INET6)
a32e0eec7042b2 Chris Elston         2012-04-29  533  			return -EAFNOSUPPORT;
a32e0eec7042b2 Chris Elston         2012-04-29  534  
a32e0eec7042b2 Chris Elston         2012-04-29  535  		daddr = &lsa->l2tp_addr;
a32e0eec7042b2 Chris Elston         2012-04-29  536  		if (np->sndflow) {
a32e0eec7042b2 Chris Elston         2012-04-29  537  			fl6.flowlabel = lsa->l2tp_flowinfo & IPV6_FLOWINFO_MASK;
a32e0eec7042b2 Chris Elston         2012-04-29  538  			if (fl6.flowlabel & IPV6_FLOWLABEL_MASK) {
a32e0eec7042b2 Chris Elston         2012-04-29  539  				flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  540  				if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  541  					return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  542  			}
a32e0eec7042b2 Chris Elston         2012-04-29  543  		}
a32e0eec7042b2 Chris Elston         2012-04-29  544  
20dcb1107ab1a3 Tom Parkin           2020-07-22  545  		/* Otherwise it will be difficult to maintain
a32e0eec7042b2 Chris Elston         2012-04-29  546  		 * sk->sk_dst_cache.
a32e0eec7042b2 Chris Elston         2012-04-29  547  		 */
a32e0eec7042b2 Chris Elston         2012-04-29  548  		if (sk->sk_state == TCP_ESTABLISHED &&
efe4208f47f907 Eric Dumazet         2013-10-03  549  		    ipv6_addr_equal(daddr, &sk->sk_v6_daddr))
efe4208f47f907 Eric Dumazet         2013-10-03  550  			daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  551  
a32e0eec7042b2 Chris Elston         2012-04-29  552  		if (addr_len >= sizeof(struct sockaddr_in6) &&
a32e0eec7042b2 Chris Elston         2012-04-29  553  		    lsa->l2tp_scope_id &&
a32e0eec7042b2 Chris Elston         2012-04-29  554  		    ipv6_addr_type(daddr) & IPV6_ADDR_LINKLOCAL)
a32e0eec7042b2 Chris Elston         2012-04-29  555  			fl6.flowi6_oif = lsa->l2tp_scope_id;
a32e0eec7042b2 Chris Elston         2012-04-29  556  	} else {
a32e0eec7042b2 Chris Elston         2012-04-29  557  		if (sk->sk_state != TCP_ESTABLISHED)
a32e0eec7042b2 Chris Elston         2012-04-29  558  			return -EDESTADDRREQ;
a32e0eec7042b2 Chris Elston         2012-04-29  559  
efe4208f47f907 Eric Dumazet         2013-10-03  560  		daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  561  		fl6.flowlabel = np->flow_label;
a32e0eec7042b2 Chris Elston         2012-04-29  562  	}
a32e0eec7042b2 Chris Elston         2012-04-29  563  
a32e0eec7042b2 Chris Elston         2012-04-29  564  	if (fl6.flowi6_oif == 0)
ff0094030f146b Eric Dumazet         2022-05-13  565  		fl6.flowi6_oif = READ_ONCE(sk->sk_bound_dev_if);
a32e0eec7042b2 Chris Elston         2012-04-29  566  
a32e0eec7042b2 Chris Elston         2012-04-29  567  	if (msg->msg_controllen) {
a32e0eec7042b2 Chris Elston         2012-04-29  568  		opt = &opt_space;
a32e0eec7042b2 Chris Elston         2012-04-29  569  		memset(opt, 0, sizeof(struct ipv6_txoptions));
a32e0eec7042b2 Chris Elston         2012-04-29  570  		opt->tot_len = sizeof(struct ipv6_txoptions);
26879da58711aa Wei Wang             2016-05-02  571  		ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  572  
5fdaa88dfefa87 Willem de Bruijn     2018-07-06  573  		err = ip6_datagram_send_ctl(sock_net(sk), sk, msg, &fl6, &ipc6);
a32e0eec7042b2 Chris Elston         2012-04-29  574  		if (err < 0) {
a32e0eec7042b2 Chris Elston         2012-04-29  575  			fl6_sock_release(flowlabel);
a32e0eec7042b2 Chris Elston         2012-04-29  576  			return err;
a32e0eec7042b2 Chris Elston         2012-04-29  577  		}
a32e0eec7042b2 Chris Elston         2012-04-29  578  		if ((fl6.flowlabel & IPV6_FLOWLABEL_MASK) && !flowlabel) {
a32e0eec7042b2 Chris Elston         2012-04-29  579  			flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  580  			if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  581  				return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  582  		}
a32e0eec7042b2 Chris Elston         2012-04-29  583  		if (!(opt->opt_nflen | opt->opt_flen))
a32e0eec7042b2 Chris Elston         2012-04-29  584  			opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  585  	}
a32e0eec7042b2 Chris Elston         2012-04-29  586  
45f6fad84cc305 Eric Dumazet         2015-11-29  587  	if (!opt) {
45f6fad84cc305 Eric Dumazet         2015-11-29  588  		opt = txopt_get(np);
45f6fad84cc305 Eric Dumazet         2015-11-29  589  		opt_to_free = opt;
45f6fad84cc305 Eric Dumazet         2015-11-29  590  	}
a32e0eec7042b2 Chris Elston         2012-04-29  591  	if (flowlabel)
a32e0eec7042b2 Chris Elston         2012-04-29  592  		opt = fl6_merge_options(&opt_space, flowlabel, opt);
a32e0eec7042b2 Chris Elston         2012-04-29  593  	opt = ipv6_fixup_options(&opt_space, opt);
26879da58711aa Wei Wang             2016-05-02  594  	ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  595  
a32e0eec7042b2 Chris Elston         2012-04-29  596  	fl6.flowi6_proto = sk->sk_protocol;
a32e0eec7042b2 Chris Elston         2012-04-29  597  	if (!ipv6_addr_any(daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  598  		fl6.daddr = *daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  599  	else
a32e0eec7042b2 Chris Elston         2012-04-29  600  		fl6.daddr.s6_addr[15] = 0x1; /* :: means loopback (BSD'ism) */
a32e0eec7042b2 Chris Elston         2012-04-29  601  	if (ipv6_addr_any(&fl6.saddr) && !ipv6_addr_any(&np->saddr))
a32e0eec7042b2 Chris Elston         2012-04-29  602  		fl6.saddr = np->saddr;
a32e0eec7042b2 Chris Elston         2012-04-29  603  
a32e0eec7042b2 Chris Elston         2012-04-29  604  	final_p = fl6_update_dst(&fl6, opt, &final);
a32e0eec7042b2 Chris Elston         2012-04-29  605  
a32e0eec7042b2 Chris Elston         2012-04-29  606  	if (!fl6.flowi6_oif && ipv6_addr_is_multicast(&fl6.daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  607  		fl6.flowi6_oif = np->mcast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  608  	else if (!fl6.flowi6_oif)
a32e0eec7042b2 Chris Elston         2012-04-29  609  		fl6.flowi6_oif = np->ucast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  610  
3df98d79215ace Paul Moore           2020-09-27  611  	security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  612  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  613  	if (ipc6.tclass < 0)
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  614  		ipc6.tclass = np->tclass;
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  615  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  616  	fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  617  
c4e85f73afb638 Sabrina Dubroca      2019-12-04  618  	dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
a32e0eec7042b2 Chris Elston         2012-04-29  619  	if (IS_ERR(dst)) {
a32e0eec7042b2 Chris Elston         2012-04-29  620  		err = PTR_ERR(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  621  		goto out;
a32e0eec7042b2 Chris Elston         2012-04-29  622  	}
a32e0eec7042b2 Chris Elston         2012-04-29  623  
26879da58711aa Wei Wang             2016-05-02  624  	if (ipc6.hlimit < 0)
26879da58711aa Wei Wang             2016-05-02  625  		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
a32e0eec7042b2 Chris Elston         2012-04-29  626  
26879da58711aa Wei Wang             2016-05-02  627  	if (ipc6.dontfrag < 0)
26879da58711aa Wei Wang             2016-05-02  628  		ipc6.dontfrag = np->dontfrag;
a32e0eec7042b2 Chris Elston         2012-04-29  629  
a32e0eec7042b2 Chris Elston         2012-04-29  630  	if (msg->msg_flags & MSG_CONFIRM)
a32e0eec7042b2 Chris Elston         2012-04-29  631  		goto do_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  632  
a32e0eec7042b2 Chris Elston         2012-04-29  633  back_from_confirm:
a32e0eec7042b2 Chris Elston         2012-04-29  634  	lock_sock(sk);
f69e6d131f5dac Al Viro              2014-11-24  635  	err = ip6_append_data(sk, ip_generic_getfrag, msg,
26879da58711aa Wei Wang             2016-05-02 @636  			      ulen, transhdrlen, &ipc6,
a32e0eec7042b2 Chris Elston         2012-04-29  637  			      &fl6, (struct rt6_info *)dst,
5fdaa88dfefa87 Willem de Bruijn     2018-07-06  638  			      msg->msg_flags);
a32e0eec7042b2 Chris Elston         2012-04-29  639  	if (err)
a32e0eec7042b2 Chris Elston         2012-04-29  640  		ip6_flush_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  641  	else if (!(msg->msg_flags & MSG_MORE))
a32e0eec7042b2 Chris Elston         2012-04-29  642  		err = l2tp_ip6_push_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  643  	release_sock(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  644  done:
a32e0eec7042b2 Chris Elston         2012-04-29  645  	dst_release(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  646  out:
a32e0eec7042b2 Chris Elston         2012-04-29  647  	fl6_sock_release(flowlabel);
45f6fad84cc305 Eric Dumazet         2015-11-29  648  	txopt_put(opt_to_free);
a32e0eec7042b2 Chris Elston         2012-04-29  649  
a32e0eec7042b2 Chris Elston         2012-04-29  650  	return err < 0 ? err : len;
a32e0eec7042b2 Chris Elston         2012-04-29  651  
a32e0eec7042b2 Chris Elston         2012-04-29  652  do_confirm:
0dec879f636f11 Julian Anastasov     2017-02-06  653  	if (msg->msg_flags & MSG_PROBE)
0dec879f636f11 Julian Anastasov     2017-02-06  654  		dst_confirm_neigh(dst, &fl6.daddr);
a32e0eec7042b2 Chris Elston         2012-04-29  655  	if (!(msg->msg_flags & MSG_PROBE) || len)
a32e0eec7042b2 Chris Elston         2012-04-29  656  		goto back_from_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  657  	err = 0;
a32e0eec7042b2 Chris Elston         2012-04-29  658  	goto done;
a32e0eec7042b2 Chris Elston         2012-04-29  659  }
a32e0eec7042b2 Chris Elston         2012-04-29  660  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 81+ messages in thread

* [RFC PATCH] iov_iter: Add an iterator-of-iterators
  2023-03-16 18:06     ` David Howells
  2023-03-16 19:01       ` Trond Myklebust
  2023-03-22 13:10       ` David Howells
@ 2023-03-22 18:15       ` David Howells
  2023-03-22 18:47         ` Trond Myklebust
  2023-03-22 18:49         ` Matthew Wilcox
  2 siblings, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-22 18:15 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: dhowells, Matthew Wilcox, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Alexander Viro, Christoph Hellwig,
	Jens Axboe, Jeffrey Layton, Christian Brauner, Linus Torvalds,
	netdev, linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs

Trond Myklebust <trondmy@hammerspace.com> wrote:

> Add an enum iter_type for ITER_ITER ? :-)

Well, you asked for it...  It's actually fairly straightforward once
ITER_PIPE is removed.

---
iov_iter: Add an iterator-of-iterators

Provide an I/O iterator that takes an array of iterators and iterates over
them in turn.  Then make the sunrpc service code (and thus nfsd) use it.

In this particular instance, the svc_tcp_sendmsg() sets up an array of
three iterators: once for the marker+header, one for the body and one
optional one for the tail, then sets msg_iter to be an
iterator-of-iterators across them.

Signed-off-by: David Howells <dhowells@redhat.com>
---    
 include/linux/uio.h  |   19 +++-
 lib/iov_iter.c       |  233 +++++++++++++++++++++++++++++++++++++++++++++++++--
 net/sunrpc/svcsock.c |   29 +++---
 3 files changed, 258 insertions(+), 23 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 74598426edb4..321381d3d616 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -27,6 +27,7 @@ enum iter_type {
 	ITER_XARRAY,
 	ITER_DISCARD,
 	ITER_UBUF,
+	ITER_ITERLIST,
 };
 
 #define ITER_SOURCE	1	// == WRITE
@@ -43,17 +44,17 @@ struct iov_iter {
 	bool nofault;
 	bool data_source;
 	bool user_backed;
-	union {
-		size_t iov_offset;
-		int last_offset;
-	};
+	bool spliceable;
+	size_t iov_offset;
 	size_t count;
+	size_t orig_count;
 	union {
 		const struct iovec *iov;
 		const struct kvec *kvec;
 		const struct bio_vec *bvec;
 		struct xarray *xarray;
 		void __user *ubuf;
+		struct iov_iter *iterlist;
 	};
 	union {
 		unsigned long nr_segs;
@@ -104,6 +105,11 @@ static inline bool iov_iter_is_xarray(const struct iov_iter *i)
 	return iov_iter_type(i) == ITER_XARRAY;
 }
 
+static inline bool iov_iter_is_iterlist(const struct iov_iter *i)
+{
+	return iov_iter_type(i) == ITER_ITERLIST;
+}
+
 static inline unsigned char iov_iter_rw(const struct iov_iter *i)
 {
 	return i->data_source ? WRITE : READ;
@@ -238,6 +244,8 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_
 void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
 void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xarray *xarray,
 		     loff_t start, size_t count);
+void iov_iter_iterlist(struct iov_iter *i, unsigned int direction, struct iov_iter *iterlist,
+		       unsigned long nr_segs, size_t count);
 ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
 		size_t maxsize, unsigned maxpages, size_t *start,
 		iov_iter_extraction_t extraction_flags);
@@ -345,7 +353,8 @@ static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction,
 		.user_backed = true,
 		.data_source = direction,
 		.ubuf = buf,
-		.count = count
+		.count = count,
+		.orig_count = count,
 	};
 }
 /* Flags for iov_iter_get/extract_pages*() */
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index fad95e4cf372..34ce3b958b6c 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -282,7 +282,8 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction,
 		.iov = iov,
 		.nr_segs = nr_segs,
 		.iov_offset = 0,
-		.count = count
+		.count = count,
+		.orig_count = count,
 	};
 }
 EXPORT_SYMBOL(iov_iter_init);
@@ -364,6 +365,26 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 	if (WARN_ON_ONCE(!i->data_source))
 		return 0;
 
+	if (unlikely(iov_iter_is_iterlist(i))) {
+		size_t copied = 0;
+
+		while (bytes && i->count) {
+			size_t part = min(bytes, i->iterlist->count), n;
+
+			if (part > 0)
+				n = _copy_from_iter(addr, part, i->iterlist);
+			addr += n;
+			copied += n;
+			bytes -= n;
+			i->count -= n;
+			if (n < part || !bytes)
+				break;
+			i->iterlist++;
+			i->nr_segs--;
+		}
+		return copied;
+	}
+
 	if (user_backed_iter(i))
 		might_fault();
 	iterate_and_advance(i, bytes, base, len, off,
@@ -380,6 +401,27 @@ size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 	if (WARN_ON_ONCE(!i->data_source))
 		return 0;
 
+	if (unlikely(iov_iter_is_iterlist(i))) {
+		size_t copied = 0;
+
+		while (bytes && i->count) {
+			size_t part = min(bytes, i->iterlist->count), n;
+
+			if (part > 0)
+				n = _copy_from_iter_nocache(addr, part,
+							    i->iterlist);
+			addr += n;
+			copied += n;
+			bytes -= n;
+			i->count -= n;
+			if (n < part || !bytes)
+				break;
+			i->iterlist++;
+			i->nr_segs--;
+		}
+		return copied;
+	}
+
 	iterate_and_advance(i, bytes, base, len, off,
 		__copy_from_user_inatomic_nocache(addr + off, base, len),
 		memcpy(addr + off, base, len)
@@ -411,6 +453,27 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
 	if (WARN_ON_ONCE(!i->data_source))
 		return 0;
 
+	if (unlikely(iov_iter_is_iterlist(i))) {
+		size_t copied = 0;
+
+		while (bytes && i->count) {
+			size_t part = min(bytes, i->iterlist->count), n;
+
+			if (part > 0)
+				n = _copy_from_iter_flushcache(addr, part,
+							       i->iterlist);
+			addr += n;
+			copied += n;
+			bytes -= n;
+			i->count -= n;
+			if (n < part || !bytes)
+				break;
+			i->iterlist++;
+			i->nr_segs--;
+		}
+		return copied;
+	}
+
 	iterate_and_advance(i, bytes, base, len, off,
 		__copy_from_user_flushcache(addr + off, base, len),
 		memcpy_flushcache(addr + off, base, len)
@@ -514,7 +577,31 @@ EXPORT_SYMBOL(iov_iter_zero);
 size_t copy_page_from_iter_atomic(struct page *page, unsigned offset, size_t bytes,
 				  struct iov_iter *i)
 {
-	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
+	char *kaddr, *p;
+
+	if (unlikely(iov_iter_is_iterlist(i))) {
+		size_t copied = 0;
+
+		while (bytes && i->count) {
+			size_t part = min(bytes, i->iterlist->count), n;
+
+			if (part > 0)
+				n = copy_page_from_iter_atomic(page, offset, part,
+							       i->iterlist);
+			offset += n;
+			copied += n;
+			bytes -= n;
+			i->count -= n;
+			if (n < part || !bytes)
+				break;
+			i->iterlist++;
+			i->nr_segs--;
+		}
+		return copied;
+	}
+
+	kaddr = kmap_atomic(page);
+	p = kaddr + offset;
 	if (!page_copy_sane(page, offset, bytes)) {
 		kunmap_atomic(kaddr);
 		return 0;
@@ -585,19 +672,49 @@ void iov_iter_advance(struct iov_iter *i, size_t size)
 		iov_iter_bvec_advance(i, size);
 	} else if (iov_iter_is_discard(i)) {
 		i->count -= size;
+	}else if (iov_iter_is_iterlist(i)) {
+		i->count -= size;
+		for (;;) {
+			size_t part = min(size, i->iterlist->count);
+
+			if (part > 0)
+				iov_iter_advance(i->iterlist, part);
+			size -= part;
+			if (!size)
+				break;
+			i->iterlist++;
+			i->nr_segs--;
+		}
 	}
 }
 EXPORT_SYMBOL(iov_iter_advance);
 
+static void iov_iter_revert_iterlist(struct iov_iter *i, size_t unroll)
+{
+	for (;;) {
+		size_t part = min(unroll, i->iterlist->orig_count - i->iterlist->count);
+
+		if (part > 0)
+			iov_iter_revert(i->iterlist, part);
+		unroll -= part;
+		if (!unroll)
+			break;
+		i->iterlist--;
+		i->nr_segs++;
+	}
+}
+
 void iov_iter_revert(struct iov_iter *i, size_t unroll)
 {
 	if (!unroll)
 		return;
-	if (WARN_ON(unroll > MAX_RW_COUNT))
+	if (WARN_ON(unroll > i->orig_count - i->count))
 		return;
 	i->count += unroll;
 	if (unlikely(iov_iter_is_discard(i)))
 		return;
+	if (unlikely(iov_iter_is_iterlist(i)))
+		return iov_iter_revert_iterlist(i, unroll);
 	if (unroll <= i->iov_offset) {
 		i->iov_offset -= unroll;
 		return;
@@ -641,6 +758,8 @@ EXPORT_SYMBOL(iov_iter_revert);
  */
 size_t iov_iter_single_seg_count(const struct iov_iter *i)
 {
+	if (iov_iter_is_iterlist(i))
+		i = i->iterlist;
 	if (i->nr_segs > 1) {
 		if (likely(iter_is_iovec(i) || iov_iter_is_kvec(i)))
 			return min(i->count, i->iov->iov_len - i->iov_offset);
@@ -662,7 +781,8 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction,
 		.kvec = kvec,
 		.nr_segs = nr_segs,
 		.iov_offset = 0,
-		.count = count
+		.count = count,
+		.orig_count = count,
 	};
 }
 EXPORT_SYMBOL(iov_iter_kvec);
@@ -678,7 +798,8 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
 		.bvec = bvec,
 		.nr_segs = nr_segs,
 		.iov_offset = 0,
-		.count = count
+		.count = count,
+		.orig_count = count,
 	};
 }
 EXPORT_SYMBOL(iov_iter_bvec);
@@ -706,6 +827,7 @@ void iov_iter_xarray(struct iov_iter *i, unsigned int direction,
 		.xarray = xarray,
 		.xarray_start = start,
 		.count = count,
+		.orig_count = count,
 		.iov_offset = 0
 	};
 }
@@ -727,11 +849,47 @@ void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
 		.iter_type = ITER_DISCARD,
 		.data_source = false,
 		.count = count,
+		.orig_count = count,
 		.iov_offset = 0
 	};
 }
 EXPORT_SYMBOL(iov_iter_discard);
 
+/**
+ * iov_iter_iterlist - Initialise an I/O iterator that is a list of iterators
+ * @iter: The iterator to initialise.
+ * @direction: The direction of the transfer.
+ * @iterlist: The list of iterators
+ * @nr_segs: The number of elements in the list
+ * @count: The size of the I/O buffer in bytes.
+ *
+ * Set up an I/O iterator that just discards everything that's written to it.
+ * It's only available as a source iterator (for WRITE), all the iterators in
+ * the list must be the same and none of them can be ITER_ITERLIST type.
+ */
+void iov_iter_iterlist(struct iov_iter *iter, unsigned int direction,
+		       struct iov_iter *iterlist, unsigned long nr_segs,
+		       size_t count)
+{
+	unsigned long i;
+
+	BUG_ON(direction != WRITE);
+	for (i = 0; i < nr_segs; i++) {
+		BUG_ON(iterlist[i].iter_type == ITER_ITERLIST);
+		BUG_ON(!iterlist[i].data_source);
+	}
+
+	*iter = (struct iov_iter){
+		.iter_type	= ITER_ITERLIST,
+		.data_source	= true,
+		.count		= count,
+		.orig_count	= count,
+		.iterlist	= iterlist,
+		.nr_segs	= nr_segs,
+	};
+}
+EXPORT_SYMBOL(iov_iter_iterlist);
+
 static bool iov_iter_aligned_iovec(const struct iov_iter *i, unsigned addr_mask,
 				   unsigned len_mask)
 {
@@ -879,6 +1037,15 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
 	if (iov_iter_is_xarray(i))
 		return (i->xarray_start + i->iov_offset) | i->count;
 
+	if (iov_iter_is_iterlist(i)) {
+		unsigned long align = 0;
+		unsigned int j;
+
+		for (j = 0; j < i->nr_segs; j++)
+			align |= iov_iter_alignment(&i->iterlist[j]);
+		return align;
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL(iov_iter_alignment);
@@ -1078,6 +1245,18 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
 	}
 	if (iov_iter_is_xarray(i))
 		return iter_xarray_get_pages(i, pages, maxsize, maxpages, start);
+	if (iov_iter_is_iterlist(i)) {
+		ssize_t size;
+
+		while (!i->iterlist->count) {
+			i->iterlist++;
+			i->nr_segs--;
+		}
+		size = __iov_iter_get_pages_alloc(i->iterlist, pages, maxsize, maxpages,
+						  start, extraction_flags);
+		i->count -= size;
+		return size;
+	}
 	return -EFAULT;
 }
 
@@ -1126,6 +1305,31 @@ ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i,
 }
 EXPORT_SYMBOL(iov_iter_get_pages_alloc2);
 
+static size_t csum_and_copy_from_iterlist(void *addr, size_t bytes, __wsum *csum,
+					  struct iov_iter *i)
+{
+	size_t copied = 0, n;
+
+	while (i->count && i->nr_segs) {
+		struct iov_iter *j = i->iterlist;
+
+		if (j->count == 0) {
+			i->iterlist++;
+			i->nr_segs--;
+			continue;
+		}
+
+		n = csum_and_copy_from_iter(addr, bytes - copied, csum, j);
+		addr += n;
+		copied += n;
+		i->count -= n;
+		if (n == 0)
+			break;
+	}
+
+	return copied;
+}
+
 size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 			       struct iov_iter *i)
 {
@@ -1133,6 +1337,8 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 	sum = *csum;
 	if (WARN_ON_ONCE(!i->data_source))
 		return 0;
+	if (iov_iter_is_iterlist(i))
+		return csum_and_copy_from_iterlist(addr, bytes, csum, i);
 
 	iterate_and_advance(i, bytes, base, len, off, ({
 		next = csum_and_copy_from_user(base, addr + off, len);
@@ -1236,6 +1442,21 @@ static int bvec_npages(const struct iov_iter *i, int maxpages)
 	return npages;
 }
 
+static int iterlist_npages(const struct iov_iter *i, int maxpages)
+{
+	ssize_t size = i->count;
+	const struct iov_iter *p;
+	int npages = 0;
+
+	for (p = i->iterlist; size; p++) {
+		size -= p->count;
+		npages += iov_iter_npages(p, maxpages - npages);
+		if (unlikely(npages >= maxpages))
+			return maxpages;
+	}
+	return npages;
+}
+
 int iov_iter_npages(const struct iov_iter *i, int maxpages)
 {
 	if (unlikely(!i->count))
@@ -1255,6 +1476,8 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
 		int npages = DIV_ROUND_UP(offset + i->count, PAGE_SIZE);
 		return min(npages, maxpages);
 	}
+	if (iov_iter_is_iterlist(i))
+		return iterlist_npages(i, maxpages);
 	return 0;
 }
 EXPORT_SYMBOL(iov_iter_npages);
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 1d0f0f764e16..030a1fa5171b 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1073,11 +1073,13 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr,
 {
 	const struct kvec *head = xdr->head;
 	const struct kvec *tail = xdr->tail;
+	struct iov_iter iters[3];
+	struct bio_vec head_bv, tail_bv;
 	struct msghdr msg = {
-		.msg_flags	= MSG_SPLICE_PAGES,
+		.msg_flags	= 0, //MSG_SPLICE_PAGES,
 	};
-	void *m, *h, *t;
-	int ret, n = xdr_buf_pagecount(xdr), size;
+	void *m, *t;
+	int ret, n = 2, size;
 
 	*sentp = 0;
 	ret = xdr_alloc_bvec(xdr, GFP_KERNEL);
@@ -1089,27 +1091,28 @@ static int svc_tcp_sendmsg(struct socket *sock, struct xdr_buf *xdr,
 	if (!m)
 		return -ENOMEM;
 
-	h = m + sizeof(marker);
-	t = h + head->iov_len;
+	memcpy(m, &marker, sizeof(marker));
+	if (head->iov_len)
+		memcpy(m + sizeof(marker), head->iov_base, head->iov_len);
+	bvec_set_virt(&head_bv, m, sizeof(marker) + head->iov_len);
+	iov_iter_bvec(&iters[0], ITER_SOURCE, &head_bv, 1,
+		      sizeof(marker) + head->iov_len);
 
-	bvec_set_virt(&xdr->bvec[-1], m, sizeof(marker) + head->iov_len);
-	n++;
+	iov_iter_bvec(&iters[1], ITER_SOURCE, xdr->bvec,
+		      xdr_buf_pagecount(xdr), xdr->page_len);
 
 	if (tail->iov_len) {
 		t = page_frag_alloc(NULL, tail->iov_len, GFP_KERNEL);
 		if (!t)
 			return -ENOMEM;
-		bvec_set_virt(&xdr->bvec[n],  t, tail->iov_len);
 		memcpy(t, tail->iov_base, tail->iov_len);
+		bvec_set_virt(&tail_bv,  t, tail->iov_len);
+		iov_iter_bvec(&iters[2], ITER_SOURCE, &tail_bv, 1, tail->iov_len);
 		n++;
 	}
 
-	memcpy(m, &marker, sizeof(marker));
-	if (head->iov_len)
-		memcpy(h, head->iov_base, head->iov_len);
-
 	size = sizeof(marker) + head->iov_len + xdr->page_len + tail->iov_len;
-	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, xdr->bvec - 1, n, size);
+	iov_iter_iterlist(&msg.msg_iter, ITER_SOURCE, iters, n, size);
 
 	ret = sock_sendmsg(sock, &msg);
 	if (ret < 0)


^ permalink raw reply related	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH] iov_iter: Add an iterator-of-iterators
  2023-03-22 18:15       ` [RFC PATCH] iov_iter: Add an iterator-of-iterators David Howells
@ 2023-03-22 18:47         ` Trond Myklebust
  2023-03-22 18:49         ` Matthew Wilcox
  1 sibling, 0 replies; 81+ messages in thread
From: Trond Myklebust @ 2023-03-22 18:47 UTC (permalink / raw)
  To: David Howells
  Cc: Matthew Wilcox, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexander Viro, Christoph Hellwig, Jens Axboe,
	Jeffrey Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs



> On Mar 22, 2023, at 14:15, David Howells <dhowells@redhat.com> wrote:
> 
> Trond Myklebust <trondmy@hammerspace.com> wrote:
> 
>> Add an enum iter_type for ITER_ITER ? :-)
> 
> Well, you asked for it...  It's actually fairly straightforward once
> ITER_PIPE is removed.
> 
> ---
> iov_iter: Add an iterator-of-iterators
> 
> Provide an I/O iterator that takes an array of iterators and iterates over
> them in turn.  Then make the sunrpc service code (and thus nfsd) use it.
> 
> In this particular instance, the svc_tcp_sendmsg() sets up an array of
> three iterators: once for the marker+header, one for the body and one
> optional one for the tail, then sets msg_iter to be an
> iterator-of-iterators across them.

Cool! This is something that can be used on the receive side as well, so very useful. I can imagine it might also open up a few more use cases for ITER_XARRAY.

Thanks!
  Trond

_________________________________
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH] iov_iter: Add an iterator-of-iterators
  2023-03-22 18:15       ` [RFC PATCH] iov_iter: Add an iterator-of-iterators David Howells
  2023-03-22 18:47         ` Trond Myklebust
@ 2023-03-22 18:49         ` Matthew Wilcox
  1 sibling, 0 replies; 81+ messages in thread
From: Matthew Wilcox @ 2023-03-22 18:49 UTC (permalink / raw)
  To: David Howells
  Cc: Trond Myklebust, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Alexander Viro, Christoph Hellwig, Jens Axboe,
	Jeffrey Layton, Christian Brauner, Linus Torvalds, netdev,
	linux-fsdevel, linux-kernel, linux-mm, Anna Schumaker,
	Charles Edward Lever, linux-nfs

On Wed, Mar 22, 2023 at 06:15:45PM +0000, David Howells wrote:
> @@ -43,17 +44,17 @@ struct iov_iter {
>  	bool nofault;
>  	bool data_source;
>  	bool user_backed;
> -	union {
> -		size_t iov_offset;
> -		int last_offset;
> -	};
> +	bool spliceable;

We've now up to five u8s in a row here (iter_type, nofault, data_source,
user_backed).  Is it time to turn some/all of them into:

	bool nofault:1;
	bool data_source:1;
	bool user_backed:1;
	bool spliceable:1;

You can't take the address of them then, but I don't believe we do that
anywhere.


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
  2023-03-22 13:56           ` [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr* David Howells
  2023-03-22 17:25             ` kernel test robot
@ 2023-03-22 22:12             ` kernel test robot
  2023-03-23  1:25             ` kernel test robot
  2023-03-23  1:25             ` kernel test robot
  3 siblings, 0 replies; 81+ messages in thread
From: kernel test robot @ 2023-03-22 22:12 UTC (permalink / raw)
  To: David Howells; +Cc: llvm, oe-kbuild-all

Hi David,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on linus/master]
[also build test ERROR on v6.3-rc3]
[cannot apply to herbert-cryptodev-2.6/master herbert-crypto-2.6/master bluetooth-next/master bluetooth/master next-20230322]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
patch link:    https://lore.kernel.org/r/20230322135612.3265850-3-dhowells%40redhat.com
patch subject: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
config: i386-randconfig-a013 (https://download.01.org/0day-ci/archive/20230323/202303230649.hnY3j3oN-lkp@intel.com/config)
compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/eecac0727821eaf716a8600550bf68f21ead4b87
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
        git checkout eecac0727821eaf716a8600550bf68f21ead4b87
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash net/l2tp/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303230649.hnY3j3oN-lkp@intel.com/

All errors (new ones prefixed by >>):

>> net/l2tp/l2tp_ip6.c:638:10: error: too many arguments to function call, expected 8, have 9
                                 msg->msg_flags);
                                 ^~~~~~~~~~~~~~
   include/net/ipv6.h:1100:5: note: 'ip6_append_data' declared here
   int ip6_append_data(struct sock *sk, struct msghdr *msg,
       ^
   1 error generated.


vim +638 net/l2tp/l2tp_ip6.c

a32e0eec7042b2 Chris Elston         2012-04-29  487  
a32e0eec7042b2 Chris Elston         2012-04-29  488  /* Userspace will call sendmsg() on the tunnel socket to send L2TP
a32e0eec7042b2 Chris Elston         2012-04-29  489   * control frames.
a32e0eec7042b2 Chris Elston         2012-04-29  490   */
cee416e2c19501 David Howells        2023-03-22  491  static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg)
a32e0eec7042b2 Chris Elston         2012-04-29  492  {
a32e0eec7042b2 Chris Elston         2012-04-29  493  	struct ipv6_txoptions opt_space;
342dfc306fb321 Steffen Hurrle       2014-01-17  494  	DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name);
a32e0eec7042b2 Chris Elston         2012-04-29  495  	struct in6_addr *daddr, *final_p, final;
a32e0eec7042b2 Chris Elston         2012-04-29  496  	struct ipv6_pinfo *np = inet6_sk(sk);
45f6fad84cc305 Eric Dumazet         2015-11-29  497  	struct ipv6_txoptions *opt_to_free = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  498  	struct ipv6_txoptions *opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  499  	struct ip6_flowlabel *flowlabel = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  500  	struct dst_entry *dst = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  501  	struct flowi6 fl6;
26879da58711aa Wei Wang             2016-05-02  502  	struct ipcm6_cookie ipc6;
cee416e2c19501 David Howells        2023-03-22  503  	size_t len = msg_data_left(msg);
a32e0eec7042b2 Chris Elston         2012-04-29  504  	int addr_len = msg->msg_namelen;
a32e0eec7042b2 Chris Elston         2012-04-29  505  	int transhdrlen = 4; /* zero session-id */
f638a84afef3df Wang Yufen           2022-06-07  506  	int ulen;
a32e0eec7042b2 Chris Elston         2012-04-29  507  	int err;
a32e0eec7042b2 Chris Elston         2012-04-29  508  
a32e0eec7042b2 Chris Elston         2012-04-29  509  	/* Rough check on arithmetic overflow,
20dcb1107ab1a3 Tom Parkin           2020-07-22  510  	 * better check is made in ip6_append_data().
a32e0eec7042b2 Chris Elston         2012-04-29  511  	 */
f638a84afef3df Wang Yufen           2022-06-07  512  	if (len > INT_MAX - transhdrlen)
a32e0eec7042b2 Chris Elston         2012-04-29  513  		return -EMSGSIZE;
f638a84afef3df Wang Yufen           2022-06-07  514  	ulen = len + transhdrlen;
a32e0eec7042b2 Chris Elston         2012-04-29  515  
a32e0eec7042b2 Chris Elston         2012-04-29  516  	/* Mirror BSD error message compatibility */
a32e0eec7042b2 Chris Elston         2012-04-29  517  	if (msg->msg_flags & MSG_OOB)
a32e0eec7042b2 Chris Elston         2012-04-29  518  		return -EOPNOTSUPP;
a32e0eec7042b2 Chris Elston         2012-04-29  519  
20dcb1107ab1a3 Tom Parkin           2020-07-22  520  	/* Get and verify the address */
a32e0eec7042b2 Chris Elston         2012-04-29  521  	memset(&fl6, 0, sizeof(fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  522  
a32e0eec7042b2 Chris Elston         2012-04-29  523  	fl6.flowi6_mark = sk->sk_mark;
e2d118a1cb5e60 Lorenzo Colitti      2016-11-04  524  	fl6.flowi6_uid = sk->sk_uid;
a32e0eec7042b2 Chris Elston         2012-04-29  525  
b515430ac9c25d Willem de Bruijn     2018-07-06  526  	ipcm6_init(&ipc6);
26879da58711aa Wei Wang             2016-05-02  527  
a32e0eec7042b2 Chris Elston         2012-04-29  528  	if (lsa) {
a32e0eec7042b2 Chris Elston         2012-04-29  529  		if (addr_len < SIN6_LEN_RFC2133)
a32e0eec7042b2 Chris Elston         2012-04-29  530  			return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  531  
a32e0eec7042b2 Chris Elston         2012-04-29  532  		if (lsa->l2tp_family && lsa->l2tp_family != AF_INET6)
a32e0eec7042b2 Chris Elston         2012-04-29  533  			return -EAFNOSUPPORT;
a32e0eec7042b2 Chris Elston         2012-04-29  534  
a32e0eec7042b2 Chris Elston         2012-04-29  535  		daddr = &lsa->l2tp_addr;
a32e0eec7042b2 Chris Elston         2012-04-29  536  		if (np->sndflow) {
a32e0eec7042b2 Chris Elston         2012-04-29  537  			fl6.flowlabel = lsa->l2tp_flowinfo & IPV6_FLOWINFO_MASK;
a32e0eec7042b2 Chris Elston         2012-04-29  538  			if (fl6.flowlabel & IPV6_FLOWLABEL_MASK) {
a32e0eec7042b2 Chris Elston         2012-04-29  539  				flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  540  				if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  541  					return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  542  			}
a32e0eec7042b2 Chris Elston         2012-04-29  543  		}
a32e0eec7042b2 Chris Elston         2012-04-29  544  
20dcb1107ab1a3 Tom Parkin           2020-07-22  545  		/* Otherwise it will be difficult to maintain
a32e0eec7042b2 Chris Elston         2012-04-29  546  		 * sk->sk_dst_cache.
a32e0eec7042b2 Chris Elston         2012-04-29  547  		 */
a32e0eec7042b2 Chris Elston         2012-04-29  548  		if (sk->sk_state == TCP_ESTABLISHED &&
efe4208f47f907 Eric Dumazet         2013-10-03  549  		    ipv6_addr_equal(daddr, &sk->sk_v6_daddr))
efe4208f47f907 Eric Dumazet         2013-10-03  550  			daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  551  
a32e0eec7042b2 Chris Elston         2012-04-29  552  		if (addr_len >= sizeof(struct sockaddr_in6) &&
a32e0eec7042b2 Chris Elston         2012-04-29  553  		    lsa->l2tp_scope_id &&
a32e0eec7042b2 Chris Elston         2012-04-29  554  		    ipv6_addr_type(daddr) & IPV6_ADDR_LINKLOCAL)
a32e0eec7042b2 Chris Elston         2012-04-29  555  			fl6.flowi6_oif = lsa->l2tp_scope_id;
a32e0eec7042b2 Chris Elston         2012-04-29  556  	} else {
a32e0eec7042b2 Chris Elston         2012-04-29  557  		if (sk->sk_state != TCP_ESTABLISHED)
a32e0eec7042b2 Chris Elston         2012-04-29  558  			return -EDESTADDRREQ;
a32e0eec7042b2 Chris Elston         2012-04-29  559  
efe4208f47f907 Eric Dumazet         2013-10-03  560  		daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  561  		fl6.flowlabel = np->flow_label;
a32e0eec7042b2 Chris Elston         2012-04-29  562  	}
a32e0eec7042b2 Chris Elston         2012-04-29  563  
a32e0eec7042b2 Chris Elston         2012-04-29  564  	if (fl6.flowi6_oif == 0)
ff0094030f146b Eric Dumazet         2022-05-13  565  		fl6.flowi6_oif = READ_ONCE(sk->sk_bound_dev_if);
a32e0eec7042b2 Chris Elston         2012-04-29  566  
a32e0eec7042b2 Chris Elston         2012-04-29  567  	if (msg->msg_controllen) {
a32e0eec7042b2 Chris Elston         2012-04-29  568  		opt = &opt_space;
a32e0eec7042b2 Chris Elston         2012-04-29  569  		memset(opt, 0, sizeof(struct ipv6_txoptions));
a32e0eec7042b2 Chris Elston         2012-04-29  570  		opt->tot_len = sizeof(struct ipv6_txoptions);
26879da58711aa Wei Wang             2016-05-02  571  		ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  572  
5fdaa88dfefa87 Willem de Bruijn     2018-07-06  573  		err = ip6_datagram_send_ctl(sock_net(sk), sk, msg, &fl6, &ipc6);
a32e0eec7042b2 Chris Elston         2012-04-29  574  		if (err < 0) {
a32e0eec7042b2 Chris Elston         2012-04-29  575  			fl6_sock_release(flowlabel);
a32e0eec7042b2 Chris Elston         2012-04-29  576  			return err;
a32e0eec7042b2 Chris Elston         2012-04-29  577  		}
a32e0eec7042b2 Chris Elston         2012-04-29  578  		if ((fl6.flowlabel & IPV6_FLOWLABEL_MASK) && !flowlabel) {
a32e0eec7042b2 Chris Elston         2012-04-29  579  			flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  580  			if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  581  				return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  582  		}
a32e0eec7042b2 Chris Elston         2012-04-29  583  		if (!(opt->opt_nflen | opt->opt_flen))
a32e0eec7042b2 Chris Elston         2012-04-29  584  			opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  585  	}
a32e0eec7042b2 Chris Elston         2012-04-29  586  
45f6fad84cc305 Eric Dumazet         2015-11-29  587  	if (!opt) {
45f6fad84cc305 Eric Dumazet         2015-11-29  588  		opt = txopt_get(np);
45f6fad84cc305 Eric Dumazet         2015-11-29  589  		opt_to_free = opt;
45f6fad84cc305 Eric Dumazet         2015-11-29  590  	}
a32e0eec7042b2 Chris Elston         2012-04-29  591  	if (flowlabel)
a32e0eec7042b2 Chris Elston         2012-04-29  592  		opt = fl6_merge_options(&opt_space, flowlabel, opt);
a32e0eec7042b2 Chris Elston         2012-04-29  593  	opt = ipv6_fixup_options(&opt_space, opt);
26879da58711aa Wei Wang             2016-05-02  594  	ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  595  
a32e0eec7042b2 Chris Elston         2012-04-29  596  	fl6.flowi6_proto = sk->sk_protocol;
a32e0eec7042b2 Chris Elston         2012-04-29  597  	if (!ipv6_addr_any(daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  598  		fl6.daddr = *daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  599  	else
a32e0eec7042b2 Chris Elston         2012-04-29  600  		fl6.daddr.s6_addr[15] = 0x1; /* :: means loopback (BSD'ism) */
a32e0eec7042b2 Chris Elston         2012-04-29  601  	if (ipv6_addr_any(&fl6.saddr) && !ipv6_addr_any(&np->saddr))
a32e0eec7042b2 Chris Elston         2012-04-29  602  		fl6.saddr = np->saddr;
a32e0eec7042b2 Chris Elston         2012-04-29  603  
a32e0eec7042b2 Chris Elston         2012-04-29  604  	final_p = fl6_update_dst(&fl6, opt, &final);
a32e0eec7042b2 Chris Elston         2012-04-29  605  
a32e0eec7042b2 Chris Elston         2012-04-29  606  	if (!fl6.flowi6_oif && ipv6_addr_is_multicast(&fl6.daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  607  		fl6.flowi6_oif = np->mcast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  608  	else if (!fl6.flowi6_oif)
a32e0eec7042b2 Chris Elston         2012-04-29  609  		fl6.flowi6_oif = np->ucast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  610  
3df98d79215ace Paul Moore           2020-09-27  611  	security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  612  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  613  	if (ipc6.tclass < 0)
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  614  		ipc6.tclass = np->tclass;
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  615  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  616  	fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  617  
c4e85f73afb638 Sabrina Dubroca      2019-12-04  618  	dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
a32e0eec7042b2 Chris Elston         2012-04-29  619  	if (IS_ERR(dst)) {
a32e0eec7042b2 Chris Elston         2012-04-29  620  		err = PTR_ERR(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  621  		goto out;
a32e0eec7042b2 Chris Elston         2012-04-29  622  	}
a32e0eec7042b2 Chris Elston         2012-04-29  623  
26879da58711aa Wei Wang             2016-05-02  624  	if (ipc6.hlimit < 0)
26879da58711aa Wei Wang             2016-05-02  625  		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
a32e0eec7042b2 Chris Elston         2012-04-29  626  
26879da58711aa Wei Wang             2016-05-02  627  	if (ipc6.dontfrag < 0)
26879da58711aa Wei Wang             2016-05-02  628  		ipc6.dontfrag = np->dontfrag;
a32e0eec7042b2 Chris Elston         2012-04-29  629  
a32e0eec7042b2 Chris Elston         2012-04-29  630  	if (msg->msg_flags & MSG_CONFIRM)
a32e0eec7042b2 Chris Elston         2012-04-29  631  		goto do_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  632  
a32e0eec7042b2 Chris Elston         2012-04-29  633  back_from_confirm:
a32e0eec7042b2 Chris Elston         2012-04-29  634  	lock_sock(sk);
f69e6d131f5dac Al Viro              2014-11-24  635  	err = ip6_append_data(sk, ip_generic_getfrag, msg,
26879da58711aa Wei Wang             2016-05-02  636  			      ulen, transhdrlen, &ipc6,
a32e0eec7042b2 Chris Elston         2012-04-29  637  			      &fl6, (struct rt6_info *)dst,
5fdaa88dfefa87 Willem de Bruijn     2018-07-06 @638  			      msg->msg_flags);
a32e0eec7042b2 Chris Elston         2012-04-29  639  	if (err)
a32e0eec7042b2 Chris Elston         2012-04-29  640  		ip6_flush_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  641  	else if (!(msg->msg_flags & MSG_MORE))
a32e0eec7042b2 Chris Elston         2012-04-29  642  		err = l2tp_ip6_push_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  643  	release_sock(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  644  done:
a32e0eec7042b2 Chris Elston         2012-04-29  645  	dst_release(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  646  out:
a32e0eec7042b2 Chris Elston         2012-04-29  647  	fl6_sock_release(flowlabel);
45f6fad84cc305 Eric Dumazet         2015-11-29  648  	txopt_put(opt_to_free);
a32e0eec7042b2 Chris Elston         2012-04-29  649  
a32e0eec7042b2 Chris Elston         2012-04-29  650  	return err < 0 ? err : len;
a32e0eec7042b2 Chris Elston         2012-04-29  651  
a32e0eec7042b2 Chris Elston         2012-04-29  652  do_confirm:
0dec879f636f11 Julian Anastasov     2017-02-06  653  	if (msg->msg_flags & MSG_PROBE)
0dec879f636f11 Julian Anastasov     2017-02-06  654  		dst_confirm_neigh(dst, &fl6.daddr);
a32e0eec7042b2 Chris Elston         2012-04-29  655  	if (!(msg->msg_flags & MSG_PROBE) || len)
a32e0eec7042b2 Chris Elston         2012-04-29  656  		goto back_from_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  657  	err = 0;
a32e0eec7042b2 Chris Elston         2012-04-29  658  	goto done;
a32e0eec7042b2 Chris Elston         2012-04-29  659  }
a32e0eec7042b2 Chris Elston         2012-04-29  660  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: [RFC,1/3] net: Drop the size argument from ->sendmsg()
  2023-03-22 13:56             ` David Howells
                               ` (2 preceding siblings ...)
  (?)
@ 2023-03-23  1:11             ` bluez.test.bot
  -1 siblings, 0 replies; 81+ messages in thread
From: bluez.test.bot @ 2023-03-23  1:11 UTC (permalink / raw)
  To: linux-bluetooth, dhowells

[-- Attachment #1: Type: text/plain, Size: 3205 bytes --]

This is automated email and please do not reply to this email!

Dear submitter,

Thank you for submitting the patches to the linux bluetooth mailing list.
This is a CI test results with your patch series:
PW Link:https://patchwork.kernel.org/project/bluetooth/list/?series=732753

---Test result---

Test Summary:
CheckPatch                    FAIL      18.11 seconds
GitLint                       PASS      0.26 seconds
SubjectPrefix                 FAIL      0.40 seconds
BuildKernel                   PASS      31.50 seconds
CheckAllWarning               PASS      34.46 seconds
CheckSparse                   WARNING   39.28 seconds
CheckSmatch                   WARNING   108.13 seconds
BuildKernel32                 PASS      30.32 seconds
TestRunnerSetup               PASS      433.80 seconds
TestRunner_l2cap-tester       PASS      15.85 seconds
TestRunner_iso-tester         PASS      15.61 seconds
TestRunner_bnep-tester        PASS      5.07 seconds
TestRunner_mgmt-tester        PASS      103.41 seconds
TestRunner_rfcomm-tester      PASS      8.14 seconds
TestRunner_sco-tester         PASS      7.43 seconds
TestRunner_ioctl-tester       PASS      8.59 seconds
TestRunner_mesh-tester        PASS      6.36 seconds
TestRunner_smp-tester         PASS      7.37 seconds
TestRunner_userchan-tester    PASS      5.26 seconds
IncrementalBuild              PASS      28.86 seconds

Details
##############################
Test: CheckPatch - FAIL
Desc: Run checkpatch.pl script
Output:
[RFC,1/3] net: Drop the size argument from ->sendmsg()
WARNING: Unnecessary space before function pointer arguments
#729: FILE: include/linux/net.h:195:
+	int		(*sendmsg)   (struct socket *sock, struct msghdr *m);

WARNING: function definition argument 'struct rxrpc_sock *' should also have an identifier name
#2195: FILE: net/rxrpc/ar-internal.h:1224:
+int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *);

WARNING: function definition argument 'struct msghdr *' should also have an identifier name
#2195: FILE: net/rxrpc/ar-internal.h:1224:
+int rxrpc_do_sendmsg(struct rxrpc_sock *, struct msghdr *);

total: 0 errors, 3 warnings, 2169 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

/github/workspace/src/src/13184097.patch has style problems, please review.

NOTE: Ignored message types: UNKNOWN_COMMIT_ID

NOTE: If any of the errors are false positives, please report
      them to the maintainer, see CHECKPATCH in MAINTAINERS.


##############################
Test: SubjectPrefix - FAIL
Desc: Check subject contains "Bluetooth" prefix
Output:
"Bluetooth: " prefix is not specified in the subject
##############################
Test: CheckSparse - WARNING
Desc: Run sparse tool with linux kernel
Output:
net/bluetooth/sco.c: note: in included file:./include/net/bluetooth/hci_core.h:148:35: warning: array of flexible structures
##############################
Test: CheckSmatch - WARNING
Desc: Run smatch tool with source
Output:
net/bluetooth/sco.c: note: in included file:./include/net/bluetooth/hci_core.h:148:35: warning: array of flexible structures


---
Regards,
Linux Bluetooth


^ permalink raw reply	[flat|nested] 81+ messages in thread

* RE: [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data()
  2023-03-22 13:56         ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() David Howells
                             ` (2 preceding siblings ...)
  2023-03-22 13:56           ` [RFC PATCH 3/3] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
@ 2023-03-23  1:17           ` Willem de Bruijn
  3 siblings, 0 replies; 81+ messages in thread
From: Willem de Bruijn @ 2023-03-23  1:17 UTC (permalink / raw)
  To: David Howells, Willem de Bruijn
  Cc: David Howells, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Matthew Wilcox, Jeff Layton, Linus Torvalds, netdev,
	linux-kernel

David Howells wrote:
> Hi Willem,
> 
> Here's another option to passing MSG_SPLICE_PAGES into sendmsg()[1] without
> polluting the flags in msg->msg_flags.  The idea here is to put the flag
> into a new field in msghdr, msg_kflags, that holds internal kernel flags
> that aren't available to userspace.
> 
> What I've done here is:
> 
>  (1) Pass msg down to __ip_append_data() and __ip6_append_data() so that
>      they can access the extra flags.
> 
>  (2) In order to avoid adding extra arguments to these functions and the
>      functions in their call chains (such as ip_make_skb()), remove the
>      size and flags arguments as these values are redundant if msg is
>      passed in.
> 
>  (3) msg is then passed into getfrag().  I would like to get rid of the
>      "from" argument also in favour of using something in msghdr, but I'm
>      not sure how best to do that.
> 
>  (4) The size parameter to ->sendmsg() seems to be redundant; indeed
>      sock_sendmsg() doesn't actually take it, but rather gets the count
>      from msg_iter - so remove this parameter.
> 
>      kernel_sendmsg() will still take a size, but it sets it on the
>      iterator and then calls sock_sendmsg().
> 
>  (5) Protocol sendmsg implementations then extract the length and the flags
>      from the iterator.
> 
>  (6) Illustrate the addition of msg_kflags and MSG_SPLICE_PAGES.  I think
>      that, at some point in the future, some of the other flags could be
>      moved from msg_flags to msg_kflags.
> 
> David
> 
> Link: https://lore.kernel.org/r/20230316152618.711970-1-dhowells@redhat.com/ [1]
> 
> David Howells (3):
>   net: Drop the size argument from ->sendmsg()
>   ip: Make __ip{,6}_append_data() and co. take a msghdr*
>   net: Declare MSG_SPLICE_PAGES internal sendmsg() flag
> 
>  crypto/af_alg.c                               | 12 +--
>  crypto/algif_aead.c                           |  9 +--
>  crypto/algif_hash.c                           |  8 +-
>  crypto/algif_rng.c                            |  3 +-
>  crypto/algif_skcipher.c                       | 10 +--
>  drivers/isdn/mISDN/socket.c                   |  3 +-
>  .../chelsio/inline_crypto/chtls/chtls.h       |  2 +-
>  .../chelsio/inline_crypto/chtls/chtls_io.c    | 15 ++--
>  drivers/net/ppp/pppoe.c                       |  4 +-
>  drivers/net/tap.c                             |  3 +-
>  drivers/net/tun.c                             |  3 +-
>  drivers/vhost/net.c                           |  6 +-
>  drivers/xen/pvcalls-back.c                    |  2 +-
>  drivers/xen/pvcalls-front.c                   |  4 +-
>  drivers/xen/pvcalls-front.h                   |  3 +-
>  fs/afs/rxrpc.c                                |  8 +-
>  include/crypto/if_alg.h                       |  3 +-
>  include/linux/lsm_hook_defs.h                 |  3 +-
>  include/linux/lsm_hooks.h                     |  1 -
>  include/linux/net.h                           |  6 +-
>  include/linux/security.h                      |  4 +-
>  include/linux/socket.h                        |  3 +
>  include/net/af_rxrpc.h                        |  3 +-
>  include/net/inet_common.h                     |  2 +-
>  include/net/ip.h                              | 24 +++---
>  include/net/ipv6.h                            | 22 +++---
>  include/net/ping.h                            |  7 +-
>  include/net/sock.h                            |  7 +-
>  include/net/tcp.h                             |  8 +-
>  include/net/udp.h                             |  2 +-
>  include/net/udplite.h                         |  4 +-
>  net/appletalk/ddp.c                           |  3 +-
>  net/atm/common.c                              |  3 +-
>  net/atm/common.h                              |  2 +-
>  net/ax25/af_ax25.c                            |  4 +-
>  net/bluetooth/hci_sock.c                      |  4 +-
>  net/bluetooth/iso.c                           |  4 +-
>  net/bluetooth/l2cap_sock.c                    |  5 +-
>  net/bluetooth/rfcomm/sock.c                   |  7 +-
>  net/bluetooth/sco.c                           |  4 +-
>  net/caif/caif_socket.c                        | 13 ++--
>  net/can/bcm.c                                 |  3 +-
>  net/can/isotp.c                               |  3 +-
>  net/can/j1939/socket.c                        |  4 +-
>  net/can/raw.c                                 |  3 +-
>  net/core/sock.c                               |  4 +-
>  net/dccp/dccp.h                               |  2 +-
>  net/dccp/proto.c                              |  3 +-
>  net/ieee802154/socket.c                       | 11 +--
>  net/ipv4/af_inet.c                            |  4 +-
>  net/ipv4/icmp.c                               | 14 ++--
>  net/ipv4/ip_output.c                          | 73 ++++++++++---------
>  net/ipv4/ping.c                               | 18 ++---
>  net/ipv4/raw.c                                | 23 +++---
>  net/ipv4/tcp.c                                | 17 +++--
>  net/ipv4/tcp_bpf.c                            |  5 +-
>  net/ipv4/tcp_input.c                          |  3 +-
>  net/ipv4/udp.c                                | 24 +++---
>  net/ipv6/af_inet6.c                           |  7 +-
>  net/ipv6/icmp.c                               | 21 ++++--
>  net/ipv6/ip6_output.c                         | 57 +++++++--------
>  net/ipv6/ping.c                               | 12 +--
>  net/ipv6/raw.c                                | 25 +++----
>  net/ipv6/udp.c                                | 26 ++++---
>  net/ipv6/udp_impl.h                           |  2 +-
>  net/iucv/af_iucv.c                            |  4 +-
>  net/kcm/kcmsock.c                             |  2 +-
>  net/key/af_key.c                              |  3 +-
>  net/l2tp/l2tp_ip.c                            |  3 +-
>  net/l2tp/l2tp_ip6.c                           |  3 +-
>  net/l2tp/l2tp_ppp.c                           |  4 +-
>  net/llc/af_llc.c                              |  5 +-
>  net/mctp/af_mctp.c                            |  3 +-
>  net/mptcp/protocol.c                          |  8 +-
>  net/netlink/af_netlink.c                      | 11 +--
>  net/netrom/af_netrom.c                        |  3 +-
>  net/nfc/llcp_sock.c                           |  7 +-
>  net/nfc/rawsock.c                             |  3 +-
>  net/packet/af_packet.c                        | 11 +--
>  net/phonet/datagram.c                         |  3 +-
>  net/phonet/pep.c                              |  3 +-
>  net/phonet/socket.c                           |  5 +-
>  net/qrtr/af_qrtr.c                            |  4 +-
>  net/rds/rds.h                                 |  2 +-
>  net/rds/send.c                                |  3 +-
>  net/rose/af_rose.c                            |  3 +-
>  net/rxrpc/af_rxrpc.c                          |  6 +-
>  net/rxrpc/ar-internal.h                       |  2 +-
>  net/rxrpc/output.c                            | 22 +++---
>  net/rxrpc/rxperf.c                            |  4 +-
>  net/rxrpc/sendmsg.c                           | 15 ++--
>  net/sctp/socket.c                             |  3 +-
>  net/smc/af_smc.c                              |  5 +-
>  net/socket.c                                  | 16 ++--
>  net/tipc/socket.c                             | 34 ++++-----
>  net/tls/tls.h                                 |  4 +-
>  net/tls/tls_device.c                          |  5 +-
>  net/tls/tls_sw.c                              |  2 +-
>  net/unix/af_unix.c                            | 19 +++--
>  net/vmw_vsock/af_vsock.c                      | 16 ++--
>  net/x25/af_x25.c                              |  3 +-
>  net/xdp/xsk.c                                 |  6 +-
>  net/xfrm/espintcp.c                           |  8 +-
>  security/apparmor/lsm.c                       |  6 +-
>  security/security.c                           |  4 +-
>  security/selinux/hooks.c                      |  3 +-
>  security/smack/smack_lsm.c                    |  4 +-
>  security/tomoyo/common.h                      |  3 +-
>  security/tomoyo/network.c                     |  4 +-
>  security/tomoyo/tomoyo.c                      |  6 +-
>  110 files changed, 444 insertions(+), 456 deletions(-)

That's a significant code change if only for this purpose.

If this bit is undefined and ignored by all socket families today,
masking it out in sock_sendmsg should be enough to start using it
safely as an internal flag.

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
  2023-03-22 13:56           ` [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr* David Howells
  2023-03-22 17:25             ` kernel test robot
  2023-03-22 22:12             ` kernel test robot
@ 2023-03-23  1:25             ` kernel test robot
  2023-03-23  1:25             ` kernel test robot
  3 siblings, 0 replies; 81+ messages in thread
From: kernel test robot @ 2023-03-23  1:25 UTC (permalink / raw)
  To: David Howells; +Cc: oe-kbuild-all

Hi David,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on linus/master]
[also build test ERROR on v6.3-rc3]
[cannot apply to herbert-cryptodev-2.6/master herbert-crypto-2.6/master bluetooth-next/master bluetooth/master next-20230322]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
patch link:    https://lore.kernel.org/r/20230322135612.3265850-3-dhowells%40redhat.com
patch subject: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
config: i386-debian-10.3 (https://download.01.org/0day-ci/archive/20230323/202303230954.Qtw5rHvX-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/eecac0727821eaf716a8600550bf68f21ead4b87
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
        git checkout eecac0727821eaf716a8600550bf68f21ead4b87
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=i386 olddefconfig
        make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash lib// net/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303230954.Qtw5rHvX-lkp@intel.com/

All errors (new ones prefixed by >>):

   net/l2tp/l2tp_ip6.c: In function 'l2tp_ip6_sendmsg':
>> net/l2tp/l2tp_ip6.c:635:35: error: passing argument 2 of 'ip6_append_data' from incompatible pointer type [-Werror=incompatible-pointer-types]
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |                                   ^~~~~~~~~~~~~~~~~~
         |                                   |
         |                                   int (*)(struct msghdr *, void *, char *, int,  int,  int,  struct sk_buff *)
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1100:53: note: expected 'struct msghdr *' but argument is of type 'int (*)(struct msghdr *, void *, char *, int,  int,  int,  struct sk_buff *)'
    1100 | int ip6_append_data(struct sock *sk, struct msghdr *msg,
         |                                      ~~~~~~~~~~~~~~~^~~
   net/l2tp/l2tp_ip6.c:635:55: error: passing argument 3 of 'ip6_append_data' from incompatible pointer type [-Werror=incompatible-pointer-types]
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |                                                       ^~~
         |                                                       |
         |                                                       struct msghdr *
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1101:34: note: expected 'ip_getfrag_t' but argument is of type 'struct msghdr *'
    1101 |                     ip_getfrag_t getfrag, void *from, int transhdrlen,
         |                     ~~~~~~~~~~~~~^~~~~~~
   net/l2tp/l2tp_ip6.c:636:31: warning: passing argument 4 of 'ip6_append_data' makes pointer from integer without a cast [-Wint-conversion]
     636 |                               ulen, transhdrlen, &ipc6,
         |                               ^~~~
         |                               |
         |                               int
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1101:49: note: expected 'void *' but argument is of type 'int'
    1101 |                     ip_getfrag_t getfrag, void *from, int transhdrlen,
         |                                           ~~~~~~^~~~
>> net/l2tp/l2tp_ip6.c:635:15: error: too many arguments to function 'ip6_append_data'
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |               ^~~~~~~~~~~~~~~
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1100:5: note: declared here
    1100 | int ip6_append_data(struct sock *sk, struct msghdr *msg,
         |     ^~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/ip6_append_data +635 net/l2tp/l2tp_ip6.c

a32e0eec7042b2 Chris Elston         2012-04-29  487  
a32e0eec7042b2 Chris Elston         2012-04-29  488  /* Userspace will call sendmsg() on the tunnel socket to send L2TP
a32e0eec7042b2 Chris Elston         2012-04-29  489   * control frames.
a32e0eec7042b2 Chris Elston         2012-04-29  490   */
cee416e2c19501 David Howells        2023-03-22  491  static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg)
a32e0eec7042b2 Chris Elston         2012-04-29  492  {
a32e0eec7042b2 Chris Elston         2012-04-29  493  	struct ipv6_txoptions opt_space;
342dfc306fb321 Steffen Hurrle       2014-01-17  494  	DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name);
a32e0eec7042b2 Chris Elston         2012-04-29  495  	struct in6_addr *daddr, *final_p, final;
a32e0eec7042b2 Chris Elston         2012-04-29  496  	struct ipv6_pinfo *np = inet6_sk(sk);
45f6fad84cc305 Eric Dumazet         2015-11-29  497  	struct ipv6_txoptions *opt_to_free = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  498  	struct ipv6_txoptions *opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  499  	struct ip6_flowlabel *flowlabel = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  500  	struct dst_entry *dst = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  501  	struct flowi6 fl6;
26879da58711aa Wei Wang             2016-05-02  502  	struct ipcm6_cookie ipc6;
cee416e2c19501 David Howells        2023-03-22  503  	size_t len = msg_data_left(msg);
a32e0eec7042b2 Chris Elston         2012-04-29  504  	int addr_len = msg->msg_namelen;
a32e0eec7042b2 Chris Elston         2012-04-29  505  	int transhdrlen = 4; /* zero session-id */
f638a84afef3df Wang Yufen           2022-06-07  506  	int ulen;
a32e0eec7042b2 Chris Elston         2012-04-29  507  	int err;
a32e0eec7042b2 Chris Elston         2012-04-29  508  
a32e0eec7042b2 Chris Elston         2012-04-29  509  	/* Rough check on arithmetic overflow,
20dcb1107ab1a3 Tom Parkin           2020-07-22  510  	 * better check is made in ip6_append_data().
a32e0eec7042b2 Chris Elston         2012-04-29  511  	 */
f638a84afef3df Wang Yufen           2022-06-07  512  	if (len > INT_MAX - transhdrlen)
a32e0eec7042b2 Chris Elston         2012-04-29  513  		return -EMSGSIZE;
f638a84afef3df Wang Yufen           2022-06-07  514  	ulen = len + transhdrlen;
a32e0eec7042b2 Chris Elston         2012-04-29  515  
a32e0eec7042b2 Chris Elston         2012-04-29  516  	/* Mirror BSD error message compatibility */
a32e0eec7042b2 Chris Elston         2012-04-29  517  	if (msg->msg_flags & MSG_OOB)
a32e0eec7042b2 Chris Elston         2012-04-29  518  		return -EOPNOTSUPP;
a32e0eec7042b2 Chris Elston         2012-04-29  519  
20dcb1107ab1a3 Tom Parkin           2020-07-22  520  	/* Get and verify the address */
a32e0eec7042b2 Chris Elston         2012-04-29  521  	memset(&fl6, 0, sizeof(fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  522  
a32e0eec7042b2 Chris Elston         2012-04-29  523  	fl6.flowi6_mark = sk->sk_mark;
e2d118a1cb5e60 Lorenzo Colitti      2016-11-04  524  	fl6.flowi6_uid = sk->sk_uid;
a32e0eec7042b2 Chris Elston         2012-04-29  525  
b515430ac9c25d Willem de Bruijn     2018-07-06  526  	ipcm6_init(&ipc6);
26879da58711aa Wei Wang             2016-05-02  527  
a32e0eec7042b2 Chris Elston         2012-04-29  528  	if (lsa) {
a32e0eec7042b2 Chris Elston         2012-04-29  529  		if (addr_len < SIN6_LEN_RFC2133)
a32e0eec7042b2 Chris Elston         2012-04-29  530  			return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  531  
a32e0eec7042b2 Chris Elston         2012-04-29  532  		if (lsa->l2tp_family && lsa->l2tp_family != AF_INET6)
a32e0eec7042b2 Chris Elston         2012-04-29  533  			return -EAFNOSUPPORT;
a32e0eec7042b2 Chris Elston         2012-04-29  534  
a32e0eec7042b2 Chris Elston         2012-04-29  535  		daddr = &lsa->l2tp_addr;
a32e0eec7042b2 Chris Elston         2012-04-29  536  		if (np->sndflow) {
a32e0eec7042b2 Chris Elston         2012-04-29  537  			fl6.flowlabel = lsa->l2tp_flowinfo & IPV6_FLOWINFO_MASK;
a32e0eec7042b2 Chris Elston         2012-04-29  538  			if (fl6.flowlabel & IPV6_FLOWLABEL_MASK) {
a32e0eec7042b2 Chris Elston         2012-04-29  539  				flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  540  				if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  541  					return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  542  			}
a32e0eec7042b2 Chris Elston         2012-04-29  543  		}
a32e0eec7042b2 Chris Elston         2012-04-29  544  
20dcb1107ab1a3 Tom Parkin           2020-07-22  545  		/* Otherwise it will be difficult to maintain
a32e0eec7042b2 Chris Elston         2012-04-29  546  		 * sk->sk_dst_cache.
a32e0eec7042b2 Chris Elston         2012-04-29  547  		 */
a32e0eec7042b2 Chris Elston         2012-04-29  548  		if (sk->sk_state == TCP_ESTABLISHED &&
efe4208f47f907 Eric Dumazet         2013-10-03  549  		    ipv6_addr_equal(daddr, &sk->sk_v6_daddr))
efe4208f47f907 Eric Dumazet         2013-10-03  550  			daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  551  
a32e0eec7042b2 Chris Elston         2012-04-29  552  		if (addr_len >= sizeof(struct sockaddr_in6) &&
a32e0eec7042b2 Chris Elston         2012-04-29  553  		    lsa->l2tp_scope_id &&
a32e0eec7042b2 Chris Elston         2012-04-29  554  		    ipv6_addr_type(daddr) & IPV6_ADDR_LINKLOCAL)
a32e0eec7042b2 Chris Elston         2012-04-29  555  			fl6.flowi6_oif = lsa->l2tp_scope_id;
a32e0eec7042b2 Chris Elston         2012-04-29  556  	} else {
a32e0eec7042b2 Chris Elston         2012-04-29  557  		if (sk->sk_state != TCP_ESTABLISHED)
a32e0eec7042b2 Chris Elston         2012-04-29  558  			return -EDESTADDRREQ;
a32e0eec7042b2 Chris Elston         2012-04-29  559  
efe4208f47f907 Eric Dumazet         2013-10-03  560  		daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  561  		fl6.flowlabel = np->flow_label;
a32e0eec7042b2 Chris Elston         2012-04-29  562  	}
a32e0eec7042b2 Chris Elston         2012-04-29  563  
a32e0eec7042b2 Chris Elston         2012-04-29  564  	if (fl6.flowi6_oif == 0)
ff0094030f146b Eric Dumazet         2022-05-13  565  		fl6.flowi6_oif = READ_ONCE(sk->sk_bound_dev_if);
a32e0eec7042b2 Chris Elston         2012-04-29  566  
a32e0eec7042b2 Chris Elston         2012-04-29  567  	if (msg->msg_controllen) {
a32e0eec7042b2 Chris Elston         2012-04-29  568  		opt = &opt_space;
a32e0eec7042b2 Chris Elston         2012-04-29  569  		memset(opt, 0, sizeof(struct ipv6_txoptions));
a32e0eec7042b2 Chris Elston         2012-04-29  570  		opt->tot_len = sizeof(struct ipv6_txoptions);
26879da58711aa Wei Wang             2016-05-02  571  		ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  572  
5fdaa88dfefa87 Willem de Bruijn     2018-07-06  573  		err = ip6_datagram_send_ctl(sock_net(sk), sk, msg, &fl6, &ipc6);
a32e0eec7042b2 Chris Elston         2012-04-29  574  		if (err < 0) {
a32e0eec7042b2 Chris Elston         2012-04-29  575  			fl6_sock_release(flowlabel);
a32e0eec7042b2 Chris Elston         2012-04-29  576  			return err;
a32e0eec7042b2 Chris Elston         2012-04-29  577  		}
a32e0eec7042b2 Chris Elston         2012-04-29  578  		if ((fl6.flowlabel & IPV6_FLOWLABEL_MASK) && !flowlabel) {
a32e0eec7042b2 Chris Elston         2012-04-29  579  			flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  580  			if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  581  				return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  582  		}
a32e0eec7042b2 Chris Elston         2012-04-29  583  		if (!(opt->opt_nflen | opt->opt_flen))
a32e0eec7042b2 Chris Elston         2012-04-29  584  			opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  585  	}
a32e0eec7042b2 Chris Elston         2012-04-29  586  
45f6fad84cc305 Eric Dumazet         2015-11-29  587  	if (!opt) {
45f6fad84cc305 Eric Dumazet         2015-11-29  588  		opt = txopt_get(np);
45f6fad84cc305 Eric Dumazet         2015-11-29  589  		opt_to_free = opt;
45f6fad84cc305 Eric Dumazet         2015-11-29  590  	}
a32e0eec7042b2 Chris Elston         2012-04-29  591  	if (flowlabel)
a32e0eec7042b2 Chris Elston         2012-04-29  592  		opt = fl6_merge_options(&opt_space, flowlabel, opt);
a32e0eec7042b2 Chris Elston         2012-04-29  593  	opt = ipv6_fixup_options(&opt_space, opt);
26879da58711aa Wei Wang             2016-05-02  594  	ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  595  
a32e0eec7042b2 Chris Elston         2012-04-29  596  	fl6.flowi6_proto = sk->sk_protocol;
a32e0eec7042b2 Chris Elston         2012-04-29  597  	if (!ipv6_addr_any(daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  598  		fl6.daddr = *daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  599  	else
a32e0eec7042b2 Chris Elston         2012-04-29  600  		fl6.daddr.s6_addr[15] = 0x1; /* :: means loopback (BSD'ism) */
a32e0eec7042b2 Chris Elston         2012-04-29  601  	if (ipv6_addr_any(&fl6.saddr) && !ipv6_addr_any(&np->saddr))
a32e0eec7042b2 Chris Elston         2012-04-29  602  		fl6.saddr = np->saddr;
a32e0eec7042b2 Chris Elston         2012-04-29  603  
a32e0eec7042b2 Chris Elston         2012-04-29  604  	final_p = fl6_update_dst(&fl6, opt, &final);
a32e0eec7042b2 Chris Elston         2012-04-29  605  
a32e0eec7042b2 Chris Elston         2012-04-29  606  	if (!fl6.flowi6_oif && ipv6_addr_is_multicast(&fl6.daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  607  		fl6.flowi6_oif = np->mcast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  608  	else if (!fl6.flowi6_oif)
a32e0eec7042b2 Chris Elston         2012-04-29  609  		fl6.flowi6_oif = np->ucast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  610  
3df98d79215ace Paul Moore           2020-09-27  611  	security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  612  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  613  	if (ipc6.tclass < 0)
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  614  		ipc6.tclass = np->tclass;
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  615  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  616  	fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  617  
c4e85f73afb638 Sabrina Dubroca      2019-12-04  618  	dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
a32e0eec7042b2 Chris Elston         2012-04-29  619  	if (IS_ERR(dst)) {
a32e0eec7042b2 Chris Elston         2012-04-29  620  		err = PTR_ERR(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  621  		goto out;
a32e0eec7042b2 Chris Elston         2012-04-29  622  	}
a32e0eec7042b2 Chris Elston         2012-04-29  623  
26879da58711aa Wei Wang             2016-05-02  624  	if (ipc6.hlimit < 0)
26879da58711aa Wei Wang             2016-05-02  625  		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
a32e0eec7042b2 Chris Elston         2012-04-29  626  
26879da58711aa Wei Wang             2016-05-02  627  	if (ipc6.dontfrag < 0)
26879da58711aa Wei Wang             2016-05-02  628  		ipc6.dontfrag = np->dontfrag;
a32e0eec7042b2 Chris Elston         2012-04-29  629  
a32e0eec7042b2 Chris Elston         2012-04-29  630  	if (msg->msg_flags & MSG_CONFIRM)
a32e0eec7042b2 Chris Elston         2012-04-29  631  		goto do_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  632  
a32e0eec7042b2 Chris Elston         2012-04-29  633  back_from_confirm:
a32e0eec7042b2 Chris Elston         2012-04-29  634  	lock_sock(sk);
f69e6d131f5dac Al Viro              2014-11-24 @635  	err = ip6_append_data(sk, ip_generic_getfrag, msg,
26879da58711aa Wei Wang             2016-05-02  636  			      ulen, transhdrlen, &ipc6,
a32e0eec7042b2 Chris Elston         2012-04-29  637  			      &fl6, (struct rt6_info *)dst,
5fdaa88dfefa87 Willem de Bruijn     2018-07-06  638  			      msg->msg_flags);
a32e0eec7042b2 Chris Elston         2012-04-29  639  	if (err)
a32e0eec7042b2 Chris Elston         2012-04-29  640  		ip6_flush_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  641  	else if (!(msg->msg_flags & MSG_MORE))
a32e0eec7042b2 Chris Elston         2012-04-29  642  		err = l2tp_ip6_push_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  643  	release_sock(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  644  done:
a32e0eec7042b2 Chris Elston         2012-04-29  645  	dst_release(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  646  out:
a32e0eec7042b2 Chris Elston         2012-04-29  647  	fl6_sock_release(flowlabel);
45f6fad84cc305 Eric Dumazet         2015-11-29  648  	txopt_put(opt_to_free);
a32e0eec7042b2 Chris Elston         2012-04-29  649  
a32e0eec7042b2 Chris Elston         2012-04-29  650  	return err < 0 ? err : len;
a32e0eec7042b2 Chris Elston         2012-04-29  651  
a32e0eec7042b2 Chris Elston         2012-04-29  652  do_confirm:
0dec879f636f11 Julian Anastasov     2017-02-06  653  	if (msg->msg_flags & MSG_PROBE)
0dec879f636f11 Julian Anastasov     2017-02-06  654  		dst_confirm_neigh(dst, &fl6.daddr);
a32e0eec7042b2 Chris Elston         2012-04-29  655  	if (!(msg->msg_flags & MSG_PROBE) || len)
a32e0eec7042b2 Chris Elston         2012-04-29  656  		goto back_from_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  657  	err = 0;
a32e0eec7042b2 Chris Elston         2012-04-29  658  	goto done;
a32e0eec7042b2 Chris Elston         2012-04-29  659  }
a32e0eec7042b2 Chris Elston         2012-04-29  660  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
  2023-03-22 13:56           ` [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr* David Howells
                               ` (2 preceding siblings ...)
  2023-03-23  1:25             ` kernel test robot
@ 2023-03-23  1:25             ` kernel test robot
  3 siblings, 0 replies; 81+ messages in thread
From: kernel test robot @ 2023-03-23  1:25 UTC (permalink / raw)
  To: David Howells; +Cc: oe-kbuild-all

Hi David,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on linus/master]
[also build test ERROR on v6.3-rc3]
[cannot apply to herbert-cryptodev-2.6/master herbert-crypto-2.6/master bluetooth-next/master bluetooth/master next-20230322]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
patch link:    https://lore.kernel.org/r/20230322135612.3265850-3-dhowells%40redhat.com
patch subject: [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr*
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20230323/202303230918.CpQpgPsA-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
        # https://github.com/intel-lab-lkp/linux/commit/eecac0727821eaf716a8600550bf68f21ead4b87
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review David-Howells/ip-Make-__ip-6-_append_data-and-co-take-a-msghdr/20230322-225741
        git checkout eecac0727821eaf716a8600550bf68f21ead4b87
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        make W=1 O=build_dir ARCH=x86_64 olddefconfig
        make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash lib// net/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202303230918.CpQpgPsA-lkp@intel.com/

All errors (new ones prefixed by >>):

   net/l2tp/l2tp_ip6.c: In function 'l2tp_ip6_sendmsg':
>> net/l2tp/l2tp_ip6.c:635:35: error: passing argument 2 of 'ip6_append_data' from incompatible pointer type [-Werror=incompatible-pointer-types]
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |                                   ^~~~~~~~~~~~~~~~~~
         |                                   |
         |                                   int (*)(struct msghdr *, void *, char *, int,  int,  int,  struct sk_buff *)
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1100:53: note: expected 'struct msghdr *' but argument is of type 'int (*)(struct msghdr *, void *, char *, int,  int,  int,  struct sk_buff *)'
    1100 | int ip6_append_data(struct sock *sk, struct msghdr *msg,
         |                                      ~~~~~~~~~~~~~~~^~~
   net/l2tp/l2tp_ip6.c:635:55: error: passing argument 3 of 'ip6_append_data' from incompatible pointer type [-Werror=incompatible-pointer-types]
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |                                                       ^~~
         |                                                       |
         |                                                       struct msghdr *
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1101:34: note: expected 'ip_getfrag_t' but argument is of type 'struct msghdr *'
    1101 |                     ip_getfrag_t getfrag, void *from, int transhdrlen,
         |                     ~~~~~~~~~~~~~^~~~~~~
   net/l2tp/l2tp_ip6.c:636:31: warning: passing argument 4 of 'ip6_append_data' makes pointer from integer without a cast [-Wint-conversion]
     636 |                               ulen, transhdrlen, &ipc6,
         |                               ^~~~
         |                               |
         |                               int
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1101:49: note: expected 'void *' but argument is of type 'int'
    1101 |                     ip_getfrag_t getfrag, void *from, int transhdrlen,
         |                                           ~~~~~~^~~~
>> net/l2tp/l2tp_ip6.c:635:15: error: too many arguments to function 'ip6_append_data'
     635 |         err = ip6_append_data(sk, ip_generic_getfrag, msg,
         |               ^~~~~~~~~~~~~~~
   In file included from include/net/inetpeer.h:16,
                    from include/net/route.h:24,
                    from include/net/ip.h:30,
                    from net/l2tp/l2tp_ip6.c:18:
   include/net/ipv6.h:1100:5: note: declared here
    1100 | int ip6_append_data(struct sock *sk, struct msghdr *msg,
         |     ^~~~~~~~~~~~~~~
   cc1: some warnings being treated as errors


vim +/ip6_append_data +635 net/l2tp/l2tp_ip6.c

a32e0eec7042b2 Chris Elston         2012-04-29  487  
a32e0eec7042b2 Chris Elston         2012-04-29  488  /* Userspace will call sendmsg() on the tunnel socket to send L2TP
a32e0eec7042b2 Chris Elston         2012-04-29  489   * control frames.
a32e0eec7042b2 Chris Elston         2012-04-29  490   */
cee416e2c19501 David Howells        2023-03-22  491  static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg)
a32e0eec7042b2 Chris Elston         2012-04-29  492  {
a32e0eec7042b2 Chris Elston         2012-04-29  493  	struct ipv6_txoptions opt_space;
342dfc306fb321 Steffen Hurrle       2014-01-17  494  	DECLARE_SOCKADDR(struct sockaddr_l2tpip6 *, lsa, msg->msg_name);
a32e0eec7042b2 Chris Elston         2012-04-29  495  	struct in6_addr *daddr, *final_p, final;
a32e0eec7042b2 Chris Elston         2012-04-29  496  	struct ipv6_pinfo *np = inet6_sk(sk);
45f6fad84cc305 Eric Dumazet         2015-11-29  497  	struct ipv6_txoptions *opt_to_free = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  498  	struct ipv6_txoptions *opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  499  	struct ip6_flowlabel *flowlabel = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  500  	struct dst_entry *dst = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  501  	struct flowi6 fl6;
26879da58711aa Wei Wang             2016-05-02  502  	struct ipcm6_cookie ipc6;
cee416e2c19501 David Howells        2023-03-22  503  	size_t len = msg_data_left(msg);
a32e0eec7042b2 Chris Elston         2012-04-29  504  	int addr_len = msg->msg_namelen;
a32e0eec7042b2 Chris Elston         2012-04-29  505  	int transhdrlen = 4; /* zero session-id */
f638a84afef3df Wang Yufen           2022-06-07  506  	int ulen;
a32e0eec7042b2 Chris Elston         2012-04-29  507  	int err;
a32e0eec7042b2 Chris Elston         2012-04-29  508  
a32e0eec7042b2 Chris Elston         2012-04-29  509  	/* Rough check on arithmetic overflow,
20dcb1107ab1a3 Tom Parkin           2020-07-22  510  	 * better check is made in ip6_append_data().
a32e0eec7042b2 Chris Elston         2012-04-29  511  	 */
f638a84afef3df Wang Yufen           2022-06-07  512  	if (len > INT_MAX - transhdrlen)
a32e0eec7042b2 Chris Elston         2012-04-29  513  		return -EMSGSIZE;
f638a84afef3df Wang Yufen           2022-06-07  514  	ulen = len + transhdrlen;
a32e0eec7042b2 Chris Elston         2012-04-29  515  
a32e0eec7042b2 Chris Elston         2012-04-29  516  	/* Mirror BSD error message compatibility */
a32e0eec7042b2 Chris Elston         2012-04-29  517  	if (msg->msg_flags & MSG_OOB)
a32e0eec7042b2 Chris Elston         2012-04-29  518  		return -EOPNOTSUPP;
a32e0eec7042b2 Chris Elston         2012-04-29  519  
20dcb1107ab1a3 Tom Parkin           2020-07-22  520  	/* Get and verify the address */
a32e0eec7042b2 Chris Elston         2012-04-29  521  	memset(&fl6, 0, sizeof(fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  522  
a32e0eec7042b2 Chris Elston         2012-04-29  523  	fl6.flowi6_mark = sk->sk_mark;
e2d118a1cb5e60 Lorenzo Colitti      2016-11-04  524  	fl6.flowi6_uid = sk->sk_uid;
a32e0eec7042b2 Chris Elston         2012-04-29  525  
b515430ac9c25d Willem de Bruijn     2018-07-06  526  	ipcm6_init(&ipc6);
26879da58711aa Wei Wang             2016-05-02  527  
a32e0eec7042b2 Chris Elston         2012-04-29  528  	if (lsa) {
a32e0eec7042b2 Chris Elston         2012-04-29  529  		if (addr_len < SIN6_LEN_RFC2133)
a32e0eec7042b2 Chris Elston         2012-04-29  530  			return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  531  
a32e0eec7042b2 Chris Elston         2012-04-29  532  		if (lsa->l2tp_family && lsa->l2tp_family != AF_INET6)
a32e0eec7042b2 Chris Elston         2012-04-29  533  			return -EAFNOSUPPORT;
a32e0eec7042b2 Chris Elston         2012-04-29  534  
a32e0eec7042b2 Chris Elston         2012-04-29  535  		daddr = &lsa->l2tp_addr;
a32e0eec7042b2 Chris Elston         2012-04-29  536  		if (np->sndflow) {
a32e0eec7042b2 Chris Elston         2012-04-29  537  			fl6.flowlabel = lsa->l2tp_flowinfo & IPV6_FLOWINFO_MASK;
a32e0eec7042b2 Chris Elston         2012-04-29  538  			if (fl6.flowlabel & IPV6_FLOWLABEL_MASK) {
a32e0eec7042b2 Chris Elston         2012-04-29  539  				flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  540  				if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  541  					return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  542  			}
a32e0eec7042b2 Chris Elston         2012-04-29  543  		}
a32e0eec7042b2 Chris Elston         2012-04-29  544  
20dcb1107ab1a3 Tom Parkin           2020-07-22  545  		/* Otherwise it will be difficult to maintain
a32e0eec7042b2 Chris Elston         2012-04-29  546  		 * sk->sk_dst_cache.
a32e0eec7042b2 Chris Elston         2012-04-29  547  		 */
a32e0eec7042b2 Chris Elston         2012-04-29  548  		if (sk->sk_state == TCP_ESTABLISHED &&
efe4208f47f907 Eric Dumazet         2013-10-03  549  		    ipv6_addr_equal(daddr, &sk->sk_v6_daddr))
efe4208f47f907 Eric Dumazet         2013-10-03  550  			daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  551  
a32e0eec7042b2 Chris Elston         2012-04-29  552  		if (addr_len >= sizeof(struct sockaddr_in6) &&
a32e0eec7042b2 Chris Elston         2012-04-29  553  		    lsa->l2tp_scope_id &&
a32e0eec7042b2 Chris Elston         2012-04-29  554  		    ipv6_addr_type(daddr) & IPV6_ADDR_LINKLOCAL)
a32e0eec7042b2 Chris Elston         2012-04-29  555  			fl6.flowi6_oif = lsa->l2tp_scope_id;
a32e0eec7042b2 Chris Elston         2012-04-29  556  	} else {
a32e0eec7042b2 Chris Elston         2012-04-29  557  		if (sk->sk_state != TCP_ESTABLISHED)
a32e0eec7042b2 Chris Elston         2012-04-29  558  			return -EDESTADDRREQ;
a32e0eec7042b2 Chris Elston         2012-04-29  559  
efe4208f47f907 Eric Dumazet         2013-10-03  560  		daddr = &sk->sk_v6_daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  561  		fl6.flowlabel = np->flow_label;
a32e0eec7042b2 Chris Elston         2012-04-29  562  	}
a32e0eec7042b2 Chris Elston         2012-04-29  563  
a32e0eec7042b2 Chris Elston         2012-04-29  564  	if (fl6.flowi6_oif == 0)
ff0094030f146b Eric Dumazet         2022-05-13  565  		fl6.flowi6_oif = READ_ONCE(sk->sk_bound_dev_if);
a32e0eec7042b2 Chris Elston         2012-04-29  566  
a32e0eec7042b2 Chris Elston         2012-04-29  567  	if (msg->msg_controllen) {
a32e0eec7042b2 Chris Elston         2012-04-29  568  		opt = &opt_space;
a32e0eec7042b2 Chris Elston         2012-04-29  569  		memset(opt, 0, sizeof(struct ipv6_txoptions));
a32e0eec7042b2 Chris Elston         2012-04-29  570  		opt->tot_len = sizeof(struct ipv6_txoptions);
26879da58711aa Wei Wang             2016-05-02  571  		ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  572  
5fdaa88dfefa87 Willem de Bruijn     2018-07-06  573  		err = ip6_datagram_send_ctl(sock_net(sk), sk, msg, &fl6, &ipc6);
a32e0eec7042b2 Chris Elston         2012-04-29  574  		if (err < 0) {
a32e0eec7042b2 Chris Elston         2012-04-29  575  			fl6_sock_release(flowlabel);
a32e0eec7042b2 Chris Elston         2012-04-29  576  			return err;
a32e0eec7042b2 Chris Elston         2012-04-29  577  		}
a32e0eec7042b2 Chris Elston         2012-04-29  578  		if ((fl6.flowlabel & IPV6_FLOWLABEL_MASK) && !flowlabel) {
a32e0eec7042b2 Chris Elston         2012-04-29  579  			flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
59c820b2317f0f Willem de Bruijn     2019-07-07  580  			if (IS_ERR(flowlabel))
a32e0eec7042b2 Chris Elston         2012-04-29  581  				return -EINVAL;
a32e0eec7042b2 Chris Elston         2012-04-29  582  		}
a32e0eec7042b2 Chris Elston         2012-04-29  583  		if (!(opt->opt_nflen | opt->opt_flen))
a32e0eec7042b2 Chris Elston         2012-04-29  584  			opt = NULL;
a32e0eec7042b2 Chris Elston         2012-04-29  585  	}
a32e0eec7042b2 Chris Elston         2012-04-29  586  
45f6fad84cc305 Eric Dumazet         2015-11-29  587  	if (!opt) {
45f6fad84cc305 Eric Dumazet         2015-11-29  588  		opt = txopt_get(np);
45f6fad84cc305 Eric Dumazet         2015-11-29  589  		opt_to_free = opt;
45f6fad84cc305 Eric Dumazet         2015-11-29  590  	}
a32e0eec7042b2 Chris Elston         2012-04-29  591  	if (flowlabel)
a32e0eec7042b2 Chris Elston         2012-04-29  592  		opt = fl6_merge_options(&opt_space, flowlabel, opt);
a32e0eec7042b2 Chris Elston         2012-04-29  593  	opt = ipv6_fixup_options(&opt_space, opt);
26879da58711aa Wei Wang             2016-05-02  594  	ipc6.opt = opt;
a32e0eec7042b2 Chris Elston         2012-04-29  595  
a32e0eec7042b2 Chris Elston         2012-04-29  596  	fl6.flowi6_proto = sk->sk_protocol;
a32e0eec7042b2 Chris Elston         2012-04-29  597  	if (!ipv6_addr_any(daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  598  		fl6.daddr = *daddr;
a32e0eec7042b2 Chris Elston         2012-04-29  599  	else
a32e0eec7042b2 Chris Elston         2012-04-29  600  		fl6.daddr.s6_addr[15] = 0x1; /* :: means loopback (BSD'ism) */
a32e0eec7042b2 Chris Elston         2012-04-29  601  	if (ipv6_addr_any(&fl6.saddr) && !ipv6_addr_any(&np->saddr))
a32e0eec7042b2 Chris Elston         2012-04-29  602  		fl6.saddr = np->saddr;
a32e0eec7042b2 Chris Elston         2012-04-29  603  
a32e0eec7042b2 Chris Elston         2012-04-29  604  	final_p = fl6_update_dst(&fl6, opt, &final);
a32e0eec7042b2 Chris Elston         2012-04-29  605  
a32e0eec7042b2 Chris Elston         2012-04-29  606  	if (!fl6.flowi6_oif && ipv6_addr_is_multicast(&fl6.daddr))
a32e0eec7042b2 Chris Elston         2012-04-29  607  		fl6.flowi6_oif = np->mcast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  608  	else if (!fl6.flowi6_oif)
a32e0eec7042b2 Chris Elston         2012-04-29  609  		fl6.flowi6_oif = np->ucast_oif;
a32e0eec7042b2 Chris Elston         2012-04-29  610  
3df98d79215ace Paul Moore           2020-09-27  611  	security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
a32e0eec7042b2 Chris Elston         2012-04-29  612  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  613  	if (ipc6.tclass < 0)
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  614  		ipc6.tclass = np->tclass;
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  615  
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  616  	fl6.flowlabel = ip6_make_flowinfo(ipc6.tclass, fl6.flowlabel);
38b7097b55b6cf Hannes Frederic Sowa 2016-06-11  617  
c4e85f73afb638 Sabrina Dubroca      2019-12-04  618  	dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
a32e0eec7042b2 Chris Elston         2012-04-29  619  	if (IS_ERR(dst)) {
a32e0eec7042b2 Chris Elston         2012-04-29  620  		err = PTR_ERR(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  621  		goto out;
a32e0eec7042b2 Chris Elston         2012-04-29  622  	}
a32e0eec7042b2 Chris Elston         2012-04-29  623  
26879da58711aa Wei Wang             2016-05-02  624  	if (ipc6.hlimit < 0)
26879da58711aa Wei Wang             2016-05-02  625  		ipc6.hlimit = ip6_sk_dst_hoplimit(np, &fl6, dst);
a32e0eec7042b2 Chris Elston         2012-04-29  626  
26879da58711aa Wei Wang             2016-05-02  627  	if (ipc6.dontfrag < 0)
26879da58711aa Wei Wang             2016-05-02  628  		ipc6.dontfrag = np->dontfrag;
a32e0eec7042b2 Chris Elston         2012-04-29  629  
a32e0eec7042b2 Chris Elston         2012-04-29  630  	if (msg->msg_flags & MSG_CONFIRM)
a32e0eec7042b2 Chris Elston         2012-04-29  631  		goto do_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  632  
a32e0eec7042b2 Chris Elston         2012-04-29  633  back_from_confirm:
a32e0eec7042b2 Chris Elston         2012-04-29  634  	lock_sock(sk);
f69e6d131f5dac Al Viro              2014-11-24 @635  	err = ip6_append_data(sk, ip_generic_getfrag, msg,
26879da58711aa Wei Wang             2016-05-02  636  			      ulen, transhdrlen, &ipc6,
a32e0eec7042b2 Chris Elston         2012-04-29  637  			      &fl6, (struct rt6_info *)dst,
5fdaa88dfefa87 Willem de Bruijn     2018-07-06  638  			      msg->msg_flags);
a32e0eec7042b2 Chris Elston         2012-04-29  639  	if (err)
a32e0eec7042b2 Chris Elston         2012-04-29  640  		ip6_flush_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  641  	else if (!(msg->msg_flags & MSG_MORE))
a32e0eec7042b2 Chris Elston         2012-04-29  642  		err = l2tp_ip6_push_pending_frames(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  643  	release_sock(sk);
a32e0eec7042b2 Chris Elston         2012-04-29  644  done:
a32e0eec7042b2 Chris Elston         2012-04-29  645  	dst_release(dst);
a32e0eec7042b2 Chris Elston         2012-04-29  646  out:
a32e0eec7042b2 Chris Elston         2012-04-29  647  	fl6_sock_release(flowlabel);
45f6fad84cc305 Eric Dumazet         2015-11-29  648  	txopt_put(opt_to_free);
a32e0eec7042b2 Chris Elston         2012-04-29  649  
a32e0eec7042b2 Chris Elston         2012-04-29  650  	return err < 0 ? err : len;
a32e0eec7042b2 Chris Elston         2012-04-29  651  
a32e0eec7042b2 Chris Elston         2012-04-29  652  do_confirm:
0dec879f636f11 Julian Anastasov     2017-02-06  653  	if (msg->msg_flags & MSG_PROBE)
0dec879f636f11 Julian Anastasov     2017-02-06  654  		dst_confirm_neigh(dst, &fl6.daddr);
a32e0eec7042b2 Chris Elston         2012-04-29  655  	if (!(msg->msg_flags & MSG_PROBE) || len)
a32e0eec7042b2 Chris Elston         2012-04-29  656  		goto back_from_confirm;
a32e0eec7042b2 Chris Elston         2012-04-29  657  	err = 0;
a32e0eec7042b2 Chris Elston         2012-04-29  658  	goto done;
a32e0eec7042b2 Chris Elston         2012-04-29  659  }
a32e0eec7042b2 Chris Elston         2012-04-29  660  

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 23/28] algif: Remove hash_sendpage*()
  2023-03-17  2:40   ` Herbert Xu
@ 2023-03-24 16:47     ` David Howells
  2023-03-25  6:00       ` Herbert Xu
  2023-03-25  7:44       ` David Howells
  0 siblings, 2 replies; 81+ messages in thread
From: David Howells @ 2023-03-24 16:47 UTC (permalink / raw)
  To: Herbert Xu
  Cc: dhowells, willy, davem, edumazet, kuba, pabeni, viro, hch, axboe,
	jlayton, brauner, torvalds, netdev, linux-fsdevel, linux-kernel,
	linux-mm, linux-crypto

Herbert Xu <herbert@gondor.apana.org.au> wrote:

> David Howells <dhowells@redhat.com> wrote:
> > Remove hash_sendpage*() and use hash_sendmsg() as the latter seems to just
> > use the source pages directly anyway.
> 
> ...
> 
> > -       if (!(flags & MSG_MORE)) {
> > -               if (ctx->more)
> > -                       err = crypto_ahash_finup(&ctx->req);
> > -               else
> > -                       err = crypto_ahash_digest(&ctx->req);
> 
> You've just removed the optimised path from user-space to
> finup/digest.  You need to add them back to sendmsg if you
> want to eliminate sendpage.

I must be missing something, I think.  What's particularly optimal about the
code in hash_sendpage() but not hash_sendmsg()?  Is it that the former uses
finup/digest, but the latter ony does update+final?

Also, looking at:

	if (!ctx->more) {
		if ((msg->msg_flags & MSG_MORE))
			hash_free_result(sk, ctx);

how is ctx->more meant to be interpreted?  I'm guessing it means that we're
continuing to the previous op.  But we do we need to free any old result if
MSG_MORE is set, but not if it isn't?

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 23/28] algif: Remove hash_sendpage*()
  2023-03-24 16:47     ` David Howells
@ 2023-03-25  6:00       ` Herbert Xu
  2023-03-25  7:44       ` David Howells
  1 sibling, 0 replies; 81+ messages in thread
From: Herbert Xu @ 2023-03-25  6:00 UTC (permalink / raw)
  To: David Howells
  Cc: willy, davem, edumazet, kuba, pabeni, viro, hch, axboe, jlayton,
	brauner, torvalds, netdev, linux-fsdevel, linux-kernel, linux-mm,
	linux-crypto

On Fri, Mar 24, 2023 at 04:47:50PM +0000, David Howells wrote:
>
> I must be missing something, I think.  What's particularly optimal about the
> code in hash_sendpage() but not hash_sendmsg()?  Is it that the former uses
> finup/digest, but the latter ony does update+final?

A lot of hardware hashes can't perform partial updates, so they
will always fall back to software unless you use finup/digest.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 23/28] algif: Remove hash_sendpage*()
  2023-03-24 16:47     ` David Howells
  2023-03-25  6:00       ` Herbert Xu
@ 2023-03-25  7:44       ` David Howells
  2023-03-25  9:21         ` Herbert Xu
  1 sibling, 1 reply; 81+ messages in thread
From: David Howells @ 2023-03-25  7:44 UTC (permalink / raw)
  To: Herbert Xu
  Cc: dhowells, willy, davem, edumazet, kuba, pabeni, viro, hch, axboe,
	jlayton, brauner, torvalds, netdev, linux-fsdevel, linux-kernel,
	linux-mm, linux-crypto

Herbert Xu <herbert@gondor.apana.org.au> wrote:

> > I must be missing something, I think.  What's particularly optimal about the
> > code in hash_sendpage() but not hash_sendmsg()?  Is it that the former uses
> > finup/digest, but the latter ony does update+final?
> 
> A lot of hardware hashes can't perform partial updates, so they
> will always fall back to software unless you use finup/digest.

Okay.  Btw, how much of a hard limit is ALG_MAX_PAGES?  Multipage folios can
exceed the current limit (16 pages, 64K) in size.  Is it just to prevent too
much memory being pinned at once?

David


^ permalink raw reply	[flat|nested] 81+ messages in thread

* Re: [RFC PATCH 23/28] algif: Remove hash_sendpage*()
  2023-03-25  7:44       ` David Howells
@ 2023-03-25  9:21         ` Herbert Xu
  0 siblings, 0 replies; 81+ messages in thread
From: Herbert Xu @ 2023-03-25  9:21 UTC (permalink / raw)
  To: David Howells
  Cc: willy, davem, edumazet, kuba, pabeni, viro, hch, axboe, jlayton,
	brauner, torvalds, netdev, linux-fsdevel, linux-kernel, linux-mm,
	linux-crypto

On Sat, Mar 25, 2023 at 07:44:14AM +0000, David Howells wrote:
>
> Okay.  Btw, how much of a hard limit is ALG_MAX_PAGES?  Multipage folios can
> exceed the current limit (16 pages, 64K) in size.  Is it just to prevent too
> much memory being pinned at once?

Yes, we don't want user-space to be able to pin an unlimited
amount of memory.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 81+ messages in thread

end of thread, other threads:[~2023-03-25  9:21 UTC | newest]

Thread overview: 81+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-16 15:25 [RFC PATCH 00/28] splice, net: Replace sendpage with sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:25 ` [RFC PATCH 01/28] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-03-16 15:25 ` [RFC PATCH 02/28] Add a special allocator for staging netfs protocol to MSG_SPLICE_PAGES David Howells
2023-03-16 17:28   ` Matthew Wilcox
2023-03-16 18:00   ` David Howells
2023-03-16 15:25 ` [RFC PATCH 03/28] tcp: Support MSG_SPLICE_PAGES David Howells
2023-03-16 18:37   ` Willem de Bruijn
2023-03-16 18:44   ` David Howells
2023-03-16 19:00     ` Willem de Bruijn
2023-03-21  0:38     ` David Howells
2023-03-21 14:22       ` Willem de Bruijn
2023-03-22 13:56         ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() David Howells
2023-03-22 13:56           ` [RFC PATCH 1/3] net: Drop the size argument from ->sendmsg() David Howells
2023-03-22 13:56             ` David Howells
2023-03-22 13:56             ` David Howells
2023-03-22 14:13             ` [RFC,1/3] " bluez.test.bot
2023-03-23  1:11             ` bluez.test.bot
2023-03-22 13:56           ` [RFC PATCH 2/3] ip: Make __ip{,6}_append_data() and co. take a msghdr* David Howells
2023-03-22 17:25             ` kernel test robot
2023-03-22 22:12             ` kernel test robot
2023-03-23  1:25             ` kernel test robot
2023-03-23  1:25             ` kernel test robot
2023-03-22 13:56           ` [RFC PATCH 3/3] net: Declare MSG_SPLICE_PAGES internal sendmsg() flag David Howells
2023-03-23  1:17           ` [RFC PATCH 0/3] net: Drop size arg from ->sendmsg() and pass msghdr into __ip{,6}_append_data() Willem de Bruijn
2023-03-16 15:25 ` [RFC PATCH 04/28] tcp: Convert do_tcp_sendpages() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:25 ` [RFC PATCH 05/28] tcp_bpf: Inline do_tcp_sendpages as it's now a wrapper around tcp_sendmsg David Howells
2023-03-16 15:25 ` [RFC PATCH 06/28] espintcp: Inline do_tcp_sendpages() David Howells
2023-03-16 15:25 ` [RFC PATCH 07/28] tls: " David Howells
2023-03-16 15:25 ` [RFC PATCH 08/28] siw: " David Howells
2023-03-20 10:53   ` Bernard Metzler
2023-03-20 11:08   ` David Howells
2023-03-20 12:27     ` Bernard Metzler
2023-03-20 13:13     ` David Howells
2023-03-20 13:18       ` Bernard Metzler
2023-03-16 15:25 ` [RFC PATCH 09/28] tcp: Fold do_tcp_sendpages() into tcp_sendpage_locked() David Howells
2023-03-16 15:26 ` [RFC PATCH 10/28] ip, udp: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 11/28] udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 12/28] af_unix: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 13/28] crypto: af_alg: Indent the loop in af_alg_sendmsg() David Howells
2023-03-16 15:26 ` [RFC PATCH 14/28] crypto: af_alg: Support MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 15/28] crypto: af_alg: Convert af_alg_sendpage() to use MSG_SPLICE_PAGES David Howells
2023-03-16 15:26 ` [RFC PATCH 16/28] splice, net: Use sendmsg(MSG_SPLICE_PAGES) rather than ->sendpage() David Howells
2023-03-16 15:26 ` [RFC PATCH 17/28] Remove file->f_op->sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 18/28] siw: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage to transmit David Howells
2023-03-20 13:39   ` Bernard Metzler
2023-03-16 15:26 ` [RFC PATCH 19/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 20/28] iscsi: " David Howells
2023-03-16 15:26 ` [RFC PATCH 21/28] tcp_bpf: Make tcp_bpf_sendpage() go through tcp_bpf_sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:26 ` [RFC PATCH 22/28] net: Use sendmsg(MSG_SPLICE_PAGES) not sendpage in skb_send_sock() David Howells
2023-03-16 15:26 ` [RFC PATCH 23/28] algif: Remove hash_sendpage*() David Howells
2023-03-17  2:40   ` Herbert Xu
2023-03-24 16:47     ` David Howells
2023-03-25  6:00       ` Herbert Xu
2023-03-25  7:44       ` David Howells
2023-03-25  9:21         ` Herbert Xu
2023-03-16 15:26 ` [RFC PATCH 24/28] ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage() David Howells
2023-03-16 15:26 ` [RFC PATCH 25/28] rds: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage David Howells
2023-03-16 15:26 ` [RFC PATCH 26/28] dlm: " David Howells
2023-03-16 15:26   ` [Cluster-devel] " David Howells
2023-03-16 15:26 ` [RFC PATCH 27/28] sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage David Howells
2023-03-16 16:17   ` Trond Myklebust
2023-03-16 17:10     ` Chuck Lever III
2023-03-16 17:28     ` David Howells
2023-03-16 17:41       ` Chuck Lever III
2023-03-16 21:21     ` David Howells
2023-03-17 15:29       ` Chuck Lever III
2023-03-16 16:24   ` David Howells
2023-03-16 17:23     ` Trond Myklebust
2023-03-16 18:06     ` David Howells
2023-03-16 19:01       ` Trond Myklebust
2023-03-22 13:10       ` David Howells
2023-03-22 18:15       ` [RFC PATCH] iov_iter: Add an iterator-of-iterators David Howells
2023-03-22 18:47         ` Trond Myklebust
2023-03-22 18:49         ` Matthew Wilcox
2023-03-16 15:26 ` [RFC PATCH 28/28] sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) David Howells
2023-03-16 15:26   ` David Howells
2023-03-16 15:26   ` David Howells
2023-03-16 15:26   ` David Howells
2023-03-16 15:57   ` Marc Kleine-Budde
2023-03-16 15:57     ` Marc Kleine-Budde
2023-03-16 15:57     ` Marc Kleine-Budde

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.