[1/4] inet: Add skb_copy_datagram_iter
diff mbox series

Message ID E1XlZWY-0003HZ-Ef@gondolin.me.apana.org.au
State New, archived
Headers show
Series
  • [1/4] inet: Add skb_copy_datagram_iter
Related show

Commit Message

Herbert Xu Nov. 4, 2014, 8:31 a.m. UTC
This patch adds skb_copy_datagram_iter, which is identical to
skb_copy_datagram_iovec except that it operates on iov_iter
instead of iovec.

Eventually all users of skb_copy_datagram_iovec should switch
over to iov_iter and then we can remove skb_copy_datagram_iovec.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---

 include/linux/skbuff.h |    3 +
 net/core/datagram.c    |   82 +++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 85 insertions(+)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Al Viro Nov. 4, 2014, 2:32 p.m. UTC | #1
On Tue, Nov 04, 2014 at 04:31:34PM +0800, Herbert Xu wrote:
> This patch adds skb_copy_datagram_iter, which is identical to
> skb_copy_datagram_iovec except that it operates on iov_iter
> instead of iovec.
> 
> Eventually all users of skb_copy_datagram_iovec should switch
> over to iov_iter and then we can remove skb_copy_datagram_iovec.

Too noisy, IMO.  How about skb_copy_datagram_msg() first?  The fewer
places have to even think of iovec or iov_iter, the better...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 4, 2014, 2:35 p.m. UTC | #2
On Tue, Nov 04, 2014 at 02:32:00PM +0000, Al Viro wrote:
> On Tue, Nov 04, 2014 at 04:31:34PM +0800, Herbert Xu wrote:
> > This patch adds skb_copy_datagram_iter, which is identical to
> > skb_copy_datagram_iovec except that it operates on iov_iter
> > instead of iovec.
> > 
> > Eventually all users of skb_copy_datagram_iovec should switch
> > over to iov_iter and then we can remove skb_copy_datagram_iovec.
> 
> Too noisy, IMO.  How about skb_copy_datagram_msg() first?  The fewer
> places have to even think of iovec or iov_iter, the better...

PS: "too noisy" is about turning every callsite of skb_copy_datagram_iovec
into that of skb_copy_datagram_iter; the helper itself is just fine.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Nov. 4, 2014, 2:42 p.m. UTC | #3
On Tue, Nov 04, 2014 at 02:32:00PM +0000, Al Viro wrote:
>
> Too noisy, IMO.  How about skb_copy_datagram_msg() first?  The fewer
> places have to even think of iovec or iov_iter, the better...

We have places like tcp ucopy and tun that do not have msghdr.
So doing skb_copy_datagram_msg means that we'd have to create
a fake msghdr wrapper around them.  The point is not everything
comes in via sendmsg/recvmsg.

What is your motivation for hiding iov/iov_iter? Do you plan to
change their API at some future point?

Cheers,
Herbert Xu Nov. 4, 2014, 2:44 p.m. UTC | #4
On Tue, Nov 04, 2014 at 02:35:36PM +0000, Al Viro wrote:
> On Tue, Nov 04, 2014 at 02:32:00PM +0000, Al Viro wrote:
>
> > Too noisy, IMO.  How about skb_copy_datagram_msg() first?  The fewer
> > places have to even think of iovec or iov_iter, the better...
> 
> PS: "too noisy" is about turning every callsite of skb_copy_datagram_iovec
> into that of skb_copy_datagram_iter; the helper itself is just fine.

Hmm if that is your concern then I don't see how skb_copy_datagram_msg
changes things as you'd still have to convert every existing caller
of skb_copy_datagram_iovec.  Colour me confused.

Cheers,
Al Viro Nov. 4, 2014, 2:52 p.m. UTC | #5
On Tue, Nov 04, 2014 at 10:44:16PM +0800, Herbert Xu wrote:
> On Tue, Nov 04, 2014 at 02:35:36PM +0000, Al Viro wrote:
> > On Tue, Nov 04, 2014 at 02:32:00PM +0000, Al Viro wrote:
> >
> > > Too noisy, IMO.  How about skb_copy_datagram_msg() first?  The fewer
> > > places have to even think of iovec or iov_iter, the better...
> > 
> > PS: "too noisy" is about turning every callsite of skb_copy_datagram_iovec
> > into that of skb_copy_datagram_iter; the helper itself is just fine.
> 
> Hmm if that is your concern then I don't see how skb_copy_datagram_msg
> changes things as you'd still have to convert every existing caller
> of skb_copy_datagram_iovec.  Colour me confused.

Fewer places having to even think of iovec/iov_iter...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Nov. 4, 2014, 2:55 p.m. UTC | #6
On Tue, Nov 04, 2014 at 02:52:22PM +0000, Al Viro wrote:
>
> > Hmm if that is your concern then I don't see how skb_copy_datagram_msg
> > changes things as you'd still have to convert every existing caller
> > of skb_copy_datagram_iovec.  Colour me confused.
> 
> Fewer places having to even think of iovec/iov_iter...

Well it's the difference between

	skb_copy_datagram_iter(..., &kmsghdr->iov_iter, ...)

and

	skb_copy_datagram_msg(..., kmsghdr, ...)

Heck we could even make skb_copy_datagram_msg an inline wrapper
around skb_copy_datagram_iter if you like.

Anyway, the point is that not everything comes with a kmsghdr.

Cheers,
Al Viro Nov. 4, 2014, 3:13 p.m. UTC | #7
On Tue, Nov 04, 2014 at 10:42:58PM +0800, Herbert Xu wrote:
> On Tue, Nov 04, 2014 at 02:32:00PM +0000, Al Viro wrote:
> >
> > Too noisy, IMO.  How about skb_copy_datagram_msg() first?  The fewer
> > places have to even think of iovec or iov_iter, the better...
> 
> We have places like tcp ucopy and tun that do not have msghdr.
> So doing skb_copy_datagram_msg means that we'd have to create
> a fake msghdr wrapper around them.  The point is not everything
> comes in via sendmsg/recvmsg.

I'm certainly not suggesting it as a primitive.

> What is your motivation for hiding iov/iov_iter? Do you plan to
> change their API at some future point?

Think of it that way: every sendmsg/recvmsg path leading to memcpy_fromiovec
and its friends (including the open-coded ones) would need to be changed
at some point.  Assuming we do not end up passing struct iov_iter * as
an extra argument through a fairly large part of net/* (and that would
be prohibitively hard and messy, not to mention the effects on the stack
footprint, etc.), the most obvious strategy is to have that thing passed
where msg_iov/msg_iovlen are - in struct msghdr.  *IF* we go that way,
it makes a whole lot of sense to start with a bunch of cleanups that
will make sense on their own (most of callers of skb_copy_datagram_iovec
do look like skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied); might
as well give it an inlined helper) and will reduce the amount of places
where ->msg_iov is used.  With such cleanups standing on their own and
being splittable from the rest of the queue.  And leaving us with fewer
places in code that deal with ->msg_iov and need to be dealt with.

Please, look through my yesterday posting upthread.  Outline of the
proposed strategy is there...

FWIW, this is from the beginning of April queue - rebased to current,
but very likely incomplete.  Variant taking iov_iter would come later
and yes, it would replace the ..._iovec() one as primitive.  With much
fewer places to worry about.

commit 8241142acab3451239029085286b717ca30aac33
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Sun Apr 6 18:41:28 2014 -0400

    new helper: skb_copy_datagram_msg()
    
    Absolute majority of skb_copy_datagram_iovec() callers (49 out of 56)
    are passing it msg->msg_iov as iovec.  Provide a trivial wrapper that
    takes msg as argument instead of iovec.
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 1be8228..dcbd858 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -163,7 +163,7 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	memcpy(skb_push(skb, MISDN_HEADER_LEN), mISDN_HEAD_P(skb),
 	       MISDN_HEADER_LEN);
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	mISDN_sock_cmsg(sk, msg, skb);
 
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 6c9c16d..443cbbf 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -981,7 +981,7 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	if (skb) {
 		total_len = min_t(size_t, total_len, skb->len);
-		error = skb_copy_datagram_iovec(skb, 0, m->msg_iov, total_len);
+		error = skb_copy_datagram_msg(skb, 0, m, total_len);
 		if (error == 0) {
 			consume_skb(skb);
 			return total_len;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6c8b6f6..379ab46 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2662,6 +2662,12 @@ unsigned int skb_gso_transport_seglen(const struct sk_buff *skb);
 struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
 struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
 
+static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
+			    struct msghdr *msg, int size)
+{
+	return skb_copy_datagram_iovec(from, offset, msg->msg_iov, size);
+}
+
 struct skb_checksum_ops {
 	__wsum (*update)(const void *mem, int len, __wsum wsum);
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index c00897f..425942d 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1758,7 +1758,7 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 		copied = size;
 		msg->msg_flags |= MSG_TRUNC;
 	}
-	err = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, offset, msg, copied);
 
 	if (!err && msg->msg_name) {
 		DECLARE_SOCKADDR(struct sockaddr_at *, sat, msg->msg_name);
diff --git a/net/atm/common.c b/net/atm/common.c
index 6a76515..9cd1cca 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -554,7 +554,7 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	error = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	error = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (error)
 		return error;
 	sock_recv_ts_and_drops(msg, sk, skb);
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index c35c3f4..f4f835e19 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1634,7 +1634,7 @@ static int ax25_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	if (msg->msg_name) {
 		ax25_digi digi;
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index 339c74a..0a7cc56 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -237,7 +237,7 @@ int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err == 0) {
 		sock_recv_ts_and_drops(msg, sk, skb);
 
@@ -328,7 +328,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
 		}
 
 		chunk = min_t(unsigned int, skb->len, size);
-		if (skb_copy_datagram_iovec(skb, 0, msg->msg_iov, chunk)) {
+		if (skb_copy_datagram_msg(skb, 0, msg, chunk)) {
 			skb_queue_head(&sk->sk_receive_queue, skb);
 			if (!copied)
 				copied = -EFAULT;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 115f149..29e1ec7 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -878,7 +878,7 @@ static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	switch (hci_pi(sk)->channel) {
 	case HCI_CHANNEL_RAW:
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 43f750e..fbcd156 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -293,7 +293,7 @@ static int caif_seqpkt_recvmsg(struct kiocb *iocb, struct socket *sock,
 		copylen = len;
 	}
 
-	ret = skb_copy_datagram_iovec(skb, 0, m->msg_iov, copylen);
+	ret = skb_copy_datagram_msg(skb, 0, m, copylen);
 	if (ret)
 		goto out_free;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 15e0c67..ac56dd0 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2457,7 +2457,7 @@ int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len,
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 5ab6627..8e6ae94 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -896,7 +896,7 @@ verify_sock_status:
 		else if (len < skb->len)
 			msg->msg_flags |= MSG_TRUNC;
 
-		if (skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len)) {
+		if (skb_copy_datagram_msg(skb, 0, msg, len)) {
 			/* Exception. Bailout! */
 			len = -EFAULT;
 			break;
diff --git a/net/ieee802154/dgram.c b/net/ieee802154/dgram.c
index ef2ad8a..fc9193e 100644
--- a/net/ieee802154/dgram.c
+++ b/net/ieee802154/dgram.c
@@ -324,7 +324,7 @@ static int dgram_recvmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	/* FIXME: skip headers if necessary ?! */
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ieee802154/raw.c b/net/ieee802154/raw.c
index 9d1f648..73a4d53 100644
--- a/net/ieee802154/raw.c
+++ b/net/ieee802154/raw.c
@@ -195,7 +195,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index c373a9a..21894df 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -424,7 +424,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 57f7c98..736236c 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -875,7 +875,7 @@ int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	}
 
 	/* Don't bother checking the checksum */
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 739db31..ee8fa4b 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -718,7 +718,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 39ec0c3..c239f47 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1377,7 +1377,7 @@ static int tcp_peek_sndq(struct sock *sk, struct msghdr *msg, int len)
 	/* XXX -- need to support SO_PEEK_OFF */
 
 	skb_queue_walk(&sk->sk_write_queue, skb) {
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, skb->len);
+		err = skb_copy_datagram_msg(skb, 0, msg, skb->len);
 		if (err)
 			break;
 
@@ -1833,8 +1833,7 @@ do_prequeue:
 		}
 
 		if (!(flags & MSG_TRUNC)) {
-			err = skb_copy_datagram_iovec(skb, offset,
-						      msg->msg_iov, used);
+			err = skb_copy_datagram_msg(skb, offset, msg, used);
 			if (err) {
 				/* Exception. Bailout! */
 				if (!copied)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index cd0db54..d7266f7 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1281,8 +1281,8 @@ try_again:
 	}
 
 	if (skb_csum_unnecessary(skb))
-		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
-					      msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
+					      msg, copied);
 	else {
 		err = skb_copy_and_csum_datagram_iovec(skb,
 						       sizeof(struct udphdr),
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2cdc383..5c6996e 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -351,7 +351,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
@@ -445,7 +445,7 @@ int ipv6_recv_rxpmtu(struct sock *sk, struct msghdr *msg, int len,
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 896af88..f642598 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -486,11 +486,11 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	if (skb_csum_unnecessary(skb)) {
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	} else if (msg->msg_flags&MSG_TRUNC) {
 		if (__skb_checksum_complete(skb))
 			goto csum_copy_err;
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	} else {
 		err = skb_copy_and_csum_datagram_iovec(skb, 0, msg->msg_iov);
 		if (err == -EINVAL)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index f6ba535..5f68cd72 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -424,8 +424,8 @@ try_again:
 	}
 
 	if (skb_csum_unnecessary(skb))
-		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
-					      msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
+					      msg, copied);
 	else {
 		err = skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr), msg->msg_iov);
 		if (err == -EINVAL)
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index 91729b8..8b7ca1c 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1805,7 +1805,7 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	rc = skb_copy_datagram_iovec(skb, sizeof(struct ipxhdr), msg->msg_iov,
+	rc = skb_copy_datagram_msg(skb, sizeof(struct ipxhdr), msg,
 				     copied);
 	if (rc)
 		goto out_free;
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 3f3a6cb..3f1a37b 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1394,7 +1394,7 @@ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock,
 		copied = size;
 		msg->msg_flags |= MSG_TRUNC;
 	}
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	skb_free_datagram(sk, skb);
 
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index a089b6b..057b564 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1355,7 +1355,7 @@ static int iucv_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 		sk->sk_shutdown = sk->sk_shutdown | RCV_SHUTDOWN;
 
 	cskb = skb;
-	if (skb_copy_datagram_iovec(cskb, offset, msg->msg_iov, copied)) {
+	if (skb_copy_datagram_msg(cskb, offset, msg, copied)) {
 		if (!(flags & MSG_PEEK))
 			skb_queue_head(&sk->sk_receive_queue, skb);
 		return -EFAULT;
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 1847ec4..e588309 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3654,7 +3654,7 @@ static int pfkey_recvmsg(struct kiocb *kiocb,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free;
 
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 369a982..a6cc1fe 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -528,7 +528,7 @@ static int l2tp_ip_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 0edb263..2177b96 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -672,7 +672,7 @@ static int l2tp_ip6_recvmsg(struct kiocb *iocb, struct sock *sk,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index b704a93..c559bcd 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -208,7 +208,7 @@ static int pppol2tp_recvmsg(struct kiocb *iocb, struct socket *sock,
 	else if (len < skb->len)
 		msg->msg_flags |= MSG_TRUNC;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len);
+	err = skb_copy_datagram_msg(skb, 0, msg, len);
 	if (likely(err == 0))
 		err = len;
 
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index bb9cbc1..8fa230b 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -819,8 +819,8 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 			used = len;
 
 		if (!(flags & MSG_TRUNC)) {
-			int rc = skb_copy_datagram_iovec(skb, offset,
-							 msg->msg_iov, used);
+			int rc = skb_copy_datagram_msg(skb, offset,
+							 msg, used);
 			if (rc) {
 				/* Exception. Bailout! */
 				if (!copied)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index f1de72d..580b794 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2401,7 +2401,7 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(data_skb);
-	err = skb_copy_datagram_iovec(data_skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(data_skb, 0, msg, copied);
 
 	if (msg->msg_name) {
 		DECLARE_SOCKADDR(struct sockaddr_nl *, addr, msg->msg_name);
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 1b06a1f..7e13f6a 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1167,7 +1167,7 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	er = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	er = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (er < 0) {
 		skb_free_datagram(sk, skb);
 		release_sock(sk);
diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 51f077a..83bc785 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -832,7 +832,7 @@ static int llcp_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	copied = min_t(unsigned int, rlen, len);
 
 	cskb = skb;
-	if (skb_copy_datagram_iovec(cskb, 0, msg->msg_iov, copied)) {
+	if (skb_copy_datagram_msg(cskb, 0, msg, copied)) {
 		if (!(flags & MSG_PEEK))
 			skb_queue_head(&sk->sk_receive_queue, skb);
 		return -EFAULT;
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 11c3544..9d7d2b7 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -269,7 +269,7 @@ static int rawsock_recvmsg(struct kiocb *iocb, struct socket *sock,
 		copied = len;
 	}
 
-	rc = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	rc = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	skb_free_datagram(sk, skb);
 
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 87d20f4..4cd13d8 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2953,7 +2953,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free;
 
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index 290352c..0918bc2 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -150,7 +150,7 @@ static int pn_recvmsg(struct kiocb *iocb, struct sock *sk,
 		copylen = len;
 	}
 
-	rval = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copylen);
+	rval = skb_copy_datagram_msg(skb, 0, msg, copylen);
 	if (rval) {
 		rval = -EFAULT;
 		goto out;
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 70a547e..44b2123 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1296,7 +1296,7 @@ copy:
 	else
 		len = skb->len;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len);
+	err = skb_copy_datagram_msg(skb, 0, msg, len);
 	if (!err)
 		err = (flags & MSG_TRUNC) ? skb->len : len;
 
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index a85c1a0..9b600c2 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1249,7 +1249,7 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	if (msg->msg_name) {
 		struct sockaddr_rose *srose;
diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index e9aaa65..4575485 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -180,7 +180,7 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
 		if (copy > len - copied)
 			copy = len - copied;
 
-		ret = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copy);
+		ret = skb_copy_datagram_msg(skb, offset, msg, copy);
 
 		if (ret < 0)
 			goto copy_error;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 634a2ab..2120292 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -2095,7 +2095,7 @@ static int sctp_recvmsg(struct kiocb *iocb, struct sock *sk,
 	if (copied > len)
 		copied = len;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	event = sctp_skb2event(skb);
 
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 51bddc2..f726eaa 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1372,8 +1372,8 @@ restart:
 			sz = buf_len;
 			m->msg_flags |= MSG_TRUNC;
 		}
-		res = skb_copy_datagram_iovec(buf, msg_hdr_sz(msg),
-					      m->msg_iov, sz);
+		res = skb_copy_datagram_msg(buf, msg_hdr_sz(msg),
+					      m, sz);
 		if (res)
 			goto exit;
 		res = sz;
@@ -1473,8 +1473,8 @@ restart:
 		needed = (buf_len - sz_copied);
 		sz_to_copy = (sz <= needed) ? sz : needed;
 
-		res = skb_copy_datagram_iovec(buf, msg_hdr_sz(msg) + offset,
-					      m->msg_iov, sz_to_copy);
+		res = skb_copy_datagram_msg(buf, msg_hdr_sz(msg) + offset,
+					      m, sz_to_copy);
 		if (res)
 			goto exit;
 
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e968843..350771a 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1825,7 +1825,7 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
 	else if (size < skb->len - skip)
 		msg->msg_flags |= MSG_TRUNC;
 
-	err = skb_copy_datagram_iovec(skb, skip, msg->msg_iov, size);
+	err = skb_copy_datagram_msg(skb, skip, msg, size);
 	if (err)
 		goto out_free;
 
@@ -2030,8 +2030,8 @@ again:
 		}
 
 		chunk = min_t(unsigned int, unix_skb_len(skb) - skip, size);
-		if (skb_copy_datagram_iovec(skb, UNIXCB(skb).consumed + skip,
-					    msg->msg_iov, chunk)) {
+		if (skb_copy_datagram_msg(skb, UNIXCB(skb).consumed + skip,
+					    msg, chunk)) {
 			if (copied == 0)
 				copied = -EFAULT;
 			break;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 9bb63ff..a57ddef 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1773,8 +1773,7 @@ static int vmci_transport_dgram_dequeue(struct kiocb *kiocb,
 	}
 
 	/* Place the datagram payload in the user's iovec. */
-	err = skb_copy_datagram_iovec(skb, sizeof(*dg), msg->msg_iov,
-		payload_len);
+	err = skb_copy_datagram_msg(skb, sizeof(*dg), msg, payload_len);
 	if (err)
 		goto out;
 
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5ad4418..59e785b 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1335,7 +1335,7 @@ static int x25_recvmsg(struct kiocb *iocb, struct socket *sock,
 	/* Currently, each datagram always contains a complete record */
 	msg->msg_flags |= MSG_EOR;
 
-	rc = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	rc = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (rc)
 		goto out_free_dgram;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Nov. 5, 2014, 2:22 a.m. UTC | #8
On Tue, Nov 04, 2014 at 03:13:55PM +0000, Al Viro wrote:
> 
> Think of it that way: every sendmsg/recvmsg path leading to memcpy_fromiovec
> and its friends (including the open-coded ones) would need to be changed
> at some point.  Assuming we do not end up passing struct iov_iter * as
> an extra argument through a fairly large part of net/* (and that would
> be prohibitively hard and messy, not to mention the effects on the stack
> footprint, etc.), the most obvious strategy is to have that thing passed
> where msg_iov/msg_iovlen are - in struct msghdr.  *IF* we go that way,
> it makes a whole lot of sense to start with a bunch of cleanups that
> will make sense on their own (most of callers of skb_copy_datagram_iovec
> do look like skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied); might
> as well give it an inlined helper) and will reduce the amount of places
> where ->msg_iov is used.  With such cleanups standing on their own and
> being splittable from the rest of the queue.  And leaving us with fewer
> places in code that deal with ->msg_iov and need to be dealt with.

I think your solution is great.  However, I don't see how my four
patches impede in anyway the work that you're doing.  I presume
your first patch will make skb_copy_datagram_msg just a wrapper
around skb_copy_datagram_iovec.

Since I'm not removing skb_copy_datagram_iovec (it can't be removed
until all users are gone) you can still do that and when you're
ready to switch over to iov_iter you can just move the wrapper over
to skb_copy_datagram_iter.  No?

Cheers,
David Miller Nov. 5, 2014, 3:27 a.m. UTC | #9
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Wed, 5 Nov 2014 10:22:51 +0800

> On Tue, Nov 04, 2014 at 03:13:55PM +0000, Al Viro wrote:
>> 
>> Think of it that way: every sendmsg/recvmsg path leading to memcpy_fromiovec
>> and its friends (including the open-coded ones) would need to be changed
>> at some point.  Assuming we do not end up passing struct iov_iter * as
>> an extra argument through a fairly large part of net/* (and that would
>> be prohibitively hard and messy, not to mention the effects on the stack
>> footprint, etc.), the most obvious strategy is to have that thing passed
>> where msg_iov/msg_iovlen are - in struct msghdr.  *IF* we go that way,
>> it makes a whole lot of sense to start with a bunch of cleanups that
>> will make sense on their own (most of callers of skb_copy_datagram_iovec
>> do look like skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied); might
>> as well give it an inlined helper) and will reduce the amount of places
>> where ->msg_iov is used.  With such cleanups standing on their own and
>> being splittable from the rest of the queue.  And leaving us with fewer
>> places in code that deal with ->msg_iov and need to be dealt with.
> 
> I think your solution is great.  However, I don't see how my four
> patches impede in anyway the work that you're doing.  I presume
> your first patch will make skb_copy_datagram_msg just a wrapper
> around skb_copy_datagram_iovec.
> 
> Since I'm not removing skb_copy_datagram_iovec (it can't be removed
> until all users are gone) you can still do that and when you're
> ready to switch over to iov_iter you can just move the wrapper over
> to skb_copy_datagram_iter.  No?

Agreed, I think both efforts can proceed in parallel.

Al, is this the helper you are talking about?

diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 1be8228..a08057d 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -163,7 +163,7 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	memcpy(skb_push(skb, MISDN_HEADER_LEN), mISDN_HEAD_P(skb),
 	       MISDN_HEADER_LEN);
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 
 	mISDN_sock_cmsg(sk, msg, skb);
 
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 6c9c16d..25234d9 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -981,7 +981,7 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	if (skb) {
 		total_len = min_t(size_t, total_len, skb->len);
-		error = skb_copy_datagram_iovec(skb, 0, m->msg_iov, total_len);
+		error = skb_copy_datagram_msghdr(skb, m, total_len);
 		if (error == 0) {
 			consume_skb(skb);
 			return total_len;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5ad9675..19fe8cc 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -17,6 +17,7 @@
 #include <linux/kernel.h>
 #include <linux/kmemcheck.h>
 #include <linux/compiler.h>
+#include <linux/socket.h>
 #include <linux/time.h>
 #include <linux/bug.h>
 #include <linux/cache.h>
@@ -2637,6 +2638,11 @@ unsigned int datagram_poll(struct file *file, struct socket *sock,
 			   struct poll_table_struct *wait);
 int skb_copy_datagram_iovec(const struct sk_buff *from, int offset,
 			    struct iovec *to, int size);
+static inline int skb_copy_datagram_msghdr(const struct sk_buff *from,
+					   struct msghdr *msg, int size)
+{
+	return skb_copy_datagram_iovec(from, 0, msg->msg_iov, size);
+}
 int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
 				     struct iovec *iov);
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
diff --git a/net/atm/common.c b/net/atm/common.c
index 6a76515..7e42bbe 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -554,7 +554,7 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	error = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	error = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (error)
 		return error;
 	sock_recv_ts_and_drops(msg, sk, skb);
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index c35c3f4..a91075c 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1634,7 +1634,7 @@ static int ax25_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msghdr(skb, msg, copied);
 
 	if (msg->msg_name) {
 		ax25_digi digi;
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index 339c74a..a68dd75 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -237,7 +237,7 @@ int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err == 0) {
 		sock_recv_ts_and_drops(msg, sk, skb);
 
@@ -328,7 +328,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
 		}
 
 		chunk = min_t(unsigned int, skb->len, size);
-		if (skb_copy_datagram_iovec(skb, 0, msg->msg_iov, chunk)) {
+		if (skb_copy_datagram_msghdr(skb, msg, chunk)) {
 			skb_queue_head(&sk->sk_receive_queue, skb);
 			if (!copied)
 				copied = -EFAULT;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 115f149..45d4fba 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -878,7 +878,7 @@ static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 
 	switch (hci_pi(sk)->channel) {
 	case HCI_CHANNEL_RAW:
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 43f750e..67e63b6 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -293,7 +293,7 @@ static int caif_seqpkt_recvmsg(struct kiocb *iocb, struct socket *sock,
 		copylen = len;
 	}
 
-	ret = skb_copy_datagram_iovec(skb, 0, m->msg_iov, copylen);
+	ret = skb_copy_datagram_msghdr(skb, m, copylen);
 	if (ret)
 		goto out_free;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 15e0c67..220c791 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2457,7 +2457,7 @@ int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len,
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 5ab6627..7ccf58f 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -896,7 +896,7 @@ verify_sock_status:
 		else if (len < skb->len)
 			msg->msg_flags |= MSG_TRUNC;
 
-		if (skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len)) {
+		if (skb_copy_datagram_msghdr(skb, msg, len)) {
 			/* Exception. Bailout! */
 			len = -EFAULT;
 			break;
diff --git a/net/ieee802154/dgram.c b/net/ieee802154/dgram.c
index ef2ad8a..7017055 100644
--- a/net/ieee802154/dgram.c
+++ b/net/ieee802154/dgram.c
@@ -324,7 +324,7 @@ static int dgram_recvmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	/* FIXME: skip headers if necessary ?! */
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ieee802154/raw.c b/net/ieee802154/raw.c
index 9d1f648..5dd893a 100644
--- a/net/ieee802154/raw.c
+++ b/net/ieee802154/raw.c
@@ -195,7 +195,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index c373a9a..d643882 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -424,7 +424,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 57f7c98..a6e0197 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -875,7 +875,7 @@ int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	}
 
 	/* Don't bother checking the checksum */
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 739db31..2f4bb30 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -718,7 +718,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 39ec0c3..3638679 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1377,7 +1377,7 @@ static int tcp_peek_sndq(struct sock *sk, struct msghdr *msg, int len)
 	/* XXX -- need to support SO_PEEK_OFF */
 
 	skb_queue_walk(&sk->sk_write_queue, skb) {
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, skb->len);
+		err = skb_copy_datagram_msghdr(skb, msg, skb->len);
 		if (err)
 			break;
 
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2cdc383..4bd84fd 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -351,7 +351,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto out_free_skb;
 
@@ -445,7 +445,7 @@ int ipv6_recv_rxpmtu(struct sock *sk, struct msghdr *msg, int len,
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 075a0fb..5d37aa1 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -486,11 +486,11 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	if (skb_csum_unnecessary(skb)) {
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+		err = skb_copy_datagram_msghdr(skb, msg, copied);
 	} else if (msg->msg_flags&MSG_TRUNC) {
 		if (__skb_checksum_complete(skb))
 			goto csum_copy_err;
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+		err = skb_copy_datagram_msghdr(skb, msg, copied);
 	} else {
 		err = skb_copy_and_csum_datagram_iovec(skb, 0, msg->msg_iov);
 		if (err == -EINVAL)
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 92fafd4..54db6dc 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1396,7 +1396,7 @@ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock,
 		copied = size;
 		msg->msg_flags |= MSG_TRUNC;
 	}
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msghdr(skb, msg, copied);
 
 	skb_free_datagram(sk, skb);
 
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 1847ec4..f09a848 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3654,7 +3654,7 @@ static int pfkey_recvmsg(struct kiocb *kiocb,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto out_free;
 
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 369a982..2913198 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -528,7 +528,7 @@ static int l2tp_ip_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 0edb263..8613881 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -672,7 +672,7 @@ static int l2tp_ip6_recvmsg(struct kiocb *iocb, struct sock *sk,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index b704a93..5f3c0f5 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -208,7 +208,7 @@ static int pppol2tp_recvmsg(struct kiocb *iocb, struct socket *sock,
 	else if (len < skb->len)
 		msg->msg_flags |= MSG_TRUNC;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len);
+	err = skb_copy_datagram_msghdr(skb, msg, len);
 	if (likely(err == 0))
 		err = len;
 
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index f1de72d..123cffd 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2401,7 +2401,7 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(data_skb);
-	err = skb_copy_datagram_iovec(data_skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(data_skb, msg, copied);
 
 	if (msg->msg_name) {
 		DECLARE_SOCKADDR(struct sockaddr_nl *, addr, msg->msg_name);
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 1b06a1f..6bc3556 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1167,7 +1167,7 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	er = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	er = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (er < 0) {
 		skb_free_datagram(sk, skb);
 		release_sock(sk);
diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 51f077a..e962c07 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -832,7 +832,7 @@ static int llcp_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	copied = min_t(unsigned int, rlen, len);
 
 	cskb = skb;
-	if (skb_copy_datagram_iovec(cskb, 0, msg->msg_iov, copied)) {
+	if (skb_copy_datagram_msghdr(cskb, msg, copied)) {
 		if (!(flags & MSG_PEEK))
 			skb_queue_head(&sk->sk_receive_queue, skb);
 		return -EFAULT;
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 11c3544..4467b2c 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -269,7 +269,7 @@ static int rawsock_recvmsg(struct kiocb *iocb, struct socket *sock,
 		copied = len;
 	}
 
-	rc = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	rc = skb_copy_datagram_msghdr(skb, msg, copied);
 
 	skb_free_datagram(sk, skb);
 
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 87d20f4..390b776 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2953,7 +2953,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (err)
 		goto out_free;
 
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index 290352c..b4835bc 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -150,7 +150,7 @@ static int pn_recvmsg(struct kiocb *iocb, struct sock *sk,
 		copylen = len;
 	}
 
-	rval = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copylen);
+	rval = skb_copy_datagram_msghdr(skb, msg, copylen);
 	if (rval) {
 		rval = -EFAULT;
 		goto out;
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 70a547e..f544658 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1296,7 +1296,7 @@ copy:
 	else
 		len = skb->len;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len);
+	err = skb_copy_datagram_msghdr(skb, msg, len);
 	if (!err)
 		err = (flags & MSG_TRUNC) ? skb->len : len;
 
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index a85c1a0..b660504 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1249,7 +1249,7 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msghdr(skb, msg, copied);
 
 	if (msg->msg_name) {
 		struct sockaddr_rose *srose;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 634a2ab..0fca34c 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -2095,7 +2095,7 @@ static int sctp_recvmsg(struct kiocb *iocb, struct sock *sk,
 	if (copied > len)
 		copied = len;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msghdr(skb, msg, copied);
 
 	event = sctp_skb2event(skb);
 
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5ad4418..fad0702 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1335,7 +1335,7 @@ static int x25_recvmsg(struct kiocb *iocb, struct socket *sock,
 	/* Currently, each datagram always contains a complete record */
 	msg->msg_flags |= MSG_EOR;
 
-	rc = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	rc = skb_copy_datagram_msghdr(skb, msg, copied);
 	if (rc)
 		goto out_free_dgram;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 5, 2014, 3:55 a.m. UTC | #10
On Tue, Nov 04, 2014 at 10:27:27PM -0500, David Miller wrote:

> Al, is this the helper you are talking about?

Mostly, except that I kept it 4-argument (and used skb_copy_datagram_msg()
for name).  Matter of taste - the ones you've missed because of that are

net/appletalk/ddp.c:1761:	err = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copied);
net/ipv4/tcp.c:1836:			err = skb_copy_datagram_iovec(skb, offset,
net/ipv4/udp.c:1284:		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
net/ipv6/udp.c:427:			err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
net/ipx/af_ipx.c:1808:	rc = skb_copy_datagram_iovec(skb, sizeof(struct ipxhdr), msg->msg_iov,
net/iucv/af_iucv.c:1358:	if (skb_copy_datagram_iovec(cskb, offset, msg->msg_iov, copied)) {
net/llc/af_llc.c:822:			int rc = skb_copy_datagram_iovec(skb, offset,
net/rxrpc/ar-recvmsg.c:183:		ret = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copy);
net/tipc/socket.c:1375:		res = skb_copy_datagram_iovec(buf, msg_hdr_sz(msg),
net/tipc/socket.c:1476:		res = skb_copy_datagram_iovec(buf, msg_hdr_sz(msg) + offset,
net/unix/af_unix.c:1828:	err = skb_copy_datagram_iovec(skb, skip, msg->msg_iov, size);
net/unix/af_unix.c:2033:		if (skb_copy_datagram_iovec(skb, UNIXCB(skb).consumed + skip,
net/vmw_vsock/vmci_transport.c:1776:	err = skb_copy_datagram_iovec(skb, sizeof(*dg), msg->msg_iov,

and back then I decided that 13 more converted instances might be worth keeping
it in 4-argument form...

What do you think of the trick with user_msghdr, BTW?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 5, 2014, 4:12 a.m. UTC | #11
On Wed, Nov 05, 2014 at 03:55:36AM +0000, Al Viro wrote:
> What do you think of the trick with user_msghdr, BTW?

	PS: where do you prefer the branches to be based off?
git://git.kernel.org/pub/scm/linux/kernel/git/davem/net#master, mainline,
something else?  I can certainly do that as patches over email, the
question is what's best used as base...  FWIW, the analysis I've posted
was in 3.18-rc3 and it looks like it ought to be valid in net#master
as well.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 5, 2014, 8:24 p.m. UTC | #12
Herbert, please provide a cover letter for this series, and the most recent
version of patch #2 gets various rejects when I try to apply it to net-next.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 5, 2014, 8:50 p.m. UTC | #13
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Wed, 5 Nov 2014 03:55:36 +0000

> On Tue, Nov 04, 2014 at 10:27:27PM -0500, David Miller wrote:
> 
>> Al, is this the helper you are talking about?
> 
> Mostly, except that I kept it 4-argument (and used skb_copy_datagram_msg()
> for name).  Matter of taste - the ones you've missed because of that are
 ...
> and back then I decided that 13 more converted instances might be worth keeping
> it in 4-argument form...

Ok, fixed up patch below:

> What do you think of the trick with user_msghdr, BTW?

I think we can get away with it if, as you say, we don't export a 'msghdr'
from any uapi headers.

And indeed, double checking, it's purely a linux/socket.h thing.

If this patch is OK, mind if I toss it into net-next Al?

diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 1be8228..dcbd858 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -163,7 +163,7 @@ mISDN_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	memcpy(skb_push(skb, MISDN_HEADER_LEN), mISDN_HEAD_P(skb),
 	       MISDN_HEADER_LEN);
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	mISDN_sock_cmsg(sk, msg, skb);
 
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 6c9c16d..443cbbf 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -981,7 +981,7 @@ static int pppoe_recvmsg(struct kiocb *iocb, struct socket *sock,
 
 	if (skb) {
 		total_len = min_t(size_t, total_len, skb->len);
-		error = skb_copy_datagram_iovec(skb, 0, m->msg_iov, total_len);
+		error = skb_copy_datagram_msg(skb, 0, m, total_len);
 		if (error == 0) {
 			consume_skb(skb);
 			return total_len;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 5ad9675..31cdb7e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -21,6 +21,7 @@
 #include <linux/bug.h>
 #include <linux/cache.h>
 #include <linux/rbtree.h>
+#include <linux/socket.h>
 
 #include <linux/atomic.h>
 #include <asm/types.h>
@@ -2637,6 +2638,11 @@ unsigned int datagram_poll(struct file *file, struct socket *sock,
 			   struct poll_table_struct *wait);
 int skb_copy_datagram_iovec(const struct sk_buff *from, int offset,
 			    struct iovec *to, int size);
+static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
+					struct msghdr *msg, int size)
+{
+	return skb_copy_datagram_iovec(from, offset, msg->msg_iov, size);
+}
 int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
 				     struct iovec *iov);
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index c00897f..425942d 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1758,7 +1758,7 @@ static int atalk_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 		copied = size;
 		msg->msg_flags |= MSG_TRUNC;
 	}
-	err = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, offset, msg, copied);
 
 	if (!err && msg->msg_name) {
 		DECLARE_SOCKADDR(struct sockaddr_at *, sat, msg->msg_name);
diff --git a/net/atm/common.c b/net/atm/common.c
index 6a76515..9cd1cca 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -554,7 +554,7 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	error = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	error = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (error)
 		return error;
 	sock_recv_ts_and_drops(msg, sk, skb);
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index c35c3f4..f4f835e 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1634,7 +1634,7 @@ static int ax25_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	if (msg->msg_name) {
 		ax25_digi digi;
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index 339c74a..0a7cc56 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -237,7 +237,7 @@ int bt_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err == 0) {
 		sock_recv_ts_and_drops(msg, sk, skb);
 
@@ -328,7 +328,7 @@ int bt_sock_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
 		}
 
 		chunk = min_t(unsigned int, skb->len, size);
-		if (skb_copy_datagram_iovec(skb, 0, msg->msg_iov, chunk)) {
+		if (skb_copy_datagram_msg(skb, 0, msg, chunk)) {
 			skb_queue_head(&sk->sk_receive_queue, skb);
 			if (!copied)
 				copied = -EFAULT;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 115f149..29e1ec7 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -878,7 +878,7 @@ static int hci_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	switch (hci_pi(sk)->channel) {
 	case HCI_CHANNEL_RAW:
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 43f750e..fbcd156 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -293,7 +293,7 @@ static int caif_seqpkt_recvmsg(struct kiocb *iocb, struct socket *sock,
 		copylen = len;
 	}
 
-	ret = skb_copy_datagram_iovec(skb, 0, m->msg_iov, copylen);
+	ret = skb_copy_datagram_msg(skb, 0, m, copylen);
 	if (ret)
 		goto out_free;
 
diff --git a/net/core/sock.c b/net/core/sock.c
index 15e0c67..ac56dd0 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2457,7 +2457,7 @@ int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len,
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 5ab6627..8e6ae94 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -896,7 +896,7 @@ verify_sock_status:
 		else if (len < skb->len)
 			msg->msg_flags |= MSG_TRUNC;
 
-		if (skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len)) {
+		if (skb_copy_datagram_msg(skb, 0, msg, len)) {
 			/* Exception. Bailout! */
 			len = -EFAULT;
 			break;
diff --git a/net/ieee802154/dgram.c b/net/ieee802154/dgram.c
index ef2ad8a..fc9193e 100644
--- a/net/ieee802154/dgram.c
+++ b/net/ieee802154/dgram.c
@@ -324,7 +324,7 @@ static int dgram_recvmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	/* FIXME: skip headers if necessary ?! */
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ieee802154/raw.c b/net/ieee802154/raw.c
index 9d1f648..73a4d53 100644
--- a/net/ieee802154/raw.c
+++ b/net/ieee802154/raw.c
@@ -195,7 +195,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index c373a9a..21894df 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -424,7 +424,7 @@ int ip_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 57f7c98..736236c 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -875,7 +875,7 @@ int ping_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	}
 
 	/* Don't bother checking the checksum */
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 739db31..ee8fa4b 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -718,7 +718,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 39ec0c3..c239f47 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1377,7 +1377,7 @@ static int tcp_peek_sndq(struct sock *sk, struct msghdr *msg, int len)
 	/* XXX -- need to support SO_PEEK_OFF */
 
 	skb_queue_walk(&sk->sk_write_queue, skb) {
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, skb->len);
+		err = skb_copy_datagram_msg(skb, 0, msg, skb->len);
 		if (err)
 			break;
 
@@ -1833,8 +1833,7 @@ do_prequeue:
 		}
 
 		if (!(flags & MSG_TRUNC)) {
-			err = skb_copy_datagram_iovec(skb, offset,
-						      msg->msg_iov, used);
+			err = skb_copy_datagram_msg(skb, offset, msg, used);
 			if (err) {
 				/* Exception. Bailout! */
 				if (!copied)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 3f001db..df19027 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1281,8 +1281,8 @@ try_again:
 	}
 
 	if (skb_csum_unnecessary(skb))
-		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
-					      msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
+					    msg, copied);
 	else {
 		err = skb_copy_and_csum_datagram_iovec(skb,
 						       sizeof(struct udphdr),
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 2cdc383..5c6996e 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -351,7 +351,7 @@ int ipv6_recv_error(struct sock *sk, struct msghdr *msg, int len, int *addr_len)
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
@@ -445,7 +445,7 @@ int ipv6_recv_rxpmtu(struct sock *sk, struct msghdr *msg, int len,
 		msg->msg_flags |= MSG_TRUNC;
 		copied = len;
 	}
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free_skb;
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 075a0fb..0cbcf98 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -486,11 +486,11 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	if (skb_csum_unnecessary(skb)) {
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	} else if (msg->msg_flags&MSG_TRUNC) {
 		if (__skb_checksum_complete(skb))
 			goto csum_copy_err;
-		err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	} else {
 		err = skb_copy_and_csum_datagram_iovec(skb, 0, msg->msg_iov);
 		if (err == -EINVAL)
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index f6ba535..9b68092 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -424,8 +424,8 @@ try_again:
 	}
 
 	if (skb_csum_unnecessary(skb))
-		err = skb_copy_datagram_iovec(skb, sizeof(struct udphdr),
-					      msg->msg_iov, copied);
+		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
+					    msg, copied);
 	else {
 		err = skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr), msg->msg_iov);
 		if (err == -EINVAL)
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index 313ef46..a0c7536 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1805,8 +1805,7 @@ static int ipx_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	rc = skb_copy_datagram_iovec(skb, sizeof(struct ipxhdr), msg->msg_iov,
-				     copied);
+	rc = skb_copy_datagram_msg(skb, sizeof(struct ipxhdr), msg, copied);
 	if (rc)
 		goto out_free;
 	if (skb->tstamp.tv64)
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 92fafd4..980bc26 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1396,7 +1396,7 @@ static int irda_recvmsg_dgram(struct kiocb *iocb, struct socket *sock,
 		copied = size;
 		msg->msg_flags |= MSG_TRUNC;
 	}
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	skb_free_datagram(sk, skb);
 
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index a089b6b..057b564 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1355,7 +1355,7 @@ static int iucv_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 		sk->sk_shutdown = sk->sk_shutdown | RCV_SHUTDOWN;
 
 	cskb = skb;
-	if (skb_copy_datagram_iovec(cskb, offset, msg->msg_iov, copied)) {
+	if (skb_copy_datagram_msg(cskb, offset, msg, copied)) {
 		if (!(flags & MSG_PEEK))
 			skb_queue_head(&sk->sk_receive_queue, skb);
 		return -EFAULT;
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 1847ec4..e588309 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3654,7 +3654,7 @@ static int pfkey_recvmsg(struct kiocb *kiocb,
 	}
 
 	skb_reset_transport_header(skb);
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free;
 
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 369a982..a6cc1fe 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -528,7 +528,7 @@ static int l2tp_ip_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 0edb263..2177b96 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -672,7 +672,7 @@ static int l2tp_ip6_recvmsg(struct kiocb *iocb, struct sock *sk,
 		copied = len;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto done;
 
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index b704a93..c559bcd 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -208,7 +208,7 @@ static int pppol2tp_recvmsg(struct kiocb *iocb, struct socket *sock,
 	else if (len < skb->len)
 		msg->msg_flags |= MSG_TRUNC;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len);
+	err = skb_copy_datagram_msg(skb, 0, msg, len);
 	if (likely(err == 0))
 		err = len;
 
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index bb9cbc1..af66266 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -819,8 +819,7 @@ static int llc_ui_recvmsg(struct kiocb *iocb, struct socket *sock,
 			used = len;
 
 		if (!(flags & MSG_TRUNC)) {
-			int rc = skb_copy_datagram_iovec(skb, offset,
-							 msg->msg_iov, used);
+			int rc = skb_copy_datagram_msg(skb, offset, msg, used);
 			if (rc) {
 				/* Exception. Bailout! */
 				if (!copied)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index f1de72d..580b794 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2401,7 +2401,7 @@ static int netlink_recvmsg(struct kiocb *kiocb, struct socket *sock,
 	}
 
 	skb_reset_transport_header(data_skb);
-	err = skb_copy_datagram_iovec(data_skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(data_skb, 0, msg, copied);
 
 	if (msg->msg_name) {
 		DECLARE_SOCKADDR(struct sockaddr_nl *, addr, msg->msg_name);
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 1b06a1f..7e13f6a 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1167,7 +1167,7 @@ static int nr_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	er = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	er = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (er < 0) {
 		skb_free_datagram(sk, skb);
 		release_sock(sk);
diff --git a/net/nfc/llcp_sock.c b/net/nfc/llcp_sock.c
index 51f077a..83bc785 100644
--- a/net/nfc/llcp_sock.c
+++ b/net/nfc/llcp_sock.c
@@ -832,7 +832,7 @@ static int llcp_sock_recvmsg(struct kiocb *iocb, struct socket *sock,
 	copied = min_t(unsigned int, rlen, len);
 
 	cskb = skb;
-	if (skb_copy_datagram_iovec(cskb, 0, msg->msg_iov, copied)) {
+	if (skb_copy_datagram_msg(cskb, 0, msg, copied)) {
 		if (!(flags & MSG_PEEK))
 			skb_queue_head(&sk->sk_receive_queue, skb);
 		return -EFAULT;
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 11c3544..9d7d2b7 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -269,7 +269,7 @@ static int rawsock_recvmsg(struct kiocb *iocb, struct socket *sock,
 		copied = len;
 	}
 
-	rc = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	rc = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	skb_free_datagram(sk, skb);
 
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 87d20f4..4cd13d8 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2953,7 +2953,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (err)
 		goto out_free;
 
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index 290352c..0918bc2 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -150,7 +150,7 @@ static int pn_recvmsg(struct kiocb *iocb, struct sock *sk,
 		copylen = len;
 	}
 
-	rval = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copylen);
+	rval = skb_copy_datagram_msg(skb, 0, msg, copylen);
 	if (rval) {
 		rval = -EFAULT;
 		goto out;
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 70a547e..44b2123 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1296,7 +1296,7 @@ copy:
 	else
 		len = skb->len;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, len);
+	err = skb_copy_datagram_msg(skb, 0, msg, len);
 	if (!err)
 		err = (flags & MSG_TRUNC) ? skb->len : len;
 
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index a85c1a0..9b600c2 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1249,7 +1249,7 @@ static int rose_recvmsg(struct kiocb *iocb, struct socket *sock,
 		msg->msg_flags |= MSG_TRUNC;
 	}
 
-	skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	if (msg->msg_name) {
 		struct sockaddr_rose *srose;
diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index e9aaa65..4575485 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -180,7 +180,7 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
 		if (copy > len - copied)
 			copy = len - copied;
 
-		ret = skb_copy_datagram_iovec(skb, offset, msg->msg_iov, copy);
+		ret = skb_copy_datagram_msg(skb, offset, msg, copy);
 
 		if (ret < 0)
 			goto copy_error;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 634a2ab..2120292 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -2095,7 +2095,7 @@ static int sctp_recvmsg(struct kiocb *iocb, struct sock *sk,
 	if (copied > len)
 		copied = len;
 
-	err = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
 
 	event = sctp_skb2event(skb);
 
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index ad8a1a1..591bbfa 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -1372,8 +1372,7 @@ restart:
 			sz = buf_len;
 			m->msg_flags |= MSG_TRUNC;
 		}
-		res = skb_copy_datagram_iovec(buf, msg_hdr_sz(msg),
-					      m->msg_iov, sz);
+		res = skb_copy_datagram_msg(buf, msg_hdr_sz(msg), m, sz);
 		if (res)
 			goto exit;
 		res = sz;
@@ -1473,8 +1472,8 @@ restart:
 		needed = (buf_len - sz_copied);
 		sz_to_copy = (sz <= needed) ? sz : needed;
 
-		res = skb_copy_datagram_iovec(buf, msg_hdr_sz(msg) + offset,
-					      m->msg_iov, sz_to_copy);
+		res = skb_copy_datagram_msg(buf, msg_hdr_sz(msg) + offset,
+					    m, sz_to_copy);
 		if (res)
 			goto exit;
 
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index e968843..5eee625 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1825,7 +1825,7 @@ static int unix_dgram_recvmsg(struct kiocb *iocb, struct socket *sock,
 	else if (size < skb->len - skip)
 		msg->msg_flags |= MSG_TRUNC;
 
-	err = skb_copy_datagram_iovec(skb, skip, msg->msg_iov, size);
+	err = skb_copy_datagram_msg(skb, skip, msg, size);
 	if (err)
 		goto out_free;
 
@@ -2030,8 +2030,8 @@ again:
 		}
 
 		chunk = min_t(unsigned int, unix_skb_len(skb) - skip, size);
-		if (skb_copy_datagram_iovec(skb, UNIXCB(skb).consumed + skip,
-					    msg->msg_iov, chunk)) {
+		if (skb_copy_datagram_msg(skb, UNIXCB(skb).consumed + skip,
+					  msg, chunk)) {
 			if (copied == 0)
 				copied = -EFAULT;
 			break;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 9bb63ff..a57ddef 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1773,8 +1773,7 @@ static int vmci_transport_dgram_dequeue(struct kiocb *kiocb,
 	}
 
 	/* Place the datagram payload in the user's iovec. */
-	err = skb_copy_datagram_iovec(skb, sizeof(*dg), msg->msg_iov,
-		payload_len);
+	err = skb_copy_datagram_msg(skb, sizeof(*dg), msg, payload_len);
 	if (err)
 		goto out;
 
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 5ad4418..59e785b 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1335,7 +1335,7 @@ static int x25_recvmsg(struct kiocb *iocb, struct socket *sock,
 	/* Currently, each datagram always contains a complete record */
 	msg->msg_flags |= MSG_EOR;
 
-	rc = skb_copy_datagram_iovec(skb, 0, msg->msg_iov, copied);
+	rc = skb_copy_datagram_msg(skb, 0, msg, copied);
 	if (rc)
 		goto out_free_dgram;
 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 5, 2014, 8:51 p.m. UTC | #14
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Wed, 5 Nov 2014 04:12:32 +0000

> On Wed, Nov 05, 2014 at 03:55:36AM +0000, Al Viro wrote:
>> What do you think of the trick with user_msghdr, BTW?
> 
> 	PS: where do you prefer the branches to be based off?
> git://git.kernel.org/pub/scm/linux/kernel/git/davem/net#master, mainline,
> something else?  I can certainly do that as patches over email, the
> question is what's best used as base...  FWIW, the analysis I've posted
> was in 3.18-rc3 and it looks like it ought to be valid in net#master
> as well.

Let's work against net-next, ie:

git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next#master

I can integrate, your, mine, and Herbert's changes all into the same
place.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 5, 2014, 9:07 p.m. UTC | #15
On Wed, Nov 05, 2014 at 03:50:54PM -0500, David Miller wrote:
> I think we can get away with it if, as you say, we don't export a 'msghdr'
> from any uapi headers.

OK.  I'm about halfway through the review of struct msghdr instances in the
current tree right now, will post user_msghdr patch once I'm done.  Already
found a dumb bug in o2net_send_tcp_msg() while doing that - broken by me
back in 3.15 ;-/  Will send a fix to Linus in an hour or so...

> And indeed, double checking, it's purely a linux/socket.h thing.
> 
> If this patch is OK, mind if I toss it into net-next Al?

Sure, no problem - AFAICS, the only real difference from rebase of April one
I've quoted upthread is that you add include of socket.h into skbuff.h; the
rest of the differences is pure whitespace noise.

Ping me when you put it there, OK?  I'll rebase the rest of old stuff on
top of it (similar helpers, mostly).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 5, 2014, 9:57 p.m. UTC | #16
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Wed, 5 Nov 2014 21:07:45 +0000

> Ping me when you put it there, OK?  I'll rebase the rest of old stuff on
> top of it (similar helpers, mostly).

I just pushed it into net-next, thanks Al.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 6, 2014, 3:25 a.m. UTC | #17
On Wed, Nov 05, 2014 at 04:57:19PM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Wed, 5 Nov 2014 21:07:45 +0000
> 
> > Ping me when you put it there, OK?  I'll rebase the rest of old stuff on
> > top of it (similar helpers, mostly).
> 
> I just pushed it into net-next, thanks Al.

OK, I've taken the beginning of the old queue on top of net-next; it's
in git://git.kernel.org//pub/scm/linux/kernel/git/viro/vfs.git iov_iter-net.

>From the quick look at the remaining ->msg_iov users:

	* I'll need to add several iov_iter primitives - counterparts of
checksum.h stuff (copy_and_csum_{from,to}_iter(), maybe some more).  Not
a big deal, I'll do that tomorrow.  That will give us a clean iov_iter-based
counterpart of skb_copy_and_csum_datagram_iovec().

	* a new helper: zerocopy_sg_from_iter().  I have it, actually,
but I'd rather not step on Herbert's toes - it's too close to the areas
his series will touch, so that's probably for when his series goes in.
It will be needed for complete macvtap conversion...

	* why doesn't verify_iovec() use rw_copy_check_uvector()?  The only
real differences I see is that (a) you do allocation in callers (same as
rw_copy_check_uvector() would've done), (b) you return EMSGSIZE in case of
too long vector, while rw_copy_check_uvector() returns EINVAL in that case
and (c) you don't do access_ok().  The last one is described as optimization,
but for iov_iter primitives it's a serious PITA - for iovec-backed instances
they are using __copy_from_user()/__copy_to_user(), etc.
	It certainly would be nice to have the same code doing all copying
of iovecs from userland - readv/writev/aio/sendmsg/recvmsg/etc.  Am I
missing something subtle semantical difference in there?  EMSGSIZE vs EINVAL
is trivial (we can lift that check into the callers, if nothing else), but
I could miss something more interesting...

	* various getfrag will need to grow iov_iter-based counterparts,
but ip_append_output() needs no changes, AFAICS.

	* crypto stuff will be easy to convert - iov_iter_get_pages()
would suffice for a primitive

	* there's some really weird stuff in there.  Just what is this
static int raw_probe_proto_opt(struct flowi4 *fl4, struct msghdr *msg)
{
        struct iovec *iov;
        u8 __user *type = NULL;
        u8 __user *code = NULL;
        int probed = 0;
        unsigned int i;

        if (!msg->msg_iov)
                return 0;

        for (i = 0; i < msg->msg_iovlen; i++) {
                iov = &msg->msg_iov[i];
                if (!iov)
                        continue;
trying to do?  "If non-NULL pointer + i somehow happened to be NULL, skip it
and try to use the same pointer + i + 1"?  Huh?  Had been that way since
the function first went in back in 2004 ("[IPV4] XFRM: probe icmp type/code
when sending packets via raw socket.", according to historical tree)...

	* rds, bluetooth and vsock are doing something odd; need to RTFS some
more.

	* not sure I understand what TIPC is doing - does it prohibit too
short first segment of ->msg_iov?  net/tipc/socket.c:dest_name_check() looks
odd _and_ potentially racy - we read the same data twice and hope our checks
still apply.  I asked TIPC folks about that race back in April, but it
looks like that fell through the cracks...

Overall, so far it looks more or less feasible - other than the missing csum
primitives, current mm/iov_iter.c should suffice.  I have _not_ seriously
looked into sendpage yet; that might very well require some more.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Nov. 6, 2014, 8:23 a.m. UTC | #18
On Wed, Nov 05, 2014 at 03:24:10PM -0500, David Miller wrote:
> 
> Herbert, please provide a cover letter for this series, and the most recent
> version of patch #2 gets various rejects when I try to apply it to net-next.

Sure, I'll regenerate them.  However, while doing so I noticed that
a number of my patches on tun/macvtap that you have previously set
as accepted are missing from net-next.  Could this be why you got
the rejects?

For example, this patch wasn't in net-next when I just did a pull.

https://patchwork.ozlabs.org/patch/405966/

Cheers,
Jon Maloy Nov. 6, 2014, 9:50 a.m. UTC | #19
> -----Original Message-----
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Al Viro
> Sent: November-06-14 4:26 AM
> To: David Miller
> Cc: herbert@gondor.apana.org.au; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; bcrl@kvack.org
> Subject: Re: [PATCH 1/4] inet: Add skb_copy_datagram_iter
> 
> On Wed, Nov 05, 2014 at 04:57:19PM -0500, David Miller wrote:
> > From: Al Viro <viro@ZenIV.linux.org.uk>
> > Date: Wed, 5 Nov 2014 21:07:45 +0000
> >
> > > Ping me when you put it there, OK?  I'll rebase the rest of old
> > > stuff on top of it (similar helpers, mostly).
> >
> > I just pushed it into net-next, thanks Al.
> 
> OK, I've taken the beginning of the old queue on top of net-next; it's in
> git://git.kernel.org//pub/scm/linux/kernel/git/viro/vfs.git iov_iter-net.
> 
> From the quick look at the remaining ->msg_iov users:
> 
> 	* I'll need to add several iov_iter primitives - counterparts of
> checksum.h stuff (copy_and_csum_{from,to}_iter(), maybe some more).
> Not a big deal, I'll do that tomorrow.  That will give us a clean iov_iter-based
> counterpart of skb_copy_and_csum_datagram_iovec().
> 
> 	* a new helper: zerocopy_sg_from_iter().  I have it, actually, but I'd
> rather not step on Herbert's toes - it's too close to the areas his series will
> touch, so that's probably for when his series goes in.
> It will be needed for complete macvtap conversion...
> 
> 	* why doesn't verify_iovec() use rw_copy_check_uvector()?  The
> only real differences I see is that (a) you do allocation in callers (same as
> rw_copy_check_uvector() would've done), (b) you return EMSGSIZE in case
> of too long vector, while rw_copy_check_uvector() returns EINVAL in that
> case and (c) you don't do access_ok().  The last one is described as
> optimization, but for iov_iter primitives it's a serious PITA - for iovec-backed
> instances they are using __copy_from_user()/__copy_to_user(), etc.
> 	It certainly would be nice to have the same code doing all copying of
> iovecs from userland - readv/writev/aio/sendmsg/recvmsg/etc.  Am I
> missing something subtle semantical difference in there?  EMSGSIZE vs
> EINVAL is trivial (we can lift that check into the callers, if nothing else), but I
> could miss something more interesting...
> 
> 	* various getfrag will need to grow iov_iter-based counterparts, but
> ip_append_output() needs no changes, AFAICS.
> 
> 	* crypto stuff will be easy to convert - iov_iter_get_pages() would
> suffice for a primitive
> 
> 	* there's some really weird stuff in there.  Just what is this static int
> raw_probe_proto_opt(struct flowi4 *fl4, struct msghdr *msg) {
>         struct iovec *iov;
>         u8 __user *type = NULL;
>         u8 __user *code = NULL;
>         int probed = 0;
>         unsigned int i;
> 
>         if (!msg->msg_iov)
>                 return 0;
> 
>         for (i = 0; i < msg->msg_iovlen; i++) {
>                 iov = &msg->msg_iov[i];
>                 if (!iov)
>                         continue;
> trying to do?  "If non-NULL pointer + i somehow happened to be NULL, skip it
> and try to use the same pointer + i + 1"?  Huh?  Had been that way since the
> function first went in back in 2004 ("[IPV4] XFRM: probe icmp type/code
> when sending packets via raw socket.", according to historical tree)...
> 
> 	* rds, bluetooth and vsock are doing something odd; need to RTFS
> some more.
> 
> 	* not sure I understand what TIPC is doing - does it prohibit too short
> first segment of ->msg_iov?  

Yes, that is the purpose. It was needed in early versions of TIPC, which was
using TIPC itself, instead of netlink, as carrier of configuration commands.
This option is long gone, and we can safely remove those checks.  I'll
post a patch.

///jon

net/tipc/socket.c:dest_name_check() looks
> odd _and_ potentially racy - we read the same data twice and hope our
> checks still apply.  I asked TIPC folks about that race back in April, but it looks
> like that fell through the cracks...
> 
> Overall, so far it looks more or less feasible - other than the missing csum
> primitives, current mm/iov_iter.c should suffice.  I have _not_ seriously
> looked into sendpage yet; that might very well require some more.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in the body
> of a message to majordomo@vger.kernel.org More majordomo info at
> http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 6, 2014, 5:25 p.m. UTC | #20
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Thu, 6 Nov 2014 16:23:38 +0800

> On Wed, Nov 05, 2014 at 03:24:10PM -0500, David Miller wrote:
>> 
>> Herbert, please provide a cover letter for this series, and the most recent
>> version of patch #2 gets various rejects when I try to apply it to net-next.
> 
> Sure, I'll regenerate them.  However, while doing so I noticed that
> a number of my patches on tun/macvtap that you have previously set
> as accepted are missing from net-next.  Could this be why you got
> the rejects?

Those were bug fixes so went into plain 'net', they will show up next
time I do a merge and I will deal with the conflicts, if any.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Nov. 7, 2014, 1:59 a.m. UTC | #21
On Thu, Nov 06, 2014 at 12:25:00PM -0500, David Miller wrote:
> From: Herbert Xu <herbert@gondor.apana.org.au>
> Date: Thu, 6 Nov 2014 16:23:38 +0800
> 
> > On Wed, Nov 05, 2014 at 03:24:10PM -0500, David Miller wrote:
> >> 
> >> Herbert, please provide a cover letter for this series, and the most recent
> >> version of patch #2 gets various rejects when I try to apply it to net-next.
> > 
> > Sure, I'll regenerate them.  However, while doing so I noticed that
> > a number of my patches on tun/macvtap that you have previously set
> > as accepted are missing from net-next.  Could this be why you got
> > the rejects?
> 
> Those were bug fixes so went into plain 'net', they will show up next
> time I do a merge and I will deal with the conflicts, if any.

I see.  In that case it might be best to wait until those fixes hit
net-next first before applying these patches as otherwise Stephen will
get hit with some nasty merge conflicts.

I'll repost them for RFC with the problems that Al pointed out fixed
in the mean time.

Cheers,
David Miller Nov. 7, 2014, 3:13 a.m. UTC | #22
From: Herbert Xu <herbert@gondor.apana.org.au>
Date: Fri, 7 Nov 2014 09:59:44 +0800

> In that case it might be best to wait until those fixes hit net-next
> first before applying these patches as otherwise Stephen will get
> hit with some nasty merge conflicts.

I just merged net into net-next, so this barrier should no longer
be present.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Nov. 7, 2014, 1:21 p.m. UTC | #23
Hi Dave:

This patch series adds the helper skb_copy_datagram_iter, which
is meant to replace both skb_copy_datagram_iovec and its evil
twin skb_copy_datagram_const_iovec.

It then converts tun and macvtap over to the new helper and finally
removes skb_copy_datagram_const_iovec which is only used by tun
and macvtap.

The copy_to_iter return value issue pointed out by Al has now been
fixed.

Cheers,
David Miller Nov. 7, 2014, 9:48 p.m. UTC | #24
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Thu, 6 Nov 2014 03:25:34 +0000

> OK, I've taken the beginning of the old queue on top of net-next; it's
> in git://git.kernel.org//pub/scm/linux/kernel/git/viro/vfs.git iov_iter-net.

What I see in there looks good.   I wonder if we can somehow make msghdr
pointer args const... but this is not so important now.

Some minor coding style nits, comments:

	/* Like
	 * this.
	 */

and for multi-line function calls in the networking, align the second
and subsequent lines at the first column after the openning parenthesis
of the first line.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 7, 2014, 9:52 p.m. UTC | #25
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Thu, 6 Nov 2014 03:25:34 +0000

> 	* a new helper: zerocopy_sg_from_iter().  I have it, actually,
> but I'd rather not step on Herbert's toes - it's too close to the areas
> his series will touch, so that's probably for when his series goes in.
> It will be needed for complete macvtap conversion...

Just a heads up, his series is applied to net-next.

> 	* why doesn't verify_iovec() use rw_copy_check_uvector()?  The only
> real differences I see is that (a) you do allocation in callers (same as
> rw_copy_check_uvector() would've done), (b) you return EMSGSIZE in case of
> too long vector, while rw_copy_check_uvector() returns EINVAL in that case
> and (c) you don't do access_ok().  The last one is described as optimization,
> but for iov_iter primitives it's a serious PITA - for iovec-backed instances
> they are using __copy_from_user()/__copy_to_user(), etc.

The answer is that nobody knew abuot it and looked, that's why.

> 	It certainly would be nice to have the same code doing all copying
> of iovecs from userland - readv/writev/aio/sendmsg/recvmsg/etc.  Am I
> missing something subtle semantical difference in there?  EMSGSIZE vs EINVAL
> is trivial (we can lift that check into the callers, if nothing else), but
> I could miss something more interesting...

We also need compat counterparts.

> 	* various getfrag will need to grow iov_iter-based counterparts,
> but ip_append_output() needs no changes, AFAICS.

Right.

> 	* there's some really weird stuff in there.  Just what is this
> static int raw_probe_proto_opt(struct flowi4 *fl4, struct msghdr *msg)
> {
>         struct iovec *iov;
>         u8 __user *type = NULL;
>         u8 __user *code = NULL;
>         int probed = 0;
>         unsigned int i;
> 
>         if (!msg->msg_iov)
>                 return 0;
> 
>         for (i = 0; i < msg->msg_iovlen; i++) {
>                 iov = &msg->msg_iov[i];
>                 if (!iov)
>                         continue;
> trying to do?  "If non-NULL pointer + i somehow happened to be NULL, skip it
> and try to use the same pointer + i + 1"?  Huh?  Had been that way since
> the function first went in back in 2004 ("[IPV4] XFRM: probe icmp type/code
> when sending packets via raw socket.", according to historical tree)...

This is probably just bogus, because this address-of will never evaluate to
NULL.

> 	* rds, bluetooth and vsock are doing something odd; need to RTFS some
> more.

It is not surprising.... :-/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 7, 2014, 10:11 p.m. UTC | #26
On Fri, Nov 07, 2014 at 04:48:59PM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Thu, 6 Nov 2014 03:25:34 +0000
> 
> > OK, I've taken the beginning of the old queue on top of net-next; it's
> > in git://git.kernel.org//pub/scm/linux/kernel/git/viro/vfs.git iov_iter-net.
> 
> What I see in there looks good.   I wonder if we can somehow make msghdr
> pointer args const... but this is not so important now.
> 
> Some minor coding style nits, comments:
> 
> 	/* Like
> 	 * this.
> 	 */
> 
> and for multi-line function calls in the networking, align the second
> and subsequent lines at the first column after the openning parenthesis
> of the first line.

OK...  I played with csum side of things a bit, and I've noticed something
really nasty - iov_iter primitives all assume that iovec has been validated,
_including_ access_ok() on all ranges.  That allows us to use __copy_...()
in primitives, and on the read/write/readv/writev/aio side of things we have
that validation done when we copy iovec from userland (or set a single-element
iovec over the userland-supplied range, as in read(2)/write(2)).

net/* primitives, OTOH, do access_ok() themselves - syscalls do _not_ check
access_ok() on iovec from untrusted source and rely on the low-level stuff
to do such checks.

Result: regular IO syscalls on sockets (i.e. not recvmsg/sendmsg, usual
read/write) do these checks (at least) twice and use of copy_from_iter()
in ->recvmsg() opens quite a nasty hole - one can call recvmsg(2) with
kernel address in ->msg_iov[0].base and have such an instance of ->recvmsg()
stomp on the kernel memory.  At the very least, with Herbert's patches
we need to validate that somewhere on the way to tun and macvtap recvmsg
instances.  We can do that right there, and as a stopgap measure it might
be a good idea.  However, it's not a sane long-term solution.

We could, of course, add those access_ok() in mm/iov_iter.c and drop them
from fs/read_write.c and fs/aio.c, but I really don't see the point - why
not do that along with the checks we do in verify_iovec() anyway?  And drop
them from memcpy_fromiovec() et.al.

I'm looking through the tree right now; so far it looks like we can just
move those suckers to the point where we validate iovec and lose them
from low-level iovec and csum copying completely.  I still haven't finished
tracing all possible paths for address to arrive at the points where we
currently check that stuff, but so far it looks very doable.

Comments?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 7, 2014, 10:31 p.m. UTC | #27
On Fri, Nov 07, 2014 at 10:11:14PM +0000, Al Viro wrote:

> I'm looking through the tree right now; so far it looks like we can just
> move those suckers to the point where we validate iovec and lose them
> from low-level iovec and csum copying completely.  I still haven't finished
> tracing all possible paths for address to arrive at the points where we
> currently check that stuff, but so far it looks very doable.

BTW, csum side of that is also chock-full of duplicate access_ok() -
e.g. generic csum_and_copy_from_user() checks before calling
csum_partial_copy_from_user().  And generic instance of that is using
__copy_from_user(), all right, but a _lot_ of non-default instances
repeat that access_ok().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 7, 2014, 10:35 p.m. UTC | #28
On Fri, Nov 07, 2014 at 10:31:53PM +0000, Al Viro wrote:
> On Fri, Nov 07, 2014 at 10:11:14PM +0000, Al Viro wrote:
> 
> > I'm looking through the tree right now; so far it looks like we can just
> > move those suckers to the point where we validate iovec and lose them
> > from low-level iovec and csum copying completely.  I still haven't finished
> > tracing all possible paths for address to arrive at the points where we
> > currently check that stuff, but so far it looks very doable.
> 
> BTW, csum side of that is also chock-full of duplicate access_ok() -
> e.g. generic csum_and_copy_from_user() checks before calling
> csum_partial_copy_from_user().  And generic instance of that is using
> __copy_from_user(), all right, but a _lot_ of non-default instances
> repeat that access_ok().

While we are at it: here's the default csum_and_copy_to_user()
static __inline__ __wsum csum_and_copy_to_user
(const void *src, void __user *dst, int len, __wsum sum, int *err_ptr)
{
        sum = csum_partial(src, len, sum);

        if (access_ok(VERIFY_WRITE, dst, len)) {
                if (copy_to_user(dst, src, len) == 0)
                        return sum;
        }
        if (len)
                *err_ptr = -EFAULT;

        return (__force __wsum)-1; /* invalid checksum */
}

Note that we do that access_ok() and follow it with copy_to_user() on exact
same range, i.e. repeat the same damn check...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 7, 2014, 11:42 p.m. UTC | #29
On Fri, Nov 07, 2014 at 10:11:14PM +0000, Al Viro wrote:

> I'm looking through the tree right now; so far it looks like we can just
> move those suckers to the point where we validate iovec and lose them
> from low-level iovec and csum copying completely.  I still haven't finished
> tracing all possible paths for address to arrive at the points where we
> currently check that stuff, but so far it looks very doable.

Definitely doable.  The only remaining interesting part is drivers/vhost
with the stuff it puts in vq->iov[].  If we can guarantee that it satisfies
the sanity checks (access_ok() and size-related ones), we are done -
making verify_iovec() use rw_copy_check_uvector() (and verify_compat_iov()
use compat_rw_copy_check_uvector()) will suffice to guarantee that none of
	csum_partial_copy_fromiovecend
	memcpy_fromiovec
	memcpy_toiovec
	memcpy_toiovecend
	memcpy_fromiovecend
	skb_copy_datagram_iovec
	skb_copy_datagram_iter
	skb_copy_datagram_from_iter
	zerocopy_sg_from_iter
	skb_copy_and_csum_datagram
	skb_copy_and_csum_datagram_iovec
	csum_and_copy_from_user
	csum_and_copy_to_user
	csum_partial_copy_from_user
will ever see an address that doesn't satisfy access_ok() checks.  And
having looked at the data flow...  we definitely want to do those checks
on intake of iovec - as it is, we usually repeat them quite a few times
for the same iovec segment, and we practically never end up _not_ doing them
for some segment of iovec, unless we hit a failure exit before we get around
to copying any data at all.

I'll finish RTFS drivers/vhost and if it turns out to be OK I'll post the
series moving those checks to the moment of copying iovec from userland,
so that kernel-side we could always rely on ->msg_iov elements having been
verified.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Herbert Xu Nov. 8, 2014, 2:21 a.m. UTC | #30
On Fri, Nov 07, 2014 at 11:42:53PM +0000, Al Viro wrote:
> I'll finish RTFS drivers/vhost and if it turns out to be OK I'll post the
> series moving those checks to the moment of copying iovec from userland,
> so that kernel-side we could always rely on ->msg_iov elements having been
> verified.

Great thanks!

Dave, please hold off on my skb_copy_datagram_iter series until
this verify_iovec change is added.

Cheers,
Al Viro Nov. 9, 2014, 9:19 p.m. UTC | #31
[Michael Cc'd]

On Fri, Nov 07, 2014 at 11:42:53PM +0000, Al Viro wrote:

> I'll finish RTFS drivers/vhost and if it turns out to be OK I'll post the
> series moving those checks to the moment of copying iovec from userland,
> so that kernel-side we could always rely on ->msg_iov elements having been
> verified.

Two questions:
1) does sparc64 access_ok() need to differ for 32bit and 64bit tasks?
AFAICS, x86 and ppc just check that address is OK for 64bit process -
if a 32bit process passes the kernel an address that would be valid
for 64bit process, but not for 32bit one, we just get a pagefault in
__copy_from_user() and friends.  No kernel objects are going to have
a virtual address in that range, so access_ok() doesn't bother preventing
such access attempts there...

2) shouldn't vhost_dev_cleanup() stop the worker thread before doing anything
else?  AFAICS, we do parts of vhost_dev teardown while the thread is
still running; granted, we keep dev->mm pinned down until after it stops
(or we would be _really_ screwed), but is it safe to do all those fput()s, etc.
while it's still running?  Michael?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 10, 2014, 5:20 a.m. UTC | #32
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Sun, 9 Nov 2014 21:19:08 +0000

> 1) does sparc64 access_ok() need to differ for 32bit and 64bit tasks?

sparc64 will just fault no matter what kind of task it is.

It is impossible for a user task to generate a reference to
a kernel virtual address, as kernel and user accesses each
go via a separate address space identifier.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 10, 2014, 6:58 a.m. UTC | #33
On Mon, Nov 10, 2014 at 12:20:20AM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Sun, 9 Nov 2014 21:19:08 +0000
> 
> > 1) does sparc64 access_ok() need to differ for 32bit and 64bit tasks?
> 
> sparc64 will just fault no matter what kind of task it is.
> 
> It is impossible for a user task to generate a reference to
> a kernel virtual address, as kernel and user accesses each
> go via a separate address space identifier.

Sure, but why do we have access_ok() there at all?  I.e. why not just have
it constant 1?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
David Miller Nov. 10, 2014, 7:30 a.m. UTC | #34
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Mon, 10 Nov 2014 06:58:17 +0000

> On Mon, Nov 10, 2014 at 12:20:20AM -0500, David Miller wrote:
>> From: Al Viro <viro@ZenIV.linux.org.uk>
>> Date: Sun, 9 Nov 2014 21:19:08 +0000
>> 
>> > 1) does sparc64 access_ok() need to differ for 32bit and 64bit tasks?
>> 
>> sparc64 will just fault no matter what kind of task it is.
>> 
>> It is impossible for a user task to generate a reference to
>> a kernel virtual address, as kernel and user accesses each
>> go via a separate address space identifier.
> 
> Sure, but why do we have access_ok() there at all?  I.e. why not just have
> it constant 1?

Since access_ok() is in fact constant 1 on sparc64, where we use it,
does it really matter?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Al Viro Nov. 10, 2014, 9:09 a.m. UTC | #35
On Mon, Nov 10, 2014 at 02:30:00AM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Mon, 10 Nov 2014 06:58:17 +0000
> 
> > On Mon, Nov 10, 2014 at 12:20:20AM -0500, David Miller wrote:
> >> From: Al Viro <viro@ZenIV.linux.org.uk>
> >> Date: Sun, 9 Nov 2014 21:19:08 +0000
> >> 
> >> > 1) does sparc64 access_ok() need to differ for 32bit and 64bit tasks?
> >> 
> >> sparc64 will just fault no matter what kind of task it is.
> >> 
> >> It is impossible for a user task to generate a reference to
> >> a kernel virtual address, as kernel and user accesses each
> >> go via a separate address space identifier.
> > 
> > Sure, but why do we have access_ok() there at all?  I.e. why not just have
> > it constant 1?
> 
> Since access_ok() is in fact constant 1 on sparc64, where we use it,
> does it really matter?

*blink*

My apologies - I've got confused by the maze of twisty includes, all alike.
Right you are; in biarch case it *doesn't* depend on 32bit vs. 64bit.
STACK_TOP-using one is sparc32 variant where we obviously don't have
biarch at all.

Anyway, the series switching to {compat_,}rw_copy_check_uvector() and
getting rid of duplicate checks is in vfs.git#iov_iter-net.  Warning:
it's almost completely untested.  It survives boot, ssh into it and
runltp -f syscalls (no regressions), but that's about it.  BTW, what's
the usual regression suite used for net/* stuff?

3 commits in there, on top of net-next#master; head should be at 555126.
There's a bunch of fairly obvious followups becoming possible after that,
but let's keep those separate...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Michael S. Tsirkin Nov. 10, 2014, 10:14 a.m. UTC | #36
On Sun, Nov 09, 2014 at 09:19:08PM +0000, Al Viro wrote:
> [Michael Cc'd]
> 
> On Fri, Nov 07, 2014 at 11:42:53PM +0000, Al Viro wrote:
> 
> > I'll finish RTFS drivers/vhost and if it turns out to be OK I'll post the
> > series moving those checks to the moment of copying iovec from userland,
> > so that kernel-side we could always rely on ->msg_iov elements having been
> > verified.
> 
> Two questions:
> 1) does sparc64 access_ok() need to differ for 32bit and 64bit tasks?
> AFAICS, x86 and ppc just check that address is OK for 64bit process -
> if a 32bit process passes the kernel an address that would be valid
> for 64bit process, but not for 32bit one, we just get a pagefault in
> __copy_from_user() and friends.  No kernel objects are going to have
> a virtual address in that range, so access_ok() doesn't bother preventing
> such access attempts there...
> 
> 2) shouldn't vhost_dev_cleanup() stop the worker thread before doing anything
> else?
>  AFAICS, we do parts of vhost_dev teardown while the thread is
> still running; granted, we keep dev->mm pinned down until after it stops
> (or we would be _really_ screwed), but is it safe to do all those fput()s, etc.
> while it's still running?  Michael?

Before invoking vhost_dev_cleanup,
the caller for vhost-net (vhost_net_release) sets private data to NULL
(using vhost_net_stop_vq) which guarantees thread will do nothing at all.
vhost scsi does it in vhost_scsi_clear_endpoint.
David Miller Nov. 10, 2014, 4:18 p.m. UTC | #37
From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Mon, 10 Nov 2014 09:09:55 +0000

> BTW, what's the usual regression suite used for net/* stuff?

There's a regression suite? :-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 6c8b6f6..5ff7054 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -148,6 +148,7 @@ 
 struct net_device;
 struct scatterlist;
 struct pipe_inode_info;
+struct iov_iter;
 
 #if defined(CONFIG_NF_CONNTRACK) || defined(CONFIG_NF_CONNTRACK_MODULE)
 struct nf_conntrack {
@@ -2641,6 +2642,8 @@  int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *frm,
 int skb_copy_datagram_const_iovec(const struct sk_buff *from, int offset,
 				  const struct iovec *to, int to_offset,
 				  int size);
+int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
+			   struct iov_iter *to, int size);
 void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
 void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb);
 int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index fdbc9a8..45a9d4d 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -49,6 +49,7 @@ 
 #include <linux/spinlock.h>
 #include <linux/slab.h>
 #include <linux/pagemap.h>
+#include <linux/uio.h>
 
 #include <net/protocol.h>
 #include <linux/skbuff.h>
@@ -482,6 +483,87 @@  fault:
 EXPORT_SYMBOL(skb_copy_datagram_const_iovec);
 
 /**
+ *	skb_copy_datagram_iter - Copy a datagram to an iovec iterator.
+ *	@skb: buffer to copy
+ *	@offset: offset in the buffer to start copying from
+ *	@to: iovec iterator to copy to
+ *	@len: amount of data to copy from buffer to iovec
+ */
+int skb_copy_datagram_iter(const struct sk_buff *skb, int offset,
+			   struct iov_iter *to, int len)
+{
+	int start = skb_headlen(skb);
+	int i, copy = start - offset;
+	struct sk_buff *frag_iter;
+
+	trace_skb_copy_datagram_iovec(skb, len);
+
+	/* Copy header. */
+	if (copy > 0) {
+		if (copy > len)
+			copy = len;
+		if (copy_to_iter(skb->data + offset, copy, to))
+			goto fault;
+		if ((len -= copy) == 0)
+			return 0;
+		offset += copy;
+	}
+
+	/* Copy paged appendix. Hmm... why does this look so complicated? */
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		int end;
+		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+		WARN_ON(start > offset + len);
+
+		end = start + skb_frag_size(frag);
+		if ((copy = end - offset) > 0) {
+			int err;
+			u8  *vaddr;
+			struct page *page = skb_frag_page(frag);
+
+			if (copy > len)
+				copy = len;
+			vaddr = kmap(page);
+			err = copy_to_iter(vaddr + frag->page_offset +
+					   offset - start, copy, to);
+			kunmap(page);
+			if (err)
+				goto fault;
+			if (!(len -= copy))
+				return 0;
+			offset += copy;
+		}
+		start = end;
+	}
+
+	skb_walk_frags(skb, frag_iter) {
+		int end;
+
+		WARN_ON(start > offset + len);
+
+		end = start + frag_iter->len;
+		if ((copy = end - offset) > 0) {
+			if (copy > len)
+				copy = len;
+			if (skb_copy_datagram_iter(frag_iter, offset - start,
+						   to, copy))
+				goto fault;
+			if ((len -= copy) == 0)
+				return 0;
+			offset += copy;
+		}
+		start = end;
+	}
+	if (!len)
+		return 0;
+
+fault:
+	return -EFAULT;
+}
+EXPORT_SYMBOL(skb_copy_datagram_iter);
+
+/**
  *	skb_copy_datagram_from_iovec - Copy a datagram from an iovec.
  *	@skb: buffer to copy
  *	@offset: offset in the buffer to start copying to