All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] situation with csum_and_copy_... API
@ 2014-11-18  8:47 Al Viro
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
  2014-11-18 20:49 ` [RFC] " Linus Torvalds
  0 siblings, 2 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18  8:47 UTC (permalink / raw)
  To: netdev; +Cc: Linus Torvalds, David Miller, linux-kernel

I've spent the last weekend crawling through the copy-and-calculate-csum
primitives on all architectures.  It came up during iov_iter work; below
are the results of that review and, in the end, several questions about
the short-term tactics of a change that needs to be done for iov_iter-net.

The API apparently consists of 3 functions - present and exported on all
architectures:
	1) csum_and_copy_from_user()
	2) csum_and_copy_to_user()
	3) csum_partial_copy_nocheck().
There are very few places _using_ those, all but one in net stack (the only
exception is in drivers/net).
The minimal implementations would be

__wsum csum_and_copy_from_user(const void __user *src, void *dst, int len,
			       __wsum sum, int *err_ptr)
{
	if (unlikely(copy_from_user(dst, src, len) < 0)) {
		*err_ptr = -EFAULT;
		return <whatever>;
	}
	return csum_partial(dst, len, sum);
}

__wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
			     __wsum sum, int *err_ptr)
{
	sum = csum_partial(src, len, sum);
	if (unlikely(copy_to_user(dst, src, len) < 0)) {
		*err_ptr = -EFAULT;
		return <whatever>;
	}
	return sum;
}

__wsum csum_partial_copy_nocheck(const void *src, void *dst, int len,
				 __wsum sum)
{
	memcpy(dst, src, len);
	return csum_partial(dst, len, sum);
}

Note that we are *not* guaranteed that *err_ptr is touched in case of success
- not all architectures zero it in that case.  Callers must (and do) zero it
before the call of csum_and_copy_{from,to}_user().  Furthermore, return value
in case of error is undefined.  On the "from" side result is really "anything
whatsoever", on the "to" side most of the architectures end up returning ~0U.
However, even that is not guaranteed - e.g. avr32 and xtensa might return 0
in that case.  All existing callers discard the return value in case of error -
most of them by jumping out of the scope of variable it's assigned to.  The
only exception is skb_copy_and_csum_datagram(); in that case return value is
discarded by caller of skb_copy_and_csum_datagram() itself (also by jumping
out of scope of the variable it had been put into).

Generally, error in copying from userland ends up zeroing the rest of
destination.  However, there are exceptions (which might be considered bugs)
*and* callers discard all the copied data in case of error anyway.  There
are several call chains:
	1) from skb_add_data(): calls skb_trim() immediately after the failure.
	2) from skb_do_copy_data_nocache() called from skb_add_data_nocache():
__skb_trim() in skb_add_data_nocache()
	3) from skb_do_copy_data_nocache() called from
skb_copy_to_page_nocache(): we do not increase skb->len until after successful
copying.
	4) from skb_copy_to_page(): we do not increase skb->len until after
successful copying.
	5) from csum_partial_copy_fromiovecend() from ip_generic_getfrag()
or ping_getfrag(): those are passed as getfrag callback to ip_make_skb(),
ip_append_data() or ip6_append_data().  ip_make_skb() and ip_append_data()
just pass it to __ip_append_data(), so we are left with two fairly similar
functions to analyse.
	* __ip_append_data() has 3 places where getfrag() is called directly and
one where it's passed to ip_ufo_append_data().  Direct ones either free skb, or
do __skb_trim() or do not increment skb->len on error.  ip_ufo_append_data()
passes it further to skb_append_datato_frags(), where we do not increment skb->len
and friends in case of getfrag() failure.
	* ip6_append_data() the situation is identical, with ip6_ufo_append_data()
in place of ip_ufo_append_data().

So for all in-tree users of that sucker we are guaranteed to discard the whole
thing in case of error and this zeroing the tail is pointless.

Note also that the order of src and dest is opposite to normal for memcpy-like
functions.  Compared to the rest of the ugliness it's trivial, but it's still
not nice.

IMO the calling conventions are atrocious.

And then there are architecture-specific warts.  A relatively minor one is that
csum_partial_copy_nocheck() has a slightly saner name on some architectures -
there it's called csum_partial_copy().  Of course, those architectures have
#define csum_partial_copy_nocheck csum_partial_copy...  Worse ones are related
to csum_and_copy_from_user() - on everything other than ppc64 it is an
inlined wrapper around csum_partial_copy_from_user(), which is what's actually
exported.  On ppc64 csum_partial_copy_from_user() simply doesn't exist.
The _only_ difference between it and csum_and_copy_from_user() is that the
wrapper does access_ok() check.  However, some architectures repeat that
access_ok() in csum_partial_copy_from_user() (and on x86 we have
#define csum_and_copy_from_user csum_partial_copy_from_user, with no
access_ok() in the wrapper).  As it is, it's architecture-dependent whether
csum_partial_copy_from_user() does or does not access_ok(); on
alpha, frv, m32r, mn10300, parisc, s390, score and x86 it does, on the
rest it doesn't.

To make it even more fun, converting verify_iovec() and its compat equivalent
to use of {,compat_}rw_copy_check_uvector() would guarantee that all those
access_ok() are redundant to start with, both on send and receive side of
things.  Note, BTW, that for read()/write()/readv()/writev() it's already
redundant.  Moreover, we have a very good reason to do that anyway - conversion
of sendmsg/recvmsg to iov_iter primitives would pretty much require that,
since copy_{to,from}_iter() assume that iovec behind the iov_iter has been
validated wrt access_ok().

I do have a patch doing just that; the question is what to do with csum-and-copy
primitives.  Originally I planned to simply strip those access_ok() from those
(both the explicit calls and use of copy_from_user() where we ought to use
__copy_from_user(), etc.), but that's not nice to potential out-of-tree callers
of those suckers.  If any of those exist and manage to cope with the wonderful
calling conventions, that is.  As it is, we have the total of 4 callers of
csum_and_copy_from_user() and 2 callers of csum_and_copy_to_user(), all in
networking code.  Do we care about potential out-of-tree users existing and
getting screwed by such change?  Davem, Linus?

Alternatively, we could introduce __csum_and_copy_{from,to}_user() that would
_not_ do access_ok() (and wouldn't be required to do zeroing the tail, etc.),
convert the existing callers to that and leave csum_and_copy_{to,from}_user()
as architecture-independent wrappers around those - "do access_ok(),
then try to call __csum_and... variant, then fall back to dumb implementation
if that fails".  For almost all architectures csum_and_partial_copy_from_user()
would get stripped of access_ok() and renamed to __csum_and_copy_from_user();
for ppc64 we'd define __csum_and_copy_from_user(src,dst,len,sum,errp) as
csum_and_partial_copy_generic(src,dst,len,sum,errp,NULL) - they already
have that kind of code structure.  This variant still buggers the out-of-tree
code using csum_and_partial_copy_from_user() directly, but users of
csum_and_copy_{from,to}_user() are left intact, such modules were already
broken on ppc64 *and* breakage is of obvious "it won't link" kind.

Comments, suggestions?

PS: there are some really amusing brainos - e.g. mn10300 csum_and_copy_to_user()
starts with this:
	missing = copy_to_user(dst, src, len);
	if (missing) {
		memset(dst + len - missing, 0, missing);
		*err_ptr = -EFAULT;
	}
that's right, if copy_to_user() has returned non-zero, try to do
memset() on userland addresses it has failed to write into...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [patches][RFC] situation with csum_and_copy_... API
  2014-11-18  8:47 [RFC] situation with csum_and_copy_... API Al Viro
@ 2014-11-18 19:40 ` Al Viro
  2014-11-18 19:41   ` [PATCH 1/5] separate kernel- and userland-side msghdr Al Viro
                     ` (5 more replies)
  2014-11-18 20:49 ` [RFC] " Linus Torvalds
  1 sibling, 6 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18 19:40 UTC (permalink / raw)
  To: netdev; +Cc: Linus Torvalds, David Miller, linux-kernel

On Tue, Nov 18, 2014 at 08:47:45AM +0000, Al Viro wrote:

> I do have a patch doing just that; the question is what to do with csum-and-copy
> primitives.  Originally I planned to simply strip those access_ok() from those
> (both the explicit calls and use of copy_from_user() where we ought to use
> __copy_from_user(), etc.), but that's not nice to potential out-of-tree callers
> of those suckers.  If any of those exist and manage to cope with the wonderful
> calling conventions, that is.  As it is, we have the total of 4 callers of
> csum_and_copy_from_user() and 2 callers of csum_and_copy_to_user(), all in
> networking code.  Do we care about potential out-of-tree users existing and
> getting screwed by such change?  Davem, Linus?

FWIW, the beginning of series in question follows; removal of those
access_ok() is 3/5.  The series is longer than that (see vfs.git#iov_iter-net
for a bit more, and there's more stuff in local queue still too much in flux
to push them out), but all the stuff relevant to validating iovecs on
sendmsg/recvmsg and getting rid of excessive access_ok() is in the first 5
commits.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 1/5] separate kernel- and userland-side msghdr
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
@ 2014-11-18 19:41   ` Al Viro
  2014-11-18 19:42   ` [PATCH 2/5] {compat_,}verify_iovec(): switch to generic copying of iovecs Al Viro
                     ` (4 subsequent siblings)
  5 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18 19:41 UTC (permalink / raw)
  To: netdev; +Cc: Linus Torvalds, David Miller, linux-kernel

Kernel-side struct msghdr is (currently) using the same layout as
userland one, but it's not a one-to-one copy - even without considering
32bit compat issues, we have msg_iov, msg_name and msg_control copied
to kernel[1].  It's fairly localized, so we get away with a few functions
where that knowledge is needed (and we could shrink that set even
more).  Pretty much everything deals with the kernel-side variant and
the few places that want userland one just use a bunch of force-casts
to paper over the differences.

The thing is, kernel-side definition of struct msghdr is *not* exposed
in include/uapi - libc doesn't see it, etc.  So we can add struct user_msghdr,
with proper annotations and let the few places that ever deal with those
beasts use it for userland pointers.  Saner typechecking aside, that will
allow to change the layout of kernel-side msghdr - e.g. replace
msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need
to modify the iovec as we copy data to/from it, etc.

We could introduce kernel_msghdr instead, but that would create much more
noise - the absolute majority of the instances would need to have the
type switched to kernel_msghdr and definition of struct msghdr in
include/linux/socket.h is not going to be seen by userland anyway.

This commit just introduces user_msghdr and switches the few places that
are dealing with userland-side msghdr to it.

[1] actually, it's even trickier than that - we copy msg_control for
sendmsg, but keep the userland address on recvmsg.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/arm/kernel/sys_oabi-compat.c |    4 ++--
 include/linux/socket.h            |   16 +++++++++++++---
 include/linux/syscalls.h          |    6 +++---
 net/compat.c                      |    4 ++--
 net/socket.c                      |   29 ++++++++++++++++-------------
 5 files changed, 36 insertions(+), 23 deletions(-)

diff --git a/arch/arm/kernel/sys_oabi-compat.c b/arch/arm/kernel/sys_oabi-compat.c
index e90a314..b83f3b7 100644
--- a/arch/arm/kernel/sys_oabi-compat.c
+++ b/arch/arm/kernel/sys_oabi-compat.c
@@ -400,7 +400,7 @@ asmlinkage long sys_oabi_sendto(int fd, void __user *buff,
 	return sys_sendto(fd, buff, len, flags, addr, addrlen);
 }
 
-asmlinkage long sys_oabi_sendmsg(int fd, struct msghdr __user *msg, unsigned flags)
+asmlinkage long sys_oabi_sendmsg(int fd, struct user_msghdr __user *msg, unsigned flags)
 {
 	struct sockaddr __user *addr;
 	int msg_namelen;
@@ -446,7 +446,7 @@ asmlinkage long sys_oabi_socketcall(int call, unsigned long __user *args)
 		break;
 	case SYS_SENDMSG:
 		if (copy_from_user(a, args, 3 * sizeof(long)) == 0)
-			r = sys_oabi_sendmsg(a[0], (struct msghdr __user *)a[1], a[2]);
+			r = sys_oabi_sendmsg(a[0], (struct user_msghdr __user *)a[1], a[2]);
 		break;
 	default:
 		r = sys_socketcall(call, args);
diff --git a/include/linux/socket.h b/include/linux/socket.h
index bb9b836..51bd666 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -53,10 +53,20 @@ struct msghdr {
 	__kernel_size_t	msg_controllen;	/* ancillary data buffer length */
 	unsigned int	msg_flags;	/* flags on received message */
 };
+ 
+struct user_msghdr {
+	void		__user *msg_name;	/* ptr to socket address structure */
+	int		msg_namelen;		/* size of socket address structure */
+	struct iovec	__user *msg_iov;	/* scatter/gather array */
+	__kernel_size_t	msg_iovlen;		/* # elements in msg_iov */
+	void		__user *msg_control;	/* ancillary data */
+	__kernel_size_t	msg_controllen;		/* ancillary data buffer length */
+	unsigned int	msg_flags;		/* flags on received message */
+};
 
 /* For recvmmsg/sendmmsg */
 struct mmsghdr {
-	struct msghdr   msg_hdr;
+	struct user_msghdr  msg_hdr;
 	unsigned int        msg_len;
 };
 
@@ -319,8 +329,8 @@ extern int put_cmsg(struct msghdr*, int level, int type, int len, void *data);
 struct timespec;
 
 /* The __sys_...msg variants allow MSG_CMSG_COMPAT */
-extern long __sys_recvmsg(int fd, struct msghdr __user *msg, unsigned flags);
-extern long __sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags);
+extern long __sys_recvmsg(int fd, struct user_msghdr __user *msg, unsigned flags);
+extern long __sys_sendmsg(int fd, struct user_msghdr __user *msg, unsigned flags);
 extern int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
 			  unsigned int flags, struct timespec *timeout);
 extern int __sys_sendmmsg(int fd, struct mmsghdr __user *mmsg,
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index bda9b81..c9afdc7 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -25,7 +25,7 @@ struct linux_dirent64;
 struct list_head;
 struct mmap_arg_struct;
 struct msgbuf;
-struct msghdr;
+struct user_msghdr;
 struct mmsghdr;
 struct msqid_ds;
 struct new_utsname;
@@ -601,13 +601,13 @@ asmlinkage long sys_getpeername(int, struct sockaddr __user *, int __user *);
 asmlinkage long sys_send(int, void __user *, size_t, unsigned);
 asmlinkage long sys_sendto(int, void __user *, size_t, unsigned,
 				struct sockaddr __user *, int);
-asmlinkage long sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags);
+asmlinkage long sys_sendmsg(int fd, struct user_msghdr __user *msg, unsigned flags);
 asmlinkage long sys_sendmmsg(int fd, struct mmsghdr __user *msg,
 			     unsigned int vlen, unsigned flags);
 asmlinkage long sys_recv(int, void __user *, size_t, unsigned);
 asmlinkage long sys_recvfrom(int, void __user *, size_t, unsigned,
 				struct sockaddr __user *, int __user *);
-asmlinkage long sys_recvmsg(int fd, struct msghdr __user *msg, unsigned flags);
+asmlinkage long sys_recvmsg(int fd, struct user_msghdr __user *msg, unsigned flags);
 asmlinkage long sys_recvmmsg(int fd, struct mmsghdr __user *msg,
 			     unsigned int vlen, unsigned flags,
 			     struct timespec __user *timeout);
diff --git a/net/compat.c b/net/compat.c
index bc8aeef..562e920 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -740,7 +740,7 @@ COMPAT_SYSCALL_DEFINE3(sendmsg, int, fd, struct compat_msghdr __user *, msg, uns
 {
 	if (flags & MSG_CMSG_COMPAT)
 		return -EINVAL;
-	return __sys_sendmsg(fd, (struct msghdr __user *)msg, flags | MSG_CMSG_COMPAT);
+	return __sys_sendmsg(fd, (struct user_msghdr __user *)msg, flags | MSG_CMSG_COMPAT);
 }
 
 COMPAT_SYSCALL_DEFINE4(sendmmsg, int, fd, struct compat_mmsghdr __user *, mmsg,
@@ -756,7 +756,7 @@ COMPAT_SYSCALL_DEFINE3(recvmsg, int, fd, struct compat_msghdr __user *, msg, uns
 {
 	if (flags & MSG_CMSG_COMPAT)
 		return -EINVAL;
-	return __sys_recvmsg(fd, (struct msghdr __user *)msg, flags | MSG_CMSG_COMPAT);
+	return __sys_recvmsg(fd, (struct user_msghdr __user *)msg, flags | MSG_CMSG_COMPAT);
 }
 
 COMPAT_SYSCALL_DEFINE4(recv, int, fd, void __user *, buf, compat_size_t, len, unsigned int, flags)
diff --git a/net/socket.c b/net/socket.c
index fe20c31..0ae8147 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1989,8 +1989,11 @@ struct used_address {
 };
 
 static int copy_msghdr_from_user(struct msghdr *kmsg,
-				 struct msghdr __user *umsg)
+				 struct user_msghdr __user *umsg)
 {
+	/* We are relying on the (currently) identical layouts.  Once
+	 * the kernel-side changes, this place will need to be updated
+	 */
 	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
 		return -EFAULT;
 
@@ -2005,7 +2008,7 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
 	return 0;
 }
 
-static int ___sys_sendmsg(struct socket *sock, struct msghdr __user *msg,
+static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg,
 			 struct msghdr *msg_sys, unsigned int flags,
 			 struct used_address *used_address)
 {
@@ -2123,7 +2126,7 @@ out:
  *	BSD sendmsg interface
  */
 
-long __sys_sendmsg(int fd, struct msghdr __user *msg, unsigned flags)
+long __sys_sendmsg(int fd, struct user_msghdr __user *msg, unsigned flags)
 {
 	int fput_needed, err;
 	struct msghdr msg_sys;
@@ -2140,7 +2143,7 @@ out:
 	return err;
 }
 
-SYSCALL_DEFINE3(sendmsg, int, fd, struct msghdr __user *, msg, unsigned int, flags)
+SYSCALL_DEFINE3(sendmsg, int, fd, struct user_msghdr __user *, msg, unsigned int, flags)
 {
 	if (flags & MSG_CMSG_COMPAT)
 		return -EINVAL;
@@ -2177,7 +2180,7 @@ int __sys_sendmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
 
 	while (datagrams < vlen) {
 		if (MSG_CMSG_COMPAT & flags) {
-			err = ___sys_sendmsg(sock, (struct msghdr __user *)compat_entry,
+			err = ___sys_sendmsg(sock, (struct user_msghdr __user *)compat_entry,
 					     &msg_sys, flags, &used_address);
 			if (err < 0)
 				break;
@@ -2185,7 +2188,7 @@ int __sys_sendmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
 			++compat_entry;
 		} else {
 			err = ___sys_sendmsg(sock,
-					     (struct msghdr __user *)entry,
+					     (struct user_msghdr __user *)entry,
 					     &msg_sys, flags, &used_address);
 			if (err < 0)
 				break;
@@ -2215,7 +2218,7 @@ SYSCALL_DEFINE4(sendmmsg, int, fd, struct mmsghdr __user *, mmsg,
 	return __sys_sendmmsg(fd, mmsg, vlen, flags);
 }
 
-static int ___sys_recvmsg(struct socket *sock, struct msghdr __user *msg,
+static int ___sys_recvmsg(struct socket *sock, struct user_msghdr __user *msg,
 			 struct msghdr *msg_sys, unsigned int flags, int nosec)
 {
 	struct compat_msghdr __user *msg_compat =
@@ -2311,7 +2314,7 @@ out:
  *	BSD recvmsg interface
  */
 
-long __sys_recvmsg(int fd, struct msghdr __user *msg, unsigned flags)
+long __sys_recvmsg(int fd, struct user_msghdr __user *msg, unsigned flags)
 {
 	int fput_needed, err;
 	struct msghdr msg_sys;
@@ -2328,7 +2331,7 @@ out:
 	return err;
 }
 
-SYSCALL_DEFINE3(recvmsg, int, fd, struct msghdr __user *, msg,
+SYSCALL_DEFINE3(recvmsg, int, fd, struct user_msghdr __user *, msg,
 		unsigned int, flags)
 {
 	if (flags & MSG_CMSG_COMPAT)
@@ -2373,7 +2376,7 @@ int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
 		 * No need to ask LSM for more than the first datagram.
 		 */
 		if (MSG_CMSG_COMPAT & flags) {
-			err = ___sys_recvmsg(sock, (struct msghdr __user *)compat_entry,
+			err = ___sys_recvmsg(sock, (struct user_msghdr __user *)compat_entry,
 					     &msg_sys, flags & ~MSG_WAITFORONE,
 					     datagrams);
 			if (err < 0)
@@ -2382,7 +2385,7 @@ int __sys_recvmmsg(int fd, struct mmsghdr __user *mmsg, unsigned int vlen,
 			++compat_entry;
 		} else {
 			err = ___sys_recvmsg(sock,
-					     (struct msghdr __user *)entry,
+					     (struct user_msghdr __user *)entry,
 					     &msg_sys, flags & ~MSG_WAITFORONE,
 					     datagrams);
 			if (err < 0)
@@ -2571,13 +2574,13 @@ SYSCALL_DEFINE2(socketcall, int, call, unsigned long __user *, args)
 				   (int __user *)a[4]);
 		break;
 	case SYS_SENDMSG:
-		err = sys_sendmsg(a0, (struct msghdr __user *)a1, a[2]);
+		err = sys_sendmsg(a0, (struct user_msghdr __user *)a1, a[2]);
 		break;
 	case SYS_SENDMMSG:
 		err = sys_sendmmsg(a0, (struct mmsghdr __user *)a1, a[2], a[3]);
 		break;
 	case SYS_RECVMSG:
-		err = sys_recvmsg(a0, (struct msghdr __user *)a1, a[2]);
+		err = sys_recvmsg(a0, (struct user_msghdr __user *)a1, a[2]);
 		break;
 	case SYS_RECVMMSG:
 		err = sys_recvmmsg(a0, (struct mmsghdr __user *)a1, a[2], a[3],
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 2/5] {compat_,}verify_iovec(): switch to generic copying of iovecs
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
  2014-11-18 19:41   ` [PATCH 1/5] separate kernel- and userland-side msghdr Al Viro
@ 2014-11-18 19:42   ` Al Viro
  2014-11-18 19:42   ` [PATCH 3/5] remove a bunch of now-pointless access_ok() in net Al Viro
                     ` (3 subsequent siblings)
  5 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18 19:42 UTC (permalink / raw)
  To: netdev; +Cc: Linus Torvalds, David Miller, linux-kernel

use {compat_,}rw_copy_check_uvector().  As the result, we are
guaranteed that all iovecs seen in ->msg_iov by ->sendmsg()
and ->recvmsg() will pass access_ok().  The next commit removes
now redundant checks in callees...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/compat.c     |   51 +++++++++++++++------------------------------------
 net/core/iovec.c |   37 ++++++++++++++-----------------------
 net/socket.c     |   38 ++++++++------------------------------
 3 files changed, 37 insertions(+), 89 deletions(-)

diff --git a/net/compat.c b/net/compat.c
index 562e920..7b4b6ad 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -31,33 +31,6 @@
 #include <asm/uaccess.h>
 #include <net/compat.h>
 
-static inline int iov_from_user_compat_to_kern(struct iovec *kiov,
-					  struct compat_iovec __user *uiov32,
-					  int niov)
-{
-	int tot_len = 0;
-
-	while (niov > 0) {
-		compat_uptr_t buf;
-		compat_size_t len;
-
-		if (get_user(len, &uiov32->iov_len) ||
-		    get_user(buf, &uiov32->iov_base))
-			return -EFAULT;
-
-		if (len > INT_MAX - tot_len)
-			len = INT_MAX - tot_len;
-
-		tot_len += len;
-		kiov->iov_base = compat_ptr(buf);
-		kiov->iov_len = (__kernel_size_t) len;
-		uiov32++;
-		kiov++;
-		niov--;
-	}
-	return tot_len;
-}
-
 int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
 {
 	compat_uptr_t tmp1, tmp2, tmp3;
@@ -80,13 +53,15 @@ int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
 }
 
 /* I've named the args so it is easy to tell whose space the pointers are in. */
-int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *kern_iov,
+int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *iov,
 		   struct sockaddr_storage *kern_address, int mode)
 {
-	int tot_len;
+	struct compat_iovec __user *p;
+	struct iovec *res;
+	int err;
 
 	if (kern_msg->msg_name && kern_msg->msg_namelen) {
-		if (mode == VERIFY_READ) {
+		if (mode == WRITE) {
 			int err = move_addr_to_kernel(kern_msg->msg_name,
 						      kern_msg->msg_namelen,
 						      kern_address);
@@ -99,13 +74,17 @@ int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *kern_iov,
 		kern_msg->msg_namelen = 0;
 	}
 
-	tot_len = iov_from_user_compat_to_kern(kern_iov,
-					  (struct compat_iovec __user *)kern_msg->msg_iov,
-					  kern_msg->msg_iovlen);
-	if (tot_len >= 0)
-		kern_msg->msg_iov = kern_iov;
+	if (kern_msg->msg_iovlen > UIO_MAXIOV)
+		return -EMSGSIZE;
 
-	return tot_len;
+	p = (struct compat_iovec __user *)kern_msg->msg_iov;
+	err = compat_rw_copy_check_uvector(mode, p, kern_msg->msg_iovlen,
+					   UIO_FASTIOV, iov, &res);
+	if (err >= 0)
+		kern_msg->msg_iov = res;
+	else if (res != iov)
+		kfree(res);
+	return err;
 }
 
 /* Bleech... */
diff --git a/net/core/iovec.c b/net/core/iovec.c
index e1ec45a..86beeea 100644
--- a/net/core/iovec.c
+++ b/net/core/iovec.c
@@ -37,13 +37,13 @@
 
 int verify_iovec(struct msghdr *m, struct iovec *iov, struct sockaddr_storage *address, int mode)
 {
-	int size, ct, err;
+	struct iovec *res;
+	int err;
 
 	if (m->msg_name && m->msg_namelen) {
-		if (mode == VERIFY_READ) {
-			void __user *namep;
-			namep = (void __user __force *) m->msg_name;
-			err = move_addr_to_kernel(namep, m->msg_namelen,
+		if (mode == WRITE) {
+			void __user *namep = (void __user __force *)m->msg_name;
+			int err = move_addr_to_kernel(namep, m->msg_namelen,
 						  address);
 			if (err < 0)
 				return err;
@@ -53,24 +53,15 @@ int verify_iovec(struct msghdr *m, struct iovec *iov, struct sockaddr_storage *a
 		m->msg_name = NULL;
 		m->msg_namelen = 0;
 	}
-
-	size = m->msg_iovlen * sizeof(struct iovec);
-	if (copy_from_user(iov, (void __user __force *) m->msg_iov, size))
-		return -EFAULT;
-
-	m->msg_iov = iov;
-	err = 0;
-
-	for (ct = 0; ct < m->msg_iovlen; ct++) {
-		size_t len = iov[ct].iov_len;
-
-		if (len > INT_MAX - err) {
-			len = INT_MAX - err;
-			iov[ct].iov_len = len;
-		}
-		err += len;
-	}
-
+	if (m->msg_iovlen > UIO_MAXIOV)
+		return -EMSGSIZE;
+
+	err = rw_copy_check_uvector(mode, (void __user __force *)m->msg_iov,
+				    m->msg_iovlen, UIO_FASTIOV, iov, &res);
+	if (err >= 0)
+		m->msg_iov = res;
+	else if (res != iov)
+		kfree(res);
 	return err;
 }
 
diff --git a/net/socket.c b/net/socket.c
index 0ae8147..59020f0 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -2032,24 +2032,14 @@ static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg,
 			return err;
 	}
 
-	if (msg_sys->msg_iovlen > UIO_FASTIOV) {
-		err = -EMSGSIZE;
-		if (msg_sys->msg_iovlen > UIO_MAXIOV)
-			goto out;
-		err = -ENOMEM;
-		iov = kmalloc(msg_sys->msg_iovlen * sizeof(struct iovec),
-			      GFP_KERNEL);
-		if (!iov)
-			goto out;
-	}
-
 	/* This will also move the address data into kernel space */
-	if (MSG_CMSG_COMPAT & flags) {
-		err = verify_compat_iovec(msg_sys, iov, &address, VERIFY_READ);
-	} else
-		err = verify_iovec(msg_sys, iov, &address, VERIFY_READ);
+	if (MSG_CMSG_COMPAT & flags)
+		err = verify_compat_iovec(msg_sys, iovstack, &address, WRITE);
+	else
+		err = verify_iovec(msg_sys, iovstack, &address, WRITE);
 	if (err < 0)
 		goto out_freeiov;
+	iov = msg_sys->msg_iov;
 	total_len = err;
 
 	err = -ENOBUFS;
@@ -2118,7 +2108,6 @@ out_freectl:
 out_freeiov:
 	if (iov != iovstack)
 		kfree(iov);
-out:
 	return err;
 }
 
@@ -2244,28 +2233,18 @@ static int ___sys_recvmsg(struct socket *sock, struct user_msghdr __user *msg,
 			return err;
 	}
 
-	if (msg_sys->msg_iovlen > UIO_FASTIOV) {
-		err = -EMSGSIZE;
-		if (msg_sys->msg_iovlen > UIO_MAXIOV)
-			goto out;
-		err = -ENOMEM;
-		iov = kmalloc(msg_sys->msg_iovlen * sizeof(struct iovec),
-			      GFP_KERNEL);
-		if (!iov)
-			goto out;
-	}
-
 	/* Save the user-mode address (verify_iovec will change the
 	 * kernel msghdr to use the kernel address space)
 	 */
 	uaddr = (__force void __user *)msg_sys->msg_name;
 	uaddr_len = COMPAT_NAMELEN(msg);
 	if (MSG_CMSG_COMPAT & flags)
-		err = verify_compat_iovec(msg_sys, iov, &addr, VERIFY_WRITE);
+		err = verify_compat_iovec(msg_sys, iovstack, &addr, READ);
 	else
-		err = verify_iovec(msg_sys, iov, &addr, VERIFY_WRITE);
+		err = verify_iovec(msg_sys, iovstack, &addr, READ);
 	if (err < 0)
 		goto out_freeiov;
+	iov = msg_sys->msg_iov;
 	total_len = err;
 
 	cmsg_ptr = (unsigned long)msg_sys->msg_control;
@@ -2306,7 +2285,6 @@ static int ___sys_recvmsg(struct socket *sock, struct user_msghdr __user *msg,
 out_freeiov:
 	if (iov != iovstack)
 		kfree(iov);
-out:
 	return err;
 }
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 3/5] remove a bunch of now-pointless access_ok() in net
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
  2014-11-18 19:41   ` [PATCH 1/5] separate kernel- and userland-side msghdr Al Viro
  2014-11-18 19:42   ` [PATCH 2/5] {compat_,}verify_iovec(): switch to generic copying of iovecs Al Viro
@ 2014-11-18 19:42   ` Al Viro
  2014-11-18 19:43   ` [PATCH 4/5] bury skb_copy_to_page() Al Viro
                     ` (2 subsequent siblings)
  5 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18 19:42 UTC (permalink / raw)
  To: netdev; +Cc: Linus Torvalds, David Miller, linux-kernel

The following set of functions
	skb_add_data_nocache
	skb_copy_to_page_nocache
	skb_do_copy_data_nocache
	skb_copy_to_page
	skb_add_data
	csum_and_copy_from_user
	skb_copy_and_csum_datagram
	csum_and_copy_to_user
	memcpy_fromiovec
	memcpy_toiovec
	memcpy_toiovecend
	memcpy_fromiovecend
are never given a userland range that would not satisfy access_ok().

Proof:

1) skb_add_data_nocache() and skb_copy_to_page_nocache() are called only
by tcp_sendmsg() and given a range covered by ->msg_iov[].

2) skb_do_copy_data_nocache() is called only by skb_add_data_nocache() and
skb_copy_to_page_nocache() and range comes from their arguments.

3) skb_copy_to_page() is never called at all (dead code since 3.0; killed in
the next commit).

4) skb_add_data() is called by rxrpc_send_data() and tcp_send_syn_data(),
both passing it a range covered by ->msg_iov[].

5) all callers of csum_partial_copy_fromiovecend() are giving it iovecs from
->msg_iov[].  Proof: it is called by ip_generic_getfrag() and ping_getfrag(),
both are called only as callbacks by ip_append_data(), ip6_append_data() and
ip_make_skb() and argument passed to those callbacks in all such invocations
is ->msg_iov.

6) csum_and_copy_from_user() is called by skb_add_data(), skb_copy_to_page(),
skb_do_copy_data_nocache() and csum_partial_copy_fromiovecend().  In all cases
the range is covered by ->msg_iov[]

7) skb_copy_and_csum_datagram_iovec() is always getting an iovec from ->msg_iov.
Proof: it is called by tcp_copy_to_iovec(), which gives it tp->ucopy.iov,
and by several recvmsg instances (udp, raw, raw6) which give it ->msg_iov.
But tp->ucopy.iov is initialized only by ->msg_iov.

8) skb_copy_and_csum_datagram() is called only by itself (for fragments)
and by skb_copy_and_csum_datagram_iovec().  The range is covered by the range
passed to caller (in the first case) or by an iovec passed to
skb_copy_and_csum_datagram_iovec().

9) csum_and_copy_to_user() is called only by skb_copy_and_csum_datagram().
Range is covered by the range given to caller...

10) skb_copy_datagram_iovec() is always getting an iovec that would pass
access_ok() on all elements.  Proof: cases when ->msg_iov or ->ucopy.iov are
passed are trivial.  Other than those, we have
	* ppp_read() (single-element iovec, range passed to ->read() has been
validated by caller)
	* skb_copy_and_csum_datagram_iovec() (see (7)), itself (covered by
the ranges in array given to its caller)
	* rds_tcp_inc_copy_to_user(), which is called only as
->inc_copy_to_user(), which is always given ->msg_iov.

11) aside of the callers of memcpy_toiovec() that immediately pass it ->msg_iov,
there are 3 call sites: one in __qp_memcpy_from_queue() (when called from
qp_memcpy_from_queue_iov()) and two in skb_copy_datagram_iovec().  The latter
is OK due to (10), the former has the call chain coming through
vmci_qpair_dequev() and vmci_transport_stream_dequeue(), which is called as
->stream_dequeue(), which is always given ->msg_iov.
Types in vmw_vmci blow, film at 11...

12) memcpy_toiovecend() is always given a subset of something we'd just given
to ->recvmsg() in ->msg_iov.

13) most of the memcpy_fromiovec() callers are explicitly passing it ->msg_iov.
There are few exceptions:
	* l2cap_skbuff_fromiovec().  Called only as bluetooth
->memcpy_fromiovec(), which always gets ->msg_iov as argument.
	* __qp_memcpy_to_queue(), from qp_memcpy_to_queue_iov(), from
vmci_qpair_enquev(), from vmci_transport_stream_enqueue(), which is always
called as ->stream_enqueue(), which always gets ->msg_iov.  Don't ask me what
I think of vmware...
	* ipxrtr_route_packet(), which is always given ->msg_iov by its caller.
	* vmci_transport_dgram_enqueue(), which is always called as
->dgram_enqueue(), which always gets ->msg_iov.
	* vhost get_indirect(), which is passing it iovec filled by
translate_desc().  Ranges are subsets of those that had been validated by
vq_memory_access_ok() back when we did vhost_set_memory().

14) zerocopy_sg_from_iovec() always gets a validated iovec.
Proof: callers are {tun,macvtap}_get_user(), which are called either from
->aio_write() (and given iovec validated by caller of ->aio_write()), or
from ->sendmsg() (and given ->msg_iov).

15) skb_copy_datagram_from_iovec() always gets an validated iovec.
Proof: for callers in macvtap and tun, same as in (14).  Ones in net/unix
and net/packet are given ->msg_iov.  Other than those, there's one
in zerocopy_sg_from_iovec() (see (14)) and one in skb_copy_datagram_from_iovec()
itself (fragments handling) - that one gets the iovec its caller was given.

16) callers of memcpy_fromiovecend() are
	* {macvtap,tun}_get_user().  Same as (14).
	* skb_copy_datagram_from_iovec().  See (15).
	* raw_send_hdrinc(), raw6_send_hdrinv().  Both get ->msg_iov as argument
from their callers and pass it to memcpy_fromiovecend().
	* sctp_user_addto_chunk().  Ditto.
	* tipc_msg_build().  Again, iovec argument is always ->msg_iov.
	* ip_generic_getfrag() and udplite_getfrag().  Same as (5).
	* vhost_scsi_handle_vq() - that one gets vq->iov as iovec, and
vq->iov is filled ultimately by translate_desc().  Validated by
vq_memory_access_ok() back when we did vhost_set_memory().

And anything that might end up in ->msg_iov[] has to pass access_ok().  It's
trivial
for kernel_{send,recv}msg() users (there we are under set_fs(KERNEL_DS)), it's
verified by rw_copy_check_uvector() in sendmsg()/recvmsg()/sendmmsg()/recvmmsg()
and in the only place where we call ->sendmsg()/->recvmsg() not via net/socket.c
helpers (drivers/vhost/net.c) they are getting vq->iov.  As mentioned above,
this one is guaranteed to pass the checks since it's filled by translate_desc(),
and ranges it fills are subsets of ranges that had been validated when we
did vhost_set_memory().

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 arch/alpha/lib/csum_partial_copy.c      |    5 ----
 arch/frv/lib/checksum.c                 |    2 +-
 arch/m32r/lib/csum_partial_copy.c       |    2 +-
 arch/mips/include/asm/checksum.h        |   30 +++++++--------------
 arch/mn10300/lib/checksum.c             |    4 +--
 arch/parisc/include/asm/checksum.h      |    2 +-
 arch/parisc/lib/checksum.c              |    2 +-
 arch/powerpc/lib/checksum_wrappers_64.c |    6 ++---
 arch/s390/include/asm/checksum.h        |    2 +-
 arch/score/include/asm/checksum.h       |    2 +-
 arch/score/lib/checksum_copy.c          |    2 +-
 arch/sh/include/asm/checksum_32.h       |    8 +-----
 arch/sparc/include/asm/checksum_32.h    |   43 ++++++++++++++-----------------
 arch/x86/include/asm/checksum_32.h      |   17 ++++--------
 arch/x86/lib/csum-wrappers_64.c         |    3 ---
 arch/x86/um/asm/checksum.h              |    2 +-
 arch/x86/um/asm/checksum_32.h           |   12 ++++-----
 arch/xtensa/include/asm/checksum.h      |    8 +-----
 include/linux/skbuff.h                  |    2 +-
 include/net/checksum.h                  |   14 +++-------
 include/net/sock.h                      |    7 +++--
 lib/iovec.c                             |    8 +++---
 net/core/iovec.c                        |    6 ++---
 23 files changed, 67 insertions(+), 122 deletions(-)

diff --git a/arch/alpha/lib/csum_partial_copy.c b/arch/alpha/lib/csum_partial_copy.c
index 5675dca..1ee80e8 100644
--- a/arch/alpha/lib/csum_partial_copy.c
+++ b/arch/alpha/lib/csum_partial_copy.c
@@ -338,11 +338,6 @@ csum_partial_copy_from_user(const void __user *src, void *dst, int len,
 	unsigned long doff = 7 & (unsigned long) dst;
 
 	if (len) {
-		if (!access_ok(VERIFY_READ, src, len)) {
-			if (errp) *errp = -EFAULT;
-			memset(dst, 0, len);
-			return sum;
-		}
 		if (!doff) {
 			if (!soff)
 				checksum = csum_partial_cfu_aligned(
diff --git a/arch/frv/lib/checksum.c b/arch/frv/lib/checksum.c
index 44e16d5..f3fd6cd 100644
--- a/arch/frv/lib/checksum.c
+++ b/arch/frv/lib/checksum.c
@@ -140,7 +140,7 @@ csum_partial_copy_from_user(const void __user *src, void *dst,
 	if (csum_err)
 		*csum_err = 0;
 
-	rem = copy_from_user(dst, src, len);
+	rem = __copy_from_user(dst, src, len);
 	if (rem != 0) {
 		if (csum_err)
 			*csum_err = -EFAULT;
diff --git a/arch/m32r/lib/csum_partial_copy.c b/arch/m32r/lib/csum_partial_copy.c
index 5596f3d..f307c53 100644
--- a/arch/m32r/lib/csum_partial_copy.c
+++ b/arch/m32r/lib/csum_partial_copy.c
@@ -47,7 +47,7 @@ csum_partial_copy_from_user (const void __user *src, void *dst,
 {
 	int missing;
 
-	missing = copy_from_user(dst, src, len);
+	missing = __copy_from_user(dst, src, len);
 	if (missing) {
 		memset(dst + len - missing, 0, missing);
 		*err_ptr = -EFAULT;
diff --git a/arch/mips/include/asm/checksum.h b/arch/mips/include/asm/checksum.h
index 3418c51..80f48c4 100644
--- a/arch/mips/include/asm/checksum.h
+++ b/arch/mips/include/asm/checksum.h
@@ -59,13 +59,7 @@ static inline
 __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 			       int len, __wsum sum, int *err_ptr)
 {
-	if (access_ok(VERIFY_READ, src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum,
-						   err_ptr);
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return sum;
+	return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
 }
 
 /*
@@ -77,20 +71,14 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
 			     __wsum sum, int *err_ptr)
 {
 	might_fault();
-	if (access_ok(VERIFY_WRITE, dst, len)) {
-		if (segment_eq(get_fs(), get_ds()))
-			return __csum_partial_copy_kernel(src,
-							  (__force void *)dst,
-							  len, sum, err_ptr);
-		else
-			return __csum_partial_copy_to_user(src,
-							   (__force void *)dst,
-							   len, sum, err_ptr);
-	}
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	if (segment_eq(get_fs(), get_ds()))
+		return __csum_partial_copy_kernel(src,
+						  (__force void *)dst,
+						  len, sum, err_ptr);
+	else
+		return __csum_partial_copy_to_user(src,
+						   (__force void *)dst,
+						   len, sum, err_ptr);
 }
 
 /*
diff --git a/arch/mn10300/lib/checksum.c b/arch/mn10300/lib/checksum.c
index b6580f5..d080d5e 100644
--- a/arch/mn10300/lib/checksum.c
+++ b/arch/mn10300/lib/checksum.c
@@ -73,7 +73,7 @@ __wsum csum_partial_copy_from_user(const void *src, void *dst,
 {
 	int missing;
 
-	missing = copy_from_user(dst, src, len);
+	missing = __copy_from_user(dst, src, len);
 	if (missing) {
 		memset(dst + len - missing, 0, missing);
 		*err_ptr = -EFAULT;
@@ -89,7 +89,7 @@ __wsum csum_and_copy_to_user(const void *src, void *dst,
 {
 	int missing;
 
-	missing = copy_to_user(dst, src, len);
+	missing = __copy_to_user(dst, src, len);
 	if (missing) {
 		memset(dst + len - missing, 0, missing);
 		*err_ptr = -EFAULT;
diff --git a/arch/parisc/include/asm/checksum.h b/arch/parisc/include/asm/checksum.h
index c84b2fc..6ca9a6f 100644
--- a/arch/parisc/include/asm/checksum.h
+++ b/arch/parisc/include/asm/checksum.h
@@ -198,7 +198,7 @@ static __inline__ __wsum csum_and_copy_to_user(const void *src,
 	/* code stolen from include/asm-mips64 */
 	sum = csum_partial(src, len, sum);
 	 
-	if (copy_to_user(dst, src, len)) {
+	if (__copy_to_user(dst, src, len)) {
 		*err_ptr = -EFAULT;
 		return (__force __wsum)-1;
 	}
diff --git a/arch/parisc/lib/checksum.c b/arch/parisc/lib/checksum.c
index ae66d31..e52fdba 100644
--- a/arch/parisc/lib/checksum.c
+++ b/arch/parisc/lib/checksum.c
@@ -138,7 +138,7 @@ __wsum csum_partial_copy_from_user(const void __user *src,
 {
 	int missing;
 
-	missing = copy_from_user(dst, src, len);
+	missing = __copy_from_user(dst, src, len);
 	if (missing) {
 		memset(dst + len - missing, 0, missing);
 		*err_ptr = -EFAULT;
diff --git a/arch/powerpc/lib/checksum_wrappers_64.c b/arch/powerpc/lib/checksum_wrappers_64.c
index 08e3a33..4c783c8 100644
--- a/arch/powerpc/lib/checksum_wrappers_64.c
+++ b/arch/powerpc/lib/checksum_wrappers_64.c
@@ -37,7 +37,7 @@ __wsum csum_and_copy_from_user(const void __user *src, void *dst,
 		goto out;
 	}
 
-	if (unlikely((len < 0) || !access_ok(VERIFY_READ, src, len))) {
+	if (unlikely(len < 0)) {
 		*err_ptr = -EFAULT;
 		csum = (__force unsigned int)sum;
 		goto out;
@@ -78,7 +78,7 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
 		goto out;
 	}
 
-	if (unlikely((len < 0) || !access_ok(VERIFY_WRITE, dst, len))) {
+	if (unlikely(len < 0)) {
 		*err_ptr = -EFAULT;
 		csum = -1; /* invalid checksum */
 		goto out;
@@ -90,7 +90,7 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
 	if (unlikely(*err_ptr)) {
 		csum = csum_partial(src, len, sum);
 
-		if (copy_to_user(dst, src, len)) {
+		if (__copy_to_user(dst, src, len)) {
 			*err_ptr = -EFAULT;
 			csum = -1; /* invalid checksum */
 		}
diff --git a/arch/s390/include/asm/checksum.h b/arch/s390/include/asm/checksum.h
index 7403648..fdc9532 100644
--- a/arch/s390/include/asm/checksum.h
+++ b/arch/s390/include/asm/checksum.h
@@ -51,7 +51,7 @@ csum_partial_copy_from_user(const void __user *src, void *dst,
                                           int len, __wsum sum,
                                           int *err_ptr)
 {
-	if (unlikely(copy_from_user(dst, src, len)))
+	if (unlikely(__copy_from_user(dst, src, len)))
 		*err_ptr = -EFAULT;
 	return csum_partial(dst, len, sum);
 }
diff --git a/arch/score/include/asm/checksum.h b/arch/score/include/asm/checksum.h
index 961bd64..d783634 100644
--- a/arch/score/include/asm/checksum.h
+++ b/arch/score/include/asm/checksum.h
@@ -36,7 +36,7 @@ __wsum csum_and_copy_to_user(const void *src, void __user *dst, int len,
 			__wsum sum, int *err_ptr)
 {
 	sum = csum_partial(src, len, sum);
-	if (copy_to_user(dst, src, len)) {
+	if (__copy_to_user(dst, src, len)) {
 		*err_ptr = -EFAULT;
 		return (__force __wsum) -1; /* invalid checksum */
 	}
diff --git a/arch/score/lib/checksum_copy.c b/arch/score/lib/checksum_copy.c
index 9b770b3..7d6b9b4 100644
--- a/arch/score/lib/checksum_copy.c
+++ b/arch/score/lib/checksum_copy.c
@@ -42,7 +42,7 @@ unsigned int csum_partial_copy_from_user(const char *src, char *dst,
 {
 	int missing;
 
-	missing = copy_from_user(dst, src, len);
+	missing = __copy_from_user(dst, src, len);
 	if (missing) {
 		memset(dst + len - missing, 0, missing);
 		*err_ptr = -EFAULT;
diff --git a/arch/sh/include/asm/checksum_32.h b/arch/sh/include/asm/checksum_32.h
index 14b7ac2..9749277 100644
--- a/arch/sh/include/asm/checksum_32.h
+++ b/arch/sh/include/asm/checksum_32.h
@@ -203,13 +203,7 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 					   int len, __wsum sum,
 					   int *err_ptr)
 {
-	if (access_ok(VERIFY_WRITE, dst, len))
-		return csum_partial_copy_generic((__force const void *)src,
+	return csum_partial_copy_generic((__force const void *)src,
 						dst, len, sum, NULL, err_ptr);
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
 }
 #endif /* __ASM_SH_CHECKSUM_H */
diff --git a/arch/sparc/include/asm/checksum_32.h b/arch/sparc/include/asm/checksum_32.h
index 426b238..be2e295 100644
--- a/arch/sparc/include/asm/checksum_32.h
+++ b/arch/sparc/include/asm/checksum_32.h
@@ -86,30 +86,25 @@ static inline __wsum
 csum_partial_copy_to_user(const void *src, void __user *dst, int len,
 			  __wsum sum, int *err)
 {
-	if (!access_ok (VERIFY_WRITE, dst, len)) {
-		*err = -EFAULT;
-		return sum;
-	} else {
-		register unsigned long ret asm("o0") = (unsigned long)src;
-		register char __user *d asm("o1") = dst;
-		register int l asm("g1") = len;
-		register __wsum s asm("g7") = sum;
-
-		__asm__ __volatile__ (
-		".section __ex_table,#alloc\n\t"
-		".align 4\n\t"
-		".word 1f,1\n\t"
-		".previous\n"
-		"1:\n\t"
-		"call __csum_partial_copy_sparc_generic\n\t"
-		" st %8, [%%sp + 64]\n"
-		: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
-		: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
-		: "o2", "o3", "o4", "o5", "o7",
-		  "g2", "g3", "g4", "g5",
-		  "cc", "memory");
-		return (__force __wsum)ret;
-	}
+	register unsigned long ret asm("o0") = (unsigned long)src;
+	register char __user *d asm("o1") = dst;
+	register int l asm("g1") = len;
+	register __wsum s asm("g7") = sum;
+
+	__asm__ __volatile__ (
+	".section __ex_table,#alloc\n\t"
+	".align 4\n\t"
+	".word 1f,1\n\t"
+	".previous\n"
+	"1:\n\t"
+	"call __csum_partial_copy_sparc_generic\n\t"
+	" st %8, [%%sp + 64]\n"
+	: "=&r" (ret), "=&r" (d), "=&r" (l), "=&r" (s)
+	: "0" (ret), "1" (d), "2" (l), "3" (s), "r" (err)
+	: "o2", "o3", "o4", "o5", "o7",
+	  "g2", "g3", "g4", "g5",
+	  "cc", "memory");
+	return (__force __wsum)ret;
 }
 
 #define HAVE_CSUM_COPY_USER
diff --git a/arch/x86/include/asm/checksum_32.h b/arch/x86/include/asm/checksum_32.h
index f50de69..4d96e08 100644
--- a/arch/x86/include/asm/checksum_32.h
+++ b/arch/x86/include/asm/checksum_32.h
@@ -185,18 +185,11 @@ static inline __wsum csum_and_copy_to_user(const void *src,
 	__wsum ret;
 
 	might_sleep();
-	if (access_ok(VERIFY_WRITE, dst, len)) {
-		stac();
-		ret = csum_partial_copy_generic(src, (__force void *)dst,
-						len, sum, NULL, err_ptr);
-		clac();
-		return ret;
-	}
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	stac();
+	ret = csum_partial_copy_generic(src, (__force void *)dst,
+					len, sum, NULL, err_ptr);
+	clac();
+	return ret;
 }
 
 #endif /* _ASM_X86_CHECKSUM_32_H */
diff --git a/arch/x86/lib/csum-wrappers_64.c b/arch/x86/lib/csum-wrappers_64.c
index 7609e0e..b6b5626 100644
--- a/arch/x86/lib/csum-wrappers_64.c
+++ b/arch/x86/lib/csum-wrappers_64.c
@@ -26,9 +26,6 @@ csum_partial_copy_from_user(const void __user *src, void *dst,
 	might_sleep();
 	*errp = 0;
 
-	if (!likely(access_ok(VERIFY_READ, src, len)))
-		goto out_err;
-
 	/*
 	 * Why 6, not 7? To handle odd addresses aligned we
 	 * would need to do considerable complications to fix the
diff --git a/arch/x86/um/asm/checksum.h b/arch/x86/um/asm/checksum.h
index 4b181b7..3a5166c 100644
--- a/arch/x86/um/asm/checksum.h
+++ b/arch/x86/um/asm/checksum.h
@@ -46,7 +46,7 @@ static __inline__
 __wsum csum_partial_copy_from_user(const void __user *src, void *dst,
 					 int len, __wsum sum, int *err_ptr)
 {
-	if (copy_from_user(dst, src, len)) {
+	if (__copy_from_user(dst, src, len)) {
 		*err_ptr = -EFAULT;
 		return (__force __wsum)-1;
 	}
diff --git a/arch/x86/um/asm/checksum_32.h b/arch/x86/um/asm/checksum_32.h
index ab77b6f..e047748 100644
--- a/arch/x86/um/asm/checksum_32.h
+++ b/arch/x86/um/asm/checksum_32.h
@@ -43,14 +43,12 @@ static __inline__ __wsum csum_and_copy_to_user(const void *src,
 						     void __user *dst,
 						     int len, __wsum sum, int *err_ptr)
 {
-	if (access_ok(VERIFY_WRITE, dst, len)) {
-		if (copy_to_user(dst, src, len)) {
-			*err_ptr = -EFAULT;
-			return (__force __wsum)-1;
-		}
-
-		return csum_partial(src, len, sum);
+	if (__copy_to_user(dst, src, len)) {
+		*err_ptr = -EFAULT;
+		return (__force __wsum)-1;
 	}
+	return csum_partial(src, len, sum);
+}
 
 	if (len)
 		*err_ptr = -EFAULT;
diff --git a/arch/xtensa/include/asm/checksum.h b/arch/xtensa/include/asm/checksum.h
index 0593de6..e2fd018 100644
--- a/arch/xtensa/include/asm/checksum.h
+++ b/arch/xtensa/include/asm/checksum.h
@@ -245,12 +245,6 @@ static __inline__ __wsum csum_and_copy_to_user(const void *src,
 					       void __user *dst, int len,
 					       __wsum sum, int *err_ptr)
 {
-	if (access_ok(VERIFY_WRITE, dst, len))
-		return csum_partial_copy_generic(src,dst,len,sum,NULL,err_ptr);
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return (__force __wsum)-1; /* invalid checksum */
+	return csum_partial_copy_generic(src,dst,len,sum,NULL,err_ptr);
 }
 #endif
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 73c370e..df99075 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2483,7 +2483,7 @@ static inline int skb_add_data(struct sk_buff *skb,
 			skb->csum = csum_block_add(skb->csum, csum, off);
 			return 0;
 		}
-	} else if (!copy_from_user(skb_put(skb, copy), from, copy))
+	} else if (!__copy_from_user(skb_put(skb, copy), from, copy))
 		return 0;
 
 	__skb_trim(skb, off);
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 6465bae..3ed85c8 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -30,13 +30,7 @@ static inline
 __wsum csum_and_copy_from_user (const void __user *src, void *dst,
 				      int len, __wsum sum, int *err_ptr)
 {
-	if (access_ok(VERIFY_READ, src, len))
-		return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
-
-	if (len)
-		*err_ptr = -EFAULT;
-
-	return sum;
+	return csum_partial_copy_from_user(src, dst, len, sum, err_ptr);
 }
 #endif
 
@@ -46,10 +40,8 @@ static __inline__ __wsum csum_and_copy_to_user
 {
 	sum = csum_partial(src, len, sum);
 
-	if (access_ok(VERIFY_WRITE, dst, len)) {
-		if (copy_to_user(dst, src, len) == 0)
-			return sum;
-	}
+	if (__copy_to_user(dst, src, len) == 0)
+		return sum;
 	if (len)
 		*err_ptr = -EFAULT;
 
diff --git a/include/net/sock.h b/include/net/sock.h
index 83a669f..94e0ead 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1842,10 +1842,9 @@ static inline int skb_do_copy_data_nocache(struct sock *sk, struct sk_buff *skb,
 			return err;
 		skb->csum = csum_block_add(skb->csum, csum, offset);
 	} else if (sk->sk_route_caps & NETIF_F_NOCACHE_COPY) {
-		if (!access_ok(VERIFY_READ, from, copy) ||
-		    __copy_from_user_nocache(to, from, copy))
+		if (__copy_from_user_nocache(to, from, copy))
 			return -EFAULT;
-	} else if (copy_from_user(to, from, copy))
+	} else if (__copy_from_user(to, from, copy))
 		return -EFAULT;
 
 	return 0;
@@ -1896,7 +1895,7 @@ static inline int skb_copy_to_page(struct sock *sk, char __user *from,
 		if (err)
 			return err;
 		skb->csum = csum_block_add(skb->csum, csum, skb->len);
-	} else if (copy_from_user(page_address(page) + off, from, copy))
+	} else if (__copy_from_user(page_address(page) + off, from, copy))
 		return -EFAULT;
 
 	skb->len	     += copy;
diff --git a/lib/iovec.c b/lib/iovec.c
index df3abd1..021d53f 100644
--- a/lib/iovec.c
+++ b/lib/iovec.c
@@ -13,7 +13,7 @@ int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len)
 	while (len > 0) {
 		if (iov->iov_len) {
 			int copy = min_t(unsigned int, len, iov->iov_len);
-			if (copy_from_user(kdata, iov->iov_base, copy))
+			if (__copy_from_user(kdata, iov->iov_base, copy))
 				return -EFAULT;
 			len -= copy;
 			kdata += copy;
@@ -38,7 +38,7 @@ int memcpy_toiovec(struct iovec *iov, unsigned char *kdata, int len)
 	while (len > 0) {
 		if (iov->iov_len) {
 			int copy = min_t(unsigned int, iov->iov_len, len);
-			if (copy_to_user(iov->iov_base, kdata, copy))
+			if (__copy_to_user(iov->iov_base, kdata, copy))
 				return -EFAULT;
 			kdata += copy;
 			len -= copy;
@@ -67,7 +67,7 @@ int memcpy_toiovecend(const struct iovec *iov, unsigned char *kdata,
 			continue;
 		}
 		copy = min_t(unsigned int, iov->iov_len - offset, len);
-		if (copy_to_user(iov->iov_base + offset, kdata, copy))
+		if (__copy_to_user(iov->iov_base + offset, kdata, copy))
 			return -EFAULT;
 		offset = 0;
 		kdata += copy;
@@ -100,7 +100,7 @@ int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov,
 		int copy = min_t(unsigned int, len, iov->iov_len - offset);
 
 		offset = 0;
-		if (copy_from_user(kdata, base, copy))
+		if (__copy_from_user(kdata, base, copy))
 			return -EFAULT;
 		len -= copy;
 		kdata += copy;
diff --git a/net/core/iovec.c b/net/core/iovec.c
index 86beeea..4a35b21 100644
--- a/net/core/iovec.c
+++ b/net/core/iovec.c
@@ -97,7 +97,7 @@ int csum_partial_copy_fromiovecend(unsigned char *kdata, struct iovec *iov,
 
 			/* iov component is too short ... */
 			if (par_len > copy) {
-				if (copy_from_user(kdata, base, copy))
+				if (__copy_from_user(kdata, base, copy))
 					goto out_fault;
 				kdata += copy;
 				base += copy;
@@ -110,7 +110,7 @@ int csum_partial_copy_fromiovecend(unsigned char *kdata, struct iovec *iov,
 							 partial_cnt, csum);
 				goto out;
 			}
-			if (copy_from_user(kdata, base, par_len))
+			if (__copy_from_user(kdata, base, par_len))
 				goto out_fault;
 			csum = csum_partial(kdata - partial_cnt, 4, csum);
 			kdata += par_len;
@@ -124,7 +124,7 @@ int csum_partial_copy_fromiovecend(unsigned char *kdata, struct iovec *iov,
 			partial_cnt = copy % 4;
 			if (partial_cnt) {
 				copy -= partial_cnt;
-				if (copy_from_user(kdata + copy, base + copy,
+				if (__copy_from_user(kdata + copy, base + copy,
 						partial_cnt))
 					goto out_fault;
 			}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 4/5] bury skb_copy_to_page()
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
                     ` (2 preceding siblings ...)
  2014-11-18 19:42   ` [PATCH 3/5] remove a bunch of now-pointless access_ok() in net Al Viro
@ 2014-11-18 19:43   ` Al Viro
  2014-11-18 19:43   ` [PATCH 5/5] fold verify_iovec() into copy_msghdr_from_user() Al Viro
  2014-11-19 20:25   ` [patches][RFC] situation with csum_and_copy_... API David Miller
  5 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18 19:43 UTC (permalink / raw)
  To: netdev; +Cc: Linus Torvalds, David Miller, linux-kernel

no callers since 3.0

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/sock.h |   23 -----------------------
 1 file changed, 23 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index 94e0ead..d7d43e9 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1883,29 +1883,6 @@ static inline int skb_copy_to_page_nocache(struct sock *sk, char __user *from,
 	return 0;
 }
 
-static inline int skb_copy_to_page(struct sock *sk, char __user *from,
-				   struct sk_buff *skb, struct page *page,
-				   int off, int copy)
-{
-	if (skb->ip_summed == CHECKSUM_NONE) {
-		int err = 0;
-		__wsum csum = csum_and_copy_from_user(from,
-						     page_address(page) + off,
-							    copy, 0, &err);
-		if (err)
-			return err;
-		skb->csum = csum_block_add(skb->csum, csum, skb->len);
-	} else if (__copy_from_user(page_address(page) + off, from, copy))
-		return -EFAULT;
-
-	skb->len	     += copy;
-	skb->data_len	     += copy;
-	skb->truesize	     += copy;
-	sk->sk_wmem_queued   += copy;
-	sk_mem_charge(sk, copy);
-	return 0;
-}
-
 /**
  * sk_wmem_alloc_get - returns write allocations
  * @sk: socket
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 5/5] fold verify_iovec() into copy_msghdr_from_user()
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
                     ` (3 preceding siblings ...)
  2014-11-18 19:43   ` [PATCH 4/5] bury skb_copy_to_page() Al Viro
@ 2014-11-18 19:43   ` Al Viro
  2014-11-19 20:25   ` [patches][RFC] situation with csum_and_copy_... API David Miller
  5 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18 19:43 UTC (permalink / raw)
  To: netdev; +Cc: Linus Torvalds, David Miller, linux-kernel

... and do the same on the compat side of things.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/socket.h |    1 -
 include/net/compat.h   |    5 ++-
 net/compat.c           |   52 ++++++++++++---------------
 net/core/iovec.c       |   38 --------------------
 net/socket.c           |   93 +++++++++++++++++++++++++++---------------------
 5 files changed, 77 insertions(+), 112 deletions(-)

diff --git a/include/linux/socket.h b/include/linux/socket.h
index 51bd666..de52228 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -322,7 +322,6 @@ extern int csum_partial_copy_fromiovecend(unsigned char *kdata,
 extern unsigned long iov_pages(const struct iovec *iov, int offset,
 			       unsigned long nr_segs);
 
-extern int verify_iovec(struct msghdr *m, struct iovec *iov, struct sockaddr_storage *address, int mode);
 extern int move_addr_to_kernel(void __user *uaddr, int ulen, struct sockaddr_storage *kaddr);
 extern int put_cmsg(struct msghdr*, int level, int type, int len, void *data);
 
diff --git a/include/net/compat.h b/include/net/compat.h
index 3b603b1..42a9c84 100644
--- a/include/net/compat.h
+++ b/include/net/compat.h
@@ -40,9 +40,8 @@ int compat_sock_get_timestampns(struct sock *, struct timespec __user *);
 #define compat_mmsghdr	mmsghdr
 #endif /* defined(CONFIG_COMPAT) */
 
-int get_compat_msghdr(struct msghdr *, struct compat_msghdr __user *);
-int verify_compat_iovec(struct msghdr *, struct iovec *,
-			struct sockaddr_storage *, int);
+ssize_t get_compat_msghdr(struct msghdr *, struct compat_msghdr __user *,
+		      struct sockaddr __user **, struct iovec **);
 asmlinkage long compat_sys_sendmsg(int, struct compat_msghdr __user *,
 				   unsigned int);
 asmlinkage long compat_sys_sendmmsg(int, struct compat_mmsghdr __user *,
diff --git a/net/compat.c b/net/compat.c
index 7b4b6ad..062f157 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -31,14 +31,18 @@
 #include <asm/uaccess.h>
 #include <net/compat.h>
 
-int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
+ssize_t get_compat_msghdr(struct msghdr *kmsg,
+			  struct compat_msghdr __user *umsg,
+			  struct sockaddr __user **save_addr,
+			  struct iovec **iov)
 {
-	compat_uptr_t tmp1, tmp2, tmp3;
+	compat_uptr_t uaddr, uiov, tmp3;
+	ssize_t err;
 
 	if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
-	    __get_user(tmp1, &umsg->msg_name) ||
+	    __get_user(uaddr, &umsg->msg_name) ||
 	    __get_user(kmsg->msg_namelen, &umsg->msg_namelen) ||
-	    __get_user(tmp2, &umsg->msg_iov) ||
+	    __get_user(uiov, &umsg->msg_iov) ||
 	    __get_user(kmsg->msg_iovlen, &umsg->msg_iovlen) ||
 	    __get_user(tmp3, &umsg->msg_control) ||
 	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
@@ -46,44 +50,32 @@ int get_compat_msghdr(struct msghdr *kmsg, struct compat_msghdr __user *umsg)
 		return -EFAULT;
 	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
 		kmsg->msg_namelen = sizeof(struct sockaddr_storage);
-	kmsg->msg_name = compat_ptr(tmp1);
-	kmsg->msg_iov = compat_ptr(tmp2);
 	kmsg->msg_control = compat_ptr(tmp3);
-	return 0;
-}
 
-/* I've named the args so it is easy to tell whose space the pointers are in. */
-int verify_compat_iovec(struct msghdr *kern_msg, struct iovec *iov,
-		   struct sockaddr_storage *kern_address, int mode)
-{
-	struct compat_iovec __user *p;
-	struct iovec *res;
-	int err;
+	if (save_addr)
+		*save_addr = compat_ptr(uaddr);
 
-	if (kern_msg->msg_name && kern_msg->msg_namelen) {
-		if (mode == WRITE) {
-			int err = move_addr_to_kernel(kern_msg->msg_name,
-						      kern_msg->msg_namelen,
-						      kern_address);
+	if (uaddr && kmsg->msg_namelen) {
+		if (!save_addr) {
+			err = move_addr_to_kernel(compat_ptr(uaddr),
+						  kmsg->msg_namelen,
+						  kmsg->msg_name);
 			if (err < 0)
 				return err;
 		}
-		kern_msg->msg_name = kern_address;
 	} else {
-		kern_msg->msg_name = NULL;
-		kern_msg->msg_namelen = 0;
+		kmsg->msg_name = NULL;
+		kmsg->msg_namelen = 0;
 	}
 
-	if (kern_msg->msg_iovlen > UIO_MAXIOV)
+	if (kmsg->msg_iovlen > UIO_MAXIOV)
 		return -EMSGSIZE;
 
-	p = (struct compat_iovec __user *)kern_msg->msg_iov;
-	err = compat_rw_copy_check_uvector(mode, p, kern_msg->msg_iovlen,
-					   UIO_FASTIOV, iov, &res);
+	err = compat_rw_copy_check_uvector(save_addr ? READ : WRITE,
+					   compat_ptr(uiov), kmsg->msg_iovlen,
+					   UIO_FASTIOV, *iov, iov);
 	if (err >= 0)
-		kern_msg->msg_iov = res;
-	else if (res != iov)
-		kfree(res);
+		kmsg->msg_iov = *iov;
 	return err;
 }
 
diff --git a/net/core/iovec.c b/net/core/iovec.c
index 4a35b21..fa3eed9 100644
--- a/net/core/iovec.c
+++ b/net/core/iovec.c
@@ -28,44 +28,6 @@
 #include <net/sock.h>
 
 /*
- *	Verify iovec. The caller must ensure that the iovec is big enough
- *	to hold the message iovec.
- *
- *	Save time not doing access_ok. copy_*_user will make this work
- *	in any case.
- */
-
-int verify_iovec(struct msghdr *m, struct iovec *iov, struct sockaddr_storage *address, int mode)
-{
-	struct iovec *res;
-	int err;
-
-	if (m->msg_name && m->msg_namelen) {
-		if (mode == WRITE) {
-			void __user *namep = (void __user __force *)m->msg_name;
-			int err = move_addr_to_kernel(namep, m->msg_namelen,
-						  address);
-			if (err < 0)
-				return err;
-		}
-		m->msg_name = address;
-	} else {
-		m->msg_name = NULL;
-		m->msg_namelen = 0;
-	}
-	if (m->msg_iovlen > UIO_MAXIOV)
-		return -EMSGSIZE;
-
-	err = rw_copy_check_uvector(mode, (void __user __force *)m->msg_iov,
-				    m->msg_iovlen, UIO_FASTIOV, iov, &res);
-	if (err >= 0)
-		m->msg_iov = res;
-	else if (res != iov)
-		kfree(res);
-	return err;
-}
-
-/*
  *	And now for the all-in-one: copy and checksum from a user iovec
  *	directly to a datagram
  *	Calls to csum_partial but the last must be in 32 bit chunks
diff --git a/net/socket.c b/net/socket.c
index 59020f0..ee3ee39 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1988,16 +1988,26 @@ struct used_address {
 	unsigned int name_len;
 };
 
-static int copy_msghdr_from_user(struct msghdr *kmsg,
-				 struct user_msghdr __user *umsg)
+static ssize_t copy_msghdr_from_user(struct msghdr *kmsg,
+				     struct user_msghdr __user *umsg,
+				     struct sockaddr __user **save_addr,
+				     struct iovec **iov)
 {
-	/* We are relying on the (currently) identical layouts.  Once
-	 * the kernel-side changes, this place will need to be updated
-	 */
-	if (copy_from_user(kmsg, umsg, sizeof(struct msghdr)))
+	struct sockaddr __user *uaddr;
+	struct iovec __user *uiov;
+	ssize_t err;
+
+	if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
+	    __get_user(uaddr, &umsg->msg_name) ||
+	    __get_user(kmsg->msg_namelen, &umsg->msg_namelen) ||
+	    __get_user(uiov, &umsg->msg_iov) ||
+	    __get_user(kmsg->msg_iovlen, &umsg->msg_iovlen) ||
+	    __get_user(kmsg->msg_control, &umsg->msg_control) ||
+	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
+	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
 		return -EFAULT;
 
-	if (kmsg->msg_name == NULL)
+	if (!uaddr)
 		kmsg->msg_namelen = 0;
 
 	if (kmsg->msg_namelen < 0)
@@ -2005,7 +2015,31 @@ static int copy_msghdr_from_user(struct msghdr *kmsg,
 
 	if (kmsg->msg_namelen > sizeof(struct sockaddr_storage))
 		kmsg->msg_namelen = sizeof(struct sockaddr_storage);
-	return 0;
+
+	if (save_addr)
+		*save_addr = uaddr;
+
+	if (uaddr && kmsg->msg_namelen) {
+		if (!save_addr) {
+			err = move_addr_to_kernel(uaddr, kmsg->msg_namelen,
+						  kmsg->msg_name);
+			if (err < 0)
+				return err;
+		}
+	} else {
+		kmsg->msg_name = NULL;
+		kmsg->msg_namelen = 0;
+	}
+
+	if (kmsg->msg_iovlen > UIO_MAXIOV)
+		return -EMSGSIZE;
+
+	err = rw_copy_check_uvector(save_addr ? READ : WRITE,
+				    uiov, kmsg->msg_iovlen,
+				    UIO_FASTIOV, *iov, iov);
+	if (err >= 0)
+		kmsg->msg_iov = *iov;
+	return err;
 }
 
 static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg,
@@ -2020,26 +2054,17 @@ static int ___sys_sendmsg(struct socket *sock, struct user_msghdr __user *msg,
 	    __attribute__ ((aligned(sizeof(__kernel_size_t))));
 	/* 20 is size of ipv6_pktinfo */
 	unsigned char *ctl_buf = ctl;
-	int err, ctl_len, total_len;
+	int ctl_len, total_len;
+	ssize_t err;
 
-	err = -EFAULT;
-	if (MSG_CMSG_COMPAT & flags) {
-		if (get_compat_msghdr(msg_sys, msg_compat))
-			return -EFAULT;
-	} else {
-		err = copy_msghdr_from_user(msg_sys, msg);
-		if (err)
-			return err;
-	}
+	msg_sys->msg_name = &address;
 
-	/* This will also move the address data into kernel space */
 	if (MSG_CMSG_COMPAT & flags)
-		err = verify_compat_iovec(msg_sys, iovstack, &address, WRITE);
+		err = get_compat_msghdr(msg_sys, msg_compat, NULL, &iov);
 	else
-		err = verify_iovec(msg_sys, iovstack, &address, WRITE);
+		err = copy_msghdr_from_user(msg_sys, msg, NULL, &iov);
 	if (err < 0)
 		goto out_freeiov;
-	iov = msg_sys->msg_iov;
 	total_len = err;
 
 	err = -ENOBUFS;
@@ -2215,36 +2240,24 @@ static int ___sys_recvmsg(struct socket *sock, struct user_msghdr __user *msg,
 	struct iovec iovstack[UIO_FASTIOV];
 	struct iovec *iov = iovstack;
 	unsigned long cmsg_ptr;
-	int err, total_len, len;
+	int total_len, len;
+	ssize_t err;
 
 	/* kernel mode address */
 	struct sockaddr_storage addr;
 
 	/* user mode address pointers */
 	struct sockaddr __user *uaddr;
-	int __user *uaddr_len;
+	int __user *uaddr_len = COMPAT_NAMELEN(msg);
 
-	if (MSG_CMSG_COMPAT & flags) {
-		if (get_compat_msghdr(msg_sys, msg_compat))
-			return -EFAULT;
-	} else {
-		err = copy_msghdr_from_user(msg_sys, msg);
-		if (err)
-			return err;
-	}
+	msg_sys->msg_name = &addr;
 
-	/* Save the user-mode address (verify_iovec will change the
-	 * kernel msghdr to use the kernel address space)
-	 */
-	uaddr = (__force void __user *)msg_sys->msg_name;
-	uaddr_len = COMPAT_NAMELEN(msg);
 	if (MSG_CMSG_COMPAT & flags)
-		err = verify_compat_iovec(msg_sys, iovstack, &addr, READ);
+		err = get_compat_msghdr(msg_sys, msg_compat, &uaddr, &iov);
 	else
-		err = verify_iovec(msg_sys, iovstack, &addr, READ);
+		err = copy_msghdr_from_user(msg_sys, msg, &uaddr, &iov);
 	if (err < 0)
 		goto out_freeiov;
-	iov = msg_sys->msg_iov;
 	total_len = err;
 
 	cmsg_ptr = (unsigned long)msg_sys->msg_control;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-18  8:47 [RFC] situation with csum_and_copy_... API Al Viro
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
@ 2014-11-18 20:49 ` Linus Torvalds
  2014-11-18 21:23   ` Al Viro
  1 sibling, 1 reply; 133+ messages in thread
From: Linus Torvalds @ 2014-11-18 20:49 UTC (permalink / raw)
  To: Al Viro; +Cc: Network Development, David Miller, Linux Kernel Mailing List

On Tue, Nov 18, 2014 at 12:47 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> The minimal implementations would be
>
> __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len,
>                                __wsum sum, int *err_ptr)
> {
>         if (unlikely(copy_from_user(dst, src, len) < 0)) {

No. That "< 0" should be "!= 0". The user copy functions return a
positive value of how many bytes they *failed* to copy.

> Note that we are *not* guaranteed that *err_ptr is touched in case of success
> - not all architectures zero it in that case.  Callers must (and do) zero it
> before the call of csum_and_copy_{from,to}_user().  Furthermore, return value
> in case of error is undefined.

Yeah, easy to misuse.

> IMO the calling conventions are atrocious.

Yeah, not pretty. At the same time, the pain of changing what seems to
work might not be worth it.

And quite frankly, I *detest* your patch 3/5.

"access_ok()" isn't that expensive, and removing them as unnecessary
is fraught with errors. We've had several cases of "oops, we used
__get_user() in a loop, because it generates much better code, but
we'd forgotten to do access_ok(), so now people can read kernel data".

I really think that

 (a) using "__get_user/__put_user/__memcpy_from/to_user" should be
avoided unless there is a clear and present performance issue

 (b) when that performance issue is clear and unmistakable, you should
generally have the "access_ok()" check *locally* to the use of unsafe
ops, so that you can *locally* see that it's safe.

 (c) if there is some upper-level check (ie the normal read/write
paths that make *sure* the addresses are fine), and there is some huge
performance issue that means that the local checks would be a problem,
then we should have a big comment about exactly where the checks are
done.

Your 3/5 violates pretty much all of these, imho.

The rest of the patches I have nothing against. I'm a bit worried that
this is stuff that can easily get stupid thinkos (ie exactly due to
things like the argument order thing), and I wonder how much it buys
us, but at least it seems to generally remove more lines than it adds,
and cleans some stuff up, so I'm not against it.

                    Linus

                 Linus

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-18 20:49 ` [RFC] " Linus Torvalds
@ 2014-11-18 21:23   ` Al Viro
  2014-11-18 21:39     ` Linus Torvalds
  2014-11-19 20:31     ` David Miller
  0 siblings, 2 replies; 133+ messages in thread
From: Al Viro @ 2014-11-18 21:23 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Network Development, David Miller, Linux Kernel Mailing List

On Tue, Nov 18, 2014 at 12:49:13PM -0800, Linus Torvalds wrote:
> On Tue, Nov 18, 2014 at 12:47 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> > The minimal implementations would be
> >
> > __wsum csum_and_copy_from_user(const void __user *src, void *dst, int len,
> >                                __wsum sum, int *err_ptr)
> > {
> >         if (unlikely(copy_from_user(dst, src, len) < 0)) {
> 
> No. That "< 0" should be "!= 0". The user copy functions return a
> positive value of how many bytes they *failed* to copy.

D'oh...  Yes, indeed - sorry about the braino.

> > IMO the calling conventions are atrocious.
> 
> Yeah, not pretty. At the same time, the pain of changing what seems to
> work might not be worth it.
> 
> And quite frankly, I *detest* your patch 3/5.
> 
> "access_ok()" isn't that expensive, and removing them as unnecessary
> is fraught with errors. We've had several cases of "oops, we used
> __get_user() in a loop, because it generates much better code, but
> we'd forgotten to do access_ok(), so now people can read kernel data".

OK...  If netdev folks can live with that for now, I've no problem with
dropping 3/5.  However, I really think we need a variant of csum-and-copy
that would _not_ bother with access_ok() longer term.  That can wait, though...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-18 21:23   ` Al Viro
@ 2014-11-18 21:39     ` Linus Torvalds
  2014-11-19 20:31     ` David Miller
  1 sibling, 0 replies; 133+ messages in thread
From: Linus Torvalds @ 2014-11-18 21:39 UTC (permalink / raw)
  To: Al Viro; +Cc: Network Development, David Miller, Linux Kernel Mailing List

On Tue, Nov 18, 2014 at 1:23 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> OK...  If netdev folks can live with that for now, I've no problem with
> dropping 3/5.  However, I really think we need a variant of csum-and-copy
> that would _not_ bother with access_ok() longer term.  That can wait, though...

iirc,access_ok() ends up being something like three or four
instructions. It's generally not very painful.

The main reason to ever use the "__" functions is for __get_user() or
__put_user() in a loop (or when doing multiple ones when
loading/storing a struct or a signal stack frame or whatever), and
then the real issue ends up being that the __get_user() is really
often just a single instruction, while "get_user()" is a function call
that does that access_ok().

So then moving the access_ok() outside the loop, or to above the
structure load, can make a *big* deal.  But removing the access_ok()
entirely tends to not be a huge issue - it's just that you want to do
it *once*, instead of doing it over-and-over.

There might be some case you really want to remove it from the
function entirely so that the "access_ok()" is no longer close to the
accesses it checks, but I really think you want to have a goof
performance case for it, and a comment about it.

And no, we haven't really always followed those rules.

Btw, these days, on x86, we actually have a bigger issue with the
whole STAC/CLAC thing:  it's a good safety measure, but doing
STAC/CLAC for each access is painful. So "__get_user()" and
"__put_user()" sadly aren't the single-instruction things they used to
be any more. The code sequences that really want tight code (think the
'struct stat' copying and the like) sadly cannot get it as it is..

                   Linus

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [patches][RFC] situation with csum_and_copy_... API
  2014-11-18 19:40 ` [patches][RFC] " Al Viro
                     ` (4 preceding siblings ...)
  2014-11-18 19:43   ` [PATCH 5/5] fold verify_iovec() into copy_msghdr_from_user() Al Viro
@ 2014-11-19 20:25   ` David Miller
  5 siblings, 0 replies; 133+ messages in thread
From: David Miller @ 2014-11-19 20:25 UTC (permalink / raw)
  To: viro; +Cc: netdev, torvalds, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 18 Nov 2014 19:40:53 +0000

> On Tue, Nov 18, 2014 at 08:47:45AM +0000, Al Viro wrote:
> 
>> I do have a patch doing just that; the question is what to do with csum-and-copy
>> primitives.  Originally I planned to simply strip those access_ok() from those
>> (both the explicit calls and use of copy_from_user() where we ought to use
>> __copy_from_user(), etc.), but that's not nice to potential out-of-tree callers
>> of those suckers.  If any of those exist and manage to cope with the wonderful
>> calling conventions, that is.  As it is, we have the total of 4 callers of
>> csum_and_copy_from_user() and 2 callers of csum_and_copy_to_user(), all in
>> networking code.  Do we care about potential out-of-tree users existing and
>> getting screwed by such change?  Davem, Linus?
> 
> FWIW, the beginning of series in question follows; removal of those
> access_ok() is 3/5.  The series is longer than that (see vfs.git#iov_iter-net
> for a bit more, and there's more stuff in local queue still too much in flux
> to push them out), but all the stuff relevant to validating iovecs on
> sendmsg/recvmsg and getting rid of excessive access_ok() is in the first 5
> commits.

Al I really like this series, especially patch #2.

Sorry for taking so long to review this, I just wanted to make sure we
got this right.

Can you give me a pull request for just these 5 patches?  Then feel free
to post the next batch for review, I'm eager to see it as are others.

Thanks!

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-18 21:23   ` Al Viro
  2014-11-18 21:39     ` Linus Torvalds
@ 2014-11-19 20:31     ` David Miller
  2014-11-19 20:40       ` Linus Torvalds
  1 sibling, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-19 20:31 UTC (permalink / raw)
  To: viro; +Cc: torvalds, netdev, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 18 Nov 2014 21:23:07 +0000

> On Tue, Nov 18, 2014 at 12:49:13PM -0800, Linus Torvalds wrote:
>> "access_ok()" isn't that expensive, and removing them as unnecessary
>> is fraught with errors. We've had several cases of "oops, we used
>> __get_user() in a loop, because it generates much better code, but
>> we'd forgotten to do access_ok(), so now people can read kernel data".
> 
> OK...  If netdev folks can live with that for now, I've no problem with
> dropping 3/5.  However, I really think we need a variant of csum-and-copy
> that would _not_ bother with access_ok() longer term.  That can wait, though...

I think because of the way Al verifies things at the top level, and
how we structure access to these msg->msg_iov so strictly, these cases
of access_ok() really can safely go.

But that is just my opinion, and yes I do acknowledge that we've had
serious holes in this area in the past.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-19 20:31     ` David Miller
@ 2014-11-19 20:40       ` Linus Torvalds
  2014-11-19 21:17         ` Al Viro
  2014-11-19 21:17         ` David Miller
  0 siblings, 2 replies; 133+ messages in thread
From: Linus Torvalds @ 2014-11-19 20:40 UTC (permalink / raw)
  To: David Miller; +Cc: Al Viro, Network Development, Linux Kernel Mailing List

On Wed, Nov 19, 2014 at 12:31 PM, David Miller <davem@davemloft.net> wrote:
>
> But that is just my opinion, and yes I do acknowledge that we've had
> serious holes in this area in the past.

The serious holes have generally been exactly in the "upper layers
already check" camp, and then it turns out that some odd ioctl or
other thing ends up doing something odd and interesting.

If Al has actual performance profiles showing that the access_ok() is
a real problem, then fine. As a low-level optimization, I agree with
it. But not as a "let's just drop them, and make the security rules be
non-local and subtle, and require people to know the details of the
whole call-chain".

Seeing a "__get_user()" and just being able to glance up in the same
function and seeing the "access_ok()" is just a good safety net. And
means that people don't have to waste time thinking about or looking
for where the hell the security net really is.

                    Linus

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-19 20:40       ` Linus Torvalds
@ 2014-11-19 21:17         ` Al Viro
  2014-11-19 21:17         ` David Miller
  1 sibling, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-19 21:17 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, Network Development, Linux Kernel Mailing List

On Wed, Nov 19, 2014 at 12:40:53PM -0800, Linus Torvalds wrote:
> On Wed, Nov 19, 2014 at 12:31 PM, David Miller <davem@davemloft.net> wrote:
> >
> > But that is just my opinion, and yes I do acknowledge that we've had
> > serious holes in this area in the past.
> 
> The serious holes have generally been exactly in the "upper layers
> already check" camp, and then it turns out that some odd ioctl or
> other thing ends up doing something odd and interesting.
> 
> If Al has actual performance profiles showing that the access_ok() is
> a real problem, then fine. As a low-level optimization, I agree with
> it. But not as a "let's just drop them, and make the security rules be
> non-local and subtle, and require people to know the details of the
> whole call-chain".
> 
> Seeing a "__get_user()" and just being able to glance up in the same
> function and seeing the "access_ok()" is just a good safety net. And
> means that people don't have to waste time thinking about or looking
> for where the hell the security net really is.

Umm...  It's not quite that bad - the thing is, for iov_iter-net series
we'll need copy_and_csum_{to,from}_iter() anyway and for iovec-backed
iov_iter instances we already ask the iovec to be validated wrt access_ok().
And this validation (already done by rw_copy_check_uvector()) is going to
be next to iov_iter_init() setting the iov_iter up.

Moreover, I'm planning to take iov_iter_init() into rw_copy_check_uvector().
That way setting ->iov is done from the same function that has just checked
all ranges.  If you look at the callers, you'll see that almost all of them
are directly followed by iov_iter_init() and folding it in is a fairly
obvious cleanup.

The thing is, with ..._iter() variants added we are left with no other
in-tree callers of csum_and_copy_{to,from}_user().  I agree that dropping
those access_ok() is too early at this point - the analysis of paths
by which a range can reach them is scary right now.  Moreover, it mostly
parallels the changes later in the series - ones that propagate a pointer
to iov_iter put into msdghdr in place of ->msg_iov/->msg_iovlen down to
the new primitives.  _After_ those steps the analysis (see the horrors in
commit message of 3/5) becomes trivial.

So whether we end up removing those access_ok() or not, this is not the
time to do so.  It still might make sense in the end, but not right now.

Frankly, my preference would be to provide __csum_and_copy_...() that do
not bother with access_ok() (and do so consistently between the architectures),
and do not bother with zeroing or trying to do an accurate csum in case of
error.  With uniform implementation of csum_and_copy_...() that does
access_ok(), tries to call __csum_... variant and, in case of failure,
does __copy_..._user(), zeroes the tail in the "from" one and calculates
the csum by source of destination - whichever's kernel-side.  I.e. do what
ppc64 is doing.  That way we get obviously safe csum_and_copy_.._user(),
and consistent __ counterparts directly used by ..._iter() primitives.
Quite a bit of complexity becomes possible to remove from asm code,
while we are at it - zeroing isn't the worst of it, contortions needed to
calculate the csum accurately in error case are often nastier.  So much
that e.g. arm doesn't even bother trying.

Again, this is a separate work - I agree with you regarding the overhead
being a non-issue for existing callers, and if somebody tries e.g. to send
64K from 32K-element vector of 2-byte ranges they'll have a _lot_ of other
overhead that will drown that of those access_ok().  In normal cases the
price of copying the data itself is going to swamp that of access_ok(),
of course.

IOW, consider the access_ok() changes withdrawn for now.  It's too early
in the series for them and the only reason to pull them that high in
ordering had been the fear of overhead.  Which is very unlikely to be
an issue.  They won't come back (if they come back at all) until the
proof of correctness becomes absolutely trivial.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-19 20:40       ` Linus Torvalds
  2014-11-19 21:17         ` Al Viro
@ 2014-11-19 21:17         ` David Miller
  2014-11-19 21:30           ` Al Viro
  1 sibling, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-19 21:17 UTC (permalink / raw)
  To: torvalds; +Cc: viro, netdev, linux-kernel

From: Linus Torvalds <torvalds@linux-foundation.org>
Date: Wed, 19 Nov 2014 12:40:53 -0800

> On Wed, Nov 19, 2014 at 12:31 PM, David Miller <davem@davemloft.net> wrote:
>>
>> But that is just my opinion, and yes I do acknowledge that we've had
>> serious holes in this area in the past.
> 
> The serious holes have generally been exactly in the "upper layers
> already check" camp, and then it turns out that some odd ioctl or
> other thing ends up doing something odd and interesting.
> 
> If Al has actual performance profiles showing that the access_ok() is
> a real problem, then fine. As a low-level optimization, I agree with
> it. But not as a "let's just drop them, and make the security rules be
> non-local and subtle, and require people to know the details of the
> whole call-chain".
> 
> Seeing a "__get_user()" and just being able to glance up in the same
> function and seeing the "access_ok()" is just a good safety net. And
> means that people don't have to waste time thinking about or looking
> for where the hell the security net really is.

Fair enough.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-19 21:17         ` David Miller
@ 2014-11-19 21:30           ` Al Viro
  2014-11-19 21:53             ` David Miller
  0 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-19 21:30 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel

On Wed, Nov 19, 2014 at 04:17:44PM -0500, David Miller wrote:
> > Seeing a "__get_user()" and just being able to glance up in the same
> > function and seeing the "access_ok()" is just a good safety net. And
> > means that people don't have to waste time thinking about or looking
> > for where the hell the security net really is.
> 
> Fair enough.

OK, with 3/5 dropped 4/5 get a trivial conflict (removal of function in 4/5
vs. change in it in 3/5).  With that dealt with, the sucker is in
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem

Shortlog:
Al Viro (4):
      separate kernel- and userland-side msghdr
      {compat_,}verify_iovec(): switch to generic copying of iovecs
      fold verify_iovec() into copy_msghdr_from_user()
      bury skb_copy_to_page()

Diffstat:
 arch/arm/kernel/sys_oabi-compat.c |    4 +--
 include/linux/socket.h            |   17 +++++++---
 include/linux/syscalls.h          |    6 ++--
 include/net/compat.h              |    5 ++-
 include/net/sock.h                |   23 -------------
 net/compat.c                      |   83 +++++++++++++++-------------------------------
 net/core/iovec.c                  |   47 --------------------------
 net/socket.c                      |  140 +++++++++++++++++++++++++++++++++++++-----------------------------------------
 8 files changed, 114 insertions(+), 211 deletions(-)

I'll post more for review after I finally get some sleep - up for bloody 27
hours by now ;-/

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-19 21:30           ` Al Viro
@ 2014-11-19 21:53             ` David Miller
  2014-11-20 21:47               ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-19 21:53 UTC (permalink / raw)
  To: viro; +Cc: torvalds, netdev, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Wed, 19 Nov 2014 21:30:07 +0000

> On Wed, Nov 19, 2014 at 04:17:44PM -0500, David Miller wrote:
>> > Seeing a "__get_user()" and just being able to glance up in the same
>> > function and seeing the "access_ok()" is just a good safety net. And
>> > means that people don't have to waste time thinking about or looking
>> > for where the hell the security net really is.
>> 
>> Fair enough.
> 
> OK, with 3/5 dropped 4/5 get a trivial conflict (removal of function in 4/5
> vs. change in it in 3/5).  With that dealt with, the sucker is in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem
> 
> Shortlog:
> Al Viro (4):
>       separate kernel- and userland-side msghdr
>       {compat_,}verify_iovec(): switch to generic copying of iovecs
>       fold verify_iovec() into copy_msghdr_from_user()
>       bury skb_copy_to_page()

Pulled, thanks Al.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-19 21:53             ` David Miller
@ 2014-11-20 21:47               ` Al Viro
  2014-11-20 21:55                 ` Eric Dumazet
                                   ` (2 more replies)
  0 siblings, 3 replies; 133+ messages in thread
From: Al Viro @ 2014-11-20 21:47 UTC (permalink / raw)
  To: David Miller
  Cc: torvalds, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig

On Wed, Nov 19, 2014 at 04:53:40PM -0500, David Miller wrote:

> Pulled, thanks Al.

Umm...  Not in net-next.git#master...  Anyway, the next portion is in
vfs.git#iov_iter-net right now; I'll post it on netdev once I get some
sleep.

It's getting close to really interesting parts.  Right now the main obstacle
is in iscsit_do_rx_data/iscsit_do_tx_data; what happens there is reuse of
iovec if kernel_sendmsg() gives a short write - it tries to send again, with
the same iovec and decremented length.  Ditto on RX side (with kernel_recvmsg(),
obviously).

As far as I can see, these retries on the send side are simply broken -
normally we are talking to TCP sockets there and tcp_sendmsg() does *not*
modify iovec in normal case.  IOW, if you get 8K sent out of 80K, the next
time it'll try to send 72K - already sent piece + 64K following it, etc.

Could target-devel folks tell how realistic those resends are, in the
first place?  Both with TX and RX sides...  Is there any sane limit on
iovec size there, etc.

Note that while conversion to iov_iter will provide a very simple solution
(iovec remains unchanged, iterator advances and we just need to avoid
reinitializing it for subsequent iterations in those loops), it won't solve
the problem in older kernels; that code had been there since 2011 and
iov_iter conversion is far too invasive for -stable.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-20 21:47               ` Al Viro
@ 2014-11-20 21:55                 ` Eric Dumazet
  2014-11-20 22:25                   ` Al Viro
  2014-11-20 23:23                 ` David Miller
  2014-11-21  4:17                 ` Nicholas A. Bellinger
  2 siblings, 1 reply; 133+ messages in thread
From: Eric Dumazet @ 2014-11-20 21:55 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig

On Thu, 2014-11-20 at 21:47 +0000, Al Viro wrote:

> As far as I can see, these retries on the send side are simply broken -
> normally we are talking to TCP sockets there and tcp_sendmsg() does *not*
> modify iovec in normal case. 

Arg... I sent this morning something doing this (against net-next tree)

Is it a problem ?

Or can we consider FASTOPEN being not normal case ? ;)

https://patchwork.ozlabs.org/patch/412776/




^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-20 21:55                 ` Eric Dumazet
@ 2014-11-20 22:25                   ` Al Viro
  2014-11-20 22:53                     ` Eric Dumazet
  0 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-20 22:25 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig

On Thu, Nov 20, 2014 at 01:55:42PM -0800, Eric Dumazet wrote:
> On Thu, 2014-11-20 at 21:47 +0000, Al Viro wrote:
> 
> > As far as I can see, these retries on the send side are simply broken -
> > normally we are talking to TCP sockets there and tcp_sendmsg() does *not*
> > modify iovec in normal case. 
> 
> Arg... I sent this morning something doing this (against net-next tree)
> 
> Is it a problem ?

Yes, it is.  You are breaking several _other_ kernel_sendmsg() users.
They are already slightly broken, but that'll make breakage much more
common.

Please, don't - the right thing to do is to have iov_iter in msghdr
(we already have the kernel and userland ones with different types and
we do not assume their layouts to be identical - currently they are,
but it's easy to change), keep iovec constant in all cases and advance
->msg_iter.  Also in all cases.

Note that direct manipulations of what's currently in ->msg_iov are
wrong - all those loops over vector elements, etc., belong in low-level
primitives.  The main missing ones right now are csum_and_copy_{from,to}_iter()
- I have those in local queue, but I'm still trying to get a reasonably
clean mm/iov_iter.c without ridiculous amounts of boilerplating.  A bit more
massage is needed there...

Seriously, take a look at vfs.git#iov_iter-net; it's preparations for the
one that'll introduce ->msg_iter.  Right now that branch has local iov_iter
declared and initialized in several ->sendmsg() and ->recvmsg() instances and
fed to primitives that work with it; after the conversion it'll be in
msg->msg_iter and it will be initialized by sock_sendmsg()/sock_recvmsg().

The tricky part is how to get through that without temporary breaking the
existing sendmsg/recvmsg users in the kernel *and* without a patch size from
hell.  I more or less see how to carve the remaining steps into
reasonably-sized chunks; iscsi is one of the tricky ones and it, AFAICS,
is genuinely broken in mainline and will need fixes that can go into -stable.

And no, your solution doesn't work.  Sorry.  You'll break e.g. smb_send_kvec()
that way.  ceph_tcp_sendmsg() as well, IIRC.


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-20 22:25                   ` Al Viro
@ 2014-11-20 22:53                     ` Eric Dumazet
  2014-11-21  8:49                       ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: Eric Dumazet @ 2014-11-20 22:53 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig

On Thu, 2014-11-20 at 22:25 +0000, Al Viro wrote:

> Yes, it is.  You are breaking several _other_ kernel_sendmsg() users.
> They are already slightly broken, but that'll make breakage much more
> common.
> 
> Please, don't - the right thing to do is to have iov_iter in msghdr
> (we already have the kernel and userland ones with different types and
> we do not assume their layouts to be identical - currently they are,
> but it's easy to change), keep iovec constant in all cases and advance
> ->msg_iter.  Also in all cases.
> 
> Note that direct manipulations of what's currently in ->msg_iov are
> wrong - all those loops over vector elements, etc., belong in low-level
> primitives.  The main missing ones right now are csum_and_copy_{from,to}_iter()
> - I have those in local queue, but I'm still trying to get a reasonably
> clean mm/iov_iter.c without ridiculous amounts of boilerplating.  A bit more
> massage is needed there...
> 
> Seriously, take a look at vfs.git#iov_iter-net; it's preparations for the
> one that'll introduce ->msg_iter.  Right now that branch has local iov_iter
> declared and initialized in several ->sendmsg() and ->recvmsg() instances and
> fed to primitives that work with it; after the conversion it'll be in
> msg->msg_iter and it will be initialized by sock_sendmsg()/sock_recvmsg().
> 
> The tricky part is how to get through that without temporary breaking the
> existing sendmsg/recvmsg users in the kernel *and* without a patch size from
> hell.  I more or less see how to carve the remaining steps into
> reasonably-sized chunks; iscsi is one of the tricky ones and it, AFAICS,
> is genuinely broken in mainline and will need fixes that can go into -stable.
> 
> And no, your solution doesn't work.  Sorry.  You'll break e.g. smb_send_kvec()
> that way.  ceph_tcp_sendmsg() as well, IIRC.

Nowhere in tcp_sendmsg() the iov had const qualifier.

If it was declared as const, this discussion would not happen,
we would know we are not allowed to modify it.

iov_iter is nice, but not a single time it is used in net/ 

Please make sure to add const where appropriate.

Thanks.



^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-20 21:47               ` Al Viro
  2014-11-20 21:55                 ` Eric Dumazet
@ 2014-11-20 23:23                 ` David Miller
  2014-11-21 17:26                   ` David Miller
  2014-11-21  4:17                 ` Nicholas A. Bellinger
  2 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-20 23:23 UTC (permalink / raw)
  To: viro; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Thu, 20 Nov 2014 21:47:53 +0000

> On Wed, Nov 19, 2014 at 04:53:40PM -0500, David Miller wrote:
> 
>> Pulled, thanks Al.
> 
> Umm...  Not in net-next.git#master...

Sorry, I may have done my usual: pull at the office compute, fire up
build testing, go home without pushing back out to kernel.org

:-/

I'll check first thing tomorrow morning.


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-20 21:47               ` Al Viro
  2014-11-20 21:55                 ` Eric Dumazet
  2014-11-20 23:23                 ` David Miller
@ 2014-11-21  4:17                 ` Nicholas A. Bellinger
  2 siblings, 0 replies; 133+ messages in thread
From: Nicholas A. Bellinger @ 2014-11-21  4:17 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel,
	Christoph Hellwig

Hi Al & Co,

On Thu, 2014-11-20 at 21:47 +0000, Al Viro wrote:
> On Wed, Nov 19, 2014 at 04:53:40PM -0500, David Miller wrote:
> 
> > Pulled, thanks Al.
> 
> Umm...  Not in net-next.git#master...  Anyway, the next portion is in
> vfs.git#iov_iter-net right now; I'll post it on netdev once I get some
> sleep.
> 

Thanks for your detailed analysis + work on this.

> It's getting close to really interesting parts.  Right now the main obstacle
> is in iscsit_do_rx_data/iscsit_do_tx_data; what happens there is reuse of
> iovec if kernel_sendmsg() gives a short write - it tries to send again, with
> the same iovec and decremented length.  Ditto on RX side (with kernel_recvmsg(),
> obviously).
> 
> As far as I can see, these retries on the send side are simply broken -
> normally we are talking to TCP sockets there and tcp_sendmsg() does *not*
> modify iovec in normal case.  IOW, if you get 8K sent out of 80K, the next
> time it'll try to send 72K - already sent piece + 64K following it, etc.
> 

AFAIK, short writes have not been actively getting triggered.

This is likely due to iscsit_do_tx_data() being used for sending 48 byte
PDU header, and small payloads in ISCSI_OP_LOGIN_RSP, ISCSI_OP_TEXT_RSP,
and ISCSI_OP_NOOP_IN control PDUs. 

All bulk data READ payloads are sent via iscsit_fe_sendpage_sg() and
only use iscsit_do_tx_data() for leading PDU header.

On the receive side, kernel_recvmsg() is called with MSG_WAITALL that
has been masking this bug..

> Could target-devel folks tell how realistic those resends are, in the
> first place?  Both with TX and RX sides...  Is there any sane limit on
> iovec size there, etc.
> 

Of the three control type PDU using this codepath, the transfer lengths
are currently limited to <= 32K + header across 2 kvecs.  The simplest
fix would probably be to fail the connection when send/recv returns a
value other than requested transfer length for these special cases.

For correctly handling short writes with your new work, what's the
preferred way to do this..?

--nab


^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-20 22:53                     ` Eric Dumazet
@ 2014-11-21  8:49                       ` Al Viro
  2014-11-21 15:01                         ` Eric Dumazet
                                           ` (2 more replies)
  0 siblings, 3 replies; 133+ messages in thread
From: Al Viro @ 2014-11-21  8:49 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig

On Thu, Nov 20, 2014 at 02:53:55PM -0800, Eric Dumazet wrote:
> > And no, your solution doesn't work.  Sorry.  You'll break e.g. smb_send_kvec()
> > that way.  ceph_tcp_sendmsg() as well, IIRC.
> 
> Nowhere in tcp_sendmsg() the iov had const qualifier.

*nod*

> If it was declared as const, this discussion would not happen,
> we would know we are not allowed to modify it.
>
> iov_iter is nice, but not a single time it is used in net/ 
> 
> Please make sure to add const where appropriate.

The situation is a fairly long-standing mess.  Take a look at e.g.
do_sock_write() - what we have there is
static ssize_t do_sock_write(struct msghdr *msg, struct kiocb *iocb,
                        struct file *file, const struct iovec *iov,
                        unsigned long nr_segs)
...
        msg->msg_iov = (struct iovec *)iov;
...
        return __sock_sendmsg(iocb, sock, msg, size);

That's where const is stripped off.  The reason why memcpy_fromiovec()
is modifying iovec is that it we obviously want subsequent calls to keep
picking new and new parts.  And on sendmsg(2) path iovec is discarded
after the method call anyway, so we get away with "let's do whatever's
more convenient internally, caller won't care anyway".

Unfortunately, it means that other callers (kernel_sendmsg() users, that is)
end up with very unpleasant situation - they can't even predict if iovec
will be fully drained, partially drained or left completely unchanged.
It's protocol-dependent *and* it depends on the codepath taken in ->sendmsg().
Note that "partially drained" is also a real-world case - e.g.
ping_v4_sendmsg() ends up draining the first sizeof(struct icmphdr) and leaves
the rest there.

In some cases it doesn't hurt much - throwaway struct iovec or a short array
of such which is fed to sendmsg and immediately discarded.  The rest either
relies on assumption about the behaviour for (known) protocol, which end
up being incorrect more often than not, or grumbles, makes a throwaway copy
of its iovec array and after the call either drains the original itself or
advances an iov_iter pointing to it.  There's quite a collection of unhappy
comments in those callers about this situation.  On the recvmsg side the
picture is the same.

We would be better off with iov_iter passed to __sock_{send,recv}msg() (as
a part of struct msghdr, instead of ->msg_iov/->msg_iovlen) and always
advanced to match the amount of data actually picked from it.  With iovec
behind it remaining constant.  That would work just as well as the current
variant for sendmsg(2)/recvmsg(2)/etc., be a lot more convenient for
kernel_{send,recv}msg() callers and would allow a lot of other fun stuff.

The problem, of course, is how to get through the conversion without a huge
patch from hell.  I'm reasonably certain that we can do that on the method
side of things - right now the amount of places aware of ->msg_iov in that
branch is much lower than in mainline.  The fun part is keeping the users
of kernel_{send,recv}msg() from breaking while we do the rest of transition.
The ones that use throwaway iovecs are fine - they don't assume any particular
behaviour wrt iovec draining.  A bunch of those will be able to use new
warranties later on, but those are separate patches that should go after the
switch to new rules.  Besides those we have the following:
	iscsit_do_tx_data().  Sends over TCP, assumes iovec drained.
	iscsit_do_rx_data().  Recieves over TCP, assumes iovec drained.
	smb_send_kvec().  Sends, assumes iovec unchanged.
	rxrpc_reject_packets().  Sends, assumes iovec unchanged.
	ceph_tcp_sendmsg().  Sends, assumes iovec unchanged.
The last three are not a problem, obviously - they are not quite correct
right now, but with those changes we get no regressions and the closer
we are to complete conversion, the better off they are.  Which leaves
us with iscsit_do_[rt]x_data().  

For tcp_sendmsg() conversion we need skb_add_data_nocache() and
skb_copy_to_page_nocache() variants that would take iov_iter as data
source.  Then the loop over iovec members and while (seglen > 0) loop
inside it get conflated (with iov_iter_count() instead of seglen).

Another thing is tcp_sendmsg_fastopen() and tcp_send_rcvq().  The latter
should just use copy_from_iter() instead of memcpy_from_iovec(), the former
is dealt with by making tcp_send_syn_data() use the same copy_from_iter()
instead of memcpy_from_iovecend().

All of that depends on ->msg_iter being already introduced and it doesn't break
iscsi_do_tx_data() any worse than it's currently broken.  Moreover, right after
it we'll be able to fix iscsi_do_tx_data() for good, simply by stopping to mess
with ->msg_iter after the first pass through the loop in there.

skb_add_data_nocache() and skb_copy_to_page_nocache() are both wrappers for
skb_do_copy_data_nocache(), which is used only by those two *and* they are
used only by tcp_sendmsg().  So there's no other code to be disrupted by
changes in those.  What skb_do_copy_data_nocache() is doing is a mix of
copy_from_user() (trivial - we just use copy_from_iter() instead),
csum_and_copy_from_user() (new primitive - csum_and_copy_from_iter()) and
__copy_from_user_nocache() (also a new primitive, easily added, but we
are really getting to the point where the amount of boilerplate in iov_iter.c
becomes painful).  Incidentally, xip_file_write() could benefit from the last
one as well, giving us full ->write_iter() for those...

As for the recvmsg() part of story, we'll need a new helper there as well -
csum_and_copy_to_iter(), for use in skb_copy_and_csum_datagram_iovec()
analogue that would take iov_iter.  Which will very shortly replace the
iovec one.  We'll need tp->ucopy.msg instead of tp->ucopy.iov for that;
that can be done as the first step, actually (ucopy.msg is always
->msg_iov of some msghdr with sufficiently long lifetime).  That +
introduction of helpers leaves us with moderately-sized patch converting
tcp_recvmsg() to new semantics.  And unfortunately the same patch will have
to include a (fairly simple) chunk in iscsi_do_rx_data() - same "don't
mess with ->msg_iter on subsequent iterations" thing.

At that point we'll have all kernel_{send,recv}msg() users happy with the
new semantics and are free to switch ->sendmsg()/->recvmsg() instances to
use of iov_iter primitives, not worrying about messing the callers up.
>From that point it's a smooth sailing - a bunch of iovec helpers will become
dead code shortly after that, etc.

One more moderately interesting spot will be around AF_ALG stuff - we'll need
to teach iov_iter_get_pages{,_alloc}() to do the right thing for kvec-backed
iterators.  Not hard to do, fortunately.

Overall, I think I have the whole series plotted in enough details to be
reasonably certain we can pull it off.  Right now I'm dealing with
mm/iov_iter.c stuff; the amount of boilerplate source is already high enough
and with those extra primitives it'll get really unpleasant.

What we need there is something templates-like, as much as I hate C++, and
I'm still not happy with what I have at the moment...  Hopefully I'll get
that in more or less tolerable form today.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-21  8:49                       ` Al Viro
@ 2014-11-21 15:01                         ` Eric Dumazet
  2014-11-21 17:42                         ` David Laight
  2014-11-22  3:27                         ` Al Viro
  2 siblings, 0 replies; 133+ messages in thread
From: Eric Dumazet @ 2014-11-21 15:01 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig

On Fri, 2014-11-21 at 08:49 +0000, Al Viro wrote:

> Another thing is tcp_sendmsg_fastopen() and tcp_send_rcvq().  The latter
> should just use copy_from_iter() instead of memcpy_from_iovec(), the former
> is dealt with by making tcp_send_syn_data() use the same copy_from_iter()
> instead of memcpy_from_iovecend().

Well, another problem I already mentioned is that tcp_send_rcvq() does a
single alloc_skb() with @size directly coming from user space. This
certainly can try allocation of dozen of Megabytes.

Not good.




^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-20 23:23                 ` David Miller
@ 2014-11-21 17:26                   ` David Miller
  2014-11-22  4:28                     ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-21 17:26 UTC (permalink / raw)
  To: viro; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

From: David Miller <davem@davemloft.net>
Date: Thu, 20 Nov 2014 18:23:39 -0500 (EST)

> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Thu, 20 Nov 2014 21:47:53 +0000
> 
>> On Wed, Nov 19, 2014 at 04:53:40PM -0500, David Miller wrote:
>> 
>>> Pulled, thanks Al.
>> 
>> Umm...  Not in net-next.git#master...
> 
> Sorry, I may have done my usual: pull at the office compute, fire up
> build testing, go home without pushing back out to kernel.org
> 
> :-/
> 
> I'll check first thing tomorrow morning.

I've resolved this now, sorry for the inconvenience Al.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [RFC] situation with csum_and_copy_... API
  2014-11-21  8:49                       ` Al Viro
  2014-11-21 15:01                         ` Eric Dumazet
@ 2014-11-21 17:42                         ` David Laight
  2014-11-21 19:39                           ` Al Viro
  2014-11-22  3:27                         ` Al Viro
  2 siblings, 1 reply; 133+ messages in thread
From: David Laight @ 2014-11-21 17:42 UTC (permalink / raw)
  To: 'Al Viro', Eric Dumazet
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig

From: Al Viro
...
> We would be better off with iov_iter passed to __sock_{send,recv}msg() (as
> a part of struct msghdr, instead of ->msg_iov/->msg_iovlen) and always
> advanced to match the amount of data actually picked from it.  With iovec
> behind it remaining constant.  That would work just as well as the current
> variant for sendmsg(2)/recvmsg(2)/etc., be a lot more convenient for
> kernel_{send,recv}msg() callers and would allow a lot of other fun stuff.

Callers of kernel_send/recvmsg() could easily be using a wrapper
function that creates the 'msghdr'.
When the want to send the remaining part of a buffer the old iterator
will no longer be available - just the original iov and the required offset.

So it would be useful if the iterator could be initialised to a byte
offset down the iov[].

Are there any current code paths where the iov[] is modified but
ends up being something other than 'the remaining data'?
If not then code can check whether iov[0].len has changed, and
skip the 'advance' is it has.

(I've an out-of-tree driver that assumes the iov[] isn't changed.)

	David




^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-21 17:42                         ` David Laight
@ 2014-11-21 19:39                           ` Al Viro
  2014-11-21 19:40                             ` Linus Torvalds
  2014-11-24 10:03                             ` David Laight
  0 siblings, 2 replies; 133+ messages in thread
From: Al Viro @ 2014-11-21 19:39 UTC (permalink / raw)
  To: David Laight
  Cc: Eric Dumazet, David Miller, torvalds, netdev, linux-kernel,
	target-devel, Nicholas A. Bellinger, Christoph Hellwig

On Fri, Nov 21, 2014 at 05:42:55PM +0000, David Laight wrote:

> Callers of kernel_send/recvmsg() could easily be using a wrapper
> function that creates the 'msghdr'.
> When the want to send the remaining part of a buffer the old iterator
> will no longer be available - just the original iov and the required offset.

Er...  So why not copy a struct iov_iter to/from msg->msg_iter, then?
It's not as it had been particulary large - 5 words isn't much...

I'm not at all sure that _anything_ has valid reasons for draining iovecs.
Maintaining a struct iov_iter and modifying it is easy and actually faster...

Right now the main examples outside of net/* are due to unfortunate
limitations of ->sendmsg() - until now it had no way to be told that
desired data starts at offset.  With ->msg_iter it obviously becomes
possible...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-21 19:39                           ` Al Viro
@ 2014-11-21 19:40                             ` Linus Torvalds
  2014-11-24 10:03                             ` David Laight
  1 sibling, 0 replies; 133+ messages in thread
From: Linus Torvalds @ 2014-11-21 19:40 UTC (permalink / raw)
  To: Al Viro
  Cc: David Laight, Eric Dumazet, David Miller, netdev, linux-kernel,
	target-devel, Nicholas A. Bellinger, Christoph Hellwig

On Fri, Nov 21, 2014 at 11:39 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> I'm not at all sure that _anything_ has valid reasons for draining iovecs.
> Maintaining a struct iov_iter and modifying it is easy and actually faster...

For new code, I agree. But the whole "draining iovec's" is a fairly
common old model.

                  Linus

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-21  8:49                       ` Al Viro
  2014-11-21 15:01                         ` Eric Dumazet
  2014-11-21 17:42                         ` David Laight
@ 2014-11-22  3:27                         ` Al Viro
  2014-11-22  3:36                           ` Al Viro
  2014-11-24 10:27                           ` David Laight
  2 siblings, 2 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  3:27 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig, Eric Dumazet

On Fri, Nov 21, 2014 at 08:49:56AM +0000, Al Viro wrote:

> Overall, I think I have the whole series plotted in enough details to be
> reasonably certain we can pull it off.  Right now I'm dealing with
> mm/iov_iter.c stuff; the amount of boilerplate source is already high enough
> and with those extra primitives it'll get really unpleasant.
> 
> What we need there is something templates-like, as much as I hate C++, and
> I'm still not happy with what I have at the moment...  Hopefully I'll get
> that in more or less tolerable form today.

Folks, I would really like comments on the patch below.  It's an attempt
to reduce the amount of boilerplate code in mm/iov_iter.c; no new primitives
added, just trying to reduce the amount of duplication in there.  I'm not
too fond of the way it currently looks, to put it mildly.  It seems to
work, it's reasonably straightforward and it even generates slightly better
code than before, but I would _very_ welcome any tricks that would allow to
make it not so tasteless.  I like the effect on line count (+124-358), but...

It defines two iterators (for iovec-backed and bvec-backed ones) and converts
a bunch of primitives to those.  The last argument is an expression evaluated
for a bunch of ranges; for bvec one it's void, for iovec - size_t; if it
evaluates to non-0, we treat it as read/write/whatever short by that many
bytes and do not proceed any further.

Any suggestions are welcome.

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index eafcf60..611af2bd 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -4,11 +4,75 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 
+#define iterate_iovec(i, n, buf, len, move, STEP) {	\
+	const struct iovec *iov = i->iov;		\
+	size_t skip = i->iov_offset;			\
+	size_t left;					\
+	size_t wanted = n;				\
+	buf = iov->iov_base + skip;			\
+	len = min(n, iov->iov_len - skip);		\
+	left = STEP;					\
+	len -= left;					\
+	skip += len;					\
+	n -= len;					\
+	while (unlikely(!left && n)) {			\
+		iov++;					\
+		buf = iov->iov_base;			\
+		len = min(n, iov->iov_len);		\
+		left = STEP;				\
+		len -= left;				\
+		skip = len;				\
+		n -= len;				\
+	}						\
+	n = wanted - n;					\
+	if (move) {					\
+		if (skip == iov->iov_len) {		\
+			iov++;				\
+			skip = 0;			\
+		}					\
+		i->count -= n;				\
+		i->nr_segs -= iov - i->iov;		\
+		i->iov = iov;				\
+		i->iov_offset = skip;			\
+	}						\
+}
+
+#define iterate_bvec(i, n, page, off, len, move, STEP) {\
+	const struct bio_vec *bvec = i->bvec;		\
+	size_t skip = i->iov_offset;			\
+	size_t wanted = n;				\
+	page = bvec->bv_page;				\
+	off = bvec->bv_offset + skip;	 		\
+	len = min_t(size_t, n, bvec->bv_len - skip);	\
+	STEP;						\
+	skip += len;					\
+	n -= len;					\
+	while (unlikely(n)) {				\
+		bvec++;					\
+		page = bvec->bv_page;			\
+		off = bvec->bv_offset;			\
+		len = min_t(size_t, n, bvec->bv_len);	\
+		STEP;					\
+		skip = len;				\
+		n -= len;				\
+	}						\
+	n = wanted;					\
+	if (move) {					\
+		if (skip == bvec->bv_len) {		\
+			bvec++;				\
+			skip = 0;			\
+		}					\
+		i->count -= n;				\
+		i->nr_segs -= bvec - i->bvec;		\
+		i->bvec = bvec;				\
+		i->iov_offset = skip;			\
+	}						\
+}
+
 static size_t copy_to_iter_iovec(void *from, size_t bytes, struct iov_iter *i)
 {
-	size_t skip, copy, left, wanted;
-	const struct iovec *iov;
 	char __user *buf;
+	size_t len;
 
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
@@ -16,44 +80,15 @@ static size_t copy_to_iter_iovec(void *from, size_t bytes, struct iov_iter *i)
 	if (unlikely(!bytes))
 		return 0;
 
-	wanted = bytes;
-	iov = i->iov;
-	skip = i->iov_offset;
-	buf = iov->iov_base + skip;
-	copy = min(bytes, iov->iov_len - skip);
-
-	left = __copy_to_user(buf, from, copy);
-	copy -= left;
-	skip += copy;
-	from += copy;
-	bytes -= copy;
-	while (unlikely(!left && bytes)) {
-		iov++;
-		buf = iov->iov_base;
-		copy = min(bytes, iov->iov_len);
-		left = __copy_to_user(buf, from, copy);
-		copy -= left;
-		skip = copy;
-		from += copy;
-		bytes -= copy;
-	}
-
-	if (skip == iov->iov_len) {
-		iov++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= iov - i->iov;
-	i->iov = iov;
-	i->iov_offset = skip;
-	return wanted - bytes;
+	iterate_iovec(i, bytes, buf, len, true,
+			__copy_to_user(buf, (from += len) - len, len))
+	return bytes;
 }
 
 static size_t copy_from_iter_iovec(void *to, size_t bytes, struct iov_iter *i)
 {
-	size_t skip, copy, left, wanted;
-	const struct iovec *iov;
 	char __user *buf;
+	size_t len;
 
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
@@ -61,37 +96,9 @@ static size_t copy_from_iter_iovec(void *to, size_t bytes, struct iov_iter *i)
 	if (unlikely(!bytes))
 		return 0;
 
-	wanted = bytes;
-	iov = i->iov;
-	skip = i->iov_offset;
-	buf = iov->iov_base + skip;
-	copy = min(bytes, iov->iov_len - skip);
-
-	left = __copy_from_user(to, buf, copy);
-	copy -= left;
-	skip += copy;
-	to += copy;
-	bytes -= copy;
-	while (unlikely(!left && bytes)) {
-		iov++;
-		buf = iov->iov_base;
-		copy = min(bytes, iov->iov_len);
-		left = __copy_from_user(to, buf, copy);
-		copy -= left;
-		skip = copy;
-		to += copy;
-		bytes -= copy;
-	}
-
-	if (skip == iov->iov_len) {
-		iov++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= iov - i->iov;
-	i->iov = iov;
-	i->iov_offset = skip;
-	return wanted - bytes;
+	iterate_iovec(i, bytes, buf, len, true,
+			__copy_from_user((to += len) - len, buf, len))
+	return bytes;
 }
 
 static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t bytes,
@@ -256,134 +263,6 @@ done:
 	return wanted - bytes;
 }
 
-static size_t zero_iovec(size_t bytes, struct iov_iter *i)
-{
-	size_t skip, copy, left, wanted;
-	const struct iovec *iov;
-	char __user *buf;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	wanted = bytes;
-	iov = i->iov;
-	skip = i->iov_offset;
-	buf = iov->iov_base + skip;
-	copy = min(bytes, iov->iov_len - skip);
-
-	left = __clear_user(buf, copy);
-	copy -= left;
-	skip += copy;
-	bytes -= copy;
-
-	while (unlikely(!left && bytes)) {
-		iov++;
-		buf = iov->iov_base;
-		copy = min(bytes, iov->iov_len);
-		left = __clear_user(buf, copy);
-		copy -= left;
-		skip = copy;
-		bytes -= copy;
-	}
-
-	if (skip == iov->iov_len) {
-		iov++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= iov - i->iov;
-	i->iov = iov;
-	i->iov_offset = skip;
-	return wanted - bytes;
-}
-
-static size_t __iovec_copy_from_user_inatomic(char *vaddr,
-			const struct iovec *iov, size_t base, size_t bytes)
-{
-	size_t copied = 0, left = 0;
-
-	while (bytes) {
-		char __user *buf = iov->iov_base + base;
-		int copy = min(bytes, iov->iov_len - base);
-
-		base = 0;
-		left = __copy_from_user_inatomic(vaddr, buf, copy);
-		copied += copy;
-		bytes -= copy;
-		vaddr += copy;
-		iov++;
-
-		if (unlikely(left))
-			break;
-	}
-	return copied - left;
-}
-
-/*
- * Copy as much as we can into the page and return the number of bytes which
- * were successfully copied.  If a fault is encountered then return the number of
- * bytes which were copied.
- */
-static size_t copy_from_user_atomic_iovec(struct page *page,
-		struct iov_iter *i, unsigned long offset, size_t bytes)
-{
-	char *kaddr;
-	size_t copied;
-
-	kaddr = kmap_atomic(page);
-	if (likely(i->nr_segs == 1)) {
-		int left;
-		char __user *buf = i->iov->iov_base + i->iov_offset;
-		left = __copy_from_user_inatomic(kaddr + offset, buf, bytes);
-		copied = bytes - left;
-	} else {
-		copied = __iovec_copy_from_user_inatomic(kaddr + offset,
-						i->iov, i->iov_offset, bytes);
-	}
-	kunmap_atomic(kaddr);
-
-	return copied;
-}
-
-static void advance_iovec(struct iov_iter *i, size_t bytes)
-{
-	BUG_ON(i->count < bytes);
-
-	if (likely(i->nr_segs == 1)) {
-		i->iov_offset += bytes;
-		i->count -= bytes;
-	} else {
-		const struct iovec *iov = i->iov;
-		size_t base = i->iov_offset;
-		unsigned long nr_segs = i->nr_segs;
-
-		/*
-		 * The !iov->iov_len check ensures we skip over unlikely
-		 * zero-length segments (without overruning the iovec).
-		 */
-		while (bytes || unlikely(i->count && !iov->iov_len)) {
-			int copy;
-
-			copy = min(bytes, iov->iov_len - base);
-			BUG_ON(!i->count || i->count < copy);
-			i->count -= copy;
-			bytes -= copy;
-			base += copy;
-			if (iov->iov_len == base) {
-				iov++;
-				nr_segs--;
-				base = 0;
-			}
-		}
-		i->iov = iov;
-		i->iov_offset = base;
-		i->nr_segs = nr_segs;
-	}
-}
-
 /*
  * Fault in the first iovec of the given iov_iter, to a maximum length
  * of bytes. Returns 0 on success, or non-zero if the memory could not be
@@ -557,8 +436,8 @@ static void memzero_page(struct page *page, size_t offset, size_t len)
 
 static size_t copy_to_iter_bvec(void *from, size_t bytes, struct iov_iter *i)
 {
-	size_t skip, copy, wanted;
-	const struct bio_vec *bvec;
+	struct page *page;
+	size_t off, len;
 
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
@@ -566,38 +445,15 @@ static size_t copy_to_iter_bvec(void *from, size_t bytes, struct iov_iter *i)
 	if (unlikely(!bytes))
 		return 0;
 
-	wanted = bytes;
-	bvec = i->bvec;
-	skip = i->iov_offset;
-	copy = min_t(size_t, bytes, bvec->bv_len - skip);
-
-	memcpy_to_page(bvec->bv_page, skip + bvec->bv_offset, from, copy);
-	skip += copy;
-	from += copy;
-	bytes -= copy;
-	while (bytes) {
-		bvec++;
-		copy = min(bytes, (size_t)bvec->bv_len);
-		memcpy_to_page(bvec->bv_page, bvec->bv_offset, from, copy);
-		skip = copy;
-		from += copy;
-		bytes -= copy;
-	}
-	if (skip == bvec->bv_len) {
-		bvec++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= bvec - i->bvec;
-	i->bvec = bvec;
-	i->iov_offset = skip;
-	return wanted - bytes;
+	iterate_bvec(i, bytes, page, off, len, true,
+		     memcpy_from_page((from += len) - len, page, off, len))
+	return bytes;
 }
 
 static size_t copy_from_iter_bvec(void *to, size_t bytes, struct iov_iter *i)
 {
-	size_t skip, copy, wanted;
-	const struct bio_vec *bvec;
+	struct page *page;
+	size_t off, len;
 
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
@@ -605,35 +461,9 @@ static size_t copy_from_iter_bvec(void *to, size_t bytes, struct iov_iter *i)
 	if (unlikely(!bytes))
 		return 0;
 
-	wanted = bytes;
-	bvec = i->bvec;
-	skip = i->iov_offset;
-
-	copy = min(bytes, bvec->bv_len - skip);
-
-	memcpy_from_page(to, bvec->bv_page, bvec->bv_offset + skip, copy);
-
-	to += copy;
-	skip += copy;
-	bytes -= copy;
-
-	while (bytes) {
-		bvec++;
-		copy = min(bytes, (size_t)bvec->bv_len);
-		memcpy_from_page(to, bvec->bv_page, bvec->bv_offset, copy);
-		skip = copy;
-		to += copy;
-		bytes -= copy;
-	}
-	if (skip == bvec->bv_len) {
-		bvec++;
-		skip = 0;
-	}
-	i->count -= wanted;
-	i->nr_segs -= bvec - i->bvec;
-	i->bvec = bvec;
-	i->iov_offset = skip;
-	return wanted;
+	iterate_bvec(i, bytes, page, off, len, true,
+		     memcpy_to_page(page, off, (to += len) - len, len))
+	return bytes;
 }
 
 static size_t copy_page_to_iter_bvec(struct page *page, size_t offset,
@@ -654,101 +484,6 @@ static size_t copy_page_from_iter_bvec(struct page *page, size_t offset,
 	return wanted;
 }
 
-static size_t zero_bvec(size_t bytes, struct iov_iter *i)
-{
-	size_t skip, copy, wanted;
-	const struct bio_vec *bvec;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	wanted = bytes;
-	bvec = i->bvec;
-	skip = i->iov_offset;
-	copy = min_t(size_t, bytes, bvec->bv_len - skip);
-
-	memzero_page(bvec->bv_page, skip + bvec->bv_offset, copy);
-	skip += copy;
-	bytes -= copy;
-	while (bytes) {
-		bvec++;
-		copy = min(bytes, (size_t)bvec->bv_len);
-		memzero_page(bvec->bv_page, bvec->bv_offset, copy);
-		skip = copy;
-		bytes -= copy;
-	}
-	if (skip == bvec->bv_len) {
-		bvec++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= bvec - i->bvec;
-	i->bvec = bvec;
-	i->iov_offset = skip;
-	return wanted - bytes;
-}
-
-static size_t copy_from_user_bvec(struct page *page,
-		struct iov_iter *i, unsigned long offset, size_t bytes)
-{
-	char *kaddr;
-	size_t left;
-	const struct bio_vec *bvec;
-	size_t base = i->iov_offset;
-
-	kaddr = kmap_atomic(page);
-	for (left = bytes, bvec = i->bvec; left; bvec++, base = 0) {
-		size_t copy = min(left, bvec->bv_len - base);
-		if (!bvec->bv_len)
-			continue;
-		memcpy_from_page(kaddr + offset, bvec->bv_page,
-				 bvec->bv_offset + base, copy);
-		offset += copy;
-		left -= copy;
-	}
-	kunmap_atomic(kaddr);
-	return bytes;
-}
-
-static void advance_bvec(struct iov_iter *i, size_t bytes)
-{
-	BUG_ON(i->count < bytes);
-
-	if (likely(i->nr_segs == 1)) {
-		i->iov_offset += bytes;
-		i->count -= bytes;
-	} else {
-		const struct bio_vec *bvec = i->bvec;
-		size_t base = i->iov_offset;
-		unsigned long nr_segs = i->nr_segs;
-
-		/*
-		 * The !iov->iov_len check ensures we skip over unlikely
-		 * zero-length segments (without overruning the iovec).
-		 */
-		while (bytes || unlikely(i->count && !bvec->bv_len)) {
-			int copy;
-
-			copy = min(bytes, bvec->bv_len - base);
-			BUG_ON(!i->count || i->count < copy);
-			i->count -= copy;
-			bytes -= copy;
-			base += copy;
-			if (bvec->bv_len == base) {
-				bvec++;
-				nr_segs--;
-				base = 0;
-			}
-		}
-		i->bvec = bvec;
-		i->iov_offset = base;
-		i->nr_segs = nr_segs;
-	}
-}
-
 static unsigned long alignment_bvec(const struct iov_iter *i)
 {
 	const struct bio_vec *bvec = i->bvec;
@@ -876,30 +611,61 @@ EXPORT_SYMBOL(copy_from_iter);
 
 size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
 {
+	if (unlikely(bytes > i->count))
+		bytes = i->count;
+
+	if (unlikely(!bytes))
+		return 0;
+
 	if (i->type & ITER_BVEC) {
-		return zero_bvec(bytes, i);
+		struct page *page;
+		size_t off, len;
+		iterate_bvec(i, bytes, page, off, len, true,
+				memzero_page(page, off, len))
 	} else {
-		return zero_iovec(bytes, i);
+		char __user *buf;
+		size_t len;
+		iterate_iovec(i, bytes, buf, len, true,
+				__clear_user(buf, len))
 	}
+	return bytes;
 }
 EXPORT_SYMBOL(iov_iter_zero);
 
 size_t iov_iter_copy_from_user_atomic(struct page *page,
 		struct iov_iter *i, unsigned long offset, size_t bytes)
 {
-	if (i->type & ITER_BVEC)
-		return copy_from_user_bvec(page, i, offset, bytes);
-	else
-		return copy_from_user_atomic_iovec(page, i, offset, bytes);
+	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
+	if (i->type & ITER_BVEC) {
+		struct page *page;
+		size_t off, len;
+		iterate_bvec(i, bytes, page, off, len, false,
+			     memcpy_from_page((p += len) - len, page, off, len))
+	} else {
+		char __user *buf;
+		size_t len;
+		iterate_iovec(i, bytes, buf, len, false,
+			      __copy_from_user_inatomic((p += len) - len,
+							buf, len))
+	}
+	kunmap_atomic(kaddr);
+	return bytes;
 }
 EXPORT_SYMBOL(iov_iter_copy_from_user_atomic);
 
 void iov_iter_advance(struct iov_iter *i, size_t size)
 {
-	if (i->type & ITER_BVEC)
-		advance_bvec(i, size);
-	else
-		advance_iovec(i, size);
+	if (i->type & ITER_BVEC) {
+		struct page *page;
+		size_t off, len;
+		iterate_bvec(i, size, page, off, len, true,
+				(void)0)
+	} else {
+		char __user *buf;
+		size_t len;
+		iterate_iovec(i, size, buf, len, true,
+				0)
+	}
 }
 EXPORT_SYMBOL(iov_iter_advance);
 

^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-22  3:27                         ` Al Viro
@ 2014-11-22  3:36                           ` Al Viro
  2014-11-24 10:27                           ` David Laight
  1 sibling, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  3:36 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Miller, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig, Eric Dumazet

On Sat, Nov 22, 2014 at 03:27:18AM +0000, Al Viro wrote:
  
> @@ -566,38 +445,15 @@ static size_t copy_to_iter_bvec(void *from, size_t bytes, struct iov_iter *i)
[snip]
> +	iterate_bvec(i, bytes, page, off, len, true,
> +		     memcpy_from_page((from += len) - len, page, off, len))

should be
+		     memcpy_to_page(page, off, (from += len) - len, len))

and

> @@ -605,35 +461,9 @@ static size_t copy_from_iter_bvec(void *to, size_t bytes, struct iov_iter *i)
[snip]
> +	iterate_bvec(i, bytes, page, off, len, true,
> +		     memcpy_to_page(page, off, (to += len) - len, len))

should be
+		     memcpy_from_page((to += len) - len, page, off, len))

Sorry, wrong delta.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-21 17:26                   ` David Miller
@ 2014-11-22  4:28                     ` Al Viro
  2014-11-22  4:29                       ` [PATCH 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
                                         ` (18 more replies)
  0 siblings, 19 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:28 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

	OK, here's the next bunch.  Sorry about the delay, iov_iter.c stuff
took most of the day (and it's not included in this pile).  Please, review.
Al Viro (17):
      new helper: skb_copy_and_csum_datagram_msg()
      new helper: memcpy_from_msg()
      switch ipxrtr_route_packet() from iovec to msghdr
      new helper: memcpy_to_msg()
      switch drivers/net/tun.c to ->read_iter()
      switch macvtap to ->read_iter()
      new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
      {macvtap,tun}_get_user(): switch to iov_iter
      kill zerocopy_sg_from_iovec()
      switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()
      switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter
      tipc_sendmsg(): pass msghdr instead of its ->msg_iov
      tipc_msg_build(): pass msghdr instead of its ->msg_iov
      vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr
      [atm] switch vcc_sendmsg() to copy_from_iter()
      rds: switch ->inc_copy_to_user() to passing iov_iter
      rds: switch rds_message_copy_from_user() to iov_iter

Patches themselves are in followups...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 01/17] new helper: skb_copy_and_csum_datagram_msg()
  2014-11-22  4:28                     ` Al Viro
@ 2014-11-22  4:29                       ` Al Viro
  2014-11-22  4:30                       ` [PATCH 02/17] new helper: memcpy_from_msg() Al Viro
                                         ` (17 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:29 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    5 +++++
 net/ipv4/udp.c         |    5 ++---
 net/ipv6/raw.c         |    2 +-
 net/ipv6/udp.c         |    2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 73c370e..4fc4024 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2651,6 +2651,11 @@ static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
 }
 int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
 				     struct iovec *iov);
+static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
+			    struct msghdr *msg)
+{
+	return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
+}
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 				 const struct iovec *from, int from_offset,
 				 int len);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 4a16b91..b2d6068 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1284,9 +1284,8 @@ try_again:
 		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
 					    msg, copied);
 	else {
-		err = skb_copy_and_csum_datagram_iovec(skb,
-						       sizeof(struct udphdr),
-						       msg->msg_iov);
+		err = skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr),
+						     msg);
 
 		if (err == -EINVAL)
 			goto csum_copy_err;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 0cbcf98..8baa53e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -492,7 +492,7 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 			goto csum_copy_err;
 		err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	} else {
-		err = skb_copy_and_csum_datagram_iovec(skb, 0, msg->msg_iov);
+		err = skb_copy_and_csum_datagram_msg(skb, 0, msg);
 		if (err == -EINVAL)
 			goto csum_copy_err;
 	}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 0ba3de4..961565a 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -427,7 +427,7 @@ try_again:
 		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
 					    msg, copied);
 	else {
-		err = skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr), msg->msg_iov);
+		err = skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
 		if (err == -EINVAL)
 			goto csum_copy_err;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 02/17] new helper: memcpy_from_msg()
  2014-11-22  4:28                     ` Al Viro
  2014-11-22  4:29                       ` [PATCH 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
@ 2014-11-22  4:30                       ` Al Viro
  2014-11-22  4:30                       ` [PATCH 03/17] switch ipxrtr_route_packet() from iovec to msghdr Al Viro
                                         ` (16 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:30 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch


Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 crypto/algif_skcipher.c     |   10 +++++-----
 drivers/isdn/mISDN/socket.c |    2 +-
 drivers/net/ppp/pppoe.c     |    2 +-
 include/linux/skbuff.h      |    5 +++++
 include/net/sctp/sm.h       |    2 +-
 net/appletalk/ddp.c         |    2 +-
 net/ax25/af_ax25.c          |    2 +-
 net/bluetooth/hci_sock.c    |    2 +-
 net/bluetooth/mgmt.c        |    2 +-
 net/bluetooth/rfcomm/sock.c |    2 +-
 net/bluetooth/sco.c         |    2 +-
 net/caif/caif_socket.c      |    4 ++--
 net/can/bcm.c               |   19 ++++++++-----------
 net/can/raw.c               |    2 +-
 net/dccp/proto.c            |    2 +-
 net/decnet/af_decnet.c      |    2 +-
 net/ieee802154/dgram.c      |    2 +-
 net/ieee802154/raw.c        |    2 +-
 net/ipv4/ping.c             |    2 +-
 net/ipv4/tcp_input.c        |    2 +-
 net/irda/af_irda.c          |    6 +++---
 net/iucv/af_iucv.c          |    2 +-
 net/key/af_key.c            |    2 +-
 net/l2tp/l2tp_ip.c          |    2 +-
 net/l2tp/l2tp_ppp.c         |    3 +--
 net/llc/af_llc.c            |    2 +-
 net/netlink/af_netlink.c    |    2 +-
 net/netrom/af_netrom.c      |    2 +-
 net/nfc/llcp_commands.c     |    4 ++--
 net/nfc/rawsock.c           |    2 +-
 net/packet/af_packet.c      |    5 ++---
 net/phonet/datagram.c       |    2 +-
 net/phonet/pep.c            |    2 +-
 net/rose/af_rose.c          |    2 +-
 net/sctp/sm_make_chunk.c    |    4 ++--
 net/x25/af_x25.c            |    2 +-
 36 files changed, 57 insertions(+), 57 deletions(-)

diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 83187f4..c3b482b 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -298,9 +298,9 @@ static int skcipher_sendmsg(struct kiocb *unused, struct socket *sock,
 			len = min_t(unsigned long, len,
 				    PAGE_SIZE - sg->offset - sg->length);
 
-			err = memcpy_fromiovec(page_address(sg_page(sg)) +
-					       sg->offset + sg->length,
-					       msg->msg_iov, len);
+			err = memcpy_from_msg(page_address(sg_page(sg)) +
+					      sg->offset + sg->length,
+					      msg, len);
 			if (err)
 				goto unlock;
 
@@ -337,8 +337,8 @@ static int skcipher_sendmsg(struct kiocb *unused, struct socket *sock,
 			if (!sg_page(sg + i))
 				goto unlock;
 
-			err = memcpy_fromiovec(page_address(sg_page(sg + i)),
-					       msg->msg_iov, plen);
+			err = memcpy_from_msg(page_address(sg_page(sg + i)),
+					      msg, plen);
 			if (err) {
 				__free_page(sg_page(sg + i));
 				sg_assign_page(sg + i, NULL);
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index dcbd858..84b3592 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -203,7 +203,7 @@ mISDN_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (!skb)
 		goto done;
 
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		goto done;
 	}
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 443cbbf..d2408a5 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -869,7 +869,7 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock,
 	ph = (struct pppoe_hdr *)skb_put(skb, total_len + sizeof(struct pppoe_hdr));
 	start = (char *)&ph->tag[0];
 
-	error = memcpy_fromiovec(start, m->msg_iov, total_len);
+	error = memcpy_from_msg(start, m, total_len);
 	if (error < 0) {
 		kfree_skb(skb);
 		goto end;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4fc4024..cb7fa2b 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2684,6 +2684,11 @@ unsigned int skb_gso_transport_seglen(const struct sk_buff *skb);
 struct sk_buff *skb_segment(struct sk_buff *skb, netdev_features_t features);
 struct sk_buff *skb_vlan_untag(struct sk_buff *skb);
 
+static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
+{
+	return memcpy_fromiovec(data, msg->msg_iov, len);
+}
+
 struct skb_checksum_ops {
 	__wsum (*update)(const void *mem, int len, __wsum wsum);
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 72a31db..487ef34 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -219,7 +219,7 @@ struct sctp_chunk *sctp_make_abort_no_data(const struct sctp_association *,
 				      const struct sctp_chunk *,
 				      __u32 tsn);
 struct sctp_chunk *sctp_make_abort_user(const struct sctp_association *,
-					const struct msghdr *, size_t msg_len);
+					struct msghdr *, size_t msg_len);
 struct sctp_chunk *sctp_make_abort_violation(const struct sctp_association *,
 				   const struct sctp_chunk *,
 				   const __u8 *,
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index 425942d..0d0766e 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1659,7 +1659,7 @@ static int atalk_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 
 	SOCK_DEBUG(sk, "SK %p: Copy user data (%Zd bytes).\n", sk, len);
 
-	err = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		err = -EFAULT;
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index f4f835e19..ca049a7 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1549,7 +1549,7 @@ static int ax25_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reserve(skb, size - len);
 
 	/* User data follows immediately after the AX.25 data */
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		kfree_skb(skb);
 		goto out;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 5e2cd25..2c245fd 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -947,7 +947,7 @@ static int hci_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (!skb)
 		goto done;
 
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		goto drop;
 	}
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
index 9c4daf7..824eef7 100644
--- a/net/bluetooth/mgmt.c
+++ b/net/bluetooth/mgmt.c
@@ -5726,7 +5726,7 @@ int mgmt_control(struct sock *sk, struct msghdr *msg, size_t msglen)
 	if (!buf)
 		return -ENOMEM;
 
-	if (memcpy_fromiovec(buf, msg->msg_iov, msglen)) {
+	if (memcpy_from_msg(buf, msg, msglen)) {
 		err = -EFAULT;
 		goto done;
 	}
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 8bbbb5e..2348176 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -588,7 +588,7 @@ static int rfcomm_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 		}
 		skb_reserve(skb, RFCOMM_SKB_HEAD_RESERVE);
 
-		err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+		err = memcpy_from_msg(skb_put(skb, size), msg, size);
 		if (err) {
 			kfree_skb(skb);
 			if (sent == 0)
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 7ee9e4a..30e5ea3 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -285,7 +285,7 @@ static int sco_send_frame(struct sock *sk, struct msghdr *msg, int len)
 	if (!skb)
 		return err;
 
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		kfree_skb(skb);
 		return -EFAULT;
 	}
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index fbcd156..230f140 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -566,7 +566,7 @@ static int caif_seqpkt_sendmsg(struct kiocb *kiocb, struct socket *sock,
 
 	skb_reserve(skb, cf_sk->headroom);
 
-	ret = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	ret = memcpy_from_msg(skb_put(skb, len), msg, len);
 
 	if (ret)
 		goto err;
@@ -641,7 +641,7 @@ static int caif_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		 */
 		size = min_t(int, size, skb_tailroom(skb));
 
-		err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+		err = memcpy_from_msg(skb_put(skb, size), msg, size);
 		if (err) {
 			kfree_skb(skb);
 			goto out_err;
diff --git a/net/can/bcm.c b/net/can/bcm.c
index dcb75c0..b9a1f71 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -858,8 +858,7 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 
 		/* update can_frames content */
 		for (i = 0; i < msg_head->nframes; i++) {
-			err = memcpy_fromiovec((u8 *)&op->frames[i],
-					       msg->msg_iov, CFSIZ);
+			err = memcpy_from_msg((u8 *)&op->frames[i], msg, CFSIZ);
 
 			if (op->frames[i].can_dlc > 8)
 				err = -EINVAL;
@@ -894,8 +893,7 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 			op->frames = &op->sframe;
 
 		for (i = 0; i < msg_head->nframes; i++) {
-			err = memcpy_fromiovec((u8 *)&op->frames[i],
-					       msg->msg_iov, CFSIZ);
+			err = memcpy_from_msg((u8 *)&op->frames[i], msg, CFSIZ);
 
 			if (op->frames[i].can_dlc > 8)
 				err = -EINVAL;
@@ -1024,9 +1022,8 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 
 		if (msg_head->nframes) {
 			/* update can_frames content */
-			err = memcpy_fromiovec((u8 *)op->frames,
-					       msg->msg_iov,
-					       msg_head->nframes * CFSIZ);
+			err = memcpy_from_msg((u8 *)op->frames, msg,
+					      msg_head->nframes * CFSIZ);
 			if (err < 0)
 				return err;
 
@@ -1072,8 +1069,8 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 		}
 
 		if (msg_head->nframes) {
-			err = memcpy_fromiovec((u8 *)op->frames, msg->msg_iov,
-					       msg_head->nframes * CFSIZ);
+			err = memcpy_from_msg((u8 *)op->frames, msg,
+					      msg_head->nframes * CFSIZ);
 			if (err < 0) {
 				if (op->frames != &op->sframe)
 					kfree(op->frames);
@@ -1209,7 +1206,7 @@ static int bcm_tx_send(struct msghdr *msg, int ifindex, struct sock *sk)
 
 	can_skb_reserve(skb);
 
-	err = memcpy_fromiovec(skb_put(skb, CFSIZ), msg->msg_iov, CFSIZ);
+	err = memcpy_from_msg(skb_put(skb, CFSIZ), msg, CFSIZ);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -1285,7 +1282,7 @@ static int bcm_sendmsg(struct kiocb *iocb, struct socket *sock,
 
 	/* read message head information */
 
-	ret = memcpy_fromiovec((u8 *)&msg_head, msg->msg_iov, MHSIZ);
+	ret = memcpy_from_msg((u8 *)&msg_head, msg, MHSIZ);
 	if (ret < 0)
 		return ret;
 
diff --git a/net/can/raw.c b/net/can/raw.c
index 081e81f..0e4004f 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -703,7 +703,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct socket *sock,
 	can_skb_reserve(skb);
 	can_skb_prv(skb)->ifindex = dev->ifindex;
 
-	err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+	err = memcpy_from_msg(skb_put(skb, size), msg, size);
 	if (err < 0)
 		goto free_skb;
 
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 8e6ae94..19f0387 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -781,7 +781,7 @@ int dccp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		goto out_release;
 
 	skb_reserve(skb, sk->sk_prot->max_header);
-	rc = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc != 0)
 		goto out_discard;
 
diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
index 25733d5..e2e2e3c 100644
--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -2032,7 +2032,7 @@ static int dn_sendmsg(struct kiocb *iocb, struct socket *sock,
 
 		skb_reserve(skb, 64 + DN_MAX_NSP_DATA_HEADER);
 
-		if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+		if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 			err = -EFAULT;
 			goto out;
 		}
diff --git a/net/ieee802154/dgram.c b/net/ieee802154/dgram.c
index b8555ec..2c7a93e 100644
--- a/net/ieee802154/dgram.c
+++ b/net/ieee802154/dgram.c
@@ -276,7 +276,7 @@ static int dgram_sendmsg(struct kiocb *iocb, struct sock *sk,
 	if (err < 0)
 		goto out_skb;
 
-	err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+	err = memcpy_from_msg(skb_put(skb, size), msg, size);
 	if (err < 0)
 		goto out_skb;
 
diff --git a/net/ieee802154/raw.c b/net/ieee802154/raw.c
index 21c3894..61e9d29 100644
--- a/net/ieee802154/raw.c
+++ b/net/ieee802154/raw.c
@@ -150,7 +150,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk,
 	skb_reset_mac_header(skb);
 	skb_reset_network_header(skb);
 
-	err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+	err = memcpy_from_msg(skb_put(skb, size), msg, size);
 	if (err < 0)
 		goto out_skb;
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index ce2920f..ef8f6ee 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -660,7 +660,7 @@ int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
 	 *	Fetch the ICMP header provided by the userland.
 	 *	iovec is modified! The ICMP header is consumed.
 	 */
-	if (memcpy_fromiovec(user_icmph, msg->msg_iov, icmph_len))
+	if (memcpy_from_msg(user_icmph, msg, icmph_len))
 		return -EFAULT;
 
 	if (family == AF_INET) {
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d91436b..2e5a1e07 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4368,7 +4368,7 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
 	if (tcp_try_rmem_schedule(sk, skb, skb->truesize))
 		goto err_free;
 
-	if (memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size))
+	if (memcpy_from_msg(skb_put(skb, size), msg, size))
 		goto err_free;
 
 	TCP_SKB_CB(skb)->seq = tcp_sk(sk)->rcv_nxt;
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index e8c4090..9052462 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1319,7 +1319,7 @@ static int irda_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reserve(skb, self->max_header_size + 16);
 	skb_reset_transport_header(skb);
 	skb_put(skb, len);
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		goto out_err;
@@ -1569,7 +1569,7 @@ static int irda_sendmsg_dgram(struct kiocb *iocb, struct socket *sock,
 
 	pr_debug("%s(), appending user data\n", __func__);
 	skb_put(skb, len);
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		goto out;
@@ -1678,7 +1678,7 @@ static int irda_sendmsg_ultra(struct kiocb *iocb, struct socket *sock,
 
 	pr_debug("%s(), appending user data\n", __func__);
 	skb_put(skb, len);
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		goto out;
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index 057b564..1cd3f81 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1122,7 +1122,7 @@ static int iucv_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	}
 	if (iucv->transport == AF_IUCV_TRANS_HIPER)
 		skb_reserve(skb, sizeof(struct af_iucv_trans_hdr) + ETH_HLEN);
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		goto fail;
 	}
diff --git a/net/key/af_key.c b/net/key/af_key.c
index e588309..f8ac939 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3611,7 +3611,7 @@ static int pfkey_sendmsg(struct kiocb *kiocb,
 		goto out;
 
 	err = -EFAULT;
-	if (memcpy_fromiovec(skb_put(skb,len), msg->msg_iov, len))
+	if (memcpy_from_msg(skb_put(skb,len), msg, len))
 		goto out;
 
 	hdr = pfkey_get_base_msg(skb, &err);
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index a6cc1fe..05dfc8a 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -441,7 +441,7 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 	*((__be32 *) skb_put(skb, 4)) = 0;
 
 	/* Copy user data into skb */
-	rc = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc < 0) {
 		kfree_skb(skb);
 		goto error;
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index c559bcd..cc7a828 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -346,8 +346,7 @@ static int pppol2tp_sendmsg(struct kiocb *iocb, struct socket *sock, struct msgh
 	skb_put(skb, 2);
 
 	/* Copy user data into skb */
-	error = memcpy_fromiovec(skb_put(skb, total_len), m->msg_iov,
-				 total_len);
+	error = memcpy_from_msg(skb_put(skb, total_len), m, total_len);
 	if (error < 0) {
 		kfree_skb(skb);
 		goto error_put_sess_tun;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index af662669..2c0b83c 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -921,7 +921,7 @@ static int llc_ui_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb->dev      = llc->dev;
 	skb->protocol = llc_proto_type(addr->sllc_arphrd);
 	skb_reserve(skb, hdrlen);
-	rc = memcpy_fromiovec(skb_put(skb, copied), msg->msg_iov, copied);
+	rc = memcpy_from_msg(skb_put(skb, copied), msg, copied);
 	if (rc)
 		goto out;
 	if (sk->sk_type == SOCK_DGRAM || addr->sllc_ua) {
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index e1aad6e..63aa5c8 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2325,7 +2325,7 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	NETLINK_CB(skb).flags	= netlink_skb_flags;
 
 	err = -EFAULT;
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		kfree_skb(skb);
 		goto out;
 	}
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 7e13f6a..69f1d5e 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1113,7 +1113,7 @@ static int nr_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_put(skb, len);
 
 	/* User data follows immediately after the NET/ROM transport header */
-	if (memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_transport_header(skb), msg, len)) {
 		kfree_skb(skb);
 		err = -EFAULT;
 		goto out;
diff --git a/net/nfc/llcp_commands.c b/net/nfc/llcp_commands.c
index a3ad69a..c4da0c2 100644
--- a/net/nfc/llcp_commands.c
+++ b/net/nfc/llcp_commands.c
@@ -665,7 +665,7 @@ int nfc_llcp_send_i_frame(struct nfc_llcp_sock *sock,
 	if (msg_data == NULL)
 		return -ENOMEM;
 
-	if (memcpy_fromiovec(msg_data, msg->msg_iov, len)) {
+	if (memcpy_from_msg(msg_data, msg, len)) {
 		kfree(msg_data);
 		return -EFAULT;
 	}
@@ -731,7 +731,7 @@ int nfc_llcp_send_ui_frame(struct nfc_llcp_sock *sock, u8 ssap, u8 dsap,
 	if (msg_data == NULL)
 		return -ENOMEM;
 
-	if (memcpy_fromiovec(msg_data, msg->msg_iov, len)) {
+	if (memcpy_from_msg(msg_data, msg, len)) {
 		kfree(msg_data);
 		return -EFAULT;
 	}
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 9d7d2b7..373e138 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -231,7 +231,7 @@ static int rawsock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb == NULL)
 		return rc;
 
-	rc = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc < 0) {
 		kfree_skb(skb);
 		return rc;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 4cd13d8..be79208 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1676,7 +1676,7 @@ retry:
 			if (len < hhlen)
 				skb_reset_network_header(skb);
 		}
-		err = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+		err = memcpy_from_msg(skb_put(skb, len), msg, len);
 		if (err)
 			goto out_free;
 		goto retry;
@@ -2438,8 +2438,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 
 		len -= vnet_hdr_len;
 
-		err = memcpy_fromiovec((void *)&vnet_hdr, msg->msg_iov,
-				       vnet_hdr_len);
+		err = memcpy_from_msg((void *)&vnet_hdr, msg, vnet_hdr_len);
 		if (err < 0)
 			goto out_unlock;
 
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index 0918bc2..26054b4 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -109,7 +109,7 @@ static int pn_sendmsg(struct kiocb *iocb, struct sock *sk,
 		return err;
 	skb_reserve(skb, MAX_PHONET_HEADER);
 
-	err = memcpy_fromiovec((void *)skb_put(skb, len), msg->msg_iov, len);
+	err = memcpy_from_msg((void *)skb_put(skb, len), msg, len);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 9cd069d..5d3f2b7 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1141,7 +1141,7 @@ static int pep_sendmsg(struct kiocb *iocb, struct sock *sk,
 		return err;
 
 	skb_reserve(skb, MAX_PHONET_HEADER + 3 + pn->aligned);
-	err = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (err < 0)
 		goto outfree;
 
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index 9b600c2..43bac7c 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1121,7 +1121,7 @@ static int rose_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reset_transport_header(skb);
 	skb_put(skb, len);
 
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		return err;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 9f32741..e49bcce 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1001,7 +1001,7 @@ no_mem:
 
 /* Helper to create ABORT with a SCTP_ERROR_USER_ABORT error.  */
 struct sctp_chunk *sctp_make_abort_user(const struct sctp_association *asoc,
-					const struct msghdr *msg,
+					struct msghdr *msg,
 					size_t paylen)
 {
 	struct sctp_chunk *retval;
@@ -1018,7 +1018,7 @@ struct sctp_chunk *sctp_make_abort_user(const struct sctp_association *asoc,
 		if (!payload)
 			goto err_payload;
 
-		err = memcpy_fromiovec(payload, msg->msg_iov, paylen);
+		err = memcpy_from_msg(payload, msg, paylen);
 		if (err < 0)
 			goto err_copy;
 	}
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 59e785b..d9149b6 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1170,7 +1170,7 @@ static int x25_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reset_transport_header(skb);
 	skb_put(skb, len);
 
-	rc = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (rc)
 		goto out_kfree_skb;
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 03/17] switch ipxrtr_route_packet() from iovec to msghdr
  2014-11-22  4:28                     ` Al Viro
  2014-11-22  4:29                       ` [PATCH 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
  2014-11-22  4:30                       ` [PATCH 02/17] new helper: memcpy_from_msg() Al Viro
@ 2014-11-22  4:30                       ` Al Viro
  2014-11-22  4:31                       ` [PATCH 04/17] new helper: memcpy_to_msg() Al Viro
                                         ` (15 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:30 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/ipx.h   |    2 +-
 net/ipx/af_ipx.c    |    3 +--
 net/ipx/ipx_route.c |    4 ++--
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/net/ipx.h b/include/net/ipx.h
index 320f47b..e5cff68 100644
--- a/include/net/ipx.h
+++ b/include/net/ipx.h
@@ -150,7 +150,7 @@ int ipxrtr_add_route(__be32 network, struct ipx_interface *intrfc,
 		     unsigned char *node);
 void ipxrtr_del_routes(struct ipx_interface *intrfc);
 int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
-			struct iovec *iov, size_t len, int noblock);
+			struct msghdr *msg, size_t len, int noblock);
 int ipxrtr_route_skb(struct sk_buff *skb);
 struct ipx_route *ipxrtr_lookup(__be32 net);
 int ipxrtr_ioctl(unsigned int cmd, void __user *arg);
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index a0c7536..36f7990 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1745,8 +1745,7 @@ static int ipx_sendmsg(struct kiocb *iocb, struct socket *sock,
 		memcpy(usipx->sipx_node, ipxs->dest_addr.node, IPX_NODE_LEN);
 	}
 
-	rc = ipxrtr_route_packet(sk, usipx, msg->msg_iov, len,
-				 flags & MSG_DONTWAIT);
+	rc = ipxrtr_route_packet(sk, usipx, msg, len, flags & MSG_DONTWAIT);
 	if (rc >= 0)
 		rc = len;
 out:
diff --git a/net/ipx/ipx_route.c b/net/ipx/ipx_route.c
index 67e7ad3..3e2a32a 100644
--- a/net/ipx/ipx_route.c
+++ b/net/ipx/ipx_route.c
@@ -165,7 +165,7 @@ int ipxrtr_route_skb(struct sk_buff *skb)
  * Route an outgoing frame from a socket.
  */
 int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
-			struct iovec *iov, size_t len, int noblock)
+			struct msghdr *msg, size_t len, int noblock)
 {
 	struct sk_buff *skb;
 	struct ipx_sock *ipxs = ipx_sk(sk);
@@ -229,7 +229,7 @@ int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
 	memcpy(ipx->ipx_dest.node, usipx->sipx_node, IPX_NODE_LEN);
 	ipx->ipx_dest.sock		= usipx->sipx_port;
 
-	rc = memcpy_fromiovec(skb_put(skb, len), iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc) {
 		kfree_skb(skb);
 		goto out_put;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 04/17] new helper: memcpy_to_msg()
  2014-11-22  4:28                     ` Al Viro
                                         ` (2 preceding siblings ...)
  2014-11-22  4:30                       ` [PATCH 03/17] switch ipxrtr_route_packet() from iovec to msghdr Al Viro
@ 2014-11-22  4:31                       ` Al Viro
  2014-11-22  4:32                       ` [PATCH 05/17] switch drivers/net/tun.c to ->read_iter() Al Viro
                                         ` (14 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:31 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 crypto/algif_hash.c    |    2 +-
 include/linux/skbuff.h |    5 +++++
 net/caif/caif_socket.c |    2 +-
 net/can/bcm.c          |    2 +-
 net/can/raw.c          |    2 +-
 net/decnet/af_decnet.c |    2 +-
 net/ipv4/tcp.c         |    2 +-
 net/irda/af_irda.c     |    2 +-
 net/packet/af_packet.c |    3 +--
 9 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 8502462..35c93ff 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -174,7 +174,7 @@ static int hash_recvmsg(struct kiocb *unused, struct socket *sock,
 			goto unlock;
 	}
 
-	err = memcpy_toiovec(msg->msg_iov, ctx->result, len);
+	err = memcpy_to_msg(msg, ctx->result, len);
 
 unlock:
 	release_sock(sk);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index cb7fa2b..18ce42e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2689,6 +2689,11 @@ static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 	return memcpy_fromiovec(data, msg->msg_iov, len);
 }
 
+static inline int memcpy_to_msg(struct msghdr *msg, void *data, int len)
+{
+	return memcpy_toiovec(msg->msg_iov, data, len);
+}
+
 struct skb_checksum_ops {
 	__wsum (*update)(const void *mem, int len, __wsum wsum);
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 230f140..ac618b0 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -418,7 +418,7 @@ unlock:
 		}
 		release_sock(sk);
 		chunk = min_t(unsigned int, skb->len, size);
-		if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+		if (memcpy_to_msg(msg, skb->data, chunk)) {
 			skb_queue_head(&sk->sk_receive_queue, skb);
 			if (copied == 0)
 				copied = -EFAULT;
diff --git a/net/can/bcm.c b/net/can/bcm.c
index b9a1f71..0167118 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1555,7 +1555,7 @@ static int bcm_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb->len < size)
 		size = skb->len;
 
-	err = memcpy_toiovec(msg->msg_iov, skb->data, size);
+	err = memcpy_to_msg(msg, skb->data, size);
 	if (err < 0) {
 		skb_free_datagram(sk, skb);
 		return err;
diff --git a/net/can/raw.c b/net/can/raw.c
index 0e4004f..dfdcffb 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -750,7 +750,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct socket *sock,
 	else
 		size = skb->len;
 
-	err = memcpy_toiovec(msg->msg_iov, skb->data, size);
+	err = memcpy_to_msg(msg, skb->data, size);
 	if (err < 0) {
 		skb_free_datagram(sk, skb);
 		return err;
diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
index e2e2e3c..8102286 100644
--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -1760,7 +1760,7 @@ static int dn_recvmsg(struct kiocb *iocb, struct socket *sock,
 		if ((chunk + copied) > size)
 			chunk = size - copied;
 
-		if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+		if (memcpy_to_msg(msg, skb->data, chunk)) {
 			rv = -EFAULT;
 			break;
 		}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index c239f47..435443b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1349,7 +1349,7 @@ static int tcp_recv_urg(struct sock *sk, struct msghdr *msg, int len, int flags)
 
 		if (len > 0) {
 			if (!(flags & MSG_TRUNC))
-				err = memcpy_toiovec(msg->msg_iov, &c, 1);
+				err = memcpy_to_msg(msg, &c, 1);
 			len = 1;
 		} else
 			msg->msg_flags |= MSG_TRUNC;
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 9052462..568edc7 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1466,7 +1466,7 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct socket *sock,
 		}
 
 		chunk = min_t(unsigned int, skb->len, size);
-		if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+		if (memcpy_to_msg(msg, skb->data, chunk)) {
 			skb_queue_head(&sk->sk_receive_queue, skb);
 			if (copied == 0)
 				copied = -EFAULT;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index be79208..692f049 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2936,8 +2936,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 			vnet_hdr.flags = VIRTIO_NET_HDR_F_DATA_VALID;
 		} /* else everything is zero */
 
-		err = memcpy_toiovec(msg->msg_iov, (void *)&vnet_hdr,
-				     vnet_hdr_len);
+		err = memcpy_to_msg(msg, (void *)&vnet_hdr, vnet_hdr_len);
 		if (err < 0)
 			goto out_free;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 05/17] switch drivers/net/tun.c to ->read_iter()
  2014-11-22  4:28                     ` Al Viro
                                         ` (3 preceding siblings ...)
  2014-11-22  4:31                       ` [PATCH 04/17] new helper: memcpy_to_msg() Al Viro
@ 2014-11-22  4:32                       ` Al Viro
  2014-11-22  4:32                       ` [PATCH 06/17] switch macvtap " Al Viro
                                         ` (13 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:32 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/tun.c |   40 +++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index ac53a73..405dfdf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1339,18 +1339,17 @@ done:
 }
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
-			   const struct iovec *iv, unsigned long segs,
-			   ssize_t len, int noblock)
+			   struct iov_iter *to,
+			   int noblock)
 {
 	struct sk_buff *skb;
-	ssize_t ret = 0;
+	ssize_t ret;
 	int peeked, err, off = 0;
-	struct iov_iter iter;
 
 	tun_debug(KERN_INFO, tun, "tun_do_read\n");
 
-	if (!len)
-		return ret;
+	if (!iov_iter_count(to))
+		return 0;
 
 	if (tun->dev->reg_state != NETREG_REGISTERED)
 		return -EIO;
@@ -1359,37 +1358,27 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
 	skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0,
 				  &peeked, &off, &err);
 	if (!skb)
-		return ret;
+		return 0;
 
-	iov_iter_init(&iter, READ, iv, segs, len);
-	ret = tun_put_user(tun, tfile, skb, &iter);
+	ret = tun_put_user(tun, tfile, skb, to);
 	kfree_skb(skb);
 
 	return ret;
 }
 
-static ssize_t tun_chr_aio_read(struct kiocb *iocb, const struct iovec *iv,
-			    unsigned long count, loff_t pos)
+static ssize_t tun_chr_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
 	struct file *file = iocb->ki_filp;
 	struct tun_file *tfile = file->private_data;
 	struct tun_struct *tun = __tun_get(tfile);
-	ssize_t len, ret;
+	ssize_t len = iov_iter_count(to), ret;
 
 	if (!tun)
 		return -EBADFD;
-	len = iov_length(iv, count);
-	if (len < 0) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	ret = tun_do_read(tun, tfile, iv, count, len,
-			  file->f_flags & O_NONBLOCK);
+	ret = tun_do_read(tun, tfile, to, file->f_flags & O_NONBLOCK);
 	ret = min_t(ssize_t, ret, len);
 	if (ret > 0)
 		iocb->ki_pos = ret;
-out:
 	tun_put(tun);
 	return ret;
 }
@@ -1471,6 +1460,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
+	struct iov_iter to;
 	int ret;
 
 	if (!tun)
@@ -1485,8 +1475,8 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 					 SOL_PACKET, TUN_TX_TIMESTAMP);
 		goto out;
 	}
-	ret = tun_do_read(tun, tfile, m->msg_iov, m->msg_iovlen, total_len,
-			  flags & MSG_DONTWAIT);
+	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
+	ret = tun_do_read(tun, tfile, &to, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
@@ -2242,8 +2232,8 @@ static int tun_chr_show_fdinfo(struct seq_file *m, struct file *f)
 static const struct file_operations tun_fops = {
 	.owner	= THIS_MODULE,
 	.llseek = no_llseek,
-	.read  = do_sync_read,
-	.aio_read  = tun_chr_aio_read,
+	.read  = new_sync_read,
+	.read_iter  = tun_chr_read_iter,
 	.write = do_sync_write,
 	.aio_write = tun_chr_aio_write,
 	.poll	= tun_chr_poll,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 06/17] switch macvtap to ->read_iter()
  2014-11-22  4:28                     ` Al Viro
                                         ` (4 preceding siblings ...)
  2014-11-22  4:32                       ` [PATCH 05/17] switch drivers/net/tun.c to ->read_iter() Al Viro
@ 2014-11-22  4:32                       ` Al Viro
  2014-11-23 23:29                         ` Ben Hutchings
  2014-11-22  4:33                       ` [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter() Al Viro
                                         ` (12 subsequent siblings)
  18 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:32 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/macvtap.c |   39 ++++++++++++++++-----------------------
 1 file changed, 16 insertions(+), 23 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index cea99d4..cdd820f 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -829,16 +829,17 @@ done:
 }
 
 static ssize_t macvtap_do_read(struct macvtap_queue *q,
-			       const struct iovec *iv, unsigned long segs,
-			       unsigned long len,
+			       struct iov_iter *to,
 			       int noblock)
 {
 	DEFINE_WAIT(wait);
 	struct sk_buff *skb;
 	ssize_t ret = 0;
-	struct iov_iter iter;
 
-	while (len) {
+	if (!iov_iter_count(to))
+		return 0;
+
+	while (1) {
 		if (!noblock)
 			prepare_to_wait(sk_sleep(&q->sk), &wait,
 					TASK_INTERRUPTIBLE);
@@ -856,37 +857,27 @@ static ssize_t macvtap_do_read(struct macvtap_queue *q,
 			}
 			/* Nothing to read, let's sleep */
 			schedule();
-			continue;
 		}
-		iov_iter_init(&iter, READ, iv, segs, len);
-		ret = macvtap_put_user(q, skb, &iter);
+	}
+	if (skb) {
+		ret = macvtap_put_user(q, skb, to);
 		kfree_skb(skb);
-		break;
 	}
-
 	if (!noblock)
 		finish_wait(sk_sleep(&q->sk), &wait);
 	return ret;
 }
 
-static ssize_t macvtap_aio_read(struct kiocb *iocb, const struct iovec *iv,
-				unsigned long count, loff_t pos)
+static ssize_t macvtap_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
 	struct file *file = iocb->ki_filp;
 	struct macvtap_queue *q = file->private_data;
-	ssize_t len, ret = 0;
+	ssize_t len = iov_iter_count(to), ret;
 
-	len = iov_length(iv, count);
-	if (len < 0) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	ret = macvtap_do_read(q, iv, count, len, file->f_flags & O_NONBLOCK);
+	ret = macvtap_do_read(q, to, file->f_flags & O_NONBLOCK);
 	ret = min_t(ssize_t, ret, len);
 	if (ret > 0)
 		iocb->ki_pos = ret;
-out:
 	return ret;
 }
 
@@ -1087,7 +1078,8 @@ static const struct file_operations macvtap_fops = {
 	.owner		= THIS_MODULE,
 	.open		= macvtap_open,
 	.release	= macvtap_release,
-	.aio_read	= macvtap_aio_read,
+	.read		= new_sync_read,
+	.read_iter	= macvtap_read_iter,
 	.aio_write	= macvtap_aio_write,
 	.poll		= macvtap_poll,
 	.llseek		= no_llseek,
@@ -1110,11 +1102,12 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
 			   int flags)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
+	struct iov_iter to;
 	int ret;
 	if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
 		return -EINVAL;
-	ret = macvtap_do_read(q, m->msg_iov, m->msg_iovlen, total_len,
-			  flags & MSG_DONTWAIT);
+	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
+	ret = macvtap_do_read(q, &to, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
  2014-11-22  4:28                     ` Al Viro
                                         ` (5 preceding siblings ...)
  2014-11-22  4:32                       ` [PATCH 06/17] switch macvtap " Al Viro
@ 2014-11-22  4:33                       ` Al Viro
  2014-11-24  0:02                         ` Ben Hutchings
  2014-11-22  4:33                       ` [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
                                         ` (11 subsequent siblings)
  18 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:33 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    3 ++
 net/core/datagram.c    |  115 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 18ce42e..ce69d48 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2659,10 +2659,13 @@ static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 				 const struct iovec *from, int from_offset,
 				 int len);
+int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
+				 struct iov_iter *from, int len);
 int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *frm,
 			   int offset, size_t count);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 			   struct iov_iter *to, int size);
+int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
 void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
 void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb);
 int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 26391a3..34d82f8 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -572,6 +572,77 @@ fault:
 }
 EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
 
+int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
+				 struct iov_iter *from,
+				 int len)
+{
+	int start = skb_headlen(skb);
+	int i, copy = start - offset;
+	struct sk_buff *frag_iter;
+
+	/* Copy header. */
+	if (copy > 0) {
+		if (copy > len)
+			copy = len;
+		if (copy_from_iter(skb->data + offset, copy, from) != copy)
+			goto fault;
+		if ((len -= copy) == 0)
+			return 0;
+		offset += copy;
+	}
+
+	/* Copy paged appendix. Hmm... why does this look so complicated? */
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		int end;
+		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+		WARN_ON(start > offset + len);
+
+		end = start + skb_frag_size(frag);
+		if ((copy = end - offset) > 0) {
+			size_t copied;
+			if (copy > len)
+				copy = len;
+			copied = copy_page_from_iter(skb_frag_page(frag),
+					  frag->page_offset + offset - start,
+					  copy, from);
+			if (copied != copy)
+				goto fault;
+
+			if (!(len -= copy))
+				return 0;
+			offset += copy;
+		}
+		start = end;
+	}
+
+	skb_walk_frags(skb, frag_iter) {
+		int end;
+
+		WARN_ON(start > offset + len);
+
+		end = start + frag_iter->len;
+		if ((copy = end - offset) > 0) {
+			if (copy > len)
+				copy = len;
+			if (skb_copy_datagram_from_iter(frag_iter,
+							offset - start,
+							from, copy))
+				goto fault;
+			if ((len -= copy) == 0)
+				return 0;
+			offset += copy;
+		}
+		start = end;
+	}
+	if (!len)
+		return 0;
+
+fault:
+	return -EFAULT;
+}
+EXPORT_SYMBOL(skb_copy_datagram_from_iter);
+
 /**
  *	zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
  *	@skb: buffer to copy
@@ -643,6 +714,50 @@ int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
 }
 EXPORT_SYMBOL(zerocopy_sg_from_iovec);
 
+int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
+{
+	int len = iov_iter_count(from);
+	int copy = min_t(int, skb_headlen(skb), len);
+	int i = 0;
+
+	/* copy up to skb headlen */
+	if (skb_copy_datagram_from_iter(skb, 0, from, copy))
+		return -EFAULT;
+
+	while (iov_iter_count(from)) {
+		struct page *pages[MAX_SKB_FRAGS];
+		size_t start;
+		ssize_t copied;
+		unsigned long truesize;
+		int n = 0;
+
+		copied = iov_iter_get_pages(from, pages, ~0U, MAX_SKB_FRAGS, &start);
+		if (copied < 0)
+			return -EFAULT;
+
+		truesize = DIV_ROUND_UP(copied + start, PAGE_SIZE) * PAGE_SIZE;
+		skb->data_len += copied;
+		skb->len += copied;
+		skb->truesize += truesize;
+		atomic_add(truesize, &skb->sk->sk_wmem_alloc);
+		while (copied) {
+			int off = start;
+			int size = min_t(int, copied, PAGE_SIZE - off);
+			start = 0;
+			if (i < MAX_SKB_FRAGS)
+				skb_fill_page_desc(skb, i, pages[n], off, size);
+			else
+				put_page(pages[n]);
+			copied -= size;
+			i++, n++;
+		}
+		if (i > MAX_SKB_FRAGS)
+			return -EMSGSIZE;
+	}
+	return 0;
+}
+EXPORT_SYMBOL(zerocopy_sg_from_iter);
+
 static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
 				      u8 __user *to, int len,
 				      __wsum *csump)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2014-11-22  4:28                     ` Al Viro
                                         ` (6 preceding siblings ...)
  2014-11-22  4:33                       ` [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter() Al Viro
@ 2014-11-22  4:33                       ` Al Viro
  2014-11-24  0:27                         ` Ben Hutchings
  2014-11-22  4:34                       ` [PATCH 09/17] kill zerocopy_sg_from_iovec() Al Viro
                                         ` (10 subsequent siblings)
  18 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:33 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

allows to switch macvtap and tun from ->aio_write() to ->write_iter()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/macvtap.c |   43 ++++++++++++++++++++-----------------------
 drivers/net/tun.c     |   43 +++++++++++++++++++++++--------------------
 2 files changed, 43 insertions(+), 43 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index cdd820f..2bf08c6 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -640,12 +640,12 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
 
 /* Get packet from user space buffer */
 static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
-				const struct iovec *iv, unsigned long total_len,
-				size_t count, int noblock)
+				struct iov_iter *from, int noblock)
 {
 	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
 	struct sk_buff *skb;
 	struct macvlan_dev *vlan;
+	unsigned long total_len = iov_iter_count(from);
 	unsigned long len = total_len;
 	int err;
 	struct virtio_net_hdr vnet_hdr = { 0 };
@@ -653,6 +653,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 	int copylen = 0;
 	bool zerocopy = false;
 	size_t linear;
+	ssize_t n;
 
 	if (q->flags & IFF_VNET_HDR) {
 		vnet_hdr_len = q->vnet_hdr_sz;
@@ -662,10 +663,11 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 			goto err;
 		len -= vnet_hdr_len;
 
-		err = memcpy_fromiovecend((void *)&vnet_hdr, iv, 0,
-					   sizeof(vnet_hdr));
-		if (err < 0)
+		err = -EFAULT;
+		n = copy_from_iter(&vnet_hdr, sizeof(vnet_hdr), from);
+		if (n != sizeof(vnet_hdr))
 			goto err;
+		iov_iter_advance(from, vnet_hdr_len - sizeof(vnet_hdr));
 		if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
 		     vnet_hdr.csum_start + vnet_hdr.csum_offset + 2 >
 							vnet_hdr.hdr_len)
@@ -680,17 +682,15 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 	if (unlikely(len < ETH_HLEN))
 		goto err;
 
-	err = -EMSGSIZE;
-	if (unlikely(count > UIO_MAXIOV))
-		goto err;
-
 	if (m && m->msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY)) {
+		struct iov_iter i;
 		copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
 		if (copylen > good_linear)
 			copylen = good_linear;
 		linear = copylen;
-		if (iov_pages(iv, vnet_hdr_len + copylen, count)
-		    <= MAX_SKB_FRAGS)
+		i = *from;
+		iov_iter_advance(&i, copylen);
+		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
 			zerocopy = true;
 	}
 
@@ -708,10 +708,9 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 		goto err;
 
 	if (zerocopy)
-		err = zerocopy_sg_from_iovec(skb, iv, vnet_hdr_len, count);
+		err = zerocopy_sg_from_iter(skb, from);
 	else {
-		err = skb_copy_datagram_from_iovec(skb, 0, iv, vnet_hdr_len,
-						   len);
+		err = skb_copy_datagram_from_iter(skb, 0, from, len);
 		if (!err && m && m->msg_control) {
 			struct ubuf_info *uarg = m->msg_control;
 			uarg->callback(uarg, false);
@@ -764,16 +763,12 @@ err:
 	return err;
 }
 
-static ssize_t macvtap_aio_write(struct kiocb *iocb, const struct iovec *iv,
-				 unsigned long count, loff_t pos)
+static ssize_t macvtap_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
-	ssize_t result = -ENOLINK;
 	struct macvtap_queue *q = file->private_data;
 
-	result = macvtap_get_user(q, NULL, iv, iov_length(iv, count), count,
-				  file->f_flags & O_NONBLOCK);
-	return result;
+	return macvtap_get_user(q, NULL, from, file->f_flags & O_NONBLOCK);
 }
 
 /* Put packet to the user space buffer */
@@ -1079,8 +1074,9 @@ static const struct file_operations macvtap_fops = {
 	.open		= macvtap_open,
 	.release	= macvtap_release,
 	.read		= new_sync_read,
+	.write		= new_sync_write,
 	.read_iter	= macvtap_read_iter,
-	.aio_write	= macvtap_aio_write,
+	.write_iter	= macvtap_write_iter,
 	.poll		= macvtap_poll,
 	.llseek		= no_llseek,
 	.unlocked_ioctl	= macvtap_ioctl,
@@ -1093,8 +1089,9 @@ static int macvtap_sendmsg(struct kiocb *iocb, struct socket *sock,
 			   struct msghdr *m, size_t total_len)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
-	return macvtap_get_user(q, m, m->msg_iov, total_len, m->msg_iovlen,
-			    m->msg_flags & MSG_DONTWAIT);
+	struct iov_iter from;
+	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
+	return macvtap_get_user(q, m, &from, m->msg_flags & MSG_DONTWAIT);
 }
 
 static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 405dfdf..4781be0 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1012,28 +1012,29 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
 
 /* Get packet from user space buffer */
 static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
-			    void *msg_control, const struct iovec *iv,
-			    size_t total_len, size_t count, int noblock)
+			    void *msg_control, struct iov_iter *from,
+			    int noblock)
 {
 	struct tun_pi pi = { 0, cpu_to_be16(ETH_P_IP) };
 	struct sk_buff *skb;
+	size_t total_len = iov_iter_count(from);
 	size_t len = total_len, align = NET_SKB_PAD, linear;
 	struct virtio_net_hdr gso = { 0 };
 	int good_linear;
-	int offset = 0;
 	int copylen;
 	bool zerocopy = false;
 	int err;
 	u32 rxhash;
+	ssize_t n;
 
 	if (!(tun->flags & TUN_NO_PI)) {
 		if (len < sizeof(pi))
 			return -EINVAL;
 		len -= sizeof(pi);
 
-		if (memcpy_fromiovecend((void *)&pi, iv, 0, sizeof(pi)))
+		n = copy_from_iter(&pi, sizeof(pi), from);
+		if (n != sizeof(pi))
 			return -EFAULT;
-		offset += sizeof(pi);
 	}
 
 	if (tun->flags & TUN_VNET_HDR) {
@@ -1041,7 +1042,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 			return -EINVAL;
 		len -= tun->vnet_hdr_sz;
 
-		if (memcpy_fromiovecend((void *)&gso, iv, offset, sizeof(gso)))
+		n = copy_from_iter(&gso, sizeof(gso), from);
+		if (n != sizeof(gso))
 			return -EFAULT;
 
 		if ((gso.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
@@ -1050,7 +1052,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 
 		if (gso.hdr_len > len)
 			return -EINVAL;
-		offset += tun->vnet_hdr_sz;
+		iov_iter_advance(from, tun->vnet_hdr_sz);
 	}
 
 	if ((tun->flags & TUN_TYPE_MASK) == TUN_TAP_DEV) {
@@ -1063,6 +1065,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	good_linear = SKB_MAX_HEAD(align);
 
 	if (msg_control) {
+		struct iov_iter i = *from;
 		/* There are 256 bytes to be copied in skb, so there is
 		 * enough room for skb expand head in case it is used.
 		 * The rest of the buffer is mapped from userspace.
@@ -1071,7 +1074,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 		if (copylen > good_linear)
 			copylen = good_linear;
 		linear = copylen;
-		if (iov_pages(iv, offset + copylen, count) <= MAX_SKB_FRAGS)
+		iov_iter_advance(&i, copylen);
+		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
 			zerocopy = true;
 	}
 
@@ -1091,9 +1095,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	}
 
 	if (zerocopy)
-		err = zerocopy_sg_from_iovec(skb, iv, offset, count);
+		err = zerocopy_sg_from_iter(skb, from);
 	else {
-		err = skb_copy_datagram_from_iovec(skb, 0, iv, offset, len);
+		err = skb_copy_datagram_from_iter(skb, 0, from, len);
 		if (!err && msg_control) {
 			struct ubuf_info *uarg = msg_control;
 			uarg->callback(uarg, false);
@@ -1207,8 +1211,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	return total_len;
 }
 
-static ssize_t tun_chr_aio_write(struct kiocb *iocb, const struct iovec *iv,
-			      unsigned long count, loff_t pos)
+static ssize_t tun_chr_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
 	struct tun_struct *tun = tun_get(file);
@@ -1218,10 +1221,7 @@ static ssize_t tun_chr_aio_write(struct kiocb *iocb, const struct iovec *iv,
 	if (!tun)
 		return -EBADFD;
 
-	tun_debug(KERN_INFO, tun, "tun_chr_write %ld\n", count);
-
-	result = tun_get_user(tun, tfile, NULL, iv, iov_length(iv, count),
-			      count, file->f_flags & O_NONBLOCK);
+	result = tun_get_user(tun, tfile, NULL, from, file->f_flags & O_NONBLOCK);
 
 	tun_put(tun);
 	return result;
@@ -1445,11 +1445,14 @@ static int tun_sendmsg(struct kiocb *iocb, struct socket *sock,
 	int ret;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
+	struct iov_iter from;
 
 	if (!tun)
 		return -EBADFD;
-	ret = tun_get_user(tun, tfile, m->msg_control, m->msg_iov, total_len,
-			   m->msg_iovlen, m->msg_flags & MSG_DONTWAIT);
+
+	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
+	ret = tun_get_user(tun, tfile, m->msg_control, &from,
+			   m->msg_flags & MSG_DONTWAIT);
 	tun_put(tun);
 	return ret;
 }
@@ -2233,9 +2236,9 @@ static const struct file_operations tun_fops = {
 	.owner	= THIS_MODULE,
 	.llseek = no_llseek,
 	.read  = new_sync_read,
+	.write = new_sync_write,
 	.read_iter  = tun_chr_read_iter,
-	.write = do_sync_write,
-	.aio_write = tun_chr_aio_write,
+	.write_iter = tun_chr_write_iter,
 	.poll	= tun_chr_poll,
 	.unlocked_ioctl	= tun_chr_ioctl,
 #ifdef CONFIG_COMPAT
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 09/17] kill zerocopy_sg_from_iovec()
  2014-11-22  4:28                     ` Al Viro
                                         ` (7 preceding siblings ...)
  2014-11-22  4:33                       ` [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
@ 2014-11-22  4:34                       ` Al Viro
  2014-11-22  4:35                       ` [PATCH 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter() Al Viro
                                         ` (9 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:34 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

no users left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    2 --
 net/core/datagram.c    |   65 ++----------------------------------------------
 2 files changed, 2 insertions(+), 65 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ce69d48..fa11bbd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2661,8 +2661,6 @@ int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 				 int len);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
-int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *frm,
-			   int offset, size_t count);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 			   struct iov_iter *to, int size);
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 34d82f8..5f8b90d 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -644,76 +644,15 @@ fault:
 EXPORT_SYMBOL(skb_copy_datagram_from_iter);
 
 /**
- *	zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
+ *	zerocopy_sg_from_iter - Build a zerocopy datagram from an iov_iter
  *	@skb: buffer to copy
- *	@from: io vector to copy from
- *	@offset: offset in the io vector to start copying from
- *	@count: amount of vectors to copy to buffer from
+ *	@from: the source to copy from
  *
  *	The function will first copy up to headlen, and then pin the userspace
  *	pages and build frags through them.
  *
  *	Returns 0, -EFAULT or -EMSGSIZE.
- *	Note: the iovec is not modified during the copy
  */
-int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
-				  int offset, size_t count)
-{
-	int len = iov_length(from, count) - offset;
-	int copy = min_t(int, skb_headlen(skb), len);
-	int size;
-	int i = 0;
-
-	/* copy up to skb headlen */
-	if (skb_copy_datagram_from_iovec(skb, 0, from, offset, copy))
-		return -EFAULT;
-
-	if (len == copy)
-		return 0;
-
-	offset += copy;
-	while (count--) {
-		struct page *page[MAX_SKB_FRAGS];
-		int num_pages;
-		unsigned long base;
-		unsigned long truesize;
-
-		/* Skip over from offset and copied */
-		if (offset >= from->iov_len) {
-			offset -= from->iov_len;
-			++from;
-			continue;
-		}
-		len = from->iov_len - offset;
-		base = (unsigned long)from->iov_base + offset;
-		size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
-		if (i + size > MAX_SKB_FRAGS)
-			return -EMSGSIZE;
-		num_pages = get_user_pages_fast(base, size, 0, &page[i]);
-		if (num_pages != size) {
-			release_pages(&page[i], num_pages, 0);
-			return -EFAULT;
-		}
-		truesize = size * PAGE_SIZE;
-		skb->data_len += len;
-		skb->len += len;
-		skb->truesize += truesize;
-		atomic_add(truesize, &skb->sk->sk_wmem_alloc);
-		while (len) {
-			int off = base & ~PAGE_MASK;
-			int size = min_t(int, len, PAGE_SIZE - off);
-			skb_fill_page_desc(skb, i, page[i], off, size);
-			base += size;
-			len -= size;
-			i++;
-		}
-		offset = 0;
-		++from;
-	}
-	return 0;
-}
-EXPORT_SYMBOL(zerocopy_sg_from_iovec);
-
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
 {
 	int len = iov_iter_count(from);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()
  2014-11-22  4:28                     ` Al Viro
                                         ` (8 preceding siblings ...)
  2014-11-22  4:34                       ` [PATCH 09/17] kill zerocopy_sg_from_iovec() Al Viro
@ 2014-11-22  4:35                       ` Al Viro
  2014-11-22  4:36                       ` PATCH 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter Al Viro
                                         ` (8 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:35 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

... and kill skb_copy_datagram_iovec()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    3 --
 net/core/datagram.c    |   88 ++----------------------------------------------
 net/packet/af_packet.c |   11 ++++--
 net/unix/af_unix.c     |   11 ++++--
 4 files changed, 18 insertions(+), 95 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index fa11bbd..67f659e 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2656,9 +2656,6 @@ static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
 {
 	return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
 }
-int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
-				 const struct iovec *from, int from_offset,
-				 int len);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 5f8b90d..836f76c 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -480,98 +480,14 @@ short_copy:
 EXPORT_SYMBOL(skb_copy_datagram_iter);
 
 /**
- *	skb_copy_datagram_from_iovec - Copy a datagram from an iovec.
+ *	skb_copy_datagram_from_iter - Copy a datagram from an iov_iter.
  *	@skb: buffer to copy
  *	@offset: offset in the buffer to start copying to
- *	@from: io vector to copy to
- *	@from_offset: offset in the io vector to start copying from
+ *	@from: the copy source
  *	@len: amount of data to copy to buffer from iovec
  *
  *	Returns 0 or -EFAULT.
- *	Note: the iovec is not modified during the copy.
  */
-int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
-				 const struct iovec *from, int from_offset,
-				 int len)
-{
-	int start = skb_headlen(skb);
-	int i, copy = start - offset;
-	struct sk_buff *frag_iter;
-
-	/* Copy header. */
-	if (copy > 0) {
-		if (copy > len)
-			copy = len;
-		if (memcpy_fromiovecend(skb->data + offset, from, from_offset,
-					copy))
-			goto fault;
-		if ((len -= copy) == 0)
-			return 0;
-		offset += copy;
-		from_offset += copy;
-	}
-
-	/* Copy paged appendix. Hmm... why does this look so complicated? */
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		int end;
-		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-
-		WARN_ON(start > offset + len);
-
-		end = start + skb_frag_size(frag);
-		if ((copy = end - offset) > 0) {
-			int err;
-			u8  *vaddr;
-			struct page *page = skb_frag_page(frag);
-
-			if (copy > len)
-				copy = len;
-			vaddr = kmap(page);
-			err = memcpy_fromiovecend(vaddr + frag->page_offset +
-						  offset - start,
-						  from, from_offset, copy);
-			kunmap(page);
-			if (err)
-				goto fault;
-
-			if (!(len -= copy))
-				return 0;
-			offset += copy;
-			from_offset += copy;
-		}
-		start = end;
-	}
-
-	skb_walk_frags(skb, frag_iter) {
-		int end;
-
-		WARN_ON(start > offset + len);
-
-		end = start + frag_iter->len;
-		if ((copy = end - offset) > 0) {
-			if (copy > len)
-				copy = len;
-			if (skb_copy_datagram_from_iovec(frag_iter,
-							 offset - start,
-							 from,
-							 from_offset,
-							 copy))
-				goto fault;
-			if ((len -= copy) == 0)
-				return 0;
-			offset += copy;
-			from_offset += copy;
-		}
-		start = end;
-	}
-	if (!len)
-		return 0;
-
-fault:
-	return -EFAULT;
-}
-EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
-
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from,
 				 int len)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 692f049..83bae39 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2400,6 +2400,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	unsigned short gso_type = 0;
 	int hlen, tlen;
 	int extra_len = 0;
+	struct iov_iter from;
+	ssize_t n;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	/*
 	 *	Get and verify the address.
@@ -2438,8 +2442,9 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 
 		len -= vnet_hdr_len;
 
-		err = memcpy_from_msg((void *)&vnet_hdr, msg, vnet_hdr_len);
-		if (err < 0)
+		err = -EFAULT;
+		n = copy_from_iter(&vnet_hdr, vnet_hdr_len, &from);
+		if (n != vnet_hdr_len)
 			goto out_unlock;
 
 		if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
@@ -2504,7 +2509,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		goto out_free;
 
 	/* Returns -EFAULT on error */
-	err = skb_copy_datagram_from_iovec(skb, offset, msg->msg_iov, 0, len);
+	err = skb_copy_datagram_from_iter(skb, offset, &from, len);
 	if (err)
 		goto out_free;
 
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 5eee625..4450d62 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1459,6 +1459,9 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	struct scm_cookie tmp_scm;
 	int max_level;
 	int data_len = 0;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1516,7 +1519,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	skb_put(skb, len - data_len);
 	skb->data_len = data_len;
 	skb->len = len;
-	err = skb_copy_datagram_from_iovec(skb, 0, msg->msg_iov, 0, len);
+	err = skb_copy_datagram_from_iter(skb, 0, &from, len);
 	if (err)
 		goto out_free;
 
@@ -1638,6 +1641,9 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	bool fds_sent = false;
 	int max_level;
 	int data_len;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1694,8 +1700,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		skb_put(skb, size - data_len);
 		skb->data_len = data_len;
 		skb->len = size;
-		err = skb_copy_datagram_from_iovec(skb, 0, msg->msg_iov,
-						   sent, size);
+		err = skb_copy_datagram_from_iter(skb, 0, &from, size);
 		if (err) {
 			kfree_skb(skb);
 			goto out_err;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* PATCH 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter
  2014-11-22  4:28                     ` Al Viro
                                         ` (9 preceding siblings ...)
  2014-11-22  4:35                       ` [PATCH 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter() Al Viro
@ 2014-11-22  4:36                       ` Al Viro
  2014-11-22  4:36                       ` [PATCH 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov Al Viro
                                         ` (7 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:36 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/sctp/structs.h |    6 +++---
 net/sctp/chunk.c           |    9 ++++-----
 net/sctp/sm_make_chunk.c   |   16 ++++++++--------
 net/sctp/socket.c          |    5 ++++-
 4 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 806e3b5..2bb2fcf 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -531,7 +531,7 @@ struct sctp_datamsg {
 
 struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *,
 					    struct sctp_sndrcvinfo *,
-					    struct msghdr *, int len);
+					    struct iov_iter *);
 void sctp_datamsg_free(struct sctp_datamsg *);
 void sctp_datamsg_put(struct sctp_datamsg *);
 void sctp_chunk_fail(struct sctp_chunk *, int error);
@@ -647,8 +647,8 @@ struct sctp_chunk {
 
 void sctp_chunk_hold(struct sctp_chunk *);
 void sctp_chunk_put(struct sctp_chunk *);
-int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len,
-			  struct iovec *data);
+int sctp_user_addto_chunk(struct sctp_chunk *chunk, int len,
+			  struct iov_iter *from);
 void sctp_chunk_free(struct sctp_chunk *);
 void  *sctp_addto_chunk(struct sctp_chunk *, int len, const void *data);
 struct sctp_chunk *sctp_chunkify(struct sk_buff *,
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 158701d..a338091 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -164,7 +164,7 @@ static void sctp_datamsg_assign(struct sctp_datamsg *msg, struct sctp_chunk *chu
  */
 struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 					    struct sctp_sndrcvinfo *sinfo,
-					    struct msghdr *msgh, int msg_len)
+					    struct iov_iter *from)
 {
 	int max, whole, i, offset, over, err;
 	int len, first_len;
@@ -172,6 +172,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 	struct sctp_chunk *chunk;
 	struct sctp_datamsg *msg;
 	struct list_head *pos, *temp;
+	size_t msg_len = iov_iter_count(from);
 	__u8 frag;
 
 	msg = sctp_datamsg_new(GFP_KERNEL);
@@ -279,12 +280,10 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 			goto errout;
 		}
 
-		err = sctp_user_addto_chunk(chunk, offset, len, msgh->msg_iov);
+		err = sctp_user_addto_chunk(chunk, len, from);
 		if (err < 0)
 			goto errout_chunk_free;
 
-		offset += len;
-
 		/* Put the chunk->skb back into the form expected by send.  */
 		__skb_pull(chunk->skb, (__u8 *)chunk->chunk_hdr
 			   - (__u8 *)chunk->skb->data);
@@ -317,7 +316,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 			goto errout;
 		}
 
-		err = sctp_user_addto_chunk(chunk, offset, over, msgh->msg_iov);
+		err = sctp_user_addto_chunk(chunk, over, from);
 
 		/* Put the chunk->skb back into the form expected by send.  */
 		__skb_pull(chunk->skb, (__u8 *)chunk->chunk_hdr
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index e49bcce..e49e231 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1491,26 +1491,26 @@ static void *sctp_addto_chunk_fixed(struct sctp_chunk *chunk,
  * chunk is not big enough.
  * Returns a kernel err value.
  */
-int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len,
-			  struct iovec *data)
+int sctp_user_addto_chunk(struct sctp_chunk *chunk, int len,
+			  struct iov_iter *from)
 {
-	__u8 *target;
-	int err = 0;
+	void *target;
+	ssize_t copied;
 
 	/* Make room in chunk for data.  */
 	target = skb_put(chunk->skb, len);
 
 	/* Copy data (whole iovec) into chunk */
-	if ((err = memcpy_fromiovecend(target, data, off, len)))
-		goto out;
+	copied = copy_from_iter(target, len, from);
+	if (copied != len)
+		return -EFAULT;
 
 	/* Adjust the chunk length field.  */
 	chunk->chunk_hdr->length =
 		htons(ntohs(chunk->chunk_hdr->length) + len);
 	chunk->chunk_end = skb_tail_pointer(chunk->skb);
 
-out:
-	return err;
+	return 0;
 }
 
 /* Helper function to assign a TSN if needed.  This assumes that both
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 2120292..7e866b7 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1609,6 +1609,9 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	__u16 sinfo_flags = 0;
 	long timeo;
 	int err;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, msg_len);
 
 	err = 0;
 	sp = sctp_sk(sk);
@@ -1947,7 +1950,7 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	/* Break the message into multiple chunks of maximum size. */
-	datamsg = sctp_datamsg_from_user(asoc, sinfo, msg, msg_len);
+	datamsg = sctp_datamsg_from_user(asoc, sinfo, &from);
 	if (IS_ERR(datamsg)) {
 		err = PTR_ERR(datamsg);
 		goto out_free;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov
  2014-11-22  4:28                     ` Al Viro
                                         ` (10 preceding siblings ...)
  2014-11-22  4:36                       ` PATCH 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter Al Viro
@ 2014-11-22  4:36                       ` Al Viro
  2014-11-22  4:37                       ` [PATCH 13/17] tipc_msg_build(): " Al Viro
                                         ` (6 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:36 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/tipc/socket.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 591bbfa..8c94ec4 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -692,7 +692,7 @@ static unsigned int tipc_poll(struct file *file, struct socket *sock,
  * tipc_sendmcast - send multicast message
  * @sock: socket structure
  * @seq: destination address
- * @iov: message data to send
+ * @msg: message to send
  * @dsz: total length of message data
  * @timeo: timeout to wait for wakeup
  *
@@ -700,7 +700,7 @@ static unsigned int tipc_poll(struct file *file, struct socket *sock,
  * Returns the number of bytes sent on success, or errno
  */
 static int tipc_sendmcast(struct  socket *sock, struct tipc_name_seq *seq,
-			  struct iovec *iov, size_t dsz, long timeo)
+			  struct msghdr *msg, size_t dsz, long timeo)
 {
 	struct sock *sk = sock->sk;
 	struct tipc_msg *mhdr = &tipc_sk(sk)->phdr;
@@ -719,7 +719,7 @@ static int tipc_sendmcast(struct  socket *sock, struct tipc_name_seq *seq,
 
 new_mtu:
 	mtu = tipc_bclink_get_mtu();
-	rc = tipc_msg_build(mhdr, iov, 0, dsz, mtu, &buf);
+	rc = tipc_msg_build(mhdr, msg->msg_iov, 0, dsz, mtu, &buf);
 	if (unlikely(rc < 0))
 		return rc;
 
@@ -943,7 +943,7 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket *sock,
 	timeo = sock_sndtimeo(sk, m->msg_flags & MSG_DONTWAIT);
 
 	if (dest->addrtype == TIPC_ADDR_MCAST) {
-		rc = tipc_sendmcast(sock, seq, iov, dsz, timeo);
+		rc = tipc_sendmcast(sock, seq, m, dsz, timeo);
 		goto exit;
 	} else if (dest->addrtype == TIPC_ADDR_NAME) {
 		u32 type = dest->addr.name.name.type;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 13/17] tipc_msg_build(): pass msghdr instead of its ->msg_iov
  2014-11-22  4:28                     ` Al Viro
                                         ` (11 preceding siblings ...)
  2014-11-22  4:36                       ` [PATCH 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov Al Viro
@ 2014-11-22  4:37                       ` Al Viro
  2014-11-22  4:37                       ` [PATCH 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr Al Viro
                                         ` (5 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:37 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/tipc/msg.c    |    8 ++++----
 net/tipc/msg.h    |    2 +-
 net/tipc/socket.c |    7 +++----
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index ec18076..9155496 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -162,14 +162,14 @@ err:
 /**
  * tipc_msg_build - create buffer chain containing specified header and data
  * @mhdr: Message header, to be prepended to data
- * @iov: User data
+ * @m: User message
  * @offset: Posision in iov to start copying from
  * @dsz: Total length of user data
  * @pktmax: Max packet size that can be used
  * @chain: Buffer or chain of buffers to be returned to caller
  * Returns message data size or errno: -ENOMEM, -EFAULT
  */
-int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
+int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
 		   int offset, int dsz, int pktmax , struct sk_buff **chain)
 {
 	int mhsz = msg_hdr_sz(mhdr);
@@ -194,7 +194,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
 		skb_copy_to_linear_data(buf, mhdr, mhsz);
 		pktpos = buf->data + mhsz;
 		TIPC_SKB_CB(buf)->chain_sz = 1;
-		if (!dsz || !memcpy_fromiovecend(pktpos, iov, offset, dsz))
+		if (!dsz || !memcpy_fromiovecend(pktpos, m->msg_iov, offset, dsz))
 			return dsz;
 		rc = -EFAULT;
 		goto error;
@@ -223,7 +223,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
 		if (drem < pktrem)
 			pktrem = drem;
 
-		if (memcpy_fromiovecend(pktpos, iov, offset, pktrem)) {
+		if (memcpy_fromiovecend(pktpos, m->msg_iov, offset, pktrem)) {
 			rc = -EFAULT;
 			goto error;
 		}
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index 0ea7b69..d7d2ba2 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -743,7 +743,7 @@ bool tipc_msg_bundle(struct sk_buff *bbuf, struct sk_buff *buf, u32 mtu);
 
 bool tipc_msg_make_bundle(struct sk_buff **buf, u32 mtu, u32 dnode);
 
-int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
+int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
 		   int offset, int dsz, int mtu , struct sk_buff **chain);
 
 struct sk_buff *tipc_msg_reassemble(struct sk_buff *chain);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 8c94ec4..7ad2a93 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -719,7 +719,7 @@ static int tipc_sendmcast(struct  socket *sock, struct tipc_name_seq *seq,
 
 new_mtu:
 	mtu = tipc_bclink_get_mtu();
-	rc = tipc_msg_build(mhdr, msg->msg_iov, 0, dsz, mtu, &buf);
+	rc = tipc_msg_build(mhdr, msg, 0, dsz, mtu, &buf);
 	if (unlikely(rc < 0))
 		return rc;
 
@@ -897,7 +897,6 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket *sock,
 	struct sock *sk = sock->sk;
 	struct tipc_sock *tsk = tipc_sk(sk);
 	struct tipc_msg *mhdr = &tsk->phdr;
-	struct iovec *iov = m->msg_iov;
 	u32 dnode, dport;
 	struct sk_buff *buf;
 	struct tipc_name_seq *seq = &dest->addr.nameseq;
@@ -974,7 +973,7 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket *sock,
 
 new_mtu:
 	mtu = tipc_node_get_mtu(dnode, tsk->ref);
-	rc = tipc_msg_build(mhdr, iov, 0, dsz, mtu, &buf);
+	rc = tipc_msg_build(mhdr, m, 0, dsz, mtu, &buf);
 	if (rc < 0)
 		goto exit;
 
@@ -1086,7 +1085,7 @@ static int tipc_send_stream(struct kiocb *iocb, struct socket *sock,
 next:
 	mtu = tsk->max_pkt;
 	send = min_t(uint, dsz - sent, TIPC_MAX_USER_MSG_SIZE);
-	rc = tipc_msg_build(mhdr, m->msg_iov, sent, send, mtu, &buf);
+	rc = tipc_msg_build(mhdr, m, sent, send, mtu, &buf);
 	if (unlikely(rc < 0))
 		goto exit;
 	do {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr
  2014-11-22  4:28                     ` Al Viro
                                         ` (12 preceding siblings ...)
  2014-11-22  4:37                       ` [PATCH 13/17] tipc_msg_build(): " Al Viro
@ 2014-11-22  4:37                       ` Al Viro
  2014-11-22  4:38                       ` [PATCH 15/17] [atm] switch vcc_sendmsg() to copy_from_iter() Al Viro
                                         ` (4 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:37 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/af_vsock.h         |    6 +++---
 net/vmw_vsock/af_vsock.c       |    6 +++---
 net/vmw_vsock/vmci_transport.c |   14 +++++++-------
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 4282778..0d87674 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -103,14 +103,14 @@ struct vsock_transport {
 	int (*dgram_dequeue)(struct kiocb *kiocb, struct vsock_sock *vsk,
 			     struct msghdr *msg, size_t len, int flags);
 	int (*dgram_enqueue)(struct vsock_sock *, struct sockaddr_vm *,
-			     struct iovec *, size_t len);
+			     struct msghdr *, size_t len);
 	bool (*dgram_allow)(u32 cid, u32 port);
 
 	/* STREAM. */
 	/* TODO: stream_bind() */
-	ssize_t (*stream_dequeue)(struct vsock_sock *, struct iovec *,
+	ssize_t (*stream_dequeue)(struct vsock_sock *, struct msghdr *,
 				  size_t len, int flags);
-	ssize_t (*stream_enqueue)(struct vsock_sock *, struct iovec *,
+	ssize_t (*stream_enqueue)(struct vsock_sock *, struct msghdr *,
 				  size_t len);
 	s64 (*stream_has_data)(struct vsock_sock *);
 	s64 (*stream_has_space)(struct vsock_sock *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 85d232b..1d0e39c 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1013,7 +1013,7 @@ static int vsock_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		goto out;
 	}
 
-	err = transport->dgram_enqueue(vsk, remote_addr, msg->msg_iov, len);
+	err = transport->dgram_enqueue(vsk, remote_addr, msg, len);
 
 out:
 	release_sock(sk);
@@ -1617,7 +1617,7 @@ static int vsock_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		 */
 
 		written = transport->stream_enqueue(
-				vsk, msg->msg_iov,
+				vsk, msg,
 				len - total_written);
 		if (written < 0) {
 			err = -ENOMEM;
@@ -1739,7 +1739,7 @@ vsock_stream_recvmsg(struct kiocb *kiocb,
 				break;
 
 			read = transport->stream_dequeue(
-					vsk, msg->msg_iov,
+					vsk, msg,
 					len - copied, flags);
 			if (read < 0) {
 				err = -ENOMEM;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index a57ddef..c1c0389 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1697,7 +1697,7 @@ static int vmci_transport_dgram_bind(struct vsock_sock *vsk,
 static int vmci_transport_dgram_enqueue(
 	struct vsock_sock *vsk,
 	struct sockaddr_vm *remote_addr,
-	struct iovec *iov,
+	struct msghdr *msg,
 	size_t len)
 {
 	int err;
@@ -1714,7 +1714,7 @@ static int vmci_transport_dgram_enqueue(
 	if (!dg)
 		return -ENOMEM;
 
-	memcpy_fromiovec(VMCI_DG_PAYLOAD(dg), iov, len);
+	memcpy_from_msg(VMCI_DG_PAYLOAD(dg), msg, len);
 
 	dg->dst = vmci_make_handle(remote_addr->svm_cid,
 				   remote_addr->svm_port);
@@ -1835,22 +1835,22 @@ static int vmci_transport_connect(struct vsock_sock *vsk)
 
 static ssize_t vmci_transport_stream_dequeue(
 	struct vsock_sock *vsk,
-	struct iovec *iov,
+	struct msghdr *msg,
 	size_t len,
 	int flags)
 {
 	if (flags & MSG_PEEK)
-		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, iov, len, 0);
+		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
 	else
-		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, iov, len, 0);
+		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
 }
 
 static ssize_t vmci_transport_stream_enqueue(
 	struct vsock_sock *vsk,
-	struct iovec *iov,
+	struct msghdr *msg,
 	size_t len)
 {
-	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, iov, len, 0);
+	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
 }
 
 static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 15/17] [atm] switch vcc_sendmsg() to copy_from_iter()
  2014-11-22  4:28                     ` Al Viro
                                         ` (13 preceding siblings ...)
  2014-11-22  4:37                       ` [PATCH 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr Al Viro
@ 2014-11-22  4:38                       ` Al Viro
  2014-11-22  4:38                       ` [PATCH 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter Al Viro
                                         ` (3 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:38 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

... and make it handle multi-segment iovecs - deals with that
"fix this later" issue for free.  A bit of shame, really - it
had been there since 2.3.15pre3 when the whole thing went into the
tree, practically a historical artefact by now...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/atm/common.c |   17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/atm/common.c b/net/atm/common.c
index 9cd1cca..f591129 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -570,15 +570,16 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 }
 
 int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
-		size_t total_len)
+		size_t size)
 {
 	struct sock *sk = sock->sk;
 	DEFINE_WAIT(wait);
 	struct atm_vcc *vcc;
 	struct sk_buff *skb;
 	int eff, error;
-	const void __user *buff;
-	int size;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, size);
 
 	lock_sock(sk);
 	if (sock->state != SS_CONNECTED) {
@@ -589,12 +590,6 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		error = -EISCONN;
 		goto out;
 	}
-	if (m->msg_iovlen != 1) {
-		error = -ENOSYS; /* fix this later @@@ */
-		goto out;
-	}
-	buff = m->msg_iov->iov_base;
-	size = m->msg_iov->iov_len;
 	vcc = ATM_SD(sock);
 	if (test_bit(ATM_VF_RELEASED, &vcc->flags) ||
 	    test_bit(ATM_VF_CLOSE, &vcc->flags) ||
@@ -607,7 +602,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		error = 0;
 		goto out;
 	}
-	if (size < 0 || size > vcc->qos.txtp.max_sdu) {
+	if (size > vcc->qos.txtp.max_sdu) {
 		error = -EMSGSIZE;
 		goto out;
 	}
@@ -639,7 +634,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		goto out;
 	skb->dev = NULL; /* for paths shared with net_device interfaces */
 	ATM_SKB(skb)->atm_options = vcc->atm_options;
-	if (copy_from_user(skb_put(skb, size), buff, size)) {
+	if (copy_from_iter(skb_put(skb, size), size, &from) != size) {
 		kfree_skb(skb);
 		error = -EFAULT;
 		goto out;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter
  2014-11-22  4:28                     ` Al Viro
                                         ` (14 preceding siblings ...)
  2014-11-22  4:38                       ` [PATCH 15/17] [atm] switch vcc_sendmsg() to copy_from_iter() Al Viro
@ 2014-11-22  4:38                       ` Al Viro
  2014-11-22  4:39                       ` [PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter Al Viro
                                         ` (2 subsequent siblings)
  18 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:38 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

instances get considerably simpler from that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/rds/ib.h       |    3 +--
 net/rds/ib_recv.c  |   37 +++++++++++--------------------------
 net/rds/iw.h       |    3 +--
 net/rds/iw_recv.c  |   37 +++++++++++--------------------------
 net/rds/message.c  |   35 ++++++++---------------------------
 net/rds/rds.h      |    6 ++----
 net/rds/recv.c     |    5 +++--
 net/rds/tcp.h      |    3 +--
 net/rds/tcp_recv.c |   38 +++++++++-----------------------------
 9 files changed, 47 insertions(+), 120 deletions(-)

diff --git a/net/rds/ib.h b/net/rds/ib.h
index 7280ab8..c36d713 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -316,8 +316,7 @@ int rds_ib_recv_alloc_caches(struct rds_ib_connection *ic);
 void rds_ib_recv_free_caches(struct rds_ib_connection *ic);
 void rds_ib_recv_refill(struct rds_connection *conn, int prefill);
 void rds_ib_inc_free(struct rds_incoming *inc);
-int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-			     size_t size);
+int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_ib_recv_cq_comp_handler(struct ib_cq *cq, void *context);
 void rds_ib_recv_tasklet_fn(unsigned long data);
 void rds_ib_recv_init_ring(struct rds_ib_connection *ic);
diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index d67de45..1b981a4 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -472,15 +472,12 @@ static struct list_head *rds_ib_recv_cache_get(struct rds_ib_refill_cache *cache
 	return head;
 }
 
-int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
-			    size_t size)
+int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_ib_incoming *ibinc;
 	struct rds_page_frag *frag;
-	struct iovec *iov = first_iov;
 	unsigned long to_copy;
 	unsigned long frag_off = 0;
-	unsigned long iov_off = 0;
 	int copied = 0;
 	int ret;
 	u32 len;
@@ -489,37 +486,25 @@ int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
 	frag = list_entry(ibinc->ii_frags.next, struct rds_page_frag, f_item);
 	len = be32_to_cpu(inc->i_hdr.h_len);
 
-	while (copied < size && copied < len) {
+	while (iov_iter_count(to) && copied < len) {
 		if (frag_off == RDS_FRAG_SIZE) {
 			frag = list_entry(frag->f_item.next,
 					  struct rds_page_frag, f_item);
 			frag_off = 0;
 		}
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, RDS_FRAG_SIZE - frag_off);
-		to_copy = min_t(size_t, to_copy, size - copied);
+		to_copy = min_t(unsigned long, iov_iter_count(to),
+				RDS_FRAG_SIZE - frag_off);
 		to_copy = min_t(unsigned long, to_copy, len - copied);
 
-		rdsdebug("%lu bytes to user [%p, %zu] + %lu from frag "
-			 "[%p, %u] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 sg_page(&frag->f_sg), frag->f_sg.offset, frag_off);
-
 		/* XXX needs + offset for multiple recvs per page */
-		ret = rds_page_copy_to_user(sg_page(&frag->f_sg),
-					    frag->f_sg.offset + frag_off,
-					    iov->iov_base + iov_off,
-					    to_copy);
-		if (ret) {
-			copied = ret;
-			break;
-		}
+		rds_stats_add(s_copy_to_user, to_copy);
+		ret = copy_page_to_iter(sg_page(&frag->f_sg),
+					frag->f_sg.offset + frag_off,
+					to_copy,
+					to);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
 		frag_off += to_copy;
 		copied += to_copy;
 	}
diff --git a/net/rds/iw.h b/net/rds/iw.h
index 04ce3b1..cbe6674 100644
--- a/net/rds/iw.h
+++ b/net/rds/iw.h
@@ -325,8 +325,7 @@ int rds_iw_recv(struct rds_connection *conn);
 int rds_iw_recv_refill(struct rds_connection *conn, gfp_t kptr_gfp,
 		       gfp_t page_gfp, int prefill);
 void rds_iw_inc_free(struct rds_incoming *inc);
-int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-			     size_t size);
+int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_iw_recv_cq_comp_handler(struct ib_cq *cq, void *context);
 void rds_iw_recv_tasklet_fn(unsigned long data);
 void rds_iw_recv_init_ring(struct rds_iw_connection *ic);
diff --git a/net/rds/iw_recv.c b/net/rds/iw_recv.c
index aa8bf67..a66d179 100644
--- a/net/rds/iw_recv.c
+++ b/net/rds/iw_recv.c
@@ -303,15 +303,12 @@ void rds_iw_inc_free(struct rds_incoming *inc)
 	BUG_ON(atomic_read(&rds_iw_allocation) < 0);
 }
 
-int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
-			    size_t size)
+int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_iw_incoming *iwinc;
 	struct rds_page_frag *frag;
-	struct iovec *iov = first_iov;
 	unsigned long to_copy;
 	unsigned long frag_off = 0;
-	unsigned long iov_off = 0;
 	int copied = 0;
 	int ret;
 	u32 len;
@@ -320,37 +317,25 @@ int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
 	frag = list_entry(iwinc->ii_frags.next, struct rds_page_frag, f_item);
 	len = be32_to_cpu(inc->i_hdr.h_len);
 
-	while (copied < size && copied < len) {
+	while (iov_iter_count(to) && copied < len) {
 		if (frag_off == RDS_FRAG_SIZE) {
 			frag = list_entry(frag->f_item.next,
 					  struct rds_page_frag, f_item);
 			frag_off = 0;
 		}
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, RDS_FRAG_SIZE - frag_off);
-		to_copy = min_t(size_t, to_copy, size - copied);
+		to_copy = min_t(unsigned long, iov_iter_count(to),
+				RDS_FRAG_SIZE - frag_off);
 		to_copy = min_t(unsigned long, to_copy, len - copied);
 
-		rdsdebug("%lu bytes to user [%p, %zu] + %lu from frag "
-			 "[%p, %lu] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 frag->f_page, frag->f_offset, frag_off);
-
 		/* XXX needs + offset for multiple recvs per page */
-		ret = rds_page_copy_to_user(frag->f_page,
-					    frag->f_offset + frag_off,
-					    iov->iov_base + iov_off,
-					    to_copy);
-		if (ret) {
-			copied = ret;
-			break;
-		}
+		rds_stats_add(s_copy_to_user, to_copy);
+		ret = copy_page_to_iter(frag->f_page,
+					frag->f_offset + frag_off,
+					to_copy,
+					to);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
 		frag_off += to_copy;
 		copied += to_copy;
 	}
diff --git a/net/rds/message.c b/net/rds/message.c
index aba232f..7a546e0 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -325,14 +325,11 @@ out:
 	return ret;
 }
 
-int rds_message_inc_copy_to_user(struct rds_incoming *inc,
-				 struct iovec *first_iov, size_t size)
+int rds_message_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_message *rm;
-	struct iovec *iov;
 	struct scatterlist *sg;
 	unsigned long to_copy;
-	unsigned long iov_off;
 	unsigned long vec_off;
 	int copied;
 	int ret;
@@ -341,36 +338,20 @@ int rds_message_inc_copy_to_user(struct rds_incoming *inc,
 	rm = container_of(inc, struct rds_message, m_inc);
 	len = be32_to_cpu(rm->m_inc.i_hdr.h_len);
 
-	iov = first_iov;
-	iov_off = 0;
 	sg = rm->data.op_sg;
 	vec_off = 0;
 	copied = 0;
 
-	while (copied < size && copied < len) {
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, sg->length - vec_off);
-		to_copy = min_t(size_t, to_copy, size - copied);
+	while (iov_iter_count(to) && copied < len) {
+		to_copy = min(iov_iter_count(to), sg->length - vec_off);
 		to_copy = min_t(unsigned long, to_copy, len - copied);
 
-		rdsdebug("copying %lu bytes to user iov [%p, %zu] + %lu to "
-			 "sg [%p, %u, %u] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 sg_page(sg), sg->offset, sg->length, vec_off);
-
-		ret = rds_page_copy_to_user(sg_page(sg), sg->offset + vec_off,
-					    iov->iov_base + iov_off,
-					    to_copy);
-		if (ret) {
-			copied = ret;
-			break;
-		}
+		rds_stats_add(s_copy_to_user, to_copy);
+		ret = copy_page_to_iter(sg_page(sg), sg->offset + vec_off,
+					to_copy, to);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
 		vec_off += to_copy;
 		copied += to_copy;
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index 48f8ffc..b22dad9 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -431,8 +431,7 @@ struct rds_transport {
 	int (*xmit_rdma)(struct rds_connection *conn, struct rm_rdma_op *op);
 	int (*xmit_atomic)(struct rds_connection *conn, struct rm_atomic_op *op);
 	int (*recv)(struct rds_connection *conn);
-	int (*inc_copy_to_user)(struct rds_incoming *inc, struct iovec *iov,
-				size_t size);
+	int (*inc_copy_to_user)(struct rds_incoming *inc, struct iov_iter *to);
 	void (*inc_free)(struct rds_incoming *inc);
 
 	int (*cm_handle_connect)(struct rdma_cm_id *cm_id,
@@ -667,8 +666,7 @@ int rds_message_add_extension(struct rds_header *hdr,
 int rds_message_next_extension(struct rds_header *hdr,
 			       unsigned int *pos, void *buf, unsigned int *buflen);
 int rds_message_add_rdma_dest_extension(struct rds_header *hdr, u32 r_key, u32 offset);
-int rds_message_inc_copy_to_user(struct rds_incoming *inc,
-				 struct iovec *first_iov, size_t size);
+int rds_message_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_message_inc_free(struct rds_incoming *inc);
 void rds_message_addref(struct rds_message *rm);
 void rds_message_put(struct rds_message *rm);
diff --git a/net/rds/recv.c b/net/rds/recv.c
index bd82522..47d7b10 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -404,6 +404,7 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int ret = 0, nonblock = msg_flags & MSG_DONTWAIT;
 	DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name);
 	struct rds_incoming *inc = NULL;
+	struct iov_iter to;
 
 	/* udp_recvmsg()->sock_recvtimeo() gets away without locking too.. */
 	timeo = sock_rcvtimeo(sk, nonblock);
@@ -449,8 +450,8 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		rdsdebug("copying inc %p from %pI4:%u to user\n", inc,
 			 &inc->i_conn->c_faddr,
 			 ntohs(inc->i_hdr.h_sport));
-		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, msg->msg_iov,
-							     size);
+		iov_iter_init(&to, READ, msg->msg_iov, msg->msg_iovlen, size);
+		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, &to);
 		if (ret < 0)
 			break;
 
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 6563749..0dbdd37 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -69,8 +69,7 @@ void rds_tcp_recv_exit(void);
 void rds_tcp_data_ready(struct sock *sk);
 int rds_tcp_recv(struct rds_connection *conn);
 void rds_tcp_inc_free(struct rds_incoming *inc);
-int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-			     size_t size);
+int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 
 /* tcp_send.c */
 void rds_tcp_xmit_prepare(struct rds_connection *conn);
diff --git a/net/rds/tcp_recv.c b/net/rds/tcp_recv.c
index 9ae6e0a..fbc5ef8 100644
--- a/net/rds/tcp_recv.c
+++ b/net/rds/tcp_recv.c
@@ -59,50 +59,30 @@ void rds_tcp_inc_free(struct rds_incoming *inc)
 /*
  * this is pretty lame, but, whatever.
  */
-int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
-			     size_t size)
+int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_tcp_incoming *tinc;
-	struct iovec *iov, tmp;
 	struct sk_buff *skb;
-	unsigned long to_copy, skb_off;
 	int ret = 0;
 
-	if (size == 0)
+	if (!iov_iter_count(to))
 		goto out;
 
 	tinc = container_of(inc, struct rds_tcp_incoming, ti_inc);
-	iov = first_iov;
-	tmp = *iov;
 
 	skb_queue_walk(&tinc->ti_skb_list, skb) {
-		skb_off = 0;
-		while (skb_off < skb->len) {
-			while (tmp.iov_len == 0) {
-				iov++;
-				tmp = *iov;
-			}
-
-			to_copy = min(tmp.iov_len, size);
+		unsigned long to_copy, skb_off;
+		for (skb_off = 0; skb_off < skb->len; skb_off += to_copy) {
+			to_copy = iov_iter_count(to);
 			to_copy = min(to_copy, skb->len - skb_off);
 
-			rdsdebug("ret %d size %zu skb %p skb_off %lu "
-				 "skblen %d iov_base %p iov_len %zu cpy %lu\n",
-				 ret, size, skb, skb_off, skb->len,
-				 tmp.iov_base, tmp.iov_len, to_copy);
-
-			/* modifies tmp as it copies */
-			if (skb_copy_datagram_iovec(skb, skb_off, &tmp,
-						    to_copy)) {
-				ret = -EFAULT;
-				goto out;
-			}
+			if (skb_copy_datagram_iter(skb, skb_off, to, to_copy))
+				return -EFAULT;
 
 			rds_stats_add(s_copy_to_user, to_copy);
-			size -= to_copy;
 			ret += to_copy;
-			skb_off += to_copy;
-			if (size == 0)
+
+			if (!iov_iter_count(to))
 				goto out;
 		}
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter
  2014-11-22  4:28                     ` Al Viro
                                         ` (15 preceding siblings ...)
  2014-11-22  4:38                       ` [PATCH 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter Al Viro
@ 2014-11-22  4:39                       ` Al Viro
  2014-11-24  2:00                         ` Ben Hutchings
  2014-11-22  7:24                       ` [RFC] situation with csum_and_copy_... API David Miller
  2014-11-22 17:48                       ` [RFC] situation with csum_and_copy_... API Linus Torvalds
  18 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-22  4:39 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/rds/message.c |   42 ++++++++++++------------------------------
 net/rds/rds.h     |    3 +--
 net/rds/send.c    |    4 +++-
 3 files changed, 16 insertions(+), 33 deletions(-)

diff --git a/net/rds/message.c b/net/rds/message.c
index 7a546e0..ff22022 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -264,64 +264,46 @@ struct rds_message *rds_message_map_pages(unsigned long *page_addrs, unsigned in
 	return rm;
 }
 
-int rds_message_copy_from_user(struct rds_message *rm, struct iovec *first_iov,
-					       size_t total_len)
+int rds_message_copy_from_user(struct rds_message *rm, struct iov_iter *from)
 {
 	unsigned long to_copy;
-	unsigned long iov_off;
 	unsigned long sg_off;
-	struct iovec *iov;
 	struct scatterlist *sg;
 	int ret = 0;
 
-	rm->m_inc.i_hdr.h_len = cpu_to_be32(total_len);
+	rm->m_inc.i_hdr.h_len = cpu_to_be32(iov_iter_count(from));
 
 	/*
 	 * now allocate and copy in the data payload.
 	 */
 	sg = rm->data.op_sg;
-	iov = first_iov;
-	iov_off = 0;
 	sg_off = 0; /* Dear gcc, sg->page will be null from kzalloc. */
 
-	while (total_len) {
+	while (iov_iter_count(from)) {
 		if (!sg_page(sg)) {
-			ret = rds_page_remainder_alloc(sg, total_len,
+			ret = rds_page_remainder_alloc(sg, iov_iter_count(from),
 						       GFP_HIGHUSER);
 			if (ret)
-				goto out;
+				return ret;
 			rm->data.op_nents++;
 			sg_off = 0;
 		}
 
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, sg->length - sg_off);
-		to_copy = min_t(size_t, to_copy, total_len);
-
-		rdsdebug("copying %lu bytes from user iov [%p, %zu] + %lu to "
-			 "sg [%p, %u, %u] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 (void *)sg_page(sg), sg->offset, sg->length, sg_off);
+		to_copy = min_t(unsigned long, iov_iter_count(from),
+				sg->length - sg_off);
 
-		ret = rds_page_copy_from_user(sg_page(sg), sg->offset + sg_off,
-					      iov->iov_base + iov_off,
-					      to_copy);
-		if (ret)
-			goto out;
+		rds_stats_add(s_copy_from_user, to_copy);
+		ret = copy_page_from_iter(sg_page(sg), sg->offset + sg_off,
+					  to_copy, from);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
-		total_len -= to_copy;
 		sg_off += to_copy;
 
 		if (sg_off == sg->length)
 			sg++;
 	}
 
-out:
 	return ret;
 }
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index b22dad9..c2a5eef 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -656,8 +656,7 @@ rds_conn_connecting(struct rds_connection *conn)
 /* message.c */
 struct rds_message *rds_message_alloc(unsigned int nents, gfp_t gfp);
 struct scatterlist *rds_message_alloc_sgs(struct rds_message *rm, int nents);
-int rds_message_copy_from_user(struct rds_message *rm, struct iovec *first_iov,
-					       size_t total_len);
+int rds_message_copy_from_user(struct rds_message *rm, struct iov_iter *from);
 struct rds_message *rds_message_map_pages(unsigned long *page_addrs, unsigned int total_len);
 void rds_message_populate_header(struct rds_header *hdr, __be16 sport,
 				 __be16 dport, u64 seq);
diff --git a/net/rds/send.c b/net/rds/send.c
index 0a64541..4de62ea 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -934,7 +934,9 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int queued = 0, allocated_mr = 0;
 	int nonblock = msg->msg_flags & MSG_DONTWAIT;
 	long timeo = sock_sndtimeo(sk, nonblock);
+	struct iov_iter from;
 
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, payload_len);
 	/* Mirror Linux UDP mirror of BSD error message compatibility */
 	/* XXX: Perhaps MSG_MORE someday */
 	if (msg->msg_flags & ~(MSG_DONTWAIT | MSG_CMSG_COMPAT)) {
@@ -982,7 +984,7 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 			ret = -ENOMEM;
 			goto out;
 		}
-		ret = rds_message_copy_from_user(rm, msg->msg_iov, payload_len);
+		ret = rds_message_copy_from_user(rm, &from);
 		if (ret)
 			goto out;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-22  4:28                     ` Al Viro
                                         ` (16 preceding siblings ...)
  2014-11-22  4:39                       ` [PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter Al Viro
@ 2014-11-22  7:24                       ` David Miller
  2014-11-25  2:40                         ` Al Viro
  2014-11-22 17:48                       ` [RFC] situation with csum_and_copy_... API Linus Torvalds
  18 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-22  7:24 UTC (permalink / raw)
  To: viro; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Sat, 22 Nov 2014 04:28:57 +0000

> 	OK, here's the next bunch.  Sorry about the delay, iov_iter.c stuff
> took most of the day (and it's not included in this pile).  Please, review.

I read over this stuff twice and this series looks fine to me.

Since this is the weekend... maybe wait until Monday for other feedback
then give me a pull request?

Thanks Al.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-22  4:28                     ` Al Viro
                                         ` (17 preceding siblings ...)
  2014-11-22  7:24                       ` [RFC] situation with csum_and_copy_... API David Miller
@ 2014-11-22 17:48                       ` Linus Torvalds
  18 siblings, 0 replies; 133+ messages in thread
From: Linus Torvalds @ 2014-11-22 17:48 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, Network Development, Linux Kernel Mailing List,
	target-devel, Nicholas A. Bellinger, Christoph Hellwig

On Fri, Nov 21, 2014 at 8:28 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>         OK, here's the next bunch.

Looks like good patches to me. Not that I actually _tested_ it, or
even have a good test-case (yeah, that "historical" ATM fix? I don't
think anybody cares ;), but it all seemed sane.

              Linus

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 06/17] switch macvtap to ->read_iter()
  2014-11-22  4:32                       ` [PATCH 06/17] switch macvtap " Al Viro
@ 2014-11-23 23:29                         ` Ben Hutchings
  0 siblings, 0 replies; 133+ messages in thread
From: Ben Hutchings @ 2014-11-23 23:29 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

[-- Attachment #1: Type: text/plain, Size: 1533 bytes --]

On Sat, 2014-11-22 at 04:32 +0000, Al Viro wrote:
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  drivers/net/macvtap.c |   39 ++++++++++++++++-----------------------
>  1 file changed, 16 insertions(+), 23 deletions(-)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index cea99d4..cdd820f 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -829,16 +829,17 @@ done:
>  }
>  
>  static ssize_t macvtap_do_read(struct macvtap_queue *q,
> -			       const struct iovec *iv, unsigned long segs,
> -			       unsigned long len,
> +			       struct iov_iter *to,
>  			       int noblock)
>  {
>  	DEFINE_WAIT(wait);
>  	struct sk_buff *skb;
>  	ssize_t ret = 0;
> -	struct iov_iter iter;
>  
> -	while (len) {
> +	if (!iov_iter_count(to))
> +		return 0;
> +
> +	while (1) {
>  		if (!noblock)
>  			prepare_to_wait(sk_sleep(&q->sk), &wait,
>  					TASK_INTERRUPTIBLE);
> @@ -856,37 +857,27 @@ static ssize_t macvtap_do_read(struct macvtap_queue *q,
>  			}
>  			/* Nothing to read, let's sleep */
>  			schedule();
> -			continue;
>  		}
> -		iov_iter_init(&iter, READ, iv, segs, len);
> -		ret = macvtap_put_user(q, skb, &iter);
> +	}
> +	if (skb) {
> +		ret = macvtap_put_user(q, skb, to);
>  		kfree_skb(skb);
> -		break;
[...]

You need to leave this break at the bottom of the loop body, or change
it to:

	do {
		...
	} while (!skb);

Ben.

-- 
Ben Hutchings
Never put off till tomorrow what you can avoid all together.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
  2014-11-22  4:33                       ` [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter() Al Viro
@ 2014-11-24  0:02                         ` Ben Hutchings
  2014-11-24  0:29                           ` Ben Hutchings
  2014-11-24  5:34                           ` Jason Wang
  0 siblings, 2 replies; 133+ messages in thread
From: Ben Hutchings @ 2014-11-24  0:02 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

[-- Attachment #1: Type: text/plain, Size: 4179 bytes --]

On Sat, 2014-11-22 at 04:33 +0000, Al Viro wrote:
[...]
> --- a/net/core/datagram.c
> +++ b/net/core/datagram.c
> @@ -572,6 +572,77 @@ fault:
>  }
>  EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
>  

Missing kernel-doc.

> +int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
> +				 struct iov_iter *from,
> +				 int len)
> +{
> +	int start = skb_headlen(skb);
> +	int i, copy = start - offset;
> +	struct sk_buff *frag_iter;
> +
> +	/* Copy header. */
> +	if (copy > 0) {
> +		if (copy > len)
> +			copy = len;
> +		if (copy_from_iter(skb->data + offset, copy, from) != copy)
> +			goto fault;
> +		if ((len -= copy) == 0)
> +			return 0;
> +		offset += copy;
> +	}
> +
> +	/* Copy paged appendix. Hmm... why does this look so complicated? */
> +	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
> +		int end;
> +		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
> +
> +		WARN_ON(start > offset + len);
> +
> +		end = start + skb_frag_size(frag);
> +		if ((copy = end - offset) > 0) {
> +			size_t copied;

Blank line needed after a declaration.

> +			if (copy > len)
> +				copy = len;
> +			copied = copy_page_from_iter(skb_frag_page(frag),
> +					  frag->page_offset + offset - start,
> +					  copy, from);
> +			if (copied != copy)
> +				goto fault;
> +
> +			if (!(len -= copy))
> +				return 0;

The other two instances of this condition are written as:

			if ((len -= copy) == 0)

Similarly in skb_copy_bits().

> +			offset += copy;
> +		}
> +		start = end;
> +	}
> +
> +	skb_walk_frags(skb, frag_iter) {
> +		int end;
> +
> +		WARN_ON(start > offset + len);
> +
> +		end = start + frag_iter->len;
> +		if ((copy = end - offset) > 0) {
> +			if (copy > len)
> +				copy = len;
> +			if (skb_copy_datagram_from_iter(frag_iter,
> +							offset - start,
> +							from, copy))
> +				goto fault;
> +			if ((len -= copy) == 0)
> +				return 0;
> +			offset += copy;
> +		}
> +		start = end;
> +	}
> +	if (!len)
> +		return 0;
> +
> +fault:
> +	return -EFAULT;
> +}
> +EXPORT_SYMBOL(skb_copy_datagram_from_iter);
> +
>  /**
>   *	zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
>   *	@skb: buffer to copy
> @@ -643,6 +714,50 @@ int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
>  }
>  EXPORT_SYMBOL(zerocopy_sg_from_iovec);
>  

Missing kernel-doc.

> +int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
> +{
> +	int len = iov_iter_count(from);
> +	int copy = min_t(int, skb_headlen(skb), len);
> +	int i = 0;
> +
> +	/* copy up to skb headlen */
> +	if (skb_copy_datagram_from_iter(skb, 0, from, copy))
> +		return -EFAULT;
> +
> +	while (iov_iter_count(from)) {
> +		struct page *pages[MAX_SKB_FRAGS];
> +		size_t start;
> +		ssize_t copied;
> +		unsigned long truesize;
> +		int n = 0;
> +
> +		copied = iov_iter_get_pages(from, pages, ~0U, MAX_SKB_FRAGS, &start);
> +		if (copied < 0)
> +			return -EFAULT;
> +
> +		truesize = DIV_ROUND_UP(copied + start, PAGE_SIZE) * PAGE_SIZE;

PAGE_ALIGN(copied + start) ?

> +		skb->data_len += copied;
> +		skb->len += copied;
> +		skb->truesize += truesize;
> +		atomic_add(truesize, &skb->sk->sk_wmem_alloc);
> +		while (copied) {
> +			int off = start;

This variable seems redundant.  Can't we use start directly and move the
'start = 0' to the bottom of the loop?

> +			int size = min_t(int, copied, PAGE_SIZE - off);
> +			start = 0;
> +			if (i < MAX_SKB_FRAGS)
> +				skb_fill_page_desc(skb, i, pages[n], off, size);
> +			else
> +				put_page(pages[n]);

Why is this condition needed, given we told iov_iter_get_pages() to
limit to MAX_SKB_FRAGS pages?

> +			copied -= size;
> +			i++, n++;
> +		}
> +		if (i > MAX_SKB_FRAGS)
> +			return -EMSGSIZE;

Same here.

Ben.

> +	}
> +	return 0;
> +}
> +EXPORT_SYMBOL(zerocopy_sg_from_iter);
> +
>  static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
>  				      u8 __user *to, int len,
>  				      __wsum *csump)

-- 
Ben Hutchings
Never put off till tomorrow what you can avoid all together.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2014-11-22  4:33                       ` [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
@ 2014-11-24  0:27                         ` Ben Hutchings
  2014-11-24  1:06                           ` Ben Hutchings
  2014-11-24 10:15                           ` Al Viro
  0 siblings, 2 replies; 133+ messages in thread
From: Ben Hutchings @ 2014-11-24  0:27 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

[-- Attachment #1: Type: text/plain, Size: 6827 bytes --]

On Sat, 2014-11-22 at 04:33 +0000, Al Viro wrote:
> allows to switch macvtap and tun from ->aio_write() to ->write_iter()
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  drivers/net/macvtap.c |   43 ++++++++++++++++++++-----------------------
>  drivers/net/tun.c     |   43 +++++++++++++++++++++++--------------------
>  2 files changed, 43 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index cdd820f..2bf08c6 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -640,12 +640,12 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
>  
>  /* Get packet from user space buffer */
>  static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
> -				const struct iovec *iv, unsigned long total_len,
> -				size_t count, int noblock)
> +				struct iov_iter *from, int noblock)
>  {
>  	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
>  	struct sk_buff *skb;
>  	struct macvlan_dev *vlan;
> +	unsigned long total_len = iov_iter_count(from);
>  	unsigned long len = total_len;
>  	int err;
>  	struct virtio_net_hdr vnet_hdr = { 0 };
> @@ -653,6 +653,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  	int copylen = 0;
>  	bool zerocopy = false;
>  	size_t linear;
> +	ssize_t n;
>  
>  	if (q->flags & IFF_VNET_HDR) {
>  		vnet_hdr_len = q->vnet_hdr_sz;
> @@ -662,10 +663,11 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  			goto err;
>  		len -= vnet_hdr_len;
>  
> -		err = memcpy_fromiovecend((void *)&vnet_hdr, iv, 0,
> -					   sizeof(vnet_hdr));
> -		if (err < 0)
> +		err = -EFAULT;
> +		n = copy_from_iter(&vnet_hdr, sizeof(vnet_hdr), from);
> +		if (n != sizeof(vnet_hdr))
>  			goto err;
> +		iov_iter_advance(from, vnet_hdr_len - sizeof(vnet_hdr));
>  		if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
>  		     vnet_hdr.csum_start + vnet_hdr.csum_offset + 2 >
>  							vnet_hdr.hdr_len)
> @@ -680,17 +682,15 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  	if (unlikely(len < ETH_HLEN))
>  		goto err;
>  
> -	err = -EMSGSIZE;
> -	if (unlikely(count > UIO_MAXIOV))
> -		goto err;
> -
>  	if (m && m->msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY)) {
> +		struct iov_iter i;

Blank line needed after a declaration.

>  		copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
>  		if (copylen > good_linear)
>  			copylen = good_linear;
>  		linear = copylen;
> -		if (iov_pages(iv, vnet_hdr_len + copylen, count)
> -		    <= MAX_SKB_FRAGS)
> +		i = *from;
> +		iov_iter_advance(&i, copylen);
> +		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)

The maxpages argument should be MAX_SKB_FRAGS + 1 as we don't need the
exact number.

>  			zerocopy = true;
>  	}
>  
> @@ -708,10 +708,9 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  		goto err;
>  
>  	if (zerocopy)
> -		err = zerocopy_sg_from_iovec(skb, iv, vnet_hdr_len, count);
> +		err = zerocopy_sg_from_iter(skb, from);
>  	else {
> -		err = skb_copy_datagram_from_iovec(skb, 0, iv, vnet_hdr_len,
> -						   len);
> +		err = skb_copy_datagram_from_iter(skb, 0, from, len);

Does skb_copy_datagram_from_iter() really need a len parameter?  Here it
is equal to iov_iter_count(from).

[...]
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1012,28 +1012,29 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
>  
>  /* Get packet from user space buffer */
>  static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
> -			    void *msg_control, const struct iovec *iv,
> -			    size_t total_len, size_t count, int noblock)
> +			    void *msg_control, struct iov_iter *from,
> +			    int noblock)
>  {
>  	struct tun_pi pi = { 0, cpu_to_be16(ETH_P_IP) };
>  	struct sk_buff *skb;
> +	size_t total_len = iov_iter_count(from);
>  	size_t len = total_len, align = NET_SKB_PAD, linear;
>  	struct virtio_net_hdr gso = { 0 };
>  	int good_linear;
> -	int offset = 0;
>  	int copylen;
>  	bool zerocopy = false;
>  	int err;
>  	u32 rxhash;
> +	ssize_t n;
>  
>  	if (!(tun->flags & TUN_NO_PI)) {
>  		if (len < sizeof(pi))
>  			return -EINVAL;
>  		len -= sizeof(pi);
>  
> -		if (memcpy_fromiovecend((void *)&pi, iv, 0, sizeof(pi)))
> +		n = copy_from_iter(&pi, sizeof(pi), from);
> +		if (n != sizeof(pi))
>  			return -EFAULT;
> -		offset += sizeof(pi);
>  	}
>  
>  	if (tun->flags & TUN_VNET_HDR) {
> @@ -1041,7 +1042,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  			return -EINVAL;
>  		len -= tun->vnet_hdr_sz;
>  
> -		if (memcpy_fromiovecend((void *)&gso, iv, offset, sizeof(gso)))
> +		n = copy_from_iter(&gso, sizeof(gso), from);
> +		if (n != sizeof(gso))
>  			return -EFAULT;
>  
>  		if ((gso.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
> @@ -1050,7 +1052,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  
>  		if (gso.hdr_len > len)
>  			return -EINVAL;
> -		offset += tun->vnet_hdr_sz;
> +		iov_iter_advance(from, tun->vnet_hdr_sz);

                                        tun->vnet_hdr_sz - sizeof(gso)

>  	}
>  
>  	if ((tun->flags & TUN_TYPE_MASK) == TUN_TAP_DEV) {
> @@ -1063,6 +1065,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  	good_linear = SKB_MAX_HEAD(align);
>  
>  	if (msg_control) {
> +		struct iov_iter i = *from;

Blank line needed after a declaration.

>  		/* There are 256 bytes to be copied in skb, so there is
>  		 * enough room for skb expand head in case it is used.
>  		 * The rest of the buffer is mapped from userspace.
> @@ -1071,7 +1074,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  		if (copylen > good_linear)
>  			copylen = good_linear;
>  		linear = copylen;
> -		if (iov_pages(iv, offset + copylen, count) <= MAX_SKB_FRAGS)
> +		iov_iter_advance(&i, copylen);
> +		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)

Again, the maxpages argument should be MAX_SKB_FRAGS + 1.

>  			zerocopy = true;
>  	}
>  
> @@ -1091,9 +1095,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  	}
>  
>  	if (zerocopy)
> -		err = zerocopy_sg_from_iovec(skb, iv, offset, count);
> +		err = zerocopy_sg_from_iter(skb, from);
>  	else {
> -		err = skb_copy_datagram_from_iovec(skb, 0, iv, offset, len);
> +		err = skb_copy_datagram_from_iter(skb, 0, from, len);
[...]

Again len is equal to iov_iter_count(from), so I think that parameter is
redundant.

Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
  2014-11-24  0:02                         ` Ben Hutchings
@ 2014-11-24  0:29                           ` Ben Hutchings
  2014-11-24  5:34                           ` Jason Wang
  1 sibling, 0 replies; 133+ messages in thread
From: Ben Hutchings @ 2014-11-24  0:29 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

[-- Attachment #1: Type: text/plain, Size: 514 bytes --]

On Mon, 2014-11-24 at 00:02 +0000, Ben Hutchings wrote:
> On Sat, 2014-11-22 at 04:33 +0000, Al Viro wrote:
> [...]
> > --- a/net/core/datagram.c
> > +++ b/net/core/datagram.c
> > @@ -572,6 +572,77 @@ fault:
> >  }
> >  EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
> >  
> 
> Missing kernel-doc.
[...]

Never mind, I can see that patches 9 and 10 recycle the _iovec
functions' kernel-doc comments.

Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2014-11-24  0:27                         ` Ben Hutchings
@ 2014-11-24  1:06                           ` Ben Hutchings
  2014-11-24 10:15                           ` Al Viro
  1 sibling, 0 replies; 133+ messages in thread
From: Ben Hutchings @ 2014-11-24  1:06 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

[-- Attachment #1: Type: text/plain, Size: 641 bytes --]

On Mon, 2014-11-24 at 00:27 +0000, Ben Hutchings wrote:
> On Sat, 2014-11-22 at 04:33 +0000, Al Viro wrote:
[...]
> Does skb_copy_datagram_from_iter() really need a len parameter?  Here it
> is equal to iov_iter_count(from).
[...]
> Again len is equal to iov_iter_count(from), so I think that parameter is
> redundant.

Having read further patches, I see that unix_stream_sendmsg() is the one
exception where the length parameter is different.  But maybe the common
case (len = iter_iov_count(iov)) deserves a wrapper function?

Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter
  2014-11-22  4:39                       ` [PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter Al Viro
@ 2014-11-24  2:00                         ` Ben Hutchings
  2014-11-24 10:17                           ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: Ben Hutchings @ 2014-11-24  2:00 UTC (permalink / raw)
  To: Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

[-- Attachment #1: Type: text/plain, Size: 412 bytes --]

On Sat, 2014-11-22 at 04:39 +0000, Al Viro wrote:
[...]
> -		ret = rds_page_copy_from_user(sg_page(sg), sg->offset + sg_off,
> -					      iov->iov_base + iov_off,
> -					      to_copy);
[...]

It looks like rds_page_copy{,_from,_to}_user() are all unused after this
change, so you could delete them.

Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
  2014-11-24  0:02                         ` Ben Hutchings
  2014-11-24  0:29                           ` Ben Hutchings
@ 2014-11-24  5:34                           ` Jason Wang
  2014-11-24 10:03                             ` Al Viro
  1 sibling, 1 reply; 133+ messages in thread
From: Jason Wang @ 2014-11-24  5:34 UTC (permalink / raw)
  To: Ben Hutchings, Al Viro
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

On 11/24/2014 08:02 AM, Ben Hutchings wrote:
> On Sat, 2014-11-22 at 04:33 +0000, Al Viro wrote:
> [...]
>> --- a/net/core/datagram.c
>> +++ b/net/core/datagram.c
>> @@ -572,6 +572,77 @@ fault:
>>  }
>>  EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
>>  
> Missing kernel-doc.
>
>> +int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
>> +				 struct iov_iter *from,
>> +				 int len)
>> +{
>> +	int start = skb_headlen(skb);
>> +	int i, copy = start - offset;
>> +	struct sk_buff *frag_iter;
>> +
>> +	/* Copy header. */
>> +	if (copy > 0) {
>> +		if (copy > len)
>> +			copy = len;
>> +		if (copy_from_iter(skb->data + offset, copy, from) != copy)
>> +			goto fault;
>> +		if ((len -= copy) == 0)
>> +			return 0;
>> +		offset += copy;
>> +	}
>> +
>> +	/* Copy paged appendix. Hmm... why does this look so complicated? */
>> +	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
>> +		int end;
>> +		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
>> +
>> +		WARN_ON(start > offset + len);
>> +
>> +		end = start + skb_frag_size(frag);
>> +		if ((copy = end - offset) > 0) {
>> +			size_t copied;
> Blank line needed after a declaration.
>
>> +			if (copy > len)
>> +				copy = len;
>> +			copied = copy_page_from_iter(skb_frag_page(frag),
>> +					  frag->page_offset + offset - start,
>> +					  copy, from);
>> +			if (copied != copy)
>> +				goto fault;
>> +
>> +			if (!(len -= copy))
>> +				return 0;
> The other two instances of this condition are written as:
>
> 			if ((len -= copy) == 0)
>
> Similarly in skb_copy_bits().
>
>> +			offset += copy;
>> +		}
>> +		start = end;
>> +	}
>> +
>> +	skb_walk_frags(skb, frag_iter) {
>> +		int end;
>> +
>> +		WARN_ON(start > offset + len);
>> +
>> +		end = start + frag_iter->len;
>> +		if ((copy = end - offset) > 0) {
>> +			if (copy > len)
>> +				copy = len;
>> +			if (skb_copy_datagram_from_iter(frag_iter,
>> +							offset - start,
>> +							from, copy))
>> +				goto fault;
>> +			if ((len -= copy) == 0)
>> +				return 0;
>> +			offset += copy;
>> +		}
>> +		start = end;
>> +	}
>> +	if (!len)
>> +		return 0;
>> +
>> +fault:
>> +	return -EFAULT;
>> +}
>> +EXPORT_SYMBOL(skb_copy_datagram_from_iter);
>> +
>>  /**
>>   *	zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
>>   *	@skb: buffer to copy
>> @@ -643,6 +714,50 @@ int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
>>  }
>>  EXPORT_SYMBOL(zerocopy_sg_from_iovec);
>>  
> Missing kernel-doc.
>
>> +int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
>> +{
>> +	int len = iov_iter_count(from);
>> +	int copy = min_t(int, skb_headlen(skb), len);
>> +	int i = 0;
>> +
>> +	/* copy up to skb headlen */
>> +	if (skb_copy_datagram_from_iter(skb, 0, from, copy))
>> +		return -EFAULT;
>> +
>> +	while (iov_iter_count(from)) {
>> +		struct page *pages[MAX_SKB_FRAGS];
>> +		size_t start;
>> +		ssize_t copied;
>> +		unsigned long truesize;
>> +		int n = 0;
>> +
>> +		copied = iov_iter_get_pages(from, pages, ~0U, MAX_SKB_FRAGS, &start);
>> +		if (copied < 0)
>> +			return -EFAULT;
>> +
>> +		truesize = DIV_ROUND_UP(copied + start, PAGE_SIZE) * PAGE_SIZE;
> PAGE_ALIGN(copied + start) ?
>
>> +		skb->data_len += copied;
>> +		skb->len += copied;
>> +		skb->truesize += truesize;
>> +		atomic_add(truesize, &skb->sk->sk_wmem_alloc);
>> +		while (copied) {
>> +			int off = start;
> This variable seems redundant.  Can't we use start directly and move the
> 'start = 0' to the bottom of the loop?
>
>> +			int size = min_t(int, copied, PAGE_SIZE - off);
>> +			start = 0;
>> +			if (i < MAX_SKB_FRAGS)
>> +				skb_fill_page_desc(skb, i, pages[n], off, size);
>> +			else
>> +				put_page(pages[n]);
> Why is this condition needed, given we told iov_iter_get_pages() to
> limit to MAX_SKB_FRAGS pages?

We don't want to send truncated packets and there's no other way to put
those pages since it was not in the frag array.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [RFC] situation with csum_and_copy_... API
  2014-11-21 19:39                           ` Al Viro
  2014-11-21 19:40                             ` Linus Torvalds
@ 2014-11-24 10:03                             ` David Laight
  1 sibling, 0 replies; 133+ messages in thread
From: David Laight @ 2014-11-24 10:03 UTC (permalink / raw)
  To: 'Al Viro'
  Cc: Eric Dumazet, David Miller, torvalds, netdev, linux-kernel,
	target-devel, Nicholas A. Bellinger, Christoph Hellwig

From: Al Viro
> On Fri, Nov 21, 2014 at 05:42:55PM +0000, David Laight wrote:
> 
> > Callers of kernel_send/recvmsg() could easily be using a wrapper
> > function that creates the 'msghdr'.
> > When the want to send the remaining part of a buffer the old iterator
> > will no longer be available - just the original iov and the required offset.
> 
> Er...  So why not copy a struct iov_iter to/from msg->msg_iter, then?
> It's not as it had been particulary large - 5 words isn't much...
> 
> I'm not at all sure that _anything_ has valid reasons for draining iovecs.
> Maintaining a struct iov_iter and modifying it is easy and actually faster...
> 
> Right now the main examples outside of net/* are due to unfortunate
> limitations of ->sendmsg() - until now it had no way to be told that
> desired data starts at offset.  With ->msg_iter it obviously becomes
> possible...

It may well be easier for code that only has to run in a new kernel.
But for code that has run in old kernels as well you still need
to modify the iov[].

This is also true of userspace - there is no way of completing
a partial transfer without modifying the iov[] to allow for the
partial transfer.

Note that I'm not suggesting that any of the 'write' functions
should ever modify an iov[].

	David




^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
  2014-11-24  5:34                           ` Jason Wang
@ 2014-11-24 10:03                             ` Al Viro
  0 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-24 10:03 UTC (permalink / raw)
  To: Jason Wang
  Cc: Ben Hutchings, David Miller, torvalds, netdev, linux-kernel,
	target-devel, nab, hch

On Mon, Nov 24, 2014 at 01:34:30PM +0800, Jason Wang wrote:
> >> +		copied = iov_iter_get_pages(from, pages, ~0U, MAX_SKB_FRAGS, &start);

> > Why is this condition needed, given we told iov_iter_get_pages() to
> > limit to MAX_SKB_FRAGS pages?
> 
> We don't want to send truncated packets and there's no other way to put
> those pages since it was not in the frag array.

No, his point is that it could never happen.  It could, actually - what's
confusing here (and that's inherited from zerocopy_from_iovec()) is
that 'i' is a lousy name for that variable.  It's actually "how many fragments
have we already put there?" and it is not reset when we go into the next
iteration of outer loop.

FWIW, I've just renamed it into 'frag', put
	if (frag == MAX_SKB_FRAGS)
		return -EMSGSIZE;
*before* iov_iter_get_pages(), passing MAX_SKB_FRAGS - frag as the
limit on number of pages in that call.  Voila - logics with put_page()
disappears and the inner loop is less obfuscated.

There was another bug in that function - iov_iter_get_pages() does *not*
advance the iterator; the caller needs to do iov_iter_advance() itself.
Also fixed...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2014-11-24  0:27                         ` Ben Hutchings
  2014-11-24  1:06                           ` Ben Hutchings
@ 2014-11-24 10:15                           ` Al Viro
  1 sibling, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-24 10:15 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

On Mon, Nov 24, 2014 at 12:27:42AM +0000, Ben Hutchings wrote:
> >  		copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
> >  		if (copylen > good_linear)
> >  			copylen = good_linear;
> >  		linear = copylen;
> > -		if (iov_pages(iv, vnet_hdr_len + copylen, count)
> > -		    <= MAX_SKB_FRAGS)
> > +		i = *from;
> > +		iov_iter_advance(&i, copylen);
> > +		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
> 
> The maxpages argument should be MAX_SKB_FRAGS + 1 as we don't need the
> exact number.

In principle, that's true, but...  Do we really care?  It only buys you
anything if you have a monstrously fragmented iovec.  And if you end up
spending too considerable amount of time in that loop in iov_iter_npages,
you are by definition on the slow path - it *will* fail, since we do not
even try to merge adjacent iovec segments.  Never had...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter
  2014-11-24  2:00                         ` Ben Hutchings
@ 2014-11-24 10:17                           ` Al Viro
  0 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-24 10:17 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, torvalds, netdev, linux-kernel, target-devel, nab, hch

On Mon, Nov 24, 2014 at 02:00:18AM +0000, Ben Hutchings wrote:
> On Sat, 2014-11-22 at 04:39 +0000, Al Viro wrote:
> [...]
> > -		ret = rds_page_copy_from_user(sg_page(sg), sg->offset + sg_off,
> > -					      iov->iov_base + iov_off,
> > -					      to_copy);
> [...]
> 
> It looks like rds_page_copy{,_from,_to}_user() are all unused after this
> change, so you could delete them.

Maybe.  Or maybe add a saner wrapper for rds_page_copy_from_user()...  Separate
story, anyway.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* RE: [RFC] situation with csum_and_copy_... API
  2014-11-22  3:27                         ` Al Viro
  2014-11-22  3:36                           ` Al Viro
@ 2014-11-24 10:27                           ` David Laight
  1 sibling, 0 replies; 133+ messages in thread
From: David Laight @ 2014-11-24 10:27 UTC (permalink / raw)
  To: 'Al Viro', Linus Torvalds
  Cc: David Miller, netdev, linux-kernel, target-devel,
	Nicholas A. Bellinger, Christoph Hellwig, Eric Dumazet

From: Al Viro
> On Fri, Nov 21, 2014 at 08:49:56AM +0000, Al Viro wrote:
> 
> > Overall, I think I have the whole series plotted in enough details to be
> > reasonably certain we can pull it off.  Right now I'm dealing with
> > mm/iov_iter.c stuff; the amount of boilerplate source is already high enough
> > and with those extra primitives it'll get really unpleasant.
> >
> > What we need there is something templates-like, as much as I hate C++, and
> > I'm still not happy with what I have at the moment...  Hopefully I'll get
> > that in more or less tolerable form today.
> 
> Folks, I would really like comments on the patch below.  It's an attempt
> to reduce the amount of boilerplate code in mm/iov_iter.c; no new primitives
> added, just trying to reduce the amount of duplication in there.  I'm not
> too fond of the way it currently looks, to put it mildly.  It seems to
> work, it's reasonably straightforward and it even generates slightly better
> code than before, but I would _very_ welcome any tricks that would allow to
> make it not so tasteless.  I like the effect on line count (+124-358), but...
> 
> It defines two iterators (for iovec-backed and bvec-backed ones) and converts
> a bunch of primitives to those.  The last argument is an expression evaluated
> for a bunch of ranges; for bvec one it's void, for iovec - size_t; if it
> evaluates to non-0, we treat it as read/write/whatever short by that many
> bytes and do not proceed any further.
> 
> Any suggestions are welcome.
> 
> diff --git a/mm/iov_iter.c b/mm/iov_iter.c
> index eafcf60..611af2bd 100644
> --- a/mm/iov_iter.c
> +++ b/mm/iov_iter.c
> @@ -4,11 +4,75 @@
>  #include <linux/slab.h>
>  #include <linux/vmalloc.h>
> 
> +#define iterate_iovec(i, n, buf, len, move, STEP) {	\
> +	const struct iovec *iov = i->iov;		\
> +	size_t skip = i->iov_offset;			\
> +	size_t left;					\
> +	size_t wanted = n;				\

All these variable names probably want (at least one) _ prefix
just in case they match the names of arguments.

> +	buf = iov->iov_base + skip;			\
> +	len = min(n, iov->iov_len - skip);		\

You are assigning to parameters, this can get confusing.
Unless these are return values this probably doesn't make sense.
Might be better to 'pass by reference', the generated code is
likely to be the same - but it is clearer to the reader.

> +	left = STEP;					\
> +	len -= left;					\
> +	skip += len;					\
> +	n -= len;					\

Again, a parameter is modified.

> +	while (unlikely(!left && n)) {			\
> +		iov++;					\
> +		buf = iov->iov_base;			\
> +		len = min(n, iov->iov_len);		\
> +		left = STEP;				\
> +		len -= left;				\
> +		skip = len;				\
> +		n -= len;				\
> +	}						\
> +	n = wanted - n;					\
> +	if (move) {					\
> +		if (skip == iov->iov_len) {		\
> +			iov++;				\
> +			skip = 0;			\
> +		}					\
> +		i->count -= n;				\
> +		i->nr_segs -= iov - i->iov;		\
> +		i->iov = iov;				\
> +		i->iov_offset = skip;			\
> +	}						\
> +}
...
> +	iterate_iovec(i, bytes, buf, len, true,
> +			__copy_to_user(buf, (from += len) - len, len))
> +	return bytes;

Using 'buf' and 'len' in __copy_to_user() is very non-obvious.
It might be better if they were fixed names, maybe _ioiter_buf and _ioiter_len.

If this code can use the gcc extension that allows #defines that contain {}
to return values, the it would be better as:

	bytes_left = iterate_iovec(i, bytes, true,
		__copy_to_user(_ioiter_buf, (from += _ioiter_len) - _ioiter_len,
			_ioiter_len);
	return bytes_left;

Possibly also two wrapper #defines that supply the true/false parameter.

	David




^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [RFC] situation with csum_and_copy_... API
  2014-11-22  7:24                       ` [RFC] situation with csum_and_copy_... API David Miller
@ 2014-11-25  2:40                         ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
                                             ` (16 more replies)
  0 siblings, 17 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25  2:40 UTC (permalink / raw)
  To: David Miller; +Cc: torvalds, netdev, linux-kernel, target-devel, nab, hch

On Sat, Nov 22, 2014 at 02:24:44AM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Sat, 22 Nov 2014 04:28:57 +0000
> 
> > 	OK, here's the next bunch.  Sorry about the delay, iov_iter.c stuff
> > took most of the day (and it's not included in this pile).  Please, review.
> 
> I read over this stuff twice and this series looks fine to me.
> 
> Since this is the weekend... maybe wait until Monday for other feedback
> then give me a pull request?

I'll probably repost the patches with updates folded in first...

FWIW, the current situation is
	* all but one ->recvmsg() instances switched to iov_iter primitives
(the exception is one of the AF_ALG instances and I know what to do with it).
Consequently, they do not drain iovecs anymore and simply advance ->msg_iter.
	* kvec-based iov_iter work just fine without set_fs(KERNEL_DS).  And
so do ->recvmsg() instances that had received such iov_iter.
	* memcpy_to_iovec() and skb_copy_datagram_iovec() are gone - no users
left.

I'm about to start on ->sendmsg() side of the things.  The interesting
question is whether tcp_send_syn_data() is doing the right thing if it
runs into EFAULT when copying iovec from userland.

As it is, it gives up on the skb it has allocated and falls back to normal
handshake.  I can duplicate that behaviour, all right, but why not simply
do skb_trim(syn_data, <actually copied>) and do fallback only if nothing
got copied at all?  Is there any problem with that?

The reason why I'm asking is that it's easier to just use copy_from_iter()
and let the damn thing advance.  Not a lot of pain to preserve the original
iov_iter (all 5 words of it) and copy it back in case of failure, but I'd
rather understood what's wrong with simpler approach...

Anyway, the current branch is in vfs.git#iov_iter-net.  Current balance is
about -0.5KLoC and that's not counting the simplifications that will become
possible in callers of kernel_sendmg/kernel_recvmsg...  I'll post the beginning
of that queue (the same 17 commits in the beginning) later tonight and wait
for review...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg()
  2014-11-25  2:40                         ` Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 19:28                             ` David Miller
  2014-11-25 14:02                           ` [PATCH v2 02/17] new helper: memcpy_from_msg() Al Viro
                                             ` (15 subsequent siblings)
  16 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    5 +++++
 net/ipv4/udp.c         |    5 ++---
 net/ipv6/raw.c         |    2 +-
 net/ipv6/udp.c         |    2 +-
 4 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 78c299f..cbe4b20 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2651,6 +2651,11 @@ static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
 }
 int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
 				     struct iovec *iov);
+static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
+			    struct msghdr *msg)
+{
+	return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
+}
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 				 const struct iovec *from, int from_offset,
 				 int len);
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index 4a16b91..b2d6068 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1284,9 +1284,8 @@ try_again:
 		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
 					    msg, copied);
 	else {
-		err = skb_copy_and_csum_datagram_iovec(skb,
-						       sizeof(struct udphdr),
-						       msg->msg_iov);
+		err = skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr),
+						     msg);
 
 		if (err == -EINVAL)
 			goto csum_copy_err;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 0cbcf98..8baa53e 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -492,7 +492,7 @@ static int rawv6_recvmsg(struct kiocb *iocb, struct sock *sk,
 			goto csum_copy_err;
 		err = skb_copy_datagram_msg(skb, 0, msg, copied);
 	} else {
-		err = skb_copy_and_csum_datagram_iovec(skb, 0, msg->msg_iov);
+		err = skb_copy_and_csum_datagram_msg(skb, 0, msg);
 		if (err == -EINVAL)
 			goto csum_copy_err;
 	}
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index dbc0b04..7cfb5d74 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -428,7 +428,7 @@ try_again:
 		err = skb_copy_datagram_msg(skb, sizeof(struct udphdr),
 					    msg, copied);
 	else {
-		err = skb_copy_and_csum_datagram_iovec(skb, sizeof(struct udphdr), msg->msg_iov);
+		err = skb_copy_and_csum_datagram_msg(skb, sizeof(struct udphdr), msg);
 		if (err == -EINVAL)
 			goto csum_copy_err;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 02/17] new helper: memcpy_from_msg()
  2014-11-25  2:40                         ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 03/17] switch ipxrtr_route_packet() from iovec to msghdr Al Viro
                                             ` (14 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 crypto/algif_skcipher.c     |   10 +++++-----
 drivers/isdn/mISDN/socket.c |    2 +-
 drivers/net/ppp/pppoe.c     |    2 +-
 include/linux/skbuff.h      |    5 +++++
 include/net/sctp/sm.h       |    2 +-
 net/appletalk/ddp.c         |    2 +-
 net/ax25/af_ax25.c          |    2 +-
 net/bluetooth/hci_sock.c    |    2 +-
 net/bluetooth/mgmt.c        |    2 +-
 net/bluetooth/rfcomm/sock.c |    2 +-
 net/bluetooth/sco.c         |    2 +-
 net/caif/caif_socket.c      |    4 ++--
 net/can/bcm.c               |   19 ++++++++-----------
 net/can/raw.c               |    2 +-
 net/dccp/proto.c            |    2 +-
 net/decnet/af_decnet.c      |    2 +-
 net/ieee802154/dgram.c      |    2 +-
 net/ieee802154/raw.c        |    2 +-
 net/ipv4/ping.c             |    2 +-
 net/ipv4/tcp_input.c        |    2 +-
 net/irda/af_irda.c          |    6 +++---
 net/iucv/af_iucv.c          |    2 +-
 net/key/af_key.c            |    2 +-
 net/l2tp/l2tp_ip.c          |    2 +-
 net/l2tp/l2tp_ppp.c         |    3 +--
 net/llc/af_llc.c            |    2 +-
 net/netlink/af_netlink.c    |    2 +-
 net/netrom/af_netrom.c      |    2 +-
 net/nfc/llcp_commands.c     |    4 ++--
 net/nfc/rawsock.c           |    2 +-
 net/packet/af_packet.c      |    5 ++---
 net/phonet/datagram.c       |    2 +-
 net/phonet/pep.c            |    2 +-
 net/rose/af_rose.c          |    2 +-
 net/sctp/sm_make_chunk.c    |    4 ++--
 net/x25/af_x25.c            |    2 +-
 36 files changed, 57 insertions(+), 57 deletions(-)

diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index 83187f4..c3b482b 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -298,9 +298,9 @@ static int skcipher_sendmsg(struct kiocb *unused, struct socket *sock,
 			len = min_t(unsigned long, len,
 				    PAGE_SIZE - sg->offset - sg->length);
 
-			err = memcpy_fromiovec(page_address(sg_page(sg)) +
-					       sg->offset + sg->length,
-					       msg->msg_iov, len);
+			err = memcpy_from_msg(page_address(sg_page(sg)) +
+					      sg->offset + sg->length,
+					      msg, len);
 			if (err)
 				goto unlock;
 
@@ -337,8 +337,8 @@ static int skcipher_sendmsg(struct kiocb *unused, struct socket *sock,
 			if (!sg_page(sg + i))
 				goto unlock;
 
-			err = memcpy_fromiovec(page_address(sg_page(sg + i)),
-					       msg->msg_iov, plen);
+			err = memcpy_from_msg(page_address(sg_page(sg + i)),
+					      msg, plen);
 			if (err) {
 				__free_page(sg_page(sg + i));
 				sg_assign_page(sg + i, NULL);
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index dcbd858..84b3592 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -203,7 +203,7 @@ mISDN_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (!skb)
 		goto done;
 
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		goto done;
 	}
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 443cbbf..d2408a5 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -869,7 +869,7 @@ static int pppoe_sendmsg(struct kiocb *iocb, struct socket *sock,
 	ph = (struct pppoe_hdr *)skb_put(skb, total_len + sizeof(struct pppoe_hdr));
 	start = (char *)&ph->tag[0];
 
-	error = memcpy_fromiovec(start, m->msg_iov, total_len);
+	error = memcpy_from_msg(start, m, total_len);
 	if (error < 0) {
 		kfree_skb(skb);
 		goto end;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index cbe4b20..97dc5f8 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2687,6 +2687,11 @@ int skb_ensure_writable(struct sk_buff *skb, int write_len);
 int skb_vlan_pop(struct sk_buff *skb);
 int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci);
 
+static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
+{
+	return memcpy_fromiovec(data, msg->msg_iov, len);
+}
+
 struct skb_checksum_ops {
 	__wsum (*update)(const void *mem, int len, __wsum wsum);
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 72a31db..487ef34 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -219,7 +219,7 @@ struct sctp_chunk *sctp_make_abort_no_data(const struct sctp_association *,
 				      const struct sctp_chunk *,
 				      __u32 tsn);
 struct sctp_chunk *sctp_make_abort_user(const struct sctp_association *,
-					const struct msghdr *, size_t msg_len);
+					struct msghdr *, size_t msg_len);
 struct sctp_chunk *sctp_make_abort_violation(const struct sctp_association *,
 				   const struct sctp_chunk *,
 				   const __u8 *,
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index 425942d..0d0766e 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1659,7 +1659,7 @@ static int atalk_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr
 
 	SOCK_DEBUG(sk, "SK %p: Copy user data (%Zd bytes).\n", sk, len);
 
-	err = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		err = -EFAULT;
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index f4f835e19..ca049a7 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1549,7 +1549,7 @@ static int ax25_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reserve(skb, size - len);
 
 	/* User data follows immediately after the AX.25 data */
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		kfree_skb(skb);
 		goto out;
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 5e2cd25..2c245fd 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -947,7 +947,7 @@ static int hci_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (!skb)
 		goto done;
 
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		goto drop;
 	}
diff --git a/net/bluetooth/mgmt.c b/net/bluetooth/mgmt.c
index cbeef5f..f3e4a16 100644
--- a/net/bluetooth/mgmt.c
+++ b/net/bluetooth/mgmt.c
@@ -5767,7 +5767,7 @@ int mgmt_control(struct sock *sk, struct msghdr *msg, size_t msglen)
 	if (!buf)
 		return -ENOMEM;
 
-	if (memcpy_fromiovec(buf, msg->msg_iov, msglen)) {
+	if (memcpy_from_msg(buf, msg, msglen)) {
 		err = -EFAULT;
 		goto done;
 	}
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index 8bbbb5e..2348176 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -588,7 +588,7 @@ static int rfcomm_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 		}
 		skb_reserve(skb, RFCOMM_SKB_HEAD_RESERVE);
 
-		err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+		err = memcpy_from_msg(skb_put(skb, size), msg, size);
 		if (err) {
 			kfree_skb(skb);
 			if (sent == 0)
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 7ee9e4a..30e5ea3 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -285,7 +285,7 @@ static int sco_send_frame(struct sock *sk, struct msghdr *msg, int len)
 	if (!skb)
 		return err;
 
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		kfree_skb(skb);
 		return -EFAULT;
 	}
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index fbcd156..230f140 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -566,7 +566,7 @@ static int caif_seqpkt_sendmsg(struct kiocb *kiocb, struct socket *sock,
 
 	skb_reserve(skb, cf_sk->headroom);
 
-	ret = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	ret = memcpy_from_msg(skb_put(skb, len), msg, len);
 
 	if (ret)
 		goto err;
@@ -641,7 +641,7 @@ static int caif_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		 */
 		size = min_t(int, size, skb_tailroom(skb));
 
-		err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+		err = memcpy_from_msg(skb_put(skb, size), msg, size);
 		if (err) {
 			kfree_skb(skb);
 			goto out_err;
diff --git a/net/can/bcm.c b/net/can/bcm.c
index dcb75c0..b9a1f71 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -858,8 +858,7 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 
 		/* update can_frames content */
 		for (i = 0; i < msg_head->nframes; i++) {
-			err = memcpy_fromiovec((u8 *)&op->frames[i],
-					       msg->msg_iov, CFSIZ);
+			err = memcpy_from_msg((u8 *)&op->frames[i], msg, CFSIZ);
 
 			if (op->frames[i].can_dlc > 8)
 				err = -EINVAL;
@@ -894,8 +893,7 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 			op->frames = &op->sframe;
 
 		for (i = 0; i < msg_head->nframes; i++) {
-			err = memcpy_fromiovec((u8 *)&op->frames[i],
-					       msg->msg_iov, CFSIZ);
+			err = memcpy_from_msg((u8 *)&op->frames[i], msg, CFSIZ);
 
 			if (op->frames[i].can_dlc > 8)
 				err = -EINVAL;
@@ -1024,9 +1022,8 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 
 		if (msg_head->nframes) {
 			/* update can_frames content */
-			err = memcpy_fromiovec((u8 *)op->frames,
-					       msg->msg_iov,
-					       msg_head->nframes * CFSIZ);
+			err = memcpy_from_msg((u8 *)op->frames, msg,
+					      msg_head->nframes * CFSIZ);
 			if (err < 0)
 				return err;
 
@@ -1072,8 +1069,8 @@ static int bcm_rx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
 		}
 
 		if (msg_head->nframes) {
-			err = memcpy_fromiovec((u8 *)op->frames, msg->msg_iov,
-					       msg_head->nframes * CFSIZ);
+			err = memcpy_from_msg((u8 *)op->frames, msg,
+					      msg_head->nframes * CFSIZ);
 			if (err < 0) {
 				if (op->frames != &op->sframe)
 					kfree(op->frames);
@@ -1209,7 +1206,7 @@ static int bcm_tx_send(struct msghdr *msg, int ifindex, struct sock *sk)
 
 	can_skb_reserve(skb);
 
-	err = memcpy_fromiovec(skb_put(skb, CFSIZ), msg->msg_iov, CFSIZ);
+	err = memcpy_from_msg(skb_put(skb, CFSIZ), msg, CFSIZ);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
@@ -1285,7 +1282,7 @@ static int bcm_sendmsg(struct kiocb *iocb, struct socket *sock,
 
 	/* read message head information */
 
-	ret = memcpy_fromiovec((u8 *)&msg_head, msg->msg_iov, MHSIZ);
+	ret = memcpy_from_msg((u8 *)&msg_head, msg, MHSIZ);
 	if (ret < 0)
 		return ret;
 
diff --git a/net/can/raw.c b/net/can/raw.c
index 081e81f..0e4004f 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -703,7 +703,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct socket *sock,
 	can_skb_reserve(skb);
 	can_skb_prv(skb)->ifindex = dev->ifindex;
 
-	err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+	err = memcpy_from_msg(skb_put(skb, size), msg, size);
 	if (err < 0)
 		goto free_skb;
 
diff --git a/net/dccp/proto.c b/net/dccp/proto.c
index 8e6ae94..19f0387 100644
--- a/net/dccp/proto.c
+++ b/net/dccp/proto.c
@@ -781,7 +781,7 @@ int dccp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		goto out_release;
 
 	skb_reserve(skb, sk->sk_prot->max_header);
-	rc = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc != 0)
 		goto out_discard;
 
diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
index 25733d5..e2e2e3c 100644
--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -2032,7 +2032,7 @@ static int dn_sendmsg(struct kiocb *iocb, struct socket *sock,
 
 		skb_reserve(skb, 64 + DN_MAX_NSP_DATA_HEADER);
 
-		if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+		if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 			err = -EFAULT;
 			goto out;
 		}
diff --git a/net/ieee802154/dgram.c b/net/ieee802154/dgram.c
index b8555ec..2c7a93e 100644
--- a/net/ieee802154/dgram.c
+++ b/net/ieee802154/dgram.c
@@ -276,7 +276,7 @@ static int dgram_sendmsg(struct kiocb *iocb, struct sock *sk,
 	if (err < 0)
 		goto out_skb;
 
-	err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+	err = memcpy_from_msg(skb_put(skb, size), msg, size);
 	if (err < 0)
 		goto out_skb;
 
diff --git a/net/ieee802154/raw.c b/net/ieee802154/raw.c
index 21c3894..61e9d29 100644
--- a/net/ieee802154/raw.c
+++ b/net/ieee802154/raw.c
@@ -150,7 +150,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk,
 	skb_reset_mac_header(skb);
 	skb_reset_network_header(skb);
 
-	err = memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size);
+	err = memcpy_from_msg(skb_put(skb, size), msg, size);
 	if (err < 0)
 		goto out_skb;
 
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index ce2920f..ef8f6ee 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -660,7 +660,7 @@ int ping_common_sendmsg(int family, struct msghdr *msg, size_t len,
 	 *	Fetch the ICMP header provided by the userland.
 	 *	iovec is modified! The ICMP header is consumed.
 	 */
-	if (memcpy_fromiovec(user_icmph, msg->msg_iov, icmph_len))
+	if (memcpy_from_msg(user_icmph, msg, icmph_len))
 		return -EFAULT;
 
 	if (family == AF_INET) {
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index d22a31f..69de1a1 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4368,7 +4368,7 @@ int tcp_send_rcvq(struct sock *sk, struct msghdr *msg, size_t size)
 	if (tcp_try_rmem_schedule(sk, skb, skb->truesize))
 		goto err_free;
 
-	if (memcpy_fromiovec(skb_put(skb, size), msg->msg_iov, size))
+	if (memcpy_from_msg(skb_put(skb, size), msg, size))
 		goto err_free;
 
 	TCP_SKB_CB(skb)->seq = tcp_sk(sk)->rcv_nxt;
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index e8c4090..9052462 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1319,7 +1319,7 @@ static int irda_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reserve(skb, self->max_header_size + 16);
 	skb_reset_transport_header(skb);
 	skb_put(skb, len);
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		goto out_err;
@@ -1569,7 +1569,7 @@ static int irda_sendmsg_dgram(struct kiocb *iocb, struct socket *sock,
 
 	pr_debug("%s(), appending user data\n", __func__);
 	skb_put(skb, len);
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		goto out;
@@ -1678,7 +1678,7 @@ static int irda_sendmsg_ultra(struct kiocb *iocb, struct socket *sock,
 
 	pr_debug("%s(), appending user data\n", __func__);
 	skb_put(skb, len);
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		goto out;
diff --git a/net/iucv/af_iucv.c b/net/iucv/af_iucv.c
index 057b564..1cd3f81 100644
--- a/net/iucv/af_iucv.c
+++ b/net/iucv/af_iucv.c
@@ -1122,7 +1122,7 @@ static int iucv_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	}
 	if (iucv->transport == AF_IUCV_TRANS_HIPER)
 		skb_reserve(skb, sizeof(struct af_iucv_trans_hdr) + ETH_HLEN);
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		err = -EFAULT;
 		goto fail;
 	}
diff --git a/net/key/af_key.c b/net/key/af_key.c
index e588309..f8ac939 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -3611,7 +3611,7 @@ static int pfkey_sendmsg(struct kiocb *kiocb,
 		goto out;
 
 	err = -EFAULT;
-	if (memcpy_fromiovec(skb_put(skb,len), msg->msg_iov, len))
+	if (memcpy_from_msg(skb_put(skb,len), msg, len))
 		goto out;
 
 	hdr = pfkey_get_base_msg(skb, &err);
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index a6cc1fe..05dfc8a 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -441,7 +441,7 @@ static int l2tp_ip_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *m
 	*((__be32 *) skb_put(skb, 4)) = 0;
 
 	/* Copy user data into skb */
-	rc = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc < 0) {
 		kfree_skb(skb);
 		goto error;
diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index c559bcd..cc7a828 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -346,8 +346,7 @@ static int pppol2tp_sendmsg(struct kiocb *iocb, struct socket *sock, struct msgh
 	skb_put(skb, 2);
 
 	/* Copy user data into skb */
-	error = memcpy_fromiovec(skb_put(skb, total_len), m->msg_iov,
-				 total_len);
+	error = memcpy_from_msg(skb_put(skb, total_len), m, total_len);
 	if (error < 0) {
 		kfree_skb(skb);
 		goto error_put_sess_tun;
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index af662669..2c0b83c 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -921,7 +921,7 @@ static int llc_ui_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb->dev      = llc->dev;
 	skb->protocol = llc_proto_type(addr->sllc_arphrd);
 	skb_reserve(skb, hdrlen);
-	rc = memcpy_fromiovec(skb_put(skb, copied), msg->msg_iov, copied);
+	rc = memcpy_from_msg(skb_put(skb, copied), msg, copied);
 	if (rc)
 		goto out;
 	if (sk->sk_type == SOCK_DGRAM || addr->sllc_ua) {
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index e1aad6e..63aa5c8 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2325,7 +2325,7 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	NETLINK_CB(skb).flags	= netlink_skb_flags;
 
 	err = -EFAULT;
-	if (memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_put(skb, len), msg, len)) {
 		kfree_skb(skb);
 		goto out;
 	}
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 7e13f6a..69f1d5e 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1113,7 +1113,7 @@ static int nr_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_put(skb, len);
 
 	/* User data follows immediately after the NET/ROM transport header */
-	if (memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len)) {
+	if (memcpy_from_msg(skb_transport_header(skb), msg, len)) {
 		kfree_skb(skb);
 		err = -EFAULT;
 		goto out;
diff --git a/net/nfc/llcp_commands.c b/net/nfc/llcp_commands.c
index a3ad69a..c4da0c2 100644
--- a/net/nfc/llcp_commands.c
+++ b/net/nfc/llcp_commands.c
@@ -665,7 +665,7 @@ int nfc_llcp_send_i_frame(struct nfc_llcp_sock *sock,
 	if (msg_data == NULL)
 		return -ENOMEM;
 
-	if (memcpy_fromiovec(msg_data, msg->msg_iov, len)) {
+	if (memcpy_from_msg(msg_data, msg, len)) {
 		kfree(msg_data);
 		return -EFAULT;
 	}
@@ -731,7 +731,7 @@ int nfc_llcp_send_ui_frame(struct nfc_llcp_sock *sock, u8 ssap, u8 dsap,
 	if (msg_data == NULL)
 		return -ENOMEM;
 
-	if (memcpy_fromiovec(msg_data, msg->msg_iov, len)) {
+	if (memcpy_from_msg(msg_data, msg, len)) {
 		kfree(msg_data);
 		return -EFAULT;
 	}
diff --git a/net/nfc/rawsock.c b/net/nfc/rawsock.c
index 9d7d2b7..373e138 100644
--- a/net/nfc/rawsock.c
+++ b/net/nfc/rawsock.c
@@ -231,7 +231,7 @@ static int rawsock_sendmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb == NULL)
 		return rc;
 
-	rc = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc < 0) {
 		kfree_skb(skb);
 		return rc;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 58af5802..07ac950 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1676,7 +1676,7 @@ retry:
 			if (len < hhlen)
 				skb_reset_network_header(skb);
 		}
-		err = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+		err = memcpy_from_msg(skb_put(skb, len), msg, len);
 		if (err)
 			goto out_free;
 		goto retry;
@@ -2446,8 +2446,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 
 		len -= vnet_hdr_len;
 
-		err = memcpy_fromiovec((void *)&vnet_hdr, msg->msg_iov,
-				       vnet_hdr_len);
+		err = memcpy_from_msg((void *)&vnet_hdr, msg, vnet_hdr_len);
 		if (err < 0)
 			goto out_unlock;
 
diff --git a/net/phonet/datagram.c b/net/phonet/datagram.c
index 0918bc2..26054b4 100644
--- a/net/phonet/datagram.c
+++ b/net/phonet/datagram.c
@@ -109,7 +109,7 @@ static int pn_sendmsg(struct kiocb *iocb, struct sock *sk,
 		return err;
 	skb_reserve(skb, MAX_PHONET_HEADER);
 
-	err = memcpy_fromiovec((void *)skb_put(skb, len), msg->msg_iov, len);
+	err = memcpy_from_msg((void *)skb_put(skb, len), msg, len);
 	if (err < 0) {
 		kfree_skb(skb);
 		return err;
diff --git a/net/phonet/pep.c b/net/phonet/pep.c
index 9cd069d..5d3f2b7 100644
--- a/net/phonet/pep.c
+++ b/net/phonet/pep.c
@@ -1141,7 +1141,7 @@ static int pep_sendmsg(struct kiocb *iocb, struct sock *sk,
 		return err;
 
 	skb_reserve(skb, MAX_PHONET_HEADER + 3 + pn->aligned);
-	err = memcpy_fromiovec(skb_put(skb, len), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (err < 0)
 		goto outfree;
 
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index 9b600c2..43bac7c 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1121,7 +1121,7 @@ static int rose_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reset_transport_header(skb);
 	skb_put(skb, len);
 
-	err = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	err = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (err) {
 		kfree_skb(skb);
 		return err;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index 9f32741..e49bcce 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1001,7 +1001,7 @@ no_mem:
 
 /* Helper to create ABORT with a SCTP_ERROR_USER_ABORT error.  */
 struct sctp_chunk *sctp_make_abort_user(const struct sctp_association *asoc,
-					const struct msghdr *msg,
+					struct msghdr *msg,
 					size_t paylen)
 {
 	struct sctp_chunk *retval;
@@ -1018,7 +1018,7 @@ struct sctp_chunk *sctp_make_abort_user(const struct sctp_association *asoc,
 		if (!payload)
 			goto err_payload;
 
-		err = memcpy_fromiovec(payload, msg->msg_iov, paylen);
+		err = memcpy_from_msg(payload, msg, paylen);
 		if (err < 0)
 			goto err_copy;
 	}
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index 59e785b..d9149b6 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1170,7 +1170,7 @@ static int x25_sendmsg(struct kiocb *iocb, struct socket *sock,
 	skb_reset_transport_header(skb);
 	skb_put(skb, len);
 
-	rc = memcpy_fromiovec(skb_transport_header(skb), msg->msg_iov, len);
+	rc = memcpy_from_msg(skb_transport_header(skb), msg, len);
 	if (rc)
 		goto out_kfree_skb;
 
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 03/17] switch ipxrtr_route_packet() from iovec to msghdr
  2014-11-25  2:40                         ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
  2014-11-25 14:02                           ` [PATCH v2 02/17] new helper: memcpy_from_msg() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 04/17] new helper: memcpy_to_msg() Al Viro
                                             ` (13 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/ipx.h   |    2 +-
 net/ipx/af_ipx.c    |    3 +--
 net/ipx/ipx_route.c |    4 ++--
 3 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/net/ipx.h b/include/net/ipx.h
index 320f47b..e5cff68 100644
--- a/include/net/ipx.h
+++ b/include/net/ipx.h
@@ -150,7 +150,7 @@ int ipxrtr_add_route(__be32 network, struct ipx_interface *intrfc,
 		     unsigned char *node);
 void ipxrtr_del_routes(struct ipx_interface *intrfc);
 int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
-			struct iovec *iov, size_t len, int noblock);
+			struct msghdr *msg, size_t len, int noblock);
 int ipxrtr_route_skb(struct sk_buff *skb);
 struct ipx_route *ipxrtr_lookup(__be32 net);
 int ipxrtr_ioctl(unsigned int cmd, void __user *arg);
diff --git a/net/ipx/af_ipx.c b/net/ipx/af_ipx.c
index 97dc432..f11ad1d 100644
--- a/net/ipx/af_ipx.c
+++ b/net/ipx/af_ipx.c
@@ -1745,8 +1745,7 @@ static int ipx_sendmsg(struct kiocb *iocb, struct socket *sock,
 		memcpy(usipx->sipx_node, ipxs->dest_addr.node, IPX_NODE_LEN);
 	}
 
-	rc = ipxrtr_route_packet(sk, usipx, msg->msg_iov, len,
-				 flags & MSG_DONTWAIT);
+	rc = ipxrtr_route_packet(sk, usipx, msg, len, flags & MSG_DONTWAIT);
 	if (rc >= 0)
 		rc = len;
 out:
diff --git a/net/ipx/ipx_route.c b/net/ipx/ipx_route.c
index 67e7ad3..3e2a32a 100644
--- a/net/ipx/ipx_route.c
+++ b/net/ipx/ipx_route.c
@@ -165,7 +165,7 @@ int ipxrtr_route_skb(struct sk_buff *skb)
  * Route an outgoing frame from a socket.
  */
 int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
-			struct iovec *iov, size_t len, int noblock)
+			struct msghdr *msg, size_t len, int noblock)
 {
 	struct sk_buff *skb;
 	struct ipx_sock *ipxs = ipx_sk(sk);
@@ -229,7 +229,7 @@ int ipxrtr_route_packet(struct sock *sk, struct sockaddr_ipx *usipx,
 	memcpy(ipx->ipx_dest.node, usipx->sipx_node, IPX_NODE_LEN);
 	ipx->ipx_dest.sock		= usipx->sipx_port;
 
-	rc = memcpy_fromiovec(skb_put(skb, len), iov, len);
+	rc = memcpy_from_msg(skb_put(skb, len), msg, len);
 	if (rc) {
 		kfree_skb(skb);
 		goto out_put;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 04/17] new helper: memcpy_to_msg()
  2014-11-25  2:40                         ` Al Viro
                                             ` (2 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 03/17] switch ipxrtr_route_packet() from iovec to msghdr Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 05/17] switch drivers/net/tun.c to ->read_iter() Al Viro
                                             ` (12 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 crypto/algif_hash.c    |    2 +-
 include/linux/skbuff.h |    5 +++++
 net/caif/caif_socket.c |    2 +-
 net/can/bcm.c          |    2 +-
 net/can/raw.c          |    2 +-
 net/decnet/af_decnet.c |    2 +-
 net/ipv4/tcp.c         |    2 +-
 net/irda/af_irda.c     |    2 +-
 net/packet/af_packet.c |    3 +--
 9 files changed, 13 insertions(+), 9 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 8502462..35c93ff 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -174,7 +174,7 @@ static int hash_recvmsg(struct kiocb *unused, struct socket *sock,
 			goto unlock;
 	}
 
-	err = memcpy_toiovec(msg->msg_iov, ctx->result, len);
+	err = memcpy_to_msg(msg, ctx->result, len);
 
 unlock:
 	release_sock(sk);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 97dc5f8..d048347 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2692,6 +2692,11 @@ static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 	return memcpy_fromiovec(data, msg->msg_iov, len);
 }
 
+static inline int memcpy_to_msg(struct msghdr *msg, void *data, int len)
+{
+	return memcpy_toiovec(msg->msg_iov, data, len);
+}
+
 struct skb_checksum_ops {
 	__wsum (*update)(const void *mem, int len, __wsum wsum);
 	__wsum (*combine)(__wsum csum, __wsum csum2, int offset, int len);
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index 230f140..ac618b0 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -418,7 +418,7 @@ unlock:
 		}
 		release_sock(sk);
 		chunk = min_t(unsigned int, skb->len, size);
-		if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+		if (memcpy_to_msg(msg, skb->data, chunk)) {
 			skb_queue_head(&sk->sk_receive_queue, skb);
 			if (copied == 0)
 				copied = -EFAULT;
diff --git a/net/can/bcm.c b/net/can/bcm.c
index b9a1f71..0167118 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1555,7 +1555,7 @@ static int bcm_recvmsg(struct kiocb *iocb, struct socket *sock,
 	if (skb->len < size)
 		size = skb->len;
 
-	err = memcpy_toiovec(msg->msg_iov, skb->data, size);
+	err = memcpy_to_msg(msg, skb->data, size);
 	if (err < 0) {
 		skb_free_datagram(sk, skb);
 		return err;
diff --git a/net/can/raw.c b/net/can/raw.c
index 0e4004f..dfdcffb 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -750,7 +750,7 @@ static int raw_recvmsg(struct kiocb *iocb, struct socket *sock,
 	else
 		size = skb->len;
 
-	err = memcpy_toiovec(msg->msg_iov, skb->data, size);
+	err = memcpy_to_msg(msg, skb->data, size);
 	if (err < 0) {
 		skb_free_datagram(sk, skb);
 		return err;
diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c
index e2e2e3c..8102286 100644
--- a/net/decnet/af_decnet.c
+++ b/net/decnet/af_decnet.c
@@ -1760,7 +1760,7 @@ static int dn_recvmsg(struct kiocb *iocb, struct socket *sock,
 		if ((chunk + copied) > size)
 			chunk = size - copied;
 
-		if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+		if (memcpy_to_msg(msg, skb->data, chunk)) {
 			rv = -EFAULT;
 			break;
 		}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index c239f47..435443b 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1349,7 +1349,7 @@ static int tcp_recv_urg(struct sock *sk, struct msghdr *msg, int len, int flags)
 
 		if (len > 0) {
 			if (!(flags & MSG_TRUNC))
-				err = memcpy_toiovec(msg->msg_iov, &c, 1);
+				err = memcpy_to_msg(msg, &c, 1);
 			len = 1;
 		} else
 			msg->msg_flags |= MSG_TRUNC;
diff --git a/net/irda/af_irda.c b/net/irda/af_irda.c
index 9052462..568edc7 100644
--- a/net/irda/af_irda.c
+++ b/net/irda/af_irda.c
@@ -1466,7 +1466,7 @@ static int irda_recvmsg_stream(struct kiocb *iocb, struct socket *sock,
 		}
 
 		chunk = min_t(unsigned int, skb->len, size);
-		if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
+		if (memcpy_to_msg(msg, skb->data, chunk)) {
 			skb_queue_head(&sk->sk_receive_queue, skb);
 			if (copied == 0)
 				copied = -EFAULT;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 07ac950..108d7f3 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2949,8 +2949,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
 			vnet_hdr.flags = VIRTIO_NET_HDR_F_DATA_VALID;
 		} /* else everything is zero */
 
-		err = memcpy_toiovec(msg->msg_iov, (void *)&vnet_hdr,
-				     vnet_hdr_len);
+		err = memcpy_to_msg(msg, (void *)&vnet_hdr, vnet_hdr_len);
 		if (err < 0)
 			goto out_free;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 05/17] switch drivers/net/tun.c to ->read_iter()
  2014-11-25  2:40                         ` Al Viro
                                             ` (3 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 04/17] new helper: memcpy_to_msg() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 06/17] switch macvtap " Al Viro
                                             ` (11 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/tun.c |   40 +++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index ac53a73..405dfdf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1339,18 +1339,17 @@ done:
 }
 
 static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
-			   const struct iovec *iv, unsigned long segs,
-			   ssize_t len, int noblock)
+			   struct iov_iter *to,
+			   int noblock)
 {
 	struct sk_buff *skb;
-	ssize_t ret = 0;
+	ssize_t ret;
 	int peeked, err, off = 0;
-	struct iov_iter iter;
 
 	tun_debug(KERN_INFO, tun, "tun_do_read\n");
 
-	if (!len)
-		return ret;
+	if (!iov_iter_count(to))
+		return 0;
 
 	if (tun->dev->reg_state != NETREG_REGISTERED)
 		return -EIO;
@@ -1359,37 +1358,27 @@ static ssize_t tun_do_read(struct tun_struct *tun, struct tun_file *tfile,
 	skb = __skb_recv_datagram(tfile->socket.sk, noblock ? MSG_DONTWAIT : 0,
 				  &peeked, &off, &err);
 	if (!skb)
-		return ret;
+		return 0;
 
-	iov_iter_init(&iter, READ, iv, segs, len);
-	ret = tun_put_user(tun, tfile, skb, &iter);
+	ret = tun_put_user(tun, tfile, skb, to);
 	kfree_skb(skb);
 
 	return ret;
 }
 
-static ssize_t tun_chr_aio_read(struct kiocb *iocb, const struct iovec *iv,
-			    unsigned long count, loff_t pos)
+static ssize_t tun_chr_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
 	struct file *file = iocb->ki_filp;
 	struct tun_file *tfile = file->private_data;
 	struct tun_struct *tun = __tun_get(tfile);
-	ssize_t len, ret;
+	ssize_t len = iov_iter_count(to), ret;
 
 	if (!tun)
 		return -EBADFD;
-	len = iov_length(iv, count);
-	if (len < 0) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	ret = tun_do_read(tun, tfile, iv, count, len,
-			  file->f_flags & O_NONBLOCK);
+	ret = tun_do_read(tun, tfile, to, file->f_flags & O_NONBLOCK);
 	ret = min_t(ssize_t, ret, len);
 	if (ret > 0)
 		iocb->ki_pos = ret;
-out:
 	tun_put(tun);
 	return ret;
 }
@@ -1471,6 +1460,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
+	struct iov_iter to;
 	int ret;
 
 	if (!tun)
@@ -1485,8 +1475,8 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 					 SOL_PACKET, TUN_TX_TIMESTAMP);
 		goto out;
 	}
-	ret = tun_do_read(tun, tfile, m->msg_iov, m->msg_iovlen, total_len,
-			  flags & MSG_DONTWAIT);
+	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
+	ret = tun_do_read(tun, tfile, &to, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
@@ -2242,8 +2232,8 @@ static int tun_chr_show_fdinfo(struct seq_file *m, struct file *f)
 static const struct file_operations tun_fops = {
 	.owner	= THIS_MODULE,
 	.llseek = no_llseek,
-	.read  = do_sync_read,
-	.aio_read  = tun_chr_aio_read,
+	.read  = new_sync_read,
+	.read_iter  = tun_chr_read_iter,
 	.write = do_sync_write,
 	.aio_write = tun_chr_aio_write,
 	.poll	= tun_chr_poll,
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 06/17] switch macvtap to ->read_iter()
  2014-11-25  2:40                         ` Al Viro
                                             ` (4 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 05/17] switch drivers/net/tun.c to ->read_iter() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter() Al Viro
                                             ` (10 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/macvtap.c |   61 ++++++++++++++++++++++---------------------------
 1 file changed, 27 insertions(+), 34 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 42a80d3..8f80045 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -831,64 +831,55 @@ done:
 }
 
 static ssize_t macvtap_do_read(struct macvtap_queue *q,
-			       const struct iovec *iv, unsigned long segs,
-			       unsigned long len,
+			       struct iov_iter *to,
 			       int noblock)
 {
 	DEFINE_WAIT(wait);
 	struct sk_buff *skb;
 	ssize_t ret = 0;
-	struct iov_iter iter;
 
-	while (len) {
+	if (!iov_iter_count(to))
+		return 0;
+
+	while (1) {
 		if (!noblock)
 			prepare_to_wait(sk_sleep(&q->sk), &wait,
 					TASK_INTERRUPTIBLE);
 
 		/* Read frames from the queue */
 		skb = skb_dequeue(&q->sk.sk_receive_queue);
-		if (!skb) {
-			if (noblock) {
-				ret = -EAGAIN;
-				break;
-			}
-			if (signal_pending(current)) {
-				ret = -ERESTARTSYS;
-				break;
-			}
-			/* Nothing to read, let's sleep */
-			schedule();
-			continue;
+		if (skb)
+			break;
+		if (noblock) {
+			ret = -EAGAIN;
+			break;
 		}
-		iov_iter_init(&iter, READ, iv, segs, len);
-		ret = macvtap_put_user(q, skb, &iter);
+		if (signal_pending(current)) {
+			ret = -ERESTARTSYS;
+			break;
+		}
+		/* Nothing to read, let's sleep */
+		schedule();
+	}
+	if (skb) {
+		ret = macvtap_put_user(q, skb, to);
 		kfree_skb(skb);
-		break;
 	}
-
 	if (!noblock)
 		finish_wait(sk_sleep(&q->sk), &wait);
 	return ret;
 }
 
-static ssize_t macvtap_aio_read(struct kiocb *iocb, const struct iovec *iv,
-				unsigned long count, loff_t pos)
+static ssize_t macvtap_read_iter(struct kiocb *iocb, struct iov_iter *to)
 {
 	struct file *file = iocb->ki_filp;
 	struct macvtap_queue *q = file->private_data;
-	ssize_t len, ret = 0;
+	ssize_t len = iov_iter_count(to), ret;
 
-	len = iov_length(iv, count);
-	if (len < 0) {
-		ret = -EINVAL;
-		goto out;
-	}
-
-	ret = macvtap_do_read(q, iv, count, len, file->f_flags & O_NONBLOCK);
+	ret = macvtap_do_read(q, to, file->f_flags & O_NONBLOCK);
 	ret = min_t(ssize_t, ret, len);
 	if (ret > 0)
 		iocb->ki_pos = ret;
-out:
 	return ret;
 }
 
@@ -1089,7 +1080,8 @@ static const struct file_operations macvtap_fops = {
 	.owner		= THIS_MODULE,
 	.open		= macvtap_open,
 	.release	= macvtap_release,
-	.aio_read	= macvtap_aio_read,
+	.read		= new_sync_read,
+	.read_iter	= macvtap_read_iter,
 	.aio_write	= macvtap_aio_write,
 	.poll		= macvtap_poll,
 	.llseek		= no_llseek,
@@ -1112,11 +1104,12 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
 			   int flags)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
+	struct iov_iter to;
 	int ret;
 	if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
 		return -EINVAL;
-	ret = macvtap_do_read(q, m->msg_iov, m->msg_iovlen, total_len,
-			  flags & MSG_DONTWAIT);
+	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
+	ret = macvtap_do_read(q, &to, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter()
  2014-11-25  2:40                         ` Al Viro
                                             ` (5 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 06/17] switch macvtap " Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
                                             ` (9 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    3 ++
 net/core/datagram.c    |  116 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 119 insertions(+)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index d048347..a01cd9a 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2659,10 +2659,13 @@ static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
 int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 				 const struct iovec *from, int from_offset,
 				 int len);
+int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
+				 struct iov_iter *from, int len);
 int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *frm,
 			   int offset, size_t count);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 			   struct iov_iter *to, int size);
+int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
 void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
 void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb);
 int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 26391a3..3ea0384 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -572,6 +572,78 @@ fault:
 }
 EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
 
+int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
+				 struct iov_iter *from,
+				 int len)
+{
+	int start = skb_headlen(skb);
+	int i, copy = start - offset;
+	struct sk_buff *frag_iter;
+
+	/* Copy header. */
+	if (copy > 0) {
+		if (copy > len)
+			copy = len;
+		if (copy_from_iter(skb->data + offset, copy, from) != copy)
+			goto fault;
+		if ((len -= copy) == 0)
+			return 0;
+		offset += copy;
+	}
+
+	/* Copy paged appendix. Hmm... why does this look so complicated? */
+	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
+		int end;
+		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
+
+		WARN_ON(start > offset + len);
+
+		end = start + skb_frag_size(frag);
+		if ((copy = end - offset) > 0) {
+			size_t copied;
+
+			if (copy > len)
+				copy = len;
+			copied = copy_page_from_iter(skb_frag_page(frag),
+					  frag->page_offset + offset - start,
+					  copy, from);
+			if (copied != copy)
+				goto fault;
+
+			if (!(len -= copy))
+				return 0;
+			offset += copy;
+		}
+		start = end;
+	}
+
+	skb_walk_frags(skb, frag_iter) {
+		int end;
+
+		WARN_ON(start > offset + len);
+
+		end = start + frag_iter->len;
+		if ((copy = end - offset) > 0) {
+			if (copy > len)
+				copy = len;
+			if (skb_copy_datagram_from_iter(frag_iter,
+							offset - start,
+							from, copy))
+				goto fault;
+			if ((len -= copy) == 0)
+				return 0;
+			offset += copy;
+		}
+		start = end;
+	}
+	if (!len)
+		return 0;
+
+fault:
+	return -EFAULT;
+}
+EXPORT_SYMBOL(skb_copy_datagram_from_iter);
+
 /**
  *	zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
  *	@skb: buffer to copy
@@ -643,6 +715,50 @@ int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
 }
 EXPORT_SYMBOL(zerocopy_sg_from_iovec);
 
+int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
+{
+	int len = iov_iter_count(from);
+	int copy = min_t(int, skb_headlen(skb), len);
+	int frag = 0;
+
+	/* copy up to skb headlen */
+	if (skb_copy_datagram_from_iter(skb, 0, from, copy))
+		return -EFAULT;
+
+	while (iov_iter_count(from)) {
+		struct page *pages[MAX_SKB_FRAGS];
+		size_t start;
+		ssize_t copied;
+		unsigned long truesize;
+		int n = 0;
+
+		if (frag == MAX_SKB_FRAGS)
+			return -EMSGSIZE;
+
+		copied = iov_iter_get_pages(from, pages, ~0U,
+					    MAX_SKB_FRAGS - frag, &start);
+		if (copied < 0)
+			return -EFAULT;
+
+		iov_iter_advance(from, copied);
+
+		truesize = PAGE_ALIGN(copied + start);
+		skb->data_len += copied;
+		skb->len += copied;
+		skb->truesize += truesize;
+		atomic_add(truesize, &skb->sk->sk_wmem_alloc);
+		while (copied) {
+			int size = min_t(int, copied, PAGE_SIZE - start);
+			skb_fill_page_desc(skb, frag++, pages[n], start, size);
+			start = 0;
+			copied -= size;
+			n++;
+		}
+	}
+	return 0;
+}
+EXPORT_SYMBOL(zerocopy_sg_from_iter);
+
 static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
 				      u8 __user *to, int len,
 				      __wsum *csump)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2014-11-25  2:40                         ` Al Viro
                                             ` (6 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2015-02-03 10:10                             ` Michael S. Tsirkin
  2014-11-25 14:02                           ` [PATCH v2 09/17] kill zerocopy_sg_from_iovec() Al Viro
                                             ` (8 subsequent siblings)
  16 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

allows to switch macvtap and tun from ->aio_write() to ->write_iter()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/macvtap.c |   44 +++++++++++++++++++++-----------------------
 drivers/net/tun.c     |   44 ++++++++++++++++++++++++--------------------
 2 files changed, 45 insertions(+), 43 deletions(-)

diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 8f80045..22b4cf2 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -640,12 +640,12 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
 
 /* Get packet from user space buffer */
 static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
-				const struct iovec *iv, unsigned long total_len,
-				size_t count, int noblock)
+				struct iov_iter *from, int noblock)
 {
 	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
 	struct sk_buff *skb;
 	struct macvlan_dev *vlan;
+	unsigned long total_len = iov_iter_count(from);
 	unsigned long len = total_len;
 	int err;
 	struct virtio_net_hdr vnet_hdr = { 0 };
@@ -653,6 +653,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 	int copylen = 0;
 	bool zerocopy = false;
 	size_t linear;
+	ssize_t n;
 
 	if (q->flags & IFF_VNET_HDR) {
 		vnet_hdr_len = q->vnet_hdr_sz;
@@ -662,10 +663,11 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 			goto err;
 		len -= vnet_hdr_len;
 
-		err = memcpy_fromiovecend((void *)&vnet_hdr, iv, 0,
-					   sizeof(vnet_hdr));
-		if (err < 0)
+		err = -EFAULT;
+		n = copy_from_iter(&vnet_hdr, sizeof(vnet_hdr), from);
+		if (n != sizeof(vnet_hdr))
 			goto err;
+		iov_iter_advance(from, vnet_hdr_len - sizeof(vnet_hdr));
 		if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
 		     vnet_hdr.csum_start + vnet_hdr.csum_offset + 2 >
 							vnet_hdr.hdr_len)
@@ -680,17 +682,16 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 	if (unlikely(len < ETH_HLEN))
 		goto err;
 
-	err = -EMSGSIZE;
-	if (unlikely(count > UIO_MAXIOV))
-		goto err;
-
 	if (m && m->msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY)) {
+		struct iov_iter i;
+
 		copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
 		if (copylen > good_linear)
 			copylen = good_linear;
 		linear = copylen;
-		if (iov_pages(iv, vnet_hdr_len + copylen, count)
-		    <= MAX_SKB_FRAGS)
+		i = *from;
+		iov_iter_advance(&i, copylen);
+		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
 			zerocopy = true;
 	}
 
@@ -708,10 +709,9 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
 		goto err;
 
 	if (zerocopy)
-		err = zerocopy_sg_from_iovec(skb, iv, vnet_hdr_len, count);
+		err = zerocopy_sg_from_iter(skb, from);
 	else {
-		err = skb_copy_datagram_from_iovec(skb, 0, iv, vnet_hdr_len,
-						   len);
+		err = skb_copy_datagram_from_iter(skb, 0, from, len);
 		if (!err && m && m->msg_control) {
 			struct ubuf_info *uarg = m->msg_control;
 			uarg->callback(uarg, false);
@@ -764,16 +764,12 @@ err:
 	return err;
 }
 
-static ssize_t macvtap_aio_write(struct kiocb *iocb, const struct iovec *iv,
-				 unsigned long count, loff_t pos)
+static ssize_t macvtap_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
-	ssize_t result = -ENOLINK;
 	struct macvtap_queue *q = file->private_data;
 
-	result = macvtap_get_user(q, NULL, iv, iov_length(iv, count), count,
-				  file->f_flags & O_NONBLOCK);
-	return result;
+	return macvtap_get_user(q, NULL, from, file->f_flags & O_NONBLOCK);
 }
 
 /* Put packet to the user space buffer */
@@ -1081,8 +1077,9 @@ static const struct file_operations macvtap_fops = {
 	.open		= macvtap_open,
 	.release	= macvtap_release,
 	.read		= new_sync_read,
+	.write		= new_sync_write,
 	.read_iter	= macvtap_read_iter,
-	.aio_write	= macvtap_aio_write,
+	.write_iter	= macvtap_write_iter,
 	.poll		= macvtap_poll,
 	.llseek		= no_llseek,
 	.unlocked_ioctl	= macvtap_ioctl,
@@ -1095,8 +1092,9 @@ static int macvtap_sendmsg(struct kiocb *iocb, struct socket *sock,
 			   struct msghdr *m, size_t total_len)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
-	return macvtap_get_user(q, m, m->msg_iov, total_len, m->msg_iovlen,
-			    m->msg_flags & MSG_DONTWAIT);
+	struct iov_iter from;
+	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
+	return macvtap_get_user(q, m, &from, m->msg_flags & MSG_DONTWAIT);
 }
 
 static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 405dfdf..4b743c6 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1012,28 +1012,29 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
 
 /* Get packet from user space buffer */
 static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
-			    void *msg_control, const struct iovec *iv,
-			    size_t total_len, size_t count, int noblock)
+			    void *msg_control, struct iov_iter *from,
+			    int noblock)
 {
 	struct tun_pi pi = { 0, cpu_to_be16(ETH_P_IP) };
 	struct sk_buff *skb;
+	size_t total_len = iov_iter_count(from);
 	size_t len = total_len, align = NET_SKB_PAD, linear;
 	struct virtio_net_hdr gso = { 0 };
 	int good_linear;
-	int offset = 0;
 	int copylen;
 	bool zerocopy = false;
 	int err;
 	u32 rxhash;
+	ssize_t n;
 
 	if (!(tun->flags & TUN_NO_PI)) {
 		if (len < sizeof(pi))
 			return -EINVAL;
 		len -= sizeof(pi);
 
-		if (memcpy_fromiovecend((void *)&pi, iv, 0, sizeof(pi)))
+		n = copy_from_iter(&pi, sizeof(pi), from);
+		if (n != sizeof(pi))
 			return -EFAULT;
-		offset += sizeof(pi);
 	}
 
 	if (tun->flags & TUN_VNET_HDR) {
@@ -1041,7 +1042,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 			return -EINVAL;
 		len -= tun->vnet_hdr_sz;
 
-		if (memcpy_fromiovecend((void *)&gso, iv, offset, sizeof(gso)))
+		n = copy_from_iter(&gso, sizeof(gso), from);
+		if (n != sizeof(gso))
 			return -EFAULT;
 
 		if ((gso.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
@@ -1050,7 +1052,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 
 		if (gso.hdr_len > len)
 			return -EINVAL;
-		offset += tun->vnet_hdr_sz;
+		iov_iter_advance(from, tun->vnet_hdr_sz);
 	}
 
 	if ((tun->flags & TUN_TYPE_MASK) == TUN_TAP_DEV) {
@@ -1063,6 +1065,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	good_linear = SKB_MAX_HEAD(align);
 
 	if (msg_control) {
+		struct iov_iter i = *from;
+
 		/* There are 256 bytes to be copied in skb, so there is
 		 * enough room for skb expand head in case it is used.
 		 * The rest of the buffer is mapped from userspace.
@@ -1071,7 +1075,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 		if (copylen > good_linear)
 			copylen = good_linear;
 		linear = copylen;
-		if (iov_pages(iv, offset + copylen, count) <= MAX_SKB_FRAGS)
+		iov_iter_advance(&i, copylen);
+		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
 			zerocopy = true;
 	}
 
@@ -1091,9 +1096,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	}
 
 	if (zerocopy)
-		err = zerocopy_sg_from_iovec(skb, iv, offset, count);
+		err = zerocopy_sg_from_iter(skb, from);
 	else {
-		err = skb_copy_datagram_from_iovec(skb, 0, iv, offset, len);
+		err = skb_copy_datagram_from_iter(skb, 0, from, len);
 		if (!err && msg_control) {
 			struct ubuf_info *uarg = msg_control;
 			uarg->callback(uarg, false);
@@ -1207,8 +1212,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	return total_len;
 }
 
-static ssize_t tun_chr_aio_write(struct kiocb *iocb, const struct iovec *iv,
-			      unsigned long count, loff_t pos)
+static ssize_t tun_chr_write_iter(struct kiocb *iocb, struct iov_iter *from)
 {
 	struct file *file = iocb->ki_filp;
 	struct tun_struct *tun = tun_get(file);
@@ -1218,10 +1222,7 @@ static ssize_t tun_chr_aio_write(struct kiocb *iocb, const struct iovec *iv,
 	if (!tun)
 		return -EBADFD;
 
-	tun_debug(KERN_INFO, tun, "tun_chr_write %ld\n", count);
-
-	result = tun_get_user(tun, tfile, NULL, iv, iov_length(iv, count),
-			      count, file->f_flags & O_NONBLOCK);
+	result = tun_get_user(tun, tfile, NULL, from, file->f_flags & O_NONBLOCK);
 
 	tun_put(tun);
 	return result;
@@ -1445,11 +1446,14 @@ static int tun_sendmsg(struct kiocb *iocb, struct socket *sock,
 	int ret;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
+	struct iov_iter from;
 
 	if (!tun)
 		return -EBADFD;
-	ret = tun_get_user(tun, tfile, m->msg_control, m->msg_iov, total_len,
-			   m->msg_iovlen, m->msg_flags & MSG_DONTWAIT);
+
+	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
+	ret = tun_get_user(tun, tfile, m->msg_control, &from,
+			   m->msg_flags & MSG_DONTWAIT);
 	tun_put(tun);
 	return ret;
 }
@@ -2233,9 +2237,9 @@ static const struct file_operations tun_fops = {
 	.owner	= THIS_MODULE,
 	.llseek = no_llseek,
 	.read  = new_sync_read,
+	.write = new_sync_write,
 	.read_iter  = tun_chr_read_iter,
-	.write = do_sync_write,
-	.aio_write = tun_chr_aio_write,
+	.write_iter = tun_chr_write_iter,
 	.poll	= tun_chr_poll,
 	.unlocked_ioctl	= tun_chr_ioctl,
 #ifdef CONFIG_COMPAT
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 09/17] kill zerocopy_sg_from_iovec()
  2014-11-25  2:40                         ` Al Viro
                                             ` (7 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter() Al Viro
                                             ` (7 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

no users left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    2 --
 net/core/datagram.c    |   65 ++----------------------------------------------
 2 files changed, 2 insertions(+), 65 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index a01cd9a..178cdbd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2661,8 +2661,6 @@ int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 				 int len);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
-int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *frm,
-			   int offset, size_t count);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 			   struct iov_iter *to, int size);
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 3ea0384..c4d832e 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -645,76 +645,15 @@ fault:
 EXPORT_SYMBOL(skb_copy_datagram_from_iter);
 
 /**
- *	zerocopy_sg_from_iovec - Build a zerocopy datagram from an iovec
+ *	zerocopy_sg_from_iter - Build a zerocopy datagram from an iov_iter
  *	@skb: buffer to copy
- *	@from: io vector to copy from
- *	@offset: offset in the io vector to start copying from
- *	@count: amount of vectors to copy to buffer from
+ *	@from: the source to copy from
  *
  *	The function will first copy up to headlen, and then pin the userspace
  *	pages and build frags through them.
  *
  *	Returns 0, -EFAULT or -EMSGSIZE.
- *	Note: the iovec is not modified during the copy
  */
-int zerocopy_sg_from_iovec(struct sk_buff *skb, const struct iovec *from,
-				  int offset, size_t count)
-{
-	int len = iov_length(from, count) - offset;
-	int copy = min_t(int, skb_headlen(skb), len);
-	int size;
-	int i = 0;
-
-	/* copy up to skb headlen */
-	if (skb_copy_datagram_from_iovec(skb, 0, from, offset, copy))
-		return -EFAULT;
-
-	if (len == copy)
-		return 0;
-
-	offset += copy;
-	while (count--) {
-		struct page *page[MAX_SKB_FRAGS];
-		int num_pages;
-		unsigned long base;
-		unsigned long truesize;
-
-		/* Skip over from offset and copied */
-		if (offset >= from->iov_len) {
-			offset -= from->iov_len;
-			++from;
-			continue;
-		}
-		len = from->iov_len - offset;
-		base = (unsigned long)from->iov_base + offset;
-		size = ((base & ~PAGE_MASK) + len + ~PAGE_MASK) >> PAGE_SHIFT;
-		if (i + size > MAX_SKB_FRAGS)
-			return -EMSGSIZE;
-		num_pages = get_user_pages_fast(base, size, 0, &page[i]);
-		if (num_pages != size) {
-			release_pages(&page[i], num_pages, 0);
-			return -EFAULT;
-		}
-		truesize = size * PAGE_SIZE;
-		skb->data_len += len;
-		skb->len += len;
-		skb->truesize += truesize;
-		atomic_add(truesize, &skb->sk->sk_wmem_alloc);
-		while (len) {
-			int off = base & ~PAGE_MASK;
-			int size = min_t(int, len, PAGE_SIZE - off);
-			skb_fill_page_desc(skb, i, page[i], off, size);
-			base += size;
-			len -= size;
-			i++;
-		}
-		offset = 0;
-		++from;
-	}
-	return 0;
-}
-EXPORT_SYMBOL(zerocopy_sg_from_iovec);
-
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
 {
 	int len = iov_iter_count(from);
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter()
  2014-11-25  2:40                         ` Al Viro
                                             ` (8 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 09/17] kill zerocopy_sg_from_iovec() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter Al Viro
                                             ` (6 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

... and kill skb_copy_datagram_iovec()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |    3 --
 net/core/datagram.c    |   88 ++----------------------------------------------
 net/packet/af_packet.c |   11 ++++--
 net/unix/af_unix.c     |   11 ++++--
 4 files changed, 18 insertions(+), 95 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 178cdbd..7691ad5 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2656,9 +2656,6 @@ static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
 {
 	return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
 }
-int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
-				 const struct iovec *from, int from_offset,
-				 int len);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index c4d832e..b6e303b 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -480,98 +480,14 @@ short_copy:
 EXPORT_SYMBOL(skb_copy_datagram_iter);
 
 /**
- *	skb_copy_datagram_from_iovec - Copy a datagram from an iovec.
+ *	skb_copy_datagram_from_iter - Copy a datagram from an iov_iter.
  *	@skb: buffer to copy
  *	@offset: offset in the buffer to start copying to
- *	@from: io vector to copy to
- *	@from_offset: offset in the io vector to start copying from
+ *	@from: the copy source
  *	@len: amount of data to copy to buffer from iovec
  *
  *	Returns 0 or -EFAULT.
- *	Note: the iovec is not modified during the copy.
  */
-int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
-				 const struct iovec *from, int from_offset,
-				 int len)
-{
-	int start = skb_headlen(skb);
-	int i, copy = start - offset;
-	struct sk_buff *frag_iter;
-
-	/* Copy header. */
-	if (copy > 0) {
-		if (copy > len)
-			copy = len;
-		if (memcpy_fromiovecend(skb->data + offset, from, from_offset,
-					copy))
-			goto fault;
-		if ((len -= copy) == 0)
-			return 0;
-		offset += copy;
-		from_offset += copy;
-	}
-
-	/* Copy paged appendix. Hmm... why does this look so complicated? */
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		int end;
-		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-
-		WARN_ON(start > offset + len);
-
-		end = start + skb_frag_size(frag);
-		if ((copy = end - offset) > 0) {
-			int err;
-			u8  *vaddr;
-			struct page *page = skb_frag_page(frag);
-
-			if (copy > len)
-				copy = len;
-			vaddr = kmap(page);
-			err = memcpy_fromiovecend(vaddr + frag->page_offset +
-						  offset - start,
-						  from, from_offset, copy);
-			kunmap(page);
-			if (err)
-				goto fault;
-
-			if (!(len -= copy))
-				return 0;
-			offset += copy;
-			from_offset += copy;
-		}
-		start = end;
-	}
-
-	skb_walk_frags(skb, frag_iter) {
-		int end;
-
-		WARN_ON(start > offset + len);
-
-		end = start + frag_iter->len;
-		if ((copy = end - offset) > 0) {
-			if (copy > len)
-				copy = len;
-			if (skb_copy_datagram_from_iovec(frag_iter,
-							 offset - start,
-							 from,
-							 from_offset,
-							 copy))
-				goto fault;
-			if ((len -= copy) == 0)
-				return 0;
-			offset += copy;
-			from_offset += copy;
-		}
-		start = end;
-	}
-	if (!len)
-		return 0;
-
-fault:
-	return -EFAULT;
-}
-EXPORT_SYMBOL(skb_copy_datagram_from_iovec);
-
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from,
 				 int len)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 108d7f3..dfb148e 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2408,6 +2408,10 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	unsigned short gso_type = 0;
 	int hlen, tlen;
 	int extra_len = 0;
+	struct iov_iter from;
+	ssize_t n;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	/*
 	 *	Get and verify the address.
@@ -2446,8 +2450,9 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 
 		len -= vnet_hdr_len;
 
-		err = memcpy_from_msg((void *)&vnet_hdr, msg, vnet_hdr_len);
-		if (err < 0)
+		err = -EFAULT;
+		n = copy_from_iter(&vnet_hdr, vnet_hdr_len, &from);
+		if (n != vnet_hdr_len)
 			goto out_unlock;
 
 		if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
@@ -2517,7 +2522,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	/* Returns -EFAULT on error */
-	err = skb_copy_datagram_from_iovec(skb, offset, msg->msg_iov, 0, len);
+	err = skb_copy_datagram_from_iter(skb, offset, &from, len);
 	if (err)
 		goto out_free;
 
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 5eee625..4450d62 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1459,6 +1459,9 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	struct scm_cookie tmp_scm;
 	int max_level;
 	int data_len = 0;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1516,7 +1519,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	skb_put(skb, len - data_len);
 	skb->data_len = data_len;
 	skb->len = len;
-	err = skb_copy_datagram_from_iovec(skb, 0, msg->msg_iov, 0, len);
+	err = skb_copy_datagram_from_iter(skb, 0, &from, len);
 	if (err)
 		goto out_free;
 
@@ -1638,6 +1641,9 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	bool fds_sent = false;
 	int max_level;
 	int data_len;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1694,8 +1700,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		skb_put(skb, size - data_len);
 		skb->data_len = data_len;
 		skb->len = size;
-		err = skb_copy_datagram_from_iovec(skb, 0, msg->msg_iov,
-						   sent, size);
+		err = skb_copy_datagram_from_iter(skb, 0, &from, size);
 		if (err) {
 			kfree_skb(skb);
 			goto out_err;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter
  2014-11-25  2:40                         ` Al Viro
                                             ` (9 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov Al Viro
                                             ` (5 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/sctp/structs.h |    6 +++---
 net/sctp/chunk.c           |    9 ++++-----
 net/sctp/sm_make_chunk.c   |   16 ++++++++--------
 net/sctp/socket.c          |    5 ++++-
 4 files changed, 19 insertions(+), 17 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 806e3b5..2bb2fcf 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -531,7 +531,7 @@ struct sctp_datamsg {
 
 struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *,
 					    struct sctp_sndrcvinfo *,
-					    struct msghdr *, int len);
+					    struct iov_iter *);
 void sctp_datamsg_free(struct sctp_datamsg *);
 void sctp_datamsg_put(struct sctp_datamsg *);
 void sctp_chunk_fail(struct sctp_chunk *, int error);
@@ -647,8 +647,8 @@ struct sctp_chunk {
 
 void sctp_chunk_hold(struct sctp_chunk *);
 void sctp_chunk_put(struct sctp_chunk *);
-int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len,
-			  struct iovec *data);
+int sctp_user_addto_chunk(struct sctp_chunk *chunk, int len,
+			  struct iov_iter *from);
 void sctp_chunk_free(struct sctp_chunk *);
 void  *sctp_addto_chunk(struct sctp_chunk *, int len, const void *data);
 struct sctp_chunk *sctp_chunkify(struct sk_buff *,
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 158701d..a338091 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -164,7 +164,7 @@ static void sctp_datamsg_assign(struct sctp_datamsg *msg, struct sctp_chunk *chu
  */
 struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 					    struct sctp_sndrcvinfo *sinfo,
-					    struct msghdr *msgh, int msg_len)
+					    struct iov_iter *from)
 {
 	int max, whole, i, offset, over, err;
 	int len, first_len;
@@ -172,6 +172,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 	struct sctp_chunk *chunk;
 	struct sctp_datamsg *msg;
 	struct list_head *pos, *temp;
+	size_t msg_len = iov_iter_count(from);
 	__u8 frag;
 
 	msg = sctp_datamsg_new(GFP_KERNEL);
@@ -279,12 +280,10 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 			goto errout;
 		}
 
-		err = sctp_user_addto_chunk(chunk, offset, len, msgh->msg_iov);
+		err = sctp_user_addto_chunk(chunk, len, from);
 		if (err < 0)
 			goto errout_chunk_free;
 
-		offset += len;
-
 		/* Put the chunk->skb back into the form expected by send.  */
 		__skb_pull(chunk->skb, (__u8 *)chunk->chunk_hdr
 			   - (__u8 *)chunk->skb->data);
@@ -317,7 +316,7 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 			goto errout;
 		}
 
-		err = sctp_user_addto_chunk(chunk, offset, over, msgh->msg_iov);
+		err = sctp_user_addto_chunk(chunk, over, from);
 
 		/* Put the chunk->skb back into the form expected by send.  */
 		__skb_pull(chunk->skb, (__u8 *)chunk->chunk_hdr
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index e49bcce..e49e231 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -1491,26 +1491,26 @@ static void *sctp_addto_chunk_fixed(struct sctp_chunk *chunk,
  * chunk is not big enough.
  * Returns a kernel err value.
  */
-int sctp_user_addto_chunk(struct sctp_chunk *chunk, int off, int len,
-			  struct iovec *data)
+int sctp_user_addto_chunk(struct sctp_chunk *chunk, int len,
+			  struct iov_iter *from)
 {
-	__u8 *target;
-	int err = 0;
+	void *target;
+	ssize_t copied;
 
 	/* Make room in chunk for data.  */
 	target = skb_put(chunk->skb, len);
 
 	/* Copy data (whole iovec) into chunk */
-	if ((err = memcpy_fromiovecend(target, data, off, len)))
-		goto out;
+	copied = copy_from_iter(target, len, from);
+	if (copied != len)
+		return -EFAULT;
 
 	/* Adjust the chunk length field.  */
 	chunk->chunk_hdr->length =
 		htons(ntohs(chunk->chunk_hdr->length) + len);
 	chunk->chunk_end = skb_tail_pointer(chunk->skb);
 
-out:
-	return err;
+	return 0;
 }
 
 /* Helper function to assign a TSN if needed.  This assumes that both
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 85e0b65..0397ac9 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1609,6 +1609,9 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	__u16 sinfo_flags = 0;
 	long timeo;
 	int err;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, msg_len);
 
 	err = 0;
 	sp = sctp_sk(sk);
@@ -1947,7 +1950,7 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	/* Break the message into multiple chunks of maximum size. */
-	datamsg = sctp_datamsg_from_user(asoc, sinfo, msg, msg_len);
+	datamsg = sctp_datamsg_from_user(asoc, sinfo, &from);
 	if (IS_ERR(datamsg)) {
 		err = PTR_ERR(datamsg);
 		goto out_free;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov
  2014-11-25  2:40                         ` Al Viro
                                             ` (10 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 13/17] tipc_msg_build(): " Al Viro
                                             ` (4 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/tipc/socket.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index e918091..2d4b2fa 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -700,7 +700,7 @@ static unsigned int tipc_poll(struct file *file, struct socket *sock,
  * tipc_sendmcast - send multicast message
  * @sock: socket structure
  * @seq: destination address
- * @iov: message data to send
+ * @msg: message to send
  * @dsz: total length of message data
  * @timeo: timeout to wait for wakeup
  *
@@ -708,7 +708,7 @@ static unsigned int tipc_poll(struct file *file, struct socket *sock,
  * Returns the number of bytes sent on success, or errno
  */
 static int tipc_sendmcast(struct  socket *sock, struct tipc_name_seq *seq,
-			  struct iovec *iov, size_t dsz, long timeo)
+			  struct msghdr *msg, size_t dsz, long timeo)
 {
 	struct sock *sk = sock->sk;
 	struct tipc_msg *mhdr = &tipc_sk(sk)->phdr;
@@ -727,7 +727,7 @@ static int tipc_sendmcast(struct  socket *sock, struct tipc_name_seq *seq,
 
 new_mtu:
 	mtu = tipc_bclink_get_mtu();
-	rc = tipc_msg_build(mhdr, iov, 0, dsz, mtu, &buf);
+	rc = tipc_msg_build(mhdr, msg->msg_iov, 0, dsz, mtu, &buf);
 	if (unlikely(rc < 0))
 		return rc;
 
@@ -951,7 +951,7 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket *sock,
 	timeo = sock_sndtimeo(sk, m->msg_flags & MSG_DONTWAIT);
 
 	if (dest->addrtype == TIPC_ADDR_MCAST) {
-		rc = tipc_sendmcast(sock, seq, iov, dsz, timeo);
+		rc = tipc_sendmcast(sock, seq, m, dsz, timeo);
 		goto exit;
 	} else if (dest->addrtype == TIPC_ADDR_NAME) {
 		u32 type = dest->addr.name.name.type;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 13/17] tipc_msg_build(): pass msghdr instead of its ->msg_iov
  2014-11-25  2:40                         ` Al Viro
                                             ` (11 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr Al Viro
                                             ` (3 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/tipc/msg.c    |    8 ++++----
 net/tipc/msg.h    |    2 +-
 net/tipc/socket.c |    7 +++----
 3 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index ec18076..9155496 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -162,14 +162,14 @@ err:
 /**
  * tipc_msg_build - create buffer chain containing specified header and data
  * @mhdr: Message header, to be prepended to data
- * @iov: User data
+ * @m: User message
  * @offset: Posision in iov to start copying from
  * @dsz: Total length of user data
  * @pktmax: Max packet size that can be used
  * @chain: Buffer or chain of buffers to be returned to caller
  * Returns message data size or errno: -ENOMEM, -EFAULT
  */
-int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
+int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
 		   int offset, int dsz, int pktmax , struct sk_buff **chain)
 {
 	int mhsz = msg_hdr_sz(mhdr);
@@ -194,7 +194,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
 		skb_copy_to_linear_data(buf, mhdr, mhsz);
 		pktpos = buf->data + mhsz;
 		TIPC_SKB_CB(buf)->chain_sz = 1;
-		if (!dsz || !memcpy_fromiovecend(pktpos, iov, offset, dsz))
+		if (!dsz || !memcpy_fromiovecend(pktpos, m->msg_iov, offset, dsz))
 			return dsz;
 		rc = -EFAULT;
 		goto error;
@@ -223,7 +223,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
 		if (drem < pktrem)
 			pktrem = drem;
 
-		if (memcpy_fromiovecend(pktpos, iov, offset, pktrem)) {
+		if (memcpy_fromiovecend(pktpos, m->msg_iov, offset, pktrem)) {
 			rc = -EFAULT;
 			goto error;
 		}
diff --git a/net/tipc/msg.h b/net/tipc/msg.h
index 0ea7b69..d7d2ba2 100644
--- a/net/tipc/msg.h
+++ b/net/tipc/msg.h
@@ -743,7 +743,7 @@ bool tipc_msg_bundle(struct sk_buff *bbuf, struct sk_buff *buf, u32 mtu);
 
 bool tipc_msg_make_bundle(struct sk_buff **buf, u32 mtu, u32 dnode);
 
-int tipc_msg_build(struct tipc_msg *mhdr, struct iovec const *iov,
+int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m,
 		   int offset, int dsz, int mtu , struct sk_buff **chain);
 
 struct sk_buff *tipc_msg_reassemble(struct sk_buff *chain);
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 2d4b2fa..b3415a7 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -727,7 +727,7 @@ static int tipc_sendmcast(struct  socket *sock, struct tipc_name_seq *seq,
 
 new_mtu:
 	mtu = tipc_bclink_get_mtu();
-	rc = tipc_msg_build(mhdr, msg->msg_iov, 0, dsz, mtu, &buf);
+	rc = tipc_msg_build(mhdr, msg, 0, dsz, mtu, &buf);
 	if (unlikely(rc < 0))
 		return rc;
 
@@ -905,7 +905,6 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket *sock,
 	struct sock *sk = sock->sk;
 	struct tipc_sock *tsk = tipc_sk(sk);
 	struct tipc_msg *mhdr = &tsk->phdr;
-	struct iovec *iov = m->msg_iov;
 	u32 dnode, dport;
 	struct sk_buff *buf;
 	struct tipc_name_seq *seq = &dest->addr.nameseq;
@@ -982,7 +981,7 @@ static int tipc_sendmsg(struct kiocb *iocb, struct socket *sock,
 
 new_mtu:
 	mtu = tipc_node_get_mtu(dnode, tsk->ref);
-	rc = tipc_msg_build(mhdr, iov, 0, dsz, mtu, &buf);
+	rc = tipc_msg_build(mhdr, m, 0, dsz, mtu, &buf);
 	if (rc < 0)
 		goto exit;
 
@@ -1094,7 +1093,7 @@ static int tipc_send_stream(struct kiocb *iocb, struct socket *sock,
 next:
 	mtu = tsk->max_pkt;
 	send = min_t(uint, dsz - sent, TIPC_MAX_USER_MSG_SIZE);
-	rc = tipc_msg_build(mhdr, m->msg_iov, sent, send, mtu, &buf);
+	rc = tipc_msg_build(mhdr, m, sent, send, mtu, &buf);
 	if (unlikely(rc < 0))
 		goto exit;
 	do {
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr
  2014-11-25  2:40                         ` Al Viro
                                             ` (12 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 13/17] tipc_msg_build(): " Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 15/17] [atm] switch vcc_sendmsg() to copy_from_iter() Al Viro
                                             ` (2 subsequent siblings)
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/af_vsock.h         |    6 +++---
 net/vmw_vsock/af_vsock.c       |    6 +++---
 net/vmw_vsock/vmci_transport.c |   14 +++++++-------
 3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 4282778..0d87674 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -103,14 +103,14 @@ struct vsock_transport {
 	int (*dgram_dequeue)(struct kiocb *kiocb, struct vsock_sock *vsk,
 			     struct msghdr *msg, size_t len, int flags);
 	int (*dgram_enqueue)(struct vsock_sock *, struct sockaddr_vm *,
-			     struct iovec *, size_t len);
+			     struct msghdr *, size_t len);
 	bool (*dgram_allow)(u32 cid, u32 port);
 
 	/* STREAM. */
 	/* TODO: stream_bind() */
-	ssize_t (*stream_dequeue)(struct vsock_sock *, struct iovec *,
+	ssize_t (*stream_dequeue)(struct vsock_sock *, struct msghdr *,
 				  size_t len, int flags);
-	ssize_t (*stream_enqueue)(struct vsock_sock *, struct iovec *,
+	ssize_t (*stream_enqueue)(struct vsock_sock *, struct msghdr *,
 				  size_t len);
 	s64 (*stream_has_data)(struct vsock_sock *);
 	s64 (*stream_has_space)(struct vsock_sock *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 85d232b..1d0e39c 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1013,7 +1013,7 @@ static int vsock_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		goto out;
 	}
 
-	err = transport->dgram_enqueue(vsk, remote_addr, msg->msg_iov, len);
+	err = transport->dgram_enqueue(vsk, remote_addr, msg, len);
 
 out:
 	release_sock(sk);
@@ -1617,7 +1617,7 @@ static int vsock_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		 */
 
 		written = transport->stream_enqueue(
-				vsk, msg->msg_iov,
+				vsk, msg,
 				len - total_written);
 		if (written < 0) {
 			err = -ENOMEM;
@@ -1739,7 +1739,7 @@ vsock_stream_recvmsg(struct kiocb *kiocb,
 				break;
 
 			read = transport->stream_dequeue(
-					vsk, msg->msg_iov,
+					vsk, msg,
 					len - copied, flags);
 			if (read < 0) {
 				err = -ENOMEM;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index a57ddef..c1c0389 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1697,7 +1697,7 @@ static int vmci_transport_dgram_bind(struct vsock_sock *vsk,
 static int vmci_transport_dgram_enqueue(
 	struct vsock_sock *vsk,
 	struct sockaddr_vm *remote_addr,
-	struct iovec *iov,
+	struct msghdr *msg,
 	size_t len)
 {
 	int err;
@@ -1714,7 +1714,7 @@ static int vmci_transport_dgram_enqueue(
 	if (!dg)
 		return -ENOMEM;
 
-	memcpy_fromiovec(VMCI_DG_PAYLOAD(dg), iov, len);
+	memcpy_from_msg(VMCI_DG_PAYLOAD(dg), msg, len);
 
 	dg->dst = vmci_make_handle(remote_addr->svm_cid,
 				   remote_addr->svm_port);
@@ -1835,22 +1835,22 @@ static int vmci_transport_connect(struct vsock_sock *vsk)
 
 static ssize_t vmci_transport_stream_dequeue(
 	struct vsock_sock *vsk,
-	struct iovec *iov,
+	struct msghdr *msg,
 	size_t len,
 	int flags)
 {
 	if (flags & MSG_PEEK)
-		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, iov, len, 0);
+		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
 	else
-		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, iov, len, 0);
+		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
 }
 
 static ssize_t vmci_transport_stream_enqueue(
 	struct vsock_sock *vsk,
-	struct iovec *iov,
+	struct msghdr *msg,
 	size_t len)
 {
-	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, iov, len, 0);
+	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
 }
 
 static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 15/17] [atm] switch vcc_sendmsg() to copy_from_iter()
  2014-11-25  2:40                         ` Al Viro
                                             ` (13 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter Al Viro
  2014-11-25 14:02                           ` [PATCH v2 17/17] rds: switch rds_message_copy_from_user() to iov_iter Al Viro
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

... and make it handle multi-segment iovecs - deals with that
"fix this later" issue for free.  A bit of shame, really - it
had been there since 2.3.15pre3 when the whole thing went into the
tree, practically a historical artefact by now...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/atm/common.c |   17 ++++++-----------
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/atm/common.c b/net/atm/common.c
index 9cd1cca..f591129 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -570,15 +570,16 @@ int vcc_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 }
 
 int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
-		size_t total_len)
+		size_t size)
 {
 	struct sock *sk = sock->sk;
 	DEFINE_WAIT(wait);
 	struct atm_vcc *vcc;
 	struct sk_buff *skb;
 	int eff, error;
-	const void __user *buff;
-	int size;
+	struct iov_iter from;
+
+	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, size);
 
 	lock_sock(sk);
 	if (sock->state != SS_CONNECTED) {
@@ -589,12 +590,6 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		error = -EISCONN;
 		goto out;
 	}
-	if (m->msg_iovlen != 1) {
-		error = -ENOSYS; /* fix this later @@@ */
-		goto out;
-	}
-	buff = m->msg_iov->iov_base;
-	size = m->msg_iov->iov_len;
 	vcc = ATM_SD(sock);
 	if (test_bit(ATM_VF_RELEASED, &vcc->flags) ||
 	    test_bit(ATM_VF_CLOSE, &vcc->flags) ||
@@ -607,7 +602,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		error = 0;
 		goto out;
 	}
-	if (size < 0 || size > vcc->qos.txtp.max_sdu) {
+	if (size > vcc->qos.txtp.max_sdu) {
 		error = -EMSGSIZE;
 		goto out;
 	}
@@ -639,7 +634,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		goto out;
 	skb->dev = NULL; /* for paths shared with net_device interfaces */
 	ATM_SKB(skb)->atm_options = vcc->atm_options;
-	if (copy_from_user(skb_put(skb, size), buff, size)) {
+	if (copy_from_iter(skb_put(skb, size), size, &from) != size) {
 		kfree_skb(skb);
 		error = -EFAULT;
 		goto out;
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter
  2014-11-25  2:40                         ` Al Viro
                                             ` (14 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 15/17] [atm] switch vcc_sendmsg() to copy_from_iter() Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  2014-11-25 14:02                           ` [PATCH v2 17/17] rds: switch rds_message_copy_from_user() to iov_iter Al Viro
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

instances get considerably simpler from that...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/rds/ib.h       |    3 +--
 net/rds/ib_recv.c  |   37 +++++++++++--------------------------
 net/rds/iw.h       |    3 +--
 net/rds/iw_recv.c  |   37 +++++++++++--------------------------
 net/rds/message.c  |   35 ++++++++---------------------------
 net/rds/rds.h      |    6 ++----
 net/rds/recv.c     |    5 +++--
 net/rds/tcp.h      |    3 +--
 net/rds/tcp_recv.c |   38 +++++++++-----------------------------
 9 files changed, 47 insertions(+), 120 deletions(-)

diff --git a/net/rds/ib.h b/net/rds/ib.h
index 7280ab8..c36d713 100644
--- a/net/rds/ib.h
+++ b/net/rds/ib.h
@@ -316,8 +316,7 @@ int rds_ib_recv_alloc_caches(struct rds_ib_connection *ic);
 void rds_ib_recv_free_caches(struct rds_ib_connection *ic);
 void rds_ib_recv_refill(struct rds_connection *conn, int prefill);
 void rds_ib_inc_free(struct rds_incoming *inc);
-int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-			     size_t size);
+int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_ib_recv_cq_comp_handler(struct ib_cq *cq, void *context);
 void rds_ib_recv_tasklet_fn(unsigned long data);
 void rds_ib_recv_init_ring(struct rds_ib_connection *ic);
diff --git a/net/rds/ib_recv.c b/net/rds/ib_recv.c
index d67de45..1b981a4 100644
--- a/net/rds/ib_recv.c
+++ b/net/rds/ib_recv.c
@@ -472,15 +472,12 @@ static struct list_head *rds_ib_recv_cache_get(struct rds_ib_refill_cache *cache
 	return head;
 }
 
-int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
-			    size_t size)
+int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_ib_incoming *ibinc;
 	struct rds_page_frag *frag;
-	struct iovec *iov = first_iov;
 	unsigned long to_copy;
 	unsigned long frag_off = 0;
-	unsigned long iov_off = 0;
 	int copied = 0;
 	int ret;
 	u32 len;
@@ -489,37 +486,25 @@ int rds_ib_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
 	frag = list_entry(ibinc->ii_frags.next, struct rds_page_frag, f_item);
 	len = be32_to_cpu(inc->i_hdr.h_len);
 
-	while (copied < size && copied < len) {
+	while (iov_iter_count(to) && copied < len) {
 		if (frag_off == RDS_FRAG_SIZE) {
 			frag = list_entry(frag->f_item.next,
 					  struct rds_page_frag, f_item);
 			frag_off = 0;
 		}
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, RDS_FRAG_SIZE - frag_off);
-		to_copy = min_t(size_t, to_copy, size - copied);
+		to_copy = min_t(unsigned long, iov_iter_count(to),
+				RDS_FRAG_SIZE - frag_off);
 		to_copy = min_t(unsigned long, to_copy, len - copied);
 
-		rdsdebug("%lu bytes to user [%p, %zu] + %lu from frag "
-			 "[%p, %u] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 sg_page(&frag->f_sg), frag->f_sg.offset, frag_off);
-
 		/* XXX needs + offset for multiple recvs per page */
-		ret = rds_page_copy_to_user(sg_page(&frag->f_sg),
-					    frag->f_sg.offset + frag_off,
-					    iov->iov_base + iov_off,
-					    to_copy);
-		if (ret) {
-			copied = ret;
-			break;
-		}
+		rds_stats_add(s_copy_to_user, to_copy);
+		ret = copy_page_to_iter(sg_page(&frag->f_sg),
+					frag->f_sg.offset + frag_off,
+					to_copy,
+					to);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
 		frag_off += to_copy;
 		copied += to_copy;
 	}
diff --git a/net/rds/iw.h b/net/rds/iw.h
index 04ce3b1..cbe6674 100644
--- a/net/rds/iw.h
+++ b/net/rds/iw.h
@@ -325,8 +325,7 @@ int rds_iw_recv(struct rds_connection *conn);
 int rds_iw_recv_refill(struct rds_connection *conn, gfp_t kptr_gfp,
 		       gfp_t page_gfp, int prefill);
 void rds_iw_inc_free(struct rds_incoming *inc);
-int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-			     size_t size);
+int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_iw_recv_cq_comp_handler(struct ib_cq *cq, void *context);
 void rds_iw_recv_tasklet_fn(unsigned long data);
 void rds_iw_recv_init_ring(struct rds_iw_connection *ic);
diff --git a/net/rds/iw_recv.c b/net/rds/iw_recv.c
index aa8bf67..a66d179 100644
--- a/net/rds/iw_recv.c
+++ b/net/rds/iw_recv.c
@@ -303,15 +303,12 @@ void rds_iw_inc_free(struct rds_incoming *inc)
 	BUG_ON(atomic_read(&rds_iw_allocation) < 0);
 }
 
-int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
-			    size_t size)
+int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_iw_incoming *iwinc;
 	struct rds_page_frag *frag;
-	struct iovec *iov = first_iov;
 	unsigned long to_copy;
 	unsigned long frag_off = 0;
-	unsigned long iov_off = 0;
 	int copied = 0;
 	int ret;
 	u32 len;
@@ -320,37 +317,25 @@ int rds_iw_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
 	frag = list_entry(iwinc->ii_frags.next, struct rds_page_frag, f_item);
 	len = be32_to_cpu(inc->i_hdr.h_len);
 
-	while (copied < size && copied < len) {
+	while (iov_iter_count(to) && copied < len) {
 		if (frag_off == RDS_FRAG_SIZE) {
 			frag = list_entry(frag->f_item.next,
 					  struct rds_page_frag, f_item);
 			frag_off = 0;
 		}
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, RDS_FRAG_SIZE - frag_off);
-		to_copy = min_t(size_t, to_copy, size - copied);
+		to_copy = min_t(unsigned long, iov_iter_count(to),
+				RDS_FRAG_SIZE - frag_off);
 		to_copy = min_t(unsigned long, to_copy, len - copied);
 
-		rdsdebug("%lu bytes to user [%p, %zu] + %lu from frag "
-			 "[%p, %lu] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 frag->f_page, frag->f_offset, frag_off);
-
 		/* XXX needs + offset for multiple recvs per page */
-		ret = rds_page_copy_to_user(frag->f_page,
-					    frag->f_offset + frag_off,
-					    iov->iov_base + iov_off,
-					    to_copy);
-		if (ret) {
-			copied = ret;
-			break;
-		}
+		rds_stats_add(s_copy_to_user, to_copy);
+		ret = copy_page_to_iter(frag->f_page,
+					frag->f_offset + frag_off,
+					to_copy,
+					to);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
 		frag_off += to_copy;
 		copied += to_copy;
 	}
diff --git a/net/rds/message.c b/net/rds/message.c
index aba232f..7a546e0 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -325,14 +325,11 @@ out:
 	return ret;
 }
 
-int rds_message_inc_copy_to_user(struct rds_incoming *inc,
-				 struct iovec *first_iov, size_t size)
+int rds_message_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_message *rm;
-	struct iovec *iov;
 	struct scatterlist *sg;
 	unsigned long to_copy;
-	unsigned long iov_off;
 	unsigned long vec_off;
 	int copied;
 	int ret;
@@ -341,36 +338,20 @@ int rds_message_inc_copy_to_user(struct rds_incoming *inc,
 	rm = container_of(inc, struct rds_message, m_inc);
 	len = be32_to_cpu(rm->m_inc.i_hdr.h_len);
 
-	iov = first_iov;
-	iov_off = 0;
 	sg = rm->data.op_sg;
 	vec_off = 0;
 	copied = 0;
 
-	while (copied < size && copied < len) {
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, sg->length - vec_off);
-		to_copy = min_t(size_t, to_copy, size - copied);
+	while (iov_iter_count(to) && copied < len) {
+		to_copy = min(iov_iter_count(to), sg->length - vec_off);
 		to_copy = min_t(unsigned long, to_copy, len - copied);
 
-		rdsdebug("copying %lu bytes to user iov [%p, %zu] + %lu to "
-			 "sg [%p, %u, %u] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 sg_page(sg), sg->offset, sg->length, vec_off);
-
-		ret = rds_page_copy_to_user(sg_page(sg), sg->offset + vec_off,
-					    iov->iov_base + iov_off,
-					    to_copy);
-		if (ret) {
-			copied = ret;
-			break;
-		}
+		rds_stats_add(s_copy_to_user, to_copy);
+		ret = copy_page_to_iter(sg_page(sg), sg->offset + vec_off,
+					to_copy, to);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
 		vec_off += to_copy;
 		copied += to_copy;
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index 48f8ffc..b22dad9 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -431,8 +431,7 @@ struct rds_transport {
 	int (*xmit_rdma)(struct rds_connection *conn, struct rm_rdma_op *op);
 	int (*xmit_atomic)(struct rds_connection *conn, struct rm_atomic_op *op);
 	int (*recv)(struct rds_connection *conn);
-	int (*inc_copy_to_user)(struct rds_incoming *inc, struct iovec *iov,
-				size_t size);
+	int (*inc_copy_to_user)(struct rds_incoming *inc, struct iov_iter *to);
 	void (*inc_free)(struct rds_incoming *inc);
 
 	int (*cm_handle_connect)(struct rdma_cm_id *cm_id,
@@ -667,8 +666,7 @@ int rds_message_add_extension(struct rds_header *hdr,
 int rds_message_next_extension(struct rds_header *hdr,
 			       unsigned int *pos, void *buf, unsigned int *buflen);
 int rds_message_add_rdma_dest_extension(struct rds_header *hdr, u32 r_key, u32 offset);
-int rds_message_inc_copy_to_user(struct rds_incoming *inc,
-				 struct iovec *first_iov, size_t size);
+int rds_message_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 void rds_message_inc_free(struct rds_incoming *inc);
 void rds_message_addref(struct rds_message *rm);
 void rds_message_put(struct rds_message *rm);
diff --git a/net/rds/recv.c b/net/rds/recv.c
index bd82522..47d7b10 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -404,6 +404,7 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int ret = 0, nonblock = msg_flags & MSG_DONTWAIT;
 	DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name);
 	struct rds_incoming *inc = NULL;
+	struct iov_iter to;
 
 	/* udp_recvmsg()->sock_recvtimeo() gets away without locking too.. */
 	timeo = sock_rcvtimeo(sk, nonblock);
@@ -449,8 +450,8 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		rdsdebug("copying inc %p from %pI4:%u to user\n", inc,
 			 &inc->i_conn->c_faddr,
 			 ntohs(inc->i_hdr.h_sport));
-		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, msg->msg_iov,
-							     size);
+		iov_iter_init(&to, READ, msg->msg_iov, msg->msg_iovlen, size);
+		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, &to);
 		if (ret < 0)
 			break;
 
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 6563749..0dbdd37 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -69,8 +69,7 @@ void rds_tcp_recv_exit(void);
 void rds_tcp_data_ready(struct sock *sk);
 int rds_tcp_recv(struct rds_connection *conn);
 void rds_tcp_inc_free(struct rds_incoming *inc);
-int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iovec *iov,
-			     size_t size);
+int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to);
 
 /* tcp_send.c */
 void rds_tcp_xmit_prepare(struct rds_connection *conn);
diff --git a/net/rds/tcp_recv.c b/net/rds/tcp_recv.c
index 9ae6e0a..fbc5ef8 100644
--- a/net/rds/tcp_recv.c
+++ b/net/rds/tcp_recv.c
@@ -59,50 +59,30 @@ void rds_tcp_inc_free(struct rds_incoming *inc)
 /*
  * this is pretty lame, but, whatever.
  */
-int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iovec *first_iov,
-			     size_t size)
+int rds_tcp_inc_copy_to_user(struct rds_incoming *inc, struct iov_iter *to)
 {
 	struct rds_tcp_incoming *tinc;
-	struct iovec *iov, tmp;
 	struct sk_buff *skb;
-	unsigned long to_copy, skb_off;
 	int ret = 0;
 
-	if (size == 0)
+	if (!iov_iter_count(to))
 		goto out;
 
 	tinc = container_of(inc, struct rds_tcp_incoming, ti_inc);
-	iov = first_iov;
-	tmp = *iov;
 
 	skb_queue_walk(&tinc->ti_skb_list, skb) {
-		skb_off = 0;
-		while (skb_off < skb->len) {
-			while (tmp.iov_len == 0) {
-				iov++;
-				tmp = *iov;
-			}
-
-			to_copy = min(tmp.iov_len, size);
+		unsigned long to_copy, skb_off;
+		for (skb_off = 0; skb_off < skb->len; skb_off += to_copy) {
+			to_copy = iov_iter_count(to);
 			to_copy = min(to_copy, skb->len - skb_off);
 
-			rdsdebug("ret %d size %zu skb %p skb_off %lu "
-				 "skblen %d iov_base %p iov_len %zu cpy %lu\n",
-				 ret, size, skb, skb_off, skb->len,
-				 tmp.iov_base, tmp.iov_len, to_copy);
-
-			/* modifies tmp as it copies */
-			if (skb_copy_datagram_iovec(skb, skb_off, &tmp,
-						    to_copy)) {
-				ret = -EFAULT;
-				goto out;
-			}
+			if (skb_copy_datagram_iter(skb, skb_off, to, to_copy))
+				return -EFAULT;
 
 			rds_stats_add(s_copy_to_user, to_copy);
-			size -= to_copy;
 			ret += to_copy;
-			skb_off += to_copy;
-			if (size == 0)
+
+			if (!iov_iter_count(to))
 				goto out;
 		}
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH v2 17/17] rds: switch rds_message_copy_from_user() to iov_iter
  2014-11-25  2:40                         ` Al Viro
                                             ` (15 preceding siblings ...)
  2014-11-25 14:02                           ` [PATCH v2 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter Al Viro
@ 2014-11-25 14:02                           ` Al Viro
  16 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-11-25 14:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/rds/message.c |   42 ++++++++++++------------------------------
 net/rds/rds.h     |    3 +--
 net/rds/send.c    |    4 +++-
 3 files changed, 16 insertions(+), 33 deletions(-)

diff --git a/net/rds/message.c b/net/rds/message.c
index 7a546e0..ff22022 100644
--- a/net/rds/message.c
+++ b/net/rds/message.c
@@ -264,64 +264,46 @@ struct rds_message *rds_message_map_pages(unsigned long *page_addrs, unsigned in
 	return rm;
 }
 
-int rds_message_copy_from_user(struct rds_message *rm, struct iovec *first_iov,
-					       size_t total_len)
+int rds_message_copy_from_user(struct rds_message *rm, struct iov_iter *from)
 {
 	unsigned long to_copy;
-	unsigned long iov_off;
 	unsigned long sg_off;
-	struct iovec *iov;
 	struct scatterlist *sg;
 	int ret = 0;
 
-	rm->m_inc.i_hdr.h_len = cpu_to_be32(total_len);
+	rm->m_inc.i_hdr.h_len = cpu_to_be32(iov_iter_count(from));
 
 	/*
 	 * now allocate and copy in the data payload.
 	 */
 	sg = rm->data.op_sg;
-	iov = first_iov;
-	iov_off = 0;
 	sg_off = 0; /* Dear gcc, sg->page will be null from kzalloc. */
 
-	while (total_len) {
+	while (iov_iter_count(from)) {
 		if (!sg_page(sg)) {
-			ret = rds_page_remainder_alloc(sg, total_len,
+			ret = rds_page_remainder_alloc(sg, iov_iter_count(from),
 						       GFP_HIGHUSER);
 			if (ret)
-				goto out;
+				return ret;
 			rm->data.op_nents++;
 			sg_off = 0;
 		}
 
-		while (iov_off == iov->iov_len) {
-			iov_off = 0;
-			iov++;
-		}
-
-		to_copy = min(iov->iov_len - iov_off, sg->length - sg_off);
-		to_copy = min_t(size_t, to_copy, total_len);
-
-		rdsdebug("copying %lu bytes from user iov [%p, %zu] + %lu to "
-			 "sg [%p, %u, %u] + %lu\n",
-			 to_copy, iov->iov_base, iov->iov_len, iov_off,
-			 (void *)sg_page(sg), sg->offset, sg->length, sg_off);
+		to_copy = min_t(unsigned long, iov_iter_count(from),
+				sg->length - sg_off);
 
-		ret = rds_page_copy_from_user(sg_page(sg), sg->offset + sg_off,
-					      iov->iov_base + iov_off,
-					      to_copy);
-		if (ret)
-			goto out;
+		rds_stats_add(s_copy_from_user, to_copy);
+		ret = copy_page_from_iter(sg_page(sg), sg->offset + sg_off,
+					  to_copy, from);
+		if (ret != to_copy)
+			return -EFAULT;
 
-		iov_off += to_copy;
-		total_len -= to_copy;
 		sg_off += to_copy;
 
 		if (sg_off == sg->length)
 			sg++;
 	}
 
-out:
 	return ret;
 }
 
diff --git a/net/rds/rds.h b/net/rds/rds.h
index b22dad9..c2a5eef 100644
--- a/net/rds/rds.h
+++ b/net/rds/rds.h
@@ -656,8 +656,7 @@ rds_conn_connecting(struct rds_connection *conn)
 /* message.c */
 struct rds_message *rds_message_alloc(unsigned int nents, gfp_t gfp);
 struct scatterlist *rds_message_alloc_sgs(struct rds_message *rm, int nents);
-int rds_message_copy_from_user(struct rds_message *rm, struct iovec *first_iov,
-					       size_t total_len);
+int rds_message_copy_from_user(struct rds_message *rm, struct iov_iter *from);
 struct rds_message *rds_message_map_pages(unsigned long *page_addrs, unsigned int total_len);
 void rds_message_populate_header(struct rds_header *hdr, __be16 sport,
 				 __be16 dport, u64 seq);
diff --git a/net/rds/send.c b/net/rds/send.c
index 0a64541..4de62ea 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -934,7 +934,9 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int queued = 0, allocated_mr = 0;
 	int nonblock = msg->msg_flags & MSG_DONTWAIT;
 	long timeo = sock_sndtimeo(sk, nonblock);
+	struct iov_iter from;
 
+	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, payload_len);
 	/* Mirror Linux UDP mirror of BSD error message compatibility */
 	/* XXX: Perhaps MSG_MORE someday */
 	if (msg->msg_flags & ~(MSG_DONTWAIT | MSG_CMSG_COMPAT)) {
@@ -982,7 +984,7 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 			ret = -ENOMEM;
 			goto out;
 		}
-		ret = rds_message_copy_from_user(rm, msg->msg_iov, payload_len);
+		ret = rds_message_copy_from_user(rm, &from);
 		if (ret)
 			goto out;
 	}
-- 
1.7.10.4


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg()
  2014-11-25 14:02                           ` [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
@ 2014-11-25 19:28                             ` David Miller
  2014-11-25 20:59                               ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-25 19:28 UTC (permalink / raw)
  To: viro; +Cc: netdev, linux-kernel


Al, this series looks fine to me, do you have a tree I can pull
it from?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg()
  2014-11-25 19:28                             ` David Miller
@ 2014-11-25 20:59                               ` Al Viro
  2014-11-26 17:27                                 ` David Miller
  0 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-11-25 20:59 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

On Tue, Nov 25, 2014 at 02:28:20PM -0500, David Miller wrote:
> 
> Al, this series looks fine to me, do you have a tree I can pull
> it from?

Er...  Same as the last time?

git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem

and there's more fun stuff in #iov_iter-net in the same tree, but that's
the next batch...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg()
  2014-11-25 20:59                               ` Al Viro
@ 2014-11-26 17:27                                 ` David Miller
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-11-26 17:27 UTC (permalink / raw)
  To: viro; +Cc: netdev, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 25 Nov 2014 20:59:11 +0000

> On Tue, Nov 25, 2014 at 02:28:20PM -0500, David Miller wrote:
>> 
>> Al, this series looks fine to me, do you have a tree I can pull
>> it from?
> 
> Er...  Same as the last time?
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem
> 
> and there's more fun stuff in #iov_iter-net in the same tree, but that's
> the next batch...

Pulled, thanks Al.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* the next chunk of iov_iter-net stuff for review
  2014-11-26 17:27                                 ` David Miller
@ 2014-12-05  5:56                                   ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 01/12] raw.c: stick msghdr into raw_frag_vec Al Viro
                                                       ` (12 more replies)
  0 siblings, 13 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:56 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

	OK, here's the tentative next batch (covers most of the ->recvmsg()
side of conversion).  That's on top of merge of net-next#master with
vfs#iov_iter (the latter had been posted earlier today, Cc'd to netdev among
other places).  This series corresponds to vfs#for-davem.  Review and comments
would be very welcome...

Shortlog:
Al Viro (12):
      raw.c: stick msghdr into raw_frag_vec
      ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt"
      ip_generic_getfrag, udplite_getfrag: switch to passing msghdr
      switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg)
      switch l2cap ->memcpy_fromiovec() to msghdr
      vmci: propagate msghdr all way down to __qp_memcpy_from_queue()
      put iov_iter into msghdr
      first fruits - kill l2cap ->memcpy_fromiovec()
      switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives
      ppp_read(): switch to skb_copy_datagram_iter()
      skb_copy_datagram_iovec() can die
      bury memcpy_toiovec()
Diffstat:
 crypto/algif_hash.c                      |   4 +-
 crypto/algif_skcipher.c                  |   4 +-
 drivers/misc/vmw_vmci/vmci_queue_pair.c  |  17 ++--
 drivers/net/macvtap.c                    |   8 +-
 drivers/net/ppp/ppp_generic.c            |   4 +-
 drivers/net/tun.c                        |   8 +-
 drivers/target/iscsi/iscsi_target_util.c |  12 ++-
 drivers/vhost/net.c                      |   8 +-
 fs/afs/rxrpc.c                           |  14 ++--
 include/linux/skbuff.h                   |  22 ++---
 include/linux/socket.h                   |   3 +-
 include/linux/tcp.h                      |   2 +-
 include/linux/uio.h                      |   1 -
 include/linux/vmw_vmci_api.h             |   5 +-
 include/net/bluetooth/l2cap.h            |  29 -------
 include/net/udplite.h                    |   4 +-
 lib/iovec.c                              |  25 ------
 net/atm/common.c                         |   5 +-
 net/bluetooth/6lowpan.c                  |   7 +-
 net/bluetooth/a2mp.c                     |   4 +-
 net/bluetooth/l2cap_core.c               |   7 +-
 net/bluetooth/l2cap_sock.c               |   8 --
 net/bluetooth/smp.c                      |   5 +-
 net/caif/caif_socket.c                   |   2 +-
 net/compat.c                             |  10 ++-
 net/core/datagram.c                      | 138 +++++--------------------------
 net/ipv4/ip_output.c                     |   8 +-
 net/ipv4/ping.c                          |   3 +-
 net/ipv4/raw.c                           |  11 +--
 net/ipv4/tcp.c                           |   8 +-
 net/ipv4/tcp_input.c                     |   7 +-
 net/ipv4/tcp_output.c                    |   2 +-
 net/ipv4/udp.c                           |   4 +-
 net/ipv6/ping.c                          |   3 +-
 net/ipv6/raw.c                           | 115 +++++++++++++-------------
 net/ipv6/udp.c                           |   2 +-
 net/l2tp/l2tp_ip6.c                      |   2 +-
 net/netlink/af_netlink.c                 |   2 +-
 net/packet/af_packet.c                   |   7 +-
 net/rds/recv.c                           |   7 +-
 net/rds/send.c                           |   4 +-
 net/rxrpc/ar-output.c                    |   8 +-
 net/sctp/socket.c                        |   5 +-
 net/socket.c                             |  27 +++---
 net/tipc/msg.c                           |   4 +-
 net/tipc/socket.c                        |   4 +-
 net/unix/af_unix.c                       |  10 +--
 net/vmw_vsock/vmci_transport.c           |   7 +-
 48 files changed, 204 insertions(+), 402 deletions(-)

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 01/12] raw.c: stick msghdr into raw_frag_vec
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 02/12] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt" Al Viro
                                                       ` (11 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

we'll want access to ->msg_iter

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/ipv4/raw.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 43385a9..5c901eb 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -82,7 +82,7 @@
 #include <linux/uio.h>
 
 struct raw_frag_vec {
-	struct iovec *iov;
+	struct msghdr *msg;
 	union {
 		struct icmphdr icmph;
 		char c[1];
@@ -440,7 +440,7 @@ static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4)
 	/* We only need the first two bytes. */
 	rfv->hlen = 2;
 
-	err = memcpy_fromiovec(rfv->hdr.c, rfv->iov, rfv->hlen);
+	err = memcpy_from_msg(rfv->hdr.c, rfv->msg, rfv->hlen);
 	if (err)
 		return err;
 
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->iov, to, offset, len, odd, skb);
+	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
 }
 
 static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
@@ -600,7 +600,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 			   daddr, saddr, 0, 0);
 
 	if (!inet->hdrincl) {
-		rfv.iov = msg->msg_iov;
+		rfv.msg = msg;
 		rfv.hlen = 0;
 
 		err = raw_probe_proto_opt(&rfv, &fl4);
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 02/12] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt"
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
  2014-12-05  5:58                                     ` [PATCH 01/12] raw.c: stick msghdr into raw_frag_vec Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 03/12] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr Al Viro
                                                       ` (10 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/ipv6/raw.c | 112 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 56 insertions(+), 56 deletions(-)

diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8baa53e..942f67b 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -672,65 +672,62 @@ error:
 	return err;
 }
 
-static int rawv6_probe_proto_opt(struct flowi6 *fl6, struct msghdr *msg)
+struct raw6_frag_vec {
+	struct msghdr *msg;
+	int hlen;
+	char c[4];
+};
+
+static int rawv6_probe_proto_opt(struct raw6_frag_vec *rfv, struct flowi6 *fl6)
 {
-	struct iovec *iov;
-	u8 __user *type = NULL;
-	u8 __user *code = NULL;
-	u8 len = 0;
-	int probed = 0;
-	int i;
-
-	if (!msg->msg_iov)
-		return 0;
+	int err = 0;
+	switch (fl6->flowi6_proto) {
+	case IPPROTO_ICMPV6:
+		rfv->hlen = 2;
+		err = memcpy_from_msg(rfv->c, rfv->msg, rfv->hlen);
+		if (!err) {
+			fl6->fl6_icmp_type = rfv->c[0];
+			fl6->fl6_icmp_code = rfv->c[1];
+		}
+		break;
+	case IPPROTO_MH:
+		rfv->hlen = 4;
+		err = memcpy_from_msg(rfv->c, rfv->msg, rfv->hlen);
+		if (!err)
+			fl6->fl6_mh_type = rfv->c[2];
+	}
+	return err;
+}
 
-	for (i = 0; i < msg->msg_iovlen; i++) {
-		iov = &msg->msg_iov[i];
-		if (!iov)
-			continue;
+static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
+		       struct sk_buff *skb)
+{
+	struct raw6_frag_vec *rfv = from;
 
-		switch (fl6->flowi6_proto) {
-		case IPPROTO_ICMPV6:
-			/* check if one-byte field is readable or not. */
-			if (iov->iov_base && iov->iov_len < 1)
-				break;
-
-			if (!type) {
-				type = iov->iov_base;
-				/* check if code field is readable or not. */
-				if (iov->iov_len > 1)
-					code = type + 1;
-			} else if (!code)
-				code = iov->iov_base;
-
-			if (type && code) {
-				if (get_user(fl6->fl6_icmp_type, type) ||
-				    get_user(fl6->fl6_icmp_code, code))
-					return -EFAULT;
-				probed = 1;
-			}
-			break;
-		case IPPROTO_MH:
-			if (iov->iov_base && iov->iov_len < 1)
-				break;
-			/* check if type field is readable or not. */
-			if (iov->iov_len > 2 - len) {
-				u8 __user *p = iov->iov_base;
-				if (get_user(fl6->fl6_mh_type, &p[2 - len]))
-					return -EFAULT;
-				probed = 1;
-			} else
-				len += iov->iov_len;
+	if (offset < rfv->hlen) {
+		int copy = min(rfv->hlen - offset, len);
 
-			break;
-		default:
-			probed = 1;
-			break;
-		}
-		if (probed)
-			break;
+		if (skb->ip_summed == CHECKSUM_PARTIAL)
+			memcpy(to, rfv->c + offset, copy);
+		else
+			skb->csum = csum_block_add(
+				skb->csum,
+				csum_partial_copy_nocheck(rfv->c + offset,
+							  to, copy, 0),
+				odd);
+
+		odd = 0;
+		offset += copy;
+		to += copy;
+		len -= copy;
+
+		if (!len)
+			return 0;
 	}
-	return 0;
+
+	offset -= rfv->hlen;
+
+	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
 }
 
 static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
@@ -745,6 +742,7 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 	struct ipv6_txoptions *opt = NULL;
 	struct ip6_flowlabel *flowlabel = NULL;
 	struct dst_entry *dst = NULL;
+	struct raw6_frag_vec rfv;
 	struct flowi6 fl6;
 	int addr_len = msg->msg_namelen;
 	int hlimit = -1;
@@ -848,7 +846,9 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 	opt = ipv6_fixup_options(&opt_space, opt);
 
 	fl6.flowi6_proto = proto;
-	err = rawv6_probe_proto_opt(&fl6, msg);
+	rfv.msg = msg;
+	rfv.hlen = 0;
+	err = rawv6_probe_proto_opt(&rfv, &fl6);
 	if (err)
 		goto out;
 
@@ -889,7 +889,7 @@ back_from_confirm:
 		err = rawv6_send_hdrinc(sk, msg->msg_iov, len, &fl6, &dst, msg->msg_flags);
 	else {
 		lock_sock(sk);
-		err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov,
+		err = ip6_append_data(sk, raw6_getfrag, &rfv,
 			len, 0, hlimit, tclass, opt, &fl6, (struct rt6_info *)dst,
 			msg->msg_flags, dontfrag);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 03/12] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
  2014-12-05  5:58                                     ` [PATCH 01/12] raw.c: stick msghdr into raw_frag_vec Al Viro
  2014-12-05  5:58                                     ` [PATCH 02/12] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt" Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 04/12] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) Al Viro
                                                       ` (9 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/udplite.h | 3 ++-
 net/ipv4/ip_output.c  | 6 +++---
 net/ipv4/raw.c        | 2 +-
 net/ipv4/udp.c        | 4 ++--
 net/ipv6/raw.c        | 2 +-
 net/ipv6/udp.c        | 2 +-
 net/l2tp/l2tp_ip6.c   | 2 +-
 7 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/include/net/udplite.h b/include/net/udplite.h
index 9a28a51..d5baaba 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -19,7 +19,8 @@ extern struct udp_table		udplite_table;
 static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
 				      int len, int odd, struct sk_buff *skb)
 {
-	return memcpy_fromiovecend(to, (struct iovec *) from, offset, len);
+	struct msghdr *msg = from;
+	return memcpy_fromiovecend(to, msg->msg_iov, offset, len);
 }
 
 /* Designate sk as UDP-Lite socket */
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 4a929ad..cdedcf1 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -752,14 +752,14 @@ EXPORT_SYMBOL(ip_fragment);
 int
 ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
 {
-	struct iovec *iov = from;
+	struct msghdr *msg = from;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
-		if (memcpy_fromiovecend(to, iov, offset, len) < 0)
+		if (memcpy_fromiovecend(to, msg->msg_iov, offset, len) < 0)
 			return -EFAULT;
 	} else {
 		__wsum csum = 0;
-		if (csum_partial_copy_fromiovecend(to, iov, offset, len, &csum) < 0)
+		if (csum_partial_copy_fromiovecend(to, msg->msg_iov, offset, len, &csum) < 0)
 			return -EFAULT;
 		skb->csum = csum_block_add(skb->csum, csum, odd);
 	}
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 5c901eb..5d83bd2 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
+	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
 static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index b2d6068..0cac250 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1036,7 +1036,7 @@ back_from_confirm:
 
 	/* Lockless fast path for the non-corking case. */
 	if (!corkreq) {
-		skb = ip_make_skb(sk, fl4, getfrag, msg->msg_iov, ulen,
+		skb = ip_make_skb(sk, fl4, getfrag, msg, ulen,
 				  sizeof(struct udphdr), &ipc, &rt,
 				  msg->msg_flags);
 		err = PTR_ERR(skb);
@@ -1067,7 +1067,7 @@ back_from_confirm:
 
 do_append_data:
 	up->len += ulen;
-	err = ip_append_data(sk, fl4, getfrag, msg->msg_iov, ulen,
+	err = ip_append_data(sk, fl4, getfrag, msg, ulen,
 			     sizeof(struct udphdr), &ipc, &rt,
 			     corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags);
 	if (err)
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 942f67b..11a9283 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -727,7 +727,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
+	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
 static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 7cfb5d74..5a164b6 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1299,7 +1299,7 @@ do_append_data:
 		dontfrag = np->dontfrag;
 	up->len += ulen;
 	getfrag  =  is_udplite ?  udplite_getfrag : ip_generic_getfrag;
-	err = ip6_append_data(sk, getfrag, msg->msg_iov, ulen,
+	err = ip6_append_data(sk, getfrag, msg, ulen,
 		sizeof(struct udphdr), hlimit, tclass, opt, &fl6,
 		(struct rt6_info *)dst,
 		corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags, dontfrag);
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2177b96..8611f1b 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -619,7 +619,7 @@ static int l2tp_ip6_sendmsg(struct kiocb *iocb, struct sock *sk,
 
 back_from_confirm:
 	lock_sock(sk);
-	err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov,
+	err = ip6_append_data(sk, ip_generic_getfrag, msg,
 			      ulen, transhdrlen, hlimit, tclass, opt,
 			      &fl6, (struct rt6_info *)dst,
 			      msg->msg_flags, dontfrag);
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 04/12] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg)
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (2 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 03/12] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 05/12] switch l2cap ->memcpy_fromiovec() to msghdr Al Viro
                                                       ` (8 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/tcp.h  | 2 +-
 net/ipv4/tcp.c       | 2 +-
 net/ipv4/tcp_input.c | 7 +++----
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index f566b85..5d9cc9c 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -162,7 +162,7 @@ struct tcp_sock {
 	struct {
 		struct sk_buff_head	prequeue;
 		struct task_struct	*task;
-		struct iovec		*iov;
+		struct msghdr		*msg;
 		int			memory;
 		int			len;
 	} ucopy;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index dc13a36..4a96f37 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1729,7 +1729,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 			if (!user_recv && !(flags & (MSG_TRUNC | MSG_PEEK))) {
 				user_recv = current;
 				tp->ucopy.task = user_recv;
-				tp->ucopy.iov = msg->msg_iov;
+				tp->ucopy.msg = msg;
 			}
 
 			tp->ucopy.len = len;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 69de1a1..075ab4d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4421,7 +4421,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 			__set_current_state(TASK_RUNNING);
 
 			local_bh_enable();
-			if (!skb_copy_datagram_iovec(skb, 0, tp->ucopy.iov, chunk)) {
+			if (!skb_copy_datagram_msg(skb, 0, tp->ucopy.msg, chunk)) {
 				tp->ucopy.len -= chunk;
 				tp->copied_seq += chunk;
 				eaten = (chunk == skb->len);
@@ -4941,10 +4941,9 @@ static int tcp_copy_to_iovec(struct sock *sk, struct sk_buff *skb, int hlen)
 
 	local_bh_enable();
 	if (skb_csum_unnecessary(skb))
-		err = skb_copy_datagram_iovec(skb, hlen, tp->ucopy.iov, chunk);
+		err = skb_copy_datagram_msg(skb, hlen, tp->ucopy.msg, chunk);
 	else
-		err = skb_copy_and_csum_datagram_iovec(skb, hlen,
-						       tp->ucopy.iov);
+		err = skb_copy_and_csum_datagram_msg(skb, hlen, tp->ucopy.msg);
 
 	if (!err) {
 		tp->ucopy.len -= chunk;
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 05/12] switch l2cap ->memcpy_fromiovec() to msghdr
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (3 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 04/12] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 06/12] vmci: propagate msghdr all way down to __qp_memcpy_from_queue() Al Viro
                                                       ` (7 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

it'll die soon enough - now that kvec-backed iov_iter works regardless
of set_fs(), both instances will become copy_from_iter() as soon as
we introduce ->msg_iter...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/bluetooth/l2cap.h | 6 +++---
 net/bluetooth/l2cap_core.c    | 4 ++--
 net/bluetooth/l2cap_sock.c    | 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
index 061e648..4e23674 100644
--- a/include/net/bluetooth/l2cap.h
+++ b/include/net/bluetooth/l2cap.h
@@ -608,7 +608,7 @@ struct l2cap_ops {
 					       unsigned long len, int nb);
 	int			(*memcpy_fromiovec) (struct l2cap_chan *chan,
 						     unsigned char *kdata,
-						     struct iovec *iov,
+						     struct msghdr *msg,
 						     int len);
 };
 
@@ -905,13 +905,13 @@ static inline long l2cap_chan_no_get_sndtimeo(struct l2cap_chan *chan)
 
 static inline int l2cap_chan_no_memcpy_fromiovec(struct l2cap_chan *chan,
 						 unsigned char *kdata,
-						 struct iovec *iov,
+						 struct msghdr *msg,
 						 int len)
 {
 	/* Following is safe since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	struct kvec *vec = (struct kvec *)iov;
+	struct kvec *vec = (struct kvec *)msg->msg_iov;
 
 	while (len > 0) {
 		if (vec->iov_len) {
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 8e12731..5201d61 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -2097,7 +2097,7 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 	int sent = 0;
 
 	if (chan->ops->memcpy_fromiovec(chan, skb_put(skb, count),
-					msg->msg_iov, count))
+					msg, count))
 		return -EFAULT;
 
 	sent += count;
@@ -2118,7 +2118,7 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 		*frag = tmp;
 
 		if (chan->ops->memcpy_fromiovec(chan, skb_put(*frag, count),
-						msg->msg_iov, count))
+						msg, count))
 			return -EFAULT;
 
 		sent += count;
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index b0efb72..205b298 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1338,9 +1338,9 @@ static struct sk_buff *l2cap_sock_alloc_skb_cb(struct l2cap_chan *chan,
 
 static int l2cap_sock_memcpy_fromiovec_cb(struct l2cap_chan *chan,
 					  unsigned char *kdata,
-					  struct iovec *iov, int len)
+					  struct msghdr *msg, int len)
 {
-	return memcpy_fromiovec(kdata, iov, len);
+	return memcpy_from_msg(kdata, msg, len);
 }
 
 static void l2cap_sock_ready_cb(struct l2cap_chan *chan)
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 06/12] vmci: propagate msghdr all way down to __qp_memcpy_from_queue()
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (4 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 05/12] switch l2cap ->memcpy_fromiovec() to msghdr Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 07/12] put iov_iter into msghdr Al Viro
                                                       ` (6 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

... and switch it to memcpy_to_msg()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/misc/vmw_vmci/vmci_queue_pair.c | 17 +++++++++--------
 include/linux/vmw_vmci_api.h            |  5 +++--
 net/vmw_vsock/vmci_transport.c          |  4 ++--
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index 1b7b303..7aaaf51 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -27,6 +27,7 @@
 #include <linux/uio.h>
 #include <linux/wait.h>
 #include <linux/vmalloc.h>
+#include <linux/skbuff.h>
 
 #include "vmci_handle_array.h"
 #include "vmci_queue_pair.h"
@@ -429,11 +430,11 @@ static int __qp_memcpy_from_queue(void *dest,
 			to_copy = size - bytes_copied;
 
 		if (is_iovec) {
-			struct iovec *iov = (struct iovec *)dest;
+			struct msghdr *msg = dest;
 			int err;
 
 			/* The iovec will track bytes_copied internally. */
-			err = memcpy_toiovec(iov, (u8 *)va + page_offset,
+			err = memcpy_to_msg(msg, (u8 *)va + page_offset,
 					     to_copy);
 			if (err != 0) {
 				if (kernel_if->host)
@@ -3264,13 +3265,13 @@ EXPORT_SYMBOL_GPL(vmci_qpair_enquev);
  * of bytes dequeued or < 0 on error.
  */
 ssize_t vmci_qpair_dequev(struct vmci_qp *qpair,
-			  void *iov,
+			  struct msghdr *msg,
 			  size_t iov_size,
 			  int buf_type)
 {
 	ssize_t result;
 
-	if (!qpair || !iov)
+	if (!qpair)
 		return VMCI_ERROR_INVALID_ARGS;
 
 	qp_lock(qpair);
@@ -3279,7 +3280,7 @@ ssize_t vmci_qpair_dequev(struct vmci_qp *qpair,
 		result = qp_dequeue_locked(qpair->produce_q,
 					   qpair->consume_q,
 					   qpair->consume_q_size,
-					   iov, iov_size,
+					   msg, iov_size,
 					   qp_memcpy_from_queue_iov,
 					   true);
 
@@ -3308,13 +3309,13 @@ EXPORT_SYMBOL_GPL(vmci_qpair_dequev);
  * of bytes peeked or < 0 on error.
  */
 ssize_t vmci_qpair_peekv(struct vmci_qp *qpair,
-			 void *iov,
+			 struct msghdr *msg,
 			 size_t iov_size,
 			 int buf_type)
 {
 	ssize_t result;
 
-	if (!qpair || !iov)
+	if (!qpair)
 		return VMCI_ERROR_INVALID_ARGS;
 
 	qp_lock(qpair);
@@ -3323,7 +3324,7 @@ ssize_t vmci_qpair_peekv(struct vmci_qp *qpair,
 		result = qp_dequeue_locked(qpair->produce_q,
 					   qpair->consume_q,
 					   qpair->consume_q_size,
-					   iov, iov_size,
+					   msg, iov_size,
 					   qp_memcpy_from_queue_iov,
 					   false);
 
diff --git a/include/linux/vmw_vmci_api.h b/include/linux/vmw_vmci_api.h
index 023430e..5691f75 100644
--- a/include/linux/vmw_vmci_api.h
+++ b/include/linux/vmw_vmci_api.h
@@ -24,6 +24,7 @@
 #define VMCI_KERNEL_API_VERSION_2 2
 #define VMCI_KERNEL_API_VERSION   VMCI_KERNEL_API_VERSION_2
 
+struct msghdr;
 typedef void (vmci_device_shutdown_fn) (void *device_registration,
 					void *user_data);
 
@@ -75,8 +76,8 @@ ssize_t vmci_qpair_peek(struct vmci_qp *qpair, void *buf, size_t buf_size,
 ssize_t vmci_qpair_enquev(struct vmci_qp *qpair,
 			  void *iov, size_t iov_size, int mode);
 ssize_t vmci_qpair_dequev(struct vmci_qp *qpair,
-			  void *iov, size_t iov_size, int mode);
-ssize_t vmci_qpair_peekv(struct vmci_qp *qpair, void *iov, size_t iov_size,
+			  struct msghdr *msg, size_t iov_size, int mode);
+ssize_t vmci_qpair_peekv(struct vmci_qp *qpair, struct msghdr *msg, size_t iov_size,
 			 int mode);
 
 #endif /* !__VMW_VMCI_API_H__ */
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index c1c0389..20a0ba3 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1840,9 +1840,9 @@ static ssize_t vmci_transport_stream_dequeue(
 	int flags)
 {
 	if (flags & MSG_PEEK)
-		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
+		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, msg, len, 0);
 	else
-		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
+		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, msg, len, 0);
 }
 
 static ssize_t vmci_transport_stream_enqueue(
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 07/12] put iov_iter into msghdr
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (5 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 06/12] vmci: propagate msghdr all way down to __qp_memcpy_from_queue() Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 08/12] first fruits - kill l2cap ->memcpy_fromiovec() Al Viro
                                                       ` (5 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 crypto/algif_hash.c            |  4 ++--
 crypto/algif_skcipher.c        |  4 ++--
 drivers/net/macvtap.c          |  8 ++------
 drivers/net/tun.c              |  8 ++------
 drivers/vhost/net.c            |  8 +++-----
 fs/afs/rxrpc.c                 | 14 ++++++--------
 include/linux/skbuff.h         | 16 ++++++++++------
 include/linux/socket.h         |  3 +--
 include/net/bluetooth/l2cap.h  |  2 +-
 include/net/udplite.h          |  3 ++-
 net/atm/common.c               |  5 +----
 net/bluetooth/6lowpan.c        |  6 +++---
 net/bluetooth/a2mp.c           |  3 +--
 net/bluetooth/smp.c            |  3 +--
 net/caif/caif_socket.c         |  2 +-
 net/compat.c                   | 10 ++++++----
 net/ipv4/ip_output.c           |  6 ++++--
 net/ipv4/ping.c                |  3 ++-
 net/ipv4/raw.c                 |  3 ++-
 net/ipv4/tcp.c                 |  6 +++---
 net/ipv4/tcp_output.c          |  2 +-
 net/ipv6/ping.c                |  3 ++-
 net/ipv6/raw.c                 |  3 ++-
 net/netlink/af_netlink.c       |  2 +-
 net/packet/af_packet.c         |  7 ++-----
 net/rds/recv.c                 |  7 ++++---
 net/rds/send.c                 |  4 +---
 net/rxrpc/ar-output.c          |  8 +++-----
 net/sctp/socket.c              |  5 +----
 net/socket.c                   | 27 ++++++++++++---------------
 net/tipc/msg.c                 |  4 ++--
 net/tipc/socket.c              |  4 ++--
 net/unix/af_unix.c             | 10 ++--------
 net/vmw_vsock/vmci_transport.c |  3 ++-
 34 files changed, 92 insertions(+), 114 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 35c93ff..83cd2cc 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -42,7 +42,7 @@ static int hash_sendmsg(struct kiocb *unused, struct socket *sock,
 	struct alg_sock *ask = alg_sk(sk);
 	struct hash_ctx *ctx = ask->private;
 	unsigned long iovlen;
-	struct iovec *iov;
+	const struct iovec *iov;
 	long copied = 0;
 	int err;
 
@@ -58,7 +58,7 @@ static int hash_sendmsg(struct kiocb *unused, struct socket *sock,
 
 	ctx->more = 0;
 
-	for (iov = msg->msg_iov, iovlen = msg->msg_iovlen; iovlen > 0;
+	for (iov = msg->msg_iter.iov, iovlen = msg->msg_iter.nr_segs; iovlen > 0;
 	     iovlen--, iov++) {
 		unsigned long seglen = iov->iov_len;
 		char __user *from = iov->iov_base;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index c3b482b..4f45dab 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -429,13 +429,13 @@ static int skcipher_recvmsg(struct kiocb *unused, struct socket *sock,
 	struct skcipher_sg_list *sgl;
 	struct scatterlist *sg;
 	unsigned long iovlen;
-	struct iovec *iov;
+	const struct iovec *iov;
 	int err = -EAGAIN;
 	int used;
 	long copied = 0;
 
 	lock_sock(sk);
-	for (iov = msg->msg_iov, iovlen = msg->msg_iovlen; iovlen > 0;
+	for (iov = msg->msg_iter.iov, iovlen = msg->msg_iter.nr_segs; iovlen > 0;
 	     iovlen--, iov++) {
 		unsigned long seglen = iov->iov_len;
 		char __user *from = iov->iov_base;
diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index 22b4cf2..4fb1222 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -1092,9 +1092,7 @@ static int macvtap_sendmsg(struct kiocb *iocb, struct socket *sock,
 			   struct msghdr *m, size_t total_len)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
-	struct iov_iter from;
-	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
-	return macvtap_get_user(q, m, &from, m->msg_flags & MSG_DONTWAIT);
+	return macvtap_get_user(q, m, &m->msg_iter, m->msg_flags & MSG_DONTWAIT);
 }
 
 static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
@@ -1102,12 +1100,10 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
 			   int flags)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
-	struct iov_iter to;
 	int ret;
 	if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
 		return -EINVAL;
-	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
-	ret = macvtap_do_read(q, &to, flags & MSG_DONTWAIT);
+	ret = macvtap_do_read(q, &m->msg_iter, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 6d44da1..8d8bede 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1446,13 +1446,11 @@ static int tun_sendmsg(struct kiocb *iocb, struct socket *sock,
 	int ret;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
-	struct iov_iter from;
 
 	if (!tun)
 		return -EBADFD;
 
-	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
-	ret = tun_get_user(tun, tfile, m->msg_control, &from,
+	ret = tun_get_user(tun, tfile, m->msg_control, &m->msg_iter,
 			   m->msg_flags & MSG_DONTWAIT);
 	tun_put(tun);
 	return ret;
@@ -1464,7 +1462,6 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
-	struct iov_iter to;
 	int ret;
 
 	if (!tun)
@@ -1479,8 +1476,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 					 SOL_PACKET, TUN_TX_TIMESTAMP);
 		goto out;
 	}
-	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
-	ret = tun_do_read(tun, tfile, &to, flags & MSG_DONTWAIT);
+	ret = tun_do_read(tun, tfile, &m->msg_iter, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8dae2f7..9f06e70 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -342,7 +342,6 @@ static void handle_tx(struct vhost_net *net)
 		.msg_namelen = 0,
 		.msg_control = NULL,
 		.msg_controllen = 0,
-		.msg_iov = vq->iov,
 		.msg_flags = MSG_DONTWAIT,
 	};
 	size_t len, total_len = 0;
@@ -396,8 +395,8 @@ static void handle_tx(struct vhost_net *net)
 		}
 		/* Skip header. TODO: support TSO. */
 		s = move_iovec_hdr(vq->iov, nvq->hdr, hdr_size, out);
-		msg.msg_iovlen = out;
 		len = iov_length(vq->iov, out);
+		iov_iter_init(&msg.msg_iter, WRITE, vq->iov, out, len);
 		/* Sanity check */
 		if (!len) {
 			vq_err(vq, "Unexpected header len for TX: "
@@ -562,7 +561,6 @@ static void handle_rx(struct vhost_net *net)
 		.msg_namelen = 0,
 		.msg_control = NULL, /* FIXME: get and handle RX aux data. */
 		.msg_controllen = 0,
-		.msg_iov = vq->iov,
 		.msg_flags = MSG_DONTWAIT,
 	};
 	struct virtio_net_hdr_mrg_rxbuf hdr = {
@@ -600,7 +598,7 @@ static void handle_rx(struct vhost_net *net)
 			break;
 		/* On overrun, truncate and discard */
 		if (unlikely(headcount > UIO_MAXIOV)) {
-			msg.msg_iovlen = 1;
+			iov_iter_init(&msg.msg_iter, READ, vq->iov, 1, 1);
 			err = sock->ops->recvmsg(NULL, sock, &msg,
 						 1, MSG_DONTWAIT | MSG_TRUNC);
 			pr_debug("Discarded rx packet: len %zd\n", sock_len);
@@ -626,7 +624,7 @@ static void handle_rx(struct vhost_net *net)
 			/* Copy the header for use in VIRTIO_NET_F_MRG_RXBUF:
 			 * needed because recvmsg can modify msg_iov. */
 			copy_iovec_hdr(vq->iov, nvq->hdr, sock_hlen, in);
-		msg.msg_iovlen = in;
+		iov_iter_init(&msg.msg_iter, READ, vq->iov, in, sock_len);
 		err = sock->ops->recvmsg(NULL, sock, &msg,
 					 sock_len, MSG_DONTWAIT | MSG_TRUNC);
 		/* Userspace might have consumed the packet meanwhile:
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 03a3beb..06e14bf 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -306,8 +306,8 @@ static int afs_send_pages(struct afs_call *call, struct msghdr *msg,
 
 			_debug("- range %u-%u%s",
 			       offset, to, msg->msg_flags ? " [more]" : "");
-			msg->msg_iov = (struct iovec *) iov;
-			msg->msg_iovlen = 1;
+			iov_iter_init(&msg->msg_iter, WRITE,
+				      (struct iovec *) iov, 1, to - offset);
 
 			/* have to change the state *before* sending the last
 			 * packet as RxRPC might give us the reply before it
@@ -384,8 +384,8 @@ int afs_make_call(struct in_addr *addr, struct afs_call *call, gfp_t gfp,
 
 	msg.msg_name		= NULL;
 	msg.msg_namelen		= 0;
-	msg.msg_iov		= (struct iovec *) iov;
-	msg.msg_iovlen		= 1;
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)iov, 1,
+		      call->request_size);
 	msg.msg_control		= NULL;
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= (call->send_pages ? MSG_MORE : 0);
@@ -778,8 +778,7 @@ void afs_send_empty_reply(struct afs_call *call)
 	iov[0].iov_len		= 0;
 	msg.msg_name		= NULL;
 	msg.msg_namelen		= 0;
-	msg.msg_iov		= iov;
-	msg.msg_iovlen		= 0;
+	iov_iter_init(&msg.msg_iter, WRITE, iov, 0, 0);	/* WTF? */
 	msg.msg_control		= NULL;
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
@@ -815,8 +814,7 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
 	iov[0].iov_len		= len;
 	msg.msg_name		= NULL;
 	msg.msg_namelen		= 0;
-	msg.msg_iov		= iov;
-	msg.msg_iovlen		= 1;
+	iov_iter_init(&msg.msg_iter, WRITE, iov, 1, len);
 	msg.msg_control		= NULL;
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 7691ad5..4199dfa 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2644,22 +2644,24 @@ unsigned int datagram_poll(struct file *file, struct socket *sock,
 			   struct poll_table_struct *wait);
 int skb_copy_datagram_iovec(const struct sk_buff *from, int offset,
 			    struct iovec *to, int size);
+int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
+			   struct iov_iter *to, int size);
 static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
 					struct msghdr *msg, int size)
 {
-	return skb_copy_datagram_iovec(from, offset, msg->msg_iov, size);
+	/* XXX: stripping const */
+	return skb_copy_datagram_iovec(from, offset, (struct iovec *)msg->msg_iter.iov, size);
 }
 int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
 				     struct iovec *iov);
 static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
 			    struct msghdr *msg)
 {
-	return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
+	/* XXX: stripping const */
+	return skb_copy_and_csum_datagram_iovec(skb, hlen, (struct iovec *)msg->msg_iter.iov);
 }
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
-int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
-			   struct iov_iter *to, int size);
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
 void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
 void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb);
@@ -2687,12 +2689,14 @@ int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci);
 
 static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 {
-	return memcpy_fromiovec(data, msg->msg_iov, len);
+	/* XXX: stripping const */
+	return memcpy_fromiovec(data, (struct iovec *)msg->msg_iter.iov, len);
 }
 
 static inline int memcpy_to_msg(struct msghdr *msg, void *data, int len)
 {
-	return memcpy_toiovec(msg->msg_iov, data, len);
+	/* XXX: stripping const */
+	return memcpy_toiovec((struct iovec *)msg->msg_iter.iov, data, len);
 }
 
 struct skb_checksum_ops {
diff --git a/include/linux/socket.h b/include/linux/socket.h
index de52228..048d6d6 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -47,8 +47,7 @@ struct linger {
 struct msghdr {
 	void		*msg_name;	/* ptr to socket address structure */
 	int		msg_namelen;	/* size of socket address structure */
-	struct iovec	*msg_iov;	/* scatter/gather array */
-	__kernel_size_t	msg_iovlen;	/* # elements in msg_iov */
+	struct iov_iter	msg_iter;	/* data */
 	void		*msg_control;	/* ancillary data */
 	__kernel_size_t	msg_controllen;	/* ancillary data buffer length */
 	unsigned int	msg_flags;	/* flags on received message */
diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
index 4e23674..bca6fc0 100644
--- a/include/net/bluetooth/l2cap.h
+++ b/include/net/bluetooth/l2cap.h
@@ -911,7 +911,7 @@ static inline int l2cap_chan_no_memcpy_fromiovec(struct l2cap_chan *chan,
 	/* Following is safe since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	struct kvec *vec = (struct kvec *)msg->msg_iov;
+	struct kvec *vec = (struct kvec *)msg->msg_iter.iov;
 
 	while (len > 0) {
 		if (vec->iov_len) {
diff --git a/include/net/udplite.h b/include/net/udplite.h
index d5baaba..ae7c8d1 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -20,7 +20,8 @@ static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
 				      int len, int odd, struct sk_buff *skb)
 {
 	struct msghdr *msg = from;
-	return memcpy_fromiovecend(to, msg->msg_iov, offset, len);
+	/* XXX: stripping const */
+	return memcpy_fromiovecend(to, (struct iovec *)msg->msg_iter.iov, offset, len);
 }
 
 /* Designate sk as UDP-Lite socket */
diff --git a/net/atm/common.c b/net/atm/common.c
index f591129..b84057e 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -577,9 +577,6 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 	struct atm_vcc *vcc;
 	struct sk_buff *skb;
 	int eff, error;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, size);
 
 	lock_sock(sk);
 	if (sock->state != SS_CONNECTED) {
@@ -634,7 +631,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		goto out;
 	skb->dev = NULL; /* for paths shared with net_device interfaces */
 	ATM_SKB(skb)->atm_options = vcc->atm_options;
-	if (copy_from_iter(skb_put(skb, size), size, &from) != size) {
+	if (copy_from_iter(skb_put(skb, size), size, &m->msg_iter) != size) {
 		kfree_skb(skb);
 		error = -EFAULT;
 		goto out;
diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index bdcaefd..d8c67a5 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -537,12 +537,12 @@ static int send_pkt(struct l2cap_chan *chan, struct sk_buff *skb,
 	 */
 	chan->data = skb;
 
-	memset(&msg, 0, sizeof(msg));
-	msg.msg_iov = (struct iovec *) &iv;
-	msg.msg_iovlen = 1;
 	iv.iov_base = skb->data;
 	iv.iov_len = skb->len;
 
+	memset(&msg, 0, sizeof(msg));
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *) &iv, 1, skb->len);
+
 	err = l2cap_chan_send(chan, &msg, skb->len);
 	if (err > 0) {
 		netdev->stats.tx_bytes += err;
diff --git a/net/bluetooth/a2mp.c b/net/bluetooth/a2mp.c
index 5dcade5..716d2a3 100644
--- a/net/bluetooth/a2mp.c
+++ b/net/bluetooth/a2mp.c
@@ -60,8 +60,7 @@ void a2mp_send(struct amp_mgr *mgr, u8 code, u8 ident, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	msg.msg_iov = (struct iovec *) &iv;
-	msg.msg_iovlen = 1;
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)&iv, 1, total_len);
 
 	l2cap_chan_send(chan, &msg, total_len);
 
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 069b76e..21f555b 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -268,8 +268,7 @@ static void smp_send_cmd(struct l2cap_conn *conn, u8 code, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	msg.msg_iov = (struct iovec *) &iv;
-	msg.msg_iovlen = 2;
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)iv, 2, 1 + len);
 
 	l2cap_chan_send(chan, &msg, 1 + len);
 
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index ac618b0..769b185 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -535,7 +535,7 @@ static int caif_seqpkt_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		goto err;
 
 	ret = -EINVAL;
-	if (unlikely(msg->msg_iov->iov_base == NULL))
+	if (unlikely(msg->msg_iter.iov->iov_base == NULL))
 		goto err;
 	noblock = msg->msg_flags & MSG_DONTWAIT;
 
diff --git a/net/compat.c b/net/compat.c
index 062f157..3236b41 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -37,13 +37,14 @@ ssize_t get_compat_msghdr(struct msghdr *kmsg,
 			  struct iovec **iov)
 {
 	compat_uptr_t uaddr, uiov, tmp3;
+	compat_size_t nr_segs;
 	ssize_t err;
 
 	if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
 	    __get_user(uaddr, &umsg->msg_name) ||
 	    __get_user(kmsg->msg_namelen, &umsg->msg_namelen) ||
 	    __get_user(uiov, &umsg->msg_iov) ||
-	    __get_user(kmsg->msg_iovlen, &umsg->msg_iovlen) ||
+	    __get_user(nr_segs, &umsg->msg_iovlen) ||
 	    __get_user(tmp3, &umsg->msg_control) ||
 	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
 	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
@@ -68,14 +69,15 @@ ssize_t get_compat_msghdr(struct msghdr *kmsg,
 		kmsg->msg_namelen = 0;
 	}
 
-	if (kmsg->msg_iovlen > UIO_MAXIOV)
+	if (nr_segs > UIO_MAXIOV)
 		return -EMSGSIZE;
 
 	err = compat_rw_copy_check_uvector(save_addr ? READ : WRITE,
-					   compat_ptr(uiov), kmsg->msg_iovlen,
+					   compat_ptr(uiov), nr_segs,
 					   UIO_FASTIOV, *iov, iov);
 	if (err >= 0)
-		kmsg->msg_iov = *iov;
+		iov_iter_init(&kmsg->msg_iter, save_addr ? READ : WRITE,
+			      *iov, nr_segs, err);
 	return err;
 }
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index cdedcf1..b50861b2 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -755,11 +755,13 @@ ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk
 	struct msghdr *msg = from;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
-		if (memcpy_fromiovecend(to, msg->msg_iov, offset, len) < 0)
+		/* XXX: stripping const */
+		if (memcpy_fromiovecend(to, (struct iovec *)msg->msg_iter.iov, offset, len) < 0)
 			return -EFAULT;
 	} else {
 		__wsum csum = 0;
-		if (csum_partial_copy_fromiovecend(to, msg->msg_iov, offset, len, &csum) < 0)
+		/* XXX: stripping const */
+		if (csum_partial_copy_fromiovecend(to, (struct iovec *)msg->msg_iter.iov, offset, len, &csum) < 0)
 			return -EFAULT;
 		skb->csum = csum_block_add(skb->csum, csum, odd);
 	}
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 8dd4ae0..c0d82f7 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -811,7 +811,8 @@ back_from_confirm:
 	pfh.icmph.checksum = 0;
 	pfh.icmph.un.echo.id = inet->inet_sport;
 	pfh.icmph.un.echo.sequence = user_icmph.un.echo.sequence;
-	pfh.iov = msg->msg_iov;
+	/* XXX: stripping const */
+	pfh.iov = (struct iovec *)msg->msg_iter.iov;
 	pfh.wcheck = 0;
 	pfh.family = AF_INET;
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 5d83bd2..0bb68df 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -625,7 +625,8 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 back_from_confirm:
 
 	if (inet->hdrincl)
-		err = raw_send_hdrinc(sk, &fl4, msg->msg_iov, len,
+		/* XXX: stripping const */
+		err = raw_send_hdrinc(sk, &fl4, (struct iovec *)msg->msg_iter.iov, len,
 				      &rt, msg->msg_flags);
 
 	 else {
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4a96f37..54ba620 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1085,7 +1085,7 @@ static int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg,
 int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		size_t size)
 {
-	struct iovec *iov;
+	const struct iovec *iov;
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *skb;
 	int iovlen, flags, err, copied = 0;
@@ -1136,8 +1136,8 @@ int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	mss_now = tcp_send_mss(sk, &size_goal, flags);
 
 	/* Ok commence sending. */
-	iovlen = msg->msg_iovlen;
-	iov = msg->msg_iov;
+	iovlen = msg->msg_iter.nr_segs;
+	iov = msg->msg_iter.iov;
 	copied = 0;
 
 	err = -EPIPE;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f5bd4bd..3e225b0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3050,7 +3050,7 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn)
 	syn_data->ip_summed = CHECKSUM_PARTIAL;
 	memcpy(syn_data->cb, syn->cb, sizeof(syn->cb));
 	if (unlikely(memcpy_fromiovecend(skb_put(syn_data, space),
-					 fo->data->msg_iov, 0, space))) {
+					 fo->data->msg_iter.iov, 0, space))) {
 		kfree_skb(syn_data);
 		goto fallback;
 	}
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 5b7a1ed..2d31483 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -163,7 +163,8 @@ int ping_v6_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	pfh.icmph.checksum = 0;
 	pfh.icmph.un.echo.id = inet->inet_sport;
 	pfh.icmph.un.echo.sequence = user_icmph.icmp6_sequence;
-	pfh.iov = msg->msg_iov;
+	/* XXX: stripping const */
+	pfh.iov = (struct iovec *)msg->msg_iter.iov;
 	pfh.wcheck = 0;
 	pfh.family = AF_INET6;
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 11a9283..ee25631 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -886,7 +886,8 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 
 back_from_confirm:
 	if (inet->hdrincl)
-		err = rawv6_send_hdrinc(sk, msg->msg_iov, len, &fl6, &dst, msg->msg_flags);
+		/* XXX: stripping const */
+		err = rawv6_send_hdrinc(sk, (struct iovec *)msg->msg_iter.iov, len, &fl6, &dst, msg->msg_flags);
 	else {
 		lock_sock(sk);
 		err = ip6_append_data(sk, raw6_getfrag, &rfv,
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 63aa5c8..cc9bcf0 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2305,7 +2305,7 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	}
 
 	if (netlink_tx_is_mmaped(sk) &&
-	    msg->msg_iov->iov_base == NULL) {
+	    msg->msg_iter.iov->iov_base == NULL) {
 		err = netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group,
 					   siocb);
 		goto out;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index efa8445..ed2e620 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2408,11 +2408,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	unsigned short gso_type = 0;
 	int hlen, tlen;
 	int extra_len = 0;
-	struct iov_iter from;
 	ssize_t n;
 
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
-
 	/*
 	 *	Get and verify the address.
 	 */
@@ -2451,7 +2448,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		len -= vnet_hdr_len;
 
 		err = -EFAULT;
-		n = copy_from_iter(&vnet_hdr, vnet_hdr_len, &from);
+		n = copy_from_iter(&vnet_hdr, vnet_hdr_len, &msg->msg_iter);
 		if (n != vnet_hdr_len)
 			goto out_unlock;
 
@@ -2522,7 +2519,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	/* Returns -EFAULT on error */
-	err = skb_copy_datagram_from_iter(skb, offset, &from, len);
+	err = skb_copy_datagram_from_iter(skb, offset, &msg->msg_iter, len);
 	if (err)
 		goto out_free;
 
diff --git a/net/rds/recv.c b/net/rds/recv.c
index 47d7b10..f9ec1ac 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -404,7 +404,6 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int ret = 0, nonblock = msg_flags & MSG_DONTWAIT;
 	DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name);
 	struct rds_incoming *inc = NULL;
-	struct iov_iter to;
 
 	/* udp_recvmsg()->sock_recvtimeo() gets away without locking too.. */
 	timeo = sock_rcvtimeo(sk, nonblock);
@@ -415,6 +414,7 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		goto out;
 
 	while (1) {
+		struct iov_iter save;
 		/* If there are pending notifications, do those - and nothing else */
 		if (!list_empty(&rs->rs_notify_queue)) {
 			ret = rds_notify_queue_get(rs, msg);
@@ -450,8 +450,8 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		rdsdebug("copying inc %p from %pI4:%u to user\n", inc,
 			 &inc->i_conn->c_faddr,
 			 ntohs(inc->i_hdr.h_sport));
-		iov_iter_init(&to, READ, msg->msg_iov, msg->msg_iovlen, size);
-		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, &to);
+		save = msg->msg_iter;
+		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, &msg->msg_iter);
 		if (ret < 0)
 			break;
 
@@ -464,6 +464,7 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 			rds_inc_put(inc);
 			inc = NULL;
 			rds_stats_inc(s_recv_deliver_raced);
+			msg->msg_iter = save;
 			continue;
 		}
 
diff --git a/net/rds/send.c b/net/rds/send.c
index 4de62ea..40a5629a 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -934,9 +934,7 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int queued = 0, allocated_mr = 0;
 	int nonblock = msg->msg_flags & MSG_DONTWAIT;
 	long timeo = sock_sndtimeo(sk, nonblock);
-	struct iov_iter from;
 
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, payload_len);
 	/* Mirror Linux UDP mirror of BSD error message compatibility */
 	/* XXX: Perhaps MSG_MORE someday */
 	if (msg->msg_flags & ~(MSG_DONTWAIT | MSG_CMSG_COMPAT)) {
@@ -984,7 +982,7 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 			ret = -ENOMEM;
 			goto out;
 		}
-		ret = rds_message_copy_from_user(rm, &from);
+		ret = rds_message_copy_from_user(rm, &msg->msg_iter);
 		if (ret)
 			goto out;
 	}
diff --git a/net/rxrpc/ar-output.c b/net/rxrpc/ar-output.c
index 0b4b9a7..86e0f10 100644
--- a/net/rxrpc/ar-output.c
+++ b/net/rxrpc/ar-output.c
@@ -531,14 +531,12 @@ static int rxrpc_send_data(struct kiocb *iocb,
 	struct rxrpc_skb_priv *sp;
 	unsigned char __user *from;
 	struct sk_buff *skb;
-	struct iovec *iov;
+	const struct iovec *iov;
 	struct sock *sk = &rx->sk;
 	long timeo;
 	bool more;
 	int ret, ioc, segment, copied;
 
-	_enter(",,,{%zu},%zu", msg->msg_iovlen, len);
-
 	timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 
 	/* this should be in poll */
@@ -547,8 +545,8 @@ static int rxrpc_send_data(struct kiocb *iocb,
 	if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
 		return -EPIPE;
 
-	iov = msg->msg_iov;
-	ioc = msg->msg_iovlen - 1;
+	iov = msg->msg_iter.iov;
+	ioc = msg->msg_iter.nr_segs - 1;
 	from = iov->iov_base;
 	segment = iov->iov_len;
 	iov++;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 0397ac9..c92f96cd 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1609,9 +1609,6 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	__u16 sinfo_flags = 0;
 	long timeo;
 	int err;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, msg_len);
 
 	err = 0;
 	sp = sctp_sk(sk);
@@ -1950,7 +1947,7 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	/* Break the message into multiple chunks of maximum size. */
-	datamsg = sctp_datamsg_from_user(asoc, sinfo, &from);
+	datamsg = sctp_datamsg_from_user(asoc, sinfo, &msg->msg_iter);
 	if (IS_ERR(datamsg)) {
 		err = PTR_ERR(datamsg);
 		goto out_free;
diff --git a/net/socket.c b/net/socket.c
index ee3ee39..46571ee 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -691,8 +691,7 @@ int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
 	 * the following is safe, since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	msg->msg_iov = (struct iovec *)vec;
-	msg->msg_iovlen = num;
+	iov_iter_init(&msg->msg_iter, WRITE, (struct iovec *)vec, num, size);
 	result = sock_sendmsg(sock, msg, size);
 	set_fs(oldfs);
 	return result;
@@ -855,7 +854,7 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
 	 * the following is safe, since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	msg->msg_iov = (struct iovec *)vec, msg->msg_iovlen = num;
+	iov_iter_init(&msg->msg_iter, READ, (struct iovec *)vec, num, size);
 	result = sock_recvmsg(sock, msg, size, flags);
 	set_fs(oldfs);
 	return result;
@@ -915,8 +914,7 @@ static ssize_t do_sock_read(struct msghdr *msg, struct kiocb *iocb,
 	msg->msg_namelen = 0;
 	msg->msg_control = NULL;
 	msg->msg_controllen = 0;
-	msg->msg_iov = (struct iovec *)iov;
-	msg->msg_iovlen = nr_segs;
+	iov_iter_init(&msg->msg_iter, READ, iov, nr_segs, size);
 	msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0;
 
 	return __sock_recvmsg(iocb, sock, msg, size, msg->msg_flags);
@@ -955,8 +953,7 @@ static ssize_t do_sock_write(struct msghdr *msg, struct kiocb *iocb,
 	msg->msg_namelen = 0;
 	msg->msg_control = NULL;
 	msg->msg_controllen = 0;
-	msg->msg_iov = (struct iovec *)iov;
-	msg->msg_iovlen = nr_segs;
+	iov_iter_init(&msg->msg_iter, WRITE, iov, nr_segs, size);
 	msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0;
 	if (sock->type == SOCK_SEQPACKET)
 		msg->msg_flags |= MSG_EOR;
@@ -1800,8 +1797,7 @@ SYSCALL_DEFINE6(sendto, int, fd, void __user *, buff, size_t, len,
 	iov.iov_base = buff;
 	iov.iov_len = len;
 	msg.msg_name = NULL;
-	msg.msg_iov = &iov;
-	msg.msg_iovlen = 1;
+	iov_iter_init(&msg.msg_iter, WRITE, &iov, 1, len);
 	msg.msg_control = NULL;
 	msg.msg_controllen = 0;
 	msg.msg_namelen = 0;
@@ -1858,10 +1854,9 @@ SYSCALL_DEFINE6(recvfrom, int, fd, void __user *, ubuf, size_t, size,
 
 	msg.msg_control = NULL;
 	msg.msg_controllen = 0;
-	msg.msg_iovlen = 1;
-	msg.msg_iov = &iov;
 	iov.iov_len = size;
 	iov.iov_base = ubuf;
+	iov_iter_init(&msg.msg_iter, READ, &iov, 1, size);
 	/* Save some cycles and don't copy the address if not needed */
 	msg.msg_name = addr ? (struct sockaddr *)&address : NULL;
 	/* We assume all kernel code knows the size of sockaddr_storage */
@@ -1995,13 +1990,14 @@ static ssize_t copy_msghdr_from_user(struct msghdr *kmsg,
 {
 	struct sockaddr __user *uaddr;
 	struct iovec __user *uiov;
+	size_t nr_segs;
 	ssize_t err;
 
 	if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
 	    __get_user(uaddr, &umsg->msg_name) ||
 	    __get_user(kmsg->msg_namelen, &umsg->msg_namelen) ||
 	    __get_user(uiov, &umsg->msg_iov) ||
-	    __get_user(kmsg->msg_iovlen, &umsg->msg_iovlen) ||
+	    __get_user(nr_segs, &umsg->msg_iovlen) ||
 	    __get_user(kmsg->msg_control, &umsg->msg_control) ||
 	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
 	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
@@ -2031,14 +2027,15 @@ static ssize_t copy_msghdr_from_user(struct msghdr *kmsg,
 		kmsg->msg_namelen = 0;
 	}
 
-	if (kmsg->msg_iovlen > UIO_MAXIOV)
+	if (nr_segs > UIO_MAXIOV)
 		return -EMSGSIZE;
 
 	err = rw_copy_check_uvector(save_addr ? READ : WRITE,
-				    uiov, kmsg->msg_iovlen,
+				    uiov, nr_segs,
 				    UIO_FASTIOV, *iov, iov);
 	if (err >= 0)
-		kmsg->msg_iov = *iov;
+		iov_iter_init(&kmsg->msg_iter, save_addr ? READ : WRITE,
+			      *iov, nr_segs, err);
 	return err;
 }
 
diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index 5b06597..a687b30 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -194,7 +194,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m, int offset,
 		__skb_queue_tail(list, skb);
 		skb_copy_to_linear_data(skb, mhdr, mhsz);
 		pktpos = skb->data + mhsz;
-		if (!dsz || !memcpy_fromiovecend(pktpos, m->msg_iov, offset,
+		if (!dsz || !memcpy_fromiovecend(pktpos, m->msg_iter.iov, offset,
 						 dsz))
 			return dsz;
 		rc = -EFAULT;
@@ -224,7 +224,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m, int offset,
 		if (drem < pktrem)
 			pktrem = drem;
 
-		if (memcpy_fromiovecend(pktpos, m->msg_iov, offset, pktrem)) {
+		if (memcpy_fromiovecend(pktpos, m->msg_iter.iov, offset, pktrem)) {
 			rc = -EFAULT;
 			goto error;
 		}
diff --git a/net/tipc/socket.c b/net/tipc/socket.c
index 9658d9b..4e78a7d 100644
--- a/net/tipc/socket.c
+++ b/net/tipc/socket.c
@@ -850,9 +850,9 @@ static int dest_name_check(struct sockaddr_tipc *dest, struct msghdr *m)
 	if (likely(dest->addr.name.name.type != TIPC_CFG_SRV))
 		return -EACCES;
 
-	if (!m->msg_iovlen || (m->msg_iov[0].iov_len < sizeof(hdr)))
+	if (!m->msg_iter.nr_segs || (m->msg_iter.iov[0].iov_len < sizeof(hdr)))
 		return -EMSGSIZE;
-	if (copy_from_user(&hdr, m->msg_iov[0].iov_base, sizeof(hdr)))
+	if (copy_from_user(&hdr, m->msg_iter.iov[0].iov_base, sizeof(hdr)))
 		return -EFAULT;
 	if ((ntohs(hdr.tcm_type) & 0xC000) && (!capable(CAP_NET_ADMIN)))
 		return -EACCES;
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 4450d62..8e1b102 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1459,9 +1459,6 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	struct scm_cookie tmp_scm;
 	int max_level;
 	int data_len = 0;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1519,7 +1516,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	skb_put(skb, len - data_len);
 	skb->data_len = data_len;
 	skb->len = len;
-	err = skb_copy_datagram_from_iter(skb, 0, &from, len);
+	err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, len);
 	if (err)
 		goto out_free;
 
@@ -1641,9 +1638,6 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	bool fds_sent = false;
 	int max_level;
 	int data_len;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1700,7 +1694,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		skb_put(skb, size - data_len);
 		skb->data_len = data_len;
 		skb->len = size;
-		err = skb_copy_datagram_from_iter(skb, 0, &from, size);
+		err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size);
 		if (err) {
 			kfree_skb(skb);
 			goto out_err;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 20a0ba3..02d2e52 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1850,7 +1850,8 @@ static ssize_t vmci_transport_stream_enqueue(
 	struct msghdr *msg,
 	size_t len)
 {
-	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
+	/* XXX: stripping const */
+	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, (struct iovec *)msg->msg_iter.iov, len, 0);
 }
 
 static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 08/12] first fruits - kill l2cap ->memcpy_fromiovec()
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (6 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 07/12] put iov_iter into msghdr Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 09/12] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives Al Viro
                                                       ` (4 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Just use copy_from_iter().  That's what this method is trying to do
in all cases, in a very convoluted fashion.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/bluetooth/l2cap.h | 29 -----------------------------
 net/bluetooth/6lowpan.c       |  3 +--
 net/bluetooth/a2mp.c          |  3 +--
 net/bluetooth/l2cap_core.c    |  7 +++----
 net/bluetooth/l2cap_sock.c    |  8 --------
 net/bluetooth/smp.c           |  4 +---
 6 files changed, 6 insertions(+), 48 deletions(-)

diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
index bca6fc0..692f786 100644
--- a/include/net/bluetooth/l2cap.h
+++ b/include/net/bluetooth/l2cap.h
@@ -606,10 +606,6 @@ struct l2cap_ops {
 	struct sk_buff		*(*alloc_skb) (struct l2cap_chan *chan,
 					       unsigned long hdr_len,
 					       unsigned long len, int nb);
-	int			(*memcpy_fromiovec) (struct l2cap_chan *chan,
-						     unsigned char *kdata,
-						     struct msghdr *msg,
-						     int len);
 };
 
 struct l2cap_conn {
@@ -903,31 +899,6 @@ static inline long l2cap_chan_no_get_sndtimeo(struct l2cap_chan *chan)
 	return 0;
 }
 
-static inline int l2cap_chan_no_memcpy_fromiovec(struct l2cap_chan *chan,
-						 unsigned char *kdata,
-						 struct msghdr *msg,
-						 int len)
-{
-	/* Following is safe since for compiler definitions of kvec and
-	 * iovec are identical, yielding the same in-core layout and alignment
-	 */
-	struct kvec *vec = (struct kvec *)msg->msg_iter.iov;
-
-	while (len > 0) {
-		if (vec->iov_len) {
-			int copy = min_t(unsigned int, len, vec->iov_len);
-			memcpy(kdata, vec->iov_base, copy);
-			len -= copy;
-			kdata += copy;
-			vec->iov_base += copy;
-			vec->iov_len -= copy;
-		}
-		vec++;
-	}
-
-	return 0;
-}
-
 extern bool disable_ertm;
 
 int l2cap_init_sockets(void);
diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index d8c67a5..76617be 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -541,7 +541,7 @@ static int send_pkt(struct l2cap_chan *chan, struct sk_buff *skb,
 	iv.iov_len = skb->len;
 
 	memset(&msg, 0, sizeof(msg));
-	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *) &iv, 1, skb->len);
+	iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, skb->len);
 
 	err = l2cap_chan_send(chan, &msg, skb->len);
 	if (err > 0) {
@@ -1050,7 +1050,6 @@ static const struct l2cap_ops bt_6lowpan_chan_ops = {
 	.suspend		= chan_suspend_cb,
 	.get_sndtimeo		= chan_get_sndtimeo_cb,
 	.alloc_skb		= chan_alloc_skb_cb,
-	.memcpy_fromiovec	= l2cap_chan_no_memcpy_fromiovec,
 
 	.teardown		= l2cap_chan_no_teardown,
 	.defer			= l2cap_chan_no_defer,
diff --git a/net/bluetooth/a2mp.c b/net/bluetooth/a2mp.c
index 716d2a3..cedfbda 100644
--- a/net/bluetooth/a2mp.c
+++ b/net/bluetooth/a2mp.c
@@ -60,7 +60,7 @@ void a2mp_send(struct amp_mgr *mgr, u8 code, u8 ident, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)&iv, 1, total_len);
+	iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, total_len);
 
 	l2cap_chan_send(chan, &msg, total_len);
 
@@ -719,7 +719,6 @@ static const struct l2cap_ops a2mp_chan_ops = {
 	.resume = l2cap_chan_no_resume,
 	.set_shutdown = l2cap_chan_no_set_shutdown,
 	.get_sndtimeo = l2cap_chan_no_get_sndtimeo,
-	.memcpy_fromiovec = l2cap_chan_no_memcpy_fromiovec,
 };
 
 static struct l2cap_chan *a2mp_chan_open(struct l2cap_conn *conn, bool locked)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 5201d61..1754040 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -2096,8 +2096,7 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 	struct sk_buff **frag;
 	int sent = 0;
 
-	if (chan->ops->memcpy_fromiovec(chan, skb_put(skb, count),
-					msg, count))
+	if (copy_from_iter(skb_put(skb, count), count, &msg->msg_iter) != count)
 		return -EFAULT;
 
 	sent += count;
@@ -2117,8 +2116,8 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 
 		*frag = tmp;
 
-		if (chan->ops->memcpy_fromiovec(chan, skb_put(*frag, count),
-						msg, count))
+		if (copy_from_iter(skb_put(*frag, count), count,
+				   &msg->msg_iter) != count)
 			return -EFAULT;
 
 		sent += count;
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 205b298..f65caf4 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1336,13 +1336,6 @@ static struct sk_buff *l2cap_sock_alloc_skb_cb(struct l2cap_chan *chan,
 	return skb;
 }
 
-static int l2cap_sock_memcpy_fromiovec_cb(struct l2cap_chan *chan,
-					  unsigned char *kdata,
-					  struct msghdr *msg, int len)
-{
-	return memcpy_from_msg(kdata, msg, len);
-}
-
 static void l2cap_sock_ready_cb(struct l2cap_chan *chan)
 {
 	struct sock *sk = chan->data;
@@ -1427,7 +1420,6 @@ static const struct l2cap_ops l2cap_chan_ops = {
 	.set_shutdown		= l2cap_sock_set_shutdown_cb,
 	.get_sndtimeo		= l2cap_sock_get_sndtimeo_cb,
 	.alloc_skb		= l2cap_sock_alloc_skb_cb,
-	.memcpy_fromiovec	= l2cap_sock_memcpy_fromiovec_cb,
 };
 
 static void l2cap_sock_destruct(struct sock *sk)
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 21f555b..de7dc75 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -268,7 +268,7 @@ static void smp_send_cmd(struct l2cap_conn *conn, u8 code, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)iv, 2, 1 + len);
+	iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iv, 2, 1 + len);
 
 	l2cap_chan_send(chan, &msg, 1 + len);
 
@@ -1629,7 +1629,6 @@ static const struct l2cap_ops smp_chan_ops = {
 	.suspend		= l2cap_chan_no_suspend,
 	.set_shutdown		= l2cap_chan_no_set_shutdown,
 	.get_sndtimeo		= l2cap_chan_no_get_sndtimeo,
-	.memcpy_fromiovec	= l2cap_chan_no_memcpy_fromiovec,
 };
 
 static inline struct l2cap_chan *smp_new_conn_cb(struct l2cap_chan *pchan)
@@ -1678,7 +1677,6 @@ static const struct l2cap_ops smp_root_chan_ops = {
 	.resume			= l2cap_chan_no_resume,
 	.set_shutdown		= l2cap_chan_no_set_shutdown,
 	.get_sndtimeo		= l2cap_chan_no_get_sndtimeo,
-	.memcpy_fromiovec	= l2cap_chan_no_memcpy_fromiovec,
 };
 
 int smp_register(struct hci_dev *hdev)
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 09/12] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (7 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 08/12] first fruits - kill l2cap ->memcpy_fromiovec() Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 10/12] ppp_read(): switch to skb_copy_datagram_iter() Al Viro
                                                       ` (3 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

... making both non-draining.  That means that tcp_recvmsg() becomes
non-draining.  And _that_ would break iscsit_do_rx_data() unless we
	a) make sure tcp_recvmsg() is uniformly non-draining (it is)
	b) make sure it copes with arbitrary (including shifted)
iov_iter (it does, all it uses is iov_iter primitives)
	c) make iscsit_do_rx_data() initialize ->msg_iter only once.

Fortunately, (c) is doable with minimal work and we are rid of one
the two places where kernel send/recvmsg users would be unhappy with
non-draining behaviour.

Actually, that makes all but one of ->recvmsg() instances iov_iter-clean.
The exception is skcipher_recvmsg() and it also isn't hard to convert
to primitives (iov_iter_get_pages() is needed there).  That'll wait
a bit - there's some interplay with ->sendmsg() path for that one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/target/iscsi/iscsi_target_util.c | 12 +++----
 include/linux/skbuff.h                   | 16 +++-------
 net/core/datagram.c                      | 54 +++++++++++---------------------
 3 files changed, 28 insertions(+), 54 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index ce87ce9..7c6a95b 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -1326,21 +1326,19 @@ static int iscsit_do_rx_data(
 	struct iscsi_conn *conn,
 	struct iscsi_data_count *count)
 {
-	int data = count->data_length, rx_loop = 0, total_rx = 0, iov_len;
-	struct kvec *iov_p;
+	int data = count->data_length, rx_loop = 0, total_rx = 0;
 	struct msghdr msg;
 
 	if (!conn || !conn->sock || !conn->conn_ops)
 		return -1;
 
 	memset(&msg, 0, sizeof(struct msghdr));
-
-	iov_p = count->iov;
-	iov_len	= count->iov_count;
+	iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC,
+		      count->iov, count->iov_count, data);
 
 	while (total_rx < data) {
-		rx_loop = kernel_recvmsg(conn->sock, &msg, iov_p, iov_len,
-					(data - total_rx), MSG_WAITALL);
+		rx_loop = sock_recvmsg(conn->sock, &msg,
+				      (data - total_rx), MSG_WAITALL);
 		if (rx_loop <= 0) {
 			pr_debug("rx_loop: %d total_rx: %d\n",
 				rx_loop, total_rx);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4199dfa..45ac95c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2649,17 +2649,10 @@ int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
 					struct msghdr *msg, int size)
 {
-	/* XXX: stripping const */
-	return skb_copy_datagram_iovec(from, offset, (struct iovec *)msg->msg_iter.iov, size);
-}
-int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
-				     struct iovec *iov);
-static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
-			    struct msghdr *msg)
-{
-	/* XXX: stripping const */
-	return skb_copy_and_csum_datagram_iovec(skb, hlen, (struct iovec *)msg->msg_iter.iov);
+	return skb_copy_datagram_iter(from, offset, &msg->msg_iter, size);
 }
+int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
+				   struct msghdr *msg);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
@@ -2695,8 +2688,7 @@ static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 
 static inline int memcpy_to_msg(struct msghdr *msg, void *data, int len)
 {
-	/* XXX: stripping const */
-	return memcpy_toiovec((struct iovec *)msg->msg_iter.iov, data, len);
+	return copy_to_iter(data, len, &msg->msg_iter) == len ? 0 : -EFAULT;
 }
 
 struct skb_checksum_ops {
diff --git a/net/core/datagram.c b/net/core/datagram.c
index b6e303b..41075ed 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -615,27 +615,25 @@ int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
 EXPORT_SYMBOL(zerocopy_sg_from_iter);
 
 static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
-				      u8 __user *to, int len,
+				      struct iov_iter *to, int len,
 				      __wsum *csump)
 {
 	int start = skb_headlen(skb);
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 	int pos = 0;
+	int n;
 
 	/* Copy header. */
 	if (copy > 0) {
-		int err = 0;
 		if (copy > len)
 			copy = len;
-		*csump = csum_and_copy_to_user(skb->data + offset, to, copy,
-					       *csump, &err);
-		if (err)
+		n = csum_and_copy_to_iter(skb->data + offset, copy, csump, to);
+		if (n != copy)
 			goto fault;
 		if ((len -= copy) == 0)
 			return 0;
 		offset += copy;
-		to += copy;
 		pos = copy;
 	}
 
@@ -647,26 +645,22 @@ static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
 
 		end = start + skb_frag_size(frag);
 		if ((copy = end - offset) > 0) {
-			__wsum csum2;
-			int err = 0;
-			u8  *vaddr;
+			__wsum csum2 = 0;
 			struct page *page = skb_frag_page(frag);
+			u8  *vaddr = kmap(page);
 
 			if (copy > len)
 				copy = len;
-			vaddr = kmap(page);
-			csum2 = csum_and_copy_to_user(vaddr +
-							frag->page_offset +
-							offset - start,
-						      to, copy, 0, &err);
+			n = csum_and_copy_to_iter(vaddr + frag->page_offset +
+						  offset - start, copy,
+						  &csum2, to);
 			kunmap(page);
-			if (err)
+			if (n != copy)
 				goto fault;
 			*csump = csum_block_add(*csump, csum2, pos);
 			if (!(len -= copy))
 				return 0;
 			offset += copy;
-			to += copy;
 			pos += copy;
 		}
 		start = end;
@@ -691,7 +685,6 @@ static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
 			if ((len -= copy) == 0)
 				return 0;
 			offset += copy;
-			to += copy;
 			pos += copy;
 		}
 		start = end;
@@ -744,20 +737,19 @@ __sum16 __skb_checksum_complete(struct sk_buff *skb)
 EXPORT_SYMBOL(__skb_checksum_complete);
 
 /**
- *	skb_copy_and_csum_datagram_iovec - Copy and checksum skb to user iovec.
+ *	skb_copy_and_csum_datagram_msg - Copy and checksum skb to user iovec.
  *	@skb: skbuff
  *	@hlen: hardware length
- *	@iov: io vector
+ *	@msg: destination
  *
  *	Caller _must_ check that skb will fit to this iovec.
  *
  *	Returns: 0       - success.
  *		 -EINVAL - checksum failure.
- *		 -EFAULT - fault during copy. Beware, in this case iovec
- *			   can be modified!
+ *		 -EFAULT - fault during copy.
  */
-int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb,
-				     int hlen, struct iovec *iov)
+int skb_copy_and_csum_datagram_msg(struct sk_buff *skb,
+				   int hlen, struct msghdr *msg)
 {
 	__wsum csum;
 	int chunk = skb->len - hlen;
@@ -765,28 +757,20 @@ int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb,
 	if (!chunk)
 		return 0;
 
-	/* Skip filled elements.
-	 * Pretty silly, look at memcpy_toiovec, though 8)
-	 */
-	while (!iov->iov_len)
-		iov++;
-
-	if (iov->iov_len < chunk) {
+	if (iov_iter_count(&msg->msg_iter) < chunk) {
 		if (__skb_checksum_complete(skb))
 			goto csum_error;
-		if (skb_copy_datagram_iovec(skb, hlen, iov, chunk))
+		if (skb_copy_datagram_msg(skb, hlen, msg, chunk))
 			goto fault;
 	} else {
 		csum = csum_partial(skb->data, hlen, skb->csum);
-		if (skb_copy_and_csum_datagram(skb, hlen, iov->iov_base,
+		if (skb_copy_and_csum_datagram(skb, hlen, &msg->msg_iter,
 					       chunk, &csum))
 			goto fault;
 		if (csum_fold(csum))
 			goto csum_error;
 		if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
 			netdev_rx_csum_fault(skb->dev);
-		iov->iov_len -= chunk;
-		iov->iov_base += chunk;
 	}
 	return 0;
 csum_error:
@@ -794,7 +778,7 @@ csum_error:
 fault:
 	return -EFAULT;
 }
-EXPORT_SYMBOL(skb_copy_and_csum_datagram_iovec);
+EXPORT_SYMBOL(skb_copy_and_csum_datagram_msg);
 
 /**
  * 	datagram_poll - generic datagram poll
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 10/12] ppp_read(): switch to skb_copy_datagram_iter()
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (8 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 09/12] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 11/12] skb_copy_datagram_iovec() can die Al Viro
                                                       ` (2 subsequent siblings)
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/ppp/ppp_generic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index 794a473..af034db 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -417,6 +417,7 @@ static ssize_t ppp_read(struct file *file, char __user *buf,
 	ssize_t ret;
 	struct sk_buff *skb = NULL;
 	struct iovec iov;
+	struct iov_iter to;
 
 	ret = count;
 
@@ -462,7 +463,8 @@ static ssize_t ppp_read(struct file *file, char __user *buf,
 	ret = -EFAULT;
 	iov.iov_base = buf;
 	iov.iov_len = count;
-	if (skb_copy_datagram_iovec(skb, 0, &iov, skb->len))
+	iov_iter_init(&to, READ, &iov, 1, count);
+	if (skb_copy_datagram_iter(skb, 0, &to, skb->len))
 		goto outf;
 	ret = skb->len;
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 11/12] skb_copy_datagram_iovec() can die
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (9 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 10/12] ppp_read(): switch to skb_copy_datagram_iter() Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-05  5:58                                     ` [PATCH 12/12] bury memcpy_toiovec() Al Viro
  2014-12-09 20:07                                     ` the next chunk of iov_iter-net stuff for review David Miller
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

no callers other than itself.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |  2 --
 net/core/datagram.c    | 84 --------------------------------------------------
 2 files changed, 86 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 45ac95c..f676e54 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2642,8 +2642,6 @@ struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned flags, int noblock,
 				  int *err);
 unsigned int datagram_poll(struct file *file, struct socket *sock,
 			   struct poll_table_struct *wait);
-int skb_copy_datagram_iovec(const struct sk_buff *from, int offset,
-			    struct iovec *to, int size);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 			   struct iov_iter *to, int size);
 static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 41075ed..df493d6 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -310,90 +310,6 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags)
 EXPORT_SYMBOL(skb_kill_datagram);
 
 /**
- *	skb_copy_datagram_iovec - Copy a datagram to an iovec.
- *	@skb: buffer to copy
- *	@offset: offset in the buffer to start copying from
- *	@to: io vector to copy to
- *	@len: amount of data to copy from buffer to iovec
- *
- *	Note: the iovec is modified during the copy.
- */
-int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
-			    struct iovec *to, int len)
-{
-	int start = skb_headlen(skb);
-	int i, copy = start - offset;
-	struct sk_buff *frag_iter;
-
-	trace_skb_copy_datagram_iovec(skb, len);
-
-	/* Copy header. */
-	if (copy > 0) {
-		if (copy > len)
-			copy = len;
-		if (memcpy_toiovec(to, skb->data + offset, copy))
-			goto fault;
-		if ((len -= copy) == 0)
-			return 0;
-		offset += copy;
-	}
-
-	/* Copy paged appendix. Hmm... why does this look so complicated? */
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		int end;
-		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-
-		WARN_ON(start > offset + len);
-
-		end = start + skb_frag_size(frag);
-		if ((copy = end - offset) > 0) {
-			int err;
-			u8  *vaddr;
-			struct page *page = skb_frag_page(frag);
-
-			if (copy > len)
-				copy = len;
-			vaddr = kmap(page);
-			err = memcpy_toiovec(to, vaddr + frag->page_offset +
-					     offset - start, copy);
-			kunmap(page);
-			if (err)
-				goto fault;
-			if (!(len -= copy))
-				return 0;
-			offset += copy;
-		}
-		start = end;
-	}
-
-	skb_walk_frags(skb, frag_iter) {
-		int end;
-
-		WARN_ON(start > offset + len);
-
-		end = start + frag_iter->len;
-		if ((copy = end - offset) > 0) {
-			if (copy > len)
-				copy = len;
-			if (skb_copy_datagram_iovec(frag_iter,
-						    offset - start,
-						    to, copy))
-				goto fault;
-			if ((len -= copy) == 0)
-				return 0;
-			offset += copy;
-		}
-		start = end;
-	}
-	if (!len)
-		return 0;
-
-fault:
-	return -EFAULT;
-}
-EXPORT_SYMBOL(skb_copy_datagram_iovec);
-
-/**
  *	skb_copy_datagram_iter - Copy a datagram to an iovec iterator.
  *	@skb: buffer to copy
  *	@offset: offset in the buffer to start copying from
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 12/12] bury memcpy_toiovec()
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (10 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 11/12] skb_copy_datagram_iovec() can die Al Viro
@ 2014-12-05  5:58                                     ` Al Viro
  2014-12-09 20:07                                     ` the next chunk of iov_iter-net stuff for review David Miller
  12 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-05  5:58 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

no users left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/uio.h |  1 -
 lib/iovec.c         | 25 -------------------------
 2 files changed, 26 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index bd8569a..a41e252 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -131,7 +131,6 @@ size_t csum_and_copy_to_iter(void *addr, size_t bytes, __wsum *csum, struct iov_
 size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
 
 int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len);
-int memcpy_toiovec(struct iovec *iov, unsigned char *kdata, int len);
 int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov,
 			int offset, int len);
 int memcpy_toiovecend(const struct iovec *v, unsigned char *kdata,
diff --git a/lib/iovec.c b/lib/iovec.c
index df3abd1..2d99cb4 100644
--- a/lib/iovec.c
+++ b/lib/iovec.c
@@ -29,31 +29,6 @@ EXPORT_SYMBOL(memcpy_fromiovec);
 
 /*
  *	Copy kernel to iovec. Returns -EFAULT on error.
- *
- *	Note: this modifies the original iovec.
- */
-
-int memcpy_toiovec(struct iovec *iov, unsigned char *kdata, int len)
-{
-	while (len > 0) {
-		if (iov->iov_len) {
-			int copy = min_t(unsigned int, iov->iov_len, len);
-			if (copy_to_user(iov->iov_base, kdata, copy))
-				return -EFAULT;
-			kdata += copy;
-			len -= copy;
-			iov->iov_len -= copy;
-			iov->iov_base += copy;
-		}
-		iov++;
-	}
-
-	return 0;
-}
-EXPORT_SYMBOL(memcpy_toiovec);
-
-/*
- *	Copy kernel to iovec. Returns -EFAULT on error.
  */
 
 int memcpy_toiovecend(const struct iovec *iov, unsigned char *kdata,
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
                                                       ` (11 preceding siblings ...)
  2014-12-05  5:58                                     ` [PATCH 12/12] bury memcpy_toiovec() Al Viro
@ 2014-12-09 20:07                                     ` David Miller
  2014-12-09 21:04                                       ` Al Viro
  12 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-12-09 20:07 UTC (permalink / raw)
  To: viro; +Cc: netdev, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Fri, 5 Dec 2014 05:56:23 +0000

> 	OK, here's the tentative next batch (covers most of the ->recvmsg()
> side of conversion).  That's on top of merge of net-next#master with
> vfs#iov_iter (the latter had been posted earlier today, Cc'd to netdev among
> other places).  This series corresponds to vfs#for-davem.  Review and comments
> would be very welcome...

Al, what's the state of this?  The iov_iter rewrite had some changes
recently.

Also, never assume I just know what GIT url to pull from, always
explicitly state where you want me to pull changes from.

Thanks.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-09 20:07                                     ` the next chunk of iov_iter-net stuff for review David Miller
@ 2014-12-09 21:04                                       ` Al Viro
  2014-12-09 21:17                                         ` David Miller
  0 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-12-09 21:04 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

On Tue, Dec 09, 2014 at 03:07:32PM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Fri, 5 Dec 2014 05:56:23 +0000
> 
> > 	OK, here's the tentative next batch (covers most of the ->recvmsg()
> > side of conversion).  That's on top of merge of net-next#master with
> > vfs#iov_iter (the latter had been posted earlier today, Cc'd to netdev among
> > other places).  This series corresponds to vfs#for-davem.  Review and comments
> > would be very welcome...
> 
> Al, what's the state of this?  The iov_iter rewrite had some changes
> recently.

Well, I've got no comments whatsoever on the networking side of things;
might mean that everything's fine, might mean that everyone had been too
polite to say it's shite, might be something in between...  FWIW, changes
in iov_iter do not affect any of those patches, so the stuff posted
for review is unchanged by those.

> Also, never assume I just know what GIT url to pull from, always
> explicitly state where you want me to pull changes from.

Sure, but that was just a call for review.  _If_ you think that those patches
are OK, the URL is
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem
(and it pulls the fixed variant of iov_iter, otherwise it's exactly the same
as when I posted those).

FWIW, this stuff seems to work here, but I'd really like to hear at least
something before asking to pull.  As it is, your mail is the first reply of
any kind...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-09 21:04                                       ` Al Viro
@ 2014-12-09 21:17                                         ` David Miller
  2014-12-09 21:23                                           ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-12-09 21:17 UTC (permalink / raw)
  To: viro; +Cc: netdev, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 9 Dec 2014 21:04:14 +0000

> Well, I've got no comments whatsoever on the networking side of things;

I've been away for more than 10 days, so I personally haven't had time
to review any of it.

Why don't you respin and I'll try to allocate some review time myself?

Thanks.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-09 21:17                                         ` David Miller
@ 2014-12-09 21:23                                           ` Al Viro
  2014-12-09 21:37                                             ` David Miller
  0 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2014-12-09 21:23 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

On Tue, Dec 09, 2014 at 04:17:12PM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Tue, 9 Dec 2014 21:04:14 +0000
> 
> > Well, I've got no comments whatsoever on the networking side of things;
> 
> I've been away for more than 10 days, so I personally haven't had time
> to review any of it.
> 
> Why don't you respin and I'll try to allocate some review time myself?

Umm...  Do you want a rebase on top of current net-next/master?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-09 21:23                                           ` Al Viro
@ 2014-12-09 21:37                                             ` David Miller
  2014-12-09 22:49                                               ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: David Miller @ 2014-12-09 21:37 UTC (permalink / raw)
  To: viro; +Cc: netdev, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 9 Dec 2014 21:23:37 +0000

> On Tue, Dec 09, 2014 at 04:17:12PM -0500, David Miller wrote:
>> From: Al Viro <viro@ZenIV.linux.org.uk>
>> Date: Tue, 9 Dec 2014 21:04:14 +0000
>> 
>> > Well, I've got no comments whatsoever on the networking side of things;
>> 
>> I've been away for more than 10 days, so I personally haven't had time
>> to review any of it.
>> 
>> Why don't you respin and I'll try to allocate some review time myself?
> 
> Umm...  Do you want a rebase on top of current net-next/master?

Sure, why not?

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-09 21:37                                             ` David Miller
@ 2014-12-09 22:49                                               ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 01/25] iov_iter.c: macros for iterating over iov_iter Al Viro
                                                                   ` (26 more replies)
  0 siblings, 27 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:49 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

On Tue, Dec 09, 2014 at 04:37:58PM -0500, David Miller wrote:
> From: Al Viro <viro@ZenIV.linux.org.uk>
> Date: Tue, 9 Dec 2014 21:23:37 +0000
> 
> > On Tue, Dec 09, 2014 at 04:17:12PM -0500, David Miller wrote:
> >> From: Al Viro <viro@ZenIV.linux.org.uk>
> >> Date: Tue, 9 Dec 2014 21:04:14 +0000
> >> 
> >> > Well, I've got no comments whatsoever on the networking side of things;
> >> 
> >> I've been away for more than 10 days, so I personally haven't had time
> >> to review any of it.
> >> 
> >> Why don't you respin and I'll try to allocate some review time myself?
> > 
> > Umm...  Do you want a rebase on top of current net-next/master?
> 
> Sure, why not?

OK, rebase is in
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem-2,
individual patches - in followups (first 13 - iov_iter, then 12 net-related)

^ permalink raw reply	[flat|nested] 133+ messages in thread

* [PATCH 01/25] iov_iter.c: macros for iterating over iov_iter
  2014-12-09 22:49                                               ` Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 02/25] iov_iter.c: iterate_and_advance Al Viro
                                                                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

iterate_all_kinds(iter, size, ident, step_iovec, step_bvec)
iterates through the ranges covered by iter (up to size bytes total),
repeating step_iovec or step_bvec for each of those.  ident is
declared in expansion of that thing, either as struct iovec or
struct bvec, and it contains the range we are currently looking
at.  step_bvec should be a void expression, step_iovec - a size_t
one, with non-zero meaning "stop here, that many bytes from this
range left".  In the end, the amount actually handled is stored
in size.

iov_iter_copy_from_user_atomic() and iov_iter_alignment() converted
to it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 212 ++++++++++++++++++++++++----------------------------------
 1 file changed, 86 insertions(+), 126 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index e34a3cb..798fcb4 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -4,6 +4,72 @@
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
 
+#define iterate_iovec(i, n, __v, __p, skip, STEP) {	\
+	size_t left;					\
+	size_t wanted = n;				\
+	__p = i->iov;					\
+	__v.iov_len = min(n, __p->iov_len - skip);	\
+	if (likely(__v.iov_len)) {			\
+		__v.iov_base = __p->iov_base + skip;	\
+		left = (STEP);				\
+		__v.iov_len -= left;			\
+		skip += __v.iov_len;			\
+		n -= __v.iov_len;			\
+	} else {					\
+		left = 0;				\
+	}						\
+	while (unlikely(!left && n)) {			\
+		__p++;					\
+		__v.iov_len = min(n, __p->iov_len);	\
+		if (unlikely(!__v.iov_len))		\
+			continue;			\
+		__v.iov_base = __p->iov_base;		\
+		left = (STEP);				\
+		__v.iov_len -= left;			\
+		skip = __v.iov_len;			\
+		n -= __v.iov_len;			\
+	}						\
+	n = wanted - n;					\
+}
+
+#define iterate_bvec(i, n, __v, __p, skip, STEP) {	\
+	size_t wanted = n;				\
+	__p = i->bvec;					\
+	__v.bv_len = min_t(size_t, n, __p->bv_len - skip);	\
+	if (likely(__v.bv_len)) {			\
+		__v.bv_page = __p->bv_page;		\
+		__v.bv_offset = __p->bv_offset + skip; 	\
+		(void)(STEP);				\
+		skip += __v.bv_len;			\
+		n -= __v.bv_len;			\
+	}						\
+	while (unlikely(n)) {				\
+		__p++;					\
+		__v.bv_len = min_t(size_t, n, __p->bv_len);	\
+		if (unlikely(!__v.bv_len))		\
+			continue;			\
+		__v.bv_page = __p->bv_page;		\
+		__v.bv_offset = __p->bv_offset;		\
+		(void)(STEP);				\
+		skip = __v.bv_len;			\
+		n -= __v.bv_len;			\
+	}						\
+	n = wanted;					\
+}
+
+#define iterate_all_kinds(i, n, v, I, B) {			\
+	size_t skip = i->iov_offset;				\
+	if (unlikely(i->type & ITER_BVEC)) {			\
+		const struct bio_vec *bvec;			\
+		struct bio_vec v;				\
+		iterate_bvec(i, n, v, bvec, skip, (B))		\
+	} else {						\
+		const struct iovec *iov;			\
+		struct iovec v;					\
+		iterate_iovec(i, n, v, iov, skip, (I))		\
+	}							\
+}
+
 static size_t copy_to_iter_iovec(void *from, size_t bytes, struct iov_iter *i)
 {
 	size_t skip, copy, left, wanted;
@@ -300,54 +366,6 @@ static size_t zero_iovec(size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static size_t __iovec_copy_from_user_inatomic(char *vaddr,
-			const struct iovec *iov, size_t base, size_t bytes)
-{
-	size_t copied = 0, left = 0;
-
-	while (bytes) {
-		char __user *buf = iov->iov_base + base;
-		int copy = min(bytes, iov->iov_len - base);
-
-		base = 0;
-		left = __copy_from_user_inatomic(vaddr, buf, copy);
-		copied += copy;
-		bytes -= copy;
-		vaddr += copy;
-		iov++;
-
-		if (unlikely(left))
-			break;
-	}
-	return copied - left;
-}
-
-/*
- * Copy as much as we can into the page and return the number of bytes which
- * were successfully copied.  If a fault is encountered then return the number of
- * bytes which were copied.
- */
-static size_t copy_from_user_atomic_iovec(struct page *page,
-		struct iov_iter *i, unsigned long offset, size_t bytes)
-{
-	char *kaddr;
-	size_t copied;
-
-	kaddr = kmap_atomic(page);
-	if (likely(i->nr_segs == 1)) {
-		int left;
-		char __user *buf = i->iov->iov_base + i->iov_offset;
-		left = __copy_from_user_inatomic(kaddr + offset, buf, bytes);
-		copied = bytes - left;
-	} else {
-		copied = __iovec_copy_from_user_inatomic(kaddr + offset,
-						i->iov, i->iov_offset, bytes);
-	}
-	kunmap_atomic(kaddr);
-
-	return copied;
-}
-
 static void advance_iovec(struct iov_iter *i, size_t bytes)
 {
 	BUG_ON(i->count < bytes);
@@ -404,30 +422,6 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
 }
 EXPORT_SYMBOL(iov_iter_fault_in_readable);
 
-static unsigned long alignment_iovec(const struct iov_iter *i)
-{
-	const struct iovec *iov = i->iov;
-	unsigned long res;
-	size_t size = i->count;
-	size_t n;
-
-	if (!size)
-		return 0;
-
-	res = (unsigned long)iov->iov_base + i->iov_offset;
-	n = iov->iov_len - i->iov_offset;
-	if (n >= size)
-		return res | size;
-	size -= n;
-	res |= n;
-	while (size > (++iov)->iov_len) {
-		res |= (unsigned long)iov->iov_base | iov->iov_len;
-		size -= iov->iov_len;
-	}
-	res |= (unsigned long)iov->iov_base | size;
-	return res;
-}
-
 void iov_iter_init(struct iov_iter *i, int direction,
 			const struct iovec *iov, unsigned long nr_segs,
 			size_t count)
@@ -691,28 +685,6 @@ static size_t zero_bvec(size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static size_t copy_from_user_bvec(struct page *page,
-		struct iov_iter *i, unsigned long offset, size_t bytes)
-{
-	char *kaddr;
-	size_t left;
-	const struct bio_vec *bvec;
-	size_t base = i->iov_offset;
-
-	kaddr = kmap_atomic(page);
-	for (left = bytes, bvec = i->bvec; left; bvec++, base = 0) {
-		size_t copy = min(left, bvec->bv_len - base);
-		if (!bvec->bv_len)
-			continue;
-		memcpy_from_page(kaddr + offset, bvec->bv_page,
-				 bvec->bv_offset + base, copy);
-		offset += copy;
-		left -= copy;
-	}
-	kunmap_atomic(kaddr);
-	return bytes;
-}
-
 static void advance_bvec(struct iov_iter *i, size_t bytes)
 {
 	BUG_ON(i->count < bytes);
@@ -749,30 +721,6 @@ static void advance_bvec(struct iov_iter *i, size_t bytes)
 	}
 }
 
-static unsigned long alignment_bvec(const struct iov_iter *i)
-{
-	const struct bio_vec *bvec = i->bvec;
-	unsigned long res;
-	size_t size = i->count;
-	size_t n;
-
-	if (!size)
-		return 0;
-
-	res = bvec->bv_offset + i->iov_offset;
-	n = bvec->bv_len - i->iov_offset;
-	if (n >= size)
-		return res | size;
-	size -= n;
-	res |= n;
-	while (size > (++bvec)->bv_len) {
-		res |= bvec->bv_offset | bvec->bv_len;
-		size -= bvec->bv_len;
-	}
-	res |= bvec->bv_offset | size;
-	return res;
-}
-
 static ssize_t get_pages_bvec(struct iov_iter *i,
 		   struct page **pages, size_t maxsize, unsigned maxpages,
 		   size_t *start)
@@ -887,10 +835,15 @@ EXPORT_SYMBOL(iov_iter_zero);
 size_t iov_iter_copy_from_user_atomic(struct page *page,
 		struct iov_iter *i, unsigned long offset, size_t bytes)
 {
-	if (i->type & ITER_BVEC)
-		return copy_from_user_bvec(page, i, offset, bytes);
-	else
-		return copy_from_user_atomic_iovec(page, i, offset, bytes);
+	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
+	iterate_all_kinds(i, bytes, v,
+		__copy_from_user_inatomic((p += v.iov_len) - v.iov_len,
+					  v.iov_base, v.iov_len),
+		memcpy_from_page((p += v.bv_len) - v.bv_len, v.bv_page,
+				 v.bv_offset, v.bv_len)
+	)
+	kunmap_atomic(kaddr);
+	return bytes;
 }
 EXPORT_SYMBOL(iov_iter_copy_from_user_atomic);
 
@@ -919,10 +872,17 @@ EXPORT_SYMBOL(iov_iter_single_seg_count);
 
 unsigned long iov_iter_alignment(const struct iov_iter *i)
 {
-	if (i->type & ITER_BVEC)
-		return alignment_bvec(i);
-	else
-		return alignment_iovec(i);
+	unsigned long res = 0;
+	size_t size = i->count;
+
+	if (!size)
+		return 0;
+
+	iterate_all_kinds(i, size, v,
+		(res |= (unsigned long)v.iov_base | v.iov_len, 0),
+		res |= v.bv_offset | v.bv_len
+	)
+	return res;
 }
 EXPORT_SYMBOL(iov_iter_alignment);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 02/25] iov_iter.c: iterate_and_advance
  2014-12-09 22:49                                               ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 01/25] iov_iter.c: macros for iterating over iov_iter Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 03/25] iov_iter.c: convert iov_iter_npages() to iterate_all_kinds Al Viro
                                                                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

same as iterate_all_kinds, but iterator is moved to the position past
the last byte we'd handled.

iov_iter_advance() converted to it

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 104 ++++++++++++++++------------------------------------------
 1 file changed, 28 insertions(+), 76 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 798fcb4..e91bf0a 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -70,6 +70,33 @@
 	}							\
 }
 
+#define iterate_and_advance(i, n, v, I, B) {			\
+	size_t skip = i->iov_offset;				\
+	if (unlikely(i->type & ITER_BVEC)) {			\
+		const struct bio_vec *bvec;			\
+		struct bio_vec v;				\
+		iterate_bvec(i, n, v, bvec, skip, (B))		\
+		if (skip == bvec->bv_len) {			\
+			bvec++;					\
+			skip = 0;				\
+		}						\
+		i->nr_segs -= bvec - i->bvec;			\
+		i->bvec = bvec;					\
+	} else {						\
+		const struct iovec *iov;			\
+		struct iovec v;					\
+		iterate_iovec(i, n, v, iov, skip, (I))		\
+		if (skip == iov->iov_len) {			\
+			iov++;					\
+			skip = 0;				\
+		}						\
+		i->nr_segs -= iov - i->iov;			\
+		i->iov = iov;					\
+	}							\
+	i->count -= n;						\
+	i->iov_offset = skip;					\
+}
+
 static size_t copy_to_iter_iovec(void *from, size_t bytes, struct iov_iter *i)
 {
 	size_t skip, copy, left, wanted;
@@ -366,42 +393,6 @@ static size_t zero_iovec(size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static void advance_iovec(struct iov_iter *i, size_t bytes)
-{
-	BUG_ON(i->count < bytes);
-
-	if (likely(i->nr_segs == 1)) {
-		i->iov_offset += bytes;
-		i->count -= bytes;
-	} else {
-		const struct iovec *iov = i->iov;
-		size_t base = i->iov_offset;
-		unsigned long nr_segs = i->nr_segs;
-
-		/*
-		 * The !iov->iov_len check ensures we skip over unlikely
-		 * zero-length segments (without overruning the iovec).
-		 */
-		while (bytes || unlikely(i->count && !iov->iov_len)) {
-			int copy;
-
-			copy = min(bytes, iov->iov_len - base);
-			BUG_ON(!i->count || i->count < copy);
-			i->count -= copy;
-			bytes -= copy;
-			base += copy;
-			if (iov->iov_len == base) {
-				iov++;
-				nr_segs--;
-				base = 0;
-			}
-		}
-		i->iov = iov;
-		i->iov_offset = base;
-		i->nr_segs = nr_segs;
-	}
-}
-
 /*
  * Fault in the first iovec of the given iov_iter, to a maximum length
  * of bytes. Returns 0 on success, or non-zero if the memory could not be
@@ -685,42 +676,6 @@ static size_t zero_bvec(size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static void advance_bvec(struct iov_iter *i, size_t bytes)
-{
-	BUG_ON(i->count < bytes);
-
-	if (likely(i->nr_segs == 1)) {
-		i->iov_offset += bytes;
-		i->count -= bytes;
-	} else {
-		const struct bio_vec *bvec = i->bvec;
-		size_t base = i->iov_offset;
-		unsigned long nr_segs = i->nr_segs;
-
-		/*
-		 * The !iov->iov_len check ensures we skip over unlikely
-		 * zero-length segments (without overruning the iovec).
-		 */
-		while (bytes || unlikely(i->count && !bvec->bv_len)) {
-			int copy;
-
-			copy = min(bytes, bvec->bv_len - base);
-			BUG_ON(!i->count || i->count < copy);
-			i->count -= copy;
-			bytes -= copy;
-			base += copy;
-			if (bvec->bv_len == base) {
-				bvec++;
-				nr_segs--;
-				base = 0;
-			}
-		}
-		i->bvec = bvec;
-		i->iov_offset = base;
-		i->nr_segs = nr_segs;
-	}
-}
-
 static ssize_t get_pages_bvec(struct iov_iter *i,
 		   struct page **pages, size_t maxsize, unsigned maxpages,
 		   size_t *start)
@@ -849,10 +804,7 @@ EXPORT_SYMBOL(iov_iter_copy_from_user_atomic);
 
 void iov_iter_advance(struct iov_iter *i, size_t size)
 {
-	if (i->type & ITER_BVEC)
-		advance_bvec(i, size);
-	else
-		advance_iovec(i, size);
+	iterate_and_advance(i, size, v, 0, 0)
 }
 EXPORT_SYMBOL(iov_iter_advance);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 03/25] iov_iter.c: convert iov_iter_npages() to iterate_all_kinds
  2014-12-09 22:49                                               ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 01/25] iov_iter.c: macros for iterating over iov_iter Al Viro
  2014-12-09 22:50                                                 ` [PATCH 02/25] iov_iter.c: iterate_and_advance Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 04/25] iov_iter.c: convert iov_iter_get_pages() " Al Viro
                                                                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 73 ++++++++++++++++-------------------------------------------
 1 file changed, 19 insertions(+), 54 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index e91bf0a..bc666e7 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -493,32 +493,6 @@ static ssize_t get_pages_alloc_iovec(struct iov_iter *i,
 	return (res == n ? len : res * PAGE_SIZE) - *start;
 }
 
-static int iov_iter_npages_iovec(const struct iov_iter *i, int maxpages)
-{
-	size_t offset = i->iov_offset;
-	size_t size = i->count;
-	const struct iovec *iov = i->iov;
-	int npages = 0;
-	int n;
-
-	for (n = 0; size && n < i->nr_segs; n++, iov++) {
-		unsigned long addr = (unsigned long)iov->iov_base + offset;
-		size_t len = iov->iov_len - offset;
-		offset = 0;
-		if (unlikely(!len))	/* empty segment */
-			continue;
-		if (len > size)
-			len = size;
-		npages += (addr + len + PAGE_SIZE - 1) / PAGE_SIZE
-			  - addr / PAGE_SIZE;
-		if (npages >= maxpages)	/* don't bother going further */
-			return maxpages;
-		size -= len;
-		offset = 0;
-	}
-	return min(npages, maxpages);
-}
-
 static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t len)
 {
 	char *from = kmap_atomic(page);
@@ -715,30 +689,6 @@ static ssize_t get_pages_alloc_bvec(struct iov_iter *i,
 	return len;
 }
 
-static int iov_iter_npages_bvec(const struct iov_iter *i, int maxpages)
-{
-	size_t offset = i->iov_offset;
-	size_t size = i->count;
-	const struct bio_vec *bvec = i->bvec;
-	int npages = 0;
-	int n;
-
-	for (n = 0; size && n < i->nr_segs; n++, bvec++) {
-		size_t len = bvec->bv_len - offset;
-		offset = 0;
-		if (unlikely(!len))	/* empty segment */
-			continue;
-		if (len > size)
-			len = size;
-		npages++;
-		if (npages >= maxpages)	/* don't bother going further */
-			return maxpages;
-		size -= len;
-		offset = 0;
-	}
-	return min(npages, maxpages);
-}
-
 size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
@@ -862,9 +812,24 @@ EXPORT_SYMBOL(iov_iter_get_pages_alloc);
 
 int iov_iter_npages(const struct iov_iter *i, int maxpages)
 {
-	if (i->type & ITER_BVEC)
-		return iov_iter_npages_bvec(i, maxpages);
-	else
-		return iov_iter_npages_iovec(i, maxpages);
+	size_t size = i->count;
+	int npages = 0;
+
+	if (!size)
+		return 0;
+
+	iterate_all_kinds(i, size, v, ({
+		unsigned long p = (unsigned long)v.iov_base;
+		npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
+			- p / PAGE_SIZE;
+		if (npages >= maxpages)
+			return maxpages;
+	0;}),({
+		npages++;
+		if (npages >= maxpages)
+			return maxpages;
+	})
+	)
+	return npages;
 }
 EXPORT_SYMBOL(iov_iter_npages);
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 04/25] iov_iter.c: convert iov_iter_get_pages() to iterate_all_kinds
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (2 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 03/25] iov_iter.c: convert iov_iter_npages() to iterate_all_kinds Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 05/25] iov_iter.c: convert iov_iter_get_pages_alloc() " Al Viro
                                                                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 78 +++++++++++++++++++++--------------------------------------
 1 file changed, 28 insertions(+), 50 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index bc666e7..75e29ef 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -428,34 +428,6 @@ void iov_iter_init(struct iov_iter *i, int direction,
 }
 EXPORT_SYMBOL(iov_iter_init);
 
-static ssize_t get_pages_iovec(struct iov_iter *i,
-		   struct page **pages, size_t maxsize, unsigned maxpages,
-		   size_t *start)
-{
-	size_t offset = i->iov_offset;
-	const struct iovec *iov = i->iov;
-	size_t len;
-	unsigned long addr;
-	int n;
-	int res;
-
-	len = iov->iov_len - offset;
-	if (len > i->count)
-		len = i->count;
-	if (len > maxsize)
-		len = maxsize;
-	addr = (unsigned long)iov->iov_base + offset;
-	len += *start = addr & (PAGE_SIZE - 1);
-	if (len > maxpages * PAGE_SIZE)
-		len = maxpages * PAGE_SIZE;
-	addr &= ~(PAGE_SIZE - 1);
-	n = (len + PAGE_SIZE - 1) / PAGE_SIZE;
-	res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, pages);
-	if (unlikely(res < 0))
-		return res;
-	return (res == n ? len : res * PAGE_SIZE) - *start;
-}
-
 static ssize_t get_pages_alloc_iovec(struct iov_iter *i,
 		   struct page ***pages, size_t maxsize,
 		   size_t *start)
@@ -650,24 +622,6 @@ static size_t zero_bvec(size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static ssize_t get_pages_bvec(struct iov_iter *i,
-		   struct page **pages, size_t maxsize, unsigned maxpages,
-		   size_t *start)
-{
-	const struct bio_vec *bvec = i->bvec;
-	size_t len = bvec->bv_len - i->iov_offset;
-	if (len > i->count)
-		len = i->count;
-	if (len > maxsize)
-		len = maxsize;
-	/* can't be more than PAGE_SIZE */
-	*start = bvec->bv_offset + i->iov_offset;
-
-	get_page(*pages = bvec->bv_page);
-
-	return len;
-}
-
 static ssize_t get_pages_alloc_bvec(struct iov_iter *i,
 		   struct page ***pages, size_t maxsize,
 		   size_t *start)
@@ -792,10 +746,34 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
 		   struct page **pages, size_t maxsize, unsigned maxpages,
 		   size_t *start)
 {
-	if (i->type & ITER_BVEC)
-		return get_pages_bvec(i, pages, maxsize, maxpages, start);
-	else
-		return get_pages_iovec(i, pages, maxsize, maxpages, start);
+	if (maxsize > i->count)
+		maxsize = i->count;
+
+	if (!maxsize)
+		return 0;
+
+	iterate_all_kinds(i, maxsize, v, ({
+		unsigned long addr = (unsigned long)v.iov_base;
+		size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
+		int n;
+		int res;
+
+		if (len > maxpages * PAGE_SIZE)
+			len = maxpages * PAGE_SIZE;
+		addr &= ~(PAGE_SIZE - 1);
+		n = DIV_ROUND_UP(len, PAGE_SIZE);
+		res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, pages);
+		if (unlikely(res < 0))
+			return res;
+		return (res == n ? len : res * PAGE_SIZE) - *start;
+	0;}),({
+		/* can't be more than PAGE_SIZE */
+		*start = v.bv_offset;
+		get_page(*pages = v.bv_page);
+		return v.bv_len;
+	})
+	)
+	return 0;
 }
 EXPORT_SYMBOL(iov_iter_get_pages);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 05/25] iov_iter.c: convert iov_iter_get_pages_alloc() to iterate_all_kinds
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (3 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 04/25] iov_iter.c: convert iov_iter_get_pages() " Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 06/25] iov_iter.c: convert iov_iter_zero() to iterate_and_advance Al Viro
                                                                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 107 ++++++++++++++++++++++++----------------------------------
 1 file changed, 45 insertions(+), 62 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 75e29ef..3214b9b 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -428,43 +428,6 @@ void iov_iter_init(struct iov_iter *i, int direction,
 }
 EXPORT_SYMBOL(iov_iter_init);
 
-static ssize_t get_pages_alloc_iovec(struct iov_iter *i,
-		   struct page ***pages, size_t maxsize,
-		   size_t *start)
-{
-	size_t offset = i->iov_offset;
-	const struct iovec *iov = i->iov;
-	size_t len;
-	unsigned long addr;
-	void *p;
-	int n;
-	int res;
-
-	len = iov->iov_len - offset;
-	if (len > i->count)
-		len = i->count;
-	if (len > maxsize)
-		len = maxsize;
-	addr = (unsigned long)iov->iov_base + offset;
-	len += *start = addr & (PAGE_SIZE - 1);
-	addr &= ~(PAGE_SIZE - 1);
-	n = (len + PAGE_SIZE - 1) / PAGE_SIZE;
-	
-	p = kmalloc(n * sizeof(struct page *), GFP_KERNEL);
-	if (!p)
-		p = vmalloc(n * sizeof(struct page *));
-	if (!p)
-		return -ENOMEM;
-
-	res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, p);
-	if (unlikely(res < 0)) {
-		kvfree(p);
-		return res;
-	}
-	*pages = p;
-	return (res == n ? len : res * PAGE_SIZE) - *start;
-}
-
 static void memcpy_from_page(char *to, struct page *page, size_t offset, size_t len)
 {
 	char *from = kmap_atomic(page);
@@ -622,27 +585,6 @@ static size_t zero_bvec(size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static ssize_t get_pages_alloc_bvec(struct iov_iter *i,
-		   struct page ***pages, size_t maxsize,
-		   size_t *start)
-{
-	const struct bio_vec *bvec = i->bvec;
-	size_t len = bvec->bv_len - i->iov_offset;
-	if (len > i->count)
-		len = i->count;
-	if (len > maxsize)
-		len = maxsize;
-	*start = bvec->bv_offset + i->iov_offset;
-
-	*pages = kmalloc(sizeof(struct page *), GFP_KERNEL);
-	if (!*pages)
-		return -ENOMEM;
-
-	get_page(**pages = bvec->bv_page);
-
-	return len;
-}
-
 size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
@@ -777,14 +719,55 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
 }
 EXPORT_SYMBOL(iov_iter_get_pages);
 
+static struct page **get_pages_array(size_t n)
+{
+	struct page **p = kmalloc(n * sizeof(struct page *), GFP_KERNEL);
+	if (!p)
+		p = vmalloc(n * sizeof(struct page *));
+	return p;
+}
+
 ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
 		   struct page ***pages, size_t maxsize,
 		   size_t *start)
 {
-	if (i->type & ITER_BVEC)
-		return get_pages_alloc_bvec(i, pages, maxsize, start);
-	else
-		return get_pages_alloc_iovec(i, pages, maxsize, start);
+	struct page **p;
+
+	if (maxsize > i->count)
+		maxsize = i->count;
+
+	if (!maxsize)
+		return 0;
+
+	iterate_all_kinds(i, maxsize, v, ({
+		unsigned long addr = (unsigned long)v.iov_base;
+		size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
+		int n;
+		int res;
+
+		addr &= ~(PAGE_SIZE - 1);
+		n = DIV_ROUND_UP(len, PAGE_SIZE);
+		p = get_pages_array(n);
+		if (!p)
+			return -ENOMEM;
+		res = get_user_pages_fast(addr, n, (i->type & WRITE) != WRITE, p);
+		if (unlikely(res < 0)) {
+			kvfree(p);
+			return res;
+		}
+		*pages = p;
+		return (res == n ? len : res * PAGE_SIZE) - *start;
+	0;}),({
+		/* can't be more than PAGE_SIZE */
+		*start = v.bv_offset;
+		*pages = p = get_pages_array(1);
+		if (!p)
+			return -ENOMEM;
+		get_page(*p = v.bv_page);
+		return v.bv_len;
+	})
+	)
+	return 0;
 }
 EXPORT_SYMBOL(iov_iter_get_pages_alloc);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 06/25] iov_iter.c: convert iov_iter_zero() to iterate_and_advance
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (4 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 05/25] iov_iter.c: convert iov_iter_get_pages_alloc() " Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 07/25] iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() Al Viro
                                                                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 98 ++++++++---------------------------------------------------
 1 file changed, 12 insertions(+), 86 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 3214b9b..39ad713 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -349,50 +349,6 @@ done:
 	return wanted - bytes;
 }
 
-static size_t zero_iovec(size_t bytes, struct iov_iter *i)
-{
-	size_t skip, copy, left, wanted;
-	const struct iovec *iov;
-	char __user *buf;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	wanted = bytes;
-	iov = i->iov;
-	skip = i->iov_offset;
-	buf = iov->iov_base + skip;
-	copy = min(bytes, iov->iov_len - skip);
-
-	left = __clear_user(buf, copy);
-	copy -= left;
-	skip += copy;
-	bytes -= copy;
-
-	while (unlikely(!left && bytes)) {
-		iov++;
-		buf = iov->iov_base;
-		copy = min(bytes, iov->iov_len);
-		left = __clear_user(buf, copy);
-		copy -= left;
-		skip = copy;
-		bytes -= copy;
-	}
-
-	if (skip == iov->iov_len) {
-		iov++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= iov - i->iov;
-	i->iov = iov;
-	i->iov_offset = skip;
-	return wanted - bytes;
-}
-
 /*
  * Fault in the first iovec of the given iov_iter, to a maximum length
  * of bytes. Returns 0 on success, or non-zero if the memory could not be
@@ -548,43 +504,6 @@ static size_t copy_page_from_iter_bvec(struct page *page, size_t offset,
 	return wanted;
 }
 
-static size_t zero_bvec(size_t bytes, struct iov_iter *i)
-{
-	size_t skip, copy, wanted;
-	const struct bio_vec *bvec;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	wanted = bytes;
-	bvec = i->bvec;
-	skip = i->iov_offset;
-	copy = min_t(size_t, bytes, bvec->bv_len - skip);
-
-	memzero_page(bvec->bv_page, skip + bvec->bv_offset, copy);
-	skip += copy;
-	bytes -= copy;
-	while (bytes) {
-		bvec++;
-		copy = min(bytes, (size_t)bvec->bv_len);
-		memzero_page(bvec->bv_page, bvec->bv_offset, copy);
-		skip = copy;
-		bytes -= copy;
-	}
-	if (skip == bvec->bv_len) {
-		bvec++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= bvec - i->bvec;
-	i->bvec = bvec;
-	i->iov_offset = skip;
-	return wanted - bytes;
-}
-
 size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
@@ -625,11 +544,18 @@ EXPORT_SYMBOL(copy_from_iter);
 
 size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
 {
-	if (i->type & ITER_BVEC) {
-		return zero_bvec(bytes, i);
-	} else {
-		return zero_iovec(bytes, i);
-	}
+	if (unlikely(bytes > i->count))
+		bytes = i->count;
+
+	if (unlikely(!bytes))
+		return 0;
+
+	iterate_and_advance(i, bytes, v,
+		__clear_user(v.iov_base, v.iov_len),
+		memzero_page(v.bv_page, v.bv_offset, v.bv_len)
+	)
+
+	return bytes;
 }
 EXPORT_SYMBOL(iov_iter_zero);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 07/25] iov_iter.c: get rid of bvec_copy_page_{to,from}_iter()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (5 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 06/25] iov_iter.c: convert iov_iter_zero() to iterate_and_advance Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 08/25] iov_iter.c: convert copy_from_iter() to iterate_and_advance Al Viro
                                                                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Just have copy_page_{to,from}_iter() fall back to kmap_atomic +
copy_{to,from}_iter() + kunmap_atomic() in ITER_BVEC case.  As
the matter of fact, that's what we want to do for any iov_iter
kind that isn't blocking - e.g. ITER_KVEC will also go that way
once we recognize it on iov_iter.c primitives level

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 60 ++++++++++++++++++++++++-----------------------------------
 1 file changed, 24 insertions(+), 36 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 39ad713..17b7144 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -486,30 +486,33 @@ static size_t copy_from_iter_bvec(void *to, size_t bytes, struct iov_iter *i)
 	return wanted;
 }
 
-static size_t copy_page_to_iter_bvec(struct page *page, size_t offset,
-					size_t bytes, struct iov_iter *i)
+size_t copy_to_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
-	void *kaddr = kmap_atomic(page);
-	size_t wanted = copy_to_iter_bvec(kaddr + offset, bytes, i);
-	kunmap_atomic(kaddr);
-	return wanted;
+	if (i->type & ITER_BVEC)
+		return copy_to_iter_bvec(addr, bytes, i);
+	else
+		return copy_to_iter_iovec(addr, bytes, i);
 }
+EXPORT_SYMBOL(copy_to_iter);
 
-static size_t copy_page_from_iter_bvec(struct page *page, size_t offset,
-					size_t bytes, struct iov_iter *i)
+size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
-	void *kaddr = kmap_atomic(page);
-	size_t wanted = copy_from_iter_bvec(kaddr + offset, bytes, i);
-	kunmap_atomic(kaddr);
-	return wanted;
+	if (i->type & ITER_BVEC)
+		return copy_from_iter_bvec(addr, bytes, i);
+	else
+		return copy_from_iter_iovec(addr, bytes, i);
 }
+EXPORT_SYMBOL(copy_from_iter);
 
 size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
-	if (i->type & ITER_BVEC)
-		return copy_page_to_iter_bvec(page, offset, bytes, i);
-	else
+	if (i->type & (ITER_BVEC|ITER_KVEC)) {
+		void *kaddr = kmap_atomic(page);
+		size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
+		kunmap_atomic(kaddr);
+		return wanted;
+	} else
 		return copy_page_to_iter_iovec(page, offset, bytes, i);
 }
 EXPORT_SYMBOL(copy_page_to_iter);
@@ -517,31 +520,16 @@ EXPORT_SYMBOL(copy_page_to_iter);
 size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
-	if (i->type & ITER_BVEC)
-		return copy_page_from_iter_bvec(page, offset, bytes, i);
-	else
+	if (i->type & ITER_BVEC) {
+		void *kaddr = kmap_atomic(page);
+		size_t wanted = copy_from_iter(kaddr + offset, bytes, i);
+		kunmap_atomic(kaddr);
+		return wanted;
+	} else
 		return copy_page_from_iter_iovec(page, offset, bytes, i);
 }
 EXPORT_SYMBOL(copy_page_from_iter);
 
-size_t copy_to_iter(void *addr, size_t bytes, struct iov_iter *i)
-{
-	if (i->type & ITER_BVEC)
-		return copy_to_iter_bvec(addr, bytes, i);
-	else
-		return copy_to_iter_iovec(addr, bytes, i);
-}
-EXPORT_SYMBOL(copy_to_iter);
-
-size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
-{
-	if (i->type & ITER_BVEC)
-		return copy_from_iter_bvec(addr, bytes, i);
-	else
-		return copy_from_iter_iovec(addr, bytes, i);
-}
-EXPORT_SYMBOL(copy_from_iter);
-
 size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
 {
 	if (unlikely(bytes > i->count))
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 08/25] iov_iter.c: convert copy_from_iter() to iterate_and_advance
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (6 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 07/25] iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 09/25] iov_iter.c: convert copy_to_iter() " Al Viro
                                                                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 106 +++++++++-------------------------------------------------
 1 file changed, 15 insertions(+), 91 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 17b7144..791429d 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -142,51 +142,6 @@ static size_t copy_to_iter_iovec(void *from, size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static size_t copy_from_iter_iovec(void *to, size_t bytes, struct iov_iter *i)
-{
-	size_t skip, copy, left, wanted;
-	const struct iovec *iov;
-	char __user *buf;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	wanted = bytes;
-	iov = i->iov;
-	skip = i->iov_offset;
-	buf = iov->iov_base + skip;
-	copy = min(bytes, iov->iov_len - skip);
-
-	left = __copy_from_user(to, buf, copy);
-	copy -= left;
-	skip += copy;
-	to += copy;
-	bytes -= copy;
-	while (unlikely(!left && bytes)) {
-		iov++;
-		buf = iov->iov_base;
-		copy = min(bytes, iov->iov_len);
-		left = __copy_from_user(to, buf, copy);
-		copy -= left;
-		skip = copy;
-		to += copy;
-		bytes -= copy;
-	}
-
-	if (skip == iov->iov_len) {
-		iov++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= iov - i->iov;
-	i->iov = iov;
-	i->iov_offset = skip;
-	return wanted - bytes;
-}
-
 static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
@@ -444,48 +399,6 @@ static size_t copy_to_iter_bvec(void *from, size_t bytes, struct iov_iter *i)
 	return wanted - bytes;
 }
 
-static size_t copy_from_iter_bvec(void *to, size_t bytes, struct iov_iter *i)
-{
-	size_t skip, copy, wanted;
-	const struct bio_vec *bvec;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	wanted = bytes;
-	bvec = i->bvec;
-	skip = i->iov_offset;
-
-	copy = min(bytes, bvec->bv_len - skip);
-
-	memcpy_from_page(to, bvec->bv_page, bvec->bv_offset + skip, copy);
-
-	to += copy;
-	skip += copy;
-	bytes -= copy;
-
-	while (bytes) {
-		bvec++;
-		copy = min(bytes, (size_t)bvec->bv_len);
-		memcpy_from_page(to, bvec->bv_page, bvec->bv_offset, copy);
-		skip = copy;
-		to += copy;
-		bytes -= copy;
-	}
-	if (skip == bvec->bv_len) {
-		bvec++;
-		skip = 0;
-	}
-	i->count -= wanted;
-	i->nr_segs -= bvec - i->bvec;
-	i->bvec = bvec;
-	i->iov_offset = skip;
-	return wanted;
-}
-
 size_t copy_to_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
 	if (i->type & ITER_BVEC)
@@ -497,10 +410,21 @@ EXPORT_SYMBOL(copy_to_iter);
 
 size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
-	if (i->type & ITER_BVEC)
-		return copy_from_iter_bvec(addr, bytes, i);
-	else
-		return copy_from_iter_iovec(addr, bytes, i);
+	char *to = addr;
+	if (unlikely(bytes > i->count))
+		bytes = i->count;
+
+	if (unlikely(!bytes))
+		return 0;
+
+	iterate_and_advance(i, bytes, v,
+		__copy_from_user((to += v.iov_len) - v.iov_len, v.iov_base,
+				 v.iov_len),
+		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
+				 v.bv_offset, v.bv_len)
+	)
+
+	return bytes;
 }
 EXPORT_SYMBOL(copy_from_iter);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 09/25] iov_iter.c: convert copy_to_iter() to iterate_and_advance
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (7 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 08/25] iov_iter.c: convert copy_from_iter() to iterate_and_advance Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 10/25] iov_iter.c: handle ITER_KVEC directly Al Viro
                                                                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 mm/iov_iter.c | 91 ++++++-----------------------------------------------------
 1 file changed, 9 insertions(+), 82 deletions(-)

diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 791429d..66665449 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -97,51 +97,6 @@
 	i->iov_offset = skip;					\
 }
 
-static size_t copy_to_iter_iovec(void *from, size_t bytes, struct iov_iter *i)
-{
-	size_t skip, copy, left, wanted;
-	const struct iovec *iov;
-	char __user *buf;
-
-	if (unlikely(bytes > i->count))
-		bytes = i->count;
-
-	if (unlikely(!bytes))
-		return 0;
-
-	wanted = bytes;
-	iov = i->iov;
-	skip = i->iov_offset;
-	buf = iov->iov_base + skip;
-	copy = min(bytes, iov->iov_len - skip);
-
-	left = __copy_to_user(buf, from, copy);
-	copy -= left;
-	skip += copy;
-	from += copy;
-	bytes -= copy;
-	while (unlikely(!left && bytes)) {
-		iov++;
-		buf = iov->iov_base;
-		copy = min(bytes, iov->iov_len);
-		left = __copy_to_user(buf, from, copy);
-		copy -= left;
-		skip = copy;
-		from += copy;
-		bytes -= copy;
-	}
-
-	if (skip == iov->iov_len) {
-		iov++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= iov - i->iov;
-	i->iov = iov;
-	i->iov_offset = skip;
-	return wanted - bytes;
-}
-
 static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
@@ -360,51 +315,23 @@ static void memzero_page(struct page *page, size_t offset, size_t len)
 	kunmap_atomic(addr);
 }
 
-static size_t copy_to_iter_bvec(void *from, size_t bytes, struct iov_iter *i)
+size_t copy_to_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
-	size_t skip, copy, wanted;
-	const struct bio_vec *bvec;
-
+	char *from = addr;
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
 
 	if (unlikely(!bytes))
 		return 0;
 
-	wanted = bytes;
-	bvec = i->bvec;
-	skip = i->iov_offset;
-	copy = min_t(size_t, bytes, bvec->bv_len - skip);
-
-	memcpy_to_page(bvec->bv_page, skip + bvec->bv_offset, from, copy);
-	skip += copy;
-	from += copy;
-	bytes -= copy;
-	while (bytes) {
-		bvec++;
-		copy = min(bytes, (size_t)bvec->bv_len);
-		memcpy_to_page(bvec->bv_page, bvec->bv_offset, from, copy);
-		skip = copy;
-		from += copy;
-		bytes -= copy;
-	}
-	if (skip == bvec->bv_len) {
-		bvec++;
-		skip = 0;
-	}
-	i->count -= wanted - bytes;
-	i->nr_segs -= bvec - i->bvec;
-	i->bvec = bvec;
-	i->iov_offset = skip;
-	return wanted - bytes;
-}
+	iterate_and_advance(i, bytes, v,
+		__copy_to_user(v.iov_base, (from += v.iov_len) - v.iov_len,
+			       v.iov_len),
+		memcpy_to_page(v.bv_page, v.bv_offset,
+			       (from += v.bv_len) - v.bv_len, v.bv_len)
+	)
 
-size_t copy_to_iter(void *addr, size_t bytes, struct iov_iter *i)
-{
-	if (i->type & ITER_BVEC)
-		return copy_to_iter_bvec(addr, bytes, i);
-	else
-		return copy_to_iter_iovec(addr, bytes, i);
+	return bytes;
 }
 EXPORT_SYMBOL(copy_to_iter);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 10/25] iov_iter.c: handle ITER_KVEC directly
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (8 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 09/25] iov_iter.c: convert copy_to_iter() " Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 11/25] csum_and_copy_..._iter() Al Viro
                                                                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

... without bothering with copy_..._user()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/uio.h |  1 +
 mm/iov_iter.c       | 82 ++++++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 70 insertions(+), 13 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 9b15814..6e16945 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -31,6 +31,7 @@ struct iov_iter {
 	size_t count;
 	union {
 		const struct iovec *iov;
+		const struct kvec *kvec;
 		const struct bio_vec *bvec;
 	};
 	unsigned long nr_segs;
diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 66665449..1618e37 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -32,6 +32,29 @@
 	n = wanted - n;					\
 }
 
+#define iterate_kvec(i, n, __v, __p, skip, STEP) {	\
+	size_t wanted = n;				\
+	__p = i->kvec;					\
+	__v.iov_len = min(n, __p->iov_len - skip);	\
+	if (likely(__v.iov_len)) {			\
+		__v.iov_base = __p->iov_base + skip;	\
+		(void)(STEP);				\
+		skip += __v.iov_len;			\
+		n -= __v.iov_len;			\
+	}						\
+	while (unlikely(n)) {				\
+		__p++;					\
+		__v.iov_len = min(n, __p->iov_len);	\
+		if (unlikely(!__v.iov_len))		\
+			continue;			\
+		__v.iov_base = __p->iov_base;		\
+		(void)(STEP);				\
+		skip = __v.iov_len;			\
+		n -= __v.iov_len;			\
+	}						\
+	n = wanted;					\
+}
+
 #define iterate_bvec(i, n, __v, __p, skip, STEP) {	\
 	size_t wanted = n;				\
 	__p = i->bvec;					\
@@ -57,12 +80,16 @@
 	n = wanted;					\
 }
 
-#define iterate_all_kinds(i, n, v, I, B) {			\
+#define iterate_all_kinds(i, n, v, I, B, K) {			\
 	size_t skip = i->iov_offset;				\
 	if (unlikely(i->type & ITER_BVEC)) {			\
 		const struct bio_vec *bvec;			\
 		struct bio_vec v;				\
 		iterate_bvec(i, n, v, bvec, skip, (B))		\
+	} else if (unlikely(i->type & ITER_KVEC)) {		\
+		const struct kvec *kvec;			\
+		struct kvec v;					\
+		iterate_kvec(i, n, v, kvec, skip, (K))		\
 	} else {						\
 		const struct iovec *iov;			\
 		struct iovec v;					\
@@ -70,7 +97,7 @@
 	}							\
 }
 
-#define iterate_and_advance(i, n, v, I, B) {			\
+#define iterate_and_advance(i, n, v, I, B, K) {			\
 	size_t skip = i->iov_offset;				\
 	if (unlikely(i->type & ITER_BVEC)) {			\
 		const struct bio_vec *bvec;			\
@@ -82,6 +109,16 @@
 		}						\
 		i->nr_segs -= bvec - i->bvec;			\
 		i->bvec = bvec;					\
+	} else if (unlikely(i->type & ITER_KVEC)) {		\
+		const struct kvec *kvec;			\
+		struct kvec v;					\
+		iterate_kvec(i, n, v, kvec, skip, (K))		\
+		if (skip == kvec->iov_len) {			\
+			kvec++;					\
+			skip = 0;				\
+		}						\
+		i->nr_segs -= kvec - i->kvec;			\
+		i->kvec = kvec;					\
 	} else {						\
 		const struct iovec *iov;			\
 		struct iovec v;					\
@@ -270,7 +307,7 @@ done:
  */
 int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
 {
-	if (!(i->type & ITER_BVEC)) {
+	if (!(i->type & (ITER_BVEC|ITER_KVEC))) {
 		char __user *buf = i->iov->iov_base + i->iov_offset;
 		bytes = min(bytes, i->iov->iov_len - i->iov_offset);
 		return fault_in_pages_readable(buf, bytes);
@@ -284,10 +321,14 @@ void iov_iter_init(struct iov_iter *i, int direction,
 			size_t count)
 {
 	/* It will get better.  Eventually... */
-	if (segment_eq(get_fs(), KERNEL_DS))
+	if (segment_eq(get_fs(), KERNEL_DS)) {
 		direction |= ITER_KVEC;
-	i->type = direction;
-	i->iov = iov;
+		i->type = direction;
+		i->kvec = (struct kvec *)iov;
+	} else {
+		i->type = direction;
+		i->iov = iov;
+	}
 	i->nr_segs = nr_segs;
 	i->iov_offset = 0;
 	i->count = count;
@@ -328,7 +369,8 @@ size_t copy_to_iter(void *addr, size_t bytes, struct iov_iter *i)
 		__copy_to_user(v.iov_base, (from += v.iov_len) - v.iov_len,
 			       v.iov_len),
 		memcpy_to_page(v.bv_page, v.bv_offset,
-			       (from += v.bv_len) - v.bv_len, v.bv_len)
+			       (from += v.bv_len) - v.bv_len, v.bv_len),
+		memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len)
 	)
 
 	return bytes;
@@ -348,7 +390,8 @@ size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 		__copy_from_user((to += v.iov_len) - v.iov_len, v.iov_base,
 				 v.iov_len),
 		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len)
+				 v.bv_offset, v.bv_len),
+		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
 	)
 
 	return bytes;
@@ -371,7 +414,7 @@ EXPORT_SYMBOL(copy_page_to_iter);
 size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
-	if (i->type & ITER_BVEC) {
+	if (i->type & (ITER_BVEC|ITER_KVEC)) {
 		void *kaddr = kmap_atomic(page);
 		size_t wanted = copy_from_iter(kaddr + offset, bytes, i);
 		kunmap_atomic(kaddr);
@@ -391,7 +434,8 @@ size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
 
 	iterate_and_advance(i, bytes, v,
 		__clear_user(v.iov_base, v.iov_len),
-		memzero_page(v.bv_page, v.bv_offset, v.bv_len)
+		memzero_page(v.bv_page, v.bv_offset, v.bv_len),
+		memset(v.iov_base, 0, v.iov_len)
 	)
 
 	return bytes;
@@ -406,7 +450,8 @@ size_t iov_iter_copy_from_user_atomic(struct page *page,
 		__copy_from_user_inatomic((p += v.iov_len) - v.iov_len,
 					  v.iov_base, v.iov_len),
 		memcpy_from_page((p += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len)
+				 v.bv_offset, v.bv_len),
+		memcpy((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
 	)
 	kunmap_atomic(kaddr);
 	return bytes;
@@ -415,7 +460,7 @@ EXPORT_SYMBOL(iov_iter_copy_from_user_atomic);
 
 void iov_iter_advance(struct iov_iter *i, size_t size)
 {
-	iterate_and_advance(i, size, v, 0, 0)
+	iterate_and_advance(i, size, v, 0, 0, 0)
 }
 EXPORT_SYMBOL(iov_iter_advance);
 
@@ -443,7 +488,8 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
 
 	iterate_all_kinds(i, size, v,
 		(res |= (unsigned long)v.iov_base | v.iov_len, 0),
-		res |= v.bv_offset | v.bv_len
+		res |= v.bv_offset | v.bv_len,
+		res |= (unsigned long)v.iov_base | v.iov_len
 	)
 	return res;
 }
@@ -478,6 +524,8 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
 		*start = v.bv_offset;
 		get_page(*pages = v.bv_page);
 		return v.bv_len;
+	}),({
+		return -EFAULT;
 	})
 	)
 	return 0;
@@ -530,6 +578,8 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
 			return -ENOMEM;
 		get_page(*p = v.bv_page);
 		return v.bv_len;
+	}),({
+		return -EFAULT;
 	})
 	)
 	return 0;
@@ -554,6 +604,12 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
 		npages++;
 		if (npages >= maxpages)
 			return maxpages;
+	}),({
+		unsigned long p = (unsigned long)v.iov_base;
+		npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
+			- p / PAGE_SIZE;
+		if (npages >= maxpages)
+			return maxpages;
 	})
 	)
 	return npages;
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 11/25] csum_and_copy_..._iter()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (9 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 10/25] iov_iter.c: handle ITER_KVEC directly Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 12/25] new helper: iov_iter_kvec() Al Viro
                                                                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/uio.h |  2 ++
 mm/iov_iter.c       | 89 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 91 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 6e16945..28ed2d9 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -124,6 +124,8 @@ static inline void iov_iter_reexpand(struct iov_iter *i, size_t count)
 {
 	i->count = count;
 }
+size_t csum_and_copy_to_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
+size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
 
 int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len);
 int memcpy_toiovec(struct iovec *iov, unsigned char *kdata, int len);
diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 1618e37..1d2cdeb 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -3,6 +3,7 @@
 #include <linux/pagemap.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
+#include <net/checksum.h>
 
 #define iterate_iovec(i, n, __v, __p, skip, STEP) {	\
 	size_t left;					\
@@ -586,6 +587,94 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
 }
 EXPORT_SYMBOL(iov_iter_get_pages_alloc);
 
+size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
+			       struct iov_iter *i)
+{
+	char *to = addr;
+	__wsum sum, next;
+	size_t off = 0;
+	if (unlikely(bytes > i->count))
+		bytes = i->count;
+
+	if (unlikely(!bytes))
+		return 0;
+
+	sum = *csum;
+	iterate_and_advance(i, bytes, v, ({
+		int err = 0;
+		next = csum_and_copy_from_user(v.iov_base, 
+					       (to += v.iov_len) - v.iov_len,
+					       v.iov_len, 0, &err);
+		if (!err) {
+			sum = csum_block_add(sum, next, off);
+			off += v.iov_len;
+		}
+		err ? v.iov_len : 0;
+	}), ({
+		char *p = kmap_atomic(v.bv_page);
+		next = csum_partial_copy_nocheck(p + v.bv_offset,
+						 (to += v.bv_len) - v.bv_len,
+						 v.bv_len, 0);
+		kunmap_atomic(p);
+		sum = csum_block_add(sum, next, off);
+		off += v.bv_len;
+	}),({
+		next = csum_partial_copy_nocheck(v.iov_base,
+						 (to += v.iov_len) - v.iov_len,
+						 v.iov_len, 0);
+		sum = csum_block_add(sum, next, off);
+		off += v.iov_len;
+	})
+	)
+	*csum = sum;
+	return bytes;
+}
+EXPORT_SYMBOL(csum_and_copy_from_iter);
+
+size_t csum_and_copy_to_iter(void *addr, size_t bytes, __wsum *csum,
+			     struct iov_iter *i)
+{
+	char *from = addr;
+	__wsum sum, next;
+	size_t off = 0;
+	if (unlikely(bytes > i->count))
+		bytes = i->count;
+
+	if (unlikely(!bytes))
+		return 0;
+
+	sum = *csum;
+	iterate_and_advance(i, bytes, v, ({
+		int err = 0;
+		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
+					     v.iov_base, 
+					     v.iov_len, 0, &err);
+		if (!err) {
+			sum = csum_block_add(sum, next, off);
+			off += v.iov_len;
+		}
+		err ? v.iov_len : 0;
+	}), ({
+		char *p = kmap_atomic(v.bv_page);
+		next = csum_partial_copy_nocheck((from += v.bv_len) - v.bv_len,
+						 p + v.bv_offset,
+						 v.bv_len, 0);
+		kunmap_atomic(p);
+		sum = csum_block_add(sum, next, off);
+		off += v.bv_len;
+	}),({
+		next = csum_partial_copy_nocheck((from += v.iov_len) - v.iov_len,
+						 v.iov_base,
+						 v.iov_len, 0);
+		sum = csum_block_add(sum, next, off);
+		off += v.iov_len;
+	})
+	)
+	*csum = sum;
+	return bytes;
+}
+EXPORT_SYMBOL(csum_and_copy_to_iter);
+
 int iov_iter_npages(const struct iov_iter *i, int maxpages)
 {
 	size_t size = i->count;
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 12/25] new helper: iov_iter_kvec()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (10 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 11/25] csum_and_copy_..._iter() Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 13/25] copy_from_iter_nocache() Al Viro
                                                                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

initialization of kvec-backed iov_iter

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/uio.h |  2 ++
 mm/iov_iter.c       | 13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 28ed2d9..c567655 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -87,6 +87,8 @@ size_t iov_iter_zero(size_t bytes, struct iov_iter *);
 unsigned long iov_iter_alignment(const struct iov_iter *i);
 void iov_iter_init(struct iov_iter *i, int direction, const struct iovec *iov,
 			unsigned long nr_segs, size_t count);
+void iov_iter_kvec(struct iov_iter *i, int direction, const struct kvec *iov,
+			unsigned long nr_segs, size_t count);
 ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
 			size_t maxsize, unsigned maxpages, size_t *start);
 ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages,
diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 1d2cdeb..88c052e 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -479,6 +479,19 @@ size_t iov_iter_single_seg_count(const struct iov_iter *i)
 }
 EXPORT_SYMBOL(iov_iter_single_seg_count);
 
+void iov_iter_kvec(struct iov_iter *i, int direction,
+			const struct kvec *iov, unsigned long nr_segs,
+			size_t count)
+{
+	BUG_ON(!(direction & ITER_KVEC));
+	i->type = direction;
+	i->kvec = (struct kvec *)iov;
+	i->nr_segs = nr_segs;
+	i->iov_offset = 0;
+	i->count = count;
+}
+EXPORT_SYMBOL(iov_iter_kvec);
+
 unsigned long iov_iter_alignment(const struct iov_iter *i)
 {
 	unsigned long res = 0;
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 13/25] copy_from_iter_nocache()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (11 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 12/25] new helper: iov_iter_kvec() Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 14/25] raw.c: stick msghdr into raw_frag_vec Al Viro
                                                                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

BTW, do we want memcpy_nocache()?

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/uio.h |  1 +
 mm/iov_iter.c       | 21 +++++++++++++++++++++
 2 files changed, 22 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index c567655..bd8569a 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -83,6 +83,7 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i);
 size_t copy_to_iter(void *addr, size_t bytes, struct iov_iter *i);
 size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i);
+size_t copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i);
 size_t iov_iter_zero(size_t bytes, struct iov_iter *);
 unsigned long iov_iter_alignment(const struct iov_iter *i);
 void iov_iter_init(struct iov_iter *i, int direction, const struct iovec *iov,
diff --git a/mm/iov_iter.c b/mm/iov_iter.c
index 88c052e..a1599ca 100644
--- a/mm/iov_iter.c
+++ b/mm/iov_iter.c
@@ -399,6 +399,27 @@ size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 }
 EXPORT_SYMBOL(copy_from_iter);
 
+size_t copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	if (unlikely(bytes > i->count))
+		bytes = i->count;
+
+	if (unlikely(!bytes))
+		return 0;
+
+	iterate_and_advance(i, bytes, v,
+		__copy_from_user_nocache((to += v.iov_len) - v.iov_len,
+					 v.iov_base, v.iov_len),
+		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
+				 v.bv_offset, v.bv_len),
+		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
+	)
+
+	return bytes;
+}
+EXPORT_SYMBOL(copy_from_iter_nocache);
+
 size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 14/25] raw.c: stick msghdr into raw_frag_vec
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (12 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 13/25] copy_from_iter_nocache() Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 15/25] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt" Al Viro
                                                                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

we'll want access to ->msg_iter

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/ipv4/raw.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 43385a9..5c901eb 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -82,7 +82,7 @@
 #include <linux/uio.h>
 
 struct raw_frag_vec {
-	struct iovec *iov;
+	struct msghdr *msg;
 	union {
 		struct icmphdr icmph;
 		char c[1];
@@ -440,7 +440,7 @@ static int raw_probe_proto_opt(struct raw_frag_vec *rfv, struct flowi4 *fl4)
 	/* We only need the first two bytes. */
 	rfv->hlen = 2;
 
-	err = memcpy_fromiovec(rfv->hdr.c, rfv->iov, rfv->hlen);
+	err = memcpy_from_msg(rfv->hdr.c, rfv->msg, rfv->hlen);
 	if (err)
 		return err;
 
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->iov, to, offset, len, odd, skb);
+	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
 }
 
 static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
@@ -600,7 +600,7 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 			   daddr, saddr, 0, 0);
 
 	if (!inet->hdrincl) {
-		rfv.iov = msg->msg_iov;
+		rfv.msg = msg;
 		rfv.hlen = 0;
 
 		err = raw_probe_proto_opt(&rfv, &fl4);
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 15/25] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt"
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (13 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 14/25] raw.c: stick msghdr into raw_frag_vec Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 16/25] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr Al Viro
                                                                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 net/ipv6/raw.c | 112 ++++++++++++++++++++++++++++-----------------------------
 1 file changed, 56 insertions(+), 56 deletions(-)

diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 8baa53e..942f67b 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -672,65 +672,62 @@ error:
 	return err;
 }
 
-static int rawv6_probe_proto_opt(struct flowi6 *fl6, struct msghdr *msg)
+struct raw6_frag_vec {
+	struct msghdr *msg;
+	int hlen;
+	char c[4];
+};
+
+static int rawv6_probe_proto_opt(struct raw6_frag_vec *rfv, struct flowi6 *fl6)
 {
-	struct iovec *iov;
-	u8 __user *type = NULL;
-	u8 __user *code = NULL;
-	u8 len = 0;
-	int probed = 0;
-	int i;
-
-	if (!msg->msg_iov)
-		return 0;
+	int err = 0;
+	switch (fl6->flowi6_proto) {
+	case IPPROTO_ICMPV6:
+		rfv->hlen = 2;
+		err = memcpy_from_msg(rfv->c, rfv->msg, rfv->hlen);
+		if (!err) {
+			fl6->fl6_icmp_type = rfv->c[0];
+			fl6->fl6_icmp_code = rfv->c[1];
+		}
+		break;
+	case IPPROTO_MH:
+		rfv->hlen = 4;
+		err = memcpy_from_msg(rfv->c, rfv->msg, rfv->hlen);
+		if (!err)
+			fl6->fl6_mh_type = rfv->c[2];
+	}
+	return err;
+}
 
-	for (i = 0; i < msg->msg_iovlen; i++) {
-		iov = &msg->msg_iov[i];
-		if (!iov)
-			continue;
+static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
+		       struct sk_buff *skb)
+{
+	struct raw6_frag_vec *rfv = from;
 
-		switch (fl6->flowi6_proto) {
-		case IPPROTO_ICMPV6:
-			/* check if one-byte field is readable or not. */
-			if (iov->iov_base && iov->iov_len < 1)
-				break;
-
-			if (!type) {
-				type = iov->iov_base;
-				/* check if code field is readable or not. */
-				if (iov->iov_len > 1)
-					code = type + 1;
-			} else if (!code)
-				code = iov->iov_base;
-
-			if (type && code) {
-				if (get_user(fl6->fl6_icmp_type, type) ||
-				    get_user(fl6->fl6_icmp_code, code))
-					return -EFAULT;
-				probed = 1;
-			}
-			break;
-		case IPPROTO_MH:
-			if (iov->iov_base && iov->iov_len < 1)
-				break;
-			/* check if type field is readable or not. */
-			if (iov->iov_len > 2 - len) {
-				u8 __user *p = iov->iov_base;
-				if (get_user(fl6->fl6_mh_type, &p[2 - len]))
-					return -EFAULT;
-				probed = 1;
-			} else
-				len += iov->iov_len;
+	if (offset < rfv->hlen) {
+		int copy = min(rfv->hlen - offset, len);
 
-			break;
-		default:
-			probed = 1;
-			break;
-		}
-		if (probed)
-			break;
+		if (skb->ip_summed == CHECKSUM_PARTIAL)
+			memcpy(to, rfv->c + offset, copy);
+		else
+			skb->csum = csum_block_add(
+				skb->csum,
+				csum_partial_copy_nocheck(rfv->c + offset,
+							  to, copy, 0),
+				odd);
+
+		odd = 0;
+		offset += copy;
+		to += copy;
+		len -= copy;
+
+		if (!len)
+			return 0;
 	}
-	return 0;
+
+	offset -= rfv->hlen;
+
+	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
 }
 
 static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
@@ -745,6 +742,7 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 	struct ipv6_txoptions *opt = NULL;
 	struct ip6_flowlabel *flowlabel = NULL;
 	struct dst_entry *dst = NULL;
+	struct raw6_frag_vec rfv;
 	struct flowi6 fl6;
 	int addr_len = msg->msg_namelen;
 	int hlimit = -1;
@@ -848,7 +846,9 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 	opt = ipv6_fixup_options(&opt_space, opt);
 
 	fl6.flowi6_proto = proto;
-	err = rawv6_probe_proto_opt(&fl6, msg);
+	rfv.msg = msg;
+	rfv.hlen = 0;
+	err = rawv6_probe_proto_opt(&rfv, &fl6);
 	if (err)
 		goto out;
 
@@ -889,7 +889,7 @@ back_from_confirm:
 		err = rawv6_send_hdrinc(sk, msg->msg_iov, len, &fl6, &dst, msg->msg_flags);
 	else {
 		lock_sock(sk);
-		err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov,
+		err = ip6_append_data(sk, raw6_getfrag, &rfv,
 			len, 0, hlimit, tclass, opt, &fl6, (struct rt6_info *)dst,
 			msg->msg_flags, dontfrag);
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 16/25] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (14 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 15/25] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt" Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 17/25] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) Al Viro
                                                                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/udplite.h | 3 ++-
 net/ipv4/ip_output.c  | 6 +++---
 net/ipv4/raw.c        | 2 +-
 net/ipv4/udp.c        | 4 ++--
 net/ipv6/raw.c        | 2 +-
 net/ipv6/udp.c        | 2 +-
 net/l2tp/l2tp_ip6.c   | 2 +-
 7 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/include/net/udplite.h b/include/net/udplite.h
index 9a28a51..d5baaba 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -19,7 +19,8 @@ extern struct udp_table		udplite_table;
 static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
 				      int len, int odd, struct sk_buff *skb)
 {
-	return memcpy_fromiovecend(to, (struct iovec *) from, offset, len);
+	struct msghdr *msg = from;
+	return memcpy_fromiovecend(to, msg->msg_iov, offset, len);
 }
 
 /* Designate sk as UDP-Lite socket */
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 4a929ad..cdedcf1 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -752,14 +752,14 @@ EXPORT_SYMBOL(ip_fragment);
 int
 ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk_buff *skb)
 {
-	struct iovec *iov = from;
+	struct msghdr *msg = from;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
-		if (memcpy_fromiovecend(to, iov, offset, len) < 0)
+		if (memcpy_fromiovecend(to, msg->msg_iov, offset, len) < 0)
 			return -EFAULT;
 	} else {
 		__wsum csum = 0;
-		if (csum_partial_copy_fromiovecend(to, iov, offset, len, &csum) < 0)
+		if (csum_partial_copy_fromiovecend(to, msg->msg_iov, offset, len, &csum) < 0)
 			return -EFAULT;
 		skb->csum = csum_block_add(skb->csum, csum, odd);
 	}
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 5c901eb..5d83bd2 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -478,7 +478,7 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
+	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
 static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index dd8e006..13b4dcf8 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1049,7 +1049,7 @@ back_from_confirm:
 
 	/* Lockless fast path for the non-corking case. */
 	if (!corkreq) {
-		skb = ip_make_skb(sk, fl4, getfrag, msg->msg_iov, ulen,
+		skb = ip_make_skb(sk, fl4, getfrag, msg, ulen,
 				  sizeof(struct udphdr), &ipc, &rt,
 				  msg->msg_flags);
 		err = PTR_ERR(skb);
@@ -1080,7 +1080,7 @@ back_from_confirm:
 
 do_append_data:
 	up->len += ulen;
-	err = ip_append_data(sk, fl4, getfrag, msg->msg_iov, ulen,
+	err = ip_append_data(sk, fl4, getfrag, msg, ulen,
 			     sizeof(struct udphdr), &ipc, &rt,
 			     corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags);
 	if (err)
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 942f67b..11a9283 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -727,7 +727,7 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
 
 	offset -= rfv->hlen;
 
-	return ip_generic_getfrag(rfv->msg->msg_iov, to, offset, len, odd, skb);
+	return ip_generic_getfrag(rfv->msg, to, offset, len, odd, skb);
 }
 
 static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 7f964322..189dc4a 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -1312,7 +1312,7 @@ do_append_data:
 		dontfrag = np->dontfrag;
 	up->len += ulen;
 	getfrag  =  is_udplite ?  udplite_getfrag : ip_generic_getfrag;
-	err = ip6_append_data(sk, getfrag, msg->msg_iov, ulen,
+	err = ip6_append_data(sk, getfrag, msg, ulen,
 		sizeof(struct udphdr), hlimit, tclass, opt, &fl6,
 		(struct rt6_info *)dst,
 		corkreq ? msg->msg_flags|MSG_MORE : msg->msg_flags, dontfrag);
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 2177b96..8611f1b 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -619,7 +619,7 @@ static int l2tp_ip6_sendmsg(struct kiocb *iocb, struct sock *sk,
 
 back_from_confirm:
 	lock_sock(sk);
-	err = ip6_append_data(sk, ip_generic_getfrag, msg->msg_iov,
+	err = ip6_append_data(sk, ip_generic_getfrag, msg,
 			      ulen, transhdrlen, hlimit, tclass, opt,
 			      &fl6, (struct rt6_info *)dst,
 			      msg->msg_flags, dontfrag);
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 17/25] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg)
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (15 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 16/25] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 18/25] switch l2cap ->memcpy_fromiovec() to msghdr Al Viro
                                                                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/tcp.h  | 2 +-
 net/ipv4/tcp.c       | 2 +-
 net/ipv4/tcp_input.c | 7 +++----
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index f566b85..5d9cc9c 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -162,7 +162,7 @@ struct tcp_sock {
 	struct {
 		struct sk_buff_head	prequeue;
 		struct task_struct	*task;
-		struct iovec		*iov;
+		struct msghdr		*msg;
 		int			memory;
 		int			len;
 	} ucopy;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index dc13a36..4a96f37 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1729,7 +1729,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 			if (!user_recv && !(flags & (MSG_TRUNC | MSG_PEEK))) {
 				user_recv = current;
 				tp->ucopy.task = user_recv;
-				tp->ucopy.iov = msg->msg_iov;
+				tp->ucopy.msg = msg;
 			}
 
 			tp->ucopy.len = len;
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 69de1a1..075ab4d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4421,7 +4421,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 			__set_current_state(TASK_RUNNING);
 
 			local_bh_enable();
-			if (!skb_copy_datagram_iovec(skb, 0, tp->ucopy.iov, chunk)) {
+			if (!skb_copy_datagram_msg(skb, 0, tp->ucopy.msg, chunk)) {
 				tp->ucopy.len -= chunk;
 				tp->copied_seq += chunk;
 				eaten = (chunk == skb->len);
@@ -4941,10 +4941,9 @@ static int tcp_copy_to_iovec(struct sock *sk, struct sk_buff *skb, int hlen)
 
 	local_bh_enable();
 	if (skb_csum_unnecessary(skb))
-		err = skb_copy_datagram_iovec(skb, hlen, tp->ucopy.iov, chunk);
+		err = skb_copy_datagram_msg(skb, hlen, tp->ucopy.msg, chunk);
 	else
-		err = skb_copy_and_csum_datagram_iovec(skb, hlen,
-						       tp->ucopy.iov);
+		err = skb_copy_and_csum_datagram_msg(skb, hlen, tp->ucopy.msg);
 
 	if (!err) {
 		tp->ucopy.len -= chunk;
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 18/25] switch l2cap ->memcpy_fromiovec() to msghdr
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (16 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 17/25] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 19/25] vmci: propagate msghdr all way down to __qp_memcpy_from_queue() Al Viro
                                                                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

it'll die soon enough - now that kvec-backed iov_iter works regardless
of set_fs(), both instances will become copy_from_iter() as soon as
we introduce ->msg_iter...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/bluetooth/l2cap.h | 6 +++---
 net/bluetooth/l2cap_core.c    | 4 ++--
 net/bluetooth/l2cap_sock.c    | 4 ++--
 3 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
index 061e648..4e23674 100644
--- a/include/net/bluetooth/l2cap.h
+++ b/include/net/bluetooth/l2cap.h
@@ -608,7 +608,7 @@ struct l2cap_ops {
 					       unsigned long len, int nb);
 	int			(*memcpy_fromiovec) (struct l2cap_chan *chan,
 						     unsigned char *kdata,
-						     struct iovec *iov,
+						     struct msghdr *msg,
 						     int len);
 };
 
@@ -905,13 +905,13 @@ static inline long l2cap_chan_no_get_sndtimeo(struct l2cap_chan *chan)
 
 static inline int l2cap_chan_no_memcpy_fromiovec(struct l2cap_chan *chan,
 						 unsigned char *kdata,
-						 struct iovec *iov,
+						 struct msghdr *msg,
 						 int len)
 {
 	/* Following is safe since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	struct kvec *vec = (struct kvec *)iov;
+	struct kvec *vec = (struct kvec *)msg->msg_iov;
 
 	while (len > 0) {
 		if (vec->iov_len) {
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 8e12731..5201d61 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -2097,7 +2097,7 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 	int sent = 0;
 
 	if (chan->ops->memcpy_fromiovec(chan, skb_put(skb, count),
-					msg->msg_iov, count))
+					msg, count))
 		return -EFAULT;
 
 	sent += count;
@@ -2118,7 +2118,7 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 		*frag = tmp;
 
 		if (chan->ops->memcpy_fromiovec(chan, skb_put(*frag, count),
-						msg->msg_iov, count))
+						msg, count))
 			return -EFAULT;
 
 		sent += count;
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index b0efb72..205b298 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1338,9 +1338,9 @@ static struct sk_buff *l2cap_sock_alloc_skb_cb(struct l2cap_chan *chan,
 
 static int l2cap_sock_memcpy_fromiovec_cb(struct l2cap_chan *chan,
 					  unsigned char *kdata,
-					  struct iovec *iov, int len)
+					  struct msghdr *msg, int len)
 {
-	return memcpy_fromiovec(kdata, iov, len);
+	return memcpy_from_msg(kdata, msg, len);
 }
 
 static void l2cap_sock_ready_cb(struct l2cap_chan *chan)
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 19/25] vmci: propagate msghdr all way down to __qp_memcpy_from_queue()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (17 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 18/25] switch l2cap ->memcpy_fromiovec() to msghdr Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 20/25] put iov_iter into msghdr Al Viro
                                                                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

... and switch it to memcpy_to_msg()

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/misc/vmw_vmci/vmci_queue_pair.c | 17 +++++++++--------
 include/linux/vmw_vmci_api.h            |  5 +++--
 net/vmw_vsock/vmci_transport.c          |  4 ++--
 3 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_queue_pair.c b/drivers/misc/vmw_vmci/vmci_queue_pair.c
index 1b7b303..7aaaf51 100644
--- a/drivers/misc/vmw_vmci/vmci_queue_pair.c
+++ b/drivers/misc/vmw_vmci/vmci_queue_pair.c
@@ -27,6 +27,7 @@
 #include <linux/uio.h>
 #include <linux/wait.h>
 #include <linux/vmalloc.h>
+#include <linux/skbuff.h>
 
 #include "vmci_handle_array.h"
 #include "vmci_queue_pair.h"
@@ -429,11 +430,11 @@ static int __qp_memcpy_from_queue(void *dest,
 			to_copy = size - bytes_copied;
 
 		if (is_iovec) {
-			struct iovec *iov = (struct iovec *)dest;
+			struct msghdr *msg = dest;
 			int err;
 
 			/* The iovec will track bytes_copied internally. */
-			err = memcpy_toiovec(iov, (u8 *)va + page_offset,
+			err = memcpy_to_msg(msg, (u8 *)va + page_offset,
 					     to_copy);
 			if (err != 0) {
 				if (kernel_if->host)
@@ -3264,13 +3265,13 @@ EXPORT_SYMBOL_GPL(vmci_qpair_enquev);
  * of bytes dequeued or < 0 on error.
  */
 ssize_t vmci_qpair_dequev(struct vmci_qp *qpair,
-			  void *iov,
+			  struct msghdr *msg,
 			  size_t iov_size,
 			  int buf_type)
 {
 	ssize_t result;
 
-	if (!qpair || !iov)
+	if (!qpair)
 		return VMCI_ERROR_INVALID_ARGS;
 
 	qp_lock(qpair);
@@ -3279,7 +3280,7 @@ ssize_t vmci_qpair_dequev(struct vmci_qp *qpair,
 		result = qp_dequeue_locked(qpair->produce_q,
 					   qpair->consume_q,
 					   qpair->consume_q_size,
-					   iov, iov_size,
+					   msg, iov_size,
 					   qp_memcpy_from_queue_iov,
 					   true);
 
@@ -3308,13 +3309,13 @@ EXPORT_SYMBOL_GPL(vmci_qpair_dequev);
  * of bytes peeked or < 0 on error.
  */
 ssize_t vmci_qpair_peekv(struct vmci_qp *qpair,
-			 void *iov,
+			 struct msghdr *msg,
 			 size_t iov_size,
 			 int buf_type)
 {
 	ssize_t result;
 
-	if (!qpair || !iov)
+	if (!qpair)
 		return VMCI_ERROR_INVALID_ARGS;
 
 	qp_lock(qpair);
@@ -3323,7 +3324,7 @@ ssize_t vmci_qpair_peekv(struct vmci_qp *qpair,
 		result = qp_dequeue_locked(qpair->produce_q,
 					   qpair->consume_q,
 					   qpair->consume_q_size,
-					   iov, iov_size,
+					   msg, iov_size,
 					   qp_memcpy_from_queue_iov,
 					   false);
 
diff --git a/include/linux/vmw_vmci_api.h b/include/linux/vmw_vmci_api.h
index 023430e..5691f75 100644
--- a/include/linux/vmw_vmci_api.h
+++ b/include/linux/vmw_vmci_api.h
@@ -24,6 +24,7 @@
 #define VMCI_KERNEL_API_VERSION_2 2
 #define VMCI_KERNEL_API_VERSION   VMCI_KERNEL_API_VERSION_2
 
+struct msghdr;
 typedef void (vmci_device_shutdown_fn) (void *device_registration,
 					void *user_data);
 
@@ -75,8 +76,8 @@ ssize_t vmci_qpair_peek(struct vmci_qp *qpair, void *buf, size_t buf_size,
 ssize_t vmci_qpair_enquev(struct vmci_qp *qpair,
 			  void *iov, size_t iov_size, int mode);
 ssize_t vmci_qpair_dequev(struct vmci_qp *qpair,
-			  void *iov, size_t iov_size, int mode);
-ssize_t vmci_qpair_peekv(struct vmci_qp *qpair, void *iov, size_t iov_size,
+			  struct msghdr *msg, size_t iov_size, int mode);
+ssize_t vmci_qpair_peekv(struct vmci_qp *qpair, struct msghdr *msg, size_t iov_size,
 			 int mode);
 
 #endif /* !__VMW_VMCI_API_H__ */
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index c1c0389..20a0ba3 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1840,9 +1840,9 @@ static ssize_t vmci_transport_stream_dequeue(
 	int flags)
 {
 	if (flags & MSG_PEEK)
-		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
+		return vmci_qpair_peekv(vmci_trans(vsk)->qpair, msg, len, 0);
 	else
-		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
+		return vmci_qpair_dequev(vmci_trans(vsk)->qpair, msg, len, 0);
 }
 
 static ssize_t vmci_transport_stream_enqueue(
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 20/25] put iov_iter into msghdr
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (18 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 19/25] vmci: propagate msghdr all way down to __qp_memcpy_from_queue() Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 21/25] first fruits - kill l2cap ->memcpy_fromiovec() Al Viro
                                                                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Note that the code _using_ ->msg_iter at that point will be very
unhappy with anything other than unshifted iovec-backed iov_iter.
We still need to convert users to proper primitives.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 crypto/algif_hash.c            |  4 ++--
 crypto/algif_skcipher.c        |  4 ++--
 drivers/net/macvtap.c          |  8 ++------
 drivers/net/tun.c              |  8 ++------
 drivers/vhost/net.c            |  8 +++-----
 fs/afs/rxrpc.c                 | 14 ++++++--------
 include/linux/skbuff.h         | 16 ++++++++++------
 include/linux/socket.h         |  3 +--
 include/net/bluetooth/l2cap.h  |  2 +-
 include/net/udplite.h          |  3 ++-
 net/atm/common.c               |  5 +----
 net/bluetooth/6lowpan.c        |  6 +++---
 net/bluetooth/a2mp.c           |  3 +--
 net/bluetooth/smp.c            |  3 +--
 net/caif/caif_socket.c         |  2 +-
 net/compat.c                   | 10 ++++++----
 net/ipv4/ip_output.c           |  6 ++++--
 net/ipv4/ping.c                |  3 ++-
 net/ipv4/raw.c                 |  3 ++-
 net/ipv4/tcp.c                 |  6 +++---
 net/ipv4/tcp_output.c          |  2 +-
 net/ipv6/ping.c                |  3 ++-
 net/ipv6/raw.c                 |  3 ++-
 net/netlink/af_netlink.c       |  2 +-
 net/packet/af_packet.c         |  7 ++-----
 net/rds/recv.c                 |  7 ++++---
 net/rds/send.c                 |  4 +---
 net/rxrpc/ar-output.c          |  8 +++-----
 net/sctp/socket.c              |  5 +----
 net/socket.c                   | 27 ++++++++++++---------------
 net/tipc/msg.c                 |  4 ++--
 net/unix/af_unix.c             | 10 ++--------
 net/vmw_vsock/vmci_transport.c |  3 ++-
 33 files changed, 90 insertions(+), 112 deletions(-)

diff --git a/crypto/algif_hash.c b/crypto/algif_hash.c
index 35c93ff..83cd2cc 100644
--- a/crypto/algif_hash.c
+++ b/crypto/algif_hash.c
@@ -42,7 +42,7 @@ static int hash_sendmsg(struct kiocb *unused, struct socket *sock,
 	struct alg_sock *ask = alg_sk(sk);
 	struct hash_ctx *ctx = ask->private;
 	unsigned long iovlen;
-	struct iovec *iov;
+	const struct iovec *iov;
 	long copied = 0;
 	int err;
 
@@ -58,7 +58,7 @@ static int hash_sendmsg(struct kiocb *unused, struct socket *sock,
 
 	ctx->more = 0;
 
-	for (iov = msg->msg_iov, iovlen = msg->msg_iovlen; iovlen > 0;
+	for (iov = msg->msg_iter.iov, iovlen = msg->msg_iter.nr_segs; iovlen > 0;
 	     iovlen--, iov++) {
 		unsigned long seglen = iov->iov_len;
 		char __user *from = iov->iov_base;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index c3b482b..4f45dab 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -429,13 +429,13 @@ static int skcipher_recvmsg(struct kiocb *unused, struct socket *sock,
 	struct skcipher_sg_list *sgl;
 	struct scatterlist *sg;
 	unsigned long iovlen;
-	struct iovec *iov;
+	const struct iovec *iov;
 	int err = -EAGAIN;
 	int used;
 	long copied = 0;
 
 	lock_sock(sk);
-	for (iov = msg->msg_iov, iovlen = msg->msg_iovlen; iovlen > 0;
+	for (iov = msg->msg_iter.iov, iovlen = msg->msg_iter.nr_segs; iovlen > 0;
 	     iovlen--, iov++) {
 		unsigned long seglen = iov->iov_len;
 		char __user *from = iov->iov_base;
diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
index ba1e5db..2c157cc 100644
--- a/drivers/net/macvtap.c
+++ b/drivers/net/macvtap.c
@@ -1095,9 +1095,7 @@ static int macvtap_sendmsg(struct kiocb *iocb, struct socket *sock,
 			   struct msghdr *m, size_t total_len)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
-	struct iov_iter from;
-	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
-	return macvtap_get_user(q, m, &from, m->msg_flags & MSG_DONTWAIT);
+	return macvtap_get_user(q, m, &m->msg_iter, m->msg_flags & MSG_DONTWAIT);
 }
 
 static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
@@ -1105,12 +1103,10 @@ static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
 			   int flags)
 {
 	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
-	struct iov_iter to;
 	int ret;
 	if (flags & ~(MSG_DONTWAIT|MSG_TRUNC))
 		return -EINVAL;
-	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
-	ret = macvtap_do_read(q, &to, flags & MSG_DONTWAIT);
+	ret = macvtap_do_read(q, &m->msg_iter, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 9c58286..f3e992e 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1449,13 +1449,11 @@ static int tun_sendmsg(struct kiocb *iocb, struct socket *sock,
 	int ret;
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
-	struct iov_iter from;
 
 	if (!tun)
 		return -EBADFD;
 
-	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
-	ret = tun_get_user(tun, tfile, m->msg_control, &from,
+	ret = tun_get_user(tun, tfile, m->msg_control, &m->msg_iter,
 			   m->msg_flags & MSG_DONTWAIT);
 	tun_put(tun);
 	return ret;
@@ -1467,7 +1465,6 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 {
 	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
 	struct tun_struct *tun = __tun_get(tfile);
-	struct iov_iter to;
 	int ret;
 
 	if (!tun)
@@ -1482,8 +1479,7 @@ static int tun_recvmsg(struct kiocb *iocb, struct socket *sock,
 					 SOL_PACKET, TUN_TX_TIMESTAMP);
 		goto out;
 	}
-	iov_iter_init(&to, READ, m->msg_iov, m->msg_iovlen, total_len);
-	ret = tun_do_read(tun, tfile, &to, flags & MSG_DONTWAIT);
+	ret = tun_do_read(tun, tfile, &m->msg_iter, flags & MSG_DONTWAIT);
 	if (ret > total_len) {
 		m->msg_flags |= MSG_TRUNC;
 		ret = flags & MSG_TRUNC ? ret : total_len;
diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 8dae2f7..9f06e70 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -342,7 +342,6 @@ static void handle_tx(struct vhost_net *net)
 		.msg_namelen = 0,
 		.msg_control = NULL,
 		.msg_controllen = 0,
-		.msg_iov = vq->iov,
 		.msg_flags = MSG_DONTWAIT,
 	};
 	size_t len, total_len = 0;
@@ -396,8 +395,8 @@ static void handle_tx(struct vhost_net *net)
 		}
 		/* Skip header. TODO: support TSO. */
 		s = move_iovec_hdr(vq->iov, nvq->hdr, hdr_size, out);
-		msg.msg_iovlen = out;
 		len = iov_length(vq->iov, out);
+		iov_iter_init(&msg.msg_iter, WRITE, vq->iov, out, len);
 		/* Sanity check */
 		if (!len) {
 			vq_err(vq, "Unexpected header len for TX: "
@@ -562,7 +561,6 @@ static void handle_rx(struct vhost_net *net)
 		.msg_namelen = 0,
 		.msg_control = NULL, /* FIXME: get and handle RX aux data. */
 		.msg_controllen = 0,
-		.msg_iov = vq->iov,
 		.msg_flags = MSG_DONTWAIT,
 	};
 	struct virtio_net_hdr_mrg_rxbuf hdr = {
@@ -600,7 +598,7 @@ static void handle_rx(struct vhost_net *net)
 			break;
 		/* On overrun, truncate and discard */
 		if (unlikely(headcount > UIO_MAXIOV)) {
-			msg.msg_iovlen = 1;
+			iov_iter_init(&msg.msg_iter, READ, vq->iov, 1, 1);
 			err = sock->ops->recvmsg(NULL, sock, &msg,
 						 1, MSG_DONTWAIT | MSG_TRUNC);
 			pr_debug("Discarded rx packet: len %zd\n", sock_len);
@@ -626,7 +624,7 @@ static void handle_rx(struct vhost_net *net)
 			/* Copy the header for use in VIRTIO_NET_F_MRG_RXBUF:
 			 * needed because recvmsg can modify msg_iov. */
 			copy_iovec_hdr(vq->iov, nvq->hdr, sock_hlen, in);
-		msg.msg_iovlen = in;
+		iov_iter_init(&msg.msg_iter, READ, vq->iov, in, sock_len);
 		err = sock->ops->recvmsg(NULL, sock, &msg,
 					 sock_len, MSG_DONTWAIT | MSG_TRUNC);
 		/* Userspace might have consumed the packet meanwhile:
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index 03a3beb..06e14bf 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -306,8 +306,8 @@ static int afs_send_pages(struct afs_call *call, struct msghdr *msg,
 
 			_debug("- range %u-%u%s",
 			       offset, to, msg->msg_flags ? " [more]" : "");
-			msg->msg_iov = (struct iovec *) iov;
-			msg->msg_iovlen = 1;
+			iov_iter_init(&msg->msg_iter, WRITE,
+				      (struct iovec *) iov, 1, to - offset);
 
 			/* have to change the state *before* sending the last
 			 * packet as RxRPC might give us the reply before it
@@ -384,8 +384,8 @@ int afs_make_call(struct in_addr *addr, struct afs_call *call, gfp_t gfp,
 
 	msg.msg_name		= NULL;
 	msg.msg_namelen		= 0;
-	msg.msg_iov		= (struct iovec *) iov;
-	msg.msg_iovlen		= 1;
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)iov, 1,
+		      call->request_size);
 	msg.msg_control		= NULL;
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= (call->send_pages ? MSG_MORE : 0);
@@ -778,8 +778,7 @@ void afs_send_empty_reply(struct afs_call *call)
 	iov[0].iov_len		= 0;
 	msg.msg_name		= NULL;
 	msg.msg_namelen		= 0;
-	msg.msg_iov		= iov;
-	msg.msg_iovlen		= 0;
+	iov_iter_init(&msg.msg_iter, WRITE, iov, 0, 0);	/* WTF? */
 	msg.msg_control		= NULL;
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
@@ -815,8 +814,7 @@ void afs_send_simple_reply(struct afs_call *call, const void *buf, size_t len)
 	iov[0].iov_len		= len;
 	msg.msg_name		= NULL;
 	msg.msg_namelen		= 0;
-	msg.msg_iov		= iov;
-	msg.msg_iovlen		= 1;
+	iov_iter_init(&msg.msg_iter, WRITE, iov, 1, len);
 	msg.msg_control		= NULL;
 	msg.msg_controllen	= 0;
 	msg.msg_flags		= 0;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index ef64cec..52cf1bd 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2646,22 +2646,24 @@ unsigned int datagram_poll(struct file *file, struct socket *sock,
 			   struct poll_table_struct *wait);
 int skb_copy_datagram_iovec(const struct sk_buff *from, int offset,
 			    struct iovec *to, int size);
+int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
+			   struct iov_iter *to, int size);
 static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
 					struct msghdr *msg, int size)
 {
-	return skb_copy_datagram_iovec(from, offset, msg->msg_iov, size);
+	/* XXX: stripping const */
+	return skb_copy_datagram_iovec(from, offset, (struct iovec *)msg->msg_iter.iov, size);
 }
 int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
 				     struct iovec *iov);
 static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
 			    struct msghdr *msg)
 {
-	return skb_copy_and_csum_datagram_iovec(skb, hlen, msg->msg_iov);
+	/* XXX: stripping const */
+	return skb_copy_and_csum_datagram_iovec(skb, hlen, (struct iovec *)msg->msg_iter.iov);
 }
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
-int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
-			   struct iov_iter *to, int size);
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
 void skb_free_datagram(struct sock *sk, struct sk_buff *skb);
 void skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb);
@@ -2689,12 +2691,14 @@ int skb_vlan_push(struct sk_buff *skb, __be16 vlan_proto, u16 vlan_tci);
 
 static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 {
-	return memcpy_fromiovec(data, msg->msg_iov, len);
+	/* XXX: stripping const */
+	return memcpy_fromiovec(data, (struct iovec *)msg->msg_iter.iov, len);
 }
 
 static inline int memcpy_to_msg(struct msghdr *msg, void *data, int len)
 {
-	return memcpy_toiovec(msg->msg_iov, data, len);
+	/* XXX: stripping const */
+	return memcpy_toiovec((struct iovec *)msg->msg_iter.iov, data, len);
 }
 
 struct skb_checksum_ops {
diff --git a/include/linux/socket.h b/include/linux/socket.h
index de52228..048d6d6 100644
--- a/include/linux/socket.h
+++ b/include/linux/socket.h
@@ -47,8 +47,7 @@ struct linger {
 struct msghdr {
 	void		*msg_name;	/* ptr to socket address structure */
 	int		msg_namelen;	/* size of socket address structure */
-	struct iovec	*msg_iov;	/* scatter/gather array */
-	__kernel_size_t	msg_iovlen;	/* # elements in msg_iov */
+	struct iov_iter	msg_iter;	/* data */
 	void		*msg_control;	/* ancillary data */
 	__kernel_size_t	msg_controllen;	/* ancillary data buffer length */
 	unsigned int	msg_flags;	/* flags on received message */
diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
index 4e23674..bca6fc0 100644
--- a/include/net/bluetooth/l2cap.h
+++ b/include/net/bluetooth/l2cap.h
@@ -911,7 +911,7 @@ static inline int l2cap_chan_no_memcpy_fromiovec(struct l2cap_chan *chan,
 	/* Following is safe since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	struct kvec *vec = (struct kvec *)msg->msg_iov;
+	struct kvec *vec = (struct kvec *)msg->msg_iter.iov;
 
 	while (len > 0) {
 		if (vec->iov_len) {
diff --git a/include/net/udplite.h b/include/net/udplite.h
index d5baaba..ae7c8d1 100644
--- a/include/net/udplite.h
+++ b/include/net/udplite.h
@@ -20,7 +20,8 @@ static __inline__ int udplite_getfrag(void *from, char *to, int  offset,
 				      int len, int odd, struct sk_buff *skb)
 {
 	struct msghdr *msg = from;
-	return memcpy_fromiovecend(to, msg->msg_iov, offset, len);
+	/* XXX: stripping const */
+	return memcpy_fromiovecend(to, (struct iovec *)msg->msg_iter.iov, offset, len);
 }
 
 /* Designate sk as UDP-Lite socket */
diff --git a/net/atm/common.c b/net/atm/common.c
index f591129..b84057e 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -577,9 +577,6 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 	struct atm_vcc *vcc;
 	struct sk_buff *skb;
 	int eff, error;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, size);
 
 	lock_sock(sk);
 	if (sock->state != SS_CONNECTED) {
@@ -634,7 +631,7 @@ int vcc_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *m,
 		goto out;
 	skb->dev = NULL; /* for paths shared with net_device interfaces */
 	ATM_SKB(skb)->atm_options = vcc->atm_options;
-	if (copy_from_iter(skb_put(skb, size), size, &from) != size) {
+	if (copy_from_iter(skb_put(skb, size), size, &m->msg_iter) != size) {
 		kfree_skb(skb);
 		error = -EFAULT;
 		goto out;
diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index bdcaefd..d8c67a5 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -537,12 +537,12 @@ static int send_pkt(struct l2cap_chan *chan, struct sk_buff *skb,
 	 */
 	chan->data = skb;
 
-	memset(&msg, 0, sizeof(msg));
-	msg.msg_iov = (struct iovec *) &iv;
-	msg.msg_iovlen = 1;
 	iv.iov_base = skb->data;
 	iv.iov_len = skb->len;
 
+	memset(&msg, 0, sizeof(msg));
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *) &iv, 1, skb->len);
+
 	err = l2cap_chan_send(chan, &msg, skb->len);
 	if (err > 0) {
 		netdev->stats.tx_bytes += err;
diff --git a/net/bluetooth/a2mp.c b/net/bluetooth/a2mp.c
index 5dcade5..716d2a3 100644
--- a/net/bluetooth/a2mp.c
+++ b/net/bluetooth/a2mp.c
@@ -60,8 +60,7 @@ void a2mp_send(struct amp_mgr *mgr, u8 code, u8 ident, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	msg.msg_iov = (struct iovec *) &iv;
-	msg.msg_iovlen = 1;
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)&iv, 1, total_len);
 
 	l2cap_chan_send(chan, &msg, total_len);
 
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 069b76e..21f555b 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -268,8 +268,7 @@ static void smp_send_cmd(struct l2cap_conn *conn, u8 code, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	msg.msg_iov = (struct iovec *) &iv;
-	msg.msg_iovlen = 2;
+	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)iv, 2, 1 + len);
 
 	l2cap_chan_send(chan, &msg, 1 + len);
 
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index ac618b0..769b185 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -535,7 +535,7 @@ static int caif_seqpkt_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		goto err;
 
 	ret = -EINVAL;
-	if (unlikely(msg->msg_iov->iov_base == NULL))
+	if (unlikely(msg->msg_iter.iov->iov_base == NULL))
 		goto err;
 	noblock = msg->msg_flags & MSG_DONTWAIT;
 
diff --git a/net/compat.c b/net/compat.c
index 062f157..3236b41 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -37,13 +37,14 @@ ssize_t get_compat_msghdr(struct msghdr *kmsg,
 			  struct iovec **iov)
 {
 	compat_uptr_t uaddr, uiov, tmp3;
+	compat_size_t nr_segs;
 	ssize_t err;
 
 	if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
 	    __get_user(uaddr, &umsg->msg_name) ||
 	    __get_user(kmsg->msg_namelen, &umsg->msg_namelen) ||
 	    __get_user(uiov, &umsg->msg_iov) ||
-	    __get_user(kmsg->msg_iovlen, &umsg->msg_iovlen) ||
+	    __get_user(nr_segs, &umsg->msg_iovlen) ||
 	    __get_user(tmp3, &umsg->msg_control) ||
 	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
 	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
@@ -68,14 +69,15 @@ ssize_t get_compat_msghdr(struct msghdr *kmsg,
 		kmsg->msg_namelen = 0;
 	}
 
-	if (kmsg->msg_iovlen > UIO_MAXIOV)
+	if (nr_segs > UIO_MAXIOV)
 		return -EMSGSIZE;
 
 	err = compat_rw_copy_check_uvector(save_addr ? READ : WRITE,
-					   compat_ptr(uiov), kmsg->msg_iovlen,
+					   compat_ptr(uiov), nr_segs,
 					   UIO_FASTIOV, *iov, iov);
 	if (err >= 0)
-		kmsg->msg_iov = *iov;
+		iov_iter_init(&kmsg->msg_iter, save_addr ? READ : WRITE,
+			      *iov, nr_segs, err);
 	return err;
 }
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index cdedcf1..b50861b2 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -755,11 +755,13 @@ ip_generic_getfrag(void *from, char *to, int offset, int len, int odd, struct sk
 	struct msghdr *msg = from;
 
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
-		if (memcpy_fromiovecend(to, msg->msg_iov, offset, len) < 0)
+		/* XXX: stripping const */
+		if (memcpy_fromiovecend(to, (struct iovec *)msg->msg_iter.iov, offset, len) < 0)
 			return -EFAULT;
 	} else {
 		__wsum csum = 0;
-		if (csum_partial_copy_fromiovecend(to, msg->msg_iov, offset, len, &csum) < 0)
+		/* XXX: stripping const */
+		if (csum_partial_copy_fromiovecend(to, (struct iovec *)msg->msg_iter.iov, offset, len, &csum) < 0)
 			return -EFAULT;
 		skb->csum = csum_block_add(skb->csum, csum, odd);
 	}
diff --git a/net/ipv4/ping.c b/net/ipv4/ping.c
index 8dd4ae0..c0d82f7 100644
--- a/net/ipv4/ping.c
+++ b/net/ipv4/ping.c
@@ -811,7 +811,8 @@ back_from_confirm:
 	pfh.icmph.checksum = 0;
 	pfh.icmph.un.echo.id = inet->inet_sport;
 	pfh.icmph.un.echo.sequence = user_icmph.un.echo.sequence;
-	pfh.iov = msg->msg_iov;
+	/* XXX: stripping const */
+	pfh.iov = (struct iovec *)msg->msg_iter.iov;
 	pfh.wcheck = 0;
 	pfh.family = AF_INET;
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 5d83bd2..0bb68df 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -625,7 +625,8 @@ static int raw_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 back_from_confirm:
 
 	if (inet->hdrincl)
-		err = raw_send_hdrinc(sk, &fl4, msg->msg_iov, len,
+		/* XXX: stripping const */
+		err = raw_send_hdrinc(sk, &fl4, (struct iovec *)msg->msg_iter.iov, len,
 				      &rt, msg->msg_flags);
 
 	 else {
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4a96f37..54ba620 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1085,7 +1085,7 @@ static int tcp_sendmsg_fastopen(struct sock *sk, struct msghdr *msg,
 int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		size_t size)
 {
-	struct iovec *iov;
+	const struct iovec *iov;
 	struct tcp_sock *tp = tcp_sk(sk);
 	struct sk_buff *skb;
 	int iovlen, flags, err, copied = 0;
@@ -1136,8 +1136,8 @@ int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	mss_now = tcp_send_mss(sk, &size_goal, flags);
 
 	/* Ok commence sending. */
-	iovlen = msg->msg_iovlen;
-	iov = msg->msg_iov;
+	iovlen = msg->msg_iter.nr_segs;
+	iov = msg->msg_iter.iov;
 	copied = 0;
 
 	err = -EPIPE;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index f5bd4bd..3e225b0 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3050,7 +3050,7 @@ static int tcp_send_syn_data(struct sock *sk, struct sk_buff *syn)
 	syn_data->ip_summed = CHECKSUM_PARTIAL;
 	memcpy(syn_data->cb, syn->cb, sizeof(syn->cb));
 	if (unlikely(memcpy_fromiovecend(skb_put(syn_data, space),
-					 fo->data->msg_iov, 0, space))) {
+					 fo->data->msg_iter.iov, 0, space))) {
 		kfree_skb(syn_data);
 		goto fallback;
 	}
diff --git a/net/ipv6/ping.c b/net/ipv6/ping.c
index 5b7a1ed..2d31483 100644
--- a/net/ipv6/ping.c
+++ b/net/ipv6/ping.c
@@ -163,7 +163,8 @@ int ping_v6_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 	pfh.icmph.checksum = 0;
 	pfh.icmph.un.echo.id = inet->inet_sport;
 	pfh.icmph.un.echo.sequence = user_icmph.icmp6_sequence;
-	pfh.iov = msg->msg_iov;
+	/* XXX: stripping const */
+	pfh.iov = (struct iovec *)msg->msg_iter.iov;
 	pfh.wcheck = 0;
 	pfh.family = AF_INET6;
 
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 11a9283..ee25631 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -886,7 +886,8 @@ static int rawv6_sendmsg(struct kiocb *iocb, struct sock *sk,
 
 back_from_confirm:
 	if (inet->hdrincl)
-		err = rawv6_send_hdrinc(sk, msg->msg_iov, len, &fl6, &dst, msg->msg_flags);
+		/* XXX: stripping const */
+		err = rawv6_send_hdrinc(sk, (struct iovec *)msg->msg_iter.iov, len, &fl6, &dst, msg->msg_flags);
 	else {
 		lock_sock(sk);
 		err = ip6_append_data(sk, raw6_getfrag, &rfv,
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 63aa5c8..cc9bcf0 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2305,7 +2305,7 @@ static int netlink_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	}
 
 	if (netlink_tx_is_mmaped(sk) &&
-	    msg->msg_iov->iov_base == NULL) {
+	    msg->msg_iter.iov->iov_base == NULL) {
 		err = netlink_mmap_sendmsg(sk, msg, dst_portid, dst_group,
 					   siocb);
 		goto out;
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index efa8445..ed2e620 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2408,11 +2408,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	unsigned short gso_type = 0;
 	int hlen, tlen;
 	int extra_len = 0;
-	struct iov_iter from;
 	ssize_t n;
 
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
-
 	/*
 	 *	Get and verify the address.
 	 */
@@ -2451,7 +2448,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 		len -= vnet_hdr_len;
 
 		err = -EFAULT;
-		n = copy_from_iter(&vnet_hdr, vnet_hdr_len, &from);
+		n = copy_from_iter(&vnet_hdr, vnet_hdr_len, &msg->msg_iter);
 		if (n != vnet_hdr_len)
 			goto out_unlock;
 
@@ -2522,7 +2519,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 	}
 
 	/* Returns -EFAULT on error */
-	err = skb_copy_datagram_from_iter(skb, offset, &from, len);
+	err = skb_copy_datagram_from_iter(skb, offset, &msg->msg_iter, len);
 	if (err)
 		goto out_free;
 
diff --git a/net/rds/recv.c b/net/rds/recv.c
index 47d7b10..f9ec1ac 100644
--- a/net/rds/recv.c
+++ b/net/rds/recv.c
@@ -404,7 +404,6 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int ret = 0, nonblock = msg_flags & MSG_DONTWAIT;
 	DECLARE_SOCKADDR(struct sockaddr_in *, sin, msg->msg_name);
 	struct rds_incoming *inc = NULL;
-	struct iov_iter to;
 
 	/* udp_recvmsg()->sock_recvtimeo() gets away without locking too.. */
 	timeo = sock_rcvtimeo(sk, nonblock);
@@ -415,6 +414,7 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		goto out;
 
 	while (1) {
+		struct iov_iter save;
 		/* If there are pending notifications, do those - and nothing else */
 		if (!list_empty(&rs->rs_notify_queue)) {
 			ret = rds_notify_queue_get(rs, msg);
@@ -450,8 +450,8 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 		rdsdebug("copying inc %p from %pI4:%u to user\n", inc,
 			 &inc->i_conn->c_faddr,
 			 ntohs(inc->i_hdr.h_sport));
-		iov_iter_init(&to, READ, msg->msg_iov, msg->msg_iovlen, size);
-		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, &to);
+		save = msg->msg_iter;
+		ret = inc->i_conn->c_trans->inc_copy_to_user(inc, &msg->msg_iter);
 		if (ret < 0)
 			break;
 
@@ -464,6 +464,7 @@ int rds_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 			rds_inc_put(inc);
 			inc = NULL;
 			rds_stats_inc(s_recv_deliver_raced);
+			msg->msg_iter = save;
 			continue;
 		}
 
diff --git a/net/rds/send.c b/net/rds/send.c
index 4de62ea..40a5629a 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -934,9 +934,7 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 	int queued = 0, allocated_mr = 0;
 	int nonblock = msg->msg_flags & MSG_DONTWAIT;
 	long timeo = sock_sndtimeo(sk, nonblock);
-	struct iov_iter from;
 
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, payload_len);
 	/* Mirror Linux UDP mirror of BSD error message compatibility */
 	/* XXX: Perhaps MSG_MORE someday */
 	if (msg->msg_flags & ~(MSG_DONTWAIT | MSG_CMSG_COMPAT)) {
@@ -984,7 +982,7 @@ int rds_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 			ret = -ENOMEM;
 			goto out;
 		}
-		ret = rds_message_copy_from_user(rm, &from);
+		ret = rds_message_copy_from_user(rm, &msg->msg_iter);
 		if (ret)
 			goto out;
 	}
diff --git a/net/rxrpc/ar-output.c b/net/rxrpc/ar-output.c
index 0b4b9a7..86e0f10 100644
--- a/net/rxrpc/ar-output.c
+++ b/net/rxrpc/ar-output.c
@@ -531,14 +531,12 @@ static int rxrpc_send_data(struct kiocb *iocb,
 	struct rxrpc_skb_priv *sp;
 	unsigned char __user *from;
 	struct sk_buff *skb;
-	struct iovec *iov;
+	const struct iovec *iov;
 	struct sock *sk = &rx->sk;
 	long timeo;
 	bool more;
 	int ret, ioc, segment, copied;
 
-	_enter(",,,{%zu},%zu", msg->msg_iovlen, len);
-
 	timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 
 	/* this should be in poll */
@@ -547,8 +545,8 @@ static int rxrpc_send_data(struct kiocb *iocb,
 	if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
 		return -EPIPE;
 
-	iov = msg->msg_iov;
-	ioc = msg->msg_iovlen - 1;
+	iov = msg->msg_iter.iov;
+	ioc = msg->msg_iter.nr_segs - 1;
 	from = iov->iov_base;
 	segment = iov->iov_len;
 	iov++;
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 0397ac9..c92f96cd 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1609,9 +1609,6 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	__u16 sinfo_flags = 0;
 	long timeo;
 	int err;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, msg_len);
 
 	err = 0;
 	sp = sctp_sk(sk);
@@ -1950,7 +1947,7 @@ static int sctp_sendmsg(struct kiocb *iocb, struct sock *sk,
 	}
 
 	/* Break the message into multiple chunks of maximum size. */
-	datamsg = sctp_datamsg_from_user(asoc, sinfo, &from);
+	datamsg = sctp_datamsg_from_user(asoc, sinfo, &msg->msg_iter);
 	if (IS_ERR(datamsg)) {
 		err = PTR_ERR(datamsg);
 		goto out_free;
diff --git a/net/socket.c b/net/socket.c
index f676ac4..8809afc 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -689,8 +689,7 @@ int kernel_sendmsg(struct socket *sock, struct msghdr *msg,
 	 * the following is safe, since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	msg->msg_iov = (struct iovec *)vec;
-	msg->msg_iovlen = num;
+	iov_iter_init(&msg->msg_iter, WRITE, (struct iovec *)vec, num, size);
 	result = sock_sendmsg(sock, msg, size);
 	set_fs(oldfs);
 	return result;
@@ -853,7 +852,7 @@ int kernel_recvmsg(struct socket *sock, struct msghdr *msg,
 	 * the following is safe, since for compiler definitions of kvec and
 	 * iovec are identical, yielding the same in-core layout and alignment
 	 */
-	msg->msg_iov = (struct iovec *)vec, msg->msg_iovlen = num;
+	iov_iter_init(&msg->msg_iter, READ, (struct iovec *)vec, num, size);
 	result = sock_recvmsg(sock, msg, size, flags);
 	set_fs(oldfs);
 	return result;
@@ -913,8 +912,7 @@ static ssize_t do_sock_read(struct msghdr *msg, struct kiocb *iocb,
 	msg->msg_namelen = 0;
 	msg->msg_control = NULL;
 	msg->msg_controllen = 0;
-	msg->msg_iov = (struct iovec *)iov;
-	msg->msg_iovlen = nr_segs;
+	iov_iter_init(&msg->msg_iter, READ, iov, nr_segs, size);
 	msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0;
 
 	return __sock_recvmsg(iocb, sock, msg, size, msg->msg_flags);
@@ -953,8 +951,7 @@ static ssize_t do_sock_write(struct msghdr *msg, struct kiocb *iocb,
 	msg->msg_namelen = 0;
 	msg->msg_control = NULL;
 	msg->msg_controllen = 0;
-	msg->msg_iov = (struct iovec *)iov;
-	msg->msg_iovlen = nr_segs;
+	iov_iter_init(&msg->msg_iter, WRITE, iov, nr_segs, size);
 	msg->msg_flags = (file->f_flags & O_NONBLOCK) ? MSG_DONTWAIT : 0;
 	if (sock->type == SOCK_SEQPACKET)
 		msg->msg_flags |= MSG_EOR;
@@ -1798,8 +1795,7 @@ SYSCALL_DEFINE6(sendto, int, fd, void __user *, buff, size_t, len,
 	iov.iov_base = buff;
 	iov.iov_len = len;
 	msg.msg_name = NULL;
-	msg.msg_iov = &iov;
-	msg.msg_iovlen = 1;
+	iov_iter_init(&msg.msg_iter, WRITE, &iov, 1, len);
 	msg.msg_control = NULL;
 	msg.msg_controllen = 0;
 	msg.msg_namelen = 0;
@@ -1856,10 +1852,9 @@ SYSCALL_DEFINE6(recvfrom, int, fd, void __user *, ubuf, size_t, size,
 
 	msg.msg_control = NULL;
 	msg.msg_controllen = 0;
-	msg.msg_iovlen = 1;
-	msg.msg_iov = &iov;
 	iov.iov_len = size;
 	iov.iov_base = ubuf;
+	iov_iter_init(&msg.msg_iter, READ, &iov, 1, size);
 	/* Save some cycles and don't copy the address if not needed */
 	msg.msg_name = addr ? (struct sockaddr *)&address : NULL;
 	/* We assume all kernel code knows the size of sockaddr_storage */
@@ -1993,13 +1988,14 @@ static ssize_t copy_msghdr_from_user(struct msghdr *kmsg,
 {
 	struct sockaddr __user *uaddr;
 	struct iovec __user *uiov;
+	size_t nr_segs;
 	ssize_t err;
 
 	if (!access_ok(VERIFY_READ, umsg, sizeof(*umsg)) ||
 	    __get_user(uaddr, &umsg->msg_name) ||
 	    __get_user(kmsg->msg_namelen, &umsg->msg_namelen) ||
 	    __get_user(uiov, &umsg->msg_iov) ||
-	    __get_user(kmsg->msg_iovlen, &umsg->msg_iovlen) ||
+	    __get_user(nr_segs, &umsg->msg_iovlen) ||
 	    __get_user(kmsg->msg_control, &umsg->msg_control) ||
 	    __get_user(kmsg->msg_controllen, &umsg->msg_controllen) ||
 	    __get_user(kmsg->msg_flags, &umsg->msg_flags))
@@ -2029,14 +2025,15 @@ static ssize_t copy_msghdr_from_user(struct msghdr *kmsg,
 		kmsg->msg_namelen = 0;
 	}
 
-	if (kmsg->msg_iovlen > UIO_MAXIOV)
+	if (nr_segs > UIO_MAXIOV)
 		return -EMSGSIZE;
 
 	err = rw_copy_check_uvector(save_addr ? READ : WRITE,
-				    uiov, kmsg->msg_iovlen,
+				    uiov, nr_segs,
 				    UIO_FASTIOV, *iov, iov);
 	if (err >= 0)
-		kmsg->msg_iov = *iov;
+		iov_iter_init(&kmsg->msg_iter, save_addr ? READ : WRITE,
+			      *iov, nr_segs, err);
 	return err;
 }
 
diff --git a/net/tipc/msg.c b/net/tipc/msg.c
index 5b06597..a687b30 100644
--- a/net/tipc/msg.c
+++ b/net/tipc/msg.c
@@ -194,7 +194,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m, int offset,
 		__skb_queue_tail(list, skb);
 		skb_copy_to_linear_data(skb, mhdr, mhsz);
 		pktpos = skb->data + mhsz;
-		if (!dsz || !memcpy_fromiovecend(pktpos, m->msg_iov, offset,
+		if (!dsz || !memcpy_fromiovecend(pktpos, m->msg_iter.iov, offset,
 						 dsz))
 			return dsz;
 		rc = -EFAULT;
@@ -224,7 +224,7 @@ int tipc_msg_build(struct tipc_msg *mhdr, struct msghdr *m, int offset,
 		if (drem < pktrem)
 			pktrem = drem;
 
-		if (memcpy_fromiovecend(pktpos, m->msg_iov, offset, pktrem)) {
+		if (memcpy_fromiovecend(pktpos, m->msg_iter.iov, offset, pktrem)) {
 			rc = -EFAULT;
 			goto error;
 		}
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 4450d62..8e1b102 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1459,9 +1459,6 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	struct scm_cookie tmp_scm;
 	int max_level;
 	int data_len = 0;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1519,7 +1516,7 @@ static int unix_dgram_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	skb_put(skb, len - data_len);
 	skb->data_len = data_len;
 	skb->len = len;
-	err = skb_copy_datagram_from_iter(skb, 0, &from, len);
+	err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, len);
 	if (err)
 		goto out_free;
 
@@ -1641,9 +1638,6 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 	bool fds_sent = false;
 	int max_level;
 	int data_len;
-	struct iov_iter from;
-
-	iov_iter_init(&from, WRITE, msg->msg_iov, msg->msg_iovlen, len);
 
 	if (NULL == siocb->scm)
 		siocb->scm = &tmp_scm;
@@ -1700,7 +1694,7 @@ static int unix_stream_sendmsg(struct kiocb *kiocb, struct socket *sock,
 		skb_put(skb, size - data_len);
 		skb->data_len = data_len;
 		skb->len = size;
-		err = skb_copy_datagram_from_iter(skb, 0, &from, size);
+		err = skb_copy_datagram_from_iter(skb, 0, &msg->msg_iter, size);
 		if (err) {
 			kfree_skb(skb);
 			goto out_err;
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 20a0ba3..02d2e52 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1850,7 +1850,8 @@ static ssize_t vmci_transport_stream_enqueue(
 	struct msghdr *msg,
 	size_t len)
 {
-	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg->msg_iov, len, 0);
+	/* XXX: stripping const */
+	return vmci_qpair_enquev(vmci_trans(vsk)->qpair, (struct iovec *)msg->msg_iter.iov, len, 0);
 }
 
 static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 21/25] first fruits - kill l2cap ->memcpy_fromiovec()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (19 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 20/25] put iov_iter into msghdr Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 22/25] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives Al Viro
                                                                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Just use copy_from_iter().  That's what this method is trying to do
in all cases, in a very convoluted fashion.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/net/bluetooth/l2cap.h | 29 -----------------------------
 net/bluetooth/6lowpan.c       |  3 +--
 net/bluetooth/a2mp.c          |  3 +--
 net/bluetooth/l2cap_core.c    |  7 +++----
 net/bluetooth/l2cap_sock.c    |  8 --------
 net/bluetooth/smp.c           |  4 +---
 6 files changed, 6 insertions(+), 48 deletions(-)

diff --git a/include/net/bluetooth/l2cap.h b/include/net/bluetooth/l2cap.h
index bca6fc0..692f786 100644
--- a/include/net/bluetooth/l2cap.h
+++ b/include/net/bluetooth/l2cap.h
@@ -606,10 +606,6 @@ struct l2cap_ops {
 	struct sk_buff		*(*alloc_skb) (struct l2cap_chan *chan,
 					       unsigned long hdr_len,
 					       unsigned long len, int nb);
-	int			(*memcpy_fromiovec) (struct l2cap_chan *chan,
-						     unsigned char *kdata,
-						     struct msghdr *msg,
-						     int len);
 };
 
 struct l2cap_conn {
@@ -903,31 +899,6 @@ static inline long l2cap_chan_no_get_sndtimeo(struct l2cap_chan *chan)
 	return 0;
 }
 
-static inline int l2cap_chan_no_memcpy_fromiovec(struct l2cap_chan *chan,
-						 unsigned char *kdata,
-						 struct msghdr *msg,
-						 int len)
-{
-	/* Following is safe since for compiler definitions of kvec and
-	 * iovec are identical, yielding the same in-core layout and alignment
-	 */
-	struct kvec *vec = (struct kvec *)msg->msg_iter.iov;
-
-	while (len > 0) {
-		if (vec->iov_len) {
-			int copy = min_t(unsigned int, len, vec->iov_len);
-			memcpy(kdata, vec->iov_base, copy);
-			len -= copy;
-			kdata += copy;
-			vec->iov_base += copy;
-			vec->iov_len -= copy;
-		}
-		vec++;
-	}
-
-	return 0;
-}
-
 extern bool disable_ertm;
 
 int l2cap_init_sockets(void);
diff --git a/net/bluetooth/6lowpan.c b/net/bluetooth/6lowpan.c
index d8c67a5..76617be 100644
--- a/net/bluetooth/6lowpan.c
+++ b/net/bluetooth/6lowpan.c
@@ -541,7 +541,7 @@ static int send_pkt(struct l2cap_chan *chan, struct sk_buff *skb,
 	iv.iov_len = skb->len;
 
 	memset(&msg, 0, sizeof(msg));
-	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *) &iv, 1, skb->len);
+	iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, skb->len);
 
 	err = l2cap_chan_send(chan, &msg, skb->len);
 	if (err > 0) {
@@ -1050,7 +1050,6 @@ static const struct l2cap_ops bt_6lowpan_chan_ops = {
 	.suspend		= chan_suspend_cb,
 	.get_sndtimeo		= chan_get_sndtimeo_cb,
 	.alloc_skb		= chan_alloc_skb_cb,
-	.memcpy_fromiovec	= l2cap_chan_no_memcpy_fromiovec,
 
 	.teardown		= l2cap_chan_no_teardown,
 	.defer			= l2cap_chan_no_defer,
diff --git a/net/bluetooth/a2mp.c b/net/bluetooth/a2mp.c
index 716d2a3..cedfbda 100644
--- a/net/bluetooth/a2mp.c
+++ b/net/bluetooth/a2mp.c
@@ -60,7 +60,7 @@ void a2mp_send(struct amp_mgr *mgr, u8 code, u8 ident, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)&iv, 1, total_len);
+	iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, &iv, 1, total_len);
 
 	l2cap_chan_send(chan, &msg, total_len);
 
@@ -719,7 +719,6 @@ static const struct l2cap_ops a2mp_chan_ops = {
 	.resume = l2cap_chan_no_resume,
 	.set_shutdown = l2cap_chan_no_set_shutdown,
 	.get_sndtimeo = l2cap_chan_no_get_sndtimeo,
-	.memcpy_fromiovec = l2cap_chan_no_memcpy_fromiovec,
 };
 
 static struct l2cap_chan *a2mp_chan_open(struct l2cap_conn *conn, bool locked)
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 5201d61..1754040 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -2096,8 +2096,7 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 	struct sk_buff **frag;
 	int sent = 0;
 
-	if (chan->ops->memcpy_fromiovec(chan, skb_put(skb, count),
-					msg, count))
+	if (copy_from_iter(skb_put(skb, count), count, &msg->msg_iter) != count)
 		return -EFAULT;
 
 	sent += count;
@@ -2117,8 +2116,8 @@ static inline int l2cap_skbuff_fromiovec(struct l2cap_chan *chan,
 
 		*frag = tmp;
 
-		if (chan->ops->memcpy_fromiovec(chan, skb_put(*frag, count),
-						msg, count))
+		if (copy_from_iter(skb_put(*frag, count), count,
+				   &msg->msg_iter) != count)
 			return -EFAULT;
 
 		sent += count;
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 205b298..f65caf4 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1336,13 +1336,6 @@ static struct sk_buff *l2cap_sock_alloc_skb_cb(struct l2cap_chan *chan,
 	return skb;
 }
 
-static int l2cap_sock_memcpy_fromiovec_cb(struct l2cap_chan *chan,
-					  unsigned char *kdata,
-					  struct msghdr *msg, int len)
-{
-	return memcpy_from_msg(kdata, msg, len);
-}
-
 static void l2cap_sock_ready_cb(struct l2cap_chan *chan)
 {
 	struct sock *sk = chan->data;
@@ -1427,7 +1420,6 @@ static const struct l2cap_ops l2cap_chan_ops = {
 	.set_shutdown		= l2cap_sock_set_shutdown_cb,
 	.get_sndtimeo		= l2cap_sock_get_sndtimeo_cb,
 	.alloc_skb		= l2cap_sock_alloc_skb_cb,
-	.memcpy_fromiovec	= l2cap_sock_memcpy_fromiovec_cb,
 };
 
 static void l2cap_sock_destruct(struct sock *sk)
diff --git a/net/bluetooth/smp.c b/net/bluetooth/smp.c
index 21f555b..de7dc75 100644
--- a/net/bluetooth/smp.c
+++ b/net/bluetooth/smp.c
@@ -268,7 +268,7 @@ static void smp_send_cmd(struct l2cap_conn *conn, u8 code, u16 len, void *data)
 
 	memset(&msg, 0, sizeof(msg));
 
-	iov_iter_init(&msg.msg_iter, WRITE, (struct iovec *)iv, 2, 1 + len);
+	iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC, iv, 2, 1 + len);
 
 	l2cap_chan_send(chan, &msg, 1 + len);
 
@@ -1629,7 +1629,6 @@ static const struct l2cap_ops smp_chan_ops = {
 	.suspend		= l2cap_chan_no_suspend,
 	.set_shutdown		= l2cap_chan_no_set_shutdown,
 	.get_sndtimeo		= l2cap_chan_no_get_sndtimeo,
-	.memcpy_fromiovec	= l2cap_chan_no_memcpy_fromiovec,
 };
 
 static inline struct l2cap_chan *smp_new_conn_cb(struct l2cap_chan *pchan)
@@ -1678,7 +1677,6 @@ static const struct l2cap_ops smp_root_chan_ops = {
 	.resume			= l2cap_chan_no_resume,
 	.set_shutdown		= l2cap_chan_no_set_shutdown,
 	.get_sndtimeo		= l2cap_chan_no_get_sndtimeo,
-	.memcpy_fromiovec	= l2cap_chan_no_memcpy_fromiovec,
 };
 
 int smp_register(struct hci_dev *hdev)
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 22/25] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (20 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 21/25] first fruits - kill l2cap ->memcpy_fromiovec() Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 23/25] ppp_read(): switch to skb_copy_datagram_iter() Al Viro
                                                                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

... making both non-draining.  That means that tcp_recvmsg() becomes
non-draining.  And _that_ would break iscsit_do_rx_data() unless we
	a) make sure tcp_recvmsg() is uniformly non-draining (it is)
	b) make sure it copes with arbitrary (including shifted)
iov_iter (it does, all it uses is iov_iter primitives)
	c) make iscsit_do_rx_data() initialize ->msg_iter only once.

Fortunately, (c) is doable with minimal work and we are rid of one
the two places where kernel send/recvmsg users would be unhappy with
non-draining behaviour.

Actually, that makes all but one of ->recvmsg() instances iov_iter-clean.
The exception is skcipher_recvmsg() and it also isn't hard to convert
to primitives (iov_iter_get_pages() is needed there).  That'll wait
a bit - there's some interplay with ->sendmsg() path for that one.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/target/iscsi/iscsi_target_util.c | 12 +++----
 include/linux/skbuff.h                   | 16 +++-------
 net/core/datagram.c                      | 54 +++++++++++---------------------
 3 files changed, 28 insertions(+), 54 deletions(-)

diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index ce87ce9..7c6a95b 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -1326,21 +1326,19 @@ static int iscsit_do_rx_data(
 	struct iscsi_conn *conn,
 	struct iscsi_data_count *count)
 {
-	int data = count->data_length, rx_loop = 0, total_rx = 0, iov_len;
-	struct kvec *iov_p;
+	int data = count->data_length, rx_loop = 0, total_rx = 0;
 	struct msghdr msg;
 
 	if (!conn || !conn->sock || !conn->conn_ops)
 		return -1;
 
 	memset(&msg, 0, sizeof(struct msghdr));
-
-	iov_p = count->iov;
-	iov_len	= count->iov_count;
+	iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC,
+		      count->iov, count->iov_count, data);
 
 	while (total_rx < data) {
-		rx_loop = kernel_recvmsg(conn->sock, &msg, iov_p, iov_len,
-					(data - total_rx), MSG_WAITALL);
+		rx_loop = sock_recvmsg(conn->sock, &msg,
+				      (data - total_rx), MSG_WAITALL);
 		if (rx_loop <= 0) {
 			pr_debug("rx_loop: %d total_rx: %d\n",
 				rx_loop, total_rx);
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 52cf1bd..4902f2d 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2651,17 +2651,10 @@ int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
 					struct msghdr *msg, int size)
 {
-	/* XXX: stripping const */
-	return skb_copy_datagram_iovec(from, offset, (struct iovec *)msg->msg_iter.iov, size);
-}
-int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb, int hlen,
-				     struct iovec *iov);
-static inline int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
-			    struct msghdr *msg)
-{
-	/* XXX: stripping const */
-	return skb_copy_and_csum_datagram_iovec(skb, hlen, (struct iovec *)msg->msg_iter.iov);
+	return skb_copy_datagram_iter(from, offset, &msg->msg_iter, size);
 }
+int skb_copy_and_csum_datagram_msg(struct sk_buff *skb, int hlen,
+				   struct msghdr *msg);
 int skb_copy_datagram_from_iter(struct sk_buff *skb, int offset,
 				 struct iov_iter *from, int len);
 int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *frm);
@@ -2697,8 +2690,7 @@ static inline int memcpy_from_msg(void *data, struct msghdr *msg, int len)
 
 static inline int memcpy_to_msg(struct msghdr *msg, void *data, int len)
 {
-	/* XXX: stripping const */
-	return memcpy_toiovec((struct iovec *)msg->msg_iter.iov, data, len);
+	return copy_to_iter(data, len, &msg->msg_iter) == len ? 0 : -EFAULT;
 }
 
 struct skb_checksum_ops {
diff --git a/net/core/datagram.c b/net/core/datagram.c
index b6e303b..41075ed 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -615,27 +615,25 @@ int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
 EXPORT_SYMBOL(zerocopy_sg_from_iter);
 
 static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
-				      u8 __user *to, int len,
+				      struct iov_iter *to, int len,
 				      __wsum *csump)
 {
 	int start = skb_headlen(skb);
 	int i, copy = start - offset;
 	struct sk_buff *frag_iter;
 	int pos = 0;
+	int n;
 
 	/* Copy header. */
 	if (copy > 0) {
-		int err = 0;
 		if (copy > len)
 			copy = len;
-		*csump = csum_and_copy_to_user(skb->data + offset, to, copy,
-					       *csump, &err);
-		if (err)
+		n = csum_and_copy_to_iter(skb->data + offset, copy, csump, to);
+		if (n != copy)
 			goto fault;
 		if ((len -= copy) == 0)
 			return 0;
 		offset += copy;
-		to += copy;
 		pos = copy;
 	}
 
@@ -647,26 +645,22 @@ static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
 
 		end = start + skb_frag_size(frag);
 		if ((copy = end - offset) > 0) {
-			__wsum csum2;
-			int err = 0;
-			u8  *vaddr;
+			__wsum csum2 = 0;
 			struct page *page = skb_frag_page(frag);
+			u8  *vaddr = kmap(page);
 
 			if (copy > len)
 				copy = len;
-			vaddr = kmap(page);
-			csum2 = csum_and_copy_to_user(vaddr +
-							frag->page_offset +
-							offset - start,
-						      to, copy, 0, &err);
+			n = csum_and_copy_to_iter(vaddr + frag->page_offset +
+						  offset - start, copy,
+						  &csum2, to);
 			kunmap(page);
-			if (err)
+			if (n != copy)
 				goto fault;
 			*csump = csum_block_add(*csump, csum2, pos);
 			if (!(len -= copy))
 				return 0;
 			offset += copy;
-			to += copy;
 			pos += copy;
 		}
 		start = end;
@@ -691,7 +685,6 @@ static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
 			if ((len -= copy) == 0)
 				return 0;
 			offset += copy;
-			to += copy;
 			pos += copy;
 		}
 		start = end;
@@ -744,20 +737,19 @@ __sum16 __skb_checksum_complete(struct sk_buff *skb)
 EXPORT_SYMBOL(__skb_checksum_complete);
 
 /**
- *	skb_copy_and_csum_datagram_iovec - Copy and checksum skb to user iovec.
+ *	skb_copy_and_csum_datagram_msg - Copy and checksum skb to user iovec.
  *	@skb: skbuff
  *	@hlen: hardware length
- *	@iov: io vector
+ *	@msg: destination
  *
  *	Caller _must_ check that skb will fit to this iovec.
  *
  *	Returns: 0       - success.
  *		 -EINVAL - checksum failure.
- *		 -EFAULT - fault during copy. Beware, in this case iovec
- *			   can be modified!
+ *		 -EFAULT - fault during copy.
  */
-int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb,
-				     int hlen, struct iovec *iov)
+int skb_copy_and_csum_datagram_msg(struct sk_buff *skb,
+				   int hlen, struct msghdr *msg)
 {
 	__wsum csum;
 	int chunk = skb->len - hlen;
@@ -765,28 +757,20 @@ int skb_copy_and_csum_datagram_iovec(struct sk_buff *skb,
 	if (!chunk)
 		return 0;
 
-	/* Skip filled elements.
-	 * Pretty silly, look at memcpy_toiovec, though 8)
-	 */
-	while (!iov->iov_len)
-		iov++;
-
-	if (iov->iov_len < chunk) {
+	if (iov_iter_count(&msg->msg_iter) < chunk) {
 		if (__skb_checksum_complete(skb))
 			goto csum_error;
-		if (skb_copy_datagram_iovec(skb, hlen, iov, chunk))
+		if (skb_copy_datagram_msg(skb, hlen, msg, chunk))
 			goto fault;
 	} else {
 		csum = csum_partial(skb->data, hlen, skb->csum);
-		if (skb_copy_and_csum_datagram(skb, hlen, iov->iov_base,
+		if (skb_copy_and_csum_datagram(skb, hlen, &msg->msg_iter,
 					       chunk, &csum))
 			goto fault;
 		if (csum_fold(csum))
 			goto csum_error;
 		if (unlikely(skb->ip_summed == CHECKSUM_COMPLETE))
 			netdev_rx_csum_fault(skb->dev);
-		iov->iov_len -= chunk;
-		iov->iov_base += chunk;
 	}
 	return 0;
 csum_error:
@@ -794,7 +778,7 @@ csum_error:
 fault:
 	return -EFAULT;
 }
-EXPORT_SYMBOL(skb_copy_and_csum_datagram_iovec);
+EXPORT_SYMBOL(skb_copy_and_csum_datagram_msg);
 
 /**
  * 	datagram_poll - generic datagram poll
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 23/25] ppp_read(): switch to skb_copy_datagram_iter()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (21 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 22/25] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 24/25] skb_copy_datagram_iovec() can die Al Viro
                                                                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 drivers/net/ppp/ppp_generic.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index 794a473..af034db 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -417,6 +417,7 @@ static ssize_t ppp_read(struct file *file, char __user *buf,
 	ssize_t ret;
 	struct sk_buff *skb = NULL;
 	struct iovec iov;
+	struct iov_iter to;
 
 	ret = count;
 
@@ -462,7 +463,8 @@ static ssize_t ppp_read(struct file *file, char __user *buf,
 	ret = -EFAULT;
 	iov.iov_base = buf;
 	iov.iov_len = count;
-	if (skb_copy_datagram_iovec(skb, 0, &iov, skb->len))
+	iov_iter_init(&to, READ, &iov, 1, count);
+	if (skb_copy_datagram_iter(skb, 0, &to, skb->len))
 		goto outf;
 	ret = skb->len;
 
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 24/25] skb_copy_datagram_iovec() can die
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (22 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 23/25] ppp_read(): switch to skb_copy_datagram_iter() Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 22:50                                                 ` [PATCH 25/25] bury memcpy_toiovec() Al Viro
                                                                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

no callers other than itself.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/skbuff.h |  2 --
 net/core/datagram.c    | 84 --------------------------------------------------
 2 files changed, 86 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4902f2d..ab0bc43 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2644,8 +2644,6 @@ struct sk_buff *skb_recv_datagram(struct sock *sk, unsigned flags, int noblock,
 				  int *err);
 unsigned int datagram_poll(struct file *file, struct socket *sock,
 			   struct poll_table_struct *wait);
-int skb_copy_datagram_iovec(const struct sk_buff *from, int offset,
-			    struct iovec *to, int size);
 int skb_copy_datagram_iter(const struct sk_buff *from, int offset,
 			   struct iov_iter *to, int size);
 static inline int skb_copy_datagram_msg(const struct sk_buff *from, int offset,
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 41075ed..df493d6 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -310,90 +310,6 @@ int skb_kill_datagram(struct sock *sk, struct sk_buff *skb, unsigned int flags)
 EXPORT_SYMBOL(skb_kill_datagram);
 
 /**
- *	skb_copy_datagram_iovec - Copy a datagram to an iovec.
- *	@skb: buffer to copy
- *	@offset: offset in the buffer to start copying from
- *	@to: io vector to copy to
- *	@len: amount of data to copy from buffer to iovec
- *
- *	Note: the iovec is modified during the copy.
- */
-int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
-			    struct iovec *to, int len)
-{
-	int start = skb_headlen(skb);
-	int i, copy = start - offset;
-	struct sk_buff *frag_iter;
-
-	trace_skb_copy_datagram_iovec(skb, len);
-
-	/* Copy header. */
-	if (copy > 0) {
-		if (copy > len)
-			copy = len;
-		if (memcpy_toiovec(to, skb->data + offset, copy))
-			goto fault;
-		if ((len -= copy) == 0)
-			return 0;
-		offset += copy;
-	}
-
-	/* Copy paged appendix. Hmm... why does this look so complicated? */
-	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
-		int end;
-		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-
-		WARN_ON(start > offset + len);
-
-		end = start + skb_frag_size(frag);
-		if ((copy = end - offset) > 0) {
-			int err;
-			u8  *vaddr;
-			struct page *page = skb_frag_page(frag);
-
-			if (copy > len)
-				copy = len;
-			vaddr = kmap(page);
-			err = memcpy_toiovec(to, vaddr + frag->page_offset +
-					     offset - start, copy);
-			kunmap(page);
-			if (err)
-				goto fault;
-			if (!(len -= copy))
-				return 0;
-			offset += copy;
-		}
-		start = end;
-	}
-
-	skb_walk_frags(skb, frag_iter) {
-		int end;
-
-		WARN_ON(start > offset + len);
-
-		end = start + frag_iter->len;
-		if ((copy = end - offset) > 0) {
-			if (copy > len)
-				copy = len;
-			if (skb_copy_datagram_iovec(frag_iter,
-						    offset - start,
-						    to, copy))
-				goto fault;
-			if ((len -= copy) == 0)
-				return 0;
-			offset += copy;
-		}
-		start = end;
-	}
-	if (!len)
-		return 0;
-
-fault:
-	return -EFAULT;
-}
-EXPORT_SYMBOL(skb_copy_datagram_iovec);
-
-/**
  *	skb_copy_datagram_iter - Copy a datagram to an iovec iterator.
  *	@skb: buffer to copy
  *	@offset: offset in the buffer to start copying from
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* [PATCH 25/25] bury memcpy_toiovec()
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (23 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 24/25] skb_copy_datagram_iovec() can die Al Viro
@ 2014-12-09 22:50                                                 ` Al Viro
  2014-12-09 23:13                                                 ` the next chunk of iov_iter-net stuff for review Al Viro
  2014-12-10 18:25                                                 ` David Miller
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 22:50 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

From: Al Viro <viro@zeniv.linux.org.uk>

no users left

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 include/linux/uio.h |  1 -
 lib/iovec.c         | 25 -------------------------
 2 files changed, 26 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index bd8569a..a41e252 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -131,7 +131,6 @@ size_t csum_and_copy_to_iter(void *addr, size_t bytes, __wsum *csum, struct iov_
 size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
 
 int memcpy_fromiovec(unsigned char *kdata, struct iovec *iov, int len);
-int memcpy_toiovec(struct iovec *iov, unsigned char *kdata, int len);
 int memcpy_fromiovecend(unsigned char *kdata, const struct iovec *iov,
 			int offset, int len);
 int memcpy_toiovecend(const struct iovec *v, unsigned char *kdata,
diff --git a/lib/iovec.c b/lib/iovec.c
index df3abd1..2d99cb4 100644
--- a/lib/iovec.c
+++ b/lib/iovec.c
@@ -29,31 +29,6 @@ EXPORT_SYMBOL(memcpy_fromiovec);
 
 /*
  *	Copy kernel to iovec. Returns -EFAULT on error.
- *
- *	Note: this modifies the original iovec.
- */
-
-int memcpy_toiovec(struct iovec *iov, unsigned char *kdata, int len)
-{
-	while (len > 0) {
-		if (iov->iov_len) {
-			int copy = min_t(unsigned int, iov->iov_len, len);
-			if (copy_to_user(iov->iov_base, kdata, copy))
-				return -EFAULT;
-			kdata += copy;
-			len -= copy;
-			iov->iov_len -= copy;
-			iov->iov_base += copy;
-		}
-		iov++;
-	}
-
-	return 0;
-}
-EXPORT_SYMBOL(memcpy_toiovec);
-
-/*
- *	Copy kernel to iovec. Returns -EFAULT on error.
  */
 
 int memcpy_toiovecend(const struct iovec *iov, unsigned char *kdata,
-- 
2.1.3


^ permalink raw reply related	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (24 preceding siblings ...)
  2014-12-09 22:50                                                 ` [PATCH 25/25] bury memcpy_toiovec() Al Viro
@ 2014-12-09 23:13                                                 ` Al Viro
  2014-12-10 18:25                                                 ` David Miller
  26 siblings, 0 replies; 133+ messages in thread
From: Al Viro @ 2014-12-09 23:13 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-kernel

On Tue, Dec 09, 2014 at 10:49:28PM +0000, Al Viro wrote:

> OK, rebase is in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem-2,
> individual patches - in followups (first 13 - iov_iter, then 12 net-related)

PS: the pile next after that one is also there, in 
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git iov_iter-net-2
but that one contains several things I'd *really* like to hear comments
about (asked about all of them on netdev, no reply so far).  Let's sort this
one out first...

Al, diving back into VFS-related queues...

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: the next chunk of iov_iter-net stuff for review
  2014-12-09 22:49                                               ` Al Viro
                                                                   ` (25 preceding siblings ...)
  2014-12-09 23:13                                                 ` the next chunk of iov_iter-net stuff for review Al Viro
@ 2014-12-10 18:25                                                 ` David Miller
  26 siblings, 0 replies; 133+ messages in thread
From: David Miller @ 2014-12-10 18:25 UTC (permalink / raw)
  To: viro; +Cc: netdev, linux-kernel

From: Al Viro <viro@ZenIV.linux.org.uk>
Date: Tue, 9 Dec 2014 22:49:28 +0000

> OK, rebase is in
> git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git for-davem-2,
> individual patches - in followups (first 13 - iov_iter, then 12 net-related)

All looks good to me, pulled, thanks Al!

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2014-11-25 14:02                           ` [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
@ 2015-02-03 10:10                             ` Michael S. Tsirkin
  2015-02-03 14:27                               ` Al Viro
  0 siblings, 1 reply; 133+ messages in thread
From: Michael S. Tsirkin @ 2015-02-03 10:10 UTC (permalink / raw)
  To: Al Viro; +Cc: David Miller, netdev, linux-kernel

On Tue, Nov 25, 2014 at 02:02:22PM +0000, Al Viro wrote:
> From: Al Viro <viro@zeniv.linux.org.uk>
> 
> allows to switch macvtap and tun from ->aio_write() to ->write_iter()
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
>  drivers/net/macvtap.c |   44 +++++++++++++++++++++-----------------------
>  drivers/net/tun.c     |   44 ++++++++++++++++++++++++--------------------
>  2 files changed, 45 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c
> index 8f80045..22b4cf2 100644
> --- a/drivers/net/macvtap.c
> +++ b/drivers/net/macvtap.c
> @@ -640,12 +640,12 @@ static void macvtap_skb_to_vnet_hdr(const struct sk_buff *skb,
>  
>  /* Get packet from user space buffer */
>  static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
> -				const struct iovec *iv, unsigned long total_len,
> -				size_t count, int noblock)
> +				struct iov_iter *from, int noblock)
>  {
>  	int good_linear = SKB_MAX_HEAD(NET_IP_ALIGN);
>  	struct sk_buff *skb;
>  	struct macvlan_dev *vlan;
> +	unsigned long total_len = iov_iter_count(from);
>  	unsigned long len = total_len;
>  	int err;
>  	struct virtio_net_hdr vnet_hdr = { 0 };
> @@ -653,6 +653,7 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  	int copylen = 0;
>  	bool zerocopy = false;
>  	size_t linear;
> +	ssize_t n;
>  
>  	if (q->flags & IFF_VNET_HDR) {
>  		vnet_hdr_len = q->vnet_hdr_sz;
> @@ -662,10 +663,11 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  			goto err;
>  		len -= vnet_hdr_len;
>  
> -		err = memcpy_fromiovecend((void *)&vnet_hdr, iv, 0,
> -					   sizeof(vnet_hdr));
> -		if (err < 0)
> +		err = -EFAULT;
> +		n = copy_from_iter(&vnet_hdr, sizeof(vnet_hdr), from);
> +		if (n != sizeof(vnet_hdr))
>  			goto err;
> +		iov_iter_advance(from, vnet_hdr_len - sizeof(vnet_hdr));
>  		if ((vnet_hdr.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
>  		     vnet_hdr.csum_start + vnet_hdr.csum_offset + 2 >
>  							vnet_hdr.hdr_len)
> @@ -680,17 +682,16 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  	if (unlikely(len < ETH_HLEN))
>  		goto err;
>  
> -	err = -EMSGSIZE;
> -	if (unlikely(count > UIO_MAXIOV))
> -		goto err;
> -
>  	if (m && m->msg_control && sock_flag(&q->sk, SOCK_ZEROCOPY)) {
> +		struct iov_iter i;
> +
>  		copylen = vnet_hdr.hdr_len ? vnet_hdr.hdr_len : GOODCOPY_LEN;
>  		if (copylen > good_linear)
>  			copylen = good_linear;
>  		linear = copylen;
> -		if (iov_pages(iv, vnet_hdr_len + copylen, count)
> -		    <= MAX_SKB_FRAGS)
> +		i = *from;
> +		iov_iter_advance(&i, copylen);
> +		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
>  			zerocopy = true;
>  	}
>  
> @@ -708,10 +709,9 @@ static ssize_t macvtap_get_user(struct macvtap_queue *q, struct msghdr *m,
>  		goto err;
>  
>  	if (zerocopy)
> -		err = zerocopy_sg_from_iovec(skb, iv, vnet_hdr_len, count);
> +		err = zerocopy_sg_from_iter(skb, from);
>  	else {
> -		err = skb_copy_datagram_from_iovec(skb, 0, iv, vnet_hdr_len,
> -						   len);
> +		err = skb_copy_datagram_from_iter(skb, 0, from, len);
>  		if (!err && m && m->msg_control) {
>  			struct ubuf_info *uarg = m->msg_control;
>  			uarg->callback(uarg, false);
> @@ -764,16 +764,12 @@ err:
>  	return err;
>  }
>  
> -static ssize_t macvtap_aio_write(struct kiocb *iocb, const struct iovec *iv,
> -				 unsigned long count, loff_t pos)
> +static ssize_t macvtap_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  {
>  	struct file *file = iocb->ki_filp;
> -	ssize_t result = -ENOLINK;
>  	struct macvtap_queue *q = file->private_data;
>  
> -	result = macvtap_get_user(q, NULL, iv, iov_length(iv, count), count,
> -				  file->f_flags & O_NONBLOCK);
> -	return result;
> +	return macvtap_get_user(q, NULL, from, file->f_flags & O_NONBLOCK);
>  }
>  
>  /* Put packet to the user space buffer */
> @@ -1081,8 +1077,9 @@ static const struct file_operations macvtap_fops = {
>  	.open		= macvtap_open,
>  	.release	= macvtap_release,
>  	.read		= new_sync_read,
> +	.write		= new_sync_write,
>  	.read_iter	= macvtap_read_iter,
> -	.aio_write	= macvtap_aio_write,
> +	.write_iter	= macvtap_write_iter,
>  	.poll		= macvtap_poll,
>  	.llseek		= no_llseek,
>  	.unlocked_ioctl	= macvtap_ioctl,
> @@ -1095,8 +1092,9 @@ static int macvtap_sendmsg(struct kiocb *iocb, struct socket *sock,
>  			   struct msghdr *m, size_t total_len)
>  {
>  	struct macvtap_queue *q = container_of(sock, struct macvtap_queue, sock);
> -	return macvtap_get_user(q, m, m->msg_iov, total_len, m->msg_iovlen,
> -			    m->msg_flags & MSG_DONTWAIT);
> +	struct iov_iter from;
> +	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
> +	return macvtap_get_user(q, m, &from, m->msg_flags & MSG_DONTWAIT);
>  }
>  
>  static int macvtap_recvmsg(struct kiocb *iocb, struct socket *sock,
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 405dfdf..4b743c6 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1012,28 +1012,29 @@ static struct sk_buff *tun_alloc_skb(struct tun_file *tfile,
>  
>  /* Get packet from user space buffer */
>  static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
> -			    void *msg_control, const struct iovec *iv,
> -			    size_t total_len, size_t count, int noblock)
> +			    void *msg_control, struct iov_iter *from,
> +			    int noblock)
>  {
>  	struct tun_pi pi = { 0, cpu_to_be16(ETH_P_IP) };
>  	struct sk_buff *skb;
> +	size_t total_len = iov_iter_count(from);
>  	size_t len = total_len, align = NET_SKB_PAD, linear;
>  	struct virtio_net_hdr gso = { 0 };
>  	int good_linear;
> -	int offset = 0;
>  	int copylen;
>  	bool zerocopy = false;
>  	int err;
>  	u32 rxhash;
> +	ssize_t n;
>  
>  	if (!(tun->flags & TUN_NO_PI)) {
>  		if (len < sizeof(pi))
>  			return -EINVAL;
>  		len -= sizeof(pi);
>  
> -		if (memcpy_fromiovecend((void *)&pi, iv, 0, sizeof(pi)))
> +		n = copy_from_iter(&pi, sizeof(pi), from);
> +		if (n != sizeof(pi))
>  			return -EFAULT;
> -		offset += sizeof(pi);
>  	}
>  
>  	if (tun->flags & TUN_VNET_HDR) {
> @@ -1041,7 +1042,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  			return -EINVAL;
>  		len -= tun->vnet_hdr_sz;
>  
> -		if (memcpy_fromiovecend((void *)&gso, iv, offset, sizeof(gso)))
> +		n = copy_from_iter(&gso, sizeof(gso), from);
> +		if (n != sizeof(gso))
>  			return -EFAULT;
>  
>  		if ((gso.flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) &&
> @@ -1050,7 +1052,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  
>  		if (gso.hdr_len > len)
>  			return -EINVAL;
> -		offset += tun->vnet_hdr_sz;
> +		iov_iter_advance(from, tun->vnet_hdr_sz);
>  	}
>  
>  	if ((tun->flags & TUN_TYPE_MASK) == TUN_TAP_DEV) {


Hmm does copy_from_iter actually modify the iovec?
If so, won't this break aio on tun/macvtap, by
reversing the effect of
commit 6f26c9a7555e5bcca3560919db9b852015077dae
    tun: fix tun_chr_aio_write so that aio works
?


Maybe we should change iovec_iter to avoid modifying the
underlying iovec?



> @@ -1063,6 +1065,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  	good_linear = SKB_MAX_HEAD(align);
>  
>  	if (msg_control) {
> +		struct iov_iter i = *from;
> +
>  		/* There are 256 bytes to be copied in skb, so there is
>  		 * enough room for skb expand head in case it is used.
>  		 * The rest of the buffer is mapped from userspace.
> @@ -1071,7 +1075,8 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  		if (copylen > good_linear)
>  			copylen = good_linear;
>  		linear = copylen;
> -		if (iov_pages(iv, offset + copylen, count) <= MAX_SKB_FRAGS)
> +		iov_iter_advance(&i, copylen);
> +		if (iov_iter_npages(&i, INT_MAX) <= MAX_SKB_FRAGS)
>  			zerocopy = true;
>  	}
>  
> @@ -1091,9 +1096,9 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  	}
>  
>  	if (zerocopy)
> -		err = zerocopy_sg_from_iovec(skb, iv, offset, count);
> +		err = zerocopy_sg_from_iter(skb, from);
>  	else {
> -		err = skb_copy_datagram_from_iovec(skb, 0, iv, offset, len);
> +		err = skb_copy_datagram_from_iter(skb, 0, from, len);
>  		if (!err && msg_control) {
>  			struct ubuf_info *uarg = msg_control;
>  			uarg->callback(uarg, false);
> @@ -1207,8 +1212,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>  	return total_len;
>  }
>  
> -static ssize_t tun_chr_aio_write(struct kiocb *iocb, const struct iovec *iv,
> -			      unsigned long count, loff_t pos)
> +static ssize_t tun_chr_write_iter(struct kiocb *iocb, struct iov_iter *from)
>  {
>  	struct file *file = iocb->ki_filp;
>  	struct tun_struct *tun = tun_get(file);
> @@ -1218,10 +1222,7 @@ static ssize_t tun_chr_aio_write(struct kiocb *iocb, const struct iovec *iv,
>  	if (!tun)
>  		return -EBADFD;
>  
> -	tun_debug(KERN_INFO, tun, "tun_chr_write %ld\n", count);
> -
> -	result = tun_get_user(tun, tfile, NULL, iv, iov_length(iv, count),
> -			      count, file->f_flags & O_NONBLOCK);
> +	result = tun_get_user(tun, tfile, NULL, from, file->f_flags & O_NONBLOCK);
>  
>  	tun_put(tun);
>  	return result;
> @@ -1445,11 +1446,14 @@ static int tun_sendmsg(struct kiocb *iocb, struct socket *sock,
>  	int ret;
>  	struct tun_file *tfile = container_of(sock, struct tun_file, socket);
>  	struct tun_struct *tun = __tun_get(tfile);
> +	struct iov_iter from;
>  
>  	if (!tun)
>  		return -EBADFD;
> -	ret = tun_get_user(tun, tfile, m->msg_control, m->msg_iov, total_len,
> -			   m->msg_iovlen, m->msg_flags & MSG_DONTWAIT);
> +
> +	iov_iter_init(&from, WRITE, m->msg_iov, m->msg_iovlen, total_len);
> +	ret = tun_get_user(tun, tfile, m->msg_control, &from,
> +			   m->msg_flags & MSG_DONTWAIT);
>  	tun_put(tun);
>  	return ret;
>  }
> @@ -2233,9 +2237,9 @@ static const struct file_operations tun_fops = {
>  	.owner	= THIS_MODULE,
>  	.llseek = no_llseek,
>  	.read  = new_sync_read,
> +	.write = new_sync_write,
>  	.read_iter  = tun_chr_read_iter,
> -	.write = do_sync_write,
> -	.aio_write = tun_chr_aio_write,
> +	.write_iter = tun_chr_write_iter,
>  	.poll	= tun_chr_poll,
>  	.unlocked_ioctl	= tun_chr_ioctl,
>  #ifdef CONFIG_COMPAT
> -- 
> 1.7.10.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2015-02-03 10:10                             ` Michael S. Tsirkin
@ 2015-02-03 14:27                               ` Al Viro
  2015-02-03 15:19                                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 133+ messages in thread
From: Al Viro @ 2015-02-03 14:27 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: David Miller, netdev, linux-kernel

On Tue, Feb 03, 2015 at 12:10:44PM +0200, Michael S. Tsirkin wrote:

> Hmm does copy_from_iter actually modify the iovec?
> If so, won't this break aio on tun/macvtap, by
> reversing the effect of
> commit 6f26c9a7555e5bcca3560919db9b852015077dae
>     tun: fix tun_chr_aio_write so that aio works
> ?
> 
> 
> Maybe we should change iovec_iter to avoid modifying the
> underlying iovec?

iov_iter never changes the underlying iovec (or kvec, or bvec).
iter->iov_offset changes as you go and once you have consumed an
entire iovec element ->iov is incremented to point to the next one
(and ->iov_offset is reset to 0 at that point).  *Contents* of
iter->iov is never modified.

^ permalink raw reply	[flat|nested] 133+ messages in thread

* Re: [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter
  2015-02-03 14:27                               ` Al Viro
@ 2015-02-03 15:19                                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 133+ messages in thread
From: Michael S. Tsirkin @ 2015-02-03 15:19 UTC (permalink / raw)
  To: Al Viro; +Cc: David Miller, netdev, linux-kernel

On Tue, Feb 03, 2015 at 02:27:28PM +0000, Al Viro wrote:
> On Tue, Feb 03, 2015 at 12:10:44PM +0200, Michael S. Tsirkin wrote:
> 
> > Hmm does copy_from_iter actually modify the iovec?
> > If so, won't this break aio on tun/macvtap, by
> > reversing the effect of
> > commit 6f26c9a7555e5bcca3560919db9b852015077dae
> >     tun: fix tun_chr_aio_write so that aio works
> > ?
> > 
> > 
> > Maybe we should change iovec_iter to avoid modifying the
> > underlying iovec?
> 
> iov_iter never changes the underlying iovec (or kvec, or bvec).
> iter->iov_offset changes as you go and once you have consumed an
> entire iovec element ->iov is incremented to point to the next one
> (and ->iov_offset is reset to 0 at that point).  *Contents* of
> iter->iov is never modified.

I see, I think I misread the code.
Thanks for the clarification.

-- 
MST

^ permalink raw reply	[flat|nested] 133+ messages in thread

end of thread, other threads:[~2015-02-03 15:19 UTC | newest]

Thread overview: 133+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-18  8:47 [RFC] situation with csum_and_copy_... API Al Viro
2014-11-18 19:40 ` [patches][RFC] " Al Viro
2014-11-18 19:41   ` [PATCH 1/5] separate kernel- and userland-side msghdr Al Viro
2014-11-18 19:42   ` [PATCH 2/5] {compat_,}verify_iovec(): switch to generic copying of iovecs Al Viro
2014-11-18 19:42   ` [PATCH 3/5] remove a bunch of now-pointless access_ok() in net Al Viro
2014-11-18 19:43   ` [PATCH 4/5] bury skb_copy_to_page() Al Viro
2014-11-18 19:43   ` [PATCH 5/5] fold verify_iovec() into copy_msghdr_from_user() Al Viro
2014-11-19 20:25   ` [patches][RFC] situation with csum_and_copy_... API David Miller
2014-11-18 20:49 ` [RFC] " Linus Torvalds
2014-11-18 21:23   ` Al Viro
2014-11-18 21:39     ` Linus Torvalds
2014-11-19 20:31     ` David Miller
2014-11-19 20:40       ` Linus Torvalds
2014-11-19 21:17         ` Al Viro
2014-11-19 21:17         ` David Miller
2014-11-19 21:30           ` Al Viro
2014-11-19 21:53             ` David Miller
2014-11-20 21:47               ` Al Viro
2014-11-20 21:55                 ` Eric Dumazet
2014-11-20 22:25                   ` Al Viro
2014-11-20 22:53                     ` Eric Dumazet
2014-11-21  8:49                       ` Al Viro
2014-11-21 15:01                         ` Eric Dumazet
2014-11-21 17:42                         ` David Laight
2014-11-21 19:39                           ` Al Viro
2014-11-21 19:40                             ` Linus Torvalds
2014-11-24 10:03                             ` David Laight
2014-11-22  3:27                         ` Al Viro
2014-11-22  3:36                           ` Al Viro
2014-11-24 10:27                           ` David Laight
2014-11-20 23:23                 ` David Miller
2014-11-21 17:26                   ` David Miller
2014-11-22  4:28                     ` Al Viro
2014-11-22  4:29                       ` [PATCH 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
2014-11-22  4:30                       ` [PATCH 02/17] new helper: memcpy_from_msg() Al Viro
2014-11-22  4:30                       ` [PATCH 03/17] switch ipxrtr_route_packet() from iovec to msghdr Al Viro
2014-11-22  4:31                       ` [PATCH 04/17] new helper: memcpy_to_msg() Al Viro
2014-11-22  4:32                       ` [PATCH 05/17] switch drivers/net/tun.c to ->read_iter() Al Viro
2014-11-22  4:32                       ` [PATCH 06/17] switch macvtap " Al Viro
2014-11-23 23:29                         ` Ben Hutchings
2014-11-22  4:33                       ` [PATCH 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter() Al Viro
2014-11-24  0:02                         ` Ben Hutchings
2014-11-24  0:29                           ` Ben Hutchings
2014-11-24  5:34                           ` Jason Wang
2014-11-24 10:03                             ` Al Viro
2014-11-22  4:33                       ` [PATCH 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
2014-11-24  0:27                         ` Ben Hutchings
2014-11-24  1:06                           ` Ben Hutchings
2014-11-24 10:15                           ` Al Viro
2014-11-22  4:34                       ` [PATCH 09/17] kill zerocopy_sg_from_iovec() Al Viro
2014-11-22  4:35                       ` [PATCH 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter() Al Viro
2014-11-22  4:36                       ` PATCH 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter Al Viro
2014-11-22  4:36                       ` [PATCH 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov Al Viro
2014-11-22  4:37                       ` [PATCH 13/17] tipc_msg_build(): " Al Viro
2014-11-22  4:37                       ` [PATCH 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr Al Viro
2014-11-22  4:38                       ` [PATCH 15/17] [atm] switch vcc_sendmsg() to copy_from_iter() Al Viro
2014-11-22  4:38                       ` [PATCH 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter Al Viro
2014-11-22  4:39                       ` [PATCH 17/17] rds: switch rds_message_copy_from_user() to iov_iter Al Viro
2014-11-24  2:00                         ` Ben Hutchings
2014-11-24 10:17                           ` Al Viro
2014-11-22  7:24                       ` [RFC] situation with csum_and_copy_... API David Miller
2014-11-25  2:40                         ` Al Viro
2014-11-25 14:02                           ` [PATCH v2 01/17] new helper: skb_copy_and_csum_datagram_msg() Al Viro
2014-11-25 19:28                             ` David Miller
2014-11-25 20:59                               ` Al Viro
2014-11-26 17:27                                 ` David Miller
2014-12-05  5:56                                   ` the next chunk of iov_iter-net stuff for review Al Viro
2014-12-05  5:58                                     ` [PATCH 01/12] raw.c: stick msghdr into raw_frag_vec Al Viro
2014-12-05  5:58                                     ` [PATCH 02/12] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt" Al Viro
2014-12-05  5:58                                     ` [PATCH 03/12] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr Al Viro
2014-12-05  5:58                                     ` [PATCH 04/12] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) Al Viro
2014-12-05  5:58                                     ` [PATCH 05/12] switch l2cap ->memcpy_fromiovec() to msghdr Al Viro
2014-12-05  5:58                                     ` [PATCH 06/12] vmci: propagate msghdr all way down to __qp_memcpy_from_queue() Al Viro
2014-12-05  5:58                                     ` [PATCH 07/12] put iov_iter into msghdr Al Viro
2014-12-05  5:58                                     ` [PATCH 08/12] first fruits - kill l2cap ->memcpy_fromiovec() Al Viro
2014-12-05  5:58                                     ` [PATCH 09/12] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives Al Viro
2014-12-05  5:58                                     ` [PATCH 10/12] ppp_read(): switch to skb_copy_datagram_iter() Al Viro
2014-12-05  5:58                                     ` [PATCH 11/12] skb_copy_datagram_iovec() can die Al Viro
2014-12-05  5:58                                     ` [PATCH 12/12] bury memcpy_toiovec() Al Viro
2014-12-09 20:07                                     ` the next chunk of iov_iter-net stuff for review David Miller
2014-12-09 21:04                                       ` Al Viro
2014-12-09 21:17                                         ` David Miller
2014-12-09 21:23                                           ` Al Viro
2014-12-09 21:37                                             ` David Miller
2014-12-09 22:49                                               ` Al Viro
2014-12-09 22:50                                                 ` [PATCH 01/25] iov_iter.c: macros for iterating over iov_iter Al Viro
2014-12-09 22:50                                                 ` [PATCH 02/25] iov_iter.c: iterate_and_advance Al Viro
2014-12-09 22:50                                                 ` [PATCH 03/25] iov_iter.c: convert iov_iter_npages() to iterate_all_kinds Al Viro
2014-12-09 22:50                                                 ` [PATCH 04/25] iov_iter.c: convert iov_iter_get_pages() " Al Viro
2014-12-09 22:50                                                 ` [PATCH 05/25] iov_iter.c: convert iov_iter_get_pages_alloc() " Al Viro
2014-12-09 22:50                                                 ` [PATCH 06/25] iov_iter.c: convert iov_iter_zero() to iterate_and_advance Al Viro
2014-12-09 22:50                                                 ` [PATCH 07/25] iov_iter.c: get rid of bvec_copy_page_{to,from}_iter() Al Viro
2014-12-09 22:50                                                 ` [PATCH 08/25] iov_iter.c: convert copy_from_iter() to iterate_and_advance Al Viro
2014-12-09 22:50                                                 ` [PATCH 09/25] iov_iter.c: convert copy_to_iter() " Al Viro
2014-12-09 22:50                                                 ` [PATCH 10/25] iov_iter.c: handle ITER_KVEC directly Al Viro
2014-12-09 22:50                                                 ` [PATCH 11/25] csum_and_copy_..._iter() Al Viro
2014-12-09 22:50                                                 ` [PATCH 12/25] new helper: iov_iter_kvec() Al Viro
2014-12-09 22:50                                                 ` [PATCH 13/25] copy_from_iter_nocache() Al Viro
2014-12-09 22:50                                                 ` [PATCH 14/25] raw.c: stick msghdr into raw_frag_vec Al Viro
2014-12-09 22:50                                                 ` [PATCH 15/25] ipv6 equivalent of "ipv4: Avoid reading user iov twice after raw_probe_proto_opt" Al Viro
2014-12-09 22:50                                                 ` [PATCH 16/25] ip_generic_getfrag, udplite_getfrag: switch to passing msghdr Al Viro
2014-12-09 22:50                                                 ` [PATCH 17/25] switch tcp_sock->ucopy from iovec (ucopy.iov) to msghdr (ucopy.msg) Al Viro
2014-12-09 22:50                                                 ` [PATCH 18/25] switch l2cap ->memcpy_fromiovec() to msghdr Al Viro
2014-12-09 22:50                                                 ` [PATCH 19/25] vmci: propagate msghdr all way down to __qp_memcpy_from_queue() Al Viro
2014-12-09 22:50                                                 ` [PATCH 20/25] put iov_iter into msghdr Al Viro
2014-12-09 22:50                                                 ` [PATCH 21/25] first fruits - kill l2cap ->memcpy_fromiovec() Al Viro
2014-12-09 22:50                                                 ` [PATCH 22/25] switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives Al Viro
2014-12-09 22:50                                                 ` [PATCH 23/25] ppp_read(): switch to skb_copy_datagram_iter() Al Viro
2014-12-09 22:50                                                 ` [PATCH 24/25] skb_copy_datagram_iovec() can die Al Viro
2014-12-09 22:50                                                 ` [PATCH 25/25] bury memcpy_toiovec() Al Viro
2014-12-09 23:13                                                 ` the next chunk of iov_iter-net stuff for review Al Viro
2014-12-10 18:25                                                 ` David Miller
2014-11-25 14:02                           ` [PATCH v2 02/17] new helper: memcpy_from_msg() Al Viro
2014-11-25 14:02                           ` [PATCH v2 03/17] switch ipxrtr_route_packet() from iovec to msghdr Al Viro
2014-11-25 14:02                           ` [PATCH v2 04/17] new helper: memcpy_to_msg() Al Viro
2014-11-25 14:02                           ` [PATCH v2 05/17] switch drivers/net/tun.c to ->read_iter() Al Viro
2014-11-25 14:02                           ` [PATCH v2 06/17] switch macvtap " Al Viro
2014-11-25 14:02                           ` [PATCH v2 07/17] new helpers: skb_copy_datagram_from_iter() and zerocopy_sg_from_iter() Al Viro
2014-11-25 14:02                           ` [PATCH v2 08/17] {macvtap,tun}_get_user(): switch to iov_iter Al Viro
2015-02-03 10:10                             ` Michael S. Tsirkin
2015-02-03 14:27                               ` Al Viro
2015-02-03 15:19                                 ` Michael S. Tsirkin
2014-11-25 14:02                           ` [PATCH v2 09/17] kill zerocopy_sg_from_iovec() Al Viro
2014-11-25 14:02                           ` [PATCH v2 10/17] switch AF_PACKET and AF_UNIX to skb_copy_datagram_from_iter() Al Viro
2014-11-25 14:02                           ` [PATCH v2 11/17] switch sctp_user_addto_chunk() and sctp_datamsg_from_user() to passing iov_iter Al Viro
2014-11-25 14:02                           ` [PATCH v2 12/17] tipc_sendmsg(): pass msghdr instead of its ->msg_iov Al Viro
2014-11-25 14:02                           ` [PATCH v2 13/17] tipc_msg_build(): " Al Viro
2014-11-25 14:02                           ` [PATCH v2 14/17] vmci_transport: switch ->enqeue_dgram, ->enqueue_stream and ->dequeue_stream to msghdr Al Viro
2014-11-25 14:02                           ` [PATCH v2 15/17] [atm] switch vcc_sendmsg() to copy_from_iter() Al Viro
2014-11-25 14:02                           ` [PATCH v2 16/17] rds: switch ->inc_copy_to_user() to passing iov_iter Al Viro
2014-11-25 14:02                           ` [PATCH v2 17/17] rds: switch rds_message_copy_from_user() to iov_iter Al Viro
2014-11-22 17:48                       ` [RFC] situation with csum_and_copy_... API Linus Torvalds
2014-11-21  4:17                 ` Nicholas A. Bellinger

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.