linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/29] RFC: iov_iter: Switch to using an ops table
@ 2020-11-21 14:13 David Howells
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
                   ` (31 more replies)
  0 siblings, 32 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:13 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel


Hi Pavel, Willy, Jens, Al,

I had a go switching the iov_iter stuff away from using a type bitmask to
using an ops table to get rid of the if-if-if-if chains that are all over
the place.  After I pushed it, someone pointed me at Pavel's two patches.

I have another iterator class that I want to add - which would lengthen the
if-if-if-if chains.  A lot of the time, there's a conditional clause at the
beginning of a function that just jumps off to a type-specific handler or
to reject the operation for that type.  An ops table can just point to that
instead.

As far as I can tell, there's no difference in performance in most cases,
though doing AFS-based kernel compiles appears to take less time (down from
3m20 to 2m50), which might make sense as that uses iterators a lot - but
there are too many variables in that for that to be a good benchmark (I'm
dealing with a remote server, for a start).

Can someone recommend a good way to benchmark this properly?  The problem
is that the difference this makes relative to the amount of time taken to
actually do I/O is tiny.

I've tried TCP transfers using the following sink program:

	#include <stdio.h>
	#include <stdlib.h>
	#include <string.h>
	#include <fcntl.h>
	#include <unistd.h>
	#include <netinet/in.h>
	#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0)
	static unsigned char buffer[512 * 1024] __attribute__((aligned(4096)));
	int main(int argc, char *argv[])
	{
		struct sockaddr_in sin = { .sin_family = AF_INET, .sin_port = htons(5555) };
		int sfd, afd;
		sfd = socket(AF_INET, SOCK_STREAM, 0);
		OSERROR(sfd, "socket");
		OSERROR(bind(sfd, (struct sockaddr *)&sin, sizeof(sin)), "bind");
		OSERROR(listen(sfd, 1), "listen");
		for (;;) {
			afd = accept(sfd, NULL, NULL);
			if (afd != -1) {
				while (read(afd, buffer, sizeof(buffer)) > 0) {}
				close(afd);
			}
		}
	}

and send program:

	#include <stdio.h>
	#include <stdlib.h>
	#include <string.h>
	#include <fcntl.h>
	#include <unistd.h>
	#include <netdb.h>
	#include <netinet/in.h>
	#include <sys/stat.h>
	#include <sys/sendfile.h>
	#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0)
	static unsigned char buffer[512*1024] __attribute__((aligned(4096)));
	int main(int argc, char *argv[])
	{
		struct sockaddr_in sin = { .sin_family = AF_INET, .sin_port = htons(5555) };
		struct hostent *h;
		ssize_t size, r, o;
		int cfd;
		if (argc != 3) {
			fprintf(stderr, "tcp-gen <server> <size>\n");
			exit(2);
		}
		size = strtoul(argv[2], NULL, 0);
		if (size <= 0) {
			fprintf(stderr, "Bad size\n");
			exit(2);
		}
		h = gethostbyname(argv[1]);
		if (!h) {
			fprintf(stderr, "%s: %s\n", argv[1], hstrerror(h_errno));
			exit(3);
		}
		if (!h->h_addr_list[0]) {
			fprintf(stderr, "%s: No addresses\n", argv[1]);
			exit(3);
		}
		memcpy(&sin.sin_addr, h->h_addr_list[0], h->h_length);
		cfd = socket(AF_INET, SOCK_STREAM, 0);
		OSERROR(cfd, "socket");
		OSERROR(connect(cfd, (struct sockaddr *)&sin, sizeof(sin)), "connect");
		do {
			r = size > sizeof(buffer) ? sizeof(buffer) : size;
			size -= r;
			o = 0;
			do {
				ssize_t w = write(cfd, buffer + o, r - o);
				OSERROR(w, "write");
				o += w;
			} while (o < r);
		} while (size > 0);
		OSERROR(close(cfd), "close/c");
		return 0;
	}

since the socket interface uses iterators.  It seems to show no difference.
One side note, though: I've been doing 10GiB same-machine transfers, and it
takes either ~2.5s or ~0.87s and rarely in between, with or without these
patches, alternating apparently randomly between the two times.

The patches can be found here:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-ops

David
---
David Howells (29):
      iov_iter: Switch to using a table of operations
      iov_iter: Split copy_page_to_iter()
      iov_iter: Split iov_iter_fault_in_readable
      iov_iter: Split the iterate_and_advance() macro
      iov_iter: Split copy_to_iter()
      iov_iter: Split copy_mc_to_iter()
      iov_iter: Split copy_from_iter()
      iov_iter: Split the iterate_all_kinds() macro
      iov_iter: Split copy_from_iter_full()
      iov_iter: Split copy_from_iter_nocache()
      iov_iter: Split copy_from_iter_flushcache()
      iov_iter: Split copy_from_iter_full_nocache()
      iov_iter: Split copy_page_from_iter()
      iov_iter: Split iov_iter_zero()
      iov_iter: Split copy_from_user_atomic()
      iov_iter: Split iov_iter_advance()
      iov_iter: Split iov_iter_revert()
      iov_iter: Split iov_iter_single_seg_count()
      iov_iter: Split iov_iter_alignment()
      iov_iter: Split iov_iter_gap_alignment()
      iov_iter: Split iov_iter_get_pages()
      iov_iter: Split iov_iter_get_pages_alloc()
      iov_iter: Split csum_and_copy_from_iter()
      iov_iter: Split csum_and_copy_from_iter_full()
      iov_iter: Split csum_and_copy_to_iter()
      iov_iter: Split iov_iter_npages()
      iov_iter: Split dup_iter()
      iov_iter: Split iov_iter_for_each_range()
      iov_iter: Remove iterate_all_kinds() and iterate_and_advance()


 lib/iov_iter.c | 1440 +++++++++++++++++++++++++++++++-----------------
 1 file changed, 934 insertions(+), 506 deletions(-)



^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
@ 2020-11-21 14:13 ` David Howells
  2020-11-21 14:31   ` Pavel Begunkov
                     ` (7 more replies)
  2020-11-21 14:13 ` [PATCH 02/29] iov_iter: Split copy_page_to_iter() David Howells
                   ` (30 subsequent siblings)
  31 siblings, 8 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:13 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Switch to using a table of operations.  In a future patch the individual
methods will be split up by type.  For the moment, however, the ops tables
just jump directly to the old functions - which are now static.  Inline
wrappers are provided to jump through the hooks.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/io_uring.c       |    2 
 include/linux/uio.h |  241 ++++++++++++++++++++++++++++++++++--------
 lib/iov_iter.c      |  293 +++++++++++++++++++++++++++++++++++++++------------
 3 files changed, 422 insertions(+), 114 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4ead291b2976..baa78f58ae5c 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3192,7 +3192,7 @@ static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
 	rw->free_iovec = iovec;
 	rw->bytes_done = 0;
 	/* can only be fixed buffers, no need to do anything */
-	if (iter->type == ITER_BVEC)
+	if (iov_iter_is_bvec(iter))
 		return;
 	if (!iovec) {
 		unsigned iov_off = 0;
diff --git a/include/linux/uio.h b/include/linux/uio.h
index 72d88566694e..45ee087f8c43 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -32,9 +32,10 @@ struct iov_iter {
 	 * Bit 1 is the BVEC_FLAG_NO_REF bit, set if type is a bvec and
 	 * the caller isn't expecting to drop a page reference when done.
 	 */
-	unsigned int type;
+	unsigned int flags;
 	size_t iov_offset;
 	size_t count;
+	const struct iov_iter_ops *ops;
 	union {
 		const struct iovec *iov;
 		const struct kvec *kvec;
@@ -50,9 +51,63 @@ struct iov_iter {
 	};
 };
 
+void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
+			unsigned long nr_segs, size_t count);
+void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec *kvec,
+			unsigned long nr_segs, size_t count);
+void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
+			unsigned long nr_segs, size_t count);
+void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
+			size_t count);
+void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
+
+struct iov_iter_ops {
+	enum iter_type type;
+	size_t (*copy_from_user_atomic)(struct page *page, struct iov_iter *i,
+					unsigned long offset, size_t bytes);
+	void (*advance)(struct iov_iter *i, size_t bytes);
+	void (*revert)(struct iov_iter *i, size_t bytes);
+	int (*fault_in_readable)(struct iov_iter *i, size_t bytes);
+	size_t (*single_seg_count)(const struct iov_iter *i);
+	size_t (*copy_page_to_iter)(struct page *page, size_t offset, size_t bytes,
+				    struct iov_iter *i);
+	size_t (*copy_page_from_iter)(struct page *page, size_t offset, size_t bytes,
+				      struct iov_iter *i);
+	size_t (*copy_to_iter)(const void *addr, size_t bytes, struct iov_iter *i);
+	size_t (*copy_from_iter)(void *addr, size_t bytes, struct iov_iter *i);
+	bool (*copy_from_iter_full)(void *addr, size_t bytes, struct iov_iter *i);
+	size_t (*copy_from_iter_nocache)(void *addr, size_t bytes, struct iov_iter *i);
+	bool (*copy_from_iter_full_nocache)(void *addr, size_t bytes, struct iov_iter *i);
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+	size_t (*copy_from_iter_flushcache)(void *addr, size_t bytes, struct iov_iter *i);
+#endif
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	size_t (*copy_mc_to_iter)(const void *addr, size_t bytes, struct iov_iter *i);
+#endif
+	size_t (*csum_and_copy_to_iter)(const void *addr, size_t bytes, void *csump,
+					struct iov_iter *i);
+	size_t (*csum_and_copy_from_iter)(void *addr, size_t bytes, __wsum *csum,
+					  struct iov_iter *i);
+	bool (*csum_and_copy_from_iter_full)(void *addr, size_t bytes, __wsum *csum,
+					     struct iov_iter *i);
+
+	size_t (*zero)(size_t bytes, struct iov_iter *i);
+	unsigned long (*alignment)(const struct iov_iter *i);
+	unsigned long (*gap_alignment)(const struct iov_iter *i);
+	ssize_t (*get_pages)(struct iov_iter *i, struct page **pages,
+			     size_t maxsize, unsigned maxpages, size_t *start);
+	ssize_t (*get_pages_alloc)(struct iov_iter *i, struct page ***pages,
+				   size_t maxsize, size_t *start);
+	int (*npages)(const struct iov_iter *i, int maxpages);
+	const void *(*dup_iter)(struct iov_iter *new, struct iov_iter *old, gfp_t flags);
+	int (*for_each_range)(struct iov_iter *i, size_t bytes,
+			      int (*f)(struct kvec *vec, void *context),
+			      void *context);
+};
+
 static inline enum iter_type iov_iter_type(const struct iov_iter *i)
 {
-	return i->type & ~(READ | WRITE);
+	return i->ops->type;
 }
 
 static inline bool iter_is_iovec(const struct iov_iter *i)
@@ -82,7 +137,7 @@ static inline bool iov_iter_is_discard(const struct iov_iter *i)
 
 static inline unsigned char iov_iter_rw(const struct iov_iter *i)
 {
-	return i->type & (READ | WRITE);
+	return i->flags & (READ | WRITE);
 }
 
 /*
@@ -111,22 +166,71 @@ static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
 	};
 }
 
-size_t iov_iter_copy_from_user_atomic(struct page *page,
-		struct iov_iter *i, unsigned long offset, size_t bytes);
-void iov_iter_advance(struct iov_iter *i, size_t bytes);
-void iov_iter_revert(struct iov_iter *i, size_t bytes);
-int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes);
-size_t iov_iter_single_seg_count(const struct iov_iter *i);
+static inline
+size_t iov_iter_copy_from_user_atomic(struct page *page, struct iov_iter *i,
+				      unsigned long offset, size_t bytes)
+{
+	return i->ops->copy_from_user_atomic(page, i, offset, bytes);
+}
+static inline
+void iov_iter_advance(struct iov_iter *i, size_t bytes)
+{
+	return i->ops->advance(i, bytes);
+}
+static inline
+void iov_iter_revert(struct iov_iter *i, size_t bytes)
+{
+	return i->ops->revert(i, bytes);
+}
+static inline
+int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
+{
+	return i->ops->fault_in_readable(i, bytes);
+}
+static inline
+size_t iov_iter_single_seg_count(const struct iov_iter *i)
+{
+	return i->ops->single_seg_count(i);
+}
+
+static inline
 size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
-			 struct iov_iter *i);
+				       struct iov_iter *i)
+{
+	return i->ops->copy_page_to_iter(page, offset, bytes, i);
+}
+static inline
 size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
-			 struct iov_iter *i);
+					 struct iov_iter *i)
+{
+	return i->ops->copy_page_from_iter(page, offset, bytes, i);
+}
 
-size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
-size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i);
-bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i);
-size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i);
-bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i);
+static __always_inline __must_check
+size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	return i->ops->copy_to_iter(addr, bytes, i);
+}
+static __always_inline __must_check
+size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
+{
+	return i->ops->copy_from_iter(addr, bytes, i);
+}
+static __always_inline __must_check
+bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
+{
+	return i->ops->copy_from_iter_full(addr, bytes, i);
+}
+static __always_inline __must_check
+size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	return i->ops->copy_from_iter_nocache(addr, bytes, i);
+}
+static __always_inline __must_check
+bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	return i->ops->copy_from_iter_full_nocache(addr, bytes, i);
+}
 
 static __always_inline __must_check
 size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
@@ -173,23 +277,21 @@ bool copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
 		return _copy_from_iter_full_nocache(addr, bytes, i);
 }
 
-#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 /*
  * Note, users like pmem that depend on the stricter semantics of
  * copy_from_iter_flushcache() than copy_from_iter_nocache() must check for
  * IS_ENABLED(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) before assuming that the
  * destination is flushed from the cache on return.
  */
-size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i);
-#else
-#define _copy_from_iter_flushcache _copy_from_iter_nocache
-#endif
-
-#ifdef CONFIG_ARCH_HAS_COPY_MC
-size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
+static __always_inline __must_check
+size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+{
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+	return i->ops->copy_from_iter_flushcache(addr, bytes, i);
 #else
-#define _copy_mc_to_iter _copy_to_iter
+	return i->ops->copy_from_iter_nocache(addr, bytes, i);
 #endif
+}
 
 static __always_inline __must_check
 size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
@@ -200,6 +302,16 @@ size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
 		return _copy_from_iter_flushcache(addr, bytes, i);
 }
 
+static __always_inline __must_check
+size_t _copy_mc_to_iter(void *addr, size_t bytes, struct iov_iter *i)
+{
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	return i->ops->copy_mc_to_iter(addr, bytes, i);
+#else
+	return i->ops->copy_to_iter(addr, bytes, i);
+#endif
+}
+
 static __always_inline __must_check
 size_t copy_mc_to_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
@@ -209,25 +321,47 @@ size_t copy_mc_to_iter(void *addr, size_t bytes, struct iov_iter *i)
 		return _copy_mc_to_iter(addr, bytes, i);
 }
 
-size_t iov_iter_zero(size_t bytes, struct iov_iter *);
-unsigned long iov_iter_alignment(const struct iov_iter *i);
-unsigned long iov_iter_gap_alignment(const struct iov_iter *i);
-void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
-			unsigned long nr_segs, size_t count);
-void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec *kvec,
-			unsigned long nr_segs, size_t count);
-void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
-			unsigned long nr_segs, size_t count);
-void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
-			size_t count);
-void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
+static inline
+size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
+{
+	return i->ops->zero(bytes, i);
+}
+static inline
+unsigned long iov_iter_alignment(const struct iov_iter *i)
+{
+	return i->ops->alignment(i);
+}
+static inline
+unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
+{
+	return i->ops->gap_alignment(i);
+}
+
+static inline
 ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
-			size_t maxsize, unsigned maxpages, size_t *start);
+			size_t maxsize, unsigned maxpages, size_t *start)
+{
+	return i->ops->get_pages(i, pages, maxsize, maxpages, start);
+}
+
+static inline
 ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages,
-			size_t maxsize, size_t *start);
-int iov_iter_npages(const struct iov_iter *i, int maxpages);
+			size_t maxsize, size_t *start)
+{
+	return i->ops->get_pages_alloc(i, pages, maxsize, start);
+}
+
+static inline
+int iov_iter_npages(const struct iov_iter *i, int maxpages)
+{
+	return i->ops->npages(i, maxpages);
+}
 
-const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags);
+static inline
+const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
+{
+	return old->ops->dup_iter(new, old, flags);
+}
 
 static inline size_t iov_iter_count(const struct iov_iter *i)
 {
@@ -260,9 +394,22 @@ static inline void iov_iter_reexpand(struct iov_iter *i, size_t count)
 {
 	i->count = count;
 }
-size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump, struct iov_iter *i);
-size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
-bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
+
+static inline
+size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump, struct iov_iter *i)
+{
+	return i->ops->csum_and_copy_to_iter(addr, bytes, csump, i);
+}
+static inline
+size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i)
+{
+	return i->ops->csum_and_copy_from_iter(addr, bytes, csum, i);
+}
+static inline
+bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i)
+{
+	return i->ops->csum_and_copy_from_iter_full(addr, bytes, csum, i);
+}
 size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
 		struct iov_iter *i);
 
@@ -278,8 +425,12 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 int import_single_range(int type, void __user *buf, size_t len,
 		 struct iovec *iov, struct iov_iter *i);
 
+static inline
 int iov_iter_for_each_range(struct iov_iter *i, size_t bytes,
 			    int (*f)(struct kvec *vec, void *context),
-			    void *context);
+			    void *context)
+{
+	return i->ops->for_each_range(i, bytes, f, context);
+}
 
 #endif
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 1635111c5bd2..e403d524c797 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -13,6 +13,12 @@
 #include <linux/scatterlist.h>
 #include <linux/instrumented.h>
 
+static const struct iov_iter_ops iovec_iter_ops;
+static const struct iov_iter_ops kvec_iter_ops;
+static const struct iov_iter_ops bvec_iter_ops;
+static const struct iov_iter_ops pipe_iter_ops;
+static const struct iov_iter_ops discard_iter_ops;
+
 #define PIPE_PARANOIA /* for now */
 
 #define iterate_iovec(i, n, __v, __p, skip, STEP) {	\
@@ -81,15 +87,15 @@
 #define iterate_all_kinds(i, n, v, I, B, K) {			\
 	if (likely(n)) {					\
 		size_t skip = i->iov_offset;			\
-		if (unlikely(i->type & ITER_BVEC)) {		\
+		if (unlikely(iov_iter_type(i) & ITER_BVEC)) {		\
 			struct bio_vec v;			\
 			struct bvec_iter __bi;			\
 			iterate_bvec(i, n, v, __bi, skip, (B))	\
-		} else if (unlikely(i->type & ITER_KVEC)) {	\
+		} else if (unlikely(iov_iter_type(i) & ITER_KVEC)) {	\
 			const struct kvec *kvec;		\
 			struct kvec v;				\
 			iterate_kvec(i, n, v, kvec, skip, (K))	\
-		} else if (unlikely(i->type & ITER_DISCARD)) {	\
+		} else if (unlikely(iov_iter_type(i) & ITER_DISCARD)) {	\
 		} else {					\
 			const struct iovec *iov;		\
 			struct iovec v;				\
@@ -103,7 +109,7 @@
 		n = i->count;					\
 	if (i->count) {						\
 		size_t skip = i->iov_offset;			\
-		if (unlikely(i->type & ITER_BVEC)) {		\
+		if (unlikely(iov_iter_type(i) & ITER_BVEC)) {		\
 			const struct bio_vec *bvec = i->bvec;	\
 			struct bio_vec v;			\
 			struct bvec_iter __bi;			\
@@ -111,7 +117,7 @@
 			i->bvec = __bvec_iter_bvec(i->bvec, __bi);	\
 			i->nr_segs -= i->bvec - bvec;		\
 			skip = __bi.bi_bvec_done;		\
-		} else if (unlikely(i->type & ITER_KVEC)) {	\
+		} else if (unlikely(iov_iter_type(i) & ITER_KVEC)) {	\
 			const struct kvec *kvec;		\
 			struct kvec v;				\
 			iterate_kvec(i, n, v, kvec, skip, (K))	\
@@ -121,7 +127,7 @@
 			}					\
 			i->nr_segs -= kvec - i->kvec;		\
 			i->kvec = kvec;				\
-		} else if (unlikely(i->type & ITER_DISCARD)) {	\
+		} else if (unlikely(iov_iter_type(i) & ITER_DISCARD)) {	\
 			skip += n;				\
 		} else {					\
 			const struct iovec *iov;		\
@@ -427,14 +433,14 @@ static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t by
  * Return 0 on success, or non-zero if the memory could not be accessed (i.e.
  * because it is an invalid address).
  */
-int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
+static int xxx_fault_in_readable(struct iov_iter *i, size_t bytes)
 {
 	size_t skip = i->iov_offset;
 	const struct iovec *iov;
 	int err;
 	struct iovec v;
 
-	if (!(i->type & (ITER_BVEC|ITER_KVEC))) {
+	if (!(iov_iter_type(i) & (ITER_BVEC|ITER_KVEC))) {
 		iterate_iovec(i, bytes, v, iov, skip, ({
 			err = fault_in_pages_readable(v.iov_base, v.iov_len);
 			if (unlikely(err))
@@ -443,7 +449,6 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
 	}
 	return 0;
 }
-EXPORT_SYMBOL(iov_iter_fault_in_readable);
 
 void iov_iter_init(struct iov_iter *i, unsigned int direction,
 			const struct iovec *iov, unsigned long nr_segs,
@@ -454,10 +459,12 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction,
 
 	/* It will get better.  Eventually... */
 	if (uaccess_kernel()) {
-		i->type = ITER_KVEC | direction;
+		i->ops = &kvec_iter_ops;
+		i->flags = direction;
 		i->kvec = (struct kvec *)iov;
 	} else {
-		i->type = ITER_IOVEC | direction;
+		i->ops = &iovec_iter_ops;
+		i->flags = direction;
 		i->iov = iov;
 	}
 	i->nr_segs = nr_segs;
@@ -625,7 +632,7 @@ static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
 	return bytes;
 }
 
-size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+static size_t xxx_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 {
 	const char *from = addr;
 	if (unlikely(iov_iter_is_pipe(i)))
@@ -641,7 +648,6 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 
 	return bytes;
 }
-EXPORT_SYMBOL(_copy_to_iter);
 
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 static int copyout_mc(void __user *to, const void *from, size_t n)
@@ -723,7 +729,7 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
  *   Compare to copy_to_iter() where only ITER_IOVEC attempts might return
  *   a short copy.
  */
-size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+static size_t xxx_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 {
 	const char *from = addr;
 	unsigned long rem, curr_addr, s_addr = (unsigned long) addr;
@@ -757,10 +763,9 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 
 	return bytes;
 }
-EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
 #endif /* CONFIG_ARCH_HAS_COPY_MC */
 
-size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
+static size_t xxx_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
 	if (unlikely(iov_iter_is_pipe(i))) {
@@ -778,9 +783,8 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 
 	return bytes;
 }
-EXPORT_SYMBOL(_copy_from_iter);
 
-bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
+static bool xxx_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
 	if (unlikely(iov_iter_is_pipe(i))) {
@@ -805,9 +809,8 @@ bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
 	iov_iter_advance(i, bytes);
 	return true;
 }
-EXPORT_SYMBOL(_copy_from_iter_full);
 
-size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
+static size_t xxx_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
 	if (unlikely(iov_iter_is_pipe(i))) {
@@ -824,7 +827,6 @@ size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 
 	return bytes;
 }
-EXPORT_SYMBOL(_copy_from_iter_nocache);
 
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 /**
@@ -841,7 +843,7 @@ EXPORT_SYMBOL(_copy_from_iter_nocache);
  * bypass the cache for the ITER_IOVEC case, and on some archs may use
  * instructions that strand dirty-data in the cache.
  */
-size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+static size_t xxx_copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
 	if (unlikely(iov_iter_is_pipe(i))) {
@@ -859,10 +861,9 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
 
 	return bytes;
 }
-EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
 #endif
 
-bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
+static bool xxx_copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
 	if (unlikely(iov_iter_is_pipe(i))) {
@@ -884,7 +885,6 @@ bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
 	iov_iter_advance(i, bytes);
 	return true;
 }
-EXPORT_SYMBOL(_copy_from_iter_full_nocache);
 
 static inline bool page_copy_sane(struct page *page, size_t offset, size_t n)
 {
@@ -910,12 +910,12 @@ static inline bool page_copy_sane(struct page *page, size_t offset, size_t n)
 	return false;
 }
 
-size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
+static size_t xxx_copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
 	if (unlikely(!page_copy_sane(page, offset, bytes)))
 		return 0;
-	if (i->type & (ITER_BVEC|ITER_KVEC)) {
+	if (iov_iter_type(i) & (ITER_BVEC|ITER_KVEC)) {
 		void *kaddr = kmap_atomic(page);
 		size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
 		kunmap_atomic(kaddr);
@@ -927,9 +927,8 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 	else
 		return copy_page_to_iter_pipe(page, offset, bytes, i);
 }
-EXPORT_SYMBOL(copy_page_to_iter);
 
-size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
+static size_t xxx_copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
 	if (unlikely(!page_copy_sane(page, offset, bytes)))
@@ -938,15 +937,14 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 		WARN_ON(1);
 		return 0;
 	}
-	if (i->type & (ITER_BVEC|ITER_KVEC)) {
+	if (iov_iter_type(i) & (ITER_BVEC|ITER_KVEC)) {
 		void *kaddr = kmap_atomic(page);
-		size_t wanted = _copy_from_iter(kaddr + offset, bytes, i);
+		size_t wanted = xxx_copy_from_iter(kaddr + offset, bytes, i);
 		kunmap_atomic(kaddr);
 		return wanted;
 	} else
 		return copy_page_from_iter_iovec(page, offset, bytes, i);
 }
-EXPORT_SYMBOL(copy_page_from_iter);
 
 static size_t pipe_zero(size_t bytes, struct iov_iter *i)
 {
@@ -975,7 +973,7 @@ static size_t pipe_zero(size_t bytes, struct iov_iter *i)
 	return bytes;
 }
 
-size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
+static size_t xxx_zero(size_t bytes, struct iov_iter *i)
 {
 	if (unlikely(iov_iter_is_pipe(i)))
 		return pipe_zero(bytes, i);
@@ -987,9 +985,8 @@ size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
 
 	return bytes;
 }
-EXPORT_SYMBOL(iov_iter_zero);
 
-size_t iov_iter_copy_from_user_atomic(struct page *page,
+static size_t xxx_copy_from_user_atomic(struct page *page,
 		struct iov_iter *i, unsigned long offset, size_t bytes)
 {
 	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
@@ -1011,7 +1008,6 @@ size_t iov_iter_copy_from_user_atomic(struct page *page,
 	kunmap_atomic(kaddr);
 	return bytes;
 }
-EXPORT_SYMBOL(iov_iter_copy_from_user_atomic);
 
 static inline void pipe_truncate(struct iov_iter *i)
 {
@@ -1067,7 +1063,7 @@ static void pipe_advance(struct iov_iter *i, size_t size)
 	pipe_truncate(i);
 }
 
-void iov_iter_advance(struct iov_iter *i, size_t size)
+static void xxx_advance(struct iov_iter *i, size_t size)
 {
 	if (unlikely(iov_iter_is_pipe(i))) {
 		pipe_advance(i, size);
@@ -1079,9 +1075,8 @@ void iov_iter_advance(struct iov_iter *i, size_t size)
 	}
 	iterate_and_advance(i, size, v, 0, 0, 0)
 }
-EXPORT_SYMBOL(iov_iter_advance);
 
-void iov_iter_revert(struct iov_iter *i, size_t unroll)
+static void xxx_revert(struct iov_iter *i, size_t unroll)
 {
 	if (!unroll)
 		return;
@@ -1147,12 +1142,11 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
 		}
 	}
 }
-EXPORT_SYMBOL(iov_iter_revert);
 
 /*
  * Return the count of just the current iov_iter segment.
  */
-size_t iov_iter_single_seg_count(const struct iov_iter *i)
+static size_t xxx_single_seg_count(const struct iov_iter *i)
 {
 	if (unlikely(iov_iter_is_pipe(i)))
 		return i->count;	// it is a silly place, anyway
@@ -1165,14 +1159,14 @@ size_t iov_iter_single_seg_count(const struct iov_iter *i)
 	else
 		return min(i->count, i->iov->iov_len - i->iov_offset);
 }
-EXPORT_SYMBOL(iov_iter_single_seg_count);
 
 void iov_iter_kvec(struct iov_iter *i, unsigned int direction,
-			const struct kvec *kvec, unsigned long nr_segs,
-			size_t count)
+		   const struct kvec *kvec, unsigned long nr_segs,
+		   size_t count)
 {
 	WARN_ON(direction & ~(READ | WRITE));
-	i->type = ITER_KVEC | (direction & (READ | WRITE));
+	i->ops = &kvec_iter_ops;
+	i->flags = direction & (READ | WRITE);
 	i->kvec = kvec;
 	i->nr_segs = nr_segs;
 	i->iov_offset = 0;
@@ -1185,7 +1179,8 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
 			size_t count)
 {
 	WARN_ON(direction & ~(READ | WRITE));
-	i->type = ITER_BVEC | (direction & (READ | WRITE));
+	i->ops = &bvec_iter_ops;
+	i->flags = direction & (READ | WRITE);
 	i->bvec = bvec;
 	i->nr_segs = nr_segs;
 	i->iov_offset = 0;
@@ -1199,7 +1194,8 @@ void iov_iter_pipe(struct iov_iter *i, unsigned int direction,
 {
 	BUG_ON(direction != READ);
 	WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size));
-	i->type = ITER_PIPE | READ;
+	i->ops = &pipe_iter_ops;
+	i->flags = READ;
 	i->pipe = pipe;
 	i->head = pipe->head;
 	i->iov_offset = 0;
@@ -1220,13 +1216,14 @@ EXPORT_SYMBOL(iov_iter_pipe);
 void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
 {
 	BUG_ON(direction != READ);
-	i->type = ITER_DISCARD | READ;
+	i->ops = &discard_iter_ops;
+	i->flags = READ;
 	i->count = count;
 	i->iov_offset = 0;
 }
 EXPORT_SYMBOL(iov_iter_discard);
 
-unsigned long iov_iter_alignment(const struct iov_iter *i)
+static unsigned long xxx_alignment(const struct iov_iter *i)
 {
 	unsigned long res = 0;
 	size_t size = i->count;
@@ -1245,9 +1242,8 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
 	)
 	return res;
 }
-EXPORT_SYMBOL(iov_iter_alignment);
 
-unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
+static unsigned long xxx_gap_alignment(const struct iov_iter *i)
 {
 	unsigned long res = 0;
 	size_t size = i->count;
@@ -1267,7 +1263,6 @@ unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
 		);
 	return res;
 }
-EXPORT_SYMBOL(iov_iter_gap_alignment);
 
 static inline ssize_t __pipe_get_pages(struct iov_iter *i,
 				size_t maxsize,
@@ -1313,7 +1308,7 @@ static ssize_t pipe_get_pages(struct iov_iter *i,
 	return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head, start);
 }
 
-ssize_t iov_iter_get_pages(struct iov_iter *i,
+static ssize_t xxx_get_pages(struct iov_iter *i,
 		   struct page **pages, size_t maxsize, unsigned maxpages,
 		   size_t *start)
 {
@@ -1352,7 +1347,6 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
 	)
 	return 0;
 }
-EXPORT_SYMBOL(iov_iter_get_pages);
 
 static struct page **get_pages_array(size_t n)
 {
@@ -1392,7 +1386,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
 	return n;
 }
 
-ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
+static ssize_t xxx_get_pages_alloc(struct iov_iter *i,
 		   struct page ***pages, size_t maxsize,
 		   size_t *start)
 {
@@ -1439,9 +1433,8 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
 	)
 	return 0;
 }
-EXPORT_SYMBOL(iov_iter_get_pages_alloc);
 
-size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
+static size_t xxx_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 			       struct iov_iter *i)
 {
 	char *to = addr;
@@ -1478,9 +1471,8 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 	*csum = sum;
 	return bytes;
 }
-EXPORT_SYMBOL(csum_and_copy_from_iter);
 
-bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
+static bool xxx_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 			       struct iov_iter *i)
 {
 	char *to = addr;
@@ -1520,9 +1512,8 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 	iov_iter_advance(i, bytes);
 	return true;
 }
-EXPORT_SYMBOL(csum_and_copy_from_iter_full);
 
-size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
+static size_t xxx_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 			     struct iov_iter *i)
 {
 	const char *from = addr;
@@ -1564,7 +1555,6 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 	*csum = sum;
 	return bytes;
 }
-EXPORT_SYMBOL(csum_and_copy_to_iter);
 
 size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
 		struct iov_iter *i)
@@ -1585,7 +1575,7 @@ size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
 }
 EXPORT_SYMBOL(hash_and_copy_to_iter);
 
-int iov_iter_npages(const struct iov_iter *i, int maxpages)
+static int xxx_npages(const struct iov_iter *i, int maxpages)
 {
 	size_t size = i->count;
 	int npages = 0;
@@ -1628,9 +1618,8 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
 	)
 	return npages;
 }
-EXPORT_SYMBOL(iov_iter_npages);
 
-const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
+static const void *xxx_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
 {
 	*new = *old;
 	if (unlikely(iov_iter_is_pipe(new))) {
@@ -1649,7 +1638,6 @@ const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
 				   new->nr_segs * sizeof(struct iovec),
 				   flags);
 }
-EXPORT_SYMBOL(dup_iter);
 
 static int copy_compat_iovec_from_user(struct iovec *iov,
 		const struct iovec __user *uvec, unsigned long nr_segs)
@@ -1826,7 +1814,7 @@ int import_single_range(int rw, void __user *buf, size_t len,
 }
 EXPORT_SYMBOL(import_single_range);
 
-int iov_iter_for_each_range(struct iov_iter *i, size_t bytes,
+static int xxx_for_each_range(struct iov_iter *i, size_t bytes,
 			    int (*f)(struct kvec *vec, void *context),
 			    void *context)
 {
@@ -1846,4 +1834,173 @@ int iov_iter_for_each_range(struct iov_iter *i, size_t bytes,
 	)
 	return err;
 }
-EXPORT_SYMBOL(iov_iter_for_each_range);
+
+static const struct iov_iter_ops iovec_iter_ops = {
+	.type				= ITER_IOVEC,
+	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.advance			= xxx_advance,
+	.revert				= xxx_revert,
+	.fault_in_readable		= xxx_fault_in_readable,
+	.single_seg_count		= xxx_single_seg_count,
+	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+#endif
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+#endif
+	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+
+	.zero				= xxx_zero,
+	.alignment			= xxx_alignment,
+	.gap_alignment			= xxx_gap_alignment,
+	.get_pages			= xxx_get_pages,
+	.get_pages_alloc		= xxx_get_pages_alloc,
+	.npages				= xxx_npages,
+	.dup_iter			= xxx_dup_iter,
+	.for_each_range			= xxx_for_each_range,
+};
+
+static const struct iov_iter_ops kvec_iter_ops = {
+	.type				= ITER_KVEC,
+	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.advance			= xxx_advance,
+	.revert				= xxx_revert,
+	.fault_in_readable		= xxx_fault_in_readable,
+	.single_seg_count		= xxx_single_seg_count,
+	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+#endif
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+#endif
+	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+
+	.zero				= xxx_zero,
+	.alignment			= xxx_alignment,
+	.gap_alignment			= xxx_gap_alignment,
+	.get_pages			= xxx_get_pages,
+	.get_pages_alloc		= xxx_get_pages_alloc,
+	.npages				= xxx_npages,
+	.dup_iter			= xxx_dup_iter,
+	.for_each_range			= xxx_for_each_range,
+};
+
+static const struct iov_iter_ops bvec_iter_ops = {
+	.type				= ITER_BVEC,
+	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.advance			= xxx_advance,
+	.revert				= xxx_revert,
+	.fault_in_readable		= xxx_fault_in_readable,
+	.single_seg_count		= xxx_single_seg_count,
+	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+#endif
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+#endif
+	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+
+	.zero				= xxx_zero,
+	.alignment			= xxx_alignment,
+	.gap_alignment			= xxx_gap_alignment,
+	.get_pages			= xxx_get_pages,
+	.get_pages_alloc		= xxx_get_pages_alloc,
+	.npages				= xxx_npages,
+	.dup_iter			= xxx_dup_iter,
+	.for_each_range			= xxx_for_each_range,
+};
+
+static const struct iov_iter_ops pipe_iter_ops = {
+	.type				= ITER_PIPE,
+	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.advance			= xxx_advance,
+	.revert				= xxx_revert,
+	.fault_in_readable		= xxx_fault_in_readable,
+	.single_seg_count		= xxx_single_seg_count,
+	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+#endif
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+#endif
+	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+
+	.zero				= xxx_zero,
+	.alignment			= xxx_alignment,
+	.gap_alignment			= xxx_gap_alignment,
+	.get_pages			= xxx_get_pages,
+	.get_pages_alloc		= xxx_get_pages_alloc,
+	.npages				= xxx_npages,
+	.dup_iter			= xxx_dup_iter,
+	.for_each_range			= xxx_for_each_range,
+};
+
+static const struct iov_iter_ops discard_iter_ops = {
+	.type				= ITER_DISCARD,
+	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.advance			= xxx_advance,
+	.revert				= xxx_revert,
+	.fault_in_readable		= xxx_fault_in_readable,
+	.single_seg_count		= xxx_single_seg_count,
+	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
+	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+#endif
+#ifdef CONFIG_ARCH_HAS_COPY_MC
+	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+#endif
+	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+
+	.zero				= xxx_zero,
+	.alignment			= xxx_alignment,
+	.gap_alignment			= xxx_gap_alignment,
+	.get_pages			= xxx_get_pages,
+	.get_pages_alloc		= xxx_get_pages_alloc,
+	.npages				= xxx_npages,
+	.dup_iter			= xxx_dup_iter,
+	.for_each_range			= xxx_for_each_range,
+};



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 02/29] iov_iter: Split copy_page_to_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
@ 2020-11-21 14:13 ` David Howells
  2020-11-21 14:13 ` [PATCH 03/29] iov_iter: Split iov_iter_fault_in_readable David Howells
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:13 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_page_to_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   44 +++++++++++++++++++++++++-------------------
 1 file changed, 25 insertions(+), 19 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index e403d524c797..fee8e99fbb9c 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -19,6 +19,8 @@ static const struct iov_iter_ops bvec_iter_ops;
 static const struct iov_iter_ops pipe_iter_ops;
 static const struct iov_iter_ops discard_iter_ops;
 
+static inline bool page_copy_sane(struct page *page, size_t offset, size_t n);
+
 #define PIPE_PARANOIA /* for now */
 
 #define iterate_iovec(i, n, __v, __p, skip, STEP) {	\
@@ -167,7 +169,7 @@ static int copyin(void *to, const void __user *from, size_t n)
 	return n;
 }
 
-static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t bytes,
+static size_t iovec_copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
 	size_t skip, copy, left, wanted;
@@ -175,6 +177,8 @@ static size_t copy_page_to_iter_iovec(struct page *page, size_t offset, size_t b
 	char __user *buf;
 	void *kaddr, *from;
 
+	if (unlikely(!page_copy_sane(page, offset, bytes)))
+		return 0;
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
 
@@ -378,7 +382,7 @@ static bool sanity(const struct iov_iter *i)
 #define sanity(i) true
 #endif
 
-static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t bytes,
+static size_t pipe_copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
 	struct pipe_inode_info *pipe = i->pipe;
@@ -388,6 +392,8 @@ static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t by
 	unsigned int i_head = i->head;
 	size_t off;
 
+	if (unlikely(!page_copy_sane(page, offset, bytes)))
+		return 0;
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
 
@@ -910,22 +916,22 @@ static inline bool page_copy_sane(struct page *page, size_t offset, size_t n)
 	return false;
 }
 
-static size_t xxx_copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
+static size_t bkvec_copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
-	if (unlikely(!page_copy_sane(page, offset, bytes)))
-		return 0;
-	if (iov_iter_type(i) & (ITER_BVEC|ITER_KVEC)) {
+	size_t wanted = 0;
+	if (likely(page_copy_sane(page, offset, bytes))) {
 		void *kaddr = kmap_atomic(page);
-		size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
+		wanted = copy_to_iter(kaddr + offset, bytes, i);
 		kunmap_atomic(kaddr);
-		return wanted;
-	} else if (unlikely(iov_iter_is_discard(i)))
-		return bytes;
-	else if (likely(!iov_iter_is_pipe(i)))
-		return copy_page_to_iter_iovec(page, offset, bytes, i);
-	else
-		return copy_page_to_iter_pipe(page, offset, bytes, i);
+	}
+	return wanted;
+}
+
+static size_t discard_copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
+					struct iov_iter *i)
+{
+	return bytes;
 }
 
 static size_t xxx_copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
@@ -1842,7 +1848,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.revert				= xxx_revert,
 	.fault_in_readable		= xxx_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
-	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_to_iter		= iovec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= xxx_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
@@ -1876,7 +1882,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.revert				= xxx_revert,
 	.fault_in_readable		= xxx_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
-	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= xxx_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
@@ -1910,7 +1916,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.revert				= xxx_revert,
 	.fault_in_readable		= xxx_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
-	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= xxx_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
@@ -1944,7 +1950,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.revert				= xxx_revert,
 	.fault_in_readable		= xxx_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
-	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_to_iter		= pipe_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= xxx_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
@@ -1978,7 +1984,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.revert				= xxx_revert,
 	.fault_in_readable		= xxx_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
-	.copy_page_to_iter		= xxx_copy_page_to_iter,
+	.copy_page_to_iter		= discard_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= xxx_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 03/29] iov_iter: Split iov_iter_fault_in_readable
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
  2020-11-21 14:13 ` [PATCH 02/29] iov_iter: Split copy_page_to_iter() David Howells
@ 2020-11-21 14:13 ` David Howells
  2020-11-21 14:13 ` [PATCH 04/29] iov_iter: Split the iterate_and_advance() macro David Howells
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:13 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_fault_in_readable() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   29 ++++++++++++++++-------------
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index fee8e99fbb9c..280b5c9c9a9c 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -439,20 +439,23 @@ static size_t pipe_copy_page_to_iter(struct page *page, size_t offset, size_t by
  * Return 0 on success, or non-zero if the memory could not be accessed (i.e.
  * because it is an invalid address).
  */
-static int xxx_fault_in_readable(struct iov_iter *i, size_t bytes)
+static int iovec_fault_in_readable(struct iov_iter *i, size_t bytes)
 {
 	size_t skip = i->iov_offset;
 	const struct iovec *iov;
 	int err;
 	struct iovec v;
 
-	if (!(iov_iter_type(i) & (ITER_BVEC|ITER_KVEC))) {
-		iterate_iovec(i, bytes, v, iov, skip, ({
-			err = fault_in_pages_readable(v.iov_base, v.iov_len);
-			if (unlikely(err))
-			return err;
-		0;}))
-	}
+	iterate_iovec(i, bytes, v, iov, skip, ({
+		err = fault_in_pages_readable(v.iov_base, v.iov_len);
+		if (unlikely(err))
+		return err;
+	0;}))
+	return 0;
+}
+
+static int no_fault_in_readable(struct iov_iter *i, size_t bytes)
+{
 	return 0;
 }
 
@@ -1846,7 +1849,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
-	.fault_in_readable		= xxx_fault_in_readable,
+	.fault_in_readable		= iovec_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= iovec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
@@ -1880,7 +1883,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
-	.fault_in_readable		= xxx_fault_in_readable,
+	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
@@ -1914,7 +1917,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
-	.fault_in_readable		= xxx_fault_in_readable,
+	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
@@ -1948,7 +1951,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
-	.fault_in_readable		= xxx_fault_in_readable,
+	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= pipe_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
@@ -1982,7 +1985,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
-	.fault_in_readable		= xxx_fault_in_readable,
+	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= discard_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 04/29] iov_iter: Split the iterate_and_advance() macro
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (2 preceding siblings ...)
  2020-11-21 14:13 ` [PATCH 03/29] iov_iter: Split iov_iter_fault_in_readable David Howells
@ 2020-11-21 14:13 ` David Howells
  2020-11-21 14:14 ` [PATCH 05/29] iov_iter: Split copy_to_iter() David Howells
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:13 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split the iterate_and_advance() macro into iovec, bvec, kvec and discard
variants.  It doesn't handle pipes.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 280b5c9c9a9c..a221e7771201 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -147,6 +147,68 @@ static inline bool page_copy_sane(struct page *page, size_t offset, size_t n);
 	}							\
 }
 
+#define iterate_and_advance_iovec(i, n, v, CMD) {		\
+	if (unlikely(i->count < n))				\
+		n = i->count;					\
+	if (i->count) {						\
+		size_t skip = i->iov_offset;			\
+		const struct iovec *iov;			\
+		struct iovec v;					\
+		iterate_iovec(i, n, v, iov, skip, (CMD))	\
+			if (skip == iov->iov_len) {		\
+				iov++;				\
+				skip = 0;			\
+			}					\
+		i->nr_segs -= iov - i->iov;			\
+		i->iov = iov;					\
+		i->count -= n;					\
+		i->iov_offset = skip;				\
+	}							\
+}
+
+#define iterate_and_advance_bvec(i, n, v, CMD) {		\
+	if (unlikely(i->count < n))				\
+		n = i->count;					\
+	if (i->count) {						\
+		size_t skip = i->iov_offset;				\
+		const struct bio_vec *bvec = i->bvec;			\
+		struct bio_vec v;					\
+		struct bvec_iter __bi;					\
+		iterate_bvec(i, n, v, __bi, skip, (CMD))		\
+			i->bvec = __bvec_iter_bvec(i->bvec, __bi);	\
+		i->nr_segs -= i->bvec - bvec;				\
+		skip = __bi.bi_bvec_done;				\
+		i->count -= n;						\
+		i->iov_offset = skip;					\
+	}								\
+}
+
+#define iterate_and_advance_kvec(i, n, v, CMD) {		\
+	if (unlikely(i->count < n))				\
+		n = i->count;					\
+	if (i->count) {						\
+		size_t skip = i->iov_offset;			\
+		const struct kvec *kvec;			\
+		struct kvec v;					\
+		iterate_kvec(i, n, v, kvec, skip, (CMD))	\
+			if (skip == kvec->iov_len) {		\
+				kvec++;				\
+				skip = 0;			\
+			}					\
+		i->nr_segs -= kvec - i->kvec;			\
+		i->kvec = kvec;					\
+		i->count -= n;					\
+		i->iov_offset = skip;				\
+	}							\
+}
+
+#define iterate_and_advance_discard(i, n) {			\
+	if (unlikely(i->count < n))				\
+		n = i->count;					\
+	i->count -= n;						\
+	i->iov_offset += n;					\
+}
+
 static int copyout(void __user *to, const void *from, size_t n)
 {
 	if (should_fail_usercopy())



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 05/29] iov_iter: Split copy_to_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (3 preceding siblings ...)
  2020-11-21 14:13 ` [PATCH 04/29] iov_iter: Split the iterate_and_advance() macro David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:14 ` [PATCH 06/29] iov_iter: Split copy_mc_to_iter() David Howells
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_to_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   47 +++++++++++++++++++++++++++++++----------------
 1 file changed, 31 insertions(+), 16 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index a221e7771201..0865e0b6eee9 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -634,7 +634,7 @@ static size_t push_pipe(struct iov_iter *i, size_t size,
 	return size - left;
 }
 
-static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
+static size_t pipe_copy_to_iter(const void *addr, size_t bytes,
 				struct iov_iter *i)
 {
 	struct pipe_inode_info *pipe = i->pipe;
@@ -703,20 +703,35 @@ static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
 	return bytes;
 }
 
-static size_t xxx_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+static size_t iovec_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 {
 	const char *from = addr;
-	if (unlikely(iov_iter_is_pipe(i)))
-		return copy_pipe_to_iter(addr, bytes, i);
-	if (iter_is_iovec(i))
-		might_fault();
-	iterate_and_advance(i, bytes, v,
-		copyout(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len),
+	might_fault();
+	iterate_and_advance_iovec(i, bytes, v,
+		copyout(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len));
+	return bytes;
+}
+
+static size_t bvec_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	const char *from = addr;
+	iterate_and_advance_bvec(i, bytes, v,
 		memcpy_to_page(v.bv_page, v.bv_offset,
-			       (from += v.bv_len) - v.bv_len, v.bv_len),
-		memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len)
-	)
+			       (from += v.bv_len) - v.bv_len, v.bv_len));
+	return bytes;
+}
 
+static size_t kvec_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	const char *from = addr;
+	iterate_and_advance_kvec(i, bytes, v,
+		memcpy(v.iov_base, (from += v.iov_len) - v.iov_len, v.iov_len));
+	return bytes;
+}
+
+static size_t discard_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	iterate_and_advance_discard(i, bytes);
 	return bytes;
 }
 
@@ -1915,7 +1930,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= iovec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
-	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_to_iter			= iovec_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
@@ -1949,7 +1964,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
-	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_to_iter			= kvec_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
@@ -1983,7 +1998,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
-	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_to_iter			= bvec_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
@@ -2017,7 +2032,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= pipe_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
-	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_to_iter			= pipe_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
@@ -2051,7 +2066,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= discard_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
-	.copy_to_iter			= xxx_copy_to_iter,
+	.copy_to_iter			= discard_copy_to_iter,
 	.copy_from_iter			= xxx_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 06/29] iov_iter: Split copy_mc_to_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (4 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 05/29] iov_iter: Split copy_to_iter() David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:14 ` [PATCH 07/29] iov_iter: Split copy_from_iter() David Howells
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_mc_to_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   54 +++++++++++++++++++++++++++++++++---------------------
 1 file changed, 33 insertions(+), 21 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 0865e0b6eee9..7c1d92f7d020 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -758,7 +758,7 @@ static unsigned long copy_mc_to_page(struct page *page, size_t offset,
 	return ret;
 }
 
-static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
+static size_t pipe_copy_mc_to_iter(const void *addr, size_t bytes,
 				struct iov_iter *i)
 {
 	struct pipe_inode_info *pipe = i->pipe;
@@ -815,18 +815,23 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
  *   Compare to copy_to_iter() where only ITER_IOVEC attempts might return
  *   a short copy.
  */
-static size_t xxx_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+static size_t iovec_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
 {
 	const char *from = addr;
-	unsigned long rem, curr_addr, s_addr = (unsigned long) addr;
 
-	if (unlikely(iov_iter_is_pipe(i)))
-		return copy_mc_pipe_to_iter(addr, bytes, i);
-	if (iter_is_iovec(i))
-		might_fault();
-	iterate_and_advance(i, bytes, v,
+	might_fault();
+	iterate_and_advance_iovec(i, bytes, v,
 		copyout_mc(v.iov_base, (from += v.iov_len) - v.iov_len,
-			   v.iov_len),
+			   v.iov_len));
+	return bytes;
+}
+
+static size_t bvec_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	const char *from = addr;
+	unsigned long rem, curr_addr, s_addr = (unsigned long) addr;
+
+	iterate_and_advance_bvec(i, bytes, v,
 		({
 		rem = copy_mc_to_page(v.bv_page, v.bv_offset,
 				      (from += v.bv_len) - v.bv_len, v.bv_len);
@@ -835,18 +840,25 @@ static size_t xxx_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_ite
 			bytes = curr_addr - s_addr - rem;
 			return bytes;
 		}
-		}),
-		({
-		rem = copy_mc_to_kernel(v.iov_base, (from += v.iov_len)
-					- v.iov_len, v.iov_len);
+		}))
+	return bytes;
+}
+
+static size_t kvec_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
+{
+	const char *from = addr;
+	unsigned long rem, curr_addr, s_addr = (unsigned long) addr;
+
+	iterate_and_advance_kvec(i, bytes, v, ({
+		rem = copy_mc_to_kernel(v.iov_base,
+					(from += v.iov_len) - v.iov_len,
+					v.iov_len);
 		if (rem) {
 			curr_addr = (unsigned long) from;
 			bytes = curr_addr - s_addr - rem;
 			return bytes;
 		}
-		})
-	)
-
+		}));
 	return bytes;
 }
 #endif /* CONFIG_ARCH_HAS_COPY_MC */
@@ -1939,7 +1951,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
-	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+	.copy_mc_to_iter		= iovec_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
@@ -1973,7 +1985,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
-	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+	.copy_mc_to_iter		= kvec_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
@@ -2007,7 +2019,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
-	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+	.copy_mc_to_iter		= bvec_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
@@ -2041,7 +2053,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
-	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+	.copy_mc_to_iter		= pipe_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
@@ -2075,7 +2087,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
-	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
+	.copy_mc_to_iter		= discard_copy_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 07/29] iov_iter: Split copy_from_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (5 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 06/29] iov_iter: Split copy_mc_to_iter() David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:14 ` [PATCH 08/29] iov_iter: Split the iterate_all_kinds() macro David Howells
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_from_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   50 ++++++++++++++++++++++++++++++++------------------
 1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 7c1d92f7d020..5b18dfe0dcc7 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -863,22 +863,36 @@ static size_t kvec_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_it
 }
 #endif /* CONFIG_ARCH_HAS_COPY_MC */
 
-static size_t xxx_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
+static size_t iovec_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
-	if (unlikely(iov_iter_is_pipe(i))) {
-		WARN_ON(1);
-		return 0;
-	}
-	if (iter_is_iovec(i))
-		might_fault();
-	iterate_and_advance(i, bytes, v,
-		copyin((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
+	might_fault();
+	iterate_and_advance_iovec(i, bytes, v,
+		copyin((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len));
+
+	return bytes;
+}
+
+static size_t bvec_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	iterate_and_advance_bvec(i, bytes, v,
 		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
-		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
-	)
+				 v.bv_offset, v.bv_len));
+	return bytes;
+}
+
+static size_t kvec_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	iterate_and_advance_kvec(i, bytes, v,
+		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len));
+	return bytes;
+}
 
+static size_t no_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
+{
+	WARN_ON(1);
 	return bytes;
 }
 
@@ -1037,7 +1051,7 @@ static size_t xxx_copy_page_from_iter(struct page *page, size_t offset, size_t b
 	}
 	if (iov_iter_type(i) & (ITER_BVEC|ITER_KVEC)) {
 		void *kaddr = kmap_atomic(page);
-		size_t wanted = xxx_copy_from_iter(kaddr + offset, bytes, i);
+		size_t wanted = copy_from_iter(kaddr + offset, bytes, i);
 		kunmap_atomic(kaddr);
 		return wanted;
 	} else
@@ -1943,7 +1957,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_page_to_iter		= iovec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= iovec_copy_to_iter,
-	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter			= iovec_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
@@ -1977,7 +1991,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= kvec_copy_to_iter,
-	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter			= kvec_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
@@ -2011,7 +2025,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= bvec_copy_to_iter,
-	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter			= bvec_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
@@ -2045,7 +2059,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_page_to_iter		= pipe_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= pipe_copy_to_iter,
-	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
@@ -2079,7 +2093,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_page_to_iter		= discard_copy_page_to_iter,
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= discard_copy_to_iter,
-	.copy_from_iter			= xxx_copy_from_iter,
+	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= xxx_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 08/29] iov_iter: Split the iterate_all_kinds() macro
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (6 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 07/29] iov_iter: Split copy_from_iter() David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:14 ` [PATCH 09/29] iov_iter: Split copy_from_iter_full() David Howells
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split the iterate_all_kinds() macro into iovec, bvec and kvec variants.
It doesn't handle pipes and the discard variant is a no-op and can be built
in directly.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 5b18dfe0dcc7..934193627540 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -106,6 +106,33 @@ static inline bool page_copy_sane(struct page *page, size_t offset, size_t n);
 	}							\
 }
 
+#define iterate_over_iovec(i, n, v, CMD) {			\
+	if (likely(n)) {					\
+		size_t skip = i->iov_offset;			\
+		const struct iovec *iov;			\
+		struct iovec v;					\
+		iterate_iovec(i, n, v, iov, skip, (CMD))	\
+	}							\
+}
+
+#define iterate_over_bvec(i, n, v, CMD) {			\
+	if (likely(n)) {					\
+		size_t skip = i->iov_offset;			\
+		struct bio_vec v;				\
+		struct bvec_iter __bi;				\
+		iterate_bvec(i, n, v, __bi, skip, (CMD))	\
+	}							\
+}
+
+#define iterate_over_kvec(i, n, v, CMD) {			\
+	if (likely(n)) {					\
+		size_t skip = i->iov_offset;			\
+		const struct kvec *kvec;			\
+		struct kvec v;					\
+		iterate_kvec(i, n, v, kvec, skip, (CMD))	\
+	}							\
+}
+
 #define iterate_and_advance(i, n, v, I, B, K) {			\
 	if (unlikely(i->count < n))				\
 		n = i->count;					\



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 09/29] iov_iter: Split copy_from_iter_full()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (7 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 08/29] iov_iter: Split the iterate_all_kinds() macro David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:14 ` [PATCH 10/29] iov_iter: Split copy_from_iter_nocache() David Howells
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_from_iter_full() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   59 +++++++++++++++++++++++++++++++++++++++-----------------
 1 file changed, 41 insertions(+), 18 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 934193627540..3dba665a1ee9 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -923,32 +923,55 @@ static size_t no_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
 	return bytes;
 }
 
-static bool xxx_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
+static bool iovec_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
-	if (unlikely(iov_iter_is_pipe(i))) {
-		WARN_ON(1);
-		return false;
-	}
+
 	if (unlikely(i->count < bytes))
 		return false;
 
-	if (iter_is_iovec(i))
-		might_fault();
-	iterate_all_kinds(i, bytes, v, ({
+	might_fault();
+	iterate_over_iovec(i, bytes, v, ({
 		if (copyin((to += v.iov_len) - v.iov_len,
-				      v.iov_base, v.iov_len))
+			   v.iov_base, v.iov_len))
 			return false;
-		0;}),
+		0;}));
+	iov_iter_advance(i, bytes);
+	return true;
+}
+
+static bool bvec_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+
+	if (unlikely(i->count < bytes))
+		return false;
+	iterate_over_bvec(i, bytes, v,
 		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
-		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
-	)
+				 v.bv_offset, v.bv_len));
+	iov_iter_advance(i, bytes);
+	return true;
+}
+
+static bool kvec_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
 
+	if (unlikely(i->count < bytes))
+		return false;
+
+	iterate_over_kvec(i, bytes, v,
+	       memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len));
 	iov_iter_advance(i, bytes);
 	return true;
 }
 
+static bool no_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
+{
+	WARN_ON(1);
+	return false;
+}
+
 static size_t xxx_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
@@ -1985,7 +2008,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= iovec_copy_to_iter,
 	.copy_from_iter			= iovec_copy_from_iter,
-	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_full		= iovec_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
@@ -2019,7 +2042,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= kvec_copy_to_iter,
 	.copy_from_iter			= kvec_copy_from_iter,
-	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_full		= kvec_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
@@ -2053,7 +2076,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= bvec_copy_to_iter,
 	.copy_from_iter			= bvec_copy_from_iter,
-	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_full		= bvec_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
@@ -2087,7 +2110,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= pipe_copy_to_iter,
 	.copy_from_iter			= no_copy_from_iter,
-	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_full		= no_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
@@ -2121,7 +2144,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_page_from_iter		= xxx_copy_page_from_iter,
 	.copy_to_iter			= discard_copy_to_iter,
 	.copy_from_iter			= no_copy_from_iter,
-	.copy_from_iter_full		= xxx_copy_from_iter_full,
+	.copy_from_iter_full		= no_copy_from_iter_full,
 	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/29] iov_iter: Split copy_from_iter_nocache()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (8 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 09/29] iov_iter: Split copy_from_iter_full() David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:14 ` [PATCH 11/29] iov_iter: Split copy_from_iter_flushcache() David Howells
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_from_iter_nocache() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   38 +++++++++++++++++++++++---------------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 3dba665a1ee9..c57c2171f730 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -972,21 +972,29 @@ static bool no_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
 	return false;
 }
 
-static size_t xxx_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
+static size_t iovec_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
-	if (unlikely(iov_iter_is_pipe(i))) {
-		WARN_ON(1);
-		return 0;
-	}
-	iterate_and_advance(i, bytes, v,
+	iterate_and_advance_iovec(i, bytes, v,
 		__copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len,
-					 v.iov_base, v.iov_len),
+						  v.iov_base, v.iov_len));
+	return bytes;
+}
+
+static size_t bvec_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	iterate_and_advance_bvec(i, bytes, v,
 		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
-		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
-	)
+				 v.bv_offset, v.bv_len));
+	return bytes;
+}
 
+static size_t kvec_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	iterate_and_advance_kvec(i, bytes, v,
+		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len));
 	return bytes;
 }
 
@@ -2009,7 +2017,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_to_iter			= iovec_copy_to_iter,
 	.copy_from_iter			= iovec_copy_from_iter,
 	.copy_from_iter_full		= iovec_copy_from_iter_full,
-	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_nocache		= iovec_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
@@ -2043,7 +2051,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_to_iter			= kvec_copy_to_iter,
 	.copy_from_iter			= kvec_copy_from_iter,
 	.copy_from_iter_full		= kvec_copy_from_iter_full,
-	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_nocache		= kvec_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
@@ -2077,7 +2085,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_to_iter			= bvec_copy_to_iter,
 	.copy_from_iter			= bvec_copy_from_iter,
 	.copy_from_iter_full		= bvec_copy_from_iter_full,
-	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_nocache		= bvec_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
@@ -2111,7 +2119,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_to_iter			= pipe_copy_to_iter,
 	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= no_copy_from_iter_full,
-	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_nocache		= no_copy_from_iter,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
@@ -2145,7 +2153,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_to_iter			= discard_copy_to_iter,
 	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= no_copy_from_iter_full,
-	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
+	.copy_from_iter_nocache		= no_copy_from_iter,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 11/29] iov_iter: Split copy_from_iter_flushcache()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (9 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 10/29] iov_iter: Split copy_from_iter_nocache() David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:14 ` [PATCH 12/29] iov_iter: Split copy_from_iter_full_nocache() David Howells
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_from_iter_flushcache() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   42 +++++++++++++++++++++++++-----------------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index c57c2171f730..6b4739d7dd9a 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1000,7 +1000,7 @@ static size_t kvec_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_i
 
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 /**
- * _copy_from_iter_flushcache - write destination through cpu cache
+ * copy_from_iter_flushcache - write destination through cpu cache
  * @addr: destination kernel address
  * @bytes: total transfer length
  * @iter: source iterator
@@ -1013,22 +1013,30 @@ static size_t kvec_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_i
  * bypass the cache for the ITER_IOVEC case, and on some archs may use
  * instructions that strand dirty-data in the cache.
  */
-static size_t xxx_copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+static size_t iovec_copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
-	if (unlikely(iov_iter_is_pipe(i))) {
-		WARN_ON(1);
-		return 0;
-	}
-	iterate_and_advance(i, bytes, v,
+	iterate_and_advance_iovec(i, bytes, v,
 		__copy_from_user_flushcache((to += v.iov_len) - v.iov_len,
-					 v.iov_base, v.iov_len),
+					    v.iov_base, v.iov_len));
+	return bytes;
+}
+
+static size_t bvec_copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	iterate_and_advance_bvec(i, bytes, v,
 		memcpy_page_flushcache((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
-		memcpy_flushcache((to += v.iov_len) - v.iov_len, v.iov_base,
-			v.iov_len)
-	)
+				 v.bv_offset, v.bv_len));
+	return bytes;
+}
 
+static size_t kvec_copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	iterate_and_advance_kvec(i, bytes, v,
+		memcpy_flushcache((to += v.iov_len) - v.iov_len, v.iov_base,
+			v.iov_len));
 	return bytes;
 }
 #endif
@@ -2020,7 +2028,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_from_iter_nocache		= iovec_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
-	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+	.copy_from_iter_flushcache	= iovec_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= iovec_copy_mc_to_iter,
@@ -2054,7 +2062,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_from_iter_nocache		= kvec_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
-	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+	.copy_from_iter_flushcache	= kvec_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= kvec_copy_mc_to_iter,
@@ -2088,7 +2096,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_from_iter_nocache		= bvec_copy_from_iter_nocache,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
-	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+	.copy_from_iter_flushcache	= bvec_copy_from_iter_flushcache,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= bvec_copy_mc_to_iter,
@@ -2122,7 +2130,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_from_iter_nocache		= no_copy_from_iter,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
-	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+	.copy_from_iter_flushcache	= no_copy_from_iter,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= pipe_copy_mc_to_iter,
@@ -2156,7 +2164,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_from_iter_nocache		= no_copy_from_iter,
 	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
-	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
+	.copy_from_iter_flushcache	= no_copy_from_iter,
 #endif
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= discard_copy_to_iter,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 12/29] iov_iter: Split copy_from_iter_full_nocache()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (10 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 11/29] iov_iter: Split copy_from_iter_flushcache() David Howells
@ 2020-11-21 14:14 ` David Howells
  2020-11-21 14:15 ` [PATCH 13/29] iov_iter: Split copy_page_from_iter() David Howells
                   ` (19 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:14 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_from_iter_full_nocache() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   44 +++++++++++++++++++++++++++++---------------
 1 file changed, 29 insertions(+), 15 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 6b4739d7dd9a..544e532e3e9f 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1041,25 +1041,39 @@ static size_t kvec_copy_from_iter_flushcache(void *addr, size_t bytes, struct io
 }
 #endif
 
-static bool xxx_copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
+static bool iovec_copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
 {
 	char *to = addr;
-	if (unlikely(iov_iter_is_pipe(i))) {
-		WARN_ON(1);
-		return false;
-	}
 	if (unlikely(i->count < bytes))
 		return false;
-	iterate_all_kinds(i, bytes, v, ({
+	iterate_over_iovec(i, bytes, v, ({
 		if (__copy_from_user_inatomic_nocache((to += v.iov_len) - v.iov_len,
 					     v.iov_base, v.iov_len))
 			return false;
-		0;}),
+		0;}));
+	iov_iter_advance(i, bytes);
+	return true;
+}
+
+static bool bvec_copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	if (unlikely(i->count < bytes))
+		return false;
+	iterate_over_bvec(i, bytes, v,
 		memcpy_from_page((to += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
-		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
-	)
+				 v.bv_offset, v.bv_len));
+	iov_iter_advance(i, bytes);
+	return true;
+}
 
+static bool kvec_copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
+{
+	char *to = addr;
+	if (unlikely(i->count < bytes))
+		return false;
+	iterate_over_kvec(i, bytes, v,
+		memcpy((to += v.iov_len) - v.iov_len, v.iov_base, v.iov_len));
 	iov_iter_advance(i, bytes);
 	return true;
 }
@@ -2026,7 +2040,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_from_iter			= iovec_copy_from_iter,
 	.copy_from_iter_full		= iovec_copy_from_iter_full,
 	.copy_from_iter_nocache		= iovec_copy_from_iter_nocache,
-	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+	.copy_from_iter_full_nocache	= iovec_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= iovec_copy_from_iter_flushcache,
 #endif
@@ -2060,7 +2074,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_from_iter			= kvec_copy_from_iter,
 	.copy_from_iter_full		= kvec_copy_from_iter_full,
 	.copy_from_iter_nocache		= kvec_copy_from_iter_nocache,
-	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+	.copy_from_iter_full_nocache	= kvec_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= kvec_copy_from_iter_flushcache,
 #endif
@@ -2094,7 +2108,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_from_iter			= bvec_copy_from_iter,
 	.copy_from_iter_full		= bvec_copy_from_iter_full,
 	.copy_from_iter_nocache		= bvec_copy_from_iter_nocache,
-	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+	.copy_from_iter_full_nocache	= bvec_copy_from_iter_full_nocache,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= bvec_copy_from_iter_flushcache,
 #endif
@@ -2128,7 +2142,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= no_copy_from_iter_full,
 	.copy_from_iter_nocache		= no_copy_from_iter,
-	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+	.copy_from_iter_full_nocache	= no_copy_from_iter_full,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= no_copy_from_iter,
 #endif
@@ -2162,7 +2176,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= no_copy_from_iter_full,
 	.copy_from_iter_nocache		= no_copy_from_iter,
-	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
+	.copy_from_iter_full_nocache	= no_copy_from_iter_full,
 #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
 	.copy_from_iter_flushcache	= no_copy_from_iter,
 #endif



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 13/29] iov_iter: Split copy_page_from_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (11 preceding siblings ...)
  2020-11-21 14:14 ` [PATCH 12/29] iov_iter: Split copy_from_iter_full_nocache() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:15 ` [PATCH 14/29] iov_iter: Split iov_iter_zero() David Howells
                   ` (18 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_page_from_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   39 +++++++++++++++++++++------------------
 1 file changed, 21 insertions(+), 18 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 544e532e3e9f..54029aeab3ec 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -344,7 +344,7 @@ static size_t iovec_copy_page_to_iter(struct page *page, size_t offset, size_t b
 	return wanted - bytes;
 }
 
-static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t bytes,
+static size_t iovec_copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
 	size_t skip, copy, left, wanted;
@@ -352,6 +352,8 @@ static size_t copy_page_from_iter_iovec(struct page *page, size_t offset, size_t
 	char __user *buf;
 	void *kaddr, *to;
 
+	if (unlikely(!page_copy_sane(page, offset, bytes)))
+		return 0;
 	if (unlikely(bytes > i->count))
 		bytes = i->count;
 
@@ -1120,22 +1122,23 @@ static size_t discard_copy_page_to_iter(struct page *page, size_t offset, size_t
 	return bytes;
 }
 
-static size_t xxx_copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
+static size_t bkvec_copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
 			 struct iov_iter *i)
 {
-	if (unlikely(!page_copy_sane(page, offset, bytes)))
-		return 0;
-	if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
-		WARN_ON(1);
-		return 0;
-	}
-	if (iov_iter_type(i) & (ITER_BVEC|ITER_KVEC)) {
+	size_t wanted = 0;
+	if (likely(page_copy_sane(page, offset, bytes))) {
 		void *kaddr = kmap_atomic(page);
-		size_t wanted = copy_from_iter(kaddr + offset, bytes, i);
+		wanted = copy_from_iter(kaddr + offset, bytes, i);
 		kunmap_atomic(kaddr);
-		return wanted;
-	} else
-		return copy_page_from_iter_iovec(page, offset, bytes, i);
+	}
+	return wanted;
+}
+
+static size_t no_copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
+				     struct iov_iter *i)
+{
+	WARN_ON(1);
+	return 0;
 }
 
 static size_t pipe_zero(size_t bytes, struct iov_iter *i)
@@ -2035,7 +2038,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.fault_in_readable		= iovec_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= iovec_copy_page_to_iter,
-	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_page_from_iter		= iovec_copy_page_from_iter,
 	.copy_to_iter			= iovec_copy_to_iter,
 	.copy_from_iter			= iovec_copy_from_iter,
 	.copy_from_iter_full		= iovec_copy_from_iter_full,
@@ -2069,7 +2072,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
-	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_page_from_iter		= bkvec_copy_page_from_iter,
 	.copy_to_iter			= kvec_copy_to_iter,
 	.copy_from_iter			= kvec_copy_from_iter,
 	.copy_from_iter_full		= kvec_copy_from_iter_full,
@@ -2103,7 +2106,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
-	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_page_from_iter		= bkvec_copy_page_from_iter,
 	.copy_to_iter			= bvec_copy_to_iter,
 	.copy_from_iter			= bvec_copy_from_iter,
 	.copy_from_iter_full		= bvec_copy_from_iter_full,
@@ -2137,7 +2140,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= pipe_copy_page_to_iter,
-	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_page_from_iter		= no_copy_page_from_iter,
 	.copy_to_iter			= pipe_copy_to_iter,
 	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= no_copy_from_iter_full,
@@ -2171,7 +2174,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= discard_copy_page_to_iter,
-	.copy_page_from_iter		= xxx_copy_page_from_iter,
+	.copy_page_from_iter		= no_copy_page_from_iter,
 	.copy_to_iter			= discard_copy_to_iter,
 	.copy_from_iter			= no_copy_from_iter,
 	.copy_from_iter_full		= no_copy_from_iter_full,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 14/29] iov_iter: Split iov_iter_zero()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (12 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 13/29] iov_iter: Split copy_page_from_iter() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:15 ` [PATCH 15/29] iov_iter: Split copy_from_user_atomic() David Howells
                   ` (17 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_zero() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   40 +++++++++++++++++++++++++++-------------
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 54029aeab3ec..9a167f53ecff 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1168,16 +1168,30 @@ static size_t pipe_zero(size_t bytes, struct iov_iter *i)
 	return bytes;
 }
 
-static size_t xxx_zero(size_t bytes, struct iov_iter *i)
+static size_t iovec_zero(size_t bytes, struct iov_iter *i)
 {
-	if (unlikely(iov_iter_is_pipe(i)))
-		return pipe_zero(bytes, i);
-	iterate_and_advance(i, bytes, v,
-		clear_user(v.iov_base, v.iov_len),
-		memzero_page(v.bv_page, v.bv_offset, v.bv_len),
-		memset(v.iov_base, 0, v.iov_len)
-	)
+	iterate_and_advance_iovec(i, bytes, v,
+		clear_user(v.iov_base, v.iov_len));
+	return bytes;
+}
 
+static size_t bvec_zero(size_t bytes, struct iov_iter *i)
+{
+	iterate_and_advance_bvec(i, bytes, v,
+		memzero_page(v.bv_page, v.bv_offset, v.bv_len));
+	return bytes;
+}
+
+static size_t kvec_zero(size_t bytes, struct iov_iter *i)
+{
+	iterate_and_advance_kvec(i, bytes, v,
+		memset(v.iov_base, 0, v.iov_len));
+	return bytes;
+}
+
+static size_t discard_zero(size_t bytes, struct iov_iter *i)
+{
+	iterate_and_advance_discard(i, bytes);
 	return bytes;
 }
 
@@ -2054,7 +2068,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
-	.zero				= xxx_zero,
+	.zero				= iovec_zero,
 	.alignment			= xxx_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
@@ -2088,7 +2102,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
-	.zero				= xxx_zero,
+	.zero				= kvec_zero,
 	.alignment			= xxx_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
@@ -2122,7 +2136,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
-	.zero				= xxx_zero,
+	.zero				= bvec_zero,
 	.alignment			= xxx_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
@@ -2156,7 +2170,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
-	.zero				= xxx_zero,
+	.zero				= pipe_zero,
 	.alignment			= xxx_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
@@ -2190,7 +2204,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
-	.zero				= xxx_zero,
+	.zero				= discard_zero,
 	.alignment			= xxx_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 15/29] iov_iter: Split copy_from_user_atomic()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (13 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 14/29] iov_iter: Split iov_iter_zero() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:15 ` [PATCH 16/29] iov_iter: Split iov_iter_advance() David Howells
                   ` (16 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split copy_from_user_atomic() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   53 ++++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 40 insertions(+), 13 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 9a167f53ecff..a626d41fef72 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1195,7 +1195,7 @@ static size_t discard_zero(size_t bytes, struct iov_iter *i)
 	return bytes;
 }
 
-static size_t xxx_copy_from_user_atomic(struct page *page,
+static size_t iovec_copy_from_user_atomic(struct page *page,
 		struct iov_iter *i, unsigned long offset, size_t bytes)
 {
 	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
@@ -1203,21 +1203,48 @@ static size_t xxx_copy_from_user_atomic(struct page *page,
 		kunmap_atomic(kaddr);
 		return 0;
 	}
-	if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
+	iterate_over_iovec(i, bytes, v,
+		copyin((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len));
+	kunmap_atomic(kaddr);
+	return bytes;
+}
+
+static size_t bvec_copy_from_user_atomic(struct page *page,
+		struct iov_iter *i, unsigned long offset, size_t bytes)
+{
+	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
+	if (unlikely(!page_copy_sane(page, offset, bytes))) {
 		kunmap_atomic(kaddr);
-		WARN_ON(1);
 		return 0;
 	}
-	iterate_all_kinds(i, bytes, v,
-		copyin((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len),
+	iterate_over_bvec(i, bytes, v,
 		memcpy_from_page((p += v.bv_len) - v.bv_len, v.bv_page,
-				 v.bv_offset, v.bv_len),
-		memcpy((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len)
-	)
+				 v.bv_offset, v.bv_len));
 	kunmap_atomic(kaddr);
 	return bytes;
 }
 
+static size_t kvec_copy_from_user_atomic(struct page *page,
+		struct iov_iter *i, unsigned long offset, size_t bytes)
+{
+	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
+	if (unlikely(!page_copy_sane(page, offset, bytes))) {
+		kunmap_atomic(kaddr);
+		return 0;
+	}
+	iterate_over_kvec(i, bytes, v,
+		memcpy((p += v.iov_len) - v.iov_len, v.iov_base, v.iov_len));
+	kunmap_atomic(kaddr);
+	return bytes;
+}
+
+static size_t no_copy_from_user_atomic(struct page *page,
+		struct iov_iter *i, unsigned long offset, size_t bytes)
+{
+	WARN_ON(1);
+	return 0;
+}
+
 static inline void pipe_truncate(struct iov_iter *i)
 {
 	struct pipe_inode_info *pipe = i->pipe;
@@ -2046,7 +2073,7 @@ static int xxx_for_each_range(struct iov_iter *i, size_t bytes,
 
 static const struct iov_iter_ops iovec_iter_ops = {
 	.type				= ITER_IOVEC,
-	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.copy_from_user_atomic		= iovec_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= iovec_fault_in_readable,
@@ -2080,7 +2107,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 
 static const struct iov_iter_ops kvec_iter_ops = {
 	.type				= ITER_KVEC,
-	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.copy_from_user_atomic		= kvec_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,
@@ -2114,7 +2141,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 
 static const struct iov_iter_ops bvec_iter_ops = {
 	.type				= ITER_BVEC,
-	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.copy_from_user_atomic		= bvec_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,
@@ -2148,7 +2175,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 
 static const struct iov_iter_ops pipe_iter_ops = {
 	.type				= ITER_PIPE,
-	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.copy_from_user_atomic		= no_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,
@@ -2182,7 +2209,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 
 static const struct iov_iter_ops discard_iter_ops = {
 	.type				= ITER_DISCARD,
-	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
+	.copy_from_user_atomic		= no_copy_from_user_atomic,
 	.advance			= xxx_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 16/29] iov_iter: Split iov_iter_advance()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (14 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 15/29] iov_iter: Split copy_from_user_atomic() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:15 ` [PATCH 17/29] iov_iter: Split iov_iter_revert() David Howells
                   ` (15 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_advance() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   37 ++++++++++++++++++++++---------------
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index a626d41fef72..9859b4b8a116 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1299,17 +1299,24 @@ static void pipe_advance(struct iov_iter *i, size_t size)
 	pipe_truncate(i);
 }
 
-static void xxx_advance(struct iov_iter *i, size_t size)
+static void iovec_advance(struct iov_iter *i, size_t size)
 {
-	if (unlikely(iov_iter_is_pipe(i))) {
-		pipe_advance(i, size);
-		return;
-	}
-	if (unlikely(iov_iter_is_discard(i))) {
-		i->count -= size;
-		return;
-	}
-	iterate_and_advance(i, size, v, 0, 0, 0)
+	iterate_and_advance_iovec(i, size, v, 0)
+}
+
+static void bvec_iov_advance(struct iov_iter *i, size_t size)
+{
+	iterate_and_advance_bvec(i, size, v, 0)
+}
+
+static void kvec_advance(struct iov_iter *i, size_t size)
+{
+	iterate_and_advance_kvec(i, size, v, 0)
+}
+
+static void discard_advance(struct iov_iter *i, size_t size)
+{
+	i->count -= size;
 }
 
 static void xxx_revert(struct iov_iter *i, size_t unroll)
@@ -2074,7 +2081,7 @@ static int xxx_for_each_range(struct iov_iter *i, size_t bytes,
 static const struct iov_iter_ops iovec_iter_ops = {
 	.type				= ITER_IOVEC,
 	.copy_from_user_atomic		= iovec_copy_from_user_atomic,
-	.advance			= xxx_advance,
+	.advance			= iovec_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= iovec_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
@@ -2108,7 +2115,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 static const struct iov_iter_ops kvec_iter_ops = {
 	.type				= ITER_KVEC,
 	.copy_from_user_atomic		= kvec_copy_from_user_atomic,
-	.advance			= xxx_advance,
+	.advance			= kvec_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
@@ -2142,7 +2149,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 static const struct iov_iter_ops bvec_iter_ops = {
 	.type				= ITER_BVEC,
 	.copy_from_user_atomic		= bvec_copy_from_user_atomic,
-	.advance			= xxx_advance,
+	.advance			= bvec_iov_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
@@ -2176,7 +2183,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 static const struct iov_iter_ops pipe_iter_ops = {
 	.type				= ITER_PIPE,
 	.copy_from_user_atomic		= no_copy_from_user_atomic,
-	.advance			= xxx_advance,
+	.advance			= pipe_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
@@ -2210,7 +2217,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 static const struct iov_iter_ops discard_iter_ops = {
 	.type				= ITER_DISCARD,
 	.copy_from_user_atomic		= no_copy_from_user_atomic,
-	.advance			= xxx_advance,
+	.advance			= discard_advance,
 	.revert				= xxx_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 17/29] iov_iter: Split iov_iter_revert()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (15 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 16/29] iov_iter: Split iov_iter_advance() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:15 ` [PATCH 18/29] iov_iter: Split iov_iter_single_seg_count() David Howells
                   ` (14 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_revert() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |  132 ++++++++++++++++++++++++++++++++++----------------------
 1 file changed, 79 insertions(+), 53 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 9859b4b8a116..b8e3da20547e 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1319,71 +1319,97 @@ static void discard_advance(struct iov_iter *i, size_t size)
 	i->count -= size;
 }
 
-static void xxx_revert(struct iov_iter *i, size_t unroll)
+static void iovec_kvec_revert(struct iov_iter *i, size_t unroll)
 {
+	const struct iovec *iov = i->iov;
 	if (!unroll)
 		return;
 	if (WARN_ON(unroll > MAX_RW_COUNT))
 		return;
 	i->count += unroll;
-	if (unlikely(iov_iter_is_pipe(i))) {
-		struct pipe_inode_info *pipe = i->pipe;
-		unsigned int p_mask = pipe->ring_size - 1;
-		unsigned int i_head = i->head;
-		size_t off = i->iov_offset;
-		while (1) {
-			struct pipe_buffer *b = &pipe->bufs[i_head & p_mask];
-			size_t n = off - b->offset;
-			if (unroll < n) {
-				off -= unroll;
-				break;
-			}
-			unroll -= n;
-			if (!unroll && i_head == i->start_head) {
-				off = 0;
-				break;
-			}
-			i_head--;
-			b = &pipe->bufs[i_head & p_mask];
-			off = b->offset + b->len;
-		}
-		i->iov_offset = off;
-		i->head = i_head;
-		pipe_truncate(i);
+	if (unroll <= i->iov_offset) {
+		i->iov_offset -= unroll;
 		return;
 	}
-	if (unlikely(iov_iter_is_discard(i)))
+	unroll -= i->iov_offset;
+	while (1) {
+		size_t n = (--iov)->iov_len;
+		i->nr_segs++;
+		if (unroll <= n) {
+			i->iov = iov;
+			i->iov_offset = n - unroll;
+			return;
+		}
+		unroll -= n;
+	}
+}
+
+static void bvec_revert(struct iov_iter *i, size_t unroll)
+{
+	const struct bio_vec *bvec = i->bvec;
+
+	if (!unroll)
 		return;
+	if (WARN_ON(unroll > MAX_RW_COUNT))
+		return;
+	i->count += unroll;
 	if (unroll <= i->iov_offset) {
 		i->iov_offset -= unroll;
 		return;
 	}
 	unroll -= i->iov_offset;
-	if (iov_iter_is_bvec(i)) {
-		const struct bio_vec *bvec = i->bvec;
-		while (1) {
-			size_t n = (--bvec)->bv_len;
-			i->nr_segs++;
-			if (unroll <= n) {
-				i->bvec = bvec;
-				i->iov_offset = n - unroll;
-				return;
-			}
-			unroll -= n;
+	while (1) {
+		size_t n = (--bvec)->bv_len;
+		i->nr_segs++;
+		if (unroll <= n) {
+			i->bvec = bvec;
+			i->iov_offset = n - unroll;
+			return;
 		}
-	} else { /* same logics for iovec and kvec */
-		const struct iovec *iov = i->iov;
-		while (1) {
-			size_t n = (--iov)->iov_len;
-			i->nr_segs++;
-			if (unroll <= n) {
-				i->iov = iov;
-				i->iov_offset = n - unroll;
-				return;
-			}
-			unroll -= n;
+		unroll -= n;
+	}
+}
+
+static void pipe_revert(struct iov_iter *i, size_t unroll)
+{
+	struct pipe_inode_info *pipe = i->pipe;
+	unsigned int p_mask = pipe->ring_size - 1;
+	unsigned int i_head = i->head;
+	size_t off = i->iov_offset;
+
+	if (!unroll)
+		return;
+	if (WARN_ON(unroll > MAX_RW_COUNT))
+		return;
+
+	while (1) {
+		struct pipe_buffer *b = &pipe->bufs[i_head & p_mask];
+		size_t n = off - b->offset;
+		if (unroll < n) {
+			off -= unroll;
+			break;
+		}
+		unroll -= n;
+		if (!unroll && i_head == i->start_head) {
+			off = 0;
+			break;
 		}
+		i_head--;
+		b = &pipe->bufs[i_head & p_mask];
+		off = b->offset + b->len;
 	}
+	i->iov_offset = off;
+	i->head = i_head;
+	pipe_truncate(i);
+}
+
+static void discard_revert(struct iov_iter *i, size_t unroll)
+{
+	if (!unroll)
+		return;
+	if (WARN_ON(unroll > MAX_RW_COUNT))
+		return;
+	i->count += unroll;
 }
 
 /*
@@ -2082,7 +2108,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.type				= ITER_IOVEC,
 	.copy_from_user_atomic		= iovec_copy_from_user_atomic,
 	.advance			= iovec_advance,
-	.revert				= xxx_revert,
+	.revert				= iovec_kvec_revert,
 	.fault_in_readable		= iovec_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= iovec_copy_page_to_iter,
@@ -2116,7 +2142,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.type				= ITER_KVEC,
 	.copy_from_user_atomic		= kvec_copy_from_user_atomic,
 	.advance			= kvec_advance,
-	.revert				= xxx_revert,
+	.revert				= iovec_kvec_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
@@ -2150,7 +2176,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.type				= ITER_BVEC,
 	.copy_from_user_atomic		= bvec_copy_from_user_atomic,
 	.advance			= bvec_iov_advance,
-	.revert				= xxx_revert,
+	.revert				= bvec_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
@@ -2184,7 +2210,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.type				= ITER_PIPE,
 	.copy_from_user_atomic		= no_copy_from_user_atomic,
 	.advance			= pipe_advance,
-	.revert				= xxx_revert,
+	.revert				= pipe_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= pipe_copy_page_to_iter,
@@ -2218,7 +2244,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.type				= ITER_DISCARD,
 	.copy_from_user_atomic		= no_copy_from_user_atomic,
 	.advance			= discard_advance,
-	.revert				= xxx_revert,
+	.revert				= discard_revert,
 	.fault_in_readable		= no_fault_in_readable,
 	.single_seg_count		= xxx_single_seg_count,
 	.copy_page_to_iter		= discard_copy_page_to_iter,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 18/29] iov_iter: Split iov_iter_single_seg_count()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (16 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 17/29] iov_iter: Split iov_iter_revert() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:15 ` [PATCH 19/29] iov_iter: Split iov_iter_alignment() David Howells
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_single_seg_count() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   31 ++++++++++++++++++-------------
 1 file changed, 18 insertions(+), 13 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index b8e3da20547e..90291188ace5 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1415,18 +1415,23 @@ static void discard_revert(struct iov_iter *i, size_t unroll)
 /*
  * Return the count of just the current iov_iter segment.
  */
-static size_t xxx_single_seg_count(const struct iov_iter *i)
+static size_t iovec_kvec_single_seg_count(const struct iov_iter *i)
 {
-	if (unlikely(iov_iter_is_pipe(i)))
-		return i->count;	// it is a silly place, anyway
 	if (i->nr_segs == 1)
 		return i->count;
-	if (unlikely(iov_iter_is_discard(i)))
+	return min(i->count, i->iov->iov_len - i->iov_offset);
+}
+
+static size_t bvec_single_seg_count(const struct iov_iter *i)
+{
+	if (i->nr_segs == 1)
 		return i->count;
-	else if (iov_iter_is_bvec(i))
-		return min(i->count, i->bvec->bv_len - i->iov_offset);
-	else
-		return min(i->count, i->iov->iov_len - i->iov_offset);
+	return min(i->count, i->bvec->bv_len - i->iov_offset);
+}
+
+static size_t simple_single_seg_count(const struct iov_iter *i)
+{
+	return i->count;
 }
 
 void iov_iter_kvec(struct iov_iter *i, unsigned int direction,
@@ -2110,7 +2115,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.advance			= iovec_advance,
 	.revert				= iovec_kvec_revert,
 	.fault_in_readable		= iovec_fault_in_readable,
-	.single_seg_count		= xxx_single_seg_count,
+	.single_seg_count		= iovec_kvec_single_seg_count,
 	.copy_page_to_iter		= iovec_copy_page_to_iter,
 	.copy_page_from_iter		= iovec_copy_page_from_iter,
 	.copy_to_iter			= iovec_copy_to_iter,
@@ -2144,7 +2149,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.advance			= kvec_advance,
 	.revert				= iovec_kvec_revert,
 	.fault_in_readable		= no_fault_in_readable,
-	.single_seg_count		= xxx_single_seg_count,
+	.single_seg_count		= iovec_kvec_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= bkvec_copy_page_from_iter,
 	.copy_to_iter			= kvec_copy_to_iter,
@@ -2178,7 +2183,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.advance			= bvec_iov_advance,
 	.revert				= bvec_revert,
 	.fault_in_readable		= no_fault_in_readable,
-	.single_seg_count		= xxx_single_seg_count,
+	.single_seg_count		= bvec_single_seg_count,
 	.copy_page_to_iter		= bkvec_copy_page_to_iter,
 	.copy_page_from_iter		= bkvec_copy_page_from_iter,
 	.copy_to_iter			= bvec_copy_to_iter,
@@ -2212,7 +2217,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.advance			= pipe_advance,
 	.revert				= pipe_revert,
 	.fault_in_readable		= no_fault_in_readable,
-	.single_seg_count		= xxx_single_seg_count,
+	.single_seg_count		= simple_single_seg_count,
 	.copy_page_to_iter		= pipe_copy_page_to_iter,
 	.copy_page_from_iter		= no_copy_page_from_iter,
 	.copy_to_iter			= pipe_copy_to_iter,
@@ -2246,7 +2251,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.advance			= discard_advance,
 	.revert				= discard_revert,
 	.fault_in_readable		= no_fault_in_readable,
-	.single_seg_count		= xxx_single_seg_count,
+	.single_seg_count		= simple_single_seg_count,
 	.copy_page_to_iter		= discard_copy_page_to_iter,
 	.copy_page_from_iter		= no_copy_page_from_iter,
 	.copy_to_iter			= discard_copy_to_iter,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 19/29] iov_iter: Split iov_iter_alignment()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (17 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 18/29] iov_iter: Split iov_iter_single_seg_count() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:15 ` [PATCH 20/29] iov_iter: Split iov_iter_gap_alignment() David Howells
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_alignment() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   59 ++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 42 insertions(+), 17 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 90291188ace5..d2a66e951995 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1497,26 +1497,51 @@ void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
 }
 EXPORT_SYMBOL(iov_iter_discard);
 
-static unsigned long xxx_alignment(const struct iov_iter *i)
+static unsigned long iovec_alignment(const struct iov_iter *i)
 {
 	unsigned long res = 0;
 	size_t size = i->count;
 
-	if (unlikely(iov_iter_is_pipe(i))) {
-		unsigned int p_mask = i->pipe->ring_size - 1;
+	iterate_over_iovec(i, size, v,
+		(res |= (unsigned long)v.iov_base | v.iov_len, 0));
+	return res;
+}
 
-		if (size && i->iov_offset && allocated(&i->pipe->bufs[i->head & p_mask]))
-			return size | i->iov_offset;
-		return size;
-	}
-	iterate_all_kinds(i, size, v,
-		(res |= (unsigned long)v.iov_base | v.iov_len, 0),
-		res |= v.bv_offset | v.bv_len,
-		res |= (unsigned long)v.iov_base | v.iov_len
-	)
+static unsigned long bvec_alignment(const struct iov_iter *i)
+{
+	unsigned long res = 0;
+	size_t size = i->count;
+
+	iterate_over_bvec(i, size, v,
+		res |= v.bv_offset | v.bv_len);
 	return res;
 }
 
+static unsigned long kvec_alignment(const struct iov_iter *i)
+{
+	unsigned long res = 0;
+	size_t size = i->count;
+
+	iterate_over_kvec(i, size, v,
+		res |= (unsigned long)v.iov_base | v.iov_len);
+	return res;
+}
+
+static unsigned long pipe_alignment(const struct iov_iter *i)
+{
+	size_t size = i->count;
+	unsigned int p_mask = i->pipe->ring_size - 1;
+
+	if (size && i->iov_offset && allocated(&i->pipe->bufs[i->head & p_mask]))
+		return size | i->iov_offset;
+	return size;
+}
+
+static unsigned long no_alignment(const struct iov_iter *i)
+{
+	return 0;
+}
+
 static unsigned long xxx_gap_alignment(const struct iov_iter *i)
 {
 	unsigned long res = 0;
@@ -2134,7 +2159,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= iovec_zero,
-	.alignment			= xxx_alignment,
+	.alignment			= iovec_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
@@ -2168,7 +2193,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= kvec_zero,
-	.alignment			= xxx_alignment,
+	.alignment			= kvec_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
@@ -2202,7 +2227,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= bvec_zero,
-	.alignment			= xxx_alignment,
+	.alignment			= bvec_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
@@ -2236,7 +2261,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= pipe_zero,
-	.alignment			= xxx_alignment,
+	.alignment			= pipe_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
@@ -2270,7 +2295,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= discard_zero,
-	.alignment			= xxx_alignment,
+	.alignment			= no_alignment,
 	.gap_alignment			= xxx_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 20/29] iov_iter: Split iov_iter_gap_alignment()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (18 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 19/29] iov_iter: Split iov_iter_alignment() David Howells
@ 2020-11-21 14:15 ` David Howells
  2020-11-21 14:16 ` [PATCH 21/29] iov_iter: Split iov_iter_get_pages() David Howells
                   ` (11 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:15 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_gap_alignment() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   50 ++++++++++++++++++++++++++++++++++----------------
 1 file changed, 34 insertions(+), 16 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d2a66e951995..5744ddec854f 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1542,27 +1542,45 @@ static unsigned long no_alignment(const struct iov_iter *i)
 	return 0;
 }
 
-static unsigned long xxx_gap_alignment(const struct iov_iter *i)
+static unsigned long iovec_gap_alignment(const struct iov_iter *i)
 {
 	unsigned long res = 0;
 	size_t size = i->count;
 
-	if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
-		WARN_ON(1);
-		return ~0U;
-	}
-
-	iterate_all_kinds(i, size, v,
+	iterate_over_iovec(i, size, v,
 		(res |= (!res ? 0 : (unsigned long)v.iov_base) |
-			(size != v.iov_len ? size : 0), 0),
+			(size != v.iov_len ? size : 0), 0));
+	return res;
+}
+
+static unsigned long bvec_gap_alignment(const struct iov_iter *i)
+{
+	unsigned long res = 0;
+	size_t size = i->count;
+
+	iterate_over_bvec(i, size, v,
 		(res |= (!res ? 0 : (unsigned long)v.bv_offset) |
-			(size != v.bv_len ? size : 0)),
+			(size != v.bv_len ? size : 0)));
+	return res;
+}
+
+static unsigned long kvec_gap_alignment(const struct iov_iter *i)
+{
+	unsigned long res = 0;
+	size_t size = i->count;
+
+	iterate_over_kvec(i, size, v,
 		(res |= (!res ? 0 : (unsigned long)v.iov_base) |
-			(size != v.iov_len ? size : 0))
-		);
+			(size != v.iov_len ? size : 0)));
 	return res;
 }
 
+static unsigned long no_gap_alignment(const struct iov_iter *i)
+{
+	WARN_ON(1);
+	return ~0U;
+}
+
 static inline ssize_t __pipe_get_pages(struct iov_iter *i,
 				size_t maxsize,
 				struct page **pages,
@@ -2160,7 +2178,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 
 	.zero				= iovec_zero,
 	.alignment			= iovec_alignment,
-	.gap_alignment			= xxx_gap_alignment,
+	.gap_alignment			= iovec_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
@@ -2194,7 +2212,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 
 	.zero				= kvec_zero,
 	.alignment			= kvec_alignment,
-	.gap_alignment			= xxx_gap_alignment,
+	.gap_alignment			= kvec_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
@@ -2228,7 +2246,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 
 	.zero				= bvec_zero,
 	.alignment			= bvec_alignment,
-	.gap_alignment			= xxx_gap_alignment,
+	.gap_alignment			= bvec_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
@@ -2262,7 +2280,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 
 	.zero				= pipe_zero,
 	.alignment			= pipe_alignment,
-	.gap_alignment			= xxx_gap_alignment,
+	.gap_alignment			= no_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
@@ -2296,7 +2314,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 
 	.zero				= discard_zero,
 	.alignment			= no_alignment,
-	.gap_alignment			= xxx_gap_alignment,
+	.gap_alignment			= no_gap_alignment,
 	.get_pages			= xxx_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 21/29] iov_iter: Split iov_iter_get_pages()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (19 preceding siblings ...)
  2020-11-21 14:15 ` [PATCH 20/29] iov_iter: Split iov_iter_gap_alignment() David Howells
@ 2020-11-21 14:16 ` David Howells
  2020-11-21 14:16 ` [PATCH 22/29] iov_iter: Split iov_iter_get_pages_alloc() David Howells
                   ` (10 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:16 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_get_pages() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   46 +++++++++++++++++++++++++++++-----------------
 1 file changed, 29 insertions(+), 17 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 5744ddec854f..a2de201b947f 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1611,6 +1611,8 @@ static ssize_t pipe_get_pages(struct iov_iter *i,
 	unsigned int iter_head, npages;
 	size_t capacity;
 
+	if (maxsize > i->count)
+		maxsize = i->count;
 	if (!maxsize)
 		return 0;
 
@@ -1625,19 +1627,14 @@ static ssize_t pipe_get_pages(struct iov_iter *i,
 	return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head, start);
 }
 
-static ssize_t xxx_get_pages(struct iov_iter *i,
+static ssize_t iovec_get_pages(struct iov_iter *i,
 		   struct page **pages, size_t maxsize, unsigned maxpages,
 		   size_t *start)
 {
 	if (maxsize > i->count)
 		maxsize = i->count;
 
-	if (unlikely(iov_iter_is_pipe(i)))
-		return pipe_get_pages(i, pages, maxsize, maxpages, start);
-	if (unlikely(iov_iter_is_discard(i)))
-		return -EFAULT;
-
-	iterate_all_kinds(i, maxsize, v, ({
+	iterate_over_iovec(i, maxsize, v, ({
 		unsigned long addr = (unsigned long)v.iov_base;
 		size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
 		int n;
@@ -1653,18 +1650,33 @@ static ssize_t xxx_get_pages(struct iov_iter *i,
 		if (unlikely(res < 0))
 			return res;
 		return (res == n ? len : res * PAGE_SIZE) - *start;
-	0;}),({
+	0;}));
+	return 0;
+}
+
+static ssize_t bvec_get_pages(struct iov_iter *i,
+		   struct page **pages, size_t maxsize, unsigned maxpages,
+		   size_t *start)
+{
+	if (maxsize > i->count)
+		maxsize = i->count;
+
+	iterate_over_bvec(i, maxsize, v, ({
 		/* can't be more than PAGE_SIZE */
 		*start = v.bv_offset;
 		get_page(*pages = v.bv_page);
 		return v.bv_len;
-	}),({
-		return -EFAULT;
-	})
-	)
+	}));
 	return 0;
 }
 
+static ssize_t no_get_pages(struct iov_iter *i,
+		   struct page **pages, size_t maxsize, unsigned maxpages,
+		   size_t *start)
+{
+	return -EFAULT;
+}
+
 static struct page **get_pages_array(size_t n)
 {
 	return kvmalloc_array(n, sizeof(struct page *), GFP_KERNEL);
@@ -2179,7 +2191,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.zero				= iovec_zero,
 	.alignment			= iovec_alignment,
 	.gap_alignment			= iovec_gap_alignment,
-	.get_pages			= xxx_get_pages,
+	.get_pages			= iovec_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
@@ -2213,7 +2225,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.zero				= kvec_zero,
 	.alignment			= kvec_alignment,
 	.gap_alignment			= kvec_gap_alignment,
-	.get_pages			= xxx_get_pages,
+	.get_pages			= no_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
@@ -2247,7 +2259,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.zero				= bvec_zero,
 	.alignment			= bvec_alignment,
 	.gap_alignment			= bvec_gap_alignment,
-	.get_pages			= xxx_get_pages,
+	.get_pages			= bvec_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
@@ -2281,7 +2293,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.zero				= pipe_zero,
 	.alignment			= pipe_alignment,
 	.gap_alignment			= no_gap_alignment,
-	.get_pages			= xxx_get_pages,
+	.get_pages			= pipe_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
@@ -2315,7 +2327,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.zero				= discard_zero,
 	.alignment			= no_alignment,
 	.gap_alignment			= no_gap_alignment,
-	.get_pages			= xxx_get_pages,
+	.get_pages			= no_get_pages,
 	.get_pages_alloc		= xxx_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 22/29] iov_iter: Split iov_iter_get_pages_alloc()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (20 preceding siblings ...)
  2020-11-21 14:16 ` [PATCH 21/29] iov_iter: Split iov_iter_get_pages() David Howells
@ 2020-11-21 14:16 ` David Howells
  2020-11-21 14:16 ` [PATCH 23/29] iov_iter: Split csum_and_copy_from_iter() David Howells
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:16 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_get_pages_alloc() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   48 +++++++++++++++++++++++++++++++-----------------
 1 file changed, 31 insertions(+), 17 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index a2de201b947f..a038bfbbbd53 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1690,6 +1690,8 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
 	unsigned int iter_head, npages;
 	ssize_t n;
 
+	if (maxsize > i->count)
+		maxsize = i->count;
 	if (!maxsize)
 		return 0;
 
@@ -1715,7 +1717,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
 	return n;
 }
 
-static ssize_t xxx_get_pages_alloc(struct iov_iter *i,
+static ssize_t iovec_get_pages_alloc(struct iov_iter *i,
 		   struct page ***pages, size_t maxsize,
 		   size_t *start)
 {
@@ -1724,12 +1726,7 @@ static ssize_t xxx_get_pages_alloc(struct iov_iter *i,
 	if (maxsize > i->count)
 		maxsize = i->count;
 
-	if (unlikely(iov_iter_is_pipe(i)))
-		return pipe_get_pages_alloc(i, pages, maxsize, start);
-	if (unlikely(iov_iter_is_discard(i)))
-		return -EFAULT;
-
-	iterate_all_kinds(i, maxsize, v, ({
+	iterate_over_iovec(i, maxsize, v, ({
 		unsigned long addr = (unsigned long)v.iov_base;
 		size_t len = v.iov_len + (*start = addr & (PAGE_SIZE - 1));
 		int n;
@@ -1748,7 +1745,20 @@ static ssize_t xxx_get_pages_alloc(struct iov_iter *i,
 		}
 		*pages = p;
 		return (res == n ? len : res * PAGE_SIZE) - *start;
-	0;}),({
+	0;}));
+	return 0;
+}
+
+static ssize_t bvec_get_pages_alloc(struct iov_iter *i,
+		   struct page ***pages, size_t maxsize,
+		   size_t *start)
+{
+	struct page **p;
+
+	if (maxsize > i->count)
+		maxsize = i->count;
+
+	iterate_over_bvec(i, maxsize, v, ({
 		/* can't be more than PAGE_SIZE */
 		*start = v.bv_offset;
 		*pages = p = get_pages_array(1);
@@ -1756,13 +1766,17 @@ static ssize_t xxx_get_pages_alloc(struct iov_iter *i,
 			return -ENOMEM;
 		get_page(*p = v.bv_page);
 		return v.bv_len;
-	}),({
-		return -EFAULT;
-	})
-	)
+	}));
 	return 0;
 }
 
+static ssize_t no_get_pages_alloc(struct iov_iter *i,
+		   struct page ***pages, size_t maxsize,
+		   size_t *start)
+{
+	return -EFAULT;
+}
+
 static size_t xxx_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 			       struct iov_iter *i)
 {
@@ -2192,7 +2206,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.alignment			= iovec_alignment,
 	.gap_alignment			= iovec_gap_alignment,
 	.get_pages			= iovec_get_pages,
-	.get_pages_alloc		= xxx_get_pages_alloc,
+	.get_pages_alloc		= iovec_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
@@ -2226,7 +2240,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.alignment			= kvec_alignment,
 	.gap_alignment			= kvec_gap_alignment,
 	.get_pages			= no_get_pages,
-	.get_pages_alloc		= xxx_get_pages_alloc,
+	.get_pages_alloc		= no_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
@@ -2260,7 +2274,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.alignment			= bvec_alignment,
 	.gap_alignment			= bvec_gap_alignment,
 	.get_pages			= bvec_get_pages,
-	.get_pages_alloc		= xxx_get_pages_alloc,
+	.get_pages_alloc		= bvec_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
@@ -2294,7 +2308,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.alignment			= pipe_alignment,
 	.gap_alignment			= no_gap_alignment,
 	.get_pages			= pipe_get_pages,
-	.get_pages_alloc		= xxx_get_pages_alloc,
+	.get_pages_alloc		= pipe_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
@@ -2328,7 +2342,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.alignment			= no_alignment,
 	.gap_alignment			= no_gap_alignment,
 	.get_pages			= no_get_pages,
-	.get_pages_alloc		= xxx_get_pages_alloc,
+	.get_pages_alloc		= no_get_pages_alloc,
 	.npages				= xxx_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 23/29] iov_iter: Split csum_and_copy_from_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (21 preceding siblings ...)
  2020-11-21 14:16 ` [PATCH 22/29] iov_iter: Split iov_iter_get_pages_alloc() David Howells
@ 2020-11-21 14:16 ` David Howells
  2020-11-21 14:16 ` [PATCH 24/29] iov_iter: Split csum_and_copy_from_iter_full() David Howells
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:16 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split csum_and_copy_from_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   56 +++++++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 41 insertions(+), 15 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index a038bfbbbd53..1f596cffddf9 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1777,18 +1777,14 @@ static ssize_t no_get_pages_alloc(struct iov_iter *i,
 	return -EFAULT;
 }
 
-static size_t xxx_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
+static size_t iovec_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 			       struct iov_iter *i)
 {
 	char *to = addr;
 	__wsum sum, next;
 	size_t off = 0;
 	sum = *csum;
-	if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
-		WARN_ON(1);
-		return 0;
-	}
-	iterate_and_advance(i, bytes, v, ({
+	iterate_and_advance_iovec(i, bytes, v, ({
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
 					       v.iov_len);
@@ -1797,24 +1793,54 @@ static size_t xxx_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum
 			off += v.iov_len;
 		}
 		next ? 0 : v.iov_len;
-	}), ({
+	}));
+	*csum = sum;
+	return bytes;
+}
+
+static size_t bvec_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
+			       struct iov_iter *i)
+{
+	char *to = addr;
+	__wsum sum;
+	size_t off = 0;
+	sum = *csum;
+	iterate_and_advance_bvec(i, bytes, v, ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy((to += v.bv_len) - v.bv_len,
 				      p + v.bv_offset, v.bv_len,
 				      sum, off);
 		kunmap_atomic(p);
 		off += v.bv_len;
-	}),({
+	}));
+	*csum = sum;
+	return bytes;
+}
+
+static size_t kvec_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
+			       struct iov_iter *i)
+{
+	char *to = addr;
+	__wsum sum;
+	size_t off = 0;
+	sum = *csum;
+	iterate_and_advance_kvec(i, bytes, v, ({
 		sum = csum_and_memcpy((to += v.iov_len) - v.iov_len,
 				      v.iov_base, v.iov_len,
 				      sum, off);
 		off += v.iov_len;
-	})
-	)
+	}));
 	*csum = sum;
 	return bytes;
 }
 
+static size_t no_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
+			       struct iov_iter *i)
+{
+	WARN_ON(1);
+	return 0;
+}
+
 static bool xxx_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 			       struct iov_iter *i)
 {
@@ -2199,7 +2225,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.copy_mc_to_iter		= iovec_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
-	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter	= iovec_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= iovec_zero,
@@ -2233,7 +2259,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.copy_mc_to_iter		= kvec_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
-	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter	= kvec_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= kvec_zero,
@@ -2267,7 +2293,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.copy_mc_to_iter		= bvec_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
-	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter	= bvec_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= bvec_zero,
@@ -2301,7 +2327,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.copy_mc_to_iter		= pipe_copy_mc_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
-	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter	= no_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= pipe_zero,
@@ -2335,7 +2361,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.copy_mc_to_iter		= discard_copy_to_iter,
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
-	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
+	.csum_and_copy_from_iter	= no_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
 
 	.zero				= discard_zero,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 24/29] iov_iter: Split csum_and_copy_from_iter_full()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (22 preceding siblings ...)
  2020-11-21 14:16 ` [PATCH 23/29] iov_iter: Split csum_and_copy_from_iter() David Howells
@ 2020-11-21 14:16 ` David Howells
  2020-11-21 14:16 ` [PATCH 25/29] iov_iter: Split csum_and_copy_to_iter() David Howells
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:16 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split csum_and_copy_from_iter_full() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   62 ++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 47 insertions(+), 15 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 1f596cffddf9..8820a9e72815 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1841,20 +1841,16 @@ static size_t no_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
 	return 0;
 }
 
-static bool xxx_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
+static bool iovec_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
 			       struct iov_iter *i)
 {
 	char *to = addr;
 	__wsum sum, next;
 	size_t off = 0;
 	sum = *csum;
-	if (unlikely(iov_iter_is_pipe(i) || iov_iter_is_discard(i))) {
-		WARN_ON(1);
-		return false;
-	}
 	if (unlikely(i->count < bytes))
 		return false;
-	iterate_all_kinds(i, bytes, v, ({
+	iterate_over_iovec(i, bytes, v, ({
 		next = csum_and_copy_from_user(v.iov_base,
 					       (to += v.iov_len) - v.iov_len,
 					       v.iov_len);
@@ -1863,25 +1859,61 @@ static bool xxx_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *c
 		sum = csum_block_add(sum, next, off);
 		off += v.iov_len;
 		0;
-	}), ({
+	}));
+	*csum = sum;
+	iov_iter_advance(i, bytes);
+	return true;
+}
+
+static bool bvec_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
+			       struct iov_iter *i)
+{
+	char *to = addr;
+	__wsum sum;
+	size_t off = 0;
+	sum = *csum;
+	if (unlikely(i->count < bytes))
+		return false;
+	iterate_over_bvec(i, bytes, v, ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy((to += v.bv_len) - v.bv_len,
 				      p + v.bv_offset, v.bv_len,
 				      sum, off);
 		kunmap_atomic(p);
 		off += v.bv_len;
-	}),({
+	}));
+	*csum = sum;
+	iov_iter_advance(i, bytes);
+	return true;
+}
+
+static bool kvec_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
+			       struct iov_iter *i)
+{
+	char *to = addr;
+	__wsum sum;
+	size_t off = 0;
+	sum = *csum;
+	if (unlikely(i->count < bytes))
+		return false;
+	iterate_over_kvec(i, bytes, v, ({
 		sum = csum_and_memcpy((to += v.iov_len) - v.iov_len,
 				      v.iov_base, v.iov_len,
 				      sum, off);
 		off += v.iov_len;
-	})
-	)
+	}));
 	*csum = sum;
 	iov_iter_advance(i, bytes);
 	return true;
 }
 
+static bool no_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
+			       struct iov_iter *i)
+{
+	WARN_ON(1);
+	return false;
+}
+
 static size_t xxx_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 			     struct iov_iter *i)
 {
@@ -2226,7 +2258,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= iovec_csum_and_copy_from_iter,
-	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+	.csum_and_copy_from_iter_full	= iovec_csum_and_copy_from_iter_full,
 
 	.zero				= iovec_zero,
 	.alignment			= iovec_alignment,
@@ -2260,7 +2292,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= kvec_csum_and_copy_from_iter,
-	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+	.csum_and_copy_from_iter_full	= kvec_csum_and_copy_from_iter_full,
 
 	.zero				= kvec_zero,
 	.alignment			= kvec_alignment,
@@ -2294,7 +2326,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= bvec_csum_and_copy_from_iter,
-	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+	.csum_and_copy_from_iter_full	= bvec_csum_and_copy_from_iter_full,
 
 	.zero				= bvec_zero,
 	.alignment			= bvec_alignment,
@@ -2328,7 +2360,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= no_csum_and_copy_from_iter,
-	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+	.csum_and_copy_from_iter_full	= no_csum_and_copy_from_iter_full,
 
 	.zero				= pipe_zero,
 	.alignment			= pipe_alignment,
@@ -2362,7 +2394,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 #endif
 	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= no_csum_and_copy_from_iter,
-	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
+	.csum_and_copy_from_iter_full	= no_csum_and_copy_from_iter_full,
 
 	.zero				= discard_zero,
 	.alignment			= no_alignment,



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 25/29] iov_iter: Split csum_and_copy_to_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (23 preceding siblings ...)
  2020-11-21 14:16 ` [PATCH 24/29] iov_iter: Split csum_and_copy_from_iter_full() David Howells
@ 2020-11-21 14:16 ` David Howells
  2020-11-21 14:16 ` [PATCH 26/29] iov_iter: Split iov_iter_npages() David Howells
                   ` (6 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:16 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split csum_and_copy_to_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   68 ++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 48 insertions(+), 20 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 8820a9e72815..2f8019e3b09a 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -698,14 +698,15 @@ static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
 	return csum_block_add(sum, next, off);
 }
 
-static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
-				__wsum *csum, struct iov_iter *i)
+static size_t pipe_csum_and_copy_to_iter(const void *addr, size_t bytes,
+				void *csump, struct iov_iter *i)
 {
 	struct pipe_inode_info *pipe = i->pipe;
 	unsigned int p_mask = pipe->ring_size - 1;
 	unsigned int i_head;
 	size_t n, r;
 	size_t off = 0;
+	__wsum *csum = csump;
 	__wsum sum = *csum;
 
 	if (!sanity(i))
@@ -1914,7 +1915,7 @@ static bool no_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *cs
 	return false;
 }
 
-static size_t xxx_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
+static size_t iovec_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
 			     struct iov_iter *i)
 {
 	const char *from = addr;
@@ -1922,15 +1923,8 @@ static size_t xxx_csum_and_copy_to_iter(const void *addr, size_t bytes, void *cs
 	__wsum sum, next;
 	size_t off = 0;
 
-	if (unlikely(iov_iter_is_pipe(i)))
-		return csum_and_copy_to_pipe_iter(addr, bytes, csum, i);
-
 	sum = *csum;
-	if (unlikely(iov_iter_is_discard(i))) {
-		WARN_ON(1);	/* for now */
-		return 0;
-	}
-	iterate_and_advance(i, bytes, v, ({
+	iterate_and_advance_iovec(i, bytes, v, ({
 		next = csum_and_copy_to_user((from += v.iov_len) - v.iov_len,
 					     v.iov_base,
 					     v.iov_len);
@@ -1939,24 +1933,58 @@ static size_t xxx_csum_and_copy_to_iter(const void *addr, size_t bytes, void *cs
 			off += v.iov_len;
 		}
 		next ? 0 : v.iov_len;
-	}), ({
+	}));
+	*csum = sum;
+	return bytes;
+}
+
+static size_t bvec_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
+			     struct iov_iter *i)
+{
+	const char *from = addr;
+	__wsum *csum = csump;
+	__wsum sum;
+	size_t off = 0;
+
+	sum = *csum;
+	iterate_and_advance_bvec(i, bytes, v, ({
 		char *p = kmap_atomic(v.bv_page);
 		sum = csum_and_memcpy(p + v.bv_offset,
 				      (from += v.bv_len) - v.bv_len,
 				      v.bv_len, sum, off);
 		kunmap_atomic(p);
 		off += v.bv_len;
-	}),({
+	}));
+	*csum = sum;
+	return bytes;
+}
+
+static size_t kvec_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
+			     struct iov_iter *i)
+{
+	const char *from = addr;
+	__wsum *csum = csump;
+	__wsum sum;
+	size_t off = 0;
+
+	sum = *csum;
+	iterate_and_advance_kvec(i, bytes, v, ({
 		sum = csum_and_memcpy(v.iov_base,
 				     (from += v.iov_len) - v.iov_len,
 				     v.iov_len, sum, off);
 		off += v.iov_len;
-	})
-	)
+	}));
 	*csum = sum;
 	return bytes;
 }
 
+static size_t discard_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
+			     struct iov_iter *i)
+{
+	WARN_ON(1);	/* for now */
+	return 0;
+}
+
 size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
 		struct iov_iter *i)
 {
@@ -2256,7 +2284,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= iovec_copy_mc_to_iter,
 #endif
-	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_to_iter		= iovec_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= iovec_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= iovec_csum_and_copy_from_iter_full,
 
@@ -2290,7 +2318,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= kvec_copy_mc_to_iter,
 #endif
-	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_to_iter		= kvec_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= kvec_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= kvec_csum_and_copy_from_iter_full,
 
@@ -2324,7 +2352,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= bvec_copy_mc_to_iter,
 #endif
-	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_to_iter		= bvec_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= bvec_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= bvec_csum_and_copy_from_iter_full,
 
@@ -2358,7 +2386,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= pipe_copy_mc_to_iter,
 #endif
-	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_to_iter		= pipe_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= no_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= no_csum_and_copy_from_iter_full,
 
@@ -2392,7 +2420,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 #ifdef CONFIG_ARCH_HAS_COPY_MC
 	.copy_mc_to_iter		= discard_copy_to_iter,
 #endif
-	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
+	.csum_and_copy_to_iter		= discard_csum_and_copy_to_iter,
 	.csum_and_copy_from_iter	= no_csum_and_copy_from_iter,
 	.csum_and_copy_from_iter_full	= no_csum_and_copy_from_iter_full,
 



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 26/29] iov_iter: Split iov_iter_npages()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (24 preceding siblings ...)
  2020-11-21 14:16 ` [PATCH 25/29] iov_iter: Split csum_and_copy_to_iter() David Howells
@ 2020-11-21 14:16 ` David Howells
  2020-11-21 14:16 ` [PATCH 27/29] iov_iter: Split dup_iter() David Howells
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:16 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_npages() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   84 ++++++++++++++++++++++++++++++++++++++------------------
 1 file changed, 57 insertions(+), 27 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 2f8019e3b09a..d8ef6c81c55f 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -2004,50 +2004,80 @@ size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
 }
 EXPORT_SYMBOL(hash_and_copy_to_iter);
 
-static int xxx_npages(const struct iov_iter *i, int maxpages)
+static int iovec_npages(const struct iov_iter *i, int maxpages)
 {
 	size_t size = i->count;
 	int npages = 0;
 
 	if (!size)
 		return 0;
-	if (unlikely(iov_iter_is_discard(i)))
-		return 0;
-
-	if (unlikely(iov_iter_is_pipe(i))) {
-		struct pipe_inode_info *pipe = i->pipe;
-		unsigned int iter_head;
-		size_t off;
-
-		if (!sanity(i))
-			return 0;
-
-		data_start(i, &iter_head, &off);
-		/* some of this one + all after this one */
-		npages = pipe_space_for_user(iter_head, pipe->tail, pipe);
-		if (npages >= maxpages)
-			return maxpages;
-	} else iterate_all_kinds(i, size, v, ({
+	iterate_over_iovec(i, size, v, ({
 		unsigned long p = (unsigned long)v.iov_base;
 		npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
 			- p / PAGE_SIZE;
 		if (npages >= maxpages)
 			return maxpages;
-	0;}),({
+	0;}));
+	return npages;
+}
+
+static int bvec_npages(const struct iov_iter *i, int maxpages)
+{
+	size_t size = i->count;
+	int npages = 0;
+
+	if (!size)
+		return 0;
+	iterate_over_bvec(i, size, v, ({
 		npages++;
 		if (npages >= maxpages)
 			return maxpages;
-	}),({
+	}));
+	return npages;
+}
+
+static int kvec_npages(const struct iov_iter *i, int maxpages)
+{
+	size_t size = i->count;
+	int npages = 0;
+
+	if (!size)
+		return 0;
+	iterate_over_kvec(i, size, v, ({
 		unsigned long p = (unsigned long)v.iov_base;
 		npages += DIV_ROUND_UP(p + v.iov_len, PAGE_SIZE)
 			- p / PAGE_SIZE;
 		if (npages >= maxpages)
 			return maxpages;
-	})
-	)
+	}));
 	return npages;
 }
 
+static int pipe_npages(const struct iov_iter *i, int maxpages)
+{
+	struct pipe_inode_info *pipe = i->pipe;
+	size_t size = i->count, off;
+	unsigned int iter_head;
+	int npages = 0;
+
+	if (!size)
+		return 0;
+	if (!sanity(i))
+		return 0;
+
+	data_start(i, &iter_head, &off);
+	/* some of this one + all after this one */
+	npages = pipe_space_for_user(iter_head, pipe->tail, pipe);
+	if (npages >= maxpages)
+		return maxpages;
+	return npages;
+}
+
+static int discard_npages(const struct iov_iter *i, int maxpages)
+{
+	return 0;
+}
+
 static const void *xxx_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
 {
 	*new = *old;
@@ -2293,7 +2323,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.gap_alignment			= iovec_gap_alignment,
 	.get_pages			= iovec_get_pages,
 	.get_pages_alloc		= iovec_get_pages_alloc,
-	.npages				= xxx_npages,
+	.npages				= iovec_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
@@ -2327,7 +2357,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.gap_alignment			= kvec_gap_alignment,
 	.get_pages			= no_get_pages,
 	.get_pages_alloc		= no_get_pages_alloc,
-	.npages				= xxx_npages,
+	.npages				= kvec_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
@@ -2361,7 +2391,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.gap_alignment			= bvec_gap_alignment,
 	.get_pages			= bvec_get_pages,
 	.get_pages_alloc		= bvec_get_pages_alloc,
-	.npages				= xxx_npages,
+	.npages				= bvec_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
@@ -2395,7 +2425,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.gap_alignment			= no_gap_alignment,
 	.get_pages			= pipe_get_pages,
 	.get_pages_alloc		= pipe_get_pages_alloc,
-	.npages				= xxx_npages,
+	.npages				= pipe_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
@@ -2429,7 +2459,7 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.gap_alignment			= no_gap_alignment,
 	.get_pages			= no_get_pages,
 	.get_pages_alloc		= no_get_pages_alloc,
-	.npages				= xxx_npages,
+	.npages				= discard_npages,
 	.dup_iter			= xxx_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 27/29] iov_iter: Split dup_iter()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (25 preceding siblings ...)
  2020-11-21 14:16 ` [PATCH 26/29] iov_iter: Split iov_iter_npages() David Howells
@ 2020-11-21 14:16 ` David Howells
  2020-11-21 14:17 ` [PATCH 28/29] iov_iter: Split iov_iter_for_each_range() David Howells
                   ` (4 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:16 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split dup_iter() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   49 +++++++++++++++++++++++++++++--------------------
 1 file changed, 29 insertions(+), 20 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index d8ef6c81c55f..ca0e94596eda 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -2078,26 +2078,35 @@ static int discard_npages(const struct iov_iter *i, int maxpages)
 	return 0;
 }
 
-static const void *xxx_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
+static const void *iovec_kvec_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
 {
 	*new = *old;
-	if (unlikely(iov_iter_is_pipe(new))) {
-		WARN_ON(1);
-		return NULL;
-	}
-	if (unlikely(iov_iter_is_discard(new)))
-		return NULL;
-	if (iov_iter_is_bvec(new))
-		return new->bvec = kmemdup(new->bvec,
-				    new->nr_segs * sizeof(struct bio_vec),
-				    flags);
-	else
-		/* iovec and kvec have identical layout */
-		return new->iov = kmemdup(new->iov,
-				   new->nr_segs * sizeof(struct iovec),
+	/* iovec and kvec have identical layout */
+	return new->iov = kmemdup(new->iov,
+				  new->nr_segs * sizeof(struct iovec),
+				  flags);
+}
+
+static const void *bvec_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
+{
+	*new = *old;
+	return new->bvec = kmemdup(new->bvec,
+				   new->nr_segs * sizeof(struct bio_vec),
 				   flags);
 }
 
+static const void *discard_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
+{
+	*new = *old;
+	return NULL;
+}
+
+static const void *no_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
+{
+	WARN_ON(1);
+	return NULL;
+}
+
 static int copy_compat_iovec_from_user(struct iovec *iov,
 		const struct iovec __user *uvec, unsigned long nr_segs)
 {
@@ -2324,7 +2333,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.get_pages			= iovec_get_pages,
 	.get_pages_alloc		= iovec_get_pages_alloc,
 	.npages				= iovec_npages,
-	.dup_iter			= xxx_dup_iter,
+	.dup_iter			= iovec_kvec_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
 
@@ -2358,7 +2367,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.get_pages			= no_get_pages,
 	.get_pages_alloc		= no_get_pages_alloc,
 	.npages				= kvec_npages,
-	.dup_iter			= xxx_dup_iter,
+	.dup_iter			= iovec_kvec_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
 
@@ -2392,7 +2401,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.get_pages			= bvec_get_pages,
 	.get_pages_alloc		= bvec_get_pages_alloc,
 	.npages				= bvec_npages,
-	.dup_iter			= xxx_dup_iter,
+	.dup_iter			= bvec_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
 
@@ -2426,7 +2435,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.get_pages			= pipe_get_pages,
 	.get_pages_alloc		= pipe_get_pages_alloc,
 	.npages				= pipe_npages,
-	.dup_iter			= xxx_dup_iter,
+	.dup_iter			= no_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };
 
@@ -2460,6 +2469,6 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.get_pages			= no_get_pages,
 	.get_pages_alloc		= no_get_pages_alloc,
 	.npages				= discard_npages,
-	.dup_iter			= xxx_dup_iter,
+	.dup_iter			= discard_dup_iter,
 	.for_each_range			= xxx_for_each_range,
 };



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 28/29] iov_iter: Split iov_iter_for_each_range()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (26 preceding siblings ...)
  2020-11-21 14:16 ` [PATCH 27/29] iov_iter: Split dup_iter() David Howells
@ 2020-11-21 14:17 ` David Howells
  2020-11-21 14:17 ` [PATCH 29/29] iov_iter: Remove iterate_all_kinds() and iterate_and_advance() David Howells
                   ` (3 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:17 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Split iov_iter_for_each_range() by type.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   41 +++++++++++++++++++++++++++++++----------
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index ca0e94596eda..db798966823e 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -2282,7 +2282,7 @@ int import_single_range(int rw, void __user *buf, size_t len,
 }
 EXPORT_SYMBOL(import_single_range);
 
-static int xxx_for_each_range(struct iov_iter *i, size_t bytes,
+static int bvec_for_each_range(struct iov_iter *i, size_t bytes,
 			    int (*f)(struct kvec *vec, void *context),
 			    void *context)
 {
@@ -2291,18 +2291,39 @@ static int xxx_for_each_range(struct iov_iter *i, size_t bytes,
 	if (!bytes)
 		return 0;
 
-	iterate_all_kinds(i, bytes, v, -EINVAL, ({
+	iterate_over_bvec(i, bytes, v, ({
 		w.iov_base = kmap(v.bv_page) + v.bv_offset;
 		w.iov_len = v.bv_len;
 		err = f(&w, context);
 		kunmap(v.bv_page);
-		err;}), ({
+		err;
+	}));
+	return err;
+}
+
+static int kvec_for_each_range(struct iov_iter *i, size_t bytes,
+			    int (*f)(struct kvec *vec, void *context),
+			    void *context)
+{
+	struct kvec w;
+	int err = -EINVAL;
+	if (!bytes)
+		return 0;
+
+	iterate_over_kvec(i, bytes, v, ({
 		w = v;
-		err = f(&w, context);})
-	)
+		err = f(&w, context);
+	}));
 	return err;
 }
 
+static int no_for_each_range(struct iov_iter *i, size_t bytes,
+			    int (*f)(struct kvec *vec, void *context),
+			    void *context)
+{
+	return !bytes ? 0 : -EINVAL;
+}
+
 static const struct iov_iter_ops iovec_iter_ops = {
 	.type				= ITER_IOVEC,
 	.copy_from_user_atomic		= iovec_copy_from_user_atomic,
@@ -2334,7 +2355,7 @@ static const struct iov_iter_ops iovec_iter_ops = {
 	.get_pages_alloc		= iovec_get_pages_alloc,
 	.npages				= iovec_npages,
 	.dup_iter			= iovec_kvec_dup_iter,
-	.for_each_range			= xxx_for_each_range,
+	.for_each_range			= no_for_each_range,
 };
 
 static const struct iov_iter_ops kvec_iter_ops = {
@@ -2368,7 +2389,7 @@ static const struct iov_iter_ops kvec_iter_ops = {
 	.get_pages_alloc		= no_get_pages_alloc,
 	.npages				= kvec_npages,
 	.dup_iter			= iovec_kvec_dup_iter,
-	.for_each_range			= xxx_for_each_range,
+	.for_each_range			= kvec_for_each_range,
 };
 
 static const struct iov_iter_ops bvec_iter_ops = {
@@ -2402,7 +2423,7 @@ static const struct iov_iter_ops bvec_iter_ops = {
 	.get_pages_alloc		= bvec_get_pages_alloc,
 	.npages				= bvec_npages,
 	.dup_iter			= bvec_dup_iter,
-	.for_each_range			= xxx_for_each_range,
+	.for_each_range			= bvec_for_each_range,
 };
 
 static const struct iov_iter_ops pipe_iter_ops = {
@@ -2436,7 +2457,7 @@ static const struct iov_iter_ops pipe_iter_ops = {
 	.get_pages_alloc		= pipe_get_pages_alloc,
 	.npages				= pipe_npages,
 	.dup_iter			= no_dup_iter,
-	.for_each_range			= xxx_for_each_range,
+	.for_each_range			= no_for_each_range,
 };
 
 static const struct iov_iter_ops discard_iter_ops = {
@@ -2470,5 +2491,5 @@ static const struct iov_iter_ops discard_iter_ops = {
 	.get_pages_alloc		= no_get_pages_alloc,
 	.npages				= discard_npages,
 	.dup_iter			= discard_dup_iter,
-	.for_each_range			= xxx_for_each_range,
+	.for_each_range			= no_for_each_range,
 };



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 29/29] iov_iter: Remove iterate_all_kinds() and iterate_and_advance()
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (27 preceding siblings ...)
  2020-11-21 14:17 ` [PATCH 28/29] iov_iter: Split iov_iter_for_each_range() David Howells
@ 2020-11-21 14:17 ` David Howells
  2020-11-21 14:34 ` [PATCH 00/29] RFC: iov_iter: Switch to using an ops table Pavel Begunkov
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-21 14:17 UTC (permalink / raw)
  To: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: dhowells, Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Remove iterate_all_kinds() and iterate_and_advance() as they're no longer
used, having been split.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 lib/iov_iter.c |   61 --------------------------------------------------------
 1 file changed, 61 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index db798966823e..ba6b60c45103 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -86,26 +86,6 @@ static inline bool page_copy_sane(struct page *page, size_t offset, size_t n);
 	}						\
 }
 
-#define iterate_all_kinds(i, n, v, I, B, K) {			\
-	if (likely(n)) {					\
-		size_t skip = i->iov_offset;			\
-		if (unlikely(iov_iter_type(i) & ITER_BVEC)) {		\
-			struct bio_vec v;			\
-			struct bvec_iter __bi;			\
-			iterate_bvec(i, n, v, __bi, skip, (B))	\
-		} else if (unlikely(iov_iter_type(i) & ITER_KVEC)) {	\
-			const struct kvec *kvec;		\
-			struct kvec v;				\
-			iterate_kvec(i, n, v, kvec, skip, (K))	\
-		} else if (unlikely(iov_iter_type(i) & ITER_DISCARD)) {	\
-		} else {					\
-			const struct iovec *iov;		\
-			struct iovec v;				\
-			iterate_iovec(i, n, v, iov, skip, (I))	\
-		}						\
-	}							\
-}
-
 #define iterate_over_iovec(i, n, v, CMD) {			\
 	if (likely(n)) {					\
 		size_t skip = i->iov_offset;			\
@@ -133,47 +113,6 @@ static inline bool page_copy_sane(struct page *page, size_t offset, size_t n);
 	}							\
 }
 
-#define iterate_and_advance(i, n, v, I, B, K) {			\
-	if (unlikely(i->count < n))				\
-		n = i->count;					\
-	if (i->count) {						\
-		size_t skip = i->iov_offset;			\
-		if (unlikely(iov_iter_type(i) & ITER_BVEC)) {		\
-			const struct bio_vec *bvec = i->bvec;	\
-			struct bio_vec v;			\
-			struct bvec_iter __bi;			\
-			iterate_bvec(i, n, v, __bi, skip, (B))	\
-			i->bvec = __bvec_iter_bvec(i->bvec, __bi);	\
-			i->nr_segs -= i->bvec - bvec;		\
-			skip = __bi.bi_bvec_done;		\
-		} else if (unlikely(iov_iter_type(i) & ITER_KVEC)) {	\
-			const struct kvec *kvec;		\
-			struct kvec v;				\
-			iterate_kvec(i, n, v, kvec, skip, (K))	\
-			if (skip == kvec->iov_len) {		\
-				kvec++;				\
-				skip = 0;			\
-			}					\
-			i->nr_segs -= kvec - i->kvec;		\
-			i->kvec = kvec;				\
-		} else if (unlikely(iov_iter_type(i) & ITER_DISCARD)) {	\
-			skip += n;				\
-		} else {					\
-			const struct iovec *iov;		\
-			struct iovec v;				\
-			iterate_iovec(i, n, v, iov, skip, (I))	\
-			if (skip == iov->iov_len) {		\
-				iov++;				\
-				skip = 0;			\
-			}					\
-			i->nr_segs -= iov - i->iov;		\
-			i->iov = iov;				\
-		}						\
-		i->count -= n;					\
-		i->iov_offset = skip;				\
-	}							\
-}
-
 #define iterate_and_advance_iovec(i, n, v, CMD) {		\
 	if (unlikely(i->count < n))				\
 		n = i->count;					\



^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
@ 2020-11-21 14:31   ` Pavel Begunkov
  2020-11-23 23:21     ` Pavel Begunkov
  2020-11-21 18:21   ` Linus Torvalds
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 55+ messages in thread
From: Pavel Begunkov @ 2020-11-21 14:31 UTC (permalink / raw)
  To: David Howells, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

On 21/11/2020 14:13, David Howells wrote:
> Switch to using a table of operations.  In a future patch the individual
> methods will be split up by type.  For the moment, however, the ops tables
> just jump directly to the old functions - which are now static.  Inline
> wrappers are provided to jump through the hooks.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  fs/io_uring.c       |    2 
>  include/linux/uio.h |  241 ++++++++++++++++++++++++++++++++++--------
>  lib/iov_iter.c      |  293 +++++++++++++++++++++++++++++++++++++++------------
>  3 files changed, 422 insertions(+), 114 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 4ead291b2976..baa78f58ae5c 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3192,7 +3192,7 @@ static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
>  	rw->free_iovec = iovec;
>  	rw->bytes_done = 0;
>  	/* can only be fixed buffers, no need to do anything */
> -	if (iter->type == ITER_BVEC)
> +	if (iov_iter_is_bvec(iter))

Could you split this io_uring change and send for 5.10?
Or I can do it for you if you wish.

>  		return;
>  	if (!iovec) {
>  		unsigned iov_off = 0;
> diff --git a/include/linux/uio.h b/include/linux/uio.h
> index 72d88566694e..45ee087f8c43 100644
> --- a/include/linux/uio.h
> +++ b/include/linux/uio.h
> @@ -32,9 +32,10 @@ struct iov_iter {
>  	 * Bit 1 is the BVEC_FLAG_NO_REF bit, set if type is a bvec and
>  	 * the caller isn't expecting to drop a page reference when done.
>  	 */
> -	unsigned int type;
> +	unsigned int flags;
>  	size_t iov_offset;
>  	size_t count;
> +	const struct iov_iter_ops *ops;
>  	union {
>  		const struct iovec *iov;
>  		const struct kvec *kvec;
> @@ -50,9 +51,63 @@ struct iov_iter {
>  	};
>  };
>  
> +void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
> +			unsigned long nr_segs, size_t count);
> +void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec *kvec,
> +			unsigned long nr_segs, size_t count);
> +void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
> +			unsigned long nr_segs, size_t count);
> +void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
> +			size_t count);
> +void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
> +
> +struct iov_iter_ops {
> +	enum iter_type type;
> +	size_t (*copy_from_user_atomic)(struct page *page, struct iov_iter *i,
> +					unsigned long offset, size_t bytes);
> +	void (*advance)(struct iov_iter *i, size_t bytes);
> +	void (*revert)(struct iov_iter *i, size_t bytes);
> +	int (*fault_in_readable)(struct iov_iter *i, size_t bytes);
> +	size_t (*single_seg_count)(const struct iov_iter *i);
> +	size_t (*copy_page_to_iter)(struct page *page, size_t offset, size_t bytes,
> +				    struct iov_iter *i);
> +	size_t (*copy_page_from_iter)(struct page *page, size_t offset, size_t bytes,
> +				      struct iov_iter *i);
> +	size_t (*copy_to_iter)(const void *addr, size_t bytes, struct iov_iter *i);
> +	size_t (*copy_from_iter)(void *addr, size_t bytes, struct iov_iter *i);
> +	bool (*copy_from_iter_full)(void *addr, size_t bytes, struct iov_iter *i);
> +	size_t (*copy_from_iter_nocache)(void *addr, size_t bytes, struct iov_iter *i);
> +	bool (*copy_from_iter_full_nocache)(void *addr, size_t bytes, struct iov_iter *i);
> +#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +	size_t (*copy_from_iter_flushcache)(void *addr, size_t bytes, struct iov_iter *i);
> +#endif
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +	size_t (*copy_mc_to_iter)(const void *addr, size_t bytes, struct iov_iter *i);
> +#endif
> +	size_t (*csum_and_copy_to_iter)(const void *addr, size_t bytes, void *csump,
> +					struct iov_iter *i);
> +	size_t (*csum_and_copy_from_iter)(void *addr, size_t bytes, __wsum *csum,
> +					  struct iov_iter *i);
> +	bool (*csum_and_copy_from_iter_full)(void *addr, size_t bytes, __wsum *csum,
> +					     struct iov_iter *i);
> +
> +	size_t (*zero)(size_t bytes, struct iov_iter *i);
> +	unsigned long (*alignment)(const struct iov_iter *i);
> +	unsigned long (*gap_alignment)(const struct iov_iter *i);
> +	ssize_t (*get_pages)(struct iov_iter *i, struct page **pages,
> +			     size_t maxsize, unsigned maxpages, size_t *start);
> +	ssize_t (*get_pages_alloc)(struct iov_iter *i, struct page ***pages,
> +				   size_t maxsize, size_t *start);
> +	int (*npages)(const struct iov_iter *i, int maxpages);
> +	const void *(*dup_iter)(struct iov_iter *new, struct iov_iter *old, gfp_t flags);
> +	int (*for_each_range)(struct iov_iter *i, size_t bytes,
> +			      int (*f)(struct kvec *vec, void *context),
> +			      void *context);
> +};
> +
>  static inline enum iter_type iov_iter_type(const struct iov_iter *i)
>  {
> -	return i->type & ~(READ | WRITE);
> +	return i->ops->type;
>  }
>  
>  static inline bool iter_is_iovec(const struct iov_iter *i)
> @@ -82,7 +137,7 @@ static inline bool iov_iter_is_discard(const struct iov_iter *i)
>  
>  static inline unsigned char iov_iter_rw(const struct iov_iter *i)
>  {
> -	return i->type & (READ | WRITE);
> +	return i->flags & (READ | WRITE);
>  }
>  
>  /*
> @@ -111,22 +166,71 @@ static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
>  	};
>  }
>  
> -size_t iov_iter_copy_from_user_atomic(struct page *page,
> -		struct iov_iter *i, unsigned long offset, size_t bytes);
> -void iov_iter_advance(struct iov_iter *i, size_t bytes);
> -void iov_iter_revert(struct iov_iter *i, size_t bytes);
> -int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes);
> -size_t iov_iter_single_seg_count(const struct iov_iter *i);
> +static inline
> +size_t iov_iter_copy_from_user_atomic(struct page *page, struct iov_iter *i,
> +				      unsigned long offset, size_t bytes)
> +{
> +	return i->ops->copy_from_user_atomic(page, i, offset, bytes);
> +}
> +static inline
> +void iov_iter_advance(struct iov_iter *i, size_t bytes)
> +{
> +	return i->ops->advance(i, bytes);
> +}
> +static inline
> +void iov_iter_revert(struct iov_iter *i, size_t bytes)
> +{
> +	return i->ops->revert(i, bytes);
> +}
> +static inline
> +int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
> +{
> +	return i->ops->fault_in_readable(i, bytes);
> +}
> +static inline
> +size_t iov_iter_single_seg_count(const struct iov_iter *i)
> +{
> +	return i->ops->single_seg_count(i);
> +}
> +
> +static inline
>  size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
> -			 struct iov_iter *i);
> +				       struct iov_iter *i)
> +{
> +	return i->ops->copy_page_to_iter(page, offset, bytes, i);
> +}
> +static inline
>  size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
> -			 struct iov_iter *i);
> +					 struct iov_iter *i)
> +{
> +	return i->ops->copy_page_from_iter(page, offset, bytes, i);
> +}
>  
> -size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
> -size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i);
> -bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i);
> -size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i);
> -bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i);
> +static __always_inline __must_check
> +size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	return i->ops->copy_to_iter(addr, bytes, i);
> +}
> +static __always_inline __must_check
> +size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	return i->ops->copy_from_iter(addr, bytes, i);
> +}
> +static __always_inline __must_check
> +bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	return i->ops->copy_from_iter_full(addr, bytes, i);
> +}
> +static __always_inline __must_check
> +size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	return i->ops->copy_from_iter_nocache(addr, bytes, i);
> +}
> +static __always_inline __must_check
> +bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
> +{
> +	return i->ops->copy_from_iter_full_nocache(addr, bytes, i);
> +}
>  
>  static __always_inline __must_check
>  size_t copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> @@ -173,23 +277,21 @@ bool copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
>  		return _copy_from_iter_full_nocache(addr, bytes, i);
>  }
>  
> -#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
>  /*
>   * Note, users like pmem that depend on the stricter semantics of
>   * copy_from_iter_flushcache() than copy_from_iter_nocache() must check for
>   * IS_ENABLED(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) before assuming that the
>   * destination is flushed from the cache on return.
>   */
> -size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i);
> -#else
> -#define _copy_from_iter_flushcache _copy_from_iter_nocache
> -#endif
> -
> -#ifdef CONFIG_ARCH_HAS_COPY_MC
> -size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i);
> +static __always_inline __must_check
> +size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
> +{
> +#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +	return i->ops->copy_from_iter_flushcache(addr, bytes, i);
>  #else
> -#define _copy_mc_to_iter _copy_to_iter
> +	return i->ops->copy_from_iter_nocache(addr, bytes, i);
>  #endif
> +}
>  
>  static __always_inline __must_check
>  size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
> @@ -200,6 +302,16 @@ size_t copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
>  		return _copy_from_iter_flushcache(addr, bytes, i);
>  }
>  
> +static __always_inline __must_check
> +size_t _copy_mc_to_iter(void *addr, size_t bytes, struct iov_iter *i)
> +{
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +	return i->ops->copy_mc_to_iter(addr, bytes, i);
> +#else
> +	return i->ops->copy_to_iter(addr, bytes, i);
> +#endif
> +}
> +
>  static __always_inline __must_check
>  size_t copy_mc_to_iter(void *addr, size_t bytes, struct iov_iter *i)
>  {
> @@ -209,25 +321,47 @@ size_t copy_mc_to_iter(void *addr, size_t bytes, struct iov_iter *i)
>  		return _copy_mc_to_iter(addr, bytes, i);
>  }
>  
> -size_t iov_iter_zero(size_t bytes, struct iov_iter *);
> -unsigned long iov_iter_alignment(const struct iov_iter *i);
> -unsigned long iov_iter_gap_alignment(const struct iov_iter *i);
> -void iov_iter_init(struct iov_iter *i, unsigned int direction, const struct iovec *iov,
> -			unsigned long nr_segs, size_t count);
> -void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec *kvec,
> -			unsigned long nr_segs, size_t count);
> -void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
> -			unsigned long nr_segs, size_t count);
> -void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
> -			size_t count);
> -void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
> +static inline
> +size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
> +{
> +	return i->ops->zero(bytes, i);
> +}
> +static inline
> +unsigned long iov_iter_alignment(const struct iov_iter *i)
> +{
> +	return i->ops->alignment(i);
> +}
> +static inline
> +unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
> +{
> +	return i->ops->gap_alignment(i);
> +}
> +
> +static inline
>  ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
> -			size_t maxsize, unsigned maxpages, size_t *start);
> +			size_t maxsize, unsigned maxpages, size_t *start)
> +{
> +	return i->ops->get_pages(i, pages, maxsize, maxpages, start);
> +}
> +
> +static inline
>  ssize_t iov_iter_get_pages_alloc(struct iov_iter *i, struct page ***pages,
> -			size_t maxsize, size_t *start);
> -int iov_iter_npages(const struct iov_iter *i, int maxpages);
> +			size_t maxsize, size_t *start)
> +{
> +	return i->ops->get_pages_alloc(i, pages, maxsize, start);
> +}
> +
> +static inline
> +int iov_iter_npages(const struct iov_iter *i, int maxpages)
> +{
> +	return i->ops->npages(i, maxpages);
> +}
>  
> -const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags);
> +static inline
> +const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
> +{
> +	return old->ops->dup_iter(new, old, flags);
> +}
>  
>  static inline size_t iov_iter_count(const struct iov_iter *i)
>  {
> @@ -260,9 +394,22 @@ static inline void iov_iter_reexpand(struct iov_iter *i, size_t count)
>  {
>  	i->count = count;
>  }
> -size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump, struct iov_iter *i);
> -size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
> -bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i);
> +
> +static inline
> +size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump, struct iov_iter *i)
> +{
> +	return i->ops->csum_and_copy_to_iter(addr, bytes, csump, i);
> +}
> +static inline
> +size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i)
> +{
> +	return i->ops->csum_and_copy_from_iter(addr, bytes, csum, i);
> +}
> +static inline
> +bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum, struct iov_iter *i)
> +{
> +	return i->ops->csum_and_copy_from_iter_full(addr, bytes, csum, i);
> +}
>  size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
>  		struct iov_iter *i);
>  
> @@ -278,8 +425,12 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
>  int import_single_range(int type, void __user *buf, size_t len,
>  		 struct iovec *iov, struct iov_iter *i);
>  
> +static inline
>  int iov_iter_for_each_range(struct iov_iter *i, size_t bytes,
>  			    int (*f)(struct kvec *vec, void *context),
> -			    void *context);
> +			    void *context)
> +{
> +	return i->ops->for_each_range(i, bytes, f, context);
> +}
>  
>  #endif
> diff --git a/lib/iov_iter.c b/lib/iov_iter.c
> index 1635111c5bd2..e403d524c797 100644
> --- a/lib/iov_iter.c
> +++ b/lib/iov_iter.c
> @@ -13,6 +13,12 @@
>  #include <linux/scatterlist.h>
>  #include <linux/instrumented.h>
>  
> +static const struct iov_iter_ops iovec_iter_ops;
> +static const struct iov_iter_ops kvec_iter_ops;
> +static const struct iov_iter_ops bvec_iter_ops;
> +static const struct iov_iter_ops pipe_iter_ops;
> +static const struct iov_iter_ops discard_iter_ops;
> +
>  #define PIPE_PARANOIA /* for now */
>  
>  #define iterate_iovec(i, n, __v, __p, skip, STEP) {	\
> @@ -81,15 +87,15 @@
>  #define iterate_all_kinds(i, n, v, I, B, K) {			\
>  	if (likely(n)) {					\
>  		size_t skip = i->iov_offset;			\
> -		if (unlikely(i->type & ITER_BVEC)) {		\
> +		if (unlikely(iov_iter_type(i) & ITER_BVEC)) {		\
>  			struct bio_vec v;			\
>  			struct bvec_iter __bi;			\
>  			iterate_bvec(i, n, v, __bi, skip, (B))	\
> -		} else if (unlikely(i->type & ITER_KVEC)) {	\
> +		} else if (unlikely(iov_iter_type(i) & ITER_KVEC)) {	\
>  			const struct kvec *kvec;		\
>  			struct kvec v;				\
>  			iterate_kvec(i, n, v, kvec, skip, (K))	\
> -		} else if (unlikely(i->type & ITER_DISCARD)) {	\
> +		} else if (unlikely(iov_iter_type(i) & ITER_DISCARD)) {	\
>  		} else {					\
>  			const struct iovec *iov;		\
>  			struct iovec v;				\
> @@ -103,7 +109,7 @@
>  		n = i->count;					\
>  	if (i->count) {						\
>  		size_t skip = i->iov_offset;			\
> -		if (unlikely(i->type & ITER_BVEC)) {		\
> +		if (unlikely(iov_iter_type(i) & ITER_BVEC)) {		\
>  			const struct bio_vec *bvec = i->bvec;	\
>  			struct bio_vec v;			\
>  			struct bvec_iter __bi;			\
> @@ -111,7 +117,7 @@
>  			i->bvec = __bvec_iter_bvec(i->bvec, __bi);	\
>  			i->nr_segs -= i->bvec - bvec;		\
>  			skip = __bi.bi_bvec_done;		\
> -		} else if (unlikely(i->type & ITER_KVEC)) {	\
> +		} else if (unlikely(iov_iter_type(i) & ITER_KVEC)) {	\
>  			const struct kvec *kvec;		\
>  			struct kvec v;				\
>  			iterate_kvec(i, n, v, kvec, skip, (K))	\
> @@ -121,7 +127,7 @@
>  			}					\
>  			i->nr_segs -= kvec - i->kvec;		\
>  			i->kvec = kvec;				\
> -		} else if (unlikely(i->type & ITER_DISCARD)) {	\
> +		} else if (unlikely(iov_iter_type(i) & ITER_DISCARD)) {	\
>  			skip += n;				\
>  		} else {					\
>  			const struct iovec *iov;		\
> @@ -427,14 +433,14 @@ static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t by
>   * Return 0 on success, or non-zero if the memory could not be accessed (i.e.
>   * because it is an invalid address).
>   */
> -int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
> +static int xxx_fault_in_readable(struct iov_iter *i, size_t bytes)
>  {
>  	size_t skip = i->iov_offset;
>  	const struct iovec *iov;
>  	int err;
>  	struct iovec v;
>  
> -	if (!(i->type & (ITER_BVEC|ITER_KVEC))) {
> +	if (!(iov_iter_type(i) & (ITER_BVEC|ITER_KVEC))) {
>  		iterate_iovec(i, bytes, v, iov, skip, ({
>  			err = fault_in_pages_readable(v.iov_base, v.iov_len);
>  			if (unlikely(err))
> @@ -443,7 +449,6 @@ int iov_iter_fault_in_readable(struct iov_iter *i, size_t bytes)
>  	}
>  	return 0;
>  }
> -EXPORT_SYMBOL(iov_iter_fault_in_readable);
>  
>  void iov_iter_init(struct iov_iter *i, unsigned int direction,
>  			const struct iovec *iov, unsigned long nr_segs,
> @@ -454,10 +459,12 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction,
>  
>  	/* It will get better.  Eventually... */
>  	if (uaccess_kernel()) {
> -		i->type = ITER_KVEC | direction;
> +		i->ops = &kvec_iter_ops;
> +		i->flags = direction;
>  		i->kvec = (struct kvec *)iov;
>  	} else {
> -		i->type = ITER_IOVEC | direction;
> +		i->ops = &iovec_iter_ops;
> +		i->flags = direction;
>  		i->iov = iov;
>  	}
>  	i->nr_segs = nr_segs;
> @@ -625,7 +632,7 @@ static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
>  	return bytes;
>  }
>  
> -size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> +static size_t xxx_copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
>  {
>  	const char *from = addr;
>  	if (unlikely(iov_iter_is_pipe(i)))
> @@ -641,7 +648,6 @@ size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
>  
>  	return bytes;
>  }
> -EXPORT_SYMBOL(_copy_to_iter);
>  
>  #ifdef CONFIG_ARCH_HAS_COPY_MC
>  static int copyout_mc(void __user *to, const void *from, size_t n)
> @@ -723,7 +729,7 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
>   *   Compare to copy_to_iter() where only ITER_IOVEC attempts might return
>   *   a short copy.
>   */
> -size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
> +static size_t xxx_copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
>  {
>  	const char *from = addr;
>  	unsigned long rem, curr_addr, s_addr = (unsigned long) addr;
> @@ -757,10 +763,9 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
>  
>  	return bytes;
>  }
> -EXPORT_SYMBOL_GPL(_copy_mc_to_iter);
>  #endif /* CONFIG_ARCH_HAS_COPY_MC */
>  
> -size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
> +static size_t xxx_copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
>  {
>  	char *to = addr;
>  	if (unlikely(iov_iter_is_pipe(i))) {
> @@ -778,9 +783,8 @@ size_t _copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
>  
>  	return bytes;
>  }
> -EXPORT_SYMBOL(_copy_from_iter);
>  
> -bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
> +static bool xxx_copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
>  {
>  	char *to = addr;
>  	if (unlikely(iov_iter_is_pipe(i))) {
> @@ -805,9 +809,8 @@ bool _copy_from_iter_full(void *addr, size_t bytes, struct iov_iter *i)
>  	iov_iter_advance(i, bytes);
>  	return true;
>  }
> -EXPORT_SYMBOL(_copy_from_iter_full);
>  
> -size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
> +static size_t xxx_copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
>  {
>  	char *to = addr;
>  	if (unlikely(iov_iter_is_pipe(i))) {
> @@ -824,7 +827,6 @@ size_t _copy_from_iter_nocache(void *addr, size_t bytes, struct iov_iter *i)
>  
>  	return bytes;
>  }
> -EXPORT_SYMBOL(_copy_from_iter_nocache);
>  
>  #ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
>  /**
> @@ -841,7 +843,7 @@ EXPORT_SYMBOL(_copy_from_iter_nocache);
>   * bypass the cache for the ITER_IOVEC case, and on some archs may use
>   * instructions that strand dirty-data in the cache.
>   */
> -size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
> +static size_t xxx_copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
>  {
>  	char *to = addr;
>  	if (unlikely(iov_iter_is_pipe(i))) {
> @@ -859,10 +861,9 @@ size_t _copy_from_iter_flushcache(void *addr, size_t bytes, struct iov_iter *i)
>  
>  	return bytes;
>  }
> -EXPORT_SYMBOL_GPL(_copy_from_iter_flushcache);
>  #endif
>  
> -bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
> +static bool xxx_copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
>  {
>  	char *to = addr;
>  	if (unlikely(iov_iter_is_pipe(i))) {
> @@ -884,7 +885,6 @@ bool _copy_from_iter_full_nocache(void *addr, size_t bytes, struct iov_iter *i)
>  	iov_iter_advance(i, bytes);
>  	return true;
>  }
> -EXPORT_SYMBOL(_copy_from_iter_full_nocache);
>  
>  static inline bool page_copy_sane(struct page *page, size_t offset, size_t n)
>  {
> @@ -910,12 +910,12 @@ static inline bool page_copy_sane(struct page *page, size_t offset, size_t n)
>  	return false;
>  }
>  
> -size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
> +static size_t xxx_copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
>  			 struct iov_iter *i)
>  {
>  	if (unlikely(!page_copy_sane(page, offset, bytes)))
>  		return 0;
> -	if (i->type & (ITER_BVEC|ITER_KVEC)) {
> +	if (iov_iter_type(i) & (ITER_BVEC|ITER_KVEC)) {
>  		void *kaddr = kmap_atomic(page);
>  		size_t wanted = copy_to_iter(kaddr + offset, bytes, i);
>  		kunmap_atomic(kaddr);
> @@ -927,9 +927,8 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
>  	else
>  		return copy_page_to_iter_pipe(page, offset, bytes, i);
>  }
> -EXPORT_SYMBOL(copy_page_to_iter);
>  
> -size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
> +static size_t xxx_copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
>  			 struct iov_iter *i)
>  {
>  	if (unlikely(!page_copy_sane(page, offset, bytes)))
> @@ -938,15 +937,14 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
>  		WARN_ON(1);
>  		return 0;
>  	}
> -	if (i->type & (ITER_BVEC|ITER_KVEC)) {
> +	if (iov_iter_type(i) & (ITER_BVEC|ITER_KVEC)) {
>  		void *kaddr = kmap_atomic(page);
> -		size_t wanted = _copy_from_iter(kaddr + offset, bytes, i);
> +		size_t wanted = xxx_copy_from_iter(kaddr + offset, bytes, i);
>  		kunmap_atomic(kaddr);
>  		return wanted;
>  	} else
>  		return copy_page_from_iter_iovec(page, offset, bytes, i);
>  }
> -EXPORT_SYMBOL(copy_page_from_iter);
>  
>  static size_t pipe_zero(size_t bytes, struct iov_iter *i)
>  {
> @@ -975,7 +973,7 @@ static size_t pipe_zero(size_t bytes, struct iov_iter *i)
>  	return bytes;
>  }
>  
> -size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
> +static size_t xxx_zero(size_t bytes, struct iov_iter *i)
>  {
>  	if (unlikely(iov_iter_is_pipe(i)))
>  		return pipe_zero(bytes, i);
> @@ -987,9 +985,8 @@ size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
>  
>  	return bytes;
>  }
> -EXPORT_SYMBOL(iov_iter_zero);
>  
> -size_t iov_iter_copy_from_user_atomic(struct page *page,
> +static size_t xxx_copy_from_user_atomic(struct page *page,
>  		struct iov_iter *i, unsigned long offset, size_t bytes)
>  {
>  	char *kaddr = kmap_atomic(page), *p = kaddr + offset;
> @@ -1011,7 +1008,6 @@ size_t iov_iter_copy_from_user_atomic(struct page *page,
>  	kunmap_atomic(kaddr);
>  	return bytes;
>  }
> -EXPORT_SYMBOL(iov_iter_copy_from_user_atomic);
>  
>  static inline void pipe_truncate(struct iov_iter *i)
>  {
> @@ -1067,7 +1063,7 @@ static void pipe_advance(struct iov_iter *i, size_t size)
>  	pipe_truncate(i);
>  }
>  
> -void iov_iter_advance(struct iov_iter *i, size_t size)
> +static void xxx_advance(struct iov_iter *i, size_t size)
>  {
>  	if (unlikely(iov_iter_is_pipe(i))) {
>  		pipe_advance(i, size);
> @@ -1079,9 +1075,8 @@ void iov_iter_advance(struct iov_iter *i, size_t size)
>  	}
>  	iterate_and_advance(i, size, v, 0, 0, 0)
>  }
> -EXPORT_SYMBOL(iov_iter_advance);
>  
> -void iov_iter_revert(struct iov_iter *i, size_t unroll)
> +static void xxx_revert(struct iov_iter *i, size_t unroll)
>  {
>  	if (!unroll)
>  		return;
> @@ -1147,12 +1142,11 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
>  		}
>  	}
>  }
> -EXPORT_SYMBOL(iov_iter_revert);
>  
>  /*
>   * Return the count of just the current iov_iter segment.
>   */
> -size_t iov_iter_single_seg_count(const struct iov_iter *i)
> +static size_t xxx_single_seg_count(const struct iov_iter *i)
>  {
>  	if (unlikely(iov_iter_is_pipe(i)))
>  		return i->count;	// it is a silly place, anyway
> @@ -1165,14 +1159,14 @@ size_t iov_iter_single_seg_count(const struct iov_iter *i)
>  	else
>  		return min(i->count, i->iov->iov_len - i->iov_offset);
>  }
> -EXPORT_SYMBOL(iov_iter_single_seg_count);
>  
>  void iov_iter_kvec(struct iov_iter *i, unsigned int direction,
> -			const struct kvec *kvec, unsigned long nr_segs,
> -			size_t count)
> +		   const struct kvec *kvec, unsigned long nr_segs,
> +		   size_t count)
>  {
>  	WARN_ON(direction & ~(READ | WRITE));
> -	i->type = ITER_KVEC | (direction & (READ | WRITE));
> +	i->ops = &kvec_iter_ops;
> +	i->flags = direction & (READ | WRITE);
>  	i->kvec = kvec;
>  	i->nr_segs = nr_segs;
>  	i->iov_offset = 0;
> @@ -1185,7 +1179,8 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
>  			size_t count)
>  {
>  	WARN_ON(direction & ~(READ | WRITE));
> -	i->type = ITER_BVEC | (direction & (READ | WRITE));
> +	i->ops = &bvec_iter_ops;
> +	i->flags = direction & (READ | WRITE);
>  	i->bvec = bvec;
>  	i->nr_segs = nr_segs;
>  	i->iov_offset = 0;
> @@ -1199,7 +1194,8 @@ void iov_iter_pipe(struct iov_iter *i, unsigned int direction,
>  {
>  	BUG_ON(direction != READ);
>  	WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size));
> -	i->type = ITER_PIPE | READ;
> +	i->ops = &pipe_iter_ops;
> +	i->flags = READ;
>  	i->pipe = pipe;
>  	i->head = pipe->head;
>  	i->iov_offset = 0;
> @@ -1220,13 +1216,14 @@ EXPORT_SYMBOL(iov_iter_pipe);
>  void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count)
>  {
>  	BUG_ON(direction != READ);
> -	i->type = ITER_DISCARD | READ;
> +	i->ops = &discard_iter_ops;
> +	i->flags = READ;
>  	i->count = count;
>  	i->iov_offset = 0;
>  }
>  EXPORT_SYMBOL(iov_iter_discard);
>  
> -unsigned long iov_iter_alignment(const struct iov_iter *i)
> +static unsigned long xxx_alignment(const struct iov_iter *i)
>  {
>  	unsigned long res = 0;
>  	size_t size = i->count;
> @@ -1245,9 +1242,8 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
>  	)
>  	return res;
>  }
> -EXPORT_SYMBOL(iov_iter_alignment);
>  
> -unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
> +static unsigned long xxx_gap_alignment(const struct iov_iter *i)
>  {
>  	unsigned long res = 0;
>  	size_t size = i->count;
> @@ -1267,7 +1263,6 @@ unsigned long iov_iter_gap_alignment(const struct iov_iter *i)
>  		);
>  	return res;
>  }
> -EXPORT_SYMBOL(iov_iter_gap_alignment);
>  
>  static inline ssize_t __pipe_get_pages(struct iov_iter *i,
>  				size_t maxsize,
> @@ -1313,7 +1308,7 @@ static ssize_t pipe_get_pages(struct iov_iter *i,
>  	return __pipe_get_pages(i, min(maxsize, capacity), pages, iter_head, start);
>  }
>  
> -ssize_t iov_iter_get_pages(struct iov_iter *i,
> +static ssize_t xxx_get_pages(struct iov_iter *i,
>  		   struct page **pages, size_t maxsize, unsigned maxpages,
>  		   size_t *start)
>  {
> @@ -1352,7 +1347,6 @@ ssize_t iov_iter_get_pages(struct iov_iter *i,
>  	)
>  	return 0;
>  }
> -EXPORT_SYMBOL(iov_iter_get_pages);
>  
>  static struct page **get_pages_array(size_t n)
>  {
> @@ -1392,7 +1386,7 @@ static ssize_t pipe_get_pages_alloc(struct iov_iter *i,
>  	return n;
>  }
>  
> -ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
> +static ssize_t xxx_get_pages_alloc(struct iov_iter *i,
>  		   struct page ***pages, size_t maxsize,
>  		   size_t *start)
>  {
> @@ -1439,9 +1433,8 @@ ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
>  	)
>  	return 0;
>  }
> -EXPORT_SYMBOL(iov_iter_get_pages_alloc);
>  
> -size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
> +static size_t xxx_csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
>  			       struct iov_iter *i)
>  {
>  	char *to = addr;
> @@ -1478,9 +1471,8 @@ size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
>  	*csum = sum;
>  	return bytes;
>  }
> -EXPORT_SYMBOL(csum_and_copy_from_iter);
>  
> -bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
> +static bool xxx_csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
>  			       struct iov_iter *i)
>  {
>  	char *to = addr;
> @@ -1520,9 +1512,8 @@ bool csum_and_copy_from_iter_full(void *addr, size_t bytes, __wsum *csum,
>  	iov_iter_advance(i, bytes);
>  	return true;
>  }
> -EXPORT_SYMBOL(csum_and_copy_from_iter_full);
>  
> -size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
> +static size_t xxx_csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
>  			     struct iov_iter *i)
>  {
>  	const char *from = addr;
> @@ -1564,7 +1555,6 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *csump,
>  	*csum = sum;
>  	return bytes;
>  }
> -EXPORT_SYMBOL(csum_and_copy_to_iter);
>  
>  size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
>  		struct iov_iter *i)
> @@ -1585,7 +1575,7 @@ size_t hash_and_copy_to_iter(const void *addr, size_t bytes, void *hashp,
>  }
>  EXPORT_SYMBOL(hash_and_copy_to_iter);
>  
> -int iov_iter_npages(const struct iov_iter *i, int maxpages)
> +static int xxx_npages(const struct iov_iter *i, int maxpages)
>  {
>  	size_t size = i->count;
>  	int npages = 0;
> @@ -1628,9 +1618,8 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
>  	)
>  	return npages;
>  }
> -EXPORT_SYMBOL(iov_iter_npages);
>  
> -const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
> +static const void *xxx_dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
>  {
>  	*new = *old;
>  	if (unlikely(iov_iter_is_pipe(new))) {
> @@ -1649,7 +1638,6 @@ const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
>  				   new->nr_segs * sizeof(struct iovec),
>  				   flags);
>  }
> -EXPORT_SYMBOL(dup_iter);
>  
>  static int copy_compat_iovec_from_user(struct iovec *iov,
>  		const struct iovec __user *uvec, unsigned long nr_segs)
> @@ -1826,7 +1814,7 @@ int import_single_range(int rw, void __user *buf, size_t len,
>  }
>  EXPORT_SYMBOL(import_single_range);
>  
> -int iov_iter_for_each_range(struct iov_iter *i, size_t bytes,
> +static int xxx_for_each_range(struct iov_iter *i, size_t bytes,
>  			    int (*f)(struct kvec *vec, void *context),
>  			    void *context)
>  {
> @@ -1846,4 +1834,173 @@ int iov_iter_for_each_range(struct iov_iter *i, size_t bytes,
>  	)
>  	return err;
>  }
> -EXPORT_SYMBOL(iov_iter_for_each_range);
> +
> +static const struct iov_iter_ops iovec_iter_ops = {
> +	.type				= ITER_IOVEC,
> +	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
> +	.advance			= xxx_advance,
> +	.revert				= xxx_revert,
> +	.fault_in_readable		= xxx_fault_in_readable,
> +	.single_seg_count		= xxx_single_seg_count,
> +	.copy_page_to_iter		= xxx_copy_page_to_iter,
> +	.copy_page_from_iter		= xxx_copy_page_from_iter,
> +	.copy_to_iter			= xxx_copy_to_iter,
> +	.copy_from_iter			= xxx_copy_from_iter,
> +	.copy_from_iter_full		= xxx_copy_from_iter_full,
> +	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
> +	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
> +#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
> +#endif
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
> +#endif
> +	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
> +	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
> +	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
> +
> +	.zero				= xxx_zero,
> +	.alignment			= xxx_alignment,
> +	.gap_alignment			= xxx_gap_alignment,
> +	.get_pages			= xxx_get_pages,
> +	.get_pages_alloc		= xxx_get_pages_alloc,
> +	.npages				= xxx_npages,
> +	.dup_iter			= xxx_dup_iter,
> +	.for_each_range			= xxx_for_each_range,
> +};
> +
> +static const struct iov_iter_ops kvec_iter_ops = {
> +	.type				= ITER_KVEC,
> +	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
> +	.advance			= xxx_advance,
> +	.revert				= xxx_revert,
> +	.fault_in_readable		= xxx_fault_in_readable,
> +	.single_seg_count		= xxx_single_seg_count,
> +	.copy_page_to_iter		= xxx_copy_page_to_iter,
> +	.copy_page_from_iter		= xxx_copy_page_from_iter,
> +	.copy_to_iter			= xxx_copy_to_iter,
> +	.copy_from_iter			= xxx_copy_from_iter,
> +	.copy_from_iter_full		= xxx_copy_from_iter_full,
> +	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
> +	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
> +#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
> +#endif
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
> +#endif
> +	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
> +	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
> +	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
> +
> +	.zero				= xxx_zero,
> +	.alignment			= xxx_alignment,
> +	.gap_alignment			= xxx_gap_alignment,
> +	.get_pages			= xxx_get_pages,
> +	.get_pages_alloc		= xxx_get_pages_alloc,
> +	.npages				= xxx_npages,
> +	.dup_iter			= xxx_dup_iter,
> +	.for_each_range			= xxx_for_each_range,
> +};
> +
> +static const struct iov_iter_ops bvec_iter_ops = {
> +	.type				= ITER_BVEC,
> +	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
> +	.advance			= xxx_advance,
> +	.revert				= xxx_revert,
> +	.fault_in_readable		= xxx_fault_in_readable,
> +	.single_seg_count		= xxx_single_seg_count,
> +	.copy_page_to_iter		= xxx_copy_page_to_iter,
> +	.copy_page_from_iter		= xxx_copy_page_from_iter,
> +	.copy_to_iter			= xxx_copy_to_iter,
> +	.copy_from_iter			= xxx_copy_from_iter,
> +	.copy_from_iter_full		= xxx_copy_from_iter_full,
> +	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
> +	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
> +#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
> +#endif
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
> +#endif
> +	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
> +	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
> +	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
> +
> +	.zero				= xxx_zero,
> +	.alignment			= xxx_alignment,
> +	.gap_alignment			= xxx_gap_alignment,
> +	.get_pages			= xxx_get_pages,
> +	.get_pages_alloc		= xxx_get_pages_alloc,
> +	.npages				= xxx_npages,
> +	.dup_iter			= xxx_dup_iter,
> +	.for_each_range			= xxx_for_each_range,
> +};
> +
> +static const struct iov_iter_ops pipe_iter_ops = {
> +	.type				= ITER_PIPE,
> +	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
> +	.advance			= xxx_advance,
> +	.revert				= xxx_revert,
> +	.fault_in_readable		= xxx_fault_in_readable,
> +	.single_seg_count		= xxx_single_seg_count,
> +	.copy_page_to_iter		= xxx_copy_page_to_iter,
> +	.copy_page_from_iter		= xxx_copy_page_from_iter,
> +	.copy_to_iter			= xxx_copy_to_iter,
> +	.copy_from_iter			= xxx_copy_from_iter,
> +	.copy_from_iter_full		= xxx_copy_from_iter_full,
> +	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
> +	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
> +#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
> +#endif
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
> +#endif
> +	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
> +	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
> +	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
> +
> +	.zero				= xxx_zero,
> +	.alignment			= xxx_alignment,
> +	.gap_alignment			= xxx_gap_alignment,
> +	.get_pages			= xxx_get_pages,
> +	.get_pages_alloc		= xxx_get_pages_alloc,
> +	.npages				= xxx_npages,
> +	.dup_iter			= xxx_dup_iter,
> +	.for_each_range			= xxx_for_each_range,
> +};
> +
> +static const struct iov_iter_ops discard_iter_ops = {
> +	.type				= ITER_DISCARD,
> +	.copy_from_user_atomic		= xxx_copy_from_user_atomic,
> +	.advance			= xxx_advance,
> +	.revert				= xxx_revert,
> +	.fault_in_readable		= xxx_fault_in_readable,
> +	.single_seg_count		= xxx_single_seg_count,
> +	.copy_page_to_iter		= xxx_copy_page_to_iter,
> +	.copy_page_from_iter		= xxx_copy_page_from_iter,
> +	.copy_to_iter			= xxx_copy_to_iter,
> +	.copy_from_iter			= xxx_copy_from_iter,
> +	.copy_from_iter_full		= xxx_copy_from_iter_full,
> +	.copy_from_iter_nocache		= xxx_copy_from_iter_nocache,
> +	.copy_from_iter_full_nocache	= xxx_copy_from_iter_full_nocache,
> +#ifdef CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE
> +	.copy_from_iter_flushcache	= xxx_copy_from_iter_flushcache,
> +#endif
> +#ifdef CONFIG_ARCH_HAS_COPY_MC
> +	.copy_mc_to_iter		= xxx_copy_mc_to_iter,
> +#endif
> +	.csum_and_copy_to_iter		= xxx_csum_and_copy_to_iter,
> +	.csum_and_copy_from_iter	= xxx_csum_and_copy_from_iter,
> +	.csum_and_copy_from_iter_full	= xxx_csum_and_copy_from_iter_full,
> +
> +	.zero				= xxx_zero,
> +	.alignment			= xxx_alignment,
> +	.gap_alignment			= xxx_gap_alignment,
> +	.get_pages			= xxx_get_pages,
> +	.get_pages_alloc		= xxx_get_pages_alloc,
> +	.npages				= xxx_npages,
> +	.dup_iter			= xxx_dup_iter,
> +	.for_each_range			= xxx_for_each_range,
> +};
> 
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/29] RFC: iov_iter: Switch to using an ops table
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (28 preceding siblings ...)
  2020-11-21 14:17 ` [PATCH 29/29] iov_iter: Remove iterate_all_kinds() and iterate_and_advance() David Howells
@ 2020-11-21 14:34 ` Pavel Begunkov
  2020-11-21 18:23 ` Linus Torvalds
  2020-12-11  3:24 ` Matthew Wilcox
  31 siblings, 0 replies; 55+ messages in thread
From: Pavel Begunkov @ 2020-11-21 14:34 UTC (permalink / raw)
  To: David Howells, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

On 21/11/2020 14:13, David Howells wrote:
> 
> Hi Pavel, Willy, Jens, Al,
> 
> I had a go switching the iov_iter stuff away from using a type bitmask to
> using an ops table to get rid of the if-if-if-if chains that are all over
> the place.  After I pushed it, someone pointed me at Pavel's two patches.
> 
> I have another iterator class that I want to add - which would lengthen the
> if-if-if-if chains.  A lot of the time, there's a conditional clause at the
> beginning of a function that just jumps off to a type-specific handler or
> to reject the operation for that type.  An ops table can just point to that
> instead.
> 
> As far as I can tell, there's no difference in performance in most cases,
> though doing AFS-based kernel compiles appears to take less time (down from
> 3m20 to 2m50), which might make sense as that uses iterators a lot - but
> there are too many variables in that for that to be a good benchmark (I'm
> dealing with a remote server, for a start).
> 
> Can someone recommend a good way to benchmark this properly?  The problem
> is that the difference this makes relative to the amount of time taken to
> actually do I/O is tiny.

I find enough of iov overhead running fio/t/io_uring.c with nullblk.
Not sure whether it'll help you but worth a try.

> 
> I've tried TCP transfers using the following sink program:
> 
> 	#include <stdio.h>
> 	#include <stdlib.h>
> 	#include <string.h>
> 	#include <fcntl.h>
> 	#include <unistd.h>
> 	#include <netinet/in.h>
> 	#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0)
> 	static unsigned char buffer[512 * 1024] __attribute__((aligned(4096)));
> 	int main(int argc, char *argv[])
> 	{
> 		struct sockaddr_in sin = { .sin_family = AF_INET, .sin_port = htons(5555) };
> 		int sfd, afd;
> 		sfd = socket(AF_INET, SOCK_STREAM, 0);
> 		OSERROR(sfd, "socket");
> 		OSERROR(bind(sfd, (struct sockaddr *)&sin, sizeof(sin)), "bind");
> 		OSERROR(listen(sfd, 1), "listen");
> 		for (;;) {
> 			afd = accept(sfd, NULL, NULL);
> 			if (afd != -1) {
> 				while (read(afd, buffer, sizeof(buffer)) > 0) {}
> 				close(afd);
> 			}
> 		}
> 	}
> 
> and send program:
> 
> 	#include <stdio.h>
> 	#include <stdlib.h>
> 	#include <string.h>
> 	#include <fcntl.h>
> 	#include <unistd.h>
> 	#include <netdb.h>
> 	#include <netinet/in.h>
> 	#include <sys/stat.h>
> 	#include <sys/sendfile.h>
> 	#define OSERROR(X, Y) do { if ((long)(X) == -1) { perror(Y); exit(1); } } while(0)
> 	static unsigned char buffer[512*1024] __attribute__((aligned(4096)));
> 	int main(int argc, char *argv[])
> 	{
> 		struct sockaddr_in sin = { .sin_family = AF_INET, .sin_port = htons(5555) };
> 		struct hostent *h;
> 		ssize_t size, r, o;
> 		int cfd;
> 		if (argc != 3) {
> 			fprintf(stderr, "tcp-gen <server> <size>\n");
> 			exit(2);
> 		}
> 		size = strtoul(argv[2], NULL, 0);
> 		if (size <= 0) {
> 			fprintf(stderr, "Bad size\n");
> 			exit(2);
> 		}
> 		h = gethostbyname(argv[1]);
> 		if (!h) {
> 			fprintf(stderr, "%s: %s\n", argv[1], hstrerror(h_errno));
> 			exit(3);
> 		}
> 		if (!h->h_addr_list[0]) {
> 			fprintf(stderr, "%s: No addresses\n", argv[1]);
> 			exit(3);
> 		}
> 		memcpy(&sin.sin_addr, h->h_addr_list[0], h->h_length);
> 		cfd = socket(AF_INET, SOCK_STREAM, 0);
> 		OSERROR(cfd, "socket");
> 		OSERROR(connect(cfd, (struct sockaddr *)&sin, sizeof(sin)), "connect");
> 		do {
> 			r = size > sizeof(buffer) ? sizeof(buffer) : size;
> 			size -= r;
> 			o = 0;
> 			do {
> 				ssize_t w = write(cfd, buffer + o, r - o);
> 				OSERROR(w, "write");
> 				o += w;
> 			} while (o < r);
> 		} while (size > 0);
> 		OSERROR(close(cfd), "close/c");
> 		return 0;
> 	}
> 
> since the socket interface uses iterators.  It seems to show no difference.
> One side note, though: I've been doing 10GiB same-machine transfers, and it
> takes either ~2.5s or ~0.87s and rarely in between, with or without these
> patches, alternating apparently randomly between the two times.
> 
> The patches can be found here:
> 
> 	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=iov-ops
> 
> David
> ---
> David Howells (29):
>       iov_iter: Switch to using a table of operations
>       iov_iter: Split copy_page_to_iter()
>       iov_iter: Split iov_iter_fault_in_readable
>       iov_iter: Split the iterate_and_advance() macro
>       iov_iter: Split copy_to_iter()
>       iov_iter: Split copy_mc_to_iter()
>       iov_iter: Split copy_from_iter()
>       iov_iter: Split the iterate_all_kinds() macro
>       iov_iter: Split copy_from_iter_full()
>       iov_iter: Split copy_from_iter_nocache()
>       iov_iter: Split copy_from_iter_flushcache()
>       iov_iter: Split copy_from_iter_full_nocache()
>       iov_iter: Split copy_page_from_iter()
>       iov_iter: Split iov_iter_zero()
>       iov_iter: Split copy_from_user_atomic()
>       iov_iter: Split iov_iter_advance()
>       iov_iter: Split iov_iter_revert()
>       iov_iter: Split iov_iter_single_seg_count()
>       iov_iter: Split iov_iter_alignment()
>       iov_iter: Split iov_iter_gap_alignment()
>       iov_iter: Split iov_iter_get_pages()
>       iov_iter: Split iov_iter_get_pages_alloc()
>       iov_iter: Split csum_and_copy_from_iter()
>       iov_iter: Split csum_and_copy_from_iter_full()
>       iov_iter: Split csum_and_copy_to_iter()
>       iov_iter: Split iov_iter_npages()
>       iov_iter: Split dup_iter()
>       iov_iter: Split iov_iter_for_each_range()
>       iov_iter: Remove iterate_all_kinds() and iterate_and_advance()
> 
> 
>  lib/iov_iter.c | 1440 +++++++++++++++++++++++++++++++-----------------
>  1 file changed, 934 insertions(+), 506 deletions(-)
> 
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
  2020-11-21 14:31   ` Pavel Begunkov
@ 2020-11-21 18:21   ` Linus Torvalds
  2020-12-11  1:30     ` Al Viro
  2020-11-22 13:33   ` David Howells
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 55+ messages in thread
From: Linus Torvalds @ 2020-11-21 18:21 UTC (permalink / raw)
  To: David Howells
  Cc: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	linux-fsdevel, linux-block, Linux Kernel Mailing List

On Sat, Nov 21, 2020 at 6:13 AM David Howells <dhowells@redhat.com> wrote:
>
> Switch to using a table of operations.  In a future patch the individual
> methods will be split up by type.  For the moment, however, the ops tables
> just jump directly to the old functions - which are now static.  Inline
> wrappers are provided to jump through the hooks.

So I think conceptually this is the right thing to do, but I have a
couple of worries:

 - do we really need all those different versions? I'm thinking
"iter_full" versions in particular. They I think the iter_full version
could just be wrappers that call the regular iter thing and verify the
end result is full (and revert if not). No?

 - I don't like the xxx_iter_op naming - even as a temporary thing.

   Please don't use "xxx" as a placeholder. It's not a great grep
pattern, it's not really descriptive, and we've literally had issues
with things being marked as spam when you use that. So it's about the
worst pattern to use.

   Use "anycase" - or something like that - which is descriptive and
greps much better (ie not a single hit for that pattern in the kernel
either before or after).

 - I worry a bit about the indirect call overhead and spectre v2.

   So yeah, it would be good to have benchmarks to make sure this
doesn't regress for some simple case.

Other than those things, my initial reaction is "this does seem cleaner".

Al?

              Linus

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/29] RFC: iov_iter: Switch to using an ops table
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (29 preceding siblings ...)
  2020-11-21 14:34 ` [PATCH 00/29] RFC: iov_iter: Switch to using an ops table Pavel Begunkov
@ 2020-11-21 18:23 ` Linus Torvalds
  2020-12-11  3:24 ` Matthew Wilcox
  31 siblings, 0 replies; 55+ messages in thread
From: Linus Torvalds @ 2020-11-21 18:23 UTC (permalink / raw)
  To: David Howells
  Cc: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	linux-fsdevel, linux-block, Linux Kernel Mailing List

On Sat, Nov 21, 2020 at 6:13 AM David Howells <dhowells@redhat.com> wrote:
>
> Can someone recommend a good way to benchmark this properly?  The problem
> is that the difference this makes relative to the amount of time taken to
> actually do I/O is tiny.

Maybe try /dev/zero -> /dev/null to try a load where the IO itself is
cheap. Or vmsplice to /dev/null?

         Linus

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
  2020-11-21 14:31   ` Pavel Begunkov
  2020-11-21 18:21   ` Linus Torvalds
@ 2020-11-22 13:33   ` David Howells
  2020-11-22 13:58     ` David Laight
  2020-11-22 19:22     ` Linus Torvalds
  2020-11-22 22:46   ` David Laight
                     ` (4 subsequent siblings)
  7 siblings, 2 replies; 55+ messages in thread
From: David Howells @ 2020-11-22 13:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: dhowells, Pavel Begunkov, Matthew Wilcox, Jens Axboe,
	Alexander Viro, linux-fsdevel, linux-block,
	Linux Kernel Mailing List

Linus Torvalds <torvalds@linux-foundation.org> wrote:

>  - I worry a bit about the indirect call overhead and spectre v2.

I don't know enough about how spectre v2 works to say if this would be a
problem for the ops-table approach, but wouldn't it also affect the chain of
conditional branches that we currently use, since it's branch-prediction
based?

David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-22 13:33   ` David Howells
@ 2020-11-22 13:58     ` David Laight
  2020-11-22 19:22     ` Linus Torvalds
  1 sibling, 0 replies; 55+ messages in thread
From: David Laight @ 2020-11-22 13:58 UTC (permalink / raw)
  To: 'David Howells', Linus Torvalds
  Cc: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	linux-fsdevel, linux-block, Linux Kernel Mailing List

From: David Howells
> Sent: 22 November 2020 13:33
> 
> Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> >  - I worry a bit about the indirect call overhead and spectre v2.
> 
> I don't know enough about how spectre v2 works to say if this would be a
> problem for the ops-table approach, but wouldn't it also affect the chain of
> conditional branches that we currently use, since it's branch-prediction
> based?

The advantage of the 'chain of branches' is that it can be converted
into a 'tree of branches' because the values are all separate bits.

So as well as putting the (expected) common one first; you can do:
	if (likely((a & (A | B))) {
		if (a & A) {
			code for A;
		} else {
			code for B;
	} else ...
So get better control over the branch sequence.
(Hopefully the compiler doesn't change the logic.
I want a dumb compiler that (mostly) compiles what I write!)

Part of the difficulty is deciding the common case.
There'll always be a benchmark that exercises an uncommon case.

Adding an indirect call does let you do things like adding
ITER_IOVER_SINGLE and ITER_KVEC_SINGLE that are used in the
common case of a single buffer fragment.
That might be a measurable gain.

It is also possible to optimise the common case to a direct
call (or even inline code) and use an indirect call for
everything else.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-22 13:33   ` David Howells
  2020-11-22 13:58     ` David Laight
@ 2020-11-22 19:22     ` Linus Torvalds
  2020-11-22 22:34       ` David Laight
  1 sibling, 1 reply; 55+ messages in thread
From: Linus Torvalds @ 2020-11-22 19:22 UTC (permalink / raw)
  To: David Howells
  Cc: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	linux-fsdevel, linux-block, Linux Kernel Mailing List

On Sun, Nov 22, 2020 at 5:33 AM David Howells <dhowells@redhat.com> wrote:
>
> I don't know enough about how spectre v2 works to say if this would be a
> problem for the ops-table approach, but wouldn't it also affect the chain of
> conditional branches that we currently use, since it's branch-prediction
> based?

No, regular conditional branches aren't a problem. Yes, they may
mispredict, but outside of a few very rare cases that we handle
specially, that's not an issue.

Why? Because they always mispredict to one or the other side, so the
code flow may be mis-predicted, but it is fairly controlled.

In contrast, an indirect jump can mispredict the target, and branch
_anywhere_, and the attack vectors can poison the BTB (branch target
buffer), so our mitigation for that is that every single indirect
branch isn't predicted at all (using "retpoline").

So a conditional branch takes zero cycles when predicted (and most
will predict quite well). And as David Laight pointed out a compiler
can also turn a series of conditional branches into a tree, means that
N conditional branches basically only needs log2(N) conditionals
executed.

In contrast, with retpoline in place, an indirect branch will
basically always take something like 25-30 cycles, because it always
mispredicts.

End result:

 - well-predicted conditional branches are basically free (apart from
code layout issues)

 - even with average prediction, a series of conditional branches has
to be fairly long for it to be worse than an indirect branch

 - only completely unpredictable conditional branches end up basically
losing, and even then you probably need more than one. And while
completely unpredictable conditional branches do exist, they are
pretty rare.

The other side of the coin, of course, is

 - often this is not measurable anyway.

 - code cleanliness is important

 - not everything needs retpolines and the expensive indirect branches.

So this is not in any way "indirect branches are bad". It's more of a
"indirect branches really aren't necessarily better than a couple of
conditionals, and _may_ be much worse".

For example, look at this gcc bugzilla:

    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952

which basically is about the compiler generating a jump table (is a
single indirect branch) vs a series of conditional branches. With
retpoline, the cross-over point is basically when you need to have
over 10 conditional branches - and because of the log2(N) behavior,
that's around a thousand cases!

(But this depends hugely on microarchitectural details).

             Linus

^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-22 19:22     ` Linus Torvalds
@ 2020-11-22 22:34       ` David Laight
  0 siblings, 0 replies; 55+ messages in thread
From: David Laight @ 2020-11-22 22:34 UTC (permalink / raw)
  To: 'Linus Torvalds', David Howells
  Cc: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	linux-fsdevel, linux-block, Linux Kernel Mailing List

From: Linus Torvalds
> Sent: 22 November 2020 19:22
> Subject: Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
> 
> On Sun, Nov 22, 2020 at 5:33 AM David Howells <dhowells@redhat.com> wrote:
> >
> > I don't know enough about how spectre v2 works to say if this would be a
> > problem for the ops-table approach, but wouldn't it also affect the chain of
> > conditional branches that we currently use, since it's branch-prediction
> > based?
> 
> No, regular conditional branches aren't a problem. Yes, they may
> mispredict, but outside of a few very rare cases that we handle
> specially, that's not an issue.
> 
> Why? Because they always mispredict to one or the other side, so the
> code flow may be mis-predicted, but it is fairly controlled.
> 
> In contrast, an indirect jump can mispredict the target, and branch
> _anywhere_, and the attack vectors can poison the BTB (branch target
> buffer), so our mitigation for that is that every single indirect
> branch isn't predicted at all (using "retpoline").
> 
> So a conditional branch takes zero cycles when predicted (and most
> will predict quite well). And as David Laight pointed out a compiler
> can also turn a series of conditional branches into a tree, means that
> N conditional branches basically only needs log2(N) conditionals
> executed.

The compiler can convert a switch statement into a branch tree.
But I don't think it can convert the 'if chain' in the current code
to one.

There is also the problem that some x86 cpu can't predict branches
if too many happen in the same cache line (or similar).

> In contrast, with retpoline in place, an indirect branch will
> basically always take something like 25-30 cycles, because it always
> mispredicts.

I also wonder if a retpoline also trashes the return stack optimisation.
(If that is ever really a significant gain for real functions.)
 
...
> So this is not in any way "indirect branches are bad". It's more of a
> "indirect branches really aren't necessarily better than a couple of
> conditionals, and _may_ be much worse".

Even without retpolines, the jump table is likely to a data-cache
miss (and maybe a TLB miss) unless you are running hot-cache.
That is probably an extra cache miss on top of the I-cache ones.
Even worse if you end up with the jump table near the code
since the data cache line and TLB might never be shared.

So a very short switch statement is likely to be better as
conditional jumps anyway.

> For example, look at this gcc bugzilla:
> 
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86952
> 
> which basically is about the compiler generating a jump table (is a
> single indirect branch) vs a series of conditional branches. With
> retpoline, the cross-over point is basically when you need to have
> over 10 conditional branches - and because of the log2(N) behavior,
> that's around a thousand cases!

That was a hot-cache test.
Cold-cache is likely to favour the retpoline a little sooner.
(And the retpoline (probbaly) won't be (much) worse than the
mid-predicted indirect jump.

I do wonder how much of the kernel actually runs hot-cache?
Except for parts that explicitly run things in bursts.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 55+ messages in thread

* RE: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
                     ` (2 preceding siblings ...)
  2020-11-22 13:33   ` David Howells
@ 2020-11-22 22:46   ` David Laight
  2020-11-23  8:05   ` Christoph Hellwig
                     ` (3 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: David Laight @ 2020-11-22 22:46 UTC (permalink / raw)
  To: 'David Howells',
	Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro
  Cc: Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

From: David Howells
> Sent: 21 November 2020 14:14
> 
> Switch to using a table of operations.  In a future patch the individual
> methods will be split up by type.  For the moment, however, the ops tables
> just jump directly to the old functions - which are now static.  Inline
> wrappers are provided to jump through the hooks.

I was wondering if you could use a bit of 'cpp magic'
so the to call sites would be:
	ITER_CALL(iter, action)(arg_list);

which might expand to:
	iter->action(arg_list);
in the function-table case.
But could also be an if-chain:
	if (iter->type & foo)
		foo_action(args);
	else ...
with foo_action() being inlined.

If there is enough symmetry it might make the code easier to read.
Although I'm not sure what happens to 'iterate_all_kinds'.
OTOH that is already unreadable.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
                     ` (3 preceding siblings ...)
  2020-11-22 22:46   ` David Laight
@ 2020-11-23  8:05   ` Christoph Hellwig
  2020-11-23 10:31   ` David Howells
                     ` (2 subsequent siblings)
  7 siblings, 0 replies; 55+ messages in thread
From: Christoph Hellwig @ 2020-11-23  8:05 UTC (permalink / raw)
  To: David Howells
  Cc: Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

On Sat, Nov 21, 2020 at 02:13:30PM +0000, David Howells wrote:
> Switch to using a table of operations.  In a future patch the individual
> methods will be split up by type.  For the moment, however, the ops tables
> just jump directly to the old functions - which are now static.  Inline
> wrappers are provided to jump through the hooks.
> 
> Signed-off-by: David Howells <dhowells@redhat.com>

Please run performance tests.  I think the indirect calls could totally
wreck things like high performance direct I/O, especially using io_uring
on x86.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
                     ` (4 preceding siblings ...)
  2020-11-23  8:05   ` Christoph Hellwig
@ 2020-11-23 10:31   ` David Howells
  2020-11-23 23:42     ` Pavel Begunkov
  2020-11-24 12:50     ` David Howells
  2020-11-23 11:14   ` David Howells
       [not found]   ` <20201203064536.GE27350@xsang-OptiPlex-9020>
  7 siblings, 2 replies; 55+ messages in thread
From: David Howells @ 2020-11-23 10:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: dhowells, Pavel Begunkov, Matthew Wilcox, Jens Axboe,
	Alexander Viro, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel

Christoph Hellwig <hch@infradead.org> wrote:

> Please run performance tests.  I think the indirect calls could totally
> wreck things like high performance direct I/O, especially using io_uring
> on x86.

Here's an initial test using fio and null_blk.  I left null_blk in its default
configuration and used the following command line:

fio --ioengine=libaio --direct=1 --gtod_reduce=1 --name=readtest --filename=/dev/nullb0 --bs=4k --iodepth=128 --time_based --runtime=120 --readwrite=randread --iodepth_low=96 --iodepth_batch=16 --numjobs=4

I borrowed some of the parameters from an email I found online, so I'm not
sure if they're that useful.

I tried three different sets of patches: none, just the first (which adds the
jump table without getting rid of the conditional branches), and all of them.

I'm not sure which stats are of particular interest here, so I took the two
summary stats from the output of fio and also added together the "issued rwts:
total=a,b,c,d" from each test thread (only the first of which is non-zero).

The CPU is an Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, so 4 single-thread
cores, and 16G of RAM.  No virtualisation is involved.

Unpatched:

   READ: bw=4109MiB/s (4308MB/s), 1025MiB/s-1029MiB/s (1074MB/s-1079MB/s), io=482GiB (517GB), run=120001-120001msec
   READ: bw=4097MiB/s (4296MB/s), 1020MiB/s-1029MiB/s (1070MB/s-1079MB/s), io=480GiB (516GB), run=120001-120001msec
   READ: bw=4113MiB/s (4312MB/s), 1025MiB/s-1031MiB/s (1075MB/s-1082MB/s), io=482GiB (517GB), run=120001-120001msec
   READ: bw=4125MiB/s (4325MB/s), 1028MiB/s-1033MiB/s (1078MB/s-1084MB/s), io=483GiB (519GB), run=120001-120001msec

  nullb0: ios=126017326/0, merge=53/0, ticks=3538817/0, in_queue=3538817, util=100.00%
  nullb0: ios=125655193/0, merge=55/0, ticks=3548157/0, in_queue=3548157, util=100.00%
  nullb0: ios=126133014/0, merge=58/0, ticks=3545621/0, in_queue=3545621, util=100.00%
  nullb0: ios=126512562/0, merge=57/0, ticks=3531600/0, in_queue=3531600, util=100.00%

  sum issued rwts = 126224632
  sum issued rwts = 125861368
  sum issued rwts = 126340344
  sum issued rwts = 126718648

Just first patch:

   READ: bw=4106MiB/s (4306MB/s), 1023MiB/s-1030MiB/s (1073MB/s-1080MB/s), io=481GiB (517GB), run=120001-120001msec
   READ: bw=4126MiB/s (4327MB/s), 1029MiB/s-1034MiB/s (1079MB/s-1084MB/s), io=484GiB (519GB), run=120001-120001msec
   READ: bw=4109MiB/s (4308MB/s), 1025MiB/s-1029MiB/s (1075MB/s-1079MB/s), io=481GiB (517GB), run=120001-120001msec
   READ: bw=4097MiB/s (4296MB/s), 1023MiB/s-1025MiB/s (1073MB/s-1074MB/s), io=480GiB (516GB), run=120001-120001msec

  nullb0: ios=125939152/0, merge=62/0, ticks=3534917/0, in_queue=3534917, util=100.00%
  nullb0: ios=126554181/0, merge=61/0, ticks=3532067/0, in_queue=3532067, util=100.00%
  nullb0: ios=126012346/0, merge=54/0, ticks=3530504/0, in_queue=3530504, util=100.00%
  nullb0: ios=125653775/0, merge=54/0, ticks=3537438/0, in_queue=3537438, util=100.00%

  sum issued rwts = 126144952
  sum issued rwts = 126765368
  sum issued rwts = 126215928
  sum issued rwts = 125864120

All patches:
  nullb0: ios=10477062/0, merge=2/0, ticks=284992/0, in_queue=284992, util=95.87%
  nullb0: ios=10405246/0, merge=2/0, ticks=291886/0, in_queue=291886, util=99.82%
  nullb0: ios=10425583/0, merge=1/0, ticks=291699/0, in_queue=291699, util=99.22%
  nullb0: ios=10438845/0, merge=3/0, ticks=292445/0, in_queue=292445, util=99.31%

   READ: bw=4118MiB/s (4318MB/s), 1028MiB/s-1032MiB/s (1078MB/s-1082MB/s), io=483GiB (518GB), run=120001-120001msec
   READ: bw=4109MiB/s (4308MB/s), 1024MiB/s-1030MiB/s (1073MB/s-1080MB/s), io=481GiB (517GB), run=120001-120001msec
   READ: bw=4108MiB/s (4308MB/s), 1026MiB/s-1029MiB/s (1076MB/s-1079MB/s), io=481GiB (517GB), run=120001-120001msec
   READ: bw=4112MiB/s (4312MB/s), 1025MiB/s-1031MiB/s (1075MB/s-1081MB/s), io=482GiB (517GB), run=120001-120001msec

  nullb0: ios=126282410/0, merge=58/0, ticks=3557384/0, in_queue=3557384, util=100.00%
  nullb0: ios=126004837/0, merge=67/0, ticks=3565235/0, in_queue=3565235, util=100.00%
  nullb0: ios=125988876/0, merge=59/0, ticks=3563026/0, in_queue=3563026, util=100.00%
  nullb0: ios=126118279/0, merge=57/0, ticks=3566122/0, in_queue=3566122, util=100.00%

  sum issued rwts = 126494904
  sum issued rwts = 126214200
  sum issued rwts = 126198200
  sum issued rwts = 126328312


David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
                     ` (5 preceding siblings ...)
  2020-11-23 10:31   ` David Howells
@ 2020-11-23 11:14   ` David Howells
       [not found]   ` <20201203064536.GE27350@xsang-OptiPlex-9020>
  7 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-23 11:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: dhowells, Pavel Begunkov, Matthew Wilcox, Jens Axboe,
	Alexander Viro, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel

David Howells <dhowells@redhat.com> wrote:

> I tried three different sets of patches: none, just the first (which adds the
> jump table without getting rid of the conditional branches), and all of them.

And, I forgot to mention, I ran each test four times and then interleaved the
result lines for that set.

David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 14:31   ` Pavel Begunkov
@ 2020-11-23 23:21     ` Pavel Begunkov
  0 siblings, 0 replies; 55+ messages in thread
From: Pavel Begunkov @ 2020-11-23 23:21 UTC (permalink / raw)
  To: David Howells
  Cc: Matthew Wilcox, Jens Axboe, Alexander Viro, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel

On 21/11/2020 14:31, Pavel Begunkov wrote:
> On 21/11/2020 14:13, David Howells wrote:
>> Switch to using a table of operations.  In a future patch the individual
>> methods will be split up by type.  For the moment, however, the ops tables
>> just jump directly to the old functions - which are now static.  Inline
>> wrappers are provided to jump through the hooks.
>>
>> Signed-off-by: David Howells <dhowells@redhat.com>
>> ---
>>
>>  fs/io_uring.c       |    2 
>>  include/linux/uio.h |  241 ++++++++++++++++++++++++++++++++++--------
>>  lib/iov_iter.c      |  293 +++++++++++++++++++++++++++++++++++++++------------
>>  3 files changed, 422 insertions(+), 114 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 4ead291b2976..baa78f58ae5c 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -3192,7 +3192,7 @@ static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
>>  	rw->free_iovec = iovec;
>>  	rw->bytes_done = 0;
>>  	/* can only be fixed buffers, no need to do anything */
>> -	if (iter->type == ITER_BVEC)
>> +	if (iov_iter_is_bvec(iter))
> 
> Could you split this io_uring change and send for 5.10?
> Or I can do it for you if you wish.

FYI, I stole this chunk with right attributes. It should go through
io_uring 5.10, so shouldn't be a problem if you just drop it.

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-23 10:31   ` David Howells
@ 2020-11-23 23:42     ` Pavel Begunkov
  2020-11-24 12:50     ` David Howells
  1 sibling, 0 replies; 55+ messages in thread
From: Pavel Begunkov @ 2020-11-23 23:42 UTC (permalink / raw)
  To: David Howells, Christoph Hellwig
  Cc: Matthew Wilcox, Jens Axboe, Alexander Viro, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel

On 23/11/2020 10:31, David Howells wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> 
>> Please run performance tests.  I think the indirect calls could totally
>> wreck things like high performance direct I/O, especially using io_uring
>> on x86.
> 
> Here's an initial test using fio and null_blk.  I left null_blk in its default
> configuration and used the following command line:

I'd prefer something along no_sched=1 submit_queues=$(nproc) to reduce overhead.

> 
> fio --ioengine=libaio --direct=1 --gtod_reduce=1 --name=readtest --filename=/dev/nullb0 --bs=4k --iodepth=128 --time_based --runtime=120 --readwrite=randread --iodepth_low=96 --iodepth_batch=16 --numjobs=4

fio is relatively heavy, I'd suggest to try fio/t/io_uring with nullblk

> 
> I borrowed some of the parameters from an email I found online, so I'm not
> sure if they're that useful.
> 
> I tried three different sets of patches: none, just the first (which adds the
> jump table without getting rid of the conditional branches), and all of them.
> 
> I'm not sure which stats are of particular interest here, so I took the two
> summary stats from the output of fio and also added together the "issued rwts:
> total=a,b,c,d" from each test thread (only the first of which is non-zero).
> 
> The CPU is an Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz, so 4 single-thread
> cores, and 16G of RAM.  No virtualisation is involved.
> 
> Unpatched:
> 
>    READ: bw=4109MiB/s (4308MB/s), 1025MiB/s-1029MiB/s (1074MB/s-1079MB/s), io=482GiB (517GB), run=120001-120001msec
>    READ: bw=4097MiB/s (4296MB/s), 1020MiB/s-1029MiB/s (1070MB/s-1079MB/s), io=480GiB (516GB), run=120001-120001msec
>    READ: bw=4113MiB/s (4312MB/s), 1025MiB/s-1031MiB/s (1075MB/s-1082MB/s), io=482GiB (517GB), run=120001-120001msec
>    READ: bw=4125MiB/s (4325MB/s), 1028MiB/s-1033MiB/s (1078MB/s-1084MB/s), io=483GiB (519GB), run=120001-120001msec
> 
>   nullb0: ios=126017326/0, merge=53/0, ticks=3538817/0, in_queue=3538817, util=100.00%
>   nullb0: ios=125655193/0, merge=55/0, ticks=3548157/0, in_queue=3548157, util=100.00%
>   nullb0: ios=126133014/0, merge=58/0, ticks=3545621/0, in_queue=3545621, util=100.00%
>   nullb0: ios=126512562/0, merge=57/0, ticks=3531600/0, in_queue=3531600, util=100.00%
> 
>   sum issued rwts = 126224632
>   sum issued rwts = 125861368
>   sum issued rwts = 126340344
>   sum issued rwts = 126718648
> 
> Just first patch:
> 
>    READ: bw=4106MiB/s (4306MB/s), 1023MiB/s-1030MiB/s (1073MB/s-1080MB/s), io=481GiB (517GB), run=120001-120001msec
>    READ: bw=4126MiB/s (4327MB/s), 1029MiB/s-1034MiB/s (1079MB/s-1084MB/s), io=484GiB (519GB), run=120001-120001msec
>    READ: bw=4109MiB/s (4308MB/s), 1025MiB/s-1029MiB/s (1075MB/s-1079MB/s), io=481GiB (517GB), run=120001-120001msec
>    READ: bw=4097MiB/s (4296MB/s), 1023MiB/s-1025MiB/s (1073MB/s-1074MB/s), io=480GiB (516GB), run=120001-120001msec
> 
>   nullb0: ios=125939152/0, merge=62/0, ticks=3534917/0, in_queue=3534917, util=100.00%
>   nullb0: ios=126554181/0, merge=61/0, ticks=3532067/0, in_queue=3532067, util=100.00%
>   nullb0: ios=126012346/0, merge=54/0, ticks=3530504/0, in_queue=3530504, util=100.00%
>   nullb0: ios=125653775/0, merge=54/0, ticks=3537438/0, in_queue=3537438, util=100.00%
> 
>   sum issued rwts = 126144952
>   sum issued rwts = 126765368
>   sum issued rwts = 126215928
>   sum issued rwts = 125864120
> 
> All patches:
>   nullb0: ios=10477062/0, merge=2/0, ticks=284992/0, in_queue=284992, util=95.87%
>   nullb0: ios=10405246/0, merge=2/0, ticks=291886/0, in_queue=291886, util=99.82%
>   nullb0: ios=10425583/0, merge=1/0, ticks=291699/0, in_queue=291699, util=99.22%
>   nullb0: ios=10438845/0, merge=3/0, ticks=292445/0, in_queue=292445, util=99.31%
> 
>    READ: bw=4118MiB/s (4318MB/s), 1028MiB/s-1032MiB/s (1078MB/s-1082MB/s), io=483GiB (518GB), run=120001-120001msec
>    READ: bw=4109MiB/s (4308MB/s), 1024MiB/s-1030MiB/s (1073MB/s-1080MB/s), io=481GiB (517GB), run=120001-120001msec
>    READ: bw=4108MiB/s (4308MB/s), 1026MiB/s-1029MiB/s (1076MB/s-1079MB/s), io=481GiB (517GB), run=120001-120001msec
>    READ: bw=4112MiB/s (4312MB/s), 1025MiB/s-1031MiB/s (1075MB/s-1081MB/s), io=482GiB (517GB), run=120001-120001msec
> 
>   nullb0: ios=126282410/0, merge=58/0, ticks=3557384/0, in_queue=3557384, util=100.00%
>   nullb0: ios=126004837/0, merge=67/0, ticks=3565235/0, in_queue=3565235, util=100.00%
>   nullb0: ios=125988876/0, merge=59/0, ticks=3563026/0, in_queue=3563026, util=100.00%
>   nullb0: ios=126118279/0, merge=57/0, ticks=3566122/0, in_queue=3566122, util=100.00%
> 
>   sum issued rwts = 126494904
>   sum issued rwts = 126214200
>   sum issued rwts = 126198200
>   sum issued rwts = 126328312
> 
> 
> David
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-23 10:31   ` David Howells
  2020-11-23 23:42     ` Pavel Begunkov
@ 2020-11-24 12:50     ` David Howells
  2020-11-24 15:30       ` Jens Axboe
  2020-11-27 17:14       ` David Howells
  1 sibling, 2 replies; 55+ messages in thread
From: David Howells @ 2020-11-24 12:50 UTC (permalink / raw)
  To: Pavel Begunkov
  Cc: dhowells, Christoph Hellwig, Matthew Wilcox, Jens Axboe,
	Alexander Viro, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel

Pavel Begunkov <asml.silence@gmail.com> wrote:

> fio is relatively heavy, I'd suggest to try fio/t/io_uring with nullblk

no patches:

IOPS=885152, IOS/call=25/25, inflight=64 (64)
IOPS=890400, IOS/call=25/25, inflight=32 (32)
IOPS=890656, IOS/call=25/25, inflight=64 (64)
IOPS=896096, IOS/call=25/25, inflight=96 (96)
IOPS=876256, IOS/call=25/25, inflight=128 (128)
IOPS=905056, IOS/call=25/25, inflight=128 (128)
IOPS=882912, IOS/call=25/25, inflight=96 (96)
IOPS=887392, IOS/call=25/25, inflight=64 (32)
IOPS=897152, IOS/call=25/25, inflight=128 (128)
IOPS=871392, IOS/call=25/25, inflight=32 (32)
IOPS=865088, IOS/call=25/25, inflight=96 (96)
IOPS=880032, IOS/call=25/25, inflight=32 (32)
IOPS=905376, IOS/call=25/25, inflight=96 (96)
IOPS=898016, IOS/call=25/25, inflight=128 (128)
IOPS=885792, IOS/call=25/25, inflight=64 (64)
IOPS=897632, IOS/call=25/25, inflight=96 (96)

first patch only:

IOPS=876640, IOS/call=25/25, inflight=64 (64)
IOPS=878208, IOS/call=25/25, inflight=64 (64)
IOPS=884000, IOS/call=25/25, inflight=64 (64)
IOPS=900864, IOS/call=25/25, inflight=64 (64)
IOPS=878496, IOS/call=25/25, inflight=64 (64)
IOPS=870944, IOS/call=25/25, inflight=32 (32)
IOPS=900672, IOS/call=25/25, inflight=32 (32)
IOPS=882368, IOS/call=25/25, inflight=128 (128)
IOPS=877120, IOS/call=25/25, inflight=128 (128)
IOPS=861856, IOS/call=25/25, inflight=64 (64)
IOPS=892896, IOS/call=25/25, inflight=96 (96)
IOPS=875808, IOS/call=25/25, inflight=128 (128)
IOPS=887808, IOS/call=25/25, inflight=32 (80)
IOPS=889984, IOS/call=25/25, inflight=128 (128)

all patches:

IOPS=872192, IOS/call=25/25, inflight=96 (96)
IOPS=887360, IOS/call=25/25, inflight=32 (32)
IOPS=894432, IOS/call=25/25, inflight=128 (128)
IOPS=884640, IOS/call=25/25, inflight=32 (32)
IOPS=886784, IOS/call=25/25, inflight=32 (32)
IOPS=884160, IOS/call=25/25, inflight=96 (96)
IOPS=886944, IOS/call=25/25, inflight=96 (96)
IOPS=903360, IOS/call=25/25, inflight=128 (128)
IOPS=887744, IOS/call=25/25, inflight=64 (64)
IOPS=891072, IOS/call=25/25, inflight=32 (32)
IOPS=900512, IOS/call=25/25, inflight=128 (128)
IOPS=888544, IOS/call=25/25, inflight=128 (128)
IOPS=877312, IOS/call=25/25, inflight=128 (128)
IOPS=895008, IOS/call=25/25, inflight=128 (128)
IOPS=889376, IOS/call=25/25, inflight=128 (128)

David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-24 12:50     ` David Howells
@ 2020-11-24 15:30       ` Jens Axboe
  2020-11-27 17:14       ` David Howells
  1 sibling, 0 replies; 55+ messages in thread
From: Jens Axboe @ 2020-11-24 15:30 UTC (permalink / raw)
  To: David Howells, Pavel Begunkov
  Cc: Christoph Hellwig, Matthew Wilcox, Alexander Viro,
	Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

On 11/24/20 5:50 AM, David Howells wrote:
> Pavel Begunkov <asml.silence@gmail.com> wrote:
> 
>> fio is relatively heavy, I'd suggest to try fio/t/io_uring with nullblk
> 
> no patches:

Here's what I get. nullb0 using blk-mq, and submit_queues==NPROC.
iostats and merging disabled, using 8k bs for t/io_uring to ensure we
have > 1 segment. Everything pinned to the same CPU to ensure
reproducibility and stability. Kernel has CONFIG_RETPOLINE enabled.

5.10-rc5:
IOPS=2453184, IOS/call=32/31, inflight=128 (128)
IOPS=2435648, IOS/call=32/32, inflight=64 (64)
IOPS=2448544, IOS/call=32/31, inflight=96 (96)
IOPS=2439584, IOS/call=32/31, inflight=128 (128)
IOPS=2454176, IOS/call=32/32, inflight=32 (32)

5.10-rc5+all patches
IOPS=2304224, IOS/call=32/32, inflight=64 (64)
IOPS=2309216, IOS/call=32/32, inflight=32 (32)
IOPS=2305376, IOS/call=32/31, inflight=128 (128)
IOPS=2300544, IOS/call=32/32, inflight=128 (128)
IOPS=2301728, IOS/call=32/32, inflight=32 (32)

which looks to be around a 6% drop.

Using actual hardware instead of just null_blk:

5.10-rc5:
IOPS=854163, IOS/call=31/31, inflight=101 (101)
IOPS=855495, IOS/call=31/31, inflight=117 (117)
IOPS=856118, IOS/call=31/31, inflight=100 (100)
IOPS=855863, IOS/call=31/31, inflight=113 (113)
IOPS=856282, IOS/call=31/31, inflight=116 (116)

5.10-rc5+all patches
IOPS=833391, IOS/call=31/31, inflight=100 (100)
IOPS=838342, IOS/call=31/31, inflight=100 (100)
IOPS=839921, IOS/call=31/31, inflight=105 (105)
IOPS=841607, IOS/call=31/31, inflight=123 (123)
IOPS=843625, IOS/call=31/31, inflight=107 (107)

which looks to be around 2-3%, but we're also running at a much
slower rate (830K vs ~2.3M).

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-24 12:50     ` David Howells
  2020-11-24 15:30       ` Jens Axboe
@ 2020-11-27 17:14       ` David Howells
  1 sibling, 0 replies; 55+ messages in thread
From: David Howells @ 2020-11-27 17:14 UTC (permalink / raw)
  To: Jens Axboe
  Cc: dhowells, Pavel Begunkov, Christoph Hellwig, Matthew Wilcox,
	Alexander Viro, Linus Torvalds, linux-fsdevel, linux-block,
	linux-kernel

Jens Axboe <axboe@kernel.dk> wrote:

> which looks to be around a 6% drop.

That's quite a lot.

> which looks to be around 2-3%, but we're also running at a much
> slower rate (830K vs ~2.3M).

That's still a lot.

Thanks for having a look!

David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression
       [not found]   ` <20201203064536.GE27350@xsang-OptiPlex-9020>
@ 2020-12-03 17:47     ` Linus Torvalds
  2020-12-03 17:50       ` Jens Axboe
  2020-12-04 11:50     ` David Howells
  2020-12-04 11:51     ` David Howells
  2 siblings, 1 reply; 55+ messages in thread
From: Linus Torvalds @ 2020-12-03 17:47 UTC (permalink / raw)
  To: kernel test robot
  Cc: David Howells, lkp, kernel test robot, Huang, Ying, Feng Tang,
	zhengjun.xing, Pavel Begunkov, Matthew Wilcox, Jens Axboe,
	Alexander Viro, linux-fsdevel, linux-block,
	Linux Kernel Mailing List

On Wed, Dec 2, 2020 at 10:31 PM kernel test robot <oliver.sang@intel.com> wrote:
>
> FYI, we noticed a -4.8% regression of will-it-scale.per_process_ops due to commit:

Ok, I guess that's bigger than expected, but the profile data does
show how bad the indirect branches are.

There's both a "direct" cost of them:

>       0.55 ą 14%      +0.3        0.87 ą 15%  perf-profile.children.cycles-pp.__x86_retpoline_rax
>       0.12 ą 14%      +0.1        0.19 ą 14%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
>       0.43 ą 14%      +0.3        0.68 ą 15%  perf-profile.self.cycles-pp.__x86_retpoline_rax

The actual retpoline profile costs themselves do not add up to 4%, but
I think that's because the indirect costs are higher, because the
branch mis-predicts will basically make everything run slower for a
while as the OoO engine needs to restart.

So the global cost then shows up in CPU and branch miss stats, where
the IPC goes down (which is the same thing as saying that CPI goes
up):

>  1.741e+08           +42.3%  2.476e+08        perf-stat.i.branch-misses
>       0.74            -3.9%       0.71        perf-stat.overall.ipc
>       1.35            +4.1%       1.41        perf-stat.overall.cpi

which is why it ends up being so costly even if the retpoline overhead
itself is "only" just under 1%.

           Linus

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression
  2020-12-03 17:47     ` [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression Linus Torvalds
@ 2020-12-03 17:50       ` Jens Axboe
  0 siblings, 0 replies; 55+ messages in thread
From: Jens Axboe @ 2020-12-03 17:50 UTC (permalink / raw)
  To: Linus Torvalds, kernel test robot
  Cc: David Howells, lkp, kernel test robot, Huang, Ying, Feng Tang,
	zhengjun.xing, Pavel Begunkov, Matthew Wilcox, Alexander Viro,
	linux-fsdevel, linux-block, Linux Kernel Mailing List

On 12/3/20 10:47 AM, Linus Torvalds wrote:
> On Wed, Dec 2, 2020 at 10:31 PM kernel test robot <oliver.sang@intel.com> wrote:
>>
>> FYI, we noticed a -4.8% regression of will-it-scale.per_process_ops due to commit:
> 
> Ok, I guess that's bigger than expected, but the profile data does
> show how bad the indirect branches are.

It's also in the same range (3-6%) as the microbenchmarks I ran and posted.
So at least there's correlation there too.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression
       [not found]   ` <20201203064536.GE27350@xsang-OptiPlex-9020>
  2020-12-03 17:47     ` [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression Linus Torvalds
@ 2020-12-04 11:50     ` David Howells
  2020-12-04 11:51     ` David Howells
  2 siblings, 0 replies; 55+ messages in thread
From: David Howells @ 2020-12-04 11:50 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: dhowells, kernel test robot, lkp, kernel test robot, Huang, Ying,
	Feng Tang, zhengjun.xing, Pavel Begunkov, Matthew Wilcox,
	Jens Axboe, Alexander Viro, linux-fsdevel, linux-block,
	Linux Kernel Mailing List

Linus Torvalds <torvalds@linux-foundation.org> wrote:

> > FYI, we noticed a -4.8% regression of will-it-scale.per_process_ops due to commit:
> 
> Ok, I guess that's bigger than expected, 

Note that it appears to be testing just the first patch and not the whole
series:

| commit: 9bd0e337c633aed3e8ec3c7397b7ae0b8436f163 ("[PATCH 01/29] iov_iter: Switch to using a table of operations")

that just adds an indirection table without taking away any of the conditional
branching.  It seems quite likely, though, that even if you add all the other
patches, you won't get back enough to make it worth it.

David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression
       [not found]   ` <20201203064536.GE27350@xsang-OptiPlex-9020>
  2020-12-03 17:47     ` [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression Linus Torvalds
  2020-12-04 11:50     ` David Howells
@ 2020-12-04 11:51     ` David Howells
  2020-12-07 13:10       ` Oliver Sang
  2020-12-07 13:20       ` David Howells
  2 siblings, 2 replies; 55+ messages in thread
From: David Howells @ 2020-12-04 11:51 UTC (permalink / raw)
  To: kernel test robot
  Cc: dhowells, lkp, lkp, ying.huang, feng.tang, zhengjun.xing,
	Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

kernel test robot <oliver.sang@intel.com> wrote:

> FYI, we noticed a -4.8% regression of will-it-scale.per_process_ops due to commit:
> 
> 
> commit: 9bd0e337c633aed3e8ec3c7397b7ae0b8436f163 ("[PATCH 01/29] iov_iter: Switch to using a table of operations")

Out of interest, would it be possible for you to run this on the tail of the
series on the same hardware?

Thanks,
David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression
  2020-12-04 11:51     ` David Howells
@ 2020-12-07 13:10       ` Oliver Sang
  2020-12-07 13:20       ` David Howells
  1 sibling, 0 replies; 55+ messages in thread
From: Oliver Sang @ 2020-12-07 13:10 UTC (permalink / raw)
  To: David Howells
  Cc: lkp, lkp, ying.huang, feng.tang, zhengjun.xing, Pavel Begunkov,
	Matthew Wilcox, Jens Axboe, Alexander Viro, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel

Hi David,

On Fri, Dec 04, 2020 at 11:51:48AM +0000, David Howells wrote:
> kernel test robot <oliver.sang@intel.com> wrote:
> 
> > FYI, we noticed a -4.8% regression of will-it-scale.per_process_ops due to commit:
> > 
> > 
> > commit: 9bd0e337c633aed3e8ec3c7397b7ae0b8436f163 ("[PATCH 01/29] iov_iter: Switch to using a table of operations")
> 
> Out of interest, would it be possible for you to run this on the tail of the
> series on the same hardware?

sorry for late. below is the result adding the tail of the series:
* ded69a6991fe0 (linux-review/David-Howells/RFC-iov_iter-Switch-to-using-an-ops-table/20201121-222344) iov_iter: Remove iterate_all_kinds() and iterate_and_advance()

=========================================================================================
compiler/cpufreq_governor/kconfig/mode/nr_task/rootfs/tbox_group/test/testcase/ucode:
  gcc-9/performance/x86_64-rhel-8.3/process/50%/debian-10.4-x86_64-20200603.cgz/lkp-ivb-2ep1/pwrite1/will-it-scale/0x42e

commit: 
  27bba9c532a8d21050b94224ffd310ad0058c353
  9bd0e337c633aed3e8ec3c7397b7ae0b8436f163
  ded69a6991fe0094f36d96bf1ace2a9636428676

27bba9c532a8d210 9bd0e337c633aed3e8ec3c7397b ded69a6991fe0094f36d96bf1ac 
---------------- --------------------------- --------------------------- 
         %stddev     %change         %stddev     %change         %stddev
             \          |                \          |                \  
  28443113            -4.8%   27064036            -4.8%   27084904        will-it-scale.24.processes
   1185129            -4.8%    1127667            -4.8%    1128537        will-it-scale.per_process_ops
  28443113            -4.8%   27064036            -4.8%   27084904        will-it-scale.workload
     13.84            +1.0%      13.98            +0.3%      13.89        boot-time.dhcp
      1251 ±  9%     -17.2%       1035 ± 10%      -9.1%       1137 ±  5%  slabinfo.dmaengine-unmap-16.active_objs
      1251 ±  9%     -17.2%       1035 ± 10%      -9.1%       1137 ±  5%  slabinfo.dmaengine-unmap-16.num_objs
      1052 ±  6%      -1.1%       1041 ±  5%     -13.4%     911.75 ± 10%  slabinfo.task_group.active_objs
      1052 ±  6%      -1.1%       1041 ±  5%     -13.4%     911.75 ± 10%  slabinfo.task_group.num_objs
     31902 ±  5%      -5.6%      30124 ±  7%      -8.3%      29265 ±  4%  slabinfo.vm_area_struct.active_objs
     32163 ±  5%      -5.4%      30441 ±  6%      -8.0%      29602 ±  4%  slabinfo.vm_area_struct.num_objs
     73.46 ± 48%     -59.7%      29.59 ±100%    -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.avg
      2386 ± 23%     -40.5%       1420 ±100%    -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.max
    393.92 ± 33%     -48.5%     202.85 ±100%    -100.0%       0.00        sched_debug.cfs_rq:/.MIN_vruntime.stddev
     73.46 ± 48%     -59.7%      29.60 ±100%    -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.avg
      2386 ± 23%     -40.5%       1420 ±100%    -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.max
    393.92 ± 33%     -48.5%     202.94 ±100%    -100.0%       0.00        sched_debug.cfs_rq:/.max_vruntime.stddev
      0.00 ±  9%     -13.5%       0.00 ±  3%      -2.9%       0.00 ± 13%  sched_debug.cpu.next_balance.stddev
    -18.50           +33.5%     -24.70           -41.9%     -10.75        sched_debug.cpu.nr_uninterruptible.min
    411.75 ± 58%     +76.8%     728.00 ± 32%     +59.2%     655.50 ± 50%  numa-vmstat.node0.nr_active_anon
     34304 ±  2%     -35.6%      22103 ± 48%      +8.6%      37243 ± 26%  numa-vmstat.node0.nr_anon_pages
     36087 ±  2%     -31.0%      24915 ± 43%      +7.0%      38606 ± 27%  numa-vmstat.node0.nr_inactive_anon
      2233 ± 51%     +60.4%       3582 ±  7%      -7.7%       2062 ± 51%  numa-vmstat.node0.nr_shmem
    411.75 ± 58%     +76.8%     728.00 ± 32%     +59.2%     655.50 ± 50%  numa-vmstat.node0.nr_zone_active_anon
     36087 ±  2%     -31.0%      24915 ± 43%      +7.0%      38606 ± 27%  numa-vmstat.node0.nr_zone_inactive_anon
     24265 ±  3%     +51.3%      36707 ± 29%     -12.2%      21315 ± 47%  numa-vmstat.node1.nr_anon_pages
     25441 ±  2%     +44.9%      36858 ± 29%      -9.9%      22912 ± 47%  numa-vmstat.node1.nr_inactive_anon
    537.25 ± 20%     +22.8%     659.50 ± 10%     +14.5%     615.00 ± 21%  numa-vmstat.node1.nr_page_table_pages
     25441 ±  2%     +44.9%      36858 ± 29%      -9.9%      22912 ± 47%  numa-vmstat.node1.nr_zone_inactive_anon
      1649 ± 58%     +76.7%       2913 ± 32%     +59.0%       2621 ± 50%  numa-meminfo.node0.Active
      1649 ± 58%     +76.7%       2913 ± 32%     +59.0%       2621 ± 50%  numa-meminfo.node0.Active(anon)
    137223 ±  2%     -35.6%      88410 ± 48%      +8.6%     148973 ± 26%  numa-meminfo.node0.AnonPages
    164997 ±  9%     -28.4%     118095 ± 42%      +6.9%     176340 ± 23%  numa-meminfo.node0.AnonPages.max
    144353 ±  2%     -31.0%      99656 ± 43%      +7.0%     154424 ± 27%  numa-meminfo.node0.Inactive
    144353 ±  2%     -31.0%      99656 ± 43%      +7.0%     154424 ± 27%  numa-meminfo.node0.Inactive(anon)
      8937 ± 51%     +60.3%      14328 ±  7%      -7.7%       8251 ± 51%  numa-meminfo.node0.Shmem
     97072 ±  3%     +51.3%     146858 ± 29%     -12.2%      85274 ± 47%  numa-meminfo.node1.AnonPages
    127410 ±  5%     +43.2%     182468 ± 16%      -1.9%     124986 ± 42%  numa-meminfo.node1.AnonPages.max
    101822 ±  2%     +44.9%     147521 ± 29%      -9.9%      91738 ± 47%  numa-meminfo.node1.Inactive
    101822 ±  2%     +44.9%     147521 ± 29%      -9.9%      91738 ± 47%  numa-meminfo.node1.Inactive(anon)
      2148 ± 20%     +22.9%       2639 ± 10%     +14.5%       2460 ± 21%  numa-meminfo.node1.PageTables
     24623 ±  5%     -18.0%      20184 ± 15%      -6.9%      22929 ± 15%  softirqs.CPU0.RCU
     15977 ±  9%     +34.4%      21477 ± 22%     +54.7%      24711 ± 15%  softirqs.CPU13.RCU
     30680 ± 40%     -56.2%      13431 ± 60%     -70.8%       8966 ± 44%  softirqs.CPU13.SCHED
     28877 ± 10%     -30.6%      20051 ± 15%     -24.2%      21887 ± 13%  softirqs.CPU19.RCU
      5693 ± 31%    +402.3%      28595 ± 22%    +154.6%      14496 ± 46%  softirqs.CPU19.SCHED
      5753 ± 14%    +141.4%      13886 ± 87%    +172.2%      15657 ± 51%  softirqs.CPU2.SCHED
      7252 ± 79%    +239.9%      24653 ± 48%    +189.1%      20968 ± 44%  softirqs.CPU23.SCHED
     42479           -24.7%      31999 ± 39%     -25.9%      31488 ± 27%  softirqs.CPU26.SCHED
     21142 ± 15%     -26.5%      15533 ± 11%      +5.6%      22317 ± 17%  softirqs.CPU27.RCU
     20776 ± 38%     -50.5%      10290 ± 58%      +4.7%      21748 ± 35%  softirqs.CPU3.SCHED
     26618 ± 11%     -35.3%      17214 ±  6%     -33.5%      17689 ±  5%  softirqs.CPU37.RCU
     10894 ± 48%    +175.5%      30012 ± 34%    +237.2%      36734 ± 10%  softirqs.CPU37.SCHED
     17015 ±  4%     +39.2%      23681 ±  7%      +9.9%      18707 ± 21%  softirqs.CPU43.RCU
     29682 ± 10%     -17.6%      24446 ± 23%     -18.9%      24062 ±  9%  softirqs.CPU6.RCU
     21953 ± 20%      +9.7%      24079 ± 24%     -18.3%      17943 ± 23%  softirqs.CPU7.RCU
      3431 ± 89%     -85.1%     512.25 ±109%     -93.6%     220.75 ± 32%  interrupts.38:PCI-MSI.2621444-edge.eth0-TxRx-3
    348.50 ± 62%    +152.7%     880.75 ± 27%     -30.1%     243.50 ± 44%  interrupts.40:PCI-MSI.2621446-edge.eth0-TxRx-5
     50948            -0.6%      50655            +7.1%      54590 ±  6%  interrupts.CAL:Function_call_interrupts
      2579 ± 26%     +32.3%       3412 ± 43%     +58.3%       4082 ± 27%  interrupts.CPU0.NMI:Non-maskable_interrupts
      2579 ± 26%     +32.3%       3412 ± 43%     +58.3%       4082 ± 27%  interrupts.CPU0.PMI:Performance_monitoring_interrupts
    296.75            -3.4%     286.75 ±  7%     -38.2%     183.50 ± 40%  interrupts.CPU1.RES:Rescheduling_interrupts
    737.25            +8.7%     801.75 ± 13%     +92.5%       1419 ± 73%  interrupts.CPU11.CAL:Function_call_interrupts
      1697 ± 63%     -53.1%     796.75 ± 13%     -55.7%     751.50        interrupts.CPU13.CAL:Function_call_interrupts
     89.75 ± 36%    +220.3%     287.50 ± 20%    +195.3%     265.00 ± 10%  interrupts.CPU13.RES:Rescheduling_interrupts
    745.75 ±  3%    +104.6%       1526 ± 69%     +52.7%       1138 ± 61%  interrupts.CPU19.CAL:Function_call_interrupts
    293.00 ±  5%     -60.0%     117.25 ± 47%     -24.1%     222.25 ± 22%  interrupts.CPU19.RES:Rescheduling_interrupts
    778.50 ±  9%    +123.7%       1741 ± 64%      +3.3%     804.50 ± 10%  interrupts.CPU22.CAL:Function_call_interrupts
    670.00 ± 22%     +40.2%     939.50 ± 49%     +84.6%       1236 ± 63%  interrupts.CPU23.CAL:Function_call_interrupts
    283.50 ±  7%     -47.7%     148.25 ± 64%     -38.9%     173.25 ± 38%  interrupts.CPU23.RES:Rescheduling_interrupts
      6450 ± 29%     -38.0%       4000 ±  4%      +8.2%       6977 ± 29%  interrupts.CPU24.NMI:Non-maskable_interrupts
      6450 ± 29%     -38.0%       4000 ±  4%      +8.2%       6977 ± 29%  interrupts.CPU24.PMI:Performance_monitoring_interrupts
      2505 ± 24%    +100.2%       5015 ± 45%    +166.6%       6679 ± 26%  interrupts.CPU25.NMI:Non-maskable_interrupts
      2505 ± 24%    +100.2%       5015 ± 45%    +166.6%       6679 ± 26%  interrupts.CPU25.PMI:Performance_monitoring_interrupts
      2012 ± 56%     -57.6%     852.75 ±  6%     -48.0%       1047 ± 35%  interrupts.CPU26.CAL:Function_call_interrupts
     71.50 ± 12%     +73.4%     124.00 ± 72%    +106.3%     147.50 ± 49%  interrupts.CPU26.RES:Rescheduling_interrupts
      4198 ± 54%      +5.7%       4438 ± 51%     +41.8%       5952 ± 40%  interrupts.CPU27.NMI:Non-maskable_interrupts
      4198 ± 54%      +5.7%       4438 ± 51%     +41.8%       5952 ± 40%  interrupts.CPU27.PMI:Performance_monitoring_interrupts
    184.25 ± 37%     -47.9%      96.00 ± 49%      -6.5%     172.25 ± 27%  interrupts.CPU27.RES:Rescheduling_interrupts
      0.50 ±100%  +64250.0%     321.75 ±170%    +500.0%       3.00 ±115%  interrupts.CPU28.TLB:TLB_shootdowns
      3431 ± 89%     -85.1%     512.25 ±109%     -93.6%     220.75 ± 32%  interrupts.CPU29.38:PCI-MSI.2621444-edge.eth0-TxRx-3
      5982 ± 40%     -21.5%       4695 ± 46%     -35.1%       3881 ± 64%  interrupts.CPU3.NMI:Non-maskable_interrupts
      5982 ± 40%     -21.5%       4695 ± 46%     -35.1%       3881 ± 64%  interrupts.CPU3.PMI:Performance_monitoring_interrupts
    348.50 ± 62%    +152.7%     880.75 ± 27%     -30.1%     243.50 ± 44%  interrupts.CPU31.40:PCI-MSI.2621446-edge.eth0-TxRx-5
    156.50 ± 51%     -51.3%      76.25 ± 59%      +9.1%     170.75 ± 48%  interrupts.CPU33.RES:Rescheduling_interrupts
    883.50 ± 18%     -23.8%     673.25 ± 22%      -2.2%     863.75 ± 12%  interrupts.CPU36.CAL:Function_call_interrupts
      7492 ± 13%     -45.6%       4073 ± 63%     -40.2%       4483 ± 27%  interrupts.CPU37.NMI:Non-maskable_interrupts
      7492 ± 13%     -45.6%       4073 ± 63%     -40.2%       4483 ± 27%  interrupts.CPU37.PMI:Performance_monitoring_interrupts
    250.50 ± 19%     -52.5%     119.00 ± 50%     -76.0%      60.00 ± 49%  interrupts.CPU37.RES:Rescheduling_interrupts
    772.50 ±  2%      +2.0%     787.75 ± 10%    +346.2%       3447 ±127%  interrupts.CPU40.CAL:Function_call_interrupts
      4688 ± 27%     +63.5%       7667 ± 15%     +14.0%       5345 ± 38%  interrupts.CPU40.NMI:Non-maskable_interrupts
      4688 ± 27%     +63.5%       7667 ± 15%     +14.0%       5345 ± 38%  interrupts.CPU40.PMI:Performance_monitoring_interrupts
     96.75 ± 92%    +135.1%     227.50 ± 22%     +29.5%     125.25 ± 46%  interrupts.CPU43.RES:Rescheduling_interrupts
      2932 ± 36%     +73.4%       5084 ± 21%     +24.7%       3656 ± 55%  interrupts.CPU47.NMI:Non-maskable_interrupts
      2932 ± 36%     +73.4%       5084 ± 21%     +24.7%       3656 ± 55%  interrupts.CPU47.PMI:Performance_monitoring_interrupts
     57.50 ± 78%    +250.4%     201.50 ± 42%    +251.7%     202.25 ± 17%  interrupts.CPU47.RES:Rescheduling_interrupts
      4207 ± 61%     +86.0%       7827 ± 11%     +48.7%       6258 ± 33%  interrupts.CPU8.NMI:Non-maskable_interrupts
      4207 ± 61%     +86.0%       7827 ± 11%     +48.7%       6258 ± 33%  interrupts.CPU8.PMI:Performance_monitoring_interrupts
      0.18 ± 60%     -36.2%       0.11 ±  9%     -39.0%       0.11 ±  4%  perf-stat.i.MPKI
 1.089e+10            -2.3%  1.064e+10            -4.8%  1.036e+10        perf-stat.i.branch-instructions
      1.62            +0.7        2.34            +0.8        2.40        perf-stat.i.branch-miss-rate%
 1.741e+08           +42.3%  2.476e+08           +42.2%  2.475e+08        perf-stat.i.branch-misses
      2.70            -0.1        2.65 ±  6%      +0.2        2.95 ±  3%  perf-stat.i.cache-miss-rate%
   5228328            +4.0%    5436325 ±  8%      -4.5%    4992245 ±  2%  perf-stat.i.cache-references
      1.36            +3.3%       1.41            +5.5%       1.44        perf-stat.i.cpi
     52.10            +0.9%      52.55            +1.8%      53.04        perf-stat.i.cpu-migrations
 1.233e+08 ±  3%      -7.1%  1.146e+08            +1.6%  1.253e+08 ± 11%  perf-stat.i.dTLB-load-misses
  2.38e+10            -3.3%  2.302e+10            -4.5%  2.273e+10        perf-stat.i.dTLB-loads
  57501510            -4.9%   54711717            -4.6%   54852849        perf-stat.i.dTLB-store-misses
 1.828e+10            -3.7%  1.761e+10            -4.3%   1.75e+10        perf-stat.i.dTLB-stores
     98.97            -2.9       96.02 ±  2%     -29.3       69.69        perf-stat.i.iTLB-load-miss-rate%
  29795797 ±  4%      -5.0%   28320171            -5.2%   28254639        perf-stat.i.iTLB-load-misses
    299268 ±  2%    +298.1%    1191476 ± 50%   +4062.6%   12457396 ±  4%  perf-stat.i.iTLB-loads
 5.335e+10            -3.7%  5.138e+10            -5.7%  5.029e+10        perf-stat.i.instructions
      0.74            -3.7%       0.71            -5.7%       0.70        perf-stat.i.ipc
      0.20 ±  8%     +12.1%       0.23            +2.7%       0.21 ±  9%  perf-stat.i.major-faults
      1104            -3.2%       1069            -4.5%       1055        perf-stat.i.metric.M/sec
     66981            +4.3%      69845 ±  6%     +10.1%      73725 ±  4%  perf-stat.i.node-load-misses
     84278 ±  2%      +7.2%      90313 ±  6%      +9.8%      92543 ±  5%  perf-stat.i.node-loads
     72308            +2.3%      73975 ±  2%      +1.5%      73361        perf-stat.i.node-stores
      0.10            +7.9%       0.11 ±  8%      +1.3%       0.10 ±  3%  perf-stat.overall.MPKI
      1.60            +0.7        2.33            +0.8        2.39        perf-stat.overall.branch-miss-rate%
      3.60 ±  6%      -0.1        3.45 ±  7%      +0.3        3.88 ±  2%  perf-stat.overall.cache-miss-rate%
      1.35            +4.1%       1.41            +6.2%       1.44        perf-stat.overall.cpi
     99.00            -3.0       95.98 ±  2%     -29.6       69.42        perf-stat.overall.iTLB-load-miss-rate%
      0.74            -3.9%       0.71            -5.9%       0.70        perf-stat.overall.ipc
    567203            +1.0%     572789            -1.2%     560464        perf-stat.overall.path-length
 1.085e+10            -2.3%   1.06e+10            -4.8%  1.033e+10        perf-stat.ps.branch-instructions
 1.735e+08           +42.3%  2.468e+08           +42.2%  2.467e+08        perf-stat.ps.branch-misses
   5216268            +4.0%    5422673 ±  8%      -4.5%    4979211 ±  2%  perf-stat.ps.cache-references
     51.99            +0.8%      52.43            +1.8%      52.92        perf-stat.ps.cpu-migrations
 1.229e+08 ±  3%      -7.1%  1.142e+08            +1.6%  1.249e+08 ± 12%  perf-stat.ps.dTLB-load-misses
 2.372e+10            -3.3%  2.294e+10            -4.5%  2.266e+10        perf-stat.ps.dTLB-loads
  57306258            -4.9%   54525679            -4.6%   54668669        perf-stat.ps.dTLB-store-misses
 1.822e+10            -3.7%  1.755e+10            -4.3%  1.744e+10        perf-stat.ps.dTLB-stores
  29695158 ±  4%      -5.0%   28224049            -5.2%   28159995        perf-stat.ps.iTLB-load-misses
    298257 ±  2%    +298.1%    1187498 ± 50%   +4061.6%   12412241 ±  4%  perf-stat.ps.iTLB-loads
 5.317e+10            -3.7%   5.12e+10            -5.7%  5.012e+10        perf-stat.ps.instructions
      0.20 ±  7%     +12.0%       0.23 ±  2%      +3.0%       0.21 ±  8%  perf-stat.ps.major-faults
     66882            +4.3%      69726 ±  6%     +10.1%      73651 ±  4%  perf-stat.ps.node-load-misses
     84325 ±  2%      +7.1%      90306 ±  6%      +9.7%      92489 ±  5%  perf-stat.ps.node-loads
 1.613e+13            -3.9%   1.55e+13            -5.9%  1.518e+13        perf-stat.total.instructions
      8.00 ± 14%      -8.0        0.00            -8.0        0.00        perf-profile.calltrace.cycles-pp.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      7.38 ± 14%      -7.4        0.00            -7.4        0.00        perf-profile.calltrace.cycles-pp.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      7.27 ± 14%      -7.3        0.00            -7.3        0.00        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.iov_iter_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
      6.71 ± 12%      -0.7        5.98 ± 13%      -0.7        6.03 ± 10%  perf-profile.calltrace.cycles-pp.syscall_return_via_sysret.__libc_pwrite
      4.93 ± 12%      -0.6        4.29 ± 14%      -0.5        4.40 ± 11%  perf-profile.calltrace.cycles-pp.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      5.81 ± 13%      -0.6        5.22 ± 14%      -0.6        5.17 ± 11%  perf-profile.calltrace.cycles-pp.shmem_write_begin.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      3.50 ± 14%      -0.5        3.03 ± 13%      -0.4        3.13 ± 11%  perf-profile.calltrace.cycles-pp.shmem_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.69 ± 14%      -0.4        0.29 ±100%      -0.5        0.14 ±173%  perf-profile.calltrace.cycles-pp.up_write.generic_file_write_iter.new_sync_write.vfs_write.ksys_pwrite64
      3.44 ± 12%      -0.4        3.06 ± 14%      -0.4        3.05 ± 12%  perf-profile.calltrace.cycles-pp.find_lock_entry.shmem_getpage_gfp.shmem_write_begin.generic_perform_write.__generic_file_write_iter
      0.62 ± 15%      -0.3        0.30 ±101%      -0.2        0.43 ± 59%  perf-profile.calltrace.cycles-pp.unlock_page.shmem_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.85 ±  8%      -0.2        0.66 ± 15%      -0.1        0.71 ± 10%  perf-profile.calltrace.cycles-pp.__fget_light.ksys_pwrite64.do_syscall_64.entry_SYSCALL_64_after_hwframe.__libc_pwrite
      0.84 ± 14%      -0.1        0.71 ± 14%      -0.1        0.72 ±  8%  perf-profile.calltrace.cycles-pp.set_page_dirty.shmem_write_end.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.91 ± 11%      -0.1        0.79 ± 12%      -0.1        0.82 ± 10%  perf-profile.calltrace.cycles-pp.file_update_time.__generic_file_write_iter.generic_file_write_iter.new_sync_write.vfs_write
      0.68 ± 15%      -0.1        0.58 ± 13%      -0.1        0.57 ±  9%  perf-profile.calltrace.cycles-pp.page_mapping.set_page_dirty.shmem_write_end.generic_perform_write.__generic_file_write_iter
      0.00            +0.0        0.00            +1.0        1.02 ± 11%  perf-profile.calltrace.cycles-pp.__get_user_nocheck_1.iovec_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.00            +0.0        0.00            +1.2        1.17 ±  9%  perf-profile.calltrace.cycles-pp.iovec_advance.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.00            +0.0        0.00            +2.1        2.13 ± 11%  perf-profile.calltrace.cycles-pp.iovec_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.00            +0.0        0.00            +6.8        6.85 ± 10%  perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.iovec_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
      0.00            +0.0        0.00            +6.9        6.95 ± 10%  perf-profile.calltrace.cycles-pp.copyin.iovec_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.00            +0.0        0.00            +8.2        8.17 ± 10%  perf-profile.calltrace.cycles-pp.iovec_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.00            +1.0        1.01 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.__get_user_nocheck_1.xxx_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.00            +1.4        1.42 ± 12%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xxx_advance.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.00            +2.1        2.15 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xxx_fault_in_readable.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      0.00            +6.8        6.82 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.copy_user_enhanced_fast_string.copyin.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter
      0.00            +6.9        6.92 ± 13%      +0.0        0.00        perf-profile.calltrace.cycles-pp.copyin.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter
      0.00            +8.1        8.09 ± 14%      +0.0        0.00        perf-profile.calltrace.cycles-pp.xxx_copy_from_user_atomic.generic_perform_write.__generic_file_write_iter.generic_file_write_iter.new_sync_write
      8.03 ± 14%      -8.0        0.00            -8.0        0.00        perf-profile.children.cycles-pp.iov_iter_copy_from_user_atomic
      7.55 ± 12%      -0.8        6.75 ± 13%      -0.8        6.79 ± 10%  perf-profile.children.cycles-pp.syscall_return_via_sysret
      4.99 ± 12%      -0.6        4.34 ± 14%      -0.5        4.45 ± 11%  perf-profile.children.cycles-pp.shmem_getpage_gfp
      5.84 ± 13%      -0.6        5.22 ± 14%      -0.6        5.20 ± 11%  perf-profile.children.cycles-pp.shmem_write_begin
      3.53 ± 13%      -0.5        3.07 ± 13%      -0.4        3.17 ± 11%  perf-profile.children.cycles-pp.shmem_write_end
      3.48 ± 12%      -0.4        3.09 ± 14%      -0.4        3.09 ± 12%  perf-profile.children.cycles-pp.find_lock_entry
      0.85 ±  8%      -0.2        0.66 ± 15%      -0.1        0.71 ± 10%  perf-profile.children.cycles-pp.__fget_light
      0.69 ± 14%      -0.2        0.52 ± 15%      -0.2        0.48 ±  9%  perf-profile.children.cycles-pp.up_write
      0.62 ± 13%      -0.2        0.46 ± 14%      -0.2        0.47 ± 12%  perf-profile.children.cycles-pp.apparmor_file_permission
      0.86 ± 14%      -0.1        0.74 ± 14%      -0.1        0.74 ±  8%  perf-profile.children.cycles-pp.set_page_dirty
      0.94 ± 11%      -0.1        0.82 ± 13%      -0.1        0.85 ± 10%  perf-profile.children.cycles-pp.file_update_time
      0.51 ± 12%      -0.1        0.40 ± 14%      +0.0        0.52 ± 11%  perf-profile.children.cycles-pp.balance_dirty_pages_ratelimited
      0.71 ± 15%      -0.1        0.60 ± 13%      -0.1        0.60 ±  9%  perf-profile.children.cycles-pp.page_mapping
      0.55 ± 12%      -0.1        0.47 ± 12%      -0.0        0.50 ±  9%  perf-profile.children.cycles-pp.current_time
      0.62 ± 14%      -0.1        0.55 ± 13%      -0.1        0.56 ± 13%  perf-profile.children.cycles-pp.unlock_page
      0.24 ± 13%      -0.0        0.20 ± 16%      -0.0        0.22 ± 12%  perf-profile.children.cycles-pp.timestamp_truncate
      0.18 ± 11%      -0.0        0.14 ± 15%      -0.0        0.18 ± 12%  perf-profile.children.cycles-pp.file_remove_privs
      0.42 ± 13%      -0.0        0.39 ± 14%      -0.1        0.36 ± 13%  perf-profile.children.cycles-pp.testcase
      0.00            +0.0        0.00            +1.2        1.18 ±  9%  perf-profile.children.cycles-pp.iovec_advance
      0.00            +0.0        0.00            +2.2        2.21 ± 11%  perf-profile.children.cycles-pp.iovec_fault_in_readable
      0.00            +0.0        0.00            +8.2        8.20 ± 10%  perf-profile.children.cycles-pp.iovec_copy_from_user_atomic
      0.21 ± 17%      +0.1        0.28 ± 16%      +0.1        0.29 ± 10%  perf-profile.children.cycles-pp.__x86_indirect_thunk_rax
      0.55 ± 14%      +0.3        0.87 ± 15%      +0.3        0.89 ± 13%  perf-profile.children.cycles-pp.__x86_retpoline_rax
      0.00            +1.4        1.42 ± 12%      +0.0        0.00        perf-profile.children.cycles-pp.xxx_advance
      0.00            +2.2        2.22 ± 13%      +0.0        0.00        perf-profile.children.cycles-pp.xxx_fault_in_readable
      0.00            +8.1        8.12 ± 14%      +0.0        0.00        perf-profile.children.cycles-pp.xxx_copy_from_user_atomic
      7.52 ± 12%      -0.8        6.72 ± 13%      -0.8        6.77 ± 10%  perf-profile.self.cycles-pp.syscall_return_via_sysret
      1.02 ± 16%      -0.2        0.82 ± 12%      -0.1        0.92 ± 10%  perf-profile.self.cycles-pp.shmem_getpage_gfp
      0.82 ±  8%      -0.2        0.63 ± 15%      -0.1        0.68 ± 10%  perf-profile.self.cycles-pp.__fget_light
      0.66 ± 14%      -0.2        0.49 ± 15%      -0.2        0.46 ±  8%  perf-profile.self.cycles-pp.up_write
      0.54 ± 15%      -0.2        0.39 ± 14%      -0.1        0.40 ± 12%  perf-profile.self.cycles-pp.apparmor_file_permission
      0.59 ± 13%      -0.1        0.46 ± 13%      -0.1        0.45 ±  9%  perf-profile.self.cycles-pp.ksys_pwrite64
      0.50 ± 12%      -0.1        0.40 ± 13%      -0.0        0.47 ± 12%  perf-profile.self.cycles-pp.balance_dirty_pages_ratelimited
      0.67 ± 15%      -0.1        0.57 ± 12%      -0.1        0.57 ±  9%  perf-profile.self.cycles-pp.page_mapping
      0.71 ± 17%      -0.1        0.63 ± 13%      -0.1        0.60 ± 14%  perf-profile.self.cycles-pp.security_file_permission
      0.24 ± 15%      -0.0        0.19 ± 15%      -0.0        0.22 ± 12%  perf-profile.self.cycles-pp.timestamp_truncate
      0.20 ± 13%      -0.0        0.17 ± 12%      -0.0        0.18 ± 10%  perf-profile.self.cycles-pp.current_time
      0.00            +0.0        0.00            +1.1        1.05 ±  9%  perf-profile.self.cycles-pp.iovec_advance
      0.00            +0.0        0.00            +1.2        1.17 ± 12%  perf-profile.self.cycles-pp.iovec_fault_in_readable
      0.00            +0.0        0.00            +1.2        1.19 ± 10%  perf-profile.self.cycles-pp.iovec_copy_from_user_atomic
      0.82 ± 15%      +0.0        0.83 ± 12%      -0.1        0.71 ± 10%  perf-profile.self.cycles-pp.shmem_write_begin
      0.12 ± 14%      +0.1        0.19 ± 14%      +0.1        0.20 ±  7%  perf-profile.self.cycles-pp.__x86_indirect_thunk_rax
      0.43 ± 14%      +0.3        0.68 ± 15%      +0.3        0.69 ± 15%  perf-profile.self.cycles-pp.__x86_retpoline_rax
      0.00            +1.1        1.14 ± 15%      +0.0        0.00        perf-profile.self.cycles-pp.xxx_copy_from_user_atomic
      0.00            +1.2        1.21 ± 12%      +0.0        0.00        perf-profile.self.cycles-pp.xxx_fault_in_readable
      0.00            +1.3        1.28 ± 12%      +0.0        0.00        perf-profile.self.cycles-pp.xxx_advance

> 
> Thanks,
> David
> 

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression
  2020-12-04 11:51     ` David Howells
  2020-12-07 13:10       ` Oliver Sang
@ 2020-12-07 13:20       ` David Howells
  1 sibling, 0 replies; 55+ messages in thread
From: David Howells @ 2020-12-07 13:20 UTC (permalink / raw)
  To: Oliver Sang
  Cc: dhowells, lkp, lkp, ying.huang, feng.tang, zhengjun.xing,
	Pavel Begunkov, Matthew Wilcox, Jens Axboe, Alexander Viro,
	Linus Torvalds, linux-fsdevel, linux-block, linux-kernel

Oliver Sang <oliver.sang@intel.com> wrote:

> > Out of interest, would it be possible for you to run this on the tail of the
> > series on the same hardware?
> 
> sorry for late. below is the result adding the tail of the series:
> * ded69a6991fe0 (linux-review/David-Howells/RFC-iov_iter-Switch-to-using-an-ops-table/20201121-222344) iov_iter: Remove iterate_all_kinds() and iterate_and_advance()

Thanks very much for doing that!

David


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 01/29] iov_iter: Switch to using a table of operations
  2020-11-21 18:21   ` Linus Torvalds
@ 2020-12-11  1:30     ` Al Viro
  0 siblings, 0 replies; 55+ messages in thread
From: Al Viro @ 2020-12-11  1:30 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: David Howells, Pavel Begunkov, Matthew Wilcox, Jens Axboe,
	linux-fsdevel, linux-block, Linux Kernel Mailing List

On Sat, Nov 21, 2020 at 10:21:17AM -0800, Linus Torvalds wrote:
> So I think conceptually this is the right thing to do, but I have a
> couple of worries:
> 
>  - do we really need all those different versions? I'm thinking
> "iter_full" versions in particular. They I think the iter_full version
> could just be wrappers that call the regular iter thing and verify the
> end result is full (and revert if not). No?

Umm...  Not sure - iov_iter_revert() is not exactly light.  OTOH, it's
on a slow path...  Other variants:
	* save local copy, run of normal variant on iter, then copy
the saved back on failure
	* make a local copy, run the normal variant in _that_, then
copy it back on success.

Note that the entire thing is 5 words, and we end up reading all of
them anyway, so I wouldn't bet which variant ends up being faster -
that would need testing to compare.

I would certainly like to get rid of the duplication there, especially
if we are going to add copy_to_iter_full() and friends (there are
use cases for those).

>  - I worry a bit about the indirect call overhead and spectre v2.
> 
>    So yeah, it would be good to have benchmarks to make sure this
> doesn't regress for some simple case.
> 
> Other than those things, my initial reaction is "this does seem cleaner".

It does seem cleaner, all right, but that stuff is on fairly hot paths.
And I didn't want to mix the overhead of indirect calls into the picture,
so it turned into cascades of ifs with rather vile macros to keep the
size down.

It looks like the cost of indirects is noticable.  OTOH, there are
other iov_iter patches floating around, hopefully getting better
code generation.  Let's see how much do those give and if they win
considerably more than those several percents, revisit this series.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/29] RFC: iov_iter: Switch to using an ops table
  2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
                   ` (30 preceding siblings ...)
  2020-11-21 18:23 ` Linus Torvalds
@ 2020-12-11  3:24 ` Matthew Wilcox
  31 siblings, 0 replies; 55+ messages in thread
From: Matthew Wilcox @ 2020-12-11  3:24 UTC (permalink / raw)
  To: David Howells
  Cc: Pavel Begunkov, Jens Axboe, Alexander Viro, Linus Torvalds,
	linux-fsdevel, linux-block, linux-kernel

On Sat, Nov 21, 2020 at 02:13:21PM +0000, David Howells wrote:
> I had a go switching the iov_iter stuff away from using a type bitmask to
> using an ops table to get rid of the if-if-if-if chains that are all over
> the place.  After I pushed it, someone pointed me at Pavel's two patches.
> 
> I have another iterator class that I want to add - which would lengthen the
> if-if-if-if chains.  A lot of the time, there's a conditional clause at the
> beginning of a function that just jumps off to a type-specific handler or
> to reject the operation for that type.  An ops table can just point to that
> instead.

So, given the performance problem, how about turning this inside out?

struct iov_step {
	union {
		void *kaddr;
		void __user *uaddr;
	};
	unsigned int len;
	bool user_addr;
	bool kmap;
	struct page *page;
};

bool iov_iterate(struct iov_step *step, struct iov_iter *i, size_t max)
{
	if (step->page)
		kunmap(page)
	else if (step->kmap)
		kunmap_atomic(step->kaddr);

	if (max == 0)
		return false;

	if (i->type & ITER_IOVEC) {
		step->user_addr = true;
		step->uaddr = i->iov.iov_base + i->iov_offset;
		return true;
	}
	if (i->type & ITER_BVEC) {
		... get the page ...
	} else if (i->type & ITER_KVEC) {
		... get the page ...
	} else ...

	kmap or kmap_atomic as appropriate ...
	...set kaddr & len ...

	return true;
}

size_t copy_from_iter(void *addr, size_t bytes, struct iov_iter *i)
{
	struct iov_step step = {};

	while (iov_iterate(&step, i, bytes)) {
		if (user_addr)
			copy_from_user(addr, step.uaddr, step.len);
		else
			memcpy(addr, step.kaddr, step.len);
		bytes -= step.len;
	}
}


^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2020-12-11  3:25 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-21 14:13 [PATCH 00/29] RFC: iov_iter: Switch to using an ops table David Howells
2020-11-21 14:13 ` [PATCH 01/29] iov_iter: Switch to using a table of operations David Howells
2020-11-21 14:31   ` Pavel Begunkov
2020-11-23 23:21     ` Pavel Begunkov
2020-11-21 18:21   ` Linus Torvalds
2020-12-11  1:30     ` Al Viro
2020-11-22 13:33   ` David Howells
2020-11-22 13:58     ` David Laight
2020-11-22 19:22     ` Linus Torvalds
2020-11-22 22:34       ` David Laight
2020-11-22 22:46   ` David Laight
2020-11-23  8:05   ` Christoph Hellwig
2020-11-23 10:31   ` David Howells
2020-11-23 23:42     ` Pavel Begunkov
2020-11-24 12:50     ` David Howells
2020-11-24 15:30       ` Jens Axboe
2020-11-27 17:14       ` David Howells
2020-11-23 11:14   ` David Howells
     [not found]   ` <20201203064536.GE27350@xsang-OptiPlex-9020>
2020-12-03 17:47     ` [iov_iter] 9bd0e337c6: will-it-scale.per_process_ops -4.8% regression Linus Torvalds
2020-12-03 17:50       ` Jens Axboe
2020-12-04 11:50     ` David Howells
2020-12-04 11:51     ` David Howells
2020-12-07 13:10       ` Oliver Sang
2020-12-07 13:20       ` David Howells
2020-11-21 14:13 ` [PATCH 02/29] iov_iter: Split copy_page_to_iter() David Howells
2020-11-21 14:13 ` [PATCH 03/29] iov_iter: Split iov_iter_fault_in_readable David Howells
2020-11-21 14:13 ` [PATCH 04/29] iov_iter: Split the iterate_and_advance() macro David Howells
2020-11-21 14:14 ` [PATCH 05/29] iov_iter: Split copy_to_iter() David Howells
2020-11-21 14:14 ` [PATCH 06/29] iov_iter: Split copy_mc_to_iter() David Howells
2020-11-21 14:14 ` [PATCH 07/29] iov_iter: Split copy_from_iter() David Howells
2020-11-21 14:14 ` [PATCH 08/29] iov_iter: Split the iterate_all_kinds() macro David Howells
2020-11-21 14:14 ` [PATCH 09/29] iov_iter: Split copy_from_iter_full() David Howells
2020-11-21 14:14 ` [PATCH 10/29] iov_iter: Split copy_from_iter_nocache() David Howells
2020-11-21 14:14 ` [PATCH 11/29] iov_iter: Split copy_from_iter_flushcache() David Howells
2020-11-21 14:14 ` [PATCH 12/29] iov_iter: Split copy_from_iter_full_nocache() David Howells
2020-11-21 14:15 ` [PATCH 13/29] iov_iter: Split copy_page_from_iter() David Howells
2020-11-21 14:15 ` [PATCH 14/29] iov_iter: Split iov_iter_zero() David Howells
2020-11-21 14:15 ` [PATCH 15/29] iov_iter: Split copy_from_user_atomic() David Howells
2020-11-21 14:15 ` [PATCH 16/29] iov_iter: Split iov_iter_advance() David Howells
2020-11-21 14:15 ` [PATCH 17/29] iov_iter: Split iov_iter_revert() David Howells
2020-11-21 14:15 ` [PATCH 18/29] iov_iter: Split iov_iter_single_seg_count() David Howells
2020-11-21 14:15 ` [PATCH 19/29] iov_iter: Split iov_iter_alignment() David Howells
2020-11-21 14:15 ` [PATCH 20/29] iov_iter: Split iov_iter_gap_alignment() David Howells
2020-11-21 14:16 ` [PATCH 21/29] iov_iter: Split iov_iter_get_pages() David Howells
2020-11-21 14:16 ` [PATCH 22/29] iov_iter: Split iov_iter_get_pages_alloc() David Howells
2020-11-21 14:16 ` [PATCH 23/29] iov_iter: Split csum_and_copy_from_iter() David Howells
2020-11-21 14:16 ` [PATCH 24/29] iov_iter: Split csum_and_copy_from_iter_full() David Howells
2020-11-21 14:16 ` [PATCH 25/29] iov_iter: Split csum_and_copy_to_iter() David Howells
2020-11-21 14:16 ` [PATCH 26/29] iov_iter: Split iov_iter_npages() David Howells
2020-11-21 14:16 ` [PATCH 27/29] iov_iter: Split dup_iter() David Howells
2020-11-21 14:17 ` [PATCH 28/29] iov_iter: Split iov_iter_for_each_range() David Howells
2020-11-21 14:17 ` [PATCH 29/29] iov_iter: Remove iterate_all_kinds() and iterate_and_advance() David Howells
2020-11-21 14:34 ` [PATCH 00/29] RFC: iov_iter: Switch to using an ops table Pavel Begunkov
2020-11-21 18:23 ` Linus Torvalds
2020-12-11  3:24 ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).