From: David Howells <dhowells@redhat.com>
To: Steve French <smfrench@gmail.com>, Al Viro <viro@zeniv.linux.org.uk>
Cc: dhowells@redhat.com, Shyam Prasad N <nspmangalore@gmail.com>,
Rohith Surabattula <rohiths.msft@gmail.com>,
Tom Talpey <tom@talpey.com>,
Christoph Hellwig <hch@infradead.org>,
Matthew Wilcox <willy@infradead.org>,
Jeff Layton <jlayton@kernel.org>,
linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: [RFC PATCH 1/9] netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator
Date: Fri, 28 Oct 2022 16:55:52 +0100 [thread overview]
Message-ID: <166697255265.61150.6289490555867717077.stgit@warthog.procyon.org.uk> (raw)
In-Reply-To: <166697254399.61150.1256557652599252121.stgit@warthog.procyon.org.uk>
Add a function to extract the pages from a user-space supplied iterator
(UBUF- or IOVEC-type) into a BVEC-type iterator, pinning the pages as we
go.
This is useful in three situations:
(1) A userspace thread may have a sibling that unmaps or remaps the
process's VM during the operation, changing the assignment of the
pages and potentially causing an error. Pinning the pages keeps
some pages around, even if this occurs; futher, we find out at the
point of extraction if EFAULT is going to be incurred.
(2) Pages might get swapped out/discarded if not pinned, so we want to pin
them to avoid the reload causing a deadlock due to a DIO from/to an
mmapped region on the same file.
(3) The iterator may get passed to sendmsg() by the filesystem. If a
fault occurs, we may get a short write to a TCP stream that's then
tricky to recover from.
We assume that other types of iterator (eg. BVEC-, KVEC- and XARRAY-type)
are constructed only by kernel internals and that the pages are pinned in
those cases.
DISCARD- and PIPE-type iterators aren't DIO'able.
Signed-off-by: David Howells <dhowells@redhat.com>
---
fs/netfs/Makefile | 1 +
fs/netfs/iterator.c | 96 +++++++++++++++++++++++++++++++++++++++++++++++++
include/linux/netfs.h | 2 +
3 files changed, 99 insertions(+)
create mode 100644 fs/netfs/iterator.c
diff --git a/fs/netfs/Makefile b/fs/netfs/Makefile
index f684c0cd1ec5..386d6fb92793 100644
--- a/fs/netfs/Makefile
+++ b/fs/netfs/Makefile
@@ -3,6 +3,7 @@
netfs-y := \
buffered_read.o \
io.o \
+ iterator.o \
main.o \
objects.o
diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c
new file mode 100644
index 000000000000..6875c6c94466
--- /dev/null
+++ b/fs/netfs/iterator.c
@@ -0,0 +1,96 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Iterator helpers.
+ *
+ * Copyright (C) 2022 Red Hat, Inc. All Rights Reserved.
+ * Written by David Howells (dhowells@redhat.com)
+ */
+
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/uio.h>
+#include <linux/netfs.h>
+#include "internal.h"
+
+/**
+ * netfs_extract_user_iter - Extract the pages from a user iterator into a bvec
+ * @orig: The original iterator
+ * @orig_len: The amount of iterator to copy
+ * @new: The iterator to be set up
+ *
+ * Extract the page fragments from the given amount of the source iterator and
+ * build up a second iterator that refers to all of those bits. This allows
+ * the original iterator to disposed of.
+ *
+ * On success, the number of elements in the bvec is returned and the original
+ * iterator will have been advanced by the amount extracted.
+ */
+ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len,
+ struct iov_iter *new)
+{
+ struct bio_vec *bv = NULL;
+ struct page **pages;
+ unsigned int cur_npages;
+ unsigned int max_pages;
+ unsigned int npages = 0;
+ unsigned int i;
+ size_t count = orig_len;
+ ssize_t ret;
+ size_t bv_size, pg_size;
+ size_t start;
+ size_t len;
+
+ if (WARN_ON_ONCE(!iter_is_ubuf(orig) && !iter_is_iovec(orig)))
+ return -EIO;
+
+ max_pages = iov_iter_npages(orig, INT_MAX);
+ bv_size = array_size(max_pages, sizeof(*bv));
+ bv = kvmalloc(bv_size, GFP_KERNEL);
+ if (!bv)
+ return -ENOMEM;
+
+ /* Put the page list at the end of the bvec list storage. bvec
+ * elements are larger than page pointers, so as long as we work
+ * 0->last, we should be fine.
+ */
+ pg_size = array_size(max_pages, sizeof(*pages));
+ pages = (void *)bv + bv_size - pg_size;
+
+ while (count && npages < max_pages) {
+ ret = iov_iter_get_pages2(orig, pages, count, max_pages - npages,
+ &start);
+ if (ret < 0) {
+ pr_err("Couldn't get user pages (rc=%zd)\n", ret);
+ break;
+ }
+
+ if (ret > count) {
+ pr_err("get_pages rc=%zd more than %zu\n", ret, count);
+ break;
+ }
+
+ count -= ret;
+ ret += start;
+ cur_npages = DIV_ROUND_UP(ret, PAGE_SIZE);
+
+ if (npages + cur_npages > max_pages) {
+ pr_err("Out of bvec array capacity (%u vs %u)\n",
+ npages + cur_npages, max_pages);
+ break;
+ }
+
+ for (i = 0; i < cur_npages; i++) {
+ len = ret > PAGE_SIZE ? PAGE_SIZE : ret;
+ bv[npages + i].bv_page = *pages++;
+ bv[npages + i].bv_offset = start;
+ bv[npages + i].bv_len = len - start;
+ ret -= len;
+ start = 0;
+ }
+
+ npages += cur_npages;
+ }
+
+ iov_iter_bvec(new, iov_iter_rw(orig), bv, npages, orig_len - count);
+ return npages;
+}
+EXPORT_SYMBOL(netfs_extract_user_iter);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index f2402ddeafbf..5f6ad0246946 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -288,6 +288,8 @@ void netfs_get_subrequest(struct netfs_io_subrequest *subreq,
void netfs_put_subrequest(struct netfs_io_subrequest *subreq,
bool was_async, enum netfs_sreq_ref_trace what);
void netfs_stats_show(struct seq_file *);
+ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len,
+ struct iov_iter *new);
/**
* netfs_inode - Get the netfs inode context from the inode
next prev parent reply other threads:[~2022-10-28 15:57 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-10-28 15:55 [RFC PATCH 0/9] smb3: Add iter helpers and use iov_iters down to the network transport David Howells
2022-10-28 15:55 ` David Howells [this message]
2022-10-28 15:55 ` [RFC PATCH 2/9] netfs: Add a function to extract an iterator into a scatterlist David Howells
2022-10-28 15:56 ` [RFC PATCH 3/9] cifs: Add a function to build an RDMA SGE list from an iterator David Howells
2022-10-28 15:56 ` [RFC PATCH 4/9] cifs: Add a function to Hash the contents of " David Howells
2022-10-28 15:56 ` [RFC PATCH 5/9] cifs: Add some helper functions David Howells
2022-10-28 15:56 ` [RFC PATCH 6/9] cifs: Add a function to read into an iter from a socket David Howells
2022-10-28 15:56 ` [RFC PATCH 7/9] cifs: Change the I/O paths to use an iterator rather than a page list David Howells
2022-10-28 15:56 ` [RFC PATCH 8/9] cifs: Build the RDMA SGE list directly from an iterator David Howells
2022-10-28 15:56 ` [RFC PATCH 9/9] cifs: Remove unused code David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=166697255265.61150.6289490555867717077.stgit@warthog.procyon.org.uk \
--to=dhowells@redhat.com \
--cc=hch@infradead.org \
--cc=jlayton@kernel.org \
--cc=linux-cifs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=nspmangalore@gmail.com \
--cc=rohiths.msft@gmail.com \
--cc=smfrench@gmail.com \
--cc=tom@talpey.com \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.