All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF
@ 2023-03-28 17:36 Jens Axboe
  2023-03-28 17:36 ` [PATCH 1/8] iov_iter: teach iov_iter_iovec() to deal with ITER_UBUF Jens Axboe
                   ` (7 more replies)
  0 siblings, 8 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro

Hi,

ather than repeat the same blurb again, see the v2 posting here:

https://lore.kernel.org/linux-fsdevel/20230327180449.87382-1-axboe@kernel.dk/

tldr - turn single segment iovecs into ITER_UBUF rather than ITER_IOVEC,
because they are more efficient.

Main addition since v3 is being careful checking users of iov_iter,
and there are two odd ones - infiniband and sound. I've added a
prep patch to be able to grab the segment count of a user backed
iterator, which sound and IB can then use.

I'm not convinced the sound patch is useful at all, since it LOOKS
like it requires nr_segments > 1 anyway. I'll chat with the folks
on that side to be sure. We may get away with just EINVAL for
this case.

IB is pretty straight forward, just make it deal with ITER_UBUF
as a single segment thing.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/8] iov_iter: teach iov_iter_iovec() to deal with ITER_UBUF
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  2023-03-28 17:36 ` [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper Jens Axboe
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

Even if we're returning an iovec, we can trivially fill it in with the
details from an ITER_UBUF as well. This enables loops that assume
ITER_IOVEC to deal with ITER_UBUF transparently.

This is done in preparation for automatically importing single segment
iovecs as ITER_UBUF.

Suggested-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 include/linux/uio.h | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 27e3fd942960..3b4403efcce1 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -143,13 +143,29 @@ static inline size_t iov_length(const struct iovec *iov, unsigned long nr_segs)
 	return ret;
 }
 
+/*
+ * Don't assume we're called with ITER_IOVEC, enable usage of ITER_UBUF
+ * as well by simply filling in the iovec.
+ */
 static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
 {
-	return (struct iovec) {
-		.iov_base = iter->iov->iov_base + iter->iov_offset,
-		.iov_len = min(iter->count,
-			       iter->iov->iov_len - iter->iov_offset),
-	};
+	if (WARN_ON_ONCE(!iter->user_backed)) {
+		return (struct iovec) {
+			.iov_base = NULL,
+			.iov_len = 0
+		};
+	} else if (iter_is_ubuf(iter)) {
+		return (struct iovec) {
+			.iov_base = iter->ubuf + iter->iov_offset,
+			.iov_len = iter->count
+		};
+	} else {
+		return (struct iovec) {
+			.iov_base = iter->iov->iov_base + iter->iov_offset,
+			.iov_len = min(iter->count,
+				       iter->iov->iov_len - iter->iov_offset),
+		};
+	}
 }
 
 size_t copy_page_from_iter_atomic(struct page *page, unsigned offset,
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
  2023-03-28 17:36 ` [PATCH 1/8] iov_iter: teach iov_iter_iovec() to deal with ITER_UBUF Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  2023-03-28 18:42   ` Al Viro
  2023-03-28 17:36 ` [PATCH 3/8] snd: move mapping an iov_iter to user bufs into a helper Jens Axboe
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

This returns the number of user segments in an iov_iter. The input can
either be an ITER_IOVEC, where it'll return the number of iovecs. Or it
can be an ITER_UBUF, in which case the number of segments is always 1.

Outside of those two, no user backed iterators exist. Just return 0 for
those.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 include/linux/uio.h | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 3b4403efcce1..8ba4d61e9e9b 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -168,6 +168,15 @@ static inline struct iovec iov_iter_iovec(const struct iov_iter *iter)
 	}
 }
 
+static inline int iovec_nr_user_vecs(const struct iov_iter *iter)
+{
+	if (iter_is_ubuf(iter))
+		return 1;
+	else if (iter->iter_type == ITER_IOVEC)
+		return iter->nr_segs;
+	return 0;
+}
+
 size_t copy_page_from_iter_atomic(struct page *page, unsigned offset,
 				  size_t bytes, struct iov_iter *i);
 void iov_iter_advance(struct iov_iter *i, size_t bytes);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/8] snd: move mapping an iov_iter to user bufs into a helper
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
  2023-03-28 17:36 ` [PATCH 1/8] iov_iter: teach iov_iter_iovec() to deal with ITER_UBUF Jens Axboe
  2023-03-28 17:36 ` [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  2023-03-28 17:36 ` [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF Jens Axboe
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

snd_pcm_{readv,writev} both do the same mapping of a struct iov_iter
into an array of buffers. Move this into a helper.

No functional changes intended in this patch.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 sound/core/pcm_native.c | 55 ++++++++++++++++++++++-------------------
 1 file changed, 30 insertions(+), 25 deletions(-)

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 331380c2438b..925d96343148 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3512,13 +3512,36 @@ static ssize_t snd_pcm_write(struct file *file, const char __user *buf,
 	return result;
 }
 
+static void __user **snd_map_bufs(struct snd_pcm_runtime *runtime,
+				  struct iov_iter *iter,
+				  snd_pcm_uframes_t *frames, int max_segs)
+{
+	void __user **bufs;
+	unsigned long i;
+
+	if (!iter->user_backed)
+		return ERR_PTR(-EFAULT);
+	if (!iter->nr_segs)
+		return ERR_PTR(-EINVAL);
+	if (iter->nr_segs > max_segs || iter->nr_segs != runtime->channels)
+		return ERR_PTR(-EINVAL);
+	if (!frame_aligned(runtime, iter->iov->iov_len))
+		return ERR_PTR(-EINVAL);
+	bufs = kmalloc_array(iter->nr_segs, sizeof(void *), GFP_KERNEL);
+	if (bufs == NULL)
+		return ERR_PTR(-ENOMEM);
+	for (i = 0; i < iter->nr_segs; ++i)
+		bufs[i] = iter->iov[i].iov_base;
+	*frames = bytes_to_samples(runtime, iter->iov->iov_len);
+	return bufs;
+}
+
 static ssize_t snd_pcm_readv(struct kiocb *iocb, struct iov_iter *to)
 {
 	struct snd_pcm_file *pcm_file;
 	struct snd_pcm_substream *substream;
 	struct snd_pcm_runtime *runtime;
 	snd_pcm_sframes_t result;
-	unsigned long i;
 	void __user **bufs;
 	snd_pcm_uframes_t frames;
 
@@ -3530,18 +3553,9 @@ static ssize_t snd_pcm_readv(struct kiocb *iocb, struct iov_iter *to)
 	if (runtime->state == SNDRV_PCM_STATE_OPEN ||
 	    runtime->state == SNDRV_PCM_STATE_DISCONNECTED)
 		return -EBADFD;
-	if (!iter_is_iovec(to))
-		return -EINVAL;
-	if (to->nr_segs > 1024 || to->nr_segs != runtime->channels)
-		return -EINVAL;
-	if (!frame_aligned(runtime, to->iov->iov_len))
-		return -EINVAL;
-	frames = bytes_to_samples(runtime, to->iov->iov_len);
-	bufs = kmalloc_array(to->nr_segs, sizeof(void *), GFP_KERNEL);
-	if (bufs == NULL)
-		return -ENOMEM;
-	for (i = 0; i < to->nr_segs; ++i)
-		bufs[i] = to->iov[i].iov_base;
+	bufs = snd_map_bufs(runtime, to, &frames, 1024);
+	if (IS_ERR(bufs))
+		return PTR_ERR(bufs);
 	result = snd_pcm_lib_readv(substream, bufs, frames);
 	if (result > 0)
 		result = frames_to_bytes(runtime, result);
@@ -3555,7 +3569,6 @@ static ssize_t snd_pcm_writev(struct kiocb *iocb, struct iov_iter *from)
 	struct snd_pcm_substream *substream;
 	struct snd_pcm_runtime *runtime;
 	snd_pcm_sframes_t result;
-	unsigned long i;
 	void __user **bufs;
 	snd_pcm_uframes_t frames;
 
@@ -3567,17 +3580,9 @@ static ssize_t snd_pcm_writev(struct kiocb *iocb, struct iov_iter *from)
 	if (runtime->state == SNDRV_PCM_STATE_OPEN ||
 	    runtime->state == SNDRV_PCM_STATE_DISCONNECTED)
 		return -EBADFD;
-	if (!iter_is_iovec(from))
-		return -EINVAL;
-	if (from->nr_segs > 128 || from->nr_segs != runtime->channels ||
-	    !frame_aligned(runtime, from->iov->iov_len))
-		return -EINVAL;
-	frames = bytes_to_samples(runtime, from->iov->iov_len);
-	bufs = kmalloc_array(from->nr_segs, sizeof(void *), GFP_KERNEL);
-	if (bufs == NULL)
-		return -ENOMEM;
-	for (i = 0; i < from->nr_segs; ++i)
-		bufs[i] = from->iov[i].iov_base;
+	bufs = snd_map_bufs(runtime, from, &frames, 128);
+	if (IS_ERR(bufs))
+		return PTR_ERR(bufs);
 	result = snd_pcm_lib_writev(substream, bufs, frames);
 	if (result > 0)
 		result = frames_to_bytes(runtime, result);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
                   ` (2 preceding siblings ...)
  2023-03-28 17:36 ` [PATCH 3/8] snd: move mapping an iov_iter to user bufs into a helper Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  2023-03-28 17:50   ` Linus Torvalds
  2023-03-28 17:36 ` [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter Jens Axboe
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

This probably doesn't make any sense, as it's reliant on passing in
different things in multiple segments. Most likely we can just make
this go away as it's already checking for ITER_IOVEC upon entry, and
it looks like nr_segments == 2 is the smallest legal value. IOW, any
attempt to readv/writev with 1 segment would fail with -EINVAL already.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 sound/core/pcm_native.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index 925d96343148..05913d68923a 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -3516,23 +3516,28 @@ static void __user **snd_map_bufs(struct snd_pcm_runtime *runtime,
 				  struct iov_iter *iter,
 				  snd_pcm_uframes_t *frames, int max_segs)
 {
+	int nr_segs = iovec_nr_user_vecs(iter);
 	void __user **bufs;
+	struct iovec iov;
 	unsigned long i;
 
 	if (!iter->user_backed)
 		return ERR_PTR(-EFAULT);
-	if (!iter->nr_segs)
+	if (!nr_segs)
 		return ERR_PTR(-EINVAL);
-	if (iter->nr_segs > max_segs || iter->nr_segs != runtime->channels)
+	if (nr_segs > max_segs || nr_segs != runtime->channels)
 		return ERR_PTR(-EINVAL);
-	if (!frame_aligned(runtime, iter->iov->iov_len))
+	iov = iov_iter_iovec(iter);
+	if (!frame_aligned(runtime, iov.iov_len))
 		return ERR_PTR(-EINVAL);
-	bufs = kmalloc_array(iter->nr_segs, sizeof(void *), GFP_KERNEL);
+	bufs = kmalloc_array(nr_segs, sizeof(void *), GFP_KERNEL);
 	if (bufs == NULL)
 		return ERR_PTR(-ENOMEM);
-	for (i = 0; i < iter->nr_segs; ++i)
+	bufs[0] = iov.iov_base;
+	/* we know it's an ITER_IOVEC is nr_segs > 1 */
+	for (i = 1; i < nr_segs; ++i)
 		bufs[i] = iter->iov[i].iov_base;
-	*frames = bytes_to_samples(runtime, iter->iov->iov_len);
+	*frames = bytes_to_samples(runtime, iov.iov_len);
 	return bufs;
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
                   ` (3 preceding siblings ...)
  2023-03-28 17:36 ` [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  2023-03-28 18:43   ` Linus Torvalds
  2023-03-28 17:36 ` [PATCH 6/8] IB/qib: make qib_write_iter() " Jens Axboe
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

Don't assume that a user backed iterator is always of the type
ITER_IOVEC. Handle the single segment case separately, then we can
use the same logic for ITER_UBUF and ITER_IOVEC.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/infiniband/hw/hfi1/file_ops.c | 42 ++++++++++++++++-----------
 1 file changed, 25 insertions(+), 17 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index b1d6ca7e9708..f52f57c30429 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -262,11 +262,17 @@ static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
 	struct hfi1_user_sdma_pkt_q *pq;
 	struct hfi1_user_sdma_comp_q *cq = fd->cq;
 	int done = 0, reqs = 0;
-	unsigned long dim = from->nr_segs;
+	unsigned long dim;
 	int idx;
 
 	if (!HFI1_CAP_IS_KSET(SDMA))
 		return -EINVAL;
+	if (!from->user_backed)
+		return -EFAULT;
+	dim = iovec_nr_user_vecs(from);
+	if (!dim)
+		return -EINVAL;
+
 	idx = srcu_read_lock(&fd->pq_srcu);
 	pq = srcu_dereference(fd->pq, &fd->pq_srcu);
 	if (!cq || !pq) {
@@ -274,11 +280,6 @@ static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
 		return -EIO;
 	}
 
-	if (!iter_is_iovec(from) || !dim) {
-		srcu_read_unlock(&fd->pq_srcu, idx);
-		return -EINVAL;
-	}
-
 	trace_hfi1_sdma_request(fd->dd, fd->uctxt->ctxt, fd->subctxt, dim);
 
 	if (atomic_read(&pq->n_reqs) == pq->n_max_reqs) {
@@ -286,20 +287,27 @@ static ssize_t hfi1_write_iter(struct kiocb *kiocb, struct iov_iter *from)
 		return -ENOSPC;
 	}
 
-	while (dim) {
-		int ret;
+	if (dim == 1) {
+		struct iovec iov = iov_iter_iovec(from);
 		unsigned long count = 0;
 
-		ret = hfi1_user_sdma_process_request(
-			fd, (struct iovec *)(from->iov + done),
-			dim, &count);
-		if (ret) {
-			reqs = ret;
-			break;
+		reqs = hfi1_user_sdma_process_request(fd, &iov, 1, &count);
+	} else {
+		while (dim) {
+			int ret;
+			unsigned long count = 0;
+
+			ret = hfi1_user_sdma_process_request(fd,
+					(struct iovec *)(from->iov + done),
+					dim, &count);
+			if (ret) {
+				reqs = ret;
+				break;
+			}
+			dim -= count;
+			done += count;
+			reqs++;
 		}
-		dim -= count;
-		done += count;
-		reqs++;
 	}
 
 	srcu_read_unlock(&fd->pq_srcu, idx);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 6/8] IB/qib: make qib_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
                   ` (4 preceding siblings ...)
  2023-03-28 17:36 ` [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  2023-03-28 17:36 ` [PATCH 7/8] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
  2023-03-28 17:36 ` [PATCH 8/8] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
  7 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

Don't assume that a user backed iterator is always of the type
ITER_IOVEC. Handle the single segment case separately, then we can
use the same logic for ITER_UBUF and ITER_IOVEC.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/infiniband/hw/qib/qib_file_ops.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/qib/qib_file_ops.c b/drivers/infiniband/hw/qib/qib_file_ops.c
index 80fe92a21f96..577d972ba048 100644
--- a/drivers/infiniband/hw/qib/qib_file_ops.c
+++ b/drivers/infiniband/hw/qib/qib_file_ops.c
@@ -2244,10 +2244,18 @@ static ssize_t qib_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	struct qib_filedata *fp = iocb->ki_filp->private_data;
 	struct qib_ctxtdata *rcd = ctxt_fp(iocb->ki_filp);
 	struct qib_user_sdma_queue *pq = fp->pq;
+	int nr_segs = iovec_nr_user_vecs(from);
 
-	if (!iter_is_iovec(from) || !from->nr_segs || !pq)
+	if (!from->user_backed)
+		return -EFAULT;
+	if (!nr_segs || !pq)
 		return -EINVAL;
 
+	if (nr_segs == 1) {
+		struct iovec iov = iov_iter_iovec(from);
+		return qib_user_sdma_writev(rcd, pq, &iov, 1);
+	}
+
 	return qib_user_sdma_writev(rcd, pq, from->iov, from->nr_segs);
 }
 
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 7/8] iov_iter: convert import_single_range() to ITER_UBUF
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
                   ` (5 preceding siblings ...)
  2023-03-28 17:36 ` [PATCH 6/8] IB/qib: make qib_write_iter() " Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  2023-03-28 17:36 ` [PATCH 8/8] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe
  7 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

Since we're just importing a single vector, we don't have to turn it
into an ITER_IOVEC. Instead turn it into an ITER_UBUF, which is cheaper
to iterate.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 lib/iov_iter.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 274014e4eafe..fc82cc42ffe6 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1866,9 +1866,7 @@ int import_single_range(int rw, void __user *buf, size_t len,
 	if (unlikely(!access_ok(buf, len)))
 		return -EFAULT;
 
-	iov->iov_base = buf;
-	iov->iov_len = len;
-	iov_iter_init(i, rw, iov, 1, len);
+	iov_iter_ubuf(i, rw, buf, len);
 	return 0;
 }
 EXPORT_SYMBOL(import_single_range);
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 8/8] iov_iter: import single vector iovecs as ITER_UBUF
  2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
                   ` (6 preceding siblings ...)
  2023-03-28 17:36 ` [PATCH 7/8] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
@ 2023-03-28 17:36 ` Jens Axboe
  7 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:36 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: torvalds, brauner, viro, Jens Axboe

Add a special case to __import_iovec(), which imports a single segment
iovec as an ITER_UBUF rather than an ITER_IOVEC. ITER_UBUF is cheaper
to iterate than ITER_IOVEC, and for a single segment iovec, there's no
point in using a segmented iterator.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 lib/iov_iter.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index fc82cc42ffe6..63cf9997bd50 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1780,6 +1780,30 @@ struct iovec *iovec_from_user(const struct iovec __user *uvec,
 	return iov;
 }
 
+/*
+ * Single segment iovec supplied by the user, import it as ITER_UBUF.
+ */
+static ssize_t __import_iovec_ubuf(int type, const struct iovec __user *uvec,
+				   struct iovec **iovp, struct iov_iter *i,
+				   bool compat)
+{
+	struct iovec *iov = *iovp;
+	ssize_t ret;
+
+	if (compat)
+		ret = copy_compat_iovec_from_user(iov, uvec, 1);
+	else
+		ret = copy_iovec_from_user(iov, uvec, 1);
+	if (unlikely(ret))
+		return ret;
+
+	ret = import_ubuf(type, iov->iov_base, iov->iov_len, i);
+	if (unlikely(ret))
+		return ret;
+	*iovp = NULL;
+	return i->count;
+}
+
 ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 		 unsigned nr_segs, unsigned fast_segs, struct iovec **iovp,
 		 struct iov_iter *i, bool compat)
@@ -1788,6 +1812,9 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 	unsigned long seg;
 	struct iovec *iov;
 
+	if (nr_segs == 1)
+		return __import_iovec_ubuf(type, uvec, iovp, i, compat);
+
 	iov = iovec_from_user(uvec, nr_segs, fast_segs, *iovp, compat);
 	if (IS_ERR(iov)) {
 		*iovp = NULL;
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF
  2023-03-28 17:36 ` [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF Jens Axboe
@ 2023-03-28 17:50   ` Linus Torvalds
  2023-03-28 17:52     ` Jens Axboe
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2023-03-28 17:50 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, brauner, viro

On Tue, Mar 28, 2023 at 10:36 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> @@ -3516,23 +3516,28 @@ static void __user **snd_map_bufs(struct snd_pcm_runtime *runtime,
>                                   struct iov_iter *iter,
>                                   snd_pcm_uframes_t *frames, int max_segs)
>  {
> +       int nr_segs = iovec_nr_user_vecs(iter);

This has a WARN_ON_ONCE() for !user_backed, but then..

>         void __user **bufs;
> +       struct iovec iov;
>         unsigned long i;
>
>         if (!iter->user_backed)
>                 return ERR_PTR(-EFAULT);

here the code tries to deal with it.

So I think the two should probably be switched around.

                 Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF
  2023-03-28 17:50   ` Linus Torvalds
@ 2023-03-28 17:52     ` Jens Axboe
  2023-03-28 18:52       ` Al Viro
  0 siblings, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 17:52 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, brauner, viro

On 3/28/23 11:50 AM, Linus Torvalds wrote:
> On Tue, Mar 28, 2023 at 10:36 AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> @@ -3516,23 +3516,28 @@ static void __user **snd_map_bufs(struct snd_pcm_runtime *runtime,
>>                                   struct iov_iter *iter,
>>                                   snd_pcm_uframes_t *frames, int max_segs)
>>  {
>> +       int nr_segs = iovec_nr_user_vecs(iter);
> 
> This has a WARN_ON_ONCE() for !user_backed, but then..
> 
>>         void __user **bufs;
>> +       struct iovec iov;
>>         unsigned long i;
>>
>>         if (!iter->user_backed)
>>                 return ERR_PTR(-EFAULT);
> 
> here the code tries to deal with it.
> 
> So I think the two should probably be switched around.

True, it was actually like that before I refactored it to include
that common helper. I'll swap them around, thanks.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper
  2023-03-28 17:36 ` [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper Jens Axboe
@ 2023-03-28 18:42   ` Al Viro
  2023-03-28 18:45     ` Linus Torvalds
  2023-03-28 19:27     ` Jens Axboe
  0 siblings, 2 replies; 27+ messages in thread
From: Al Viro @ 2023-03-28 18:42 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, torvalds, brauner

On Tue, Mar 28, 2023 at 11:36:07AM -0600, Jens Axboe wrote:
> This returns the number of user segments in an iov_iter. The input can
> either be an ITER_IOVEC, where it'll return the number of iovecs. Or it
> can be an ITER_UBUF, in which case the number of segments is always 1.
> 
> Outside of those two, no user backed iterators exist. Just return 0 for
> those.

Umm...  Why not set ->nr_segs to 1 in iov_iter_ubuf() instead?  Note that
it won't be more costly; that part of struct iov_iter (8 bytes at offset 40
on amd64) is *not* left uninitialized - zero gets stored there.  That way
you'll get constant 1 stored there, which is just as cheap...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 17:36 ` [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter Jens Axboe
@ 2023-03-28 18:43   ` Linus Torvalds
  2023-03-28 18:55     ` Matthew Wilcox
                       ` (2 more replies)
  0 siblings, 3 replies; 27+ messages in thread
From: Linus Torvalds @ 2023-03-28 18:43 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, brauner, viro

On Tue, Mar 28, 2023 at 10:36 AM Jens Axboe <axboe@kernel.dk> wrote:
>
> Don't assume that a user backed iterator is always of the type
> ITER_IOVEC. Handle the single segment case separately, then we can
> use the same logic for ITER_UBUF and ITER_IOVEC.

Ugh. This is ugly.

Yes,. the original code is ugly too, but this makes it worse.

You have that helper for "give me the number of iovecs" and that just
works automatically with the ITER_UBUF case. But this code (and the
sound driver code in the previous patch), really lso wants a helper to
just return the 'iov' array.

And I think you should just do exactly that. The problem with
'iov_iter_iovec()' is that it doesn't return the array, it just
returns the first entry, so it's unusable for this case, and then you
have all these special "do something else for the single-entry
situation" cases.

And iov_iter_iovec() actually tries to be nice and clever and add the
iov_offset, so that you can actually do the proper iov_iter_advance()
on it etc, but again, this is not what any of this code wants, it just
wants the raw iov array, and the base will always be zero, because
this code just doesn't *work* on the iter level, and never advances
the iterator, it just advances the array index.

And the thing is, I think you could easily just add a

   const struct iovec *iov_iter_iovec_array(iter);

helper that just always returns a valid array of iov's.

For a ITER_IOV, it would just return the raw iov pointer.

And for a ITER_UBUF, we could either

 (a) just always pass in a single-entry auto iov that gets filled in
and the pointer to it returned

 (b) be *really* clever (or ugly, depending on how you want to see
it), and do something like this:

        --- a/include/linux/uio.h
        +++ b/include/linux/uio.h
        @@ -49,14 +49,23 @@ struct iov_iter {
                        size_t iov_offset;
                        int last_offset;
                };
        -       size_t count;
        -       union {
        -               const struct iovec *iov;
        -               const struct kvec *kvec;
        -               const struct bio_vec *bvec;
        -               struct xarray *xarray;
        -               struct pipe_inode_info *pipe;
        -               void __user *ubuf;
        +
        +       /*
        +        * This has the same layout as 'struct iovec'!
        +        * In particular, the ITER_UBUF form can create
        +        * a single-entry 'struct iovec' by casting the
        +        * address of the 'ubuf' member to that.
        +        */
        +       struct {
        +               union {
        +                       const struct iovec *iov;
        +                       const struct kvec *kvec;
        +                       const struct bio_vec *bvec;
        +                       struct xarray *xarray;
        +                       struct pipe_inode_info *pipe;
        +                       void __user *ubuf;
        +               };
        +               size_t count;
                };
                union {
                        unsigned long nr_segs;

and if you accept the above, then you can do

   #define iter_ubuf_to_iov(iter) ((const struct iovec *)&(iter)->ubuf)

which I will admit is not *pretty*, but it's kind of clever, I think.

So now you can trivially turn a user-backed iov_iter into the related
'struct iovec *' by just doing

   #define iov_iter_iovec_array(iter) \
     ((iter)->type == ITER_UBUF ? iter_ubuf_to_iov(iter) : (iter)->iov)

or something like that.

And no, the above is NOT AT ALL TESTED. Caveat emptor.

And if you go blind from looking at that patch, I will not accept
responsibility.

              Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper
  2023-03-28 18:42   ` Al Viro
@ 2023-03-28 18:45     ` Linus Torvalds
  2023-03-28 19:27     ` Jens Axboe
  1 sibling, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2023-03-28 18:45 UTC (permalink / raw)
  To: Al Viro; +Cc: Jens Axboe, linux-fsdevel, brauner

On Tue, Mar 28, 2023 at 11:42 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Umm...  Why not set ->nr_segs to 1 in iov_iter_ubuf() instead?  Note that
> it won't be more costly; that part of struct iov_iter (8 bytes at offset 40
> on amd64) is *not* left uninitialized - zero gets stored there.  That way
> you'll get constant 1 stored there, which is just as cheap...

Ack. And with my suggestion to embed a 'struct iov' in the 'struct
iov_iter' for the ITER_UBUF case (see previous email, try not to
barf), we really end up with a pretty cheap "ITER_UBUF can be used
almost as-is for any ITER_IOV case".

                 Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF
  2023-03-28 17:52     ` Jens Axboe
@ 2023-03-28 18:52       ` Al Viro
  2023-03-28 19:28         ` Jens Axboe
  0 siblings, 1 reply; 27+ messages in thread
From: Al Viro @ 2023-03-28 18:52 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linus Torvalds, linux-fsdevel, brauner, Takashi Iwai

On Tue, Mar 28, 2023 at 11:52:10AM -0600, Jens Axboe wrote:
> On 3/28/23 11:50 AM, Linus Torvalds wrote:
> > On Tue, Mar 28, 2023 at 10:36 AM Jens Axboe <axboe@kernel.dk> wrote:
> >>
> >> @@ -3516,23 +3516,28 @@ static void __user **snd_map_bufs(struct snd_pcm_runtime *runtime,
> >>                                   struct iov_iter *iter,
> >>                                   snd_pcm_uframes_t *frames, int max_segs)
> >>  {
> >> +       int nr_segs = iovec_nr_user_vecs(iter);
> > 
> > This has a WARN_ON_ONCE() for !user_backed, but then..
> > 
> >>         void __user **bufs;
> >> +       struct iovec iov;
> >>         unsigned long i;
> >>
> >>         if (!iter->user_backed)
> >>                 return ERR_PTR(-EFAULT);
> > 
> > here the code tries to deal with it.
> > 
> > So I think the two should probably be switched around.
> 
> True, it was actually like that before I refactored it to include
> that common helper. I'll swap them around, thanks.

Umm...  That looks really weird - if nothing else, it seems that this
thing quietly ignores the ->iov_len on all but the first iovec.

Might make sense to ask ALSA folks what the hell is going on there;
it's readv()/writev() on pcm device, and it looks like userland ABI
is really perverted here... ;-/

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 18:43   ` Linus Torvalds
@ 2023-03-28 18:55     ` Matthew Wilcox
  2023-03-28 19:05       ` Linus Torvalds
  2023-03-28 19:30     ` Jens Axboe
  2023-03-28 20:38     ` Al Viro
  2 siblings, 1 reply; 27+ messages in thread
From: Matthew Wilcox @ 2023-03-28 18:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jens Axboe, linux-fsdevel, brauner, viro

On Tue, Mar 28, 2023 at 11:43:34AM -0700, Linus Torvalds wrote:
>         -       size_t count;
>         -       union {
>         -               const struct iovec *iov;
>         -               const struct kvec *kvec;
>         -               const struct bio_vec *bvec;
>         -               struct xarray *xarray;
>         -               struct pipe_inode_info *pipe;
>         -               void __user *ubuf;
>         +
>         +       /*
>         +        * This has the same layout as 'struct iovec'!
>         +        * In particular, the ITER_UBUF form can create
>         +        * a single-entry 'struct iovec' by casting the
>         +        * address of the 'ubuf' member to that.
>         +        */
>         +       struct {
>         +               union {
>         +                       const struct iovec *iov;
>         +                       const struct kvec *kvec;
>         +                       const struct bio_vec *bvec;
>         +                       struct xarray *xarray;
>         +                       struct pipe_inode_info *pipe;
>         +                       void __user *ubuf;
>         +               };
>         +               size_t count;
>                 };
>                 union {
>                         unsigned long nr_segs;
> 
> and if you accept the above, then you can do
> 
>    #define iter_ubuf_to_iov(iter) ((const struct iovec *)&(iter)->ubuf)
> 
> which I will admit is not *pretty*, but it's kind of clever, I think.

I think it'll annoy gcc, and particularly the randstruct plugin.
How about:

	union {
		struct iovec ubuf;
		struct {
			const struct iovec *iov;
			size_t count; /* Also valid for subsequent types */
		};
		const struct kvec *kvec;
		const struct bio_vec *bvec;
		struct xarray *xarray;
		struct pipe_inode_info *pipe;
	}


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 18:55     ` Matthew Wilcox
@ 2023-03-28 19:05       ` Linus Torvalds
  2023-03-28 19:16         ` Linus Torvalds
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2023-03-28 19:05 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jens Axboe, linux-fsdevel, brauner, viro

On Tue, Mar 28, 2023 at 11:55 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> I think it'll annoy gcc, and particularly the randstruct plugin.

No, randstruct doesn't go change any normal data structure layout on its own.

You have to actively mark things for randstruct, or they have to be
pure function pointer ones.

But it's not like adding a 'struct iovec' explicitly to the members
just as extra "code documentation" would be wrong.

I don't think it really helps, though, since you have to have that
other explicit structure there anyway to get the member names right.

So I don't hate your version, but I don't think it really helps either.

             Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 19:05       ` Linus Torvalds
@ 2023-03-28 19:16         ` Linus Torvalds
  2023-03-28 21:21           ` Jens Axboe
  0 siblings, 1 reply; 27+ messages in thread
From: Linus Torvalds @ 2023-03-28 19:16 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: Jens Axboe, linux-fsdevel, brauner, viro

On Tue, Mar 28, 2023 at 12:05 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> But it's not like adding a 'struct iovec' explicitly to the members
> just as extra "code documentation" would be wrong.
>
> I don't think it really helps, though, since you have to have that
> other explicit structure there anyway to get the member names right.

Actually, thinking a bit more about it, adding a

    const struct iovec xyzzy;

member might be a good idea just to avoid a cast. Then that
iter_ubuf_to_iov() macro becomes just

   #define iter_ubuf_to_iov(iter) (&(iter)->xyzzy)

and that looks much nicer (plus still acts kind of as a "code comment"
to clarify things).

                Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper
  2023-03-28 18:42   ` Al Viro
  2023-03-28 18:45     ` Linus Torvalds
@ 2023-03-28 19:27     ` Jens Axboe
  1 sibling, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 19:27 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, torvalds, brauner

On 3/28/23 12:42 PM, Al Viro wrote:
> On Tue, Mar 28, 2023 at 11:36:07AM -0600, Jens Axboe wrote:
>> This returns the number of user segments in an iov_iter. The input can
>> either be an ITER_IOVEC, where it'll return the number of iovecs. Or it
>> can be an ITER_UBUF, in which case the number of segments is always 1.
>>
>> Outside of those two, no user backed iterators exist. Just return 0 for
>> those.
> 
> Umm...  Why not set ->nr_segs to 1 in iov_iter_ubuf() instead?  Note that
> it won't be more costly; that part of struct iov_iter (8 bytes at offset 40
> on amd64) is *not* left uninitialized - zero gets stored there.  That way
> you'll get constant 1 stored there, which is just as cheap...

Good point, let's have a prep patch that does that too.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF
  2023-03-28 18:52       ` Al Viro
@ 2023-03-28 19:28         ` Jens Axboe
  0 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 19:28 UTC (permalink / raw)
  To: Al Viro; +Cc: Linus Torvalds, linux-fsdevel, brauner, Takashi Iwai

On 3/28/23 12:52 PM, Al Viro wrote:
> On Tue, Mar 28, 2023 at 11:52:10AM -0600, Jens Axboe wrote:
>> On 3/28/23 11:50 AM, Linus Torvalds wrote:
>>> On Tue, Mar 28, 2023 at 10:36 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>> @@ -3516,23 +3516,28 @@ static void __user **snd_map_bufs(struct snd_pcm_runtime *runtime,
>>>>                                   struct iov_iter *iter,
>>>>                                   snd_pcm_uframes_t *frames, int max_segs)
>>>>  {
>>>> +       int nr_segs = iovec_nr_user_vecs(iter);
>>>
>>> This has a WARN_ON_ONCE() for !user_backed, but then..
>>>
>>>>         void __user **bufs;
>>>> +       struct iovec iov;
>>>>         unsigned long i;
>>>>
>>>>         if (!iter->user_backed)
>>>>                 return ERR_PTR(-EFAULT);
>>>
>>> here the code tries to deal with it.
>>>
>>> So I think the two should probably be switched around.
>>
>> True, it was actually like that before I refactored it to include
>> that common helper. I'll swap them around, thanks.
> 
> Umm...  That looks really weird - if nothing else, it seems that this
> thing quietly ignores the ->iov_len on all but the first iovec.

I agree, but this is how it currently works...

> Might make sense to ask ALSA folks what the hell is going on there;
> it's readv()/writev() on pcm device, and it looks like userland ABI
> is really perverted here... ;-/

I have sent them email separately to confirm that the only cases
that makes sense here is nr_segs >= 2. But the ABI is what it is,
however horrible it may be :/

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 18:43   ` Linus Torvalds
  2023-03-28 18:55     ` Matthew Wilcox
@ 2023-03-28 19:30     ` Jens Axboe
  2023-03-28 20:38     ` Al Viro
  2 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 19:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: linux-fsdevel, brauner, viro

On 3/28/23 12:43?PM, Linus Torvalds wrote:
> On Tue, Mar 28, 2023 at 10:36?AM Jens Axboe <axboe@kernel.dk> wrote:
>>
>> Don't assume that a user backed iterator is always of the type
>> ITER_IOVEC. Handle the single segment case separately, then we can
>> use the same logic for ITER_UBUF and ITER_IOVEC.
> 
> Ugh. This is ugly.
> 
> Yes,. the original code is ugly too, but this makes it worse.

Hah I know, I did feel dirty writing that patch... The existing code is
pretty ugly as-is, but it sure didn't get better.

> You have that helper for "give me the number of iovecs" and that just
> works automatically with the ITER_UBUF case. But this code (and the
> sound driver code in the previous patch), really lso wants a helper to
> just return the 'iov' array.
> 
> And I think you should just do exactly that. The problem with
> 'iov_iter_iovec()' is that it doesn't return the array, it just
> returns the first entry, so it's unusable for this case, and then you
> have all these special "do something else for the single-entry
> situation" cases.
> 
> And iov_iter_iovec() actually tries to be nice and clever and add the
> iov_offset, so that you can actually do the proper iov_iter_advance()
> on it etc, but again, this is not what any of this code wants, it just
> wants the raw iov array, and the base will always be zero, because
> this code just doesn't *work* on the iter level, and never advances
> the iterator, it just advances the array index.
> 
> And the thing is, I think you could easily just add a
> 
>    const struct iovec *iov_iter_iovec_array(iter);
> 
> helper that just always returns a valid array of iov's.
> 
> For a ITER_IOV, it would just return the raw iov pointer.
> 
> And for a ITER_UBUF, we could either
> 
>  (a) just always pass in a single-entry auto iov that gets filled in
> and the pointer to it returned
> 
>  (b) be *really* clever (or ugly, depending on how you want to see
> it), and do something like this:
> 
>         --- a/include/linux/uio.h
>         +++ b/include/linux/uio.h
>         @@ -49,14 +49,23 @@ struct iov_iter {
>                         size_t iov_offset;
>                         int last_offset;
>                 };
>         -       size_t count;
>         -       union {
>         -               const struct iovec *iov;
>         -               const struct kvec *kvec;
>         -               const struct bio_vec *bvec;
>         -               struct xarray *xarray;
>         -               struct pipe_inode_info *pipe;
>         -               void __user *ubuf;
>         +
>         +       /*
>         +        * This has the same layout as 'struct iovec'!
>         +        * In particular, the ITER_UBUF form can create
>         +        * a single-entry 'struct iovec' by casting the
>         +        * address of the 'ubuf' member to that.
>         +        */
>         +       struct {
>         +               union {
>         +                       const struct iovec *iov;
>         +                       const struct kvec *kvec;
>         +                       const struct bio_vec *bvec;
>         +                       struct xarray *xarray;
>         +                       struct pipe_inode_info *pipe;
>         +                       void __user *ubuf;
>         +               };
>         +               size_t count;
>                 };
>                 union {
>                         unsigned long nr_segs;
> 
> and if you accept the above, then you can do
> 
>    #define iter_ubuf_to_iov(iter) ((const struct iovec *)&(iter)->ubuf)
> 
> which I will admit is not *pretty*, but it's kind of clever, I think.
> 
> So now you can trivially turn a user-backed iov_iter into the related
> 'struct iovec *' by just doing
> 
>    #define iov_iter_iovec_array(iter) \
>      ((iter)->type == ITER_UBUF ? iter_ubuf_to_iov(iter) : (iter)->iov)
> 
> or something like that.
> 
> And no, the above is NOT AT ALL TESTED. Caveat emptor.
> 
> And if you go blind from looking at that patch, I will not accept
> responsibility.

I pondered something like that too, but balked at adding to iov_iter and
then didn't pursue that any further.

But bundled nicely, it should work out quite fine in the union. So I
like the suggestion, and then just return a pointer to the vec rather
than the copy, unifying the two cases.

Thanks for the review and suggestion, I'll make that change.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 18:43   ` Linus Torvalds
  2023-03-28 18:55     ` Matthew Wilcox
  2023-03-28 19:30     ` Jens Axboe
@ 2023-03-28 20:38     ` Al Viro
  2023-03-28 20:46       ` Jens Axboe
  2023-03-28 22:06       ` Linus Torvalds
  2 siblings, 2 replies; 27+ messages in thread
From: Al Viro @ 2023-03-28 20:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jens Axboe, linux-fsdevel, brauner

On Tue, Mar 28, 2023 at 11:43:34AM -0700, Linus Torvalds wrote:

> And if you go blind from looking at that patch, I will not accept
> responsibility.

... and all that, er, cleverness - for the sake of a single piece of shit
driver for piece of shit hardware *and* equally beautiful ABI.

Is it really worth bothering with?  And if anyone has doubts about the
inturdprize kwality of the ABI in question, I suggest taking a look at
hfi1_user_sdma_process_request() - that's where the horrors are.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 20:38     ` Al Viro
@ 2023-03-28 20:46       ` Jens Axboe
  2023-03-28 22:06       ` Linus Torvalds
  1 sibling, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 20:46 UTC (permalink / raw)
  To: Al Viro, Linus Torvalds; +Cc: linux-fsdevel, brauner

On 3/28/23 2:38 PM, Al Viro wrote:
> On Tue, Mar 28, 2023 at 11:43:34AM -0700, Linus Torvalds wrote:
> 
>> And if you go blind from looking at that patch, I will not accept
>> responsibility.
> 
> ... and all that, er, cleverness - for the sake of a single piece of shit
> driver for piece of shit hardware *and* equally beautiful ABI.
> 
> Is it really worth bothering with?  And if anyone has doubts about the
> inturdprize kwality of the ABI in question, I suggest taking a look at
> hfi1_user_sdma_process_request() - that's where the horrors are.

It is horrible code, I only go as far as that very function before
deciding that I wasn't going to be touching it as part of this.

I do like the cleverness of the overlay, the only practical downside
I see are things that _assign_ to eg iter->iovec after setting it
up. That would lead to some, ehm, interesting side effects.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 19:16         ` Linus Torvalds
@ 2023-03-28 21:21           ` Jens Axboe
  2023-03-28 21:38             ` Jens Axboe
  0 siblings, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 21:21 UTC (permalink / raw)
  To: Linus Torvalds, Matthew Wilcox; +Cc: linux-fsdevel, brauner, viro

On 3/28/23 1:16?PM, Linus Torvalds wrote:
> On Tue, Mar 28, 2023 at 12:05?PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> But it's not like adding a 'struct iovec' explicitly to the members
>> just as extra "code documentation" would be wrong.
>>
>> I don't think it really helps, though, since you have to have that
>> other explicit structure there anyway to get the member names right.
> 
> Actually, thinking a bit more about it, adding a
> 
>     const struct iovec xyzzy;
> 
> member might be a good idea just to avoid a cast. Then that
> iter_ubuf_to_iov() macro becomes just
> 
>    #define iter_ubuf_to_iov(iter) (&(iter)->xyzzy)
> 
> and that looks much nicer (plus still acts kind of as a "code comment"
> to clarify things).

I went down this path, and it _mostly_ worked out. You can view the
series here, I'll send it out when I've actually tested it:

https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf

A few mental notes I made along the way:

- The IB/sound changes are now just replacing an inappropriate
  iter_is_iovec() with iter->user_backed. That's nice and simple.

- The iov_iter_iovec() case becomes a bit simpler. Or so I thought,
  because we still need to add in the offset so we can't just use
  out embedded iovec for that. The above branch is just using the
  iovec, but I don't think this is right.

- Looks like it exposed a block bug, where the copy in
  bio_alloc_map_data() was obvious garbage but happened to work
  before.

I'm still inclined to favor this approach over the previous, even if the
IB driver is a pile of garbage and lighting it a bit more on fire would
not really hurt.

Opinions? Or do you want me to just send it out for easier reading


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 21:21           ` Jens Axboe
@ 2023-03-28 21:38             ` Jens Axboe
  2023-03-28 21:51               ` Jens Axboe
  0 siblings, 1 reply; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 21:38 UTC (permalink / raw)
  To: Linus Torvalds, Matthew Wilcox; +Cc: linux-fsdevel, brauner, viro

On 3/28/23 3:21?PM, Jens Axboe wrote:
> On 3/28/23 1:16?PM, Linus Torvalds wrote:
>> On Tue, Mar 28, 2023 at 12:05?PM Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>>>
>>> But it's not like adding a 'struct iovec' explicitly to the members
>>> just as extra "code documentation" would be wrong.
>>>
>>> I don't think it really helps, though, since you have to have that
>>> other explicit structure there anyway to get the member names right.
>>
>> Actually, thinking a bit more about it, adding a
>>
>>     const struct iovec xyzzy;
>>
>> member might be a good idea just to avoid a cast. Then that
>> iter_ubuf_to_iov() macro becomes just
>>
>>    #define iter_ubuf_to_iov(iter) (&(iter)->xyzzy)
>>
>> and that looks much nicer (plus still acts kind of as a "code comment"
>> to clarify things).
> 
> I went down this path, and it _mostly_ worked out. You can view the
> series here, I'll send it out when I've actually tested it:
> 
> https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf
> 
> A few mental notes I made along the way:
> 
> - The IB/sound changes are now just replacing an inappropriate
>   iter_is_iovec() with iter->user_backed. That's nice and simple.
> 
> - The iov_iter_iovec() case becomes a bit simpler. Or so I thought,
>   because we still need to add in the offset so we can't just use
>   out embedded iovec for that. The above branch is just using the
>   iovec, but I don't think this is right.
> 
> - Looks like it exposed a block bug, where the copy in
>   bio_alloc_map_data() was obvious garbage but happened to work
>   before.
> 
> I'm still inclined to favor this approach over the previous, even if the
> IB driver is a pile of garbage and lighting it a bit more on fire would
> not really hurt.
> 
> Opinions? Or do you want me to just send it out for easier reading

While cleaning up that stuff, we only have a few users of iov_iter_iovec().
Why don't we just kill them off and the helper too? That drops that
part of it and it kind of works out nicely beyond that.


diff --git a/fs/read_write.c b/fs/read_write.c
index 7a2ff6157eda..fb932d0997d4 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -749,15 +749,15 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
 		return -EOPNOTSUPP;
 
 	while (iov_iter_count(iter)) {
-		struct iovec iovec = iov_iter_iovec(iter);
+		const struct iovec *iov = iter->iov;
 		ssize_t nr;
 
 		if (type == READ) {
-			nr = filp->f_op->read(filp, iovec.iov_base,
-					      iovec.iov_len, ppos);
+			nr = filp->f_op->read(filp, iov->iov_base,
+					      iov->iov_len, ppos);
 		} else {
-			nr = filp->f_op->write(filp, iovec.iov_base,
-					       iovec.iov_len, ppos);
+			nr = filp->f_op->write(filp, iov->iov_base,
+					       iov->iov_len, ppos);
 		}
 
 		if (nr < 0) {
@@ -766,7 +766,7 @@ static ssize_t do_loop_readv_writev(struct file *filp, struct iov_iter *iter,
 			break;
 		}
 		ret += nr;
-		if (nr != iovec.iov_len)
+		if (nr != iov->iov_len)
 			break;
 		iov_iter_advance(iter, nr);
 	}
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 4c233910e200..585461a6f6a0 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -454,7 +454,8 @@ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw, struct iov_iter *iter)
 			iovec.iov_base = iter->ubuf + iter->iov_offset;
 			iovec.iov_len = iov_iter_count(iter);
 		} else if (!iov_iter_is_bvec(iter)) {
-			iovec = iov_iter_iovec(iter);
+			iovec.iov_base = iter->iov->iov_base;
+			iovec.iov_len = iter->iov->iov_len;
 		} else {
 			iovec.iov_base = u64_to_user_ptr(rw->addr);
 			iovec.iov_len = rw->len;
diff --git a/mm/madvise.c b/mm/madvise.c
index 340125d08c03..0701a3bd530b 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -1456,7 +1456,8 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 		size_t, vlen, int, behavior, unsigned int, flags)
 {
 	ssize_t ret;
-	struct iovec iovstack[UIO_FASTIOV], iovec;
+	struct iovec iovstack[UIO_FASTIOV];
+	const struct iovec *iovec;
 	struct iovec *iov = iovstack;
 	struct iov_iter iter;
 	struct task_struct *task;
@@ -1503,12 +1504,12 @@ SYSCALL_DEFINE5(process_madvise, int, pidfd, const struct iovec __user *, vec,
 	total_len = iov_iter_count(&iter);
 
 	while (iov_iter_count(&iter)) {
-		iovec = iov_iter_iovec(&iter);
-		ret = do_madvise(mm, (unsigned long)iovec.iov_base,
-					iovec.iov_len, behavior);
+		iovec = iter.iov;
+		ret = do_madvise(mm, (unsigned long)iovec->iov_base,
+					iovec->iov_len, behavior);
 		if (ret < 0)
 			break;
-		iov_iter_advance(&iter, iovec.iov_len);
+		iov_iter_advance(&iter, iovec->iov_len);
 	}
 
 	ret = (total_len - iov_iter_count(&iter)) ? : ret;

-- 
Jens Axboe


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 21:38             ` Jens Axboe
@ 2023-03-28 21:51               ` Jens Axboe
  0 siblings, 0 replies; 27+ messages in thread
From: Jens Axboe @ 2023-03-28 21:51 UTC (permalink / raw)
  To: Linus Torvalds, Matthew Wilcox; +Cc: linux-fsdevel, brauner, viro

On 3/28/23 3:38 PM, Jens Axboe wrote:
> On 3/28/23 3:21?PM, Jens Axboe wrote:
>> On 3/28/23 1:16?PM, Linus Torvalds wrote:
>>> On Tue, Mar 28, 2023 at 12:05?PM Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>>
>>>> But it's not like adding a 'struct iovec' explicitly to the members
>>>> just as extra "code documentation" would be wrong.
>>>>
>>>> I don't think it really helps, though, since you have to have that
>>>> other explicit structure there anyway to get the member names right.
>>>
>>> Actually, thinking a bit more about it, adding a
>>>
>>>     const struct iovec xyzzy;
>>>
>>> member might be a good idea just to avoid a cast. Then that
>>> iter_ubuf_to_iov() macro becomes just
>>>
>>>    #define iter_ubuf_to_iov(iter) (&(iter)->xyzzy)
>>>
>>> and that looks much nicer (plus still acts kind of as a "code comment"
>>> to clarify things).
>>
>> I went down this path, and it _mostly_ worked out. You can view the
>> series here, I'll send it out when I've actually tested it:
>>
>> https://git.kernel.dk/cgit/linux-block/log/?h=iter-ubuf
>>
>> A few mental notes I made along the way:
>>
>> - The IB/sound changes are now just replacing an inappropriate
>>   iter_is_iovec() with iter->user_backed. That's nice and simple.
>>
>> - The iov_iter_iovec() case becomes a bit simpler. Or so I thought,
>>   because we still need to add in the offset so we can't just use
>>   out embedded iovec for that. The above branch is just using the
>>   iovec, but I don't think this is right.
>>
>> - Looks like it exposed a block bug, where the copy in
>>   bio_alloc_map_data() was obvious garbage but happened to work
>>   before.
>>
>> I'm still inclined to favor this approach over the previous, even if the
>> IB driver is a pile of garbage and lighting it a bit more on fire would
>> not really hurt.
>>
>> Opinions? Or do you want me to just send it out for easier reading
> 
> While cleaning up that stuff, we only have a few users of iov_iter_iovec().
> Why don't we just kill them off and the helper too? That drops that
> part of it and it kind of works out nicely beyond that.

Ugh that won't work obviously, as we can't factor in per-vec
offsets... So it has to be a copy.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter
  2023-03-28 20:38     ` Al Viro
  2023-03-28 20:46       ` Jens Axboe
@ 2023-03-28 22:06       ` Linus Torvalds
  1 sibling, 0 replies; 27+ messages in thread
From: Linus Torvalds @ 2023-03-28 22:06 UTC (permalink / raw)
  To: Al Viro; +Cc: Jens Axboe, linux-fsdevel, brauner

On Tue, Mar 28, 2023 at 1:38 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> ... and all that, er, cleverness - for the sake of a single piece of shit
> driver for piece of shit hardware *and* equally beautiful ABI.

Now, I wish we didn't have any of those 'walk the iov{} array', but
sadly we do, and it's not just a single case.

It's also that pcm_native.c case, it's the qib rdma driver.

So if we didn't have people walking the iov[] array, I'd hate to add this.

But considering that we *do* have people walking the iov[] array, I'd
rather unify the two user-mode cases than have them do the whole "do
two different things for the ITER_UBUF vs ITER_IOV case".

> Is it really worth bothering with?  And if anyone has doubts about the
> inturdprize kwality of the ABI in question, I suggest taking a look at
> hfi1_user_sdma_process_request() - that's where the horrors are.

Yes. I started out my email to Jens by suggesting that instead of
passing down the iov[] pointer, he should just pass down the iter
instead.

And then I looked at that code and went "yeah, no way do I want to touch it".

Which then got me to that "could we at least *unify* these two cases".

                    Linus

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2023-03-28 22:06 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-28 17:36 [PATCHSET v4 0/8] Turn single segment imports into ITER_UBUF Jens Axboe
2023-03-28 17:36 ` [PATCH 1/8] iov_iter: teach iov_iter_iovec() to deal with ITER_UBUF Jens Axboe
2023-03-28 17:36 ` [PATCH 2/8] iov_iter: add iovec_nr_user_vecs() helper Jens Axboe
2023-03-28 18:42   ` Al Viro
2023-03-28 18:45     ` Linus Torvalds
2023-03-28 19:27     ` Jens Axboe
2023-03-28 17:36 ` [PATCH 3/8] snd: move mapping an iov_iter to user bufs into a helper Jens Axboe
2023-03-28 17:36 ` [PATCH 4/8] snd: make snd_map_bufs() deal with ITER_UBUF Jens Axboe
2023-03-28 17:50   ` Linus Torvalds
2023-03-28 17:52     ` Jens Axboe
2023-03-28 18:52       ` Al Viro
2023-03-28 19:28         ` Jens Axboe
2023-03-28 17:36 ` [PATCH 5/8] IB/hfi1: make hfi1_write_iter() deal with ITER_UBUF iov_iter Jens Axboe
2023-03-28 18:43   ` Linus Torvalds
2023-03-28 18:55     ` Matthew Wilcox
2023-03-28 19:05       ` Linus Torvalds
2023-03-28 19:16         ` Linus Torvalds
2023-03-28 21:21           ` Jens Axboe
2023-03-28 21:38             ` Jens Axboe
2023-03-28 21:51               ` Jens Axboe
2023-03-28 19:30     ` Jens Axboe
2023-03-28 20:38     ` Al Viro
2023-03-28 20:46       ` Jens Axboe
2023-03-28 22:06       ` Linus Torvalds
2023-03-28 17:36 ` [PATCH 6/8] IB/qib: make qib_write_iter() " Jens Axboe
2023-03-28 17:36 ` [PATCH 7/8] iov_iter: convert import_single_range() to ITER_UBUF Jens Axboe
2023-03-28 17:36 ` [PATCH 8/8] iov_iter: import single vector iovecs as ITER_UBUF Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.