All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] io_uring: use ITER_UBUF
@ 2022-11-07 17:56 Keith Busch
  2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
  To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel, Keith Busch

From: Keith Busch <kbusch@kernel.org>

ITER_UBUF is a more efficient representation when using single vector
buffers, providing small optimizations in the fast path. Most of this
series came from Jens; I just ported them forward to the current release
and tested against various filesystems and devices.

Usage for this new iter type has been extensively exercised via
read/write syscall interface for some time now, so I don't expect
surprises from supporting this with io_uring. There are, however, a
couple difference between the two interfaces:

  1. io_uring will always prefer using the _iter versions of read/write
     callbacks if file_operations implement both, where as the generic
     syscalls will use .read/.write (if implemented) for non-vectored IO.
 
  2. io_uring will use the ITER_UBUF representation for single vector
     readv/writev, but the generic syscalls currently uses ITER_IOVEC for
     these.

That should mean, then, the only potential areas for problem are for
file_operations that implement both .read/.read_iter or
.write/.write_iter. Fortunately there are very few that do that, and I
found only two of them that won't readily work: qib_file_ops, and
snd_pcm_f_ops. The former is already broken with io_uring before this
series, and the latter's vectored read/write only works with ITER_IOVEC,
so that will break, but I don't think anyone is using io_uring to talk
to a sound card driver.

Jens Axboe (3):
  iov: add import_ubuf()
  io_uring: switch network send/recv to ITER_UBUF
  io_uring: use ubuf for single range imports for read/write

Keith Busch (1):
  iov_iter: move iter_ubuf check inside restore WARN

 include/linux/uio.h |  1 +
 io_uring/net.c      | 13 ++++---------
 io_uring/rw.c       |  9 ++++++---
 lib/iov_iter.c      | 15 +++++++++++++--
 4 files changed, 24 insertions(+), 14 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] iov: add import_ubuf()
  2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
  2022-11-08  6:55   ` Christoph Hellwig
  2022-11-07 17:56 ` [PATCH 2/4] io_uring: switch network send/recv to ITER_UBUF Keith Busch
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
  To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel

From: Jens Axboe <axboe@kernel.dk>

Like import_single_range(), but for ITER_UBUF.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 include/linux/uio.h |  1 +
 lib/iov_iter.c      | 11 +++++++++++
 2 files changed, 12 insertions(+)

diff --git a/include/linux/uio.h b/include/linux/uio.h
index 2e3134b14ffd..27575495c006 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -337,6 +337,7 @@ ssize_t __import_iovec(int type, const struct iovec __user *uvec,
 		 struct iov_iter *i, bool compat);
 int import_single_range(int type, void __user *buf, size_t len,
 		 struct iovec *iov, struct iov_iter *i);
+int import_ubuf(int type, void __user *buf, size_t len, struct iov_iter *i);
 
 static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction,
 			void __user *buf, size_t count)
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index c3ca28ca68a6..07adf18e5e40 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1855,6 +1855,17 @@ int import_single_range(int rw, void __user *buf, size_t len,
 }
 EXPORT_SYMBOL(import_single_range);
 
+int import_ubuf(int rw, void __user *buf, size_t len, struct iov_iter *i)
+{
+	if (len > MAX_RW_COUNT)
+		len = MAX_RW_COUNT;
+	if (unlikely(!access_ok(buf, len)))
+		return -EFAULT;
+
+	iov_iter_ubuf(i, rw, buf, len);
+	return 0;
+}
+
 /**
  * iov_iter_restore() - Restore a &struct iov_iter to the same state as when
  *     iov_iter_save_state() was called.
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] io_uring: switch network send/recv to ITER_UBUF
  2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
  2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
  2022-11-07 17:56 ` [PATCH 3/4] io_uring: use ubuf for single range imports for read/write Keith Busch
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
  To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel, Keith Busch

From: Jens Axboe <axboe@kernel.dk>

This is more efficient than ITER_IOVEC.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
[merged to 6.1]
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 io_uring/net.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/io_uring/net.c b/io_uring/net.c
index 9a07e79cc0e6..12c68b5ec62d 100644
--- a/io_uring/net.c
+++ b/io_uring/net.c
@@ -170,7 +170,7 @@ static int io_setup_async_msg(struct io_kiocb *req,
 	if (async_msg->msg.msg_name)
 		async_msg->msg.msg_name = &async_msg->addr;
 	/* if were using fast_iov, set it to the new one */
-	if (!kmsg->free_iov) {
+	if (iter_is_iovec(&kmsg->msg.msg_iter) && !kmsg->free_iov) {
 		size_t fast_idx = kmsg->msg.msg_iter.iov - kmsg->fast_iov;
 		async_msg->msg.msg_iter.iov = &async_msg->fast_iov[fast_idx];
 	}
@@ -333,7 +333,6 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags)
 	struct sockaddr_storage __address;
 	struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
 	struct msghdr msg;
-	struct iovec iov;
 	struct socket *sock;
 	unsigned flags;
 	int min_ret = 0;
@@ -367,7 +366,7 @@ int io_send(struct io_kiocb *req, unsigned int issue_flags)
 	if (unlikely(!sock))
 		return -ENOTSOCK;
 
-	ret = import_single_range(WRITE, sr->buf, sr->len, &iov, &msg.msg_iter);
+	ret = import_ubuf(WRITE, sr->buf, sr->len, &msg.msg_iter);
 	if (unlikely(ret))
 		return ret;
 
@@ -752,10 +751,7 @@ int io_recvmsg(struct io_kiocb *req, unsigned int issue_flags)
 			}
 		}
 
-		kmsg->fast_iov[0].iov_base = buf;
-		kmsg->fast_iov[0].iov_len = len;
-		iov_iter_init(&kmsg->msg.msg_iter, READ, kmsg->fast_iov, 1,
-				len);
+		iov_iter_ubuf(&kmsg->msg.msg_iter, READ, buf, len);
 	}
 
 	flags = sr->msg_flags;
@@ -824,7 +820,6 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags)
 	struct io_sr_msg *sr = io_kiocb_to_cmd(req, struct io_sr_msg);
 	struct msghdr msg;
 	struct socket *sock;
-	struct iovec iov;
 	unsigned int cflags;
 	unsigned flags;
 	int ret, min_ret = 0;
@@ -849,7 +844,7 @@ int io_recv(struct io_kiocb *req, unsigned int issue_flags)
 		sr->buf = buf;
 	}
 
-	ret = import_single_range(READ, sr->buf, len, &iov, &msg.msg_iter);
+	ret = import_ubuf(READ, sr->buf, len, &msg.msg_iter);
 	if (unlikely(ret))
 		goto out_free;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] io_uring: use ubuf for single range imports for read/write
  2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
  2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
  2022-11-07 17:56 ` [PATCH 2/4] io_uring: switch network send/recv to ITER_UBUF Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
  2022-11-07 17:56 ` [PATCH 4/4] iov_iter: move iter_ubuf check inside restore WARN Keith Busch
  2022-11-08  6:54 ` [PATCH 0/4] io_uring: use ITER_UBUF Christoph Hellwig
  4 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
  To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel, Keith Busch

From: Jens Axboe <axboe@kernel.dk>

This is more efficient than ITER_IOVEC.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
[merge to 6.1, random fixes]
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 io_uring/rw.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/io_uring/rw.c b/io_uring/rw.c
index 1ce065709724..19551b5e8088 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -391,7 +391,7 @@ static struct iovec *__io_import_iovec(int ddir, struct io_kiocb *req,
 			rw->len = sqe_len;
 		}
 
-		ret = import_single_range(ddir, buf, sqe_len, s->fast_iov, iter);
+		ret = import_ubuf(ddir, buf, sqe_len, iter);
 		if (ret)
 			return ERR_PTR(ret);
 		return NULL;
@@ -450,7 +450,10 @@ static ssize_t loop_rw_iter(int ddir, struct io_rw *rw, struct iov_iter *iter)
 		struct iovec iovec;
 		ssize_t nr;
 
-		if (!iov_iter_is_bvec(iter)) {
+		if (iter_is_ubuf(iter)) {
+			iovec.iov_base = iter->ubuf + iter->iov_offset;
+			iovec.iov_len = iov_iter_count(iter);
+		} else if (!iov_iter_is_bvec(iter)) {
 			iovec = iov_iter_iovec(iter);
 		} else {
 			iovec.iov_base = u64_to_user_ptr(rw->addr);
@@ -495,7 +498,7 @@ static void io_req_map_rw(struct io_kiocb *req, const struct iovec *iovec,
 	io->free_iovec = iovec;
 	io->bytes_done = 0;
 	/* can only be fixed buffers, no need to do anything */
-	if (iov_iter_is_bvec(iter))
+	if (iov_iter_is_bvec(iter) || iter_is_ubuf(iter))
 		return;
 	if (!iovec) {
 		unsigned iov_off = 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] iov_iter: move iter_ubuf check inside restore WARN
  2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
                   ` (2 preceding siblings ...)
  2022-11-07 17:56 ` [PATCH 3/4] io_uring: use ubuf for single range imports for read/write Keith Busch
@ 2022-11-07 17:56 ` Keith Busch
  2022-11-08  6:54 ` [PATCH 0/4] io_uring: use ITER_UBUF Christoph Hellwig
  4 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-07 17:56 UTC (permalink / raw)
  To: viro, axboe, io-uring; +Cc: asml.silence, linux-fsdevel, Keith Busch

From: Keith Busch <kbusch@kernel.org>

io_uring is using iter_ubuf types for single vector requests. We expect
state restore may happen for this type now, and it is already handled
correctly, so move the check inside the warning to suppress it.

Signed-off-by: Keith Busch <kbusch@kernel.org>
---
 lib/iov_iter.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 07adf18e5e40..aa192a386bd7 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -1880,8 +1880,8 @@ int import_ubuf(int rw, void __user *buf, size_t len, struct iov_iter *i)
  */
 void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state)
 {
-	if (WARN_ON_ONCE(!iov_iter_is_bvec(i) && !iter_is_iovec(i)) &&
-			 !iov_iter_is_kvec(i) && !iter_is_ubuf(i))
+	if (WARN_ON_ONCE(!iov_iter_is_bvec(i) && !iter_is_iovec(i) &&
+			 !iter_is_ubuf(i)) && !iov_iter_is_kvec(i))
 		return;
 	i->iov_offset = state->iov_offset;
 	i->count = state->count;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] io_uring: use ITER_UBUF
  2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
                   ` (3 preceding siblings ...)
  2022-11-07 17:56 ` [PATCH 4/4] iov_iter: move iter_ubuf check inside restore WARN Keith Busch
@ 2022-11-08  6:54 ` Christoph Hellwig
  2022-11-08 20:25   ` Keith Busch
  4 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2022-11-08  6:54 UTC (permalink / raw)
  To: Keith Busch
  Cc: viro, axboe, io-uring, asml.silence, linux-fsdevel, Keith Busch

On Mon, Nov 07, 2022 at 09:56:06AM -0800, Keith Busch wrote:
>   1. io_uring will always prefer using the _iter versions of read/write
>      callbacks if file_operations implement both, where as the generic
>      syscalls will use .read/.write (if implemented) for non-vectored IO.

There are very few file operations that have both, and for those
the difference matters, e.g. the strange vectors semantics for the
sound code.  I would strongly suggest to mirror what the normal
read/write path does here.

>   2. io_uring will use the ITER_UBUF representation for single vector
>      readv/writev, but the generic syscalls currently uses ITER_IOVEC for
>      these.

Same here.  It might be woth to use ITER_UBUF for single vector
readv/writev, but this should be the same for all interfaces.  I'd
suggest to drop this for now and do a separate series with careful
review from Al for this.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] iov: add import_ubuf()
  2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
@ 2022-11-08  6:55   ` Christoph Hellwig
  2022-11-08 16:05     ` Keith Busch
  0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2022-11-08  6:55 UTC (permalink / raw)
  To: Keith Busch; +Cc: viro, axboe, io-uring, asml.silence, linux-fsdevel

On Mon, Nov 07, 2022 at 09:56:07AM -0800, Keith Busch wrote:
> From: Jens Axboe <axboe@kernel.dk>
> 
> Like import_single_range(), but for ITER_UBUF.

So what is the argument for not simplify switching
import_single_range to always do a ITER_UBUF?  Maybe there is a reason
against that, but it should be clearly stated here.

> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>

Now that this went through your hands it also needs your signoff.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/4] iov: add import_ubuf()
  2022-11-08  6:55   ` Christoph Hellwig
@ 2022-11-08 16:05     ` Keith Busch
  0 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-08 16:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, viro, axboe, io-uring, asml.silence, linux-fsdevel

On Mon, Nov 07, 2022 at 10:55:49PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 07, 2022 at 09:56:07AM -0800, Keith Busch wrote:
> > From: Jens Axboe <axboe@kernel.dk>
> > 
> > Like import_single_range(), but for ITER_UBUF.
> 
> So what is the argument for not simplify switching
> import_single_range to always do a ITER_UBUF?  Maybe there is a reason
> against that, but it should be clearly stated here.

That may be a good idea if everyone uses the more efficient iter, but I
thought it'd be safer to keep them separate. There are just a few
import_single_range() users that expect the result be ITER_IOVEC. It
will take some extra time on my side to make sure that kind of change
won't break anything.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] io_uring: use ITER_UBUF
  2022-11-08  6:54 ` [PATCH 0/4] io_uring: use ITER_UBUF Christoph Hellwig
@ 2022-11-08 20:25   ` Keith Busch
  0 siblings, 0 replies; 9+ messages in thread
From: Keith Busch @ 2022-11-08 20:25 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, viro, axboe, io-uring, asml.silence, linux-fsdevel

On Mon, Nov 07, 2022 at 10:54:06PM -0800, Christoph Hellwig wrote:
> On Mon, Nov 07, 2022 at 09:56:06AM -0800, Keith Busch wrote:
> >   1. io_uring will always prefer using the _iter versions of read/write
> >      callbacks if file_operations implement both, where as the generic
> >      syscalls will use .read/.write (if implemented) for non-vectored IO.
> 
> There are very few file operations that have both, and for those
> the difference matters, e.g. the strange vectors semantics for the
> sound code. 
 
Yes, thankfully there are not many. Other than the two mentioned
file_operations, the only other fops I find implementing both are
'null_ops' and 'zero_ops'; those are fine. And one other implements
just .write/.write_iter: trace_events_user.c, which is also fine.

> I would strongly suggest to mirror what the normal
> read/write path does here.

I don't think we can change that now. io_uring has always used the
.{read,write}_iter callbacks if available ever since it introduced
non-vectored read/write (3a6820f2bb8a0). Altering the io_uring op's ABI
to align with the read/write syscalls seems risky.

But I don't think there are any real use cases affected by this series
anyway.

> >   2. io_uring will use the ITER_UBUF representation for single vector
> >      readv/writev, but the generic syscalls currently uses ITER_IOVEC for
> >      these.
> 
> Same here.  It might be woth to use ITER_UBUF for single vector
> readv/writev, but this should be the same for all interfaces.  I'd
> suggest to drop this for now and do a separate series with careful
> review from Al for this.

I feel like that's a worthy longer term goal, but I'll start looking
into it now.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-11-08 20:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-07 17:56 [PATCH 0/4] io_uring: use ITER_UBUF Keith Busch
2022-11-07 17:56 ` [PATCH 1/4] iov: add import_ubuf() Keith Busch
2022-11-08  6:55   ` Christoph Hellwig
2022-11-08 16:05     ` Keith Busch
2022-11-07 17:56 ` [PATCH 2/4] io_uring: switch network send/recv to ITER_UBUF Keith Busch
2022-11-07 17:56 ` [PATCH 3/4] io_uring: use ubuf for single range imports for read/write Keith Busch
2022-11-07 17:56 ` [PATCH 4/4] iov_iter: move iter_ubuf check inside restore WARN Keith Busch
2022-11-08  6:54 ` [PATCH 0/4] io_uring: use ITER_UBUF Christoph Hellwig
2022-11-08 20:25   ` Keith Busch

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.