All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET RFC 0/4] Support passing fds between chain links
@ 2020-03-03 23:50 Jens Axboe
  2020-03-03 23:50 ` [PATCH 1/4] io_uring: add end-of-bits marker and build time verify it Jens Axboe
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Jens Axboe @ 2020-03-03 23:50 UTC (permalink / raw)
  To: io-uring; +Cc: jlayton

One of the fabled features with chains has long been the desire to
support things like:

<open fileX><read from fileX><close fileX>

in a single chain. This currently doesn't work, since the read/close
depends on what file descriptor we get on open.

This is very much a RFC patchset, but it allows the read/close above
to set their fd to a magic value, IOSQE_FD_LAST_OPEN. If set to this
value, the file descriptor will be inherited from the last open in
that chain. If there are no opens in the chain, the IO is simply
errored. Only a single magic fd value is supported, so if the chain
needs to operate on two of them, care needs to be taken to ensure
things are correct. Consider for example the desire to open fileX
and read from it, and write that to another file. You could do that
ala:

<open fileX><read from fileX><close fileX><open fileY><write to fileY>
	<close fileY>

and have that work, but you cannot open both files first, then read/write
and then close. I don't think that necessarily poses a problem, and
I'd rather not get into fd nesting and things like that. Open to input
here, of course.

Another concern here is that we currently error linked IO if it doesn't
match what was asked for, a prime example being short reads. For a
basic chain of open/read/close, the close doesn't really care if the read
is short or not. It's only if we have further links in the chain that
depend on the read length that this is a problem.

Anyway, with this, prep handlers can't look at ->file as it may not be
valid yet. Only close and read/write do that, from a quick glance, and
there are two prep patches to split that a bit (2 and 3). Patch 1 is just
a basic prep patch as well, patch 4 is the functional part.

I added a small 'orc' (open-read-close) test program in the fd-pass
branch of liburing:

https://git.kernel.dk/cgit/liburing/plain/test/orc.c?h=fd-pass

as an example use case.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 1/4] io_uring: add end-of-bits marker and build time verify it
  2020-03-03 23:50 [PATCHSET RFC 0/4] Support passing fds between chain links Jens Axboe
@ 2020-03-03 23:50 ` Jens Axboe
  2020-03-03 23:50 ` [PATCH 2/4] io_uring: move CLOSE req->file checking into handler Jens Axboe
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2020-03-03 23:50 UTC (permalink / raw)
  To: io-uring; +Cc: jlayton, Jens Axboe

Not easy to tell if we're going over the size of bits we can shove
in req->flags, so add an end-of-bits marker and a BUILD_BUG_ON()
check for it.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 3df54dad9eae..0464efbeba25 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -511,6 +511,9 @@ enum {
 	REQ_F_OVERFLOW_BIT,
 	REQ_F_POLLED_BIT,
 	REQ_F_BUFFER_SELECTED_BIT,
+
+	/* not a real bit, just to check we're not overflowing the space */
+	__REQ_F_LAST_BIT,
 };
 
 enum {
@@ -8011,6 +8014,7 @@ static int __init io_uring_init(void)
 	BUILD_BUG_SQE_ELEM(44, __s32,  splice_fd_in);
 
 	BUILD_BUG_ON(ARRAY_SIZE(io_op_defs) != IORING_OP_LAST);
+	BUILD_BUG_ON(__REQ_F_LAST_BIT >= 8 * sizeof(int));
 	req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC);
 	return 0;
 };
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 2/4] io_uring: move CLOSE req->file checking into handler
  2020-03-03 23:50 [PATCHSET RFC 0/4] Support passing fds between chain links Jens Axboe
  2020-03-03 23:50 ` [PATCH 1/4] io_uring: add end-of-bits marker and build time verify it Jens Axboe
@ 2020-03-03 23:50 ` Jens Axboe
  2020-03-04 13:07   ` Pavel Begunkov
  2020-03-03 23:50 ` [PATCH 3/4] io_uring: move read/write side file based prep into op handler Jens Axboe
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2020-03-03 23:50 UTC (permalink / raw)
  To: io-uring; +Cc: jlayton, Jens Axboe

In preparation for not needing req->file in on the prep side at all.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0464efbeba25..9d5e49a39dba 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -3367,10 +3367,6 @@ static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 		return -EBADF;
 
 	req->close.fd = READ_ONCE(sqe->fd);
-	if (req->file->f_op == &io_uring_fops ||
-	    req->close.fd == req->ctx->ring_fd)
-		return -EBADF;
-
 	return 0;
 }
 
@@ -3400,6 +3396,10 @@ static int io_close(struct io_kiocb *req, bool force_nonblock)
 {
 	int ret;
 
+	if (req->file->f_op == &io_uring_fops ||
+	    req->close.fd == req->ctx->ring_fd)
+		return -EBADF;
+
 	req->close.put_file = NULL;
 	ret = __close_fd_get_file(req->close.fd, &req->close.put_file);
 	if (ret < 0)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 3/4] io_uring: move read/write side file based prep into op handler
  2020-03-03 23:50 [PATCHSET RFC 0/4] Support passing fds between chain links Jens Axboe
  2020-03-03 23:50 ` [PATCH 1/4] io_uring: add end-of-bits marker and build time verify it Jens Axboe
  2020-03-03 23:50 ` [PATCH 2/4] io_uring: move CLOSE req->file checking into handler Jens Axboe
@ 2020-03-03 23:50 ` Jens Axboe
  2020-03-03 23:50 ` [PATCH 4/4] io_uring: test patch for fd passing Jens Axboe
  2020-03-04  0:43 ` [PATCHSET RFC 0/4] Support passing fds between chain links Jeff Layton
  4 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2020-03-03 23:50 UTC (permalink / raw)
  To: io-uring; +Cc: jlayton, Jens Axboe

In preparation for not needing req->file in on the prep side at all.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c | 72 ++++++++++++++++++++++++++++++---------------------
 1 file changed, 43 insertions(+), 29 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 9d5e49a39dba..8044dec4e793 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2030,37 +2030,19 @@ static bool io_file_supports_async(struct file *file)
 	return false;
 }
 
-static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
-		      bool force_nonblock)
+static int io_prep_rw(struct io_kiocb *req, bool force_nonblock)
 {
-	struct io_ring_ctx *ctx = req->ctx;
 	struct kiocb *kiocb = &req->rw.kiocb;
-	unsigned ioprio;
-	int ret;
 
 	if (S_ISREG(file_inode(req->file)->i_mode))
 		req->flags |= REQ_F_ISREG;
 
-	kiocb->ki_pos = READ_ONCE(sqe->off);
 	if (kiocb->ki_pos == -1 && !(req->file->f_mode & FMODE_STREAM)) {
 		req->flags |= REQ_F_CUR_POS;
 		kiocb->ki_pos = req->file->f_pos;
 	}
 	kiocb->ki_hint = ki_hint_validate(file_write_hint(kiocb->ki_filp));
-	kiocb->ki_flags = iocb_flags(kiocb->ki_filp);
-	ret = kiocb_set_rw_flags(kiocb, READ_ONCE(sqe->rw_flags));
-	if (unlikely(ret))
-		return ret;
-
-	ioprio = READ_ONCE(sqe->ioprio);
-	if (ioprio) {
-		ret = ioprio_check_cap(ioprio);
-		if (ret)
-			return ret;
-
-		kiocb->ki_ioprio = ioprio;
-	} else
-		kiocb->ki_ioprio = get_current_ioprio();
+	kiocb->ki_flags |= iocb_flags(kiocb->ki_filp);
 
 	/* don't allow async punt if RWF_NOWAIT was requested */
 	if ((kiocb->ki_flags & IOCB_NOWAIT) ||
@@ -2070,7 +2052,7 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	if (force_nonblock)
 		kiocb->ki_flags |= IOCB_NOWAIT;
 
-	if (ctx->flags & IORING_SETUP_IOPOLL) {
+	if (req->ctx->flags & IORING_SETUP_IOPOLL) {
 		if (!(kiocb->ki_flags & IOCB_DIRECT) ||
 		    !kiocb->ki_filp->f_op->iopoll)
 			return -EOPNOTSUPP;
@@ -2084,6 +2066,30 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		kiocb->ki_complete = io_complete_rw;
 	}
 
+	return 0;
+}
+
+static int io_sqe_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	struct kiocb *kiocb = &req->rw.kiocb;
+	unsigned ioprio;
+	int ret;
+
+	kiocb->ki_pos = READ_ONCE(sqe->off);
+	ret = kiocb_set_rw_flags(kiocb, READ_ONCE(sqe->rw_flags));
+	if (unlikely(ret))
+		return ret;
+
+	ioprio = READ_ONCE(sqe->ioprio);
+	if (ioprio) {
+		ret = ioprio_check_cap(ioprio);
+		if (ret)
+			return ret;
+
+		kiocb->ki_ioprio = ioprio;
+	} else
+		kiocb->ki_ioprio = get_current_ioprio();
+
 	req->rw.addr = READ_ONCE(sqe->addr);
 	req->rw.len = READ_ONCE(sqe->len);
 	/* we own ->private, reuse it for the buffer index  / buffer ID */
@@ -2487,13 +2493,10 @@ static int io_read_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	struct iov_iter iter;
 	ssize_t ret;
 
-	ret = io_prep_rw(req, sqe, force_nonblock);
+	ret = io_sqe_prep_rw(req, sqe);
 	if (ret)
 		return ret;
 
-	if (unlikely(!(req->file->f_mode & FMODE_READ)))
-		return -EBADF;
-
 	/* either don't need iovec imported or already have it */
 	if (!req->io || req->flags & REQ_F_NEED_CLEANUP)
 		return 0;
@@ -2518,6 +2521,13 @@ static int io_read(struct io_kiocb *req, bool force_nonblock)
 	size_t iov_count;
 	ssize_t io_size, ret;
 
+	if (unlikely(!(req->file->f_mode & FMODE_READ)))
+		return -EBADF;
+
+	ret = io_prep_rw(req, force_nonblock);
+	if (ret)
+		return ret;
+
 	ret = io_import_iovec(READ, req, &iovec, &iter, !force_nonblock);
 	if (ret < 0)
 		return ret;
@@ -2576,13 +2586,10 @@ static int io_write_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	struct iov_iter iter;
 	ssize_t ret;
 
-	ret = io_prep_rw(req, sqe, force_nonblock);
+	ret = io_sqe_prep_rw(req, sqe);
 	if (ret)
 		return ret;
 
-	if (unlikely(!(req->file->f_mode & FMODE_WRITE)))
-		return -EBADF;
-
 	/* either don't need iovec imported or already have it */
 	if (!req->io || req->flags & REQ_F_NEED_CLEANUP)
 		return 0;
@@ -2607,6 +2614,13 @@ static int io_write(struct io_kiocb *req, bool force_nonblock)
 	size_t iov_count;
 	ssize_t ret, io_size;
 
+	if (unlikely(!(req->file->f_mode & FMODE_WRITE)))
+		return -EBADF;
+
+	ret = io_prep_rw(req, force_nonblock);
+	if (ret)
+		return ret;
+
 	ret = io_import_iovec(WRITE, req, &iovec, &iter, !force_nonblock);
 	if (ret < 0)
 		return ret;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 4/4] io_uring: test patch for fd passing
  2020-03-03 23:50 [PATCHSET RFC 0/4] Support passing fds between chain links Jens Axboe
                   ` (2 preceding siblings ...)
  2020-03-03 23:50 ` [PATCH 3/4] io_uring: move read/write side file based prep into op handler Jens Axboe
@ 2020-03-03 23:50 ` Jens Axboe
  2020-03-04 13:13   ` Pavel Begunkov
  2020-03-04  0:43 ` [PATCHSET RFC 0/4] Support passing fds between chain links Jeff Layton
  4 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2020-03-03 23:50 UTC (permalink / raw)
  To: io-uring; +Cc: jlayton, Jens Axboe

This allows a chain link to set ->fd to IOSQE_FD_LAST_OPEN, which will
then be turned into the results from the last open (or accept) request
in the chain. If no open has been done, this isn't valid and the request
will be errored with -EBADF.

With this, we can define chains of open+read+close, where the read and
close part can work on the fd instantiated by the open request.

Totally a work in progress, POC so far.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 56 ++++++++++++++++++++++++++++++++---
 include/uapi/linux/io_uring.h |  6 ++++
 2 files changed, 58 insertions(+), 4 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8044dec4e793..63a1a4f01e7e 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -511,6 +511,7 @@ enum {
 	REQ_F_OVERFLOW_BIT,
 	REQ_F_POLLED_BIT,
 	REQ_F_BUFFER_SELECTED_BIT,
+	REQ_F_OPEN_FD_BIT,
 
 	/* not a real bit, just to check we're not overflowing the space */
 	__REQ_F_LAST_BIT,
@@ -562,6 +563,8 @@ enum {
 	REQ_F_POLLED		= BIT(REQ_F_POLLED_BIT),
 	/* buffer already selected */
 	REQ_F_BUFFER_SELECTED	= BIT(REQ_F_BUFFER_SELECTED_BIT),
+	/* use chain previous open fd */
+	REQ_F_OPEN_FD		= BIT(REQ_F_OPEN_FD_BIT),
 };
 
 struct async_poll {
@@ -600,6 +603,9 @@ struct io_kiocb {
 	bool				needs_fixed_file;
 	u8				opcode;
 
+	int			last_open_fd;
+	struct file		*last_open_file;
+
 	struct io_ring_ctx	*ctx;
 	struct list_head	list;
 	unsigned int		flags;
@@ -1344,7 +1350,7 @@ static inline void io_put_file(struct io_kiocb *req, struct file *file,
 {
 	if (fixed)
 		percpu_ref_put(&req->ctx->file_data->refs);
-	else
+	else if (!(req->flags & REQ_F_OPEN_FD))
 		fput(file);
 }
 
@@ -1487,6 +1493,14 @@ static void io_req_link_next(struct io_kiocb *req, struct io_kiocb **nxtptr)
 		list_del_init(&req->link_list);
 		if (!list_empty(&nxt->link_list))
 			nxt->flags |= REQ_F_LINK;
+		if (nxt->flags & REQ_F_OPEN_FD) {
+			WARN_ON_ONCE(nxt->file);
+			if (unlikely(!req->last_open_file))
+				nxt->flags |= REQ_F_FAIL_LINK;
+			nxt->last_open_file = req->last_open_file;
+			nxt->last_open_fd = req->last_open_fd;
+			nxt->file = req->last_open_file;
+		}
 		*nxtptr = nxt;
 		break;
 	}
@@ -2965,8 +2979,8 @@ static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 
 static int io_openat2(struct io_kiocb *req, bool force_nonblock)
 {
+	struct file *file = NULL;
 	struct open_flags op;
-	struct file *file;
 	int ret;
 
 	if (force_nonblock)
@@ -2991,8 +3005,12 @@ static int io_openat2(struct io_kiocb *req, bool force_nonblock)
 err:
 	putname(req->open.filename);
 	req->flags &= ~REQ_F_NEED_CLEANUP;
-	if (ret < 0)
+	if (ret < 0) {
 		req_set_fail_links(req);
+	} else if (req->flags & REQ_F_LINK) {
+		req->last_open_file = file;
+		req->last_open_fd = ret;
+	}
 	io_cqring_add_event(req, ret);
 	io_put_req(req);
 	return 0;
@@ -3410,6 +3428,14 @@ static int io_close(struct io_kiocb *req, bool force_nonblock)
 {
 	int ret;
 
+	if (req->flags & REQ_F_OPEN_FD) {
+		if (req->close.fd != IOSQE_FD_LAST_OPEN)
+			return -EBADF;
+		req->close.fd = req->last_open_fd;
+		req->last_open_file = NULL;
+		req->last_open_fd = -1;
+	}
+
 	if (req->file->f_op == &io_uring_fops ||
 	    req->close.fd == req->ctx->ring_fd)
 		return -EBADF;
@@ -3962,8 +3988,14 @@ static int __io_accept(struct io_kiocb *req, bool force_nonblock)
 		return -EAGAIN;
 	if (ret == -ERESTARTSYS)
 		ret = -EINTR;
-	if (ret < 0)
+	if (ret < 0) {
 		req_set_fail_links(req);
+	} else if (req->flags & REQ_F_LINK) {
+		rcu_read_lock();
+		req->last_open_file = fcheck_files(current->files, ret);
+		rcu_read_unlock();
+		req->last_open_fd = ret;
+	}
 	io_cqring_add_event(req, ret);
 	io_put_req(req);
 	return 0;
@@ -4999,6 +5031,9 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 	struct io_ring_ctx *ctx = req->ctx;
 	int ret;
 
+	if ((req->flags & REQ_F_OPEN_FD) && !req->file)
+		return -EBADF;
+
 	switch (req->opcode) {
 	case IORING_OP_NOP:
 		ret = io_nop(req);
@@ -5335,6 +5370,14 @@ static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req,
 		return 0;
 
 	fixed = (flags & IOSQE_FIXED_FILE);
+	if (fd == IOSQE_FD_LAST_OPEN) {
+		if (fixed)
+			return -EBADF;
+		req->flags |= REQ_F_OPEN_FD;
+		req->file = NULL;
+		return 0;
+	}
+
 	if (unlikely(!fixed && req->needs_fixed_file))
 		return -EBADF;
 
@@ -5646,6 +5689,9 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		trace_io_uring_link(ctx, req, head);
 		list_add_tail(&req->link_list, &head->link_list);
 
+		req->last_open_fd = -1;
+		req->last_open_file = NULL;
+
 		/* last request of a link, enqueue the link */
 		if (!(sqe_flags & (IOSQE_IO_LINK|IOSQE_IO_HARDLINK))) {
 			io_queue_link_head(head);
@@ -5658,6 +5704,8 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		if (sqe_flags & (IOSQE_IO_LINK|IOSQE_IO_HARDLINK)) {
 			req->flags |= REQ_F_LINK;
+			req->last_open_fd = -1;
+			req->last_open_file = NULL;
 			INIT_LIST_HEAD(&req->link_list);
 			ret = io_req_defer_prep(req, sqe);
 			if (ret)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 5eea0cfafb59..8206f075dd26 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -84,6 +84,12 @@ enum {
 /* select buffer from sqe->buf_group */
 #define IOSQE_BUFFER_SELECT	(1U << IOSQE_BUFFER_SELECT_BIT)
 
+/*
+ * 'magic' ->fd values don't point to a real fd, but rather define how fds
+ * can be inherited through links in a chain
+ */
+#define IOSQE_FD_LAST_OPEN	(-4096)	/* previous result is fd */
+
 /*
  * io_uring_setup() flags
  */
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCHSET RFC 0/4] Support passing fds between chain links
  2020-03-03 23:50 [PATCHSET RFC 0/4] Support passing fds between chain links Jens Axboe
                   ` (3 preceding siblings ...)
  2020-03-03 23:50 ` [PATCH 4/4] io_uring: test patch for fd passing Jens Axboe
@ 2020-03-04  0:43 ` Jeff Layton
  4 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2020-03-04  0:43 UTC (permalink / raw)
  To: Jens Axboe, io-uring

On Tue, 2020-03-03 at 16:50 -0700, Jens Axboe wrote:
> One of the fabled features with chains has long been the desire to
> support things like:
> 
> <open fileX><read from fileX><close fileX>
> 
> in a single chain. This currently doesn't work, since the read/close
> depends on what file descriptor we get on open.
> 
> This is very much a RFC patchset, but it allows the read/close above
> to set their fd to a magic value, IOSQE_FD_LAST_OPEN. If set to this
> value, the file descriptor will be inherited from the last open in
> that chain. If there are no opens in the chain, the IO is simply
> errored. Only a single magic fd value is supported, so if the chain
> needs to operate on two of them, care needs to be taken to ensure
> things are correct. Consider for example the desire to open fileX
> and read from it, and write that to another file. You could do that
> ala:
> 
> <open fileX><read from fileX><close fileX><open fileY><write to fileY>
> 	<close fileY>
> 
> and have that work, but you cannot open both files first, then read/write
> and then close. I don't think that necessarily poses a problem, and
> I'd rather not get into fd nesting and things like that. Open to input
> here, of course.
> 

Nice work!

I think this is a fine interface for about 90% of the use cases that I
can think of for this.

It would have to be expanded if we ever did want to be able to multiplex
several fds though. Might we ever need to do a splice or copy_file_range
between two files like this? It's at least worth considering how we
might do that in the future.

One idea might be to give each chain a fixed-size table (8 or so?) and
then add a way to basically dup a private reference of the LAST_OPEN
fd+file into a particular slot.

Then you could just define constants like IOSQE_FD_SLOT_1..8 and then
you have a way to deal with more than one "ephemeral" open at a time.
Those dup'ed entries would be implicitly closed when the chain returns.

Then, you could do:

    <open><dup><close>

...and just work with the private slot descriptor from then on in the
chain.

> 
> Another concern here is that we currently error linked IO if it doesn't
> match what was asked for, a prime example being short reads. For a
> basic chain of open/read/close, the close doesn't really care if the read
> is short or not. It's only if we have further links in the chain that
> depend on the read length that this is a problem.
> 

Ok, so a short read is considered an error and you stop processing the
chain? That seems like a reasonable thing to do here. The idea is to do
this for speed, so erroring out when things don't go as planned seems
fine to me.


> Anyway, with this, prep handlers can't look at ->file as it may not be
> valid yet. Only close and read/write do that, from a quick glance, and
> there are two prep patches to split that a bit (2 and 3). Patch 1 is just
> a basic prep patch as well, patch 4 is the functional part.
> 
> I added a small 'orc' (open-read-close) test program in the fd-pass
> branch of liburing:
> 
> https://git.kernel.dk/cgit/liburing/plain/test/orc.c?h=fd-pass
> 
> as an example use case.
> 
-- 
Jeff Layton <jlayton@kernel.org>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] io_uring: move CLOSE req->file checking into handler
  2020-03-03 23:50 ` [PATCH 2/4] io_uring: move CLOSE req->file checking into handler Jens Axboe
@ 2020-03-04 13:07   ` Pavel Begunkov
  2020-03-04 17:47     ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2020-03-04 13:07 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: jlayton

On 04/03/2020 02:50, Jens Axboe wrote:
> In preparation for not needing req->file in on the prep side at all.
> 
> Signed-off-by: Jens Axboe <axboe@kernel.dk>
> ---
>  fs/io_uring.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 0464efbeba25..9d5e49a39dba 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -3367,10 +3367,6 @@ static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>  		return -EBADF;
>  
>  	req->close.fd = READ_ONCE(sqe->fd);
> -	if (req->file->f_op == &io_uring_fops ||
> -	    req->close.fd == req->ctx->ring_fd)
> -		return -EBADF;
> -
>  	return 0;
>  }
>  
> @@ -3400,6 +3396,10 @@ static int io_close(struct io_kiocb *req, bool force_nonblock)
>  {
>  	int ret;
>  
> +	if (req->file->f_op == &io_uring_fops ||
> +	    req->close.fd == req->ctx->ring_fd)
> +		return -EBADF;
> +

@ring_fd's and @ring_file's lifetimes are bound by call to io_submit_sqes(), and
they're undefined outside. For the same reason both of them should be used with
ctx->uring_lock hold.


>  	req->close.put_file = NULL;
>  	ret = __close_fd_get_file(req->close.fd, &req->close.put_file);
>  	if (ret < 0)
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] io_uring: test patch for fd passing
  2020-03-03 23:50 ` [PATCH 4/4] io_uring: test patch for fd passing Jens Axboe
@ 2020-03-04 13:13   ` Pavel Begunkov
  2020-03-04 17:48     ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2020-03-04 13:13 UTC (permalink / raw)
  To: Jens Axboe, io-uring; +Cc: jlayton

On 04/03/2020 02:50, Jens Axboe wrote:
> This allows a chain link to set ->fd to IOSQE_FD_LAST_OPEN, which will
> then be turned into the results from the last open (or accept) request
> in the chain. If no open has been done, this isn't valid and the request
> will be errored with -EBADF.
> 
> With this, we can define chains of open+read+close, where the read and
> close part can work on the fd instantiated by the open request.
> 
> Totally a work in progress, POC so far.

I'm concerned of having and supporting all these IOSQE flavours. Isn't it the
thing that can be potentially done with eBPF?

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 2/4] io_uring: move CLOSE req->file checking into handler
  2020-03-04 13:07   ` Pavel Begunkov
@ 2020-03-04 17:47     ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2020-03-04 17:47 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: jlayton

On 3/4/20 6:07 AM, Pavel Begunkov wrote:
> On 04/03/2020 02:50, Jens Axboe wrote:
>> In preparation for not needing req->file in on the prep side at all.
>>
>> Signed-off-by: Jens Axboe <axboe@kernel.dk>
>> ---
>>  fs/io_uring.c | 8 ++++----
>>  1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/io_uring.c b/fs/io_uring.c
>> index 0464efbeba25..9d5e49a39dba 100644
>> --- a/fs/io_uring.c
>> +++ b/fs/io_uring.c
>> @@ -3367,10 +3367,6 @@ static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>>  		return -EBADF;
>>  
>>  	req->close.fd = READ_ONCE(sqe->fd);
>> -	if (req->file->f_op == &io_uring_fops ||
>> -	    req->close.fd == req->ctx->ring_fd)
>> -		return -EBADF;
>> -
>>  	return 0;
>>  }
>>  
>> @@ -3400,6 +3396,10 @@ static int io_close(struct io_kiocb *req, bool force_nonblock)
>>  {
>>  	int ret;
>>  
>> +	if (req->file->f_op == &io_uring_fops ||
>> +	    req->close.fd == req->ctx->ring_fd)
>> +		return -EBADF;
>> +
> 
> @ring_fd's and @ring_file's lifetimes are bound by call to io_submit_sqes(), and
> they're undefined outside. For the same reason both of them should be used with
> ctx->uring_lock hold.

I've fixed this by splitting the check.


-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 4/4] io_uring: test patch for fd passing
  2020-03-04 13:13   ` Pavel Begunkov
@ 2020-03-04 17:48     ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2020-03-04 17:48 UTC (permalink / raw)
  To: Pavel Begunkov, io-uring; +Cc: jlayton

On 3/4/20 6:13 AM, Pavel Begunkov wrote:
> On 04/03/2020 02:50, Jens Axboe wrote:
>> This allows a chain link to set ->fd to IOSQE_FD_LAST_OPEN, which will
>> then be turned into the results from the last open (or accept) request
>> in the chain. If no open has been done, this isn't valid and the request
>> will be errored with -EBADF.
>>
>> With this, we can define chains of open+read+close, where the read and
>> close part can work on the fd instantiated by the open request.
>>
>> Totally a work in progress, POC so far.
> 
> I'm concerned of having and supporting all these IOSQE flavours. Isn't it the
> thing that can be potentially done with eBPF?

It could totally be done with BPF, but honestly I'd love to avoid the
extra dependency.

But in talks with Josh, I really like his suggestion on being able to ask
for a specific fd instead. I've reworked it to sit on top of that instead,
I'll post a new series soon.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-03-04 17:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-03 23:50 [PATCHSET RFC 0/4] Support passing fds between chain links Jens Axboe
2020-03-03 23:50 ` [PATCH 1/4] io_uring: add end-of-bits marker and build time verify it Jens Axboe
2020-03-03 23:50 ` [PATCH 2/4] io_uring: move CLOSE req->file checking into handler Jens Axboe
2020-03-04 13:07   ` Pavel Begunkov
2020-03-04 17:47     ` Jens Axboe
2020-03-03 23:50 ` [PATCH 3/4] io_uring: move read/write side file based prep into op handler Jens Axboe
2020-03-03 23:50 ` [PATCH 4/4] io_uring: test patch for fd passing Jens Axboe
2020-03-04 13:13   ` Pavel Begunkov
2020-03-04 17:48     ` Jens Axboe
2020-03-04  0:43 ` [PATCHSET RFC 0/4] Support passing fds between chain links Jeff Layton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.