All of lore.kernel.org
 help / color / mirror / Atom feed
* [WIP PATCH] io_uring: Support opening a file into the fixed-file table
@ 2020-07-14 21:08 Josh Triplett
  2020-07-14 21:16 ` [WIP PATCH] liburing: Support IORING_OP_OPENAT2_FIXED_FILE Josh Triplett
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Josh Triplett @ 2020-07-14 21:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

Add a new operation IORING_OP_OPENAT2_FIXED_FILE, which opens a file
into the fixed-file table rather than installing a file descriptor.
Using a new operation avoids having an IOSQE flag that almost all
operations will need to ignore; io_openat2_fixed_file also has
substantially different control-flow than io_openat2, and it can avoid
requiring the file table if not needed for the dirfd.

(This intentionally does not use the IOSQE_FIXED_FILE flag, because
semantically, IOSQE_FIXED_FILE for openat2 should mean to interpret the
dirfd as a fixed-file-table index, and that would be useful future
behavior for both IORING_OP_OPENAT2 and IORING_OP_OPENAT2_FIXED_FILE.)

Create a new io_sqe_files_add_new function to add a single new file to
the fixed-file table. This function returns -EBUSY if attempting to
overwrite an existing file.

Provide a new field to pass along the fixed-file-table index for an
open-like operation; future operations such as
IORING_OP_ACCEPT_FIXED_FILE can use the same index.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---

(Should this check for and reject open flags like O_CLOEXEC that only
affect the file descriptor?)

I've tested this (and I'll send my liburing patch momentarily), and it
works fine if you do the open in one batch and operate on the fixed-file
in another batch. As discussed via Twitter, opening and operating on a
file in the same batch will require changing other operations to obtain
their fixed-file entries later, post-prep.

It might make sense to do and test that for one operation at a time, and
add a .late_fixed_file flag to the operation definition for operations
that support that.

It might also make sense to have the prep for
IORING_OP_OPENAT2_FIXED_FILE stick an indication in the fixed-file table
that there *will* be a file there later, perhaps an
ERR_PTR(-EINPROGRESS), and make sure there isn't one already, to detect
potential errors earlier and to let the prep for other operations
confirm that there *will* be a file; on the other hand, that would mean
there's an invalid non-NULL file pointer in the fixed file table, which
seems potentially error-prone if any operation ever forgets that.

The other next step would be to add an IORING_OP_CLOSE_FIXED_FILE
(separate from the existing CLOSE op) that removes an entry currently in
the fixed file table and calls fput on it. (With some care, that
*should* be possible even for an entry that was originally registered
from a file descriptor.)

And finally, we should have an IORING_OP_FIXED_FILE_TO_FD operation,
which calls get_unused_fd_flags (with specified flags to allow for
O_CLOEXEC) and then fd_install. That allows opening a file via io_uring,
operating on it via the ring, but then also operating on it via other
syscalls (or inheriting it or anything else you can do with a file
descriptor).

 fs/io_uring.c                 | 90 ++++++++++++++++++++++++++++++++++-
 include/uapi/linux/io_uring.h |  6 ++-
 2 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 9fd7e69696c3..df6f017ef8e8 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -425,6 +425,7 @@ struct io_sr_msg {
 struct io_open {
 	struct file			*file;
 	int				dfd;
+	u32				open_fixed_idx;
 	struct filename			*filename;
 	struct open_how			how;
 	unsigned long			nofile;
@@ -878,6 +879,10 @@ static const struct io_op_def io_op_defs[] = {
 		.hash_reg_file		= 1,
 		.unbound_nonreg_file	= 1,
 	},
+	[IORING_OP_OPENAT2_FIXED_FILE] = {
+		.file_table		= 1,
+		.needs_fs		= 1,
+	},
 };
 
 static void io_wq_submit_work(struct io_wq_work **workptr);
@@ -886,6 +891,9 @@ static void io_put_req(struct io_kiocb *req);
 static void __io_double_put_req(struct io_kiocb *req);
 static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req);
 static void io_queue_linked_timeout(struct io_kiocb *req);
+static int io_sqe_files_add_new(struct io_ring_ctx *ctx,
+				u32 index,
+				struct file *file);
 static int __io_sqe_files_update(struct io_ring_ctx *ctx,
 				 struct io_uring_files_update *ip,
 				 unsigned nr_args);
@@ -3060,10 +3068,48 @@ static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 					len);
 	if (ret)
 		return ret;
+	req->open.open_fixed_idx = READ_ONCE(sqe->open_fixed_idx);
 
 	return __io_openat_prep(req, sqe);
 }
 
+static int io_openat2_fixed_file(struct io_kiocb *req, bool force_nonblock)
+{
+	struct io_open *open = &req->open;
+	struct open_flags op;
+	struct file *file;
+	int ret;
+
+	if (force_nonblock) {
+		/* only need file table for an actual valid fd */
+		if (open->dfd == -1 || open->dfd == AT_FDCWD)
+			req->flags |= REQ_F_NO_FILE_TABLE;
+		return -EAGAIN;
+	}
+
+	ret = build_open_flags(&open->how, &op);
+	if (ret)
+		goto err;
+
+	file = do_filp_open(open->dfd, open->filename, &op);
+	if (IS_ERR(file)) {
+		ret = PTR_ERR(file);
+	} else {
+		fsnotify_open(file);
+		ret = io_sqe_files_add_new(req->ctx, open->open_fixed_idx, file);
+		if (ret)
+			fput(file);
+	}
+err:
+	putname(open->filename);
+	req->flags &= ~REQ_F_NEED_CLEANUP;
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req(req);
+	return 0;
+}
+
 static int io_openat2(struct io_kiocb *req, bool force_nonblock)
 {
 	struct open_flags op;
@@ -5048,6 +5094,7 @@ static int io_req_defer_prep(struct io_kiocb *req,
 		ret = io_madvise_prep(req, sqe);
 		break;
 	case IORING_OP_OPENAT2:
+	case IORING_OP_OPENAT2_FIXED_FILE:
 		ret = io_openat2_prep(req, sqe);
 		break;
 	case IORING_OP_EPOLL_CTL:
@@ -5135,6 +5182,7 @@ static void io_cleanup_req(struct io_kiocb *req)
 		break;
 	case IORING_OP_OPENAT:
 	case IORING_OP_OPENAT2:
+	case IORING_OP_OPENAT2_FIXED_FILE:
 		break;
 	case IORING_OP_SPLICE:
 	case IORING_OP_TEE:
@@ -5329,12 +5377,17 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		ret = io_madvise(req, force_nonblock);
 		break;
 	case IORING_OP_OPENAT2:
+	case IORING_OP_OPENAT2_FIXED_FILE:
 		if (sqe) {
 			ret = io_openat2_prep(req, sqe);
 			if (ret)
 				break;
 		}
-		ret = io_openat2(req, force_nonblock);
+		if (req->opcode == IORING_OP_OPENAT2) {
+			ret = io_openat2(req, force_nonblock);
+		} else {
+			ret = io_openat2_fixed_file(req, force_nonblock);
+		}
 		break;
 	case IORING_OP_EPOLL_CTL:
 		if (sqe) {
@@ -6791,6 +6844,41 @@ static int io_queue_file_removal(struct fixed_file_data *data,
 	return 0;
 }
 
+/*
+ * Add a single new file in an empty entry of the fixed file table. Does not
+ * allow overwriting an existing entry; returns -EBUSY in that case.
+ */
+static int io_sqe_files_add_new(struct io_ring_ctx *ctx,
+				u32 index,
+				struct file *file)
+{
+	struct fixed_file_table *table;
+	u32 i;
+	int err;
+
+	if (unlikely(index > ctx->nr_user_files))
+		return -EINVAL;
+	i = array_index_nospec(index, ctx->nr_user_files);
+	table = &ctx->file_data->table[i >> IORING_FILE_TABLE_SHIFT];
+	index = i & IORING_FILE_TABLE_MASK;
+	if (unlikely(table->files[index]))
+		return -EBUSY;
+	/*
+	 * Don't allow io_uring instances to be registered. If UNIX isn't
+	 * enabled, then this causes a reference cycle and this instance can
+	 * never get freed. If UNIX is enabled we'll handle it just fine, but
+	 * there's still no point in allowing a ring fd as it doesn't support
+	 * regular read/write anyway.
+	 */
+	if (unlikely(file->f_op == &io_uring_fops))
+		return -EBADF;
+	err = io_sqe_file_register(ctx, file, i);
+	if (err)
+		return err;
+	table->files[index] = file;
+	return 0;
+}
+
 static int __io_sqe_files_update(struct io_ring_ctx *ctx,
 				 struct io_uring_files_update *up,
 				 unsigned nr_args)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 7843742b8b74..95f107e6f65e 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -54,7 +54,10 @@ struct io_uring_sqe {
 			} __attribute__((packed));
 			/* personality to use, if used */
 			__u16	personality;
-			__s32	splice_fd_in;
+			union {
+				__s32	splice_fd_in;
+				__s32	open_fixed_idx;
+			};
 		};
 		__u64	__pad2[3];
 	};
@@ -130,6 +133,7 @@ enum {
 	IORING_OP_PROVIDE_BUFFERS,
 	IORING_OP_REMOVE_BUFFERS,
 	IORING_OP_TEE,
+	IORING_OP_OPENAT2_FIXED_FILE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
2.28.0.rc0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [WIP PATCH] liburing: Support IORING_OP_OPENAT2_FIXED_FILE
  2020-07-14 21:08 [WIP PATCH] io_uring: Support opening a file into the fixed-file table Josh Triplett
@ 2020-07-14 21:16 ` Josh Triplett
  2020-07-14 22:59 ` [WIP PATCH] io_uring: Support opening a file into the fixed-file table Clay Harris
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Josh Triplett @ 2020-07-14 21:16 UTC (permalink / raw)
  To: Jens Axboe; +Cc: io-uring

[-- Attachment #1: Type: text/plain, Size: 2121 bytes --]

Update io_uring.h to match the corresponding kernel changes.

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
---

I've tested this, and it works. The test I'd propose is "open two files,
read bytes from both files, test if they have the expected values", to
make sure each operation operates on the right file. I've attached a
draft of that test that submits the read operations in a separate batch,
which passes.

 src/include/liburing.h          | 10 ++++++++++
 src/include/liburing/io_uring.h |  6 +++++-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/include/liburing.h b/src/include/liburing.h
index 0505a4f..0352837 100644
--- a/src/include/liburing.h
+++ b/src/include/liburing.h
@@ -412,6 +412,16 @@ static inline void io_uring_prep_openat2(struct io_uring_sqe *sqe, int dfd,
 				(uint64_t) (uintptr_t) how);
 }
 
+static inline void io_uring_prep_openat2_fixed_file(struct io_uring_sqe *sqe,
+                                                    int dfd, const char *path,
+                                                    struct open_how *how,
+                                                    uint32_t index)
+{
+        io_uring_prep_rw(IORING_OP_OPENAT2_FIXED_FILE, sqe, dfd, path,
+                                sizeof(*how), (uint64_t) (uintptr_t) how);
+        sqe->open_fixed_idx = index;
+}
+
 struct epoll_event;
 static inline void io_uring_prep_epoll_ctl(struct io_uring_sqe *sqe, int epfd,
 					   int fd, int op,
diff --git a/src/include/liburing/io_uring.h b/src/include/liburing/io_uring.h
index d39b45f..0d2c41b 100644
--- a/src/include/liburing/io_uring.h
+++ b/src/include/liburing/io_uring.h
@@ -59,7 +59,10 @@ struct io_uring_sqe {
 			} __attribute__((packed));
 			/* personality to use, if used */
 			__u16	personality;
-			__s32	splice_fd_in;
+			union {
+				__s32	splice_fd_in;
+				__s32	open_fixed_idx;
+			};
 		};
 		__u64	__pad2[3];
 	};
@@ -135,6 +138,7 @@ enum {
 	IORING_OP_PROVIDE_BUFFERS,
 	IORING_OP_REMOVE_BUFFERS,
 	IORING_OP_TEE,
+	IORING_OP_OPENAT2_FIXED_FILE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
2.28.0.rc0


[-- Attachment #2: open-fixed-file.c --]
[-- Type: text/x-csrc, Size: 3084 bytes --]

/* SPDX-License-Identifier: MIT */
/*
 * Description: test open with fixed-file support
 *
 */
#include <errno.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <string.h>
#include <fcntl.h>

#include "liburing.h"

static int create_file(const char *file, size_t size, int byte)
{
	ssize_t ret;
	char *buf;
	int fd;

	buf = malloc(size);
	memset(buf, byte, size);

	fd = open(file, O_WRONLY | O_CREAT, 0644);
	if (fd < 0) {
		perror("open file");
		return 1;
	}
	ret = write(fd, buf, size);
	close(fd);
	return ret != size;
}

static int test_openat2_fixed_file(struct io_uring *ring, const char *path1, const char *path2)
{
	struct io_uring_cqe *cqe;
	struct io_uring_sqe *sqe;
	struct open_how how = { .flags = O_RDONLY };
	int ret;
	int fds[2] = { -1, -1 };
	unsigned char byte1 = 0, byte2 = 0;
	int i;

	ret = io_uring_register_files(ring, fds, 2);

	sqe = io_uring_get_sqe(ring);
	io_uring_prep_openat2_fixed_file(sqe, -1, path1, &how, 0);
	sqe->flags |= IOSQE_IO_LINK;

	sqe = io_uring_get_sqe(ring);
	io_uring_prep_openat2_fixed_file(sqe, -1, path2, &how, 1);
	sqe->flags |= IOSQE_IO_LINK;

	ret = io_uring_submit(ring);
	if (ret != 2) {
		fprintf(stderr, "sqe submit failed: %d\n", ret);
		goto err;
	}

	for (i = 0; i < 2; i++) {
		ret = io_uring_wait_cqe(ring, &cqe);
		if (ret < 0) {
			fprintf(stderr, "wait completion %d\n", ret);
			goto err;
		}

		ret = cqe->res;
		if (ret < 0) {
			fprintf(stderr, "operation %d failed: %d\n", i, ret);
			goto err;
		}
		io_uring_cqe_seen(ring, cqe);
	}

	sqe = io_uring_get_sqe(ring);
	io_uring_prep_read(sqe, 0, &byte1, 1, 0);
	sqe->flags |= IOSQE_FIXED_FILE;

	sqe = io_uring_get_sqe(ring);
	io_uring_prep_read(sqe, 1, &byte2, 1, 0);
	sqe->flags |= IOSQE_FIXED_FILE;

	ret = io_uring_submit(ring);
	if (ret != 2) {
		fprintf(stderr, "sqe submit failed: %d\n", ret);
		goto err;
	}

	for (i = 0; i < 2; i++) {
		ret = io_uring_wait_cqe(ring, &cqe);
		if (ret < 0) {
			fprintf(stderr, "wait completion %d\n", ret);
			goto err;
		}

		ret = cqe->res;
		if (ret < 0) {
			fprintf(stderr, "operation %d failed: %d\n", i, ret);
			goto err;
		}
		io_uring_cqe_seen(ring, cqe);
	}

	if (byte1 != 0x11 || byte2 != 0x22) {
		fprintf(stderr, "bytes did not have expected values: %x %x\n",
			(unsigned)byte1, (unsigned)byte2);
		ret = -1;
		goto err;
	}

err:
	return ret;
}

int main(int argc, char *argv[])
{
	struct io_uring ring;
	const char *path1, *path2;
	int ret;

	ret = io_uring_queue_init(8, &ring, 0);
	if (ret) {
		fprintf(stderr, "ring setup failed\n");
		return 1;
	}

	path1 = "/tmp/.open.close.1";
	path2 = "/tmp/.open.close.2";

	if (create_file(path1, 4096, 0x11) || create_file(path2, 4096, 0x22)) {
		fprintf(stderr, "file create failed\n");
		return 1;
	}

	ret = test_openat2_fixed_file(&ring, path1, path2);
	if (ret < 0) {
		if (ret == -EINVAL) {
			fprintf(stdout, "openat2 not supported, skipping\n");
			goto done;
		}
		fprintf(stderr, "test_openat2 failed: %d\n", ret);
		goto err;
	}

done:
	unlink(path1);
	unlink(path2);
	return 0;
err:
	unlink(path1);
	unlink(path2);
	return 1;
}


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-14 21:08 [WIP PATCH] io_uring: Support opening a file into the fixed-file table Josh Triplett
  2020-07-14 21:16 ` [WIP PATCH] liburing: Support IORING_OP_OPENAT2_FIXED_FILE Josh Triplett
@ 2020-07-14 22:59 ` Clay Harris
  2020-07-15  0:42   ` Josh Triplett
  2020-07-15  2:32   ` Clay Harris
  2020-07-15 16:07 ` Jens Axboe
  2020-07-15 19:54 ` Pavel Begunkov
  3 siblings, 2 replies; 10+ messages in thread
From: Clay Harris @ 2020-07-14 22:59 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Jens Axboe, io-uring

On Tue, Jul 14 2020 at 14:08:26 -0700, Josh Triplett quoth thus:

> Add a new operation IORING_OP_OPENAT2_FIXED_FILE, which opens a file
> into the fixed-file table rather than installing a file descriptor.
> Using a new operation avoids having an IOSQE flag that almost all
> operations will need to ignore; io_openat2_fixed_file also has
> substantially different control-flow than io_openat2, and it can avoid
> requiring the file table if not needed for the dirfd.
> 
> (This intentionally does not use the IOSQE_FIXED_FILE flag, because
> semantically, IOSQE_FIXED_FILE for openat2 should mean to interpret the
> dirfd as a fixed-file-table index, and that would be useful future
> behavior for both IORING_OP_OPENAT2 and IORING_OP_OPENAT2_FIXED_FILE.)
> 
> Create a new io_sqe_files_add_new function to add a single new file to
> the fixed-file table. This function returns -EBUSY if attempting to
> overwrite an existing file.
> 
> Provide a new field to pass along the fixed-file-table index for an
> open-like operation; future operations such as
> IORING_OP_ACCEPT_FIXED_FILE can use the same index.
> 
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> ---
> 
> (Should this check for and reject open flags like O_CLOEXEC that only
> affect the file descriptor?)
> 
> I've tested this (and I'll send my liburing patch momentarily), and it
> works fine if you do the open in one batch and operate on the fixed-file
> in another batch. As discussed via Twitter, opening and operating on a
> file in the same batch will require changing other operations to obtain
> their fixed-file entries later, post-prep.
> 
> It might make sense to do and test that for one operation at a time, and
> add a .late_fixed_file flag to the operation definition for operations
> that support that.
> 
> It might also make sense to have the prep for
> IORING_OP_OPENAT2_FIXED_FILE stick an indication in the fixed-file table
> that there *will* be a file there later, perhaps an
> ERR_PTR(-EINPROGRESS), and make sure there isn't one already, to detect
> potential errors earlier and to let the prep for other operations
> confirm that there *will* be a file; on the other hand, that would mean
> there's an invalid non-NULL file pointer in the fixed file table, which
> seems potentially error-prone if any operation ever forgets that.
> 
> The other next step would be to add an IORING_OP_CLOSE_FIXED_FILE
> (separate from the existing CLOSE op) that removes an entry currently in
> the fixed file table and calls fput on it. (With some care, that
> *should* be possible even for an entry that was originally registered
> from a file descriptor.)
> 
> And finally, we should have an IORING_OP_FIXED_FILE_TO_FD operation,
> which calls get_unused_fd_flags (with specified flags to allow for
> O_CLOEXEC) and then fd_install. That allows opening a file via io_uring,
> operating on it via the ring, but then also operating on it via other
> syscalls (or inheriting it or anything else you can do with a file
> descriptor).

I'd been thinking about fixed file operations previous to your post,
so I'm happy to see this work.

I see IORING_OP_FIXED_FILE_TO_FD as a dup() function from fixed file
to process descriptor space.  It would be nice if it would take
parameters to select the functionality of dup, dup2, dup3, F_DUPFD,
and F_DUPFD_CLOEXEC.  As I recall, O_CLOFORK is on its way from
Posix-land, so I'd think there will also be something like
F_DUPFD_CLOFORK coming.

> 

It would be useful if IORING_REGISTER_xxx_UPDATE would accept a
placeholder value to ask the kernel not to mess with that index.
I think AT_FDCWD would be a good choice.

>  fs/io_uring.c                 | 90 ++++++++++++++++++++++++++++++++++-
>  include/uapi/linux/io_uring.h |  6 ++-
>  2 files changed, 94 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 9fd7e69696c3..df6f017ef8e8 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -425,6 +425,7 @@ struct io_sr_msg {
>  struct io_open {
>  	struct file			*file;
>  	int				dfd;
> +	u32				open_fixed_idx;
>  	struct filename			*filename;
>  	struct open_how			how;
>  	unsigned long			nofile;
> @@ -878,6 +879,10 @@ static const struct io_op_def io_op_defs[] = {
>  		.hash_reg_file		= 1,
>  		.unbound_nonreg_file	= 1,
>  	},
> +	[IORING_OP_OPENAT2_FIXED_FILE] = {
> +		.file_table		= 1,
> +		.needs_fs		= 1,
> +	},
>  };
>  
>  static void io_wq_submit_work(struct io_wq_work **workptr);
> @@ -886,6 +891,9 @@ static void io_put_req(struct io_kiocb *req);
>  static void __io_double_put_req(struct io_kiocb *req);
>  static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req);
>  static void io_queue_linked_timeout(struct io_kiocb *req);
> +static int io_sqe_files_add_new(struct io_ring_ctx *ctx,
> +				u32 index,
> +				struct file *file);
>  static int __io_sqe_files_update(struct io_ring_ctx *ctx,
>  				 struct io_uring_files_update *ip,
>  				 unsigned nr_args);
> @@ -3060,10 +3068,48 @@ static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>  					len);
>  	if (ret)
>  		return ret;
> +	req->open.open_fixed_idx = READ_ONCE(sqe->open_fixed_idx);
>  
>  	return __io_openat_prep(req, sqe);
>  }
>  
> +static int io_openat2_fixed_file(struct io_kiocb *req, bool force_nonblock)
> +{
> +	struct io_open *open = &req->open;
> +	struct open_flags op;
> +	struct file *file;
> +	int ret;
> +
> +	if (force_nonblock) {
> +		/* only need file table for an actual valid fd */
> +		if (open->dfd == -1 || open->dfd == AT_FDCWD)
> +			req->flags |= REQ_F_NO_FILE_TABLE;
> +		return -EAGAIN;
> +	}
> +
> +	ret = build_open_flags(&open->how, &op);
> +	if (ret)
> +		goto err;
> +
> +	file = do_filp_open(open->dfd, open->filename, &op);
> +	if (IS_ERR(file)) {
> +		ret = PTR_ERR(file);
> +	} else {
> +		fsnotify_open(file);
> +		ret = io_sqe_files_add_new(req->ctx, open->open_fixed_idx, file);
> +		if (ret)
> +			fput(file);
> +	}
> +err:
> +	putname(open->filename);
> +	req->flags &= ~REQ_F_NEED_CLEANUP;
> +	if (ret < 0)
> +		req_set_fail_links(req);
> +	io_cqring_add_event(req, ret);
> +	io_put_req(req);
> +	return 0;
> +}
> +
>  static int io_openat2(struct io_kiocb *req, bool force_nonblock)
>  {
>  	struct open_flags op;
> @@ -5048,6 +5094,7 @@ static int io_req_defer_prep(struct io_kiocb *req,
>  		ret = io_madvise_prep(req, sqe);
>  		break;
>  	case IORING_OP_OPENAT2:
> +	case IORING_OP_OPENAT2_FIXED_FILE:
>  		ret = io_openat2_prep(req, sqe);
>  		break;
>  	case IORING_OP_EPOLL_CTL:
> @@ -5135,6 +5182,7 @@ static void io_cleanup_req(struct io_kiocb *req)
>  		break;
>  	case IORING_OP_OPENAT:
>  	case IORING_OP_OPENAT2:
> +	case IORING_OP_OPENAT2_FIXED_FILE:
>  		break;
>  	case IORING_OP_SPLICE:
>  	case IORING_OP_TEE:
> @@ -5329,12 +5377,17 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
>  		ret = io_madvise(req, force_nonblock);
>  		break;
>  	case IORING_OP_OPENAT2:
> +	case IORING_OP_OPENAT2_FIXED_FILE:
>  		if (sqe) {
>  			ret = io_openat2_prep(req, sqe);
>  			if (ret)
>  				break;
>  		}
> -		ret = io_openat2(req, force_nonblock);
> +		if (req->opcode == IORING_OP_OPENAT2) {
> +			ret = io_openat2(req, force_nonblock);
> +		} else {
> +			ret = io_openat2_fixed_file(req, force_nonblock);
> +		}
>  		break;
>  	case IORING_OP_EPOLL_CTL:
>  		if (sqe) {
> @@ -6791,6 +6844,41 @@ static int io_queue_file_removal(struct fixed_file_data *data,
>  	return 0;
>  }
>  
> +/*
> + * Add a single new file in an empty entry of the fixed file table. Does not
> + * allow overwriting an existing entry; returns -EBUSY in that case.
> + */
> +static int io_sqe_files_add_new(struct io_ring_ctx *ctx,
> +				u32 index,
> +				struct file *file)
> +{
> +	struct fixed_file_table *table;
> +	u32 i;
> +	int err;
> +
> +	if (unlikely(index > ctx->nr_user_files))
> +		return -EINVAL;
> +	i = array_index_nospec(index, ctx->nr_user_files);
> +	table = &ctx->file_data->table[i >> IORING_FILE_TABLE_SHIFT];
> +	index = i & IORING_FILE_TABLE_MASK;
> +	if (unlikely(table->files[index]))
> +		return -EBUSY;
> +	/*
> +	 * Don't allow io_uring instances to be registered. If UNIX isn't
> +	 * enabled, then this causes a reference cycle and this instance can
> +	 * never get freed. If UNIX is enabled we'll handle it just fine, but
> +	 * there's still no point in allowing a ring fd as it doesn't support
> +	 * regular read/write anyway.
> +	 */
> +	if (unlikely(file->f_op == &io_uring_fops))
> +		return -EBADF;
> +	err = io_sqe_file_register(ctx, file, i);
> +	if (err)
> +		return err;
> +	table->files[index] = file;
> +	return 0;
> +}
> +
>  static int __io_sqe_files_update(struct io_ring_ctx *ctx,
>  				 struct io_uring_files_update *up,
>  				 unsigned nr_args)
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index 7843742b8b74..95f107e6f65e 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -54,7 +54,10 @@ struct io_uring_sqe {
>  			} __attribute__((packed));
>  			/* personality to use, if used */
>  			__u16	personality;
> -			__s32	splice_fd_in;
> +			union {
> +				__s32	splice_fd_in;
> +				__s32	open_fixed_idx;
> +			};
>  		};
>  		__u64	__pad2[3];
>  	};
> @@ -130,6 +133,7 @@ enum {
>  	IORING_OP_PROVIDE_BUFFERS,
>  	IORING_OP_REMOVE_BUFFERS,
>  	IORING_OP_TEE,
> +	IORING_OP_OPENAT2_FIXED_FILE,
>  
>  	/* this goes last, obviously */
>  	IORING_OP_LAST,
> -- 
> 2.28.0.rc0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-14 22:59 ` [WIP PATCH] io_uring: Support opening a file into the fixed-file table Clay Harris
@ 2020-07-15  0:42   ` Josh Triplett
  2020-07-15  2:32   ` Clay Harris
  1 sibling, 0 replies; 10+ messages in thread
From: Josh Triplett @ 2020-07-15  0:42 UTC (permalink / raw)
  To: Clay Harris; +Cc: Jens Axboe, io-uring

On Tue, Jul 14, 2020 at 05:59:05PM -0500, Clay Harris wrote:
> I see IORING_OP_FIXED_FILE_TO_FD as a dup() function from fixed file
> to process descriptor space.

Exactly.

> It would be nice if it would take
> parameters to select the functionality of dup, dup2, dup3, F_DUPFD,
> and F_DUPFD_CLOEXEC.  As I recall, O_CLOFORK is on its way from
> Posix-land, so I'd think there will also be something like
> F_DUPFD_CLOFORK coming.

We should certainly have any applicable file-descriptor flags, yes. And
I'd expect to have three primary modes: "give me any unused file
descriptor", "give me this exact file descriptor (closing it if open)",
and "give me this exact file descriptor (erroring if it's already
taken)".

> It would be useful if IORING_REGISTER_xxx_UPDATE would accept a
> placeholder value to ask the kernel not to mess with that index.
> I think AT_FDCWD would be a good choice.

It does accept -1 for that exact purpose.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-14 22:59 ` [WIP PATCH] io_uring: Support opening a file into the fixed-file table Clay Harris
  2020-07-15  0:42   ` Josh Triplett
@ 2020-07-15  2:32   ` Clay Harris
  2020-07-15 21:04     ` josh
  1 sibling, 1 reply; 10+ messages in thread
From: Clay Harris @ 2020-07-15  2:32 UTC (permalink / raw)
  To: Josh Triplett; +Cc: Jens Axboe, io-uring

On Tue, Jul 14 2020 at 17:59:05 -0500, Clay Harris quoth thus:

> On Tue, Jul 14 2020 at 14:08:26 -0700, Josh Triplett quoth thus:
> 
> > The other next step would be to add an IORING_OP_CLOSE_FIXED_FILE
> > (separate from the existing CLOSE op) that removes an entry currently in
> > the fixed file table and calls fput on it. (With some care, that
> > *should* be possible even for an entry that was originally registered
> > from a file descriptor.)

I'm curious why you wouldn't use IOSQE_FIXED_FILE here.  I understand
your reasoning for OPEN_FIXED_FILE, but using the flag with the existing
CLOSE OP seems like a perfect fit.  To me, this looks like a suboptimal
precedent that would result in defining a new opcode for every request
which could accept either a fixed file or process descriptor.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-14 21:08 [WIP PATCH] io_uring: Support opening a file into the fixed-file table Josh Triplett
  2020-07-14 21:16 ` [WIP PATCH] liburing: Support IORING_OP_OPENAT2_FIXED_FILE Josh Triplett
  2020-07-14 22:59 ` [WIP PATCH] io_uring: Support opening a file into the fixed-file table Clay Harris
@ 2020-07-15 16:07 ` Jens Axboe
  2020-07-15 19:54 ` Pavel Begunkov
  3 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2020-07-15 16:07 UTC (permalink / raw)
  To: Josh Triplett; +Cc: io-uring

On 7/14/20 3:08 PM, Josh Triplett wrote:
> Add a new operation IORING_OP_OPENAT2_FIXED_FILE, which opens a file
> into the fixed-file table rather than installing a file descriptor.
> Using a new operation avoids having an IOSQE flag that almost all
> operations will need to ignore; io_openat2_fixed_file also has
> substantially different control-flow than io_openat2, and it can avoid
> requiring the file table if not needed for the dirfd.
> 
> (This intentionally does not use the IOSQE_FIXED_FILE flag, because
> semantically, IOSQE_FIXED_FILE for openat2 should mean to interpret the
> dirfd as a fixed-file-table index, and that would be useful future
> behavior for both IORING_OP_OPENAT2 and IORING_OP_OPENAT2_FIXED_FILE.)
> 
> Create a new io_sqe_files_add_new function to add a single new file to
> the fixed-file table. This function returns -EBUSY if attempting to
> overwrite an existing file.
> 
> Provide a new field to pass along the fixed-file-table index for an
> open-like operation; future operations such as
> IORING_OP_ACCEPT_FIXED_FILE can use the same index.

I like this, I think it's really nifty! Private fds are fast fds, and
not only does this allow links to propagate the fds nicely, it also
enables you go avoid the expensive fget/fput for system calls if you
stay within the realm of io_uring for the requests that you are doing.

We do need to preface this with a cleanup that moves the file assignment
out of the prep side of the op handling and into the main part of it
instead. That'll fix those issues associated with needing to do two
bundles in your test case, it could all just be linked at that point.

Some of this is repeats of what we discussed outside of the list emails,
repeating it here for the general audience as well.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-14 21:08 [WIP PATCH] io_uring: Support opening a file into the fixed-file table Josh Triplett
                   ` (2 preceding siblings ...)
  2020-07-15 16:07 ` Jens Axboe
@ 2020-07-15 19:54 ` Pavel Begunkov
  2020-07-15 20:46   ` Josh Triplett
  3 siblings, 1 reply; 10+ messages in thread
From: Pavel Begunkov @ 2020-07-15 19:54 UTC (permalink / raw)
  To: Josh Triplett, Jens Axboe; +Cc: io-uring

On 15/07/2020 00:08, Josh Triplett wrote:
> Add a new operation IORING_OP_OPENAT2_FIXED_FILE, which opens a file
> into the fixed-file table rather than installing a file descriptor.
> Using a new operation avoids having an IOSQE flag that almost all
> operations will need to ignore; io_openat2_fixed_file also has
> substantially different control-flow than io_openat2, and it can avoid
> requiring the file table if not needed for the dirfd.
> 
> (This intentionally does not use the IOSQE_FIXED_FILE flag, because
> semantically, IOSQE_FIXED_FILE for openat2 should mean to interpret the
> dirfd as a fixed-file-table index, and that would be useful future
> behavior for both IORING_OP_OPENAT2 and IORING_OP_OPENAT2_FIXED_FILE.)
> 
> Create a new io_sqe_files_add_new function to add a single new file to
> the fixed-file table. This function returns -EBUSY if attempting to
> overwrite an existing file.
> 
> Provide a new field to pass along the fixed-file-table index for an
> open-like operation; future operations such as
> IORING_OP_ACCEPT_FIXED_FILE can use the same index.
> 
> Signed-off-by: Josh Triplett <josh@joshtriplett.org>
> ---
> 
> (Should this check for and reject open flags like O_CLOEXEC that only
> affect the file descriptor?)
> 
> I've tested this (and I'll send my liburing patch momentarily), and it
> works fine if you do the open in one batch and operate on the fixed-file
> in another batch. As discussed via Twitter, opening and operating on a
> file in the same batch will require changing other operations to obtain
> their fixed-file entries later, post-prep.
> 
> It might make sense to do and test that for one operation at a time, and
> add a .late_fixed_file flag to the operation definition for operations
> that support that.
> 
> It might also make sense to have the prep for
> IORING_OP_OPENAT2_FIXED_FILE stick an indication in the fixed-file table
> that there *will* be a file there later, perhaps an
> ERR_PTR(-EINPROGRESS), and make sure there isn't one already, to detect
> potential errors earlier and to let the prep for other operations
> confirm that there *will* be a file; on the other hand, that would mean
> there's an invalid non-NULL file pointer in the fixed file table, which
> seems potentially error-prone if any operation ever forgets that.
> 
> The other next step would be to add an IORING_OP_CLOSE_FIXED_FILE
> (separate from the existing CLOSE op) that removes an entry currently in
> the fixed file table and calls fput on it. (With some care, that
> *should* be possible even for an entry that was originally registered
> from a file descriptor.)
> 
> And finally, we should have an IORING_OP_FIXED_FILE_TO_FD operation,
> which calls get_unused_fd_flags (with specified flags to allow for
> O_CLOEXEC) and then fd_install. That allows opening a file via io_uring,
> operating on it via the ring, but then also operating on it via other
> syscalls (or inheriting it or anything else you can do with a file
> descriptor).
> 
>  fs/io_uring.c                 | 90 ++++++++++++++++++++++++++++++++++-
>  include/uapi/linux/io_uring.h |  6 ++-
>  2 files changed, 94 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/io_uring.c b/fs/io_uring.c
> index 9fd7e69696c3..df6f017ef8e8 100644
> --- a/fs/io_uring.c
> +++ b/fs/io_uring.c
> @@ -425,6 +425,7 @@ struct io_sr_msg {
>  struct io_open {
>  	struct file			*file;
>  	int				dfd;
> +	u32				open_fixed_idx;
>  	struct filename			*filename;
>  	struct open_how			how;
>  	unsigned long			nofile;
> @@ -878,6 +879,10 @@ static const struct io_op_def io_op_defs[] = {
>  		.hash_reg_file		= 1,
>  		.unbound_nonreg_file	= 1,
>  	},
> +	[IORING_OP_OPENAT2_FIXED_FILE] = {
> +		.file_table		= 1,
> +		.needs_fs		= 1,
> +	},
>  };
>  
>  static void io_wq_submit_work(struct io_wq_work **workptr);
> @@ -886,6 +891,9 @@ static void io_put_req(struct io_kiocb *req);
>  static void __io_double_put_req(struct io_kiocb *req);
>  static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req);
>  static void io_queue_linked_timeout(struct io_kiocb *req);
> +static int io_sqe_files_add_new(struct io_ring_ctx *ctx,
> +				u32 index,
> +				struct file *file);
>  static int __io_sqe_files_update(struct io_ring_ctx *ctx,
>  				 struct io_uring_files_update *ip,
>  				 unsigned nr_args);
> @@ -3060,10 +3068,48 @@ static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
>  					len);
>  	if (ret)
>  		return ret;
> +	req->open.open_fixed_idx = READ_ONCE(sqe->open_fixed_idx);
>  
>  	return __io_openat_prep(req, sqe);
>  }
>  
> +static int io_openat2_fixed_file(struct io_kiocb *req, bool force_nonblock)
> +{

How about having it in io_openat2()? There are almost identical, that would be
just a couple of if's.

> +	struct io_open *open = &req->open;
> +	struct open_flags op;
> +	struct file *file;
> +	int ret;
> +
> +	if (force_nonblock) {
> +		/* only need file table for an actual valid fd */
> +		if (open->dfd == -1 || open->dfd == AT_FDCWD)
> +			req->flags |= REQ_F_NO_FILE_TABLE;
> +		return -EAGAIN;
> +	}
> +
> +	ret = build_open_flags(&open->how, &op);
> +	if (ret)
> +		goto err;
> +
> +	file = do_filp_open(open->dfd, open->filename, &op);
> +	if (IS_ERR(file)) {
> +		ret = PTR_ERR(file);
> +	} else {
> +		fsnotify_open(file);
> +		ret = io_sqe_files_add_new(req->ctx, open->open_fixed_idx, file);
> +		if (ret)
> +			fput(file);
> +	}
> +err:
> +	putname(open->filename);
> +	req->flags &= ~REQ_F_NEED_CLEANUP;
> +	if (ret < 0)
> +		req_set_fail_links(req);
> +	io_cqring_add_event(req, ret);
> +	io_put_req(req);

These 2 lines are better to be replace with (since 5.9):

io_req_complete(req, ret);

> +	return 0;
> +}
> +
>  static int io_openat2(struct io_kiocb *req, bool force_nonblock)
>  {
>  	struct open_flags op;
> @@ -5048,6 +5094,7 @@ static int io_req_defer_prep(struct io_kiocb *req,
>  		ret = io_madvise_prep(req, sqe);
>  		break;
>  	case IORING_OP_OPENAT2:
> +	case IORING_OP_OPENAT2_FIXED_FILE:
>  		ret = io_openat2_prep(req, sqe);
>  		break;
>  	case IORING_OP_EPOLL_CTL:
> @@ -5135,6 +5182,7 @@ static void io_cleanup_req(struct io_kiocb *req)
>  		break;
>  	case IORING_OP_OPENAT:
>  	case IORING_OP_OPENAT2:
> +	case IORING_OP_OPENAT2_FIXED_FILE:

These OPENAT cases weren't doing anything, so were killed,
as should be this line.

>  		break;
>  	case IORING_OP_SPLICE:
>  	case IORING_OP_TEE:
> @@ -5329,12 +5377,17 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
>  		ret = io_madvise(req, force_nonblock);
>  		break;
>  	case IORING_OP_OPENAT2:
> +	case IORING_OP_OPENAT2_FIXED_FILE:
>  		if (sqe) {
>  			ret = io_openat2_prep(req, sqe);
>  			if (ret)
>  				break;
>  		}
> -		ret = io_openat2(req, force_nonblock);
> +		if (req->opcode == IORING_OP_OPENAT2) {
> +			ret = io_openat2(req, force_nonblock);
> +		} else {
> +			ret = io_openat2_fixed_file(req, force_nonblock);
> +		}

We don't need all these brackets for one liners

>  		break;
>  	case IORING_OP_EPOLL_CTL:
>  		if (sqe) {
> @@ -6791,6 +6844,41 @@ static int io_queue_file_removal(struct fixed_file_data *data,
>  	return 0;
>  }
>  
> +/*
> + * Add a single new file in an empty entry of the fixed file table. Does not
> + * allow overwriting an existing entry; returns -EBUSY in that case.
> + */
> +static int io_sqe_files_add_new(struct io_ring_ctx *ctx,
> +				u32 index,
> +				struct file *file)
> +{
> +	struct fixed_file_table *table;
> +	u32 i;
> +	int err;
> +
> +	if (unlikely(index > ctx->nr_user_files))
> +		return -EINVAL;
> +	i = array_index_nospec(index, ctx->nr_user_files);
> +	table = &ctx->file_data->table[i >> IORING_FILE_TABLE_SHIFT];
> +	index = i & IORING_FILE_TABLE_MASK;
> +	if (unlikely(table->files[index]))
> +		return -EBUSY;
> +	/*
> +	 * Don't allow io_uring instances to be registered. If UNIX isn't
> +	 * enabled, then this causes a reference cycle and this instance can
> +	 * never get freed. If UNIX is enabled we'll handle it just fine, but
> +	 * there's still no point in allowing a ring fd as it doesn't support
> +	 * regular read/write anyway.
> +	 */
> +	if (unlikely(file->f_op == &io_uring_fops))
> +		return -EBADF;
> +	err = io_sqe_file_register(ctx, file, i);
> +	if (err)
> +		return err;
> +	table->files[index] = file;
> +	return 0;
> +}
> +
>  static int __io_sqe_files_update(struct io_ring_ctx *ctx,
>  				 struct io_uring_files_update *up,
>  				 unsigned nr_args)
> diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
> index 7843742b8b74..95f107e6f65e 100644
> --- a/include/uapi/linux/io_uring.h
> +++ b/include/uapi/linux/io_uring.h
> @@ -54,7 +54,10 @@ struct io_uring_sqe {
>  			} __attribute__((packed));
>  			/* personality to use, if used */
>  			__u16	personality;
> -			__s32	splice_fd_in;
> +			union {
> +				__s32	splice_fd_in;
> +				__s32	open_fixed_idx;
> +			};
>  		};
>  		__u64	__pad2[3];
>  	};
> @@ -130,6 +133,7 @@ enum {
>  	IORING_OP_PROVIDE_BUFFERS,
>  	IORING_OP_REMOVE_BUFFERS,
>  	IORING_OP_TEE,
> +	IORING_OP_OPENAT2_FIXED_FILE,

I think, it's better to reuse IORING_OP_OPENAT2.
E.g. fixed version if "open_fixed_idx != 0" or something similar.

>  
>  	/* this goes last, obviously */
>  	IORING_OP_LAST,
> 

-- 
Pavel Begunkov

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-15 19:54 ` Pavel Begunkov
@ 2020-07-15 20:46   ` Josh Triplett
  2020-07-15 20:54     ` Jens Axboe
  0 siblings, 1 reply; 10+ messages in thread
From: Josh Triplett @ 2020-07-15 20:46 UTC (permalink / raw)
  To: Pavel Begunkov; +Cc: Jens Axboe, io-uring

On Wed, Jul 15, 2020 at 10:54:21PM +0300, Pavel Begunkov wrote:
> On 15/07/2020 00:08, Josh Triplett wrote:
> > +static int io_openat2_fixed_file(struct io_kiocb *req, bool force_nonblock)
> > +{
> 
> How about having it in io_openat2()? There are almost identical, that would be
> just a couple of if's.

It's a little more complex than that. It would require making three
pieces of io_openat2 conditional, and the control flow would be
intertwined with whether the file descriptor (from get_unused_fd_flags)
has been initialized. It's absolutely *doable*, but the resulting
complexity didn't seem worth it.

> > +	struct io_open *open = &req->open;
> > +	struct open_flags op;
> > +	struct file *file;
> > +	int ret;
> > +
> > +	if (force_nonblock) {
> > +		/* only need file table for an actual valid fd */
> > +		if (open->dfd == -1 || open->dfd == AT_FDCWD)
> > +			req->flags |= REQ_F_NO_FILE_TABLE;
> > +		return -EAGAIN;
> > +	}
> > +
> > +	ret = build_open_flags(&open->how, &op);
> > +	if (ret)
> > +		goto err;
> > +
> > +	file = do_filp_open(open->dfd, open->filename, &op);
> > +	if (IS_ERR(file)) {
> > +		ret = PTR_ERR(file);
> > +	} else {
> > +		fsnotify_open(file);
> > +		ret = io_sqe_files_add_new(req->ctx, open->open_fixed_idx, file);
> > +		if (ret)
> > +			fput(file);
> > +	}
> > +err:
> > +	putname(open->filename);
> > +	req->flags &= ~REQ_F_NEED_CLEANUP;
> > +	if (ret < 0)
> > +		req_set_fail_links(req);
> > +	io_cqring_add_event(req, ret);
> > +	io_put_req(req);
> 
> These 2 lines are better to be replace with (since 5.9):
> 
> io_req_complete(req, ret);

This was directly copied from the same code in io_openat2.

> > +	return 0;
> > +}
> > +
> >  static int io_openat2(struct io_kiocb *req, bool force_nonblock)
> >  {
> >  	struct open_flags op;
> > @@ -5048,6 +5094,7 @@ static int io_req_defer_prep(struct io_kiocb *req,
> >  		ret = io_madvise_prep(req, sqe);
> >  		break;
> >  	case IORING_OP_OPENAT2:
> > +	case IORING_OP_OPENAT2_FIXED_FILE:
> >  		ret = io_openat2_prep(req, sqe);
> >  		break;
> >  	case IORING_OP_EPOLL_CTL:
> > @@ -5135,6 +5182,7 @@ static void io_cleanup_req(struct io_kiocb *req)
> >  		break;
> >  	case IORING_OP_OPENAT:
> >  	case IORING_OP_OPENAT2:
> > +	case IORING_OP_OPENAT2_FIXED_FILE:
> 
> These OPENAT cases weren't doing anything, so were killed,
> as should be this line.

Makes sense.

> >  		break;
> >  	case IORING_OP_SPLICE:
> >  	case IORING_OP_TEE:
> > @@ -5329,12 +5377,17 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
> >  		ret = io_madvise(req, force_nonblock);
> >  		break;
> >  	case IORING_OP_OPENAT2:
> > +	case IORING_OP_OPENAT2_FIXED_FILE:
> >  		if (sqe) {
> >  			ret = io_openat2_prep(req, sqe);
> >  			if (ret)
> >  				break;
> >  		}
> > -		ret = io_openat2(req, force_nonblock);
> > +		if (req->opcode == IORING_OP_OPENAT2) {
> > +			ret = io_openat2(req, force_nonblock);
> > +		} else {
> > +			ret = io_openat2_fixed_file(req, force_nonblock);
> > +		}
> 
> We don't need all these brackets for one liners

Habit; most codebases I work on *always* include the braces.

> > --- a/include/uapi/linux/io_uring.h
> > +++ b/include/uapi/linux/io_uring.h
> > @@ -54,7 +54,10 @@ struct io_uring_sqe {
> >  			} __attribute__((packed));
> >  			/* personality to use, if used */
> >  			__u16	personality;
> > -			__s32	splice_fd_in;
> > +			union {
> > +				__s32	splice_fd_in;
> > +				__s32	open_fixed_idx;
> > +			};
> >  		};
> >  		__u64	__pad2[3];
> >  	};
> > @@ -130,6 +133,7 @@ enum {
> >  	IORING_OP_PROVIDE_BUFFERS,
> >  	IORING_OP_REMOVE_BUFFERS,
> >  	IORING_OP_TEE,
> > +	IORING_OP_OPENAT2_FIXED_FILE,
> 
> I think, it's better to reuse IORING_OP_OPENAT2.
> E.g. fixed version if "open_fixed_idx != 0" or something similar.

As far as I can tell, nothing in today's handling of IORING_OP_OPENAT2
will give an EINVAL for splice_fd_in!=0. It might be possible to arrange
for it to be the same op via a new flag or something else that the
current IORING_OP_OPENAT2 wil reject (and if it's a flag or similar, all
other operations that don't involve opening a file would also need to
reject it).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-15 20:46   ` Josh Triplett
@ 2020-07-15 20:54     ` Jens Axboe
  0 siblings, 0 replies; 10+ messages in thread
From: Jens Axboe @ 2020-07-15 20:54 UTC (permalink / raw)
  To: Josh Triplett, Pavel Begunkov; +Cc: io-uring

On 7/15/20 2:46 PM, Josh Triplett wrote:
>>> +	file = do_filp_open(open->dfd, open->filename, &op);
>>> +	if (IS_ERR(file)) {
>>> +		ret = PTR_ERR(file);
>>> +	} else {
>>> +		fsnotify_open(file);
>>> +		ret = io_sqe_files_add_new(req->ctx, open->open_fixed_idx, file);
>>> +		if (ret)
>>> +			fput(file);
>>> +	}
>>> +err:
>>> +	putname(open->filename);
>>> +	req->flags &= ~REQ_F_NEED_CLEANUP;
>>> +	if (ret < 0)
>>> +		req_set_fail_links(req);
>>> +	io_cqring_add_event(req, ret);
>>> +	io_put_req(req);
>>
>> These 2 lines are better to be replace with (since 5.9):
>>
>> io_req_complete(req, ret);
> 
> This was directly copied from the same code in io_openat2.

You're probably using current -git or something like that, the patch
would be best against for-5.9/io_uring - that's what's queued up for
5.9, and it does use io_req_complete() consistently throughout.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [WIP PATCH] io_uring: Support opening a file into the fixed-file table
  2020-07-15  2:32   ` Clay Harris
@ 2020-07-15 21:04     ` josh
  0 siblings, 0 replies; 10+ messages in thread
From: josh @ 2020-07-15 21:04 UTC (permalink / raw)
  To: bugs; +Cc: axboe, io-uring, josh

Clay Harris wrote:
> On Tue, Jul 14 2020 at 14:08:26 -0700, Josh Triplett quoth thus:
> > The other next step would be to add an IORING_OP_CLOSE_FIXED_FILE
> > (separate from the existing CLOSE op) that removes an entry currently in
> > the fixed file table and calls fput on it. (With some care, that
> > *should* be possible even for an entry that was originally registered
> > from a file descriptor.)
>
> I'm curious why you wouldn't use IOSQE_FIXED_FILE here.  I understand
> your reasoning for OPEN_FIXED_FILE, but using the flag with the existing
> CLOSE OP seems like a perfect fit.  To me, this looks like a suboptimal
> precedent that would result in defining a new opcode for every request
> which could accept either a fixed file or process descriptor.

We absolutely could use IORING_OP_CLOSE with IOSQE_FIXED_FILE as the
userspace interface. It would take a bit of refactoring to make that
work, since unlike other cases of IOSQE_FIXED_FILE we don't just want a
struct file in both cases and instead we need the file descriptor (to
close it) in the non-fixed-file case, but I agree that that would make
sense as the userspace interface.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-07-15 21:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-14 21:08 [WIP PATCH] io_uring: Support opening a file into the fixed-file table Josh Triplett
2020-07-14 21:16 ` [WIP PATCH] liburing: Support IORING_OP_OPENAT2_FIXED_FILE Josh Triplett
2020-07-14 22:59 ` [WIP PATCH] io_uring: Support opening a file into the fixed-file table Clay Harris
2020-07-15  0:42   ` Josh Triplett
2020-07-15  2:32   ` Clay Harris
2020-07-15 21:04     ` josh
2020-07-15 16:07 ` Jens Axboe
2020-07-15 19:54 ` Pavel Begunkov
2020-07-15 20:46   ` Josh Triplett
2020-07-15 20:54     ` Jens Axboe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.