All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Jann Horn <jannh@google.com>
Cc: linux-block@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH 1/3] io_uring: add support for async work inheriting files table
Date: Thu, 24 Oct 2019 18:52:34 -0600	[thread overview]
Message-ID: <4a920143-d899-8811-a767-d114dba1e4e3@kernel.dk> (raw)
In-Reply-To: <947c74b9-e828-e190-19fc-449c72a20798@kernel.dk>

On 10/24/19 6:35 PM, Jens Axboe wrote:
> On 10/24/19 5:13 PM, Jann Horn wrote:
>> On Fri, Oct 25, 2019 at 12:04 AM Jens Axboe <axboe@kernel.dk> wrote:
>>> On 10/24/19 2:31 PM, Jann Horn wrote:
>>>> On Thu, Oct 24, 2019 at 9:41 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>> On 10/18/19 12:50 PM, Jann Horn wrote:
>>>>>> On Fri, Oct 18, 2019 at 8:16 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>> On 10/18/19 12:06 PM, Jann Horn wrote:
>>>>>>>> But actually, by the way: Is this whole files_struct thing creating a
>>>>>>>> reference loop? The files_struct has a reference to the uring file,
>>>>>>>> and the uring file has ACCEPT work that has a reference to the
>>>>>>>> files_struct. If the task gets killed and the accept work blocks, the
>>>>>>>> entire files_struct will stay alive, right?
>>>>>>>
>>>>>>> Yes, for the lifetime of the request, it does create a loop. So if the
>>>>>>> application goes away, I think you're right, the files_struct will stay.
>>>>>>> And so will the io_uring, for that matter, as we depend on the closing
>>>>>>> of the files to do the final reap.
>>>>>>>
>>>>>>> Hmm, not sure how best to handle that, to be honest. We need some way to
>>>>>>> break the loop, if the request never finishes.
>>>>>>
>>>>>> A wacky and dubious approach would be to, instead of taking a
>>>>>> reference to the files_struct, abuse f_op->flush() to synchronously
>>>>>> flush out pending requests with references to the files_struct... But
>>>>>> it's probably a bad idea, given that in f_op->flush(), you can't
>>>>>> easily tell which files_struct the close is coming from. I suppose you
>>>>>> could keep a list of (fdtable, fd) pairs through which ACCEPT requests
>>>>>> have come in and then let f_op->flush() probe whether the file
>>>>>> pointers are gone from them...
>>>>>
>>>>> Got back to this after finishing the io-wq stuff, which we need for the
>>>>> cancel.
>>>>>
>>>>> Here's an updated patch:
>>>>>
>>>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=1ea847edc58d6a54ca53001ad0c656da57257570
>>>>>
>>>>> that seems to work for me (lightly tested), we correctly find and cancel
>>>>> work that is holding on to the file table.
>>>>>
>>>>> The full series sits on top of my for-5.5/io_uring-wq branch, and can be
>>>>> viewed here:
>>>>>
>>>>> http://git.kernel.dk/cgit/linux-block/log/?h=for-5.5/io_uring-test
>>>>>
>>>>> Let me know what you think!
>>>>
>>>> Ah, I didn't realize that the second argument to f_op->flush is a
>>>> pointer to the files_struct. That's neat.
>>>>
>>>>
>>>> Security: There is no guarantee that ->flush() will run after the last
>>>> io_uring_enter() finishes. You can race like this, with threads A and
>>>> B in one process and C in another one:
>>>>
>>>> A: sends uring fd to C via unix domain socket
>>>> A: starts syscall io_uring_enter(fd, ...)
>>>> A: calls fdget(fd), takes reference to file
>>>> B: starts syscall close(fd)
>>>> B: fd table entry is removed
>>>> B: f_op->flush is invoked and finds no pending transactions
>>>> B: syscall close() returns
>>>> A: continues io_uring_enter(), grabbing current->files
>>>> A: io_uring_enter() returns
>>>> A and B: exit
>>>> worker: use-after-free access to files_struct
>>>>
>>>> I think the solution to this would be (unless you're fine with adding
>>>> some broad global read-write mutex) something like this in
>>>> __io_queue_sqe(), where "fd" and "f" are the variables from
>>>> io_uring_enter(), plumbed through the stack somehow:
>>>>
>>>> if (req->flags & REQ_F_NEED_FILES) {
>>>>      rcu_read_lock();
>>>>      spin_lock_irq(&ctx->inflight_lock);
>>>>      if (fcheck(fd) == f) {
>>>>        list_add(&req->inflight_list,
>>>>          &ctx->inflight_list);
>>>>        req->work.files = current->files;
>>>>        ret = 0;
>>>>      } else {
>>>>        ret = -EBADF;
>>>>      }
>>>>      spin_unlock_irq(&ctx->inflight_lock);
>>>>      rcu_read_unlock();
>>>>      if (ret)
>>>>        goto put_req;
>>>> }
>>>
>>> First of all, thanks for the thorough look at this! We already have f
>>> available here, it's req->file. And we just made a copy of the sqe, so
>>> we have sqe->fd available as well. I fixed this up.
>>
>> sqe->fd is the file descriptor we're doing I/O on, not the file
>> descriptor of the uring file, right? Same thing for req->file. This
>> check only detects whether the fd we're doing I/O on was closed, which
>> is irrelevant.
> 
> Duh yes, I'm an idiot. Easily fixable, I'll update this for the ring fd.

Incremental:

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ec9dadfa90d2..4d94886a3d13 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -262,11 +262,13 @@ struct io_ring_ctx {
 
 struct sqe_submit {
 	const struct io_uring_sqe	*sqe;
+	struct file			*ring_file;
 	unsigned short			index;
 	bool				has_user : 1;
 	bool				in_async : 1;
 	bool				needs_fixed_file : 1;
 	u32				sequence;
+	int				ring_fd;
 };
 
 /*
@@ -2329,14 +2331,13 @@ static int io_req_set_file(struct io_ring_ctx *ctx, const struct sqe_submit *s,
 	return 0;
 }
 
-static int io_grab_files(struct io_ring_ctx *ctx, struct io_kiocb *req,
-			 struct io_uring_sqe *sqe)
+static int io_grab_files(struct io_ring_ctx *ctx, struct io_kiocb *req)
 {
 	int ret = -EBADF;
 
 	rcu_read_lock();
 	spin_lock_irq(&ctx->inflight_lock);
-	if (fcheck(sqe->fd) == req->file) {
+	if (fcheck(req->submit.ring_fd) == req->submit.ring_file) {
 		list_add(&req->inflight_entry, &ctx->inflight_list);
 		req->work.files = current->files;
 		ret = 0;
@@ -2367,7 +2368,7 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
 			s->sqe = sqe_copy;
 			memcpy(&req->submit, s, sizeof(*s));
 			if (req->flags & REQ_F_NEED_FILES) {
-				ret = io_grab_files(ctx, req, sqe_copy);
+				ret = io_grab_files(ctx, req);
 				if (ret) {
 					kfree(sqe_copy);
 					goto err;
@@ -2585,6 +2586,7 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct sqe_submit *s)
 
 	head = READ_ONCE(sq_array[head & ctx->sq_mask]);
 	if (head < ctx->sq_entries) {
+		s->ring_file = NULL;
 		s->index = head;
 		s->sqe = &ctx->sq_sqes[head];
 		s->sequence = ctx->cached_sq_head;
@@ -2782,7 +2784,8 @@ static int io_sq_thread(void *data)
 	return 0;
 }
 
-static int io_ring_submit(struct io_ring_ctx *ctx, unsigned int to_submit)
+static int io_ring_submit(struct io_ring_ctx *ctx, unsigned int to_submit,
+			  struct file *ring_file, int ring_fd)
 {
 	struct io_submit_state state, *statep = NULL;
 	struct io_kiocb *link = NULL;
@@ -2824,9 +2827,11 @@ static int io_ring_submit(struct io_ring_ctx *ctx, unsigned int to_submit)
 		}
 
 out:
+		s.ring_file = ring_file;
 		s.has_user = true;
 		s.in_async = false;
 		s.needs_fixed_file = false;
+		s.ring_fd = ring_fd;
 		submit++;
 		trace_io_uring_submit_sqe(ctx, true, false);
 		io_submit_sqe(ctx, &s, statep, &link);
@@ -3828,10 +3833,9 @@ static int io_uring_flush(struct file *file, void *data)
 {
 	struct io_ring_ctx *ctx = file->private_data;
 
+	io_uring_cancel_files(ctx, data);
 	if (fatal_signal_pending(current) || (current->flags & PF_EXITING))
 		io_wq_cancel_all(ctx->io_wq);
-	else
-		io_uring_cancel_files(ctx, data);
 	return 0;
 }
 
@@ -3903,7 +3907,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u32, to_submit,
 		to_submit = min(to_submit, ctx->sq_entries);
 
 		mutex_lock(&ctx->uring_lock);
-		submitted = io_ring_submit(ctx, to_submit);
+		submitted = io_ring_submit(ctx, to_submit, f.file, fd);
 		mutex_unlock(&ctx->uring_lock);
 	}
 	if (flags & IORING_ENTER_GETEVENTS) {

-- 
Jens Axboe


  reply	other threads:[~2019-10-25  0:52 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-17 21:28 [PATCHSET] io_uring: add support for accept(4) Jens Axboe
2019-10-17 21:28 ` [PATCH 1/3] io_uring: add support for async work inheriting files table Jens Axboe
2019-10-18  2:41   ` Jann Horn
2019-10-18 14:01     ` Jens Axboe
2019-10-18 14:34       ` Jann Horn
2019-10-18 14:37         ` Jens Axboe
2019-10-18 14:40           ` Jann Horn
2019-10-18 14:43             ` Jens Axboe
2019-10-18 14:52               ` Jann Horn
2019-10-18 15:00                 ` Jens Axboe
2019-10-18 15:54                   ` Jens Axboe
2019-10-18 16:20                     ` Jann Horn
2019-10-18 16:36                       ` Jens Axboe
2019-10-18 17:05                         ` Jens Axboe
2019-10-18 18:06                           ` Jann Horn
2019-10-18 18:16                             ` Jens Axboe
2019-10-18 18:50                               ` Jann Horn
2019-10-24 19:41                                 ` Jens Axboe
2019-10-24 20:31                                   ` Jann Horn
2019-10-24 22:04                                     ` Jens Axboe
2019-10-24 22:09                                       ` Jens Axboe
2019-10-24 23:13                                       ` Jann Horn
2019-10-25  0:35                                         ` Jens Axboe
2019-10-25  0:52                                           ` Jens Axboe [this message]
2019-10-23 12:04   ` Wolfgang Bumiller
2019-10-23 14:11     ` Jens Axboe
2019-10-17 21:28 ` [PATCH 2/3] net: add __sys_accept4_file() helper Jens Axboe
2019-10-17 21:28 ` [PATCH 3/3] io_uring: add support for IORING_OP_ACCEPT Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4a920143-d899-8811-a767-d114dba1e4e3@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=jannh@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.