From: Jann Horn <jannh@google.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH 1/3] io_uring: add support for async work inheriting files table
Date: Thu, 24 Oct 2019 22:31:07 +0200 [thread overview]
Message-ID: <CAG48ez0K_wtHA4DSWjz4TjohHkMTGo2pTpDVMZPQWD2gtrqZJw@mail.gmail.com> (raw)
In-Reply-To: <c3fb07d4-223c-8835-5c22-68367e957a4f@kernel.dk>
On Thu, Oct 24, 2019 at 9:41 PM Jens Axboe <axboe@kernel.dk> wrote:
> On 10/18/19 12:50 PM, Jann Horn wrote:
> > On Fri, Oct 18, 2019 at 8:16 PM Jens Axboe <axboe@kernel.dk> wrote:
> >> On 10/18/19 12:06 PM, Jann Horn wrote:
> >>> But actually, by the way: Is this whole files_struct thing creating a
> >>> reference loop? The files_struct has a reference to the uring file,
> >>> and the uring file has ACCEPT work that has a reference to the
> >>> files_struct. If the task gets killed and the accept work blocks, the
> >>> entire files_struct will stay alive, right?
> >>
> >> Yes, for the lifetime of the request, it does create a loop. So if the
> >> application goes away, I think you're right, the files_struct will stay.
> >> And so will the io_uring, for that matter, as we depend on the closing
> >> of the files to do the final reap.
> >>
> >> Hmm, not sure how best to handle that, to be honest. We need some way to
> >> break the loop, if the request never finishes.
> >
> > A wacky and dubious approach would be to, instead of taking a
> > reference to the files_struct, abuse f_op->flush() to synchronously
> > flush out pending requests with references to the files_struct... But
> > it's probably a bad idea, given that in f_op->flush(), you can't
> > easily tell which files_struct the close is coming from. I suppose you
> > could keep a list of (fdtable, fd) pairs through which ACCEPT requests
> > have come in and then let f_op->flush() probe whether the file
> > pointers are gone from them...
>
> Got back to this after finishing the io-wq stuff, which we need for the
> cancel.
>
> Here's an updated patch:
>
> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=1ea847edc58d6a54ca53001ad0c656da57257570
>
> that seems to work for me (lightly tested), we correctly find and cancel
> work that is holding on to the file table.
>
> The full series sits on top of my for-5.5/io_uring-wq branch, and can be
> viewed here:
>
> http://git.kernel.dk/cgit/linux-block/log/?h=for-5.5/io_uring-test
>
> Let me know what you think!
Ah, I didn't realize that the second argument to f_op->flush is a
pointer to the files_struct. That's neat.
Security: There is no guarantee that ->flush() will run after the last
io_uring_enter() finishes. You can race like this, with threads A and
B in one process and C in another one:
A: sends uring fd to C via unix domain socket
A: starts syscall io_uring_enter(fd, ...)
A: calls fdget(fd), takes reference to file
B: starts syscall close(fd)
B: fd table entry is removed
B: f_op->flush is invoked and finds no pending transactions
B: syscall close() returns
A: continues io_uring_enter(), grabbing current->files
A: io_uring_enter() returns
A and B: exit
worker: use-after-free access to files_struct
I think the solution to this would be (unless you're fine with adding
some broad global read-write mutex) something like this in
__io_queue_sqe(), where "fd" and "f" are the variables from
io_uring_enter(), plumbed through the stack somehow:
if (req->flags & REQ_F_NEED_FILES) {
rcu_read_lock();
spin_lock_irq(&ctx->inflight_lock);
if (fcheck(fd) == f) {
list_add(&req->inflight_list,
&ctx->inflight_list);
req->work.files = current->files;
ret = 0;
} else {
ret = -EBADF;
}
spin_unlock_irq(&ctx->inflight_lock);
rcu_read_unlock();
if (ret)
goto put_req;
}
Minor note: If a process uses dup() to duplicate the uring fd, then
closes the duplicated fd, that will cause work cancellations - but I
guess that's fine?
Style nit: I find it a bit confusing to name both the list head and
the list member heads "inflight_list". Maybe name them "inflight_list"
and "inflight_entry", or something like that?
Correctness: Why is the wait in io_uring_flush() TASK_INTERRUPTIBLE?
Shouldn't it be TASK_UNINTERRUPTIBLE? If someone sends a signal to the
task while it's at that schedule(), it's just going to loop back
around and retry what it was doing already, right?
Security + Correctness: If there is more than one io_wqe, it seems to
me that io_uring_flush() calls io_wq_cancel_work(), which calls
io_wqe_cancel_work(), which may return IO_WQ_CANCEL_OK if the first
request it looks at is pending. In that case, io_wq_cancel_work() will
immediately return, and io_uring_flush() will also immediately return.
It looks like any other requests will continue running?
next prev parent reply other threads:[~2019-10-24 20:31 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-17 21:28 [PATCHSET] io_uring: add support for accept(4) Jens Axboe
2019-10-17 21:28 ` [PATCH 1/3] io_uring: add support for async work inheriting files table Jens Axboe
2019-10-18 2:41 ` Jann Horn
2019-10-18 14:01 ` Jens Axboe
2019-10-18 14:34 ` Jann Horn
2019-10-18 14:37 ` Jens Axboe
2019-10-18 14:40 ` Jann Horn
2019-10-18 14:43 ` Jens Axboe
2019-10-18 14:52 ` Jann Horn
2019-10-18 15:00 ` Jens Axboe
2019-10-18 15:54 ` Jens Axboe
2019-10-18 16:20 ` Jann Horn
2019-10-18 16:36 ` Jens Axboe
2019-10-18 17:05 ` Jens Axboe
2019-10-18 18:06 ` Jann Horn
2019-10-18 18:16 ` Jens Axboe
2019-10-18 18:50 ` Jann Horn
2019-10-24 19:41 ` Jens Axboe
2019-10-24 20:31 ` Jann Horn [this message]
2019-10-24 22:04 ` Jens Axboe
2019-10-24 22:09 ` Jens Axboe
2019-10-24 23:13 ` Jann Horn
2019-10-25 0:35 ` Jens Axboe
2019-10-25 0:52 ` Jens Axboe
2019-10-23 12:04 ` Wolfgang Bumiller
2019-10-23 14:11 ` Jens Axboe
2019-10-17 21:28 ` [PATCH 2/3] net: add __sys_accept4_file() helper Jens Axboe
2019-10-17 21:28 ` [PATCH 3/3] io_uring: add support for IORING_OP_ACCEPT Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAG48ez0K_wtHA4DSWjz4TjohHkMTGo2pTpDVMZPQWD2gtrqZJw@mail.gmail.com \
--to=jannh@google.com \
--cc=axboe@kernel.dk \
--cc=davem@davemloft.net \
--cc=linux-block@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).