From: Jens Axboe <axboe@kernel.dk>
To: Jann Horn <jannh@google.com>
Cc: linux-block@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH 1/3] io_uring: add support for async work inheriting files table
Date: Fri, 18 Oct 2019 10:36:26 -0600 [thread overview]
Message-ID: <20b44cc0-87b1-7bf8-d20e-f6131da9d130@kernel.dk> (raw)
In-Reply-To: <CAG48ez12pteHyZasU8Smup-0Mn3BWNMCVjybd1jvXsPrJ7OmYg@mail.gmail.com>
On 10/18/19 10:20 AM, Jann Horn wrote:
> On Fri, Oct 18, 2019 at 5:55 PM Jens Axboe <axboe@kernel.dk> wrote:
>> On 10/18/19 9:00 AM, Jens Axboe wrote:
>>> On 10/18/19 8:52 AM, Jann Horn wrote:
>>>> On Fri, Oct 18, 2019 at 4:43 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>
>>>>> On 10/18/19 8:40 AM, Jann Horn wrote:
>>>>>> On Fri, Oct 18, 2019 at 4:37 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>>
>>>>>>> On 10/18/19 8:34 AM, Jann Horn wrote:
>>>>>>>> On Fri, Oct 18, 2019 at 4:01 PM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>>>> On 10/17/19 8:41 PM, Jann Horn wrote:
>>>>>>>>>> On Fri, Oct 18, 2019 at 4:01 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>>>>>>>> This is in preparation for adding opcodes that need to modify files
>>>>>>>>>>> in a process file table, either adding new ones or closing old ones.
>>>>>>>> [...]
>>>>>>>>> Updated patch1:
>>>>>>>>>
>>>>>>>>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=df6caac708dae8ee9a74c9016e479b02ad78d436
>>>>>>>>
>>>>>>>> I don't understand what you're doing with old_files in there. In the
>>>>>>>> "s->files && !old_files" branch, "current->files = s->files" happens
>>>>>>>> without holding task_lock(), but current->files and s->files are also
>>>>>>>> the same already at that point anyway. And what's the intent behind
>>>>>>>> assigning stuff to old_files inside the loop? Isn't that going to
>>>>>>>> cause the workqueue to keep a modified current->files beyond the
>>>>>>>> runtime of the work?
>>>>>>>
>>>>>>> I simply forgot to remove the old block, it should only have this one:
>>>>>>>
>>>>>>> if (s->files && s->files != cur_files) {
>>>>>>> task_lock(current);
>>>>>>> current->files = s->files;
>>>>>>> task_unlock(current);
>>>>>>> if (cur_files)
>>>>>>> put_files_struct(cur_files);
>>>>>>> cur_files = s->files;
>>>>>>> }
>>>>>>
>>>>>> Don't you still need a put_files_struct() in the case where "s->files
>>>>>> == cur_files"?
>>>>>
>>>>> I want to hold on to the files for as long as I can, to avoid unnecessary
>>>>> shuffling of it. But I take it your worry here is that we'll be calling
>>>>> something that manipulates ->files? Nothing should do that, unless
>>>>> s->files is set. We didn't hide the workqueue ->files[] before this
>>>>> change either.
>>>>
>>>> No, my worry is that the refcount of the files_struct is left too
>>>> high. From what I can tell, the "do" loop in io_sq_wq_submit_work()
>>>> iterates over multiple instances of struct sqe_submit. If there are
>>>> two sqe_submit instances with the same ->files (each holding a
>>>> reference from the get_files_struct() in __io_queue_sqe()), then:
>>>>
>>>> When processing the first sqe_submit instance, current->files and
>>>> cur_files are set to $user_files.
>>>> When processing the second sqe_submit instance, nothing happens
>>>> (s->files == cur_files).
>>>> After the loop, at the end of the function, put_files_struct() is
>>>> called once on $user_files.
>>>>
>>>> So get_files_struct() has been called twice, but put_files_struct()
>>>> has only been called once. That leaves the refcount too high, and by
>>>> repeating this, an attacker can make the refcount wrap around and then
>>>> cause a use-after-free.
>>>
>>> Ah now I see what you are getting at, yes that's clearly a bug! I wonder
>>> how we best safely can batch the drops. We can track the number of times
>>> we've used the same files, and do atomic_sub_and_test() in a
>>> put_files_struct_many() type addition. But that would leave us open to
>>> the issue you describe, where someone could maliciously overflow the
>>> files ref count.
>>>
>>> Probably not worth over-optimizing, as long as we can avoid the
>>> current->files task lock/unlock and shuffle.
>>>
>>> I'll update the patch.
>>
>> Alright, this incremental on top should do it. And full updated patch
>> here:
>>
>> http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=40449c5a3d3b16796fa13e9469c69d62986e961c
>>
>> Let me know what you think.
>
> Ignoring the locking elision, basically the logic is now this:
>
> static void io_sq_wq_submit_work(struct work_struct *work)
> {
> struct io_kiocb *req = container_of(work, struct io_kiocb, work);
> struct files_struct *cur_files = NULL, *old_files;
> [...]
> old_files = current->files;
> [...]
> do {
> struct sqe_submit *s = &req->submit;
> [...]
> if (cur_files)
> /* drop cur_files reference; borrow lifetime must
> * end before here */
> put_files_struct(cur_files);
> /* move reference ownership to cur_files */
> cur_files = s->files;
> if (cur_files) {
> task_lock(current);
> /* current->files borrows reference from cur_files;
> * existing borrow from previous loop ends here */
> current->files = cur_files;
> task_unlock(current);
> }
>
> [call __io_submit_sqe()]
> [...]
> } while (req);
> [...]
> /* existing borrow ends here */
> task_lock(current);
> current->files = old_files;
> task_unlock(current);
> if (cur_files)
> /* drop cur_files reference; borrow lifetime must
> * end before here */
> put_files_struct(cur_files);
> }
>
> If you run two iterations of this loop, with a first element that has
> a ->files pointer and a second element that doesn't, then in the
> second run through the loop, the reference to the files_struct will be
> dropped while current->files still points to it; current->files is
> only reset after the loop has ended. If someone accesses
> current->files through procfs directly after that, AFAICS you'd get a
> use-after-free.
Amazing how this is still broken. You are right, and it's especially
annoying since that's exactly the case I originally talked about (not
flipping current->files if we don't have to). I just did it wrong, so
we'll leave a dangling pointer in ->files.
The by far most common case is if one sqe has a files it needs to
attach, then others that also have files will be the same set. So I want
to optimize for the case where we only flip current->files once when we
see the files, and once when we're done with the loop.
Let me see if I can get this right...
--
Jens Axboe
next prev parent reply other threads:[~2019-10-18 16:36 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-17 21:28 [PATCHSET] io_uring: add support for accept(4) Jens Axboe
2019-10-17 21:28 ` [PATCH 1/3] io_uring: add support for async work inheriting files table Jens Axboe
2019-10-18 2:41 ` Jann Horn
2019-10-18 14:01 ` Jens Axboe
2019-10-18 14:34 ` Jann Horn
2019-10-18 14:37 ` Jens Axboe
2019-10-18 14:40 ` Jann Horn
2019-10-18 14:43 ` Jens Axboe
2019-10-18 14:52 ` Jann Horn
2019-10-18 15:00 ` Jens Axboe
2019-10-18 15:54 ` Jens Axboe
2019-10-18 16:20 ` Jann Horn
2019-10-18 16:36 ` Jens Axboe [this message]
2019-10-18 17:05 ` Jens Axboe
2019-10-18 18:06 ` Jann Horn
2019-10-18 18:16 ` Jens Axboe
2019-10-18 18:50 ` Jann Horn
2019-10-24 19:41 ` Jens Axboe
2019-10-24 20:31 ` Jann Horn
2019-10-24 22:04 ` Jens Axboe
2019-10-24 22:09 ` Jens Axboe
2019-10-24 23:13 ` Jann Horn
2019-10-25 0:35 ` Jens Axboe
2019-10-25 0:52 ` Jens Axboe
2019-10-23 12:04 ` Wolfgang Bumiller
2019-10-23 14:11 ` Jens Axboe
2019-10-17 21:28 ` [PATCH 2/3] net: add __sys_accept4_file() helper Jens Axboe
2019-10-17 21:28 ` [PATCH 3/3] io_uring: add support for IORING_OP_ACCEPT Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20b44cc0-87b1-7bf8-d20e-f6131da9d130@kernel.dk \
--to=axboe@kernel.dk \
--cc=davem@davemloft.net \
--cc=jannh@google.com \
--cc=linux-block@vger.kernel.org \
--cc=netdev@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).