linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Jann Horn <jannh@google.com>
Cc: linux-block@vger.kernel.org,
	"David S. Miller" <davem@davemloft.net>,
	Network Development <netdev@vger.kernel.org>
Subject: Re: [PATCH 1/3] io_uring: add support for async work inheriting files table
Date: Fri, 18 Oct 2019 08:01:20 -0600	[thread overview]
Message-ID: <0fb9d9a0-6251-c4bd-71b0-6e34c6a1aab8@kernel.dk> (raw)
In-Reply-To: <CAG48ez0G2y0JS9=S2KmePO3xq-5DuzgovrLFiX4TJL-G897LCA@mail.gmail.com>

On 10/17/19 8:41 PM, Jann Horn wrote:
> On Fri, Oct 18, 2019 at 4:01 AM Jens Axboe <axboe@kernel.dk> wrote:
>> This is in preparation for adding opcodes that need to modify files
>> in a process file table, either adding new ones or closing old ones.
> 
> Closing old ones would be tricky. Basically if you call
> get_files_struct() while you're between an fdget()/fdput() pair (e.g.
> from sys_io_uring_enter()), you're not allowed to use that
> files_struct reference to replace or close existing FDs through that
> reference. (Or more accurately, if you go through fdget() with
> files_struct refcount 1, you must not replace/close FDs in there in
> any way until you've passed the corresponding fdput().)
> 
> You can avoid that if you ensure that you never use fdget()/fdput() in
> the relevant places, only fget()/fput().

That's a good point, I didn't think the closing aspect through when
writing that changelog. File addition is the most interesting aspect,
obviously, and the only part that I care about in this patch set. I'll
change the wording.

>> If an opcode needs this, it must set REQ_F_NEED_FILES in the request
>> structure. If work that needs to get punted to async context have this
>> set, they will grab a reference to the process file table. When the
>> work is completed, the reference is dropped again.
> [...]
>> @@ -2220,6 +2223,10 @@ static void io_sq_wq_submit_work(struct work_struct *work)
>>                                  set_fs(USER_DS);
>>                          }
>>                  }
>> +               if (s->files && !old_files) {
>> +                       old_files = current->files;
>> +                       current->files = s->files;
>> +               }
> 
> AFAIK e.g. stuff like proc_fd_link() in procfs can concurrently call
> get_files_struct() even on kernel tasks, so you should take the
> task_lock(current) while fiddling with the ->files pointer.

Fixed up, thanks!

> Also, maybe I'm too tired to read this correctly, but it seems like
> when io_sq_wq_submit_work() is processing multiple elements with
> ->files pointers, this part will only consume a reference to the first
> one?

Like the mm, we should only have the one file table. But there's no
reason to not handle this properly, I've amended the commit to properly
swap so it works for any number of file tables.

>>                  if (!ret) {
>>                          s->has_user = cur_mm != NULL;
>> @@ -2312,6 +2319,11 @@ static void io_sq_wq_submit_work(struct work_struct *work)
>>                  unuse_mm(cur_mm);
>>                  mmput(cur_mm);
>>          }
>> +       if (old_files) {
>> +               struct files_struct *files = current->files;
>> +               current->files = old_files;
>> +               put_files_struct(files);
>> +       }
> 
> And then here the first files_struct reference is dropped, and the
> rest of them leak?

Fixed with the above change.

>> @@ -2413,6 +2425,8 @@ static int __io_queue_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
>>
>>                          s->sqe = sqe_copy;
>>                          memcpy(&req->submit, s, sizeof(*s));
>> +                       if (req->flags & REQ_F_NEED_FILES)
>> +                               req->submit.files = get_files_struct(current);
> 
> Stupid question: How does this interact with sqpoll mode? In that
> case, this function is running on a kernel thread that isn't sharing
> the application's files_struct, right?

Not a stupid question! It doesn't work with sqpoll. We need to be
entered on behalf of the task, and we never see that with sqpoll (except
if NEED_WAKE is set in flags).

For now I'll just forbid it explicitly in io_accept(), just like we do
for IORING_SETUP_IOPOLL.

Updated patch1:

http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=df6caac708dae8ee9a74c9016e479b02ad78d436

and patch 3:

http://git.kernel.dk/cgit/linux-block/commit/?h=for-5.5/io_uring-test&id=442bb35fc4f8f28c29ea220475c45babb44ee49c

-- 
Jens Axboe


  reply	other threads:[~2019-10-18 14:01 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-17 21:28 [PATCHSET] io_uring: add support for accept(4) Jens Axboe
2019-10-17 21:28 ` [PATCH 1/3] io_uring: add support for async work inheriting files table Jens Axboe
2019-10-18  2:41   ` Jann Horn
2019-10-18 14:01     ` Jens Axboe [this message]
2019-10-18 14:34       ` Jann Horn
2019-10-18 14:37         ` Jens Axboe
2019-10-18 14:40           ` Jann Horn
2019-10-18 14:43             ` Jens Axboe
2019-10-18 14:52               ` Jann Horn
2019-10-18 15:00                 ` Jens Axboe
2019-10-18 15:54                   ` Jens Axboe
2019-10-18 16:20                     ` Jann Horn
2019-10-18 16:36                       ` Jens Axboe
2019-10-18 17:05                         ` Jens Axboe
2019-10-18 18:06                           ` Jann Horn
2019-10-18 18:16                             ` Jens Axboe
2019-10-18 18:50                               ` Jann Horn
2019-10-24 19:41                                 ` Jens Axboe
2019-10-24 20:31                                   ` Jann Horn
2019-10-24 22:04                                     ` Jens Axboe
2019-10-24 22:09                                       ` Jens Axboe
2019-10-24 23:13                                       ` Jann Horn
2019-10-25  0:35                                         ` Jens Axboe
2019-10-25  0:52                                           ` Jens Axboe
2019-10-23 12:04   ` Wolfgang Bumiller
2019-10-23 14:11     ` Jens Axboe
2019-10-17 21:28 ` [PATCH 2/3] net: add __sys_accept4_file() helper Jens Axboe
2019-10-17 21:28 ` [PATCH 3/3] io_uring: add support for IORING_OP_ACCEPT Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0fb9d9a0-6251-c4bd-71b0-6e34c6a1aab8@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=davem@davemloft.net \
    --cc=jannh@google.com \
    --cc=linux-block@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).