All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stefan Metzmacher <metze@samba.org>
To: Pavel Begunkov <asml.silence@gmail.com>,
	Jens Axboe <axboe@kernel.dk>,
	io-uring@vger.kernel.org
Subject: Re: IORING_SETUP_ATTACH_WQ (was Re: [PATCH 1/3] io_uring: fix invalid ctx->sq_thread_idle)
Date: Thu, 11 Mar 2021 13:02:20 +0100	[thread overview]
Message-ID: <b994e9c6-8642-20a0-61d6-f7943e151e76@samba.org> (raw)
In-Reply-To: <5efea46e-8dce-3d6b-99e4-9ee9a111d8a6@samba.org>


Am 11.03.21 um 12:46 schrieb Stefan Metzmacher:
> 
> Am 11.03.21 um 12:18 schrieb Pavel Begunkov:
>> On 10/03/2021 13:56, Stefan Metzmacher wrote:
>>>
>>> Hi Pavel,
>>>
>>> I wondered about the exact same change this morning, while researching
>>> the IORING_SETUP_ATTACH_WQ behavior :-)
>>>
>>> It still seems to me that IORING_SETUP_ATTACH_WQ changed over time.
>>> As you introduced that flag, can you summaries it's behavior (and changes)
>>> over time (over the releases).
>>
>> Not sure I remember the story in details, but from the beginning it was
>> for io-wq sharing only, then it had expanded to SQPOLL as well. Now it's
>> only about SQPOLL sharing, because of the recent io-wq changes that made
>> it per-task and shared by default.
>>
>> In all cases it should be checking the passed in file, that should retain
>> the old behaviour of failing setup if the flag is set but wq_fd is not valid.
> 
> Thanks, that's what I also found so far, see below for more findings.
> 
>>>
>>> I'm wondering if ctx->sq_creds is really the only thing we need to take care of.
>>
>> io-wq is not affected by IORING_SETUP_ATTACH_WQ. It's per-task and mimics
>> all the resources of the creator (on the moment of io-wq creation). Off
>> ATTACH_WQ topic, but that's almost matches what it has been before, and
>> with dropped unshare bit, should be totally same.
>>
>> Regarding SQPOLL, it was always using resources of the first task, so
>> those are just reaped of from it, and not only some particular like
>> mm/files but all of them, like fork does, so should be safer.
>>
>> Creds are just a special case because of that personality stuff, at least
>> if we add back iowq unshare handling.
>>
>>>
>>> Do we know about existing users of IORING_SETUP_ATTACH_WQ and their use case?
>>
>> Have no clue.
>>
>>> As mm, files and other things may differ now between sqe producer and the sq_thread.
>>
>> It was always using mm/files of the ctx creator's task, aka ctx->sqo_task,
>> but right, for the sharing case those may be different b/w ctx, so looks
>> like a regression to me
> 
> Good. I'll try to explore a possible way out below.
> 
> Ok, I'm continuing the thread here (just pasting the mail I already started to write :-)
> 
> I did some more research regarding IORING_SETUP_ATTACH_WQ in 5.12.
> 
> The current logic in io_sq_offload_create() is this:
> 
> +       /* Retain compatibility with failing for an invalid attach attempt */
> +       if ((ctx->flags & (IORING_SETUP_ATTACH_WQ | IORING_SETUP_SQPOLL)) ==
> +                               IORING_SETUP_ATTACH_WQ) {
> +               struct fd f;
> +
> +               f = fdget(p->wq_fd);
> +               if (!f.file)
> +                       return -ENXIO;
> +               if (f.file->f_op != &io_uring_fops) {
> +                       fdput(f);
> +                       return -EINVAL;
> +               }
> +               fdput(f);
> +       }
> 
> That means that IORING_SETUP_ATTACH_WQ (without IORING_SETUP_SQPOLL) is completely
> ignored (except that we still simulate the -ENXIO and -EINVAL  cases), correct?
> (You already agreed on that above :-)
> 
> The reason for this is that io_wq is no longer maintained per io_ring_ctx,
> but instead it is now global per io_uring_task.
> Which means each userspace thread (or the sq_thread) has its own io_uring_task and
> thus its own io_wq.
> 
> Regarding the IORING_SETUP_SQPOLL|IORING_SETUP_ATTACH_WQ case we still allow attaching
> to the sq_thread of a different io_ring_ctx. The sq_thread runs in the context of
> the io_uring_setup() syscall that created it. We used to switch current->mm, current->files
> and other things before calling __io_sq_thread() before, but we no longer do that.
> And this seems to be security problem to me, as it's now possible for the attached
> io_ring_ctx to start sqe's copying the whole address space of the donator into
> a registered fixed file of the attached process.
> 
> As we already ignore IORING_SETUP_ATTACH_WQ without IORING_SETUP_SQPOLL, what about
> ignoring it as well if the attaching task uses different ->mm, ->files, ...
> So IORING_SETUP_ATTACH_WQ would only have any effect if the task calling io_uring_setup()
> runs in the same context (except of the creds) as the existing sq_thread, which means it would work
> if multiple userspace threads of the same userspace process want to share the sq_thread and its
> io_wq. Everything else would be stupid (similar to the unshare() cases).
> But as this has worked before, we just silently ignore IORING_SETUP_ATTACH_WQ is
> we find a context mismatch and let io_uring_setup() silently create a new sq_thread.

Or we completely ignore IORING_SETUP_ATTACH_WQ (execpt the error cases).

Then we can implement a new IORING_SETUP_ATTACH_SQ with new semantics,
that the existing sq_thread will be used as it and both sides now what it means to them.
We also add a new IORING_REGISTER_RESTRICTIONS/IORING_RESTRICTION_ALLOW_SQ_ATTACHMENTS
which prepares the first io_ring_ctx to allow others to attach.

Would that make sense?

metze

  reply	other threads:[~2021-03-11 12:03 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 13:13 [PATCH 5.12 0/3] sqpoll fixes/cleanups Pavel Begunkov
2021-03-10 13:13 ` [PATCH 1/3] io_uring: fix invalid ctx->sq_thread_idle Pavel Begunkov
2021-03-10 13:56   ` Stefan Metzmacher
2021-03-11 10:49     ` Stefan Metzmacher
2021-03-11 11:18     ` Pavel Begunkov
2021-03-11 11:46       ` IORING_SETUP_ATTACH_WQ (was Re: [PATCH 1/3] io_uring: fix invalid ctx->sq_thread_idle) Stefan Metzmacher
2021-03-11 12:02         ` Stefan Metzmacher [this message]
2021-03-11 15:28           ` Jens Axboe
2021-03-11 12:27         ` Pavel Begunkov
2021-03-11 12:44           ` Stefan Metzmacher
2021-03-11 15:30             ` Jens Axboe
2021-03-11 15:38               ` Jens Axboe
2021-03-11 15:54                 ` Stefan Metzmacher
2021-03-11 15:27         ` Jens Axboe
2021-03-10 13:13 ` [PATCH 2/3] io_uring: remove indirect ctx into sqo injection Pavel Begunkov
2021-03-10 13:13 ` [PATCH 3/3] io_uring: simplify io_sqd_update_thread_idle() Pavel Begunkov
2021-03-10 14:38 ` [PATCH 5.12 0/3] sqpoll fixes/cleanups Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b994e9c6-8642-20a0-61d6-f7943e151e76@samba.org \
    --to=metze@samba.org \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=io-uring@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.