linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: "Eric W. Biederman" <ebiederm@xmission.com>,
	Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCHSET v3 0/4] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK (Insufficiently faking current?)
Date: Mon, 15 Feb 2021 14:09:19 -0700	[thread overview]
Message-ID: <e3335211-83f2-5305-9601-663cc2a73427@kernel.dk> (raw)
In-Reply-To: <e9ba3d6c-ee1f-6491-e7a9-56f4d7a167a3@kernel.dk>

On 2/15/21 11:24 AM, Jens Axboe wrote:
> On 2/15/21 11:07 AM, Eric W. Biederman wrote:
>> Linus Torvalds <torvalds@linux-foundation.org> writes:
>>
>>> On Sun, Feb 14, 2021 at 8:38 AM Jens Axboe <axboe@kernel.dk> wrote:
>>>>
>>>>> Similarly it looks like opening of "/dev/tty" fails to
>>>>> return the tty of the caller but instead fails because
>>>>> io-wq threads don't have a tty.
>>>>
>>>> I've got a patch queued up for 5.12 that clears ->fs and ->files for the
>>>> thread if not explicitly inherited, and I'm working on similarly
>>>> proactively catching these cases that could potentially be problematic.
>>>
>>> Well, the /dev/tty case still needs fixing somehow.
>>>
>>> Opening /dev/tty actually depends on current->signal, and if it is
>>> NULL it will fall back on the first VT console instead (I think).
>>>
>>> I wonder if it should do the same thing /proc/self does..
>>
>> Would there be any downside of making the io-wq kernel threads be per
>> process instead of per user?
>>
>> I can see a lower probability of a thread already existing.  Are there
>> other downsides I am missing?
>>
>> The upside would be that all of the issues of have we copied enough
>> should go away, as the io-wq thread would then behave like another user
>> space thread.  To handle posix setresuid() and friends it looks like
>> current_cred would need to be copied but I can't think of anything else.
> 
> I really like that idea. Do we currently have a way of creating a thread
> internally, akin to what would happen if the same task did pthread_create?
> That'd ensure that we have everything we need, without actively needing to
> map the request types, or find future issues of "we also need this bit".
> It'd work fine for the 'need new worker' case too, if one goes to sleep.
> We'd just 'fork' off that child.
> 
> Would require some restructuring of io-wq, but at the end of it, it'd
> be a simpler solution.

I was intrigued enough that I tried to wire this up. If we can pull this
off, then it would take a great weight off my shoulders as there would
be no more worries on identity.

Here's a branch that's got a set of patches that actually work, though
it's a bit of a hack in spots. Notes:

- Forked worker initially crashed, since it's an actual user thread and
  bombed on deref of kernel structures. Expectedly. That's what the
  horrible kernel_clone_args->io_wq hack is working around for now.
  Obviously not the final solution, but helped move things along so
  I could actually test this.

- Shared io-wq helpers need indexing for task, right now this isn't
  done. But that's not hard to do.

- Idle thread reaping isn't done yet, so they persist until the
  context goes away.

- task_work fallback needs a bit of love. Currently we fallback to
  the io-wq manager thread for handling that, but a) manager is gone,
  and b) the new workers are now threads and go away as well when
  the original task goes away. None of the three fallback sites need
  task context, so likely solution here is just punt it to system_wq.
  Not the hot path, obviously, we're exiting.

- Personality registration is broken, it's just Good Enough to compile.

Probably a few more items that escape me right now. As long as you
don't hit the fallback cases, it appears to work fine for me. And
the diffstat is pretty good to:

 fs/io-wq.c                 | 418 +++++++++++--------------------------
 fs/io-wq.h                 |  10 +-
 fs/io_uring.c              | 314 +++-------------------------
 fs/proc/self.c             |   7 -
 fs/proc/thread_self.c      |   7 -
 include/linux/io_uring.h   |  19 --
 include/linux/sched.h      |   3 +
 include/linux/sched/task.h |   1 +
 kernel/fork.c              |   2 +
 9 files changed, 161 insertions(+), 620 deletions(-)

as it gets rid of _all_ the 'grab this or that piece' that we're
tracking.

WIP series here:

https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-worker

-- 
Jens Axboe


  reply	other threads:[~2021-02-15 21:10 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-14 19:13 [PATCHSET v3 0/4] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Jens Axboe
2020-12-14 19:13 ` [PATCH 1/4] fs: make unlazy_walk() error handling consistent Jens Axboe
2020-12-14 19:13 ` [PATCH 2/4] fs: add support for LOOKUP_NONBLOCK Jens Axboe
2020-12-15 12:24   ` Matthew Wilcox
2020-12-15 15:29     ` Jens Axboe
2020-12-15 15:33       ` Matthew Wilcox
2020-12-15 15:37         ` Jens Axboe
2020-12-15 16:08           ` Jens Axboe
2020-12-15 16:14             ` Jens Axboe
2020-12-15 18:29             ` Linus Torvalds
2020-12-15 18:44               ` Jens Axboe
2020-12-15 18:47                 ` Linus Torvalds
2020-12-15 19:03                   ` Jens Axboe
2020-12-15 19:32                     ` Linus Torvalds
2020-12-15 19:38                       ` Jens Axboe
2020-12-16  2:36   ` Al Viro
2020-12-16  3:30     ` Jens Axboe
2020-12-16  2:43   ` Al Viro
2020-12-16  3:32     ` Jens Axboe
2020-12-14 19:13 ` [PATCH 3/4] fs: expose LOOKUP_NONBLOCK through openat2() RESOLVE_NONBLOCK Jens Axboe
2020-12-15 22:25   ` Dave Chinner
2020-12-15 22:31     ` Linus Torvalds
2020-12-15 23:25       ` Jens Axboe
2020-12-16  2:37   ` Al Viro
2020-12-16  3:39     ` Linus Torvalds
2020-12-14 19:13 ` [PATCH 4/4] io_uring: enable LOOKUP_NONBLOCK path resolution for filename lookups Jens Axboe
2020-12-15  3:06 ` [PATCHSET v3 0/4] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Linus Torvalds
2020-12-15  3:18   ` Jens Axboe
2020-12-15  6:11 ` Al Viro
2020-12-15 15:29   ` Jens Axboe
2021-01-04  5:31 ` Al Viro
2021-01-04 14:43   ` Jens Axboe
2021-01-04 16:54     ` Al Viro
2021-01-04 17:03       ` Jens Axboe
     [not found] ` <m1lfbrwrgq.fsf@fess.ebiederm.org>
2021-02-14 16:38   ` [PATCHSET v3 0/4] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK (Insufficiently faking current?) Jens Axboe
2021-02-14 20:30     ` Linus Torvalds
2021-02-14 21:24       ` Al Viro
2021-02-15 18:07       ` Eric W. Biederman
2021-02-15 18:24         ` Jens Axboe
2021-02-15 21:09           ` Jens Axboe [this message]
2021-02-15 22:41             ` Eric W. Biederman
2021-02-16  2:41               ` Jens Axboe
2021-02-17  1:18                 ` Jens Axboe
2021-02-17  1:26                   ` Jens Axboe
2021-02-17  3:11                     ` Jens Axboe
2021-02-15 17:56     ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e3335211-83f2-5305-9601-663cc2a73427@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=ebiederm@xmission.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).