All of lore.kernel.org
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK
Date: Fri, 11 Dec 2020 09:35:45 -0800	[thread overview]
Message-ID: <CAHk-=whgSe_ZmHTjpVw5VJ6zrmKxR4oH3hx=ZvQf4w4h4rhVCQ@mail.gmail.com> (raw)
In-Reply-To: <20201211172054.GX3579531@ZenIV.linux.org.uk>

On Fri, Dec 11, 2020 at 9:21 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Explain, please.  What's the difference between blocking in a lookup and
> blocking in truncate?  Either your call site is fine with a potentially
> long sleep, or it is not; I don't understand what makes one source of
> that behaviour different from another.

So I'm not Jens, and I don't know exactly what io_uring loads he's
looking at, but the reason I'm interested in this is that this is very
much not the first time this has come up.

The big difference between filename lookup and truncate is that one is
very common indeed, and the other isn't.

Sure, something like truncate happens. And it might even be a huge
deal and very critical for some load. But realistically, I don't think
I've ever seen a load where if it's important, and you can do it
asynchronously, you couldn't just start a thread for it (particularly
a kthread).

> "Fast path" in context like "we can't sleep here, but often enough we
> won't need to; here's a function that will bail out rather than blocking,
> let's call that and go through offload to helper thread in rare case
> when it does bail out" does make sense; what you are proposing to do
> here is rather different and AFAICS saying "that's my fast path" is
> meaningless here.

The fast path context here is not "we can't sleep here".

No, the fast-path context here is "we want highest performance here",
with the understanding that there are other things to be done. The
existing code already simply starts a kernel thread for the open - not
because it "can't sleep", but because of that "I want to get this
operation started, but there are other things I want to start too".

And in that context, it's not about "can't sleep". It's about "if we
already have the data in a very fast cache, then doing this
asynchronously with a thread is SLOWER than just doing it directly".

In particular it's not about correctness: doing it synchronously or
asynchronously are both "equally correct". You get the same answer in
the end. It's purely about that "if we can do it really quickly, it's
better to just do it".

Which gets me back to the first part: this has come up before. Tux2
used to want to do _exactly_ this same thing.

But what has happened is that
 (a) we now have a RCU lookup that is an almost exact match for this
and
 (b) we now have a generic interface for user space to use it in the
form of io_uring

So this is not about "you have to get it right".  In fact, if it was,
the RCU lookup model would be the wrong thing, because the RCU name
lookup is optimistic, and will fail for a variety of reasons.

Bo, this is literally about "threads and synchronization is a real
overhead, so if you care about performance, you don't actually want to
use them if you can do the operation so fast that the thread and
synchronization overhead is a real issue".

Which is why LOOKUP_RCU is such a good match.

And while Tux was never very successful because it was so limited and
so special, io_uring really looks like it could be the interface to
make a lot of performance-sensitive people happy. And getting that
"low-latency cached behaviour vs bigger operations that might need
lots of locks or IO" balance right would be a very good thing, I
suspect.

            Linus

  reply	other threads:[~2020-12-11 19:22 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-10 20:01 [PATCHSET 0/2] fs: Support for LOOKUP_NONBLOCK / RESOLVE_NONBLOCK Jens Axboe
2020-12-10 20:01 ` [PATCH 1/2] fs: add support for LOOKUP_NONBLOCK Jens Axboe
2020-12-10 20:53   ` Linus Torvalds
2020-12-10 21:06     ` Jens Axboe
2020-12-11  2:45       ` Al Viro
2020-12-11 16:05         ` Jens Axboe
2020-12-11 17:20           ` Al Viro
2020-12-11 17:35             ` Linus Torvalds [this message]
2020-12-11 18:50             ` Jens Axboe
2020-12-11 21:51               ` Al Viro
2020-12-11 23:47                 ` Jens Axboe
2020-12-11 17:33           ` Matthew Wilcox
2020-12-11 18:55             ` Jens Axboe
2020-12-11  2:35   ` Al Viro
2020-12-11 15:57     ` Jens Axboe
2020-12-11 17:21       ` Linus Torvalds
2020-12-11 17:29         ` Al Viro
2020-12-11 17:38           ` Al Viro
2020-12-11 17:44           ` Linus Torvalds
2020-12-11 21:46           ` Jens Axboe
2020-12-10 20:01 ` [PATCH 2/2] fs: expose LOOKUP_NONBLOCK through openat2() RESOLVE_NONBLOCK Jens Axboe
2020-12-10 22:29   ` Dave Chinner
2020-12-10 23:12     ` Jens Axboe
2020-12-10 23:29     ` Linus Torvalds
2020-12-11  0:58       ` Dave Chinner
2020-12-11  1:01         ` Linus Torvalds
2020-12-11  3:45           ` Dave Chinner
2020-12-11 18:07             ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=whgSe_ZmHTjpVw5VJ6zrmKxR4oH3hx=ZvQf4w4h4rhVCQ@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.