linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dmitry Kadashev <dkadashev@gmail.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jens Axboe <axboe@kernel.dk>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	io-uring <io-uring@vger.kernel.org>
Subject: Re: [PATCH 05/14] namei: prepare do_mkdirat for refactoring
Date: Wed, 21 Jul 2021 17:02:37 +0700	[thread overview]
Message-ID: <CAOKbgA6JAGioiqMu1CQbTe-Wpb2_HDszRaz+rf=qna-oieF75A@mail.gmail.com> (raw)
In-Reply-To: <YPbV5wnUNw3SsSfI@zeniv-ca.linux.org.uk>

.On Tue, Jul 20, 2021 at 8:58 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Tue, Jul 20, 2021 at 01:59:29PM +0700, Dmitry Kadashev wrote:
>
> > > This is the wrong way to go.  Really.  Look at it that way - LOOKUP_REVAL
> > > is the final stage of escalation; if we had to go there, there's no
> > > point being optimistic about the last dcache lookup, nevermind trying
> > > to retry the parent pathwalk if we fail with -ESTALE doing it.
> > >
> > > I'm not saying that it's something worth optimizing for; the problem
> > > is different - the logics makes no sense whatsoever that way.  It's
> > > a matter of reader's cycles wasted on "what the fuck are we trying
> > > to do here?", not the CPU cycles wasted on execution.
> > >
> > > While we are at it, it makes no sense for filename_parentat() and its
> > > ilk to go for RCU and normal if it's been given LOOKUP_REVAL - I mean,
> > > look at the sequence of calls in there.  And try to make sense of
> > > it.  Especially of the "OK, RCU attempt told us to sod off and try normal;
> > > here, let's call path_parentat() with LOOKUP_REVAL for flags and if it
> > > says -ESTALE, call it again with exact same arguments" part.
> > >
> > > Seriously, look at that from the point of view of somebody who tries
> > > to make sense of the entire thing
> >
> > OK, let me try to venture down that "change the way ESTALE retries are
> > done completely" path. The problem here is I'm not familiar with the
> > code enough to be sure the conversion is 1-to-1 (i.e. that we can't get
> > ESTALE from somewhere unexpected), and that retries are open-coded in
> > quite a few places it seems. Anyway, I'll try and dig in and come back
> > with either an RFC patch or some questions. Thanks for the feedback, Al.
>
> I'd try to look at the primitives that go through RCU/normal/REVAL series.
> They are all in fs/namei.c; filename_lookup(), filename_parentat(),
> do_filp_open() and do_filo_open_root().  The latter pair almost certainly
> is fine as-is.
>
> retry_estale() crap is limited to user_path_at/user_path_at_empty users,
> along with some filename_parentat() ones.
>
> There we follow that series with something that might give us ESTALE,
> and if it does, we want to repeat the whole thing in REVAL mode.
>
> OTOH, there are callers (and fairly similar ones, at that - look at e.g.
> AF_UNIX bind doing mknod) where we don't have that kind of logics.
>
> Question 1: which of those are lacking retry_estale(), even though they
> might arguably need it?  Note that e.g. AF_UNIX bind uses kern_path_create(),
> so we need to look at all callchains leading to those, not just the ones
> in fs/namei.c guts.
>
> If most of those really want retry_estale, we'd be better off if we took
> the REVAL fallback out of filename_lookup() and filename_parentat()
> and turned massaged the users from
>         do rcu/normal/reval lookups
>         if failed, fuck off
>         do other work
>         if it fails with ESTALE
>                 do rcu/reval/reval (yes, really)
>                 if failed, fuck off
>                 do other work
> into
>         do rcu/normal lookups
>         if not failed
>                 do other work
>         if something (including initial lookup) failed with ESTALE
>                 repeat the entire thing with LOOKUP_REVAL in the mix
> possibly with a helper function involved.
> For the ones that need retry_estale that's a win; for the rest it'd
> be boilerplate (that's basically the ones where "do other work" never
> fails with ESTALE).
>
> Question 2: how are "need retry_estale"/"fine just with ESTALE fallback
> in filename_{lookup,parentat}()" cases are distributed?
>
> If the majority is in "need retry_estale" class, then something similar
> to what's been outlined above would probably be a decent solution.
>
> Otherwise we'll need wrappers equivalent to current behaviour, and that's
> where it can get unpleasant - at which level in call chain do we put
> that wrapper?  Sure, we can add filename_lookup_as_it_fucking_used_to_be().
> Except that it's not called directly by those "don't need retry_estale"
> users, so we'd need to provide such counterparts for them as well ;-/
>
> IOW, we need the call tree for filename_lookup()/filename_parentat(),
> with leaves (originators of call chain) marked with "does that user
> do retry_estale?" (and tracked far back for the answer to depend only
> upon the call site - if an intermediate can come from both kinds
> of places, we need to track back to its callers).
>
> Then we'll be able to see at which levels do we want those "as it used
> to behave" wrappers...
>
> If you want to dig around, that would probably be a reasonable place to
> start.

Thanks for the pointers! I am happy to dig into this. I can't spend much
time per week on this though, so it will take some time.

-- 
Dmitry Kadashev

  reply	other threads:[~2021-07-21 10:24 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-15 10:35 [PATCH 00/14] namei: clean up retry logic in various do_* functions Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 01/14] namei: prepare do_rmdir for refactoring Dmitry Kadashev
2021-07-15 19:49   ` Al Viro
2021-07-20  6:52     ` Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 02/14] namei: clean up do_rmdir retry logic Dmitry Kadashev
2021-07-15 20:02   ` Al Viro
2021-07-15 10:35 ` [PATCH 03/14] namei: prepare do_unlinkat for refactoring Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 04/14] namei: clean up do_unlinkat retry logic Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 05/14] namei: prepare do_mkdirat for refactoring Dmitry Kadashev
2021-07-15 20:17   ` Al Viro
2021-07-20  6:59     ` Dmitry Kadashev
2021-07-20 13:55       ` Al Viro
2021-07-21 10:02         ` Dmitry Kadashev [this message]
2021-07-15 10:35 ` [PATCH 06/14] namei: clean up do_mkdirat retry logic Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 07/14] namei: prepare do_mknodat for refactoring Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 08/14] namei: clean up do_mknodat retry logic Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 09/14] namei: prepare do_symlinkat for refactoring Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 10/14] namei: clean up do_symlinkat retry logic Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 11/14] namei: prepare do_linkat for refactoring Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 12/14] namei: clean up do_linkat retry logic Dmitry Kadashev
2021-07-15 10:35 ` [PATCH 13/14] namei: prepare do_renameat2 for refactoring Dmitry Kadashev
2021-07-15 10:36 ` [PATCH 14/14] namei: clean up do_renameat2 retry logic Dmitry Kadashev
2021-07-15 10:39 ` [PATCH 00/14] namei: clean up retry logic in various do_* functions Dmitry Kadashev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOKbgA6JAGioiqMu1CQbTe-Wpb2_HDszRaz+rf=qna-oieF75A@mail.gmail.com' \
    --to=dkadashev@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=christian.brauner@ubuntu.com \
    --cc=io-uring@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).