All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <koct9i@gmail.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Konstantin Khlebnikov <khlebnikov@yandex-team.ru>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] proc/sysctl: drop unregistered stale dentries as soon as possible
Date: Thu, 9 Feb 2017 10:36:15 +0300	[thread overview]
Message-ID: <CALYGNiMU1a_P+uMspAfj6UYbsy1ahq_ab_OPZpKFF_cUhSKT1A@mail.gmail.com> (raw)
In-Reply-To: <20170209035307.GK13195@ZenIV.linux.org.uk>

On Thu, Feb 9, 2017 at 6:53 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Wed, Feb 08, 2017 at 01:48:04PM -0800, Andrew Morton wrote:
>
>> > This patch detects stale dentry in proc_sys_compare and pretends that
>> > it has matching name - revalidation will kill it and lookup restarts.
>> > As a result each stale dentry will be seen only once and will not
>> > contaminate hash endlessly.
>> >
>>
>> What are "stale" dentries?  Unused dentries?  If so, why doesn't the
>> creation of a new dentry immediately invalidate the old dentry with a
>> matching path?  What do other filesystems do to prevent this issue?
>
> The whole point is that it's *NOT* a matching path.  Currently ->d_compare()
> for /proc/sys tries to make sure that sysctl getting unregistered means
> that no extra references will be added to dentries of the stuff we are
> trying to kick out.  If it's getting unregistered, ->d_compare() won't be
> seeing them and from that point on dentry refcount can only go down -
> no new lookups will increase it.
>
> This kludge tries to have _any_ lookup in the same hash chain pick the
> first dentry of such stuff, no matter what name/parent it has.  Then
> it relies upon ->d_revalidate() refusing to accept that sucker, so that
> it gets unhashed and we (hopefully) repeat the lookup.
>
> This is complete garbage.  Lookups won't be repeated indefinitely -
> if there are several such dentries in the hash chain we search, syscall
> will end up failing with ESTALE on thus buggered ->d_compare(), even though
> none of those dentries are anywhere near the path we are trying to resolve.
> No other filesystem attempts that kind of insanity, and for a good reason.
>
> The problem it tries to address is that sysctl unregistration doesn't
> unhash the now-stale dentries.  Before the unregistration we kept them
> even with refcount 0, until memory pressure evicts the suckers.  After
> unregistration we make sure that refcount reaching 0 will cause the
> instant eviction.  The problem is with the case when they had refcount 0
> to start with.  Then the eviction rule does not get triggered - it would have
> happened when dropping the last reference, but we don't have any.
>
> The kludge proposed in that patch is nowhere near being a sane way to deal
> with that.  Having ->d_compare() notice such dentries and quietly kick
> them out would be borderline saner, but
>         * it's a potentially blocking operation and ->d_compare() is called
> in non-blocking contexts, including deep under rcu_read_lock().
>         * it's done when walking a hash chain; having that chain modified
> by ->d_compare() itself would require some modifications of callers and
> those are very hot codepaths.
>
> I agree that the problem is real, but this is no way to deal with it.
> What we want is something along the lines of d_prune_aliases() done for all
> inodes corresponding to given sysctl.  Done just before erase_header()
> in start_unregistering().  That would require maintaining the list of
> inodes in question (e.g. anchored in ctl_table_header) and a bit of care
> in traversing it (use of igrab(), etc.)
>
> In the current form - NAK.  Sorry.

Ok, Thank you. I've expected that this fix isn't sane,

Maybe we could minimize changes for now. For example: keep these
stale dentries in memory but silently unhash them in ->d_compare().
Memory processure and reclaimer will kill them later.

  reply	other threads:[~2017-02-09  7:37 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-08 10:48 [PATCH] proc/sysctl: drop unregistered stale dentries as soon as possible Konstantin Khlebnikov
2017-02-08 21:48 ` Andrew Morton
2017-02-09  3:53   ` Al Viro
2017-02-09  7:36     ` Konstantin Khlebnikov [this message]
2017-02-09  8:40       ` Al Viro
2017-02-10  7:35         ` [PATCH] proc/sysctl: prune stale dentries during unregistering Konstantin Khlebnikov
2017-02-10  7:47           ` Al Viro
2017-02-10  7:54             ` Konstantin Khlebnikov
2017-02-13  9:54               ` Eric W. Biederman
2017-02-18 18:55           ` Konstantin Khlebnikov
2017-02-19  8:42             ` Al Viro
2017-02-21  1:41               ` [REVIEW][PATCH] proc/sysctl: Don't grab i_lock under sysctl_lock Eric W. Biederman
2017-02-21  8:40                 ` Konstantin Khlebnikov
2017-02-21 19:29                   ` Eric W. Biederman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CALYGNiMU1a_P+uMspAfj6UYbsy1ahq_ab_OPZpKFF_cUhSKT1A@mail.gmail.com \
    --to=koct9i@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=khlebnikov@yandex-team.ru \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.