linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: NeilBrown <neilb@suse.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 2/3] Improve fairness when locking the per-superblock s_anon list
Date: Thu, 9 Nov 2017 11:52:48 -0800	[thread overview]
Message-ID: <CA+55aFwxFCi5mDEadCf8xxK-LQ-aXUhd1Ox6m51vsXt3uexFpw@mail.gmail.com> (raw)
In-Reply-To: <151019772760.30101.8513274540570798315.stgit@noble>

On Wed, Nov 8, 2017 at 7:22 PM, NeilBrown <neilb@suse.com> wrote:
> bit-spin-locks, as used for dcache hash chains, are not fair.
> This is not a problem for the dcache hash table as different CPUs are
> likely to access different entries in the hash table so high contention
> is not expected.
> However anonymous dentryies (created by NFSD) all live on a single hash
> chain "s_anon" and the bitlock on this can be highly contended, resulting
> in soft-lockup warnings.

This really seems idiotic.

Let me rephrase this commit message so that you can see why I think it's wrong:

 "NFSd uses a single hash chain for all dentries, which can cause
horrible lock contention, in ways that the normal hashing does not.

  This horrendous contention can cause the machine to have bad latency
spikes, which can cause soft lockup warnings.

  Instead of just fixing the bad design decision that causes this
horrible contention, we'll just special-case this code and use an
additional lock that hides the problem by serializing the actual
contending access".

Honestly, looking at the code, the whole s_anon thing seems entirely
broken. There doesn't even seem to be much reason for it. In pretty
much all cases, we could just hash the damn dentry,

The only reason for actually having s_anon seems to be that we want
some per-superblock list of these unconnected dentries for
shrink_dcache_for_umount().

Everything else would actually be *much* happier with just having the
dentry on the regular hash table. It would entirely get rid of this
stupid performance problem, and it would actually simplify all the
code elsewhere, because it would remove special cases like this

                if (unlikely(IS_ROOT(dentry)))
                        b = &dentry->d_sb->s_anon;
                else
                        b = d_hash(dentry->d_name.hash);

and just turn them into

                b = d_hash(dentry->d_name.hash);

so I really wonder if we could just get rid of s_anon entirely.

Yes, getting rid of s_anon might involve crazy things like "let's just
walk all the dentries at umount time", but honestly, that sounds
preferable. Especially if we can just then do something like

 - set a special flag in the superblock if we ever use __d_obtain_alias()

 - only scan all the dentries on umount if that flag is set.

Hmm?

The other alternative would be to just try to make the bitlocks
fairer. I would find that much less distasteful than this nasty hack.

I could imagine, for example, just turning the bitlocks into "spinlock
on hashed array". That's basically what we did with the page lock,
which _used_ to be basically a blocking bitlock, and where there was
no possible way to not have contention on some pages.

We could afford to have two bits for the bitlock (lock and contention)
to make something like that more viable.

But I really get the feeling that the problem here is not the locking
primitive, but that whole s_anon decision. And my gut feeling is that
it really should be fixable.

Because wouldn't it be much nicer if the horrible nfsd contention just
went away rather than be worked around?

But maybe I'm missing some _other_ reason why s_anon has to be its own
separate list?

          Linus

  reply	other threads:[~2017-11-09 19:52 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-09  3:22 [PATCH 0/3] Three VFS patch resends NeilBrown
2017-11-09  3:22 ` [PATCH 1/3] VFS: use synchronize_rcu_expedited() in namespace_unlock() NeilBrown
2017-11-09  3:22 ` [PATCH 2/3] Improve fairness when locking the per-superblock s_anon list NeilBrown
2017-11-09 19:52   ` Linus Torvalds [this message]
2017-11-09 20:50     ` Al Viro
2017-11-09 23:09       ` NeilBrown
2017-11-09 23:19         ` Al Viro
2017-11-10  0:02       ` Linus Torvalds
2017-11-10  8:50     ` Christoph Hellwig
2017-11-09  3:22 ` [PATCH 3/3] VFS: close race between getcwd() and d_move() NeilBrown
2017-11-09 11:41   ` Nikolay Borisov
2017-11-09 13:08     ` Matthew Wilcox
2017-11-09 16:02       ` Nikolay Borisov
2017-11-09 20:23   ` Linus Torvalds
2017-11-09 22:14     ` NeilBrown
2017-11-10  1:40       ` Linus Torvalds
2017-11-10  4:45         ` NeilBrown
2017-11-10 19:52           ` Linus Torvalds
2017-11-10 20:53           ` Al Viro
2017-11-21 23:50             ` Al Viro
2017-11-22  1:31               ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+55aFwxFCi5mDEadCf8xxK-LQ-aXUhd1Ox6m51vsXt3uexFpw@mail.gmail.com \
    --to=torvalds@linux-foundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=neilb@suse.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).