From: Al Viro <viro@ZenIV.linux.org.uk>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Hansen <dave.hansen@intel.com>,
"Chen, Tim C" <tim.c.chen@intel.com>,
Ingo Molnar <mingo@redhat.com>, Davidlohr Bueso <dbueso@suse.de>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Jason Low <jason.low2@hp.com>,
Michel Lespinasse <walken@google.com>,
"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
Waiman Long <waiman.long@hp.com>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: performance delta after VFS i_mutex=>i_rwsem conversion
Date: Mon, 6 Jun 2016 22:15:23 +0100 [thread overview]
Message-ID: <20160606211522.GF14480@ZenIV.linux.org.uk> (raw)
In-Reply-To: <CA+55aFxH_7wjo_BgUPK5iomWedE2=DaUZVX-yruHOWEk7OTiHQ@mail.gmail.com>
On Mon, Jun 06, 2016 at 01:46:23PM -0700, Linus Torvalds wrote:
> So my gut feel is that we do want to have the same heuristics for
> rwsems and mutexes (well, modulo possible actual semantic differences
> due to the whole shared-vs-exclusive issues).
>
> And I also suspect that the mutexes have gotten a lot more performance
> tuning done on them, so it's likely the correct thing to try to make
> the rwsem match the mutex code rather than the other way around.
>
> I think we had Jason and Davidlohr do mutex work last year, let's see
> if they agree on that "yes, the mutex case is the likely more tuned
> case" feeling.
>
> The fact that your performance improves when you do that obviously
> then also validates the assumption that the mutex spinning is the
> better optimized one.
FWIW, there's another fun issue on ramfs - dcache_readdir() is doing an
obscene amount of grabbing/releasing ->d_lock and once you take the external
serialization out, parallel getdents load hits contention on *that*.
In spades. And unlike mutex (or rswem exclusive), contention on ->d_lock
chews a lot of cycles. The root cause is the use of cursors - we not only
move them more than we ought to (we do that on each entry reported, rather
than once before return from dcache_readdir()), we can't traverse the real
list entries (which remain nice and stable; another low-hanging fruit is
pointless grabbing ->d_lock on those) without ->d_lock on parent.
I think I have a kinda-sorta solution, but it has a problem. What I want
to do is
* list_move() only once per dcache_readdir()
* ->d_lock taken for that and only for that.
* list_move() itself surrounded with write_seqcount_{begin,end} on
some seqcount
* traversal to the next real entry done under rcu_read_lock in a
seqretry loop.
The only problem is where to put that seqcount (unsigned int, really).
->i_dir_seq is an obvious candidate, but that'll need careful profiling
on getdents/lookup mixes...
next prev parent reply other threads:[~2016-06-06 21:15 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-06 20:00 performance delta after VFS i_mutex=>i_rwsem conversion Dave Hansen
2016-06-06 20:46 ` Linus Torvalds
2016-06-06 21:13 ` Waiman Long
2016-06-06 21:20 ` Linus Torvalds
2016-06-07 3:22 ` Valdis.Kletnieks
2016-06-07 15:22 ` Waiman Long
2016-06-08 8:58 ` Ingo Molnar
2016-06-09 10:25 ` Ingo Molnar
2016-06-09 18:14 ` Dave Hansen
2016-06-09 20:10 ` Chen, Tim C
2016-06-06 21:15 ` Al Viro [this message]
2016-06-06 21:46 ` Linus Torvalds
2016-06-06 22:07 ` Al Viro
2016-06-06 23:50 ` Linus Torvalds
2016-06-06 23:59 ` Linus Torvalds
2016-06-07 0:29 ` Linus Torvalds
2016-06-07 0:40 ` Al Viro
2016-06-07 0:44 ` Al Viro
2016-06-07 0:58 ` Al Viro
2016-06-07 0:58 ` Linus Torvalds
2016-06-07 1:19 ` Al Viro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160606211522.GF14480@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=dave.hansen@intel.com \
--cc=dbueso@suse.de \
--cc=jason.low2@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tim.c.chen@intel.com \
--cc=torvalds@linux-foundation.org \
--cc=waiman.long@hp.com \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).