linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ian Kent <raven@themaw.net>
To: Fox Chen <foxhlchen@gmail.com>
Cc: akpm@linux-foundation.org, dhowells@redhat.com,
	Greg KH <gregkh@linuxfoundation.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	miklos@szeredi.hu, ricklind@linux.vnet.ibm.com,
	sfr@canb.auug.org.au, Tejun Heo <tj@kernel.org>,
	viro@zeniv.linux.org.uk
Subject: Re: [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement
Date: Thu, 17 Dec 2020 12:46:13 +0800	[thread overview]
Message-ID: <da4f730bbbb20c0920599ca5afc316e2c092b7d8.camel@themaw.net> (raw)
In-Reply-To: <efb7469c7bad2f6458c9a537b8e3623e7c303c21.camel@themaw.net>

On Tue, 2020-12-15 at 20:59 +0800, Ian Kent wrote:
> On Tue, 2020-12-15 at 16:33 +0800, Fox Chen wrote:
> > On Mon, Dec 14, 2020 at 9:30 PM Ian Kent <raven@themaw.net> wrote:
> > > On Mon, 2020-12-14 at 14:14 +0800, Fox Chen wrote:
> > > > On Sun, Dec 13, 2020 at 11:46 AM Ian Kent <raven@themaw.net>
> > > > wrote:
> > > > > On Fri, 2020-12-11 at 10:17 +0800, Ian Kent wrote:
> > > > > > On Fri, 2020-12-11 at 10:01 +0800, Ian Kent wrote:
> > > > > > > > For the patches, there is a mutex_lock in kn-
> > > > > > > > >attr_mutex, 
> > > > > > > > as
> > > > > > > > Tejun
> > > > > > > > mentioned here
> > > > > > > > (
> > > > > > > > https://lore.kernel.org/lkml/X8fe0cmu+aq1gi7O@mtj.duckdns.org/
> > > > > > > > ),
> > > > > > > > maybe a global
> > > > > > > > rwsem for kn->iattr will be better??
> > > > > > > 
> > > > > > > I wasn't sure about that, IIRC a spin lock could be used
> > > > > > > around
> > > > > > > the
> > > > > > > initial check and checked again at the end which would
> > > > > > > probably
> > > > > > > have
> > > > > > > been much faster but much less conservative and a bit
> > > > > > > more
> > > > > > > ugly
> > > > > > > so
> > > > > > > I just went the conservative path since there was so much
> > > > > > > change
> > > > > > > already.
> > > > > > 
> > > > > > Sorry, I hadn't looked at Tejun's reply yet and TBH didn't
> > > > > > remember
> > > > > > it.
> > > > > > 
> > > > > > Based on what Tejun said it sounds like that needs work.
> > > > > 
> > > > > Those attribute handling patches were meant to allow taking
> > > > > the
> > > > > rw
> > > > > sem read lock instead of the write lock for
> > > > > kernfs_refresh_inode()
> > > > > updates, with the added locking to protect the inode
> > > > > attributes
> > > > > update since it's called from the VFS both with and without
> > > > > the
> > > > > inode lock.
> > > > 
> > > > Oh, understood. I was asking also because lock on kn-
> > > > >attr_mutex
> > > > drags
> > > > concurrent performance.
> > > > 
> > > > > Looking around it looks like kernfs_iattrs() is called from
> > > > > multiple
> > > > > places without a node database lock at all.
> > > > > 
> > > > > I'm thinking that, to keep my proposed change straight
> > > > > forward
> > > > > and on topic, I should just leave kernfs_refresh_inode()
> > > > > taking
> > > > > the node db write lock for now and consider the attributes
> > > > > handling
> > > > > as a separate change. Once that's done we could reconsider
> > > > > what's
> > > > > needed to use the node db read lock in
> > > > > kernfs_refresh_inode().
> > > > 
> > > > You meant taking write lock of kernfs_rwsem for
> > > > kernfs_refresh_inode()??
> > > > It may be a lot slower in my benchmark, let me test it.
> > > 
> > > Yes, but make sure the write lock of kernfs_rwsem is being taken
> > > not the read lock.
> > > 
> > > That's a mistake I had initially?
> > > 
> > > Still, that attributes handling is, I think, sufficient to
> > > warrant
> > > a separate change since it looks like it might need work, the
> > > kernfs
> > > node db probably should be kept stable for those attribute
> > > updates
> > > but equally the existence of an instantiated dentry might
> > > mitigate
> > > the it.
> > > 
> > > Some people might just know whether it's ok or not but I would
> > > like
> > > to check the callers to work out what's going on.
> > > 
> > > In any case it's academic if GCH isn't willing to consider the
> > > series
> > > for review and possible merge.
> > > 
> > Hi Ian
> > 
> > I removed kn->attr_mutex and changed read lock to write lock for
> > kernfs_refresh_inode
> > 
> > down_write(&kernfs_rwsem);
> > kernfs_refresh_inode(kn, inode);
> > up_write(&kernfs_rwsem);
> > 
> > 
> > Unfortunate, changes in this way make things worse,  my benchmark
> > runs
> > 100% slower than upstream sysfs.  :(
> > open+read+close a sysfs file concurrently took 1000us. (Currently,
> > sysfs with a big mutex kernfs_mutex only takes ~500us
> > for one open+read+close operation concurrently)
> 
> Right, so it does need attention nowish.
> 
> I'll have a look at it in a while, I really need to get a new autofs
> release out, and there are quite a few changes, and testing is seeing
> a number of errors, some old, some newly introduced. It's proving
> difficult.

I've taken a breather for the autofs testing and had a look at this.

I think my original analysis of this was wrong.

Could you try this patch please.
I'm not sure how much difference it will make but, in principle,
it's much the same as the previous approach except it doesn't
increase the kernfs node struct size or mess with the other
attribute handling code.

Note, this is not even compile tested.

kernfs: use kernfs read lock in .getattr() and .permission()

From: Ian Kent <raven@themaw.net>

From Documenation/filesystems.rst and (slightly outdated) comments
in fs/attr.c the inode i_rwsem is used for attribute handling.

This lock satisfies the requirememnts needed to reduce lock contention,
namely a per-object lock needs to be used rather than a file system
global lock with the kernfs node db held stable for read operations.

In particular it should reduce lock contention seen when calling the
kernfs .permission() method.

The inode methods .getattr() and .permission() do not hold the inode
i_rwsem lock when called as they are usually read operations. Also
the .permission() method checks for rcu-walk mode and returns -ECHILD
to the VFS if it is set. So the i_rwsem lock can be used in
kernfs_iop_getattr() and kernfs_iop_permission() to protect the inode
update done by kernfs_refresh_inode(). Using this lock allows the
kernfs node db write lock in these functions to be changed to a read
lock.

Signed-off-by: Ian Kent <raven@themaw.net>
---
 fs/kernfs/inode.c |   12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/kernfs/inode.c b/fs/kernfs/inode.c
index ddaf18198935..568037e9efe9 100644
--- a/fs/kernfs/inode.c
+++ b/fs/kernfs/inode.c
@@ -189,9 +189,11 @@ int kernfs_iop_getattr(const struct path *path, struct kstat *stat,
 	struct inode *inode = d_inode(path->dentry);
 	struct kernfs_node *kn = inode->i_private;
 
-	down_write(&kernfs_rwsem);
+	inode_lock(inode);
+	down_read(&kernfs_rwsem);
 	kernfs_refresh_inode(kn, inode);
-	up_write(&kernfs_rwsem);
+	up_read(&kernfs_rwsem);
+	inode_unlock(inode);
 
 	generic_fillattr(inode, stat);
 	return 0;
@@ -281,9 +283,11 @@ int kernfs_iop_permission(struct inode *inode, int mask)
 
 	kn = inode->i_private;
 
-	down_write(&kernfs_rwsem);
+	inode_lock(inode);
+	down_read(&kernfs_rwsem);
 	kernfs_refresh_inode(kn, inode);
-	up_write(&kernfs_rwsem);
+	up_read(&kernfs_rwsem);
+	inode_unlock(inode);
 
 	return generic_permission(inode, mask);
 }


  reply	other threads:[~2020-12-17  4:47 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-17  7:37 [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement Ian Kent
2020-06-17  7:37 ` [PATCH v2 1/6] kernfs: switch kernfs to use an rwsem Ian Kent
2020-06-17  7:37 ` [PATCH v2 2/6] kernfs: move revalidate to be near lookup Ian Kent
2020-06-17  7:37 ` [PATCH v2 3/6] kernfs: improve kernfs path resolution Ian Kent
2020-06-17  7:38 ` [PATCH v2 4/6] kernfs: use revision to identify directory node changes Ian Kent
2020-06-17  7:38 ` [PATCH v2 5/6] kernfs: refactor attr locking Ian Kent
2020-06-17  7:38 ` [PATCH v2 6/6] kernfs: make attr_mutex a local kernfs node lock Ian Kent
2020-06-19 15:38 ` [PATCH v2 0/6] kernfs: proposed locking and concurrency improvement Tejun Heo
2020-06-19 20:41   ` Rick Lindsley
2020-06-19 22:23     ` Tejun Heo
2020-06-20  2:44       ` Rick Lindsley
2020-06-22 17:53         ` Tejun Heo
2020-06-22 21:22           ` Rick Lindsley
2020-06-23 23:13             ` Tejun Heo
2020-06-24  9:04               ` Rick Lindsley
2020-06-24  9:27                 ` Greg Kroah-Hartman
2020-06-24 13:19                 ` Tejun Heo
2020-06-25  8:15               ` Ian Kent
2020-06-25  9:43                 ` Greg Kroah-Hartman
2020-06-26  0:19                   ` Ian Kent
2020-06-21  4:55       ` Ian Kent
2020-06-22 17:48         ` Tejun Heo
2020-06-22 18:03           ` Greg Kroah-Hartman
2020-06-22 21:27             ` Rick Lindsley
2020-06-23  5:21               ` Greg Kroah-Hartman
2020-06-23  5:09             ` Ian Kent
2020-06-23  6:02               ` Greg Kroah-Hartman
2020-06-23  8:01                 ` Ian Kent
2020-06-23  8:29                   ` Ian Kent
2020-06-23 11:49                   ` Greg Kroah-Hartman
2020-06-23  9:33                 ` Rick Lindsley
2020-06-23 11:45                   ` Greg Kroah-Hartman
2020-06-23 22:55                     ` Rick Lindsley
2020-06-23 11:51                   ` Ian Kent
2020-06-21  3:21   ` Ian Kent
2020-12-10 16:44 ` Fox Chen
2020-12-11  2:01   ` [PATCH " Ian Kent
2020-12-11  2:17     ` Ian Kent
2020-12-13  3:46       ` Ian Kent
2020-12-14  6:14         ` Fox Chen
2020-12-14 13:30           ` Ian Kent
2020-12-15  8:33             ` Fox Chen
2020-12-15 12:59               ` Ian Kent
2020-12-17  4:46                 ` Ian Kent [this message]
2020-12-17  8:54                   ` Fox Chen
2020-12-17 10:09                     ` Ian Kent
2020-12-17 11:09                       ` Ian Kent
2020-12-17 11:48                         ` Ian Kent
2020-12-17 15:14                           ` Tejun Heo
2020-12-18  7:36                             ` Ian Kent
2020-12-18  8:01                               ` Fox Chen
2020-12-18 11:21                                 ` Ian Kent
2020-12-18 13:20                                   ` Fox Chen
2020-12-19  0:53                                     ` Ian Kent
2020-12-19  7:47                                       ` Fox Chen
2020-12-22  2:17                                         ` Ian Kent
2020-12-18 14:59                               ` Tejun Heo
2020-12-19  7:08                                 ` Ian Kent
2020-12-19 16:23                                   ` Tejun Heo
2020-12-19 23:52                                     ` Ian Kent
2020-12-20  1:37                                       ` Ian Kent
2020-12-21  9:28                                       ` Fox Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da4f730bbbb20c0920599ca5afc316e2c092b7d8.camel@themaw.net \
    --to=raven@themaw.net \
    --cc=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=foxhlchen@gmail.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=ricklind@linux.vnet.ibm.com \
    --cc=sfr@canb.auug.org.au \
    --cc=tj@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).