All of lore.kernel.org
 help / color / mirror / Atom feed
From: "J. Bruce Fields" <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
To: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
Cc: Al Viro <viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>,
	Nick Piggin <npiggin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
	Nick Piggin <npiggin-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>,
	linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: Re: [PATCH] fs/dcache: allow __d_obtain_alias() to return unhashed dentries
Date: Fri, 29 Jun 2012 16:10:34 -0400	[thread overview]
Message-ID: <20120629201034.GA17103@fieldses.org> (raw)
In-Reply-To: <20120628135927.GA6406-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>

On Thu, Jun 28, 2012 at 09:59:27AM -0400, J. Bruce Fields wrote:
> Coming back to this now, just trying to review the
> filehandle-lookup/dcache interactions:
> 
> On Fri, Mar 11, 2011 at 03:07:49PM +1100, NeilBrown wrote:
> > 1/ Originally DCACHE_DISCONNECTED didn't really mean much - it's presence
> >    was only a hint, its absence was a strong statement.
> >    If the flag is set, the dentry might not be linked to the root.
> >    If it is clear, it definitely is link through to the root.
> >    However I think it was used with stronger intent than that.
> > 
> >    Now it seems to mean a little bit more:  If it is set and the dentry
> >    is hashed, then it must be on the sb->s_anon list.
> 
> The code that makes that assumption is __d_shrink (which does the work
> of d_drop)--it uses DCACHE_DISCONECTED to decide which hash chain to
> lock.
> 
> I can't find any basis for that assumption.  The only code that clears
> DCACHE_DISCONNECTED is in expfs.c, and it isn't done at the same time as
> hashing.  Am I missing something?
> 
> >    This is a significant
> >    which I never noticed (I haven't been watching).  Originally a
> >    disconnected dentry would be attached (and hashed) to its parent.  Then
> >    that parent would get its own parent and so on until it was attached all
> >    the way to the root.  Only then would be start clearing
> >    DCACHE_DISCONNECTED.  It seems we must clear it sooner now... I wonder if
> >    that is correct.
> 
> It looks wrong to me:
> 
> If we clear DCACHE_DISCONNECTED too early, then we risk a filehandle
> lookup thinking the dentry is OK to use.  That could mean for example
> trying to rename across directories that don't have any ancestor
> relationship to each other in the dcache yet.
> 
> So we need to wait to clear DCACHE_DISCONNECTED until we *know* the
> dentry's parents go all the way back to the root.  As you say, that's
> what the current code does.
> 
> But that means DCACHE_DISCONNECTED dentries can be hashed to their
> parents, and __d_shrink can be handed such dentries and then get the
> locking wrong.
> 
> It looks like this bug might originate with Nick Piggin's ceb5bdc2d246
> "fs: dcache per-bucket dcache hash locking"?  There's no discussion in
> the changelog, so probably it was just based on an unexamined assumption
> about DCACHE_DISCONNECTED.
> 
> I wonder if an IS_ROOT() test could replace the DCACHE_DISCONNECTED test
> in __d_shrink(), or if we need another flag, or ?

Bah, sorry, and I only just noticed that you already said as much later
and did the IS_ROOT() thing in your patch.

Anyway, here's just that one change with a slightly more painstaking
changelog.

--b.

commit b1fa644355122627424fe2240a9fc60cbef4c349
Author: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Date:   Thu Jun 28 12:10:55 2012 -0400

    dcache: use IS_ROOT to decide where dentry is hashed
    
    Every hashed dentry is either hashed in the dentry_hashtable, or a
    superblock's s_anon list.
    
    __d_shrink assumes it can determine which is the case by checking
    DCACHE_DISCONNECTED; this is not true.
    
    It is true that when DCACHE_DISCONNECTED is cleared, the dentry is not
    only hashed on dentry_hashtable, but is fully connected to its parents
    back to the root.
    
    But the converse is *not* true: fs/exportfs/expfs.c:reconnect_path()
    attempts to connect a directory (found by filehandle lookup) back to
    root by ascending to parents and performing lookups one at a time.  It
    does not clear DCACHE_DISCONNECTED until its done, and that is not at
    all an atomic process.
    
    In particular, it is possible for DCACHE_DISCONNECTED to be set on a
    dentry which is hashed on the dentry_hashtable.
    
    Instead, use IS_ROOT() to check which hash chain a dentry is on.  This
    *does* work:
    
    Dentries are hashed only by:
    
    	- d_obtain_alias, which adds an IS_ROOT() dentry to sb_anon.
    
    	- __d_rehash, called by _d_rehash: hashes to the dentry's
    	  parent, and all callers of _d_rehash appear to have d_parent
    	  set to a "real" parent.
    	- __d_rehash, called by __d_move: rehashes the moved dentry to
    	  hash chain determined by target, and assigns target's d_parent
    	  to its d_parent, before dropping the dentry's d_lock.
    
    Therefore I believe it's safe for a holder of a dentry's d_lock to
    assume that it is hashed on sb_anon if and only if IS_ROOT(dentry) is
    true.
    
    I believe the incorrect assumption about DCACHE_DISCONNECTED was
    originally introduced by ceb5bdc2d246 "fs: dcache per-bucket dcache hash
    locking".
    
    Cc: Neil Brown <neilb-l3A5Bk7waGM@public.gmane.org>
    Cc: Nick Piggin <npiggin-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
    Signed-off-by: J. Bruce Fields <bfields-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>

diff --git a/fs/dcache.c b/fs/dcache.c
index 87c2da7..b2b382c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -410,7 +410,7 @@ static void __d_shrink(struct dentry *dentry)
 {
 	if (!d_unhashed(dentry)) {
 		struct hlist_bl_head *b;
-		if (unlikely(dentry->d_flags & DCACHE_DISCONNECTED))
+		if (unlikely(IS_ROOT(dentry->d_flags)))
 			b = &dentry->d_sb->s_anon;
 		else
 			b = d_hash(dentry->d_parent, dentry->d_name.hash);
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: "J. Bruce Fields" <bfields@fieldses.org>
To: NeilBrown <neilb@suse.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>,
	Nick Piggin <npiggin@gmail.com>, Nick Piggin <npiggin@kernel.dk>,
	linux-nfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH] fs/dcache: allow __d_obtain_alias() to return unhashed dentries
Date: Fri, 29 Jun 2012 16:10:34 -0400	[thread overview]
Message-ID: <20120629201034.GA17103@fieldses.org> (raw)
In-Reply-To: <20120628135927.GA6406@fieldses.org>

On Thu, Jun 28, 2012 at 09:59:27AM -0400, J. Bruce Fields wrote:
> Coming back to this now, just trying to review the
> filehandle-lookup/dcache interactions:
> 
> On Fri, Mar 11, 2011 at 03:07:49PM +1100, NeilBrown wrote:
> > 1/ Originally DCACHE_DISCONNECTED didn't really mean much - it's presence
> >    was only a hint, its absence was a strong statement.
> >    If the flag is set, the dentry might not be linked to the root.
> >    If it is clear, it definitely is link through to the root.
> >    However I think it was used with stronger intent than that.
> > 
> >    Now it seems to mean a little bit more:  If it is set and the dentry
> >    is hashed, then it must be on the sb->s_anon list.
> 
> The code that makes that assumption is __d_shrink (which does the work
> of d_drop)--it uses DCACHE_DISCONECTED to decide which hash chain to
> lock.
> 
> I can't find any basis for that assumption.  The only code that clears
> DCACHE_DISCONNECTED is in expfs.c, and it isn't done at the same time as
> hashing.  Am I missing something?
> 
> >    This is a significant
> >    which I never noticed (I haven't been watching).  Originally a
> >    disconnected dentry would be attached (and hashed) to its parent.  Then
> >    that parent would get its own parent and so on until it was attached all
> >    the way to the root.  Only then would be start clearing
> >    DCACHE_DISCONNECTED.  It seems we must clear it sooner now... I wonder if
> >    that is correct.
> 
> It looks wrong to me:
> 
> If we clear DCACHE_DISCONNECTED too early, then we risk a filehandle
> lookup thinking the dentry is OK to use.  That could mean for example
> trying to rename across directories that don't have any ancestor
> relationship to each other in the dcache yet.
> 
> So we need to wait to clear DCACHE_DISCONNECTED until we *know* the
> dentry's parents go all the way back to the root.  As you say, that's
> what the current code does.
> 
> But that means DCACHE_DISCONNECTED dentries can be hashed to their
> parents, and __d_shrink can be handed such dentries and then get the
> locking wrong.
> 
> It looks like this bug might originate with Nick Piggin's ceb5bdc2d246
> "fs: dcache per-bucket dcache hash locking"?  There's no discussion in
> the changelog, so probably it was just based on an unexamined assumption
> about DCACHE_DISCONNECTED.
> 
> I wonder if an IS_ROOT() test could replace the DCACHE_DISCONNECTED test
> in __d_shrink(), or if we need another flag, or ?

Bah, sorry, and I only just noticed that you already said as much later
and did the IS_ROOT() thing in your patch.

Anyway, here's just that one change with a slightly more painstaking
changelog.

--b.

commit b1fa644355122627424fe2240a9fc60cbef4c349
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Jun 28 12:10:55 2012 -0400

    dcache: use IS_ROOT to decide where dentry is hashed
    
    Every hashed dentry is either hashed in the dentry_hashtable, or a
    superblock's s_anon list.
    
    __d_shrink assumes it can determine which is the case by checking
    DCACHE_DISCONNECTED; this is not true.
    
    It is true that when DCACHE_DISCONNECTED is cleared, the dentry is not
    only hashed on dentry_hashtable, but is fully connected to its parents
    back to the root.
    
    But the converse is *not* true: fs/exportfs/expfs.c:reconnect_path()
    attempts to connect a directory (found by filehandle lookup) back to
    root by ascending to parents and performing lookups one at a time.  It
    does not clear DCACHE_DISCONNECTED until its done, and that is not at
    all an atomic process.
    
    In particular, it is possible for DCACHE_DISCONNECTED to be set on a
    dentry which is hashed on the dentry_hashtable.
    
    Instead, use IS_ROOT() to check which hash chain a dentry is on.  This
    *does* work:
    
    Dentries are hashed only by:
    
    	- d_obtain_alias, which adds an IS_ROOT() dentry to sb_anon.
    
    	- __d_rehash, called by _d_rehash: hashes to the dentry's
    	  parent, and all callers of _d_rehash appear to have d_parent
    	  set to a "real" parent.
    	- __d_rehash, called by __d_move: rehashes the moved dentry to
    	  hash chain determined by target, and assigns target's d_parent
    	  to its d_parent, before dropping the dentry's d_lock.
    
    Therefore I believe it's safe for a holder of a dentry's d_lock to
    assume that it is hashed on sb_anon if and only if IS_ROOT(dentry) is
    true.
    
    I believe the incorrect assumption about DCACHE_DISCONNECTED was
    originally introduced by ceb5bdc2d246 "fs: dcache per-bucket dcache hash
    locking".
    
    Cc: Neil Brown <neilb@suse.de>
    Cc: Nick Piggin <npiggin@kernel.dk>
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/fs/dcache.c b/fs/dcache.c
index 87c2da7..b2b382c 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -410,7 +410,7 @@ static void __d_shrink(struct dentry *dentry)
 {
 	if (!d_unhashed(dentry)) {
 		struct hlist_bl_head *b;
-		if (unlikely(dentry->d_flags & DCACHE_DISCONNECTED))
+		if (unlikely(IS_ROOT(dentry->d_flags)))
 			b = &dentry->d_sb->s_anon;
 		else
 			b = d_hash(dentry->d_parent, dentry->d_name.hash);

  parent reply	other threads:[~2012-06-29 20:10 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-12 18:43 lifetime of DCACHE_DISCONECTED dentries J. Bruce Fields
2010-11-13 11:53 ` Nick Piggin
2010-11-13 11:53   ` Nick Piggin
2010-11-15 17:48   ` J. Bruce Fields
2010-11-15 17:48     ` J. Bruce Fields
     [not found]     ` <20101115174837.GB10044-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-11-16  6:45       ` Nick Piggin
2010-11-16  6:45         ` Nick Piggin
2010-11-29  3:56         ` Nick Piggin
2010-11-29  3:56           ` Nick Piggin
2010-11-29 19:32           ` J. Bruce Fields
2010-11-29 19:32             ` J. Bruce Fields
     [not found]             ` <20101129193248.GA9897-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-11-30  1:00               ` Nick Piggin
2010-11-30  1:00                 ` Nick Piggin
     [not found]                 ` <AANLkTikwzDJ_q65==uxDsAhp3h8bU7Rkt7U9gVgRAK0D-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-11-30 18:39                   ` J. Bruce Fields
2010-11-30 18:39                     ` J. Bruce Fields
2010-12-03 22:33                   ` [PATCH] nfsd4: allow __d_obtain_alias() to return unhashed dentries J. Bruce Fields
2010-12-03 22:33                     ` J. Bruce Fields
2010-12-13  5:19                     ` Nick Piggin
2010-12-13  5:19                       ` Nick Piggin
2010-12-14 22:01                       ` J. Bruce Fields
2010-12-14 22:01                         ` J. Bruce Fields
     [not found]                         ` <20101214220102.GM24828-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-12-17 17:53                           ` [PATCH] fs/dcache: use standard list macro for d_find_alias J. Bruce Fields
2010-12-17 17:53                             ` J. Bruce Fields
2010-12-17 18:00                           ` [PATCH 2/2] fs/dcache: allow __d_obtain_alias() to return unhashed dentries J. Bruce Fields
2010-12-17 18:00                             ` J. Bruce Fields
2010-12-18  2:01                             ` Nick Piggin
2010-12-18  2:01                               ` Nick Piggin
2010-12-18 16:16                               ` J. Bruce Fields
2010-12-18 16:16                                 ` J. Bruce Fields
     [not found]                                 ` <20101218161609.GA22150-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2010-12-19 14:53                                   ` Nick Piggin
2010-12-19 14:53                                     ` Nick Piggin
     [not found]                                     ` <AANLkTingRv_gtRSctGzMfYrKg02M_sKj97HSQPRm_mA_-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-12-27 23:46                                       ` [PATCH] " J. Bruce Fields
2010-12-27 23:46                                         ` J. Bruce Fields
     [not found]                                         ` <20101227234641.GA22248-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-01-18 20:45                                           ` J. Bruce Fields
2011-01-18 20:45                                             ` J. Bruce Fields
     [not found]                                             ` <20110118204509.GA10903-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-01-18 22:02                                               ` Nick Piggin
2011-01-18 22:02                                                 ` Nick Piggin
     [not found]                                                 ` <AANLkTikL2CDSWQJ1QH_Y4G-j70Vd=VesNMMnYTmMGHC9-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2011-01-18 22:08                                                   ` J. Bruce Fields
2011-01-18 22:08                                                     ` J. Bruce Fields
     [not found]                                                     ` <20110118220817.GF10903-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-03-08 18:13                                                       ` J. Bruce Fields
2011-03-08 18:13                                                         ` J. Bruce Fields
     [not found]                                                         ` <20110308181320.GA15566-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2011-03-10 10:58                                                           ` Al Viro
2011-03-10 10:58                                                             ` Al Viro
     [not found]                                                             ` <20110310105821.GE22723-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org>
2011-03-11  4:07                                                               ` NeilBrown
2011-03-11  4:07                                                                 ` NeilBrown
     [not found]                                                                 ` <20110311150749.2fa2be66-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-02-14 17:03                                                                   ` J. Bruce Fields
2012-02-14 17:03                                                                     ` J. Bruce Fields
     [not found]                                                                     ` <20120214170300.GA4309-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-15 16:56                                                                       ` J. Bruce Fields
2012-02-15 16:56                                                                         ` J. Bruce Fields
     [not found]                                                                         ` <20120215165633.GE12490-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-16  3:06                                                                           ` NeilBrown
2012-02-16  3:06                                                                             ` NeilBrown
     [not found]                                                                             ` <20120216140603.08cb4900-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-02-16 11:51                                                                               ` J. Bruce Fields
2012-02-16 11:51                                                                                 ` J. Bruce Fields
     [not found]                                                                                 ` <20120216115133.GA20279-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-16 16:08                                                                                   ` J. Bruce Fields
2012-02-16 16:08                                                                                     ` J. Bruce Fields
2012-02-16 22:30                                                                             ` J. Bruce Fields
     [not found]                                                                               ` <20120216223011.GA23997-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-02-17 16:34                                                                                 ` Peng Tao
2012-02-17 16:34                                                                                   ` Peng Tao
2012-03-13 20:55                                                                                   ` J. Bruce Fields
2012-03-13 20:55                                                                                     ` J. Bruce Fields
2012-03-13 20:58                                                                                     ` [PATCH 1/2] vfs: stop d_splice_alias creating directory aliases J. Bruce Fields
2012-03-13 20:58                                                                                     ` [PATCH 2/2] vfs: remove unused __d_splice_alias argument J. Bruce Fields
2012-02-20  2:55                                                                                 ` [PATCH] fs/dcache: allow __d_obtain_alias() to return unhashed dentries NeilBrown
2012-02-20  2:55                                                                                   ` NeilBrown
     [not found]                                                                                   ` <20120220135537.3078e20b-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2012-02-29 23:10                                                                                     ` J. Bruce Fields
2012-02-29 23:10                                                                                       ` J. Bruce Fields
2012-06-28 13:59                                                                   ` J. Bruce Fields
2012-06-28 13:59                                                                     ` J. Bruce Fields
     [not found]                                                                     ` <20120628135927.GA6406-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-06-29 20:10                                                                       ` J. Bruce Fields [this message]
2012-06-29 20:10                                                                         ` J. Bruce Fields
     [not found]                                                                         ` <20120629201034.GA17103-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>
2012-06-29 20:29                                                                           ` J. Bruce Fields
2012-06-29 20:29                                                                             ` J. Bruce Fields
2012-07-01 23:15                                                                             ` NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120629201034.GA17103@fieldses.org \
    --to=bfields-uc3wqj2krung9huczpvpmw@public.gmane.org \
    --cc=linux-fsdevel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=neilb-l3A5Bk7waGM@public.gmane.org \
    --cc=npiggin-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=npiggin-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
    --cc=viro-3bDd1+5oDREiFSDQTTA3OLVCufUGDwFn@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.