linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@clusterfs.com>
To: Linus Torvalds <torvalds@transmeta.com>
Cc: viro@parcelfarce.linux.theplanet.co.uk,
	Frank Cusack <fcusack@fcusack.com>,
	Trond Myklebust <trond.myklebust@fys.uio.no>,
	marcelo@conectiva.com.br, lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] nfs_unlink() race (was: nfs_refresh_inode: inode number mismatch)
Date: Wed, 11 Jun 2003 00:16:58 -0600	[thread overview]
Message-ID: <20030611001658.K12274@schatzie.adilger.int> (raw)
In-Reply-To: <Pine.LNX.4.44.0306102226300.17133-100000@home.transmeta.com>; from torvalds@transmeta.com on Tue, Jun 10, 2003 at 10:30:10PM -0700

On Jun 10, 2003  22:30 -0700, Linus Torvalds wrote:
> On Wed, 11 Jun 2003 viro@parcelfarce.linux.theplanet.co.uk wrote:
> > Two different dentries for the same file is obviously not a problem...
> 
> It _is_ a problem. It does the wrong thing on any subsequent directory
> operation (move or unlink). 
> 
> Multiple aliased dentries have never been ok, unless the filesystem 
> explicitly handles them and invalidates them (ie ntfs/fat kind of things).

Amusingly, we hit _exactly_ this same problem a couple of days ago
in Lustre because we were trying to use RW locks instead of the single
directory i_sem to improve large (10M files) directory lookup performance,
and it is hard to hit and even harder to detect.

We've gone back to using i_sem to protect the actual dentry instantiation
in lookup for now (we have a separate DLM for doing RW locking of the
directory for create/rename/unlink), but will be looking at ways to allow
this in the filesystem again at some time in the future.  Allowing RW
locking for local filesystem lookups would probably also be a win.

My 5-minute theory to fix this is to re-check the inode dentry alias list
when you are doing d_instantiate() (with dcache lock held) for another
dentry with the same hash (and further the same name if the hashes match),
but that would behave poorly when there are large numbers of links to a
single file.  It _might_ be safe to check the first N entries on the
i_dentry list, where N ~= num_cpus, assuming that we could only have N
racy dentry lookups going at one time.  Even so, the boost of allowing
multiple parallel lookups for huge directories probably beats out the
extra searching slowdown for the uncommon many-links-to-inode case.

Cheers, Andreas
--
Andreas Dilger
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/


  reply	other threads:[~2003-06-11  6:03 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-06-03 23:54 nfs_refresh_inode: inode number mismatch Frank Cusack
2003-06-04 14:19 ` Trond Myklebust
2003-06-04 21:20   ` Frank Cusack
2003-06-04 21:28     ` Trond Myklebust
2003-06-09 13:51       ` [PATCH] nfs_unlink() race (was: nfs_refresh_inode: inode number mismatch) Frank Cusack
2003-06-09 13:55         ` Frank Cusack
2003-06-09 15:53         ` Linus Torvalds
2003-06-09 16:40           ` Trond Myklebust
2003-06-09 20:46             ` Frank Cusack
2003-06-10  0:01               ` Trond Myklebust
2003-06-11  0:54         ` viro
2003-06-11  1:28           ` Frank Cusack
2003-06-11  1:47             ` viro
2003-06-11  2:32               ` Frank Cusack
2003-06-11  2:37                 ` viro
2003-06-11  1:59           ` Trond Myklebust
2003-06-11  2:27             ` viro
2003-06-11  2:43               ` Frank Cusack
2003-06-11  2:50                 ` Frank Cusack
2003-06-11  3:00                 ` viro
2003-06-11  7:22                   ` [PATCH] NFS sillyrename fixes (was: [PATCH] nfs_unlink() race) Frank Cusack
2003-06-13  0:19                     ` [PATCH] NFS sillyrename fixes take 3 Frank Cusack
2003-06-11  3:00               ` [PATCH] nfs_unlink() race (was: nfs_refresh_inode: inode number mismatch) Trond Myklebust
2003-06-11  5:30           ` Linus Torvalds
2003-06-11  6:16             ` Andreas Dilger [this message]
2003-06-11 12:33             ` Alan Cox
2003-06-11 15:08               ` Linus Torvalds
2003-06-11 15:51                 ` Alan Cox
2003-06-11 16:11                   ` Linus Torvalds
2003-06-11 16:21                     ` Alan Cox
2003-06-11 16:31                       ` Linus Torvalds
2003-06-11 16:34                         ` viro
2003-06-11 17:22                         ` Alan Cox
2003-06-11 17:37                           ` Trond Myklebust
2003-06-11 17:47                             ` Trond Myklebust
2003-06-12 21:59                               ` Jan Harkes
     [not found]                             ` <16103.29804.198545.680701@charged.uio.no>
2003-06-11 22:24                               ` Frank Cusack
2003-06-11 23:16                                 ` Trond Myklebust
2003-06-11 23:35                                   ` [PATCH] nfs_unlink() race Frank Cusack
2003-06-05  9:11     ` nfs_refresh_inode: inode number mismatch Adrian Cox
2003-06-05  9:13       ` Russell King
2003-06-05 13:51         ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030611001658.K12274@schatzie.adilger.int \
    --to=adilger@clusterfs.com \
    --cc=fcusack@fcusack.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=marcelo@conectiva.com.br \
    --cc=torvalds@transmeta.com \
    --cc=trond.myklebust@fys.uio.no \
    --cc=viro@parcelfarce.linux.theplanet.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).