linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Frank Cusack <fcusack@fcusack.com>
To: lkml <linux-kernel@vger.kernel.org>, trond.myklebust@fys.uio.no
Subject: nfs_refresh_inode: inode number mismatch
Date: Tue, 3 Jun 2003 16:54:38 -0700	[thread overview]
Message-ID: <20030603165438.A24791@google.com> (raw)

Hi,

[Previously sent to nfs@sourceforge with no response]

I'm using a frankenstein kernel, 2.4.21-rc3 with some -ac bits,
and 2.5.69 NFS+RPC backported to it.  Like the CITI kernel (for krb5),
but a little more aggressive on the bits backported.  For the purpose
of this email, I think the code I have questions with is similar or even
identical from 2.4.21->2.5.69.  I can reproduce this problem on a RH
2.4.20-9smp kernel.

Consider these two shells running on the same machine:

	    1				    2

	cd /nfs				cd /nfs
	mkdir t
	echo foo > t/foo
	less t/foo
	 [less waits for input]
					rm -rf t
	'v'
	 [vi tries to access tmp/foo]

At this point, fs/nfs/inode.c:__nfs_refresh_inode() prints the "inode
number mismatch" error.  AFAICT, this is just noise, but the noise is
driving me crazy. :-)

Now, if sequence 2 is run on a different machine, there is no error!
So that hints to me that the local cache just needs to be cleared,
perhaps in nfs_rmdir() or maybe in nfs_unlink()/nfs_safe_remove().
I've tried a few things, but I'm not familiar enough with the code
and am making slow progress.  I can suppress this error by testing
for 'unlinked but open' in __nfs_refresh_inode:

        if (NFS_FILEID(inode) != fattr->fileid) {
		if (inode->i_nlink)	/* quiet if inode DNE anymore */
			printk(...)
	}

Do you think this is safe?  Some minimal logs:

kernel: NFS: dentry_delete(t/.nfs01c7d70600000001, 2)	| renamed file
kernel: NFS: delete_inode(e/29873926)			| unlink of renamed foo

kernel: NFS: refresh_inode(e/29873923 ct=1 info=0x6)	| accessing t/
kernel: nfs_refresh_inode: inode number mismatch
kernel: expected (0xe/0x1c7d703), got (0xe/0xe63bc2)
kernel: NFS: dentry_delete(fsstress/t, 0)
kernel: NFS: delete_inode(e/29873923)

and then access calls beginning at the root.  I apologize for the likely
uselessness of the above logs.  I can email some annotated logs if desired,
but the problem is very easy to reproduce, so I'll hold off for now.

This problem only exists for nfsv3.  This problem doesn't occur if there
is a third process also holding foo open (note that the directory does
get removed, just no kernel error when trying to access it).

The 2.2 kernel doesn't have this problem, because (apparently) it doesn't
allow you to unlink a .nfsXXX file while it's open (and therefore you
cannot remove the dir).

Which made me look around (2.5.69):  In nfs_silly_rename(), the new
dentry (sdentry) gets a d_count of 1.  Doesn't this indicate that no
one is holding this file open?  (which then tells nfs_unlink() to just
call nfs_safe_remove() rather than nfs_silly_rename())  Is that really
desirable?  Even if I set the d_count to match what the previous
dentry->d_count had, and avoid calling dput(sdentry), on the next run
through nfs_unlink() the d_count is 1 and it just goes to nfs_safe_remove().
I think that clearly, I don't understand what the d_move() is for.
(My guess is to avoid nfs_async_unlink() getting passed a dentry which
we are actually about to get rid of, but I haven't wrapped my head around
the dcache yet.)

Then I noticed that the DCACHE_NFSFS_RENAMED seems a little racy.
nfs_async_unlink() sets this and when the call completes,
nfs_complete_unlink() resets it.  So while it's being deleted, if an
rm -rf quickly picks up the .nfs name before the async unlink returns,
it won't get removed.  But if the nfs call completes first, it does
get removed.  Is the intention just to prevent removal of the .nfs
file until the old file is removed on the server?  What's the benefit
of this?

So, even with that error message quieted, fsstress reports lots of
inode mismatches.  I am in the process of trying to piece together a
simple reproducible sequence of NFS calls.

This is against a netapp server, although I can't see how the server would
matter.

Thanks for any advice, guidance, or hopefully fixes!  BTW, I'm interested
to hear what tools folks use to stress the NFS client.

/fc

             reply	other threads:[~2003-06-03 23:41 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2003-06-03 23:54 Frank Cusack [this message]
2003-06-04 14:19 ` nfs_refresh_inode: inode number mismatch Trond Myklebust
2003-06-04 21:20   ` Frank Cusack
2003-06-04 21:28     ` Trond Myklebust
2003-06-09 13:51       ` [PATCH] nfs_unlink() race (was: nfs_refresh_inode: inode number mismatch) Frank Cusack
2003-06-09 13:55         ` Frank Cusack
2003-06-09 15:53         ` Linus Torvalds
2003-06-09 16:40           ` Trond Myklebust
2003-06-09 20:46             ` Frank Cusack
2003-06-10  0:01               ` Trond Myklebust
2003-06-11  0:54         ` viro
2003-06-11  1:28           ` Frank Cusack
2003-06-11  1:47             ` viro
2003-06-11  2:32               ` Frank Cusack
2003-06-11  2:37                 ` viro
2003-06-11  1:59           ` Trond Myklebust
2003-06-11  2:27             ` viro
2003-06-11  2:43               ` Frank Cusack
2003-06-11  2:50                 ` Frank Cusack
2003-06-11  3:00                 ` viro
2003-06-11  7:22                   ` [PATCH] NFS sillyrename fixes (was: [PATCH] nfs_unlink() race) Frank Cusack
2003-06-13  0:19                     ` [PATCH] NFS sillyrename fixes take 3 Frank Cusack
2003-06-11  3:00               ` [PATCH] nfs_unlink() race (was: nfs_refresh_inode: inode number mismatch) Trond Myklebust
2003-06-11  5:30           ` Linus Torvalds
2003-06-11  6:16             ` Andreas Dilger
2003-06-11 12:33             ` Alan Cox
2003-06-11 15:08               ` Linus Torvalds
2003-06-11 15:51                 ` Alan Cox
2003-06-11 16:11                   ` Linus Torvalds
2003-06-11 16:21                     ` Alan Cox
2003-06-11 16:31                       ` Linus Torvalds
2003-06-11 16:34                         ` viro
2003-06-11 17:22                         ` Alan Cox
2003-06-11 17:37                           ` Trond Myklebust
2003-06-11 17:47                             ` Trond Myklebust
2003-06-12 21:59                               ` Jan Harkes
     [not found]                             ` <16103.29804.198545.680701@charged.uio.no>
2003-06-11 22:24                               ` Frank Cusack
2003-06-11 23:16                                 ` Trond Myklebust
2003-06-11 23:35                                   ` [PATCH] nfs_unlink() race Frank Cusack
2003-06-05  9:11     ` nfs_refresh_inode: inode number mismatch Adrian Cox
2003-06-05  9:13       ` Russell King
2003-06-05 13:51         ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2001-07-17  0:24 Marco d'Itri
2001-07-17  9:44 ` Trond Myklebust
2001-07-18 22:25   ` Marco d'Itri
2001-07-19 11:00   ` Trond Myklebust
2001-02-22 22:30 Scott A McConnell
2001-02-22 21:59 ` Russell King
2001-02-23  9:30   ` Trond Myklebust
2001-02-08  1:13 Jun Sun
2001-02-08  1:22 ` Neil Brown
2001-02-08  8:08   ` Russell King
2001-02-09  0:02     ` Jun Sun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20030603165438.A24791@google.com \
    --to=fcusack@fcusack.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=trond.myklebust@fys.uio.no \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).