linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nfs_refresh_inode: inode number mismatch
@ 2001-02-08  1:13 Jun Sun
  2001-02-08  1:22 ` Neil Brown
  0 siblings, 1 reply; 18+ messages in thread
From: Jun Sun @ 2001-02-08  1:13 UTC (permalink / raw)
  To: linux-kernel


This is a weird problem that I am looking at right.  It seems to indicate a
bug in the nfs server.

I have a MIPS machine that boots from a NFS root fs hosted on a redhat 6.2
workstation.  Everything works fine except that after a few reboots I start to
see the error messages like the following:

Freeing unused kernel memory: 24k freed
INIT: version 2.77 booting
nfs_refresh_inode: inode number mismatch
expected (0x308/0x28b3d2), got (0x308/0x12b91b)
INIT: Entering runlevel: 3
sh-2.03# 

Restarting the nfs server on the host does not get rid of the messages. 
Things will get better if I reboot the host.

I traced the network packets, and it seems obvious that the server is
returning wrong fileid in the "write reply" message.  Below is a segment of
the extracted packet trace.  It is obvious that the nfs server returns a wrong
fileid for the same handle it returned earlier to the client.  The confusing
part is the nfs server actually serves the first write request, and a couple
of other requests, correctly but failed for the second time, returning a wrong
fileid.

In my particular setup, it seems only certain files (inodes) tend to get
screwed up.

Does anybody have an idea as to what is wrong here?

Please cc your reply to my email address.  TIA.


Jun

------------------
round 3:

case 1:

2177 lookup:
        ioctl.save

2178 lookup reply:
        fileid: 2667474
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2181 write:
        offset:0
        total count: 60
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2182 write reply:
        fileid: 2667474
        size: 60

2183 setattr:
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2184 setattr reply:
        fileid: 2667474

2185 write:
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2186 write reply:
        fileid 1227035
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread
* nfs_refresh_inode: inode number mismatch
@ 2001-02-22 22:30 Scott A McConnell
  2001-02-22 21:59 ` Russell King
  0 siblings, 1 reply; 18+ messages in thread
From: Scott A McConnell @ 2001-02-22 22:30 UTC (permalink / raw)
  To: linux-kernel

I am getting NFS errors/warnings

VFS: Mounted root (nfs filesystem).
Freeing unused kernel memory: 196k freed
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
                     ^/var/run/utmp
^/var/log/wtmp                        **************
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b48), got (0x806/0x6246a)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b4f), got (0x806/0x6246a)

^/var/run/inetd.pid
*****************
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b48), got (0x806/0x6246a)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b48), got (0x806/0x6246a)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x42d60), got (0x806/0x42d5f)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)

I am running  RedHat Linux version 2.2.16-3 on  my PC and  Hardhat Linux
version 2.4.0-test5 on my MIPS board. Any thoughts or suggestions?

I saw a discussion start on the ARM list along these lines but I never
saw a solution.

Please CC me at samcconn@cotw.com

Thanks,
Scott



^ permalink raw reply	[flat|nested] 18+ messages in thread
* nfs_refresh_inode: inode number mismatch
@ 2001-07-17  0:24 Marco d'Itri
  2001-07-17  9:44 ` Trond Myklebust
  0 siblings, 1 reply; 18+ messages in thread
From: Marco d'Itri @ 2001-07-17  0:24 UTC (permalink / raw)
  To: linux-kernel

Jul 18 00:15:07 newsserver kernel: nfs_refresh_inode: inode number mismatch
Jul 18 00:15:07 newsserver kernel: expected (0x3b30ac75/0x48d5), got (0x3b30ac75/0x8d04)

I've got a flood of these messages while talking to a procom NAS this.
Should I worry? Upgrade/patch the kernel? Yell at procom tech support?


Linux newsserver 2.4.5 #1 Fri Jun 22 18:18:56 CEST 2001 i686 unknown

192.168.139.11:/news_store on /shared/archive type nfs (rw,noatime,rsize=8192,wsize=8192,udp,nfsvers=3,addr=192.168.139.11)


-- 
ciao,
Marco

^ permalink raw reply	[flat|nested] 18+ messages in thread
* nfs_refresh_inode: inode number mismatch
@ 2003-06-03 23:54 Frank Cusack
  2003-06-04 14:19 ` Trond Myklebust
  0 siblings, 1 reply; 18+ messages in thread
From: Frank Cusack @ 2003-06-03 23:54 UTC (permalink / raw)
  To: lkml, trond.myklebust

Hi,

[Previously sent to nfs@sourceforge with no response]

I'm using a frankenstein kernel, 2.4.21-rc3 with some -ac bits,
and 2.5.69 NFS+RPC backported to it.  Like the CITI kernel (for krb5),
but a little more aggressive on the bits backported.  For the purpose
of this email, I think the code I have questions with is similar or even
identical from 2.4.21->2.5.69.  I can reproduce this problem on a RH
2.4.20-9smp kernel.

Consider these two shells running on the same machine:

	    1				    2

	cd /nfs				cd /nfs
	mkdir t
	echo foo > t/foo
	less t/foo
	 [less waits for input]
					rm -rf t
	'v'
	 [vi tries to access tmp/foo]

At this point, fs/nfs/inode.c:__nfs_refresh_inode() prints the "inode
number mismatch" error.  AFAICT, this is just noise, but the noise is
driving me crazy. :-)

Now, if sequence 2 is run on a different machine, there is no error!
So that hints to me that the local cache just needs to be cleared,
perhaps in nfs_rmdir() or maybe in nfs_unlink()/nfs_safe_remove().
I've tried a few things, but I'm not familiar enough with the code
and am making slow progress.  I can suppress this error by testing
for 'unlinked but open' in __nfs_refresh_inode:

        if (NFS_FILEID(inode) != fattr->fileid) {
		if (inode->i_nlink)	/* quiet if inode DNE anymore */
			printk(...)
	}

Do you think this is safe?  Some minimal logs:

kernel: NFS: dentry_delete(t/.nfs01c7d70600000001, 2)	| renamed file
kernel: NFS: delete_inode(e/29873926)			| unlink of renamed foo

kernel: NFS: refresh_inode(e/29873923 ct=1 info=0x6)	| accessing t/
kernel: nfs_refresh_inode: inode number mismatch
kernel: expected (0xe/0x1c7d703), got (0xe/0xe63bc2)
kernel: NFS: dentry_delete(fsstress/t, 0)
kernel: NFS: delete_inode(e/29873923)

and then access calls beginning at the root.  I apologize for the likely
uselessness of the above logs.  I can email some annotated logs if desired,
but the problem is very easy to reproduce, so I'll hold off for now.

This problem only exists for nfsv3.  This problem doesn't occur if there
is a third process also holding foo open (note that the directory does
get removed, just no kernel error when trying to access it).

The 2.2 kernel doesn't have this problem, because (apparently) it doesn't
allow you to unlink a .nfsXXX file while it's open (and therefore you
cannot remove the dir).

Which made me look around (2.5.69):  In nfs_silly_rename(), the new
dentry (sdentry) gets a d_count of 1.  Doesn't this indicate that no
one is holding this file open?  (which then tells nfs_unlink() to just
call nfs_safe_remove() rather than nfs_silly_rename())  Is that really
desirable?  Even if I set the d_count to match what the previous
dentry->d_count had, and avoid calling dput(sdentry), on the next run
through nfs_unlink() the d_count is 1 and it just goes to nfs_safe_remove().
I think that clearly, I don't understand what the d_move() is for.
(My guess is to avoid nfs_async_unlink() getting passed a dentry which
we are actually about to get rid of, but I haven't wrapped my head around
the dcache yet.)

Then I noticed that the DCACHE_NFSFS_RENAMED seems a little racy.
nfs_async_unlink() sets this and when the call completes,
nfs_complete_unlink() resets it.  So while it's being deleted, if an
rm -rf quickly picks up the .nfs name before the async unlink returns,
it won't get removed.  But if the nfs call completes first, it does
get removed.  Is the intention just to prevent removal of the .nfs
file until the old file is removed on the server?  What's the benefit
of this?

So, even with that error message quieted, fsstress reports lots of
inode mismatches.  I am in the process of trying to piece together a
simple reproducible sequence of NFS calls.

This is against a netapp server, although I can't see how the server would
matter.

Thanks for any advice, guidance, or hopefully fixes!  BTW, I'm interested
to hear what tools folks use to stress the NFS client.

/fc

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2003-06-05 13:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-02-08  1:13 nfs_refresh_inode: inode number mismatch Jun Sun
2001-02-08  1:22 ` Neil Brown
2001-02-08  8:08   ` Russell King
2001-02-09  0:02     ` Jun Sun
2001-02-22 22:30 Scott A McConnell
2001-02-22 21:59 ` Russell King
2001-02-23  9:30   ` Trond Myklebust
2001-07-17  0:24 Marco d'Itri
2001-07-17  9:44 ` Trond Myklebust
2001-07-18 22:25   ` Marco d'Itri
2001-07-19 11:00   ` Trond Myklebust
2003-06-03 23:54 Frank Cusack
2003-06-04 14:19 ` Trond Myklebust
2003-06-04 21:20   ` Frank Cusack
2003-06-04 21:28     ` Trond Myklebust
2003-06-05  9:11     ` Adrian Cox
2003-06-05  9:13       ` Russell King
2003-06-05 13:51         ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).