linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* nfs_refresh_inode: inode number mismatch
@ 2001-07-17  0:24 Marco d'Itri
  2001-07-17  9:44 ` Trond Myklebust
  0 siblings, 1 reply; 18+ messages in thread
From: Marco d'Itri @ 2001-07-17  0:24 UTC (permalink / raw)
  To: linux-kernel

Jul 18 00:15:07 newsserver kernel: nfs_refresh_inode: inode number mismatch
Jul 18 00:15:07 newsserver kernel: expected (0x3b30ac75/0x48d5), got (0x3b30ac75/0x8d04)

I've got a flood of these messages while talking to a procom NAS this.
Should I worry? Upgrade/patch the kernel? Yell at procom tech support?


Linux newsserver 2.4.5 #1 Fri Jun 22 18:18:56 CEST 2001 i686 unknown

192.168.139.11:/news_store on /shared/archive type nfs (rw,noatime,rsize=8192,wsize=8192,udp,nfsvers=3,addr=192.168.139.11)


-- 
ciao,
Marco

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-07-17  0:24 nfs_refresh_inode: inode number mismatch Marco d'Itri
@ 2001-07-17  9:44 ` Trond Myklebust
  2001-07-18 22:25   ` Marco d'Itri
  2001-07-19 11:00   ` Trond Myklebust
  0 siblings, 2 replies; 18+ messages in thread
From: Trond Myklebust @ 2001-07-17  9:44 UTC (permalink / raw)
  To: Marco d'Itri; +Cc: linux-kernel

>>>>> " " == Marco d'Itri <md@Linux.IT> writes:

     > Jul 18 00:15:07 newsserver kernel: nfs_refresh_inode: inode
     > number mismatch Jul 18 00:15:07 newsserver kernel: expected
     > (0x3b30ac75/0x48d5), got (0x3b30ac75/0x8d04)

     > I've got a flood of these messages while talking to a procom
     > NAS this.  Should I worry? Upgrade/patch the kernel? Yell at
     > procom tech support?

Have you applied any extra patches to NFS? I remember one of my
patches (availalble from my WWW-page, but clearly marked experimental)
was generating these messages gratuitously.

If, on the other hand, you're using a clean kernel, I'd look into what
the server is doing. It sounds like it's doing the same thing that the
userland `nfs-server' does: namely to recycle filehandles after a file
gets deleted...

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-07-17  9:44 ` Trond Myklebust
@ 2001-07-18 22:25   ` Marco d'Itri
  2001-07-19 11:00   ` Trond Myklebust
  1 sibling, 0 replies; 18+ messages in thread
From: Marco d'Itri @ 2001-07-18 22:25 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: linux-kernel

On Jul 17, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:

 >     > Jul 18 00:15:07 newsserver kernel: nfs_refresh_inode: inode
 >     > number mismatch Jul 18 00:15:07 newsserver kernel: expected
 >     > (0x3b30ac75/0x48d5), got (0x3b30ac75/0x8d04)

 >     > I've got a flood of these messages while talking to a procom
 >     > NAS this.  Should I worry? Upgrade/patch the kernel? Yell at
 >     > procom tech support?

 >Have you applied any extra patches to NFS? I remember one of my
No, the kernel is plain unpatched 2.4.5.

 >If, on the other hand, you're using a clean kernel, I'd look into what
 >the server is doing. It sounds like it's doing the same thing that the
 >userland `nfs-server' does: namely to recycle filehandles after a file
 >gets deleted...
Anything specific I can tell to their tech support?

Can I ignore these messages or I risk data corruption?

-- 
ciao,
Marco

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-07-17  9:44 ` Trond Myklebust
  2001-07-18 22:25   ` Marco d'Itri
@ 2001-07-19 11:00   ` Trond Myklebust
  1 sibling, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2001-07-19 11:00 UTC (permalink / raw)
  To: Marco d'Itri; +Cc: Linux Kernel

>>>>> " " == Marco d'Itri <md@Linux.IT> writes:

     > On Jul 17, Trond Myklebust <trond.myklebust@fys.uio.no> wrote:
    >> > Jul 18 00:15:07 newsserver kernel: nfs_refresh_inode: inode
    >> > number mismatch Jul 18 00:15:07 newsserver kernel: expected
    >> > (0x3b30ac75/0x48d5), got (0x3b30ac75/0x8d04)

    >> If, on the other hand, you're using a clean kernel, I'd look
    >> into what the server is doing. It sounds like it's doing the
    >> same thing that the userland `nfs-server' does: namely to
    >> recycle filehandles after a file gets deleted...
     > Anything specific I can tell to their tech support?

     > Can I ignore these messages or I risk data corruption?

There's always a small danger of data corruption, since the NFS client
can't rely on the file handle actually being a pointer to the file we
expect.

Try 2.4.6 first though, as a couple of fixes were implemented there
that should reduce the frequency of such messages. Basically we ensure
that inodes are removed from the cache when we do believe that it has
been deleted.

A proper fix, though, would be for the server to implement filehandles
that are unique as per RFC1813...

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2003-06-05  9:13       ` Russell King
@ 2003-06-05 13:51         ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2003-06-05 13:51 UTC (permalink / raw)
  To: Russell King; +Cc: Adrian Cox, Frank Cusack, linux-kernel

>>>>> " " == Russell King <rmk@arm.linux.org.uk> writes:

     > If this is the case, you need to ensure that you don't reboot
     > the client before the servers XID cache times out the XID
     > numbers.  For Linux knfsd, that's around 2 minutes.

Note that older versions of knfsd didn't time out their replay cache
at all...

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2003-06-05  9:11     ` Adrian Cox
@ 2003-06-05  9:13       ` Russell King
  2003-06-05 13:51         ` Trond Myklebust
  0 siblings, 1 reply; 18+ messages in thread
From: Russell King @ 2003-06-05  9:13 UTC (permalink / raw)
  To: Adrian Cox; +Cc: Frank Cusack, trond.myklebust, linux-kernel

On Thu, Jun 05, 2003 at 10:11:20AM +0100, Adrian Cox wrote:
> There's a very common cause on embedded boards that don't have
> real-time clocks. Without a clock the client uses the same XID on every
> run, leading to lots of these messages. Is your clock broken?

BTDT.

If this is the case, you need to ensure that you don't reboot the client
before the servers XID cache times out the XID numbers.  For Linux knfsd,
that's around 2 minutes.

-- 
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2003-06-04 21:20   ` Frank Cusack
  2003-06-04 21:28     ` Trond Myklebust
@ 2003-06-05  9:11     ` Adrian Cox
  2003-06-05  9:13       ` Russell King
  1 sibling, 1 reply; 18+ messages in thread
From: Adrian Cox @ 2003-06-05  9:11 UTC (permalink / raw)
  To: Frank Cusack; +Cc: trond.myklebust, linux-kernel

On Wed, 4 Jun 2003 14:20:47 -0700
"Frank Cusack" <fcusack@fcusack.com> wrote:

> On Wed, Jun 04, 2003 at 04:19:38PM +0200, Trond Myklebust wrote:
> > >>>>> " " == Frank Cusack <fcusack@fcusack.com> writes:
> >      > At this point, fs/nfs/inode.c:__nfs_refresh_inode() prints
> >      > the"inode number mismatch" error.  AFAICT, this is just
> >      > noise, but the noise is driving me crazy. :-)
> > 
> > Inode number mismatch points to either an an obvious server error
> > (it is not providing unique filehandles) or corruption of the fattr
> > struct that was passed to nfs_refresh_inode().

There's a very common cause on embedded boards that don't have
real-time clocks. Without a clock the client uses the same XID on every
run, leading to lots of these messages. Is your clock broken?

- Adrian Cox
http://www.humboldt.co.uk/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2003-06-04 21:20   ` Frank Cusack
@ 2003-06-04 21:28     ` Trond Myklebust
  2003-06-05  9:11     ` Adrian Cox
  1 sibling, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2003-06-04 21:28 UTC (permalink / raw)
  To: Frank Cusack; +Cc: lkml

>>>>> " " == Frank Cusack <fcusack@fcusack.com> writes:

     > Could you take another look at the specific case I cited?  At
     > the time I try to access the file, the path to it no longer
     > exists.  No information on this file should exist.

I cannot duplicate.

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2003-06-04 14:19 ` Trond Myklebust
@ 2003-06-04 21:20   ` Frank Cusack
  2003-06-04 21:28     ` Trond Myklebust
  2003-06-05  9:11     ` Adrian Cox
  0 siblings, 2 replies; 18+ messages in thread
From: Frank Cusack @ 2003-06-04 21:20 UTC (permalink / raw)
  To: Trond Myklebust; +Cc: lkml

On Wed, Jun 04, 2003 at 04:19:38PM +0200, Trond Myklebust wrote:
> >>>>> " " == Frank Cusack <fcusack@fcusack.com> writes:
>      > At this point, fs/nfs/inode.c:__nfs_refresh_inode() prints the
>      > "inode number mismatch" error.  AFAICT, this is just noise, but
>      > the noise is driving me crazy. :-)
> 
> Inode number mismatch points to either an an obvious server error (it
> is not providing unique filehandles) or corruption of the fattr struct
> that was passed to nfs_refresh_inode().

Clearly it's not the former.  No way a netapp filer is going to have
this problem.  I can't imagine *any* nfs server having this problem.

Could you take another look at the specific case I cited?  At the time
I try to access the file, the path to it no longer exists.  No information
on this file should exist.

/fc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2003-06-03 23:54 Frank Cusack
@ 2003-06-04 14:19 ` Trond Myklebust
  2003-06-04 21:20   ` Frank Cusack
  0 siblings, 1 reply; 18+ messages in thread
From: Trond Myklebust @ 2003-06-04 14:19 UTC (permalink / raw)
  To: Frank Cusack; +Cc: lkml, trond.myklebust

>>>>> " " == Frank Cusack <fcusack@fcusack.com> writes:

     > Hi, [Previously sent to nfs@sourceforge with no response]

     > I'm using a frankenstein kernel, 2.4.21-rc3 with some -ac bits,
     > and 2.5.69 NFS+RPC backported to it.  Like the CITI kernel (for
     > krb5), but a little more aggressive on the bits backported.
     > For the purpose of this email, I think the code I have
     > questions with is similar or even identical from
     > 2.4.21->2.5.69.  I can reproduce this problem on a RH
     > 2.4.20-9smp kernel.

     > Consider these two shells running on the same machine:

     > 	    1 2

     > 	cd /nfs cd /nfs mkdir t echo foo > t/foo less t/foo
     > 	 [less waits for input]
     > 					rm -rf t
     > 	'v'
     > 	 [vi tries to access tmp/foo]

     > At this point, fs/nfs/inode.c:__nfs_refresh_inode() prints the
     > "inode number mismatch" error.  AFAICT, this is just noise, but
     > the noise is driving me crazy. :-)

Inode number mismatch points to either an an obvious server error (it
is not providing unique filehandles) or corruption of the fattr struct
that was passed to nfs_refresh_inode().

Cheers,
  Trond

^ permalink raw reply	[flat|nested] 18+ messages in thread

* nfs_refresh_inode: inode number mismatch
@ 2003-06-03 23:54 Frank Cusack
  2003-06-04 14:19 ` Trond Myklebust
  0 siblings, 1 reply; 18+ messages in thread
From: Frank Cusack @ 2003-06-03 23:54 UTC (permalink / raw)
  To: lkml, trond.myklebust

Hi,

[Previously sent to nfs@sourceforge with no response]

I'm using a frankenstein kernel, 2.4.21-rc3 with some -ac bits,
and 2.5.69 NFS+RPC backported to it.  Like the CITI kernel (for krb5),
but a little more aggressive on the bits backported.  For the purpose
of this email, I think the code I have questions with is similar or even
identical from 2.4.21->2.5.69.  I can reproduce this problem on a RH
2.4.20-9smp kernel.

Consider these two shells running on the same machine:

	    1				    2

	cd /nfs				cd /nfs
	mkdir t
	echo foo > t/foo
	less t/foo
	 [less waits for input]
					rm -rf t
	'v'
	 [vi tries to access tmp/foo]

At this point, fs/nfs/inode.c:__nfs_refresh_inode() prints the "inode
number mismatch" error.  AFAICT, this is just noise, but the noise is
driving me crazy. :-)

Now, if sequence 2 is run on a different machine, there is no error!
So that hints to me that the local cache just needs to be cleared,
perhaps in nfs_rmdir() or maybe in nfs_unlink()/nfs_safe_remove().
I've tried a few things, but I'm not familiar enough with the code
and am making slow progress.  I can suppress this error by testing
for 'unlinked but open' in __nfs_refresh_inode:

        if (NFS_FILEID(inode) != fattr->fileid) {
		if (inode->i_nlink)	/* quiet if inode DNE anymore */
			printk(...)
	}

Do you think this is safe?  Some minimal logs:

kernel: NFS: dentry_delete(t/.nfs01c7d70600000001, 2)	| renamed file
kernel: NFS: delete_inode(e/29873926)			| unlink of renamed foo

kernel: NFS: refresh_inode(e/29873923 ct=1 info=0x6)	| accessing t/
kernel: nfs_refresh_inode: inode number mismatch
kernel: expected (0xe/0x1c7d703), got (0xe/0xe63bc2)
kernel: NFS: dentry_delete(fsstress/t, 0)
kernel: NFS: delete_inode(e/29873923)

and then access calls beginning at the root.  I apologize for the likely
uselessness of the above logs.  I can email some annotated logs if desired,
but the problem is very easy to reproduce, so I'll hold off for now.

This problem only exists for nfsv3.  This problem doesn't occur if there
is a third process also holding foo open (note that the directory does
get removed, just no kernel error when trying to access it).

The 2.2 kernel doesn't have this problem, because (apparently) it doesn't
allow you to unlink a .nfsXXX file while it's open (and therefore you
cannot remove the dir).

Which made me look around (2.5.69):  In nfs_silly_rename(), the new
dentry (sdentry) gets a d_count of 1.  Doesn't this indicate that no
one is holding this file open?  (which then tells nfs_unlink() to just
call nfs_safe_remove() rather than nfs_silly_rename())  Is that really
desirable?  Even if I set the d_count to match what the previous
dentry->d_count had, and avoid calling dput(sdentry), on the next run
through nfs_unlink() the d_count is 1 and it just goes to nfs_safe_remove().
I think that clearly, I don't understand what the d_move() is for.
(My guess is to avoid nfs_async_unlink() getting passed a dentry which
we are actually about to get rid of, but I haven't wrapped my head around
the dcache yet.)

Then I noticed that the DCACHE_NFSFS_RENAMED seems a little racy.
nfs_async_unlink() sets this and when the call completes,
nfs_complete_unlink() resets it.  So while it's being deleted, if an
rm -rf quickly picks up the .nfs name before the async unlink returns,
it won't get removed.  But if the nfs call completes first, it does
get removed.  Is the intention just to prevent removal of the .nfs
file until the old file is removed on the server?  What's the benefit
of this?

So, even with that error message quieted, fsstress reports lots of
inode mismatches.  I am in the process of trying to piece together a
simple reproducible sequence of NFS calls.

This is against a netapp server, although I can't see how the server would
matter.

Thanks for any advice, guidance, or hopefully fixes!  BTW, I'm interested
to hear what tools folks use to stress the NFS client.

/fc

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-02-22 21:59 ` Russell King
@ 2001-02-23  9:30   ` Trond Myklebust
  0 siblings, 0 replies; 18+ messages in thread
From: Trond Myklebust @ 2001-02-23  9:30 UTC (permalink / raw)
  To: Russell King; +Cc: Scott A McConnell, linux-kernel

>>>>> " " == Russell King <rmk@arm.linux.org.uk> writes:

     > Scott A McConnell writes:
    >> I am running RedHat Linux version 2.2.16-3 on my PC and Hardhat
    >> Linux version 2.4.0-test5 on my MIPS board. Any thoughts or
    >> suggestions?
    >>
    >> I saw a discussion start on the ARM list along these lines but
    >> I never saw a solution.

     > The problem is partly caused by the NFS server indefinitely
     > caching NFS request XIDs to responses, and the NFS client not
     > having a way to generate a random initial XID.  (thus, for each
     > reboot, it starts at the same XID number).

That shouldn't be true in the latest kernels. knfsd should normally
cache requests for no longer than 2 minutes with the changes made by
Neil following your bugreport.

Cheers,
   Trond

^ permalink raw reply	[flat|nested] 18+ messages in thread

* nfs_refresh_inode: inode number mismatch
@ 2001-02-22 22:30 Scott A McConnell
  2001-02-22 21:59 ` Russell King
  0 siblings, 1 reply; 18+ messages in thread
From: Scott A McConnell @ 2001-02-22 22:30 UTC (permalink / raw)
  To: linux-kernel

I am getting NFS errors/warnings

VFS: Mounted root (nfs filesystem).
Freeing unused kernel memory: 196k freed
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
                     ^/var/run/utmp
^/var/log/wtmp                        **************
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b48), got (0x806/0x6246a)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b4f), got (0x806/0x6246a)

^/var/run/inetd.pid
*****************
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b48), got (0x806/0x6246a)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x62b48), got (0x806/0x6246a)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x42d60), got (0x806/0x42d5f)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)
nfs_refresh_inode: inode number mismatch
expected (0x806/0x6246a), got (0x806/0x62b48)

I am running  RedHat Linux version 2.2.16-3 on  my PC and  Hardhat Linux
version 2.4.0-test5 on my MIPS board. Any thoughts or suggestions?

I saw a discussion start on the ARM list along these lines but I never
saw a solution.

Please CC me at samcconn@cotw.com

Thanks,
Scott



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-02-22 22:30 Scott A McConnell
@ 2001-02-22 21:59 ` Russell King
  2001-02-23  9:30   ` Trond Myklebust
  0 siblings, 1 reply; 18+ messages in thread
From: Russell King @ 2001-02-22 21:59 UTC (permalink / raw)
  To: Scott A McConnell; +Cc: linux-kernel

Scott A McConnell writes:
> I am running  RedHat Linux version 2.2.16-3 on  my PC and  Hardhat Linux
> version 2.4.0-test5 on my MIPS board. Any thoughts or suggestions?
> 
> I saw a discussion start on the ARM list along these lines but I never
> saw a solution.

The problem is partly caused by the NFS server indefinitely caching NFS
request XIDs to responses, and the NFS client not having a way to generate
a random initial XID.  (thus, for each reboot, it starts at the same XID
number).

Upgrade your NFS server to kernel 2.2.18, and don't reboot more than once
in a 2 minute window.

--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-02-08  8:08   ` Russell King
@ 2001-02-09  0:02     ` Jun Sun
  0 siblings, 0 replies; 18+ messages in thread
From: Jun Sun @ 2001-02-09  0:02 UTC (permalink / raw)
  To: Russell King; +Cc: Neil Brown, linux-kernel

Russell King wrote:
> 
> Neil Brown writes:
> > On Wednesday February 7, jsun@mvista.com wrote:
> > > This is a weird problem that I am looking at right.  It seems to indicate a
> > > bug in the nfs server.
> > >
> > > I have a MIPS machine that boots from a NFS root fs hosted on a redhat 6.2
> > > workstation.  Everything works fine except that after a few reboots I start to
> > > see the error messages like the following:
> >
> > What verison of Linux?  If it is less than 2.2.18, then an upgrade
> > will help you a lot.
> >
> > If it is >= 2.2.18, I will look some more.
> 
> Note that you need to upgrade the server, not the client.  Also, make sure
> you don't reboot the client more than once in a 2 minute time window.

My server was 2.2.14.  I upgraded it to 2.2.18.  It appears that the problem
is gone, although it will probably take a while to be sure.

I do find the "no more than once in 2 minutes" requirement amusing ... :-)

Jun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-02-08  1:22 ` Neil Brown
@ 2001-02-08  8:08   ` Russell King
  2001-02-09  0:02     ` Jun Sun
  0 siblings, 1 reply; 18+ messages in thread
From: Russell King @ 2001-02-08  8:08 UTC (permalink / raw)
  To: Neil Brown; +Cc: Jun Sun, linux-kernel

Neil Brown writes:
> On Wednesday February 7, jsun@mvista.com wrote:
> > This is a weird problem that I am looking at right.  It seems to indicate a
> > bug in the nfs server.
> > 
> > I have a MIPS machine that boots from a NFS root fs hosted on a redhat 6.2
> > workstation.  Everything works fine except that after a few reboots I start to
> > see the error messages like the following:
> 
> What verison of Linux?  If it is less than 2.2.18, then an upgrade 
> will help you a lot.
> 
> If it is >= 2.2.18, I will look some more.

Note that you need to upgrade the server, not the client.  Also, make sure
you don't reboot the client more than once in a 2 minute time window.
--
Russell King (rmk@arm.linux.org.uk)                The developer of ARM Linux
             http://www.arm.linux.org.uk/personal/aboutme.html

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: nfs_refresh_inode: inode number mismatch
  2001-02-08  1:13 Jun Sun
@ 2001-02-08  1:22 ` Neil Brown
  2001-02-08  8:08   ` Russell King
  0 siblings, 1 reply; 18+ messages in thread
From: Neil Brown @ 2001-02-08  1:22 UTC (permalink / raw)
  To: Jun Sun; +Cc: linux-kernel

On Wednesday February 7, jsun@mvista.com wrote:
> 
> This is a weird problem that I am looking at right.  It seems to indicate a
> bug in the nfs server.
> 
> I have a MIPS machine that boots from a NFS root fs hosted on a redhat 6.2
> workstation.  Everything works fine except that after a few reboots I start to
> see the error messages like the following:

What verison of Linux?  If it is less than 2.2.18, then an upgrade 
will help you a lot.

If it is >= 2.2.18, I will look some more.

NeilBrown
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

* nfs_refresh_inode: inode number mismatch
@ 2001-02-08  1:13 Jun Sun
  2001-02-08  1:22 ` Neil Brown
  0 siblings, 1 reply; 18+ messages in thread
From: Jun Sun @ 2001-02-08  1:13 UTC (permalink / raw)
  To: linux-kernel


This is a weird problem that I am looking at right.  It seems to indicate a
bug in the nfs server.

I have a MIPS machine that boots from a NFS root fs hosted on a redhat 6.2
workstation.  Everything works fine except that after a few reboots I start to
see the error messages like the following:

Freeing unused kernel memory: 24k freed
INIT: version 2.77 booting
nfs_refresh_inode: inode number mismatch
expected (0x308/0x28b3d2), got (0x308/0x12b91b)
INIT: Entering runlevel: 3
sh-2.03# 

Restarting the nfs server on the host does not get rid of the messages. 
Things will get better if I reboot the host.

I traced the network packets, and it seems obvious that the server is
returning wrong fileid in the "write reply" message.  Below is a segment of
the extracted packet trace.  It is obvious that the nfs server returns a wrong
fileid for the same handle it returned earlier to the client.  The confusing
part is the nfs server actually serves the first write request, and a couple
of other requests, correctly but failed for the second time, returning a wrong
fileid.

In my particular setup, it seems only certain files (inodes) tend to get
screwed up.

Does anybody have an idea as to what is wrong here?

Please cc your reply to my email address.  TIA.


Jun

------------------
round 3:

case 1:

2177 lookup:
        ioctl.save

2178 lookup reply:
        fileid: 2667474
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2181 write:
        offset:0
        total count: 60
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2182 write reply:
        fileid: 2667474
        size: 60

2183 setattr:
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2184 setattr reply:
        fileid: 2667474

2185 write:
        handle:
cabaebfed2b32800e6ab2800080300000803000054c21100b2302b0c00000000

2186 write reply:
        fileid 1227035
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2003-06-05 13:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2001-07-17  0:24 nfs_refresh_inode: inode number mismatch Marco d'Itri
2001-07-17  9:44 ` Trond Myklebust
2001-07-18 22:25   ` Marco d'Itri
2001-07-19 11:00   ` Trond Myklebust
  -- strict thread matches above, loose matches on Subject: below --
2003-06-03 23:54 Frank Cusack
2003-06-04 14:19 ` Trond Myklebust
2003-06-04 21:20   ` Frank Cusack
2003-06-04 21:28     ` Trond Myklebust
2003-06-05  9:11     ` Adrian Cox
2003-06-05  9:13       ` Russell King
2003-06-05 13:51         ` Trond Myklebust
2001-02-22 22:30 Scott A McConnell
2001-02-22 21:59 ` Russell King
2001-02-23  9:30   ` Trond Myklebust
2001-02-08  1:13 Jun Sun
2001-02-08  1:22 ` Neil Brown
2001-02-08  8:08   ` Russell King
2001-02-09  0:02     ` Jun Sun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).