All of lore.kernel.org
 help / color / mirror / Atom feed
* nfs4_reclaim_open_state: Lock reclaim failed!
@ 2018-08-29  9:09 Harald Dunkel
  2018-08-29  9:13 ` Harald Dunkel
  2018-08-31 15:41 ` Olga Kornievskaia
  0 siblings, 2 replies; 10+ messages in thread
From: Harald Dunkel @ 2018-08-29  9:09 UTC (permalink / raw)
  To: linux-nfs

Hi folks,

dmesg -T shows me a large list of messages

:
[Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
[Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
[Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
[Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
[Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
[Wed Aug 29 11:13:04 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
[Wed Aug 29 11:13:05 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
:

with a modification date appr 9 minutes in the future. NFS server and
client have the correct time and are in sync, afaict. How comes?


Kernel is 4.9.88-1+deb9u1 on the client, 4.16.5-1~bpo9+1 on the NFS server.


Every helpful comment is highly appreciated.
Harri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-08-29  9:09 nfs4_reclaim_open_state: Lock reclaim failed! Harald Dunkel
@ 2018-08-29  9:13 ` Harald Dunkel
  2018-08-31 11:49   ` Jeff Layton
  2018-08-31 15:41 ` Olga Kornievskaia
  1 sibling, 1 reply; 10+ messages in thread
From: Harald Dunkel @ 2018-08-29  9:13 UTC (permalink / raw)
  To: linux-nfs

PS:

On 8/29/18 11:09 AM, Harald Dunkel wrote:
> Hi folks,
> 
> dmesg -T shows me a large list of messages
> 
> :
> [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:04 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:05 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> :
> 
> with a modification date appr 9 minutes in the future. NFS server and

Its a "timestamp", of course. Sorry.

> client have the correct time and are in sync, afaict. How comes?
> 
> 
> Kernel is 4.9.88-1+deb9u1 on the client, 4.16.5-1~bpo9+1 on the NFS server.
> 
> 
> Every helpful comment is highly appreciated.
> Harri
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-08-29  9:13 ` Harald Dunkel
@ 2018-08-31 11:49   ` Jeff Layton
  2018-09-03  8:34     ` Harald Dunkel
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2018-08-31 11:49 UTC (permalink / raw)
  To: Harald Dunkel, linux-nfs

On Wed, 2018-08-29 at 11:13 +0200, Harald Dunkel wrote:
> PS:
> 
> On 8/29/18 11:09 AM, Harald Dunkel wrote:
> > Hi folks,
> > 
> > dmesg -T shows me a large list of messages
> > 
> > :
> > [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > [Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > [Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > [Wed Aug 29 11:13:04 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > [Wed Aug 29 11:13:05 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> > :
> > 
> > with a modification date appr 9 minutes in the future. NFS server and
> 
> Its a "timestamp", of course. Sorry.
> 
> > client have the correct time and are in sync, afaict. How comes?
> > 
> > 
> > Kernel is 4.9.88-1+deb9u1 on the client, 4.16.5-1~bpo9+1 on the NFS server.
> > 
> > 
> > Every helpful comment is highly appreciated.
> > Harri
> > 
> 
> 

Hi Harald,

Usually this means that the client and server have gotten out of sync
(possibly due to a server reboot), the client has tried to reclaim the
state it held before but that reclaim failed.

Determining why that happened is is difficult from the info you have
here. Is your server being restarted regularly? What version of NFS are
you using to mount?

v4.9 is pretty old at this point as well, you may want to try a newer
kernel on the client and see if it behaves better.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-08-29  9:09 nfs4_reclaim_open_state: Lock reclaim failed! Harald Dunkel
  2018-08-29  9:13 ` Harald Dunkel
@ 2018-08-31 15:41 ` Olga Kornievskaia
  2018-09-03  7:48   ` Harald Dunkel
  1 sibling, 1 reply; 10+ messages in thread
From: Olga Kornievskaia @ 2018-08-31 15:41 UTC (permalink / raw)
  To: harald.dunkel; +Cc: linux-nfs

On Wed, Aug 29, 2018 at 5:16 AM Harald Dunkel <harald.dunkel@aixigo.de> wrote:
>
> Hi folks,
>
> dmesg -T shows me a large list of messages
>
> :
> [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:02 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:03 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:04 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> [Wed Aug 29 11:13:05 2018] NFS: nfs4_reclaim_open_state: Lock reclaim failed!
> :
>
> with a modification date appr 9 minutes in the future. NFS server and
> client have the correct time and are in sync, afaict. How comes?

Is your question about how come the timestamp is wrong or are you
asking about the errors being logged. Jeff provided some info about
the latter piece. However, if you are asking about the timestamps,
then my guess would be that your hardware clock and your system clock
might be output sync. Check the "sudo hwclock --show" to what your
"date" shows. Also check the manual page that say for this option
"dmesg -T" there is a warming saying that the timestamps might not be
accurate.

>
>
> Kernel is 4.9.88-1+deb9u1 on the client, 4.16.5-1~bpo9+1 on the NFS server.
>
>
> Every helpful comment is highly appreciated.
> Harri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-08-31 15:41 ` Olga Kornievskaia
@ 2018-09-03  7:48   ` Harald Dunkel
  2018-09-03 13:15     ` Olga Kornievskaia
  0 siblings, 1 reply; 10+ messages in thread
From: Harald Dunkel @ 2018-09-03  7:48 UTC (permalink / raw)
  To: Olga Kornievskaia; +Cc: linux-nfs

Hi Olga,

On 8/31/18 5:41 PM, Olga Kornievskaia wrote:
> 
> Is your question about how come the timestamp is wrong or are you
> asking about the errors being logged. Jeff provided some info about
> the latter piece. However, if you are asking about the timestamps,
> then my guess would be that your hardware clock and your system clock
> might be output sync. Check the "sudo hwclock --show" to what your
> "date" shows. Also check the manual page that say for this option
> "dmesg -T" there is a warming saying that the timestamps might not be
> accurate.
> 

date and hwclock show the same time (now, after a reboot). I am not sure
why NFS should use the hwclock time for the logfile entries.


Regards
Harri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-08-31 11:49   ` Jeff Layton
@ 2018-09-03  8:34     ` Harald Dunkel
  2018-09-03  9:32       ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Harald Dunkel @ 2018-09-03  8:34 UTC (permalink / raw)
  To: Jeff Layton, linux-nfs

Hi Jeff,

On 8/31/18 1:49 PM, Jeff Layton wrote:
> 
> Hi Harald,
> 
> Usually this means that the client and server have gotten out of sync
> (possibly due to a server reboot), the client has tried to reclaim the
> state it held before but that reclaim failed.
> 

Is this supposed to happen on a server reboot? BTW, all Linux
clients are run with a kernel command line like

	nfs.nfs4_unique_id=6dcc70d4-7481-45b8-a3af-4fef4ea175d0

Each client has its own uuid, of course, hardwired at install time
in the grub configuration.

> Determining why that happened is is difficult from the info you have
> here. Is your server being restarted regularly? What version of NFS are
> you using to mount?
> 

No, usually we have uptimes of several months for the NFServers.
Its NFS4 (4.2):

# grep -i nfs /proc/mounts
nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
nfs-data:/space/data /data nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.19.96.122,local_lock=none,addr=172.19.96.205 0 0
nfs-data:/space/home /home nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.19.96.122,local_lock=none,addr=172.19.96.205 0 0

> v4.9 is pretty old at this point as well, you may want to try a newer
> kernel on the client and see if it behaves better.
> 

I am bound to the versions included in Debian 9. Currently it is
kernel 4.9.110-3+deb9u4 on both client and server. Not to mention
that we are also running hosts with Solaris 10 and 11, AIX 6.1 and
7.1, RedHat EL 5 to 7. NFS has to be rock-solid for our needs. Its
difficult to move to a newer kernel for some trial and error.

Would you recommend to stick with NFS 4(.0) or NFS 3, avoiding the
new code in NFS 4.{1,2}? Which NFS version in 4.9 or another LTS
kernel suits best for production use?


Regards
Harri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-09-03  8:34     ` Harald Dunkel
@ 2018-09-03  9:32       ` Jeff Layton
  2018-09-04  7:31         ` Harald Dunkel
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2018-09-03  9:32 UTC (permalink / raw)
  To: Harald Dunkel, linux-nfs

On Mon, 2018-09-03 at 10:34 +0200, Harald Dunkel wrote:
> Hi Jeff,
> 
> On 8/31/18 1:49 PM, Jeff Layton wrote:
> > 
> > Hi Harald,
> > 
> > Usually this means that the client and server have gotten out of sync
> > (possibly due to a server reboot), the client has tried to reclaim the
> > state it held before but that reclaim failed.
> > 
> 
> Is this supposed to happen on a server reboot? BTW, all Linux
> clients are run with a kernel command line like
> 
> 	nfs.nfs4_unique_id=6dcc70d4-7481-45b8-a3af-4fef4ea175d0
> 
> Each client has its own uuid, of course, hardwired at install time
> in the grub configuration.
> 

Yes, typically a server reboot will cause the client to reclaim its
state. If the server isn't restarting then you probably have a situation
where the client and server have gotten out of sync in some fashion, the
client is realizing it and attempting to reclaim its state.

One thing that could (potentially) cause this is a nfs4_unique_id
collision. You might want to survey your clients and ensure that there
aren't any.

> > Determining why that happened is is difficult from the info you have
> > here. Is your server being restarted regularly? What version of NFS are
> > you using to mount?
> > 
> 
> No, usually we have uptimes of several months for the NFServers.
> Its NFS4 (4.2):
> 
> # grep -i nfs /proc/mounts
> nfsd /proc/fs/nfsd nfsd rw,relatime 0 0
> nfs-data:/space/data /data nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.19.96.122,local_lock=none,addr=172.19.96.205 0 0
> nfs-data:/space/home /home nfs4 rw,relatime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.19.96.122,local_lock=none,addr=172.19.96.205 0 0
> 
> > v4.9 is pretty old at this point as well, you may want to try a newer
> > kernel on the client and see if it behaves better.
> > 
> 
> I am bound to the versions included in Debian 9. Currently it is
> kernel 4.9.110-3+deb9u4 on both client and server. Not to mention
> that we are also running hosts with Solaris 10 and 11, AIX 6.1 and
> 7.1, RedHat EL 5 to 7. NFS has to be rock-solid for our needs. Its
> difficult to move to a newer kernel for some trial and error.
> 

Pity -- a newer client would help rule out patches that have already
been fixed but that weren't backported to stable.

> Would you recommend to stick with NFS 4(.0) or NFS 3, avoiding the
> new code in NFS 4.{1,2}? Which NFS version in 4.9 or another LTS
> kernel suits best for production use?
> 

v4.1+ are fine (in general) for production, but there are always bugs.

I probably wouldn't make any changes until you have a clearer idea of
why your clients are going into reclaim. One idea might be to sniff NFS
traffic and see if you can suss out what's triggering that series of
events.

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-09-03  7:48   ` Harald Dunkel
@ 2018-09-03 13:15     ` Olga Kornievskaia
  0 siblings, 0 replies; 10+ messages in thread
From: Olga Kornievskaia @ 2018-09-03 13:15 UTC (permalink / raw)
  To: harald.dunkel; +Cc: linux-nfs

On Mon, Sep 3, 2018 at 3:49 AM Harald Dunkel <harald.dunkel@aixigo.de> wrote:
>
> Hi Olga,
>
> On 8/31/18 5:41 PM, Olga Kornievskaia wrote:
> >
> > Is your question about how come the timestamp is wrong or are you
> > asking about the errors being logged. Jeff provided some info about
> > the latter piece. However, if you are asking about the timestamps,
> > then my guess would be that your hardware clock and your system clock
> > might be output sync. Check the "sudo hwclock --show" to what your
> > "date" shows. Also check the manual page that say for this option
> > "dmesg -T" there is a warming saying that the timestamps might not be
> > accurate.
> >
>
> date and hwclock show the same time (now, after a reboot).

where they the same before that?

>  I am not sure why NFS should use the hwclock time for the logfile entries.

timestamps have nothing to do with NFS.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-09-03  9:32       ` Jeff Layton
@ 2018-09-04  7:31         ` Harald Dunkel
  2018-09-04 12:11           ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Harald Dunkel @ 2018-09-04  7:31 UTC (permalink / raw)
  To: Jeff Layton, linux-nfs

Hi Jeff,

On 9/3/18 11:32 AM, Jeff Layton wrote:
> 
> Yes, typically a server reboot will cause the client to reclaim its
> state. If the server isn't restarting then you probably have a situation
> where the client and server have gotten out of sync in some fashion, the
> client is realizing it and attempting to reclaim its state.
> 
> One thing that could (potentially) cause this is a nfs4_unique_id
> collision. You might want to survey your clients and ensure that there
> aren't any.
> 

/sys/module/nfs/parameters/nfs4_unique_id tells me that the default
is an empty string. Thats hard to believe. I had expected the default
is derived from the mac address of eth0 or something like this. ???

All explicitly defined nfs4_unique_id are unique, I checked (on the
Linux hosts). il06 (the NFS client here) and 4 other ancient servers
*were* running with the default "unique" id. My fault.

>> Would you recommend to stick with NFS 4(.0) or NFS 3, avoiding the
>> new code in NFS 4.{1,2}? Which NFS version in 4.9 or another LTS
>> kernel suits best for production use?
>>
> 
> v4.1+ are fine (in general) for production, but there are always bugs.
> 

How do NFS version numbers on client and Linux server affect each
other? AIX 7.1 (just as an example) supports just "nfs" and "nfs4",
not "nfs4.1" or "nfs4.2". Will the AIX clients benefit from the bug
fixes included in Linux' nfs 4.1+?


Regards
Harri

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: nfs4_reclaim_open_state: Lock reclaim failed!
  2018-09-04  7:31         ` Harald Dunkel
@ 2018-09-04 12:11           ` Jeff Layton
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2018-09-04 12:11 UTC (permalink / raw)
  To: Harald Dunkel, linux-nfs

On Tue, 2018-09-04 at 09:31 +0200, Harald Dunkel wrote:
> Hi Jeff,
> 
> On 9/3/18 11:32 AM, Jeff Layton wrote:
> > 
> > Yes, typically a server reboot will cause the client to reclaim its
> > state. If the server isn't restarting then you probably have a situation
> > where the client and server have gotten out of sync in some fashion, the
> > client is realizing it and attempting to reclaim its state.
> > 
> > One thing that could (potentially) cause this is a nfs4_unique_id
> > collision. You might want to survey your clients and ensure that there
> > aren't any.
> > 
> 
> /sys/module/nfs/parameters/nfs4_unique_id tells me that the default
> is an empty string. Thats hard to believe. I had expected the default
> is derived from the mac address of eth0 or something like this. ???
> 
> All explicitly defined nfs4_unique_id are unique, I checked (on the
> Linux hosts). il06 (the NFS client here) and 4 other ancient servers
> *were* running with the default "unique" id. My fault.
> 

In general, the long-form clientid needs to be unique for each client.

nfs4_unique_id is just a uniquifier for the long-form client ID string.
If it's blank then it'll just use the current nodename (hostname)
without one. If you have uniquifier collisions among hosts with
different hostnames, then you'll still get different strings. See
nfs4_init_uniform_client_string() in the kernel sources for details.

> > > Would you recommend to stick with NFS 4(.0) or NFS 3, avoiding the
> > > new code in NFS 4.{1,2}? Which NFS version in 4.9 or another LTS
> > > kernel suits best for production use?
> > > 
> > 
> > v4.1+ are fine (in general) for production, but there are always bugs.
> > 
> 
> How do NFS version numbers on client and Linux server affect each
> other? AIX 7.1 (just as an example) supports just "nfs" and "nfs4",
> not "nfs4.1" or "nfs4.2". Will the AIX clients benefit from the bug
> fixes included in Linux' nfs 4.1+?
> 

I'm not sure -- it really depends on whether AIX supports v4.1+.

In general, the server will "advertise" what versions it supports and
the Linux client will negotiate to the highest supported version. I'm
not sure what other clients will do.

-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-09-04 16:36 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-29  9:09 nfs4_reclaim_open_state: Lock reclaim failed! Harald Dunkel
2018-08-29  9:13 ` Harald Dunkel
2018-08-31 11:49   ` Jeff Layton
2018-09-03  8:34     ` Harald Dunkel
2018-09-03  9:32       ` Jeff Layton
2018-09-04  7:31         ` Harald Dunkel
2018-09-04 12:11           ` Jeff Layton
2018-08-31 15:41 ` Olga Kornievskaia
2018-09-03  7:48   ` Harald Dunkel
2018-09-03 13:15     ` Olga Kornievskaia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.