All of lore.kernel.org
 help / color / mirror / Atom feed
* 4.6, 4.7 slow ifs export with more than one client.
@ 2016-09-05  4:55 Oleg Drokin
  2016-09-06 14:30 ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Drokin @ 2016-09-05  4:55 UTC (permalink / raw)
  To: linux-nfs; +Cc: Jeff Layton

Hello!

   I have a somewhat mysterious problem with my nfs test rig that I suspect is something
   stupid I am missing, but I cannot figure it out and would appreciate any help.

   NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
   Clients are a bunch of 4.8-rc5 nodes, nfsroot.
   If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
   operations ground to a half (nfs-wise). NFS server side there's very little load.

   I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
   was running 4.4.something I believe), and back then after some mucking around
   I set:
net.core.rmem_default=268435456
net.core.wmem_default=268435456
net.core.rmem_max=268435456
net.core.wmem_max=268435456

   and while no idea why, that helped, so I stopped looking into it completely.

   Now fast forward to now, I am back at the same problem and the workaround above
   does not help anymore.

   I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
   in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
   help).

   So anyway I discovered the nfsdcltrack and such and I noticed that whenever
   the kernel calls it, it's always with the same hexid of
   4c696e7578204e465376342e32206c6f63616c686f7374

   NAturally if I try to list the content of the sqlite file, I get:
sqlite> select * from clients;
Linux NFSv4.2 localhost|1473049735|1
sqlite> select * from clients;
Linux NFSv4.2 localhost|1473049736|1
sqlite> select * from clients;
Linux NFSv4.2 localhost|1473049737|1
sqlite> select * from clients;
Linux NFSv4.2 localhost|1473049751|1
sqlite> select * from clients;
Linux NFSv4.2 localhost|1473049752|1
sqlite> 

   (the number keeps changing), so it looks like client id detection broke somehow?

   These same clients (and a bunch more) also mount another nfs server (for crashdump
   purposes) that is centos7-based, there everything is detected correctly
   and performance is ok. The select shows:
sqlite> select * from clients;
Linux NFSv4.0 192.168.10.219/192.168.10.1 tcp|1472868376|0
Linux NFSv4.0 192.168.10.218/192.168.10.1 tcp|1472868376|0
Linux NFSv4.0 192.168.10.210/192.168.10.1 tcp|1472868384|0
Linux NFSv4.0 192.168.10.221/192.168.10.1 tcp|1472868387|0
Linux NFSv4.0 192.168.10.220/192.168.10.1 tcp|1472868388|0
Linux NFSv4.0 192.168.10.211/192.168.10.1 tcp|1472868389|0
Linux NFSv4.0 192.168.10.222/192.168.10.1 tcp|1473035496|0
Linux NFSv4.0 192.168.10.217/192.168.10.1 tcp|1473035500|0
Linux NFSv4.0 192.168.10.216/192.168.10.1 tcp|1473035501|0
Linux NFSv4.0 192.168.10.224/192.168.10.1 tcp|1473035520|0
Linux NFSv4.0 192.168.10.226/192.168.10.1 tcp|1473045789|0
Linux NFSv4.0 192.168.10.227/192.168.10.1 tcp|1473045789|0
Linux NFSv4.1 fedora1.localnet|1473046045|1
Linux NFSv4.1 fedora-1-3.localnet|1473046139|1
Linux NFSv4.1 fedora-2-4.localnet|1473046229|1
Linux NFSv4.1 fedora-1-1.localnet|1473046244|1
Linux NFSv4.1 fedora-1-4.localnet|1473046251|1
Linux NFSv4.1 fedora-2-1.localnet|1473046342|1
Linux NFSv4.1 fedora-1-2.localnet|1473046498|1
Linux NFSv4.1 fedora-2-3.localnet|1473046524|1
Linux NFSv4.1 fedora-2-2.localnet|1473046689|1
sqlite> 

  (the first nameless bunch is centos7 nfsroot clients, fedora* bunch are
  the ones on 4.8-rc5).
  If I try to mount the Fedora23 server from one of the centos7 clients, the record
  does not appear in the output either.

   Now, while a theory that "aha, it's nfs 4.2 that is broken with Fedora23"
   might look possible, I have another Fedora23 server that is mounted by
   yet another (single) client and there things seems to be fine:
sqlite> select * from clients;
Linux NFSv4.2 xbmc.localnet|1471825025|1


   So with all of that in the picture, I wonder what is it I am doing wrong just on
   this server?

   Thanks.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-05  4:55 4.6, 4.7 slow ifs export with more than one client Oleg Drokin
@ 2016-09-06 14:30 ` Jeff Layton
  2016-09-06 14:58   ` Oleg Drokin
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2016-09-06 14:30 UTC (permalink / raw)
  To: Oleg Drokin, linux-nfs

On Mon, 2016-09-05 at 00:55 -0400, Oleg Drokin wrote:
> Hello!
> 
>    I have a somewhat mysterious problem with my nfs test rig that I suspect is something
>    stupid I am missing, but I cannot figure it out and would appreciate any help.
> 
>    NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
>    Clients are a bunch of 4.8-rc5 nodes, nfsroot.
>    If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
>    operations ground to a half (nfs-wise). NFS server side there's very little load.
> 
>    I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
>    was running 4.4.something I believe), and back then after some mucking around
>    I set:
> net.core.rmem_default=268435456
> net.core.wmem_default=268435456
> net.core.rmem_max=268435456
> net.core.wmem_max=268435456
> 
>    and while no idea why, that helped, so I stopped looking into it completely.
> 
>    Now fast forward to now, I am back at the same problem and the workaround above
>    does not help anymore.
> 
>    I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
>    in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
>    help).
> 
>    So anyway I discovered the nfsdcltrack and such and I noticed that whenever
>    the kernel calls it, it's always with the same hexid of
>    4c696e7578204e465376342e32206c6f63616c686f7374
> 
>    NAturally if I try to list the content of the sqlite file, I get:
> sqlite> select * from clients;
> Linux NFSv4.2 localhost|1473049735|1
> sqlite> select * from clients;
> Linux NFSv4.2 localhost|1473049736|1
> sqlite> select * from clients;
> Linux NFSv4.2 localhost|1473049737|1
> sqlite> select * from clients;
> Linux NFSv4.2 localhost|1473049751|1
> sqlite> select * from clients;
> Linux NFSv4.2 localhost|1473049752|1
> sqlite> 
> 

Well, not exactly. It sounds like the clients are all using the same
long-form clientid string. The server sees that and tosses out any
state that was previously established by the earlier client, because it
assumes that the client rebooted.

The easiest way to work around this is to use the nfs4_unique_id nfs.ko
module parm on the clients to give them each a unique string id. That
should prevent the collisions.

>    (the number keeps changing), so it looks like client id detection broke somehow?
> 
>    These same clients (and a bunch more) also mount another nfs server (for crashdump
>    purposes) that is centos7-based, there everything is detected correctly
>    and performance is ok. The select shows:
> sqlite> select * from clients;
> Linux NFSv4.0 192.168.10.219/192.168.10.1 tcp|1472868376|0
> Linux NFSv4.0 192.168.10.218/192.168.10.1 tcp|1472868376|0
> Linux NFSv4.0 192.168.10.210/192.168.10.1 tcp|1472868384|0
> Linux NFSv4.0 192.168.10.221/192.168.10.1 tcp|1472868387|0
> Linux NFSv4.0 192.168.10.220/192.168.10.1 tcp|1472868388|0
> Linux NFSv4.0 192.168.10.211/192.168.10.1 tcp|1472868389|0
> Linux NFSv4.0 192.168.10.222/192.168.10.1 tcp|1473035496|0
> Linux NFSv4.0 192.168.10.217/192.168.10.1 tcp|1473035500|0
> Linux NFSv4.0 192.168.10.216/192.168.10.1 tcp|1473035501|0
> Linux NFSv4.0 192.168.10.224/192.168.10.1 tcp|1473035520|0
> Linux NFSv4.0 192.168.10.226/192.168.10.1 tcp|1473045789|0
> Linux NFSv4.0 192.168.10.227/192.168.10.1 tcp|1473045789|0
> Linux NFSv4.1 fedora1.localnet|1473046045|1
> Linux NFSv4.1 fedora-1-3.localnet|1473046139|1
> Linux NFSv4.1 fedora-2-4.localnet|1473046229|1
> Linux NFSv4.1 fedora-1-1.localnet|1473046244|1
> Linux NFSv4.1 fedora-1-4.localnet|1473046251|1
> Linux NFSv4.1 fedora-2-1.localnet|1473046342|1
> Linux NFSv4.1 fedora-1-2.localnet|1473046498|1
> Linux NFSv4.1 fedora-2-3.localnet|1473046524|1
> Linux NFSv4.1 fedora-2-2.localnet|1473046689|1
> sqlite> 
> 
>   (the first nameless bunch is centos7 nfsroot clients, fedora* bunch are
>   the ones on 4.8-rc5).
>   If I try to mount the Fedora23 server from one of the centos7 clients, the record
>   does not appear in the output either.
> 
>    Now, while a theory that "aha, it's nfs 4.2 that is broken with Fedora23"
>    might look possible, I have another Fedora23 server that is mounted by
>    yet another (single) client and there things seems to be fine:
> sqlite> select * from clients;
> Linux NFSv4.2 xbmc.localnet|1471825025|1
> 
> 
>    So with all of that in the picture, I wonder what is it I am doing wrong just on
>    this server?
> 
>    Thanks.
> 
> Bye,
>     Oleg
-- 
Jeff Layton <jlayton@poochiereds.net>
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 14:30 ` Jeff Layton
@ 2016-09-06 14:58   ` Oleg Drokin
  2016-09-06 15:18     ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Drokin @ 2016-09-06 14:58 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs


On Sep 6, 2016, at 10:30 AM, Jeff Layton wrote:

> On Mon, 2016-09-05 at 00:55 -0400, Oleg Drokin wrote:
>> Hello!
>> 
>>    I have a somewhat mysterious problem with my nfs test rig that I suspect is something
>>    stupid I am missing, but I cannot figure it out and would appreciate any help.
>> 
>>    NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
>>    Clients are a bunch of 4.8-rc5 nodes, nfsroot.
>>    If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
>>    operations ground to a half (nfs-wise). NFS server side there's very little load.
>> 
>>    I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
>>    was running 4.4.something I believe), and back then after some mucking around
>>    I set:
>> net.core.rmem_default=268435456
>> net.core.wmem_default=268435456
>> net.core.rmem_max=268435456
>> net.core.wmem_max=268435456
>> 
>>    and while no idea why, that helped, so I stopped looking into it completely.
>> 
>>    Now fast forward to now, I am back at the same problem and the workaround above
>>    does not help anymore.
>> 
>>    I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
>>    in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
>>    help).
>> 
>>    So anyway I discovered the nfsdcltrack and such and I noticed that whenever
>>    the kernel calls it, it's always with the same hexid of
>>    4c696e7578204e465376342e32206c6f63616c686f7374
>> 
>>    NAturally if I try to list the content of the sqlite file, I get:
>> sqlite> select * from clients;
>> Linux NFSv4.2 localhost|1473049735|1
>> sqlite> select * from clients;
>> Linux NFSv4.2 localhost|1473049736|1
>> sqlite> select * from clients;
>> Linux NFSv4.2 localhost|1473049737|1
>> sqlite> select * from clients;
>> Linux NFSv4.2 localhost|1473049751|1
>> sqlite> select * from clients;
>> Linux NFSv4.2 localhost|1473049752|1
>> sqlite> 
>> 
> 
> Well, not exactly. It sounds like the clients are all using the same
> long-form clientid string. The server sees that and tosses out any
> state that was previously established by the earlier client, because it
> assumes that the client rebooted.
> 
> The easiest way to work around this is to use the nfs4_unique_id nfs.ko
> module parm on the clients to give them each a unique string id. That
> should prevent the collisions.

Hm, but it did work ok in the past.
What determines the unique id now by default?
The clients do start with a different ip address for one, so that
seems to make that a much more good proxy for unique id
(or local ip/server ip as is in case of centos7) than whatever local
hostname is at any random point in time during boot
(where it might not be set yet apparently).

> 
>>    (the number keeps changing), so it looks like client id detection broke somehow?
>> 
>>    These same clients (and a bunch more) also mount another nfs server (for crashdump
>>    purposes) that is centos7-based, there everything is detected correctly
>>    and performance is ok. The select shows:
>> sqlite> select * from clients;
>> Linux NFSv4.0 192.168.10.219/192.168.10.1 tcp|1472868376|0
>> Linux NFSv4.0 192.168.10.218/192.168.10.1 tcp|1472868376|0
>> Linux NFSv4.0 192.168.10.210/192.168.10.1 tcp|1472868384|0
>> Linux NFSv4.0 192.168.10.221/192.168.10.1 tcp|1472868387|0
>> Linux NFSv4.0 192.168.10.220/192.168.10.1 tcp|1472868388|0
>> Linux NFSv4.0 192.168.10.211/192.168.10.1 tcp|1472868389|0
>> Linux NFSv4.0 192.168.10.222/192.168.10.1 tcp|1473035496|0
>> Linux NFSv4.0 192.168.10.217/192.168.10.1 tcp|1473035500|0
>> Linux NFSv4.0 192.168.10.216/192.168.10.1 tcp|1473035501|0
>> Linux NFSv4.0 192.168.10.224/192.168.10.1 tcp|1473035520|0
>> Linux NFSv4.0 192.168.10.226/192.168.10.1 tcp|1473045789|0
>> Linux NFSv4.0 192.168.10.227/192.168.10.1 tcp|1473045789|0
>> Linux NFSv4.1 fedora1.localnet|1473046045|1
>> Linux NFSv4.1 fedora-1-3.localnet|1473046139|1
>> Linux NFSv4.1 fedora-2-4.localnet|1473046229|1
>> Linux NFSv4.1 fedora-1-1.localnet|1473046244|1
>> Linux NFSv4.1 fedora-1-4.localnet|1473046251|1
>> Linux NFSv4.1 fedora-2-1.localnet|1473046342|1
>> Linux NFSv4.1 fedora-1-2.localnet|1473046498|1
>> Linux NFSv4.1 fedora-2-3.localnet|1473046524|1
>> Linux NFSv4.1 fedora-2-2.localnet|1473046689|1
>> sqlite> 
>> 
>>   (the first nameless bunch is centos7 nfsroot clients, fedora* bunch are
>>   the ones on 4.8-rc5).
>>   If I try to mount the Fedora23 server from one of the centos7 clients, the record
>>   does not appear in the output either.
>> 
>>    Now, while a theory that "aha, it's nfs 4.2 that is broken with Fedora23"
>>    might look possible, I have another Fedora23 server that is mounted by
>>    yet another (single) client and there things seems to be fine:
>> sqlite> select * from clients;
>> Linux NFSv4.2 xbmc.localnet|1471825025|1
>> 
>> 
>>    So with all of that in the picture, I wonder what is it I am doing wrong just on
>>    this server?
>> 
>>    Thanks.
>> 
>> Bye,
>>     Oleg
> -- 
> Jeff Layton <jlayton@poochiereds.net>
> -- 
> Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 14:58   ` Oleg Drokin
@ 2016-09-06 15:18     ` Jeff Layton
  2016-09-06 15:47       ` Oleg Drokin
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2016-09-06 15:18 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: linux-nfs

On Tue, 2016-09-06 at 10:58 -0400, Oleg Drokin wrote:
> On Sep 6, 2016, at 10:30 AM, Jeff Layton wrote:
> 
> > 
> > On Mon, 2016-09-05 at 00:55 -0400, Oleg Drokin wrote:
> > > 
> > > Hello!
> > > 
> > >    I have a somewhat mysterious problem with my nfs test rig that I suspect is something
> > >    stupid I am missing, but I cannot figure it out and would appreciate any help.
> > > 
> > >    NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
> > >    Clients are a bunch of 4.8-rc5 nodes, nfsroot.
> > >    If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
> > >    operations ground to a half (nfs-wise). NFS server side there's very little load.
> > > 
> > >    I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
> > >    was running 4.4.something I believe), and back then after some mucking around
> > >    I set:
> > > net.core.rmem_default=268435456
> > > net.core.wmem_default=268435456
> > > net.core.rmem_max=268435456
> > > net.core.wmem_max=268435456
> > > 
> > >    and while no idea why, that helped, so I stopped looking into it completely.
> > > 
> > >    Now fast forward to now, I am back at the same problem and the workaround above
> > >    does not help anymore.
> > > 
> > >    I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
> > >    in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
> > >    help).
> > > 
> > >    So anyway I discovered the nfsdcltrack and such and I noticed that whenever
> > >    the kernel calls it, it's always with the same hexid of
> > >    4c696e7578204e465376342e32206c6f63616c686f7374
> > > 
> > >    NAturally if I try to list the content of the sqlite file, I get:
> > > sqlite> select * from clients;
> > > Linux NFSv4.2 localhost|1473049735|1
> > > sqlite> select * from clients;
> > > Linux NFSv4.2 localhost|1473049736|1
> > > sqlite> select * from clients;
> > > Linux NFSv4.2 localhost|1473049737|1
> > > sqlite> select * from clients;
> > > Linux NFSv4.2 localhost|1473049751|1
> > > sqlite> select * from clients;
> > > Linux NFSv4.2 localhost|1473049752|1
> > > sqlite> 
> > > 
> > 
> > Well, not exactly. It sounds like the clients are all using the same
> > long-form clientid string. The server sees that and tosses out any
> > state that was previously established by the earlier client, because it
> > assumes that the client rebooted.
> > 
> > The easiest way to work around this is to use the nfs4_unique_id nfs.ko
> > module parm on the clients to give them each a unique string id. That
> > should prevent the collisions.
> 
> Hm, but it did work ok in the past.
> What determines the unique id now by default?
> The clients do start with a different ip address for one, so that
> seems to make that a much more good proxy for unique id
> (or local ip/server ip as is in case of centos7) than whatever local
> hostname is at any random point in time during boot
> (where it might not be set yet apparently).
> 

The v4.1+ clientid is (by default) determined entirely from the
hostname.

IP addresses are a poor choice given that they can easily change for
clients that have them dynamically assigned. That's the main reason
that v4.0 behaves differently here. The big problems there really come
into play with NFSv4 migration. See this RFC draft for the gory
details:

    https://tools.ietf.org/html/draft-ietf-nfsv4-migration-issues-10


> > 
> > 
> > > 
> > >    (the number keeps changing), so it looks like client id detection broke somehow?
> > > 
> > >    These same clients (and a bunch more) also mount another nfs server (for crashdump
> > >    purposes) that is centos7-based, there everything is detected correctly
> > >    and performance is ok. The select shows:
> > > sqlite> select * from clients;
> > > Linux NFSv4.0 192.168.10.219/192.168.10.1 tcp|1472868376|0
> > > Linux NFSv4.0 192.168.10.218/192.168.10.1 tcp|1472868376|0
> > > Linux NFSv4.0 192.168.10.210/192.168.10.1 tcp|1472868384|0
> > > Linux NFSv4.0 192.168.10.221/192.168.10.1 tcp|1472868387|0
> > > Linux NFSv4.0 192.168.10.220/192.168.10.1 tcp|1472868388|0
> > > Linux NFSv4.0 192.168.10.211/192.168.10.1 tcp|1472868389|0
> > > Linux NFSv4.0 192.168.10.222/192.168.10.1 tcp|1473035496|0
> > > Linux NFSv4.0 192.168.10.217/192.168.10.1 tcp|1473035500|0
> > > Linux NFSv4.0 192.168.10.216/192.168.10.1 tcp|1473035501|0
> > > Linux NFSv4.0 192.168.10.224/192.168.10.1 tcp|1473035520|0
> > > Linux NFSv4.0 192.168.10.226/192.168.10.1 tcp|1473045789|0
> > > Linux NFSv4.0 192.168.10.227/192.168.10.1 tcp|1473045789|0
> > > Linux NFSv4.1 fedora1.localnet|1473046045|1
> > > Linux NFSv4.1 fedora-1-3.localnet|1473046139|1
> > > Linux NFSv4.1 fedora-2-4.localnet|1473046229|1
> > > Linux NFSv4.1 fedora-1-1.localnet|1473046244|1
> > > Linux NFSv4.1 fedora-1-4.localnet|1473046251|1
> > > Linux NFSv4.1 fedora-2-1.localnet|1473046342|1
> > > Linux NFSv4.1 fedora-1-2.localnet|1473046498|1
> > > Linux NFSv4.1 fedora-2-3.localnet|1473046524|1
> > > Linux NFSv4.1 fedora-2-2.localnet|1473046689|1
> > > sqlite> 
> > > 
> > >   (the first nameless bunch is centos7 nfsroot clients, fedora* bunch are
> > >   the ones on 4.8-rc5).
> > >   If I try to mount the Fedora23 server from one of the centos7 clients, the record
> > >   does not appear in the output either.
> > > 
> > >    Now, while a theory that "aha, it's nfs 4.2 that is broken with Fedora23"
> > >    might look possible, I have another Fedora23 server that is mounted by
> > >    yet another (single) client and there things seems to be fine:
> > > sqlite> select * from clients;
> > > Linux NFSv4.2 xbmc.localnet|1471825025|1
> > > 
> > > 
> > >    So with all of that in the picture, I wonder what is it I am doing wrong just on
> > >    this server?
> > > 
> > >    Thanks.
> > > 
> > > Bye,
> > >     Oleg
> > -- 
> > > > Jeff Layton <jlayton@poochiereds.net>
> > -- 
> > > > Jeff Layton <jlayton@redhat.com>
> 
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 15:18     ` Jeff Layton
@ 2016-09-06 15:47       ` Oleg Drokin
  2016-09-06 16:00         ` Jeff Layton
  2016-09-06 16:38         ` Chuck Lever
  0 siblings, 2 replies; 10+ messages in thread
From: Oleg Drokin @ 2016-09-06 15:47 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs


On Sep 6, 2016, at 11:18 AM, Jeff Layton wrote:

> On Tue, 2016-09-06 at 10:58 -0400, Oleg Drokin wrote:
>> On Sep 6, 2016, at 10:30 AM, Jeff Layton wrote:
>> 
>>> 
>>> On Mon, 2016-09-05 at 00:55 -0400, Oleg Drokin wrote:
>>>> 
>>>> Hello!
>>>> 
>>>>    I have a somewhat mysterious problem with my nfs test rig that I suspect is something
>>>>    stupid I am missing, but I cannot figure it out and would appreciate any help.
>>>> 
>>>>    NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
>>>>    Clients are a bunch of 4.8-rc5 nodes, nfsroot.
>>>>    If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
>>>>    operations ground to a half (nfs-wise). NFS server side there's very little load.
>>>> 
>>>>    I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
>>>>    was running 4.4.something I believe), and back then after some mucking around
>>>>    I set:
>>>> net.core.rmem_default=268435456
>>>> net.core.wmem_default=268435456
>>>> net.core.rmem_max=268435456
>>>> net.core.wmem_max=268435456
>>>> 
>>>>    and while no idea why, that helped, so I stopped looking into it completely.
>>>> 
>>>>    Now fast forward to now, I am back at the same problem and the workaround above
>>>>    does not help anymore.
>>>> 
>>>>    I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
>>>>    in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
>>>>    help).
>>>> 
>>>>    So anyway I discovered the nfsdcltrack and such and I noticed that whenever
>>>>    the kernel calls it, it's always with the same hexid of
>>>>    4c696e7578204e465376342e32206c6f63616c686f7374
>>>> 
>>>>    NAturally if I try to list the content of the sqlite file, I get:
>>>> sqlite> select * from clients;
>>>> Linux NFSv4.2 localhost|1473049735|1
>>>> sqlite> select * from clients;
>>>> Linux NFSv4.2 localhost|1473049736|1
>>>> sqlite> select * from clients;
>>>> Linux NFSv4.2 localhost|1473049737|1
>>>> sqlite> select * from clients;
>>>> Linux NFSv4.2 localhost|1473049751|1
>>>> sqlite> select * from clients;
>>>> Linux NFSv4.2 localhost|1473049752|1
>>>> sqlite> 
>>>> 
>>> 
>>> Well, not exactly. It sounds like the clients are all using the same
>>> long-form clientid string. The server sees that and tosses out any
>>> state that was previously established by the earlier client, because it
>>> assumes that the client rebooted.
>>> 
>>> The easiest way to work around this is to use the nfs4_unique_id nfs.ko
>>> module parm on the clients to give them each a unique string id. That
>>> should prevent the collisions.
>> 
>> Hm, but it did work ok in the past.
>> What determines the unique id now by default?
>> The clients do start with a different ip address for one, so that
>> seems to make that a much more good proxy for unique id
>> (or local ip/server ip as is in case of centos7) than whatever local
>> hostname is at any random point in time during boot
>> (where it might not be set yet apparently).
>> 
> 
> The v4.1+ clientid is (by default) determined entirely from the
> hostname.
> 
> IP addresses are a poor choice given that they can easily change for
> clients that have them dynamically assigned. That's the main reason
> that v4.0 behaves differently here. The big problems there really come
> into play with NFSv4 migration. See this RFC draft for the gory
> details:
> 
>     https://tools.ietf.org/html/draft-ietf-nfsv4-migration-issues-10

Duh, so "ip addresses are unreliable, let's use something even less
reliable". hostname is also dynamic in a bunch of cases, btw.
Worst of all, there are very many valid cases where nfs might be mounted
before hostname is set (or do you regard that as a bug in the environment
and I should just file a ticket in Fedora bugzilla?)

Looking over the draft, the two cases are:
what if client reboots, how do we reclaim state ASAP and
what if there is server migration, but same client.

The second case is trivial as long as the client id stays constant no matter
what server you connect to and might be any number of constant identifiers,
be it random, or not.

On the other hand the rebooted client is more interesting. Of course there's
also a lease expiration (that's what we do in Lustre too, if the client dies,
it'll be expired eventually, but also if we talk to it and it does not reply,
we kick it out as well, and this has a much shorter timeout, so not as disruptive).

Cannot some more unique identifier be used by default?
Say "mac address of the primary interface, whatever that happens to be",
in that case as long as your client remains on the same physical box
(and the network card has not changed), you should be fine.
I guess there are other ways.
Ideally, kernel would offer an API (might be there is already, but I cannot find it)
that could be queried for a unique id like that (with inputs from mac addresses,
various serial numbers identifiable and such).


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 15:47       ` Oleg Drokin
@ 2016-09-06 16:00         ` Jeff Layton
  2016-09-06 16:29           ` Oleg Drokin
  2016-09-06 16:38         ` Chuck Lever
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Layton @ 2016-09-06 16:00 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: linux-nfs

On Tue, 2016-09-06 at 11:47 -0400, Oleg Drokin wrote:
> On Sep 6, 2016, at 11:18 AM, Jeff Layton wrote:
> 
> > 
> > On Tue, 2016-09-06 at 10:58 -0400, Oleg Drokin wrote:
> > > 
> > > On Sep 6, 2016, at 10:30 AM, Jeff Layton wrote:
> > > 
> > > > 
> > > > 
> > > > On Mon, 2016-09-05 at 00:55 -0400, Oleg Drokin wrote:
> > > > > 
> > > > > 
> > > > > Hello!
> > > > > 
> > > > >    I have a somewhat mysterious problem with my nfs test rig that I suspect is something
> > > > >    stupid I am missing, but I cannot figure it out and would appreciate any help.
> > > > > 
> > > > >    NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
> > > > >    Clients are a bunch of 4.8-rc5 nodes, nfsroot.
> > > > >    If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
> > > > >    operations ground to a half (nfs-wise). NFS server side there's very little load.
> > > > > 
> > > > >    I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
> > > > >    was running 4.4.something I believe), and back then after some mucking around
> > > > >    I set:
> > > > > net.core.rmem_default=268435456
> > > > > net.core.wmem_default=268435456
> > > > > net.core.rmem_max=268435456
> > > > > net.core.wmem_max=268435456
> > > > > 
> > > > >    and while no idea why, that helped, so I stopped looking into it completely.
> > > > > 
> > > > >    Now fast forward to now, I am back at the same problem and the workaround above
> > > > >    does not help anymore.
> > > > > 
> > > > >    I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
> > > > >    in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
> > > > >    help).
> > > > > 
> > > > >    So anyway I discovered the nfsdcltrack and such and I noticed that whenever
> > > > >    the kernel calls it, it's always with the same hexid of
> > > > >    4c696e7578204e465376342e32206c6f63616c686f7374
> > > > > 
> > > > >    NAturally if I try to list the content of the sqlite file, I get:
> > > > > sqlite> select * from clients;
> > > > > Linux NFSv4.2 localhost|1473049735|1
> > > > > sqlite> select * from clients;
> > > > > Linux NFSv4.2 localhost|1473049736|1
> > > > > sqlite> select * from clients;
> > > > > Linux NFSv4.2 localhost|1473049737|1
> > > > > sqlite> select * from clients;
> > > > > Linux NFSv4.2 localhost|1473049751|1
> > > > > sqlite> select * from clients;
> > > > > Linux NFSv4.2 localhost|1473049752|1
> > > > > sqlite> 
> > > > > 
> > > > 
> > > > Well, not exactly. It sounds like the clients are all using the same
> > > > long-form clientid string. The server sees that and tosses out any
> > > > state that was previously established by the earlier client, because it
> > > > assumes that the client rebooted.
> > > > 
> > > > The easiest way to work around this is to use the nfs4_unique_id nfs.ko
> > > > module parm on the clients to give them each a unique string id. That
> > > > should prevent the collisions.
> > > 
> > > Hm, but it did work ok in the past.
> > > What determines the unique id now by default?
> > > The clients do start with a different ip address for one, so that
> > > seems to make that a much more good proxy for unique id
> > > (or local ip/server ip as is in case of centos7) than whatever local
> > > hostname is at any random point in time during boot
> > > (where it might not be set yet apparently).
> > > 
> > 
> > The v4.1+ clientid is (by default) determined entirely from the
> > hostname.
> > 
> > IP addresses are a poor choice given that they can easily change for
> > clients that have them dynamically assigned. That's the main reason
> > that v4.0 behaves differently here. The big problems there really come
> > into play with NFSv4 migration. See this RFC draft for the gory
> > details:
> > 
> >     https://tools.ietf.org/html/draft-ietf-nfsv4-migration-issues-10
> 
> Duh, so "ip addresses are unreliable, let's use something even less
> reliable". hostname is also dynamic in a bunch of cases, btw.
> Worst of all, there are very many valid cases where nfs might be mounted
> before hostname is set (or do you regard that as a bug in the environment
> and I should just file a ticket in Fedora bugzilla?)
> 
> Looking over the draft, the two cases are:
> what if client reboots, how do we reclaim state ASAP and
> what if there is server migration, but same client.
> 
> The second case is trivial as long as the client id stays constant no matter
> what server you connect to and might be any number of constant identifiers,
> be it random, or not.
> 
> On the other hand the rebooted client is more interesting. Of course there's
> also a lease expiration (that's what we do in Lustre too, if the client dies,
> it'll be expired eventually, but also if we talk to it and it does not reply,
> we kick it out as well, and this has a much shorter timeout, so not as disruptive).
> 
> Cannot some more unique identifier be used by default?
> Say "mac address of the primary interface, whatever that happens to be",
> in that case as long as your client remains on the same physical box
> (and the network card has not changed), you should be fine.
> I guess there are other ways.
> Ideally, kernel would offer an API (might be there is already, but I cannot find it)
> that could be queried for a unique id like that (with inputs from mac addresses,
> various serial numbers identifiable and such).
> 

Shrug...feel free to propose a better scheme for generating unique ids
if you can think of one. Unfortunately, there are always cases when
these mechanisms for getting a persistent+unique id break down.

That's the reason that nfs provides an interface to allow setting a
uniquifier from userland via module param.

Cheers,
-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 16:00         ` Jeff Layton
@ 2016-09-06 16:29           ` Oleg Drokin
  2016-09-06 22:51             ` Jeff Layton
  0 siblings, 1 reply; 10+ messages in thread
From: Oleg Drokin @ 2016-09-06 16:29 UTC (permalink / raw)
  To: Jeff Layton; +Cc: linux-nfs


On Sep 6, 2016, at 12:00 PM, Jeff Layton wrote:
> 
> Shrug...feel free to propose a better scheme for generating unique ids
> if you can think of one. Unfortunately, there are always cases when
> these mechanisms for getting a persistent+unique id break down.
> 
> That's the reason that nfs provides an interface to allow setting a
> uniquifier from userland via module param.

Fair enough, though I guess module parameter is not really a super convenient place
for it when you run off nfs-root.
I guess you were not really enticed with the mac address idea too ;)

Anyway, thank you very much for your help, now I have some better
idea of what's going on and what to try next.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 15:47       ` Oleg Drokin
  2016-09-06 16:00         ` Jeff Layton
@ 2016-09-06 16:38         ` Chuck Lever
  2016-09-06 18:52           ` Oleg Drokin
  1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2016-09-06 16:38 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: Jeff Layton, Linux NFS Mailing List


> On Sep 6, 2016, at 11:47 AM, Oleg Drokin <green@linuxhacker.ru> wrote:
> 
> 
> On Sep 6, 2016, at 11:18 AM, Jeff Layton wrote:
> 
>> On Tue, 2016-09-06 at 10:58 -0400, Oleg Drokin wrote:
>>> On Sep 6, 2016, at 10:30 AM, Jeff Layton wrote:
>>> 
>>>> 
>>>> On Mon, 2016-09-05 at 00:55 -0400, Oleg Drokin wrote:
>>>>> 
>>>>> Hello!
>>>>> 
>>>>>   I have a somewhat mysterious problem with my nfs test rig that I suspect is something
>>>>>   stupid I am missing, but I cannot figure it out and would appreciate any help.
>>>>> 
>>>>>   NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
>>>>>   Clients are a bunch of 4.8-rc5 nodes, nfsroot.
>>>>>   If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
>>>>>   operations ground to a half (nfs-wise). NFS server side there's very little load.
>>>>> 
>>>>>   I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
>>>>>   was running 4.4.something I believe), and back then after some mucking around
>>>>>   I set:
>>>>> net.core.rmem_default=268435456
>>>>> net.core.wmem_default=268435456
>>>>> net.core.rmem_max=268435456
>>>>> net.core.wmem_max=268435456
>>>>> 
>>>>>   and while no idea why, that helped, so I stopped looking into it completely.
>>>>> 
>>>>>   Now fast forward to now, I am back at the same problem and the workaround above
>>>>>   does not help anymore.
>>>>> 
>>>>>   I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
>>>>>   in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
>>>>>   help).
>>>>> 
>>>>>   So anyway I discovered the nfsdcltrack and such and I noticed that whenever
>>>>>   the kernel calls it, it's always with the same hexid of
>>>>>   4c696e7578204e465376342e32206c6f63616c686f7374
>>>>> 
>>>>>   NAturally if I try to list the content of the sqlite file, I get:
>>>>> sqlite> select * from clients;
>>>>> Linux NFSv4.2 localhost|1473049735|1
>>>>> sqlite> select * from clients;
>>>>> Linux NFSv4.2 localhost|1473049736|1
>>>>> sqlite> select * from clients;
>>>>> Linux NFSv4.2 localhost|1473049737|1
>>>>> sqlite> select * from clients;
>>>>> Linux NFSv4.2 localhost|1473049751|1
>>>>> sqlite> select * from clients;
>>>>> Linux NFSv4.2 localhost|1473049752|1
>>>>> sqlite> 
>>>>> 
>>>> 
>>>> Well, not exactly. It sounds like the clients are all using the same
>>>> long-form clientid string. The server sees that and tosses out any
>>>> state that was previously established by the earlier client, because it
>>>> assumes that the client rebooted.
>>>> 
>>>> The easiest way to work around this is to use the nfs4_unique_id nfs.ko
>>>> module parm on the clients to give them each a unique string id. That
>>>> should prevent the collisions.
>>> 
>>> Hm, but it did work ok in the past.
>>> What determines the unique id now by default?
>>> The clients do start with a different ip address for one, so that
>>> seems to make that a much more good proxy for unique id
>>> (or local ip/server ip as is in case of centos7) than whatever local
>>> hostname is at any random point in time during boot
>>> (where it might not be set yet apparently).
>>> 
>> 
>> The v4.1+ clientid is (by default) determined entirely from the
>> hostname.
>> 
>> IP addresses are a poor choice given that they can easily change for
>> clients that have them dynamically assigned. That's the main reason
>> that v4.0 behaves differently here. The big problems there really come
>> into play with NFSv4 migration. See this RFC draft for the gory
>> details:
>> 
>>    https://tools.ietf.org/html/draft-ietf-nfsv4-migration-issues-10
> 
> Duh, so "ip addresses are unreliable, let's use something even less
> reliable". hostname is also dynamic in a bunch of cases, btw.
> Worst of all, there are very many valid cases where nfs might be mounted
> before hostname is set (or do you regard that as a bug in the environment
> and I should just file a ticket in Fedora bugzilla?)

That's a bug IMO. How can network activity be done before the host is
properly configured? If the host has an IP address, it can perform
a reverse-lookup and find out the matching hostname and use that.

At any rate, if NFS needs the hostname set before performing a mount,
that dependency should be added to the O/S's start-up logic.


> Looking over the draft, the two cases are:
> what if client reboots, how do we reclaim state ASAP and
> what if there is server migration, but same client.
> 
> The second case is trivial as long as the client id stays constant no matter
> what server you connect to and might be any number of constant identifiers,
> be it random, or not.
> 
> On the other hand the rebooted client is more interesting. Of course there's
> also a lease expiration (that's what we do in Lustre too, if the client dies,
> it'll be expired eventually, but also if we talk to it and it does not reply,
> we kick it out as well, and this has a much shorter timeout, so not as disruptive).
> 
> Cannot some more unique identifier be used by default?

There is no good way to do this. We picked a way that works in many
convenient cases, and provided a mechanism for setting a unique ID
in the cases where the default behavior does not work. That's the
best that can be done.

Ideally, we would want O/S installation to generate a random value
(say, a UUID) and store that persistently on the client to use as
its client ID. A diskless client does not have persistent storage,
however.


> Say "mac address of the primary interface, whatever that happens to be",
> in that case as long as your client remains on the same physical box
> (and the network card has not changed), you should be fine.

That has all the same caveats as using hostname or IP address. Given that
Linux is notoriously bad about the "ordering" of hardware devices after
a reboot, it's difficult to claim that this would be more reliable than
using a hostname.


> I guess there are other ways.
> Ideally, kernel would offer an API (might be there is already, but I cannot find it)
> that could be queried for a unique id like that (with inputs from mac addresses,
> various serial numbers identifiable and such).

The IESG had some trouble with that; namely that (if I recall correctly)
it makes it possible for an attacker to see that serial number on the
wire, tracking that host and its MACs and PRNG.

We carefully considered all of this when authoring that document. And,
implementations of NFSv4 are free to use whatever they like in that
client ID. The text in that document is a suggestion, not a normative
requirement.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 16:38         ` Chuck Lever
@ 2016-09-06 18:52           ` Oleg Drokin
  0 siblings, 0 replies; 10+ messages in thread
From: Oleg Drokin @ 2016-09-06 18:52 UTC (permalink / raw)
  To: Chuck Lever; +Cc: Jeff Layton, Linux NFS Mailing List


On Sep 6, 2016, at 12:38 PM, Chuck Lever wrote:

> 
>> On Sep 6, 2016, at 11:47 AM, Oleg Drokin <green@linuxhacker.ru> wrote:
>> 
>> 
>> On Sep 6, 2016, at 11:18 AM, Jeff Layton wrote:
>> 
>>> On Tue, 2016-09-06 at 10:58 -0400, Oleg Drokin wrote:
>>>> On Sep 6, 2016, at 10:30 AM, Jeff Layton wrote:
>>>> 
>>>>> 
>>>>> On Mon, 2016-09-05 at 00:55 -0400, Oleg Drokin wrote:
>>>>>> 
>>>>>> Hello!
>>>>>> 
>>>>>>  I have a somewhat mysterious problem with my nfs test rig that I suspect is something
>>>>>>  stupid I am missing, but I cannot figure it out and would appreciate any help.
>>>>>> 
>>>>>>  NFS server is Fedora23 with 4.6.7-200.fc23.x86_64 as the kernel.
>>>>>>  Clients are a bunch of 4.8-rc5 nodes, nfsroot.
>>>>>>  If I only start one of them, all is fine, if I start all 9 or 10, then suddenly all
>>>>>>  operations ground to a half (nfs-wise). NFS server side there's very little load.
>>>>>> 
>>>>>>  I hit this (or something similar) back in June, when testing 4.6-rcs (and the server
>>>>>>  was running 4.4.something I believe), and back then after some mucking around
>>>>>>  I set:
>>>>>> net.core.rmem_default=268435456
>>>>>> net.core.wmem_default=268435456
>>>>>> net.core.rmem_max=268435456
>>>>>> net.core.wmem_max=268435456
>>>>>> 
>>>>>>  and while no idea why, that helped, so I stopped looking into it completely.
>>>>>> 
>>>>>>  Now fast forward to now, I am back at the same problem and the workaround above
>>>>>>  does not help anymore.
>>>>>> 
>>>>>>  I also have a bunch of "NFSD: client 192.168.10.191 testing state ID with incorrect client ID"
>>>>>>  in my logs (also had in June. Tried to disable nfs 4.2 and 4.1 and that did not
>>>>>>  help).
>>>>>> 
>>>>>>  So anyway I discovered the nfsdcltrack and such and I noticed that whenever
>>>>>>  the kernel calls it, it's always with the same hexid of
>>>>>>  4c696e7578204e465376342e32206c6f63616c686f7374
>>>>>> 
>>>>>>  NAturally if I try to list the content of the sqlite file, I get:
>>>>>> sqlite> select * from clients;
>>>>>> Linux NFSv4.2 localhost|1473049735|1
>>>>>> sqlite> select * from clients;
>>>>>> Linux NFSv4.2 localhost|1473049736|1
>>>>>> sqlite> select * from clients;
>>>>>> Linux NFSv4.2 localhost|1473049737|1
>>>>>> sqlite> select * from clients;
>>>>>> Linux NFSv4.2 localhost|1473049751|1
>>>>>> sqlite> select * from clients;
>>>>>> Linux NFSv4.2 localhost|1473049752|1
>>>>>> sqlite> 
>>>>>> 
>>>>> 
>>>>> Well, not exactly. It sounds like the clients are all using the same
>>>>> long-form clientid string. The server sees that and tosses out any
>>>>> state that was previously established by the earlier client, because it
>>>>> assumes that the client rebooted.
>>>>> 
>>>>> The easiest way to work around this is to use the nfs4_unique_id nfs.ko
>>>>> module parm on the clients to give them each a unique string id. That
>>>>> should prevent the collisions.
>>>> 
>>>> Hm, but it did work ok in the past.
>>>> What determines the unique id now by default?
>>>> The clients do start with a different ip address for one, so that
>>>> seems to make that a much more good proxy for unique id
>>>> (or local ip/server ip as is in case of centos7) than whatever local
>>>> hostname is at any random point in time during boot
>>>> (where it might not be set yet apparently).
>>>> 
>>> 
>>> The v4.1+ clientid is (by default) determined entirely from the
>>> hostname.
>>> 
>>> IP addresses are a poor choice given that they can easily change for
>>> clients that have them dynamically assigned. That's the main reason
>>> that v4.0 behaves differently here. The big problems there really come
>>> into play with NFSv4 migration. See this RFC draft for the gory
>>> details:
>>> 
>>>   https://tools.ietf.org/html/draft-ietf-nfsv4-migration-issues-10
>> 
>> Duh, so "ip addresses are unreliable, let's use something even less
>> reliable". hostname is also dynamic in a bunch of cases, btw.
>> Worst of all, there are very many valid cases where nfs might be mounted
>> before hostname is set (or do you regard that as a bug in the environment
>> and I should just file a ticket in Fedora bugzilla?)
> 
> That's a bug IMO. How can network activity be done before the host is
> properly configured? If the host has an IP address, it can perform
> a reverse-lookup and find out the matching hostname and use that.
> 
> At any rate, if NFS needs the hostname set before performing a mount,
> that dependency should be added to the O/S's start-up logic.

I guess so.
Since the later startup sets the hostname somehow, there's no reason this
logic cannot be brought into initramfs either.
Though it's also a bit strange that dhcpd does not seem to supply
hostname from the host declaration like one would imagine it would

Anyway, for the record, doing this in dhcpd.conf seems to avert
the immediate problem for me
(so whoever finds this thread might use the same trick, does require reverse
dns lookups to return a sensible value):
get-lease-hostnames true;

>> Looking over the draft, the two cases are:
>> what if client reboots, how do we reclaim state ASAP and
>> what if there is server migration, but same client.
>> 
>> The second case is trivial as long as the client id stays constant no matter
>> what server you connect to and might be any number of constant identifiers,
>> be it random, or not.
>> 
>> On the other hand the rebooted client is more interesting. Of course there's
>> also a lease expiration (that's what we do in Lustre too, if the client dies,
>> it'll be expired eventually, but also if we talk to it and it does not reply,
>> we kick it out as well, and this has a much shorter timeout, so not as disruptive).
>> 
>> Cannot some more unique identifier be used by default?
> 
> There is no good way to do this. We picked a way that works in many
> convenient cases, and provided a mechanism for setting a unique ID
> in the cases where the default behavior does not work. That's the
> best that can be done.
> 
> Ideally, we would want O/S installation to generate a random value
> (say, a UUID) and store that persistently on the client to use as
> its client ID. A diskless client does not have persistent storage,
> however.

True, while there's machine-id(5), diskless-clients cannot really access it
before getting onto the network first, and in case of nfsroot, that means
mounting before it's available.

I just suspect you traded one problem for a very similar another one, though.
Imagine that you have a truly dynamic ip addresses (and hostnames depend on them).
Then if the ip address changes, so does the hostname. This is all fine while the
client is alive, since clientid stays constant across such a mount, but
on reboot, once the ip address (and corresponding hostname) are lost, you
are back to square one - rebooted client could not be mapped back into the
existing lease.
The extra complication (same as with old ip one too) is once you lost your ip
and somebody gets it and happens to mount the same server - you will get
two clients with the same id.
Now if both of the clients are fairly active, tracking down why all of a sudden
performance have dropped might be less than trivial.
Hopefully the migration is not compromised by this either (when wrong client with the
same client id comes to the other server on migration first).

All of this makes me wonder, is lease expiration such an expensive and long thing?
Cannot you just make that more lighter-weight? Potentially this will be easier than
having robust persistent node ids for all cases.

>> Say "mac address of the primary interface, whatever that happens to be",
>> in that case as long as your client remains on the same physical box
>> (and the network card has not changed), you should be fine.
> 
> That has all the same caveats as using hostname or IP address. Given that
> Linux is notoriously bad about the "ordering" of hardware devices after
> a reboot, it's difficult to claim that this would be more reliable than
> using a hostname.

Yes, I guess this can be a real problem.
On the other hand diskless clients typically have only one "primary" interface
anyway, the one that gets the ip address first.

>> I guess there are other ways.
>> Ideally, kernel would offer an API (might be there is already, but I cannot find it)
>> that could be queried for a unique id like that (with inputs from mac addresses,
>> various serial numbers identifiable and such).
> 
> The IESG had some trouble with that; namely that (if I recall correctly)
> it makes it possible for an attacker to see that serial number on the
> wire, tracking that host and its MACs and PRNG.

Yes, tracking is a problem. On the other hand MAC addresses are visible on the local
subnet anyway, and your current scheme is the same - you still give away a
"unique" id, it's just only exposed if the host happens to be an nfs client,
so some degree of tracking is still possible - i.e.
if I have a constant hostname (not a diskless host) and I mount nfs -
now attacker (that can see that nfs traffic) knows my IP address,
possibly my MAC address, can get info about other network interfaces potentially,
and info about PRNGs too.

> We carefully considered all of this when authoring that document. And,
> implementations of NFSv4 are free to use whatever they like in that
> client ID. The text in that document is a suggestion, not a normative
> requirement.

Yes, I understand that.
I guess I was just too disappointed by a working configuration suddenly breaking,
to see how the new scheme is an improvement over the old one ;)

Thanks.

Bye,
    Oleg

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: 4.6, 4.7 slow ifs export with more than one client.
  2016-09-06 16:29           ` Oleg Drokin
@ 2016-09-06 22:51             ` Jeff Layton
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Layton @ 2016-09-06 22:51 UTC (permalink / raw)
  To: Oleg Drokin; +Cc: linux-nfs

On Tue, 2016-09-06 at 12:29 -0400, Oleg Drokin wrote:
> On Sep 6, 2016, at 12:00 PM, Jeff Layton wrote:
> > 
> > 
> > Shrug...feel free to propose a better scheme for generating unique
> > ids
> > if you can think of one. Unfortunately, there are always cases when
> > these mechanisms for getting a persistent+unique id break down.
> > 
> > That's the reason that nfs provides an interface to allow setting a
> > uniquifier from userland via module param.
> 
> Fair enough, though I guess module parameter is not really a super
> convenient place
> for it when you run off nfs-root.
> I guess you were not really enticed with the mac address idea too ;)
> 

Which mac address? It's not always a given that the host will initialze
them in the same order when you have multiple interfaces, even if the
hardware doesn't change out from under you.

This is just one of those classic "punt the solution to userland" sort
of problems that's just really hard to fix in the kernel in a way that
will always work.

> Anyway, thank you very much for your help, now I have some better
> idea of what's going on and what to try next.

No problem. Diagnosing these sorts of problems can be pretty difficult.

Note too that earlier versions of the client had a length limitation on
this string, which could cause similar problems even when the hostnames
are already set if they were very long names and only differed at the
end. commit 873e385116b2cc5c7daca8f51881371fadb90970 fixes that
problem.

-- 
Jeff Layton <jlayton@redhat.com>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-09-06 22:51 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-05  4:55 4.6, 4.7 slow ifs export with more than one client Oleg Drokin
2016-09-06 14:30 ` Jeff Layton
2016-09-06 14:58   ` Oleg Drokin
2016-09-06 15:18     ` Jeff Layton
2016-09-06 15:47       ` Oleg Drokin
2016-09-06 16:00         ` Jeff Layton
2016-09-06 16:29           ` Oleg Drokin
2016-09-06 22:51             ` Jeff Layton
2016-09-06 16:38         ` Chuck Lever
2016-09-06 18:52           ` Oleg Drokin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.