LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)

All of lore.kernel.org
 help / color / mirror / Atom feed

* LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-18 15:26 ` Petr Vorel
  0 siblings, 0 replies; 16+ messages in thread
From: Petr Vorel @ 2022-01-18 15:26 UTC (permalink / raw)
  To: linux-nfs
  Cc: J. Bruce Fields, Chuck Lever, Trond Myklebust, Anna Schumaker,
	Neil Brown, Steve Dickson, Nikita Yushchenko, ltp

Hi all,

this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
looks to be failing on NFS v3:

"not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
for lockd already active for root namespace. This breaks nfs3 file locking."

This error has been hidden, showing only with extra patch from Nikita [2].
Because the patch has not been merged, in case you want to verify yourself,
feel free to use my LTP fork branch nfs_flock/fail-on-error to get this patch +
strace debugging [3]:

# PATH="/opt/ltp/testcases/bin:$PATH" /opt/ltp/testcases/bin/nfslock01 -t tcp -v 3
...
nfslock01 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
nfslock01 1 TINFO: add local addr 10.0.0.2/24
nfslock01 1 TINFO: add local addr fd00:1:1:1::2/64
nfslock01 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
nfslock01 1 TINFO: add remote addr 10.0.0.1/24
nfslock01 1 TINFO: add remote addr fd00:1:1:1::1/64
nfslock01 1 TINFO: Network config (local -- remote):
nfslock01 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
nfslock01 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
nfslock01 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
nfslock01 1 TINFO: timeout per run is 0h 5m 0s
nfslock01 1 TINFO: setup NFSv3, socket type tcp
nfslock01 1 TINFO: Mounting NFS: mount -v -t nfs -o proto=tcp,vers=3 10.0.0.2:/tmp/LTP_nfslock01.PAYCDFih75/3/tcp /tmp/LTP_nfslock01.PAYCDFih75/3/0
nfslock01 1 TINFO: creating test files
nfslock01 1 TINFO: Testing locking
nfslock01 1 TINFO: locking 'flock_idata' file and writing data
nfslock01 1 TINFO: waiting for pids: 2022 2023
execve("/opt/ltp/testcases/bin/nfs_flock", ["nfs_flock", "0", "flock_idata"], 0x7ffd4dae5880 /* 206 vars */execve("/opt/ltp/testcases/bin/nfs_flock", ["nfs_flock", "1", "flock_idata"], 0x7ffee8d52690 /* 206 vars */) = 0
brk(NULL)                               = 0x555ad67cc000
...
openat(AT_FDCWD, "flock_idata", O_RDWR) = 3
) = 3
fcntl(3, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=64, l_len=64}fcntl(3, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=64}) = -1 ENOLCK (No locks available)
newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x1), ...}, AT_EMPTY_PATH) = 0
brk(NULL)                               = 0x55aefc2d5000
brk(0x55aefc2f6000)                     = 0x55aefc2f6000
write(1, "failed in writeb_lock, Errno = 3"..., 34failed in writeb_lock, Errno = 37
) = 34
exit_group(1)                           = ?
+++ exited with 1 +++
) = -1 ENOLCK (No locks available)
newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x1), ...}, AT_EMPTY_PATH) = 0
brk(NULL)                               = 0x555ad67cc000
brk(0x555ad67ed000)                     = 0x555ad67ed000
write(1, "failed in writeb_lock, Errno = 3"..., 34failed in writeb_lock, Errno = 37
) = 34
exit_group(1)                           = ?
+++ exited with 1 +++
nfslock01 1 TFAIL: nfs_lock process failed
...

Dmesg shows: "lockd: cannot monitor 10.0.0.2", test fails on
fcntl(fd, F_SETLKW, &lock), lock.l_whence is SEEK_SET.

Running other NFS versions (-v 4 or -v 4.1 or -v 4.2) works ok.
Also tested only on TCP due UDP being recently disabled by default.

I found this behaviour on various kernels (openSUSE 5.16, Debian: 5.16, 5.10,
SLES 5.14 and 5.3 - both heavily patched).

Is it a bug in lockd or in a test? Is there some limitation on v3?

Kind regards,
Petr

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-18 15:26 ` Petr Vorel
  0 siblings, 0 replies; 16+ messages in thread
From: Petr Vorel @ 2022-01-18 15:26 UTC (permalink / raw)
  To: linux-nfs
  Cc: Neil Brown, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, Trond Myklebust, ltp, Nikita Yushchenko

Hi all,

this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
looks to be failing on NFS v3:

"not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
for lockd already active for root namespace. This breaks nfs3 file locking."

This error has been hidden, showing only with extra patch from Nikita [2].
Because the patch has not been merged, in case you want to verify yourself,
feel free to use my LTP fork branch nfs_flock/fail-on-error to get this patch +
strace debugging [3]:

# PATH="/opt/ltp/testcases/bin:$PATH" /opt/ltp/testcases/bin/nfslock01 -t tcp -v 3
...
nfslock01 1 TINFO: initialize 'lhost' 'ltp_ns_veth2' interface
nfslock01 1 TINFO: add local addr 10.0.0.2/24
nfslock01 1 TINFO: add local addr fd00:1:1:1::2/64
nfslock01 1 TINFO: initialize 'rhost' 'ltp_ns_veth1' interface
nfslock01 1 TINFO: add remote addr 10.0.0.1/24
nfslock01 1 TINFO: add remote addr fd00:1:1:1::1/64
nfslock01 1 TINFO: Network config (local -- remote):
nfslock01 1 TINFO: ltp_ns_veth2 -- ltp_ns_veth1
nfslock01 1 TINFO: 10.0.0.2/24 -- 10.0.0.1/24
nfslock01 1 TINFO: fd00:1:1:1::2/64 -- fd00:1:1:1::1/64
nfslock01 1 TINFO: timeout per run is 0h 5m 0s
nfslock01 1 TINFO: setup NFSv3, socket type tcp
nfslock01 1 TINFO: Mounting NFS: mount -v -t nfs -o proto=tcp,vers=3 10.0.0.2:/tmp/LTP_nfslock01.PAYCDFih75/3/tcp /tmp/LTP_nfslock01.PAYCDFih75/3/0
nfslock01 1 TINFO: creating test files
nfslock01 1 TINFO: Testing locking
nfslock01 1 TINFO: locking 'flock_idata' file and writing data
nfslock01 1 TINFO: waiting for pids: 2022 2023
execve("/opt/ltp/testcases/bin/nfs_flock", ["nfs_flock", "0", "flock_idata"], 0x7ffd4dae5880 /* 206 vars */execve("/opt/ltp/testcases/bin/nfs_flock", ["nfs_flock", "1", "flock_idata"], 0x7ffee8d52690 /* 206 vars */) = 0
brk(NULL)                               = 0x555ad67cc000
...
openat(AT_FDCWD, "flock_idata", O_RDWR) = 3
) = 3
fcntl(3, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=64, l_len=64}fcntl(3, F_SETLKW, {l_type=F_WRLCK, l_whence=SEEK_SET, l_start=0, l_len=64}) = -1 ENOLCK (No locks available)
newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x1), ...}, AT_EMPTY_PATH) = 0
brk(NULL)                               = 0x55aefc2d5000
brk(0x55aefc2f6000)                     = 0x55aefc2f6000
write(1, "failed in writeb_lock, Errno = 3"..., 34failed in writeb_lock, Errno = 37
) = 34
exit_group(1)                           = ?
+++ exited with 1 +++
) = -1 ENOLCK (No locks available)
newfstatat(1, "", {st_mode=S_IFCHR|0600, st_rdev=makedev(0x88, 0x1), ...}, AT_EMPTY_PATH) = 0
brk(NULL)                               = 0x555ad67cc000
brk(0x555ad67ed000)                     = 0x555ad67ed000
write(1, "failed in writeb_lock, Errno = 3"..., 34failed in writeb_lock, Errno = 37
) = 34
exit_group(1)                           = ?
+++ exited with 1 +++
nfslock01 1 TFAIL: nfs_lock process failed
...

Dmesg shows: "lockd: cannot monitor 10.0.0.2", test fails on
fcntl(fd, F_SETLKW, &lock), lock.l_whence is SEEK_SET.

Running other NFS versions (-v 4 or -v 4.1 or -v 4.2) works ok.
Also tested only on TCP due UDP being recently disabled by default.

I found this behaviour on various kernels (openSUSE 5.16, Debian: 5.16, 5.10,
SLES 5.14 and 5.3 - both heavily patched).

Is it a bug in lockd or in a test? Is there some limitation on v3?

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
  2022-01-18 15:26 ` [LTP] " Petr Vorel
@ 2022-01-18 15:51   ` Nikita Yushchenko via ltp
  -1 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko @ 2022-01-18 15:51 UTC (permalink / raw)
  To: Petr Vorel, linux-nfs
  Cc: J. Bruce Fields, Chuck Lever, Trond Myklebust, Anna Schumaker,
	Neil Brown, Steve Dickson, ltp, kernel

18.01.2022 18:26, Petr Vorel wrote:
> Hi all,
> 
> this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> looks to be failing on NFS v3:
> 
> "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> for lockd already active for root namespace. This breaks nfs3 file locking."

What exactly happens is:

Test runs 'mount' in non-root netns, trying to mount a directory from root netns of the same host via nfsv3

(Part of) call chain inside the kernel

nfs_try_get_tree()
  nfs3_create_server()
   nfs_create_server()
    nfs_init_server()
     nfs_start_lockd()
      nlmclnt_init()
       lockd_up()
        svc_bind()
         svc_rpcb_setup()
          rpcb_create_local()

... and at this point it tries AF_UNIX connection to /var/run/rpcbind.sock

AF_UNIX is not netns-aware.
So it connects to host's rpcbind.
And overwrites ports registered in host's rpcbind by lockd instance for root namespace. Since this 
point, lockd instance for root namespace becomes no longer accessible (it still listens but nobody can 
learn the ports). Thus nfs locks don't work.

I'm not sure what is the correct behavior here.

Maybe rpcb_create_local() shall detect that it is not in root netns, and only try AF_INET connection to 
localhost in that case.

Maybe it shall not try AF_UNIX at all. Are there any realistic cases when rpcbind is accessible via 
AF_UNIX only?

Nikita

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-18 15:51   ` Nikita Yushchenko via ltp
  0 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko via ltp @ 2022-01-18 15:51 UTC (permalink / raw)
  To: Petr Vorel, linux-nfs
  Cc: Neil Brown, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, kernel, Trond Myklebust, ltp

18.01.2022 18:26, Petr Vorel wrote:
> Hi all,
> 
> this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> looks to be failing on NFS v3:
> 
> "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> for lockd already active for root namespace. This breaks nfs3 file locking."

What exactly happens is:

Test runs 'mount' in non-root netns, trying to mount a directory from root netns of the same host via nfsv3

(Part of) call chain inside the kernel

nfs_try_get_tree()
  nfs3_create_server()
   nfs_create_server()
    nfs_init_server()
     nfs_start_lockd()
      nlmclnt_init()
       lockd_up()
        svc_bind()
         svc_rpcb_setup()
          rpcb_create_local()

... and at this point it tries AF_UNIX connection to /var/run/rpcbind.sock

AF_UNIX is not netns-aware.
So it connects to host's rpcbind.
And overwrites ports registered in host's rpcbind by lockd instance for root namespace. Since this 
point, lockd instance for root namespace becomes no longer accessible (it still listens but nobody can 
learn the ports). Thus nfs locks don't work.

I'm not sure what is the correct behavior here.

Maybe rpcb_create_local() shall detect that it is not in root netns, and only try AF_INET connection to 
localhost in that case.

Maybe it shall not try AF_UNIX at all. Are there any realistic cases when rpcbind is accessible via 
AF_UNIX only?

Nikita

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
  2022-01-18 15:26 ` [LTP] " Petr Vorel
@ 2022-01-18 22:11   ` NeilBrown
  -1 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2022-01-18 22:11 UTC (permalink / raw)
  To: Petr Vorel
  Cc: linux-nfs, J. Bruce Fields, Chuck Lever, Trond Myklebust,
	Anna Schumaker, Steve Dickson, Nikita Yushchenko, ltp

On Wed, 19 Jan 2022, Petr Vorel wrote:
> Hi all,
> 
> this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> looks to be failing on NFS v3:
> 
> "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> for lockd already active for root namespace. This breaks nfs3 file locking."

"not unsharing /var" ....  can this be fixed by simply unsharing /var?
Or is that not simple?

On could easily argue that RPCBIND_SOCK_PATHNAME in the kernel should be
changed to "/run/rpcbind.sock".  Does this test suite unshare /run ??

BTW, your email contains [1], [2], etc which suggests there are links
somewhere - but there aren't.

NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-18 22:11   ` NeilBrown
  0 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2022-01-18 22:11 UTC (permalink / raw)
  To: Petr Vorel
  Cc: linux-nfs, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, Trond Myklebust, ltp, Nikita Yushchenko

On Wed, 19 Jan 2022, Petr Vorel wrote:
> Hi all,
> 
> this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> looks to be failing on NFS v3:
> 
> "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> for lockd already active for root namespace. This breaks nfs3 file locking."

"not unsharing /var" ....  can this be fixed by simply unsharing /var?
Or is that not simple?

On could easily argue that RPCBIND_SOCK_PATHNAME in the kernel should be
changed to "/run/rpcbind.sock".  Does this test suite unshare /run ??

BTW, your email contains [1], [2], etc which suggests there are links
somewhere - but there aren't.

NeilBrown

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
  2022-01-18 15:51   ` [LTP] " Nikita Yushchenko via ltp
@ 2022-01-18 22:13     ` NeilBrown
  -1 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2022-01-18 22:13 UTC (permalink / raw)
  To: Nikita Yushchenko
  Cc: Petr Vorel, linux-nfs, J. Bruce Fields, Chuck Lever,
	Trond Myklebust, Anna Schumaker, Steve Dickson, ltp, kernel

On Wed, 19 Jan 2022, Nikita Yushchenko wrote:
> 18.01.2022 18:26, Petr Vorel wrote:
> > Hi all,
> > 
> > this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> > looks to be failing on NFS v3:
> > 
> > "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> > inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> > ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> > for lockd already active for root namespace. This breaks nfs3 file locking."
> 
> What exactly happens is:
> 
> Test runs 'mount' in non-root netns, trying to mount a directory from root netns of the same host via nfsv3
> 
> (Part of) call chain inside the kernel
> 
> nfs_try_get_tree()
>   nfs3_create_server()
>    nfs_create_server()
>     nfs_init_server()
>      nfs_start_lockd()
>       nlmclnt_init()
>        lockd_up()
>         svc_bind()
>          svc_rpcb_setup()
>           rpcb_create_local()
> 
> ... and at this point it tries AF_UNIX connection to /var/run/rpcbind.sock
> 
> AF_UNIX is not netns-aware.
> So it connects to host's rpcbind.
> And overwrites ports registered in host's rpcbind by lockd instance for root namespace. Since this 
> point, lockd instance for root namespace becomes no longer accessible (it still listens but nobody can 
> learn the ports). Thus nfs locks don't work.
> 
> I'm not sure what is the correct behavior here.
> 
> Maybe rpcb_create_local() shall detect that it is not in root netns, and only try AF_INET connection to 
> localhost in that case.

That would be simple and might be sensible.  IF changing the AF_UNIX
path to "/run/rpcbind.sock" isn't sufficient, then testing for the
root_ns is probably the best second option.

Thanks,
NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-18 22:13     ` NeilBrown
  0 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2022-01-18 22:13 UTC (permalink / raw)
  To: Nikita Yushchenko
  Cc: linux-nfs, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, kernel, Trond Myklebust, ltp

On Wed, 19 Jan 2022, Nikita Yushchenko wrote:
> 18.01.2022 18:26, Petr Vorel wrote:
> > Hi all,
> > 
> > this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> > looks to be failing on NFS v3:
> > 
> > "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> > inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> > ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> > for lockd already active for root namespace. This breaks nfs3 file locking."
> 
> What exactly happens is:
> 
> Test runs 'mount' in non-root netns, trying to mount a directory from root netns of the same host via nfsv3
> 
> (Part of) call chain inside the kernel
> 
> nfs_try_get_tree()
>   nfs3_create_server()
>    nfs_create_server()
>     nfs_init_server()
>      nfs_start_lockd()
>       nlmclnt_init()
>        lockd_up()
>         svc_bind()
>          svc_rpcb_setup()
>           rpcb_create_local()
> 
> ... and at this point it tries AF_UNIX connection to /var/run/rpcbind.sock
> 
> AF_UNIX is not netns-aware.
> So it connects to host's rpcbind.
> And overwrites ports registered in host's rpcbind by lockd instance for root namespace. Since this 
> point, lockd instance for root namespace becomes no longer accessible (it still listens but nobody can 
> learn the ports). Thus nfs locks don't work.
> 
> I'm not sure what is the correct behavior here.
> 
> Maybe rpcb_create_local() shall detect that it is not in root netns, and only try AF_INET connection to 
> localhost in that case.

That would be simple and might be sensible.  IF changing the AF_UNIX
path to "/run/rpcbind.sock" isn't sufficient, then testing for the
root_ns is probably the best second option.

Thanks,
NeilBrown

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
  2022-01-18 22:11   ` [LTP] " NeilBrown
@ 2022-01-19  5:17     ` Nikita Yushchenko via ltp
  -1 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko @ 2022-01-19  5:17 UTC (permalink / raw)
  To: NeilBrown, Petr Vorel
  Cc: linux-nfs, J. Bruce Fields, Chuck Lever, Trond Myklebust,
	Anna Schumaker, Steve Dickson, ltp, kernel

19.01.2022 01:11, NeilBrown wrote:
> On Wed, 19 Jan 2022, Petr Vorel wrote:
>> Hi all,
>>
>> this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
>> looks to be failing on NFS v3:
>>
>> "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
>> inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
>> ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
>> for lockd already active for root namespace. This breaks nfs3 file locking."
> 
> "not unsharing /var" ....  can this be fixed by simply unsharing /var?
> Or is that not simple?

Big picture is - lockd tries to be per-netns, but lockd isn't standalone, it depends on rpcbind, and 
rpcbind isn't guaranteed to be per-netns.

One can argue that it is not kernel's job to provide per-netns rpcbind.

Still, the current situation is - by default, doing an nfs mount from within netns B immediately breaks 
lockd serving nfs mounts exported from different netns A. "By default" = "as long as nfsmount process 
executed in netns B is also in a different mount namespace that has RPCBIND_SOCK_PATHNAME not pointing 
to AF_UNIX socket instance owned by rpcbind serving netns A.

Although in LTP's 'nfslock01' test the "non working locking" is reproduced on the same mount that 
triggered the breakage, the breakage is not limited to that mount. Since that mount operation in netns 
B, any client of nfs exports from netns A will get locking broken - including clients running on 
different physical hosts.

I'd say that using AF_UNIX connection from lockd to rpcbind does not play well with per-netns lockd.

Solution to use AF_UNIX connection to rpcbind only for lockd serving root netns, and using AF_INET 
otherwise - looks more sane.

> On could easily argue that RPCBIND_SOCK_PATHNAME in the kernel should be
> changed to "/run/rpcbind.sock".

It may be a better idea to make it configurable per-netns.

Nikita

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-19  5:17     ` Nikita Yushchenko via ltp
  0 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko via ltp @ 2022-01-19  5:17 UTC (permalink / raw)
  To: NeilBrown, Petr Vorel
  Cc: linux-nfs, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, kernel, Trond Myklebust, ltp

19.01.2022 01:11, NeilBrown wrote:
> On Wed, 19 Jan 2022, Petr Vorel wrote:
>> Hi all,
>>
>> this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
>> looks to be failing on NFS v3:
>>
>> "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
>> inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
>> ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
>> for lockd already active for root namespace. This breaks nfs3 file locking."
> 
> "not unsharing /var" ....  can this be fixed by simply unsharing /var?
> Or is that not simple?

Big picture is - lockd tries to be per-netns, but lockd isn't standalone, it depends on rpcbind, and 
rpcbind isn't guaranteed to be per-netns.

One can argue that it is not kernel's job to provide per-netns rpcbind.

Still, the current situation is - by default, doing an nfs mount from within netns B immediately breaks 
lockd serving nfs mounts exported from different netns A. "By default" = "as long as nfsmount process 
executed in netns B is also in a different mount namespace that has RPCBIND_SOCK_PATHNAME not pointing 
to AF_UNIX socket instance owned by rpcbind serving netns A.

Although in LTP's 'nfslock01' test the "non working locking" is reproduced on the same mount that 
triggered the breakage, the breakage is not limited to that mount. Since that mount operation in netns 
B, any client of nfs exports from netns A will get locking broken - including clients running on 
different physical hosts.

I'd say that using AF_UNIX connection from lockd to rpcbind does not play well with per-netns lockd.

Solution to use AF_UNIX connection to rpcbind only for lockd serving root netns, and using AF_INET 
otherwise - looks more sane.

> On could easily argue that RPCBIND_SOCK_PATHNAME in the kernel should be
> changed to "/run/rpcbind.sock".

It may be a better idea to make it configurable per-netns.

Nikita

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
  2022-01-19  5:17     ` [LTP] " Nikita Yushchenko via ltp
@ 2022-01-19  5:26       ` Nikita Yushchenko via ltp
  -1 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko @ 2022-01-19  5:26 UTC (permalink / raw)
  To: NeilBrown, Petr Vorel
  Cc: linux-nfs, J. Bruce Fields, Chuck Lever, Trond Myklebust,
	Anna Schumaker, Steve Dickson, ltp, kernel

> Big picture is - lockd tries to be per-netns, but lockd isn't standalone, it depends on rpcbind, and 
> rpcbind isn't guaranteed to be per-netns.
> 
> One can argue that it is not kernel's job to provide per-netns rpcbind.
> 
> Still, the current situation is - by default, doing an nfs mount from within netns B immediately breaks 
> lockd serving nfs mounts exported from different netns A. "By default" = "as long as nfsmount process 
> executed in netns B is also in a different mount namespace that has RPCBIND_SOCK_PATHNAME not pointing 
> to AF_UNIX socket instance owned by rpcbind serving netns A.
> 
> Although in LTP's 'nfslock01' test the "non working locking" is reproduced on the same mount that 
> triggered the breakage, the breakage is not limited to that mount. Since that mount operation in netns 
> B, any client of nfs exports from netns A will get locking broken - including clients running on 
> different physical hosts.
> 
> I'd say that using AF_UNIX connection from lockd to rpcbind does not play well with per-netns lockd.
> 
> Solution to use AF_UNIX connection to rpcbind only for lockd serving root netns, and using AF_INET 
> otherwise - looks more sane.

Btw, not sure (did not test) what will happen if nfs server will be similarly started in netns B.  Will 
it hijack requests addressed to nfs server running in netns A?

Nikita

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-19  5:26       ` Nikita Yushchenko via ltp
  0 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko via ltp @ 2022-01-19  5:26 UTC (permalink / raw)
  To: NeilBrown, Petr Vorel
  Cc: linux-nfs, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, kernel, Trond Myklebust, ltp

> Big picture is - lockd tries to be per-netns, but lockd isn't standalone, it depends on rpcbind, and 
> rpcbind isn't guaranteed to be per-netns.
> 
> One can argue that it is not kernel's job to provide per-netns rpcbind.
> 
> Still, the current situation is - by default, doing an nfs mount from within netns B immediately breaks 
> lockd serving nfs mounts exported from different netns A. "By default" = "as long as nfsmount process 
> executed in netns B is also in a different mount namespace that has RPCBIND_SOCK_PATHNAME not pointing 
> to AF_UNIX socket instance owned by rpcbind serving netns A.
> 
> Although in LTP's 'nfslock01' test the "non working locking" is reproduced on the same mount that 
> triggered the breakage, the breakage is not limited to that mount. Since that mount operation in netns 
> B, any client of nfs exports from netns A will get locking broken - including clients running on 
> different physical hosts.
> 
> I'd say that using AF_UNIX connection from lockd to rpcbind does not play well with per-netns lockd.
> 
> Solution to use AF_UNIX connection to rpcbind only for lockd serving root netns, and using AF_INET 
> otherwise - looks more sane.

Btw, not sure (did not test) what will happen if nfs server will be similarly started in netns B.  Will 
it hijack requests addressed to nfs server running in netns A?

Nikita

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
  2022-01-19  5:26       ` [LTP] " Nikita Yushchenko via ltp
@ 2022-01-19  5:28         ` Nikita Yushchenko via ltp
  -1 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko @ 2022-01-19  5:28 UTC (permalink / raw)
  To: NeilBrown, Petr Vorel
  Cc: linux-nfs, J. Bruce Fields, Chuck Lever, Trond Myklebust,
	Anna Schumaker, Steve Dickson, ltp, kernel

19.01.2022 08:26, Nikita Yushchenko wrote:
>> Big picture is - lockd tries to be per-netns, but lockd isn't standalone, it depends on rpcbind, and 
>> rpcbind isn't guaranteed to be per-netns.
>>
>> One can argue that it is not kernel's job to provide per-netns rpcbind.
>>
>> Still, the current situation is - by default, doing an nfs mount from within netns B immediately 
>> breaks lockd serving nfs mounts exported from different netns A. "By default" = "as long as nfsmount 
>> process executed in netns B is also in a different mount namespace that has RPCBIND_SOCK_PATHNAME not 
>> pointing to AF_UNIX socket instance owned by rpcbind serving netns A.
>>
>> Although in LTP's 'nfslock01' test the "non working locking" is reproduced on the same mount that 
>> triggered the breakage, the breakage is not limited to that mount. Since that mount operation in netns 
>> B, any client of nfs exports from netns A will get locking broken - including clients running on 
>> different physical hosts.
>>
>> I'd say that using AF_UNIX connection from lockd to rpcbind does not play well with per-netns lockd.
>>
>> Solution to use AF_UNIX connection to rpcbind only for lockd serving root netns, and using AF_INET 
>> otherwise - looks more sane.
> 
> Btw, not sure (did not test) what will happen if nfs server will be similarly started in netns B.  Will 
> it hijack requests addressed to nfs server running in netns A?

No it won't "hijack"...  because in will still listen inside netns B only.  But, if ports in rpcbind get 
overwritten in the similar manner, nfs server running in netns A will become no longer reachable.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-19  5:28         ` Nikita Yushchenko via ltp
  0 siblings, 0 replies; 16+ messages in thread
From: Nikita Yushchenko via ltp @ 2022-01-19  5:28 UTC (permalink / raw)
  To: NeilBrown, Petr Vorel
  Cc: linux-nfs, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, kernel, Trond Myklebust, ltp

19.01.2022 08:26, Nikita Yushchenko wrote:
>> Big picture is - lockd tries to be per-netns, but lockd isn't standalone, it depends on rpcbind, and 
>> rpcbind isn't guaranteed to be per-netns.
>>
>> One can argue that it is not kernel's job to provide per-netns rpcbind.
>>
>> Still, the current situation is - by default, doing an nfs mount from within netns B immediately 
>> breaks lockd serving nfs mounts exported from different netns A. "By default" = "as long as nfsmount 
>> process executed in netns B is also in a different mount namespace that has RPCBIND_SOCK_PATHNAME not 
>> pointing to AF_UNIX socket instance owned by rpcbind serving netns A.
>>
>> Although in LTP's 'nfslock01' test the "non working locking" is reproduced on the same mount that 
>> triggered the breakage, the breakage is not limited to that mount. Since that mount operation in netns 
>> B, any client of nfs exports from netns A will get locking broken - including clients running on 
>> different physical hosts.
>>
>> I'd say that using AF_UNIX connection from lockd to rpcbind does not play well with per-netns lockd.
>>
>> Solution to use AF_UNIX connection to rpcbind only for lockd serving root netns, and using AF_INET 
>> otherwise - looks more sane.
> 
> Btw, not sure (did not test) what will happen if nfs server will be similarly started in netns B.  Will 
> it hijack requests addressed to nfs server running in netns A?

No it won't "hijack"...  because in will still listen inside netns B only.  But, if ports in rpcbind get 
overwritten in the similar manner, nfs server running in netns A will become no longer reachable.

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
  2022-01-18 22:11   ` [LTP] " NeilBrown
@ 2022-01-20 12:24     ` Petr Vorel
  -1 siblings, 0 replies; 16+ messages in thread
From: Petr Vorel @ 2022-01-20 12:24 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-nfs, J. Bruce Fields, Chuck Lever, Trond Myklebust,
	Anna Schumaker, Steve Dickson, Nikita Yushchenko, ltp

Hi Neil, all,

> On Wed, 19 Jan 2022, Petr Vorel wrote:
> > Hi all,

> > this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> > looks to be failing on NFS v3:

> > "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> > inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> > ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> > for lockd already active for root namespace. This breaks nfs3 file locking."

> "not unsharing /var" ....  can this be fixed by simply unsharing /var?
> Or is that not simple?

> On could easily argue that RPCBIND_SOCK_PATHNAME in the kernel should be
> changed to "/run/rpcbind.sock".  Does this test suite unshare /run ??

> BTW, your email contains [1], [2], etc which suggests there are links
> somewhere - but there aren't.
I'm sorry, here they are:

[1] https://lore.kernel.org/ltp/590378ee-71af-deb6-6c03-1d2af459ed63@virtuozzo.com/
(the report)

[2] https://lore.kernel.org/ltp/20220112161942.4065665-1-nikita.yushchenko@virtuozzo.com/
(the not yet merged LTP Nikita's patch)

[3] https://github.com/pevik/ltp/commits/nfs_flock/fail-on-error
(my LTP fork with Nikita's patch [2] + strace debugging - with this code I post
the report)

Kind regards,
Petr

> NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [LTP] LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2)
@ 2022-01-20 12:24     ` Petr Vorel
  0 siblings, 0 replies; 16+ messages in thread
From: Petr Vorel @ 2022-01-20 12:24 UTC (permalink / raw)
  To: NeilBrown
  Cc: linux-nfs, Steve Dickson, Anna Schumaker, J. Bruce Fields,
	Chuck Lever, Trond Myklebust, ltp, Nikita Yushchenko

Hi Neil, all,

> On Wed, 19 Jan 2022, Petr Vorel wrote:
> > Hi all,

> > this is a test failure posted by Nikita Yushchenko [1]. LTP NFS test nfslock01
> > looks to be failing on NFS v3:

> > "not unsharing /var makes AF_UNIX socket for host's rpcbind to become available
> > inside ltpns. Then, at nfs3 mount time, kernel creates an instance of lockd for
> > ltpns, and ports for that instance leak to host's rpcbind and overwrite ports
> > for lockd already active for root namespace. This breaks nfs3 file locking."

> "not unsharing /var" ....  can this be fixed by simply unsharing /var?
> Or is that not simple?

> On could easily argue that RPCBIND_SOCK_PATHNAME in the kernel should be
> changed to "/run/rpcbind.sock".  Does this test suite unshare /run ??

> BTW, your email contains [1], [2], etc which suggests there are links
> somewhere - but there aren't.
I'm sorry, here they are:

[1] https://lore.kernel.org/ltp/590378ee-71af-deb6-6c03-1d2af459ed63@virtuozzo.com/
(the report)

[2] https://lore.kernel.org/ltp/20220112161942.4065665-1-nikita.yushchenko@virtuozzo.com/
(the not yet merged LTP Nikita's patch)

[3] https://github.com/pevik/ltp/commits/nfs_flock/fail-on-error
(my LTP fork with Nikita's patch [2] + strace debugging - with this code I post
the report)

Kind regards,
Petr

> NeilBrown

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-01-20 12:24 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-18 15:26 LTP nfslock01 test failing on NFS v3 (lockd: cannot monitor 10.0.0.2) Petr Vorel
2022-01-18 15:26 ` [LTP] " Petr Vorel
2022-01-18 15:51 ` Nikita Yushchenko
2022-01-18 15:51   ` [LTP] " Nikita Yushchenko via ltp
2022-01-18 22:13   ` NeilBrown
2022-01-18 22:13     ` [LTP] " NeilBrown
2022-01-18 22:11 ` NeilBrown
2022-01-18 22:11   ` [LTP] " NeilBrown
2022-01-19  5:17   ` Nikita Yushchenko
2022-01-19  5:17     ` [LTP] " Nikita Yushchenko via ltp
2022-01-19  5:26     ` Nikita Yushchenko
2022-01-19  5:26       ` [LTP] " Nikita Yushchenko via ltp
2022-01-19  5:28       ` Nikita Yushchenko
2022-01-19  5:28         ` [LTP] " Nikita Yushchenko via ltp
2022-01-20 12:24   ` Petr Vorel
2022-01-20 12:24     ` [LTP] " Petr Vorel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.