All of lore.kernel.org
 help / color / mirror / Atom feed
* Regular deadlocks
@ 2016-06-25 23:41 Cyril B.
  2016-06-26  1:59 ` Ian Kent
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-25 23:41 UTC (permalink / raw)
  To: autofs

Hello,

I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1 
as well) on my servers, typically about once every 2 or 3 days.

I already posted on this mailing-list back in 2015 for a bug that also 
triggered deadlocks (which was fixed), so I'll copy/paste parts of my 
original message as my config hasn't changed.

/etc/auto.master:
--
/nfs program:/etc/auto.nfs
/home program:/etc/auto.home
--

/etc/auto.nfs is basically returning:

-fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/

/etc/auto.home:
--
#!/bin/sh

if [ ! -h /var/home/$1 ]
then
    exit 1
fi

echo -fstype=bind :$(readlink --no-newline /var/home/$1)
--

So for instance, /var/home/foo would be a symlink pointing to
/nfs/serverX/foo.

Kernel: Linux 4.4.7.

My servers have cronjobs that trigger /home/userX mounts basically at 
the same time (when the jobs do start). I have 2 servers with the same 
config, but one of them has MANY more users/cronjobs and oddly enough, 
the deadlock happens much more infrequently.

Anyway, here's a 'ps faux' a few hours after the deadlock started:

root        3437  0.0  0.0 217644  5428 ?        Ssl  Jun24   0:14 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269869  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269915  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269920  0.0  0.0      0     0 ?        Z    18:00   0:00  \_ 
[auto.nfs] <defunct>
root     1269923  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269931  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269932  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269933  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269934  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1269937  0.0  0.0 209420  1904 ?        S    18:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1271340  0.0  0.0 209420  1904 ?        S    18:01   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1271943  0.0  0.0 209420  1904 ?        S    18:03   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1272408  0.0  0.0 209420  1904 ?        S    18:05   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1273323  0.0  0.0 210448  1936 ?        S    18:08   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1274002  0.0  0.0 211476  1968 ?        S    18:10   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1276131  0.0  0.0 213532  2032 ?        S    18:20   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1278540  0.0  0.0 213532  2032 ?        S    18:30   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1302389  0.0  0.0 214560  2064 ?        S    20:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1317670  0.0  0.0 215588  2096 ?        S    20:52   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     1319408  0.0  0.0 216616  2128 ?        S    21:00   0:00  \_ 
/usr/sbin/automount --pid-file /var/run/autofs.pid


Backtraces of the main process (3437):

Thread 24 (Thread 0x404a2950 (LWP 3438)):
#0  0x00007f438c7e8fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
#1  0x000055e6c5bd28e6 in alarm_handler (arg=<optimized out>) at alarm.c:191
#2  0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#3  0x00007f438b9ad64d in clone () from /lib/libc.so.6
#4  0x0000000000000000 in ?? ()

Thread 23 (Thread 0x4182f950 (LWP 3439)):
#0  0x00007f438c7e8d29 in pthread_cond_wait@@GLIBC_2.3.2 () from 
/lib/libpthread.so.0
#1  0x000055e6c5bc7205 in st_queue_handler (arg=<optimized out>) at 
state.c:1101
#2  0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#3  0x00007f438b9ad64d in clone () from /lib/libc.so.6
#4  0x0000000000000000 in ?? ()

Thread 22 (Thread 0x42030950 (LWP 3442)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bbabce in get_pkt (pkt=<optimized out>, ap=<optimized 
out>) at automount.c:1009
#2  handle_packet (ap=<optimized out>) at automount.c:1160
#3  handle_mounts (arg=<optimized out>) at automount.c:1834
#4  0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#5  0x00007f438b9ad64d in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 21 (Thread 0x417f3950 (LWP 3445)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bbabce in get_pkt (pkt=<optimized out>, ap=<optimized 
out>) at automount.c:1009
#2  handle_packet (ap=<optimized out>) at automount.c:1160
#3  handle_mounts (arg=<optimized out>) at automount.c:1834
#4  0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#5  0x00007f438b9ad64d in clone () from /lib/libc.so.6
#6  0x0000000000000000 in ?? ()

Thread 20 (Thread 0x409a0950 (LWP 1269762)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4099d9b0 "user1", name_len=12, 
what=0x4099d980 "/nfs/http12/user1", fstype=0x4099d9d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x55e6c642cb00) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4099d9b0 "user1", name_len=12, 
what=0x4099d980 "/nfs/http12/user1", fstype=0x4099d9d0 "bind",
     options=0x4099d9f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4099e000 "user1", namelen=12, 
loc=0x55e6c642cbb0 ":/nfs/http12/user1", loclen=25,
     options=0x4099d9f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x4099e000 "user1", name_len=12, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x4099e000 "user1", name_len=12, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x4099e000 "user1", name_len=12) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x4099e000 
"user1", name_len=12) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 19 (Thread 0x42232950 (LWP 1269775)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
---Type <return> to continue, or q <return> to quit---
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4222f9b0 "user2", name_len=8, 
what=0x4222f980 "/nfs/http12/user2", fstype=0x4222f9d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x55e6c64311d0) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4222f9b0 "user2", name_len=8, 
what=0x4222f980 "/nfs/http12/user2", fstype=0x4222f9d0 "bind",
     options=0x4222f9f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42230000 "user2", namelen=8, 
loc=0x55e6c6431810 ":/nfs/http12/user2", loclen=21, options=0x4222f9f0 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42230000 "user2", name_len=8, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42230000 "user2", name_len=8, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42230000 "user2", name_len=8) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42230000 
"user2", name_len=8) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 18 (Thread 0x42434950 (LWP 1269778)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x424319b0 "user3", name_len=9, 
what=0x42431980 "/nfs/http12/user3", fstype=0x424319d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0xffffffff00000024) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x424319b0 "user3", name_len=9, 
what=0x42431980 "/nfs/http12/user3", fstype=0x424319d0 "bind",
     options=0x424319f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42432000 "user3", namelen=9, 
loc=0x55e6c6431f20 ":/nfs/http12/user3", loclen=22, options=0x424319f0
"",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42432000 "user3", name_len=9, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42432000 "user3", name_len=9, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42432000 "user3", name_len=9) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42432000 
"user3", name_len=9) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 17 (Thread 0x42636950 (LWP 1269781)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x426339b0 "user4", name_len=10, 
what=0x42633980 "/nfs/http12/user4", fstype=0x426339d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x55e6c64316d0) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x426339b0 "user4", name_len=10, 
what=0x42633980 "/nfs/http12/user4", fstype=0x426339d0 "bind",
     options=0x426339f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42634000 "user4", namelen=10, 
loc=0x55e6c6431a70 ":/nfs/http12/user4", loclen=23, options=0x426339f0
"",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42634000 "user4", name_len=10, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42634000 "user4", name_len=10, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42634000 "user4", name_len=10) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42634000 
"user4", name_len=10) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 16 (Thread 0x40be5950 (LWP 1269783)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x40be29b0 "user5", name_len=7, 
what=0x40be2980 "/nfs/http12/user5", fstype=0x40be29d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x55e6c6431a10) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x40be29b0 "user5", name_len=7, 
what=0x40be2980 "/nfs/http12/user5", fstype=0x40be29d0 "bind",
     options=0x40be29f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x40be3000 "user5", namelen=7, 
loc=0x55e6c6431c90 ":/nfs/http12/user5", loclen=20, options=0x40be29f0 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x40be3000 "user5", name_len=7, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x40be3000 "user5", name_len=7, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x40be3000 "user5", name_len=7) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x40be3000 
"user5", name_len=7) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 15 (Thread 0x42a3a950 (LWP 1269785)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42a379b0 "user6", name_len=12, 
what=0x42a37980 "/nfs/http12/user6", fstype=0x42a379d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x0) at mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42a379b0 "user6", name_len=12, 
what=0x42a37980 "/nfs/http12/user6", fstype=0x42a379d0 "bind",
     options=0x42a379f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42a38000 "user6", namelen=12, 
loc=0x55e6c6432660 ":/nfs/http12/user6", loclen=25,
     options=0x42a379f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42a38000 "user6", name_len=12, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42a38000 "user6", name_len=12, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42a38000 "user6", name_len=12) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42a38000 
"user6", name_len=12) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 14 (Thread 0x42535950 (LWP 1269810)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x425329c0 "user7", name_len=6, 
what=0x42532990 "/nfs/http12/user7", fstype=0x425329e0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x55e6c6428ff0) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x425329c0 "user7", name_len=6, 
what=0x42532990 "/nfs/http12/user7", fstype=0x425329e0 "bind",
options=0x42532a00 "")
     at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42533000 "user7", namelen=6, 
loc=0x55e6c64322d0 ":/nfs/http12/user7", loclen=19, options=0x42532a00 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42533000 "user7", name_len=6, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42533000 "user7", name_len=6, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42533000 "user7", name_len=6) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42533000 
"user7", name_len=6) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 13 (Thread 0x42939950 (LWP 1269829)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x429369c0 "user8", name_len=3, 
what=0x429369a0 "/nfs/http12/user8", fstype=0x429369e0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x20) at mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x429369c0 "user8", name_len=3, 
what=0x429369a0 "/nfs/http12/user8", fstype=0x429369e0 "bind", 
options=0x42936a00 "")
     at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42937000 "user8", namelen=3, 
loc=0x55e6c6432100 ":/nfs/http12/user8", loclen=16, options=0x42936a00 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42937000 "user8", name_len=3, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42937000 "user8", name_len=3, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42937000 "user8", name_len=3) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42937000 
"user8", name_len=3) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 12 (Thread 0x42838950 (LWP 1269881)):
#0  0x00007f438c7eb7db in read () from /lib/libpthread.so.0
#1  0x00007f438b017f2d in lookup_one (ap=0x55e6c63f8c30, 
name=0x55e6c6432320 "http12", name_len=<optimized out>, ctxt=<optimized 
out>) at lookup_program.c:291
#2  0x00007f438b01898d in match_key (ctxt=<optimized out>, 
mapent=<optimized out>, name_len=<optimized out>, name=<optimized out>, 
source=<optimized out>, ap=<optimized out>) at lookup_program.c:493
#3  lookup_mount (ap=0x55e6c63f8c30, name=0x42836000 "http12", 
name_len=6, context=0x55e6c63f87c0) at lookup_program.c:682
#4  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f8c30, 
map=0x55e6c63f8d50, name=0x42836000 "http12", name_len=6) at lookup.c:766
#5  0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#6  lookup_nss_mount (ap=0x55e6c63f8c30, source=0x0, name=0x42836000 
"http12", name_len=6) at lookup.c:1110
#7  0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#8  0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#9  0x00007f438b9ad64d in clone () from /lib/libc.so.6
#10 0x0000000000000000 in ?? ()

Thread 11 (Thread 0x42333950 (LWP 1271292)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x423309b0 "user9", name_len=9, 
what=0x42330980 "/nfs/http12/user9", fstype=0x423309d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x7f438bc2ca40) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x423309b0 "user9", name_len=9, 
what=0x42330980 "/nfs/http12/user9", fstype=0x423309d0 "bind",
     options=0x423309f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42331000 "user9", namelen=9, 
loc=0x55e6c6432740 ":/nfs/http12/user9", loclen=22, options=0x423309f0
"",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42331000 "user9", name_len=9, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42331000 "user9", name_len=9, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42331000 "user9", name_len=9) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42331000 
"user9", name_len=9) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 10 (Thread 0x42737950 (LWP 1271894)):
---Type <return> to continue, or q <return> to quit---
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x427349b0 "user10", name_len=12, 
what=0x42734980 "/nfs/http12/user10", fstype=0x427349d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x0) at mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x427349b0 "user10", name_len=12, 
what=0x42734980 "/nfs/http12/user10", fstype=0x427349d0 "bind",
     options=0x427349f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42735000 "user10", namelen=12, 
loc=0x55e6c64328d0 ":/nfs/http12/user10", loclen=25,
     options=0x427349f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42735000 "user10", name_len=12, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42735000 "user10", name_len=12, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42735000 "user10", name_len=12) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42735000 
"user10", name_len=12) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 9 (Thread 0x42131950 (LWP 1272342)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4212e9b0 "user11", name_len=9, 
what=0x4212e980 "/nfs/http12/user11", fstype=0x4212e9d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x55e6c64353f0) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4212e9b0 "user11", name_len=9, 
what=0x4212e980 "/nfs/http12/user11", fstype=0x4212e9d0 "bind",
     options=0x4212e9f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x4212f000 "user11", namelen=9, 
loc=0x55e6c6432a30 ":/nfs/http12/user11", loclen=22, options=0x4212e9f0
"",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x4212f000 "user11", name_len=9, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x4212f000 "user11", name_len=9, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x4212f000 "user11", name_len=9) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x4212f000 
"user11", name_len=9) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 8 (Thread 0x407af950 (LWP 1273318)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x407ac9c0 "user12", name_len=5, 
what=0x407ac990 "/nfs/http12/user12", fstype=0x407ac9e0 "bind",
     options=0x7f438a9a9716 "defaults", context=0xffffffffffffffff) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x407ac9c0 "user12", name_len=5, 
what=0x407ac990 "/nfs/http12/user12", fstype=0x407ac9e0 "bind", 
options=0x407aca00
"")
     at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x407ad000 "user12", namelen=5, 
loc=0x55e6c6432cf0 ":/nfs/http12/user12", loclen=18, options=0x407aca00 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x407ad000 "user12", name_len=5, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x407ad000 "user12", name_len=5, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x407ad000 "user12", name_len=5) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x407ad000 
"user12", name_len=5) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 7 (Thread 0x42b3b950 (LWP 1273997)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42b389c0 "user13", name_len=6, 
what=0x42b38990 "/nfs/http12/user13", fstype=0x42b389e0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x4b1) at mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42b389c0 "user13", name_len=6, 
what=0x42b38990 "/nfs/http12/user13", fstype=0x42b389e0 "bind",
options=0x42b38a00 "")
     at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42b39000 "user13", namelen=6, 
loc=0x55e6c64330c0 ":/nfs/http12/user13", loclen=19, options=0x42b38a00 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42b39000 "user13", name_len=6, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42b39000 "user13", name_len=6, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42b39000 "user13", name_len=6) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42b39000 
"user13", name_len=6) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 6 (Thread 0x405a3950 (LWP 1276040)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x405a09c0 "user14", name_len=5, 
what=0x405a0990 "/nfs/http12/user14", fstype=0x405a09e0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x21) at mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x405a09c0 "user14", name_len=5, 
what=0x405a0990 "/nfs/http12/user14", fstype=0x405a09e0 "bind", 
options=0x405a0a00
"")
     at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x405a1000 "user14", namelen=5, 
loc=0x55e6c6433360 ":/nfs/http12/user14", loclen=18, options=0x405a0a00 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x405a1000 "user14", name_len=5, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x405a1000 "user14", name_len=5, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x405a1000 "user14", name_len=5) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x405a1000 
"user14", name_len=5) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 5 (Thread 0x42c3c950 (LWP 1278498)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42c399c0 "user15", name_len=6, 
what=0x42c39990 "/nfs/http12/user15", fstype=0x42c399e0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x6974616877040020) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42c399c0 "user15", name_len=6, 
what=0x42c39990 "/nfs/http12/user15", fstype=0x42c399e0 "bind",
options=0x42c39a00 "")
     at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42c3a000 "user15", namelen=6, 
loc=0x55e6c6435160 ":/nfs/http12/user15", loclen=19, options=0x42c39a00 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42c3a000 "user15", name_len=6, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42c3a000 "user15", name_len=6, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42c3a000 "user15", name_len=6) at lookup.c:766
---Type <return> to continue, or q <return> to quit---
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42c3a000 
"user15", name_len=6) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 4 (Thread 0x42d3d950 (LWP 1302366)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42d3a9b0 "user16", name_len=8, 
what=0x42d3a980 "/nfs/http12/user16", fstype=0x42d3a9d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0xffffffffffffffff) at 
mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42d3a9b0 "user16", name_len=8, 
what=0x42d3a980 "/nfs/http12/user16", fstype=0x42d3a9d0 "bind",
     options=0x42d3a9f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42d3b000 "user16", namelen=8, 
loc=0x55e6c6435420 ":/nfs/http12/user16", loclen=21, options=0x42d3a9f0 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42d3b000 "user16", name_len=8, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42d3b000 "user16", name_len=8, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42d3b000 "user16", name_len=8) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42d3b000 
"user16", name_len=8) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x42e3e950 (LWP 1317665)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42e3b9b0 "user17", name_len=7, 
what=0x42e3b980 "/nfs/http12/user17", fstype=0x42e3b9d0 "bind",
     options=0x7f438a9a9716 "defaults", context=0x0) at mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42e3b9b0 "user17", name_len=7, 
what=0x42e3b980 "/nfs/http12/user17", fstype=0x42e3b9d0 "bind",
     options=0x42e3b9f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42e3c000 "user17", namelen=7, 
loc=0x55e6c64358a0 ":/nfs/http12/user17", loclen=20, options=0x42e3b9f0 "",
     ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42e3c000 "user17", name_len=7, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42e3c000 "user17", name_len=7, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42e3c000 "user17", name_len=7) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42e3c000 
"user17", name_len=7) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x42f3f950 (LWP 1319319)):
#0  0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1  0x000055e6c5bc1474 in timed_read (time=<optimized out>, 
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at 
spawn.c:107
#2  do_spawn (logopt=0, wait=4294967295, options=<optimized out>, 
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3  0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4  0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42f3c9a0 "user18", name_len=18, 
what=0x42f3c970 "/nfs/http12/user18",
     fstype=0x42f3c9d0 "bind", options=0x7f438a9a9716 "defaults", 
context=0xffffffffffffffff) at mount_bind.c:170
#5  0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42f3c9a0 "user18", name_len=18, 
what=0x42f3c970 "/nfs/http12/user18",
     fstype=0x42f3c9d0 "bind", options=0x42f3c9f0 "") at mount.c:78
#6  0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030, 
root=0x55e6c63f9130 "/home", name=0x42f3d000 "user18", namelen=18, 
loc=0x55e6c6435b40 ":/nfs/http12/user18", loclen=31,
---Type <return> to continue, or q <return> to quit---
     options=0x42f3c9f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7  0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030, 
name=0x42f3d000 "user18", name_len=18, mapent=<optimized out>, 
context=<optimized out>) at parse_sun.c:1673
#8  0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030, 
name=0x42f3d000 "user18", name_len=18, context=0x55e6c63f76f0) at 
lookup_program.c:691
#9  0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030, 
map=0x55e6c63f9150, name=0x42f3d000 "user18", name_len=18) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized 
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at 
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42f3d000 
"user18", name_len=18) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at 
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f438cbfa6e0 (LWP 3437)):
#0  0x00007f438c7ec797 in do_sigwait () from /lib/libpthread.so.0
#1  0x00007f438c7ec83d in sigwait () from /lib/libpthread.so.0
#2  0x000055e6c5bba3c6 in statemachine (arg=<optimized out>) at 
automount.c:1469
#3  main (argc=0, argv=<optimized out>) at automount.c:2476


I cannot attach gdb on subprocesses: gdb just hangs after:
Attaching to program: /usr/sbin/automount, process 1269869

Kernel trace:

# cat /proc/1269869/stack
[<ffffffffc052f3bf>] autofs4_wait+0x3df/0xb60 [autofs4]
[<ffffffffc052e0a5>] autofs4_d_automount+0x235/0x270 [autofs4]
[<ffffffff921cbb8f>] follow_managed+0x1ff/0x2d0
[<ffffffff921cccb3>] walk_component+0x263/0x300
[<ffffffff921cdded>] link_path_walk+0x18d/0x5a0
[<ffffffff921cf48e>] path_openat+0xbe/0x1070
[<ffffffff921d04c5>] do_filp_open+0x85/0xe0
[<ffffffff921bed96>] do_sys_open+0x146/0x220
[<ffffffff921beeae>] SyS_open+0x1e/0x20
[<ffffffff927685b2>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

Other subprocesses may have a slightly different kernel trace:

~# cat /proc/1269923/stack
[<ffffffffc052f3bf>] autofs4_wait+0x3df/0xb60 [autofs4]
[<ffffffffc052dc97>] autofs4_d_manage+0x117/0x1b0 [autofs4]
[<ffffffff921cbaa6>] follow_managed+0x116/0x2d0
[<ffffffff921cccb3>] walk_component+0x263/0x300
[<ffffffff921cdded>] link_path_walk+0x18d/0x5a0
[<ffffffff921cf48e>] path_openat+0xbe/0x1070
[<ffffffff921d04c5>] do_filp_open+0x85/0xe0
[<ffffffff921bed96>] do_sys_open+0x146/0x220
[<ffffffff921beeae>] SyS_open+0x1e/0x20
[<ffffffff927685b2>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff

When the deadlock happens, I have to kill -9 subprocesses one by one. 
One specific subprocess finally unlocks the deadlock and everything goes 
back to normal (remaining subprocesses disappear as well).

I have no other interesting log, and in particular no kernel log when 
that happens.

Thanks,

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-25 23:41 Regular deadlocks Cyril B.
@ 2016-06-26  1:59 ` Ian Kent
  2016-06-26 10:02   ` Cyril B.
  2016-06-26 10:59 ` Ian Kent
  2016-06-26 11:02 ` Ian Kent
  2 siblings, 1 reply; 17+ messages in thread
From: Ian Kent @ 2016-06-26  1:59 UTC (permalink / raw)
  To: Cyril B., autofs

On Sun, 2016-06-26 at 01:41 +0200, Cyril B. wrote:
> Hello,
> 
> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1 
> as well) on my servers, typically about once every 2 or 3 days.
> 
> I already posted on this mailing-list back in 2015 for a bug that also 
> triggered deadlocks (which was fixed), so I'll copy/paste parts of my 
> original message as my config hasn't changed.
> 
> /etc/auto.master:
> --
> /nfs program:/etc/auto.nfs
> /home program:/etc/auto.home
> --
> 
> /etc/auto.nfs is basically returning:
> 
> -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
> 
> /etc/auto.home:
> --
> #!/bin/sh
> 
> if [ ! -h /var/home/$1 ]
> then
>     exit 1
> fi
> 
> echo -fstype=bind :$(readlink --no-newline /var/home/$1)
> --
> 
> So for instance, /var/home/foo would be a symlink pointing to
> /nfs/serverX/foo.
> 
> Kernel: Linux 4.4.7.
> 
> My servers have cronjobs that trigger /home/userX mounts basically at 
> the same time (when the jobs do start). I have 2 servers with the same 
> config, but one of them has MANY more users/cronjobs and oddly enough, 
> the deadlock happens much more infrequently.
> 
> Anyway, here's a 'ps faux' a few hours after the deadlock started:

Looks like these aren't showing a deadlock.

I think I've been seeing the same thing during testing and I can see it's
mount.nfs(8) that is not returning when it should.

I initially thought it was may environment but I've swapped several devices and
used different servers so I'm beginning to think mount.nfs(8) has grown a
problem but still not sure.

So far I've been thinking this was a problem with my environemtso didn't worry
too much about it.

snip ...


> 
> I cannot attach gdb on subprocesses: gdb just hangs after:
> Attaching to program: /usr/sbin/automount, process 1269869
> 
> Kernel trace:
> 
> # cat /proc/1269869/stack
> [<ffffffffc052f3bf>] autofs4_wait+0x3df/0xb60 [autofs4]
> [<ffffffffc052e0a5>] autofs4_d_automount+0x235/0x270 [autofs4]
> [<ffffffff921cbb8f>] follow_managed+0x1ff/0x2d0
> [<ffffffff921cccb3>] walk_component+0x263/0x300
> [<ffffffff921cdded>] link_path_walk+0x18d/0x5a0
> [<ffffffff921cf48e>] path_openat+0xbe/0x1070
> [<ffffffff921d04c5>] do_filp_open+0x85/0xe0
> [<ffffffff921bed96>] do_sys_open+0x146/0x220
> [<ffffffff921beeae>] SyS_open+0x1e/0x20
> [<ffffffff927685b2>] entry_SYSCALL_64_fastpath+0x12/0x71
> [<ffffffffffffffff>] 0xffffffffffffffff

The autofs4_d_automount() entry here indicates this is the one that triggered
the mount.

If you are seeing a problem with mount, looking for the blocked process and
killing it should clear the rest of these up.

You would think that, even if mount.nfs(8) was losing network packets, it would
timeout after about 3 minutes. I must admit I haven't waited long enough to find
out if that's the case so far.

If you find you are actually seeing this, setting the configuration option
mount_wait to some sensible value sufficient for a mount to complete might help.

The problem with that is it's hard to locate the actual blocked child process
from automount to kill it once the timeout has expired. Only the process spawned
by automount itself is killed so sub processes will probably accumulate.

Normally that process would go away after the usual lengthy timeout and was
commonly due to a server not responding or something like that so not locating
the process and killing it wasn't a big problem.

If this really is what's happening then I will need to fix that so that
automount can work around it. Not sure it will really help that much though
......

snip ...

> When the deadlock happens, I have to kill -9 subprocesses one by one. 
> One specific subprocess finally unlocks the deadlock and everything goes 
> back to normal (remaining subprocesses disappear as well).

Right, you might need to do that on the blocked mount process, NFS hangs on
pretty strongly when mounting.

Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-26  1:59 ` Ian Kent
@ 2016-06-26 10:02   ` Cyril B.
  2016-06-26 10:30     ` Ian Kent
  2016-06-27  0:26     ` Ian Kent
  0 siblings, 2 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-26 10:02 UTC (permalink / raw)
  To: Ian Kent, autofs

On 06/26/2016 03:59 AM, Ian Kent wrote:
> The autofs4_d_automount() entry here indicates this is the one that triggered
> the mount.
>
> If you are seeing a problem with mount, looking for the blocked process and
> killing it should clear the rest of these up.
>
> You would think that, even if mount.nfs(8) was losing network packets, it would
> timeout after about 3 minutes. I must admit I haven't waited long enough to find
> out if that's the case so far.
>
> If you find you are actually seeing this, setting the configuration option
> mount_wait to some sensible value sufficient for a mount to complete might help.


Thanks for the quick response. I don't remember seeing any lingering 
mount.nfs, but I must have overlooked.

However, I've also had those "deadlocks" happen with several different 
NFS servers at the same time (this was not the case with the traces I 
sent in my previous email). If one NFS server was unmountable, should 
automount be blocked when trying to mount others?

Anyway, I'll investigate some more next time the "deadlock" occurs and 
will also try setting mount_wait to a reasonable value. I'll let you 
know how it goes.

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-26 10:02   ` Cyril B.
@ 2016-06-26 10:30     ` Ian Kent
  2016-06-26 11:13       ` Cyril B.
  2016-06-27  0:26     ` Ian Kent
  1 sibling, 1 reply; 17+ messages in thread
From: Ian Kent @ 2016-06-26 10:30 UTC (permalink / raw)
  To: Cyril B., autofs

On Sun, 2016-06-26 at 12:02 +0200, Cyril B. wrote:
> On 06/26/2016 03:59 AM, Ian Kent wrote:
> > The autofs4_d_automount() entry here indicates this is the one that
> > triggered
> > the mount.
> > 
> > If you are seeing a problem with mount, looking for the blocked process and
> > killing it should clear the rest of these up.
> > 
> > You would think that, even if mount.nfs(8) was losing network packets, it
> > would
> > timeout after about 3 minutes. I must admit I haven't waited long enough to
> > find
> > out if that's the case so far.
> > 
> > If you find you are actually seeing this, setting the configuration option
> > mount_wait to some sensible value sufficient for a mount to complete might
> > help.
> 
> 
> Thanks for the quick response. I don't remember seeing any lingering 
> mount.nfs, but I must have overlooked.

Yep, check it and let me know.

> 
> However, I've also had those "deadlocks" happen with several different 
> NFS servers at the same time (this was not the case with the traces I 
> sent in my previous email). If one NFS server was unmountable, should 
> automount be blocked when trying to mount others?

That's a bit harder to answer, it's different for mount and umount and one can
affect the other.

I would need to check since the locking scope changes due to changes for other
problems, that's with 5.1.1 and 5.1.2 right?

That also assumes automount has got all the way to mounting or umounting, it can
block at other points too like remote key lookup.

Ideally one down server shouldn't affect other mounts or umounts but, as I say,
I'll need to check and try and fix it if it's not working like that. Having said
that it also might not be straight forward either.

> 
> Anyway, I'll investigate some more next time the "deadlock" occurs and 
> will also try setting mount_wait to a reasonable value. I'll let you 
> know how it goes.
> 
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-25 23:41 Regular deadlocks Cyril B.
  2016-06-26  1:59 ` Ian Kent
@ 2016-06-26 10:59 ` Ian Kent
  2016-06-26 11:02 ` Ian Kent
  2 siblings, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-26 10:59 UTC (permalink / raw)
  To: Cyril B., autofs

On Sun, 2016-06-26 at 01:41 +0200, Cyril B. wrote:
> Hello,
> 
> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1 
> as well) on my servers, typically about once every 2 or 3 days.
> 
> I already posted on this mailing-list back in 2015 for a bug that also 
> triggered deadlocks (which was fixed), so I'll copy/paste parts of my 
> original message as my config hasn't changed.
> 
> /etc/auto.master:
> --
> /nfs program:/etc/auto.nfs
> /home program:/etc/auto.home
> --
> 
> /etc/auto.nfs is basically returning:
> 
> -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
> 
> /etc/auto.home:
> --
> #!/bin/sh
> 
> if [ ! -h /var/home/$1 ]
> then
>     exit 1
> fi
> 
> echo -fstype=bind :$(readlink --no-newline /var/home/$1)
> --

Btw, there was a bug in 5.1.1 were a program map lookup could hang under load
due to out of order locking.

The stack traces here didn't show what I'd expect to see with that but it can be
a problem with 5.1.1.

fyi, the patch that fixed that is:
https://www.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.2/autofs-5.1.1-fix-out-of-order-call-in-program-map-lookup.patch

Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-25 23:41 Regular deadlocks Cyril B.
  2016-06-26  1:59 ` Ian Kent
  2016-06-26 10:59 ` Ian Kent
@ 2016-06-26 11:02 ` Ian Kent
  2 siblings, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-26 11:02 UTC (permalink / raw)
  To: Cyril B., autofs

On Sun, 2016-06-26 at 01:41 +0200, Cyril B. wrote:
> Hello,
> 
> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1 
> as well) on my servers, typically about once every 2 or 3 days.
> 

Ummm ... is that meant to be 5.1.1 and 5.1.2, typo perhaps?
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-26 10:30     ` Ian Kent
@ 2016-06-26 11:13       ` Cyril B.
  0 siblings, 0 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-26 11:13 UTC (permalink / raw)
  To: Ian Kent, autofs

On 06/26/2016 12:30 PM, Ian Kent wrote:
> I would need to check since the locking scope changes due to changes for other
> problems, that's with 5.1.1 and 5.1.2 right?

Yes, both version exhibit the same behaviour for me, with no apparent 
difference (same deadlock frequency, for instance).

> Btw, there was a bug in 5.1.1 were a program map lookup could hang under load
> due to out of order locking.
>
> The stack traces here didn't show what I'd expect to see with that but it can be
> a problem with 5.1.1.

That's actually the bug you fixed after my bug report in 2015, so my 
5.1.1 already included that patch. And since it's included in 5.1.2, all 
my tests were with that patch.

>> Hello,
>>
>> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1
>> as well) on my servers, typically about once every 2 or 3 days.
>>
>
> Ummm ... is that meant to be 5.1.1 and 5.1.2, typo perhaps?

Whoops, yes :)

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-26 10:02   ` Cyril B.
  2016-06-26 10:30     ` Ian Kent
@ 2016-06-27  0:26     ` Ian Kent
  2016-06-27  0:35       ` Ian Kent
  2016-06-27 14:04       ` Cyril B.
  1 sibling, 2 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-27  0:26 UTC (permalink / raw)
  To: Cyril B., autofs

On Sun, 2016-06-26 at 12:02 +0200, Cyril B. wrote:
> 
> However, I've also had those "deadlocks" happen with several different 
> NFS servers at the same time (this was not the case with the traces I 
> sent in my previous email). If one NFS server was unmountable, should 
> automount be blocked when trying to mount others?
> 

How is autofs configured.

If --disable-mount-locking is not used then any mount can block all other
mounts, if it is used then there can be mtab corruption if still using a text
based mtab.

I always use --disable-mount-locking and nowadays the mtab is usually a symlink
into the proc file system so corruption isn't a problem.

Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-27  0:26     ` Ian Kent
@ 2016-06-27  0:35       ` Ian Kent
  2016-06-27 14:04       ` Cyril B.
  1 sibling, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-27  0:35 UTC (permalink / raw)
  To: Cyril B., autofs

On Mon, 2016-06-27 at 08:26 +0800, Ian Kent wrote:
> On Sun, 2016-06-26 at 12:02 +0200, Cyril B. wrote:
> > 
> > However, I've also had those "deadlocks" happen with several different 
> > NFS servers at the same time (this was not the case with the traces I 
> > sent in my previous email). If one NFS server was unmountable, should 
> > automount be blocked when trying to mount others?

And don't forget it's not different servers that affect locking.

Most of the locking that can cause blocking it is at directory level so it
should be at or below the autofs mount point directories.

> 
> How is autofs configured.
> 
> If --disable-mount-locking is not used then any mount can block all other
> mounts, if it is used then there can be mtab corruption if still using a text
> based mtab.
> 
> I always use --disable-mount-locking and nowadays the mtab is usually a
> symlink
> into the proc file system so corruption isn't a problem.
> 
> Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-27  0:26     ` Ian Kent
  2016-06-27  0:35       ` Ian Kent
@ 2016-06-27 14:04       ` Cyril B.
  2016-06-28  0:27         ` Ian Kent
  2016-07-01  7:03         ` Ian Kent
  1 sibling, 2 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-27 14:04 UTC (permalink / raw)
  To: Ian Kent, autofs

On 06/27/2016 02:26 AM, Ian Kent wrote:
> How is autofs configured.
>
> If --disable-mount-locking is not used then any mount can block all other
> mounts, if it is used then there can be mtab corruption if still using a text
> based mtab.

I use --disable-mount-locking.

> I always use --disable-mount-locking and nowadays the mtab is usually a symlink
> into the proc file system so corruption isn't a problem.

/etc/mtab is actually not a symlink on my systems.


Anyway, I have more details for you as the issue appeared today and I 
could investigate some more. This is on a server that only mounts one 
single NFS server (http12), so the multi-servers blocking issue is 
irrelevant here.

A few minutes before the "deadlock" occurred, /nfs/http12 was unmounted 
by autofs, I assume because it was idle. I have TIMEOUT=600. That 
explains why the issue appears much more frequently on a server which is 
way less busy (and usually in the middle of the night): the NFS server 
needs to be idle enough to be unmounted.

However, I still had many /home/userX mounted (by autofs), which point 
to /nfs/http12/userX. Shouldn't autofs not unmount /nfs/http12 when at 
least one /home/userX is mounted? To be clear, here's an extract from my 
/proc/mounts BEFORE the NFS server is unmounted by autofs:

http12:/ /nfs/http12 nfs4 
rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:20:1::1 
0
  0
http12://user1 /home/user1 nfs4 
rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:20:
1::1 0 0
http12://user2 /home/user2 nfs4 
rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:2
0:1::1 0 0


Also, I couldn't find any blocked mount process that would explain the 
"deadlock". I had a 'ps aux|grep mount' done every 10 seconds:

Mon Jun 27 05:00:00 CEST 2016
root        3437  0.0  0.0 218676  5500 ?        Ssl  Jun24   0:26 
/usr/sbin/automount --pid-file /var/run/autofs.pid

Mon Jun 27 05:00:10 CEST 2016
root        3437  0.0  0.0 218676  5500 ?        Ssl  Jun24   0:27 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618146  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618214  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618215  0.0  0.0      0     0 ?        Z    05:00   0:00 
[umount] <defunct>
root     2618224  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618227  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618230  0.0  0.0      0     0 ?        Z    05:00   0:00 
[umount] <defunct>
root     2618240  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618248  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618250  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618252  0.1  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid

Mon Jun 27 05:00:20 CEST 2016
root        3437  0.0  0.0 218676  5500 ?        Ssl  Jun24   0:27 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618146  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618214  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618224  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618227  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618240  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618248  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618250  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618252  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618701  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     2618702  0.0  0.0 218676  2168 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid

And it remained in that state afterwards. I don't know if the defunct 
umount are suspicious, I guess not.

One last thing: a manual umount of /home/userY was done by a script at 
6:26 (/home/userY was NOT mounted though), and it remained blocked. I'm 
not sure if it's a consequence of autofs being blocked or something else.

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-27 14:04       ` Cyril B.
@ 2016-06-28  0:27         ` Ian Kent
  2016-06-28  7:48           ` Cyril B.
  2016-06-29 12:45           ` Cyril B.
  2016-07-01  7:03         ` Ian Kent
  1 sibling, 2 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-28  0:27 UTC (permalink / raw)
  To: Cyril B., autofs

On Mon, 2016-06-27 at 16:04 +0200, Cyril B. wrote:
> On 06/27/2016 02:26 AM, Ian Kent wrote:
> > How is autofs configured.
> > 
> > If --disable-mount-locking is not used then any mount can block all other
> > mounts, if it is used then there can be mtab corruption if still using a
> > text
> > based mtab.
> 
> I use --disable-mount-locking.
> 
> > I always use --disable-mount-locking and nowadays the mtab is usually a
> > symlink
> > into the proc file system so corruption isn't a problem.
> 
> /etc/mtab is actually not a symlink on my systems.
> 
> 
> Anyway, I have more details for you as the issue appeared today and I 
> could investigate some more. This is on a server that only mounts one 
> single NFS server (http12), so the multi-servers blocking issue is 
> irrelevant here.
> 
> A few minutes before the "deadlock" occurred, /nfs/http12 was unmounted 
> by autofs, I assume because it was idle. I have TIMEOUT=600. That 
> explains why the issue appears much more frequently on a server which is 
> way less busy (and usually in the middle of the night): the NFS server 
> needs to be idle enough to be unmounted.
> 
> However, I still had many /home/userX mounted (by autofs), which point 
> to /nfs/http12/userX. Shouldn't autofs not unmount /nfs/http12 when at 
> least one /home/userX is mounted? To be clear, here's an extract from my 
> /proc/mounts BEFORE the NFS server is unmounted by autofs:

This doesn't look like the full picture.
Where are the autofs file system mounts?

Mounted doesn't necessarily mean busy or in use but that also depends on the
mount hierarchy and how it has been constructed.

And symlinks are never be "busy", they can't be a pwd and they aren't opened as
a file.

> 
> http12:/ /nfs/http12 nfs4 
> rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=t
> cp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,
> addr=2a00:42:1:20:1::1 
> 0
>   0

This looks like an nfs4 fsid 0 mount.

Is this one mounted by a program map similar to what you described earlier?
Do you then rely on the nfs cross device automounting to mount the user mounts?

Or is this one mounted external to autofs and used by autofs automounts?
Or are you using autofs to mount this one and some other map to mount the other
mounts?

> http12://user1 /home/user1 nfs4 
> rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=t
> cp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,
> addr=2a00:42:1:20:
> 1::1 0 0
> http12://user2 /home/user2 nfs4 
> rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=t
> cp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,
> addr=2a00:42:1:2
> 0:1::1 0 0

But then there are these, that don't seem to be related to the mount above,
perhaps they reference the first mount. 

Can you describe again how this fits together?

At this point a full debug log would probably answer most of my questions.

> 
> 
> Also, I couldn't find any blocked mount process that would explain the 
> "deadlock". I had a 'ps aux|grep mount' done every 10 seconds:
> 
> Mon Jun 27 05:00:00 CEST 2016
> root        3437  0.0  0.0 218676  5500 ?        Ssl  Jun24   0:26 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> 
> Mon Jun 27 05:00:10 CEST 2016
> root        3437  0.0  0.0 218676  5500 ?        Ssl  Jun24   0:27 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618146  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618214  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618215  0.0  0.0      0     0 ?        Z    05:00   0:00 
> [umount] <defunct>
> root     2618224  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618227  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618230  0.0  0.0      0     0 ?        Z    05:00   0:00 
> [umount] <defunct>
> root     2618240  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618248  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618250  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618252  0.1  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> 
> Mon Jun 27 05:00:20 CEST 2016
> root        3437  0.0  0.0 218676  5500 ?        Ssl  Jun24   0:27 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618146  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618214  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618224  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618227  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618240  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618248  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618250  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618252  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618701  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root     2618702  0.0  0.0 218676  2168 ?        S    05:00   0:00 
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> 
> And it remained in that state afterwards. I don't know if the defunct 
> umount are suspicious, I guess not.
> 
> One last thing: a manual umount of /home/userY was done by a script at 
> 6:26 (/home/userY was NOT mounted though), and it remained blocked. I'm 
> not sure if it's a consequence of autofs being blocked or something else.
> 
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-28  0:27         ` Ian Kent
@ 2016-06-28  7:48           ` Cyril B.
  2016-07-01  7:28             ` Ian Kent
  2016-06-29 12:45           ` Cyril B.
  1 sibling, 1 reply; 17+ messages in thread
From: Cyril B. @ 2016-06-28  7:48 UTC (permalink / raw)
  To: Ian Kent, autofs

On 06/28/2016 02:27 AM, Ian Kent wrote:
> Can you describe again how this fits together?

Sure. Please let me know if I'm not clear. Here's my /etc/auto.master:

/nfs program:/etc/auto.nfs nobind
/home program:/etc/auto.home --mode=751

/etc/auto.nfs does some sanity checks, then does:
echo -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/

/etc/auto.home looks up in an internal database on what NFS server the 
account is, then does:
echo -fstype=bind :/nfs/httpX/$1

So /nfs and /home are entirely handled by autofs. There's no NFS or user 
mount handled by something else.

On the NFS servers, /home is a standard filesystem (XFS), here's the 
/etc/exports:

/home  1.2.3.4  (rw,async,insecure,no_subtree_check,no_root_squash,fsid=0)

On my server that deadlocks the most, all users are on the same NFS 
server (named http12).

> At this point a full debug log would probably answer most of my questions.

I've just enabled debug logs, I'll send them to you as soon as I have a 
new deadlock.

Thanks again.

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-28  0:27         ` Ian Kent
  2016-06-28  7:48           ` Cyril B.
@ 2016-06-29 12:45           ` Cyril B.
  1 sibling, 0 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-29 12:45 UTC (permalink / raw)
  To: Ian Kent, autofs

On 06/28/2016 02:27 AM, Ian Kent wrote:
> At this point a full debug log would probably answer most of my questions.

I have a full debug log for you. I don't know what the policy is for 
sending attachments on the mailing list, so I'll send the log to you only.

The "deadlock" happened at 22:00, /nfs/http12 was unmounted shortly before:

Jun 28 21:55:21 ssh1 automount[3848595]: umount_subtree_mounts: 
unmounting dir = /nfs/http12

Here's a 'ps aux|grep mount' from this morning, before I cleaned it up:

root      153793  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      153798  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      153803  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      153808  0.0  0.0      0     0 ?        Z    Jun28   0:00 
[mount] <defunct>
root      153819  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      153860  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      153861  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      154529  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      155381  0.0  0.0 271852  1868 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      156565  0.0  0.0 272880  1900 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      176662  0.0  0.0   3944   392 ?        S    Jun28   0:00 
/bin/sh -c /bin/umount /home/bikes
root      176663  0.0  0.0   8124   480 ?        S    Jun28   0:00 
/bin/umount /home/bikes
root      176671  0.0  0.0 274936  1964 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      195792  0.0  0.0 275964  1996 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      195817  0.0  0.0 275964  1996 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      228410  0.0  0.0 276992  2028 ?        S    Jun28   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      232950  0.0  0.0 279048  2092 ?        S    00:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      232969  0.0  0.0 279048  2092 ?        S    00:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      236404  0.0  0.0 280076  2124 ?        S    00:05   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      262934  0.0  0.0 281104  2156 ?        S    00:39   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      340095  0.0  0.0 283160  2220 ?        S    02:55   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      344231  0.0  0.0 283160  2220 ?        S    03:05   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      361273  0.0  0.0 284188  2252 ?        S    03:55   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      384154  0.0  0.0 287272  2348 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      384165  0.0  0.0 287272  2348 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      384170  0.0  0.0 287272  2348 ?        S    05:00   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      407146  0.0  0.0 288300  2380 ?        S    06:01   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      595311  0.0  0.0 289328  2412 ?        S    09:34   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root      597797  0.0  0.0 290356  2444 ?        S    09:40   0:00 
/usr/sbin/automount --pid-file /var/run/autofs.pid
root     3848595  0.0  0.0 291384  5756 ?        Ssl  Jun28   0:21 
/usr/sbin/automount --pid-file /var/run/autofs.pid

The umount '/home/bikes' was not done by autofs, it was done by a script.

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-27 14:04       ` Cyril B.
  2016-06-28  0:27         ` Ian Kent
@ 2016-07-01  7:03         ` Ian Kent
  1 sibling, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-07-01  7:03 UTC (permalink / raw)
  To: Cyril B., autofs

On Mon, 2016-06-27 at 16:04 +0200, Cyril B. wrote:
> On 06/27/2016 02:26 AM, Ian Kent wrote:
> > How is autofs configured.
> > 
> > If --disable-mount-locking is not used then any mount can block all other
> > mounts, if it is used then there can be mtab corruption if still using a
> > text
> > based mtab.
> 
> I use --disable-mount-locking.
> 
> > I always use --disable-mount-locking and nowadays the mtab is usually a
> > symlink
> > into the proc file system so corruption isn't a problem.
> 
> /etc/mtab is actually not a symlink on my systems.
> 
> 
> Anyway, I have more details for you as the issue appeared today and I 
> could investigate some more. This is on a server that only mounts one 
> single NFS server (http12), so the multi-servers blocking issue is 
> irrelevant here.
> 
> A few minutes before the "deadlock" occurred, /nfs/http12 was unmounted 
> by autofs, I assume because it was idle. I have TIMEOUT=600. That 
> explains why the issue appears much more frequently on a server which is 
> way less busy (and usually in the middle of the night): the NFS server 
> needs to be idle enough to be unmounted.

That does seem to be causing a problem.

The mount request for /nfs/http12 doesn't seem to be able to make progress but
that could be due to what looks like a signal handling problem, not sure.

There are a bunch of processes blocked on poll(2), waiting for input from a pipe
that probably belongs to a process that has died (quite a few of them).

You would think that poll(2) whould get a SIGCHLD signal when the child process
terminates but, unfortunately, that can't be relied upon in a threaded
application.

Only a single thread of those that don't have SIGCHLD blocked will receive the
signal, and that might not be the thread that fork(2)ed the child, and if there
are multiple signals sent at the same time the number of signals delivered might
not match the number of processes that sent the signal.

So I think the first thing to try will be to change the logic around the poll(2)
call in the timed_wait() function to be non-blocking and check for child process
existence before waiting on poll(2) again.

That's probably not going to help with whatever has caused a problem with
mount(8) (or probably mount.nfs(8)) but it will provide the opportunity to put
some logging in to try and get more information on it. Not only that you will th
en likely get a bunch of mount failures for mounts that shouldn't have failed.

The really annoying thing is that there is no output al all from any of the
child process that must have been forked.

Anyway, that's going to take a while.

> 
> However, I still had many /home/userX mounted (by autofs), which point 
> to /nfs/http12/userX. Shouldn't autofs not unmount /nfs/http12 when at 
> least one /home/userX is mounted? To be clear, here's an extract from my 
> /proc/mounts BEFORE the NFS server is unmounted by autofs:

That's another fairly difficult question.

First it's the kernel dentry corresponding to /nfs/http12 that holds the
last_used counter that determines if the dentry hasn't been used for the given
timeout. For that timeout to occur the dentry must not have been busy during
that time which means no open file handles, no working directories open within
it and no activity that would update the last_used value (not usually plain path
walks).

Then there's the question of bind mounting.

I think that when you bind mount the result is an independent mount but just how
that is handled when bind mounting a sub directory of a mount isn't clear. The
output of /proc/mounts (last time I looked at this case) makes it look like the
parent mount is used.

So it's not clear what's going on there.

If the parent mount is used for each bound mount then there would be multiple
independent mounts each able to be umounted independently. For a start that
implies the business of each of these mounts can't influence the busyness of
others and so neither the parent itself.

That sounds a bit strange I know but it would take a lot of time trawling the
VFS to really understand what is going on there.

We do however see that /nfs/http12 can be umounted so I think we can assume
something similar to what I describe is the way it is.

I don't know yet what that means for the scenario here, what I've suggested
above needs to be done first I think.

Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-06-28  7:48           ` Cyril B.
@ 2016-07-01  7:28             ` Ian Kent
  2016-07-01  7:48               ` Cyril B.
  0 siblings, 1 reply; 17+ messages in thread
From: Ian Kent @ 2016-07-01  7:28 UTC (permalink / raw)
  To: Cyril B., autofs

On Tue, 2016-06-28 at 09:48 +0200, Cyril B. wrote:
> On 06/28/2016 02:27 AM, Ian Kent wrote:
> > Can you describe again how this fits together?
> 
> Sure. Please let me know if I'm not clear. Here's my /etc/auto.master:

Thanks for this, after reading this I see I should have been able to work it out
from your original post.

Usually I don't like to question why people set things up the way they do, in
fact I find it annoying when someone asks a question and people respond with
"don't do it that way, do it this way" because, usually, if that was an option
the poster wouldn't have asked the question.

But in this case I think it's worth mentioning because due to the mount
independence required by kernel mount namespaces the problem we are seeing with
/nfs/http12 might not be easily solvable or solvable at all. 

> 
> /nfs program:/etc/auto.nfs nobind
> /home program:/etc/auto.home --mode=751
> 
> /etc/auto.nfs does some sanity checks, then does:
> echo -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/

Is there a reason you need to use the nfsv4 root namespace rather than using the
/home export directly?

That was a problem, particularly for autofs, and needed to be done long ago but
that has long since changed.

You should be able to use nfsv4 mounts directly without this intermediate mount.

Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-07-01  7:28             ` Ian Kent
@ 2016-07-01  7:48               ` Cyril B.
  2016-07-28  7:36                 ` Cyril B.
  0 siblings, 1 reply; 17+ messages in thread
From: Cyril B. @ 2016-07-01  7:48 UTC (permalink / raw)
  To: Ian Kent, autofs

On 07/01/2016 09:28 AM, Ian Kent wrote:
>> /nfs program:/etc/auto.nfs nobind
>> > /home program:/etc/auto.home --mode=751
>> >
>> > /etc/auto.nfs does some sanity checks, then does:
>> > echo -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
 >
> Is there a reason you need to use the nfsv4 root namespace rather than using the
> /home export directly?

You mean get rid of the bind mounts and have /home/foo be a NFSv4 mount 
to (e.g.) http12:/foo?

If so, I assumed that it would mean having a lot of independant mounts 
over NFS (and as many TCP connections between the NFS client and 
server), and doing a single NFS mount with local bind mounts would be 
better.

However, my assumption may very well be wrong. Would the kernel share 
the same underlying NFS mount (and TCP connection) for all those mount 
points to the same NFS server?

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Regular deadlocks
  2016-07-01  7:48               ` Cyril B.
@ 2016-07-28  7:36                 ` Cyril B.
  0 siblings, 0 replies; 17+ messages in thread
From: Cyril B. @ 2016-07-28  7:36 UTC (permalink / raw)
  To: Ian Kent, autofs

On 07/01/2016 09:48 AM, Cyril B. wrote:
>> Is there a reason you need to use the nfsv4 root namespace rather than
>> using the
>> /home export directly?
>
> You mean get rid of the bind mounts and have /home/foo be a NFSv4 mount
> to (e.g.) http12:/foo?
>
> If so, I assumed that it would mean having a lot of independant mounts
> over NFS (and as many TCP connections between the NFS client and
> server), and doing a single NFS mount with local bind mounts would be
> better.
>
> However, my assumption may very well be wrong. Would the kernel share
> the same underlying NFS mount (and TCP connection) for all those mount
> points to the same NFS server?


Well, I've been doing that for the past month and everything works 
rather flawlessly. In particular, deadlocks are gone, so I'll keep that 
setup.

-- 
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-07-28  7:36 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-25 23:41 Regular deadlocks Cyril B.
2016-06-26  1:59 ` Ian Kent
2016-06-26 10:02   ` Cyril B.
2016-06-26 10:30     ` Ian Kent
2016-06-26 11:13       ` Cyril B.
2016-06-27  0:26     ` Ian Kent
2016-06-27  0:35       ` Ian Kent
2016-06-27 14:04       ` Cyril B.
2016-06-28  0:27         ` Ian Kent
2016-06-28  7:48           ` Cyril B.
2016-07-01  7:28             ` Ian Kent
2016-07-01  7:48               ` Cyril B.
2016-07-28  7:36                 ` Cyril B.
2016-06-29 12:45           ` Cyril B.
2016-07-01  7:03         ` Ian Kent
2016-06-26 10:59 ` Ian Kent
2016-06-26 11:02 ` Ian Kent

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.