* Regular deadlocks
@ 2016-06-25 23:41 Cyril B.
2016-06-26 1:59 ` Ian Kent
` (2 more replies)
0 siblings, 3 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-25 23:41 UTC (permalink / raw)
To: autofs
Hello,
I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1
as well) on my servers, typically about once every 2 or 3 days.
I already posted on this mailing-list back in 2015 for a bug that also
triggered deadlocks (which was fixed), so I'll copy/paste parts of my
original message as my config hasn't changed.
/etc/auto.master:
--
/nfs program:/etc/auto.nfs
/home program:/etc/auto.home
--
/etc/auto.nfs is basically returning:
-fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
/etc/auto.home:
--
#!/bin/sh
if [ ! -h /var/home/$1 ]
then
exit 1
fi
echo -fstype=bind :$(readlink --no-newline /var/home/$1)
--
So for instance, /var/home/foo would be a symlink pointing to
/nfs/serverX/foo.
Kernel: Linux 4.4.7.
My servers have cronjobs that trigger /home/userX mounts basically at
the same time (when the jobs do start). I have 2 servers with the same
config, but one of them has MANY more users/cronjobs and oddly enough,
the deadlock happens much more infrequently.
Anyway, here's a 'ps faux' a few hours after the deadlock started:
root 3437 0.0 0.0 217644 5428 ? Ssl Jun24 0:14
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269869 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269915 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269920 0.0 0.0 0 0 ? Z 18:00 0:00 \_
[auto.nfs] <defunct>
root 1269923 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269931 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269932 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269933 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269934 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1269937 0.0 0.0 209420 1904 ? S 18:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1271340 0.0 0.0 209420 1904 ? S 18:01 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1271943 0.0 0.0 209420 1904 ? S 18:03 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1272408 0.0 0.0 209420 1904 ? S 18:05 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1273323 0.0 0.0 210448 1936 ? S 18:08 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1274002 0.0 0.0 211476 1968 ? S 18:10 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1276131 0.0 0.0 213532 2032 ? S 18:20 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1278540 0.0 0.0 213532 2032 ? S 18:30 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1302389 0.0 0.0 214560 2064 ? S 20:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1317670 0.0 0.0 215588 2096 ? S 20:52 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 1319408 0.0 0.0 216616 2128 ? S 21:00 0:00 \_
/usr/sbin/automount --pid-file /var/run/autofs.pid
Backtraces of the main process (3437):
Thread 24 (Thread 0x404a2950 (LWP 3438)):
#0 0x00007f438c7e8fad in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
#1 0x000055e6c5bd28e6 in alarm_handler (arg=<optimized out>) at alarm.c:191
#2 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#3 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 23 (Thread 0x4182f950 (LWP 3439)):
#0 0x00007f438c7e8d29 in pthread_cond_wait@@GLIBC_2.3.2 () from
/lib/libpthread.so.0
#1 0x000055e6c5bc7205 in st_queue_handler (arg=<optimized out>) at
state.c:1101
#2 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#3 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#4 0x0000000000000000 in ?? ()
Thread 22 (Thread 0x42030950 (LWP 3442)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bbabce in get_pkt (pkt=<optimized out>, ap=<optimized
out>) at automount.c:1009
#2 handle_packet (ap=<optimized out>) at automount.c:1160
#3 handle_mounts (arg=<optimized out>) at automount.c:1834
#4 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#5 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#6 0x0000000000000000 in ?? ()
Thread 21 (Thread 0x417f3950 (LWP 3445)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bbabce in get_pkt (pkt=<optimized out>, ap=<optimized
out>) at automount.c:1009
#2 handle_packet (ap=<optimized out>) at automount.c:1160
#3 handle_mounts (arg=<optimized out>) at automount.c:1834
#4 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#5 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#6 0x0000000000000000 in ?? ()
Thread 20 (Thread 0x409a0950 (LWP 1269762)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4099d9b0 "user1", name_len=12,
what=0x4099d980 "/nfs/http12/user1", fstype=0x4099d9d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x55e6c642cb00) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4099d9b0 "user1", name_len=12,
what=0x4099d980 "/nfs/http12/user1", fstype=0x4099d9d0 "bind",
options=0x4099d9f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4099e000 "user1", namelen=12,
loc=0x55e6c642cbb0 ":/nfs/http12/user1", loclen=25,
options=0x4099d9f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x4099e000 "user1", name_len=12, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x4099e000 "user1", name_len=12, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x4099e000 "user1", name_len=12) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x4099e000
"user1", name_len=12) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 19 (Thread 0x42232950 (LWP 1269775)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
---Type <return> to continue, or q <return> to quit---
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4222f9b0 "user2", name_len=8,
what=0x4222f980 "/nfs/http12/user2", fstype=0x4222f9d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x55e6c64311d0) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4222f9b0 "user2", name_len=8,
what=0x4222f980 "/nfs/http12/user2", fstype=0x4222f9d0 "bind",
options=0x4222f9f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42230000 "user2", namelen=8,
loc=0x55e6c6431810 ":/nfs/http12/user2", loclen=21, options=0x4222f9f0 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42230000 "user2", name_len=8, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42230000 "user2", name_len=8, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42230000 "user2", name_len=8) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42230000
"user2", name_len=8) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 18 (Thread 0x42434950 (LWP 1269778)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x424319b0 "user3", name_len=9,
what=0x42431980 "/nfs/http12/user3", fstype=0x424319d0 "bind",
options=0x7f438a9a9716 "defaults", context=0xffffffff00000024) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x424319b0 "user3", name_len=9,
what=0x42431980 "/nfs/http12/user3", fstype=0x424319d0 "bind",
options=0x424319f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42432000 "user3", namelen=9,
loc=0x55e6c6431f20 ":/nfs/http12/user3", loclen=22, options=0x424319f0
"",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42432000 "user3", name_len=9, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42432000 "user3", name_len=9, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42432000 "user3", name_len=9) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42432000
"user3", name_len=9) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 17 (Thread 0x42636950 (LWP 1269781)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x426339b0 "user4", name_len=10,
what=0x42633980 "/nfs/http12/user4", fstype=0x426339d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x55e6c64316d0) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x426339b0 "user4", name_len=10,
what=0x42633980 "/nfs/http12/user4", fstype=0x426339d0 "bind",
options=0x426339f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42634000 "user4", namelen=10,
loc=0x55e6c6431a70 ":/nfs/http12/user4", loclen=23, options=0x426339f0
"",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42634000 "user4", name_len=10, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42634000 "user4", name_len=10, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42634000 "user4", name_len=10) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42634000
"user4", name_len=10) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 16 (Thread 0x40be5950 (LWP 1269783)):
---Type <return> to continue, or q <return> to quit---
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x40be29b0 "user5", name_len=7,
what=0x40be2980 "/nfs/http12/user5", fstype=0x40be29d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x55e6c6431a10) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x40be29b0 "user5", name_len=7,
what=0x40be2980 "/nfs/http12/user5", fstype=0x40be29d0 "bind",
options=0x40be29f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x40be3000 "user5", namelen=7,
loc=0x55e6c6431c90 ":/nfs/http12/user5", loclen=20, options=0x40be29f0 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x40be3000 "user5", name_len=7, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x40be3000 "user5", name_len=7, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x40be3000 "user5", name_len=7) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x40be3000
"user5", name_len=7) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 15 (Thread 0x42a3a950 (LWP 1269785)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42a379b0 "user6", name_len=12,
what=0x42a37980 "/nfs/http12/user6", fstype=0x42a379d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x0) at mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42a379b0 "user6", name_len=12,
what=0x42a37980 "/nfs/http12/user6", fstype=0x42a379d0 "bind",
options=0x42a379f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42a38000 "user6", namelen=12,
loc=0x55e6c6432660 ":/nfs/http12/user6", loclen=25,
options=0x42a379f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42a38000 "user6", name_len=12, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42a38000 "user6", name_len=12, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42a38000 "user6", name_len=12) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42a38000
"user6", name_len=12) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 14 (Thread 0x42535950 (LWP 1269810)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x425329c0 "user7", name_len=6,
what=0x42532990 "/nfs/http12/user7", fstype=0x425329e0 "bind",
options=0x7f438a9a9716 "defaults", context=0x55e6c6428ff0) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x425329c0 "user7", name_len=6,
what=0x42532990 "/nfs/http12/user7", fstype=0x425329e0 "bind",
options=0x42532a00 "")
at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42533000 "user7", namelen=6,
loc=0x55e6c64322d0 ":/nfs/http12/user7", loclen=19, options=0x42532a00 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42533000 "user7", name_len=6, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42533000 "user7", name_len=6, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42533000 "user7", name_len=6) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42533000
"user7", name_len=6) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 13 (Thread 0x42939950 (LWP 1269829)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x429369c0 "user8", name_len=3,
what=0x429369a0 "/nfs/http12/user8", fstype=0x429369e0 "bind",
options=0x7f438a9a9716 "defaults", context=0x20) at mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x429369c0 "user8", name_len=3,
what=0x429369a0 "/nfs/http12/user8", fstype=0x429369e0 "bind",
options=0x42936a00 "")
at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42937000 "user8", namelen=3,
loc=0x55e6c6432100 ":/nfs/http12/user8", loclen=16, options=0x42936a00 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42937000 "user8", name_len=3, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42937000 "user8", name_len=3, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42937000 "user8", name_len=3) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42937000
"user8", name_len=3) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 12 (Thread 0x42838950 (LWP 1269881)):
#0 0x00007f438c7eb7db in read () from /lib/libpthread.so.0
#1 0x00007f438b017f2d in lookup_one (ap=0x55e6c63f8c30,
name=0x55e6c6432320 "http12", name_len=<optimized out>, ctxt=<optimized
out>) at lookup_program.c:291
#2 0x00007f438b01898d in match_key (ctxt=<optimized out>,
mapent=<optimized out>, name_len=<optimized out>, name=<optimized out>,
source=<optimized out>, ap=<optimized out>) at lookup_program.c:493
#3 lookup_mount (ap=0x55e6c63f8c30, name=0x42836000 "http12",
name_len=6, context=0x55e6c63f87c0) at lookup_program.c:682
#4 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f8c30,
map=0x55e6c63f8d50, name=0x42836000 "http12", name_len=6) at lookup.c:766
#5 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#6 lookup_nss_mount (ap=0x55e6c63f8c30, source=0x0, name=0x42836000
"http12", name_len=6) at lookup.c:1110
#7 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#8 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#9 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#10 0x0000000000000000 in ?? ()
Thread 11 (Thread 0x42333950 (LWP 1271292)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x423309b0 "user9", name_len=9,
what=0x42330980 "/nfs/http12/user9", fstype=0x423309d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x7f438bc2ca40) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x423309b0 "user9", name_len=9,
what=0x42330980 "/nfs/http12/user9", fstype=0x423309d0 "bind",
options=0x423309f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42331000 "user9", namelen=9,
loc=0x55e6c6432740 ":/nfs/http12/user9", loclen=22, options=0x423309f0
"",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42331000 "user9", name_len=9, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42331000 "user9", name_len=9, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42331000 "user9", name_len=9) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42331000
"user9", name_len=9) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 10 (Thread 0x42737950 (LWP 1271894)):
---Type <return> to continue, or q <return> to quit---
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x427349b0 "user10", name_len=12,
what=0x42734980 "/nfs/http12/user10", fstype=0x427349d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x0) at mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x427349b0 "user10", name_len=12,
what=0x42734980 "/nfs/http12/user10", fstype=0x427349d0 "bind",
options=0x427349f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42735000 "user10", namelen=12,
loc=0x55e6c64328d0 ":/nfs/http12/user10", loclen=25,
options=0x427349f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42735000 "user10", name_len=12, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42735000 "user10", name_len=12, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42735000 "user10", name_len=12) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42735000
"user10", name_len=12) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 9 (Thread 0x42131950 (LWP 1272342)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4212e9b0 "user11", name_len=9,
what=0x4212e980 "/nfs/http12/user11", fstype=0x4212e9d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x55e6c64353f0) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4212e9b0 "user11", name_len=9,
what=0x4212e980 "/nfs/http12/user11", fstype=0x4212e9d0 "bind",
options=0x4212e9f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x4212f000 "user11", namelen=9,
loc=0x55e6c6432a30 ":/nfs/http12/user11", loclen=22, options=0x4212e9f0
"",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x4212f000 "user11", name_len=9, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x4212f000 "user11", name_len=9, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x4212f000 "user11", name_len=9) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x4212f000
"user11", name_len=9) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 8 (Thread 0x407af950 (LWP 1273318)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x407ac9c0 "user12", name_len=5,
what=0x407ac990 "/nfs/http12/user12", fstype=0x407ac9e0 "bind",
options=0x7f438a9a9716 "defaults", context=0xffffffffffffffff) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x407ac9c0 "user12", name_len=5,
what=0x407ac990 "/nfs/http12/user12", fstype=0x407ac9e0 "bind",
options=0x407aca00
"")
at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x407ad000 "user12", namelen=5,
loc=0x55e6c6432cf0 ":/nfs/http12/user12", loclen=18, options=0x407aca00 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x407ad000 "user12", name_len=5, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x407ad000 "user12", name_len=5, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x407ad000 "user12", name_len=5) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x407ad000
"user12", name_len=5) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
---Type <return> to continue, or q <return> to quit---
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 7 (Thread 0x42b3b950 (LWP 1273997)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42b389c0 "user13", name_len=6,
what=0x42b38990 "/nfs/http12/user13", fstype=0x42b389e0 "bind",
options=0x7f438a9a9716 "defaults", context=0x4b1) at mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42b389c0 "user13", name_len=6,
what=0x42b38990 "/nfs/http12/user13", fstype=0x42b389e0 "bind",
options=0x42b38a00 "")
at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42b39000 "user13", namelen=6,
loc=0x55e6c64330c0 ":/nfs/http12/user13", loclen=19, options=0x42b38a00 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42b39000 "user13", name_len=6, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42b39000 "user13", name_len=6, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42b39000 "user13", name_len=6) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42b39000
"user13", name_len=6) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 6 (Thread 0x405a3950 (LWP 1276040)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x405a09c0 "user14", name_len=5,
what=0x405a0990 "/nfs/http12/user14", fstype=0x405a09e0 "bind",
options=0x7f438a9a9716 "defaults", context=0x21) at mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x405a09c0 "user14", name_len=5,
what=0x405a0990 "/nfs/http12/user14", fstype=0x405a09e0 "bind",
options=0x405a0a00
"")
at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x405a1000 "user14", namelen=5,
loc=0x55e6c6433360 ":/nfs/http12/user14", loclen=18, options=0x405a0a00 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x405a1000 "user14", name_len=5, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x405a1000 "user14", name_len=5, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x405a1000 "user14", name_len=5) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x405a1000
"user14", name_len=5) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 5 (Thread 0x42c3c950 (LWP 1278498)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42c399c0 "user15", name_len=6,
what=0x42c39990 "/nfs/http12/user15", fstype=0x42c399e0 "bind",
options=0x7f438a9a9716 "defaults", context=0x6974616877040020) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42c399c0 "user15", name_len=6,
what=0x42c39990 "/nfs/http12/user15", fstype=0x42c399e0 "bind",
options=0x42c39a00 "")
at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42c3a000 "user15", namelen=6,
loc=0x55e6c6435160 ":/nfs/http12/user15", loclen=19, options=0x42c39a00 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42c3a000 "user15", name_len=6, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42c3a000 "user15", name_len=6, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42c3a000 "user15", name_len=6) at lookup.c:766
---Type <return> to continue, or q <return> to quit---
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42c3a000
"user15", name_len=6) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 4 (Thread 0x42d3d950 (LWP 1302366)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42d3a9b0 "user16", name_len=8,
what=0x42d3a980 "/nfs/http12/user16", fstype=0x42d3a9d0 "bind",
options=0x7f438a9a9716 "defaults", context=0xffffffffffffffff) at
mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42d3a9b0 "user16", name_len=8,
what=0x42d3a980 "/nfs/http12/user16", fstype=0x42d3a9d0 "bind",
options=0x42d3a9f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42d3b000 "user16", namelen=8,
loc=0x55e6c6435420 ":/nfs/http12/user16", loclen=21, options=0x42d3a9f0 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42d3b000 "user16", name_len=8, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42d3b000 "user16", name_len=8, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42d3b000 "user16", name_len=8) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42d3b000
"user16", name_len=8) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 3 (Thread 0x42e3e950 (LWP 1317665)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42e3b9b0 "user17", name_len=7,
what=0x42e3b980 "/nfs/http12/user17", fstype=0x42e3b9d0 "bind",
options=0x7f438a9a9716 "defaults", context=0x0) at mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42e3b9b0 "user17", name_len=7,
what=0x42e3b980 "/nfs/http12/user17", fstype=0x42e3b9d0 "bind",
options=0x42e3b9f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42e3c000 "user17", namelen=7,
loc=0x55e6c64358a0 ":/nfs/http12/user17", loclen=20, options=0x42e3b9f0 "",
ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42e3c000 "user17", name_len=7, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42e3c000 "user17", name_len=7, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42e3c000 "user17", name_len=7) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42e3c000
"user17", name_len=7) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 2 (Thread 0x42f3f950 (LWP 1319319)):
#0 0x00007f438b9a4bd6 in poll () from /lib/libc.so.6
#1 0x000055e6c5bc1474 in timed_read (time=<optimized out>,
len=<optimized out>, buf=<optimized out>, pipe=<optimized out>) at
spawn.c:107
#2 do_spawn (logopt=0, wait=4294967295, options=<optimized out>,
prog=<optimized out>, argv=<optimized out>) at spawn.c:272
#3 0x000055e6c5bc2050 in spawn_bind_mount (logopt=0) at spawn.c:538
#4 0x00007f438a9a88f8 in mount_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42f3c9a0 "user18", name_len=18,
what=0x42f3c970 "/nfs/http12/user18",
fstype=0x42f3c9d0 "bind", options=0x7f438a9a9716 "defaults",
context=0xffffffffffffffff) at mount_bind.c:170
#5 0x000055e6c5bc34ad in do_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42f3c9a0 "user18", name_len=18,
what=0x42f3c970 "/nfs/http12/user18",
fstype=0x42f3c9d0 "bind", options=0x42f3c9f0 "") at mount.c:78
#6 0x00007f438ade8a43 in sun_mount (ap=0x55e6c63f9030,
root=0x55e6c63f9130 "/home", name=0x42f3d000 "user18", namelen=18,
loc=0x55e6c6435b40 ":/nfs/http12/user18", loclen=31,
---Type <return> to continue, or q <return> to quit---
options=0x42f3c9f0 "", ctxt=0x55e6c63f7770) at parse_sun.c:712
#7 0x00007f438adeacc2 in parse_mount (ap=0x55e6c63f9030,
name=0x42f3d000 "user18", name_len=18, mapent=<optimized out>,
context=<optimized out>) at parse_sun.c:1673
#8 0x00007f438b018be1 in lookup_mount (ap=0x55e6c63f9030,
name=0x42f3d000 "user18", name_len=18, context=0x55e6c63f76f0) at
lookup_program.c:691
#9 0x000055e6c5bc4072 in do_lookup_mount (ap=0x55e6c63f9030,
map=0x55e6c63f9150, name=0x42f3d000 "user18", name_len=18) at lookup.c:766
#10 0x000055e6c5bc4642 in do_name_lookup_mount (name_len=<optimized
out>, name=<optimized out>, map=<optimized out>, ap=<optimized out>) at
lookup.c:945
#11 lookup_nss_mount (ap=0x55e6c63f9030, source=0x0, name=0x42f3d000
"user18", name_len=18) at lookup.c:1110
#12 0x000055e6c5bbb4d6 in do_mount_indirect (arg=<optimized out>) at
indirect.c:772
#13 0x00007f438c7e4fc7 in start_thread () from /lib/libpthread.so.0
#14 0x00007f438b9ad64d in clone () from /lib/libc.so.6
#15 0x0000000000000000 in ?? ()
Thread 1 (Thread 0x7f438cbfa6e0 (LWP 3437)):
#0 0x00007f438c7ec797 in do_sigwait () from /lib/libpthread.so.0
#1 0x00007f438c7ec83d in sigwait () from /lib/libpthread.so.0
#2 0x000055e6c5bba3c6 in statemachine (arg=<optimized out>) at
automount.c:1469
#3 main (argc=0, argv=<optimized out>) at automount.c:2476
I cannot attach gdb on subprocesses: gdb just hangs after:
Attaching to program: /usr/sbin/automount, process 1269869
Kernel trace:
# cat /proc/1269869/stack
[<ffffffffc052f3bf>] autofs4_wait+0x3df/0xb60 [autofs4]
[<ffffffffc052e0a5>] autofs4_d_automount+0x235/0x270 [autofs4]
[<ffffffff921cbb8f>] follow_managed+0x1ff/0x2d0
[<ffffffff921cccb3>] walk_component+0x263/0x300
[<ffffffff921cdded>] link_path_walk+0x18d/0x5a0
[<ffffffff921cf48e>] path_openat+0xbe/0x1070
[<ffffffff921d04c5>] do_filp_open+0x85/0xe0
[<ffffffff921bed96>] do_sys_open+0x146/0x220
[<ffffffff921beeae>] SyS_open+0x1e/0x20
[<ffffffff927685b2>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff
Other subprocesses may have a slightly different kernel trace:
~# cat /proc/1269923/stack
[<ffffffffc052f3bf>] autofs4_wait+0x3df/0xb60 [autofs4]
[<ffffffffc052dc97>] autofs4_d_manage+0x117/0x1b0 [autofs4]
[<ffffffff921cbaa6>] follow_managed+0x116/0x2d0
[<ffffffff921cccb3>] walk_component+0x263/0x300
[<ffffffff921cdded>] link_path_walk+0x18d/0x5a0
[<ffffffff921cf48e>] path_openat+0xbe/0x1070
[<ffffffff921d04c5>] do_filp_open+0x85/0xe0
[<ffffffff921bed96>] do_sys_open+0x146/0x220
[<ffffffff921beeae>] SyS_open+0x1e/0x20
[<ffffffff927685b2>] entry_SYSCALL_64_fastpath+0x12/0x71
[<ffffffffffffffff>] 0xffffffffffffffff
When the deadlock happens, I have to kill -9 subprocesses one by one.
One specific subprocess finally unlocks the deadlock and everything goes
back to normal (remaining subprocesses disappear as well).
I have no other interesting log, and in particular no kernel log when
that happens.
Thanks,
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-25 23:41 Regular deadlocks Cyril B.
@ 2016-06-26 1:59 ` Ian Kent
2016-06-26 10:02 ` Cyril B.
2016-06-26 10:59 ` Ian Kent
2016-06-26 11:02 ` Ian Kent
2 siblings, 1 reply; 17+ messages in thread
From: Ian Kent @ 2016-06-26 1:59 UTC (permalink / raw)
To: Cyril B., autofs
On Sun, 2016-06-26 at 01:41 +0200, Cyril B. wrote:
> Hello,
>
> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1
> as well) on my servers, typically about once every 2 or 3 days.
>
> I already posted on this mailing-list back in 2015 for a bug that also
> triggered deadlocks (which was fixed), so I'll copy/paste parts of my
> original message as my config hasn't changed.
>
> /etc/auto.master:
> --
> /nfs program:/etc/auto.nfs
> /home program:/etc/auto.home
> --
>
> /etc/auto.nfs is basically returning:
>
> -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
>
> /etc/auto.home:
> --
> #!/bin/sh
>
> if [ ! -h /var/home/$1 ]
> then
> exit 1
> fi
>
> echo -fstype=bind :$(readlink --no-newline /var/home/$1)
> --
>
> So for instance, /var/home/foo would be a symlink pointing to
> /nfs/serverX/foo.
>
> Kernel: Linux 4.4.7.
>
> My servers have cronjobs that trigger /home/userX mounts basically at
> the same time (when the jobs do start). I have 2 servers with the same
> config, but one of them has MANY more users/cronjobs and oddly enough,
> the deadlock happens much more infrequently.
>
> Anyway, here's a 'ps faux' a few hours after the deadlock started:
Looks like these aren't showing a deadlock.
I think I've been seeing the same thing during testing and I can see it's
mount.nfs(8) that is not returning when it should.
I initially thought it was may environment but I've swapped several devices and
used different servers so I'm beginning to think mount.nfs(8) has grown a
problem but still not sure.
So far I've been thinking this was a problem with my environemtso didn't worry
too much about it.
snip ...
>
> I cannot attach gdb on subprocesses: gdb just hangs after:
> Attaching to program: /usr/sbin/automount, process 1269869
>
> Kernel trace:
>
> # cat /proc/1269869/stack
> [<ffffffffc052f3bf>] autofs4_wait+0x3df/0xb60 [autofs4]
> [<ffffffffc052e0a5>] autofs4_d_automount+0x235/0x270 [autofs4]
> [<ffffffff921cbb8f>] follow_managed+0x1ff/0x2d0
> [<ffffffff921cccb3>] walk_component+0x263/0x300
> [<ffffffff921cdded>] link_path_walk+0x18d/0x5a0
> [<ffffffff921cf48e>] path_openat+0xbe/0x1070
> [<ffffffff921d04c5>] do_filp_open+0x85/0xe0
> [<ffffffff921bed96>] do_sys_open+0x146/0x220
> [<ffffffff921beeae>] SyS_open+0x1e/0x20
> [<ffffffff927685b2>] entry_SYSCALL_64_fastpath+0x12/0x71
> [<ffffffffffffffff>] 0xffffffffffffffff
The autofs4_d_automount() entry here indicates this is the one that triggered
the mount.
If you are seeing a problem with mount, looking for the blocked process and
killing it should clear the rest of these up.
You would think that, even if mount.nfs(8) was losing network packets, it would
timeout after about 3 minutes. I must admit I haven't waited long enough to find
out if that's the case so far.
If you find you are actually seeing this, setting the configuration option
mount_wait to some sensible value sufficient for a mount to complete might help.
The problem with that is it's hard to locate the actual blocked child process
from automount to kill it once the timeout has expired. Only the process spawned
by automount itself is killed so sub processes will probably accumulate.
Normally that process would go away after the usual lengthy timeout and was
commonly due to a server not responding or something like that so not locating
the process and killing it wasn't a big problem.
If this really is what's happening then I will need to fix that so that
automount can work around it. Not sure it will really help that much though
......
snip ...
> When the deadlock happens, I have to kill -9 subprocesses one by one.
> One specific subprocess finally unlocks the deadlock and everything goes
> back to normal (remaining subprocesses disappear as well).
Right, you might need to do that on the blocked mount process, NFS hangs on
pretty strongly when mounting.
Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-26 1:59 ` Ian Kent
@ 2016-06-26 10:02 ` Cyril B.
2016-06-26 10:30 ` Ian Kent
2016-06-27 0:26 ` Ian Kent
0 siblings, 2 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-26 10:02 UTC (permalink / raw)
To: Ian Kent, autofs
On 06/26/2016 03:59 AM, Ian Kent wrote:
> The autofs4_d_automount() entry here indicates this is the one that triggered
> the mount.
>
> If you are seeing a problem with mount, looking for the blocked process and
> killing it should clear the rest of these up.
>
> You would think that, even if mount.nfs(8) was losing network packets, it would
> timeout after about 3 minutes. I must admit I haven't waited long enough to find
> out if that's the case so far.
>
> If you find you are actually seeing this, setting the configuration option
> mount_wait to some sensible value sufficient for a mount to complete might help.
Thanks for the quick response. I don't remember seeing any lingering
mount.nfs, but I must have overlooked.
However, I've also had those "deadlocks" happen with several different
NFS servers at the same time (this was not the case with the traces I
sent in my previous email). If one NFS server was unmountable, should
automount be blocked when trying to mount others?
Anyway, I'll investigate some more next time the "deadlock" occurs and
will also try setting mount_wait to a reasonable value. I'll let you
know how it goes.
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-26 10:02 ` Cyril B.
@ 2016-06-26 10:30 ` Ian Kent
2016-06-26 11:13 ` Cyril B.
2016-06-27 0:26 ` Ian Kent
1 sibling, 1 reply; 17+ messages in thread
From: Ian Kent @ 2016-06-26 10:30 UTC (permalink / raw)
To: Cyril B., autofs
On Sun, 2016-06-26 at 12:02 +0200, Cyril B. wrote:
> On 06/26/2016 03:59 AM, Ian Kent wrote:
> > The autofs4_d_automount() entry here indicates this is the one that
> > triggered
> > the mount.
> >
> > If you are seeing a problem with mount, looking for the blocked process and
> > killing it should clear the rest of these up.
> >
> > You would think that, even if mount.nfs(8) was losing network packets, it
> > would
> > timeout after about 3 minutes. I must admit I haven't waited long enough to
> > find
> > out if that's the case so far.
> >
> > If you find you are actually seeing this, setting the configuration option
> > mount_wait to some sensible value sufficient for a mount to complete might
> > help.
>
>
> Thanks for the quick response. I don't remember seeing any lingering
> mount.nfs, but I must have overlooked.
Yep, check it and let me know.
>
> However, I've also had those "deadlocks" happen with several different
> NFS servers at the same time (this was not the case with the traces I
> sent in my previous email). If one NFS server was unmountable, should
> automount be blocked when trying to mount others?
That's a bit harder to answer, it's different for mount and umount and one can
affect the other.
I would need to check since the locking scope changes due to changes for other
problems, that's with 5.1.1 and 5.1.2 right?
That also assumes automount has got all the way to mounting or umounting, it can
block at other points too like remote key lookup.
Ideally one down server shouldn't affect other mounts or umounts but, as I say,
I'll need to check and try and fix it if it's not working like that. Having said
that it also might not be straight forward either.
>
> Anyway, I'll investigate some more next time the "deadlock" occurs and
> will also try setting mount_wait to a reasonable value. I'll let you
> know how it goes.
>
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-25 23:41 Regular deadlocks Cyril B.
2016-06-26 1:59 ` Ian Kent
@ 2016-06-26 10:59 ` Ian Kent
2016-06-26 11:02 ` Ian Kent
2 siblings, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-26 10:59 UTC (permalink / raw)
To: Cyril B., autofs
On Sun, 2016-06-26 at 01:41 +0200, Cyril B. wrote:
> Hello,
>
> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1
> as well) on my servers, typically about once every 2 or 3 days.
>
> I already posted on this mailing-list back in 2015 for a bug that also
> triggered deadlocks (which was fixed), so I'll copy/paste parts of my
> original message as my config hasn't changed.
>
> /etc/auto.master:
> --
> /nfs program:/etc/auto.nfs
> /home program:/etc/auto.home
> --
>
> /etc/auto.nfs is basically returning:
>
> -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
>
> /etc/auto.home:
> --
> #!/bin/sh
>
> if [ ! -h /var/home/$1 ]
> then
> exit 1
> fi
>
> echo -fstype=bind :$(readlink --no-newline /var/home/$1)
> --
Btw, there was a bug in 5.1.1 were a program map lookup could hang under load
due to out of order locking.
The stack traces here didn't show what I'd expect to see with that but it can be
a problem with 5.1.1.
fyi, the patch that fixed that is:
https://www.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.2/autofs-5.1.1-fix-out-of-order-call-in-program-map-lookup.patch
Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-25 23:41 Regular deadlocks Cyril B.
2016-06-26 1:59 ` Ian Kent
2016-06-26 10:59 ` Ian Kent
@ 2016-06-26 11:02 ` Ian Kent
2 siblings, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-26 11:02 UTC (permalink / raw)
To: Cyril B., autofs
On Sun, 2016-06-26 at 01:41 +0200, Cyril B. wrote:
> Hello,
>
> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1
> as well) on my servers, typically about once every 2 or 3 days.
>
Ummm ... is that meant to be 5.1.1 and 5.1.2, typo perhaps?
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-26 10:30 ` Ian Kent
@ 2016-06-26 11:13 ` Cyril B.
0 siblings, 0 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-26 11:13 UTC (permalink / raw)
To: Ian Kent, autofs
On 06/26/2016 12:30 PM, Ian Kent wrote:
> I would need to check since the locking scope changes due to changes for other
> problems, that's with 5.1.1 and 5.1.2 right?
Yes, both version exhibit the same behaviour for me, with no apparent
difference (same deadlock frequency, for instance).
> Btw, there was a bug in 5.1.1 were a program map lookup could hang under load
> due to out of order locking.
>
> The stack traces here didn't show what I'd expect to see with that but it can be
> a problem with 5.1.1.
That's actually the bug you fixed after my bug report in 2015, so my
5.1.1 already included that patch. And since it's included in 5.1.2, all
my tests were with that patch.
>> Hello,
>>
>> I have occasional deadlocks using autofs 4.1.2 (but it happened on 4.1.1
>> as well) on my servers, typically about once every 2 or 3 days.
>>
>
> Ummm ... is that meant to be 5.1.1 and 5.1.2, typo perhaps?
Whoops, yes :)
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-26 10:02 ` Cyril B.
2016-06-26 10:30 ` Ian Kent
@ 2016-06-27 0:26 ` Ian Kent
2016-06-27 0:35 ` Ian Kent
2016-06-27 14:04 ` Cyril B.
1 sibling, 2 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-27 0:26 UTC (permalink / raw)
To: Cyril B., autofs
On Sun, 2016-06-26 at 12:02 +0200, Cyril B. wrote:
>
> However, I've also had those "deadlocks" happen with several different
> NFS servers at the same time (this was not the case with the traces I
> sent in my previous email). If one NFS server was unmountable, should
> automount be blocked when trying to mount others?
>
How is autofs configured.
If --disable-mount-locking is not used then any mount can block all other
mounts, if it is used then there can be mtab corruption if still using a text
based mtab.
I always use --disable-mount-locking and nowadays the mtab is usually a symlink
into the proc file system so corruption isn't a problem.
Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-27 0:26 ` Ian Kent
@ 2016-06-27 0:35 ` Ian Kent
2016-06-27 14:04 ` Cyril B.
1 sibling, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-27 0:35 UTC (permalink / raw)
To: Cyril B., autofs
On Mon, 2016-06-27 at 08:26 +0800, Ian Kent wrote:
> On Sun, 2016-06-26 at 12:02 +0200, Cyril B. wrote:
> >
> > However, I've also had those "deadlocks" happen with several different
> > NFS servers at the same time (this was not the case with the traces I
> > sent in my previous email). If one NFS server was unmountable, should
> > automount be blocked when trying to mount others?
And don't forget it's not different servers that affect locking.
Most of the locking that can cause blocking it is at directory level so it
should be at or below the autofs mount point directories.
>
> How is autofs configured.
>
> If --disable-mount-locking is not used then any mount can block all other
> mounts, if it is used then there can be mtab corruption if still using a text
> based mtab.
>
> I always use --disable-mount-locking and nowadays the mtab is usually a
> symlink
> into the proc file system so corruption isn't a problem.
>
> Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-27 0:26 ` Ian Kent
2016-06-27 0:35 ` Ian Kent
@ 2016-06-27 14:04 ` Cyril B.
2016-06-28 0:27 ` Ian Kent
2016-07-01 7:03 ` Ian Kent
1 sibling, 2 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-27 14:04 UTC (permalink / raw)
To: Ian Kent, autofs
On 06/27/2016 02:26 AM, Ian Kent wrote:
> How is autofs configured.
>
> If --disable-mount-locking is not used then any mount can block all other
> mounts, if it is used then there can be mtab corruption if still using a text
> based mtab.
I use --disable-mount-locking.
> I always use --disable-mount-locking and nowadays the mtab is usually a symlink
> into the proc file system so corruption isn't a problem.
/etc/mtab is actually not a symlink on my systems.
Anyway, I have more details for you as the issue appeared today and I
could investigate some more. This is on a server that only mounts one
single NFS server (http12), so the multi-servers blocking issue is
irrelevant here.
A few minutes before the "deadlock" occurred, /nfs/http12 was unmounted
by autofs, I assume because it was idle. I have TIMEOUT=600. That
explains why the issue appears much more frequently on a server which is
way less busy (and usually in the middle of the night): the NFS server
needs to be idle enough to be unmounted.
However, I still had many /home/userX mounted (by autofs), which point
to /nfs/http12/userX. Shouldn't autofs not unmount /nfs/http12 when at
least one /home/userX is mounted? To be clear, here's an extract from my
/proc/mounts BEFORE the NFS server is unmounted by autofs:
http12:/ /nfs/http12 nfs4
rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:20:1::1
0
0
http12://user1 /home/user1 nfs4
rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:20:
1::1 0 0
http12://user2 /home/user2 nfs4
rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,addr=2a00:42:1:2
0:1::1 0 0
Also, I couldn't find any blocked mount process that would explain the
"deadlock". I had a 'ps aux|grep mount' done every 10 seconds:
Mon Jun 27 05:00:00 CEST 2016
root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:26
/usr/sbin/automount --pid-file /var/run/autofs.pid
Mon Jun 27 05:00:10 CEST 2016
root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:27
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618146 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618214 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618215 0.0 0.0 0 0 ? Z 05:00 0:00
[umount] <defunct>
root 2618224 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618227 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618230 0.0 0.0 0 0 ? Z 05:00 0:00
[umount] <defunct>
root 2618240 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618248 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618250 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618252 0.1 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
Mon Jun 27 05:00:20 CEST 2016
root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:27
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618146 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618214 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618224 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618227 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618240 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618248 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618250 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618252 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618701 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 2618702 0.0 0.0 218676 2168 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
And it remained in that state afterwards. I don't know if the defunct
umount are suspicious, I guess not.
One last thing: a manual umount of /home/userY was done by a script at
6:26 (/home/userY was NOT mounted though), and it remained blocked. I'm
not sure if it's a consequence of autofs being blocked or something else.
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-27 14:04 ` Cyril B.
@ 2016-06-28 0:27 ` Ian Kent
2016-06-28 7:48 ` Cyril B.
2016-06-29 12:45 ` Cyril B.
2016-07-01 7:03 ` Ian Kent
1 sibling, 2 replies; 17+ messages in thread
From: Ian Kent @ 2016-06-28 0:27 UTC (permalink / raw)
To: Cyril B., autofs
On Mon, 2016-06-27 at 16:04 +0200, Cyril B. wrote:
> On 06/27/2016 02:26 AM, Ian Kent wrote:
> > How is autofs configured.
> >
> > If --disable-mount-locking is not used then any mount can block all other
> > mounts, if it is used then there can be mtab corruption if still using a
> > text
> > based mtab.
>
> I use --disable-mount-locking.
>
> > I always use --disable-mount-locking and nowadays the mtab is usually a
> > symlink
> > into the proc file system so corruption isn't a problem.
>
> /etc/mtab is actually not a symlink on my systems.
>
>
> Anyway, I have more details for you as the issue appeared today and I
> could investigate some more. This is on a server that only mounts one
> single NFS server (http12), so the multi-servers blocking issue is
> irrelevant here.
>
> A few minutes before the "deadlock" occurred, /nfs/http12 was unmounted
> by autofs, I assume because it was idle. I have TIMEOUT=600. That
> explains why the issue appears much more frequently on a server which is
> way less busy (and usually in the middle of the night): the NFS server
> needs to be idle enough to be unmounted.
>
> However, I still had many /home/userX mounted (by autofs), which point
> to /nfs/http12/userX. Shouldn't autofs not unmount /nfs/http12 when at
> least one /home/userX is mounted? To be clear, here's an extract from my
> /proc/mounts BEFORE the NFS server is unmounted by autofs:
This doesn't look like the full picture.
Where are the autofs file system mounts?
Mounted doesn't necessarily mean busy or in use but that also depends on the
mount hierarchy and how it has been constructed.
And symlinks are never be "busy", they can't be a pwd and they aren't opened as
a file.
>
> http12:/ /nfs/http12 nfs4
> rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=t
> cp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,
> addr=2a00:42:1:20:1::1
> 0
> 0
This looks like an nfs4 fsid 0 mount.
Is this one mounted by a program map similar to what you described earlier?
Do you then rely on the nfs cross device automounting to mount the user mounts?
Or is this one mounted external to autofs and used by autofs automounts?
Or are you using autofs to mount this one and some other map to mount the other
mounts?
> http12://user1 /home/user1 nfs4
> rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=t
> cp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,
> addr=2a00:42:1:20:
> 1::1 0 0
> http12://user2 /home/user2 nfs4
> rw,nosuid,noatime,vers=4.0,rsize=1048576,wsize=1048576,namlen=255,soft,proto=t
> cp6,timeo=1000,retrans=2,sec=sys,clientaddr=2a00:42:1:50:1::1,local_lock=none,
> addr=2a00:42:1:2
> 0:1::1 0 0
But then there are these, that don't seem to be related to the mount above,
perhaps they reference the first mount.
Can you describe again how this fits together?
At this point a full debug log would probably answer most of my questions.
>
>
> Also, I couldn't find any blocked mount process that would explain the
> "deadlock". I had a 'ps aux|grep mount' done every 10 seconds:
>
> Mon Jun 27 05:00:00 CEST 2016
> root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:26
> /usr/sbin/automount --pid-file /var/run/autofs.pid
>
> Mon Jun 27 05:00:10 CEST 2016
> root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:27
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618146 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618214 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618215 0.0 0.0 0 0 ? Z 05:00 0:00
> [umount] <defunct>
> root 2618224 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618227 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618230 0.0 0.0 0 0 ? Z 05:00 0:00
> [umount] <defunct>
> root 2618240 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618248 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618250 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618252 0.1 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
>
> Mon Jun 27 05:00:20 CEST 2016
> root 3437 0.0 0.0 218676 5500 ? Ssl Jun24 0:27
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618146 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618214 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618224 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618227 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618240 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618248 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618250 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618252 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618701 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
> root 2618702 0.0 0.0 218676 2168 ? S 05:00 0:00
> /usr/sbin/automount --pid-file /var/run/autofs.pid
>
> And it remained in that state afterwards. I don't know if the defunct
> umount are suspicious, I guess not.
>
> One last thing: a manual umount of /home/userY was done by a script at
> 6:26 (/home/userY was NOT mounted though), and it remained blocked. I'm
> not sure if it's a consequence of autofs being blocked or something else.
>
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-28 0:27 ` Ian Kent
@ 2016-06-28 7:48 ` Cyril B.
2016-07-01 7:28 ` Ian Kent
2016-06-29 12:45 ` Cyril B.
1 sibling, 1 reply; 17+ messages in thread
From: Cyril B. @ 2016-06-28 7:48 UTC (permalink / raw)
To: Ian Kent, autofs
On 06/28/2016 02:27 AM, Ian Kent wrote:
> Can you describe again how this fits together?
Sure. Please let me know if I'm not clear. Here's my /etc/auto.master:
/nfs program:/etc/auto.nfs nobind
/home program:/etc/auto.home --mode=751
/etc/auto.nfs does some sanity checks, then does:
echo -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
/etc/auto.home looks up in an internal database on what NFS server the
account is, then does:
echo -fstype=bind :/nfs/httpX/$1
So /nfs and /home are entirely handled by autofs. There's no NFS or user
mount handled by something else.
On the NFS servers, /home is a standard filesystem (XFS), here's the
/etc/exports:
/home 1.2.3.4 (rw,async,insecure,no_subtree_check,no_root_squash,fsid=0)
On my server that deadlocks the most, all users are on the same NFS
server (named http12).
> At this point a full debug log would probably answer most of my questions.
I've just enabled debug logs, I'll send them to you as soon as I have a
new deadlock.
Thanks again.
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-28 0:27 ` Ian Kent
2016-06-28 7:48 ` Cyril B.
@ 2016-06-29 12:45 ` Cyril B.
1 sibling, 0 replies; 17+ messages in thread
From: Cyril B. @ 2016-06-29 12:45 UTC (permalink / raw)
To: Ian Kent, autofs
On 06/28/2016 02:27 AM, Ian Kent wrote:
> At this point a full debug log would probably answer most of my questions.
I have a full debug log for you. I don't know what the policy is for
sending attachments on the mailing list, so I'll send the log to you only.
The "deadlock" happened at 22:00, /nfs/http12 was unmounted shortly before:
Jun 28 21:55:21 ssh1 automount[3848595]: umount_subtree_mounts:
unmounting dir = /nfs/http12
Here's a 'ps aux|grep mount' from this morning, before I cleaned it up:
root 153793 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 153798 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 153803 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 153808 0.0 0.0 0 0 ? Z Jun28 0:00
[mount] <defunct>
root 153819 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 153860 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 153861 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 154529 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 155381 0.0 0.0 271852 1868 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 156565 0.0 0.0 272880 1900 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 176662 0.0 0.0 3944 392 ? S Jun28 0:00
/bin/sh -c /bin/umount /home/bikes
root 176663 0.0 0.0 8124 480 ? S Jun28 0:00
/bin/umount /home/bikes
root 176671 0.0 0.0 274936 1964 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 195792 0.0 0.0 275964 1996 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 195817 0.0 0.0 275964 1996 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 228410 0.0 0.0 276992 2028 ? S Jun28 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 232950 0.0 0.0 279048 2092 ? S 00:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 232969 0.0 0.0 279048 2092 ? S 00:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 236404 0.0 0.0 280076 2124 ? S 00:05 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 262934 0.0 0.0 281104 2156 ? S 00:39 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 340095 0.0 0.0 283160 2220 ? S 02:55 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 344231 0.0 0.0 283160 2220 ? S 03:05 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 361273 0.0 0.0 284188 2252 ? S 03:55 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 384154 0.0 0.0 287272 2348 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 384165 0.0 0.0 287272 2348 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 384170 0.0 0.0 287272 2348 ? S 05:00 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 407146 0.0 0.0 288300 2380 ? S 06:01 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 595311 0.0 0.0 289328 2412 ? S 09:34 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 597797 0.0 0.0 290356 2444 ? S 09:40 0:00
/usr/sbin/automount --pid-file /var/run/autofs.pid
root 3848595 0.0 0.0 291384 5756 ? Ssl Jun28 0:21
/usr/sbin/automount --pid-file /var/run/autofs.pid
The umount '/home/bikes' was not done by autofs, it was done by a script.
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-27 14:04 ` Cyril B.
2016-06-28 0:27 ` Ian Kent
@ 2016-07-01 7:03 ` Ian Kent
1 sibling, 0 replies; 17+ messages in thread
From: Ian Kent @ 2016-07-01 7:03 UTC (permalink / raw)
To: Cyril B., autofs
On Mon, 2016-06-27 at 16:04 +0200, Cyril B. wrote:
> On 06/27/2016 02:26 AM, Ian Kent wrote:
> > How is autofs configured.
> >
> > If --disable-mount-locking is not used then any mount can block all other
> > mounts, if it is used then there can be mtab corruption if still using a
> > text
> > based mtab.
>
> I use --disable-mount-locking.
>
> > I always use --disable-mount-locking and nowadays the mtab is usually a
> > symlink
> > into the proc file system so corruption isn't a problem.
>
> /etc/mtab is actually not a symlink on my systems.
>
>
> Anyway, I have more details for you as the issue appeared today and I
> could investigate some more. This is on a server that only mounts one
> single NFS server (http12), so the multi-servers blocking issue is
> irrelevant here.
>
> A few minutes before the "deadlock" occurred, /nfs/http12 was unmounted
> by autofs, I assume because it was idle. I have TIMEOUT=600. That
> explains why the issue appears much more frequently on a server which is
> way less busy (and usually in the middle of the night): the NFS server
> needs to be idle enough to be unmounted.
That does seem to be causing a problem.
The mount request for /nfs/http12 doesn't seem to be able to make progress but
that could be due to what looks like a signal handling problem, not sure.
There are a bunch of processes blocked on poll(2), waiting for input from a pipe
that probably belongs to a process that has died (quite a few of them).
You would think that poll(2) whould get a SIGCHLD signal when the child process
terminates but, unfortunately, that can't be relied upon in a threaded
application.
Only a single thread of those that don't have SIGCHLD blocked will receive the
signal, and that might not be the thread that fork(2)ed the child, and if there
are multiple signals sent at the same time the number of signals delivered might
not match the number of processes that sent the signal.
So I think the first thing to try will be to change the logic around the poll(2)
call in the timed_wait() function to be non-blocking and check for child process
existence before waiting on poll(2) again.
That's probably not going to help with whatever has caused a problem with
mount(8) (or probably mount.nfs(8)) but it will provide the opportunity to put
some logging in to try and get more information on it. Not only that you will th
en likely get a bunch of mount failures for mounts that shouldn't have failed.
The really annoying thing is that there is no output al all from any of the
child process that must have been forked.
Anyway, that's going to take a while.
>
> However, I still had many /home/userX mounted (by autofs), which point
> to /nfs/http12/userX. Shouldn't autofs not unmount /nfs/http12 when at
> least one /home/userX is mounted? To be clear, here's an extract from my
> /proc/mounts BEFORE the NFS server is unmounted by autofs:
That's another fairly difficult question.
First it's the kernel dentry corresponding to /nfs/http12 that holds the
last_used counter that determines if the dentry hasn't been used for the given
timeout. For that timeout to occur the dentry must not have been busy during
that time which means no open file handles, no working directories open within
it and no activity that would update the last_used value (not usually plain path
walks).
Then there's the question of bind mounting.
I think that when you bind mount the result is an independent mount but just how
that is handled when bind mounting a sub directory of a mount isn't clear. The
output of /proc/mounts (last time I looked at this case) makes it look like the
parent mount is used.
So it's not clear what's going on there.
If the parent mount is used for each bound mount then there would be multiple
independent mounts each able to be umounted independently. For a start that
implies the business of each of these mounts can't influence the busyness of
others and so neither the parent itself.
That sounds a bit strange I know but it would take a lot of time trawling the
VFS to really understand what is going on there.
We do however see that /nfs/http12 can be umounted so I think we can assume
something similar to what I describe is the way it is.
I don't know yet what that means for the scenario here, what I've suggested
above needs to be done first I think.
Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-06-28 7:48 ` Cyril B.
@ 2016-07-01 7:28 ` Ian Kent
2016-07-01 7:48 ` Cyril B.
0 siblings, 1 reply; 17+ messages in thread
From: Ian Kent @ 2016-07-01 7:28 UTC (permalink / raw)
To: Cyril B., autofs
On Tue, 2016-06-28 at 09:48 +0200, Cyril B. wrote:
> On 06/28/2016 02:27 AM, Ian Kent wrote:
> > Can you describe again how this fits together?
>
> Sure. Please let me know if I'm not clear. Here's my /etc/auto.master:
Thanks for this, after reading this I see I should have been able to work it out
from your original post.
Usually I don't like to question why people set things up the way they do, in
fact I find it annoying when someone asks a question and people respond with
"don't do it that way, do it this way" because, usually, if that was an option
the poster wouldn't have asked the question.
But in this case I think it's worth mentioning because due to the mount
independence required by kernel mount namespaces the problem we are seeing with
/nfs/http12 might not be easily solvable or solvable at all.
>
> /nfs program:/etc/auto.nfs nobind
> /home program:/etc/auto.home --mode=751
>
> /etc/auto.nfs does some sanity checks, then does:
> echo -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
Is there a reason you need to use the nfsv4 root namespace rather than using the
/home export directly?
That was a problem, particularly for autofs, and needed to be done long ago but
that has long since changed.
You should be able to use nfsv4 mounts directly without this intermediate mount.
Ian
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-07-01 7:28 ` Ian Kent
@ 2016-07-01 7:48 ` Cyril B.
2016-07-28 7:36 ` Cyril B.
0 siblings, 1 reply; 17+ messages in thread
From: Cyril B. @ 2016-07-01 7:48 UTC (permalink / raw)
To: Ian Kent, autofs
On 07/01/2016 09:28 AM, Ian Kent wrote:
>> /nfs program:/etc/auto.nfs nobind
>> > /home program:/etc/auto.home --mode=751
>> >
>> > /etc/auto.nfs does some sanity checks, then does:
>> > echo -fstype=nfs4,noatime,nosuid,_netdev,soft,intr,timeo=1000 $1:/
>
> Is there a reason you need to use the nfsv4 root namespace rather than using the
> /home export directly?
You mean get rid of the bind mounts and have /home/foo be a NFSv4 mount
to (e.g.) http12:/foo?
If so, I assumed that it would mean having a lot of independant mounts
over NFS (and as many TCP connections between the NFS client and
server), and doing a single NFS mount with local bind mounts would be
better.
However, my assumption may very well be wrong. Would the kernel share
the same underlying NFS mount (and TCP connection) for all those mount
points to the same NFS server?
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Regular deadlocks
2016-07-01 7:48 ` Cyril B.
@ 2016-07-28 7:36 ` Cyril B.
0 siblings, 0 replies; 17+ messages in thread
From: Cyril B. @ 2016-07-28 7:36 UTC (permalink / raw)
To: Ian Kent, autofs
On 07/01/2016 09:48 AM, Cyril B. wrote:
>> Is there a reason you need to use the nfsv4 root namespace rather than
>> using the
>> /home export directly?
>
> You mean get rid of the bind mounts and have /home/foo be a NFSv4 mount
> to (e.g.) http12:/foo?
>
> If so, I assumed that it would mean having a lot of independant mounts
> over NFS (and as many TCP connections between the NFS client and
> server), and doing a single NFS mount with local bind mounts would be
> better.
>
> However, my assumption may very well be wrong. Would the kernel share
> the same underlying NFS mount (and TCP connection) for all those mount
> points to the same NFS server?
Well, I've been doing that for the past month and everything works
rather flawlessly. In particular, deadlocks are gone, so I'll keep that
setup.
--
Cyril B.
--
To unsubscribe from this list: send the line "unsubscribe autofs" in
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2016-07-28 7:36 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-25 23:41 Regular deadlocks Cyril B.
2016-06-26 1:59 ` Ian Kent
2016-06-26 10:02 ` Cyril B.
2016-06-26 10:30 ` Ian Kent
2016-06-26 11:13 ` Cyril B.
2016-06-27 0:26 ` Ian Kent
2016-06-27 0:35 ` Ian Kent
2016-06-27 14:04 ` Cyril B.
2016-06-28 0:27 ` Ian Kent
2016-06-28 7:48 ` Cyril B.
2016-07-01 7:28 ` Ian Kent
2016-07-01 7:48 ` Cyril B.
2016-07-28 7:36 ` Cyril B.
2016-06-29 12:45 ` Cyril B.
2016-07-01 7:03 ` Ian Kent
2016-06-26 10:59 ` Ian Kent
2016-06-26 11:02 ` Ian Kent
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.