* NFS client locking hangs for period @ 2003-01-24 20:49 Christian Reis 2003-01-25 3:54 ` Neil Brown 0 siblings, 1 reply; 14+ messages in thread From: Christian Reis @ 2003-01-24 20:49 UTC (permalink / raw) To: neilb; +Cc: linux-kernel, NFS Hello Neil, I've been trying to get at this problem for a while now, and had been concentrating on the client-side of the problem (and consequently bothering Trond about it) [1,2]. I am now pretty much convinced this is a server-side problem, and as I've patched 2.4.20 with all the NFS patches pending (that didn't have to do with the kernel lock breaking) and still see the issue, I decided to report this bug. The scenario is: a set of NFS clients with root mounted over nfs from a single server. Clients run vanilla 2.4.20, server runs 2.4.20 patched with your server-side patches I mentioned above. The clients run okay for a period, and then one of them will start to hang for long periods of time for certain operations (it happens on startup and shutdown, for instance). Once the client hangs start the server needs to be rebooted for it to clear up. It seems to be reproducible by having the client hang or reboot without shutting down properly. Another tip is that the server gets files left over in /var/lib/nfs/sm/ for the hanging client(s). I've been trying to track this down for a while, but since I'm not very proficient with debugging at this level, I haven't had much luck. It's really a problem because I need to reboot and make 20 people stop working when the problem gets serious. Trond has had a hand trying to help me, but we still haven't uncovered anything. I wonder if you have any clue what could be happenning? The other details are standard: the clients are debian woodys with nfs-utils 1.0.1 installed, and the server has the same version. The server runs reiserfs over RAID-1 partitions (using the kernel md driver). Could it be triggered because of this perhaps unusual combination? Some of the messages I point out below have some info about the issue - including tcpdumps and traces of nlm_debug on the server and client. Mount options follow for the client filesystems: anthem:/export/root/ / nfs defaults,rw,rsize=8192,wsize=8192,nfsvers=2 0 0 anthem:/home /home nfs defaults,rw,rsize=8192,wsize=8192,nfsvers=3 0 0 I have checked and, yes, root is mounted using version 2 and the rest as version 3. Perhaps I should try getting the kernel to mount root using version 3? [1] http://groups.google.com/groups?q=trond+christian+nfs&hl=pt&lr=&ie=UTF-8&client=googlet&scoring=d&selm=20030108151424.N2628%40blackjesus.async.com.br.lucky.linux.kernel&rnum=1 [2] http://groups.google.com/groups?hl=pt&lr=&ie=UTF-8&client=googlet&th=3575b3c5f3360eb0&seekm=20030108151424.N2628%40blackjesus.async.com.br.lucky.linux.kernel&frame=off Thanks for any help you can give. Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFS client locking hangs for period 2003-01-24 20:49 NFS client locking hangs for period Christian Reis @ 2003-01-25 3:54 ` Neil Brown 2003-01-26 16:02 ` Christian Reis 0 siblings, 1 reply; 14+ messages in thread From: Neil Brown @ 2003-01-25 3:54 UTC (permalink / raw) To: Christian Reis; +Cc: linux-kernel, NFS On Friday January 24, kiko@async.com.br wrote: > > Hello Neil, Hi. > > I've been trying to get at this problem for a while now, .... > > It seems to be reproducible by having the client hang or reboot without > shutting down properly. Another tip is that the server gets files left > over in /var/lib/nfs/sm/ for the hanging client(s). > > Mount options follow for the client filesystems: > > anthem:/export/root/ / nfs defaults,rw,rsize=8192,wsize=8192,nfsvers=2 0 0 > anthem:/home /home nfs defaults,rw,rsize=8192,wsize=8192,nfsvers=3 0 0 > Hmmm. So you have several clients all mounting the same root filesystem, and mounting it writable? That doesn't sound like a plan for success. How do you make sure the clients don't tread over each other when using /etc files? I suspect that what you really want is to mount root read-only, or mount separate roots for each client, and then in either case to mount with the "nolock" flag. I suspect that your problem is related to the client trying to do locking, but no having statd running on the client. You cannot meaningfully do locking on an NFS mounted root filesystem. Infact, I think it would be good if the default mount options for nfs root included nolock... and if I read fs/nfs/nfsroot.c:root_nfs_name correctly, nolock is the default. Are you overriding that default be explicitly setting "lock"?? NeilBrown ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFS client locking hangs for period 2003-01-25 3:54 ` Neil Brown @ 2003-01-26 16:02 ` Christian Reis 2003-01-26 21:49 ` [NFS] " Trond Myklebust 2003-01-28 8:00 ` Denis Vlasenko 0 siblings, 2 replies; 14+ messages in thread From: Christian Reis @ 2003-01-26 16:02 UTC (permalink / raw) To: Neil Brown; +Cc: linux-kernel, NFS On Sat, Jan 25, 2003 at 02:54:09PM +1100, Neil Brown wrote: > Hmmm. So you have several clients all mounting the same root > filesystem, and mounting it writable? That doesn't sound like a plan > for success. How do you make sure the clients don't tread over each > other when using /etc files? The truth is few (broken wrt the FHS) programs actually write to /etc. I have set up everything so nothing is written to in /etc, and it actually works very well (have to use a special init(8) that doesn't write to /etc/ioctl.save). This setup has been running for almost a year now, with the locking problem being the only one left to fix. > I suspect that what you really want is to mount root read-only, or > mount separate roots for each client, and then in either case to mount > with the "nolock" flag. Well, mounting root read-only is a good idea but it sacrifices being able to administer the system from any station, and it also puts a lot of burden on me to fix *all* programs to not write to anywhere on it. This shouldn't be too hard, but we're still just working around the bug, which I would really like to identify and fix. > I suspect that your problem is related to the client trying to do > locking, but no having statd running on the client. I am 100% positive statd runs on every single client. This problem here only happens spuriously. It goes away when I restart nfsd and mountd (in that order). It really does look like a bug <wink> > You cannot meaningfully do locking on an NFS mounted root filesystem. > Infact, I think it would be good if the default mount options for nfs > root included nolock... and if I read fs/nfs/nfsroot.c:root_nfs_name > correctly, nolock is the default. Are you overriding that default > be explicitly setting "lock"?? Nope. I've just tested and the default (specifying no lock option upon bootup) really is nolock: /dev/root on / type nfs (rw,v3,rsize=8192,wsize=8192,hard,udp,nolock,addr=192.168.99.4) I wonder why you can't do locking on NFS root (if it's a current limitation of if it doesn't make sense). But I also think this problem shouldn't be happening if no locking was going on. And when I checked using nlm_debug it sure did seem locking was being used. What do you make of it? Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [NFS] Re: NFS client locking hangs for period 2003-01-26 16:02 ` Christian Reis @ 2003-01-26 21:49 ` Trond Myklebust 2003-01-26 22:47 ` Christian Reis 2003-01-28 8:00 ` Denis Vlasenko 1 sibling, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2003-01-26 21:49 UTC (permalink / raw) To: Christian Reis; +Cc: Neil Brown, linux-kernel, NFS >>>>> " " == Christian Reis <kiko@async.com.br> writes: > I wonder why you can't do locking on NFS root (if it's a > current limitation of if it doesn't make sense). locking supposes that you are already running a statd daemon, which you clearly cannot be doing on an nfsroot system. If you need locking on a root partition, then you'll need to set up an initrd from which to start all the necessary daemons... BTW: Did I understand you and Neil correctly when you appeared to say that you were sharing the *same* root partition between several clients? If so, then that could easily explain your problem: a directory like /var/lib/nfs simply cannot be shared among several different machines. Read the 'statd' manpage, and I'm sure you will understand why. Cheers, Trond ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [NFS] Re: NFS client locking hangs for period 2003-01-26 21:49 ` [NFS] " Trond Myklebust @ 2003-01-26 22:47 ` Christian Reis 2003-01-26 23:02 ` Trond Myklebust 2003-01-28 8:14 ` [NFS] Re: NFS client locking hangs for period Denis Vlasenko 0 siblings, 2 replies; 14+ messages in thread From: Christian Reis @ 2003-01-26 22:47 UTC (permalink / raw) To: Trond Myklebust; +Cc: Neil Brown, linux-kernel, NFS On Sun, Jan 26, 2003 at 10:49:14PM +0100, Trond Myklebust wrote: > >>>>> " " == Christian Reis <kiko@async.com.br> writes: > > > I wonder why you can't do locking on NFS root (if it's a > > current limitation of if it doesn't make sense). > > locking supposes that you are already running a statd daemon, which > you clearly cannot be doing on an nfsroot system. If you need locking > on a root partition, then you'll need to set up an initrd from which > to start all the necessary daemons... This makes a lot of sense, I just had never thought about it properly. I'm not sure I *need* locking, so I'll run with nolock till it bites me. > BTW: Did I understand you and Neil correctly when you appeared to say > that you were sharing the *same* root partition between several > clients? Yes, you did understand correctly. The same root partition is mounted by around 20 machines. It works, too. The bug that we have manifests itself very rarely, and only when one of the machines does an unclean shutdown. I still haven't been able to reproduce it so I still haven't seen a solution yet. > If so, then that could easily explain your problem: a directory like > /var/lib/nfs simply cannot be shared among several different > machines. Read the 'statd' manpage, and I'm sure you will understand > why. Well, none of the machines by default exports anything through NFS, so none of them explicitly *need* /var/lib/nfs. I've done some careful study and separated the directories which are written to on a per-host basis, and used a lot of tmpfs. It works quite well, to be honest. A breakdown of "special" directories: - /var/spool and /var/log need to be separate, for obvious reasons. - /proc/mounts should be linked to /etc/mtab to avoid the need for writing there. - /tmp, /var/tmp, /dev/shm, /var/lock, /var/run, /var/lib/nfs, /var/yp/binding, /var/lib/sendmail are tmpfs. None of the users have root access so writing to the partition only is done as the result of servers running. I used a lot of reboots and ls -lt to find out what needs to be separate, and there are few issues that need fixing (/etc/ioctl.save being the latest). One issue I ran into that I only discovered today (well, we all have to learn someday) was that a shared /dev is not a good idea, because some programs write to it. Case in point was syslogd, which creates /dev/log - all but the last machine had logging broken. Since nobody needs logs on these boxes anyway, it had gone on unnoticed, but I'm now using devfs, and it works fine. Everybody seems to find this setup a bit bizarre. It's not. It keeps maintenence down to zero for everything, and adding a new box means running a script once. statd(8) does indicate that /var/lib/nfs is private, so I just mount it as tmpfs. Should I make it persistent, or is the fact those files disappear on an unclean reboot a sign of trouble? Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [NFS] Re: NFS client locking hangs for period 2003-01-26 22:47 ` Christian Reis @ 2003-01-26 23:02 ` Trond Myklebust 2003-01-26 23:56 ` Christian Reis 2003-01-28 8:14 ` [NFS] Re: NFS client locking hangs for period Denis Vlasenko 1 sibling, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2003-01-26 23:02 UTC (permalink / raw) To: Christian Reis; +Cc: Neil Brown, linux-kernel, NFS >>>>> " " == Christian Reis <kiko@async.com.br> writes: > statd(8) does indicate that /var/lib/nfs is private, so I just > mount it as tmpfs. Should I make it persistent, or is the fact > those files disappear on an unclean reboot a sign of trouble? If you want locking to work, then /var/lib/nfs *MUST* be persistent and unique for each client. If not then the server will fail to be notified that it needs to release any POSIX locks it might think you were holding if/when your NFS client fails to shutdown cleanly. That again will typically cause a deadlock the next time you try to access your mailspool (if the server thinks it is already holding a lock on your behalf). Cheers, Trond ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [NFS] Re: NFS client locking hangs for period 2003-01-26 23:02 ` Trond Myklebust @ 2003-01-26 23:56 ` Christian Reis 2003-01-27 0:06 ` Trond Myklebust 0 siblings, 1 reply; 14+ messages in thread From: Christian Reis @ 2003-01-26 23:56 UTC (permalink / raw) To: Trond Myklebust; +Cc: Neil Brown, linux-kernel, NFS On Mon, Jan 27, 2003 at 12:02:00AM +0100, Trond Myklebust wrote: > >>>>> " " == Christian Reis <kiko@async.com.br> writes: > > > statd(8) does indicate that /var/lib/nfs is private, so I just > > mount it as tmpfs. Should I make it persistent, or is the fact > > those files disappear on an unclean reboot a sign of trouble? > > If you want locking to work, then /var/lib/nfs *MUST* be > persistent and unique for each client. I had never realized this; things are symmetric in an odd way with NFS, and this bit with locking can trick you. I've changed the clients to mount private nfs directories (the perks of shared root for diskless ;) and I do hope things will work out from now on. One thing worth noting is that the private /var/lib/nfs directory has to be mounted a) with nolock (I assume) and b) *before* statd and lockd have gone up. > That again will typically cause a deadlock the next time you try to > access your mailspool (if the server thinks it is already holding a > lock on your behalf). I am now left wondering how it bit us so little here. Is there a way of finding out exactly *which* files are being locked at a certain time for a certain client? Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [NFS] Re: NFS client locking hangs for period 2003-01-26 23:56 ` Christian Reis @ 2003-01-27 0:06 ` Trond Myklebust 2003-01-27 2:19 ` Dell Latitude CPi keyboard problems since 2.5.42 Tom Sightler 0 siblings, 1 reply; 14+ messages in thread From: Trond Myklebust @ 2003-01-27 0:06 UTC (permalink / raw) To: Christian Reis; +Cc: Neil Brown, linux-kernel, NFS >>>>> " " == Christian Reis <kiko@async.com.br> writes: > Is there a way of finding out exactly *which* files are being > locked at a certain time for a certain client? Not really. 'cat /proc/locks' is about the closest you can get. That will give you no NFS-specific information though. Cheers, Trond ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: Dell Latitude CPi keyboard problems since 2.5.42 2003-01-27 0:06 ` Trond Myklebust @ 2003-01-27 2:19 ` Tom Sightler 0 siblings, 0 replies; 14+ messages in thread From: Tom Sightler @ 2003-01-27 2:19 UTC (permalink / raw) To: linux-kernel; +Cc: vojtech > Hmm, interesting. Can you try disabling some of the probes for > extended keyboards in atkbd.c to see if some of them could confuse > your keyboard so that the BIOS doesn't like it after boot? Also you > may want to kill the keyboard reset on reboot ... (atkbd_cleanup) ... I've been following this because my Dell Latitude C810 has the "keyboard/mouse doesn't work after reboot" with all of the recent 2.5.x kernel that I have tried. When I saw the suggestion above to try removing the keyboard reset I thought that was just too easy to pass up giving it a try. Sure enough, removing just the one line that preforms the keyboard reset from atkbd_cleanup solves the problem for me. Now I guess it's time to try to determine why, and what the real fix should likely be. Just thought it might be valuable to have another report about the problem. Later, Tom ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [NFS] Re: NFS client locking hangs for period 2003-01-26 22:47 ` Christian Reis 2003-01-26 23:02 ` Trond Myklebust @ 2003-01-28 8:14 ` Denis Vlasenko 2003-01-28 16:47 ` Christian Reis 1 sibling, 1 reply; 14+ messages in thread From: Denis Vlasenko @ 2003-01-28 8:14 UTC (permalink / raw) To: Christian Reis, Trond Myklebust; +Cc: Neil Brown, linux-kernel, NFS On 27 January 2003 00:47, Christian Reis wrote: > On Sun, Jan 26, 2003 at 10:49:14PM +0100, Trond Myklebust wrote: > > >>>>> " " == Christian Reis <kiko@async.com.br> writes: > > > I wonder why you can't do locking on NFS root (if it's a > > > current limitation of if it doesn't make sense). > > > > locking supposes that you are already running a statd daemon, which > > you clearly cannot be doing on an nfsroot system. If you need > > locking on a root partition, then you'll need to set up an initrd > > from which to start all the necessary daemons... > > This makes a lot of sense, I just had never thought about it > properly. I'm not sure I *need* locking, so I'll run with nolock till > it bites me. > > > BTW: Did I understand you and Neil correctly when you appeared to > > say that you were sharing the *same* root partition between several > > clients? > > Yes, you did understand correctly. The same root partition is mounted > by around 20 machines. It works, too. The bug that we have manifests > itself very rarely, and only when one of the machines does an unclean > shutdown. I still haven't been able to reproduce it so I still > haven't seen a solution yet. > > > If so, then that could easily explain your problem: a directory > > like /var/lib/nfs simply cannot be shared among several different > > machines. Read the 'statd' manpage, and I'm sure you will > > understand why. > > Well, none of the machines by default exports anything through NFS, > so none of them explicitly *need* /var/lib/nfs. I've done some > careful study and separated the directories which are written to on a > per-host basis, and used a lot of tmpfs. It works quite well, to be > honest. A breakdown of "special" directories: > > - /var/spool and /var/log need to be separate, for obvious reasons. > - /proc/mounts should be linked to /etc/mtab to avoid the need for > writing there. > - /tmp, /var/tmp, /dev/shm, /var/lock, /var/run, /var/lib/nfs, > /var/yp/binding, /var/lib/sendmail are tmpfs. I did the same. You will end up amending this list. Simplify it: /var need to be separate, for obvious reasons. ;) /tmp need to be separate /etc need to be separate > None of the users have root access so writing to the partition only > is done as the result of servers running. I used a lot of reboots and > ls -lt to find out what needs to be separate, and there are few > issues that need fixing (/etc/ioctl.save being the latest). Entire /etc. How can you have different per-client configs for e.g. /etc/resolv.conf? I know you don't usually need that. Sometimes we need to do unusual things ;) > One issue I ran into that I only discovered today (well, we all have > to learn someday) was that a shared /dev is not a good idea, because > some programs write to it. Case in point was syslogd, which creates > /dev/log - all but the last machine had logging broken. Since nobody > needs logs on these boxes anyway, it had gone on unnoticed, but I'm > now using devfs, and it works fine. Same here. Devfs is cool ;) For one, it forces people to think before they got strange ideas of putting something foreign in /dev. Like abm syslogd. > Everybody seems to find this setup a bit bizarre. It's not. It keeps > maintenence down to zero for everything, and adding a new box means > running a script once. Yeah! ;) What a contrast with typical Windows network mess you can find in random office! -- vda ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [NFS] Re: NFS client locking hangs for period 2003-01-28 8:14 ` [NFS] Re: NFS client locking hangs for period Denis Vlasenko @ 2003-01-28 16:47 ` Christian Reis 0 siblings, 0 replies; 14+ messages in thread From: Christian Reis @ 2003-01-28 16:47 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Trond Myklebust, Neil Brown, linux-kernel, NFS On Tue, Jan 28, 2003 at 10:14:09AM +0200, Denis Vlasenko wrote: > > None of the users have root access so writing to the partition only > > is done as the result of servers running. I used a lot of reboots and > > ls -lt to find out what needs to be separate, and there are few > > issues that need fixing (/etc/ioctl.save being the latest). > > Entire /etc. How can you have different per-client configs for > e.g. /etc/resolv.conf? I know you don't usually need that. > Sometimes we need to do unusual things ;) Well, the per-client configurations are an exception at our office, and the only things we customize are XFree86 (we use the Xfree86Config.hostname capability), gpm and the kernel, which is dealt out by DHCPD. We also have a special startup script that is run for the named box if it exists (/etc/init.d/host-specific/`hostname`). I'm sure this won't work for everybody, but it does work for us, a smallish development team. > Same here. Devfs is cool ;) > For one, it forces people to think before they got strange ideas of > putting something foreign in /dev. Like abm syslogd. Only after using devfs in this context have I come to appreciate how nice it is. And I had it in place after 10 minutes of reading and two of recompiled kernels. Amazing. Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFS client locking hangs for period 2003-01-26 16:02 ` Christian Reis 2003-01-26 21:49 ` [NFS] " Trond Myklebust @ 2003-01-28 8:00 ` Denis Vlasenko 2003-01-28 16:44 ` Christian Reis 2003-01-29 21:53 ` Daniel Egger 1 sibling, 2 replies; 14+ messages in thread From: Denis Vlasenko @ 2003-01-28 8:00 UTC (permalink / raw) To: Christian Reis, Neil Brown; +Cc: linux-kernel, NFS On 26 January 2003 18:02, Christian Reis wrote: > On Sat, Jan 25, 2003 at 02:54:09PM +1100, Neil Brown wrote: > > Hmmm. So you have several clients all mounting the same root > > filesystem, and mounting it writable? That doesn't sound like a > > plan for success. How do you make sure the clients don't tread > > over each other when using /etc files? > > The truth is few (broken wrt the FHS) programs actually write to > /etc. I have set up everything so nothing is written to in /etc, and > it actually works very well (have to use a special init(8) that > doesn't write to /etc/ioctl.save). This setup has been running for > almost a year now, with the locking problem being the only one left > to fix. My root fs is RO. Works wonders. Clients simply CANNOT trash their /bin, /lib etc ;) > > I suspect that what you really want is to mount root read-only, or > > mount separate roots for each client, and then in either case to > > mount with the "nolock" flag. > > Well, mounting root read-only is a good idea but it sacrifices being > able to administer the system from any station, and it also puts a > lot of burden on me to fix *all* programs to not write to anywhere on > it. This shouldn't be too hard, but we're still just working around > the bug, which I would really like to identify and fix. It was not really *that* difficult for me. I used devfs and symlinks. /etc, /var, /tmp are different directories per client, /home, /usr are shared. The rest stays on root fs readonly. ssh to NFS server if you want to modify some files on root fs. Separate etc/var/tmp files for each client = no concurrent rw access. > > I suspect that your problem is related to the client trying to do > > locking, but no having statd running on the client. > > I am 100% positive statd runs on every single client. This problem > here only happens spuriously. It goes away when I restart nfsd and > mountd (in that order). It really does look like a bug <wink> File locking over the network is hard to do reliably. I have no experience with that in NFS, but presume there can be problems in some situations (statd or portmap crashed on a client, client hung/disconnected from the net, etc etc etc...) Anyway, such corner cases are painful, thank you for your efforts to nail it down. -- vda ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFS client locking hangs for period 2003-01-28 8:00 ` Denis Vlasenko @ 2003-01-28 16:44 ` Christian Reis 2003-01-29 21:53 ` Daniel Egger 1 sibling, 0 replies; 14+ messages in thread From: Christian Reis @ 2003-01-28 16:44 UTC (permalink / raw) To: Denis Vlasenko; +Cc: Neil Brown, linux-kernel, NFS On Tue, Jan 28, 2003 at 10:00:05AM +0200, Denis Vlasenko wrote: > > Well, mounting root read-only is a good idea but it sacrifices being > > able to administer the system from any station, and it also puts a > > lot of burden on me to fix *all* programs to not write to anywhere on > > it. This shouldn't be too hard, but we're still just working around > > the bug, which I would really like to identify and fix. > > It was not really *that* difficult for me. I used devfs and symlinks. > /etc, /var, /tmp are different directories per client, > /home, /usr are shared. The rest stays on root fs readonly. > ssh to NFS server if you want to modify some files on root fs. > > Separate etc/var/tmp files for each client = no concurrent rw access. I agree it is a lot simpler; however, you have to give up the ability to install and upgrade system software seamlessly. When Debian reports a security issue, all I do is apt-get -u upgrade and skim through it - all boxes are magically updated. No need to update the individual /etc files for the changes, and no messy links either. It does require you take care, though. The most important issue is finding out what files are written to in these directories (in violation of the LFS/FHS, I must say). The current culprit I am after is a /sbin/init, who writes to /etc/ioctl.save (why, I wonder). After a lot of cleanup, I've managed to pair this down to teh minimum, and I'm going after some of the last culprits now. > File locking over the network is hard to do reliably. > I have no experience with that in NFS, but presume there > can be problems in some situations (statd or portmap > crashed on a client, client hung/disconnected from the net, > etc etc etc...) > > Anyway, such corner cases are painful, thank you for > your efforts to nail it down. It seems Trond has given us the answer to the problem: the persistence of /var/lib/nfs seems to be essential to a healthy diskless client. One of our co-workers who was an expert as triggering the problems is at the beach this week, so I can't tell for sure, but next Tuesday or so I hope to post to NFS-list with [SUMMARY] in the Subject line <wink> Take care, -- Christian Reis, Senior Engineer, Async Open Source, Brazil. http://async.com.br/~kiko/ | [+55 16] 261 2331 | NMFL ^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: NFS client locking hangs for period 2003-01-28 8:00 ` Denis Vlasenko 2003-01-28 16:44 ` Christian Reis @ 2003-01-29 21:53 ` Daniel Egger 1 sibling, 0 replies; 14+ messages in thread From: Daniel Egger @ 2003-01-29 21:53 UTC (permalink / raw) To: vda; +Cc: linux-kernel, NFS [-- Attachment #1: Type: text/plain, Size: 695 bytes --] Am Die, 2003-01-28 um 09.00 schrieb Denis Vlasenko: > It was not really *that* difficult for me. I used devfs and symlinks. > /etc, /var, /tmp are different directories per client, > /home, /usr are shared. The rest stays on root fs readonly. > ssh to NFS server if you want to modify some files on root fs. This will only work dandy if the server runs the same OS on the same architecture and its own system is well enough equipped to do software installations and bootstraps. Although I'm using Linux on my server as well as the same architecture as most of the clients I sometimes experience troubles working in the chrooted client environment. -- Servus, Daniel [-- Attachment #2: Dies ist ein digital signierter Nachrichtenteil --] [-- Type: application/pgp-signature, Size: 189 bytes --] ^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2003-01-29 21:44 UTC | newest] Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2003-01-24 20:49 NFS client locking hangs for period Christian Reis 2003-01-25 3:54 ` Neil Brown 2003-01-26 16:02 ` Christian Reis 2003-01-26 21:49 ` [NFS] " Trond Myklebust 2003-01-26 22:47 ` Christian Reis 2003-01-26 23:02 ` Trond Myklebust 2003-01-26 23:56 ` Christian Reis 2003-01-27 0:06 ` Trond Myklebust 2003-01-27 2:19 ` Dell Latitude CPi keyboard problems since 2.5.42 Tom Sightler 2003-01-28 8:14 ` [NFS] Re: NFS client locking hangs for period Denis Vlasenko 2003-01-28 16:47 ` Christian Reis 2003-01-28 8:00 ` Denis Vlasenko 2003-01-28 16:44 ` Christian Reis 2003-01-29 21:53 ` Daniel Egger
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).