From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jim Carter Subject: Re: clients suddenly start hanging (was: (no subject)) Date: Fri, 20 Jun 2008 18:02:29 -0700 (PDT) Message-ID: References: <20080423185018.122C53C3B1@xena.cft.ca.us> <1213414942.18072.26.camel@raven.themaw.net> <1213845274.2971.11.camel@raven.themaw.net> <20080619183446.532D82111B1@simba.math.ucla.edu> <1213934961.2971.69.camel@raven.themaw.net> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <1213934961.2971.69.camel@raven.themaw.net> List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: autofs-bounces@linux.kernel.org Errors-To: autofs-bounces@linux.kernel.org To: Ian Kent Cc: autofs@linux.kernel.org On Fri, 20 Jun 2008, Ian Kent wrote: > So here is autofs-5.0.3-submount-shutdown-recovery-8.patch. > Please try it instead of revision 7. The patch went on cleanly. However, there was a problem in execution. The output was: 17:00:14 -- #1, chkd 0, run 0, OK 570, mtd 2, of 570 Jun 20 17:00:22 serval automount[2799]: unexpected pthreads error: -1 at 901 in master.c After patching this is in: void master_signal_submount(struct autofs_point *ap, unsigned int action) status = pthread_barrier_wait(&ap->submount_barrier); if (status) fatal(status); I'm not sure what's frozen; the machine responds to ping, but I can't do ssh to it, and I'm not at work. I would have expected needed NFS resources to already be mounted, from the session that started up the test program. Any ideas what went wrong? I can commandeer another machine for the next test, since the owner is also not at work. About setting up a test environment: We have 133 Linux boxes (a few of these are down), so you would need a lot of hosts. I was thinking how to do this. How about lots of UML or Xen virtual machines, each exporting maybe 2 NFS filesystems. I'm most familiar with UML. I have some rather old notes on UML here: http://www.math.ucla.edu/~jimc/documents/uml-install-suse.html I think your best bet is to create a COW (copy on write) file with the standard configuration, then cover it with a writeable image file for each virtual guest; the latter would occupy only a few hundred Kb each since most of the material would be in the COW (readonly). Allow at most 16 Mb "physical" memory per guest, and perhaps even forget the swap file. This lets you pack 64 UML instances per Gb of physical memory on the host (except allow some uncommitted physical memory for the host kernel and daemons). This will keep the host CPU hopping, but in reality the guest systems will only be giving out and cancelling mounts from the test machine, one at a time, so the host CPU load should be manageable. On the guest image you can ease your life by creating a small file, maybe 64 Kb of zeroes, assign it to a loopback device, and put a filesystem on it. In one application I used Minix because it's compact yet supports normal UNIX semantics. Mount it, create a single file with a unique ID, and export it. From the contained file you can verify which host and which of its 2 filesystems you actually mounted. The UML guest can do the loopback thing if it was given a complete set of modules (SuSE's UML has it). If you actually go ahead with this, tell me and I'll send you the test program. The automount maps go like this: auto.master: /net /etc/auto.net auto.net: (backslash is not really there) * -rsize=8192,wsize=8192,retry=1,soft,fstype=autofs,-DSERVER=& \ file:/etc/auto.net.generic auto.net.generic: * ${SERVER}:/& Good luck, you're going to need it :-) James F. Carter Voice 310 825 2897 FAX 310 206 6673 UCLA-Mathnet; 6115 MSA; 405 Hilgard Ave.; Los Angeles, CA, USA 90095-1555 Email: jimc@math.ucla.edu http://www.math.ucla.edu/~jimc (q.v. for PGP key)