From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ian Kent Subject: Re: [RFC PATCH]autofs4: hang and proposed fix Date: Tue, 29 Nov 2005 09:19:23 -0500 (EST) Message-ID: References: <20051116101740.GA9551@RAM> <438B3C34.2050509@us.ibm.com> <1133219572.27824.24.camel@localhost.localdomain> Mime-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Cc: "William H. Taber" , Ram Pai , autofs mailing list , linux-fsdevel Return-path: Received: from wombat.indigo.net.au ([202.0.185.19]:6406 "EHLO wombat.indigo.net.au") by vger.kernel.org with ESMTP id S932319AbVK2BTa (ORCPT ); Mon, 28 Nov 2005 20:19:30 -0500 To: Badari Pulavarty In-Reply-To: <1133219572.27824.24.camel@localhost.localdomain> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org We'll need to do an analysis of all callers of the revalidate method. On Mon, 28 Nov 2005, Badari Pulavarty wrote: > On Mon, 2005-11-28 at 12:19 -0500, William H. Taber wrote: > > Ian Kent wrote: > > > > > My thoughts: > > > > > > The cause of this issue is user space programs using autofs4 need to > > > call services that must be able to take the inode semaphore. Notably > > > sys_mkdir and sys_symlink in order to complete their task. > > > > > > I believe that, in this case, releasing the semaphore is ok since the > > > entry is part of the autofs filesystem and so autofs is responsible for > > > taking care of it, provided that it is done carefully. The semaphore is > > > meant to serialize changes being to the directory and these changes are > > > done in autofs by asking the user space process to do it. Which are > > > themselves serialized by the same semaphore. > > > > > > The only tricky thing I can think of here is that care must be taken to > > > ensure that the semaphore is not released before the DCACHE_AUTOFS_PENDING > > > flag is set to make sure that other incoming requests are sent to the wait > > > queue. > > > > > > The attached patch does this and opts for a conservative approach by > > > broadening the critical region instead of narrowing it. > > > > > > It may also be necessary to review the return codes from revaliate but I'm > > > only part way through that. > > > > > > Please review and test this patch and offer further comment. > > > Sorry guys but I haven't been able to test this at all save verifying that > > > it compiles. > > > > > > Hopefully I haven't missed anything completely obvious ... DOH! > > > > > > Ian > > > > > > --- linux-2.6.15-rc1/fs/autofs4/root.c.lookup-deadlock 2005-11-17 18:58:38.000000000 +0800 > > > +++ linux-2.6.15-rc1/fs/autofs4/root.c 2005-11-27 17:00:40.000000000 +0800 > > > @@ -487,11 +487,8 @@ static struct dentry *autofs4_lookup(str > > > dentry->d_fsdata = NULL; > > > d_add(dentry, NULL); > > > > > > - if (dentry->d_op && dentry->d_op->d_revalidate) { > > > - up(&dir->i_sem); > > > + if (dentry->d_op && dentry->d_op->d_revalidate) > > > (dentry->d_op->d_revalidate)(dentry, nd); > > > - down(&dir->i_sem); > > > - } > > > > > > /* > > > * If we are still pending, check if we had to handle > > > --- linux-2.6.15-rc1/fs/autofs4/waitq.c.lookup-deadlock 2005-11-27 17:09:42.000000000 +0800 > > > +++ linux-2.6.15-rc1/fs/autofs4/waitq.c 2005-11-27 17:17:34.000000000 +0800 > > > @@ -161,6 +161,8 @@ int autofs4_wait(struct autofs_sb_info * > > > enum autofs_notify notify) > > > { > > > struct autofs_wait_queue *wq; > > > + struct inode *dir = dentry->d_parent->d_inode; > > > + int i_sem_held; > > > char *name; > > > int len, status; > > > > > > @@ -227,6 +229,14 @@ int autofs4_wait(struct autofs_sb_info * > > > (unsigned long) wq->wait_queue_token, wq->len, wq->name, notify); > > > } > > > > > > + /* > > > + * If we are called from lookup or lookup_hash the > > > + * the inode semaphore needs to be released for > > > + * userspace to do its thing. > > > + */ > > > + i_sem_held = down_trylock(&dir->i_sem); > > > + up(&dir->i_sem); > > > + > > > if (notify != NFY_NONE && atomic_dec_and_test(&wq->notified)) { > > > int type = (notify == NFY_MOUNT ? > > > autofs_ptype_missing : autofs_ptype_expire_multi); > > > @@ -268,6 +278,10 @@ int autofs4_wait(struct autofs_sb_info * > > > DPRINTK("skipped sleeping"); > > > } > > > > > > + /* Re-take the inode semaphore if it was held */ > > > + if (i_sem_held) > > > + down(&dir->i_sem); > > > + > > > status = wq->status; > > > > > > /* Are we the last process to need status? */ > > > - > > Ian, > > I have not tested this patch but it seems to have a serious flaw. Given > > that do_lookup does not get the parent i_sem lock before calling > > revalidate, you have the possibility that you are being called without > > having gotten the lock but the lock may be held by another process. In > > that case you do not want to be releasing their lock while they are > > relying on it. > > > > Here is the patch Will Taber proposed and I am posting on his behalf. > > Thanks, > Badari > > > >