From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Aneesh Kumar K.V" Subject: Re: [PATCH] vfs: move ACL cache lookup into generic code Date: Mon, 25 Jul 2011 13:45:04 +0530 Message-ID: <87tyaag4jb.fsf@skywalker.in.ibm.com> References: <20110723032944.GA24703@ZenIV.linux.org.uk> <20110723043120.GB24703@ZenIV.linux.org.uk> <20110723060626.GC24703@ZenIV.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Christoph Hellwig , linux-fsdevel , Linus Torvalds To: Al Viro Return-path: Received: from e6.ny.us.ibm.com ([32.97.182.146]:42504 "EHLO e6.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751294Ab1GYIbL (ORCPT ); Mon, 25 Jul 2011 04:31:11 -0400 Received: from d01relay01.pok.ibm.com (d01relay01.pok.ibm.com [9.56.227.233]) by e6.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id p6P7p7SW000890 for ; Mon, 25 Jul 2011 03:51:07 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id p6P8FEfg121814 for ; Mon, 25 Jul 2011 04:15:14 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id p6P8FDnJ017724 for ; Mon, 25 Jul 2011 05:15:14 -0300 In-Reply-To: <20110723060626.GC24703@ZenIV.linux.org.uk> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat, 23 Jul 2011 07:06:26 +0100, Al Viro wrote: > On Sat, Jul 23, 2011 at 05:31:20AM +0100, Al Viro wrote: > > > Heh... In addition to ocfs2 leak: 9p leaks nicely if v9fs_acl_mode() is > > called with !S_ISDIR(mode). In that case acl reference is simply lost. > > So yes, it's worth looking at. > > Damn... FWIW, that code is seriously broken and not only on failure exits. > Guys, think what happens if we have a negative dentry and call e.g. mkdir(). > Dentry is hashed. Eventually we get to > d_instantiate(dentry, inode); > err = v9fs_fid_add(dentry, fid); > in v9fs_create() (or v9fs_vfs_mkdir_dotl()). At the same time somebody does > lookup *in* that directory. We do find that thing in dcache, it's (already) > positive and we hit > dfid = v9fs_fid_lookup(dentry->d_parent); > at the same time. That calls v9fs_fid_lookup_with_uid() and then > v9fs_fid_add(). Now, we have two tasks doing v9fs_fid_add() on the same > dentry. Which is to say, > dent = dentry->d_fsdata; > if (!dent) { > dent = kmalloc(sizeof(struct v9fs_dentry), GFP_KERNEL); > if (!dent) > return -ENOMEM; > > spin_lock_init(&dent->lock); > INIT_LIST_HEAD(&dent->fidlist); > dentry->d_fsdata = dent; > } > and that's an obvious leak. It's not easy to hit, but... Note that > ->d_revalidate() does *not* help - v9fs_create() will force revaliation > of parent, but the second process might be already past that point. > Moreover, ->d_revalidate() would just talk to server and return 1, so > even if the second process does hit it, nothing will change except for > minor delay. And the first one might be sleeping in blocking allocation > at that point... > > I don't like that; we certainly can protect v9fs_fid_add() in standard way > (i.e. recheck ->d_fsdata after kmalloc(), if we have lost the race - kfree() > and go on, otherwise set ->d_fsdata and make that check+assignment atomic, > e.g. by use of ->d_lock). However, that looks like a kludge. Is there any > saner way to do it? E.g. do we absolutely have to do that *after* > d_instantiate()? We should be able to move that v9fs_fid_add before d_instantiate. That way we can be sure that positive dentries have fids attached. v9fs_vfs_lookup does that already. I guess we should audit all the v9fs_fid_add usage to make sure we do that consistently all other place. -aneesh