* [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 @ 2010-06-07 20:02 Eric Van Hensbergen 2010-06-08 0:08 ` Linus Torvalds 0 siblings, 1 reply; 16+ messages in thread From: Eric Van Hensbergen @ 2010-06-07 20:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: V9FS Developers, linux-kernel Linus, Please merge the following bug fixes for the 9P file system. The following changes since commit 386f40c86d6c8d5b717ef20620af1a750d0dacb4: Linus Torvalds (1): Revert "tty: fix a little bug in scrup, vt.c" are available in the git repository at: ssh://master.kernel.org/pub/scm/linux/kernel/git/ericvh/v9fs.git for-linus Fang Wenqi (1): virtio_9p.h needs <linux/types.h> jvrao (2): Add a helper function to get fsgid for a file create. 9p: Add a wstat after TCREATE to fix the gid. fs/9p/vfs_inode.c | 34 ++++++++++++++++++++++++++++++++++ include/linux/virtio_9p.h | 1 + 2 files changed, 35 insertions(+), 0 deletions(-) ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-07 20:02 [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 Eric Van Hensbergen @ 2010-06-08 0:08 ` Linus Torvalds 2010-06-08 0:41 ` Al Viro 2010-06-08 14:29 ` Venkateswararao Jujjuri (JV) 0 siblings, 2 replies; 16+ messages in thread From: Linus Torvalds @ 2010-06-08 0:08 UTC (permalink / raw) To: Eric Van Hensbergen; +Cc: V9FS Developers, linux-kernel On Mon, 7 Jun 2010, Eric Van Hensbergen wrote: > > jvrao (2): > Add a helper function to get fsgid for a file create. > 9p: Add a wstat after TCREATE to fix the gid. Quite frankly, this looks rather broken. It uses "dentry->d_parent" without locking (it so happens to likely be ok, since we are in "create()" and thus should be holding the parent semaphore). On its own, that might be excusable (if people were even _aware_ of the this locking rule!), but it does so just to get the inode pointer to that parent. And the only thing that makes it ok to access dentry->d_parent - the fact that we are in v9fs_create() - is also the thing that should have made people look at the arguments to the function and say "hmm". We pass in the directory inode pointer as an argument to the create function! The code could have used that thing directly, instead of mucking around with dentry pointers that it had no business looking at. I see why it seems to have happened: v9fs does the exact same thing for the pre-existing "v9fs_fid_lookup()". So there is history to this behavior. Maybe people weren't aware of the fact that just dereferencing dentry->d_parent willy-nilly isn't actually allowed. That field changes. Sure, there are cases where it's ok, but this is a dangerous thing to do in general. In fact, the other thing that I find doing that whole "dentry->d_parent" thing seems to literally be broken. If you look at v9fs_fid_lookup(), you'll notice how it walks up the d_parent chain, and at that point you do NOT own the directory i_mutex, so at that point d_parent really _can_ be changing wildly due to concurrent renames or whatever. So 9pfs seems to have some preexisting bugs in this area. I'm not going to pull new bug-prone code. See the other discussions about being tight this release about really _only_ taking regressions after the merge window closed. Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-08 0:08 ` Linus Torvalds @ 2010-06-08 0:41 ` Al Viro 2010-06-08 0:48 ` Linus Torvalds 2010-06-16 16:42 ` [V9fs-developer] " Aneesh Kumar K. V 2010-06-08 14:29 ` Venkateswararao Jujjuri (JV) 1 sibling, 2 replies; 16+ messages in thread From: Al Viro @ 2010-06-08 0:41 UTC (permalink / raw) To: Linus Torvalds; +Cc: Eric Van Hensbergen, V9FS Developers, linux-kernel On Mon, Jun 07, 2010 at 05:08:19PM -0700, Linus Torvalds wrote: > In fact, the other thing that I find doing that whole "dentry->d_parent" > thing seems to literally be broken. If you look at v9fs_fid_lookup(), > you'll notice how it walks up the d_parent chain, and at that point you do > NOT own the directory i_mutex, so at that point d_parent really _can_ be > changing wildly due to concurrent renames or whatever. Eh... It's bogus, all right, but i_mutex is not the correct solution. You'd have to take it on a lot of inodes along the way to root *and* you'd violate the ordering in process (ancestors first). I'm not sure what's the right thing to do there, actually - s_vfs_rename_sem also won't do, since it'll give you ordering problems of its own (it's taken before i_mutex in VFS, so trying to take it under i_mutex would not do). The _really_ interesting question is how do servers deal with topology-changing renames. Note that the problem exists only with extended 9P - with the original one all of that had been a non-issue, since it didn't allow cross-directory renames at all and the tree topology remained stable all along. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-08 0:41 ` Al Viro @ 2010-06-08 0:48 ` Linus Torvalds 2010-06-16 16:42 ` [V9fs-developer] " Aneesh Kumar K. V 1 sibling, 0 replies; 16+ messages in thread From: Linus Torvalds @ 2010-06-08 0:48 UTC (permalink / raw) To: Al Viro; +Cc: Eric Van Hensbergen, V9FS Developers, linux-kernel On Tue, 8 Jun 2010, Al Viro wrote: > On Mon, Jun 07, 2010 at 05:08:19PM -0700, Linus Torvalds wrote: > > > In fact, the other thing that I find doing that whole "dentry->d_parent" > > thing seems to literally be broken. If you look at v9fs_fid_lookup(), > > you'll notice how it walks up the d_parent chain, and at that point you do > > NOT own the directory i_mutex, so at that point d_parent really _can_ be > > changing wildly due to concurrent renames or whatever. > > Eh... It's bogus, all right, but i_mutex is not the correct solution. Oh, no, I didn't imply it was. But the other sites that I saw doing the dentry->d_parent access already _had_ the i_mutex thing, so I was pointing out how this one does not (and indeed _cannot_ do that). So I'm just saying that pretty much _all_ the dentry->d_parent use in 9p seems very suspect. The cases where we hold i_mutex (because the caller already took it) shouldn't do that whole d_parent dance, because they get the directory inode passed into them directly. And the other places are just buggy. So from a quick look, the use of d_parent in 9p is simply not a good idea. Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-08 0:41 ` Al Viro 2010-06-08 0:48 ` Linus Torvalds @ 2010-06-16 16:42 ` Aneesh Kumar K. V 2010-06-24 16:30 ` Aneesh Kumar K. V 1 sibling, 1 reply; 16+ messages in thread From: Aneesh Kumar K. V @ 2010-06-16 16:42 UTC (permalink / raw) To: Al Viro, Linus Torvalds Cc: Eric Van Hensbergen, V9FS Developers, linux-kernel On Tue, 8 Jun 2010 01:41:02 +0100, Al Viro <viro@ZenIV.linux.org.uk> wrote: > On Mon, Jun 07, 2010 at 05:08:19PM -0700, Linus Torvalds wrote: > > > In fact, the other thing that I find doing that whole "dentry->d_parent" > > thing seems to literally be broken. If you look at v9fs_fid_lookup(), > > you'll notice how it walks up the d_parent chain, and at that point you do > > NOT own the directory i_mutex, so at that point d_parent really _can_ be > > changing wildly due to concurrent renames or whatever. > > Eh... It's bogus, all right, but i_mutex is not the correct solution. > You'd have to take it on a lot of inodes along the way to root *and* > you'd violate the ordering in process (ancestors first). > > I'm not sure what's the right thing to do there, actually - s_vfs_rename_sem > also won't do, since it'll give you ordering problems of its own (it's > taken before i_mutex in VFS, so trying to take it under i_mutex would not > do). Can we use dcache_lock in v9fs_fid_lookup ? Since we are holding parent directory inode->i_mutex in other places where we refer dentry->d_parent I guess those access are ok ?. And for v9fs_fid_lookup we can hold dcache_lock, walk the parent, build the full path name and use that for TWALK ? Another option is we deny a cross directory rename when doing fid_lookup. That is we can introduce a per superblock v9fs specific rwlock that get taken (in write mode) in a cross directory rename and in fid_lookup we take them in read mode ? We will have to set FS_RENAME_DOES_D_MOVE for 9p. > > The _really_ interesting question is how do servers deal with topology-changing > renames. Note that the problem exists only with extended 9P - with the > original one all of that had been a non-issue, since it didn't allow > cross-directory renames at all and the tree topology remained stable all along. > True the problem exist only with .L extension since .u protocol don't allow a cross directory renames. In the server we update the path names attached to a fid during rename. This happen to all fids that have path component matching the changed name. -aneesh ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-16 16:42 ` [V9fs-developer] " Aneesh Kumar K. V @ 2010-06-24 16:30 ` Aneesh Kumar K. V 2010-06-29 20:18 ` Eric Van Hensbergen 2010-06-30 12:55 ` Aneesh Kumar K. V 0 siblings, 2 replies; 16+ messages in thread From: Aneesh Kumar K. V @ 2010-06-24 16:30 UTC (permalink / raw) To: Al Viro, Linus Torvalds Cc: Eric Van Hensbergen, V9FS Developers, linux-kernel On Wed, 16 Jun 2010 22:12:32 +0530, "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com> wrote: > On Tue, 8 Jun 2010 01:41:02 +0100, Al Viro <viro@ZenIV.linux.org.uk> wrote: > > On Mon, Jun 07, 2010 at 05:08:19PM -0700, Linus Torvalds wrote: > > > > > In fact, the other thing that I find doing that whole "dentry->d_parent" > > > thing seems to literally be broken. If you look at v9fs_fid_lookup(), > > > you'll notice how it walks up the d_parent chain, and at that point you do > > > NOT own the directory i_mutex, so at that point d_parent really _can_ be > > > changing wildly due to concurrent renames or whatever. > > > > Eh... It's bogus, all right, but i_mutex is not the correct solution. > > You'd have to take it on a lot of inodes along the way to root *and* > > you'd violate the ordering in process (ancestors first). > > > > I'm not sure what's the right thing to do there, actually - s_vfs_rename_sem > > also won't do, since it'll give you ordering problems of its own (it's > > taken before i_mutex in VFS, so trying to take it under i_mutex would not > > do). > > Can we use dcache_lock in v9fs_fid_lookup ? Since we are holding > parent directory inode->i_mutex in other places where we refer > dentry->d_parent I guess those access are ok ?. And for v9fs_fid_lookup we > can hold dcache_lock, walk the parent, build the full path name and use > that for TWALK ? > > Another option is we deny a cross directory rename when > doing fid_lookup. That is we can introduce a per superblock v9fs > specific rwlock that get taken (in write mode) in a cross directory > rename and in fid_lookup we take them in read mode ? We will have to set > FS_RENAME_DOES_D_MOVE for 9p. something like this ? commit 86c06ad7506e7e05dd4e1e1b8cee28e19703c4f6 Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Date: Mon Jun 21 21:50:07 2010 +0530 fs/9p: Prevent parallel rename when doing fid_lookup During fid lookup we need to make sure that the dentry->d_parent doesn't change so that we can safely walk the parent dentries. To ensure that we need to prevent cross directory rename during fid_lookup. Add a per superblock rename_lock rwlock to prevent parallel fid lookup and rename. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> diff --git a/fs/9p/fid.c b/fs/9p/fid.c index 7317b39..14b542a 100644 --- a/fs/9p/fid.c +++ b/fs/9p/fid.c @@ -139,14 +139,19 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) fid = v9fs_fid_find(dentry, uid, any); if (fid) return fid; - + /* + * we don't have a matching fid. To do a TWALK we need + * parent fid. We need to prevent rename when we want to + * look at the parent. + */ + read_lock(&v9ses->rename_lock); ds = dentry->d_parent; fid = v9fs_fid_find(ds, uid, any); if (!fid) { /* walk from the root */ n = 0; for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) n++; - + read_unlock(&v9ses->rename_lock); fid = v9fs_fid_find(ds, uid, any); if (!fid) { /* the user is not attached to the fs yet */ if (access == V9FS_ACCESS_SINGLE) @@ -165,9 +170,11 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) v9fs_fid_add(ds, fid); } - } else /* walk from the parent */ + } else { + /* walk from the parent */ n = 1; - + read_unlock(&v9ses->rename_lock); + } if (ds == dentry) return fid; @@ -175,8 +182,10 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) if (!wnames) return ERR_PTR(-ENOMEM); + read_lock(&v9ses->rename_lock); for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) wnames[i] = (char *) d->d_name.name; + read_unlock(&v9ses->rename_lock); clone = 1; i = 0; @@ -199,7 +208,6 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) i += l; clone = 0; } - kfree(wnames); v9fs_fid_add(dentry, fid); return fid; diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c index f8b86e9..6839fca 100644 --- a/fs/9p/v9fs.c +++ b/fs/9p/v9fs.c @@ -237,6 +237,7 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *v9ses, __putname(v9ses->uname); return ERR_PTR(-ENOMEM); } + rwlock_init(&v9ses->rename_lock); rc = bdi_setup_and_register(&v9ses->bdi, "9p", BDI_CAP_MAP_COPY); if (rc) { diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h index bec4d0b..dee4f26 100644 --- a/fs/9p/v9fs.h +++ b/fs/9p/v9fs.h @@ -104,6 +104,7 @@ struct v9fs_session_info { struct p9_client *clnt; /* 9p client */ struct list_head slist; /* list of sessions registered with v9fs */ struct backing_dev_info bdi; + rwlock_t rename_lock; }; struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *, diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index 4331b3b..be39035 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -678,6 +678,7 @@ static struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry, sb = dir->i_sb; v9ses = v9fs_inode2v9ses(dir); + /* We can walk d_parent because we hold the dir->i_mutex */ dfid = v9fs_fid_lookup(dentry->d_parent); if (IS_ERR(dfid)) return ERR_CAST(dfid); @@ -763,7 +764,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, struct p9_fid *olddirfid; struct p9_fid *newdirfid; struct p9_wstat wstat; - int retval; + int retval, cross_dir_rename; P9_DPRINTK(P9_DEBUG_VFS, "\n"); retval = 0; @@ -784,7 +785,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, retval = PTR_ERR(newdirfid); goto clunk_olddir; } - + cross_dir_rename = (old_dentry->d_parent != new_dentry->d_parent); if (v9fs_proto_dotl(v9ses)) { retval = p9_client_rename(oldfid, newdirfid, (char *) new_dentry->d_name.name); @@ -806,6 +807,11 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, retval = p9_client_wstat(oldfid, &wstat); clunk_newdir: + if (cross_dir_rename) + write_lock(&v9ses->rename_lock); + d_move(old_dentry, new_dentry); + if (cross_dir_rename) + write_unlock(&v9ses->rename_lock); p9_client_clunk(newdirfid); clunk_olddir: diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c index be74d02..0f6d06d 100644 --- a/fs/9p/vfs_super.c +++ b/fs/9p/vfs_super.c @@ -278,4 +278,5 @@ struct file_system_type v9fs_fs_type = { .get_sb = v9fs_get_sb, .kill_sb = v9fs_kill_super, .owner = THIS_MODULE, + .fs_flags = FS_RENAME_DOES_D_MOVE, }; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-24 16:30 ` Aneesh Kumar K. V @ 2010-06-29 20:18 ` Eric Van Hensbergen 2010-06-29 20:38 ` Linus Torvalds 2010-06-30 12:55 ` Aneesh Kumar K. V 1 sibling, 1 reply; 16+ messages in thread From: Eric Van Hensbergen @ 2010-06-29 20:18 UTC (permalink / raw) To: Linus Torvalds, Al Viro; +Cc: V9FS Developers, linux-kernel Linus, Al, Does this approach satisfy your concerns? We've been going over several different options on how to proceed, but this seems to be the best option. The v9fs_fid_lookup traversal of the d_parent path is an exception case that hardly ever gets hit in practice so the extra locking shouldn't be a performance problem, but we haven't been able to figure out a way to resolve it in another fashion for existing use-case scenarios. Thanks for any guidance, -eric On Thu, Jun 24, 2010 at 11:30 AM, Aneesh Kumar K. V <aneesh.kumar@linux.vnet.ibm.com> wrote: > On Wed, 16 Jun 2010 22:12:32 +0530, "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com> wrote: >> On Tue, 8 Jun 2010 01:41:02 +0100, Al Viro <viro@ZenIV.linux.org.uk> wrote: >> > On Mon, Jun 07, 2010 at 05:08:19PM -0700, Linus Torvalds wrote: >> > >> > > In fact, the other thing that I find doing that whole "dentry->d_parent" >> > > thing seems to literally be broken. If you look at v9fs_fid_lookup(), >> > > you'll notice how it walks up the d_parent chain, and at that point you do >> > > NOT own the directory i_mutex, so at that point d_parent really _can_ be >> > > changing wildly due to concurrent renames or whatever. >> > >> > Eh... It's bogus, all right, but i_mutex is not the correct solution. >> > You'd have to take it on a lot of inodes along the way to root *and* >> > you'd violate the ordering in process (ancestors first). >> > >> > I'm not sure what's the right thing to do there, actually - s_vfs_rename_sem >> > also won't do, since it'll give you ordering problems of its own (it's >> > taken before i_mutex in VFS, so trying to take it under i_mutex would not >> > do). >> >> Can we use dcache_lock in v9fs_fid_lookup ? Since we are holding >> parent directory inode->i_mutex in other places where we refer >> dentry->d_parent I guess those access are ok ?. And for v9fs_fid_lookup we >> can hold dcache_lock, walk the parent, build the full path name and use >> that for TWALK ? >> >> Another option is we deny a cross directory rename when >> doing fid_lookup. That is we can introduce a per superblock v9fs >> specific rwlock that get taken (in write mode) in a cross directory >> rename and in fid_lookup we take them in read mode ? We will have to set >> FS_RENAME_DOES_D_MOVE for 9p. > > something like this ? > > commit 86c06ad7506e7e05dd4e1e1b8cee28e19703c4f6 > Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > Date: Mon Jun 21 21:50:07 2010 +0530 > > fs/9p: Prevent parallel rename when doing fid_lookup > > During fid lookup we need to make sure that the dentry->d_parent doesn't > change so that we can safely walk the parent dentries. To ensure that > we need to prevent cross directory rename during fid_lookup. Add a > per superblock rename_lock rwlock to prevent parallel fid lookup and rename. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > > diff --git a/fs/9p/fid.c b/fs/9p/fid.c > index 7317b39..14b542a 100644 > --- a/fs/9p/fid.c > +++ b/fs/9p/fid.c > @@ -139,14 +139,19 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > fid = v9fs_fid_find(dentry, uid, any); > if (fid) > return fid; > - > + /* > + * we don't have a matching fid. To do a TWALK we need > + * parent fid. We need to prevent rename when we want to > + * look at the parent. > + */ > + read_lock(&v9ses->rename_lock); > ds = dentry->d_parent; > fid = v9fs_fid_find(ds, uid, any); > if (!fid) { /* walk from the root */ > n = 0; > for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) > n++; > - > + read_unlock(&v9ses->rename_lock); > fid = v9fs_fid_find(ds, uid, any); > if (!fid) { /* the user is not attached to the fs yet */ > if (access == V9FS_ACCESS_SINGLE) > @@ -165,9 +170,11 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > > v9fs_fid_add(ds, fid); > } > - } else /* walk from the parent */ > + } else { > + /* walk from the parent */ > n = 1; > - > + read_unlock(&v9ses->rename_lock); > + } > if (ds == dentry) > return fid; > > @@ -175,8 +182,10 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > if (!wnames) > return ERR_PTR(-ENOMEM); > > + read_lock(&v9ses->rename_lock); > for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) > wnames[i] = (char *) d->d_name.name; > + read_unlock(&v9ses->rename_lock); > > clone = 1; > i = 0; > @@ -199,7 +208,6 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > i += l; > clone = 0; > } > - > kfree(wnames); > v9fs_fid_add(dentry, fid); > return fid; > diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c > index f8b86e9..6839fca 100644 > --- a/fs/9p/v9fs.c > +++ b/fs/9p/v9fs.c > @@ -237,6 +237,7 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *v9ses, > __putname(v9ses->uname); > return ERR_PTR(-ENOMEM); > } > + rwlock_init(&v9ses->rename_lock); > > rc = bdi_setup_and_register(&v9ses->bdi, "9p", BDI_CAP_MAP_COPY); > if (rc) { > diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h > index bec4d0b..dee4f26 100644 > --- a/fs/9p/v9fs.h > +++ b/fs/9p/v9fs.h > @@ -104,6 +104,7 @@ struct v9fs_session_info { > struct p9_client *clnt; /* 9p client */ > struct list_head slist; /* list of sessions registered with v9fs */ > struct backing_dev_info bdi; > + rwlock_t rename_lock; > }; > > struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *, > diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c > index 4331b3b..be39035 100644 > --- a/fs/9p/vfs_inode.c > +++ b/fs/9p/vfs_inode.c > @@ -678,6 +678,7 @@ static struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry, > > sb = dir->i_sb; > v9ses = v9fs_inode2v9ses(dir); > + /* We can walk d_parent because we hold the dir->i_mutex */ > dfid = v9fs_fid_lookup(dentry->d_parent); > if (IS_ERR(dfid)) > return ERR_CAST(dfid); > @@ -763,7 +764,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > struct p9_fid *olddirfid; > struct p9_fid *newdirfid; > struct p9_wstat wstat; > - int retval; > + int retval, cross_dir_rename; > > P9_DPRINTK(P9_DEBUG_VFS, "\n"); > retval = 0; > @@ -784,7 +785,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > retval = PTR_ERR(newdirfid); > goto clunk_olddir; > } > - > + cross_dir_rename = (old_dentry->d_parent != new_dentry->d_parent); > if (v9fs_proto_dotl(v9ses)) { > retval = p9_client_rename(oldfid, newdirfid, > (char *) new_dentry->d_name.name); > @@ -806,6 +807,11 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > retval = p9_client_wstat(oldfid, &wstat); > > clunk_newdir: > + if (cross_dir_rename) > + write_lock(&v9ses->rename_lock); > + d_move(old_dentry, new_dentry); > + if (cross_dir_rename) > + write_unlock(&v9ses->rename_lock); > p9_client_clunk(newdirfid); > > clunk_olddir: > diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c > index be74d02..0f6d06d 100644 > --- a/fs/9p/vfs_super.c > +++ b/fs/9p/vfs_super.c > @@ -278,4 +278,5 @@ struct file_system_type v9fs_fs_type = { > .get_sb = v9fs_get_sb, > .kill_sb = v9fs_kill_super, > .owner = THIS_MODULE, > + .fs_flags = FS_RENAME_DOES_D_MOVE, > }; > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-29 20:18 ` Eric Van Hensbergen @ 2010-06-29 20:38 ` Linus Torvalds 2010-06-30 11:52 ` Aneesh Kumar K. V 0 siblings, 1 reply; 16+ messages in thread From: Linus Torvalds @ 2010-06-29 20:38 UTC (permalink / raw) To: Eric Van Hensbergen; +Cc: Al Viro, V9FS Developers, linux-kernel On Tue, Jun 29, 2010 at 1:18 PM, Eric Van Hensbergen <ericvh@gmail.com> wrote: > > Does this approach satisfy your concerns? We've been going over > several different options on how to proceed, but this seems to be the > best option. Using a p9fs rename lock does seem to be a reasonable option. That said, the patch itself seems to not be valid. You drop the lock too early in v9fs_fid_lookup() as far as I can tell. You then re-take it before doing that whole for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) loop with it held again, but that's totally bogus - because you dropped the lock, your 'n-1' count has absolutely no meaning any more, since a cross-directory rename might have changed the depth of the thing in the meantime. And if the depth changes, you aren't at all guaranteed to stay on the same p9fs filesystem, so now you're doing that d_parent access without the proper locking (sure: you hold the rename lock, but it's not at all guaranteed that the rename lock is the _right_ lock any more as you traverse the list down!) But I didn't look deeply at the patch. There might be some reason why it's safe (I doubt it, though), and there might be other places where you do the same. But in general, dropping and re-taking a lock is a bad idea. If you dropped the lock, you can't depend on anything you found out while having held it. Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-29 20:38 ` Linus Torvalds @ 2010-06-30 11:52 ` Aneesh Kumar K. V 2010-06-30 16:48 ` Linus Torvalds 2010-06-30 18:16 ` Latchesar Ionkov 0 siblings, 2 replies; 16+ messages in thread From: Aneesh Kumar K. V @ 2010-06-30 11:52 UTC (permalink / raw) To: Linus Torvalds, Eric Van Hensbergen Cc: V9FS Developers, Al Viro, linux-kernel On Tue, 29 Jun 2010 13:38:57 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Tue, Jun 29, 2010 at 1:18 PM, Eric Van Hensbergen <ericvh@gmail.com> wrote: > > > > Does this approach satisfy your concerns? We've been going over > > several different options on how to proceed, but this seems to be the > > best option. > > Using a p9fs rename lock does seem to be a reasonable option. > > That said, the patch itself seems to not be valid. You drop the lock > too early in v9fs_fid_lookup() as far as I can tell. You then re-take > it before doing that whole > > for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) > > loop with it held again, but that's totally bogus - because you > dropped the lock, your 'n-1' count has absolutely no meaning any more, > since a cross-directory rename might have changed the depth of the > thing in the meantime. > > And if the depth changes, you aren't at all guaranteed to stay on the > same p9fs filesystem, so now you're doing that d_parent access without > the proper locking (sure: you hold the rename lock, but it's not at > all guaranteed that the rename lock is the _right_ lock any more as > you traverse the list down!) > > But I didn't look deeply at the patch. There might be some reason why > it's safe (I doubt it, though), and there might be other places where > you do the same. But in general, dropping and re-taking a lock is a > bad idea. If you dropped the lock, you can't depend on anything you > found out while having held it. You are correct. we cannot drop the rename lock in between. I also found another issue in that we are using dentry->d_name.name directly. That would imply we need to hold the rename_lock even during the client_walk. How about the patch below ?. I updated the patch to hold rename_lock during multiple path walk. Also the rename path is updated to hold the lock during p9_client_rename operations. commit 79f6f20dbb70ad35db37b674957c95de20662a75 Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Date: Wed Jun 30 15:37:58 2010 +0530 fs/9p: Prevent parallel rename when doing fid_lookup During fid lookup we need to make sure that the dentry->d_parent doesn't change so that we can safely walk the parent dentries. To ensure that we need to prevent cross directory rename during fid_lookup. Add a per superblock rename_lock rwlock to prevent parallel fid lookup and rename. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> diff --git a/fs/9p/fid.c b/fs/9p/fid.c index 5d6cfcb..7b387fe 100644 --- a/fs/9p/fid.c +++ b/fs/9p/fid.c @@ -97,6 +97,34 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, u32 uid, int any) return ret; } +/* + * We need to hold v9ses->rename_lock as long as we hold references + * to returned path array. Array element contain pointers to + * dentry names. + */ +static int build_path_from_dentry(struct v9fs_session_info *v9ses, + struct dentry *dentry, char ***names) +{ + int n = 0, i; + char **wnames; + struct dentry *ds; + + for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) + n++; + + wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); + if (!wnames) + goto err_out; + + for (ds = dentry, i = (n-1); i >= 0; i--, ds = ds->d_parent) + wnames[i] = (char *)ds->d_name.name; + + *names = wnames; + return n; +err_out: + return -ENOMEM; +} + /** * v9fs_fid_lookup - lookup for a fid, try to walk if not found * @dentry: dentry to look for fid in @@ -112,7 +140,7 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) int i, n, l, clone, any, access; u32 uid; struct p9_fid *fid, *old_fid = NULL; - struct dentry *d, *ds; + struct dentry *ds; struct v9fs_session_info *v9ses; char **wnames, *uname; @@ -139,50 +167,63 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) fid = v9fs_fid_find(dentry, uid, any); if (fid) return fid; - + /* + * we don't have a matching fid. To do a TWALK we need + * parent fid. We need to prevent rename when we want to + * look at the parent. + */ + read_lock(&v9ses->rename_lock); ds = dentry->d_parent; fid = v9fs_fid_find(ds, uid, any); - if (!fid) { /* walk from the root */ - n = 0; - for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) - n++; - - fid = v9fs_fid_find(ds, uid, any); - if (!fid) { /* the user is not attached to the fs yet */ - if (access == V9FS_ACCESS_SINGLE) - return ERR_PTR(-EPERM); - - if (v9fs_proto_dotu(v9ses) || - v9fs_proto_dotl(v9ses)) - uname = NULL; - else - uname = v9ses->uname; + if (fid) { + /* Found the parent fid do a lookup with that */ + fid = p9_client_walk(fid, 1, (char **)&ds->d_name.name, 1); + read_unlock(&v9ses->rename_lock); + return fid; + } + read_unlock(&v9ses->rename_lock); - fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, - v9ses->aname); + /* start from the root and try to do a lookup */ + fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any); + if (!fid) { + /* the user is not attached to the fs yet */ + if (access == V9FS_ACCESS_SINGLE) + return ERR_PTR(-EPERM); - if (IS_ERR(fid)) - return fid; + if (v9fs_proto_dotu(v9ses) || v9fs_proto_dotl(v9ses)) + uname = NULL; + else + uname = v9ses->uname; - v9fs_fid_add(ds, fid); - } - } else /* walk from the parent */ - n = 1; + fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, + v9ses->aname); + if (IS_ERR(fid)) + return fid; - if (ds == dentry) + v9fs_fid_add(dentry->d_sb->s_root, fid); + } + /* If we are root ourself just return that */ + if (dentry->d_sb->s_root == dentry) return fid; - - wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); - if (!wnames) - return ERR_PTR(-ENOMEM); - - for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) - wnames[i] = (char *) d->d_name.name; - + /* + * Do a multipath walk with attached root. + * When walking parent we need to make sure we + * don't have a parallel rename happening + */ + read_lock(&v9ses->rename_lock); + n = build_path_from_dentry(v9ses, dentry, &wnames); + if (n < 0) { + fid = ERR_CAST(ERR_PTR(n)); + goto err_out; + } clone = 1; i = 0; while (i < n) { l = min(n - i, P9_MAXWELEM); + /* + * We need to hold rename lock when doing a multipath + * walk to ensure none of the patch component change + */ fid = p9_client_walk(fid, l, &wnames[i], clone); if (IS_ERR(fid)) { if (old_fid) { @@ -194,15 +235,16 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) p9_client_clunk(old_fid); } kfree(wnames); - return fid; + goto err_out; } old_fid = fid; i += l; clone = 0; } - kfree(wnames); v9fs_fid_add(dentry, fid); +err_out: + read_unlock(&v9ses->rename_lock); return fid; } diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c index 3c49201..b41bcef 100644 --- a/fs/9p/v9fs.c +++ b/fs/9p/v9fs.c @@ -237,6 +237,7 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *v9ses, __putname(v9ses->uname); return ERR_PTR(-ENOMEM); } + rwlock_init(&v9ses->rename_lock); rc = bdi_setup_and_register(&v9ses->bdi, "9p", BDI_CAP_MAP_COPY); if (rc) { diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h index bec4d0b..dee4f26 100644 --- a/fs/9p/v9fs.h +++ b/fs/9p/v9fs.h @@ -104,6 +104,7 @@ struct v9fs_session_info { struct p9_client *clnt; /* 9p client */ struct list_head slist; /* list of sessions registered with v9fs */ struct backing_dev_info bdi; + rwlock_t rename_lock; }; struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *, diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index 503a6a2..eae89ad 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -964,6 +964,7 @@ static struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry, sb = dir->i_sb; v9ses = v9fs_inode2v9ses(dir); + /* We can walk d_parent because we hold the dir->i_mutex */ dfid = v9fs_fid_lookup(dentry->d_parent); if (IS_ERR(dfid)) return ERR_CAST(dfid); @@ -1049,7 +1050,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, struct p9_fid *olddirfid; struct p9_fid *newdirfid; struct p9_wstat wstat; - int retval; + int retval, cross_dir_rename = 0; P9_DPRINTK(P9_DEBUG_VFS, "\n"); retval = 0; @@ -1070,6 +1071,9 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, retval = PTR_ERR(newdirfid); goto clunk_olddir; } + cross_dir_rename = (old_dentry->d_parent != new_dentry->d_parent); + if (cross_dir_rename) + write_lock(&v9ses->rename_lock); if (v9fs_proto_dotl(v9ses)) { retval = p9_client_rename(oldfid, newdirfid, @@ -1077,21 +1081,27 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, if (retval != -ENOSYS) goto clunk_newdir; } + if (cross_dir_rename) { + /* + * 9P .u can only handle file rename in the same directory + */ - /* 9P can only handle file rename in the same directory */ - if (memcmp(&olddirfid->qid, &newdirfid->qid, sizeof(newdirfid->qid))) { P9_DPRINTK(P9_DEBUG_ERROR, "old dir and new dir are different\n"); retval = -EXDEV; goto clunk_newdir; } - v9fs_blank_wstat(&wstat); wstat.muid = v9ses->uname; wstat.name = (char *) new_dentry->d_name.name; retval = p9_client_wstat(oldfid, &wstat); clunk_newdir: + if (!retval) + /* successful rename */ + d_move(old_dentry, new_dentry); + if (cross_dir_rename) + write_unlock(&v9ses->rename_lock); p9_client_clunk(newdirfid); clunk_olddir: diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c index 0740675..3abc3ec 100644 --- a/fs/9p/vfs_super.c +++ b/fs/9p/vfs_super.c @@ -287,4 +287,5 @@ struct file_system_type v9fs_fs_type = { .get_sb = v9fs_get_sb, .kill_sb = v9fs_kill_super, .owner = THIS_MODULE, + .fs_flags = FS_RENAME_DOES_D_MOVE, }; ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-30 11:52 ` Aneesh Kumar K. V @ 2010-06-30 16:48 ` Linus Torvalds 2010-06-30 18:58 ` Aneesh Kumar K. V 2010-06-30 18:16 ` Latchesar Ionkov 1 sibling, 1 reply; 16+ messages in thread From: Linus Torvalds @ 2010-06-30 16:48 UTC (permalink / raw) To: Aneesh Kumar K. V Cc: Eric Van Hensbergen, V9FS Developers, Al Viro, linux-kernel On Wed, Jun 30, 2010 at 4:52 AM, Aneesh Kumar K. V <aneesh.kumar@linux.vnet.ibm.com> wrote: > > You are correct. we cannot drop the rename lock in between. I also found > another issue in that we are using dentry->d_name.name directly. That > would imply we need to hold the rename_lock even during the > client_walk. Yes. > How about the patch below ?. I updated the patch to hold > rename_lock during multiple path walk. Also the rename path is updated > to hold the lock during p9_client_rename operations. I'm not finding any obvious problems with this, and you're right that you also need to hold the rename write-lock even for regular renames (not just cross-directory ones) in order to protect the name itself. So those two patches together seem to be ok at a quick glance. That said, I do wonder if you wouldn't be better off copying the name components in order to then drop the lock earlier. That way you'd only need to hold the lock while walking the dentries, and could possibly release it during the actual walk (unless you need the names to be stable, but I don't think you can rely on that anyway, since other clients might be doing renames concurrently.. I don't know) Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-30 16:48 ` Linus Torvalds @ 2010-06-30 18:58 ` Aneesh Kumar K. V 0 siblings, 0 replies; 16+ messages in thread From: Aneesh Kumar K. V @ 2010-06-30 18:58 UTC (permalink / raw) To: Linus Torvalds Cc: Eric Van Hensbergen, V9FS Developers, Al Viro, linux-kernel On Wed, 30 Jun 2010 09:48:15 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Wed, Jun 30, 2010 at 4:52 AM, Aneesh Kumar K. V > <aneesh.kumar@linux.vnet.ibm.com> wrote: > > > > You are correct. we cannot drop the rename lock in between. I also found > > another issue in that we are using dentry->d_name.name directly. That > > would imply we need to hold the rename_lock even during the > > client_walk. > > Yes. > > > How about the patch below ?. I updated the patch to hold > > rename_lock during multiple path walk. Also the rename path is updated > > to hold the lock during p9_client_rename operations. > > I'm not finding any obvious problems with this, and you're right that > you also need to hold the rename write-lock even for regular renames > (not just cross-directory ones) in order to protect the name itself. > So those two patches together seem to be ok at a quick glance. > > That said, I do wonder if you wouldn't be better off copying the name > components in order to then drop the lock earlier. That way you'd only > need to hold the lock while walking the dentries, and could possibly > release it during the actual walk (unless you need the names to be > stable, but I don't think you can rely on that anyway, since other > clients might be doing renames concurrently.. I don't know) > > Linus Updated patch >From 0bae321401e88d495ea7e9e96272c11c1b9b9ec3 Mon Sep 17 00:00:00 2001 From: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> Date: Wed, 30 Jun 2010 19:18:50 +0530 Subject: [PATCH] fs/9p: Prevent parallel rename when doing fid_lookup During fid lookup we need to make sure that the dentry->d_parent doesn't change so that we can safely walk the parent dentries. To ensure that we need to prevent cross directory rename during fid_lookup. Add a per superblock rename_lock rwlock to prevent parallel fid lookup and rename. Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> --- fs/9p/fid.c | 134 ++++++++++++++++++++++++++++++++++++++--------------- fs/9p/v9fs.c | 1 + fs/9p/v9fs.h | 1 + fs/9p/vfs_inode.c | 14 ++++-- fs/9p/vfs_super.c | 1 + 5 files changed, 110 insertions(+), 41 deletions(-) diff --git a/fs/9p/fid.c b/fs/9p/fid.c index 5d6cfcb..7dcefaa 100644 --- a/fs/9p/fid.c +++ b/fs/9p/fid.c @@ -97,6 +97,48 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, u32 uid, int any) return ret; } +static int build_path_from_dentry(struct v9fs_session_info *v9ses, + struct dentry *dentry, char ***names) +{ + int n = 0, i; + char **wnames; + struct dentry *ds; + + read_lock(&v9ses->rename_lock); + for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) + n++; + + wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); + if (!wnames) { + n = 0; + goto err_out; + } + for (ds = dentry, i = (n-1); i >= 0; i--, ds = ds->d_parent) { + wnames[i] = kmalloc(strlen(ds->d_name.name) + 1, GFP_KERNEL); + if (!wnames[i]) + goto err_out; + + strcpy(wnames[i], ds->d_name.name); + } + *names = wnames; + read_unlock(&v9ses->rename_lock); + return n; +err_out: + read_unlock(&v9ses->rename_lock); + for (; i < n; i++) + kfree(wnames[i]); + kfree(wnames); + return -ENOMEM; +} + +static void kfree_wnames(char **wnames, int n) +{ + int i; + for (i = 0; i < n; i++) + kfree(wnames[i]); + kfree(wnames); +} + /** * v9fs_fid_lookup - lookup for a fid, try to walk if not found * @dentry: dentry to look for fid in @@ -112,7 +154,7 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) int i, n, l, clone, any, access; u32 uid; struct p9_fid *fid, *old_fid = NULL; - struct dentry *d, *ds; + struct dentry *ds; struct v9fs_session_info *v9ses; char **wnames, *uname; @@ -139,46 +181,62 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) fid = v9fs_fid_find(dentry, uid, any); if (fid) return fid; - + /* + * we don't have a matching fid. To do a TWALK we need + * parent fid. We need to prevent rename when we want to + * look at the parent. + */ + read_lock(&v9ses->rename_lock); ds = dentry->d_parent; fid = v9fs_fid_find(ds, uid, any); - if (!fid) { /* walk from the root */ - n = 0; - for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) - n++; - - fid = v9fs_fid_find(ds, uid, any); - if (!fid) { /* the user is not attached to the fs yet */ - if (access == V9FS_ACCESS_SINGLE) - return ERR_PTR(-EPERM); - - if (v9fs_proto_dotu(v9ses) || - v9fs_proto_dotl(v9ses)) + if (fid) { + /* + * Found the parent fid do a lookup with that. + * We want to do the walk without holding rename_lock + */ + char *dentry_name; + dentry_name = kmalloc(strlen(dentry->d_name.name) + 1, + GFP_KERNEL); + if (dentry_name) { + strcpy(dentry_name, dentry->d_name.name); + read_unlock(&v9ses->rename_lock); + fid = p9_client_walk(fid, 1, (char **)&dentry_name, 1); + kfree(dentry_name); + goto dentry_out; + } else { + read_unlock(&v9ses->rename_lock); + return ERR_PTR(-ENOMEM); + } + } + read_unlock(&v9ses->rename_lock); + /* start from the root and try to do a lookup */ + fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any); + if (!fid) { + /* the user is not attached to the fs yet */ + if (access == V9FS_ACCESS_SINGLE) + return ERR_PTR(-EPERM); + + if (v9fs_proto_dotu(v9ses) || v9fs_proto_dotl(v9ses)) uname = NULL; - else - uname = v9ses->uname; - - fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, - v9ses->aname); + else + uname = v9ses->uname; - if (IS_ERR(fid)) - return fid; - - v9fs_fid_add(ds, fid); - } - } else /* walk from the parent */ - n = 1; + fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, + v9ses->aname); + if (IS_ERR(fid)) + return fid; - if (ds == dentry) + v9fs_fid_add(dentry->d_sb->s_root, fid); + } + /* If we are root just return the fid */ + if (dentry->d_sb->s_root == dentry) return fid; - wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); - if (!wnames) - return ERR_PTR(-ENOMEM); - - for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) - wnames[i] = (char *) d->d_name.name; - + n = build_path_from_dentry(v9ses, dentry, &wnames); + if (n < 0) { + fid = ERR_PTR(n); + goto err_out; + } clone = 1; i = 0; while (i < n) { @@ -193,16 +251,18 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) */ p9_client_clunk(old_fid); } - kfree(wnames); - return fid; + kfree_wnames(wnames, n); + goto err_out; } old_fid = fid; i += l; clone = 0; } + kfree_wnames(wnames, n); - kfree(wnames); +dentry_out: v9fs_fid_add(dentry, fid); +err_out: return fid; } diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c index 3c49201..b41bcef 100644 --- a/fs/9p/v9fs.c +++ b/fs/9p/v9fs.c @@ -237,6 +237,7 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *v9ses, __putname(v9ses->uname); return ERR_PTR(-ENOMEM); } + rwlock_init(&v9ses->rename_lock); rc = bdi_setup_and_register(&v9ses->bdi, "9p", BDI_CAP_MAP_COPY); if (rc) { diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h index bec4d0b..dee4f26 100644 --- a/fs/9p/v9fs.h +++ b/fs/9p/v9fs.h @@ -104,6 +104,7 @@ struct v9fs_session_info { struct p9_client *clnt; /* 9p client */ struct list_head slist; /* list of sessions registered with v9fs */ struct backing_dev_info bdi; + rwlock_t rename_lock; }; struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *, diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index 503a6a2..97fee13 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -964,6 +964,7 @@ static struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry, sb = dir->i_sb; v9ses = v9fs_inode2v9ses(dir); + /* We can walk d_parent because we hold the dir->i_mutex */ dfid = v9fs_fid_lookup(dentry->d_parent); if (IS_ERR(dfid)) return ERR_CAST(dfid); @@ -1070,28 +1071,33 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, retval = PTR_ERR(newdirfid); goto clunk_olddir; } - if (v9fs_proto_dotl(v9ses)) { retval = p9_client_rename(oldfid, newdirfid, (char *) new_dentry->d_name.name); if (retval != -ENOSYS) goto clunk_newdir; } + if (old_dentry->d_parent != new_dentry->d_parent) { + /* + * 9P .u can only handle file rename in the same directory + */ - /* 9P can only handle file rename in the same directory */ - if (memcmp(&olddirfid->qid, &newdirfid->qid, sizeof(newdirfid->qid))) { P9_DPRINTK(P9_DEBUG_ERROR, "old dir and new dir are different\n"); retval = -EXDEV; goto clunk_newdir; } - v9fs_blank_wstat(&wstat); wstat.muid = v9ses->uname; wstat.name = (char *) new_dentry->d_name.name; retval = p9_client_wstat(oldfid, &wstat); clunk_newdir: + write_lock(&v9ses->rename_lock); + if (!retval) + /* successful rename */ + d_move(old_dentry, new_dentry); + write_unlock(&v9ses->rename_lock); p9_client_clunk(newdirfid); clunk_olddir: diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c index 0740675..3abc3ec 100644 --- a/fs/9p/vfs_super.c +++ b/fs/9p/vfs_super.c @@ -287,4 +287,5 @@ struct file_system_type v9fs_fs_type = { .get_sb = v9fs_get_sb, .kill_sb = v9fs_kill_super, .owner = THIS_MODULE, + .fs_flags = FS_RENAME_DOES_D_MOVE, }; -- 1.7.2.rc0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-30 11:52 ` Aneesh Kumar K. V 2010-06-30 16:48 ` Linus Torvalds @ 2010-06-30 18:16 ` Latchesar Ionkov 2010-06-30 18:31 ` Linus Torvalds 2010-06-30 18:33 ` Aneesh Kumar K. V 1 sibling, 2 replies; 16+ messages in thread From: Latchesar Ionkov @ 2010-06-30 18:16 UTC (permalink / raw) To: Aneesh Kumar K. V Cc: Linus Torvalds, Eric Van Hensbergen, V9FS Developers, Al Viro, linux-kernel I think that you need to use the s_vfs_rename_mutex in the super_block struct instead of introducing a new rename_lock in the v9fs session. Thanks, Lucho On Wed, Jun 30, 2010 at 5:52 AM, Aneesh Kumar K. V <aneesh.kumar@linux.vnet.ibm.com> wrote: > On Tue, 29 Jun 2010 13:38:57 -0700, Linus Torvalds <torvalds@linux-foundation.org> wrote: >> On Tue, Jun 29, 2010 at 1:18 PM, Eric Van Hensbergen <ericvh@gmail.com> wrote: >> > >> > Does this approach satisfy your concerns? We've been going over >> > several different options on how to proceed, but this seems to be the >> > best option. >> >> Using a p9fs rename lock does seem to be a reasonable option. >> >> That said, the patch itself seems to not be valid. You drop the lock >> too early in v9fs_fid_lookup() as far as I can tell. You then re-take >> it before doing that whole >> >> for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) >> >> loop with it held again, but that's totally bogus - because you >> dropped the lock, your 'n-1' count has absolutely no meaning any more, >> since a cross-directory rename might have changed the depth of the >> thing in the meantime. >> >> And if the depth changes, you aren't at all guaranteed to stay on the >> same p9fs filesystem, so now you're doing that d_parent access without >> the proper locking (sure: you hold the rename lock, but it's not at >> all guaranteed that the rename lock is the _right_ lock any more as >> you traverse the list down!) >> >> But I didn't look deeply at the patch. There might be some reason why >> it's safe (I doubt it, though), and there might be other places where >> you do the same. But in general, dropping and re-taking a lock is a >> bad idea. If you dropped the lock, you can't depend on anything you >> found out while having held it. > > You are correct. we cannot drop the rename lock in between. I also found > another issue in that we are using dentry->d_name.name directly. That > would imply we need to hold the rename_lock even during the > client_walk. How about the patch below ?. I updated the patch to hold > rename_lock during multiple path walk. Also the rename path is updated > to hold the lock during p9_client_rename operations. > > commit 79f6f20dbb70ad35db37b674957c95de20662a75 > Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > Date: Wed Jun 30 15:37:58 2010 +0530 > > fs/9p: Prevent parallel rename when doing fid_lookup > > During fid lookup we need to make sure that the dentry->d_parent doesn't > change so that we can safely walk the parent dentries. To ensure that > we need to prevent cross directory rename during fid_lookup. Add a > per superblock rename_lock rwlock to prevent parallel fid lookup and rename. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > > diff --git a/fs/9p/fid.c b/fs/9p/fid.c > index 5d6cfcb..7b387fe 100644 > --- a/fs/9p/fid.c > +++ b/fs/9p/fid.c > @@ -97,6 +97,34 @@ static struct p9_fid *v9fs_fid_find(struct dentry *dentry, u32 uid, int any) > return ret; > } > > +/* > + * We need to hold v9ses->rename_lock as long as we hold references > + * to returned path array. Array element contain pointers to > + * dentry names. > + */ > +static int build_path_from_dentry(struct v9fs_session_info *v9ses, > + struct dentry *dentry, char ***names) > +{ > + int n = 0, i; > + char **wnames; > + struct dentry *ds; > + > + for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) > + n++; > + > + wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); > + if (!wnames) > + goto err_out; > + > + for (ds = dentry, i = (n-1); i >= 0; i--, ds = ds->d_parent) > + wnames[i] = (char *)ds->d_name.name; > + > + *names = wnames; > + return n; > +err_out: > + return -ENOMEM; > +} > + > /** > * v9fs_fid_lookup - lookup for a fid, try to walk if not found > * @dentry: dentry to look for fid in > @@ -112,7 +140,7 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > int i, n, l, clone, any, access; > u32 uid; > struct p9_fid *fid, *old_fid = NULL; > - struct dentry *d, *ds; > + struct dentry *ds; > struct v9fs_session_info *v9ses; > char **wnames, *uname; > > @@ -139,50 +167,63 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > fid = v9fs_fid_find(dentry, uid, any); > if (fid) > return fid; > - > + /* > + * we don't have a matching fid. To do a TWALK we need > + * parent fid. We need to prevent rename when we want to > + * look at the parent. > + */ > + read_lock(&v9ses->rename_lock); > ds = dentry->d_parent; > fid = v9fs_fid_find(ds, uid, any); > - if (!fid) { /* walk from the root */ > - n = 0; > - for (ds = dentry; !IS_ROOT(ds); ds = ds->d_parent) > - n++; > - > - fid = v9fs_fid_find(ds, uid, any); > - if (!fid) { /* the user is not attached to the fs yet */ > - if (access == V9FS_ACCESS_SINGLE) > - return ERR_PTR(-EPERM); > - > - if (v9fs_proto_dotu(v9ses) || > - v9fs_proto_dotl(v9ses)) > - uname = NULL; > - else > - uname = v9ses->uname; > + if (fid) { > + /* Found the parent fid do a lookup with that */ > + fid = p9_client_walk(fid, 1, (char **)&ds->d_name.name, 1); > + read_unlock(&v9ses->rename_lock); > + return fid; > + } > + read_unlock(&v9ses->rename_lock); > > - fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, > - v9ses->aname); > + /* start from the root and try to do a lookup */ > + fid = v9fs_fid_find(dentry->d_sb->s_root, uid, any); > + if (!fid) { > + /* the user is not attached to the fs yet */ > + if (access == V9FS_ACCESS_SINGLE) > + return ERR_PTR(-EPERM); > > - if (IS_ERR(fid)) > - return fid; > + if (v9fs_proto_dotu(v9ses) || v9fs_proto_dotl(v9ses)) > + uname = NULL; > + else > + uname = v9ses->uname; > > - v9fs_fid_add(ds, fid); > - } > - } else /* walk from the parent */ > - n = 1; > + fid = p9_client_attach(v9ses->clnt, NULL, uname, uid, > + v9ses->aname); > + if (IS_ERR(fid)) > + return fid; > > - if (ds == dentry) > + v9fs_fid_add(dentry->d_sb->s_root, fid); > + } > + /* If we are root ourself just return that */ > + if (dentry->d_sb->s_root == dentry) > return fid; > - > - wnames = kmalloc(sizeof(char *) * n, GFP_KERNEL); > - if (!wnames) > - return ERR_PTR(-ENOMEM); > - > - for (d = dentry, i = (n-1); i >= 0; i--, d = d->d_parent) > - wnames[i] = (char *) d->d_name.name; > - > + /* > + * Do a multipath walk with attached root. > + * When walking parent we need to make sure we > + * don't have a parallel rename happening > + */ > + read_lock(&v9ses->rename_lock); > + n = build_path_from_dentry(v9ses, dentry, &wnames); > + if (n < 0) { > + fid = ERR_CAST(ERR_PTR(n)); > + goto err_out; > + } > clone = 1; > i = 0; > while (i < n) { > l = min(n - i, P9_MAXWELEM); > + /* > + * We need to hold rename lock when doing a multipath > + * walk to ensure none of the patch component change > + */ > fid = p9_client_walk(fid, l, &wnames[i], clone); > if (IS_ERR(fid)) { > if (old_fid) { > @@ -194,15 +235,16 @@ struct p9_fid *v9fs_fid_lookup(struct dentry *dentry) > p9_client_clunk(old_fid); > } > kfree(wnames); > - return fid; > + goto err_out; > } > old_fid = fid; > i += l; > clone = 0; > } > - > kfree(wnames); > v9fs_fid_add(dentry, fid); > +err_out: > + read_unlock(&v9ses->rename_lock); > return fid; > } > > diff --git a/fs/9p/v9fs.c b/fs/9p/v9fs.c > index 3c49201..b41bcef 100644 > --- a/fs/9p/v9fs.c > +++ b/fs/9p/v9fs.c > @@ -237,6 +237,7 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *v9ses, > __putname(v9ses->uname); > return ERR_PTR(-ENOMEM); > } > + rwlock_init(&v9ses->rename_lock); > > rc = bdi_setup_and_register(&v9ses->bdi, "9p", BDI_CAP_MAP_COPY); > if (rc) { > diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h > index bec4d0b..dee4f26 100644 > --- a/fs/9p/v9fs.h > +++ b/fs/9p/v9fs.h > @@ -104,6 +104,7 @@ struct v9fs_session_info { > struct p9_client *clnt; /* 9p client */ > struct list_head slist; /* list of sessions registered with v9fs */ > struct backing_dev_info bdi; > + rwlock_t rename_lock; > }; > > struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *, > diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c > index 503a6a2..eae89ad 100644 > --- a/fs/9p/vfs_inode.c > +++ b/fs/9p/vfs_inode.c > @@ -964,6 +964,7 @@ static struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry, > > sb = dir->i_sb; > v9ses = v9fs_inode2v9ses(dir); > + /* We can walk d_parent because we hold the dir->i_mutex */ > dfid = v9fs_fid_lookup(dentry->d_parent); > if (IS_ERR(dfid)) > return ERR_CAST(dfid); > @@ -1049,7 +1050,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > struct p9_fid *olddirfid; > struct p9_fid *newdirfid; > struct p9_wstat wstat; > - int retval; > + int retval, cross_dir_rename = 0; > > P9_DPRINTK(P9_DEBUG_VFS, "\n"); > retval = 0; > @@ -1070,6 +1071,9 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > retval = PTR_ERR(newdirfid); > goto clunk_olddir; > } > + cross_dir_rename = (old_dentry->d_parent != new_dentry->d_parent); > + if (cross_dir_rename) > + write_lock(&v9ses->rename_lock); > > if (v9fs_proto_dotl(v9ses)) { > retval = p9_client_rename(oldfid, newdirfid, > @@ -1077,21 +1081,27 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, > if (retval != -ENOSYS) > goto clunk_newdir; > } > + if (cross_dir_rename) { > + /* > + * 9P .u can only handle file rename in the same directory > + */ > > - /* 9P can only handle file rename in the same directory */ > - if (memcmp(&olddirfid->qid, &newdirfid->qid, sizeof(newdirfid->qid))) { > P9_DPRINTK(P9_DEBUG_ERROR, > "old dir and new dir are different\n"); > retval = -EXDEV; > goto clunk_newdir; > } > - > v9fs_blank_wstat(&wstat); > wstat.muid = v9ses->uname; > wstat.name = (char *) new_dentry->d_name.name; > retval = p9_client_wstat(oldfid, &wstat); > > clunk_newdir: > + if (!retval) > + /* successful rename */ > + d_move(old_dentry, new_dentry); > + if (cross_dir_rename) > + write_unlock(&v9ses->rename_lock); > p9_client_clunk(newdirfid); > > clunk_olddir: > diff --git a/fs/9p/vfs_super.c b/fs/9p/vfs_super.c > index 0740675..3abc3ec 100644 > --- a/fs/9p/vfs_super.c > +++ b/fs/9p/vfs_super.c > @@ -287,4 +287,5 @@ struct file_system_type v9fs_fs_type = { > .get_sb = v9fs_get_sb, > .kill_sb = v9fs_kill_super, > .owner = THIS_MODULE, > + .fs_flags = FS_RENAME_DOES_D_MOVE, > }; > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Sprint > What will you do first with EVO, the first 4G phone? > Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first > _______________________________________________ > V9fs-developer mailing list > V9fs-developer@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/v9fs-developer > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-30 18:16 ` Latchesar Ionkov @ 2010-06-30 18:31 ` Linus Torvalds 2010-06-30 18:33 ` Aneesh Kumar K. V 1 sibling, 0 replies; 16+ messages in thread From: Linus Torvalds @ 2010-06-30 18:31 UTC (permalink / raw) To: Latchesar Ionkov Cc: Aneesh Kumar K. V, Eric Van Hensbergen, V9FS Developers, Al Viro, linux-kernel On Wed, Jun 30, 2010 at 11:16 AM, Latchesar Ionkov <lucho@ionkov.net> wrote: > > I think that you need to use the s_vfs_rename_mutex in the super_block > struct instead of introducing a new rename_lock in the v9fs session. I actually think it's better to avoid having filesystems muck around with VFS locking details. Also, I think we get the VFS rename mutex only for cross-directory renames, and as mentioned, 9p needs locking even for regular directory renames. (Also, this way you can have parallel readers - although we could obviously change the vfs rename mutex into a rw-sem too). So I do think that keeping the logic private to a 9p-specific lock is the right solution here. Linus ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-30 18:16 ` Latchesar Ionkov 2010-06-30 18:31 ` Linus Torvalds @ 2010-06-30 18:33 ` Aneesh Kumar K. V 1 sibling, 0 replies; 16+ messages in thread From: Aneesh Kumar K. V @ 2010-06-30 18:33 UTC (permalink / raw) To: Latchesar Ionkov Cc: Linus Torvalds, Eric Van Hensbergen, V9FS Developers, Al Viro, linux-kernel On Wed, 30 Jun 2010 12:16:15 -0600, Latchesar Ionkov <lucho@ionkov.net> wrote: > I think that you need to use the s_vfs_rename_mutex in the super_block > struct instead of introducing a new rename_lock in the v9fs session. > I guess that will have lock dependency issue with inode->i_mutex. Also as Linus suggested with dentry name copied we can avoid holding the lock when doing 9P operation. -aneesh ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-24 16:30 ` Aneesh Kumar K. V 2010-06-29 20:18 ` Eric Van Hensbergen @ 2010-06-30 12:55 ` Aneesh Kumar K. V 1 sibling, 0 replies; 16+ messages in thread From: Aneesh Kumar K. V @ 2010-06-30 12:55 UTC (permalink / raw) To: Al Viro, Linus Torvalds Cc: Eric Van Hensbergen, V9FS Developers, linux-kernel On Thu, 24 Jun 2010 22:00:10 +0530, "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com> wrote: > On Wed, 16 Jun 2010 22:12:32 +0530, "Aneesh Kumar K. V" <aneesh.kumar@linux.vnet.ibm.com> wrote: > > On Tue, 8 Jun 2010 01:41:02 +0100, Al Viro <viro@ZenIV.linux.org.uk> wrote: > > > On Mon, Jun 07, 2010 at 05:08:19PM -0700, Linus Torvalds wrote: > > > > > > > In fact, the other thing that I find doing that whole "dentry->d_parent" > > > > thing seems to literally be broken. If you look at v9fs_fid_lookup(), > > > > you'll notice how it walks up the d_parent chain, and at that point you do > > > > NOT own the directory i_mutex, so at that point d_parent really _can_ be > > > > changing wildly due to concurrent renames or whatever. > > > > > > Eh... It's bogus, all right, but i_mutex is not the correct solution. > > > You'd have to take it on a lot of inodes along the way to root *and* > > > you'd violate the ordering in process (ancestors first). > > > > > > I'm not sure what's the right thing to do there, actually - s_vfs_rename_sem > > > also won't do, since it'll give you ordering problems of its own (it's > > > taken before i_mutex in VFS, so trying to take it under i_mutex would not > > > do). > > > > Can we use dcache_lock in v9fs_fid_lookup ? Since we are holding > > parent directory inode->i_mutex in other places where we refer > > dentry->d_parent I guess those access are ok ?. And for v9fs_fid_lookup we > > can hold dcache_lock, walk the parent, build the full path name and use > > that for TWALK ? > > > > Another option is we deny a cross directory rename when > > doing fid_lookup. That is we can introduce a per superblock v9fs > > specific rwlock that get taken (in write mode) in a cross directory > > rename and in fid_lookup we take them in read mode ? We will have to set > > FS_RENAME_DOES_D_MOVE for 9p. > > something like this ? > > commit 86c06ad7506e7e05dd4e1e1b8cee28e19703c4f6 > Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > Date: Mon Jun 21 21:50:07 2010 +0530 > > fs/9p: Prevent parallel rename when doing fid_lookup > > During fid lookup we need to make sure that the dentry->d_parent doesn't > change so that we can safely walk the parent dentries. To ensure that > we need to prevent cross directory rename during fid_lookup. Add a > per superblock rename_lock rwlock to prevent parallel fid lookup and rename. > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> > To make sure d_name.name remain correct we need something like below ? diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c index eae89ad..9f9f804 100644 --- a/fs/9p/vfs_inode.c +++ b/fs/9p/vfs_inode.c @@ -1050,7 +1050,7 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, struct p9_fid *olddirfid; struct p9_fid *newdirfid; struct p9_wstat wstat; - int retval, cross_dir_rename = 0; + int retval; P9_DPRINTK(P9_DEBUG_VFS, "\n"); retval = 0; @@ -1071,17 +1071,15 @@ v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry, retval = PTR_ERR(newdirfid); goto clunk_olddir; } - cross_dir_rename = (old_dentry->d_parent != new_dentry->d_parent); - if (cross_dir_rename) - write_lock(&v9ses->rename_lock); + write_lock(&v9ses->rename_lock); if (v9fs_proto_dotl(v9ses)) { retval = p9_client_rename(oldfid, newdirfid, (char *) new_dentry->d_name.name); if (retval != -ENOSYS) goto clunk_newdir; } - if (cross_dir_rename) { + if (old_dentry->d_parent != new_dentry->d_parent) { /* * 9P .u can only handle file rename in the same directory */ @@ -1100,8 +1098,7 @@ clunk_newdir: if (!retval) /* successful rename */ d_move(old_dentry, new_dentry); - if (cross_dir_rename) - write_unlock(&v9ses->rename_lock); + write_unlock(&v9ses->rename_lock); p9_client_clunk(newdirfid); clunk_olddir: ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [V9fs-developer] [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 2010-06-08 0:08 ` Linus Torvalds 2010-06-08 0:41 ` Al Viro @ 2010-06-08 14:29 ` Venkateswararao Jujjuri (JV) 1 sibling, 0 replies; 16+ messages in thread From: Venkateswararao Jujjuri (JV) @ 2010-06-08 14:29 UTC (permalink / raw) To: Linus Torvalds; +Cc: Eric Van Hensbergen, V9FS Developers, linux-kernel Linus Torvalds wrote: > > On Mon, 7 Jun 2010, Eric Van Hensbergen wrote: >> jvrao (2): >> Add a helper function to get fsgid for a file create. >> 9p: Add a wstat after TCREATE to fix the gid. > > Quite frankly, this looks rather broken. > > It uses "dentry->d_parent" without locking (it so happens to likely be ok, > since we are in "create()" and thus should be holding the parent > semaphore). On its own, that might be excusable (if people were even > _aware_ of the this locking rule!), but it does so just to get the inode > pointer to that parent. > > And the only thing that makes it ok to access dentry->d_parent - the fact > that we are in v9fs_create() - is also the thing that should have made > people look at the arguments to the function and say "hmm". Silly me. I sent out another patch using the dir inode passed through arguments. But we still need to analyze the use of dentry->d_parent in other parts of code.. - JV > > We pass in the directory inode pointer as an argument to the create > function! The code could have used that thing directly, instead of > mucking around with dentry pointers that it had no business looking at. > > I see why it seems to have happened: v9fs does the exact same thing for > the pre-existing "v9fs_fid_lookup()". So there is history to this > behavior. > > Maybe people weren't aware of the fact that just dereferencing > dentry->d_parent willy-nilly isn't actually allowed. That field changes. > Sure, there are cases where it's ok, but this is a dangerous thing to do > in general. > > In fact, the other thing that I find doing that whole "dentry->d_parent" > thing seems to literally be broken. If you look at v9fs_fid_lookup(), > you'll notice how it walks up the d_parent chain, and at that point you do > NOT own the directory i_mutex, so at that point d_parent really _can_ be > changing wildly due to concurrent renames or whatever. > > So 9pfs seems to have some preexisting bugs in this area. I'm not going to > pull new bug-prone code. See the other discussions about being tight this > release about really _only_ taking regressions after the merge window > closed. > > Linus > > ------------------------------------------------------------------------------ > ThinkGeek and WIRED's GeekDad team up for the Ultimate > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the > lucky parental unit. See the prize list and enter to win: > http://p.sf.net/sfu/thinkgeek-promo > _______________________________________________ > V9fs-developer mailing list > V9fs-developer@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/v9fs-developer ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2010-06-30 18:58 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2010-06-07 20:02 [GIT PULL] 9p file system bug fixes for 2.6.35-rc2 Eric Van Hensbergen 2010-06-08 0:08 ` Linus Torvalds 2010-06-08 0:41 ` Al Viro 2010-06-08 0:48 ` Linus Torvalds 2010-06-16 16:42 ` [V9fs-developer] " Aneesh Kumar K. V 2010-06-24 16:30 ` Aneesh Kumar K. V 2010-06-29 20:18 ` Eric Van Hensbergen 2010-06-29 20:38 ` Linus Torvalds 2010-06-30 11:52 ` Aneesh Kumar K. V 2010-06-30 16:48 ` Linus Torvalds 2010-06-30 18:58 ` Aneesh Kumar K. V 2010-06-30 18:16 ` Latchesar Ionkov 2010-06-30 18:31 ` Linus Torvalds 2010-06-30 18:33 ` Aneesh Kumar K. V 2010-06-30 12:55 ` Aneesh Kumar K. V 2010-06-08 14:29 ` Venkateswararao Jujjuri (JV)
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).