All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] vfs: atomic open v2
@ 2012-03-28 20:24 Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 01/16] vfs: split do_lookup() Miklos Szeredi
                   ` (15 more replies)
  0 siblings, 16 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

This series allows clean implementation of atomic lookup+(create)+open
operations that previously were done via ->lookup and ->create using open
intents.

The biggest change from the first version is that ->atomic_create has been
merged into ->atomic_open, and that now ->atomic_open is allowed to return with
the result of a lookup without actually opening the file.

The interface is thus simpler and more flexible at the cost of a bit of extra
complexity in the filesystem implementations.

I dropped the rest of the nameidata cleanup patches from this series, and will
post them as a separate series.

Note, I only tested NFS and FUSE, the CIFS, CEPH and 9P changes are completely
untested, so please report any testing results.

git tree is here:

  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git atomic-open.v2

Thanks,
Miklos
---

Miklos Szeredi (16):
      vfs: split do_lookup()
      vfs: reorganize do_last()
      vfs: split __dentry_open()
      vfs: add i_op->atomic_open()
      nfs: don't open in ->d_revalidate
      nfs: implement i_op->atomic_open()
      nfs: clean up ->create in nfs_rpc_ops
      nfs: don't use nd->intent.open.flags
      nfs: don't use intents for checking atomic open
      fuse: implement i_op->atomic_create()
      cifs: implement i_op->atomic_open() and i_op->atomic_create()
      ceph: remove unused arg from ceph_lookup_open()
      ceph: implement i_op->atomic_open() and i_op->atomic_create()
      9p: implement i_op->atomic_create()
      vfs: remove open intents from nameidata
      vfs: only retry last component if opening stale dentry

---
 fs/9p/vfs_inode.c       |  169 ++++++++++------
 fs/9p/vfs_inode_dotl.c  |   52 ++++--
 fs/ceph/dir.c           |   68 ++++---
 fs/ceph/file.c          |   22 +-
 fs/ceph/super.h         |    6 +-
 fs/cifs/cifsfs.c        |    1 +
 fs/cifs/cifsfs.h        |    3 +
 fs/cifs/dir.c           |  435 ++++++++++++++++++++++------------------
 fs/fuse/dir.c           |   97 +++++++---
 fs/internal.h           |   10 +-
 fs/namei.c              |  510 ++++++++++++++++++++++++++++++++++++-----------
 fs/nfs/dir.c            |  285 ++++++++++-----------------
 fs/nfs/file.c           |   69 ++++++-
 fs/nfs/nfs3proc.c       |    2 +-
 fs/nfs/nfs4proc.c       |   37 +---
 fs/nfs/proc.c           |    2 +-
 fs/open.c               |  119 +++++------
 include/linux/errno.h   |    1 +
 include/linux/fs.h      |    7 +
 include/linux/namei.h   |   11 -
 include/linux/nfs_xdr.h |    2 +-
 21 files changed, 1153 insertions(+), 755 deletions(-)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH 01/16] vfs: split do_lookup()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 02/16] vfs: reorganize do_last() Miklos Szeredi
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Split do_lookup() into two functions:

  lookup_fast() - does cached lookup without i_mutex
  lookup_slow() - does lookup with i_mutex

Both follow managed dentries.

The new functions are needed by atomic_open.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c |   56 ++++++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index d15d0ba..936648c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1142,8 +1142,8 @@ static struct dentry *__lookup_hash(struct qstr *name, struct dentry *base,
  *  small and for now I'd prefer to have fast path as straight as possible.
  *  It _is_ time-critical.
  */
-static int do_lookup(struct nameidata *nd, struct qstr *name,
-			struct path *path, struct inode **inode)
+static int lookup_fast(struct nameidata *nd, struct qstr *name,
+		       struct path *path, struct inode **inode)
 {
 	struct vfsmount *mnt = nd->path.mnt;
 	struct dentry *dentry, *parent = nd->path.dentry;
@@ -1212,7 +1212,6 @@ unlazy:
 		}
 	}
 
-success:
 	path->mnt = mnt;
 	path->dentry = dentry;
 	err = follow_managed(path, nd);
@@ -1224,6 +1223,17 @@ success:
 	return 0;
 
 need_lookup:
+	return 1;
+}
+
+/* Fast lookup failed, do it the slow way */
+static int lookup_slow(struct nameidata *nd, struct qstr *name,
+		       struct path *path)
+{
+	struct dentry *dentry, *parent;
+	int err;
+
+	parent = nd->path.dentry;
 	BUG_ON(nd->inode != parent->d_inode);
 
 	mutex_lock(&parent->d_inode->i_mutex);
@@ -1232,7 +1242,14 @@ need_lookup:
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	goto success;
+	path->mnt = nd->path.mnt;
+	path->dentry = dentry;
+	err = follow_managed(path, nd);
+	if (unlikely(err)) {
+		path_put_conditional(path, nd);
+		return err;
+	}
+	return 0;
 }
 
 static inline int may_lookup(struct nameidata *nd)
@@ -1304,21 +1321,26 @@ static inline int walk_component(struct nameidata *nd, struct path *path,
 	 */
 	if (unlikely(type != LAST_NORM))
 		return handle_dots(nd, type);
-	err = do_lookup(nd, name, path, &inode);
+	err = lookup_fast(nd, name, path, &inode);
 	if (unlikely(err)) {
-		terminate_walk(nd);
-		return err;
-	}
-	if (!inode) {
-		path_to_nameidata(path, nd);
-		terminate_walk(nd);
-		return -ENOENT;
+		if (err < 0)
+			goto out_err;
+
+		err = lookup_slow(nd, name, path);
+		if (err < 0)
+			goto out_err;
+
+		inode = path->dentry->d_inode;
 	}
+	err = -ENOENT;
+	if (!inode)
+		goto out_path_put;
+
 	if (should_follow_link(inode, follow)) {
 		if (nd->flags & LOOKUP_RCU) {
 			if (unlikely(unlazy_walk(nd, path->dentry))) {
-				terminate_walk(nd);
-				return -ECHILD;
+				err = -ECHILD;
+				goto out_err;
 			}
 		}
 		BUG_ON(inode != path->dentry->d_inode);
@@ -1327,6 +1349,12 @@ static inline int walk_component(struct nameidata *nd, struct path *path,
 	path_to_nameidata(path, nd);
 	nd->inode = inode;
 	return 0;
+
+out_path_put:
+	path_to_nameidata(path, nd);
+out_err:
+	terminate_walk(nd);
+	return err;
 }
 
 /*
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 02/16] vfs: reorganize do_last()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 01/16] vfs: split do_lookup() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 03/16] vfs: split __dentry_open() Miklos Szeredi
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Make the slow lookup part of O_CREAT and non-O_CREAT opens common.

This allows atomic_open to be hooked into the slow lookup part.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c |   92 ++++++++++++++++++++++++++++++++++++------------------------
 1 files changed, 55 insertions(+), 37 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 936648c..ff3c47b 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2108,6 +2108,8 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	int want_write = 0;
 	int acc_mode = op->acc_mode;
 	struct file *filp;
+	struct inode *inode;
+	int symlink_ok = 0;
 	int error;
 
 	nd->flags &= ~LOOKUP_PARENT;
@@ -2139,47 +2141,38 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	}
 
 	if (!(open_flag & O_CREAT)) {
-		int symlink_ok = 0;
 		if (nd->last.name[nd->last.len])
 			nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY;
 		if (open_flag & O_PATH && !(nd->flags & LOOKUP_FOLLOW))
 			symlink_ok = 1;
 		/* we _can_ be in RCU mode here */
-		error = walk_component(nd, path, &nd->last, LAST_NORM,
-					!symlink_ok);
-		if (error < 0)
-			return ERR_PTR(error);
-		if (error) /* symlink */
-			return NULL;
-		/* sayonara */
+		error = lookup_fast(nd, &nd->last, path, &inode);
+		if (error <= 0) {
+			if (error)
+				goto terminate;
+
+			goto finish_lookup;
+		}
+		/* cached lookup failed, no longer in RCU mode */
+	} else {
+		/* create side of things */
+
+		/*
+		 * This will *only* deal with leaving RCU mode - LOOKUP_JUMPED
+		 * has been cleared when we got to the last component we are
+		 * about to look up
+		 */
 		error = complete_walk(nd);
 		if (error)
 			return ERR_PTR(error);
 
-		error = -ENOTDIR;
-		if (nd->flags & LOOKUP_DIRECTORY) {
-			if (!nd->inode->i_op->lookup)
-				goto exit;
-		}
-		audit_inode(pathname, nd->path.dentry);
-		goto ok;
+		audit_inode(pathname, dir);
+		error = -EISDIR;
+		/* trailing slashes? */
+		if (nd->last.name[nd->last.len])
+			goto exit;
 	}
 
-	/* create side of things */
-	/*
-	 * This will *only* deal with leaving RCU mode - LOOKUP_JUMPED has been
-	 * cleared when we got to the last component we are about to look up
-	 */
-	error = complete_walk(nd);
-	if (error)
-		return ERR_PTR(error);
-
-	audit_inode(pathname, dir);
-	error = -EISDIR;
-	/* trailing slashes? */
-	if (nd->last.name[nd->last.len])
-		goto exit;
-
 	mutex_lock(&dir->d_inode->i_mutex);
 
 	dentry = lookup_hash(nd);
@@ -2192,9 +2185,14 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	path->dentry = dentry;
 	path->mnt = nd->path.mnt;
 
-	/* Negative dentry, just create the file */
+	/* Negative dentry, create the file if O_CREAT */
 	if (!dentry->d_inode) {
 		umode_t mode = op->mode;
+
+		error = -ENOENT;
+		if (!(open_flag & O_CREAT))
+			goto exit_mutex_unlock;
+
 		if (!IS_POSIXACL(dir->d_inode))
 			mode &= ~current_umask();
 		/*
@@ -2238,22 +2236,38 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	if (error)
 		goto exit_dput;
 
+	inode = path->dentry->d_inode;
+finish_lookup:
 	error = -ENOENT;
-	if (!path->dentry->d_inode)
-		goto exit_dput;
+	if (!inode) {
+		path_to_nameidata(path, nd);
+		goto terminate;
+	}
 
-	if (path->dentry->d_inode->i_op->follow_link)
+	if (should_follow_link(inode, !symlink_ok)) {
+		if (nd->flags & LOOKUP_RCU) {
+			if (unlikely(unlazy_walk(nd, path->dentry))) {
+				error = -ECHILD;
+				goto terminate;
+			}
+		}
 		return NULL;
+	}
 
 	path_to_nameidata(path, nd);
-	nd->inode = path->dentry->d_inode;
-	/* Why this, you ask?  _Now_ we might have grown LOOKUP_JUMPED... */
+	nd->inode = inode;
+
 	error = complete_walk(nd);
 	if (error)
 		return ERR_PTR(error);
 	error = -EISDIR;
-	if (S_ISDIR(nd->inode->i_mode))
+	if ((open_flag & O_CREAT) && S_ISDIR(inode->i_mode))
+		goto exit;
+	error = -ENOTDIR;
+	if ((nd->flags & LOOKUP_DIRECTORY) && !inode->i_op->lookup)
 		goto exit;
+
+	audit_inode(pathname, nd->path.dentry);
 ok:
 	if (!S_ISREG(nd->inode->i_mode))
 		will_truncate = 0;
@@ -2298,6 +2312,10 @@ exit_dput:
 exit:
 	filp = ERR_PTR(error);
 	goto out;
+
+terminate:
+	terminate_walk(nd);
+	return ERR_PTR(error);
 }
 
 static struct file *path_openat(int dfd, const char *pathname,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 03/16] vfs: split __dentry_open()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 01/16] vfs: split do_lookup() Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 02/16] vfs: reorganize do_last() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 04/16] vfs: add i_op->atomic_open() Miklos Szeredi
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Split __dentry_open() into two functions:

  do_dentry_open() - does most of the actual work, doesn't put file on failure
  open_check_o_direct() - after a successful open, checks direct_IO method

This will allow i_op->atomic_open to do just the file initialization and leave
the direct_IO checking to the VFS.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/internal.h |    1 +
 fs/open.c     |   47 +++++++++++++++++++++++++++++++++--------------
 2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 9962c59..4d69fdd 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -100,6 +100,7 @@ extern struct file *do_file_open_root(struct dentry *, struct vfsmount *,
 
 extern long do_handle_open(int mountdirfd,
 			   struct file_handle __user *ufh, int open_flag);
+extern int open_check_o_direct(struct file *f);
 
 /*
  * inode.c
diff --git a/fs/open.c b/fs/open.c
index 77becc0..6acfd2d 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -644,10 +644,23 @@ static inline int __get_file_write_access(struct inode *inode,
 	return error;
 }
 
-static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
-					struct file *f,
-					int (*open)(struct inode *, struct file *),
-					const struct cred *cred)
+int open_check_o_direct(struct file *f)
+{
+	/* NB: we're sure to have correct a_ops only after f_op->open */
+	if (f->f_flags & O_DIRECT) {
+		if (!f->f_mapping->a_ops ||
+		    ((!f->f_mapping->a_ops->direct_IO) &&
+		    (!f->f_mapping->a_ops->get_xip_mem))) {
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static struct file *do_dentry_open(struct dentry *dentry, struct vfsmount *mnt,
+				   struct file *f,
+				   int (*open)(struct inode *, struct file *),
+				   const struct cred *cred)
 {
 	static const struct file_operations empty_fops = {};
 	struct inode *inode;
@@ -703,16 +716,6 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
 
 	file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);
 
-	/* NB: we're sure to have correct a_ops only after f_op->open */
-	if (f->f_flags & O_DIRECT) {
-		if (!f->f_mapping->a_ops ||
-		    ((!f->f_mapping->a_ops->direct_IO) &&
-		    (!f->f_mapping->a_ops->get_xip_mem))) {
-			fput(f);
-			f = ERR_PTR(-EINVAL);
-		}
-	}
-
 	return f;
 
 cleanup_all:
@@ -740,6 +743,22 @@ cleanup_file:
 	return ERR_PTR(error);
 }
 
+static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
+				struct file *f,
+				int (*open)(struct inode *, struct file *),
+				const struct cred *cred)
+{
+	struct file *res = do_dentry_open(dentry, mnt, f, open, cred);
+	if (!IS_ERR(res)) {
+		int error = open_check_o_direct(f);
+		if (error) {
+			fput(res);
+			res = ERR_PTR(error);
+		}
+	}
+	return res;
+}
+
 /**
  * lookup_instantiate_filp - instantiates the open intent filp
  * @nd: pointer to nameidata
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 04/16] vfs: add i_op->atomic_open()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (2 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 03/16] vfs: split __dentry_open() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 05/16] nfs: don't open in ->d_revalidate Miklos Szeredi
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Add a new inode operation which is called on the last component of an open.
Using this the filesystem can look up, possibly create and open the file in one
atomic operation.  If it cannot perform this (e.g. the file type turned out to
be wrong) it may signal this by returning NULL instead of an open struct file
pointer.

i_op->atomic_open() is only called if the last component is negative or needs
lookup.  Handling cached positive dentries here doesn't add much value: these
can be opened using f_op->open().  If the cached file turns out to be invalid,
the open can be retried, this time using ->atomic_open() with a fresh dentry.

For now leave the old way of using open intents in lookup and revalidate in
place.  This will be removed once all the users are converted.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/internal.h      |    6 ++
 fs/namei.c         |  246 +++++++++++++++++++++++++++++++++++++++++++++++++---
 fs/open.c          |   41 +++++++++
 include/linux/fs.h |    7 ++
 4 files changed, 289 insertions(+), 11 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 4d69fdd..438fbd3 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -87,12 +87,18 @@ extern struct super_block *user_get_super(dev_t);
 struct nameidata;
 extern struct file *nameidata_to_filp(struct nameidata *);
 extern void release_open_intent(struct nameidata *);
+struct opendata {
+	struct dentry *dentry;
+	struct vfsmount *mnt;
+	struct file **filp;
+};
 struct open_flags {
 	int open_flag;
 	umode_t mode;
 	int acc_mode;
 	int intent;
 };
+
 extern struct file *do_filp_open(int dfd, const char *pathname,
 		const struct open_flags *op, int lookup_flags);
 extern struct file *do_file_open_root(struct dentry *, struct vfsmount *,
diff --git a/fs/namei.c b/fs/namei.c
index ff3c47b..f6dfd7c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2095,6 +2095,216 @@ static inline int open_to_namei_flags(int flag)
 	return flag;
 }
 
+static int may_o_create(struct path *dir, struct dentry *dentry, umode_t mode)
+{
+	int error = security_path_mknod(dir, dentry, mode, 0);
+	if (error)
+		return error;
+
+	error = may_create(dir->dentry->d_inode, dentry);
+	if (error)
+		return error;
+
+	return security_inode_create(dir->dentry->d_inode, dentry, mode);
+}
+
+static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
+				struct path *path, const struct open_flags *op,
+				int *want_write, bool need_lookup)
+{
+	struct inode *dir =  nd->path.dentry->d_inode;
+	unsigned open_flag = open_to_namei_flags(op->open_flag);
+	umode_t mode;
+	int error;
+	bool created = false;
+	int acc_mode;
+	struct opendata od;
+	struct file *filp;
+	int create_error = 0;
+	struct dentry *const DENTRY_NOT_SET = (void *) -1UL;
+
+	BUG_ON(dentry->d_inode);
+
+	/* Don't create child dentry for a dead directory. */
+	if (unlikely(IS_DEADDIR(dir))) {
+		filp = ERR_PTR(-ENOENT);
+		goto out;
+	}
+
+	mode = op->mode & S_IALLUGO;
+	if ((open_flag & O_CREAT) && !IS_POSIXACL(dir))
+		mode &= ~current_umask();
+
+	if (open_flag & O_EXCL) {
+		open_flag &= ~O_TRUNC;
+		created = true;
+	}
+
+	/*
+	 * Checking write permission is tricky, bacuse we don't know if we are
+	 * going to actually need it: O_CREAT opens should work as long as the
+	 * file exists.  But checking existence breaks atomicity.  The trick is
+	 * to check access and if not granted clear O_CREAT from the flags.
+	 *
+	 * Another problem is returing the "right" error value (e.g. for an
+	 * O_EXCL open we want to return EEXIST not EROFS).
+	 */
+	if ((open_flag & (O_CREAT | O_TRUNC)) ||
+	    (open_flag & O_ACCMODE) != O_RDONLY) {
+		error = mnt_want_write(nd->path.mnt);
+		if (!error) {
+			*want_write = 1;
+		} else if (!(open_flag & O_CREAT)) {
+			/*
+			 * No O_CREATE -> atomicity not a requirement -> fall
+			 * back to lookup + open
+			 */
+			goto no_open;
+		} else if (open_flag & (O_EXCL | O_TRUNC)) {
+			/* Fall back and fail with the right error */
+			create_error = error;
+			goto no_open;
+		} else {
+			/* No side effects, safe to clear O_CREAT */
+			create_error = error;
+			open_flag &= ~O_CREAT;
+		}
+	}
+
+	if (open_flag & O_CREAT) {
+		error = may_o_create(&nd->path, dentry, op->mode);
+		if (error) {
+			create_error = error;
+			if (open_flag & O_EXCL)
+				goto no_open;
+			open_flag &= ~O_CREAT;
+		}
+	}
+
+	if (nd->flags & LOOKUP_DIRECTORY)
+		open_flag |= O_DIRECTORY;
+
+	od.dentry = DENTRY_NOT_SET;
+	od.mnt = nd->path.mnt;
+	od.filp = &nd->intent.open.file;
+	filp = dir->i_op->atomic_open(dir, dentry, &od, open_flag, mode,
+				      &created);
+	if (IS_ERR(filp)) {
+		if (WARN_ON(od.dentry != DENTRY_NOT_SET))
+			dput(od.dentry);
+
+		if (create_error && PTR_ERR(filp) == -ENOENT)
+			filp = ERR_PTR(create_error);
+		goto out;
+	}
+
+	acc_mode = op->acc_mode;
+	if (created) {
+		fsnotify_create(dir, dentry);
+		acc_mode = MAY_OPEN;
+	}
+
+	if (!filp) {
+		if (WARN_ON(od.dentry == DENTRY_NOT_SET)) {
+			filp = ERR_PTR(-EIO);
+			goto out;
+		}
+		if (od.dentry) {
+			dput(dentry);
+			dentry = od.dentry;
+		}
+		goto looked_up;
+	}
+
+	/*
+	 * We didn't have the inode before the open, so check open permission
+	 * here.
+	 */
+	error = may_open(&filp->f_path, acc_mode, open_flag);
+	if (error)
+		goto out_fput;
+
+	error = open_check_o_direct(filp);
+	if (error)
+		goto out_fput;
+
+out:
+	dput(dentry);
+	return filp;
+
+out_fput:
+	fput(filp);
+	filp = ERR_PTR(error);
+	goto out;
+
+no_open:
+	if (need_lookup) {
+		dentry = lookup_real(dir, dentry, nd);
+		if (IS_ERR(dentry))
+			return ERR_CAST(dentry);
+
+		if (create_error) {
+			int open_flag = op->open_flag;
+
+			filp = ERR_PTR(create_error);
+			if ((open_flag & O_EXCL) && !dentry->d_inode)
+				goto out;
+			if (!(open_flag & O_EXCL) && (open_flag & O_TRUNC) &&
+			    dentry->d_inode && S_ISREG(dentry->d_inode->i_mode))
+				goto out;
+
+			/* will fail later, go on to get the right error */
+		}
+	}
+looked_up:
+	path->dentry = dentry;
+	path->mnt = nd->path.mnt;
+	return NULL;
+}
+
+/*
+ * Lookup and possibly open (and create) the last component
+ *
+ * Must be called with i_mutex held on parent.
+ *
+ * Returns open file or NULL on success, error otherwise.  NULL means no open
+ * was performed, only lookup.
+ */
+static struct file *lookup_open(struct nameidata *nd, struct path *path,
+				const struct open_flags *op, int *want_write)
+{
+	struct dentry *dir = nd->path.dentry;
+	struct inode *dir_inode = dir->d_inode;
+	struct dentry *dentry;
+	bool need_lookup;
+
+	dentry = lookup_dcache(&nd->last, dir, nd, &need_lookup);
+	if (IS_ERR(dentry))
+		return ERR_CAST(dentry);
+
+	/* Cached positive dentry: will open in f_op->open */
+	if (!need_lookup && dentry->d_inode)
+		goto out_no_open;
+
+	if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) {
+		return atomic_open(nd, dentry, path, op, want_write,
+				   need_lookup);
+	}
+
+	if (need_lookup) {
+		BUG_ON(dentry->d_inode);
+
+		dentry = lookup_real(dir_inode, dentry, nd);
+		if (IS_ERR(dentry))
+			return ERR_CAST(dentry);
+	}
+
+out_no_open:
+	path->dentry = dentry;
+	path->mnt = nd->path.mnt;
+	return NULL;
+}
+
 /*
  * Handle the last step of open()
  */
@@ -2175,15 +2385,16 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 
 	mutex_lock(&dir->d_inode->i_mutex);
 
-	dentry = lookup_hash(nd);
-	error = PTR_ERR(dentry);
-	if (IS_ERR(dentry)) {
+	filp = lookup_open(nd, path, op, &want_write);
+	if (filp) {
 		mutex_unlock(&dir->d_inode->i_mutex);
-		goto exit;
-	}
+		if (IS_ERR(filp))
+			goto out;
 
-	path->dentry = dentry;
-	path->mnt = nd->path.mnt;
+		audit_inode(pathname, filp->f_path.dentry);
+		goto opened;
+	}
+	dentry = path->dentry;
 
 	/* Negative dentry, create the file if O_CREAT */
 	if (!dentry->d_inode) {
@@ -2202,10 +2413,12 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 		 * a permanent write count is taken through
 		 * the 'struct file' in nameidata_to_filp().
 		 */
-		error = mnt_want_write(nd->path.mnt);
-		if (error)
-			goto exit_mutex_unlock;
-		want_write = 1;
+		if (!want_write) {
+			error = mnt_want_write(nd->path.mnt);
+			if (error)
+				goto exit_mutex_unlock;
+			want_write = 1;
+		}
 		/* Don't check for write permission, don't truncate */
 		open_flag &= ~O_TRUNC;
 		will_truncate = 0;
@@ -2228,6 +2441,16 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	mutex_unlock(&dir->d_inode->i_mutex);
 	audit_inode(pathname, path->dentry);
 
+	/*
+	 * If atomic_open() acquired write access it is dropped now due to
+	 * possible mount and symlink following (this might be optimized away if
+	 * necessary...)
+	 */
+	if (want_write) {
+		mnt_drop_write(nd->path.mnt);
+		want_write = 0;
+	}
+
 	error = -EEXIST;
 	if (open_flag & O_EXCL)
 		goto exit_dput;
@@ -2283,6 +2506,7 @@ common:
 	if (error)
 		goto exit;
 	filp = nameidata_to_filp(nd);
+opened:
 	if (!IS_ERR(filp)) {
 		error = ima_file_check(filp, op->acc_mode);
 		if (error) {
diff --git a/fs/open.c b/fs/open.c
index 6acfd2d..cd73de1 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -800,6 +800,47 @@ out_err:
 EXPORT_SYMBOL_GPL(lookup_instantiate_filp);
 
 /**
+ * finish_open - finish opening a file
+ * @od: opaque open data
+ * @dentry: pointer to dentry
+ * @open: open callback
+ *
+ * This can be used to finish opening a file passed to i_op->atomic_open().
+ *
+ * If the open callback is set to NULL, then the standard f_op->open()
+ * filesystem callback is substituted.
+ */
+struct file *finish_open(struct opendata *od, struct dentry *dentry,
+			 int (*open)(struct inode *, struct file *))
+{
+	struct file *filp;
+
+	filp = *(od->filp);
+	*(od->filp) = NULL;
+
+	mntget(od->mnt);
+	dget(dentry);
+
+	return do_dentry_open(dentry, od->mnt, filp, open, current_cred());
+}
+EXPORT_SYMBOL(finish_open);
+
+/**
+ * finish_no_open - finish ->atomic_open() without opening the file
+ *
+ * @od: opaque open data
+ * @dentry: dentry or NULL (as returned from ->lookup())
+ *
+ * This can be used to set the result of a successful lookup in ->atomic_open().
+ * The filesystem's atomic_open() method shall return NULL after calling this.
+ */
+void finish_no_open(struct opendata *od, struct dentry *dentry)
+{
+	od->dentry = dentry;
+}
+EXPORT_SYMBOL(finish_no_open);
+
+/**
  * nameidata_to_filp - convert a nameidata to an open filp.
  * @nd: pointer to nameidata
  * @flags: open flags
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 69cd5bb..3d144b4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -412,6 +412,7 @@ struct kstatfs;
 struct vm_area_struct;
 struct vfsmount;
 struct cred;
+struct opendata;
 
 extern void __init inode_init(void);
 extern void __init inode_init_early(void);
@@ -1653,6 +1654,9 @@ struct inode_operations {
 	void (*truncate_range)(struct inode *, loff_t, loff_t);
 	int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
 		      u64 len);
+	struct file * (*atomic_open)(struct inode *, struct dentry *,
+				     struct opendata *, unsigned open_flag,
+				     umode_t create_mode, bool *created);
 } ____cacheline_aligned;
 
 struct seq_file;
@@ -2027,6 +2031,9 @@ extern struct file * dentry_open(struct dentry *, struct vfsmount *, int,
 				 const struct cred *);
 extern int filp_close(struct file *, fl_owner_t id);
 extern char * getname(const char __user *);
+extern struct file *finish_open(struct opendata *od, struct dentry *dentry,
+				int (*open)(struct inode *, struct file *));
+extern void finish_no_open(struct opendata *od, struct dentry *dentry);
 
 /* fs/ioctl.c */
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 05/16] nfs: don't open in ->d_revalidate
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (3 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 04/16] vfs: add i_op->atomic_open() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 21:45     ` Myklebust, Trond
  2012-03-28 20:24 ` [PATCH 06/16] nfs: implement i_op->atomic_open() Miklos Szeredi
                   ` (10 subsequent siblings)
  15 siblings, 1 reply; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
mount needs to be followed or not.  It does check d_mountpoint() on the dentry,
which can result in a weird error if the VFS found that the mount does not in
fact need to be followed, e.g.:

  # mount --bind /mnt/nfs /mnt/nfs-clone
  # echo something > /mnt/nfs/tmp/bar
  # echo x > /tmp/file
  # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
  # cat  /mnt/nfs/tmp/bar
  cat: /mnt/nfs/tmp/bar: Not a directory

Which should, by any sane filesystem, result in "something" being printed.

So instead do the open in f_op->open() and in the unlikely case that the cached
dentry turned out to be invalid, drop the dentry and return ESTALE to let the
VFS retry.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
CC: Trond Myklebust <Trond.Myklebust@netapp.com>
---
 fs/nfs/dir.c  |   47 ++++----------------------------------
 fs/nfs/file.c |   69 +++++++++++++++++++++++++++++++++++++++++++++++++++++---
 2 files changed, 70 insertions(+), 46 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index fd9a872..715a8c1 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1328,10 +1328,10 @@ out:
 }
 
 #ifdef CONFIG_NFS_V4
-static int nfs_open_revalidate(struct dentry *, struct nameidata *);
+static int nfs4_lookup_revalidate(struct dentry *, struct nameidata *);
 
 const struct dentry_operations nfs4_dentry_operations = {
-	.d_revalidate	= nfs_open_revalidate,
+	.d_revalidate	= nfs4_lookup_revalidate,
 	.d_delete	= nfs_dentry_delete,
 	.d_iput		= nfs_dentry_iput,
 	.d_automount	= nfs_d_automount,
@@ -1489,12 +1489,11 @@ no_open:
 	return nfs_lookup(dir, dentry, nd);
 }
 
-static int nfs_open_revalidate(struct dentry *dentry, struct nameidata *nd)
+static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 {
 	struct dentry *parent = NULL;
 	struct inode *inode;
 	struct inode *dir;
-	struct nfs_open_context *ctx;
 	int openflags, ret = 0;
 
 	if (nd->flags & LOOKUP_RCU)
@@ -1523,49 +1522,13 @@ static int nfs_open_revalidate(struct dentry *dentry, struct nameidata *nd)
 	/* We cannot do exclusive creation on a positive dentry */
 	if ((openflags & (O_CREAT|O_EXCL)) == (O_CREAT|O_EXCL))
 		goto no_open_dput;
-	/* We can't create new files, or truncate existing ones here */
-	openflags &= ~(O_CREAT|O_EXCL|O_TRUNC);
 
-	ctx = create_nfs_open_context(dentry, openflags);
-	ret = PTR_ERR(ctx);
-	if (IS_ERR(ctx))
-		goto out;
-	/*
-	 * Note: we're not holding inode->i_mutex and so may be racing with
-	 * operations that change the directory. We therefore save the
-	 * change attribute *before* we do the RPC call.
-	 */
-	inode = NFS_PROTO(dir)->open_context(dir, ctx, openflags, NULL);
-	if (IS_ERR(inode)) {
-		ret = PTR_ERR(inode);
-		switch (ret) {
-		case -EPERM:
-		case -EACCES:
-		case -EDQUOT:
-		case -ENOSPC:
-		case -EROFS:
-			goto out_put_ctx;
-		default:
-			goto out_drop;
-		}
-	}
-	iput(inode);
-	if (inode != dentry->d_inode)
-		goto out_drop;
+	/* Let f_op->open() actually open (and revalidate) the file */
+	ret = 1;
 
-	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
-	ret = nfs_intent_set_file(nd, ctx);
-	if (ret >= 0)
-		ret = 1;
 out:
 	dput(parent);
 	return ret;
-out_drop:
-	d_drop(dentry);
-	ret = 0;
-out_put_ctx:
-	put_nfs_open_context(ctx);
-	goto out;
 
 no_open_dput:
 	dput(parent);
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index c43a452..4e626ec 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -870,12 +870,73 @@ const struct file_operations nfs_file_operations = {
 static int
 nfs4_file_open(struct inode *inode, struct file *filp)
 {
+	struct nfs_open_context *ctx;
+	struct dentry *dentry = filp->f_path.dentry;
+	struct dentry *parent = NULL;
+	struct inode *dir;
+	unsigned openflags = filp->f_flags;
+	int err;
+
+	BUG_ON(inode != dentry->d_inode);
 	/*
-	 * NFSv4 opens are handled in d_lookup and d_revalidate. If we get to
-	 * this point, then something is very wrong
+	 * If no cached dentry exists or if it's negative, NFSv4 handled the
+	 * opens in ->lookup() or ->create().
+	 *
+	 * We only get this far for a cached positive dentry.  We skipped
+	 * revalidation, so handle it here by dropping the dentry and returning
+	 * -ESTALE.  The VFS will retry the lookup/create/open.
 	 */
-	dprintk("NFS: %s called! inode=%p filp=%p\n", __func__, inode, filp);
-	return -ENOTDIR;
+
+	dprintk("NFS: open file(%s/%s)\n",
+		dentry->d_parent->d_name.name,
+		dentry->d_name.name);
+
+	if ((openflags & O_ACCMODE) == 3)
+		openflags--;
+
+	/* We can't create new files, or truncate existing ones here */
+	openflags &= ~(O_CREAT|O_EXCL|O_TRUNC);
+
+	parent = dget_parent(dentry);
+	dir = parent->d_inode;
+
+	ctx = alloc_nfs_open_context(filp->f_path.dentry, filp->f_mode);
+	err = PTR_ERR(ctx);
+	if (IS_ERR(ctx))
+		goto out;
+
+	inode = NFS_PROTO(dir)->open_context(dir, ctx, openflags, NULL);
+	if (IS_ERR(inode)) {
+		err = PTR_ERR(inode);
+		switch (err) {
+		case -EPERM:
+		case -EACCES:
+		case -EDQUOT:
+		case -ENOSPC:
+		case -EROFS:
+			goto out_put_ctx;
+		default:
+			goto out_drop;
+		}
+	}
+	iput(inode);
+	if (inode != dentry->d_inode)
+		goto out_drop;
+
+	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
+	nfs_file_set_open_context(filp, ctx);
+	err = 0;
+
+out_put_ctx:
+	put_nfs_open_context(ctx);
+out:
+	dput(parent);
+	return err;
+
+out_drop:
+	d_drop(dentry);
+	err = -ESTALE;
+	goto out_put_ctx;
 }
 
 const struct file_operations nfs4_file_operations = {
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 06/16] nfs: implement i_op->atomic_open()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (4 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 05/16] nfs: don't open in ->d_revalidate Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 07/16] nfs: clean up ->create in nfs_rpc_ops Miklos Szeredi
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace NFS4 specific ->lookup implementation with ->atomic_open impelementation
and use the generic nfs_lookup for other lookups.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c |  179 +++++++++++++++++++++++++++++++---------------------------
 1 files changed, 95 insertions(+), 84 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 715a8c1..15846e6 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -111,11 +111,15 @@ const struct inode_operations nfs3_dir_inode_operations = {
 
 #ifdef CONFIG_NFS_V4
 
-static struct dentry *nfs_atomic_lookup(struct inode *, struct dentry *, struct nameidata *);
-static int nfs_open_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct nameidata *nd);
+static struct file *nfs_atomic_open(struct inode *, struct dentry *,
+				    struct opendata *, unsigned, umode_t,
+				    bool *);
+static int nfs4_create(struct inode *dir, struct dentry *dentry,
+		       umode_t mode, struct nameidata *nd);
 const struct inode_operations nfs4_dir_inode_operations = {
-	.create		= nfs_open_create,
-	.lookup		= nfs_atomic_lookup,
+	.create		= nfs4_create,
+	.lookup		= nfs_lookup,
+	.atomic_open	= nfs_atomic_open,
 	.link		= nfs_link,
 	.unlink		= nfs_unlink,
 	.symlink	= nfs_symlink,
@@ -1377,116 +1381,128 @@ static int do_open(struct inode *inode, struct file *filp)
 	return 0;
 }
 
-static int nfs_intent_set_file(struct nameidata *nd, struct nfs_open_context *ctx)
+static struct file *nfs_finish_open(struct nfs_open_context *ctx,
+				    struct dentry *dentry,
+				    struct opendata *od, unsigned open_flags)
 {
 	struct file *filp;
-	int ret = 0;
+	int err;
+
+	if (ctx->dentry != dentry) {
+		dput(ctx->dentry);
+		ctx->dentry = dget(dentry);
+	}
 
 	/* If the open_intent is for execute, we have an extra check to make */
 	if (ctx->mode & FMODE_EXEC) {
-		ret = nfs_may_open(ctx->dentry->d_inode,
-				ctx->cred,
-				nd->intent.open.flags);
-		if (ret < 0)
+		err = nfs_may_open(dentry->d_inode, ctx->cred, open_flags);
+		if (err < 0) {
+			filp = ERR_PTR(err);
 			goto out;
+		}
 	}
-	filp = lookup_instantiate_filp(nd, ctx->dentry, do_open);
-	if (IS_ERR(filp))
-		ret = PTR_ERR(filp);
-	else
+
+	filp = finish_open(od, dentry, do_open);
+	if (!IS_ERR(filp))
 		nfs_file_set_open_context(filp, ctx);
+
 out:
 	put_nfs_open_context(ctx);
-	return ret;
+	return filp;
 }
 
-static struct dentry *nfs_atomic_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct file *nfs_atomic_open(struct inode *dir, struct dentry *dentry,
+				    struct opendata *od, unsigned open_flags,
+				    umode_t mode, bool *created)
 {
 	struct nfs_open_context *ctx;
-	struct iattr attr;
-	struct dentry *res = NULL;
+	struct dentry *res;
+	struct iattr attr = { .ia_valid = 0 };
 	struct inode *inode;
-	int open_flags;
+	struct file *filp;
 	int err;
 
-	dfprintk(VFS, "NFS: atomic_lookup(%s/%ld), %s\n",
+	/* Expect a negative dentry */
+	BUG_ON(dentry->d_inode);
+
+	dfprintk(VFS, "NFS: atomic_open(%s/%ld), %s\n",
 			dir->i_sb->s_id, dir->i_ino, dentry->d_name.name);
 
-	/* Check that we are indeed trying to open this file */
-	if (!is_atomic_open(nd))
+	/* NFS only supports OPEN on regular files */
+	if ((open_flags & O_DIRECTORY)) {
+		err = -ENOENT;
+		if (!d_unhashed(dentry)) {
+			/*
+			 * Hashed negative dentry with O_DIRECTORY: dentry was
+			 * revalidated and is fine, no need to perform lookup
+			 * again
+			 */
+			goto out_err;
+		}
 		goto no_open;
-
-	if (dentry->d_name.len > NFS_SERVER(dir)->namelen) {
-		res = ERR_PTR(-ENAMETOOLONG);
-		goto out;
 	}
 
-	/* Let vfs_create() deal with O_EXCL. Instantiate, but don't hash
-	 * the dentry. */
-	if (nd->flags & LOOKUP_EXCL) {
-		d_instantiate(dentry, NULL);
-		goto out;
-	}
+	err = -ENAMETOOLONG;
+	if (dentry->d_name.len > NFS_SERVER(dir)->namelen)
+		goto out_err;
 
-	open_flags = nd->intent.open.flags;
+	if (open_flags & O_CREAT) {
+		attr.ia_mode = mode & ~current_umask();
+		attr.ia_valid |= ATTR_MODE;
+	}
 
 	ctx = create_nfs_open_context(dentry, open_flags);
-	res = ERR_CAST(ctx);
+	err = PTR_ERR(ctx);
 	if (IS_ERR(ctx))
-		goto out;
-
-	if (nd->flags & LOOKUP_CREATE) {
-		attr.ia_mode = nd->intent.open.create_mode;
-		attr.ia_valid = ATTR_MODE;
-		attr.ia_mode &= ~current_umask();
-	} else {
-		open_flags &= ~(O_EXCL | O_CREAT);
-		attr.ia_valid = 0;
-	}
+		goto out_err;
 
-	/* Open the file on the server */
 	nfs_block_sillyrename(dentry->d_parent);
 	inode = NFS_PROTO(dir)->open_context(dir, ctx, open_flags, &attr);
+	d_drop(dentry);
 	if (IS_ERR(inode)) {
 		nfs_unblock_sillyrename(dentry->d_parent);
 		put_nfs_open_context(ctx);
-		switch (PTR_ERR(inode)) {
-			/* Make a negative dentry */
-			case -ENOENT:
-				d_add(dentry, NULL);
-				res = NULL;
-				goto out;
-			/* This turned out not to be a regular file */
-			case -EISDIR:
-			case -ENOTDIR:
+		err = PTR_ERR(inode);
+		switch (err) {
+		case -ENOENT:
+			d_add(dentry, NULL);
+			break;
+		case -EISDIR:
+		case -ENOTDIR:
+			goto no_open;
+		case -ELOOP:
+			if (!(open_flags & O_NOFOLLOW))
 				goto no_open;
-			case -ELOOP:
-				if (!(nd->intent.open.flags & O_NOFOLLOW))
-					goto no_open;
+			break;
 			/* case -EINVAL: */
-			default:
-				res = ERR_CAST(inode);
-				goto out;
+		default:
+			break;
 		}
+		goto out_err;
 	}
 	res = d_add_unique(dentry, inode);
-	nfs_unblock_sillyrename(dentry->d_parent);
-	if (res != NULL) {
-		dput(ctx->dentry);
-		ctx->dentry = dget(res);
+	if (res != NULL)
 		dentry = res;
-	}
-	err = nfs_intent_set_file(nd, ctx);
-	if (err < 0) {
-		if (res != NULL)
-			dput(res);
-		return ERR_PTR(err);
-	}
-out:
+
+	nfs_unblock_sillyrename(dentry->d_parent);
 	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
-	return res;
+
+	filp = nfs_finish_open(ctx, dentry, od, open_flags);
+
+	dput(res);
+	return filp;
+
+out_err:
+	return ERR_PTR(err);
+
 no_open:
-	return nfs_lookup(dir, dentry, nd);
+	res = nfs_lookup(dir, dentry, NULL);
+	err = PTR_ERR(res);
+	if (IS_ERR(res))
+		goto out_err;
+
+	finish_no_open(od, res);
+	return NULL;
 }
 
 static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
@@ -1536,8 +1552,8 @@ no_open:
 	return nfs_lookup_revalidate(dentry, nd);
 }
 
-static int nfs_open_create(struct inode *dir, struct dentry *dentry,
-		umode_t mode, struct nameidata *nd)
+static int nfs4_create(struct inode *dir, struct dentry *dentry,
+		       umode_t mode, struct nameidata *nd)
 {
 	struct nfs_open_context *ctx = NULL;
 	struct iattr attr;
@@ -1561,19 +1577,14 @@ static int nfs_open_create(struct inode *dir, struct dentry *dentry,
 	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags, ctx);
 	if (error != 0)
 		goto out_put_ctx;
-	if (nd) {
-		error = nfs_intent_set_file(nd, ctx);
-		if (error < 0)
-			goto out_err;
-	} else {
-		put_nfs_open_context(ctx);
-	}
+
+	put_nfs_open_context(ctx);
+
 	return 0;
 out_put_ctx:
 	put_nfs_open_context(ctx);
 out_err_drop:
 	d_drop(dentry);
-out_err:
 	return error;
 }
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 07/16] nfs: clean up ->create in nfs_rpc_ops
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (5 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 06/16] nfs: implement i_op->atomic_open() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 08/16] nfs: don't use nd->intent.open.flags Miklos Szeredi
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Don't pass nfs_open_context() to ->create().  Only the NFS4 implementation
needed that and only because it wanted to return an open file using open
intents.  That task has been replaced by ->atomic_open so it is not necessary
anymore to pass the context to the create rpc operation.

Despite nfs4_proc_create apparently being okay with a NULL context it Oopses
somewhere down the call chain.  So allocate a context here.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c            |   42 ++----------------------------------------
 fs/nfs/nfs3proc.c       |    2 +-
 fs/nfs/nfs4proc.c       |   37 ++++++++++---------------------------
 fs/nfs/proc.c           |    2 +-
 include/linux/nfs_xdr.h |    2 +-
 5 files changed, 15 insertions(+), 70 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 15846e6..8828b0d 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -114,10 +114,8 @@ const struct inode_operations nfs3_dir_inode_operations = {
 static struct file *nfs_atomic_open(struct inode *, struct dentry *,
 				    struct opendata *, unsigned, umode_t,
 				    bool *);
-static int nfs4_create(struct inode *dir, struct dentry *dentry,
-		       umode_t mode, struct nameidata *nd);
 const struct inode_operations nfs4_dir_inode_operations = {
-	.create		= nfs4_create,
+	.create		= nfs_create,
 	.lookup		= nfs_lookup,
 	.atomic_open	= nfs_atomic_open,
 	.link		= nfs_link,
@@ -1552,42 +1550,6 @@ no_open:
 	return nfs_lookup_revalidate(dentry, nd);
 }
 
-static int nfs4_create(struct inode *dir, struct dentry *dentry,
-		       umode_t mode, struct nameidata *nd)
-{
-	struct nfs_open_context *ctx = NULL;
-	struct iattr attr;
-	int error;
-	int open_flags = O_CREAT|O_EXCL;
-
-	dfprintk(VFS, "NFS: create(%s/%ld), %s\n",
-			dir->i_sb->s_id, dir->i_ino, dentry->d_name.name);
-
-	attr.ia_mode = mode;
-	attr.ia_valid = ATTR_MODE;
-
-	if (nd)
-		open_flags = nd->intent.open.flags;
-
-	ctx = create_nfs_open_context(dentry, open_flags);
-	error = PTR_ERR(ctx);
-	if (IS_ERR(ctx))
-		goto out_err_drop;
-
-	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags, ctx);
-	if (error != 0)
-		goto out_put_ctx;
-
-	put_nfs_open_context(ctx);
-
-	return 0;
-out_put_ctx:
-	put_nfs_open_context(ctx);
-out_err_drop:
-	d_drop(dentry);
-	return error;
-}
-
 #endif /* CONFIG_NFSV4 */
 
 /*
@@ -1654,7 +1616,7 @@ static int nfs_create(struct inode *dir, struct dentry *dentry,
 	if (nd)
 		open_flags = nd->intent.open.flags;
 
-	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags, NULL);
+	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags);
 	if (error != 0)
 		goto out_err;
 	return 0;
diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
index 9194395..821c8bf 100644
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -314,7 +314,7 @@ static void nfs3_free_createdata(struct nfs3_createdata *data)
  */
 static int
 nfs3_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
-		 int flags, struct nfs_open_context *ctx)
+		 int flags)
 {
 	struct nfs3_createdata *data;
 	umode_t mode = sattr->ia_mode;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ec9f6ef..f80c547 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2613,37 +2613,22 @@ static int nfs4_proc_readlink(struct inode *inode, struct page *page,
 }
 
 /*
- * Got race?
- * We will need to arrange for the VFS layer to provide an atomic open.
- * Until then, this create/open method is prone to inefficiency and race
- * conditions due to the lookup, create, and open VFS calls from sys_open()
- * placed on the wire.
- *
- * Given the above sorry state of affairs, I'm simply sending an OPEN.
- * The file will be opened again in the subsequent VFS open call
- * (nfs4_proc_file_open).
- *
- * The open for read will just hang around to be used by any process that
- * opens the file O_RDONLY. This will all be resolved with the VFS changes.
+ * This is just for mknod.  open(O_CREAT) will always do ->open_context().
  */
-
 static int
 nfs4_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
-                 int flags, struct nfs_open_context *ctx)
+		 int flags)
 {
-	struct dentry *de = dentry;
+	struct nfs_open_context *ctx;
 	struct nfs4_state *state;
-	struct rpc_cred *cred = NULL;
-	fmode_t fmode = 0;
 	int status = 0;
 
-	if (ctx != NULL) {
-		cred = ctx->cred;
-		de = ctx->dentry;
-		fmode = ctx->mode;
-	}
+	ctx = alloc_nfs_open_context(dentry, FMODE_READ);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
 	sattr->ia_mode &= ~current_umask();
-	state = nfs4_do_open(dir, de, fmode, flags, sattr, cred);
+	state = nfs4_do_open(dir, dentry, ctx->mode, flags, sattr, ctx->cred);
 	d_drop(dentry);
 	if (IS_ERR(state)) {
 		status = PTR_ERR(state);
@@ -2651,11 +2636,9 @@ nfs4_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
 	}
 	d_add(dentry, igrab(state->inode));
 	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
-	if (ctx != NULL)
-		ctx->state = state;
-	else
-		nfs4_close_sync(state, fmode);
+	ctx->state = state;
 out:
+	put_nfs_open_context(ctx);
 	return status;
 }
 
diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
index 0c672588..60834bc 100644
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -259,7 +259,7 @@ static void nfs_free_createdata(const struct nfs_createdata *data)
 
 static int
 nfs_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
-		int flags, struct nfs_open_context *ctx)
+		int flags)
 {
 	struct nfs_createdata *data;
 	struct rpc_message msg = {
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index d6ba9a1..c3df045 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1218,7 +1218,7 @@ struct nfs_rpc_ops {
 	int	(*readlink)(struct inode *, struct page *, unsigned int,
 			    unsigned int);
 	int	(*create)  (struct inode *, struct dentry *,
-			    struct iattr *, int, struct nfs_open_context *);
+			    struct iattr *, int);
 	int	(*remove)  (struct inode *, struct qstr *);
 	void	(*unlink_setup)  (struct rpc_message *, struct inode *dir);
 	int	(*unlink_done) (struct rpc_task *, struct inode *);
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 08/16] nfs: don't use nd->intent.open.flags
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (6 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 07/16] nfs: clean up ->create in nfs_rpc_ops Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 09/16] nfs: don't use intents for checking atomic open Miklos Szeredi
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Instead check LOOKUP_EXCL in nd->flags, which is basically what the open intent
flags were used for.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 8828b0d..35fa125 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1508,7 +1508,7 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	struct dentry *parent = NULL;
 	struct inode *inode;
 	struct inode *dir;
-	int openflags, ret = 0;
+	int ret = 0;
 
 	if (nd->flags & LOOKUP_RCU)
 		return -ECHILD;
@@ -1532,9 +1532,8 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	/* NFS only supports OPEN on regular files */
 	if (!S_ISREG(inode->i_mode))
 		goto no_open_dput;
-	openflags = nd->intent.open.flags;
 	/* We cannot do exclusive creation on a positive dentry */
-	if ((openflags & (O_CREAT|O_EXCL)) == (O_CREAT|O_EXCL))
+	if (nd && nd->flags & LOOKUP_EXCL)
 		goto no_open_dput;
 
 	/* Let f_op->open() actually open (and revalidate) the file */
@@ -1613,8 +1612,8 @@ static int nfs_create(struct inode *dir, struct dentry *dentry,
 	attr.ia_mode = mode;
 	attr.ia_valid = ATTR_MODE;
 
-	if (nd)
-		open_flags = nd->intent.open.flags;
+	if (nd && !(nd->flags & LOOKUP_EXCL))
+		open_flags = O_CREAT;
 
 	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags);
 	if (error != 0)
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 09/16] nfs: don't use intents for checking atomic open
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (7 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 08/16] nfs: don't use nd->intent.open.flags Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 10/16] fuse: implement i_op->atomic_create() Miklos Szeredi
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

is_atomic_open() is now only used by nfs4_lookup_revalidate() to check whether
it's okay to skip normal revalidation.

It does a racy check for mount read-onlyness and falls back to normal
revalidation if the open would fail.  This makes little sense now that this
function isn't used for determining whether to actually open the file or not.

The d_mountpoint() check still makes sense since it is an indication that we
might be following a mount and so open may not revalidate the dentry.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c |   24 ++++--------------------
 1 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 35fa125..48260a5 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1340,24 +1340,6 @@ const struct dentry_operations nfs4_dentry_operations = {
 	.d_release	= nfs_d_release,
 };
 
-/*
- * Use intent information to determine whether we need to substitute
- * the NFSv4-style stateful OPEN for the LOOKUP call
- */
-static int is_atomic_open(struct nameidata *nd)
-{
-	if (nd == NULL || nfs_lookup_check_intent(nd, LOOKUP_OPEN) == 0)
-		return 0;
-	/* NFS does not (yet) have a stateful open for directories */
-	if (nd->flags & LOOKUP_DIRECTORY)
-		return 0;
-	/* Are we trying to write to a read only partition? */
-	if (__mnt_is_readonly(nd->path.mnt) &&
-	    (nd->intent.open.flags & (O_CREAT|O_TRUNC|O_ACCMODE)))
-		return 0;
-	return 1;
-}
-
 static fmode_t flags_to_mode(int flags)
 {
 	fmode_t res = (__force fmode_t)flags & FMODE_EXEC;
@@ -1513,10 +1495,12 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	if (nd->flags & LOOKUP_RCU)
 		return -ECHILD;
 
-	inode = dentry->d_inode;
-	if (!is_atomic_open(nd) || d_mountpoint(dentry))
+	if (!(nd->flags & LOOKUP_OPEN) || (nd->flags & LOOKUP_DIRECTORY))
+		goto no_open;
+	if (d_mountpoint(dentry))
 		goto no_open;
 
+	inode = dentry->d_inode;
 	parent = dget_parent(dentry);
 	dir = parent->d_inode;
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 10/16] fuse: implement i_op->atomic_create()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (8 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 09/16] nfs: don't use intents for checking atomic open Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 11/16] cifs: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace fuse's ->create implementation with a ->atomic_create implementation.
No functionality is changed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/fuse/dir.c |   97 ++++++++++++++++++++++++++++++++++++++++----------------
 1 files changed, 69 insertions(+), 28 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 2066328..75e5142 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -369,8 +369,9 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
  * If the filesystem doesn't support this, then fall back to separate
  * 'mknod' + 'open' requests.
  */
-static int fuse_create_open(struct inode *dir, struct dentry *entry,
-			    umode_t mode, struct nameidata *nd)
+static struct file *fuse_create_open(struct inode *dir, struct dentry *entry,
+				     struct opendata *od, unsigned flags,
+				     umode_t mode)
 {
 	int err;
 	struct inode *inode;
@@ -382,17 +383,15 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	struct fuse_entry_out outentry;
 	struct fuse_file *ff;
 	struct file *file;
-	int flags = nd->intent.open.flags;
-
-	if (fc->no_create)
-		return -ENOSYS;
 
+	err = -EINVAL;
 	if (flags & O_DIRECT)
-		return -EINVAL;
+		goto out_err;
 
 	forget = fuse_alloc_forget();
+	err = -ENOMEM;
 	if (!forget)
-		return -ENOMEM;
+		goto out_err;
 
 	req = fuse_get_req(fc);
 	err = PTR_ERR(req);
@@ -431,11 +430,8 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	req->out.args[1].value = &outopen;
 	fuse_request_send(fc, req);
 	err = req->out.h.error;
-	if (err) {
-		if (err == -ENOSYS)
-			fc->no_create = 1;
+	if (err)
 		goto out_free_ff;
-	}
 
 	err = -EIO;
 	if (!S_ISREG(outentry.attr.mode) || invalid_nodeid(outentry.nodeid))
@@ -451,28 +447,78 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
 		fuse_sync_release(ff, flags);
 		fuse_queue_forget(fc, forget, outentry.nodeid, 1);
-		return -ENOMEM;
+		err = -ENOMEM;
+		goto out_err;
 	}
 	kfree(forget);
 	d_instantiate(entry, inode);
 	fuse_change_entry_timeout(entry, &outentry);
 	fuse_invalidate_attr(dir);
-	file = lookup_instantiate_filp(nd, entry, generic_file_open);
+	file = finish_open(od, entry, generic_file_open);
 	if (IS_ERR(file)) {
 		fuse_sync_release(ff, flags);
-		return PTR_ERR(file);
+	} else {
+		file->private_data = fuse_file_get(ff);
+		fuse_finish_open(inode, file);
 	}
-	file->private_data = fuse_file_get(ff);
-	fuse_finish_open(inode, file);
-	return 0;
+	return file;
 
- out_free_ff:
+out_free_ff:
 	fuse_file_free(ff);
- out_put_request:
+out_put_request:
 	fuse_put_request(fc, req);
- out_put_forget_req:
+out_put_forget_req:
 	kfree(forget);
-	return err;
+out_err:
+	return ERR_PTR(err);
+}
+
+static int fuse_mknod(struct inode *, struct dentry *, umode_t, dev_t);
+static struct file *fuse_atomic_open(struct inode *dir, struct dentry *entry,
+				     struct opendata *od, unsigned flags,
+				     umode_t mode, bool *created)
+{
+	int err;
+	struct fuse_conn *fc = get_fuse_conn(dir);
+	struct file *file;
+	struct dentry *res = NULL;
+
+	if (d_unhashed(entry)) {
+		res = fuse_lookup(dir, entry, NULL);
+		if (IS_ERR(res))
+			return ERR_CAST(res);
+
+		if (res)
+			entry = res;
+	}
+
+	if (!(flags & O_CREAT) || entry->d_inode)
+		goto no_open;
+
+	/* Only creates */
+	*created = true;
+
+	if (fc->no_create)
+		goto mknod;
+
+	file = fuse_create_open(dir, entry, od, flags, mode);
+	if (PTR_ERR(file) == -ENOSYS) {
+		fc->no_create = 1;
+		goto mknod;
+	}
+out_dput:
+	dput(res);
+	return file;
+
+mknod:
+	err = fuse_mknod(dir, entry, mode, 0);
+	if (err) {
+		file = ERR_PTR(err);
+		goto out_dput;
+	}
+no_open:
+	finish_no_open(od, res);
+	return NULL;
 }
 
 /*
@@ -576,12 +622,6 @@ static int fuse_mknod(struct inode *dir, struct dentry *entry, umode_t mode,
 static int fuse_create(struct inode *dir, struct dentry *entry, umode_t mode,
 		       struct nameidata *nd)
 {
-	if (nd) {
-		int err = fuse_create_open(dir, entry, mode, nd);
-		if (err != -ENOSYS)
-			return err;
-		/* Fall back on mknod */
-	}
 	return fuse_mknod(dir, entry, mode, 0);
 }
 
@@ -1632,6 +1672,7 @@ static const struct inode_operations fuse_dir_inode_operations = {
 	.link		= fuse_link,
 	.setattr	= fuse_setattr,
 	.create		= fuse_create,
+	.atomic_open	= fuse_atomic_open,
 	.mknod		= fuse_mknod,
 	.permission	= fuse_permission,
 	.getattr	= fuse_getattr,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 11/16] cifs: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (9 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 10/16] fuse: implement i_op->atomic_create() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 12/16] ceph: remove unused arg from ceph_lookup_open() Miklos Szeredi
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace CIFS's ->create operation with ->atomic_open and ->atomic_create.  Also
move the relevant code from ->lookup into the create function.

CIFS currently only does atomic open for O_CREAT, but it wants to do that as
early as possible, without first calling ->lookup, so it uses ->atomic_open,
just like NFS.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/cifs/cifsfs.c |    1 +
 fs/cifs/cifsfs.h |    3 +
 fs/cifs/dir.c    |  435 ++++++++++++++++++++++++++++++------------------------
 3 files changed, 245 insertions(+), 194 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index b1fd382..755b9d8 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -772,6 +772,7 @@ struct file_system_type cifs_fs_type = {
 };
 const struct inode_operations cifs_dir_inode_ops = {
 	.create = cifs_create,
+	.atomic_open = cifs_atomic_open,
 	.lookup = cifs_lookup,
 	.getattr = cifs_getattr,
 	.unlink = cifs_unlink,
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index fe5ecf1..d33c008 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -46,6 +46,9 @@ extern const struct inode_operations cifs_dir_inode_ops;
 extern struct inode *cifs_root_iget(struct super_block *);
 extern int cifs_create(struct inode *, struct dentry *, umode_t,
 		       struct nameidata *);
+extern struct file *cifs_atomic_open(struct inode *, struct dentry *,
+				     struct opendata *, unsigned, umode_t,
+				     bool *);
 extern struct dentry *cifs_lookup(struct inode *, struct dentry *,
 				  struct nameidata *);
 extern int cifs_unlink(struct inode *dir, struct dentry *dentry);
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index bc7e244..d133b0e 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -133,108 +133,141 @@ cifs_bp_rename_retry:
 	return full_path;
 }
 
+/*
+ * Don't allow the separator character in a path component.
+ * The VFS will not allow "/", but "\" is allowed by posix.
+ */
+static int
+check_name(struct dentry *direntry)
+{
+	struct cifs_sb_info *cifs_sb = CIFS_SB(direntry->d_sb);
+	int i;
+
+	if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_POSIX_PATHS)) {
+		for (i = 0; i < direntry->d_name.len; i++) {
+			if (direntry->d_name.name[i] == '\\') {
+				cFYI(1, "Invalid file name");
+				return -EINVAL;
+			}
+		}
+	}
+	return 0;
+}
+
+
 /* Inode operations in similar order to how they appear in Linux file fs.h */
 
-int
-cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
-		struct nameidata *nd)
+static int cifs_do_create(struct inode *inode, struct dentry *direntry,
+			  int xid, struct tcon_link *tlink, unsigned oflags,
+			  umode_t mode, __u32 *oplock, __u16 *fileHandle,
+			  bool *created)
 {
 	int rc = -ENOENT;
-	int xid;
 	int create_options = CREATE_NOT_DIR;
-	__u32 oplock = 0;
-	int oflags;
-	/*
-	 * BB below access is probably too much for mknod to request
-	 *    but we have to do query and setpathinfo so requesting
-	 *    less could fail (unless we want to request getatr and setatr
-	 *    permissions (only).  At least for POSIX we do not have to
-	 *    request so much.
-	 */
-	int desiredAccess = GENERIC_READ | GENERIC_WRITE;
-	__u16 fileHandle;
-	struct cifs_sb_info *cifs_sb;
-	struct tcon_link *tlink;
-	struct cifs_tcon *tcon;
+	int desiredAccess;
+	struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb);
+	struct cifs_tcon *tcon = tlink_tcon(tlink);
 	char *full_path = NULL;
 	FILE_ALL_INFO *buf = NULL;
 	struct inode *newinode = NULL;
-	int disposition = FILE_OVERWRITE_IF;
-
-	xid = GetXid();
-
-	cifs_sb = CIFS_SB(inode->i_sb);
-	tlink = cifs_sb_tlink(cifs_sb);
-	if (IS_ERR(tlink)) {
-		FreeXid(xid);
-		return PTR_ERR(tlink);
-	}
-	tcon = tlink_tcon(tlink);
+	int disposition;
 
+	*oplock = 0;
 	if (enable_oplocks)
-		oplock = REQ_OPLOCK;
-
-	if (nd)
-		oflags = nd->intent.open.file->f_flags;
-	else
-		oflags = O_RDONLY | O_CREAT;
+		*oplock = REQ_OPLOCK;
 
 	full_path = build_path_from_dentry(direntry);
 	if (full_path == NULL) {
 		rc = -ENOMEM;
-		goto cifs_create_out;
+		goto out;
 	}
 
 	if (tcon->unix_ext && (tcon->ses->capabilities & CAP_UNIX) &&
+	    !tcon->broken_posix_open &&
 	    (CIFS_UNIX_POSIX_PATH_OPS_CAP &
 			le64_to_cpu(tcon->fsUnixInfo.Capability))) {
 		rc = cifs_posix_open(full_path, &newinode,
-			inode->i_sb, mode, oflags, &oplock, &fileHandle, xid);
-		/* EIO could indicate that (posix open) operation is not
-		   supported, despite what server claimed in capability
-		   negotiation.  EREMOTE indicates DFS junction, which is not
-		   handled in posix open */
+			inode->i_sb, mode, oflags, oplock, fileHandle, xid);
+		switch (rc) {
+		case 0:
+			if (newinode == NULL) {
+				/* query inode info */
+				goto cifs_create_get_file_info;
+			}
 
-		if (rc == 0) {
-			if (newinode == NULL) /* query inode info */
+			if (!S_ISREG(newinode->i_mode)) {
+				/*
+				 * The server may allow us to open things like
+				 * FIFOs, but the client isn't set up to deal
+				 * with that. If it's not a regular file, just
+				 * close it and proceed as if it were a normal
+				 * lookup.
+				 */
+				CIFSSMBClose(xid, tcon, *fileHandle);
 				goto cifs_create_get_file_info;
-			else /* success, no need to query */
-				goto cifs_create_set_dentry;
-		} else if ((rc != -EIO) && (rc != -EREMOTE) &&
-			 (rc != -EOPNOTSUPP) && (rc != -EINVAL))
-			goto cifs_create_out;
-		/* else fallthrough to retry, using older open call, this is
-		   case where server does not support this SMB level, and
-		   falsely claims capability (also get here for DFS case
-		   which should be rare for path not covered on files) */
-	}
+			}
+			/* success, no need to query */
+			goto cifs_create_set_dentry;
+
+		case -ENOENT:
+			goto cifs_create_get_file_info;
+
+		case -EIO:
+		case -EINVAL:
+			/*
+			 * EIO could indicate that (posix open) operation is not
+			 * supported, despite what server claimed in capability
+			 * negotiation.
+			 *
+			 * POSIX open in samba versions 3.3.1 and earlier could
+			 * incorrectly fail with invalid parameter.
+			 */
+			tcon->broken_posix_open = true;
+			break;
 
-	if (nd) {
-		/* if the file is going to stay open, then we
-		   need to set the desired access properly */
-		desiredAccess = 0;
-		if (OPEN_FMODE(oflags) & FMODE_READ)
-			desiredAccess |= GENERIC_READ; /* is this too little? */
-		if (OPEN_FMODE(oflags) & FMODE_WRITE)
-			desiredAccess |= GENERIC_WRITE;
+		case -EREMOTE:
+		case -EOPNOTSUPP:
+			/*
+			 * EREMOTE indicates DFS junction, which is not handled
+			 * in posix open.  If either that or op not supported
+			 * returned, follow the normal lookup.
+			 */
+			break;
 
-		if ((oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))
-			disposition = FILE_CREATE;
-		else if ((oflags & (O_CREAT | O_TRUNC)) == (O_CREAT | O_TRUNC))
-			disposition = FILE_OVERWRITE_IF;
-		else if ((oflags & O_CREAT) == O_CREAT)
-			disposition = FILE_OPEN_IF;
-		else
-			cFYI(1, "Create flag not set in create function");
+		default:
+			goto out;
+		}
+		/*
+		 * fallthrough to retry, using older open call, this is case
+		 * where server does not support this SMB level, and falsely
+		 * claims capability (also get here for DFS case which should be
+		 * rare for path not covered on files)
+		 */
 	}
 
+	desiredAccess = 0;
+	if (OPEN_FMODE(oflags) & FMODE_READ)
+		desiredAccess |= GENERIC_READ; /* is this too little? */
+	if (OPEN_FMODE(oflags) & FMODE_WRITE)
+		desiredAccess |= GENERIC_WRITE;
+
+	disposition = FILE_OVERWRITE_IF;
+	if ((oflags & (O_CREAT | O_EXCL)) == (O_CREAT | O_EXCL))
+		disposition = FILE_CREATE;
+	else if ((oflags & (O_CREAT | O_TRUNC)) == (O_CREAT | O_TRUNC))
+		disposition = FILE_OVERWRITE_IF;
+	else if ((oflags & O_CREAT) == O_CREAT)
+		disposition = FILE_OPEN_IF;
+	else
+		cFYI(1, "Create flag not set in create function");
+
 	/* BB add processing to set equivalent of mode - e.g. via CreateX with
 	   ACLs */
 
 	buf = kmalloc(sizeof(FILE_ALL_INFO), GFP_KERNEL);
 	if (buf == NULL) {
 		rc = -ENOMEM;
-		goto cifs_create_out;
+		goto out;
 	}
 
 	/*
@@ -250,7 +283,7 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 	if (tcon->ses->capabilities & CAP_NT_SMBS)
 		rc = CIFSSMBOpen(xid, tcon, full_path, disposition,
 			 desiredAccess, create_options,
-			 &fileHandle, &oplock, buf, cifs_sb->local_nls,
+			 fileHandle, oplock, buf, cifs_sb->local_nls,
 			 cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
 	else
 		rc = -EIO; /* no NT SMB support fall into legacy open below */
@@ -259,17 +292,17 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 		/* old server, retry the open legacy style */
 		rc = SMBLegacyOpen(xid, tcon, full_path, disposition,
 			desiredAccess, create_options,
-			&fileHandle, &oplock, buf, cifs_sb->local_nls,
+			fileHandle, oplock, buf, cifs_sb->local_nls,
 			cifs_sb->mnt_cifs_flags & CIFS_MOUNT_MAP_SPECIAL_CHR);
 	}
 	if (rc) {
 		cFYI(1, "cifs_create returned 0x%x", rc);
-		goto cifs_create_out;
+		goto out;
 	}
 
 	/* If Open reported that we actually created a file
 	   then we now have to set the mode if possible */
-	if ((tcon->unix_ext) && (oplock & CIFS_CREATE_ACTION)) {
+	if ((tcon->unix_ext) && (*oplock & CIFS_CREATE_ACTION)) {
 		struct cifs_unix_set_info_args args = {
 				.mode	= mode,
 				.ctime	= NO_CHANGE_64,
@@ -278,6 +311,7 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 				.device	= 0,
 		};
 
+		*created = true;
 		if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SET_UID) {
 			args.uid = (__u64) current_fsuid();
 			if (inode->i_mode & S_ISGID)
@@ -288,7 +322,7 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 			args.uid = NO_CHANGE_64;
 			args.gid = NO_CHANGE_64;
 		}
-		CIFSSMBUnixSetFileInfo(xid, tcon, &args, fileHandle,
+		CIFSSMBUnixSetFileInfo(xid, tcon, &args, *fileHandle,
 					current->tgid);
 	} else {
 		/* BB implement mode setting via Windows security
@@ -305,11 +339,11 @@ cifs_create_get_file_info:
 					      inode->i_sb, xid);
 	else {
 		rc = cifs_get_inode_info(&newinode, full_path, buf,
-					 inode->i_sb, xid, &fileHandle);
+					 inode->i_sb, xid, fileHandle);
 		if (newinode) {
 			if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_DYNPERM)
 				newinode->i_mode = mode;
-			if ((oplock & CIFS_CREATE_ACTION) &&
+			if ((*oplock & CIFS_CREATE_ACTION) &&
 			    (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SET_UID)) {
 				newinode->i_uid = current_fsuid();
 				if (inode->i_mode & S_ISGID)
@@ -321,37 +355,139 @@ cifs_create_get_file_info:
 	}
 
 cifs_create_set_dentry:
-	if (rc == 0)
-		d_instantiate(direntry, newinode);
-	else
+	if (rc != 0) {
 		cFYI(1, "Create worked, get_inode_info failed rc = %d", rc);
+		goto out;
+	}
+	d_drop(direntry);
+	d_add(direntry, newinode);
 
-	if (newinode && nd) {
-		struct cifsFileInfo *pfile_info;
-		struct file *filp;
+	/* ENOENT for create?  How weird... */
+	rc = -ENOENT;
+	if (!newinode) {
+		CIFSSMBClose(xid, tcon, *fileHandle);
+		goto out;
+	}
+	rc = 0;
 
-		filp = lookup_instantiate_filp(nd, direntry, generic_file_open);
-		if (IS_ERR(filp)) {
-			rc = PTR_ERR(filp);
-			CIFSSMBClose(xid, tcon, fileHandle);
-			goto cifs_create_out;
-		}
+out:
+	kfree(buf);
+	kfree(full_path);
+	return rc;
+}
 
-		pfile_info = cifs_new_fileinfo(fileHandle, filp, tlink, oplock);
-		if (pfile_info == NULL) {
-			fput(filp);
-			CIFSSMBClose(xid, tcon, fileHandle);
-			rc = -ENOMEM;
-		}
-	} else {
+struct file *
+cifs_atomic_open(struct inode *inode, struct dentry *direntry,
+		 struct opendata *od, unsigned oflags, umode_t mode,
+		 bool *created)
+{
+	int rc;
+	int xid;
+	struct tcon_link *tlink;
+	struct cifs_tcon *tcon;
+	__u16 fileHandle;
+	__u32 oplock;
+	struct file *filp;
+	struct cifsFileInfo *pfile_info;
+
+	/* Posix open is only called (at lookup time) for file create now.  For
+	 * opens (rather than creates), because we do not know if it is a file
+	 * or directory yet, and current Samba no longer allows us to do posix
+	 * open on dirs, we could end up wasting an open call on what turns out
+	 * to be a dir. For file opens, we wait to call posix open till
+	 * cifs_open.  It could be added to atomic_open in the future but the
+	 * performance tradeoff of the extra network request when EISDIR or
+	 * EACCES is returned would have to be weighed against the 50% reduction
+	 * in network traffic in the other paths.
+	 */
+	if (!(oflags & O_CREAT)) {
+		struct dentry *res = cifs_lookup(inode, direntry, NULL);
+		if (IS_ERR(res))
+			return ERR_CAST(res);
+
+		finish_no_open(od, res);
+		return NULL;
+	}
+
+	rc = check_name(direntry);
+	if (rc)
+		return ERR_PTR(rc);
+
+	xid = GetXid();
+
+	cFYI(1, "parent inode = 0x%p name is: %s and dentry = 0x%p",
+	     inode, direntry->d_name.name, direntry);
+
+	tlink = cifs_sb_tlink(CIFS_SB(inode->i_sb));
+	filp = ERR_CAST(tlink);
+	if (IS_ERR(tlink))
+		goto free_xid;
+
+	tcon = tlink_tcon(tlink);
+
+	rc = cifs_do_create(inode, direntry, xid, tlink, oflags, mode,
+			    &oplock, &fileHandle, created);
+
+	if (rc) {
+		filp = ERR_PTR(rc);
+		goto out;
+	}
+
+	filp = finish_open(od, direntry, generic_file_open);
+	if (IS_ERR(filp)) {
 		CIFSSMBClose(xid, tcon, fileHandle);
+		goto out;
 	}
 
-cifs_create_out:
-	kfree(buf);
-	kfree(full_path);
+	pfile_info = cifs_new_fileinfo(fileHandle, filp, tlink, oplock);
+	if (pfile_info == NULL) {
+		CIFSSMBClose(xid, tcon, fileHandle);
+		fput(filp);
+		filp = ERR_PTR(-ENOMEM);
+	}
+
+out:
+	cifs_put_tlink(tlink);
+free_xid:
+	FreeXid(xid);
+	return filp;
+}
+
+int cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
+		struct nameidata *nd)
+{
+	int rc;
+	int xid = GetXid();
+	/*
+	 * BB below access is probably too much for mknod to request
+	 *    but we have to do query and setpathinfo so requesting
+	 *    less could fail (unless we want to request getatr and setatr
+	 *    permissions (only).  At least for POSIX we do not have to
+	 *    request so much.
+	 */
+	unsigned oflags = O_EXCL | O_CREAT | O_RDWR;
+	struct tcon_link *tlink;
+	__u16 fileHandle;
+	__u32 oplock;
+	bool created = true;
+
+	cFYI(1, "cifs_create parent inode = 0x%p name is: %s and dentry = 0x%p",
+	     inode, direntry->d_name.name, direntry);
+
+	tlink = cifs_sb_tlink(CIFS_SB(inode->i_sb));
+	rc = PTR_ERR(tlink);
+	if (IS_ERR(tlink))
+		goto free_xid;
+
+	rc = cifs_do_create(inode, direntry, xid, tlink, oflags, mode,
+			    &oplock, &fileHandle, &created);
+	if (!rc)
+		CIFSSMBClose(xid, tlink_tcon(tlink), fileHandle);
+
 	cifs_put_tlink(tlink);
+free_xid:
 	FreeXid(xid);
+
 	return rc;
 }
 
@@ -492,16 +628,11 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 {
 	int xid;
 	int rc = 0; /* to get around spurious gcc warning, set to zero here */
-	__u32 oplock = enable_oplocks ? REQ_OPLOCK : 0;
-	__u16 fileHandle = 0;
-	bool posix_open = false;
 	struct cifs_sb_info *cifs_sb;
 	struct tcon_link *tlink;
 	struct cifs_tcon *pTcon;
-	struct cifsFileInfo *cfile;
 	struct inode *newInode = NULL;
 	char *full_path = NULL;
-	struct file *filp;
 
 	xid = GetXid();
 
@@ -518,29 +649,9 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 	}
 	pTcon = tlink_tcon(tlink);
 
-	/*
-	 * Don't allow the separator character in a path component.
-	 * The VFS will not allow "/", but "\" is allowed by posix.
-	 */
-	if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_POSIX_PATHS)) {
-		int i;
-		for (i = 0; i < direntry->d_name.len; i++)
-			if (direntry->d_name.name[i] == '\\') {
-				cFYI(1, "Invalid file name");
-				rc = -EINVAL;
-				goto lookup_out;
-			}
-	}
-
-	/*
-	 * O_EXCL: optimize away the lookup, but don't hash the dentry. Let
-	 * the VFS handle the create.
-	 */
-	if (nd && (nd->flags & LOOKUP_EXCL)) {
-		d_instantiate(direntry, NULL);
-		rc = 0;
+	rc = check_name(direntry);
+	if (rc)
 		goto lookup_out;
-	}
 
 	/* can not grab the rename sem here since it would
 	deadlock in the cases (beginning of sys_rename itself)
@@ -558,80 +669,16 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 	}
 	cFYI(1, "Full path: %s inode = 0x%p", full_path, direntry->d_inode);
 
-	/* Posix open is only called (at lookup time) for file create now.
-	 * For opens (rather than creates), because we do not know if it
-	 * is a file or directory yet, and current Samba no longer allows
-	 * us to do posix open on dirs, we could end up wasting an open call
-	 * on what turns out to be a dir. For file opens, we wait to call posix
-	 * open till cifs_open.  It could be added here (lookup) in the future
-	 * but the performance tradeoff of the extra network request when EISDIR
-	 * or EACCES is returned would have to be weighed against the 50%
-	 * reduction in network traffic in the other paths.
-	 */
 	if (pTcon->unix_ext) {
-		if (nd && !(nd->flags & LOOKUP_DIRECTORY) &&
-		     (nd->flags & LOOKUP_OPEN) && !pTcon->broken_posix_open &&
-		     (nd->intent.open.file->f_flags & O_CREAT)) {
-			rc = cifs_posix_open(full_path, &newInode,
-					parent_dir_inode->i_sb,
-					nd->intent.open.create_mode,
-					nd->intent.open.file->f_flags, &oplock,
-					&fileHandle, xid);
-			/*
-			 * The check below works around a bug in POSIX
-			 * open in samba versions 3.3.1 and earlier where
-			 * open could incorrectly fail with invalid parameter.
-			 * If either that or op not supported returned, follow
-			 * the normal lookup.
-			 */
-			switch (rc) {
-			case 0:
-				/*
-				 * The server may allow us to open things like
-				 * FIFOs, but the client isn't set up to deal
-				 * with that. If it's not a regular file, just
-				 * close it and proceed as if it were a normal
-				 * lookup.
-				 */
-				if (newInode && !S_ISREG(newInode->i_mode)) {
-					CIFSSMBClose(xid, pTcon, fileHandle);
-					break;
-				}
-			case -ENOENT:
-				posix_open = true;
-			case -EOPNOTSUPP:
-				break;
-			default:
-				pTcon->broken_posix_open = true;
-			}
-		}
-		if (!posix_open)
-			rc = cifs_get_inode_info_unix(&newInode, full_path,
-						parent_dir_inode->i_sb, xid);
-	} else
+		rc = cifs_get_inode_info_unix(&newInode, full_path,
+					      parent_dir_inode->i_sb, xid);
+	} else {
 		rc = cifs_get_inode_info(&newInode, full_path, NULL,
 				parent_dir_inode->i_sb, xid, NULL);
+	}
 
 	if ((rc == 0) && (newInode != NULL)) {
 		d_add(direntry, newInode);
-		if (posix_open) {
-			filp = lookup_instantiate_filp(nd, direntry,
-						       generic_file_open);
-			if (IS_ERR(filp)) {
-				rc = PTR_ERR(filp);
-				CIFSSMBClose(xid, pTcon, fileHandle);
-				goto lookup_out;
-			}
-
-			cfile = cifs_new_fileinfo(fileHandle, filp, tlink,
-						  oplock);
-			if (cfile == NULL) {
-				fput(filp);
-				CIFSSMBClose(xid, pTcon, fileHandle);
-				rc = -ENOMEM;
-				goto lookup_out;
-			}
-		}
 		/* since paths are not looked up by component - the parent
 		   directories are presumed to be good here */
 		renew_parental_timestamps(direntry);
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 12/16] ceph: remove unused arg from ceph_lookup_open()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (10 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 11/16] cifs: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 13/16] ceph: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

What was the purpose of this?

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/ceph/dir.c   |    4 ++--
 fs/ceph/file.c  |    3 +--
 fs/ceph/super.h |    3 +--
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 3e8094b..c4b7832 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -599,7 +599,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
 	    (nd->flags & LOOKUP_OPEN) &&
 	    !(nd->intent.open.flags & O_CREAT)) {
 		int mode = nd->intent.open.create_mode & ~current->fs->umask;
-		return ceph_lookup_open(dir, dentry, nd, mode, 1);
+		return ceph_lookup_open(dir, dentry, nd, mode);
 	}
 
 	/* can we conclude ENOENT locally? */
@@ -710,7 +710,7 @@ static int ceph_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 
 	if (nd) {
 		BUG_ON((nd->flags & LOOKUP_OPEN) == 0);
-		dentry = ceph_lookup_open(dir, dentry, nd, mode, 0);
+		dentry = ceph_lookup_open(dir, dentry, nd, mode);
 		/* hrm, what should i do here if we get aliased? */
 		if (IS_ERR(dentry))
 			return PTR_ERR(dentry);
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index ed72428..2fe9a3e 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -220,8 +220,7 @@ out:
  *  path_lookup_create -> LOOKUP_OPEN|LOOKUP_CREATE
  */
 struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				struct nameidata *nd, int mode,
-				int locked_dir)
+				struct nameidata *nd, int mode)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb);
 	struct ceph_mds_client *mdsc = fsc->mdsc;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 1421f3d..c6b2cba 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -805,8 +805,7 @@ extern int ceph_copy_from_page_vector(struct page **pages,
 extern struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags);
 extern int ceph_open(struct inode *inode, struct file *file);
 extern struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				       struct nameidata *nd, int mode,
-				       int locked_dir);
+				       struct nameidata *nd, int mode);
 extern int ceph_release(struct inode *inode, struct file *filp);
 
 /* dir.c */
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 13/16] ceph: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (11 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 12/16] ceph: remove unused arg from ceph_lookup_open() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-29  5:23   ` Sage Weil
  2012-03-28 20:24 ` [PATCH 14/16] 9p: implement i_op->atomic_create() Miklos Szeredi
                   ` (2 subsequent siblings)
  15 siblings, 1 reply; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Instead of calling ceph_lookup_open() from ->lookup and ->create, call it from
->atomic_open and ->atomic_create.

CEPH does non-create open in ->atomic_open and create-open in ->atomic_create.
To prevent unnecessary call to ->atomic_open the FS_NO_LOOKUP_CREATE flag is set
in the filesystem flags to only call ->atomic_open on non-create opens.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/ceph/dir.c   |   68 ++++++++++++++++++++++++++++++++++--------------------
 fs/ceph/file.c  |   21 ++++++++---------
 fs/ceph/super.h |    5 ++-
 3 files changed, 56 insertions(+), 38 deletions(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index c4b7832..75df600 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -594,14 +594,6 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
 	if (err < 0)
 		return ERR_PTR(err);
 
-	/* open (but not create!) intent? */
-	if (nd &&
-	    (nd->flags & LOOKUP_OPEN) &&
-	    !(nd->intent.open.flags & O_CREAT)) {
-		int mode = nd->intent.open.create_mode & ~current->fs->umask;
-		return ceph_lookup_open(dir, dentry, nd, mode);
-	}
-
 	/* can we conclude ENOENT locally? */
 	if (dentry->d_inode == NULL) {
 		struct ceph_inode_info *ci = ceph_inode(dir);
@@ -642,6 +634,47 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
 	return dentry;
 }
 
+struct file *ceph_atomic_open(struct inode *dir, struct dentry *dentry,
+			      struct opendata *od, unsigned flags, umode_t mode,
+			      bool *created)
+{
+	int err;
+	struct dentry *res = NULL;
+	struct file *filp;
+
+	if (!(flags & O_CREAT)) {
+		if (dentry->d_name.len > NAME_MAX)
+			return ERR_PTR(-ENAMETOOLONG);
+
+		err = ceph_init_dentry(dentry);
+		if (err < 0)
+			return ERR_PTR(err);
+
+		return ceph_lookup_open(dir, dentry, od, flags, mode);
+	}
+
+	if (d_unhashed(dentry)) {
+		res = ceph_lookup(dir, dentry, NULL);
+		if (IS_ERR(res))
+			return ERR_CAST(res);
+
+		if (res)
+			dentry = res;
+	}
+
+	/* We don't deal with positive dentries here */
+	if (dentry->d_inode) {
+		finish_no_open(od, res);
+		return NULL;
+	}
+
+	*created = true;
+	filp = ceph_lookup_open(dir, dentry, od, flags, mode);
+	dput(res);
+
+	return filp;
+}
+
 /*
  * If we do a create but get no trace back from the MDS, follow up with
  * a lookup (the VFS expects us to link up the provided dentry).
@@ -702,23 +735,7 @@ static int ceph_mknod(struct inode *dir, struct dentry *dentry,
 static int ceph_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 		       struct nameidata *nd)
 {
-	dout("create in dir %p dentry %p name '%.*s'\n",
-	     dir, dentry, dentry->d_name.len, dentry->d_name.name);
-
-	if (ceph_snap(dir) != CEPH_NOSNAP)
-		return -EROFS;
-
-	if (nd) {
-		BUG_ON((nd->flags & LOOKUP_OPEN) == 0);
-		dentry = ceph_lookup_open(dir, dentry, nd, mode);
-		/* hrm, what should i do here if we get aliased? */
-		if (IS_ERR(dentry))
-			return PTR_ERR(dentry);
-		return 0;
-	}
-
-	/* fall back to mknod */
-	return ceph_mknod(dir, dentry, (mode & ~S_IFMT) | S_IFREG, 0);
+	return ceph_mknod(dir, dentry, mode, 0);
 }
 
 static int ceph_symlink(struct inode *dir, struct dentry *dentry,
@@ -1357,6 +1374,7 @@ const struct inode_operations ceph_dir_iops = {
 	.rmdir = ceph_unlink,
 	.rename = ceph_rename,
 	.create = ceph_create,
+	.atomic_open = ceph_atomic_open,
 };
 
 const struct dentry_operations ceph_dentry_ops = {
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 2fe9a3e..bf35158 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -214,21 +214,15 @@ out:
  * may_open() fails, the struct *file gets cleaned up (i.e.
  * ceph_release gets called).  So fear not!
  */
-/*
- * flags
- *  path_lookup_open   -> LOOKUP_OPEN
- *  path_lookup_create -> LOOKUP_OPEN|LOOKUP_CREATE
- */
-struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				struct nameidata *nd, int mode)
+struct file *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
+			      struct opendata *od, unsigned flags, umode_t mode)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb);
 	struct ceph_mds_client *mdsc = fsc->mdsc;
-	struct file *file;
+	struct file *file = NULL;
 	struct ceph_mds_request *req;
 	struct dentry *ret;
 	int err;
-	int flags = nd->intent.open.flags;
 
 	dout("ceph_lookup_open dentry %p '%.*s' flags %d mode 0%o\n",
 	     dentry, dentry->d_name.len, dentry->d_name.name, flags, mode);
@@ -254,14 +248,19 @@ struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
 		err = ceph_handle_notrace_create(dir, dentry);
 	if (err)
 		goto out;
-	file = lookup_instantiate_filp(nd, req->r_dentry, ceph_open);
+	file = finish_open(od, req->r_dentry, ceph_open);
 	if (IS_ERR(file))
 		err = PTR_ERR(file);
 out:
 	ret = ceph_finish_lookup(req, dentry, err);
 	ceph_mdsc_put_request(req);
 	dout("ceph_lookup_open result=%p\n", ret);
-	return ret;
+
+	if (IS_ERR(ret))
+		return ERR_CAST(ret);
+
+	dput(ret);
+	return err ? ERR_PTR(err) : file;
 }
 
 int ceph_release(struct inode *inode, struct file *file)
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index c6b2cba..4e83114 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -804,8 +804,9 @@ extern int ceph_copy_from_page_vector(struct page **pages,
 				    loff_t off, size_t len);
 extern struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags);
 extern int ceph_open(struct inode *inode, struct file *file);
-extern struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				       struct nameidata *nd, int mode);
+extern struct file *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
+				     struct opendata *od, unsigned flags,
+				     umode_t mode);
 extern int ceph_release(struct inode *inode, struct file *filp);
 
 /* dir.c */
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 14/16] 9p: implement i_op->atomic_create()
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (12 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 13/16] ceph: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 15/16] vfs: remove open intents from nameidata Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 16/16] vfs: only retry last component if opening stale dentry Miklos Szeredi
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace 9p's ->create implementations with ->atomic_create implementations.  No
functionality is changed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/9p/vfs_inode.c      |  169 +++++++++++++++++++++++++++++-------------------
 fs/9p/vfs_inode_dotl.c |   52 ++++++++++-----
 2 files changed, 137 insertions(+), 84 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 014c8dd..203892d 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -712,11 +712,14 @@ error:
 }
 
 /**
- * v9fs_vfs_create - VFS hook to create files
+ * v9fs_vfs_create - VFS hook to create a regular file
+ *
+ * open(.., O_CREAT) is handled in v9fs_vfs_atomic_open().  This is only called
+ * for mknod(2).
+ *
  * @dir: directory inode that is being created
  * @dentry:  dentry that is being deleted
  * @mode: create permissions
- * @nd: path information
  *
  */
 
@@ -724,76 +727,19 @@ static int
 v9fs_vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 		struct nameidata *nd)
 {
-	int err;
-	u32 perm;
-	int flags;
-	struct file *filp;
-	struct v9fs_inode *v9inode;
-	struct v9fs_session_info *v9ses;
-	struct p9_fid *fid, *inode_fid;
-
-	err = 0;
-	fid = NULL;
-	v9ses = v9fs_inode2v9ses(dir);
-	perm = unixmode2p9mode(v9ses, mode);
-	if (nd)
-		flags = nd->intent.open.flags;
-	else
-		flags = O_RDWR;
+	struct v9fs_session_info *v9ses = v9fs_inode2v9ses(dir);
+	u32 perm = unixmode2p9mode(v9ses, mode);
+	struct p9_fid *fid;
 
-	fid = v9fs_create(v9ses, dir, dentry, NULL, perm,
-				v9fs_uflags2omode(flags,
-						v9fs_proto_dotu(v9ses)));
-	if (IS_ERR(fid)) {
-		err = PTR_ERR(fid);
-		fid = NULL;
-		goto error;
-	}
+	/* P9_OEXCL? */
+	fid = v9fs_create(v9ses, dir, dentry, NULL, perm, P9_ORDWR);
+	if (IS_ERR(fid))
+		return PTR_ERR(fid);
 
 	v9fs_invalidate_inode_attr(dir);
-	/* if we are opening a file, assign the open fid to the file */
-	if (nd) {
-		v9inode = V9FS_I(dentry->d_inode);
-		mutex_lock(&v9inode->v_mutex);
-		if (v9ses->cache && !v9inode->writeback_fid &&
-		    ((flags & O_ACCMODE) != O_RDONLY)) {
-			/*
-			 * clone a fid and add it to writeback_fid
-			 * we do it during open time instead of
-			 * page dirty time via write_begin/page_mkwrite
-			 * because we want write after unlink usecase
-			 * to work.
-			 */
-			inode_fid = v9fs_writeback_fid(dentry);
-			if (IS_ERR(inode_fid)) {
-				err = PTR_ERR(inode_fid);
-				mutex_unlock(&v9inode->v_mutex);
-				goto error;
-			}
-			v9inode->writeback_fid = (void *) inode_fid;
-		}
-		mutex_unlock(&v9inode->v_mutex);
-		filp = lookup_instantiate_filp(nd, dentry, generic_file_open);
-		if (IS_ERR(filp)) {
-			err = PTR_ERR(filp);
-			goto error;
-		}
-
-		filp->private_data = fid;
-#ifdef CONFIG_9P_FSCACHE
-		if (v9ses->cache)
-			v9fs_cache_inode_set_cookie(dentry->d_inode, filp);
-#endif
-	} else
-		p9_client_clunk(fid);
+	p9_client_clunk(fid);
 
 	return 0;
-
-error:
-	if (fid)
-		p9_client_clunk(fid);
-
-	return err;
 }
 
 /**
@@ -910,6 +856,93 @@ error:
 	return ERR_PTR(result);
 }
 
+static struct file *
+v9fs_vfs_atomic_open(struct inode *dir, struct dentry *dentry,
+		     struct opendata *od, unsigned flags, umode_t mode,
+		     bool *created)
+{
+	int err;
+	u32 perm;
+	struct file *filp;
+	struct v9fs_inode *v9inode;
+	struct v9fs_session_info *v9ses;
+	struct p9_fid *fid, *inode_fid;
+	struct dentry *res = NULL;
+
+	if (d_unhashed(dentry)) {
+		res = v9fs_vfs_lookup(dir, dentry, NULL);
+		if (IS_ERR(res))
+			return ERR_CAST(res);
+
+		if (res)
+			dentry = res;
+	}
+
+	/* Only creates */
+	if (!(flags & O_CREAT) || dentry->d_inode) {
+		finish_no_open(od, res);
+		return NULL;
+	}
+
+	err = 0;
+	fid = NULL;
+	v9ses = v9fs_inode2v9ses(dir);
+	perm = unixmode2p9mode(v9ses, mode);
+	fid = v9fs_create(v9ses, dir, dentry, NULL, perm,
+				v9fs_uflags2omode(flags,
+						v9fs_proto_dotu(v9ses)));
+	if (IS_ERR(fid)) {
+		err = PTR_ERR(fid);
+		fid = NULL;
+		goto error;
+	}
+
+	v9fs_invalidate_inode_attr(dir);
+	v9inode = V9FS_I(dentry->d_inode);
+	mutex_lock(&v9inode->v_mutex);
+	if (v9ses->cache && !v9inode->writeback_fid &&
+	    ((flags & O_ACCMODE) != O_RDONLY)) {
+		/*
+		 * clone a fid and add it to writeback_fid
+		 * we do it during open time instead of
+		 * page dirty time via write_begin/page_mkwrite
+		 * because we want write after unlink usecase
+		 * to work.
+		 */
+		inode_fid = v9fs_writeback_fid(dentry);
+		if (IS_ERR(inode_fid)) {
+			err = PTR_ERR(inode_fid);
+			mutex_unlock(&v9inode->v_mutex);
+			goto error;
+		}
+		v9inode->writeback_fid = (void *) inode_fid;
+	}
+	mutex_unlock(&v9inode->v_mutex);
+	filp = finish_open(od, dentry, generic_file_open);
+	if (IS_ERR(filp)) {
+		err = PTR_ERR(filp);
+		goto error;
+	}
+
+	filp->private_data = fid;
+#ifdef CONFIG_9P_FSCACHE
+	if (v9ses->cache)
+		v9fs_cache_inode_set_cookie(dentry->d_inode, filp);
+#endif
+
+	*created = true;
+out:
+	dput(res);
+	return filp;
+
+error:
+	if (fid)
+		p9_client_clunk(fid);
+
+	filp = ERR_PTR(err);
+	goto out;
+}
+
 /**
  * v9fs_vfs_unlink - VFS unlink hook to delete an inode
  * @i:  inode that is being unlinked
@@ -1488,6 +1521,7 @@ out:
 static const struct inode_operations v9fs_dir_inode_operations_dotu = {
 	.create = v9fs_vfs_create,
 	.lookup = v9fs_vfs_lookup,
+	.atomic_open = v9fs_vfs_atomic_open,
 	.symlink = v9fs_vfs_symlink,
 	.link = v9fs_vfs_link,
 	.unlink = v9fs_vfs_unlink,
@@ -1502,6 +1536,7 @@ static const struct inode_operations v9fs_dir_inode_operations_dotu = {
 static const struct inode_operations v9fs_dir_inode_operations = {
 	.create = v9fs_vfs_create,
 	.lookup = v9fs_vfs_lookup,
+	.atomic_open = v9fs_vfs_atomic_open,
 	.unlink = v9fs_vfs_unlink,
 	.mkdir = v9fs_vfs_mkdir,
 	.rmdir = v9fs_vfs_rmdir,
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index a1e6c99..7a1b48e 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -248,7 +248,6 @@ int v9fs_open_to_dotl_flags(int flags)
  * @dir: directory inode that is being created
  * @dentry:  dentry that is being deleted
  * @mode: create permissions
- * @nd: path information
  *
  */
 
@@ -256,9 +255,16 @@ static int
 v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 		struct nameidata *nd)
 {
+	return v9fs_vfs_mknod_dotl(dir, dentry, omode, 0);
+}
+
+static struct file *
+v9fs_vfs_atomic_open_dotl(struct inode *dir, struct dentry *dentry,
+			  struct opendata *od, unsigned flags, umode_t omode,
+			  bool *created)
+{
 	int err = 0;
 	gid_t gid;
-	int flags;
 	umode_t mode;
 	char *name = NULL;
 	struct file *filp;
@@ -269,19 +275,25 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	struct p9_fid *dfid, *ofid, *inode_fid;
 	struct v9fs_session_info *v9ses;
 	struct posix_acl *pacl = NULL, *dacl = NULL;
+	struct dentry *res = NULL;
 
-	v9ses = v9fs_inode2v9ses(dir);
-	if (nd)
-		flags = nd->intent.open.flags;
-	else {
-		/*
-		 * create call without LOOKUP_OPEN is due
-		 * to mknod of regular files. So use mknod
-		 * operation.
-		 */
-		return v9fs_vfs_mknod_dotl(dir, dentry, omode, 0);
+	if (d_unhashed(dentry)) {
+		res = v9fs_vfs_lookup(dir, dentry, NULL);
+		if (IS_ERR(res))
+			return ERR_CAST(res);
+
+		if (res)
+			dentry = res;
+	}
+
+	/* Only creates */
+	if (!(flags & O_CREAT) || dentry->d_inode) {
+		finish_no_open(od, res);
+		return NULL;
 	}
 
+	v9ses = v9fs_inode2v9ses(dir);
+
 	name = (char *) dentry->d_name.name;
 	p9_debug(P9_DEBUG_VFS, "name:%s flags:0x%x mode:0x%hx\n",
 		 name, flags, omode);
@@ -290,7 +302,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	if (IS_ERR(dfid)) {
 		err = PTR_ERR(dfid);
 		p9_debug(P9_DEBUG_VFS, "fid lookup failed %d\n", err);
-		return err;
+		goto err_return;
 	}
 
 	/* clone a fid to use for creation */
@@ -298,7 +310,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	if (IS_ERR(ofid)) {
 		err = PTR_ERR(ofid);
 		p9_debug(P9_DEBUG_VFS, "p9_client_walk failed %d\n", err);
-		return err;
+		goto err_return;
 	}
 
 	gid = v9fs_get_fsgid_for_create(dir);
@@ -363,7 +375,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	}
 	mutex_unlock(&v9inode->v_mutex);
 	/* Since we are opening a file, assign the open fid to the file */
-	filp = lookup_instantiate_filp(nd, dentry, generic_file_open);
+	filp = finish_open(od, dentry, generic_file_open);
 	if (IS_ERR(filp)) {
 		err = PTR_ERR(filp);
 		goto err_clunk_old_fid;
@@ -373,7 +385,10 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	if (v9ses->cache)
 		v9fs_cache_inode_set_cookie(inode, filp);
 #endif
-	return 0;
+	*created = true;
+out:
+	dput(res);
+	return filp;
 
 error:
 	if (fid)
@@ -382,7 +397,9 @@ err_clunk_old_fid:
 	if (ofid)
 		p9_client_clunk(ofid);
 	v9fs_set_create_acl(NULL, &dacl, &pacl);
-	return err;
+err_return:
+	filp = ERR_PTR(err);
+	goto out;
 }
 
 /**
@@ -1000,6 +1017,7 @@ out:
 
 const struct inode_operations v9fs_dir_inode_operations_dotl = {
 	.create = v9fs_vfs_create_dotl,
+	.atomic_open = v9fs_vfs_atomic_open_dotl,
 	.lookup = v9fs_vfs_lookup,
 	.link = v9fs_vfs_link_dotl,
 	.symlink = v9fs_vfs_symlink_dotl,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 15/16] vfs: remove open intents from nameidata
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (13 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 14/16] 9p: implement i_op->atomic_create() Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  2012-03-28 20:24 ` [PATCH 16/16] vfs: only retry last component if opening stale dentry Miklos Szeredi
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

All users of open intents have been converted to use ->atomic_{open,create}.

This patch gets rid of nd->intent.open and related infrastructure.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/internal.h         |    5 +--
 fs/namei.c            |  114 ++++++++++++++++++++++---------------------------
 fs/open.c             |   97 +++++------------------------------------
 include/linux/namei.h |   11 -----
 4 files changed, 64 insertions(+), 163 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 438fbd3..f99e89a 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -84,13 +84,10 @@ extern struct super_block *user_get_super(dev_t);
 /*
  * open.c
  */
-struct nameidata;
-extern struct file *nameidata_to_filp(struct nameidata *);
-extern void release_open_intent(struct nameidata *);
 struct opendata {
 	struct dentry *dentry;
 	struct vfsmount *mnt;
-	struct file **filp;
+	struct file *filp;
 };
 struct open_flags {
 	int open_flag;
diff --git a/fs/namei.c b/fs/namei.c
index f6dfd7c..66b26da 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -476,22 +476,6 @@ err_root:
 	return -ECHILD;
 }
 
-/**
- * release_open_intent - free up open intent resources
- * @nd: pointer to nameidata
- */
-void release_open_intent(struct nameidata *nd)
-{
-	struct file *file = nd->intent.open.file;
-
-	if (file && !IS_ERR(file)) {
-		if (file->f_path.dentry == NULL)
-			put_filp(file);
-		else
-			fput(file);
-	}
-}
-
 static inline int d_revalidate(struct dentry *dentry, struct nameidata *nd)
 {
 	return dentry->d_op->d_revalidate(dentry, nd);
@@ -2109,7 +2093,8 @@ static int may_o_create(struct path *dir, struct dentry *dentry, umode_t mode)
 }
 
 static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
-				struct path *path, const struct open_flags *op,
+				struct path *path, struct opendata *od,
+				const struct open_flags *op,
 				int *want_write, bool need_lookup)
 {
 	struct inode *dir =  nd->path.dentry->d_inode;
@@ -2118,7 +2103,6 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	int error;
 	bool created = false;
 	int acc_mode;
-	struct opendata od;
 	struct file *filp;
 	int create_error = 0;
 	struct dentry *const DENTRY_NOT_SET = (void *) -1UL;
@@ -2184,14 +2168,13 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	if (nd->flags & LOOKUP_DIRECTORY)
 		open_flag |= O_DIRECTORY;
 
-	od.dentry = DENTRY_NOT_SET;
-	od.mnt = nd->path.mnt;
-	od.filp = &nd->intent.open.file;
-	filp = dir->i_op->atomic_open(dir, dentry, &od, open_flag, mode,
+	od->dentry = DENTRY_NOT_SET;
+	od->mnt = nd->path.mnt;
+	filp = dir->i_op->atomic_open(dir, dentry, od, open_flag, mode,
 				      &created);
 	if (IS_ERR(filp)) {
-		if (WARN_ON(od.dentry != DENTRY_NOT_SET))
-			dput(od.dentry);
+		if (WARN_ON(od->dentry != DENTRY_NOT_SET))
+			dput(od->dentry);
 
 		if (create_error && PTR_ERR(filp) == -ENOENT)
 			filp = ERR_PTR(create_error);
@@ -2205,13 +2188,13 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	}
 
 	if (!filp) {
-		if (WARN_ON(od.dentry == DENTRY_NOT_SET)) {
+		if (WARN_ON(od->dentry == DENTRY_NOT_SET)) {
 			filp = ERR_PTR(-EIO);
 			goto out;
 		}
-		if (od.dentry) {
+		if (od->dentry) {
 			dput(dentry);
-			dentry = od.dentry;
+			dentry = od->dentry;
 		}
 		goto looked_up;
 	}
@@ -2271,6 +2254,7 @@ looked_up:
  * was performed, only lookup.
  */
 static struct file *lookup_open(struct nameidata *nd, struct path *path,
+				struct opendata *od,
 				const struct open_flags *op, int *want_write)
 {
 	struct dentry *dir = nd->path.dentry;
@@ -2287,7 +2271,7 @@ static struct file *lookup_open(struct nameidata *nd, struct path *path,
 		goto out_no_open;
 
 	if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) {
-		return atomic_open(nd, dentry, path, op, want_write,
+		return atomic_open(nd, dentry, path, od, op, want_write,
 				   need_lookup);
 	}
 
@@ -2309,7 +2293,8 @@ out_no_open:
  * Handle the last step of open()
  */
 static struct file *do_last(struct nameidata *nd, struct path *path,
-			    const struct open_flags *op, const char *pathname)
+			    struct opendata *od, const struct open_flags *op,
+			    const char *pathname)
 {
 	struct dentry *dir = nd->path.dentry;
 	struct dentry *dentry;
@@ -2385,7 +2370,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 
 	mutex_lock(&dir->d_inode->i_mutex);
 
-	filp = lookup_open(nd, path, op, &want_write);
+	filp = lookup_open(nd, path, od, op, &want_write);
 	if (filp) {
 		mutex_unlock(&dir->d_inode->i_mutex);
 		if (IS_ERR(filp))
@@ -2411,7 +2396,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 		 * rw->ro transition does not occur between
 		 * the time when the file is created and when
 		 * a permanent write count is taken through
-		 * the 'struct file' in nameidata_to_filp().
+		 * the 'struct file' in finish_open().
 		 */
 		if (!want_write) {
 			error = mnt_want_write(nd->path.mnt);
@@ -2505,23 +2490,22 @@ common:
 	error = may_open(&nd->path, acc_mode, open_flag);
 	if (error)
 		goto exit;
-	filp = nameidata_to_filp(nd);
+	od->mnt = nd->path.mnt;
+	filp = finish_open(od, nd->path.dentry, NULL);
+	if (IS_ERR(filp))
+		goto out;
+	error = open_check_o_direct(filp);
+	if (error)
+		goto exit_fput;
 opened:
-	if (!IS_ERR(filp)) {
-		error = ima_file_check(filp, op->acc_mode);
-		if (error) {
-			fput(filp);
-			filp = ERR_PTR(error);
-		}
-	}
-	if (!IS_ERR(filp)) {
-		if (will_truncate) {
-			error = handle_truncate(filp);
-			if (error) {
-				fput(filp);
-				filp = ERR_PTR(error);
-			}
-		}
+	error = ima_file_check(filp, op->acc_mode);
+	if (error)
+		goto exit_fput;
+
+	if (will_truncate) {
+		error = handle_truncate(filp);
+		if (error)
+			goto exit_fput;
 	}
 out:
 	if (want_write)
@@ -2537,6 +2521,10 @@ exit:
 	filp = ERR_PTR(error);
 	goto out;
 
+exit_fput:
+	fput(filp);
+	goto exit;
+
 terminate:
 	terminate_walk(nd);
 	return ERR_PTR(error);
@@ -2546,18 +2534,16 @@ static struct file *path_openat(int dfd, const char *pathname,
 		struct nameidata *nd, const struct open_flags *op, int flags)
 {
 	struct file *base = NULL;
-	struct file *filp;
+	struct opendata od;
+	struct file *res;
 	struct path path;
 	int error;
 
-	filp = get_empty_filp();
-	if (!filp)
+	od.filp = get_empty_filp();
+	if (!od.filp)
 		return ERR_PTR(-ENFILE);
 
-	filp->f_flags = op->open_flag;
-	nd->intent.open.file = filp;
-	nd->intent.open.flags = open_to_namei_flags(op->open_flag);
-	nd->intent.open.create_mode = op->mode;
+	od.filp->f_flags = op->open_flag;
 
 	error = path_init(dfd, pathname, flags | LOOKUP_PARENT, nd, &base);
 	if (unlikely(error))
@@ -2568,23 +2554,23 @@ static struct file *path_openat(int dfd, const char *pathname,
 	if (unlikely(error))
 		goto out_filp;
 
-	filp = do_last(nd, &path, op, pathname);
-	while (unlikely(!filp)) { /* trailing symlink */
+	res = do_last(nd, &path, &od, op, pathname);
+	while (unlikely(!res)) { /* trailing symlink */
 		struct path link = path;
 		void *cookie;
 		if (!(nd->flags & LOOKUP_FOLLOW)) {
 			path_put_conditional(&path, nd);
 			path_put(&nd->path);
-			filp = ERR_PTR(-ELOOP);
+			res = ERR_PTR(-ELOOP);
 			break;
 		}
 		nd->flags |= LOOKUP_PARENT;
 		nd->flags &= ~(LOOKUP_OPEN|LOOKUP_CREATE|LOOKUP_EXCL);
 		error = follow_link(&link, nd, &cookie);
 		if (unlikely(error))
-			filp = ERR_PTR(error);
+			res = ERR_PTR(error);
 		else
-			filp = do_last(nd, &path, op, pathname);
+			res = do_last(nd, &path, &od, op, pathname);
 		put_link(nd, &link, cookie);
 	}
 out:
@@ -2592,11 +2578,14 @@ out:
 		path_put(&nd->root);
 	if (base)
 		fput(base);
-	release_open_intent(nd);
-	return filp;
+	if (od.filp) {
+		BUG_ON(od.filp->f_path.dentry);
+		put_filp(od.filp);
+	}
+	return res;
 
 out_filp:
-	filp = ERR_PTR(error);
+	res = ERR_PTR(error);
 	goto out;
 }
 
@@ -2652,7 +2641,6 @@ struct dentry *kern_path_create(int dfd, const char *pathname, struct path *path
 		goto out;
 	nd.flags &= ~LOOKUP_PARENT;
 	nd.flags |= LOOKUP_CREATE | LOOKUP_EXCL;
-	nd.intent.open.flags = O_EXCL;
 
 	/*
 	 * Do the final lookup.
diff --git a/fs/open.c b/fs/open.c
index cd73de1..a18e6bc 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -743,62 +743,6 @@ cleanup_file:
 	return ERR_PTR(error);
 }
 
-static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
-				struct file *f,
-				int (*open)(struct inode *, struct file *),
-				const struct cred *cred)
-{
-	struct file *res = do_dentry_open(dentry, mnt, f, open, cred);
-	if (!IS_ERR(res)) {
-		int error = open_check_o_direct(f);
-		if (error) {
-			fput(res);
-			res = ERR_PTR(error);
-		}
-	}
-	return res;
-}
-
-/**
- * lookup_instantiate_filp - instantiates the open intent filp
- * @nd: pointer to nameidata
- * @dentry: pointer to dentry
- * @open: open callback
- *
- * Helper for filesystems that want to use lookup open intents and pass back
- * a fully instantiated struct file to the caller.
- * This function is meant to be called from within a filesystem's
- * lookup method.
- * Beware of calling it for non-regular files! Those ->open methods might block
- * (e.g. in fifo_open), leaving you with parent locked (and in case of fifo,
- * leading to a deadlock, as nobody can open that fifo anymore, because
- * another process to open fifo will block on locked parent when doing lookup).
- * Note that in case of error, nd->intent.open.file is destroyed, but the
- * path information remains valid.
- * If the open callback is set to NULL, then the standard f_op->open()
- * filesystem callback is substituted.
- */
-struct file *lookup_instantiate_filp(struct nameidata *nd, struct dentry *dentry,
-		int (*open)(struct inode *, struct file *))
-{
-	const struct cred *cred = current_cred();
-
-	if (IS_ERR(nd->intent.open.file))
-		goto out;
-	if (IS_ERR(dentry))
-		goto out_err;
-	nd->intent.open.file = __dentry_open(dget(dentry), mntget(nd->path.mnt),
-					     nd->intent.open.file,
-					     open, cred);
-out:
-	return nd->intent.open.file;
-out_err:
-	release_open_intent(nd);
-	nd->intent.open.file = ERR_CAST(dentry);
-	goto out;
-}
-EXPORT_SYMBOL_GPL(lookup_instantiate_filp);
-
 /**
  * finish_open - finish opening a file
  * @od: opaque open data
@@ -815,8 +759,8 @@ struct file *finish_open(struct opendata *od, struct dentry *dentry,
 {
 	struct file *filp;
 
-	filp = *(od->filp);
-	*(od->filp) = NULL;
+	filp = od->filp;
+	od->filp = NULL;
 
 	mntget(od->mnt);
 	dget(dentry);
@@ -840,31 +784,6 @@ void finish_no_open(struct opendata *od, struct dentry *dentry)
 }
 EXPORT_SYMBOL(finish_no_open);
 
-/**
- * nameidata_to_filp - convert a nameidata to an open filp.
- * @nd: pointer to nameidata
- * @flags: open flags
- *
- * Note that this function destroys the original nameidata
- */
-struct file *nameidata_to_filp(struct nameidata *nd)
-{
-	const struct cred *cred = current_cred();
-	struct file *filp;
-
-	/* Pick up the filp from the open intent */
-	filp = nd->intent.open.file;
-	nd->intent.open.file = NULL;
-
-	/* Has the filesystem initialised the file for us? */
-	if (filp->f_path.dentry == NULL) {
-		path_get(&nd->path);
-		filp = __dentry_open(nd->path.dentry, nd->path.mnt, filp,
-				     NULL, cred);
-	}
-	return filp;
-}
-
 /*
  * dentry_open() will have done dput(dentry) and mntput(mnt) if it returns an
  * error.
@@ -873,7 +792,7 @@ struct file *dentry_open(struct dentry *dentry, struct vfsmount *mnt, int flags,
 			 const struct cred *cred)
 {
 	int error;
-	struct file *f;
+	struct file *f, *res;
 
 	validate_creds(cred);
 
@@ -889,7 +808,15 @@ struct file *dentry_open(struct dentry *dentry, struct vfsmount *mnt, int flags,
 	}
 
 	f->f_flags = flags;
-	return __dentry_open(dentry, mnt, f, NULL, cred);
+	res = do_dentry_open(dentry, mnt, f, NULL, cred);
+	if (!IS_ERR(res)) {
+		int error = open_check_o_direct(f);
+		if (error) {
+			fput(res);
+			res = ERR_PTR(error);
+		}
+	}
+	return res;
 }
 EXPORT_SYMBOL(dentry_open);
 
diff --git a/include/linux/namei.h b/include/linux/namei.h
index ffc0213..54dadda 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -7,12 +7,6 @@
 
 struct vfsmount;
 
-struct open_intent {
-	int	flags;
-	int	create_mode;
-	struct file *file;
-};
-
 enum { MAX_NESTED_LINKS = 8 };
 
 struct nameidata {
@@ -25,11 +19,6 @@ struct nameidata {
 	int		last_type;
 	unsigned	depth;
 	char *saved_names[MAX_NESTED_LINKS + 1];
-
-	/* Intent data */
-	union {
-		struct open_intent open;
-	} intent;
 };
 
 /*
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH 16/16] vfs: only retry last component if opening stale dentry
  2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
                   ` (14 preceding siblings ...)
  2012-03-28 20:24 ` [PATCH 15/16] vfs: remove open intents from nameidata Miklos Szeredi
@ 2012-03-28 20:24 ` Miklos Szeredi
  15 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-28 20:24 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

NFS optimizes away d_revalidates for last component of open.  This means that
open itself can find the dentry stale.  It returns ESTALE resulting in the
complete path being looked up again with LOOKUP_REVAL.

This is unnecessary, however, since it would be enough to retry the last
component only.  Introduce EOPENSTALE (a kernel private errno) and allow NFS to
retry opening only the last component.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c            |   34 ++++++++++++++++++++++++++++++----
 fs/nfs/file.c         |    2 +-
 fs/open.c             |   16 +++++++++-------
 include/linux/errno.h |    1 +
 4 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 66b26da..c2f7951 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2133,8 +2133,8 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	 * Another problem is returing the "right" error value (e.g. for an
 	 * O_EXCL open we want to return EEXIST not EROFS).
 	 */
-	if ((open_flag & (O_CREAT | O_TRUNC)) ||
-	    (open_flag & O_ACCMODE) != O_RDONLY) {
+	if (!*want_write && ((open_flag & (O_CREAT | O_TRUNC)) ||
+			     (open_flag & O_ACCMODE) != O_RDONLY)) {
 		error = mnt_want_write(nd->path.mnt);
 		if (!error) {
 			*want_write = 1;
@@ -2305,6 +2305,8 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	struct file *filp;
 	struct inode *inode;
 	int symlink_ok = 0;
+	struct path save_parent = { .dentry = NULL, .mnt = NULL };
+	bool retried = false;
 	int error;
 
 	nd->flags &= ~LOOKUP_PARENT;
@@ -2368,6 +2370,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 			goto exit;
 	}
 
+retry_lookup:
 	mutex_lock(&dir->d_inode->i_mutex);
 
 	filp = lookup_open(nd, path, od, op, &want_write);
@@ -2462,12 +2465,21 @@ finish_lookup:
 		return NULL;
 	}
 
-	path_to_nameidata(path, nd);
+	if ((nd->flags & LOOKUP_RCU) || nd->path.mnt != path->mnt) {
+		path_to_nameidata(path, nd);
+	} else {
+		save_parent.dentry = nd->path.dentry;
+		save_parent.mnt = mntget(path->mnt);
+		nd->path.dentry = path->dentry;
+
+	}
 	nd->inode = inode;
 
 	error = complete_walk(nd);
-	if (error)
+	if (error) {
+		path_put(&save_parent);
 		return ERR_PTR(error);
+	}
 	error = -EISDIR;
 	if ((open_flag & O_CREAT) && S_ISDIR(inode->i_mode))
 		goto exit;
@@ -2492,6 +2504,19 @@ common:
 		goto exit;
 	od->mnt = nd->path.mnt;
 	filp = finish_open(od, nd->path.dentry, NULL);
+	if (IS_ERR(filp) && PTR_ERR(filp) == -EOPENSTALE) {
+		error = -ESTALE;
+		if (!save_parent.dentry || retried)
+			goto exit;
+		BUG_ON(save_parent.dentry != dir);
+		path_put(&nd->path);
+		nd->path = save_parent;
+		nd->inode = dir->d_inode;
+		save_parent.mnt = NULL;
+		save_parent.dentry = NULL;
+		retried = true;
+		goto retry_lookup;
+	}
 	if (IS_ERR(filp))
 		goto out;
 	error = open_check_o_direct(filp);
@@ -2510,6 +2535,7 @@ opened:
 out:
 	if (want_write)
 		mnt_drop_write(nd->path.mnt);
+	path_put(&save_parent);
 	path_put(&nd->path);
 	return filp;
 
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 4e626ec..bb1f5cb 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -935,7 +935,7 @@ out:
 
 out_drop:
 	d_drop(dentry);
-	err = -ESTALE;
+	err = -EOPENSTALE;
 	goto out_put_ctx;
 }
 
diff --git a/fs/open.c b/fs/open.c
index a18e6bc..e7298b5 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -737,7 +737,6 @@ cleanup_all:
 	f->f_path.dentry = NULL;
 	f->f_path.mnt = NULL;
 cleanup_file:
-	put_filp(f);
 	dput(dentry);
 	mntput(mnt);
 	return ERR_PTR(error);
@@ -757,15 +756,16 @@ cleanup_file:
 struct file *finish_open(struct opendata *od, struct dentry *dentry,
 			 int (*open)(struct inode *, struct file *))
 {
-	struct file *filp;
-
-	filp = od->filp;
-	od->filp = NULL;
+	struct file *res;
 
 	mntget(od->mnt);
 	dget(dentry);
 
-	return do_dentry_open(dentry, od->mnt, filp, open, current_cred());
+	res = do_dentry_open(dentry, od->mnt, od->filp, open, current_cred());
+	if (!IS_ERR(res))
+		od->filp = NULL;
+
+	return res;
 }
 EXPORT_SYMBOL(finish_open);
 
@@ -809,7 +809,9 @@ struct file *dentry_open(struct dentry *dentry, struct vfsmount *mnt, int flags,
 
 	f->f_flags = flags;
 	res = do_dentry_open(dentry, mnt, f, NULL, cred);
-	if (!IS_ERR(res)) {
+	if (IS_ERR(res)) {
+		put_filp(f);
+	} else {
 		int error = open_check_o_direct(f);
 		if (error) {
 			fput(res);
diff --git a/include/linux/errno.h b/include/linux/errno.h
index 4668583..b1c33a0 100644
--- a/include/linux/errno.h
+++ b/include/linux/errno.h
@@ -16,6 +16,7 @@
 #define ERESTARTNOHAND	514	/* restart if no handler.. */
 #define ENOIOCTLCMD	515	/* No ioctl command */
 #define ERESTART_RESTARTBLOCK 516 /* restart by calling sys_restart_syscall */
+#define EOPENSTALE	517	/* open found a stale dentry */
 
 /* Defined for the NFSv3 protocol */
 #define EBADHANDLE	521	/* Illegal NFS file handle */
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/16] nfs: don't open in ->d_revalidate
  2012-03-28 20:24 ` [PATCH 05/16] nfs: don't open in ->d_revalidate Miklos Szeredi
@ 2012-03-28 21:45     ` Myklebust, Trond
  0 siblings, 0 replies; 22+ messages in thread
From: Myklebust, Trond @ 2012-03-28 21:45 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, sfrench, sage, ericvh, mszeredi

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1371 bytes --]

On Wed, 2012-03-28 at 22:24 +0200, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
> mount needs to be followed or not.  It does check d_mountpoint() on the dentry,
> which can result in a weird error if the VFS found that the mount does not in
> fact need to be followed, e.g.:
> 
>   # mount --bind /mnt/nfs /mnt/nfs-clone
>   # echo something > /mnt/nfs/tmp/bar
>   # echo x > /tmp/file
>   # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
>   # cat  /mnt/nfs/tmp/bar
>   cat: /mnt/nfs/tmp/bar: Not a directory
> 
> Which should, by any sane filesystem, result in "something" being printed.
> 
> So instead do the open in f_op->open() and in the unlikely case that the cached
> dentry turned out to be invalid, drop the dentry and return ESTALE to let the
> VFS retry.


Just one comment. Would it now make sense for NFSv4 to just skip
->d_revalidate() if LOOKUP_OPEN is set, and LOOKUP_EXCL is not set? We
will in any case be doing a revalidation in nfs4_file_open.

Otherwise, the rest all looks good to me.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/16] nfs: don't open in ->d_revalidate
@ 2012-03-28 21:45     ` Myklebust, Trond
  0 siblings, 0 replies; 22+ messages in thread
From: Myklebust, Trond @ 2012-03-28 21:45 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, sfrench, sage, ericvh, mszeredi

On Wed, 2012-03-28 at 22:24 +0200, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
> mount needs to be followed or not.  It does check d_mountpoint() on the dentry,
> which can result in a weird error if the VFS found that the mount does not in
> fact need to be followed, e.g.:
> 
>   # mount --bind /mnt/nfs /mnt/nfs-clone
>   # echo something > /mnt/nfs/tmp/bar
>   # echo x > /tmp/file
>   # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
>   # cat  /mnt/nfs/tmp/bar
>   cat: /mnt/nfs/tmp/bar: Not a directory
> 
> Which should, by any sane filesystem, result in "something" being printed.
> 
> So instead do the open in f_op->open() and in the unlikely case that the cached
> dentry turned out to be invalid, drop the dentry and return ESTALE to let the
> VFS retry.


Just one comment. Would it now make sense for NFSv4 to just skip
->d_revalidate() if LOOKUP_OPEN is set, and LOOKUP_EXCL is not set? We
will in any case be doing a revalidation in nfs4_file_open.

Otherwise, the rest all looks good to me.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 13/16] ceph: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-28 20:24 ` [PATCH 13/16] ceph: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
@ 2012-03-29  5:23   ` Sage Weil
  0 siblings, 0 replies; 22+ messages in thread
From: Sage Weil @ 2012-03-29  5:23 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	ericvh, mszeredi

Hi Miklos,

I was pleasantly surprised to find that this works as-is.  I simplified 
this to call ceph_lookup_open for the atomic O_CREAT case as well, and it 
passes my basic battery of tests.  An incremental patch is below.

I haven't looked closely at the vfs bits, but from an fs perspective this 
series looks good to me!

Thanks-
sage




diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 75df600..9d679dd 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -642,37 +642,17 @@ struct file *ceph_atomic_open(struct inode *dir, struct dentry *dentry,
        struct dentry *res = NULL;
        struct file *filp;
 
-       if (!(flags & O_CREAT)) {
-               if (dentry->d_name.len > NAME_MAX)
-                       return ERR_PTR(-ENAMETOOLONG);
-
-               err = ceph_init_dentry(dentry);
-               if (err < 0)
-                       return ERR_PTR(err);
-
-               return ceph_lookup_open(dir, dentry, od, flags, mode);
-       }
-
-       if (d_unhashed(dentry)) {
-               res = ceph_lookup(dir, dentry, NULL);
-               if (IS_ERR(res))
-                       return ERR_CAST(res);
-
-               if (res)
-                       dentry = res;
-       }
+       dout("atomic_open %p dentry %p '%.*s'\n",
+            dir, dentry, dentry->d_name.len, dentry->d_name.name);
 
-       /* We don't deal with positive dentries here */
-       if (dentry->d_inode) {
-               finish_no_open(od, res);
-               return NULL;
-       }
+       if (dentry->d_name.len > NAME_MAX)
+               return ERR_PTR(-ENAMETOOLONG);
 
-       *created = true;
-       filp = ceph_lookup_open(dir, dentry, od, flags, mode);
-       dput(res);
+       err = ceph_init_dentry(dentry);
+       if (err < 0)
+               return ERR_PTR(err);
 
-       return filp;
+       return ceph_lookup_open(dir, dentry, od, flags, mode);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/16] nfs: don't open in ->d_revalidate
  2012-03-28 21:45     ` Myklebust, Trond
@ 2012-03-29 12:19       ` Miklos Szeredi
  -1 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-29 12:19 UTC (permalink / raw)
  To: Myklebust, Trond
  Cc: viro@ZenIV.linux.org.uk, linux-fsdevel, linux-kernel, hch,
	sfrench, sage, ericvh, mszeredi

On Wed, Mar 28, 2012 at 11:45 PM, Myklebust, Trond
<Trond.Myklebust@netapp.com> wrote:
> On Wed, 2012-03-28 at 22:24 +0200, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@suse.cz>
>>
>> NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
>> mount needs to be followed or not.  It does check d_mountpoint() on the dentry,
>> which can result in a weird error if the VFS found that the mount does not in
>> fact need to be followed, e.g.:
>>
>>   # mount --bind /mnt/nfs /mnt/nfs-clone
>>   # echo something > /mnt/nfs/tmp/bar
>>   # echo x > /tmp/file
>>   # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
>>   # cat  /mnt/nfs/tmp/bar
>>   cat: /mnt/nfs/tmp/bar: Not a directory
>>
>> Which should, by any sane filesystem, result in "something" being printed.
>>
>> So instead do the open in f_op->open() and in the unlikely case that the cached
>> dentry turned out to be invalid, drop the dentry and return ESTALE to let the
>> VFS retry.
>
>
> Just one comment. Would it now make sense for NFSv4 to just skip
> ->d_revalidate() if LOOKUP_OPEN is set, and LOOKUP_EXCL is not set? We
> will in any case be doing a revalidation in nfs4_file_open.

And dentry is positive and regular.  Which is basically what
nfs4_lookup_revalidate() does check at the moment.

One question is whether this can be done without dropping out of RCU
mode, which might be a real performance win. I'm not sure about
dereferencing inode->i_mode.  AFAICS it should be fine, considering
that destruction of the inode will leave the mode bits untouched,
but...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH 05/16] nfs: don't open in ->d_revalidate
@ 2012-03-29 12:19       ` Miklos Szeredi
  0 siblings, 0 replies; 22+ messages in thread
From: Miklos Szeredi @ 2012-03-29 12:19 UTC (permalink / raw)
  To: Myklebust, Trond
  Cc: viro@ZenIV.linux.org.uk, linux-fsdevel, linux-kernel, hch,
	sfrench, sage, ericvh, mszeredi

On Wed, Mar 28, 2012 at 11:45 PM, Myklebust, Trond
<Trond.Myklebust@netapp.com> wrote:
> On Wed, 2012-03-28 at 22:24 +0200, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@suse.cz>
>>
>> NFSv4 can't do reliable opens in d_revalidate, since it cannot know whether a
>> mount needs to be followed or not.  It does check d_mountpoint() on the dentry,
>> which can result in a weird error if the VFS found that the mount does not in
>> fact need to be followed, e.g.:
>>
>>   # mount --bind /mnt/nfs /mnt/nfs-clone
>>   # echo something > /mnt/nfs/tmp/bar
>>   # echo x > /tmp/file
>>   # mount --bind /tmp/file /mnt/nfs-clone/tmp/bar
>>   # cat  /mnt/nfs/tmp/bar
>>   cat: /mnt/nfs/tmp/bar: Not a directory
>>
>> Which should, by any sane filesystem, result in "something" being printed.
>>
>> So instead do the open in f_op->open() and in the unlikely case that the cached
>> dentry turned out to be invalid, drop the dentry and return ESTALE to let the
>> VFS retry.
>
>
> Just one comment. Would it now make sense for NFSv4 to just skip
> ->d_revalidate() if LOOKUP_OPEN is set, and LOOKUP_EXCL is not set? We
> will in any case be doing a revalidation in nfs4_file_open.

And dentry is positive and regular.  Which is basically what
nfs4_lookup_revalidate() does check at the moment.

One question is whether this can be done without dropping out of RCU
mode, which might be a real performance win. I'm not sure about
dereferencing inode->i_mode.  AFAICS it should be fine, considering
that destruction of the inode will leave the mode bits untouched,
but...

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2012-03-29 12:19 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-28 20:24 [PATCH 00/16] vfs: atomic open v2 Miklos Szeredi
2012-03-28 20:24 ` [PATCH 01/16] vfs: split do_lookup() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 02/16] vfs: reorganize do_last() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 03/16] vfs: split __dentry_open() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 04/16] vfs: add i_op->atomic_open() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 05/16] nfs: don't open in ->d_revalidate Miklos Szeredi
2012-03-28 21:45   ` Myklebust, Trond
2012-03-28 21:45     ` Myklebust, Trond
2012-03-29 12:19     ` Miklos Szeredi
2012-03-29 12:19       ` Miklos Szeredi
2012-03-28 20:24 ` [PATCH 06/16] nfs: implement i_op->atomic_open() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 07/16] nfs: clean up ->create in nfs_rpc_ops Miklos Szeredi
2012-03-28 20:24 ` [PATCH 08/16] nfs: don't use nd->intent.open.flags Miklos Szeredi
2012-03-28 20:24 ` [PATCH 09/16] nfs: don't use intents for checking atomic open Miklos Szeredi
2012-03-28 20:24 ` [PATCH 10/16] fuse: implement i_op->atomic_create() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 11/16] cifs: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 12/16] ceph: remove unused arg from ceph_lookup_open() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 13/16] ceph: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
2012-03-29  5:23   ` Sage Weil
2012-03-28 20:24 ` [PATCH 14/16] 9p: implement i_op->atomic_create() Miklos Szeredi
2012-03-28 20:24 ` [PATCH 15/16] vfs: remove open intents from nameidata Miklos Szeredi
2012-03-28 20:24 ` [PATCH 16/16] vfs: only retry last component if opening stale dentry Miklos Szeredi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.