linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/25] vfs: atomic open RFC
@ 2012-03-07 21:22 Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 01/25] vfs: split do_lookup() Miklos Szeredi
                   ` (26 more replies)
  0 siblings, 27 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

This series allows clean implementation of atomic lookup+(create)+open and
create+open operations that previously were done via ->lookup and ->create using
open intents.

Testing and review is welcome, but at this stage mainly I'd like to hear
opinions on the overall design of the new interfaces.

This is based on the vfs fixes patchset posted previously on -fsdevel.

git tree is here:

  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git atomic-open

Thanks,
Miklos

---
Miklos Szeredi (25):
      vfs: split do_lookup()
      vfs: reorganize do_last()
      vfs: split __dentry_open()
      vfs: add i_op->atomic_open()
      vfs: add filesystem flags for atomic_open
      vfs: add i_op->atomic_create()
      nfs: implement i_op->atomic_open()
      nfs: clean up ->create in nfs_rpc_ops
      nfs: remove nfs4 specific create function
      nfs: don't use nd->intent.open.flags
      nfs: don't use intents for checking atomic open
      fuse: implement i_op->atomic_create()
      cifs: implement i_op->atomic_open() and i_op->atomic_create()
      ceph: remove unused arg from ceph_lookup_open()
      ceph: implement i_op->atomic_open() and i_op->atomic_create()
      9p: implement i_op->atomic_create()
      vfs: remove open intents from nameidata
      vfs: only retry last component if opening stale dentry
      vfs: remove nameidata argument from vfs_create
      vfs: move O_DIRECT check to common code
      gfs2: use i_op->atomic_create()
      nfs: use i_op->atomic_create()
      vfs: remove nameidata argument from i_op->create()
      vfs: optionally skip lookup on exclusive create
      vfs: remove nameidata from lookup

---
 Documentation/filesystems/Locking |    3 +-
 Documentation/filesystems/vfs.txt |    2 +-
 fs/9p/v9fs.h                      |    3 +-
 fs/9p/vfs_inode.c                 |   36 +--
 fs/9p/vfs_inode_dotl.c            |   36 +-
 fs/adfs/dir.c                     |    2 +-
 fs/affs/affs.h                    |    5 +-
 fs/affs/namei.c                   |    4 +-
 fs/afs/dir.c                      |   12 +-
 fs/afs/mntpt.c                    |    7 +-
 fs/autofs4/root.c                 |    4 +-
 fs/bad_inode.c                    |    7 +-
 fs/befs/linuxvfs.c                |    4 +-
 fs/bfs/dir.c                      |    6 +-
 fs/btrfs/inode.c                  |    6 +-
 fs/cachefiles/namei.c             |    2 +-
 fs/ceph/dir.c                     |   38 +--
 fs/ceph/file.c                    |   23 +-
 fs/ceph/super.c                   |    2 +-
 fs/ceph/super.h                   |    6 +-
 fs/cifs/cifsfs.c                  |   24 ++-
 fs/cifs/cifsfs.h                  |    8 +-
 fs/cifs/dir.c                     |  175 +++++------
 fs/coda/dir.c                     |    8 +-
 fs/configfs/dir.c                 |    4 +-
 fs/cramfs/inode.c                 |    2 +-
 fs/ecryptfs/inode.c               |    9 +-
 fs/efs/efs.h                      |    2 +-
 fs/efs/namei.c                    |    3 +-
 fs/exofs/namei.c                  |    6 +-
 fs/ext2/namei.c                   |    4 +-
 fs/ext3/namei.c                   |    5 +-
 fs/ext4/namei.c                   |    5 +-
 fs/fat/namei_msdos.c              |    6 +-
 fs/fat/namei_vfat.c               |    6 +-
 fs/freevxfs/vxfs_lookup.c         |    5 +-
 fs/fuse/dir.c                     |   60 ++--
 fs/gfs2/inode.c                   |   24 +-
 fs/hfs/dir.c                      |    6 +-
 fs/hfs/inode.c                    |    3 +-
 fs/hfsplus/dir.c                  |    7 +-
 fs/hfsplus/inode.c                |    2 +-
 fs/hostfs/hostfs_kern.c           |    6 +-
 fs/hpfs/dir.c                     |    2 +-
 fs/hpfs/hpfs_fn.h                 |    2 +-
 fs/hpfs/namei.c                   |    2 +-
 fs/hppfs/hppfs.c                  |    3 +-
 fs/hugetlbfs/inode.c              |    3 +-
 fs/internal.h                     |    9 +-
 fs/isofs/isofs.h                  |    2 +-
 fs/isofs/namei.c                  |    2 +-
 fs/jffs2/dir.c                    |   11 +-
 fs/jfs/namei.c                    |    6 +-
 fs/libfs.c                        |    2 +-
 fs/logfs/dir.c                    |    6 +-
 fs/minix/namei.c                  |    5 +-
 fs/namei.c                        |  636 ++++++++++++++++++++++++++++---------
 fs/ncpfs/dir.c                    |    9 +-
 fs/nfs/dir.c                      |  278 +++++++----------
 fs/nfs/file.c                     |    2 +-
 fs/nfs/nfs3proc.c                 |    2 +-
 fs/nfs/nfs4proc.c                 |   48 ---
 fs/nfs/proc.c                     |    2 +-
 fs/nfs/super.c                    |    9 +-
 fs/nfsd/vfs.c                     |    4 +-
 fs/nilfs2/namei.c                 |    6 +-
 fs/ntfs/namei.c                   |    4 +-
 fs/ocfs2/dlmfs/dlmfs.c            |    3 +-
 fs/ocfs2/namei.c                  |    6 +-
 fs/omfs/dir.c                     |    6 +-
 fs/open.c                         |  113 +++----
 fs/openpromfs/inode.c             |    5 +-
 fs/proc/base.c                    |   22 +-
 fs/proc/generic.c                 |    3 +-
 fs/proc/internal.h                |    4 +-
 fs/proc/namespaces.c              |    2 +-
 fs/proc/proc_net.c                |    2 +-
 fs/proc/proc_sysctl.c             |    3 +-
 fs/proc/root.c                    |    9 +-
 fs/qnx4/namei.c                   |    2 +-
 fs/qnx4/qnx4.h                    |    2 +-
 fs/ramfs/inode.c                  |    2 +-
 fs/reiserfs/namei.c               |    7 +-
 fs/romfs/super.c                  |    3 +-
 fs/squashfs/namei.c               |    3 +-
 fs/sysfs/dir.c                    |    3 +-
 fs/sysv/namei.c                   |    4 +-
 fs/ubifs/dir.c                    |    6 +-
 fs/udf/namei.c                    |    6 +-
 fs/ufs/namei.c                    |    5 +-
 fs/xfs/xfs_iops.c                 |    9 +-
 include/linux/errno.h             |    1 +
 include/linux/fs.h                |   21 +-
 include/linux/namei.h             |   11 -
 include/linux/nfs_xdr.h           |    2 +-
 ipc/mqueue.c                      |    5 +-
 kernel/cgroup.c                   |    4 +-
 mm/shmem.c                        |    3 +-
 98 files changed, 1031 insertions(+), 889 deletions(-)



^ permalink raw reply	[flat|nested] 55+ messages in thread

* [PATCH 01/25] vfs: split do_lookup()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 02/25] vfs: reorganize do_last() Miklos Szeredi
                   ` (25 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Split do_lookup() into two functions:

  lookup_fast() - does cached lookup without i_mutex
  lookup_slow() - does lookup with i_mutex

Both follow managed dentries.

The new functions are needed by atomic_open.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c |   56 ++++++++++++++++++++++++++++++++++++++++++--------------
 1 files changed, 42 insertions(+), 14 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 29a00e3..f9639bf 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1147,8 +1147,8 @@ static struct dentry *__lookup_hash(struct qstr *name, struct dentry *base,
  *  small and for now I'd prefer to have fast path as straight as possible.
  *  It _is_ time-critical.
  */
-static int do_lookup(struct nameidata *nd, struct qstr *name,
-			struct path *path, struct inode **inode)
+static int lookup_fast(struct nameidata *nd, struct qstr *name,
+		       struct path *path, struct inode **inode)
 {
 	struct vfsmount *mnt = nd->path.mnt;
 	struct dentry *dentry, *parent = nd->path.dentry;
@@ -1217,7 +1217,6 @@ unlazy:
 		}
 	}
 
-success:
 	path->mnt = mnt;
 	path->dentry = dentry;
 	err = follow_managed(path, nd);
@@ -1229,6 +1228,17 @@ success:
 	return 0;
 
 need_lookup:
+	return 1;
+}
+
+/* Fast lookup failed, do it the slow way */
+static int lookup_slow(struct nameidata *nd, struct qstr *name,
+		       struct path *path)
+{
+	struct dentry *dentry, *parent;
+	int err;
+
+	parent = nd->path.dentry;
 	BUG_ON(nd->inode != parent->d_inode);
 
 	mutex_lock(&parent->d_inode->i_mutex);
@@ -1237,7 +1247,14 @@ need_lookup:
 	if (IS_ERR(dentry))
 		return PTR_ERR(dentry);
 
-	goto success;
+	path->mnt = nd->path.mnt;
+	path->dentry = dentry;
+	err = follow_managed(path, nd);
+	if (unlikely(err)) {
+		path_put_conditional(path, nd);
+		return err;
+	}
+	return 0;
 }
 
 static inline int may_lookup(struct nameidata *nd)
@@ -1309,21 +1326,26 @@ static inline int walk_component(struct nameidata *nd, struct path *path,
 	 */
 	if (unlikely(type != LAST_NORM))
 		return handle_dots(nd, type);
-	err = do_lookup(nd, name, path, &inode);
+	err = lookup_fast(nd, name, path, &inode);
 	if (unlikely(err)) {
-		terminate_walk(nd);
-		return err;
-	}
-	if (!inode) {
-		path_to_nameidata(path, nd);
-		terminate_walk(nd);
-		return -ENOENT;
+		if (err < 0)
+			goto out_err;
+
+		err = lookup_slow(nd, name, path);
+		if (err < 0)
+			goto out_err;
+
+		inode = path->dentry->d_inode;
 	}
+	err = -ENOENT;
+	if (!inode)
+		goto out_path_put;
+
 	if (should_follow_link(inode, follow)) {
 		if (nd->flags & LOOKUP_RCU) {
 			if (unlikely(unlazy_walk(nd, path->dentry))) {
-				terminate_walk(nd);
-				return -ECHILD;
+				err = -ECHILD;
+				goto out_err;
 			}
 		}
 		BUG_ON(inode != path->dentry->d_inode);
@@ -1332,6 +1354,12 @@ static inline int walk_component(struct nameidata *nd, struct path *path,
 	path_to_nameidata(path, nd);
 	nd->inode = inode;
 	return 0;
+
+out_path_put:
+	path_to_nameidata(path, nd);
+out_err:
+	terminate_walk(nd);
+	return err;
 }
 
 /*
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 02/25] vfs: reorganize do_last()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 01/25] vfs: split do_lookup() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 03/25] vfs: split __dentry_open() Miklos Szeredi
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Make the slow lookup part of O_CREAT and non-O_CREAT opens common.

This allows atomic_open to be hooked into the slow lookup part.

The audit_inode() from after mutex_unlock() was moved down just before the "ok:"
label.  Is this correct?

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c |   93 +++++++++++++++++++++++++++++++++++------------------------
 1 files changed, 55 insertions(+), 38 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index f9639bf..ff8bc94 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2113,6 +2113,8 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	int want_write = 0;
 	int acc_mode = op->acc_mode;
 	struct file *filp;
+	struct inode *inode;
+	int symlink_ok = 0;
 	int error;
 
 	nd->flags &= ~LOOKUP_PARENT;
@@ -2144,47 +2146,38 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	}
 
 	if (!(open_flag & O_CREAT)) {
-		int symlink_ok = 0;
 		if (nd->last.name[nd->last.len])
 			nd->flags |= LOOKUP_FOLLOW | LOOKUP_DIRECTORY;
 		if (open_flag & O_PATH && !(nd->flags & LOOKUP_FOLLOW))
 			symlink_ok = 1;
 		/* we _can_ be in RCU mode here */
-		error = walk_component(nd, path, &nd->last, LAST_NORM,
-					!symlink_ok);
-		if (error < 0)
-			return ERR_PTR(error);
-		if (error) /* symlink */
-			return NULL;
-		/* sayonara */
+		error = lookup_fast(nd, &nd->last, path, &inode);
+		if (error <= 0) {
+			if (error)
+				goto terminate;
+
+			goto finish_lookup;
+		}
+		/* cached lookup failed, no longer in RCU mode */
+	} else {
+		/* create side of things */
+
+		/*
+		 * This will *only* deal with leaving RCU mode - LOOKUP_JUMPED
+		 * has been cleared when we got to the last component we are
+		 * about to look up
+		 */
 		error = complete_walk(nd);
 		if (error)
 			return ERR_PTR(error);
 
-		error = -ENOTDIR;
-		if (nd->flags & LOOKUP_DIRECTORY) {
-			if (!nd->inode->i_op->lookup)
-				goto exit;
-		}
-		audit_inode(pathname, nd->path.dentry);
-		goto ok;
+		audit_inode(pathname, dir);
+		error = -EISDIR;
+		/* trailing slashes? */
+		if (nd->last.name[nd->last.len])
+			goto exit;
 	}
 
-	/* create side of things */
-	/*
-	 * This will *only* deal with leaving RCU mode - LOOKUP_JUMPED has been
-	 * cleared when we got to the last component we are about to look up
-	 */
-	error = complete_walk(nd);
-	if (error)
-		return ERR_PTR(error);
-
-	audit_inode(pathname, dir);
-	error = -EISDIR;
-	/* trailing slashes? */
-	if (nd->last.name[nd->last.len])
-		goto exit;
-
 	mutex_lock(&dir->d_inode->i_mutex);
 
 	dentry = lookup_hash(nd);
@@ -2197,9 +2190,14 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	path->dentry = dentry;
 	path->mnt = nd->path.mnt;
 
-	/* Negative dentry, just create the file */
+	/* Negative dentry, create the file if O_CREAT */
 	if (!dentry->d_inode) {
 		umode_t mode = op->mode;
+
+		error = -ENOENT;
+		if (!(open_flag & O_CREAT))
+			goto exit_mutex_unlock;
+
 		if (!IS_POSIXACL(dir->d_inode))
 			mode &= ~current_umask();
 		/*
@@ -2233,7 +2231,6 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	 * It already exists.
 	 */
 	mutex_unlock(&dir->d_inode->i_mutex);
-	audit_inode(pathname, path->dentry);
 
 	error = -EEXIST;
 	if (open_flag & O_EXCL)
@@ -2243,22 +2240,38 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	if (error)
 		goto exit_dput;
 
+	inode = path->dentry->d_inode;
+finish_lookup:
 	error = -ENOENT;
-	if (!path->dentry->d_inode)
-		goto exit_dput;
+	if (!inode) {
+		path_to_nameidata(path, nd);
+		goto terminate;
+	}
 
-	if (path->dentry->d_inode->i_op->follow_link)
+	if (should_follow_link(inode, !symlink_ok)) {
+		if (nd->flags & LOOKUP_RCU) {
+			if (unlikely(unlazy_walk(nd, path->dentry))) {
+				error = -ECHILD;
+				goto terminate;
+			}
+		}
 		return NULL;
+	}
 
 	path_to_nameidata(path, nd);
-	nd->inode = path->dentry->d_inode;
-	/* Why this, you ask?  _Now_ we might have grown LOOKUP_JUMPED... */
+	nd->inode = inode;
+
 	error = complete_walk(nd);
 	if (error)
 		return ERR_PTR(error);
 	error = -EISDIR;
-	if (S_ISDIR(nd->inode->i_mode))
+	if ((open_flag & O_CREAT) && S_ISDIR(inode->i_mode))
+		goto exit;
+	error = -ENOTDIR;
+	if (nd->flags & LOOKUP_DIRECTORY && !inode->i_op->lookup)
 		goto exit;
+
+	audit_inode(pathname, nd->path.dentry);
 ok:
 	if (!S_ISREG(nd->inode->i_mode))
 		will_truncate = 0;
@@ -2303,6 +2316,10 @@ exit_dput:
 exit:
 	filp = ERR_PTR(error);
 	goto out;
+
+terminate:
+	terminate_walk(nd);
+	return ERR_PTR(error);
 }
 
 static struct file *path_openat(int dfd, const char *pathname,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 03/25] vfs: split __dentry_open()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 01/25] vfs: split do_lookup() Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 02/25] vfs: reorganize do_last() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-24 14:12   ` Christoph Hellwig
  2012-03-07 21:22 ` [PATCH 04/25] vfs: add i_op->atomic_open() Miklos Szeredi
                   ` (23 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Split __dentry_open() into two functions:

  do_dentry_open() - does most of the actual work, doesn't put file on failure
  open_check_o_direct() - after a successful open, checks direct_IO method

This will allow i_op->atomic_open to do just the file initialization and leave
the direct_IO checking to the VFS.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/internal.h |    1 +
 fs/open.c     |   47 +++++++++++++++++++++++++++++++++--------------
 2 files changed, 34 insertions(+), 14 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 9962c59..4d69fdd 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -100,6 +100,7 @@ extern struct file *do_file_open_root(struct dentry *, struct vfsmount *,
 
 extern long do_handle_open(int mountdirfd,
 			   struct file_handle __user *ufh, int open_flag);
+extern int open_check_o_direct(struct file *f);
 
 /*
  * inode.c
diff --git a/fs/open.c b/fs/open.c
index 77becc0..6acfd2d 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -644,10 +644,23 @@ static inline int __get_file_write_access(struct inode *inode,
 	return error;
 }
 
-static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
-					struct file *f,
-					int (*open)(struct inode *, struct file *),
-					const struct cred *cred)
+int open_check_o_direct(struct file *f)
+{
+	/* NB: we're sure to have correct a_ops only after f_op->open */
+	if (f->f_flags & O_DIRECT) {
+		if (!f->f_mapping->a_ops ||
+		    ((!f->f_mapping->a_ops->direct_IO) &&
+		    (!f->f_mapping->a_ops->get_xip_mem))) {
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+static struct file *do_dentry_open(struct dentry *dentry, struct vfsmount *mnt,
+				   struct file *f,
+				   int (*open)(struct inode *, struct file *),
+				   const struct cred *cred)
 {
 	static const struct file_operations empty_fops = {};
 	struct inode *inode;
@@ -703,16 +716,6 @@ static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
 
 	file_ra_state_init(&f->f_ra, f->f_mapping->host->i_mapping);
 
-	/* NB: we're sure to have correct a_ops only after f_op->open */
-	if (f->f_flags & O_DIRECT) {
-		if (!f->f_mapping->a_ops ||
-		    ((!f->f_mapping->a_ops->direct_IO) &&
-		    (!f->f_mapping->a_ops->get_xip_mem))) {
-			fput(f);
-			f = ERR_PTR(-EINVAL);
-		}
-	}
-
 	return f;
 
 cleanup_all:
@@ -740,6 +743,22 @@ cleanup_file:
 	return ERR_PTR(error);
 }
 
+static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
+				struct file *f,
+				int (*open)(struct inode *, struct file *),
+				const struct cred *cred)
+{
+	struct file *res = do_dentry_open(dentry, mnt, f, open, cred);
+	if (!IS_ERR(res)) {
+		int error = open_check_o_direct(f);
+		if (error) {
+			fput(res);
+			res = ERR_PTR(error);
+		}
+	}
+	return res;
+}
+
 /**
  * lookup_instantiate_filp - instantiates the open intent filp
  * @nd: pointer to nameidata
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 04/25] vfs: add i_op->atomic_open()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (2 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 03/25] vfs: split __dentry_open() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-13 14:38   ` Myklebust, Trond
  2012-03-07 21:22 ` [PATCH 05/25] vfs: add filesystem flags for atomic_open Miklos Szeredi
                   ` (22 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Add a new inode operation which is called on the last component of an open.
Using this the filesystem can look up, possibly create and open the file in one
atomic operation.  If it cannot perform this (e.g. the file type turned out to
be wrong) it may signal this by returning NULL instead of an open struct file
pointer.

i_op->atomic_open() is only called if the last component is negative or needs
lookup.  Handling cached positive dentries here doesn't add much value: these
can be opened using f_op->open().  If the cached file turns out to be invalid,
the open can be retried, this time using ->atomic_open() with a fresh dentry.

For now leave the old way of using open intents in lookup and revalidate in
place.  This will be removed once all the users are converted.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/internal.h      |    5 +
 fs/namei.c         |  231 +++++++++++++++++++++++++++++++++++++++++++++++++---
 fs/open.c          |   27 ++++++
 include/linux/fs.h |    6 ++
 4 files changed, 258 insertions(+), 11 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 4d69fdd..10143de 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -87,12 +87,17 @@ extern struct super_block *user_get_super(dev_t);
 struct nameidata;
 extern struct file *nameidata_to_filp(struct nameidata *);
 extern void release_open_intent(struct nameidata *);
+struct opendata {
+	struct vfsmount *mnt;
+	struct file **filp;
+};
 struct open_flags {
 	int open_flag;
 	umode_t mode;
 	int acc_mode;
 	int intent;
 };
+
 extern struct file *do_filp_open(int dfd, const char *pathname,
 		const struct open_flags *op, int lookup_flags);
 extern struct file *do_file_open_root(struct dentry *, struct vfsmount *,
diff --git a/fs/namei.c b/fs/namei.c
index ff8bc94..835dcf1 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2100,6 +2100,201 @@ static inline int open_to_namei_flags(int flag)
 	return flag;
 }
 
+static int may_o_create(struct path *dir, struct dentry *dentry, umode_t mode)
+{
+	int error = security_path_mknod(dir, dentry, mode, 0);
+	if (error)
+		return error;
+
+	error = may_create(dir->dentry->d_inode, dentry);
+	if (error)
+		return error;
+
+	return security_inode_create(dir->dentry->d_inode, dentry, mode);
+}
+
+static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
+				const struct open_flags *op,
+				int *want_write, int *create_error)
+{
+	struct inode *dir =  nd->path.dentry->d_inode;
+	unsigned open_flag = open_to_namei_flags(op->open_flag);
+	umode_t mode;
+	int error;
+	bool created = false;
+	int acc_mode;
+	struct opendata od;
+	struct file *filp;
+
+	BUG_ON(dentry->d_inode);
+
+	/* Don't create child dentry for a dead directory. */
+	if (unlikely(IS_DEADDIR(dir)))
+		return ERR_PTR(-ENOENT);
+
+	mode = op->mode & S_IALLUGO;
+	if ((open_flag & O_CREAT) && !IS_POSIXACL(dir))
+		mode &= ~current_umask();
+
+	if (open_flag & O_EXCL) {
+		open_flag &= ~O_TRUNC;
+		created = true;
+	}
+
+	/*
+	 * Checking write permission is tricky, bacuse we don't know if we are
+	 * going to actually need it: O_CREAT opens should work as long as the
+	 * file exists.  But checking existence breaks atomicity.  The trick is
+	 * to check access and if not granted clear O_CREAT from the flags.
+	 *
+	 * Another problem is returing the "right" error value (e.g. for an
+	 * O_EXCL open we want to return EEXIST not EROFS).
+	 */
+	if ((open_flag & (O_CREAT | O_TRUNC)) ||
+	    (open_flag & O_ACCMODE) != O_RDONLY) {
+		error = mnt_want_write(nd->path.mnt);
+		if (!error) {
+			*want_write = 1;
+		} else if (!(open_flag & O_CREAT)) {
+			/*
+			 * No O_CREATE -> atomicity not a requirement -> fall
+			 * back to lookup + open
+			 */
+			goto look_up;
+		} else if (open_flag & (O_EXCL | O_TRUNC)) {
+			/* Fall back and fail with the right error */
+			*create_error = error;
+			goto look_up;
+		} else {
+			/* No side effects, safe to clear O_CREAT */
+			*create_error = error;
+			open_flag &= ~O_CREAT;
+		}
+	}
+
+	if (open_flag & O_CREAT) {
+		error = may_o_create(&nd->path, dentry, op->mode);
+		if (error) {
+			*create_error = error;
+			if (open_flag & O_EXCL)
+				goto look_up;
+			open_flag &= ~O_CREAT;
+		}
+	}
+
+	if (nd->flags & LOOKUP_DIRECTORY)
+		open_flag |= O_DIRECTORY;
+
+	od.mnt = nd->path.mnt;
+	od.filp = &nd->intent.open.file;
+	filp = dir->i_op->atomic_open(dir, dentry, &od, open_flag, mode,
+				      &created);
+	if (IS_ERR(filp)) {
+		if (*create_error && PTR_ERR(filp) == -ENOENT)
+			filp = ERR_PTR(*create_error);
+		goto out;
+	}
+
+	acc_mode = op->acc_mode;
+	if (created) {
+		fsnotify_create(dir, dentry);
+		acc_mode = MAY_OPEN;
+	}
+
+	if (filp) {
+		/*
+		 * We didn't have the inode before the open, so check open
+		 * permission here.
+		 */
+		error = may_open(&filp->f_path, acc_mode, open_flag);
+		if (error)
+			goto out_fput;
+
+		error = open_check_o_direct(filp);
+		if (error)
+			goto out_fput;
+	}
+	*create_error = 0;
+
+out:
+	return filp;
+
+out_fput:
+	fput(filp);
+	return ERR_PTR(error);
+
+look_up:
+	return NULL;
+}
+
+/*
+ * Lookup and possibly open (and create) the last component
+ *
+ * Must be called with i_mutex held on parent.
+ *
+ * Returns open file or NULL on success, error otherwise.  NULL means no open
+ * was performed, only lookup.
+ */
+static struct file *lookup_open(struct nameidata *nd, struct path *path,
+				const struct open_flags *op, int *want_write)
+{
+	struct dentry *dir = nd->path.dentry;
+	struct inode *dir_inode = dir->d_inode;
+	struct dentry *dentry;
+	int error;
+	int create_error = 0;
+	bool need_lookup;
+
+	dentry = lookup_dcache(&nd->last, dir, nd, &need_lookup);
+	if (IS_ERR(dentry))
+		return ERR_CAST(dentry);
+
+	/* Cached positive dentry: will open in f_op->open */
+	if (!need_lookup && dentry->d_inode)
+		goto out_no_open;
+
+	if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) {
+		struct file *filp;
+
+		filp = atomic_open(nd, dentry, op, want_write, &create_error);
+		if (filp) {
+			dput(dentry);
+			return filp;
+		}
+		/* fall back to plain lookup */
+	}
+
+	if (need_lookup) {
+		BUG_ON(dentry->d_inode);
+
+		dentry = lookup_real(dir_inode, dentry, nd);
+		if (IS_ERR(dentry))
+			return ERR_CAST(dentry);
+
+		if (create_error) {
+			int open_flag = op->open_flag;
+
+			error = create_error;
+			if ((open_flag & O_EXCL) && !dentry->d_inode)
+				goto out_dput;
+			if (!(open_flag & O_EXCL) && (open_flag & O_TRUNC) &&
+			    dentry->d_inode && S_ISREG(dentry->d_inode->i_mode))
+				goto out_dput;
+
+			/* will fail later, go on to get the right error */
+		}
+	}
+
+out_no_open:
+	path->dentry = dentry;
+	path->mnt = nd->path.mnt;
+	return NULL;
+
+out_dput:
+	dput(dentry);
+	return ERR_PTR(error);
+}
+
 /*
  * Handle the last step of open()
  */
@@ -2180,15 +2375,16 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 
 	mutex_lock(&dir->d_inode->i_mutex);
 
-	dentry = lookup_hash(nd);
-	error = PTR_ERR(dentry);
-	if (IS_ERR(dentry)) {
+	filp = lookup_open(nd, path, op, &want_write);
+	if (filp) {
 		mutex_unlock(&dir->d_inode->i_mutex);
-		goto exit;
-	}
+		if (IS_ERR(filp))
+			goto out;
 
-	path->dentry = dentry;
-	path->mnt = nd->path.mnt;
+		audit_inode(pathname, filp->f_path.dentry);
+		goto opened;
+	}
+	dentry = path->dentry;
 
 	/* Negative dentry, create the file if O_CREAT */
 	if (!dentry->d_inode) {
@@ -2207,10 +2403,12 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 		 * a permanent write count is taken through
 		 * the 'struct file' in nameidata_to_filp().
 		 */
-		error = mnt_want_write(nd->path.mnt);
-		if (error)
-			goto exit_mutex_unlock;
-		want_write = 1;
+		if (!want_write) {
+			error = mnt_want_write(nd->path.mnt);
+			if (error)
+				goto exit_mutex_unlock;
+			want_write = 1;
+		}
 		/* Don't check for write permission, don't truncate */
 		open_flag &= ~O_TRUNC;
 		will_truncate = 0;
@@ -2232,6 +2430,16 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	 */
 	mutex_unlock(&dir->d_inode->i_mutex);
 
+	/*
+	 * If atomic_open() acquired write access it is dropped now due to
+	 * possible mount and symlink following (this might be optimized away if
+	 * necessary...)
+	 */
+	if (want_write) {
+		mnt_drop_write(nd->path.mnt);
+		want_write = 0;
+	}
+
 	error = -EEXIST;
 	if (open_flag & O_EXCL)
 		goto exit_dput;
@@ -2287,6 +2495,7 @@ common:
 	if (error)
 		goto exit;
 	filp = nameidata_to_filp(nd);
+opened:
 	if (!IS_ERR(filp)) {
 		error = ima_file_check(filp, op->acc_mode);
 		if (error) {
diff --git a/fs/open.c b/fs/open.c
index 6acfd2d..238c5ae 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -800,6 +800,33 @@ out_err:
 EXPORT_SYMBOL_GPL(lookup_instantiate_filp);
 
 /**
+ * finish_open - set up a not fully instantiated file
+ * @od: opaque open data
+ * @dentry: pointer to dentry
+ * @open: open callback
+ *
+ * This can be used to finish opening a file passed to i_op->atomic_open() or
+ * i_op->atomic_create().
+ *
+ * If the open callback is set to NULL, then the standard f_op->open()
+ * filesystem callback is substituted.
+ */
+struct file *finish_open(struct opendata *od, struct dentry *dentry,
+			 int (*open)(struct inode *, struct file *))
+{
+	struct file *filp;
+
+	filp = *(od->filp);
+	*(od->filp) = NULL;
+
+	mntget(od->mnt);
+	dget(dentry);
+
+	return do_dentry_open(dentry, od->mnt, filp, open, current_cred());
+}
+EXPORT_SYMBOL(finish_open);
+
+/**
  * nameidata_to_filp - convert a nameidata to an open filp.
  * @nd: pointer to nameidata
  * @flags: open flags
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 69cd5bb..3a47ecd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -412,6 +412,7 @@ struct kstatfs;
 struct vm_area_struct;
 struct vfsmount;
 struct cred;
+struct opendata;
 
 extern void __init inode_init(void);
 extern void __init inode_init_early(void);
@@ -1653,6 +1654,9 @@ struct inode_operations {
 	void (*truncate_range)(struct inode *, loff_t, loff_t);
 	int (*fiemap)(struct inode *, struct fiemap_extent_info *, u64 start,
 		      u64 len);
+	struct file * (*atomic_open)(struct inode *, struct dentry *,
+				     struct opendata *, unsigned open_flag,
+				     umode_t create_mode, bool *created);
 } ____cacheline_aligned;
 
 struct seq_file;
@@ -2027,6 +2031,8 @@ extern struct file * dentry_open(struct dentry *, struct vfsmount *, int,
 				 const struct cred *);
 extern int filp_close(struct file *, fl_owner_t id);
 extern char * getname(const char __user *);
+extern struct file *finish_open(struct opendata *od, struct dentry *dentry,
+				int (*open)(struct inode *, struct file *));
 
 /* fs/ioctl.c */
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 05/25] vfs: add filesystem flags for atomic_open
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (3 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 04/25] vfs: add i_op->atomic_open() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-13  9:33   ` Christoph Hellwig
  2012-03-07 21:22 ` [PATCH 06/25] vfs: add i_op->atomic_create() Miklos Szeredi
                   ` (21 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Allow filesystem to select which cases it wants to perform atomic_open.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c         |   16 +++++++++++++++-
 include/linux/fs.h |    2 ++
 2 files changed, 17 insertions(+), 1 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 835dcf1..8dfbe45 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2227,6 +2227,20 @@ look_up:
 	return NULL;
 }
 
+static bool is_atomic_lookup_open(struct inode *dir, struct nameidata *nd)
+{
+	int fs_flags;
+
+	if (!(nd->flags & LOOKUP_OPEN) || !dir->i_op->atomic_open)
+		return false;
+
+	fs_flags = dir->i_sb->s_type->fs_flags;
+	if (nd->flags & LOOKUP_CREATE)
+		return !(fs_flags & FS_NO_LOOKUP_CREATE);
+	else
+		return !(fs_flags & FS_NO_LOOKUP_OPEN);
+}
+
 /*
  * Lookup and possibly open (and create) the last component
  *
@@ -2253,7 +2267,7 @@ static struct file *lookup_open(struct nameidata *nd, struct path *path,
 	if (!need_lookup && dentry->d_inode)
 		goto out_no_open;
 
-	if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) {
+	if (is_atomic_lookup_open(dir_inode, nd)) {
 		struct file *filp;
 
 		filp = atomic_open(nd, dentry, op, want_write, &create_error);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3a47ecd..6615355 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -181,6 +181,8 @@ struct inodes_stat_t {
 #define FS_RENAME_DOES_D_MOVE	32768	/* FS will handle d_move()
 					 * during rename() internally.
 					 */
+#define FS_NO_LOOKUP_OPEN	0x10000	/* fs can't do atomic lookup+open */
+#define FS_NO_LOOKUP_CREATE	0x20000 /* fs can't do lookup+create+open */
 
 /*
  * These are the fs-independent mount-flags: up to 32 flags are supported
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (4 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 05/25] vfs: add filesystem flags for atomic_open Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-13  9:37   ` Christoph Hellwig
  2012-03-07 21:22 ` [PATCH 07/25] nfs: implement i_op->atomic_open() Miklos Szeredi
                   ` (20 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Add a new inode operation which is called on regular file create.  This is a
replacement for ->create() which allows the file to be opened atomically with
creation.

This function is also called for non-open creates (mknod(2)) with a NULL file
argument.  Only one of ->create or ->atomic_create will be called, implementing
both makes no sense.

The functionality of this method partially overlaps that of ->atomic_open().
FUSE and 9P only use ->atomic_create, NFS, CIFS and CEPH use both.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c         |  118 +++++++++++++++++++++++++++++++++++++++++-----------
 include/linux/fs.h |    4 ++
 2 files changed, 97 insertions(+), 25 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 8dfbe45..200cffe 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1998,27 +1998,6 @@ void unlock_rename(struct dentry *p1, struct dentry *p2)
 	}
 }
 
-int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
-{
-	int error = may_create(dir, dentry);
-
-	if (error)
-		return error;
-
-	if (!dir->i_op->create)
-		return -EACCES;	/* shouldn't it be ENOSYS? */
-	mode &= S_IALLUGO;
-	mode |= S_IFREG;
-	error = security_inode_create(dir, dentry, mode);
-	if (error)
-		return error;
-	error = dir->i_op->create(dir, dentry, mode, nd);
-	if (!error)
-		fsnotify_create(dir, dentry);
-	return error;
-}
-
 static int may_open(struct path *path, int acc_mode, int flag)
 {
 	struct dentry *dentry = path->dentry;
@@ -2071,6 +2050,87 @@ static int may_open(struct path *path, int acc_mode, int flag)
 	return 0;
 }
 
+static struct file *atomic_create(struct inode *dir, struct dentry *dentry,
+				  struct opendata *od, unsigned open_flag,
+				  umode_t mode)
+{
+	struct file *filp;
+	int error;
+
+	filp = dir->i_op->atomic_create(dir, dentry, od, open_flag, mode);
+	if (IS_ERR(filp))
+		goto out;
+
+	fsnotify_create(dir, dentry);
+
+	if (!filp)
+		goto out;
+
+	/*
+	 * We don't have the inode before the open, so check open permission
+	 * here.
+	 */
+	error = may_open(&filp->f_path, MAY_OPEN, open_flag);
+	if (error)
+		goto out_fput;
+
+	error = open_check_o_direct(filp);
+	if (error)
+		goto out_fput;
+
+out:
+	return filp;
+
+out_fput:
+	fput(filp);
+	return ERR_PTR(error);
+}
+
+static struct file *create_open(struct inode *dir, struct dentry *dentry,
+				struct opendata *od, unsigned open_flag,
+				umode_t mode, struct nameidata *nd)
+{
+	int error = may_create(dir, dentry);
+	if (error)
+		goto out_err;
+
+	error = -EACCES; /* shouldn't it be ENOSYS? */
+	if (!dir->i_op->create && !dir->i_op->atomic_create)
+		goto out_err;
+	mode &= S_IALLUGO;
+	mode |= S_IFREG;
+	error = security_inode_create(dir, dentry, mode);
+	if (error)
+		goto out_err;
+	if (dir->i_op->create) {
+		error = dir->i_op->create(dir, dentry, mode, nd);
+		if (error)
+			goto out_err;
+
+		fsnotify_create(dir, dentry);
+		return NULL;
+	} else {
+		return atomic_create(dir, dentry, od, open_flag, mode);
+	}
+
+out_err:
+	return ERR_PTR(error);
+}
+
+int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
+	       struct nameidata *nd)
+{
+	struct file *res;
+	unsigned open_flag = O_RDONLY|O_CREAT|O_EXCL;
+
+	res = create_open(dir, dentry, NULL, open_flag, mode, NULL);
+	if (IS_ERR(res))
+		return PTR_ERR(res);
+
+	BUG_ON(res != NULL);
+	return 0;
+}
+
 static int handle_truncate(struct file *filp)
 {
 	struct path *path = &filp->f_path;
@@ -2317,7 +2377,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 {
 	struct dentry *dir = nd->path.dentry;
 	struct dentry *dentry;
-	int open_flag = op->open_flag;
+	int open_flag = open_to_namei_flags(op->open_flag);
 	int will_truncate = open_flag & O_TRUNC;
 	int want_write = 0;
 	int acc_mode = op->acc_mode;
@@ -2402,6 +2462,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 
 	/* Negative dentry, create the file if O_CREAT */
 	if (!dentry->d_inode) {
+		struct opendata od;
 		umode_t mode = op->mode;
 
 		error = -ENOENT;
@@ -2430,12 +2491,19 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 		error = security_path_mknod(&nd->path, dentry, mode, 0);
 		if (error)
 			goto exit_mutex_unlock;
-		error = vfs_create(dir->d_inode, dentry, mode, nd);
-		if (error)
-			goto exit_mutex_unlock;
+		od.mnt = nd->path.mnt;
+		od.filp = &nd->intent.open.file;
+		filp = create_open(dir->d_inode, dentry, &od, open_flag, mode,
+				   nd);
 		mutex_unlock(&dir->d_inode->i_mutex);
 		dput(nd->path.dentry);
 		nd->path.dentry = dentry;
+		if (filp) {
+			if (IS_ERR(filp))
+				goto out;
+
+			goto opened;
+		}
 		goto common;
 	}
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6615355..af291bb 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1659,6 +1659,10 @@ struct inode_operations {
 	struct file * (*atomic_open)(struct inode *, struct dentry *,
 				     struct opendata *, unsigned open_flag,
 				     umode_t create_mode, bool *created);
+	struct file * (*atomic_create)(struct inode *, struct dentry *,
+				       struct opendata *, unsigned open_flag,
+				       umode_t create_mode);
+
 } ____cacheline_aligned;
 
 struct seq_file;
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 07/25] nfs: implement i_op->atomic_open()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (5 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 06/25] vfs: add i_op->atomic_create() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 08/25] nfs: clean up ->create in nfs_rpc_ops Miklos Szeredi
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace NFS4 specific ->lookup implementation with ->atomic_open impelementation
and use the generic nfs_lookup for other lookups.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c |  161 ++++++++++++++++++++++++++++------------------------------
 1 files changed, 77 insertions(+), 84 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 715a8c1..949b9e8 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -111,11 +111,15 @@ const struct inode_operations nfs3_dir_inode_operations = {
 
 #ifdef CONFIG_NFS_V4
 
-static struct dentry *nfs_atomic_lookup(struct inode *, struct dentry *, struct nameidata *);
-static int nfs_open_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct nameidata *nd);
+static struct file *nfs_atomic_open(struct inode *, struct dentry *,
+				    struct opendata *, unsigned, umode_t,
+				    bool *);
+static int nfs4_create(struct inode *dir, struct dentry *dentry,
+		       umode_t mode, struct nameidata *nd);
 const struct inode_operations nfs4_dir_inode_operations = {
-	.create		= nfs_open_create,
-	.lookup		= nfs_atomic_lookup,
+	.create		= nfs4_create,
+	.lookup		= nfs_lookup,
+	.atomic_open	= nfs_atomic_open,
 	.link		= nfs_link,
 	.unlink		= nfs_unlink,
 	.symlink	= nfs_symlink,
@@ -1377,116 +1381,110 @@ static int do_open(struct inode *inode, struct file *filp)
 	return 0;
 }
 
-static int nfs_intent_set_file(struct nameidata *nd, struct nfs_open_context *ctx)
+static struct file *nfs_finish_open(struct nfs_open_context *ctx,
+				    struct dentry *dentry,
+				    struct opendata *od, unsigned open_flags)
 {
 	struct file *filp;
-	int ret = 0;
+	int err;
+
+	if (ctx->dentry != dentry) {
+		dput(ctx->dentry);
+		ctx->dentry = dget(dentry);
+	}
 
 	/* If the open_intent is for execute, we have an extra check to make */
 	if (ctx->mode & FMODE_EXEC) {
-		ret = nfs_may_open(ctx->dentry->d_inode,
-				ctx->cred,
-				nd->intent.open.flags);
-		if (ret < 0)
+		err = nfs_may_open(dentry->d_inode, ctx->cred, open_flags);
+		if (err < 0) {
+			filp = ERR_PTR(err);
 			goto out;
+		}
 	}
-	filp = lookup_instantiate_filp(nd, ctx->dentry, do_open);
-	if (IS_ERR(filp))
-		ret = PTR_ERR(filp);
-	else
+
+	filp = finish_open(od, dentry, do_open);
+	if (!IS_ERR(filp))
 		nfs_file_set_open_context(filp, ctx);
+
 out:
 	put_nfs_open_context(ctx);
-	return ret;
+	return filp;
 }
 
-static struct dentry *nfs_atomic_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct file *nfs_atomic_open(struct inode *dir, struct dentry *dentry,
+				    struct opendata *od, unsigned open_flags,
+				    umode_t mode, bool *created)
 {
 	struct nfs_open_context *ctx;
-	struct iattr attr;
-	struct dentry *res = NULL;
+	struct dentry *res;
+	struct iattr attr = { .ia_valid = 0 };
 	struct inode *inode;
-	int open_flags;
+	struct file *filp;
 	int err;
 
-	dfprintk(VFS, "NFS: atomic_lookup(%s/%ld), %s\n",
+	/* Expect a negative dentry */
+	BUG_ON(dentry->d_inode);
+
+	dfprintk(VFS, "NFS: atomic_open(%s/%ld), %s\n",
 			dir->i_sb->s_id, dir->i_ino, dentry->d_name.name);
 
-	/* Check that we are indeed trying to open this file */
-	if (!is_atomic_open(nd))
+	/* NFS only supports OPEN on regular files */
+	if ((open_flags & O_DIRECTORY))
 		goto no_open;
 
-	if (dentry->d_name.len > NFS_SERVER(dir)->namelen) {
-		res = ERR_PTR(-ENAMETOOLONG);
-		goto out;
-	}
+	err = -ENAMETOOLONG;
+	if (dentry->d_name.len > NFS_SERVER(dir)->namelen)
+		goto out_err;
 
-	/* Let vfs_create() deal with O_EXCL. Instantiate, but don't hash
-	 * the dentry. */
-	if (nd->flags & LOOKUP_EXCL) {
-		d_instantiate(dentry, NULL);
-		goto out;
+	if (open_flags & O_CREAT) {
+		attr.ia_mode = mode & ~current_umask();
+		attr.ia_valid |= ATTR_MODE;
 	}
 
-	open_flags = nd->intent.open.flags;
-
 	ctx = create_nfs_open_context(dentry, open_flags);
-	res = ERR_CAST(ctx);
+	err = PTR_ERR(ctx);
 	if (IS_ERR(ctx))
-		goto out;
-
-	if (nd->flags & LOOKUP_CREATE) {
-		attr.ia_mode = nd->intent.open.create_mode;
-		attr.ia_valid = ATTR_MODE;
-		attr.ia_mode &= ~current_umask();
-	} else {
-		open_flags &= ~(O_EXCL | O_CREAT);
-		attr.ia_valid = 0;
-	}
+		goto out_err;
 
-	/* Open the file on the server */
 	nfs_block_sillyrename(dentry->d_parent);
 	inode = NFS_PROTO(dir)->open_context(dir, ctx, open_flags, &attr);
+	d_drop(dentry);
 	if (IS_ERR(inode)) {
 		nfs_unblock_sillyrename(dentry->d_parent);
 		put_nfs_open_context(ctx);
-		switch (PTR_ERR(inode)) {
-			/* Make a negative dentry */
-			case -ENOENT:
-				d_add(dentry, NULL);
-				res = NULL;
-				goto out;
-			/* This turned out not to be a regular file */
-			case -EISDIR:
-			case -ENOTDIR:
+		err = PTR_ERR(inode);
+		switch (err) {
+		case -ENOENT:
+			d_add(dentry, NULL);
+			break;
+		case -EISDIR:
+		case -ENOTDIR:
+			goto no_open;
+		case -ELOOP:
+			if (!(open_flags & O_NOFOLLOW))
 				goto no_open;
-			case -ELOOP:
-				if (!(nd->intent.open.flags & O_NOFOLLOW))
-					goto no_open;
+			break;
 			/* case -EINVAL: */
-			default:
-				res = ERR_CAST(inode);
-				goto out;
+		default:
+			break;
 		}
+		goto out_err;
 	}
 	res = d_add_unique(dentry, inode);
-	nfs_unblock_sillyrename(dentry->d_parent);
-	if (res != NULL) {
-		dput(ctx->dentry);
-		ctx->dentry = dget(res);
+	if (res != NULL)
 		dentry = res;
-	}
-	err = nfs_intent_set_file(nd, ctx);
-	if (err < 0) {
-		if (res != NULL)
-			dput(res);
-		return ERR_PTR(err);
-	}
-out:
+
+	nfs_unblock_sillyrename(dentry->d_parent);
 	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
-	return res;
+
+	filp = nfs_finish_open(ctx, dentry, od, open_flags);
+
+	dput(res);
+	return filp;
+out_err:
+	return ERR_PTR(err);
 no_open:
-	return nfs_lookup(dir, dentry, nd);
+	return NULL;
 }
 
 static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
@@ -1536,8 +1534,8 @@ no_open:
 	return nfs_lookup_revalidate(dentry, nd);
 }
 
-static int nfs_open_create(struct inode *dir, struct dentry *dentry,
-		umode_t mode, struct nameidata *nd)
+static int nfs4_create(struct inode *dir, struct dentry *dentry,
+		       umode_t mode, struct nameidata *nd)
 {
 	struct nfs_open_context *ctx = NULL;
 	struct iattr attr;
@@ -1561,19 +1559,14 @@ static int nfs_open_create(struct inode *dir, struct dentry *dentry,
 	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags, ctx);
 	if (error != 0)
 		goto out_put_ctx;
-	if (nd) {
-		error = nfs_intent_set_file(nd, ctx);
-		if (error < 0)
-			goto out_err;
-	} else {
-		put_nfs_open_context(ctx);
-	}
+
+	put_nfs_open_context(ctx);
+
 	return 0;
 out_put_ctx:
 	put_nfs_open_context(ctx);
 out_err_drop:
 	d_drop(dentry);
-out_err:
 	return error;
 }
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 08/25] nfs: clean up ->create in nfs_rpc_ops
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (6 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 07/25] nfs: implement i_op->atomic_open() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 09/25] nfs: remove nfs4 specific create function Miklos Szeredi
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Don't pass nfs_open_context() to ->create().  Only the NFS4 implementation
needed that and only because it wanted to return an open file using open
intents.  That task has been replaced by ->atomic_open so it is not necessary
anymore to pass the context to the create rpc operation.

Despite nfs4_proc_create apparently being okay with a NULL context it Oopses
somewhere down the call chain.  So allocate a context here.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c            |   42 ++----------------------------------------
 fs/nfs/nfs3proc.c       |    2 +-
 fs/nfs/nfs4proc.c       |   37 ++++++++++---------------------------
 fs/nfs/proc.c           |    2 +-
 include/linux/nfs_xdr.h |    2 +-
 5 files changed, 15 insertions(+), 70 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 949b9e8..24bf3c9 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -114,10 +114,8 @@ const struct inode_operations nfs3_dir_inode_operations = {
 static struct file *nfs_atomic_open(struct inode *, struct dentry *,
 				    struct opendata *, unsigned, umode_t,
 				    bool *);
-static int nfs4_create(struct inode *dir, struct dentry *dentry,
-		       umode_t mode, struct nameidata *nd);
 const struct inode_operations nfs4_dir_inode_operations = {
-	.create		= nfs4_create,
+	.create		= nfs_create,
 	.lookup		= nfs_lookup,
 	.atomic_open	= nfs_atomic_open,
 	.link		= nfs_link,
@@ -1534,42 +1532,6 @@ no_open:
 	return nfs_lookup_revalidate(dentry, nd);
 }
 
-static int nfs4_create(struct inode *dir, struct dentry *dentry,
-		       umode_t mode, struct nameidata *nd)
-{
-	struct nfs_open_context *ctx = NULL;
-	struct iattr attr;
-	int error;
-	int open_flags = O_CREAT|O_EXCL;
-
-	dfprintk(VFS, "NFS: create(%s/%ld), %s\n",
-			dir->i_sb->s_id, dir->i_ino, dentry->d_name.name);
-
-	attr.ia_mode = mode;
-	attr.ia_valid = ATTR_MODE;
-
-	if (nd)
-		open_flags = nd->intent.open.flags;
-
-	ctx = create_nfs_open_context(dentry, open_flags);
-	error = PTR_ERR(ctx);
-	if (IS_ERR(ctx))
-		goto out_err_drop;
-
-	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags, ctx);
-	if (error != 0)
-		goto out_put_ctx;
-
-	put_nfs_open_context(ctx);
-
-	return 0;
-out_put_ctx:
-	put_nfs_open_context(ctx);
-out_err_drop:
-	d_drop(dentry);
-	return error;
-}
-
 #endif /* CONFIG_NFSV4 */
 
 /*
@@ -1636,7 +1598,7 @@ static int nfs_create(struct inode *dir, struct dentry *dentry,
 	if (nd)
 		open_flags = nd->intent.open.flags;
 
-	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags, NULL);
+	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags);
 	if (error != 0)
 		goto out_err;
 	return 0;
diff --git a/fs/nfs/nfs3proc.c b/fs/nfs/nfs3proc.c
index 9194395..821c8bf 100644
--- a/fs/nfs/nfs3proc.c
+++ b/fs/nfs/nfs3proc.c
@@ -314,7 +314,7 @@ static void nfs3_free_createdata(struct nfs3_createdata *data)
  */
 static int
 nfs3_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
-		 int flags, struct nfs_open_context *ctx)
+		 int flags)
 {
 	struct nfs3_createdata *data;
 	umode_t mode = sattr->ia_mode;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index ec9f6ef..f80c547 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2613,37 +2613,22 @@ static int nfs4_proc_readlink(struct inode *inode, struct page *page,
 }
 
 /*
- * Got race?
- * We will need to arrange for the VFS layer to provide an atomic open.
- * Until then, this create/open method is prone to inefficiency and race
- * conditions due to the lookup, create, and open VFS calls from sys_open()
- * placed on the wire.
- *
- * Given the above sorry state of affairs, I'm simply sending an OPEN.
- * The file will be opened again in the subsequent VFS open call
- * (nfs4_proc_file_open).
- *
- * The open for read will just hang around to be used by any process that
- * opens the file O_RDONLY. This will all be resolved with the VFS changes.
+ * This is just for mknod.  open(O_CREAT) will always do ->open_context().
  */
-
 static int
 nfs4_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
-                 int flags, struct nfs_open_context *ctx)
+		 int flags)
 {
-	struct dentry *de = dentry;
+	struct nfs_open_context *ctx;
 	struct nfs4_state *state;
-	struct rpc_cred *cred = NULL;
-	fmode_t fmode = 0;
 	int status = 0;
 
-	if (ctx != NULL) {
-		cred = ctx->cred;
-		de = ctx->dentry;
-		fmode = ctx->mode;
-	}
+	ctx = alloc_nfs_open_context(dentry, FMODE_READ);
+	if (IS_ERR(ctx))
+		return PTR_ERR(ctx);
+
 	sattr->ia_mode &= ~current_umask();
-	state = nfs4_do_open(dir, de, fmode, flags, sattr, cred);
+	state = nfs4_do_open(dir, dentry, ctx->mode, flags, sattr, ctx->cred);
 	d_drop(dentry);
 	if (IS_ERR(state)) {
 		status = PTR_ERR(state);
@@ -2651,11 +2636,9 @@ nfs4_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
 	}
 	d_add(dentry, igrab(state->inode));
 	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
-	if (ctx != NULL)
-		ctx->state = state;
-	else
-		nfs4_close_sync(state, fmode);
+	ctx->state = state;
 out:
+	put_nfs_open_context(ctx);
 	return status;
 }
 
diff --git a/fs/nfs/proc.c b/fs/nfs/proc.c
index 0c672588..60834bc 100644
--- a/fs/nfs/proc.c
+++ b/fs/nfs/proc.c
@@ -259,7 +259,7 @@ static void nfs_free_createdata(const struct nfs_createdata *data)
 
 static int
 nfs_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
-		int flags, struct nfs_open_context *ctx)
+		int flags)
 {
 	struct nfs_createdata *data;
 	struct rpc_message msg = {
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index d6ba9a1..c3df045 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -1218,7 +1218,7 @@ struct nfs_rpc_ops {
 	int	(*readlink)(struct inode *, struct page *, unsigned int,
 			    unsigned int);
 	int	(*create)  (struct inode *, struct dentry *,
-			    struct iattr *, int, struct nfs_open_context *);
+			    struct iattr *, int);
 	int	(*remove)  (struct inode *, struct qstr *);
 	void	(*unlink_setup)  (struct rpc_message *, struct inode *dir);
 	int	(*unlink_done) (struct rpc_task *, struct inode *);
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 09/25] nfs: remove nfs4 specific create function
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (7 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 08/25] nfs: clean up ->create in nfs_rpc_ops Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-13 12:09   ` Christoph Hellwig
  2012-03-07 21:22 ` [PATCH 10/25] nfs: don't use nd->intent.open.flags Miklos Szeredi
                   ` (17 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Make nfs_atomic_open() work for non-open creates.  This is trivial to do and
allows the NFSv4 specific create code to be removed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c      |   28 ++++++++++++++++++++--------
 fs/nfs/nfs4proc.c |   31 -------------------------------
 2 files changed, 20 insertions(+), 39 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 24bf3c9..8627965 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -114,10 +114,13 @@ const struct inode_operations nfs3_dir_inode_operations = {
 static struct file *nfs_atomic_open(struct inode *, struct dentry *,
 				    struct opendata *, unsigned, umode_t,
 				    bool *);
+static struct file *nfs_atomic_open_common(struct inode *, struct dentry *,
+					   struct opendata *, unsigned,
+					   umode_t);
 const struct inode_operations nfs4_dir_inode_operations = {
-	.create		= nfs_create,
 	.lookup		= nfs_lookup,
 	.atomic_open	= nfs_atomic_open,
+	.atomic_create	= nfs_atomic_open_common, /* called for mknod */
 	.link		= nfs_link,
 	.unlink		= nfs_unlink,
 	.symlink	= nfs_symlink,
@@ -1383,7 +1386,7 @@ static struct file *nfs_finish_open(struct nfs_open_context *ctx,
 				    struct dentry *dentry,
 				    struct opendata *od, unsigned open_flags)
 {
-	struct file *filp;
+	struct file *filp = NULL;
 	int err;
 
 	if (ctx->dentry != dentry) {
@@ -1400,18 +1403,20 @@ static struct file *nfs_finish_open(struct nfs_open_context *ctx,
 		}
 	}
 
-	filp = finish_open(od, dentry, do_open);
-	if (!IS_ERR(filp))
-		nfs_file_set_open_context(filp, ctx);
+	if (od) {
+		filp = finish_open(od, dentry, do_open);
+		if (!IS_ERR(filp))
+			nfs_file_set_open_context(filp, ctx);
+	}
 
 out:
 	put_nfs_open_context(ctx);
 	return filp;
 }
 
-static struct file *nfs_atomic_open(struct inode *dir, struct dentry *dentry,
-				    struct opendata *od, unsigned open_flags,
-				    umode_t mode, bool *created)
+static struct file *nfs_atomic_open_common(struct inode *dir,
+			struct dentry *dentry, struct opendata *od,
+			unsigned open_flags, umode_t mode)
 {
 	struct nfs_open_context *ctx;
 	struct dentry *res;
@@ -1485,6 +1490,13 @@ no_open:
 	return NULL;
 }
 
+static struct file *nfs_atomic_open(struct inode *dir, struct dentry *dentry,
+				    struct opendata *od, unsigned open_flags,
+				    umode_t mode, bool *created)
+{
+	return nfs_atomic_open_common(dir, dentry, od, open_flags, mode);
+}
+
 static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 {
 	struct dentry *parent = NULL;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index f80c547..a0f169a 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2612,36 +2612,6 @@ static int nfs4_proc_readlink(struct inode *inode, struct page *page,
 	return err;
 }
 
-/*
- * This is just for mknod.  open(O_CREAT) will always do ->open_context().
- */
-static int
-nfs4_proc_create(struct inode *dir, struct dentry *dentry, struct iattr *sattr,
-		 int flags)
-{
-	struct nfs_open_context *ctx;
-	struct nfs4_state *state;
-	int status = 0;
-
-	ctx = alloc_nfs_open_context(dentry, FMODE_READ);
-	if (IS_ERR(ctx))
-		return PTR_ERR(ctx);
-
-	sattr->ia_mode &= ~current_umask();
-	state = nfs4_do_open(dir, dentry, ctx->mode, flags, sattr, ctx->cred);
-	d_drop(dentry);
-	if (IS_ERR(state)) {
-		status = PTR_ERR(state);
-		goto out;
-	}
-	d_add(dentry, igrab(state->inode));
-	nfs_set_verifier(dentry, nfs_save_change_attribute(dir));
-	ctx->state = state;
-out:
-	put_nfs_open_context(ctx);
-	return status;
-}
-
 static int _nfs4_proc_remove(struct inode *dir, struct qstr *name)
 {
 	struct nfs_server *server = NFS_SERVER(dir);
@@ -6240,7 +6210,6 @@ const struct nfs_rpc_ops nfs_v4_clientops = {
 	.lookup		= nfs4_proc_lookup,
 	.access		= nfs4_proc_access,
 	.readlink	= nfs4_proc_readlink,
-	.create		= nfs4_proc_create,
 	.remove		= nfs4_proc_remove,
 	.unlink_setup	= nfs4_proc_unlink_setup,
 	.unlink_done	= nfs4_proc_unlink_done,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 10/25] nfs: don't use nd->intent.open.flags
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (8 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 09/25] nfs: remove nfs4 specific create function Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 11/25] nfs: don't use intents for checking atomic open Miklos Szeredi
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Instead check LOOKUP_EXCL in nd->flags, which is basically what the open intent
flags were used for.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 8627965..cd35abb 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1502,7 +1502,7 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	struct dentry *parent = NULL;
 	struct inode *inode;
 	struct inode *dir;
-	int openflags, ret = 0;
+	int ret = 0;
 
 	if (nd->flags & LOOKUP_RCU)
 		return -ECHILD;
@@ -1526,9 +1526,8 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	/* NFS only supports OPEN on regular files */
 	if (!S_ISREG(inode->i_mode))
 		goto no_open_dput;
-	openflags = nd->intent.open.flags;
 	/* We cannot do exclusive creation on a positive dentry */
-	if ((openflags & (O_CREAT|O_EXCL)) == (O_CREAT|O_EXCL))
+	if (nd && nd->flags & LOOKUP_EXCL)
 		goto no_open_dput;
 
 	/* Let f_op->open() actually open (and revalidate) the file */
@@ -1607,8 +1606,8 @@ static int nfs_create(struct inode *dir, struct dentry *dentry,
 	attr.ia_mode = mode;
 	attr.ia_valid = ATTR_MODE;
 
-	if (nd)
-		open_flags = nd->intent.open.flags;
+	if (nd && !(nd->flags & LOOKUP_EXCL))
+		open_flags = O_CREAT;
 
 	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags);
 	if (error != 0)
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 11/25] nfs: don't use intents for checking atomic open
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (9 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 10/25] nfs: don't use nd->intent.open.flags Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-24 14:20   ` Christoph Hellwig
  2012-03-07 21:22 ` [PATCH 12/25] fuse: implement i_op->atomic_create() Miklos Szeredi
                   ` (15 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

is_atomic_open() is now only used by nfs4_lookup_revalidate() to check whether
it's okay to skip normal revalidation.

It does a racy check for mount read-onlyness and falls back to normal
revalidation if the open would fail.  This makes little sense now that this
function isn't used for determining whether to actually open the file or not.

The d_mountpoint() check still makes sense since it is an indication that we
might be following a mount and so open may not revalidate the dentry.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c |   24 ++++--------------------
 1 files changed, 4 insertions(+), 20 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index cd35abb..36a12b4 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1343,24 +1343,6 @@ const struct dentry_operations nfs4_dentry_operations = {
 	.d_release	= nfs_d_release,
 };
 
-/*
- * Use intent information to determine whether we need to substitute
- * the NFSv4-style stateful OPEN for the LOOKUP call
- */
-static int is_atomic_open(struct nameidata *nd)
-{
-	if (nd == NULL || nfs_lookup_check_intent(nd, LOOKUP_OPEN) == 0)
-		return 0;
-	/* NFS does not (yet) have a stateful open for directories */
-	if (nd->flags & LOOKUP_DIRECTORY)
-		return 0;
-	/* Are we trying to write to a read only partition? */
-	if (__mnt_is_readonly(nd->path.mnt) &&
-	    (nd->intent.open.flags & (O_CREAT|O_TRUNC|O_ACCMODE)))
-		return 0;
-	return 1;
-}
-
 static fmode_t flags_to_mode(int flags)
 {
 	fmode_t res = (__force fmode_t)flags & FMODE_EXEC;
@@ -1507,10 +1489,12 @@ static int nfs4_lookup_revalidate(struct dentry *dentry, struct nameidata *nd)
 	if (nd->flags & LOOKUP_RCU)
 		return -ECHILD;
 
-	inode = dentry->d_inode;
-	if (!is_atomic_open(nd) || d_mountpoint(dentry))
+	if (!(nd->flags & LOOKUP_OPEN) || (nd->flags & LOOKUP_DIRECTORY))
+		goto no_open;
+	if (d_mountpoint(dentry))
 		goto no_open;
 
+	inode = dentry->d_inode;
 	parent = dget_parent(dentry);
 	dir = parent->d_inode;
 
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 12/25] fuse: implement i_op->atomic_create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (10 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 11/25] nfs: don't use intents for checking atomic open Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace fuse's ->create implementation with a ->atomic_create implementation.
No functionality is changed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/fuse/dir.c |   57 +++++++++++++++++++++++++++++----------------------------
 1 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 2066328..584385e 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -363,14 +363,16 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
 	return ERR_PTR(err);
 }
 
+static int fuse_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 /*
  * Atomic create+open operation
  *
  * If the filesystem doesn't support this, then fall back to separate
  * 'mknod' + 'open' requests.
  */
-static int fuse_create_open(struct inode *dir, struct dentry *entry,
-			    umode_t mode, struct nameidata *nd)
+static struct file *fuse_create_open(struct inode *dir, struct dentry *entry,
+				     struct opendata *od, unsigned flags,
+				     umode_t mode)
 {
 	int err;
 	struct inode *inode;
@@ -382,17 +384,18 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	struct fuse_entry_out outentry;
 	struct fuse_file *ff;
 	struct file *file;
-	int flags = nd->intent.open.flags;
 
-	if (fc->no_create)
-		return -ENOSYS;
+	if (!od || fc->no_create)
+		goto no_open;
 
+	err = -EINVAL;
 	if (flags & O_DIRECT)
-		return -EINVAL;
+		goto out_err;
 
 	forget = fuse_alloc_forget();
+	err = -ENOMEM;
 	if (!forget)
-		return -ENOMEM;
+		goto out_err;
 
 	req = fuse_get_req(fc);
 	err = PTR_ERR(req);
@@ -432,8 +435,10 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	fuse_request_send(fc, req);
 	err = req->out.h.error;
 	if (err) {
-		if (err == -ENOSYS)
+		if (err == -ENOSYS) {
 			fc->no_create = 1;
+			goto no_open;
+		}
 		goto out_free_ff;
 	}
 
@@ -451,20 +456,21 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 		flags &= ~(O_CREAT | O_EXCL | O_TRUNC);
 		fuse_sync_release(ff, flags);
 		fuse_queue_forget(fc, forget, outentry.nodeid, 1);
-		return -ENOMEM;
+		err = -ENOMEM;
+		goto out_err;
 	}
 	kfree(forget);
 	d_instantiate(entry, inode);
 	fuse_change_entry_timeout(entry, &outentry);
 	fuse_invalidate_attr(dir);
-	file = lookup_instantiate_filp(nd, entry, generic_file_open);
+	file = finish_open(od, entry, generic_file_open);
 	if (IS_ERR(file)) {
 		fuse_sync_release(ff, flags);
-		return PTR_ERR(file);
+	} else {
+		file->private_data = fuse_file_get(ff);
+		fuse_finish_open(inode, file);
 	}
-	file->private_data = fuse_file_get(ff);
-	fuse_finish_open(inode, file);
-	return 0;
+	return file;
 
  out_free_ff:
 	fuse_file_free(ff);
@@ -472,7 +478,14 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
 	fuse_put_request(fc, req);
  out_put_forget_req:
 	kfree(forget);
-	return err;
+out_err:
+	return ERR_PTR(err);
+
+no_open:
+	err = fuse_mknod(dir, entry, mode, 0);
+	if (err)
+		goto out_err;
+	return NULL;
 }
 
 /*
@@ -573,18 +586,6 @@ static int fuse_mknod(struct inode *dir, struct dentry *entry, umode_t mode,
 	return create_new_entry(fc, req, dir, entry, mode);
 }
 
-static int fuse_create(struct inode *dir, struct dentry *entry, umode_t mode,
-		       struct nameidata *nd)
-{
-	if (nd) {
-		int err = fuse_create_open(dir, entry, mode, nd);
-		if (err != -ENOSYS)
-			return err;
-		/* Fall back on mknod */
-	}
-	return fuse_mknod(dir, entry, mode, 0);
-}
-
 static int fuse_mkdir(struct inode *dir, struct dentry *entry, umode_t mode)
 {
 	struct fuse_mkdir_in inarg;
@@ -1631,7 +1632,7 @@ static const struct inode_operations fuse_dir_inode_operations = {
 	.rename		= fuse_rename,
 	.link		= fuse_link,
 	.setattr	= fuse_setattr,
-	.create		= fuse_create,
+	.atomic_create	= fuse_create_open,
 	.mknod		= fuse_mknod,
 	.permission	= fuse_permission,
 	.getattr	= fuse_getattr,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (11 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 12/25] fuse: implement i_op->atomic_create() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-13 12:06   ` Christoph Hellwig
  2012-03-07 21:22 ` [PATCH 14/25] ceph: remove unused arg from ceph_lookup_open() Miklos Szeredi
                   ` (13 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace CIFS's ->create operation with ->atomic_open and ->atomic_create.  Also
move the relevant code from ->lookup into the create function.

CIFS currently only does atomic open for O_CREAT, but it wants to do that as
early as possible, without first calling ->lookup, so it uses ->atomic_open,
just like NFS.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/cifs/cifsfs.c |   24 +++++++-
 fs/cifs/cifsfs.h |    5 +-
 fs/cifs/dir.c    |  172 +++++++++++++++++++++++-------------------------------
 3 files changed, 98 insertions(+), 103 deletions(-)

diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index b1fd382..694bc3d 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -763,15 +763,35 @@ static int cifs_setlease(struct file *file, long arg, struct file_lock **lease)
 		return -EAGAIN;
 }
 
+static struct file *cifs_create(struct inode *dir, struct dentry *dentry,
+				struct opendata *od, unsigned flags,
+				umode_t mode)
+{
+	bool created = true;
+	return cifs_atomic_open(dir, dentry, od, flags, mode, &created);
+}
+
 struct file_system_type cifs_fs_type = {
 	.owner = THIS_MODULE,
 	.name = "cifs",
 	.mount = cifs_do_mount,
 	.kill_sb = cifs_kill_sb,
-	/*  .fs_flags */
+
+	/* Posix open is only called (at lookup time) for file create now.  For
+	 * opens (rather than creates), because we do not know if it is a file
+	 * or directory yet, and current Samba no longer allows us to do posix
+	 * open on dirs, we could end up wasting an open call on what turns out
+	 * to be a dir. For file opens, we wait to call posix open till
+	 * cifs_open.  It could be added to atomic_open in the future but the
+	 * performance tradeoff of the extra network request when EISDIR or
+	 * EACCES is returned would have to be weighed against the 50% reduction
+	 * in network traffic in the other paths.
+	 */
+	.fs_flags = FS_NO_LOOKUP_OPEN,
 };
 const struct inode_operations cifs_dir_inode_ops = {
-	.create = cifs_create,
+	.atomic_open = cifs_atomic_open,
+	.atomic_create = cifs_create,
 	.lookup = cifs_lookup,
 	.getattr = cifs_getattr,
 	.unlink = cifs_unlink,
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index fe5ecf1..16aa162 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -44,8 +44,9 @@ extern const struct address_space_operations cifs_addr_ops_smallbuf;
 /* Functions related to inodes */
 extern const struct inode_operations cifs_dir_inode_ops;
 extern struct inode *cifs_root_iget(struct super_block *);
-extern int cifs_create(struct inode *, struct dentry *, umode_t,
-		       struct nameidata *);
+extern struct file *cifs_atomic_open(struct inode *, struct dentry *,
+				     struct opendata *, unsigned, umode_t,
+				     bool *);
 extern struct dentry *cifs_lookup(struct inode *, struct dentry *,
 				  struct nameidata *);
 extern int cifs_unlink(struct inode *dir, struct dentry *dentry);
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 63a196b..507cc67 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -133,17 +133,39 @@ cifs_bp_rename_retry:
 	return full_path;
 }
 
+/*
+ * Don't allow the separator character in a path component.
+ * The VFS will not allow "/", but "\" is allowed by posix.
+ */
+static int
+check_name(struct dentry *direntry)
+{
+	struct cifs_sb_info *cifs_sb = CIFS_SB(direntry->d_sb);
+	int i;
+
+	if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_POSIX_PATHS)) {
+		for (i = 0; i < direntry->d_name.len; i++) {
+			if (direntry->d_name.name[i] == '\\') {
+				cFYI(1, "Invalid file name");
+				return -EINVAL;
+			}
+		}
+	}
+	return 0;
+}
+
+
 /* Inode operations in similar order to how they appear in Linux file fs.h */
 
-int
-cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
-		struct nameidata *nd)
+struct file *
+cifs_atomic_open(struct inode *inode, struct dentry *direntry,
+		 struct opendata *od, unsigned oflags, umode_t mode,
+		 bool *created)
 {
 	int rc = -ENOENT;
 	int xid;
 	int create_options = CREATE_NOT_DIR;
 	__u32 oplock = 0;
-	int oflags;
 	/*
 	 * BB below access is probably too much for mknod to request
 	 *    but we have to do query and setpathinfo so requesting
@@ -160,23 +182,30 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 	FILE_ALL_INFO *buf = NULL;
 	struct inode *newinode = NULL;
 	int disposition = FILE_OVERWRITE_IF;
+	struct file *filp;
+
+	rc = check_name(direntry);
+	if (rc)
+		return ERR_PTR(rc);
 
 	xid = GetXid();
 
+	cFYI(1, "parent inode = 0x%p name is: %s and dentry = 0x%p",
+	     inode, direntry->d_name.name, direntry);
+
 	cifs_sb = CIFS_SB(inode->i_sb);
 	tlink = cifs_sb_tlink(cifs_sb);
 	if (IS_ERR(tlink)) {
 		FreeXid(xid);
-		return PTR_ERR(tlink);
+		return ERR_CAST(tlink);
 	}
 	tcon = tlink_tcon(tlink);
 
 	if (enable_oplocks)
 		oplock = REQ_OPLOCK;
 
-	if (nd)
-		oflags = nd->intent.open.file->f_flags;
-	else
+	/* Why not O_CREAT|O_EXCL? */
+	if (!od)
 		oflags = O_RDONLY | O_CREAT;
 
 	full_path = build_path_from_dentry(direntry);
@@ -186,6 +215,7 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 	}
 
 	if (tcon->unix_ext && (tcon->ses->capabilities & CAP_UNIX) &&
+	    !tcon->broken_posix_open &&
 	    (CIFS_UNIX_POSIX_PATH_OPS_CAP &
 			le64_to_cpu(tcon->fsUnixInfo.Capability))) {
 		rc = cifs_posix_open(full_path, &newinode,
@@ -207,9 +237,19 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 		   case where server does not support this SMB level, and
 		   falsely claims capability (also get here for DFS case
 		   which should be rare for path not covered on files) */
+
+		/*
+		 * The check below works around a bug in POSIX
+		 * open in samba versions 3.3.1 and earlier where
+		 * open could incorrectly fail with invalid parameter.
+		 * If either that or op not supported returned, follow
+		 * the normal lookup.
+		 */
+		if ((rc == -EINVAL) || (rc != -EOPNOTSUPP))
+			tcon->broken_posix_open = true;
 	}
 
-	if (nd) {
+	if (od) {
 		/* if the file is going to stay open, then we
 		   need to set the desired access properly */
 		desiredAccess = 0;
@@ -278,6 +318,7 @@ cifs_create(struct inode *inode, struct dentry *direntry, umode_t mode,
 				.device	= 0,
 		};
 
+		*created = true;
 		if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_SET_UID) {
 			args.uid = (__u64) current_fsuid();
 			if (inode->i_mode & S_ISGID)
@@ -321,16 +362,17 @@ cifs_create_get_file_info:
 	}
 
 cifs_create_set_dentry:
-	if (rc == 0)
-		d_instantiate(direntry, newinode);
-	else
+	if (rc != 0) {
 		cFYI(1, "Create worked, get_inode_info failed rc = %d", rc);
+		goto cifs_create_out;
+	}
+	d_drop(direntry);
+	d_add(direntry, newinode);
 
-	if (newinode && nd) {
+	if (newinode && od) {
 		struct cifsFileInfo *pfile_info;
-		struct file *filp;
 
-		filp = lookup_instantiate_filp(nd, direntry, generic_file_open);
+		filp = finish_open(od, direntry, generic_file_open);
 		if (IS_ERR(filp)) {
 			rc = PTR_ERR(filp);
 			CIFSSMBClose(xid, tcon, fileHandle);
@@ -339,20 +381,25 @@ cifs_create_set_dentry:
 
 		pfile_info = cifs_new_fileinfo(fileHandle, filp, tlink, oplock);
 		if (pfile_info == NULL) {
-			fput(filp);
 			CIFSSMBClose(xid, tcon, fileHandle);
+			fput(filp);
 			rc = -ENOMEM;
+			goto cifs_create_out;
 		}
 	} else {
 		CIFSSMBClose(xid, tcon, fileHandle);
+		filp = NULL;
 	}
-
-cifs_create_out:
+out:
 	kfree(buf);
 	kfree(full_path);
 	cifs_put_tlink(tlink);
 	FreeXid(xid);
-	return rc;
+	return filp;
+
+cifs_create_out:
+	filp = ERR_PTR(rc);
+	goto out;
 }
 
 int cifs_mknod(struct inode *inode, struct dentry *direntry, umode_t mode,
@@ -492,16 +539,11 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 {
 	int xid;
 	int rc = 0; /* to get around spurious gcc warning, set to zero here */
-	__u32 oplock = enable_oplocks ? REQ_OPLOCK : 0;
-	__u16 fileHandle = 0;
-	bool posix_open = false;
 	struct cifs_sb_info *cifs_sb;
 	struct tcon_link *tlink;
 	struct cifs_tcon *pTcon;
-	struct cifsFileInfo *cfile;
 	struct inode *newInode = NULL;
 	char *full_path = NULL;
-	struct file *filp;
 
 	xid = GetXid();
 
@@ -518,29 +560,9 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 	}
 	pTcon = tlink_tcon(tlink);
 
-	/*
-	 * Don't allow the separator character in a path component.
-	 * The VFS will not allow "/", but "\" is allowed by posix.
-	 */
-	if (!(cifs_sb->mnt_cifs_flags & CIFS_MOUNT_POSIX_PATHS)) {
-		int i;
-		for (i = 0; i < direntry->d_name.len; i++)
-			if (direntry->d_name.name[i] == '\\') {
-				cFYI(1, "Invalid file name");
-				rc = -EINVAL;
-				goto lookup_out;
-			}
-	}
-
-	/*
-	 * O_EXCL: optimize away the lookup, but don't hash the dentry. Let
-	 * the VFS handle the create.
-	 */
-	if (nd && (nd->flags & LOOKUP_EXCL)) {
-		d_instantiate(direntry, NULL);
-		rc = 0;
+	rc = check_name(direntry);
+	if (rc)
 		goto lookup_out;
-	}
 
 	/* can not grab the rename sem here since it would
 	deadlock in the cases (beginning of sys_rename itself)
@@ -558,64 +580,16 @@ cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
 	}
 	cFYI(1, "Full path: %s inode = 0x%p", full_path, direntry->d_inode);
 
-	/* Posix open is only called (at lookup time) for file create now.
-	 * For opens (rather than creates), because we do not know if it
-	 * is a file or directory yet, and current Samba no longer allows
-	 * us to do posix open on dirs, we could end up wasting an open call
-	 * on what turns out to be a dir. For file opens, we wait to call posix
-	 * open till cifs_open.  It could be added here (lookup) in the future
-	 * but the performance tradeoff of the extra network request when EISDIR
-	 * or EACCES is returned would have to be weighed against the 50%
-	 * reduction in network traffic in the other paths.
-	 */
 	if (pTcon->unix_ext) {
-		if (nd && !(nd->flags & LOOKUP_DIRECTORY) &&
-		     (nd->flags & LOOKUP_OPEN) && !pTcon->broken_posix_open &&
-		     (nd->intent.open.file->f_flags & O_CREAT)) {
-			rc = cifs_posix_open(full_path, &newInode,
-					parent_dir_inode->i_sb,
-					nd->intent.open.create_mode,
-					nd->intent.open.file->f_flags, &oplock,
-					&fileHandle, xid);
-			/*
-			 * The check below works around a bug in POSIX
-			 * open in samba versions 3.3.1 and earlier where
-			 * open could incorrectly fail with invalid parameter.
-			 * If either that or op not supported returned, follow
-			 * the normal lookup.
-			 */
-			if ((rc == 0) || (rc == -ENOENT))
-				posix_open = true;
-			else if ((rc == -EINVAL) || (rc != -EOPNOTSUPP))
-				pTcon->broken_posix_open = true;
-		}
-		if (!posix_open)
-			rc = cifs_get_inode_info_unix(&newInode, full_path,
-						parent_dir_inode->i_sb, xid);
-	} else
+		rc = cifs_get_inode_info_unix(&newInode, full_path,
+					      parent_dir_inode->i_sb, xid);
+	} else {
 		rc = cifs_get_inode_info(&newInode, full_path, NULL,
-				parent_dir_inode->i_sb, xid, NULL);
+					 parent_dir_inode->i_sb, xid, NULL);
+	}
 
 	if ((rc == 0) && (newInode != NULL)) {
 		d_add(direntry, newInode);
-		if (posix_open) {
-			filp = lookup_instantiate_filp(nd, direntry,
-						       generic_file_open);
-			if (IS_ERR(filp)) {
-				rc = PTR_ERR(filp);
-				CIFSSMBClose(xid, pTcon, fileHandle);
-				goto lookup_out;
-			}
-
-			cfile = cifs_new_fileinfo(fileHandle, filp, tlink,
-						  oplock);
-			if (cfile == NULL) {
-				fput(filp);
-				CIFSSMBClose(xid, pTcon, fileHandle);
-				rc = -ENOMEM;
-				goto lookup_out;
-			}
-		}
 		/* since paths are not looked up by component - the parent
 		   directories are presumed to be good here */
 		renew_parental_timestamps(direntry);
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 14/25] ceph: remove unused arg from ceph_lookup_open()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (12 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 15/25] ceph: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

What was the purpose of this?

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/ceph/dir.c   |    4 ++--
 fs/ceph/file.c  |    3 +--
 fs/ceph/super.h |    3 +--
 3 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 3e8094b..c4b7832 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -599,7 +599,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
 	    (nd->flags & LOOKUP_OPEN) &&
 	    !(nd->intent.open.flags & O_CREAT)) {
 		int mode = nd->intent.open.create_mode & ~current->fs->umask;
-		return ceph_lookup_open(dir, dentry, nd, mode, 1);
+		return ceph_lookup_open(dir, dentry, nd, mode);
 	}
 
 	/* can we conclude ENOENT locally? */
@@ -710,7 +710,7 @@ static int ceph_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 
 	if (nd) {
 		BUG_ON((nd->flags & LOOKUP_OPEN) == 0);
-		dentry = ceph_lookup_open(dir, dentry, nd, mode, 0);
+		dentry = ceph_lookup_open(dir, dentry, nd, mode);
 		/* hrm, what should i do here if we get aliased? */
 		if (IS_ERR(dentry))
 			return PTR_ERR(dentry);
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index ed72428..2fe9a3e 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -220,8 +220,7 @@ out:
  *  path_lookup_create -> LOOKUP_OPEN|LOOKUP_CREATE
  */
 struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				struct nameidata *nd, int mode,
-				int locked_dir)
+				struct nameidata *nd, int mode)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb);
 	struct ceph_mds_client *mdsc = fsc->mdsc;
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index 1421f3d..c6b2cba 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -805,8 +805,7 @@ extern int ceph_copy_from_page_vector(struct page **pages,
 extern struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags);
 extern int ceph_open(struct inode *inode, struct file *file);
 extern struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				       struct nameidata *nd, int mode,
-				       int locked_dir);
+				       struct nameidata *nd, int mode);
 extern int ceph_release(struct inode *inode, struct file *filp);
 
 /* dir.c */
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 15/25] ceph: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (13 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 14/25] ceph: remove unused arg from ceph_lookup_open() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 16/25] 9p: implement i_op->atomic_create() Miklos Szeredi
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Instead of calling ceph_lookup_open() from ->lookup and ->create, call it from
->atomic_open and ->atomic_create.

CEPH does non-create open in ->atomic_open and create-open in ->atomic_create.
To prevent unnecessary call to ->atomic_open the FS_NO_LOOKUP_CREATE flag is set
in the filesystem flags to only call ->atomic_open on non-create opens.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/ceph/dir.c   |   33 +++++++++------------------------
 fs/ceph/file.c  |   22 +++++++++++-----------
 fs/ceph/super.c |    2 +-
 fs/ceph/super.h |    5 +++--
 4 files changed, 24 insertions(+), 38 deletions(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index c4b7832..62b10e7 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -594,14 +594,6 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
 	if (err < 0)
 		return ERR_PTR(err);
 
-	/* open (but not create!) intent? */
-	if (nd &&
-	    (nd->flags & LOOKUP_OPEN) &&
-	    !(nd->intent.open.flags & O_CREAT)) {
-		int mode = nd->intent.open.create_mode & ~current->fs->umask;
-		return ceph_lookup_open(dir, dentry, nd, mode);
-	}
-
 	/* can we conclude ENOENT locally? */
 	if (dentry->d_inode == NULL) {
 		struct ceph_inode_info *ci = ceph_inode(dir);
@@ -699,26 +691,18 @@ static int ceph_mknod(struct inode *dir, struct dentry *dentry,
 	return err;
 }
 
-static int ceph_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		       struct nameidata *nd)
+struct file *ceph_create(struct inode *dir, struct dentry *dentry,
+			 struct opendata *od, unsigned flags, umode_t mode)
 {
-	dout("create in dir %p dentry %p name '%.*s'\n",
-	     dir, dentry, dentry->d_name.len, dentry->d_name.name);
+	bool created = true;
 
-	if (ceph_snap(dir) != CEPH_NOSNAP)
-		return -EROFS;
+	if (!od) {
+		int err = ceph_mknod(dir, dentry, mode, 0);
 
-	if (nd) {
-		BUG_ON((nd->flags & LOOKUP_OPEN) == 0);
-		dentry = ceph_lookup_open(dir, dentry, nd, mode);
-		/* hrm, what should i do here if we get aliased? */
-		if (IS_ERR(dentry))
-			return PTR_ERR(dentry);
-		return 0;
+		return err ? ERR_PTR(err) : NULL;
 	}
 
-	/* fall back to mknod */
-	return ceph_mknod(dir, dentry, (mode & ~S_IFMT) | S_IFREG, 0);
+	return ceph_lookup_open(dir, dentry, od, flags, mode, &created);
 }
 
 static int ceph_symlink(struct inode *dir, struct dentry *dentry,
@@ -1356,7 +1340,8 @@ const struct inode_operations ceph_dir_iops = {
 	.unlink = ceph_unlink,
 	.rmdir = ceph_unlink,
 	.rename = ceph_rename,
-	.create = ceph_create,
+	.atomic_create = ceph_create,
+	.atomic_open = ceph_lookup_open,
 };
 
 const struct dentry_operations ceph_dentry_ops = {
diff --git a/fs/ceph/file.c b/fs/ceph/file.c
index 2fe9a3e..6a00f89 100644
--- a/fs/ceph/file.c
+++ b/fs/ceph/file.c
@@ -214,21 +214,16 @@ out:
  * may_open() fails, the struct *file gets cleaned up (i.e.
  * ceph_release gets called).  So fear not!
  */
-/*
- * flags
- *  path_lookup_open   -> LOOKUP_OPEN
- *  path_lookup_create -> LOOKUP_OPEN|LOOKUP_CREATE
- */
-struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				struct nameidata *nd, int mode)
+struct file *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
+			      struct opendata *od, unsigned flags, umode_t mode,
+			      bool *created)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb);
 	struct ceph_mds_client *mdsc = fsc->mdsc;
-	struct file *file;
+	struct file *file = NULL;
 	struct ceph_mds_request *req;
 	struct dentry *ret;
 	int err;
-	int flags = nd->intent.open.flags;
 
 	dout("ceph_lookup_open dentry %p '%.*s' flags %d mode 0%o\n",
 	     dentry, dentry->d_name.len, dentry->d_name.name, flags, mode);
@@ -254,14 +249,19 @@ struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
 		err = ceph_handle_notrace_create(dir, dentry);
 	if (err)
 		goto out;
-	file = lookup_instantiate_filp(nd, req->r_dentry, ceph_open);
+	file = finish_open(od, req->r_dentry, ceph_open);
 	if (IS_ERR(file))
 		err = PTR_ERR(file);
 out:
 	ret = ceph_finish_lookup(req, dentry, err);
 	ceph_mdsc_put_request(req);
 	dout("ceph_lookup_open result=%p\n", ret);
-	return ret;
+
+	if (IS_ERR(ret))
+		return ERR_CAST(ret);
+
+	dput(ret);
+	return err ? ERR_PTR(err) : file;
 }
 
 int ceph_release(struct inode *inode, struct file *file)
diff --git a/fs/ceph/super.c b/fs/ceph/super.c
index 00de2c9..6e87fc2 100644
--- a/fs/ceph/super.c
+++ b/fs/ceph/super.c
@@ -915,7 +915,7 @@ static struct file_system_type ceph_fs_type = {
 	.name		= "ceph",
 	.mount		= ceph_mount,
 	.kill_sb	= ceph_kill_sb,
-	.fs_flags	= FS_RENAME_DOES_D_MOVE,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE | FS_NO_LOOKUP_CREATE,
 };
 
 #define _STRINGIFY(x) #x
diff --git a/fs/ceph/super.h b/fs/ceph/super.h
index c6b2cba..cf66773 100644
--- a/fs/ceph/super.h
+++ b/fs/ceph/super.h
@@ -804,8 +804,9 @@ extern int ceph_copy_from_page_vector(struct page **pages,
 				    loff_t off, size_t len);
 extern struct page **ceph_alloc_page_vector(int num_pages, gfp_t flags);
 extern int ceph_open(struct inode *inode, struct file *file);
-extern struct dentry *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
-				       struct nameidata *nd, int mode);
+extern struct file *ceph_lookup_open(struct inode *dir, struct dentry *dentry,
+				     struct opendata *od, unsigned flags,
+				     umode_t mode, bool *);
 extern int ceph_release(struct inode *inode, struct file *filp);
 
 /* dir.c */
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 16/25] 9p: implement i_op->atomic_create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (14 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 15/25] ceph: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 17/25] vfs: remove open intents from nameidata Miklos Szeredi
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Replace 9p's ->create implementations with ->atomic_create implementations.  No
functionality is changed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/9p/vfs_inode.c      |   28 ++++++++++++----------------
 fs/9p/vfs_inode_dotl.c |   36 ++++++++++++++++++------------------
 2 files changed, 30 insertions(+), 34 deletions(-)

diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 014c8dd..3d526a7 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -716,18 +716,16 @@ error:
  * @dir: directory inode that is being created
  * @dentry:  dentry that is being deleted
  * @mode: create permissions
- * @nd: path information
  *
  */
 
-static int
-v9fs_vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
+static struct file *
+v9fs_vfs_create(struct inode *dir, struct dentry *dentry, struct opendata *od,
+		unsigned flags, umode_t mode)
 {
 	int err;
 	u32 perm;
-	int flags;
-	struct file *filp;
+	struct file *filp = NULL;
 	struct v9fs_inode *v9inode;
 	struct v9fs_session_info *v9ses;
 	struct p9_fid *fid, *inode_fid;
@@ -736,10 +734,8 @@ v9fs_vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 	fid = NULL;
 	v9ses = v9fs_inode2v9ses(dir);
 	perm = unixmode2p9mode(v9ses, mode);
-	if (nd)
-		flags = nd->intent.open.flags;
-	else
-		flags = O_RDWR;
+	if (!od)
+		flags = O_RDWR; /* Why not O_RDONLY|O_CREAT|O_EXCL? */
 
 	fid = v9fs_create(v9ses, dir, dentry, NULL, perm,
 				v9fs_uflags2omode(flags,
@@ -752,7 +748,7 @@ v9fs_vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 
 	v9fs_invalidate_inode_attr(dir);
 	/* if we are opening a file, assign the open fid to the file */
-	if (nd) {
+	if (od) {
 		v9inode = V9FS_I(dentry->d_inode);
 		mutex_lock(&v9inode->v_mutex);
 		if (v9ses->cache && !v9inode->writeback_fid &&
@@ -773,7 +769,7 @@ v9fs_vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 			v9inode->writeback_fid = (void *) inode_fid;
 		}
 		mutex_unlock(&v9inode->v_mutex);
-		filp = lookup_instantiate_filp(nd, dentry, generic_file_open);
+		filp = finish_open(od, dentry, generic_file_open);
 		if (IS_ERR(filp)) {
 			err = PTR_ERR(filp);
 			goto error;
@@ -787,13 +783,13 @@ v9fs_vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 	} else
 		p9_client_clunk(fid);
 
-	return 0;
+	return filp;
 
 error:
 	if (fid)
 		p9_client_clunk(fid);
 
-	return err;
+	return ERR_PTR(err);
 }
 
 /**
@@ -1486,7 +1482,7 @@ out:
 }
 
 static const struct inode_operations v9fs_dir_inode_operations_dotu = {
-	.create = v9fs_vfs_create,
+	.atomic_create = v9fs_vfs_create,
 	.lookup = v9fs_vfs_lookup,
 	.symlink = v9fs_vfs_symlink,
 	.link = v9fs_vfs_link,
@@ -1500,7 +1496,7 @@ static const struct inode_operations v9fs_dir_inode_operations_dotu = {
 };
 
 static const struct inode_operations v9fs_dir_inode_operations = {
-	.create = v9fs_vfs_create,
+	.atomic_create = v9fs_vfs_create,
 	.lookup = v9fs_vfs_lookup,
 	.unlink = v9fs_vfs_unlink,
 	.mkdir = v9fs_vfs_mkdir,
diff --git a/fs/9p/vfs_inode_dotl.c b/fs/9p/vfs_inode_dotl.c
index a1e6c99..6e37911 100644
--- a/fs/9p/vfs_inode_dotl.c
+++ b/fs/9p/vfs_inode_dotl.c
@@ -248,17 +248,15 @@ int v9fs_open_to_dotl_flags(int flags)
  * @dir: directory inode that is being created
  * @dentry:  dentry that is being deleted
  * @mode: create permissions
- * @nd: path information
  *
  */
 
-static int
-v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
-		struct nameidata *nd)
+static struct file *
+v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry,
+		     struct opendata *od, unsigned flags,  umode_t omode)
 {
 	int err = 0;
 	gid_t gid;
-	int flags;
 	umode_t mode;
 	char *name = NULL;
 	struct file *filp;
@@ -271,15 +269,16 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	struct posix_acl *pacl = NULL, *dacl = NULL;
 
 	v9ses = v9fs_inode2v9ses(dir);
-	if (nd)
-		flags = nd->intent.open.flags;
-	else {
+	if (!od) {
 		/*
-		 * create call without LOOKUP_OPEN is due
-		 * to mknod of regular files. So use mknod
-		 * operation.
+		 * create call without filp is due to mknod of regular files.
+		 * So use mknod operation.
 		 */
-		return v9fs_vfs_mknod_dotl(dir, dentry, omode, 0);
+		err = v9fs_vfs_mknod_dotl(dir, dentry, omode, 0);
+		if (err)
+			goto err_return;
+
+		return NULL;
 	}
 
 	name = (char *) dentry->d_name.name;
@@ -290,7 +289,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	if (IS_ERR(dfid)) {
 		err = PTR_ERR(dfid);
 		p9_debug(P9_DEBUG_VFS, "fid lookup failed %d\n", err);
-		return err;
+		goto err_return;
 	}
 
 	/* clone a fid to use for creation */
@@ -298,7 +297,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	if (IS_ERR(ofid)) {
 		err = PTR_ERR(ofid);
 		p9_debug(P9_DEBUG_VFS, "p9_client_walk failed %d\n", err);
-		return err;
+		goto err_return;
 	}
 
 	gid = v9fs_get_fsgid_for_create(dir);
@@ -363,7 +362,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	}
 	mutex_unlock(&v9inode->v_mutex);
 	/* Since we are opening a file, assign the open fid to the file */
-	filp = lookup_instantiate_filp(nd, dentry, generic_file_open);
+	filp = finish_open(od, dentry, generic_file_open);
 	if (IS_ERR(filp)) {
 		err = PTR_ERR(filp);
 		goto err_clunk_old_fid;
@@ -373,7 +372,7 @@ v9fs_vfs_create_dotl(struct inode *dir, struct dentry *dentry, umode_t omode,
 	if (v9ses->cache)
 		v9fs_cache_inode_set_cookie(inode, filp);
 #endif
-	return 0;
+	return filp;
 
 error:
 	if (fid)
@@ -382,7 +381,8 @@ err_clunk_old_fid:
 	if (ofid)
 		p9_client_clunk(ofid);
 	v9fs_set_create_acl(NULL, &dacl, &pacl);
-	return err;
+err_return:
+	return ERR_PTR(err);
 }
 
 /**
@@ -999,7 +999,7 @@ out:
 }
 
 const struct inode_operations v9fs_dir_inode_operations_dotl = {
-	.create = v9fs_vfs_create_dotl,
+	.atomic_create = v9fs_vfs_create_dotl,
 	.lookup = v9fs_vfs_lookup,
 	.link = v9fs_vfs_link_dotl,
 	.symlink = v9fs_vfs_symlink_dotl,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 17/25] vfs: remove open intents from nameidata
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (15 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 16/25] 9p: implement i_op->atomic_create() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 18/25] vfs: only retry last component if opening stale dentry Miklos Szeredi
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

All users of open intents have been converted to use ->atomic_{open,create}.

This patch gets rid of nd->intent.open and related infrastructure.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/internal.h         |    5 +--
 fs/namei.c            |  109 +++++++++++++++++++++---------------------------
 fs/open.c             |   97 +++++--------------------------------------
 include/linux/namei.h |   11 -----
 4 files changed, 61 insertions(+), 161 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 10143de..6d71416 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -84,12 +84,9 @@ extern struct super_block *user_get_super(dev_t);
 /*
  * open.c
  */
-struct nameidata;
-extern struct file *nameidata_to_filp(struct nameidata *);
-extern void release_open_intent(struct nameidata *);
 struct opendata {
 	struct vfsmount *mnt;
-	struct file **filp;
+	struct file *filp;
 };
 struct open_flags {
 	int open_flag;
diff --git a/fs/namei.c b/fs/namei.c
index 200cffe..ff21a67 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -476,22 +476,6 @@ err_root:
 	return -ECHILD;
 }
 
-/**
- * release_open_intent - free up open intent resources
- * @nd: pointer to nameidata
- */
-void release_open_intent(struct nameidata *nd)
-{
-	struct file *file = nd->intent.open.file;
-
-	if (file && !IS_ERR(file)) {
-		if (file->f_path.dentry == NULL)
-			put_filp(file);
-		else
-			fput(file);
-	}
-}
-
 static inline int d_revalidate(struct dentry *dentry, struct nameidata *nd)
 {
 	return dentry->d_op->d_revalidate(dentry, nd);
@@ -2174,6 +2158,7 @@ static int may_o_create(struct path *dir, struct dentry *dentry, umode_t mode)
 }
 
 static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
+				struct opendata *od,
 				const struct open_flags *op,
 				int *want_write, int *create_error)
 {
@@ -2183,7 +2168,6 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	int error;
 	bool created = false;
 	int acc_mode;
-	struct opendata od;
 	struct file *filp;
 
 	BUG_ON(dentry->d_inode);
@@ -2245,9 +2229,8 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	if (nd->flags & LOOKUP_DIRECTORY)
 		open_flag |= O_DIRECTORY;
 
-	od.mnt = nd->path.mnt;
-	od.filp = &nd->intent.open.file;
-	filp = dir->i_op->atomic_open(dir, dentry, &od, open_flag, mode,
+	od->mnt = nd->path.mnt;
+	filp = dir->i_op->atomic_open(dir, dentry, od, open_flag, mode,
 				      &created);
 	if (IS_ERR(filp)) {
 		if (*create_error && PTR_ERR(filp) == -ENOENT)
@@ -2310,6 +2293,7 @@ static bool is_atomic_lookup_open(struct inode *dir, struct nameidata *nd)
  * was performed, only lookup.
  */
 static struct file *lookup_open(struct nameidata *nd, struct path *path,
+				struct opendata *od,
 				const struct open_flags *op, int *want_write)
 {
 	struct dentry *dir = nd->path.dentry;
@@ -2330,7 +2314,8 @@ static struct file *lookup_open(struct nameidata *nd, struct path *path,
 	if (is_atomic_lookup_open(dir_inode, nd)) {
 		struct file *filp;
 
-		filp = atomic_open(nd, dentry, op, want_write, &create_error);
+		filp = atomic_open(nd, dentry, od, op, want_write,
+				   &create_error);
 		if (filp) {
 			dput(dentry);
 			return filp;
@@ -2373,15 +2358,16 @@ out_dput:
  * Handle the last step of open()
  */
 static struct file *do_last(struct nameidata *nd, struct path *path,
-			    const struct open_flags *op, const char *pathname)
+			    struct opendata *od, const struct open_flags *op,
+			    const char *pathname)
 {
 	struct dentry *dir = nd->path.dentry;
 	struct dentry *dentry;
+	struct file *filp;
 	int open_flag = open_to_namei_flags(op->open_flag);
 	int will_truncate = open_flag & O_TRUNC;
 	int want_write = 0;
 	int acc_mode = op->acc_mode;
-	struct file *filp;
 	struct inode *inode;
 	int symlink_ok = 0;
 	int error;
@@ -2449,7 +2435,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 
 	mutex_lock(&dir->d_inode->i_mutex);
 
-	filp = lookup_open(nd, path, op, &want_write);
+	filp = lookup_open(nd, path, od, op, &want_write);
 	if (filp) {
 		mutex_unlock(&dir->d_inode->i_mutex);
 		if (IS_ERR(filp))
@@ -2462,7 +2448,6 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 
 	/* Negative dentry, create the file if O_CREAT */
 	if (!dentry->d_inode) {
-		struct opendata od;
 		umode_t mode = op->mode;
 
 		error = -ENOENT;
@@ -2476,7 +2461,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 		 * rw->ro transition does not occur between
 		 * the time when the file is created and when
 		 * a permanent write count is taken through
-		 * the 'struct file' in nameidata_to_filp().
+		 * the 'struct file' in finish_open().
 		 */
 		if (!want_write) {
 			error = mnt_want_write(nd->path.mnt);
@@ -2491,9 +2476,8 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 		error = security_path_mknod(&nd->path, dentry, mode, 0);
 		if (error)
 			goto exit_mutex_unlock;
-		od.mnt = nd->path.mnt;
-		od.filp = &nd->intent.open.file;
-		filp = create_open(dir->d_inode, dentry, &od, open_flag, mode,
+		od->mnt = nd->path.mnt;
+		filp = create_open(dir->d_inode, dentry, od, open_flag, mode,
 				   nd);
 		mutex_unlock(&dir->d_inode->i_mutex);
 		dput(nd->path.dentry);
@@ -2576,23 +2560,22 @@ common:
 	error = may_open(&nd->path, acc_mode, open_flag);
 	if (error)
 		goto exit;
-	filp = nameidata_to_filp(nd);
+	od->mnt = nd->path.mnt;
+	filp = finish_open(od, nd->path.dentry, NULL);
+	if (IS_ERR(filp))
+		goto out;
+	error = open_check_o_direct(filp);
+	if (error)
+		goto exit_fput;
 opened:
-	if (!IS_ERR(filp)) {
-		error = ima_file_check(filp, op->acc_mode);
-		if (error) {
-			fput(filp);
-			filp = ERR_PTR(error);
-		}
-	}
-	if (!IS_ERR(filp)) {
-		if (will_truncate) {
-			error = handle_truncate(filp);
-			if (error) {
-				fput(filp);
-				filp = ERR_PTR(error);
-			}
-		}
+	error = ima_file_check(filp, op->acc_mode);
+	if (error)
+		goto exit_fput;
+
+	if (will_truncate) {
+		error = handle_truncate(filp);
+		if (error)
+			goto exit_fput;
 	}
 out:
 	if (want_write)
@@ -2608,6 +2591,10 @@ exit:
 	filp = ERR_PTR(error);
 	goto out;
 
+exit_fput:
+	fput(filp);
+	goto exit;
+
 terminate:
 	terminate_walk(nd);
 	return ERR_PTR(error);
@@ -2617,18 +2604,16 @@ static struct file *path_openat(int dfd, const char *pathname,
 		struct nameidata *nd, const struct open_flags *op, int flags)
 {
 	struct file *base = NULL;
-	struct file *filp;
+	struct opendata od;
+	struct file *res;
 	struct path path;
 	int error;
 
-	filp = get_empty_filp();
-	if (!filp)
+	od.filp = get_empty_filp();
+	if (!od.filp)
 		return ERR_PTR(-ENFILE);
 
-	filp->f_flags = op->open_flag;
-	nd->intent.open.file = filp;
-	nd->intent.open.flags = open_to_namei_flags(op->open_flag);
-	nd->intent.open.create_mode = op->mode;
+	od.filp->f_flags = op->open_flag;
 
 	error = path_init(dfd, pathname, flags | LOOKUP_PARENT, nd, &base);
 	if (unlikely(error))
@@ -2639,23 +2624,23 @@ static struct file *path_openat(int dfd, const char *pathname,
 	if (unlikely(error))
 		goto out_filp;
 
-	filp = do_last(nd, &path, op, pathname);
-	while (unlikely(!filp)) { /* trailing symlink */
+	res = do_last(nd, &path, &od, op, pathname);
+	while (unlikely(!res)) { /* trailing symlink */
 		struct path link = path;
 		void *cookie;
 		if (!(nd->flags & LOOKUP_FOLLOW)) {
 			path_put_conditional(&path, nd);
 			path_put(&nd->path);
-			filp = ERR_PTR(-ELOOP);
+			res = ERR_PTR(-ELOOP);
 			break;
 		}
 		nd->flags |= LOOKUP_PARENT;
 		nd->flags &= ~(LOOKUP_OPEN|LOOKUP_CREATE|LOOKUP_EXCL);
 		error = follow_link(&link, nd, &cookie);
 		if (unlikely(error))
-			filp = ERR_PTR(error);
+			res = ERR_PTR(error);
 		else
-			filp = do_last(nd, &path, op, pathname);
+			res = do_last(nd, &path, &od, op, pathname);
 		put_link(nd, &link, cookie);
 	}
 out:
@@ -2663,11 +2648,14 @@ out:
 		path_put(&nd->root);
 	if (base)
 		fput(base);
-	release_open_intent(nd);
-	return filp;
+	if (od.filp) {
+		BUG_ON(od.filp->f_path.dentry);
+		put_filp(od.filp);
+	}
+	return res;
 
 out_filp:
-	filp = ERR_PTR(error);
+	res = ERR_PTR(error);
 	goto out;
 }
 
@@ -2723,7 +2711,6 @@ struct dentry *kern_path_create(int dfd, const char *pathname, struct path *path
 		goto out;
 	nd.flags &= ~LOOKUP_PARENT;
 	nd.flags |= LOOKUP_CREATE | LOOKUP_EXCL;
-	nd.intent.open.flags = O_EXCL;
 
 	/*
 	 * Do the final lookup.
diff --git a/fs/open.c b/fs/open.c
index 238c5ae..b51afcc 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -743,62 +743,6 @@ cleanup_file:
 	return ERR_PTR(error);
 }
 
-static struct file *__dentry_open(struct dentry *dentry, struct vfsmount *mnt,
-				struct file *f,
-				int (*open)(struct inode *, struct file *),
-				const struct cred *cred)
-{
-	struct file *res = do_dentry_open(dentry, mnt, f, open, cred);
-	if (!IS_ERR(res)) {
-		int error = open_check_o_direct(f);
-		if (error) {
-			fput(res);
-			res = ERR_PTR(error);
-		}
-	}
-	return res;
-}
-
-/**
- * lookup_instantiate_filp - instantiates the open intent filp
- * @nd: pointer to nameidata
- * @dentry: pointer to dentry
- * @open: open callback
- *
- * Helper for filesystems that want to use lookup open intents and pass back
- * a fully instantiated struct file to the caller.
- * This function is meant to be called from within a filesystem's
- * lookup method.
- * Beware of calling it for non-regular files! Those ->open methods might block
- * (e.g. in fifo_open), leaving you with parent locked (and in case of fifo,
- * leading to a deadlock, as nobody can open that fifo anymore, because
- * another process to open fifo will block on locked parent when doing lookup).
- * Note that in case of error, nd->intent.open.file is destroyed, but the
- * path information remains valid.
- * If the open callback is set to NULL, then the standard f_op->open()
- * filesystem callback is substituted.
- */
-struct file *lookup_instantiate_filp(struct nameidata *nd, struct dentry *dentry,
-		int (*open)(struct inode *, struct file *))
-{
-	const struct cred *cred = current_cred();
-
-	if (IS_ERR(nd->intent.open.file))
-		goto out;
-	if (IS_ERR(dentry))
-		goto out_err;
-	nd->intent.open.file = __dentry_open(dget(dentry), mntget(nd->path.mnt),
-					     nd->intent.open.file,
-					     open, cred);
-out:
-	return nd->intent.open.file;
-out_err:
-	release_open_intent(nd);
-	nd->intent.open.file = ERR_CAST(dentry);
-	goto out;
-}
-EXPORT_SYMBOL_GPL(lookup_instantiate_filp);
-
 /**
  * finish_open - set up a not fully instantiated file
  * @od: opaque open data
@@ -816,8 +760,8 @@ struct file *finish_open(struct opendata *od, struct dentry *dentry,
 {
 	struct file *filp;
 
-	filp = *(od->filp);
-	*(od->filp) = NULL;
+	filp = od->filp;
+	od->filp = NULL;
 
 	mntget(od->mnt);
 	dget(dentry);
@@ -826,31 +770,6 @@ struct file *finish_open(struct opendata *od, struct dentry *dentry,
 }
 EXPORT_SYMBOL(finish_open);
 
-/**
- * nameidata_to_filp - convert a nameidata to an open filp.
- * @nd: pointer to nameidata
- * @flags: open flags
- *
- * Note that this function destroys the original nameidata
- */
-struct file *nameidata_to_filp(struct nameidata *nd)
-{
-	const struct cred *cred = current_cred();
-	struct file *filp;
-
-	/* Pick up the filp from the open intent */
-	filp = nd->intent.open.file;
-	nd->intent.open.file = NULL;
-
-	/* Has the filesystem initialised the file for us? */
-	if (filp->f_path.dentry == NULL) {
-		path_get(&nd->path);
-		filp = __dentry_open(nd->path.dentry, nd->path.mnt, filp,
-				     NULL, cred);
-	}
-	return filp;
-}
-
 /*
  * dentry_open() will have done dput(dentry) and mntput(mnt) if it returns an
  * error.
@@ -859,7 +778,7 @@ struct file *dentry_open(struct dentry *dentry, struct vfsmount *mnt, int flags,
 			 const struct cred *cred)
 {
 	int error;
-	struct file *f;
+	struct file *f, *res;
 
 	validate_creds(cred);
 
@@ -875,7 +794,15 @@ struct file *dentry_open(struct dentry *dentry, struct vfsmount *mnt, int flags,
 	}
 
 	f->f_flags = flags;
-	return __dentry_open(dentry, mnt, f, NULL, cred);
+	res = do_dentry_open(dentry, mnt, f, NULL, cred);
+	if (!IS_ERR(res)) {
+		int error = open_check_o_direct(f);
+		if (error) {
+			fput(res);
+			res = ERR_PTR(error);
+		}
+	}
+	return res;
 }
 EXPORT_SYMBOL(dentry_open);
 
diff --git a/include/linux/namei.h b/include/linux/namei.h
index ffc0213..54dadda 100644
--- a/include/linux/namei.h
+++ b/include/linux/namei.h
@@ -7,12 +7,6 @@
 
 struct vfsmount;
 
-struct open_intent {
-	int	flags;
-	int	create_mode;
-	struct file *file;
-};
-
 enum { MAX_NESTED_LINKS = 8 };
 
 struct nameidata {
@@ -25,11 +19,6 @@ struct nameidata {
 	int		last_type;
 	unsigned	depth;
 	char *saved_names[MAX_NESTED_LINKS + 1];
-
-	/* Intent data */
-	union {
-		struct open_intent open;
-	} intent;
 };
 
 /*
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 18/25] vfs: only retry last component if opening stale dentry
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (16 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 17/25] vfs: remove open intents from nameidata Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 19/25] vfs: remove nameidata argument from vfs_create Miklos Szeredi
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

NFS optimizes away d_revalidates for last component of open.  This means that
open itself can find the dentry stale.  It returns ESTALE resulting in the
complete path being looked up again with LOOKUP_REVAL.

This is unnecessary, however, since it would be enough to retry the last
component only.  Introduce EOPENSTALE (a kernel private errno) and allow NFS to
retry opening only the last component.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c            |   34 ++++++++++++++++++++++++++++++----
 fs/nfs/file.c         |    2 +-
 fs/open.c             |   16 +++++++++-------
 include/linux/errno.h |    1 +
 4 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index ff21a67..b991aa0 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2194,8 +2194,8 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 	 * Another problem is returing the "right" error value (e.g. for an
 	 * O_EXCL open we want to return EEXIST not EROFS).
 	 */
-	if ((open_flag & (O_CREAT | O_TRUNC)) ||
-	    (open_flag & O_ACCMODE) != O_RDONLY) {
+	if (!*want_write && ((open_flag & (O_CREAT | O_TRUNC)) ||
+			     (open_flag & O_ACCMODE) != O_RDONLY)) {
 		error = mnt_want_write(nd->path.mnt);
 		if (!error) {
 			*want_write = 1;
@@ -2370,6 +2370,8 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 	int acc_mode = op->acc_mode;
 	struct inode *inode;
 	int symlink_ok = 0;
+	struct path save_parent = { .dentry = NULL, .mnt = NULL };
+	bool retried = false;
 	int error;
 
 	nd->flags &= ~LOOKUP_PARENT;
@@ -2433,6 +2435,7 @@ static struct file *do_last(struct nameidata *nd, struct path *path,
 			goto exit;
 	}
 
+retry_lookup:
 	mutex_lock(&dir->d_inode->i_mutex);
 
 	filp = lookup_open(nd, path, od, op, &want_write);
@@ -2532,12 +2535,21 @@ finish_lookup:
 		return NULL;
 	}
 
-	path_to_nameidata(path, nd);
+	if ((nd->flags & LOOKUP_RCU) || nd->path.mnt != path->mnt) {
+		path_to_nameidata(path, nd);
+	} else {
+		save_parent.dentry = nd->path.dentry;
+		save_parent.mnt = mntget(path->mnt);
+		nd->path.dentry = path->dentry;
+
+	}
 	nd->inode = inode;
 
 	error = complete_walk(nd);
-	if (error)
+	if (error) {
+		path_put(&save_parent);
 		return ERR_PTR(error);
+	}
 	error = -EISDIR;
 	if ((open_flag & O_CREAT) && S_ISDIR(inode->i_mode))
 		goto exit;
@@ -2562,6 +2574,19 @@ common:
 		goto exit;
 	od->mnt = nd->path.mnt;
 	filp = finish_open(od, nd->path.dentry, NULL);
+	if (IS_ERR(filp) && PTR_ERR(filp) == -EOPENSTALE) {
+		error = -ESTALE;
+		if (!save_parent.dentry || retried)
+			goto exit;
+		BUG_ON(save_parent.dentry != dir);
+		path_put(&nd->path);
+		nd->path = save_parent;
+		nd->inode = dir->d_inode;
+		save_parent.mnt = NULL;
+		save_parent.dentry = NULL;
+		retried = true;
+		goto retry_lookup;
+	}
 	if (IS_ERR(filp))
 		goto out;
 	error = open_check_o_direct(filp);
@@ -2580,6 +2605,7 @@ opened:
 out:
 	if (want_write)
 		mnt_drop_write(nd->path.mnt);
+	path_put(&save_parent);
 	path_put(&nd->path);
 	return filp;
 
diff --git a/fs/nfs/file.c b/fs/nfs/file.c
index 4e626ec..bb1f5cb 100644
--- a/fs/nfs/file.c
+++ b/fs/nfs/file.c
@@ -935,7 +935,7 @@ out:
 
 out_drop:
 	d_drop(dentry);
-	err = -ESTALE;
+	err = -EOPENSTALE;
 	goto out_put_ctx;
 }
 
diff --git a/fs/open.c b/fs/open.c
index b51afcc..d324139 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -737,7 +737,6 @@ cleanup_all:
 	f->f_path.dentry = NULL;
 	f->f_path.mnt = NULL;
 cleanup_file:
-	put_filp(f);
 	dput(dentry);
 	mntput(mnt);
 	return ERR_PTR(error);
@@ -758,15 +757,16 @@ cleanup_file:
 struct file *finish_open(struct opendata *od, struct dentry *dentry,
 			 int (*open)(struct inode *, struct file *))
 {
-	struct file *filp;
-
-	filp = od->filp;
-	od->filp = NULL;
+	struct file *res;
 
 	mntget(od->mnt);
 	dget(dentry);
 
-	return do_dentry_open(dentry, od->mnt, filp, open, current_cred());
+	res = do_dentry_open(dentry, od->mnt, od->filp, open, current_cred());
+	if (!IS_ERR(res))
+		od->filp = NULL;
+
+	return res;
 }
 EXPORT_SYMBOL(finish_open);
 
@@ -795,7 +795,9 @@ struct file *dentry_open(struct dentry *dentry, struct vfsmount *mnt, int flags,
 
 	f->f_flags = flags;
 	res = do_dentry_open(dentry, mnt, f, NULL, cred);
-	if (!IS_ERR(res)) {
+	if (IS_ERR(res)) {
+		put_filp(f);
+	} else {
 		int error = open_check_o_direct(f);
 		if (error) {
 			fput(res);
diff --git a/include/linux/errno.h b/include/linux/errno.h
index 4668583..b1c33a0 100644
--- a/include/linux/errno.h
+++ b/include/linux/errno.h
@@ -16,6 +16,7 @@
 #define ERESTARTNOHAND	514	/* restart if no handler.. */
 #define ENOIOCTLCMD	515	/* No ioctl command */
 #define ERESTART_RESTARTBLOCK 516 /* restart by calling sys_restart_syscall */
+#define EOPENSTALE	517	/* open found a stale dentry */
 
 /* Defined for the NFSv3 protocol */
 #define EBADHANDLE	521	/* Illegal NFS file handle */
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 19/25] vfs: remove nameidata argument from vfs_create
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (17 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 18/25] vfs: only retry last component if opening stale dentry Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 20/25] vfs: move O_DIRECT check to common code Miklos Szeredi
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

All callers of vfs_create() pass a NULL nameidata.  So this argument can be
removed.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/cachefiles/namei.c |    2 +-
 fs/ecryptfs/inode.c   |    2 +-
 fs/namei.c            |    5 ++---
 fs/nfsd/vfs.c         |    4 ++--
 include/linux/fs.h    |    2 +-
 ipc/mqueue.c          |    2 +-
 6 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index a0358c2..faa933f 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -567,7 +567,7 @@ lookup_again:
 			if (ret < 0)
 				goto create_error;
 			start = jiffies;
-			ret = vfs_create(dir->d_inode, next, S_IFREG, NULL);
+			ret = vfs_create(dir->d_inode, next, S_IFREG);
 			cachefiles_hist(cachefiles_create_histogram, start);
 			if (ret < 0)
 				goto create_error;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index ab35b11..b6d0d69 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -173,7 +173,7 @@ ecryptfs_do_create(struct inode *directory_inode,
 		inode = ERR_CAST(lower_dir_dentry);
 		goto out;
 	}
-	rc = vfs_create(lower_dir_dentry->d_inode, lower_dentry, mode, NULL);
+	rc = vfs_create(lower_dir_dentry->d_inode, lower_dentry, mode);
 	if (rc) {
 		printk(KERN_ERR "%s: Failure to create dentry in lower fs; "
 		       "rc = [%d]\n", __func__, rc);
diff --git a/fs/namei.c b/fs/namei.c
index b991aa0..fadc95c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2101,8 +2101,7 @@ out_err:
 	return ERR_PTR(error);
 }
 
-int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-	       struct nameidata *nd)
+int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct file *res;
 	unsigned open_flag = O_RDONLY|O_CREAT|O_EXCL;
@@ -2856,7 +2855,7 @@ SYSCALL_DEFINE4(mknodat, int, dfd, const char __user *, filename, umode_t, mode,
 		goto out_drop_write;
 	switch (mode & S_IFMT) {
 		case 0: case S_IFREG:
-			error = vfs_create(path.dentry->d_inode,dentry,mode,NULL);
+			error = vfs_create(path.dentry->d_inode, dentry, mode);
 			break;
 		case S_IFCHR: case S_IFBLK:
 			error = vfs_mknod(path.dentry->d_inode,dentry,mode,
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c
index edf6d3e..26470ad 100644
--- a/fs/nfsd/vfs.c
+++ b/fs/nfsd/vfs.c
@@ -1321,7 +1321,7 @@ nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 	err = 0;
 	switch (type) {
 	case S_IFREG:
-		host_err = vfs_create(dirp, dchild, iap->ia_mode, NULL);
+		host_err = vfs_create(dirp, dchild, iap->ia_mode);
 		if (!host_err)
 			nfsd_check_ignore_resizing(iap);
 		break;
@@ -1484,7 +1484,7 @@ do_nfsd_create(struct svc_rqst *rqstp, struct svc_fh *fhp,
 		goto out;
 	}
 
-	host_err = vfs_create(dirp, dchild, iap->ia_mode, NULL);
+	host_err = vfs_create(dirp, dchild, iap->ia_mode);
 	if (host_err < 0) {
 		fh_drop_write(fhp);
 		goto out_nfserr;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index af291bb..23268f4 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1534,7 +1534,7 @@ extern void unlock_super(struct super_block *);
 /*
  * VFS helper functions..
  */
-extern int vfs_create(struct inode *, struct dentry *, umode_t, struct nameidata *);
+extern int vfs_create(struct inode *, struct dentry *, umode_t);
 extern int vfs_mkdir(struct inode *, struct dentry *, umode_t);
 extern int vfs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
 extern int vfs_symlink(struct inode *, struct dentry *, const char *);
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index 86ee272..b31b495 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -634,7 +634,7 @@ static struct file *do_create(struct ipc_namespace *ipc_ns, struct dentry *dir,
 	ret = mnt_want_write(ipc_ns->mq_mnt);
 	if (ret)
 		goto out;
-	ret = vfs_create(dir->d_inode, dentry, mode, NULL);
+	ret = vfs_create(dir->d_inode, dentry, mode);
 	dentry->d_fsdata = NULL;
 	if (ret)
 		goto out_drop_write;
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 20/25] vfs: move O_DIRECT check to common code
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (18 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 19/25] vfs: remove nameidata argument from vfs_create Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 21/25] gfs2: use i_op->atomic_create() Miklos Szeredi
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Perform open_check_o_direct() in a common place in do_last after opening the
file.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c |   30 +++++++++---------------------
 1 files changed, 9 insertions(+), 21 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index fadc95c..4207e4f 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2055,19 +2055,13 @@ static struct file *atomic_create(struct inode *dir, struct dentry *dentry,
 	 * here.
 	 */
 	error = may_open(&filp->f_path, MAY_OPEN, open_flag);
-	if (error)
-		goto out_fput;
-
-	error = open_check_o_direct(filp);
-	if (error)
-		goto out_fput;
+	if (error) {
+		fput(filp);
+		return ERR_PTR(error);
+	}
 
 out:
 	return filp;
-
-out_fput:
-	fput(filp);
-	return ERR_PTR(error);
 }
 
 static struct file *create_open(struct inode *dir, struct dentry *dentry,
@@ -2249,22 +2243,16 @@ static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
 		 * permission here.
 		 */
 		error = may_open(&filp->f_path, acc_mode, open_flag);
-		if (error)
-			goto out_fput;
-
-		error = open_check_o_direct(filp);
-		if (error)
-			goto out_fput;
+		if (error) {
+			fput(filp);
+			return ERR_PTR(error);
+		}
 	}
 	*create_error = 0;
 
 out:
 	return filp;
 
-out_fput:
-	fput(filp);
-	return ERR_PTR(error);
-
 look_up:
 	return NULL;
 }
@@ -2588,10 +2576,10 @@ common:
 	}
 	if (IS_ERR(filp))
 		goto out;
+opened:
 	error = open_check_o_direct(filp);
 	if (error)
 		goto exit_fput;
-opened:
 	error = ima_file_check(filp, op->acc_mode);
 	if (error)
 		goto exit_fput;
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 21/25] gfs2: use i_op->atomic_create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (19 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 20/25] vfs: move O_DIRECT check to common code Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-24 14:27   ` Christoph Hellwig
  2012-03-07 21:22 ` [PATCH 22/25] nfs: " Miklos Szeredi
                   ` (5 subsequent siblings)
  26 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

GFS2 doesn't open the file in ->create, but it does check the LOOKUP_EXCL flag
in it's create function.  Convert to using ->atomic_create and checking O_EXCL
so that the nameidata argument is no longer necessary.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/gfs2/inode.c |   20 ++++++++++++++------
 1 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 5698746..203ec3c 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -754,13 +754,21 @@ fail:
  * Returns: errno
  */
 
-static int gfs2_create(struct inode *dir, struct dentry *dentry,
-		       umode_t mode, struct nameidata *nd)
+static struct file *gfs2_create(struct inode *dir, struct dentry *dentry,
+				struct opendata *od, unsigned open_flag,
+				umode_t mode)
 {
-	int excl = 0;
-	if (nd && (nd->flags & LOOKUP_EXCL))
+	int err;
+	int excl = 0; /* Why is excl not the default? */
+
+	if (od && (open_flag & O_EXCL))
 		excl = 1;
-	return gfs2_create_inode(dir, dentry, S_IFREG | mode, 0, NULL, 0, excl);
+
+	err = gfs2_create_inode(dir, dentry, S_IFREG | mode, 0, NULL, 0, excl);
+	if (err)
+		return ERR_PTR(err);
+
+	return NULL;
 }
 
 /**
@@ -1809,7 +1817,7 @@ const struct inode_operations gfs2_file_iops = {
 };
 
 const struct inode_operations gfs2_dir_iops = {
-	.create = gfs2_create,
+	.atomic_create = gfs2_create,
 	.lookup = gfs2_lookup,
 	.link = gfs2_link,
 	.unlink = gfs2_unlink,
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 22/25] nfs: use i_op->atomic_create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (20 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 21/25] gfs2: use i_op->atomic_create() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 23/25] vfs: remove nameidata argument from i_op->create() Miklos Szeredi
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

NFS doesn't open the file in ->create, but it does check the LOOKUP_EXCL flag in
it's generic create function.  Convert to using ->atomic_create and checking
O_EXCL so that the nameidata argument is no longer necessary.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/nfs/dir.c |   20 +++++++++-----------
 1 files changed, 9 insertions(+), 11 deletions(-)

diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 36a12b4..887226d 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -47,7 +47,8 @@ static int nfs_opendir(struct inode *, struct file *);
 static int nfs_closedir(struct inode *, struct file *);
 static int nfs_readdir(struct file *, void *, filldir_t);
 static struct dentry *nfs_lookup(struct inode *, struct dentry *, struct nameidata *);
-static int nfs_create(struct inode *, struct dentry *, umode_t, struct nameidata *);
+static struct file *nfs_create(struct inode *, struct dentry *,
+			       struct opendata *, unsigned, umode_t);
 static int nfs_mkdir(struct inode *, struct dentry *, umode_t);
 static int nfs_rmdir(struct inode *, struct dentry *);
 static int nfs_unlink(struct inode *, struct dentry *);
@@ -70,7 +71,7 @@ const struct file_operations nfs_dir_operations = {
 };
 
 const struct inode_operations nfs_dir_inode_operations = {
-	.create		= nfs_create,
+	.atomic_create	= nfs_create,
 	.lookup		= nfs_lookup,
 	.link		= nfs_link,
 	.unlink		= nfs_unlink,
@@ -90,7 +91,7 @@ const struct address_space_operations nfs_dir_aops = {
 
 #ifdef CONFIG_NFS_V3
 const struct inode_operations nfs3_dir_inode_operations = {
-	.create		= nfs_create,
+	.atomic_create	= nfs_create,
 	.lookup		= nfs_lookup,
 	.link		= nfs_link,
 	.unlink		= nfs_unlink,
@@ -1577,12 +1578,12 @@ out_error:
  * that the operation succeeded on the server, but an error in the
  * reply path made it appear to have failed.
  */
-static int nfs_create(struct inode *dir, struct dentry *dentry,
-		umode_t mode, struct nameidata *nd)
+static struct file *nfs_create(struct inode *dir, struct dentry *dentry,
+			       struct opendata *od, unsigned open_flags,
+			       umode_t mode)
 {
 	struct iattr attr;
 	int error;
-	int open_flags = O_CREAT|O_EXCL;
 
 	dfprintk(VFS, "NFS: create(%s/%ld), %s\n",
 			dir->i_sb->s_id, dir->i_ino, dentry->d_name.name);
@@ -1590,16 +1591,13 @@ static int nfs_create(struct inode *dir, struct dentry *dentry,
 	attr.ia_mode = mode;
 	attr.ia_valid = ATTR_MODE;
 
-	if (nd && !(nd->flags & LOOKUP_EXCL))
-		open_flags = O_CREAT;
-
 	error = NFS_PROTO(dir)->create(dir, dentry, &attr, open_flags);
 	if (error != 0)
 		goto out_err;
-	return 0;
+	return NULL;
 out_err:
 	d_drop(dentry);
-	return error;
+	return ERR_PTR(error);
 }
 
 /*
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 23/25] vfs: remove nameidata argument from i_op->create()
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (21 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 22/25] nfs: " Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 24/25] vfs: optionally skip lookup on exclusive create Miklos Szeredi
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

The nameidata argument is no longer used by any filesystem.  Any information
that might be necessary is available in i_op->atomic_create().

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/affs/affs.h          |    3 ++-
 fs/affs/namei.c         |    2 +-
 fs/afs/dir.c            |    6 ++----
 fs/bad_inode.c          |    4 ++--
 fs/bfs/dir.c            |    3 +--
 fs/btrfs/inode.c        |    3 +--
 fs/coda/dir.c           |    4 ++--
 fs/ecryptfs/inode.c     |    3 +--
 fs/exofs/namei.c        |    3 +--
 fs/ext2/namei.c         |    2 +-
 fs/ext3/namei.c         |    3 +--
 fs/ext4/namei.c         |    3 +--
 fs/fat/namei_msdos.c    |    3 +--
 fs/fat/namei_vfat.c     |    3 +--
 fs/hfs/dir.c            |    3 +--
 fs/hfsplus/dir.c        |    4 ++--
 fs/hostfs/hostfs_kern.c |    3 +--
 fs/hpfs/namei.c         |    2 +-
 fs/hugetlbfs/inode.c    |    3 ++-
 fs/jffs2/dir.c          |    5 ++---
 fs/jfs/namei.c          |    4 +---
 fs/logfs/dir.c          |    3 +--
 fs/minix/namei.c        |    3 +--
 fs/namei.c              |    9 ++++-----
 fs/ncpfs/dir.c          |    5 ++---
 fs/nilfs2/namei.c       |    3 +--
 fs/ocfs2/dlmfs/dlmfs.c  |    3 +--
 fs/ocfs2/namei.c        |    3 +--
 fs/omfs/dir.c           |    3 +--
 fs/ramfs/inode.c        |    2 +-
 fs/reiserfs/namei.c     |    4 ++--
 fs/sysv/namei.c         |    2 +-
 fs/ubifs/dir.c          |    3 +--
 fs/udf/namei.c          |    3 +--
 fs/ufs/namei.c          |    3 +--
 fs/xfs/xfs_iops.c       |    3 +--
 include/linux/fs.h      |    2 +-
 ipc/mqueue.c            |    3 +--
 mm/shmem.c              |    3 +--
 39 files changed, 51 insertions(+), 78 deletions(-)

diff --git a/fs/affs/affs.h b/fs/affs/affs.h
index 45a0ce4..e0fca52 100644
--- a/fs/affs/affs.h
+++ b/fs/affs/affs.h
@@ -156,7 +156,8 @@ extern void	affs_free_bitmap(struct super_block *sb);
 extern int	affs_hash_name(struct super_block *sb, const u8 *name, unsigned int len);
 extern struct dentry *affs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *);
 extern int	affs_unlink(struct inode *dir, struct dentry *dentry);
-extern int	affs_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct nameidata *);
+extern int	affs_create(struct inode *dir, struct dentry *dentry,
+			    umode_t mode);
 extern int	affs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode);
 extern int	affs_rmdir(struct inode *dir, struct dentry *dentry);
 extern int	affs_link(struct dentry *olddentry, struct inode *dir,
diff --git a/fs/affs/namei.c b/fs/affs/namei.c
index 4780694..3ad7695 100644
--- a/fs/affs/namei.c
+++ b/fs/affs/namei.c
@@ -255,7 +255,7 @@ affs_unlink(struct inode *dir, struct dentry *dentry)
 }
 
 int
-affs_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct nameidata *nd)
+affs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct super_block *sb = dir->i_sb;
 	struct inode	*inode;
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index e22dc4b..ea254ab 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -28,8 +28,7 @@ static int afs_d_delete(const struct dentry *dentry);
 static void afs_d_release(struct dentry *dentry);
 static int afs_lookup_filldir(void *_cookie, const char *name, int nlen,
 				  loff_t fpos, u64 ino, unsigned dtype);
-static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		      struct nameidata *nd);
+static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode);
 static int afs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode);
 static int afs_rmdir(struct inode *dir, struct dentry *dentry);
 static int afs_unlink(struct inode *dir, struct dentry *dentry);
@@ -948,8 +947,7 @@ error:
 /*
  * create a regular file on an AFS filesystem
  */
-static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		      struct nameidata *nd)
+static int afs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct afs_file_status status;
 	struct afs_callback cb;
diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 22e9a78..9fc0eab 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -172,8 +172,8 @@ static const struct file_operations bad_file_ops =
 	.splice_read	= bad_file_splice_read,
 };
 
-static int bad_inode_create (struct inode *dir, struct dentry *dentry,
-		umode_t mode, struct nameidata *nd)
+static int bad_inode_create(struct inode *dir, struct dentry *dentry,
+			    umode_t mode)
 {
 	return -EIO;
 }
diff --git a/fs/bfs/dir.c b/fs/bfs/dir.c
index d12c796..e9a3937 100644
--- a/fs/bfs/dir.c
+++ b/fs/bfs/dir.c
@@ -84,8 +84,7 @@ const struct file_operations bfs_dir_operations = {
 
 extern void dump_imap(const char *, struct super_block *);
 
-static int bfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-						struct nameidata *nd)
+static int bfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	int err;
 	struct inode *inode;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 892b347..a5b0fb5 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4694,8 +4694,7 @@ out_unlock:
 	return err;
 }
 
-static int btrfs_create(struct inode *dir, struct dentry *dentry,
-			umode_t mode, struct nameidata *nd)
+static int btrfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct btrfs_trans_handle *trans;
 	struct btrfs_root *root = BTRFS_I(dir)->root;
diff --git a/fs/coda/dir.c b/fs/coda/dir.c
index 1775158..e8d02b9 100644
--- a/fs/coda/dir.c
+++ b/fs/coda/dir.c
@@ -30,7 +30,7 @@
 #include "coda_int.h"
 
 /* dir inode-ops */
-static int coda_create(struct inode *dir, struct dentry *new, umode_t mode, struct nameidata *nd);
+static int coda_create(struct inode *dir, struct dentry *new, umode_t mode);
 static struct dentry *coda_lookup(struct inode *dir, struct dentry *target, struct nameidata *nd);
 static int coda_link(struct dentry *old_dentry, struct inode *dir_inode, 
 		     struct dentry *entry);
@@ -188,7 +188,7 @@ static inline void coda_dir_drop_nlink(struct inode *dir)
 }
 
 /* creation routines: create, mknod, mkdir, link, symlink */
-static int coda_create(struct inode *dir, struct dentry *de, umode_t mode, struct nameidata *nd)
+static int coda_create(struct inode *dir, struct dentry *de, umode_t mode)
 {
 	int error;
 	const char *name=de->d_name.name;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index b6d0d69..8371f06 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -240,7 +240,6 @@ out:
  * @dir: The inode of the directory in which to create the file.
  * @dentry: The eCryptfs dentry
  * @mode: The mode of the new file.
- * @nd: nameidata
  *
  * Creates a new file.
  *
@@ -248,7 +247,7 @@ out:
  */
 static int
 ecryptfs_create(struct inode *directory_inode, struct dentry *ecryptfs_dentry,
-		umode_t mode, struct nameidata *nd)
+		umode_t mode)
 {
 	struct inode *ecryptfs_inode;
 	int rc;
diff --git a/fs/exofs/namei.c b/fs/exofs/namei.c
index 9dbf0c3..570845b 100644
--- a/fs/exofs/namei.c
+++ b/fs/exofs/namei.c
@@ -59,8 +59,7 @@ static struct dentry *exofs_lookup(struct inode *dir, struct dentry *dentry,
 	return d_splice_alias(inode, dentry);
 }
 
-static int exofs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-			 struct nameidata *nd)
+static int exofs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode = exofs_new_inode(dir, mode);
 	int err = PTR_ERR(inode);
diff --git a/fs/ext2/namei.c b/fs/ext2/namei.c
index 0804198..a39f4c4 100644
--- a/fs/ext2/namei.c
+++ b/fs/ext2/namei.c
@@ -94,7 +94,7 @@ struct dentry *ext2_get_parent(struct dentry *child)
  * If the create succeeds, we fill in the inode information
  * with d_instantiate(). 
  */
-static int ext2_create (struct inode * dir, struct dentry * dentry, umode_t mode, struct nameidata *nd)
+static int ext2_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 
diff --git a/fs/ext3/namei.c b/fs/ext3/namei.c
index e8e2117..af7adfc 100644
--- a/fs/ext3/namei.c
+++ b/fs/ext3/namei.c
@@ -1701,8 +1701,7 @@ static int ext3_add_nondir(handle_t *handle,
  * If the create succeeds, we fill in the inode information
  * with d_instantiate().
  */
-static int ext3_create (struct inode * dir, struct dentry * dentry, umode_t mode,
-		struct nameidata *nd)
+static int ext3_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	handle_t *handle;
 	struct inode * inode;
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 2043f48..20f9a13 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1736,8 +1736,7 @@ static int ext4_add_nondir(handle_t *handle,
  * If the create succeeds, we fill in the inode information
  * with d_instantiate().
  */
-static int ext4_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		       struct nameidata *nd)
+static int ext4_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	handle_t *handle;
 	struct inode *inode;
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index c5938c9..c854101 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -264,8 +264,7 @@ static int msdos_add_entry(struct inode *dir, const unsigned char *name,
 }
 
 /***** Create a file */
-static int msdos_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-			struct nameidata *nd)
+static int msdos_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct super_block *sb = dir->i_sb;
 	struct inode *inode = NULL;
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index a81eb23..39e810f 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -782,8 +782,7 @@ error:
 	return ERR_PTR(err);
 }
 
-static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		       struct nameidata *nd)
+static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct super_block *sb = dir->i_sb;
 	struct inode *inode;
diff --git a/fs/hfs/dir.c b/fs/hfs/dir.c
index 62fc14e..ba125c1 100644
--- a/fs/hfs/dir.c
+++ b/fs/hfs/dir.c
@@ -186,8 +186,7 @@ static int hfs_dir_release(struct inode *inode, struct file *file)
  * a directory and return a corresponding inode, given the inode for
  * the directory and the name (and its length) of the new file.
  */
-static int hfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		      struct nameidata *nd)
+static int hfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 	int res;
diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c
index 88e155f..bfb2ad8 100644
--- a/fs/hfsplus/dir.c
+++ b/fs/hfsplus/dir.c
@@ -453,8 +453,8 @@ out:
 	return res;
 }
 
-static int hfsplus_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-			  struct nameidata *nd)
+static int hfsplus_create(struct inode *dir, struct dentry *dentry,
+			  umode_t mode)
 {
 	return hfsplus_mknod(dir, dentry, mode, 0);
 }
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index e130bd4..a3d2242 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -551,8 +551,7 @@ static int read_name(struct inode *ino, char *name)
 	return 0;
 }
 
-int hostfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		  struct nameidata *nd)
+static int hostfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 	char *name;
diff --git a/fs/hpfs/namei.c b/fs/hpfs/namei.c
index 30dd7b1..3129925 100644
--- a/fs/hpfs/namei.c
+++ b/fs/hpfs/namei.c
@@ -115,7 +115,7 @@ bail:
 	return err;
 }
 
-static int hpfs_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct nameidata *nd)
+static int hpfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	const unsigned char *name = dentry->d_name.name;
 	unsigned len = dentry->d_name.len;
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 1e85a7a..00a60dc 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -545,7 +545,8 @@ static int hugetlbfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mod
 	return retval;
 }
 
-static int hugetlbfs_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct nameidata *nd)
+static int hugetlbfs_create(struct inode *dir, struct dentry *dentry,
+			    umode_t mode)
 {
 	return hugetlbfs_mknod(dir, dentry, mode | S_IFREG, 0);
 }
diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c
index 973ac58..099dd6a 100644
--- a/fs/jffs2/dir.c
+++ b/fs/jffs2/dir.c
@@ -22,8 +22,7 @@
 
 static int jffs2_readdir (struct file *, void *, filldir_t);
 
-static int jffs2_create (struct inode *,struct dentry *,umode_t,
-			 struct nameidata *);
+static int jffs2_create(struct inode *, struct dentry *, umode_t);
 static struct dentry *jffs2_lookup (struct inode *,struct dentry *,
 				    struct nameidata *);
 static int jffs2_link (struct dentry *,struct inode *,struct dentry *);
@@ -170,7 +169,7 @@ static int jffs2_readdir(struct file *filp, void *dirent, filldir_t filldir)
 
 
 static int jffs2_create(struct inode *dir_i, struct dentry *dentry,
-			umode_t mode, struct nameidata *nd)
+			umode_t mode)
 {
 	struct jffs2_raw_inode *ri;
 	struct jffs2_inode_info *f, *dir_f;
diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
index 5f7c160..423c520 100644
--- a/fs/jfs/namei.c
+++ b/fs/jfs/namei.c
@@ -67,13 +67,11 @@ static inline void free_ea_wmap(struct inode *inode)
  * PARAMETER:	dip	- parent directory vnode
  *		dentry	- dentry of new file
  *		mode	- create mode (rwxrwxrwx).
- *		nd- nd struct
  *
  * RETURN:	Errors from subroutines
  *
  */
-static int jfs_create(struct inode *dip, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
+static int jfs_create(struct inode *dip, struct dentry *dentry, umode_t mode)
 {
 	int rc = 0;
 	tid_t tid;		/* transaction id */
diff --git a/fs/logfs/dir.c b/fs/logfs/dir.c
index 3de7a32..33556ae 100644
--- a/fs/logfs/dir.c
+++ b/fs/logfs/dir.c
@@ -501,8 +501,7 @@ static int logfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	return __logfs_create(dir, dentry, inode, NULL, 0);
 }
 
-static int logfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
+static int logfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 
diff --git a/fs/minix/namei.c b/fs/minix/namei.c
index 2f76e38..2ec09bb 100644
--- a/fs/minix/namei.c
+++ b/fs/minix/namei.c
@@ -54,8 +54,7 @@ static int minix_mknod(struct inode * dir, struct dentry *dentry, umode_t mode,
 	return error;
 }
 
-static int minix_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
+static int minix_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	return minix_mknod(dir, dentry, mode, 0);
 }
diff --git a/fs/namei.c b/fs/namei.c
index 4207e4f..5c95ce5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2066,7 +2066,7 @@ out:
 
 static struct file *create_open(struct inode *dir, struct dentry *dentry,
 				struct opendata *od, unsigned open_flag,
-				umode_t mode, struct nameidata *nd)
+				umode_t mode)
 {
 	int error = may_create(dir, dentry);
 	if (error)
@@ -2081,7 +2081,7 @@ static struct file *create_open(struct inode *dir, struct dentry *dentry,
 	if (error)
 		goto out_err;
 	if (dir->i_op->create) {
-		error = dir->i_op->create(dir, dentry, mode, nd);
+		error = dir->i_op->create(dir, dentry, mode);
 		if (error)
 			goto out_err;
 
@@ -2100,7 +2100,7 @@ int vfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 	struct file *res;
 	unsigned open_flag = O_RDONLY|O_CREAT|O_EXCL;
 
-	res = create_open(dir, dentry, NULL, open_flag, mode, NULL);
+	res = create_open(dir, dentry, NULL, open_flag, mode);
 	if (IS_ERR(res))
 		return PTR_ERR(res);
 
@@ -2467,8 +2467,7 @@ retry_lookup:
 		if (error)
 			goto exit_mutex_unlock;
 		od->mnt = nd->path.mnt;
-		filp = create_open(dir->d_inode, dentry, od, open_flag, mode,
-				   nd);
+		filp = create_open(dir->d_inode, dentry, od, open_flag, mode);
 		mutex_unlock(&dir->d_inode->i_mutex);
 		dput(nd->path.dentry);
 		nd->path.dentry = dentry;
diff --git a/fs/ncpfs/dir.c b/fs/ncpfs/dir.c
index aeed93a..34eeb46 100644
--- a/fs/ncpfs/dir.c
+++ b/fs/ncpfs/dir.c
@@ -30,7 +30,7 @@ static void ncp_do_readdir(struct file *, void *, filldir_t,
 
 static int ncp_readdir(struct file *, void *, filldir_t);
 
-static int ncp_create(struct inode *, struct dentry *, umode_t, struct nameidata *);
+static int ncp_create(struct inode *, struct dentry *, umode_t);
 static struct dentry *ncp_lookup(struct inode *, struct dentry *, struct nameidata *);
 static int ncp_unlink(struct inode *, struct dentry *);
 static int ncp_mkdir(struct inode *, struct dentry *, umode_t);
@@ -979,8 +979,7 @@ out:
 	return error;
 }
 
-static int ncp_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
+static int ncp_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	return ncp_create_new(dir, dentry, mode, 0, 0);
 }
diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c
index 1cd3f62..ebde465 100644
--- a/fs/nilfs2/namei.c
+++ b/fs/nilfs2/namei.c
@@ -84,8 +84,7 @@ nilfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
  * If the create succeeds, we fill in the inode information
  * with d_instantiate().
  */
-static int nilfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-			struct nameidata *nd)
+static int nilfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 	struct nilfs_transaction_info ti;
diff --git a/fs/ocfs2/dlmfs/dlmfs.c b/fs/ocfs2/dlmfs/dlmfs.c
index abfac0d..aafd800 100644
--- a/fs/ocfs2/dlmfs/dlmfs.c
+++ b/fs/ocfs2/dlmfs/dlmfs.c
@@ -525,8 +525,7 @@ bail:
 
 static int dlmfs_create(struct inode *dir,
 			struct dentry *dentry,
-			umode_t mode,
-			struct nameidata *nd)
+			umode_t mode)
 {
 	int status = 0;
 	struct inode *inode;
diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index a9856e3..0ef9de8 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -617,8 +617,7 @@ static int ocfs2_mkdir(struct inode *dir,
 
 static int ocfs2_create(struct inode *dir,
 			struct dentry *dentry,
-			umode_t mode,
-			struct nameidata *nd)
+			umode_t mode)
 {
 	int ret;
 
diff --git a/fs/omfs/dir.c b/fs/omfs/dir.c
index f00576e..d1aa51d 100644
--- a/fs/omfs/dir.c
+++ b/fs/omfs/dir.c
@@ -284,8 +284,7 @@ static int omfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	return omfs_add_node(dir, dentry, mode | S_IFDIR);
 }
 
-static int omfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
+static int omfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	return omfs_add_node(dir, dentry, mode | S_IFREG);
 }
diff --git a/fs/ramfs/inode.c b/fs/ramfs/inode.c
index aec766a..f08381d 100644
--- a/fs/ramfs/inode.c
+++ b/fs/ramfs/inode.c
@@ -114,7 +114,7 @@ static int ramfs_mkdir(struct inode * dir, struct dentry * dentry, umode_t mode)
 	return retval;
 }
 
-static int ramfs_create(struct inode *dir, struct dentry *dentry, umode_t mode, struct nameidata *nd)
+static int ramfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	return ramfs_mknod(dir, dentry, mode | S_IFREG, 0);
 }
diff --git a/fs/reiserfs/namei.c b/fs/reiserfs/namei.c
index 1463788..044120c 100644
--- a/fs/reiserfs/namei.c
+++ b/fs/reiserfs/namei.c
@@ -572,8 +572,8 @@ static int new_inode_init(struct inode *inode, struct inode *dir, umode_t mode)
 	return 0;
 }
 
-static int reiserfs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-			   struct nameidata *nd)
+static int reiserfs_create(struct inode *dir, struct dentry *dentry,
+			   umode_t mode)
 {
 	int retval;
 	struct inode *inode;
diff --git a/fs/sysv/namei.c b/fs/sysv/namei.c
index b217797..9fd62a4 100644
--- a/fs/sysv/namei.c
+++ b/fs/sysv/namei.c
@@ -80,7 +80,7 @@ static int sysv_mknod(struct inode * dir, struct dentry * dentry, umode_t mode,
 	return err;
 }
 
-static int sysv_create(struct inode * dir, struct dentry * dentry, umode_t mode, struct nameidata *nd)
+static int sysv_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	return sysv_mknod(dir, dentry, mode, 0);
 }
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index d6fe1c7..90adfaf 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -253,8 +253,7 @@ out:
 	return ERR_PTR(err);
 }
 
-static int ubifs_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-			struct nameidata *nd)
+static int ubifs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 	struct ubifs_info *c = dir->i_sb->s_fs_info;
diff --git a/fs/udf/namei.c b/fs/udf/namei.c
index 08bf46e..c96f1a7 100644
--- a/fs/udf/namei.c
+++ b/fs/udf/namei.c
@@ -552,8 +552,7 @@ static int udf_delete_entry(struct inode *inode, struct fileIdentDesc *fi,
 	return udf_write_fi(inode, cfi, fi, fibh, NULL, NULL);
 }
 
-static int udf_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		      struct nameidata *nd)
+static int udf_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct udf_fileident_bh fibh;
 	struct inode *inode;
diff --git a/fs/ufs/namei.c b/fs/ufs/namei.c
index 38cac19..ca6c710 100644
--- a/fs/ufs/namei.c
+++ b/fs/ufs/namei.c
@@ -70,8 +70,7 @@ static struct dentry *ufs_lookup(struct inode * dir, struct dentry *dentry, stru
  * If the create succeeds, we fill in the inode information
  * with d_instantiate(). 
  */
-static int ufs_create (struct inode * dir, struct dentry * dentry, umode_t mode,
-		struct nameidata *nd)
+static int ufs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 	int err;
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index ab30253..66810ac 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -231,8 +231,7 @@ STATIC int
 xfs_vn_create(
 	struct inode	*dir,
 	struct dentry	*dentry,
-	umode_t		mode,
-	struct nameidata *nd)
+	umode_t		mode)
 {
 	return xfs_vn_mknod(dir, dentry, mode, 0);
 }
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 23268f4..ea9282d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1637,7 +1637,7 @@ struct inode_operations {
 	int (*readlink) (struct dentry *, char __user *,int);
 	void (*put_link) (struct dentry *, struct nameidata *, void *);
 
-	int (*create) (struct inode *,struct dentry *,umode_t,struct nameidata *);
+	int (*create) (struct inode *, struct dentry *, umode_t);
 	int (*link) (struct dentry *,struct inode *,struct dentry *);
 	int (*unlink) (struct inode *,struct dentry *);
 	int (*symlink) (struct inode *,struct dentry *,const char *);
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index b31b495..d757ebb 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -294,8 +294,7 @@ static void mqueue_evict_inode(struct inode *inode)
 		put_ipc_ns(ipc_ns);
 }
 
-static int mqueue_create(struct inode *dir, struct dentry *dentry,
-				umode_t mode, struct nameidata *nd)
+static int mqueue_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	struct inode *inode;
 	struct mq_attr *attr = dentry->d_fsdata;
diff --git a/mm/shmem.c b/mm/shmem.c
index 269d049..28d63b1 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1524,8 +1524,7 @@ static int shmem_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	return 0;
 }
 
-static int shmem_create(struct inode *dir, struct dentry *dentry, umode_t mode,
-		struct nameidata *nd)
+static int shmem_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 {
 	return shmem_mknod(dir, dentry, mode | S_IFREG, 0);
 }
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 24/25] vfs: optionally skip lookup on exclusive create
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (22 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 23/25] vfs: remove nameidata argument from i_op->create() Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:22 ` [PATCH 25/25] vfs: remove nameidata from lookup Miklos Szeredi
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

NFS optimizes away the last component lookup for exclusive creation (create,
mkdir, mknod, link, symlink).  It does this by checking for LOOKUP_EXCL in
nd->flags and skipping the actual lookup in that case, leaving a negative
unhashed dentry for the create function to fill.

Move this logic into the VFS which can be enabled by a filesystem flag.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 fs/namei.c         |   12 +++++++++++-
 fs/nfs/dir.c       |   22 ++++++++++++----------
 fs/nfs/super.c     |    9 ++++++---
 include/linux/fs.h |    1 +
 4 files changed, 30 insertions(+), 14 deletions(-)

diff --git a/fs/namei.c b/fs/namei.c
index 5c95ce5..3e3652c 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2709,6 +2709,7 @@ struct file *do_file_open_root(struct dentry *dentry, struct vfsmount *mnt,
 
 struct dentry *kern_path_create(int dfd, const char *pathname, struct path *path, int is_dir)
 {
+	bool need_lookup;
 	struct dentry *dentry = ERR_PTR(-EEXIST);
 	struct nameidata nd;
 	int error = do_path_lookup(dfd, pathname, LOOKUP_PARENT, &nd);
@@ -2728,10 +2729,19 @@ struct dentry *kern_path_create(int dfd, const char *pathname, struct path *path
 	 * Do the final lookup.
 	 */
 	mutex_lock_nested(&nd.path.dentry->d_inode->i_mutex, I_MUTEX_PARENT);
-	dentry = lookup_hash(&nd);
+	dentry = lookup_dcache(&nd.last, nd.path.dentry, &nd, &need_lookup);
 	if (IS_ERR(dentry))
 		goto fail;
 
+	if (need_lookup) {
+		struct inode *dir = nd.path.dentry->d_inode;
+		if (!(dir->i_sb->s_type->fs_flags & FS_SKIP_LOOKUP_EXCL)) {
+			dentry = lookup_real(dir, dentry, &nd);
+			if (IS_ERR(dentry))
+				goto fail;
+		}
+	}
+
 	if (dentry->d_inode)
 		goto eexist;
 	/*
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 887226d..2b91cf3 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -1286,16 +1286,6 @@ static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, stru
 	if (dentry->d_name.len > NFS_SERVER(dir)->namelen)
 		goto out;
 
-	/*
-	 * If we're doing an exclusive create, optimize away the lookup
-	 * but don't hash the dentry.
-	 */
-	if (nfs_is_exclusive_create(dir, nd)) {
-		d_instantiate(dentry, NULL);
-		res = NULL;
-		goto out;
-	}
-
 	res = ERR_PTR(-ENOMEM);
 	fhandle = nfs_alloc_fhandle();
 	fattr = nfs_alloc_fattr();
@@ -1612,6 +1602,9 @@ nfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t rdev)
 	dfprintk(VFS, "NFS: mknod(%s/%ld), %s\n",
 			dir->i_sb->s_id, dir->i_ino, dentry->d_name.name);
 
+	if (dentry->d_name.len > NFS_SERVER(dir)->namelen)
+		return -ENAMETOOLONG;
+
 	if (!new_valid_dev(rdev))
 		return -EINVAL;
 
@@ -1638,6 +1631,9 @@ static int nfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	dfprintk(VFS, "NFS: mkdir(%s/%ld), %s\n",
 			dir->i_sb->s_id, dir->i_ino, dentry->d_name.name);
 
+	if (dentry->d_name.len > NFS_SERVER(dir)->namelen)
+		return -ENAMETOOLONG;
+
 	attr.ia_valid = ATTR_MODE;
 	attr.ia_mode = mode | S_IFDIR;
 
@@ -1771,6 +1767,9 @@ static int nfs_symlink(struct inode *dir, struct dentry *dentry, const char *sym
 	dfprintk(VFS, "NFS: symlink(%s/%ld, %s, %s)\n", dir->i_sb->s_id,
 		dir->i_ino, dentry->d_name.name, symname);
 
+	if (dentry->d_name.len > NFS_SERVER(dir)->namelen)
+		return -ENAMETOOLONG;
+
 	if (pathlen > PAGE_SIZE)
 		return -ENAMETOOLONG;
 
@@ -1824,6 +1823,9 @@ nfs_link(struct dentry *old_dentry, struct inode *dir, struct dentry *dentry)
 		old_dentry->d_parent->d_name.name, old_dentry->d_name.name,
 		dentry->d_parent->d_name.name, dentry->d_name.name);
 
+	if (dentry->d_name.len > NFS_SERVER(dir)->namelen)
+		return -ENAMETOOLONG;
+
 	nfs_inode_return_delegation(inode);
 
 	d_drop(dentry);
diff --git a/fs/nfs/super.c b/fs/nfs/super.c
index 3dfa4f1..41ec94b 100644
--- a/fs/nfs/super.c
+++ b/fs/nfs/super.c
@@ -279,7 +279,8 @@ static struct file_system_type nfs_fs_type = {
 	.name		= "nfs",
 	.mount		= nfs_fs_mount,
 	.kill_sb	= nfs_kill_super,
-	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE | FS_REVAL_DOT |
+			  FS_BINARY_MOUNTDATA | FS_SKIP_LOOKUP_EXCL,
 };
 
 struct file_system_type nfs_xdev_fs_type = {
@@ -287,7 +288,8 @@ struct file_system_type nfs_xdev_fs_type = {
 	.name		= "nfs",
 	.mount		= nfs_xdev_mount,
 	.kill_sb	= nfs_kill_super,
-	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE | FS_REVAL_DOT |
+			  FS_BINARY_MOUNTDATA | FS_SKIP_LOOKUP_EXCL,
 };
 
 static const struct super_operations nfs_sops = {
@@ -327,7 +329,8 @@ static struct file_system_type nfs4_fs_type = {
 	.name		= "nfs4",
 	.mount		= nfs4_mount,
 	.kill_sb	= nfs4_kill_super,
-	.fs_flags	= FS_RENAME_DOES_D_MOVE|FS_REVAL_DOT|FS_BINARY_MOUNTDATA,
+	.fs_flags	= FS_RENAME_DOES_D_MOVE | FS_REVAL_DOT |
+			  FS_BINARY_MOUNTDATA | FS_SKIP_LOOKUP_EXCL,
 };
 
 static struct file_system_type nfs4_remote_fs_type = {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ea9282d..4a5e0d3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -183,6 +183,7 @@ struct inodes_stat_t {
 					 */
 #define FS_NO_LOOKUP_OPEN	0x10000	/* fs can't do atomic lookup+open */
 #define FS_NO_LOOKUP_CREATE	0x20000 /* fs can't do lookup+create+open */
+#define FS_SKIP_LOOKUP_EXCL	0x40000 /* skip lookup for exclusive create */
 
 /*
  * These are the fs-independent mount-flags: up to 32 flags are supported
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* [PATCH 25/25] vfs: remove nameidata from lookup
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (23 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 24/25] vfs: optionally skip lookup on exclusive create Miklos Szeredi
@ 2012-03-07 21:22 ` Miklos Szeredi
  2012-03-07 21:27 ` [PATCH 00/25] vfs: atomic open RFC Steve French
  2012-03-13  9:51 ` Christoph Hellwig
  26 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-07 21:22 UTC (permalink / raw)
  To: viro
  Cc: linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench, sage,
	ericvh, mszeredi

From: Miklos Szeredi <mszeredi@suse.cz>

Remove nameidata argument of i_op->lookup.  It is no longer used by any
filesystem.

Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
---
 Documentation/filesystems/Locking |    3 +--
 Documentation/filesystems/vfs.txt |    2 +-
 fs/9p/v9fs.h                      |    3 +--
 fs/9p/vfs_inode.c                 |    8 +++-----
 fs/adfs/dir.c                     |    2 +-
 fs/affs/affs.h                    |    2 +-
 fs/affs/namei.c                   |    2 +-
 fs/afs/dir.c                      |    6 ++----
 fs/afs/mntpt.c                    |    7 ++-----
 fs/autofs4/root.c                 |    4 ++--
 fs/bad_inode.c                    |    3 +--
 fs/befs/linuxvfs.c                |    4 ++--
 fs/bfs/dir.c                      |    3 +--
 fs/btrfs/inode.c                  |    3 +--
 fs/ceph/dir.c                     |    5 ++---
 fs/cifs/cifsfs.h                  |    3 +--
 fs/cifs/dir.c                     |    3 +--
 fs/coda/dir.c                     |    4 ++--
 fs/configfs/dir.c                 |    4 +---
 fs/cramfs/inode.c                 |    2 +-
 fs/ecryptfs/inode.c               |    4 +---
 fs/efs/efs.h                      |    2 +-
 fs/efs/namei.c                    |    3 ++-
 fs/exofs/namei.c                  |    3 +--
 fs/ext2/namei.c                   |    2 +-
 fs/ext3/namei.c                   |    2 +-
 fs/ext4/namei.c                   |    2 +-
 fs/fat/namei_msdos.c              |    3 +--
 fs/fat/namei_vfat.c               |    3 +--
 fs/freevxfs/vxfs_lookup.c         |    5 ++---
 fs/fuse/dir.c                     |    3 +--
 fs/gfs2/inode.c                   |    4 +---
 fs/hfs/dir.c                      |    3 +--
 fs/hfs/inode.c                    |    3 +--
 fs/hfsplus/dir.c                  |    3 +--
 fs/hfsplus/inode.c                |    2 +-
 fs/hostfs/hostfs_kern.c           |    3 +--
 fs/hpfs/dir.c                     |    2 +-
 fs/hpfs/hpfs_fn.h                 |    2 +-
 fs/hppfs/hppfs.c                  |    3 +--
 fs/isofs/isofs.h                  |    2 +-
 fs/isofs/namei.c                  |    2 +-
 fs/jffs2/dir.c                    |    6 ++----
 fs/jfs/namei.c                    |    2 +-
 fs/libfs.c                        |    2 +-
 fs/logfs/dir.c                    |    3 +--
 fs/minix/namei.c                  |    2 +-
 fs/namei.c                        |   11 +++++------
 fs/ncpfs/dir.c                    |    4 ++--
 fs/nfs/dir.c                      |    4 ++--
 fs/nilfs2/namei.c                 |    3 +--
 fs/ntfs/namei.c                   |    4 +---
 fs/ocfs2/namei.c                  |    3 +--
 fs/omfs/dir.c                     |    3 +--
 fs/openpromfs/inode.c             |    5 +++--
 fs/proc/base.c                    |   22 ++++++++++++----------
 fs/proc/generic.c                 |    3 +--
 fs/proc/internal.h                |    4 ++--
 fs/proc/namespaces.c              |    2 +-
 fs/proc/proc_net.c                |    2 +-
 fs/proc/proc_sysctl.c             |    3 +--
 fs/proc/root.c                    |    9 ++++-----
 fs/qnx4/namei.c                   |    2 +-
 fs/qnx4/qnx4.h                    |    2 +-
 fs/reiserfs/namei.c               |    3 +--
 fs/romfs/super.c                  |    3 +--
 fs/squashfs/namei.c               |    3 +--
 fs/sysfs/dir.c                    |    3 +--
 fs/sysv/namei.c                   |    2 +-
 fs/ubifs/dir.c                    |    3 +--
 fs/udf/namei.c                    |    3 +--
 fs/ufs/namei.c                    |    2 +-
 fs/xfs/xfs_iops.c                 |    6 ++----
 include/linux/fs.h                |    4 ++--
 kernel/cgroup.c                   |    4 ++--
 75 files changed, 112 insertions(+), 159 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 4fca82e..636eb0b 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -38,8 +38,7 @@ d_manage:	no		no		yes (ref-walk)	maybe
 --------------------------- inode_operations --------------------------- 
 prototypes:
 	int (*create) (struct inode *,struct dentry *,umode_t, struct nameidata *);
-	struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameid
-ata *);
+	struct dentry * (*lookup) (struct inode *,struct dentry *);
 	int (*link) (struct dentry *,struct inode *,struct dentry *);
 	int (*unlink) (struct inode *,struct dentry *);
 	int (*symlink) (struct inode *,struct dentry *,const char *);
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 3d9393b..647307f 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -342,7 +342,7 @@ filesystem. As of kernel 2.6.22, the following members are defined:
 
 struct inode_operations {
 	int (*create) (struct inode *,struct dentry *, umode_t, struct nameidata *);
-	struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);
+	struct dentry * (*lookup) (struct inode *,struct dentry *);
 	int (*link) (struct dentry *,struct inode *,struct dentry *);
 	int (*unlink) (struct inode *,struct dentry *);
 	int (*symlink) (struct inode *,struct dentry *,const char *);
diff --git a/fs/9p/v9fs.h b/fs/9p/v9fs.h
index e78956c..cb28298 100644
--- a/fs/9p/v9fs.h
+++ b/fs/9p/v9fs.h
@@ -143,8 +143,7 @@ struct p9_fid *v9fs_session_init(struct v9fs_session_info *, const char *,
 extern void v9fs_session_close(struct v9fs_session_info *v9ses);
 extern void v9fs_session_cancel(struct v9fs_session_info *v9ses);
 extern void v9fs_session_begin_cancel(struct v9fs_session_info *v9ses);
-extern struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry,
-			struct nameidata *nameidata);
+extern struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry);
 extern int v9fs_vfs_unlink(struct inode *i, struct dentry *d);
 extern int v9fs_vfs_rmdir(struct inode *i, struct dentry *d);
 extern int v9fs_vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 3d526a7..6883765 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -830,12 +830,10 @@ static int v9fs_vfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode
  * v9fs_vfs_lookup - VFS lookup hook to "walk" to a new inode
  * @dir:  inode that is being walked from
  * @dentry: dentry that is being walked to?
- * @nameidata: path data
  *
  */
 
-struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry,
-				      struct nameidata *nameidata)
+struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct dentry *res;
 	struct super_block *sb;
@@ -845,8 +843,8 @@ struct dentry *v9fs_vfs_lookup(struct inode *dir, struct dentry *dentry,
 	char *name;
 	int result = 0;
 
-	p9_debug(P9_DEBUG_VFS, "dir: %p dentry: (%s) %p nameidata: %p\n",
-		 dir, dentry->d_name.name, dentry, nameidata);
+	p9_debug(P9_DEBUG_VFS, "dir: %p dentry: (%s) %p\n",
+		 dir, dentry->d_name.name, dentry);
 
 	if (dentry->d_name.len > NAME_MAX)
 		return ERR_PTR(-ENAMETOOLONG);
diff --git a/fs/adfs/dir.c b/fs/adfs/dir.c
index 3d83075a..9314d8d 100644
--- a/fs/adfs/dir.c
+++ b/fs/adfs/dir.c
@@ -266,7 +266,7 @@ const struct dentry_operations adfs_dentry_operations = {
 };
 
 static struct dentry *
-adfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+adfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = NULL;
 	struct object_info obj;
diff --git a/fs/affs/affs.h b/fs/affs/affs.h
index e0fca52..9b698c8 100644
--- a/fs/affs/affs.h
+++ b/fs/affs/affs.h
@@ -154,7 +154,7 @@ extern void	affs_free_bitmap(struct super_block *sb);
 /* namei.c */
 
 extern int	affs_hash_name(struct super_block *sb, const u8 *name, unsigned int len);
-extern struct dentry *affs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *);
+extern struct dentry *affs_lookup(struct inode *dir, struct dentry *dentry);
 extern int	affs_unlink(struct inode *dir, struct dentry *dentry);
 extern int	affs_create(struct inode *dir, struct dentry *dentry,
 			    umode_t mode);
diff --git a/fs/affs/namei.c b/fs/affs/namei.c
index 3ad7695..886f4d2 100644
--- a/fs/affs/namei.c
+++ b/fs/affs/namei.c
@@ -211,7 +211,7 @@ affs_find_entry(struct inode *dir, struct dentry *dentry)
 }
 
 struct dentry *
-affs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+affs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct super_block *sb = dir->i_sb;
 	struct buffer_head *bh;
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index ea254ab..853e9ca 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -19,8 +19,7 @@
 #include <linux/sched.h>
 #include "internal.h"
 
-static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
-				 struct nameidata *nd);
+static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry);
 static int afs_dir_open(struct inode *inode, struct file *file);
 static int afs_readdir(struct file *file, void *dirent, filldir_t filldir);
 static int afs_d_revalidate(struct dentry *dentry, struct nameidata *nd);
@@ -514,8 +513,7 @@ out:
 /*
  * look up an entry in a directory
  */
-static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry,
-				 struct nameidata *nd)
+static struct dentry *afs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct afs_vnode *vnode;
 	struct afs_fid fid;
diff --git a/fs/afs/mntpt.c b/fs/afs/mntpt.c
index 8f4ce26..fa15077 100644
--- a/fs/afs/mntpt.c
+++ b/fs/afs/mntpt.c
@@ -21,8 +21,7 @@
 
 
 static struct dentry *afs_mntpt_lookup(struct inode *dir,
-				       struct dentry *dentry,
-				       struct nameidata *nd);
+				       struct dentry *dentry);
 static int afs_mntpt_open(struct inode *inode, struct file *file);
 static void afs_mntpt_expiry_timed_out(struct work_struct *work);
 
@@ -102,9 +101,7 @@ out:
 /*
  * no valid lookup procedure on this sort of dir
  */
-static struct dentry *afs_mntpt_lookup(struct inode *dir,
-				       struct dentry *dentry,
-				       struct nameidata *nd)
+static struct dentry *afs_mntpt_lookup(struct inode *dir, struct dentry *dentry)
 {
 	_enter("%p,%p{%p{%s},%s}",
 	       dir,
diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
index 75e5f1c..43ccbc5 100644
--- a/fs/autofs4/root.c
+++ b/fs/autofs4/root.c
@@ -32,7 +32,7 @@ static long autofs4_root_ioctl(struct file *,unsigned int,unsigned long);
 static long autofs4_root_compat_ioctl(struct file *,unsigned int,unsigned long);
 #endif
 static int autofs4_dir_open(struct inode *inode, struct file *file);
-static struct dentry *autofs4_lookup(struct inode *,struct dentry *, struct nameidata *);
+static struct dentry *autofs4_lookup(struct inode *, struct dentry *);
 static struct vfsmount *autofs4_d_automount(struct path *);
 static int autofs4_d_manage(struct dentry *, bool);
 static void autofs4_dentry_release(struct dentry *);
@@ -458,7 +458,7 @@ int autofs4_d_manage(struct dentry *dentry, bool rcu_walk)
 }
 
 /* Lookups in the root directory */
-static struct dentry *autofs4_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *autofs4_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct autofs_sb_info *sbi;
 	struct autofs_info *ino;
diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 9fc0eab..8f76ebf 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -178,8 +178,7 @@ static int bad_inode_create(struct inode *dir, struct dentry *dentry,
 	return -EIO;
 }
 
-static struct dentry *bad_inode_lookup(struct inode *dir,
-			struct dentry *dentry, struct nameidata *nd)
+static struct dentry *bad_inode_lookup(struct inode *dir, struct dentry *dentry)
 {
 	return ERR_PTR(-EIO);
 }
diff --git a/fs/befs/linuxvfs.c b/fs/befs/linuxvfs.c
index 6e6d536..1c2e50f 100644
--- a/fs/befs/linuxvfs.c
+++ b/fs/befs/linuxvfs.c
@@ -34,7 +34,7 @@ static int befs_readdir(struct file *, void *, filldir_t);
 static int befs_get_block(struct inode *, sector_t, struct buffer_head *, int);
 static int befs_readpage(struct file *file, struct page *page);
 static sector_t befs_bmap(struct address_space *mapping, sector_t block);
-static struct dentry *befs_lookup(struct inode *, struct dentry *, struct nameidata *);
+static struct dentry *befs_lookup(struct inode *, struct dentry *);
 static struct inode *befs_iget(struct super_block *, unsigned long);
 static struct inode *befs_alloc_inode(struct super_block *sb);
 static void befs_destroy_inode(struct inode *inode);
@@ -159,7 +159,7 @@ befs_get_block(struct inode *inode, sector_t block,
 }
 
 static struct dentry *
-befs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+befs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = NULL;
 	struct super_block *sb = dir->i_sb;
diff --git a/fs/bfs/dir.c b/fs/bfs/dir.c
index e9a3937..4322fc7 100644
--- a/fs/bfs/dir.c
+++ b/fs/bfs/dir.c
@@ -131,8 +131,7 @@ static int bfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 	return 0;
 }
 
-static struct dentry *bfs_lookup(struct inode *dir, struct dentry *dentry,
-						struct nameidata *nd)
+static struct dentry *bfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = NULL;
 	struct buffer_head *bh;
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index a5b0fb5..2514364 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -4007,8 +4007,7 @@ static void btrfs_dentry_release(struct dentry *dentry)
 		kfree(dentry->d_fsdata);
 }
 
-static struct dentry *btrfs_lookup(struct inode *dir, struct dentry *dentry,
-				   struct nameidata *nd)
+static struct dentry *btrfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct dentry *ret;
 
diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index 62b10e7..17f984f 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -575,8 +575,7 @@ static int is_root_ceph_dentry(struct inode *inode, struct dentry *dentry)
  * Look up a single dir entry.  If there is a lookup intent, inform
  * the MDS so that it gets our 'caps wanted' value in a single op.
  */
-static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
-				  struct nameidata *nd)
+static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb);
 	struct ceph_mds_client *mdsc = fsc->mdsc;
@@ -640,7 +639,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry,
  */
 int ceph_handle_notrace_create(struct inode *dir, struct dentry *dentry)
 {
-	struct dentry *result = ceph_lookup(dir, dentry, NULL);
+	struct dentry *result = ceph_lookup(dir, dentry);
 
 	if (result && !IS_ERR(result)) {
 		/*
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 16aa162..2cfd0f4 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -47,8 +47,7 @@ extern struct inode *cifs_root_iget(struct super_block *);
 extern struct file *cifs_atomic_open(struct inode *, struct dentry *,
 				     struct opendata *, unsigned, umode_t,
 				     bool *);
-extern struct dentry *cifs_lookup(struct inode *, struct dentry *,
-				  struct nameidata *);
+extern struct dentry *cifs_lookup(struct inode *, struct dentry *);
 extern int cifs_unlink(struct inode *dir, struct dentry *dentry);
 extern int cifs_hardlink(struct dentry *, struct inode *, struct dentry *);
 extern int cifs_mknod(struct inode *, struct dentry *, umode_t, dev_t);
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 507cc67..57212c9 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -534,8 +534,7 @@ mknod_out:
 }
 
 struct dentry *
-cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry,
-	    struct nameidata *nd)
+cifs_lookup(struct inode *parent_dir_inode, struct dentry *direntry)
 {
 	int xid;
 	int rc = 0; /* to get around spurious gcc warning, set to zero here */
diff --git a/fs/coda/dir.c b/fs/coda/dir.c
index e8d02b9..025c174 100644
--- a/fs/coda/dir.c
+++ b/fs/coda/dir.c
@@ -31,7 +31,7 @@
 
 /* dir inode-ops */
 static int coda_create(struct inode *dir, struct dentry *new, umode_t mode);
-static struct dentry *coda_lookup(struct inode *dir, struct dentry *target, struct nameidata *nd);
+static struct dentry *coda_lookup(struct inode *dir, struct dentry *target);
 static int coda_link(struct dentry *old_dentry, struct inode *dir_inode, 
 		     struct dentry *entry);
 static int coda_unlink(struct inode *dir_inode, struct dentry *entry);
@@ -94,7 +94,7 @@ const struct file_operations coda_dir_operations = {
 
 /* inode operations for directories */
 /* access routines: lookup, readlink, permission */
-static struct dentry *coda_lookup(struct inode *dir, struct dentry *entry, struct nameidata *nd)
+static struct dentry *coda_lookup(struct inode *dir, struct dentry *entry)
 {
 	struct super_block *sb = dir->i_sb;
 	const char *name = entry->d_name.name;
diff --git a/fs/configfs/dir.c b/fs/configfs/dir.c
index 5ddd7eb..2c09ae6 100644
--- a/fs/configfs/dir.c
+++ b/fs/configfs/dir.c
@@ -450,9 +450,7 @@ static int configfs_attach_attr(struct configfs_dirent * sd, struct dentry * den
 	return 0;
 }
 
-static struct dentry * configfs_lookup(struct inode *dir,
-				       struct dentry *dentry,
-				       struct nameidata *nd)
+static struct dentry *configfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct configfs_dirent * parent_sd = dentry->d_parent->d_fsdata;
 	struct configfs_dirent * sd;
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c
index a2ee8f9..53a5135 100644
--- a/fs/cramfs/inode.c
+++ b/fs/cramfs/inode.c
@@ -419,7 +419,7 @@ static int cramfs_readdir(struct file *filp, void *dirent, filldir_t filldir)
 /*
  * Lookup and fill in the inode data..
  */
-static struct dentry * cramfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *cramfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	unsigned int offset = 0;
 	struct inode *inode = NULL;
diff --git a/fs/ecryptfs/inode.c b/fs/ecryptfs/inode.c
index 8371f06..0e3fa4c 100644
--- a/fs/ecryptfs/inode.c
+++ b/fs/ecryptfs/inode.c
@@ -366,14 +366,12 @@ static int ecryptfs_lookup_interpose(struct dentry *dentry,
  * ecryptfs_lookup
  * @ecryptfs_dir_inode: The eCryptfs directory inode
  * @ecryptfs_dentry: The eCryptfs dentry that we are looking up
- * @ecryptfs_nd: nameidata; may be NULL
  *
  * Find a file on disk. If the file does not exist, then we'll add it to the
  * dentry cache and continue on to read it from the disk.
  */
 static struct dentry *ecryptfs_lookup(struct inode *ecryptfs_dir_inode,
-				      struct dentry *ecryptfs_dentry,
-				      struct nameidata *ecryptfs_nd)
+				      struct dentry *ecryptfs_dentry)
 {
 	char *encrypted_and_encoded_name = NULL;
 	size_t encrypted_and_encoded_name_size;
diff --git a/fs/efs/efs.h b/fs/efs/efs.h
index d8305b5..2729b33 100644
--- a/fs/efs/efs.h
+++ b/fs/efs/efs.h
@@ -129,7 +129,7 @@ extern struct inode *efs_iget(struct super_block *, unsigned long);
 extern efs_block_t efs_map_block(struct inode *, efs_block_t);
 extern int efs_get_block(struct inode *, sector_t, struct buffer_head *, int);
 
-extern struct dentry *efs_lookup(struct inode *, struct dentry *, struct nameidata *);
+extern struct dentry *efs_lookup(struct inode *, struct dentry *);
 extern struct dentry *efs_fh_to_dentry(struct super_block *sb, struct fid *fid,
 		int fh_len, int fh_type);
 extern struct dentry *efs_fh_to_parent(struct super_block *sb, struct fid *fid,
diff --git a/fs/efs/namei.c b/fs/efs/namei.c
index 832b10d..ee1f261 100644
--- a/fs/efs/namei.c
+++ b/fs/efs/namei.c
@@ -58,7 +58,8 @@ static efs_ino_t efs_find_entry(struct inode *inode, const char *name, int len)
 	return(0);
 }
 
-struct dentry *efs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd) {
+struct dentry *efs_lookup(struct inode *dir, struct dentry *dentry)
+{
 	efs_ino_t inodenum;
 	struct inode *inode = NULL;
 
diff --git a/fs/exofs/namei.c b/fs/exofs/namei.c
index 570845b..4c77014 100644
--- a/fs/exofs/namei.c
+++ b/fs/exofs/namei.c
@@ -45,8 +45,7 @@ static inline int exofs_add_nondir(struct dentry *dentry, struct inode *inode)
 	return err;
 }
 
-static struct dentry *exofs_lookup(struct inode *dir, struct dentry *dentry,
-				   struct nameidata *nd)
+static struct dentry *exofs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode;
 	ino_t ino;
diff --git a/fs/ext2/namei.c b/fs/ext2/namei.c
index a39f4c4..0fc69a6 100644
--- a/fs/ext2/namei.c
+++ b/fs/ext2/namei.c
@@ -55,7 +55,7 @@ static inline int ext2_add_nondir(struct dentry *dentry, struct inode *inode)
  * Methods themselves.
  */
 
-static struct dentry *ext2_lookup(struct inode * dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *ext2_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode * inode;
 	ino_t ino;
diff --git a/fs/ext3/namei.c b/fs/ext3/namei.c
index af7adfc..d7803b0 100644
--- a/fs/ext3/namei.c
+++ b/fs/ext3/namei.c
@@ -1023,7 +1023,7 @@ errout:
 	return NULL;
 }
 
-static struct dentry *ext3_lookup(struct inode * dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *ext3_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode * inode;
 	struct ext3_dir_entry_2 * de;
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 20f9a13..380b7b2 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1019,7 +1019,7 @@ errout:
 	return NULL;
 }
 
-static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *ext4_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode;
 	struct ext4_dir_entry_2 *de;
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index c854101..c6fa9c9 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -200,8 +200,7 @@ static const struct dentry_operations msdos_dentry_operations = {
  */
 
 /***** Get inode using directory and name */
-static struct dentry *msdos_lookup(struct inode *dir, struct dentry *dentry,
-				   struct nameidata *nd)
+static struct dentry *msdos_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct super_block *sb = dir->i_sb;
 	struct fat_slot_info sinfo;
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index 39e810f..fe45623 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -724,8 +724,7 @@ static int vfat_d_anon_disconn(struct dentry *dentry)
 	return IS_ROOT(dentry) && (dentry->d_flags & DCACHE_DISCONNECTED);
 }
 
-static struct dentry *vfat_lookup(struct inode *dir, struct dentry *dentry,
-				  struct nameidata *nd)
+static struct dentry *vfat_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct super_block *sb = dir->i_sb;
 	struct fat_slot_info sinfo;
diff --git a/fs/freevxfs/vxfs_lookup.c b/fs/freevxfs/vxfs_lookup.c
index 3360f1e..9eec2b4 100644
--- a/fs/freevxfs/vxfs_lookup.c
+++ b/fs/freevxfs/vxfs_lookup.c
@@ -48,7 +48,7 @@
 #define VXFS_BLOCK_PER_PAGE(sbp)  ((PAGE_CACHE_SIZE / (sbp)->s_blocksize))
 
 
-static struct dentry *	vxfs_lookup(struct inode *, struct dentry *, struct nameidata *);
+static struct dentry	*vxfs_lookup(struct inode *, struct dentry *);
 static int		vxfs_readdir(struct file *, void *, filldir_t);
 
 const struct inode_operations vxfs_dir_inode_ops = {
@@ -192,7 +192,6 @@ vxfs_inode_by_name(struct inode *dip, struct dentry *dp)
  * vxfs_lookup - lookup pathname component
  * @dip:	dir in which we lookup
  * @dp:		dentry we lookup
- * @nd:		lookup nameidata
  *
  * Description:
  *   vxfs_lookup tries to lookup the pathname component described
@@ -203,7 +202,7 @@ vxfs_inode_by_name(struct inode *dip, struct dentry *dp)
  *   in the return pointer.
  */
 static struct dentry *
-vxfs_lookup(struct inode *dip, struct dentry *dp, struct nameidata *nd)
+vxfs_lookup(struct inode *dip, struct dentry *dp)
 {
 	struct inode		*ip = NULL;
 	ino_t			ino;
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c
index 584385e..cf0e470 100644
--- a/fs/fuse/dir.c
+++ b/fs/fuse/dir.c
@@ -315,8 +315,7 @@ int fuse_lookup_name(struct super_block *sb, u64 nodeid, struct qstr *name,
 	return err;
 }
 
-static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry,
-				  struct nameidata *nd)
+static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry)
 {
 	int err;
 	struct fuse_entry_out outarg;
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c
index 203ec3c..d1820e5 100644
--- a/fs/gfs2/inode.c
+++ b/fs/gfs2/inode.c
@@ -775,15 +775,13 @@ static struct file *gfs2_create(struct inode *dir, struct dentry *dentry,
  * gfs2_lookup - Look up a filename in a directory and return its inode
  * @dir: The directory inode
  * @dentry: The dentry of the new inode
- * @nd: passed from Linux VFS, ignored by us
  *
  * Called by the VFS layer. Lock dir and call gfs2_lookupi()
  *
  * Returns: errno
  */
 
-static struct dentry *gfs2_lookup(struct inode *dir, struct dentry *dentry,
-				  struct nameidata *nd)
+static struct dentry *gfs2_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = gfs2_lookupi(dir, &dentry->d_name, 0);
 	if (inode && !IS_ERR(inode)) {
diff --git a/fs/hfs/dir.c b/fs/hfs/dir.c
index ba125c1..ee6b6f2 100644
--- a/fs/hfs/dir.c
+++ b/fs/hfs/dir.c
@@ -17,8 +17,7 @@
 /*
  * hfs_lookup()
  */
-static struct dentry *hfs_lookup(struct inode *dir, struct dentry *dentry,
-				 struct nameidata *nd)
+static struct dentry *hfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	hfs_cat_rec rec;
 	struct hfs_find_data fd;
diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
index 737dbeb..c649d1a 100644
--- a/fs/hfs/inode.c
+++ b/fs/hfs/inode.c
@@ -488,8 +488,7 @@ out:
 	return 0;
 }
 
-static struct dentry *hfs_file_lookup(struct inode *dir, struct dentry *dentry,
-				      struct nameidata *nd)
+static struct dentry *hfs_file_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = NULL;
 	hfs_cat_rec rec;
diff --git a/fs/hfsplus/dir.c b/fs/hfsplus/dir.c
index bfb2ad8..03bd823 100644
--- a/fs/hfsplus/dir.c
+++ b/fs/hfsplus/dir.c
@@ -24,8 +24,7 @@ static inline void hfsplus_instantiate(struct dentry *dentry,
 }
 
 /* Find the entry inside dir named dentry->d_name */
-static struct dentry *hfsplus_lookup(struct inode *dir, struct dentry *dentry,
-				     struct nameidata *nd)
+static struct dentry *hfsplus_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = NULL;
 	struct hfs_find_data fd;
diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
index 6643b24..8edd486 100644
--- a/fs/hfsplus/inode.c
+++ b/fs/hfsplus/inode.c
@@ -168,7 +168,7 @@ const struct dentry_operations hfsplus_dentry_operations = {
 };
 
 static struct dentry *hfsplus_file_lookup(struct inode *dir,
-		struct dentry *dentry, struct nameidata *nd)
+					  struct dentry *dentry)
 {
 	struct hfs_find_data fd;
 	struct super_block *sb = dir->i_sb;
diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
index a3d2242..6f6985b 100644
--- a/fs/hostfs/hostfs_kern.c
+++ b/fs/hostfs/hostfs_kern.c
@@ -592,8 +592,7 @@ static int hostfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 	return error;
 }
 
-struct dentry *hostfs_lookup(struct inode *ino, struct dentry *dentry,
-			     struct nameidata *nd)
+static struct dentry *hostfs_lookup(struct inode *ino, struct dentry *dentry)
 {
 	struct inode *inode;
 	char *name;
diff --git a/fs/hpfs/dir.c b/fs/hpfs/dir.c
index 2fa0089..087afbf 100644
--- a/fs/hpfs/dir.c
+++ b/fs/hpfs/dir.c
@@ -189,7 +189,7 @@ out:
  *	      to tell read_inode to read fnode or not.
  */
 
-struct dentry *hpfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+struct dentry *hpfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	const unsigned char *name = dentry->d_name.name;
 	unsigned len = dentry->d_name.len;
diff --git a/fs/hpfs/hpfs_fn.h b/fs/hpfs/hpfs_fn.h
index de94617..eb08076 100644
--- a/fs/hpfs/hpfs_fn.h
+++ b/fs/hpfs/hpfs_fn.h
@@ -227,7 +227,7 @@ extern const struct dentry_operations hpfs_dentry_operations;
 
 /* dir.c */
 
-struct dentry *hpfs_lookup(struct inode *, struct dentry *, struct nameidata *);
+struct dentry *hpfs_lookup(struct inode *, struct dentry *);
 extern const struct file_operations hpfs_dir_ops;
 
 /* dnode.c */
diff --git a/fs/hppfs/hppfs.c b/fs/hppfs/hppfs.c
index d92f4ce..90b94e6 100644
--- a/fs/hppfs/hppfs.c
+++ b/fs/hppfs/hppfs.c
@@ -137,8 +137,7 @@ static int file_removed(struct dentry *dentry, const char *file)
 	return 0;
 }
 
-static struct dentry *hppfs_lookup(struct inode *ino, struct dentry *dentry,
-				   struct nameidata *nd)
+static struct dentry *hppfs_lookup(struct inode *ino, struct dentry *dentry)
 {
 	struct dentry *proc_dentry, *parent;
 	struct qstr *name = &dentry->d_name;
diff --git a/fs/isofs/isofs.h b/fs/isofs/isofs.h
index 0e73f63..c0589a4 100644
--- a/fs/isofs/isofs.h
+++ b/fs/isofs/isofs.h
@@ -114,7 +114,7 @@ extern int isofs_name_translate(struct iso_directory_record *, char *, struct in
 int get_joliet_filename(struct iso_directory_record *, unsigned char *, struct inode *);
 int get_acorn_filename(struct iso_directory_record *, char *, struct inode *);
 
-extern struct dentry *isofs_lookup(struct inode *, struct dentry *, struct nameidata *);
+extern struct dentry *isofs_lookup(struct inode *, struct dentry *);
 extern struct buffer_head *isofs_bread(struct inode *, sector_t);
 extern int isofs_get_blocks(struct inode *, sector_t, struct buffer_head **, unsigned long);
 
diff --git a/fs/isofs/namei.c b/fs/isofs/namei.c
index 1e2946f..3d6e214 100644
--- a/fs/isofs/namei.c
+++ b/fs/isofs/namei.c
@@ -163,7 +163,7 @@ isofs_find_entry(struct inode *dir, struct dentry *dentry,
 	return 0;
 }
 
-struct dentry *isofs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+struct dentry *isofs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	int found;
 	unsigned long uninitialized_var(block);
diff --git a/fs/jffs2/dir.c b/fs/jffs2/dir.c
index 099dd6a..0c3a2b8 100644
--- a/fs/jffs2/dir.c
+++ b/fs/jffs2/dir.c
@@ -23,8 +23,7 @@
 static int jffs2_readdir (struct file *, void *, filldir_t);
 
 static int jffs2_create(struct inode *, struct dentry *, umode_t);
-static struct dentry *jffs2_lookup (struct inode *,struct dentry *,
-				    struct nameidata *);
+static struct dentry *jffs2_lookup(struct inode *, struct dentry *);
 static int jffs2_link (struct dentry *,struct inode *,struct dentry *);
 static int jffs2_unlink (struct inode *,struct dentry *);
 static int jffs2_symlink (struct inode *,struct dentry *,const char *);
@@ -70,8 +69,7 @@ const struct inode_operations jffs2_dir_inode_operations =
    and we use the same hash function as the dentries. Makes this
    nice and simple
 */
-static struct dentry *jffs2_lookup(struct inode *dir_i, struct dentry *target,
-				   struct nameidata *nd)
+static struct dentry *jffs2_lookup(struct inode *dir_i, struct dentry *target)
 {
 	struct jffs2_inode_info *dir_f;
 	struct jffs2_full_dirent *fd = NULL, *fd_list;
diff --git a/fs/jfs/namei.c b/fs/jfs/namei.c
index 423c520..2c1ce6b 100644
--- a/fs/jfs/namei.c
+++ b/fs/jfs/namei.c
@@ -1447,7 +1447,7 @@ static int jfs_mknod(struct inode *dir, struct dentry *dentry,
 	return rc;
 }
 
-static struct dentry *jfs_lookup(struct inode *dip, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *jfs_lookup(struct inode *dip, struct dentry *dentry)
 {
 	struct btstack btstack;
 	ino_t inum;
diff --git a/fs/libfs.c b/fs/libfs.c
index 5b2dbb3..2654564 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -53,7 +53,7 @@ static int simple_delete_dentry(const struct dentry *dentry)
  * Lookup the data. This is trivial - if the dentry didn't already
  * exist, we know it is negative.  Set d_op to delete negative dentries.
  */
-struct dentry *simple_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+struct dentry *simple_lookup(struct inode *dir, struct dentry *dentry)
 {
 	static const struct dentry_operations simple_dentry_operations = {
 		.d_delete = simple_delete_dentry,
diff --git a/fs/logfs/dir.c b/fs/logfs/dir.c
index 33556ae..a748d02 100644
--- a/fs/logfs/dir.c
+++ b/fs/logfs/dir.c
@@ -348,8 +348,7 @@ static void logfs_set_name(struct logfs_disk_dentry *dd, struct qstr *name)
 	memcpy(dd->name, name->name, name->len);
 }
 
-static struct dentry *logfs_lookup(struct inode *dir, struct dentry *dentry,
-		struct nameidata *nd)
+static struct dentry *logfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct page *page;
 	struct logfs_disk_dentry *dd;
diff --git a/fs/minix/namei.c b/fs/minix/namei.c
index 2ec09bb..ea5585a 100644
--- a/fs/minix/namei.c
+++ b/fs/minix/namei.c
@@ -18,7 +18,7 @@ static int add_nondir(struct dentry *dentry, struct inode *inode)
 	return err;
 }
 
-static struct dentry *minix_lookup(struct inode * dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *minix_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode * inode = NULL;
 	ino_t ino;
diff --git a/fs/namei.c b/fs/namei.c
index 3e3652c..5d821ff 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1094,8 +1094,7 @@ static struct dentry *lookup_dcache(struct qstr *name, struct dentry *dir,
  *
  * dir->d_inode->i_mutex must be held
  */
-static struct dentry *lookup_real(struct inode *dir, struct dentry *dentry,
-				  struct nameidata *nd)
+static struct dentry *lookup_real(struct inode *dir, struct dentry *dentry)
 {
 	struct dentry *old;
 
@@ -1105,7 +1104,7 @@ static struct dentry *lookup_real(struct inode *dir, struct dentry *dentry,
 		return ERR_PTR(-ENOENT);
 	}
 
-	old = dir->i_op->lookup(dir, dentry, nd);
+	old = dir->i_op->lookup(dir, dentry);
 	if (unlikely(old)) {
 		dput(dentry);
 		dentry = old;
@@ -1123,7 +1122,7 @@ static struct dentry *__lookup_hash(struct qstr *name, struct dentry *base,
 	if (IS_ERR(dentry) || !need_lookup)
 		return dentry;
 
-	return lookup_real(base->d_inode, dentry, nd);
+	return lookup_real(base->d_inode, dentry);
 }
 
 /*
@@ -2313,7 +2312,7 @@ static struct file *lookup_open(struct nameidata *nd, struct path *path,
 	if (need_lookup) {
 		BUG_ON(dentry->d_inode);
 
-		dentry = lookup_real(dir_inode, dentry, nd);
+		dentry = lookup_real(dir_inode, dentry);
 		if (IS_ERR(dentry))
 			return ERR_CAST(dentry);
 
@@ -2736,7 +2735,7 @@ struct dentry *kern_path_create(int dfd, const char *pathname, struct path *path
 	if (need_lookup) {
 		struct inode *dir = nd.path.dentry->d_inode;
 		if (!(dir->i_sb->s_type->fs_flags & FS_SKIP_LOOKUP_EXCL)) {
-			dentry = lookup_real(dir, dentry, &nd);
+			dentry = lookup_real(dir, dentry);
 			if (IS_ERR(dentry))
 				goto fail;
 		}
diff --git a/fs/ncpfs/dir.c b/fs/ncpfs/dir.c
index 34eeb46..119c0b0 100644
--- a/fs/ncpfs/dir.c
+++ b/fs/ncpfs/dir.c
@@ -31,7 +31,7 @@ static void ncp_do_readdir(struct file *, void *, filldir_t,
 static int ncp_readdir(struct file *, void *, filldir_t);
 
 static int ncp_create(struct inode *, struct dentry *, umode_t);
-static struct dentry *ncp_lookup(struct inode *, struct dentry *, struct nameidata *);
+static struct dentry *ncp_lookup(struct inode *, struct dentry *);
 static int ncp_unlink(struct inode *, struct dentry *);
 static int ncp_mkdir(struct inode *, struct dentry *, umode_t);
 static int ncp_rmdir(struct inode *, struct dentry *);
@@ -836,7 +836,7 @@ out:
 	return result;
 }
 
-static struct dentry *ncp_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *ncp_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct ncp_server *server = NCP_SERVER(dir);
 	struct inode *inode = NULL;
diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
index 2b91cf3..9125230 100644
--- a/fs/nfs/dir.c
+++ b/fs/nfs/dir.c
@@ -46,7 +46,7 @@
 static int nfs_opendir(struct inode *, struct file *);
 static int nfs_closedir(struct inode *, struct file *);
 static int nfs_readdir(struct file *, void *, filldir_t);
-static struct dentry *nfs_lookup(struct inode *, struct dentry *, struct nameidata *);
+static struct dentry *nfs_lookup(struct inode *, struct dentry *);
 static struct file *nfs_create(struct inode *, struct dentry *,
 			       struct opendata *, unsigned, umode_t);
 static int nfs_mkdir(struct inode *, struct dentry *, umode_t);
@@ -1269,7 +1269,7 @@ const struct dentry_operations nfs_dentry_operations = {
 	.d_release	= nfs_d_release,
 };
 
-static struct dentry *nfs_lookup(struct inode *dir, struct dentry * dentry, struct nameidata *nd)
+static struct dentry *nfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct dentry *res;
 	struct dentry *parent;
diff --git a/fs/nilfs2/namei.c b/fs/nilfs2/namei.c
index ebde465..26f703a 100644
--- a/fs/nilfs2/namei.c
+++ b/fs/nilfs2/namei.c
@@ -62,8 +62,7 @@ static inline int nilfs_add_nondir(struct dentry *dentry, struct inode *inode)
  * Methods themselves.
  */
 
-static struct dentry *
-nilfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *nilfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode;
 	ino_t ino;
diff --git a/fs/ntfs/namei.c b/fs/ntfs/namei.c
index 358273e..8f788ec 100644
--- a/fs/ntfs/namei.c
+++ b/fs/ntfs/namei.c
@@ -35,7 +35,6 @@
  * ntfs_lookup - find the inode represented by a dentry in a directory inode
  * @dir_ino:	directory inode in which to look for the inode
  * @dent:	dentry representing the inode to look for
- * @nd:		lookup nameidata
  *
  * In short, ntfs_lookup() looks for the inode represented by the dentry @dent
  * in the directory inode @dir_ino and if found attaches the inode to the
@@ -100,8 +99,7 @@
  *
  * Locking: Caller must hold i_mutex on the directory.
  */
-static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent,
-		struct nameidata *nd)
+static struct dentry *ntfs_lookup(struct inode *dir_ino, struct dentry *dent)
 {
 	ntfs_volume *vol = NTFS_SB(dir_ino->i_sb);
 	struct inode *dent_inode;
diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index 0ef9de8..634be51 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -97,8 +97,7 @@ static int ocfs2_create_symlink_data(struct ocfs2_super *osb,
 /* An orphan dir name is an 8 byte value, printed as a hex string */
 #define OCFS2_ORPHAN_NAMELEN ((int)(2 * sizeof(u64)))
 
-static struct dentry *ocfs2_lookup(struct inode *dir, struct dentry *dentry,
-				   struct nameidata *nd)
+static struct dentry *ocfs2_lookup(struct inode *dir, struct dentry *dentry)
 {
 	int status;
 	u64 blkno;
diff --git a/fs/omfs/dir.c b/fs/omfs/dir.c
index d1aa51d..c4b5a5e 100644
--- a/fs/omfs/dir.c
+++ b/fs/omfs/dir.c
@@ -289,8 +289,7 @@ static int omfs_create(struct inode *dir, struct dentry *dentry, umode_t mode)
 	return omfs_add_node(dir, dentry, mode | S_IFREG);
 }
 
-static struct dentry *omfs_lookup(struct inode *dir, struct dentry *dentry,
-				  struct nameidata *nd)
+static struct dentry *omfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct buffer_head *bh;
 	struct inode *inode = NULL;
diff --git a/fs/openpromfs/inode.c b/fs/openpromfs/inode.c
index a88c03b..5ea1c81 100644
--- a/fs/openpromfs/inode.c
+++ b/fs/openpromfs/inode.c
@@ -170,13 +170,14 @@ static const struct file_operations openprom_operations = {
 	.llseek		= generic_file_llseek,
 };
 
-static struct dentry *openpromfs_lookup(struct inode *, struct dentry *, struct nameidata *);
+static struct dentry *openpromfs_lookup(struct inode *, struct dentry *);
 
 static const struct inode_operations openprom_inode_operations = {
 	.lookup		= openpromfs_lookup,
 };
 
-static struct dentry *openpromfs_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *openpromfs_lookup(struct inode *dir,
+					struct dentry *dentry)
 {
 	struct op_inode_info *ent_oi, *oi = OP_I(dir);
 	struct device_node *dp, *child;
diff --git a/fs/proc/base.c b/fs/proc/base.c
index d4548dd..05d768d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1967,8 +1967,7 @@ out_no_task:
 	return retval;
 }
 
-static struct dentry *proc_lookupfd(struct inode *dir, struct dentry *dentry,
-				    struct nameidata *nd)
+static struct dentry *proc_lookupfd(struct inode *dir, struct dentry *dentry)
 {
 	return proc_lookupfd_common(dir, dentry, proc_fd_instantiate);
 }
@@ -2160,7 +2159,7 @@ proc_map_files_instantiate(struct inode *dir, struct dentry *dentry,
 }
 
 static struct dentry *proc_map_files_lookup(struct inode *dir,
-		struct dentry *dentry, struct nameidata *nd)
+					    struct dentry *dentry)
 {
 	unsigned long vm_start, vm_end;
 	struct vm_area_struct *vma;
@@ -2398,8 +2397,7 @@ static struct dentry *proc_fdinfo_instantiate(struct inode *dir,
 }
 
 static struct dentry *proc_lookupfdinfo(struct inode *dir,
-					struct dentry *dentry,
-					struct nameidata *nd)
+					struct dentry *dentry)
 {
 	return proc_lookupfd_common(dir, dentry, proc_fdinfo_instantiate);
 }
@@ -2649,7 +2647,7 @@ static const struct file_operations proc_attr_dir_operations = {
 };
 
 static struct dentry *proc_attr_dir_lookup(struct inode *dir,
-				struct dentry *dentry, struct nameidata *nd)
+					   struct dentry *dentry)
 {
 	return proc_pident_lookup(dir, dentry,
 				  attr_dir_stuff, ARRAY_SIZE(attr_dir_stuff));
@@ -3061,7 +3059,9 @@ static const struct file_operations proc_tgid_base_operations = {
 	.llseek		= default_llseek,
 };
 
-static struct dentry *proc_tgid_base_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd){
+static struct dentry *proc_tgid_base_lookup(struct inode *dir,
+					    struct dentry *dentry)
+{
 	return proc_pident_lookup(dir, dentry,
 				  tgid_base_stuff, ARRAY_SIZE(tgid_base_stuff));
 }
@@ -3190,7 +3190,7 @@ out:
 	return error;
 }
 
-struct dentry *proc_pid_lookup(struct inode *dir, struct dentry * dentry, struct nameidata *nd)
+struct dentry *proc_pid_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct dentry *result;
 	struct task_struct *task;
@@ -3410,7 +3410,9 @@ static int proc_tid_base_readdir(struct file * filp,
 				   tid_base_stuff,ARRAY_SIZE(tid_base_stuff));
 }
 
-static struct dentry *proc_tid_base_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd){
+static struct dentry *proc_tid_base_lookup(struct inode *dir,
+					   struct dentry *dentry)
+{
 	return proc_pident_lookup(dir, dentry,
 				  tid_base_stuff, ARRAY_SIZE(tid_base_stuff));
 }
@@ -3454,7 +3456,7 @@ out:
 	return error;
 }
 
-static struct dentry *proc_task_lookup(struct inode *dir, struct dentry * dentry, struct nameidata *nd)
+static struct dentry *proc_task_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct dentry *result = ERR_PTR(-ENOENT);
 	struct task_struct *task;
diff --git a/fs/proc/generic.c b/fs/proc/generic.c
index 2edf34f..7db4387 100644
--- a/fs/proc/generic.c
+++ b/fs/proc/generic.c
@@ -445,8 +445,7 @@ out_unlock:
 	return ERR_PTR(error);
 }
 
-struct dentry *proc_lookup(struct inode *dir, struct dentry *dentry,
-		struct nameidata *nd)
+struct dentry *proc_lookup(struct inode *dir, struct dentry *dentry)
 {
 	return proc_lookup_de(PDE(dir), dir, dentry);
 }
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 2925775..03ca6b7 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -101,7 +101,7 @@ void pde_users_dec(struct proc_dir_entry *pde);
 
 extern spinlock_t proc_subdir_lock;
 
-struct dentry *proc_pid_lookup(struct inode *dir, struct dentry * dentry, struct nameidata *);
+struct dentry *proc_pid_lookup(struct inode *dir, struct dentry *dentry);
 int proc_pid_readdir(struct file * filp, void * dirent, filldir_t filldir);
 unsigned long task_vsize(struct mm_struct *);
 unsigned long task_statm(struct mm_struct *,
@@ -127,7 +127,7 @@ int proc_remount(struct super_block *sb, int *flags, char *data);
  * of the /proc/<pid> subdirectories.
  */
 int proc_readdir(struct file *, void *, filldir_t);
-struct dentry *proc_lookup(struct inode *, struct dentry *, struct nameidata *);
+struct dentry *proc_lookup(struct inode *, struct dentry *);
 
 
 
diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index 27da860..7962651 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -140,7 +140,7 @@ const struct file_operations proc_ns_dir_operations = {
 };
 
 static struct dentry *proc_ns_dir_lookup(struct inode *dir,
-				struct dentry *dentry, struct nameidata *nd)
+					 struct dentry *dentry)
 {
 	struct dentry *error;
 	struct task_struct *task = get_proc_task(dir);
diff --git a/fs/proc/proc_net.c b/fs/proc/proc_net.c
index 06e1cc1..aa3c10d 100644
--- a/fs/proc/proc_net.c
+++ b/fs/proc/proc_net.c
@@ -119,7 +119,7 @@ static struct net *get_proc_task_net(struct inode *dir)
 }
 
 static struct dentry *proc_tgid_net_lookup(struct inode *dir,
-		struct dentry *dentry, struct nameidata *nd)
+					   struct dentry *dentry)
 {
 	struct dentry *de;
 	struct net *net;
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index a6b6217..e0c4d31 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -86,8 +86,7 @@ static struct ctl_table_header *grab_header(struct inode *inode)
 		return sysctl_head_next(NULL);
 }
 
-static struct dentry *proc_sys_lookup(struct inode *dir, struct dentry *dentry,
-					struct nameidata *nd)
+static struct dentry *proc_sys_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct ctl_table_header *head = grab_header(dir);
 	struct ctl_table *table = PROC_I(dir)->sysctl_entry;
diff --git a/fs/proc/root.c b/fs/proc/root.c
index 46a15d8..2ad8a08 100644
--- a/fs/proc/root.c
+++ b/fs/proc/root.c
@@ -199,13 +199,12 @@ static int proc_root_getattr(struct vfsmount *mnt, struct dentry *dentry, struct
 	return 0;
 }
 
-static struct dentry *proc_root_lookup(struct inode * dir, struct dentry * dentry, struct nameidata *nd)
+static struct dentry *proc_root_lookup(struct inode *dir, struct dentry *dentry)
 {
-	if (!proc_lookup(dir, dentry, nd)) {
+	if (!proc_lookup(dir, dentry))
 		return NULL;
-	}
-	
-	return proc_pid_lookup(dir, dentry, nd);
+
+	return proc_pid_lookup(dir, dentry);
 }
 
 static int proc_root_readdir(struct file * filp,
diff --git a/fs/qnx4/namei.c b/fs/qnx4/namei.c
index 275327b..42bd979 100644
--- a/fs/qnx4/namei.c
+++ b/fs/qnx4/namei.c
@@ -98,7 +98,7 @@ static struct buffer_head *qnx4_find_entry(int len, struct inode *dir,
 	return NULL;
 }
 
-struct dentry * qnx4_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+struct dentry *qnx4_lookup(struct inode *dir, struct dentry *dentry)
 {
 	int ino;
 	struct qnx4_inode_entry *de;
diff --git a/fs/qnx4/qnx4.h b/fs/qnx4/qnx4.h
index 33a6085..12003ec 100644
--- a/fs/qnx4/qnx4.h
+++ b/fs/qnx4/qnx4.h
@@ -23,7 +23,7 @@ struct qnx4_inode_info {
 };
 
 extern struct inode *qnx4_iget(struct super_block *, unsigned long);
-extern struct dentry *qnx4_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd);
+extern struct dentry *qnx4_lookup(struct inode *dir, struct dentry *dentry);
 extern unsigned long qnx4_count_free_blocks(struct super_block *sb);
 extern unsigned long qnx4_block_map(struct inode *inode, long iblock);
 
diff --git a/fs/reiserfs/namei.c b/fs/reiserfs/namei.c
index 044120c..d377f38 100644
--- a/fs/reiserfs/namei.c
+++ b/fs/reiserfs/namei.c
@@ -321,8 +321,7 @@ static int reiserfs_find_entry(struct inode *dir, const char *name, int namelen,
 	}			/* while (1) */
 }
 
-static struct dentry *reiserfs_lookup(struct inode *dir, struct dentry *dentry,
-				      struct nameidata *nd)
+static struct dentry *reiserfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	int retval;
 	int lock_depth;
diff --git a/fs/romfs/super.c b/fs/romfs/super.c
index bb36ab7..baeeb46 100644
--- a/fs/romfs/super.c
+++ b/fs/romfs/super.c
@@ -209,8 +209,7 @@ out:
 /*
  * look up an entry in a directory
  */
-static struct dentry *romfs_lookup(struct inode *dir, struct dentry *dentry,
-				   struct nameidata *nd)
+static struct dentry *romfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	unsigned long offset, maxoff;
 	struct inode *inode;
diff --git a/fs/squashfs/namei.c b/fs/squashfs/namei.c
index 0682b38..efd8612 100644
--- a/fs/squashfs/namei.c
+++ b/fs/squashfs/namei.c
@@ -133,8 +133,7 @@ out:
 }
 
 
-static struct dentry *squashfs_lookup(struct inode *dir, struct dentry *dentry,
-				 struct nameidata *nd)
+static struct dentry *squashfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	const unsigned char *name = dentry->d_name.name;
 	int len = dentry->d_name.len;
diff --git a/fs/sysfs/dir.c b/fs/sysfs/dir.c
index 7fdf6a7..a079d23 100644
--- a/fs/sysfs/dir.c
+++ b/fs/sysfs/dir.c
@@ -724,8 +724,7 @@ int sysfs_create_dir(struct kobject * kobj)
 	return error;
 }
 
-static struct dentry * sysfs_lookup(struct inode *dir, struct dentry *dentry,
-				struct nameidata *nd)
+static struct dentry *sysfs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct dentry *ret = NULL;
 	struct dentry *parent = dentry->d_parent;
diff --git a/fs/sysv/namei.c b/fs/sysv/namei.c
index 9fd62a4..13fd282 100644
--- a/fs/sysv/namei.c
+++ b/fs/sysv/namei.c
@@ -43,7 +43,7 @@ const struct dentry_operations sysv_dentry_operations = {
 	.d_hash		= sysv_hash,
 };
 
-static struct dentry *sysv_lookup(struct inode * dir, struct dentry * dentry, struct nameidata *nd)
+static struct dentry *sysv_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode * inode = NULL;
 	ino_t ino;
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index 90adfaf..c50109d 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -191,8 +191,7 @@ static int dbg_check_name(const struct ubifs_info *c,
 
 #endif
 
-static struct dentry *ubifs_lookup(struct inode *dir, struct dentry *dentry,
-				   struct nameidata *nd)
+static struct dentry *ubifs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	int err;
 	union ubifs_key key;
diff --git a/fs/udf/namei.c b/fs/udf/namei.c
index c96f1a7..8c3baf4 100644
--- a/fs/udf/namei.c
+++ b/fs/udf/namei.c
@@ -252,8 +252,7 @@ out_ok:
 	return fi;
 }
 
-static struct dentry *udf_lookup(struct inode *dir, struct dentry *dentry,
-				 struct nameidata *nd)
+static struct dentry *udf_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode *inode = NULL;
 	struct fileIdentDesc cfi;
diff --git a/fs/ufs/namei.c b/fs/ufs/namei.c
index ca6c710..b747bb6 100644
--- a/fs/ufs/namei.c
+++ b/fs/ufs/namei.c
@@ -46,7 +46,7 @@ static inline int ufs_add_nondir(struct dentry *dentry, struct inode *inode)
 	return err;
 }
 
-static struct dentry *ufs_lookup(struct inode * dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *ufs_lookup(struct inode *dir, struct dentry *dentry)
 {
 	struct inode * inode = NULL;
 	ino_t ino;
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 66810ac..8c1bf879 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -248,8 +248,7 @@ xfs_vn_mkdir(
 STATIC struct dentry *
 xfs_vn_lookup(
 	struct inode	*dir,
-	struct dentry	*dentry,
-	struct nameidata *nd)
+	struct dentry	*dentry)
 {
 	struct xfs_inode *cip;
 	struct xfs_name	name;
@@ -273,8 +272,7 @@ xfs_vn_lookup(
 STATIC struct dentry *
 xfs_vn_ci_lookup(
 	struct inode	*dir,
-	struct dentry	*dentry,
-	struct nameidata *nd)
+	struct dentry	*dentry)
 {
 	struct xfs_inode *ip;
 	struct xfs_name	xname;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 4a5e0d3..8cfe7bf 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1630,7 +1630,7 @@ struct file_operations {
 };
 
 struct inode_operations {
-	struct dentry * (*lookup) (struct inode *,struct dentry *, struct nameidata *);
+	struct dentry * (*lookup) (struct inode *, struct dentry *);
 	void * (*follow_link) (struct dentry *, struct nameidata *);
 	int (*permission) (struct inode *, int);
 	struct posix_acl * (*get_acl)(struct inode *, int);
@@ -2537,7 +2537,7 @@ extern int simple_write_end(struct file *file, struct address_space *mapping,
 			loff_t pos, unsigned len, unsigned copied,
 			struct page *page, void *fsdata);
 
-extern struct dentry *simple_lookup(struct inode *, struct dentry *, struct nameidata *);
+extern struct dentry *simple_lookup(struct inode *, struct dentry *);
 extern ssize_t generic_read_dir(struct file *, char __user *, size_t, loff_t *);
 extern const struct file_operations simple_dir_operations;
 extern const struct inode_operations simple_dir_inode_operations;
diff --git a/kernel/cgroup.c b/kernel/cgroup.c
index a5d3b53..a284d08 100644
--- a/kernel/cgroup.c
+++ b/kernel/cgroup.c
@@ -778,7 +778,7 @@ EXPORT_SYMBOL_GPL(cgroup_unlock);
  */
 
 static int cgroup_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode);
-static struct dentry *cgroup_lookup(struct inode *, struct dentry *, struct nameidata *);
+static struct dentry *cgroup_lookup(struct inode *, struct dentry *);
 static int cgroup_rmdir(struct inode *unused_dir, struct dentry *dentry);
 static int cgroup_populate_dir(struct cgroup *cgrp);
 static const struct inode_operations cgroup_dir_inode_operations;
@@ -2606,7 +2606,7 @@ static const struct inode_operations cgroup_dir_inode_operations = {
 	.rename = cgroup_rename,
 };
 
-static struct dentry *cgroup_lookup(struct inode *dir, struct dentry *dentry, struct nameidata *nd)
+static struct dentry *cgroup_lookup(struct inode *dir, struct dentry *dentry)
 {
 	if (dentry->d_name.len > NAME_MAX)
 		return ERR_PTR(-ENAMETOOLONG);
-- 
1.7.7


^ permalink raw reply related	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/25] vfs: atomic open RFC
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (24 preceding siblings ...)
  2012-03-07 21:22 ` [PATCH 25/25] vfs: remove nameidata from lookup Miklos Szeredi
@ 2012-03-07 21:27 ` Steve French
  2012-03-13  9:51 ` Christoph Hellwig
  26 siblings, 0 replies; 55+ messages in thread
From: Steve French @ 2012-03-07 21:27 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Wed, Mar 7, 2012 at 3:22 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> This series allows clean implementation of atomic lookup+(create)+open and
> create+open operations that previously were done via ->lookup and ->create using
> open intents.
>
> Testing and review is welcome, but at this stage mainly I'd like to hear
> opinions on the overall design of the new interfaces.

This could be great for network file systems like cifs/smb2

Historically mapping the multistage lookup/create/open to an atomic
opencreate protocol operation (for cifs, and now smb2 and smb2.2)
has always been hard.   We go from an atomic syscall to
lookup intents, but it was tricky to make this atomic on the wire in every case.

It will be interesting to review this.

-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 05/25] vfs: add filesystem flags for atomic_open
  2012-03-07 21:22 ` [PATCH 05/25] vfs: add filesystem flags for atomic_open Miklos Szeredi
@ 2012-03-13  9:33   ` Christoph Hellwig
  2012-03-13 11:17     ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-13  9:33 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Wed, Mar 07, 2012 at 10:22:22PM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Allow filesystem to select which cases it wants to perform atomic_open.

Given that atomic_open is allowed to return a NULL filp and fall back to
the traditional open I don't think this is needed - it can be much
cleaner handled by the fs just returning NULL for that case.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-07 21:22 ` [PATCH 06/25] vfs: add i_op->atomic_create() Miklos Szeredi
@ 2012-03-13  9:37   ` Christoph Hellwig
  2012-03-13 11:22     ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-13  9:37 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Wed, Mar 07, 2012 at 10:22:23PM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Add a new inode operation which is called on regular file create.  This is a
> replacement for ->create() which allows the file to be opened atomically with
> creation.
> 
> This function is also called for non-open creates (mknod(2)) with a NULL file
> argument.  Only one of ->create or ->atomic_create will be called, implementing
> both makes no sense.
> 
> The functionality of this method partially overlaps that of ->atomic_open().
> FUSE and 9P only use ->atomic_create, NFS, CIFS and CEPH use both.

I really don't like the special casing of the mknod handling in every
atomic_create instance.  Either we should keep ->create for it, or do a
cleanup pass before to always make that pass go through ->mknod - in
fact most filesystems handle the two in common code anyway so we might
be able to get rid of one of them, possible mkdir as well.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/25] vfs: atomic open RFC
  2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
                   ` (25 preceding siblings ...)
  2012-03-07 21:27 ` [PATCH 00/25] vfs: atomic open RFC Steve French
@ 2012-03-13  9:51 ` Christoph Hellwig
  2012-03-13 11:00   ` Miklos Szeredi
  26 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-13  9:51 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

Do we really need the opendata structure?

It seems like we could just pass a struct path instead of the dentry
passed directly and the vfsmount in it.  There should be no need to
preallocate the file before calling into ->atomic_open, as it's only
used to pass around f_flags - but we already pass that one to
->atomic_open directly and might as well pass it on to finish_open and
allocate the file there.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/25] vfs: atomic open RFC
  2012-03-13  9:51 ` Christoph Hellwig
@ 2012-03-13 11:00   ` Miklos Szeredi
  2012-03-13 12:01     ` Christoph Hellwig
  0 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 11:00 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh

Christoph Hellwig <hch@infradead.org> writes:

> Do we really need the opendata structure?
>
> It seems like we could just pass a struct path instead of the dentry
> passed directly and the vfsmount in it.  There should be no need to
> preallocate the file before calling into ->atomic_open, as it's only
> used to pass around f_flags - but we already pass that one to
> ->atomic_open directly and might as well pass it on to finish_open and
> allocate the file there.

We really don't want to get into the situation where the open fails
after a successful create(*).  Which means the file needs to be allocated
prior to calling ->atomic_open and needs to be passed to finish_open()
toghether with the vfsmount and dentry.

In the first version of the patch I set filp->f_path.mnt to nd->path.mnt
and passed the half initialized filp to ->atomic_open.  But then decided
that it's confusing for the filesystem code to deal with a half baked
filp (does it need to be fput on error?  etc...)

Doing it with an opaque opendata makes this cleaner I think.

Thanks,
Miklos


(*)
commit a1a5b3d93ca45613ec1d920fdb131b69b6553882
Author: Peter Staubach <staubach@redhat.com>
Date:   Tue Sep 13 01:25:12 2005 -0700

    [PATCH] open returns ENFILE but creates file anyway
    
    When open(O_CREAT) is called and the error, ENFILE, is returned, the file
    may be created anyway.  This is counter intuitive, against the SUS V3
    specification, and may cause applications to misbehave if they are not
    coded correctly to handle this semantic.  The SUS V3 specification
    explicitly states "No files shall be created or modified if the function
    returns -1.".
    
    The error, ENFILE, is used to indicate the system wide open file table is
    full and no more file structs can be allocated.
    
    This is due to an ordering problem.  The entry in the directory is created
    before the file struct is allocated.  If the allocation for the file struct
    fails, then the system call must return an error, but the directory entry
    was already created and can not be safely removed.
    
    The solution to this situation is relatively easy.  The file struct should
    be allocated before the directory entry is created.  If the allocation
    fails, then the error can be returned directly.  If the creation of the
    directory entry fails, then the file struct can be easily freed.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 05/25] vfs: add filesystem flags for atomic_open
  2012-03-13  9:33   ` Christoph Hellwig
@ 2012-03-13 11:17     ` Miklos Szeredi
  0 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 11:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh

Christoph Hellwig <hch@infradead.org> writes:

> On Wed, Mar 07, 2012 at 10:22:22PM +0100, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@suse.cz>
>> 
>> Allow filesystem to select which cases it wants to perform atomic_open.
>
> Given that atomic_open is allowed to return a NULL filp and fall back to
> the traditional open I don't think this is needed - it can be much
> cleaner handled by the fs just returning NULL for that case.

Hmm, yeah.  The reason I did this was to prevent doing permission/
security checks twice before create.

But in fact that could be optimized away with careful code flow.  I'll
look into dropping this.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-13  9:37   ` Christoph Hellwig
@ 2012-03-13 11:22     ` Miklos Szeredi
  2012-03-13 11:55       ` Christoph Hellwig
  0 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 11:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh

Christoph Hellwig <hch@infradead.org> writes:

> On Wed, Mar 07, 2012 at 10:22:23PM +0100, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@suse.cz>
>> 
>> Add a new inode operation which is called on regular file create.  This is a
>> replacement for ->create() which allows the file to be opened atomically with
>> creation.
>> 
>> This function is also called for non-open creates (mknod(2)) with a NULL file
>> argument.  Only one of ->create or ->atomic_create will be called, implementing
>> both makes no sense.
>> 
>> The functionality of this method partially overlaps that of ->atomic_open().
>> FUSE and 9P only use ->atomic_create, NFS, CIFS and CEPH use both.
>
> I really don't like the special casing of the mknod handling in every
> atomic_create instance.  Either we should keep ->create for it, or do a
> cleanup pass before to always make that pass go through ->mknod - in
> fact most filesystems handle the two in common code anyway so we might
> be able to get rid of one of them, possible mkdir as well.

Good point.  Yes, ->create is probably worth getting rid of.  Mkdir, I'm
not so sure, but I'll look at what filesystems are doing.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-13 11:22     ` Miklos Szeredi
@ 2012-03-13 11:55       ` Christoph Hellwig
  2012-03-13 13:26         ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-13 11:55 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Christoph Hellwig, viro, linux-fsdevel, linux-kernel,
	Trond.Myklebust, sfrench, sage, ericvh

On Tue, Mar 13, 2012 at 12:22:10PM +0100, Miklos Szeredi wrote:
> Good point.  Yes, ->create is probably worth getting rid of.  Mkdir, I'm
> not so sure, but I'll look at what filesystems are doing.

Btw, is there any good reason to keep ->atomic_open and ->atomic_create
separate?  It seems like the instances in general share code anyway.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/25] vfs: atomic open RFC
  2012-03-13 11:00   ` Miklos Szeredi
@ 2012-03-13 12:01     ` Christoph Hellwig
  2012-03-13 13:33       ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-13 12:01 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Christoph Hellwig, viro, linux-fsdevel, linux-kernel,
	Trond.Myklebust, sfrench, sage, ericvh

On Tue, Mar 13, 2012 at 12:00:05PM +0100, Miklos Szeredi wrote:
> > Do we really need the opendata structure?
> >
> > It seems like we could just pass a struct path instead of the dentry
> > passed directly and the vfsmount in it.  There should be no need to
> > preallocate the file before calling into ->atomic_open, as it's only
> > used to pass around f_flags - but we already pass that one to
> > ->atomic_open directly and might as well pass it on to finish_open and
> > allocate the file there.
> 
> We really don't want to get into the situation where the open fails
> after a successful create(*).  Which means the file needs to be allocated
> prior to calling ->atomic_open and needs to be passed to finish_open()
> toghether with the vfsmount and dentry.
> 
> In the first version of the patch I set filp->f_path.mnt to nd->path.mnt
> and passed the half initialized filp to ->atomic_open.  But then decided
> that it's confusing for the filesystem code to deal with a half baked
> filp (does it need to be fput on error?  etc...)
> 
> Doing it with an opaque opendata makes this cleaner I think.

Make sense.  Can you throw in another cleanup patch to really just make
it a pass-through and not also use it as a boolean flag if open_flags
should be obeyed?  This probably will change sematincs for the various
filesystems, but given that they should behave the same way that's a
good thing.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-07 21:22 ` [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
@ 2012-03-13 12:06   ` Christoph Hellwig
  2012-03-13 13:39     ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-13 12:06 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Wed, Mar 07, 2012 at 10:22:30PM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Replace CIFS's ->create operation with ->atomic_open and ->atomic_create.  Also
> move the relevant code from ->lookup into the create function.
> 
> CIFS currently only does atomic open for O_CREAT, but it wants to do that as
> early as possible, without first calling ->lookup, so it uses ->atomic_open,
> just like NFS.

Why does cifs need to set the created flag from inside ->atomic_open?

It's different from everyone else in that respect.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 09/25] nfs: remove nfs4 specific create function
  2012-03-07 21:22 ` [PATCH 09/25] nfs: remove nfs4 specific create function Miklos Szeredi
@ 2012-03-13 12:09   ` Christoph Hellwig
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-13 12:09 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Wed, Mar 07, 2012 at 10:22:26PM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Make nfs_atomic_open() work for non-open creates.  This is trivial to do and
> allows the NFSv4 specific create code to be removed.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> ---
>  fs/nfs/dir.c      |   28 ++++++++++++++++++++--------
>  fs/nfs/nfs4proc.c |   31 -------------------------------
>  2 files changed, 20 insertions(+), 39 deletions(-)
> 
> diff --git a/fs/nfs/dir.c b/fs/nfs/dir.c
> index 24bf3c9..8627965 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -114,10 +114,13 @@ const struct inode_operations nfs3_dir_inode_operations = {
>  static struct file *nfs_atomic_open(struct inode *, struct dentry *,
>  				    struct opendata *, unsigned, umode_t,
>  				    bool *);
> +static struct file *nfs_atomic_open_common(struct inode *, struct dentry *,
> +					   struct opendata *, unsigned,
> +					   umode_t);
>  const struct inode_operations nfs4_dir_inode_operations = {
> -	.create		= nfs_create,
>  	.lookup		= nfs_lookup,
>  	.atomic_open	= nfs_atomic_open,
> +	.atomic_create	= nfs_atomic_open_common, /* called for mknod */

Can you please name the methods after the interface they implement,
e.g. do a s/nfs_atomic_open_common/nfs_atomic_create/g here, and
similar transformations for the other filesystems.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-13 11:55       ` Christoph Hellwig
@ 2012-03-13 13:26         ` Miklos Szeredi
  2012-03-13 14:08           ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 13:26 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh

Christoph Hellwig <hch@infradead.org> writes:

> On Tue, Mar 13, 2012 at 12:22:10PM +0100, Miklos Szeredi wrote:
>> Good point.  Yes, ->create is probably worth getting rid of.  Mkdir, I'm
>> not so sure, but I'll look at what filesystems are doing.
>
> Btw, is there any good reason to keep ->atomic_open and ->atomic_create
> separate?  It seems like the instances in general share code anyway.

->atomic_open is called before lookup, ->atomic_create after lookup.

How would we differentiate between the two if they were common?  We
could have a filesystem flag, but for example CEPH does weird things
like using ->atomic_open for !O_CREAT and ->atomic_create for O_CREAT.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 00/25] vfs: atomic open RFC
  2012-03-13 12:01     ` Christoph Hellwig
@ 2012-03-13 13:33       ` Miklos Szeredi
  0 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 13:33 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh

Christoph Hellwig <hch@infradead.org> writes:

> On Tue, Mar 13, 2012 at 12:00:05PM +0100, Miklos Szeredi wrote:
>> > Do we really need the opendata structure?
>> >
>> > It seems like we could just pass a struct path instead of the dentry
>> > passed directly and the vfsmount in it.  There should be no need to
>> > preallocate the file before calling into ->atomic_open, as it's only
>> > used to pass around f_flags - but we already pass that one to
>> > ->atomic_open directly and might as well pass it on to finish_open and
>> > allocate the file there.
>> 
>> We really don't want to get into the situation where the open fails
>> after a successful create(*).  Which means the file needs to be allocated
>> prior to calling ->atomic_open and needs to be passed to finish_open()
>> toghether with the vfsmount and dentry.
>> 
>> In the first version of the patch I set filp->f_path.mnt to nd->path.mnt
>> and passed the half initialized filp to ->atomic_open.  But then decided
>> that it's confusing for the filesystem code to deal with a half baked
>> filp (does it need to be fput on error?  etc...)
>> 
>> Doing it with an opaque opendata makes this cleaner I think.
>
> Make sense.  Can you throw in another cleanup patch to really just make
> it a pass-through and not also use it as a boolean flag if open_flags
> should be obeyed?  This probably will change sematincs for the various
> filesystems, but given that they should behave the same way that's a
> good thing.

It's not just that.  The filesystems will create some state if od is
non-NULL, which is released in f_op->release.  If od is always non-NULL
then the VFS has to call ->release on a dummy file, that file has to be
allocated, which might fail...  So this brings with it a couple of
issues that I didn't want to deal with.

But yes, it would probably be a good cleanup...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-13 12:06   ` Christoph Hellwig
@ 2012-03-13 13:39     ` Miklos Szeredi
  2012-03-13 16:43       ` Sage Weil
  2012-03-24 14:22       ` Christoph Hellwig
  0 siblings, 2 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 13:39 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh

Christoph Hellwig <hch@infradead.org> writes:

> On Wed, Mar 07, 2012 at 10:22:30PM +0100, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@suse.cz>
>> 
>> Replace CIFS's ->create operation with ->atomic_open and ->atomic_create.  Also
>> move the relevant code from ->lookup into the create function.
>> 
>> CIFS currently only does atomic open for O_CREAT, but it wants to do that as
>> early as possible, without first calling ->lookup, so it uses ->atomic_open,
>> just like NFS.
>
> Why does cifs need to set the created flag from inside ->atomic_open?
>
> It's different from everyone else in that respect.

Apparently CIFS is the only one that can tell whether the file was
created or not.  If the flag is set then notify_create() is called.
Users of NFS doesn't seem to care, it's of dubious value anyway, but why
not use the info when available?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-13 13:26         ` Miklos Szeredi
@ 2012-03-13 14:08           ` Miklos Szeredi
  2012-03-13 16:34             ` Sage Weil
  2012-03-24 14:14             ` Christoph Hellwig
  0 siblings, 2 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 14:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh

Miklos Szeredi <miklos@szeredi.hu> writes:

> Christoph Hellwig <hch@infradead.org> writes:
>
>> On Tue, Mar 13, 2012 at 12:22:10PM +0100, Miklos Szeredi wrote:
>>> Good point.  Yes, ->create is probably worth getting rid of.  Mkdir, I'm
>>> not so sure, but I'll look at what filesystems are doing.
>>
>> Btw, is there any good reason to keep ->atomic_open and ->atomic_create
>> separate?  It seems like the instances in general share code anyway.
>
> ->atomic_open is called before lookup, ->atomic_create after lookup.
>
> How would we differentiate between the two if they were common?  We
> could have a filesystem flag, but for example CEPH does weird things
> like using ->atomic_open for !O_CREAT and ->atomic_create for O_CREAT.

Or let the filesystem do the lookup in ->atomic_open if it wants (and
pass the need_lookup flag to the filesystem).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 04/25] vfs: add i_op->atomic_open()
  2012-03-07 21:22 ` [PATCH 04/25] vfs: add i_op->atomic_open() Miklos Szeredi
@ 2012-03-13 14:38   ` Myklebust, Trond
  2012-03-13 15:11     ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Myklebust, Trond @ 2012-03-13 14:38 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, sfrench, sage, ericvh, mszeredi

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 7114 bytes --]

On Wed, 2012-03-07 at 22:22 +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Add a new inode operation which is called on the last component of an open.
> Using this the filesystem can look up, possibly create and open the file in one
> atomic operation.  If it cannot perform this (e.g. the file type turned out to
> be wrong) it may signal this by returning NULL instead of an open struct file
> pointer.
> 
> i_op->atomic_open() is only called if the last component is negative or needs
> lookup.  Handling cached positive dentries here doesn't add much value: these
> can be opened using f_op->open().  If the cached file turns out to be invalid,
> the open can be retried, this time using ->atomic_open() with a fresh dentry.
> 
> For now leave the old way of using open intents in lookup and revalidate in
> place.  This will be removed once all the users are converted.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@suse.cz>
> ---
>  fs/internal.h      |    5 +
>  fs/namei.c         |  231 +++++++++++++++++++++++++++++++++++++++++++++++++---
>  fs/open.c          |   27 ++++++
>  include/linux/fs.h |    6 ++
>  4 files changed, 258 insertions(+), 11 deletions(-)
> 
> diff --git a/fs/internal.h b/fs/internal.h
> index 4d69fdd..10143de 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -87,12 +87,17 @@ extern struct super_block *user_get_super(dev_t);
>  struct nameidata;
>  extern struct file *nameidata_to_filp(struct nameidata *);
>  extern void release_open_intent(struct nameidata *);
> +struct opendata {
> +	struct vfsmount *mnt;
> +	struct file **filp;
> +};
>  struct open_flags {
>  	int open_flag;
>  	umode_t mode;
>  	int acc_mode;
>  	int intent;
>  };
> +
>  extern struct file *do_filp_open(int dfd, const char *pathname,
>  		const struct open_flags *op, int lookup_flags);
>  extern struct file *do_file_open_root(struct dentry *, struct vfsmount *,
> diff --git a/fs/namei.c b/fs/namei.c
> index ff8bc94..835dcf1 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -2100,6 +2100,201 @@ static inline int open_to_namei_flags(int flag)
>  	return flag;
>  }
>  
> +static int may_o_create(struct path *dir, struct dentry *dentry, umode_t mode)
> +{
> +	int error = security_path_mknod(dir, dentry, mode, 0);
> +	if (error)
> +		return error;
> +
> +	error = may_create(dir->dentry->d_inode, dentry);
> +	if (error)
> +		return error;
> +
> +	return security_inode_create(dir->dentry->d_inode, dentry, mode);
> +}
> +
> +static struct file *atomic_open(struct nameidata *nd, struct dentry *dentry,
> +				const struct open_flags *op,
> +				int *want_write, int *create_error)
> +{
> +	struct inode *dir =  nd->path.dentry->d_inode;
> +	unsigned open_flag = open_to_namei_flags(op->open_flag);
> +	umode_t mode;
> +	int error;
> +	bool created = false;
> +	int acc_mode;
> +	struct opendata od;
> +	struct file *filp;
> +
> +	BUG_ON(dentry->d_inode);
> +
> +	/* Don't create child dentry for a dead directory. */
> +	if (unlikely(IS_DEADDIR(dir)))
> +		return ERR_PTR(-ENOENT);
> +
> +	mode = op->mode & S_IALLUGO;
> +	if ((open_flag & O_CREAT) && !IS_POSIXACL(dir))
> +		mode &= ~current_umask();
> +
> +	if (open_flag & O_EXCL) {
> +		open_flag &= ~O_TRUNC;
> +		created = true;
> +	}
> +
> +	/*
> +	 * Checking write permission is tricky, bacuse we don't know if we are
> +	 * going to actually need it: O_CREAT opens should work as long as the
> +	 * file exists.  But checking existence breaks atomicity.  The trick is
> +	 * to check access and if not granted clear O_CREAT from the flags.
> +	 *
> +	 * Another problem is returing the "right" error value (e.g. for an
> +	 * O_EXCL open we want to return EEXIST not EROFS).
> +	 */
> +	if ((open_flag & (O_CREAT | O_TRUNC)) ||
> +	    (open_flag & O_ACCMODE) != O_RDONLY) {
> +		error = mnt_want_write(nd->path.mnt);
> +		if (!error) {
> +			*want_write = 1;
> +		} else if (!(open_flag & O_CREAT)) {
> +			/*
> +			 * No O_CREATE -> atomicity not a requirement -> fall
> +			 * back to lookup + open
> +			 */
> +			goto look_up;
> +		} else if (open_flag & (O_EXCL | O_TRUNC)) {
> +			/* Fall back and fail with the right error */
> +			*create_error = error;
> +			goto look_up;
> +		} else {
> +			/* No side effects, safe to clear O_CREAT */
> +			*create_error = error;
> +			open_flag &= ~O_CREAT;
> +		}
> +	}
> +
> +	if (open_flag & O_CREAT) {
> +		error = may_o_create(&nd->path, dentry, op->mode);
> +		if (error) {
> +			*create_error = error;
> +			if (open_flag & O_EXCL)
> +				goto look_up;
> +			open_flag &= ~O_CREAT;
> +		}
> +	}
> +
> +	if (nd->flags & LOOKUP_DIRECTORY)
> +		open_flag |= O_DIRECTORY;
> +
> +	od.mnt = nd->path.mnt;
> +	od.filp = &nd->intent.open.file;
> +	filp = dir->i_op->atomic_open(dir, dentry, &od, open_flag, mode,
> +				      &created);
> +	if (IS_ERR(filp)) {
> +		if (*create_error && PTR_ERR(filp) == -ENOENT)
> +			filp = ERR_PTR(*create_error);
> +		goto out;
> +	}
> +
> +	acc_mode = op->acc_mode;
> +	if (created) {
> +		fsnotify_create(dir, dentry);
> +		acc_mode = MAY_OPEN;
> +	}
> +
> +	if (filp) {
> +		/*
> +		 * We didn't have the inode before the open, so check open
> +		 * permission here.
> +		 */
> +		error = may_open(&filp->f_path, acc_mode, open_flag);
> +		if (error)
> +			goto out_fput;
> +
> +		error = open_check_o_direct(filp);
> +		if (error)
> +			goto out_fput;
> +	}
> +	*create_error = 0;
> +
> +out:
> +	return filp;
> +
> +out_fput:
> +	fput(filp);
> +	return ERR_PTR(error);
> +
> +look_up:
> +	return NULL;
> +}
> +
> +/*
> + * Lookup and possibly open (and create) the last component
> + *
> + * Must be called with i_mutex held on parent.
> + *
> + * Returns open file or NULL on success, error otherwise.  NULL means no open
> + * was performed, only lookup.
> + */
> +static struct file *lookup_open(struct nameidata *nd, struct path *path,
> +				const struct open_flags *op, int *want_write)
> +{
> +	struct dentry *dir = nd->path.dentry;
> +	struct inode *dir_inode = dir->d_inode;
> +	struct dentry *dentry;
> +	int error;
> +	int create_error = 0;
> +	bool need_lookup;
> +
> +	dentry = lookup_dcache(&nd->last, dir, nd, &need_lookup);
> +	if (IS_ERR(dentry))
> +		return ERR_CAST(dentry);
> +
> +	/* Cached positive dentry: will open in f_op->open */
> +	if (!need_lookup && dentry->d_inode)
> +		goto out_no_open;
> +
> +	if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) {
> +		struct file *filp;
> +
> +		filp = atomic_open(nd, dentry, op, want_write, &create_error);
> +		if (filp) {
> +			dput(dentry);
> +			return filp;
> +		}
> +		/* fall back to plain lookup */
> +	}

Would it be possible to allow the filesystem to return a new dentry even
if it can't complete the actual open? That way we can return the actual
symlink that caused the open to fail instead of looking it up separately
(which may be subject to races).

 

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 04/25] vfs: add i_op->atomic_open()
  2012-03-13 14:38   ` Myklebust, Trond
@ 2012-03-13 15:11     ` Miklos Szeredi
  2012-03-13 15:31       ` Myklebust, Trond
  0 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-13 15:11 UTC (permalink / raw)
  To: Myklebust, Trond
  Cc: viro, linux-fsdevel, linux-kernel, hch, sfrench, sage, ericvh

"Myklebust, Trond" <Trond.Myklebust@netapp.com> writes:

> On Wed, 2012-03-07 at 22:22 +0100, Miklos Szeredi wrote:
>> +
>> +	if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) {
>> +		struct file *filp;
>> +
>> +		filp = atomic_open(nd, dentry, op, want_write, &create_error);
>> +		if (filp) {
>> +			dput(dentry);
>> +			return filp;
>> +		}
>> +		/* fall back to plain lookup */
>> +	}
>
> Would it be possible to allow the filesystem to return a new dentry even
> if it can't complete the actual open? That way we can return the actual
> symlink that caused the open to fail instead of looking it up separately
> (which may be subject to races).

This should be possible, but I'm reluctant to add more arguments to
->atomic_open.  Other possibilites that come to mind:

  return -ELOOKEDUP - caller should retry d_lookup and proceed with the result

  call opendata_set_dentry(od, dentry) and return NULL - caller checks
  opendata for non-NULL dentry and proceeds with that

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 04/25] vfs: add i_op->atomic_open()
  2012-03-13 15:11     ` Miklos Szeredi
@ 2012-03-13 15:31       ` Myklebust, Trond
  0 siblings, 0 replies; 55+ messages in thread
From: Myklebust, Trond @ 2012-03-13 15:31 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, sfrench, sage, ericvh

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1534 bytes --]

On Tue, 2012-03-13 at 16:11 +0100, Miklos Szeredi wrote:
> "Myklebust, Trond" <Trond.Myklebust@netapp.com> writes:
> 
> > On Wed, 2012-03-07 at 22:22 +0100, Miklos Szeredi wrote:
> >> +
> >> +	if ((nd->flags & LOOKUP_OPEN) && dir_inode->i_op->atomic_open) {
> >> +		struct file *filp;
> >> +
> >> +		filp = atomic_open(nd, dentry, op, want_write, &create_error);
> >> +		if (filp) {
> >> +			dput(dentry);
> >> +			return filp;
> >> +		}
> >> +		/* fall back to plain lookup */
> >> +	}
> >
> > Would it be possible to allow the filesystem to return a new dentry even
> > if it can't complete the actual open? That way we can return the actual
> > symlink that caused the open to fail instead of looking it up separately
> > (which may be subject to races).
> 
> This should be possible, but I'm reluctant to add more arguments to
> ->atomic_open.  Other possibilites that come to mind:
> 
>   return -ELOOKEDUP - caller should retry d_lookup and proceed with the result
> 
>   call opendata_set_dentry(od, dentry) and return NULL - caller checks
>   opendata for non-NULL dentry and proceeds with that

Or convert the existing 'dentry' argument into a struct dentry **.

Then again, it might just be easier to convert the existing arguments
into a single "open" structure.

-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust@netapp.com
www.netapp.com

ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-13 14:08           ` Miklos Szeredi
@ 2012-03-13 16:34             ` Sage Weil
  2012-03-24 14:14             ` Christoph Hellwig
  1 sibling, 0 replies; 55+ messages in thread
From: Sage Weil @ 2012-03-13 16:34 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Christoph Hellwig, viro, linux-fsdevel, linux-kernel,
	Trond.Myklebust, sfrench, ericvh

On Tue, 13 Mar 2012, Miklos Szeredi wrote:
> Miklos Szeredi <miklos@szeredi.hu> writes:
> 
> > Christoph Hellwig <hch@infradead.org> writes:
> >
> >> On Tue, Mar 13, 2012 at 12:22:10PM +0100, Miklos Szeredi wrote:
> >>> Good point.  Yes, ->create is probably worth getting rid of.  Mkdir, I'm
> >>> not so sure, but I'll look at what filesystems are doing.
> >>
> >> Btw, is there any good reason to keep ->atomic_open and ->atomic_create
> >> separate?  It seems like the instances in general share code anyway.
> >
> > ->atomic_open is called before lookup, ->atomic_create after lookup.
> >
> > How would we differentiate between the two if they were common?  We
> > could have a filesystem flag, but for example CEPH does weird things
> > like using ->atomic_open for !O_CREAT and ->atomic_create for O_CREAT.

Don't let what Ceph used to do distract you; I only got certain intent 
cases to work and didn't bother with the others.

> Or let the filesystem do the lookup in ->atomic_open if it wants (and
> pass the need_lookup flag to the filesystem).

Either way is fine from my perspective.

sage

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-13 13:39     ` Miklos Szeredi
@ 2012-03-13 16:43       ` Sage Weil
  2012-03-24 14:22       ` Christoph Hellwig
  1 sibling, 0 replies; 55+ messages in thread
From: Sage Weil @ 2012-03-13 16:43 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Christoph Hellwig, viro, linux-fsdevel, linux-kernel,
	Trond.Myklebust, sfrench, ericvh

On Tue, 13 Mar 2012, Miklos Szeredi wrote:
> Christoph Hellwig <hch@infradead.org> writes:
> 
> > On Wed, Mar 07, 2012 at 10:22:30PM +0100, Miklos Szeredi wrote:
> >> From: Miklos Szeredi <mszeredi@suse.cz>
> >> 
> >> Replace CIFS's ->create operation with ->atomic_open and ->atomic_create.  Also
> >> move the relevant code from ->lookup into the create function.
> >> 
> >> CIFS currently only does atomic open for O_CREAT, but it wants to do that as
> >> early as possible, without first calling ->lookup, so it uses ->atomic_open,
> >> just like NFS.
> >
> > Why does cifs need to set the created flag from inside ->atomic_open?
> >
> > It's different from everyone else in that respect.
> 
> Apparently CIFS is the only one that can tell whether the file was
> created or not.  If the flag is set then notify_create() is called.
> Users of NFS doesn't seem to care, it's of dubious value anyway, but why
> not use the info when available?

It also add MAY_OPEN to acc_mode... I take it that only matters if the 
fs who set created = true looks for it in ->permission()?

+       acc_mode = op->acc_mode;
+       if (created) {
+               fsnotify_create(dir, dentry);
+               acc_mode = MAY_OPEN;
+       }

sage

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 03/25] vfs: split __dentry_open()
  2012-03-07 21:22 ` [PATCH 03/25] vfs: split __dentry_open() Miklos Szeredi
@ 2012-03-24 14:12   ` Christoph Hellwig
  2012-03-26 13:22     ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-24 14:12 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Wed, Mar 07, 2012 at 10:22:20PM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> Split __dentry_open() into two functions:
> 
>   do_dentry_open() - does most of the actual work, doesn't put file on failure
>   open_check_o_direct() - after a successful open, checks direct_IO method
> 
> This will allow i_op->atomic_open to do just the file initialization and leave
> the direct_IO checking to the VFS.

I think the O_DIRECT checks should move out of the VFS.  The direct I/O
method isn't called from the VFS anywhere, but just from the
generic_file_* routines in filemap.c, which suggest doing the O_DIRECT
check in there as well.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 06/25] vfs: add i_op->atomic_create()
  2012-03-13 14:08           ` Miklos Szeredi
  2012-03-13 16:34             ` Sage Weil
@ 2012-03-24 14:14             ` Christoph Hellwig
  1 sibling, 0 replies; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-24 14:14 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Christoph Hellwig, viro, linux-fsdevel, linux-kernel,
	Trond.Myklebust, sfrench, sage, ericvh

On Tue, Mar 13, 2012 at 03:08:59PM +0100, Miklos Szeredi wrote:
> >> Btw, is there any good reason to keep ->atomic_open and ->atomic_create
> >> separate?  It seems like the instances in general share code anyway.
> >
> > ->atomic_open is called before lookup, ->atomic_create after lookup.
> >
> > How would we differentiate between the two if they were common?  We
> > could have a filesystem flag, but for example CEPH does weird things
> > like using ->atomic_open for !O_CREAT and ->atomic_create for O_CREAT.
> 
> Or let the filesystem do the lookup in ->atomic_open if it wants (and
> pass the need_lookup flag to the filesystem).

That sounds like a much better approch.  And as mentioned by Sarge a lot
of the odd fs behaviour currently probably is because the current atomic
open interface is so awkward that no one really understands it.  We'll
probably be able to clean up a lot of that, but I'd suggest to not try
to cram everything into the initial atomic open support.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 11/25] nfs: don't use intents for checking atomic open
  2012-03-07 21:22 ` [PATCH 11/25] nfs: don't use intents for checking atomic open Miklos Szeredi
@ 2012-03-24 14:20   ` Christoph Hellwig
  0 siblings, 0 replies; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-24 14:20 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

As NFS was the only instance of d_relvalidate uing other fields
than 'flags' from the nameidata argument we can simply pass an int
flags now.  That will make life for ecryptfs a lot simpler, and it
also gets rid of all that if (nd && check_flags) crap because it's
passed conditionally.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create()
  2012-03-13 13:39     ` Miklos Szeredi
  2012-03-13 16:43       ` Sage Weil
@ 2012-03-24 14:22       ` Christoph Hellwig
  1 sibling, 0 replies; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-24 14:22 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Christoph Hellwig, viro, linux-fsdevel, linux-kernel,
	Trond.Myklebust, sfrench, sage, ericvh

On Tue, Mar 13, 2012 at 02:39:16PM +0100, Miklos Szeredi wrote:
> Apparently CIFS is the only one that can tell whether the file was
> created or not.  If the flag is set then notify_create() is called.
> Users of NFS doesn't seem to care, it's of dubious value anyway, but why
> not use the info when available?

Given that *notify doesn't work correctly on non-local filesystes anyway
I don't think it matters.

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 21/25] gfs2: use i_op->atomic_create()
  2012-03-07 21:22 ` [PATCH 21/25] gfs2: use i_op->atomic_create() Miklos Szeredi
@ 2012-03-24 14:27   ` Christoph Hellwig
  2012-03-24 15:38     ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-24 14:27 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: viro, linux-fsdevel, linux-kernel, hch, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Wed, Mar 07, 2012 at 10:22:38PM +0100, Miklos Szeredi wrote:
> From: Miklos Szeredi <mszeredi@suse.cz>
> 
> GFS2 doesn't open the file in ->create, but it does check the LOOKUP_EXCL flag
> in it's create function.  Convert to using ->atomic_create and checking O_EXCL
> so that the nameidata argument is no longer necessary.

It seems odd that we require implementing ->atomic_create even if we
don't actually do an atomic create but just want to look at the flags.

In fact I wonder if we really need to bother with having ->atomic_create
and ->create, or if we should have one method (kinda contra my previous
stance that ->atomic_open and ->atomic_create should be one).

This method then could or could not return a file pointer, but it would
always be the entry point for creates.


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 21/25] gfs2: use i_op->atomic_create()
  2012-03-24 14:27   ` Christoph Hellwig
@ 2012-03-24 15:38     ` Miklos Szeredi
  0 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-24 15:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Sat, Mar 24, 2012 at 3:27 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Wed, Mar 07, 2012 at 10:22:38PM +0100, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@suse.cz>
>>
>> GFS2 doesn't open the file in ->create, but it does check the LOOKUP_EXCL flag
>> in it's create function.  Convert to using ->atomic_create and checking O_EXCL
>> so that the nameidata argument is no longer necessary.
>
> It seems odd that we require implementing ->atomic_create even if we
> don't actually do an atomic create but just want to look at the flags.
>
> In fact I wonder if we really need to bother with having ->atomic_create
> and ->create, or if we should have one method (kinda contra my previous
> stance that ->atomic_open and ->atomic_create should be one).
>
> This method then could or could not return a file pointer, but it would
> always be the entry point for creates.

Yes, except I thought of the pain of converting all those returns to
ERR_PTR and left the old ->create instead.

Maybe I should just byte the bullet or do that, having a single return
point for those functions shouldn't be that difficult to do.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 03/25] vfs: split __dentry_open()
  2012-03-24 14:12   ` Christoph Hellwig
@ 2012-03-26 13:22     ` Miklos Szeredi
  2012-03-26 13:30       ` Christoph Hellwig
  0 siblings, 1 reply; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-26 13:22 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Sat, Mar 24, 2012 at 3:12 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Wed, Mar 07, 2012 at 10:22:20PM +0100, Miklos Szeredi wrote:
>> From: Miklos Szeredi <mszeredi@suse.cz>
>>
>> Split __dentry_open() into two functions:
>>
>>   do_dentry_open() - does most of the actual work, doesn't put file on failure
>>   open_check_o_direct() - after a successful open, checks direct_IO method
>>
>> This will allow i_op->atomic_open to do just the file initialization and leave
>> the direct_IO checking to the VFS.
>
> I think the O_DIRECT checks should move out of the VFS.  The direct I/O
> method isn't called from the VFS anywhere, but just from the
> generic_file_* routines in filemap.c, which suggest doing the O_DIRECT
> check in there as well.

Returning the error at the earliest opportunity (from open as opposed
to read/write) makes sense.  Given that some apps may actually rely on
the return value from open to verify O_DIRECT support, it doesn't seem
to be a good idea to move the checks to read/write.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 03/25] vfs: split __dentry_open()
  2012-03-26 13:22     ` Miklos Szeredi
@ 2012-03-26 13:30       ` Christoph Hellwig
  2012-03-26 13:47         ` Miklos Szeredi
  0 siblings, 1 reply; 55+ messages in thread
From: Christoph Hellwig @ 2012-03-26 13:30 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Christoph Hellwig, viro, linux-fsdevel, linux-kernel,
	Trond.Myklebust, sfrench, sage, ericvh, mszeredi

On Mon, Mar 26, 2012 at 03:22:09PM +0200, Miklos Szeredi wrote:
> > I think the O_DIRECT checks should move out of the VFS. ??The direct I/O
> > method isn't called from the VFS anywhere, but just from the
> > generic_file_* routines in filemap.c, which suggest doing the O_DIRECT
> > check in there as well.
> 
> Returning the error at the earliest opportunity (from open as opposed
> to read/write) makes sense.  Given that some apps may actually rely on
> the return value from open to verify O_DIRECT support, it doesn't seem
> to be a good idea to move the checks to read/write.

I'm fine with keeping it in open, bu it should be in generic_file_open,
not in the VFS (and yeah, generic_file_open is in open.c not filemap.c
where it should be, sorry)


^ permalink raw reply	[flat|nested] 55+ messages in thread

* Re: [PATCH 03/25] vfs: split __dentry_open()
  2012-03-26 13:30       ` Christoph Hellwig
@ 2012-03-26 13:47         ` Miklos Szeredi
  0 siblings, 0 replies; 55+ messages in thread
From: Miklos Szeredi @ 2012-03-26 13:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: viro, linux-fsdevel, linux-kernel, Trond.Myklebust, sfrench,
	sage, ericvh, mszeredi

On Mon, Mar 26, 2012 at 3:30 PM, Christoph Hellwig <hch@infradead.org> wrote:
> On Mon, Mar 26, 2012 at 03:22:09PM +0200, Miklos Szeredi wrote:
>> > I think the O_DIRECT checks should move out of the VFS. ??The direct I/O
>> > method isn't called from the VFS anywhere, but just from the
>> > generic_file_* routines in filemap.c, which suggest doing the O_DIRECT
>> > check in there as well.
>>
>> Returning the error at the earliest opportunity (from open as opposed
>> to read/write) makes sense.  Given that some apps may actually rely on
>> the return value from open to verify O_DIRECT support, it doesn't seem
>> to be a good idea to move the checks to read/write.
>
> I'm fine with keeping it in open, bu it should be in generic_file_open,
> not in the VFS (and yeah, generic_file_open is in open.c not filemap.c
> where it should be, sorry)

Unfortunately a lot of filesystems that use generic_file_aio_foo()
don't actually call generic_file_open() in their ->open method.  And
it would be difficult to enforce.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 55+ messages in thread

end of thread, other threads:[~2012-03-26 13:47 UTC | newest]

Thread overview: 55+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-07 21:22 [PATCH 00/25] vfs: atomic open RFC Miklos Szeredi
2012-03-07 21:22 ` [PATCH 01/25] vfs: split do_lookup() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 02/25] vfs: reorganize do_last() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 03/25] vfs: split __dentry_open() Miklos Szeredi
2012-03-24 14:12   ` Christoph Hellwig
2012-03-26 13:22     ` Miklos Szeredi
2012-03-26 13:30       ` Christoph Hellwig
2012-03-26 13:47         ` Miklos Szeredi
2012-03-07 21:22 ` [PATCH 04/25] vfs: add i_op->atomic_open() Miklos Szeredi
2012-03-13 14:38   ` Myklebust, Trond
2012-03-13 15:11     ` Miklos Szeredi
2012-03-13 15:31       ` Myklebust, Trond
2012-03-07 21:22 ` [PATCH 05/25] vfs: add filesystem flags for atomic_open Miklos Szeredi
2012-03-13  9:33   ` Christoph Hellwig
2012-03-13 11:17     ` Miklos Szeredi
2012-03-07 21:22 ` [PATCH 06/25] vfs: add i_op->atomic_create() Miklos Szeredi
2012-03-13  9:37   ` Christoph Hellwig
2012-03-13 11:22     ` Miklos Szeredi
2012-03-13 11:55       ` Christoph Hellwig
2012-03-13 13:26         ` Miklos Szeredi
2012-03-13 14:08           ` Miklos Szeredi
2012-03-13 16:34             ` Sage Weil
2012-03-24 14:14             ` Christoph Hellwig
2012-03-07 21:22 ` [PATCH 07/25] nfs: implement i_op->atomic_open() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 08/25] nfs: clean up ->create in nfs_rpc_ops Miklos Szeredi
2012-03-07 21:22 ` [PATCH 09/25] nfs: remove nfs4 specific create function Miklos Szeredi
2012-03-13 12:09   ` Christoph Hellwig
2012-03-07 21:22 ` [PATCH 10/25] nfs: don't use nd->intent.open.flags Miklos Szeredi
2012-03-07 21:22 ` [PATCH 11/25] nfs: don't use intents for checking atomic open Miklos Szeredi
2012-03-24 14:20   ` Christoph Hellwig
2012-03-07 21:22 ` [PATCH 12/25] fuse: implement i_op->atomic_create() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 13/25] cifs: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
2012-03-13 12:06   ` Christoph Hellwig
2012-03-13 13:39     ` Miklos Szeredi
2012-03-13 16:43       ` Sage Weil
2012-03-24 14:22       ` Christoph Hellwig
2012-03-07 21:22 ` [PATCH 14/25] ceph: remove unused arg from ceph_lookup_open() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 15/25] ceph: implement i_op->atomic_open() and i_op->atomic_create() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 16/25] 9p: implement i_op->atomic_create() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 17/25] vfs: remove open intents from nameidata Miklos Szeredi
2012-03-07 21:22 ` [PATCH 18/25] vfs: only retry last component if opening stale dentry Miklos Szeredi
2012-03-07 21:22 ` [PATCH 19/25] vfs: remove nameidata argument from vfs_create Miklos Szeredi
2012-03-07 21:22 ` [PATCH 20/25] vfs: move O_DIRECT check to common code Miklos Szeredi
2012-03-07 21:22 ` [PATCH 21/25] gfs2: use i_op->atomic_create() Miklos Szeredi
2012-03-24 14:27   ` Christoph Hellwig
2012-03-24 15:38     ` Miklos Szeredi
2012-03-07 21:22 ` [PATCH 22/25] nfs: " Miklos Szeredi
2012-03-07 21:22 ` [PATCH 23/25] vfs: remove nameidata argument from i_op->create() Miklos Szeredi
2012-03-07 21:22 ` [PATCH 24/25] vfs: optionally skip lookup on exclusive create Miklos Szeredi
2012-03-07 21:22 ` [PATCH 25/25] vfs: remove nameidata from lookup Miklos Szeredi
2012-03-07 21:27 ` [PATCH 00/25] vfs: atomic open RFC Steve French
2012-03-13  9:51 ` Christoph Hellwig
2012-03-13 11:00   ` Miklos Szeredi
2012-03-13 12:01     ` Christoph Hellwig
2012-03-13 13:33       ` Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).