linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC][PATCH 0/4] ovl: Do not hold s_vfs_rename_mutex during data copy up
@ 2016-11-12 19:36 Amir Goldstein
  2016-11-12 19:36 ` [RFC][PATCH 1/4] vfs: update documentation for inode operation locking rules Amir Goldstein
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Amir Goldstein @ 2016-11-12 19:36 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel

With these changes, different files data can be copied up concurrently and
data copy up will not block copy up of other files nor will it block other
rename operations on the underlaying fs.

Following 3 cleanup patches, the last patch implements the following locking
scheme:

Directories:
  <maybe> inode_lock(ovl_inode)
    lock_rename(workdir, upperdir)
      copy up dir to workdir
      move dir to upperdir

Special and zero size files:
  inode_lock(ovl_inode)
    lock_rename(workdir, upperdir)
      copy up file to workdir (no data)
      move file to upperdir

Regular files with data:
  inode_lock(ovl_inode)
    lock_rename(workdir, upperdir)
      copy up file to workdir (no data)
    unlock_rename(workdir, upperdir)
      copy data to file in workdir
    lock_rename(workdir, upperdir)
      move file to upperdir

Those changes are sanitity tested with xfstest -g quick, pjdfstest and
unionmount-testsuite.

Also tested that during copy up of 4gb file, it is possible to touch small files
on same overlay without blocking and rename files in underlying fs

Rename or touch the 4gb file before copy up is done will block until copy up is
done and complete there after.

Amir Goldstein (4):
  vfs: update documentation for inode operation locking rules
  ovl: add helper ovl_dentry_is_upper()
  ovl: fold ovl_copy_up_truncate() into ovl_copy_up()
  ovl: allow concurrent copy up data of differnt files

 Documentation/filesystems/Locking |  27 +++++-----
 fs/overlayfs/copy_up.c            | 100 ++++++++++++++++++++++++++++++++------
 fs/overlayfs/dir.c                |   4 +-
 fs/overlayfs/inode.c              |  36 ++------------
 fs/overlayfs/overlayfs.h          |   4 +-
 fs/overlayfs/readdir.c            |   2 +-
 fs/overlayfs/util.c               |   7 +++
 7 files changed, 116 insertions(+), 64 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [RFC][PATCH 1/4] vfs: update documentation for inode operation locking rules
  2016-11-12 19:36 [RFC][PATCH 0/4] ovl: Do not hold s_vfs_rename_mutex during data copy up Amir Goldstein
@ 2016-11-12 19:36 ` Amir Goldstein
  2016-11-12 19:36 ` [RFC][PATCH 2/4] ovl: add helper ovl_dentry_is_upper() Amir Goldstein
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2016-11-12 19:36 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel

- Remove obsolete truncate operation prototype
- Reorder prototypes mostly by their order in inode_operations struct
- Match locking rules order to prototypes order
- Add missing rule for set_acl operation
- Fix indentation

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 Documentation/filesystems/Locking | 27 ++++++++++++++-------------
 1 file changed, 14 insertions(+), 13 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 14cdc10..0e35b16 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -42,8 +42,11 @@ d_real		no		no		yes 		no
 
 --------------------------- inode_operations --------------------------- 
 prototypes:
-	int (*create) (struct inode *,struct dentry *,umode_t, bool);
 	struct dentry * (*lookup) (struct inode *,struct dentry *, unsigned int);
+	int (*permission) (struct inode *, int);
+	int (*readlink) (struct dentry *, char __user *,int);
+	const char *(*get_link) (struct dentry *, struct inode *, void **);
+	int (*create) (struct inode *,struct dentry *,umode_t, bool);
 	int (*link) (struct dentry *,struct inode *,struct dentry *);
 	int (*unlink) (struct inode *,struct dentry *);
 	int (*symlink) (struct inode *,struct dentry *,const char *);
@@ -52,11 +55,6 @@ prototypes:
 	int (*mknod) (struct inode *,struct dentry *,umode_t,dev_t);
 	int (*rename) (struct inode *, struct dentry *,
 			struct inode *, struct dentry *, unsigned int);
-	int (*readlink) (struct dentry *, char __user *,int);
-	const char *(*get_link) (struct dentry *, struct inode *, void **);
-	void (*truncate) (struct inode *);
-	int (*permission) (struct inode *, int, unsigned int);
-	int (*get_acl)(struct inode *, int);
 	int (*setattr) (struct dentry *, struct iattr *);
 	int (*getattr) (struct vfsmount *, struct dentry *, struct kstat *);
 	ssize_t (*listxattr) (struct dentry *, char *, size_t);
@@ -66,30 +64,33 @@ prototypes:
 				struct file *, unsigned open_flag,
 				umode_t create_mode, int *opened);
 	int (*tmpfile) (struct inode *, struct dentry *, umode_t);
+	int (*set_acl) (struct inode *, struct posix_acl *, int);
+	struct posix_acl *(*get_acl)(struct inode *, int);
 
 locking rules:
 	all may block
 		i_mutex(inode)
 lookup:		yes
+permission:	no (may not block if called in rcu-walk mode)
+readlink:	no
+get_link:	no
 create:		yes
 link:		yes (both)
-mknod:		yes
+unlink:		yes (both)
 symlink:	yes
 mkdir:		yes
-unlink:		yes (both)
 rmdir:		yes (both)	(see below)
-rename:	yes (all)	(see below)
-readlink:	no
-get_link:	no
+mknod:		yes
+rename:		yes (all)	(see below)
 setattr:	yes
-permission:	no (may not block if called in rcu-walk mode)
-get_acl:	no
 getattr:	no
 listxattr:	no
 fiemap:		no
 update_time:	no
 atomic_open:	yes
 tmpfile:	no
+set_acl:	yes
+get_acl:	no
 
 
 	Additionally, ->rmdir(), ->unlink() and ->rename() have ->i_mutex on
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 2/4] ovl: add helper ovl_dentry_is_upper()
  2016-11-12 19:36 [RFC][PATCH 0/4] ovl: Do not hold s_vfs_rename_mutex during data copy up Amir Goldstein
  2016-11-12 19:36 ` [RFC][PATCH 1/4] vfs: update documentation for inode operation locking rules Amir Goldstein
@ 2016-11-12 19:36 ` Amir Goldstein
  2016-11-12 19:36 ` [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up() Amir Goldstein
  2016-11-12 19:36 ` [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files Amir Goldstein
  3 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2016-11-12 19:36 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel

Add a simple helper to find out if an overlay entry is upper,
and use instead of the cumbersome expression
OVL_TYPE_UPPER(ovl_path_type(dentry)).

Use the helper in all the places that call ovl_dentry_upper(dentry)
without dereferencing the returned entry.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 11 ++++-------
 fs/overlayfs/dir.c       |  4 ++--
 fs/overlayfs/overlayfs.h |  1 +
 fs/overlayfs/readdir.c   |  2 +-
 fs/overlayfs/util.c      |  7 +++++++
 5 files changed, 15 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index f18c1a6..7e67512 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -346,7 +346,6 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	struct path parentpath;
 	struct dentry *lowerdentry = lowerpath->dentry;
 	struct dentry *upperdir;
-	struct dentry *upperdentry;
 	const char *link = NULL;
 
 	if (WARN_ON(!workdir))
@@ -372,8 +371,8 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 		pr_err("overlayfs: failed to lock workdir+upperdir\n");
 		goto out_unlock;
 	}
-	upperdentry = ovl_dentry_upper(dentry);
-	if (upperdentry) {
+
+	if (ovl_dentry_is_upper(dentry)) {
 		/* Raced with another copy-up?  Nothing to do, then... */
 		err = 0;
 		goto out_unlock;
@@ -402,9 +401,8 @@ int ovl_copy_up(struct dentry *dentry)
 		struct dentry *parent;
 		struct path lowerpath;
 		struct kstat stat;
-		enum ovl_path_type type = ovl_path_type(dentry);
 
-		if (OVL_TYPE_UPPER(type))
+		if (ovl_dentry_is_upper(dentry))
 			break;
 
 		next = dget(dentry);
@@ -412,8 +410,7 @@ int ovl_copy_up(struct dentry *dentry)
 		for (;;) {
 			parent = dget_parent(next);
 
-			type = ovl_path_type(parent);
-			if (OVL_TYPE_UPPER(type))
+			if (ovl_dentry_is_upper(parent))
 				break;
 
 			dput(next);
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index f24b6b9..100dce5 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -631,7 +631,7 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
 
 	err = -ESTALE;
 	if ((opaquedir && upper != opaquedir) ||
-	    (!opaquedir && ovl_dentry_upper(dentry) &&
+	    (!opaquedir && ovl_dentry_is_upper(dentry) &&
 	     upper != ovl_dentry_upper(dentry))) {
 		goto out_dput_upper;
 	}
@@ -858,7 +858,7 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 	new_opaque = ovl_dentry_is_opaque(new);
 
 	err = -ESTALE;
-	if (ovl_dentry_upper(new)) {
+	if (ovl_dentry_is_upper(new)) {
 		if (opaquedir) {
 			if (newdentry != opaquedir)
 				goto out_dput;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index f6e4d35..421fd50 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -156,6 +156,7 @@ void ovl_set_dir_cache(struct dentry *dentry, struct ovl_dir_cache *cache);
 bool ovl_dentry_is_opaque(struct dentry *dentry);
 bool ovl_dentry_is_whiteout(struct dentry *dentry);
 void ovl_dentry_set_opaque(struct dentry *dentry, bool opaque);
+bool ovl_dentry_is_upper(struct dentry *dentry);
 void ovl_dentry_update(struct dentry *dentry, struct dentry *upperdentry);
 void ovl_inode_init(struct inode *inode, struct inode *realinode,
 		    bool is_upper);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index f241b4e..857f24b 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -437,7 +437,7 @@ static int ovl_dir_fsync(struct file *file, loff_t start, loff_t end,
 	/*
 	 * Need to check if we started out being a lower dir, but got copied up
 	 */
-	if (!od->is_upper && OVL_TYPE_UPPER(ovl_path_type(dentry))) {
+	if (!od->is_upper && ovl_dentry_is_upper(dentry)) {
 		struct inode *inode = file_inode(file);
 
 		realfile = lockless_dereference(od->upperfile);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 0d45a84..9a8071a 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -121,6 +121,13 @@ struct dentry *ovl_dentry_upper(struct dentry *dentry)
 	return ovl_upperdentry_dereference(oe);
 }
 
+bool ovl_dentry_is_upper(struct dentry *dentry)
+{
+	struct ovl_entry *oe = dentry->d_fsdata;
+
+	return (oe->__upperdentry != NULL);
+}
+
 static struct dentry *__ovl_dentry_lower(struct ovl_entry *oe)
 {
 	return oe->numlower ? oe->lowerstack[0].dentry : NULL;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up()
  2016-11-12 19:36 [RFC][PATCH 0/4] ovl: Do not hold s_vfs_rename_mutex during data copy up Amir Goldstein
  2016-11-12 19:36 ` [RFC][PATCH 1/4] vfs: update documentation for inode operation locking rules Amir Goldstein
  2016-11-12 19:36 ` [RFC][PATCH 2/4] ovl: add helper ovl_dentry_is_upper() Amir Goldstein
@ 2016-11-12 19:36 ` Amir Goldstein
  2016-12-03 17:04   ` Amir Goldstein
  2016-12-05 14:51   ` Miklos Szeredi
  2016-11-12 19:36 ` [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files Amir Goldstein
  3 siblings, 2 replies; 12+ messages in thread
From: Amir Goldstein @ 2016-11-12 19:36 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel

This removes code duplication and is prep work
for fixing up copy up locking.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 19 ++++++++++++++++---
 fs/overlayfs/inode.c     | 33 +--------------------------------
 fs/overlayfs/overlayfs.h |  3 +--
 3 files changed, 18 insertions(+), 37 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index 7e67512..a16127b 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -336,8 +336,8 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
  * d_parent it is possible that the copy up will lock the old parent.  At
  * that point the file will have already been copied up anyway.
  */
-int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
-		    struct path *lowerpath, struct kstat *stat)
+static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
+			   struct path *lowerpath, struct kstat *stat)
 {
 	DEFINE_DELAYED_CALL(done);
 	struct dentry *workdir = ovl_workdir(dentry);
@@ -391,7 +391,7 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 	return err;
 }
 
-int ovl_copy_up(struct dentry *dentry)
+static int __ovl_copy_up(struct dentry *dentry, int flags)
 {
 	int err = 0;
 	const struct cred *old_cred = ovl_override_creds(dentry->d_sb);
@@ -419,6 +419,9 @@ int ovl_copy_up(struct dentry *dentry)
 
 		ovl_path_lower(next, &lowerpath);
 		err = vfs_getattr(&lowerpath, &stat);
+		/* maybe truncate regular file. this has no effect on dirs */
+		if (flags & O_TRUNC)
+			stat.size = 0;
 		if (!err)
 			err = ovl_copy_up_one(parent, next, &lowerpath, &stat);
 
@@ -429,3 +432,13 @@ int ovl_copy_up(struct dentry *dentry)
 
 	return err;
 }
+
+int ovl_copy_up(struct dentry *dentry)
+{
+	return __ovl_copy_up(dentry, 0);
+}
+
+int ovl_copy_up_open(struct dentry *dentry, int flags)
+{
+	return __ovl_copy_up(dentry, flags);
+}
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index a10e948..7abae00 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -13,34 +13,6 @@
 #include <linux/posix_acl.h>
 #include "overlayfs.h"
 
-static int ovl_copy_up_truncate(struct dentry *dentry)
-{
-	int err;
-	struct dentry *parent;
-	struct kstat stat;
-	struct path lowerpath;
-	const struct cred *old_cred;
-
-	parent = dget_parent(dentry);
-	err = ovl_copy_up(parent);
-	if (err)
-		goto out_dput_parent;
-
-	ovl_path_lower(dentry, &lowerpath);
-
-	old_cred = ovl_override_creds(dentry->d_sb);
-	err = vfs_getattr(&lowerpath, &stat);
-	if (!err) {
-		stat.size = 0;
-		err = ovl_copy_up_one(parent, dentry, &lowerpath, &stat);
-	}
-	revert_creds(old_cred);
-
-out_dput_parent:
-	dput(parent);
-	return err;
-}
-
 int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 {
 	int err;
@@ -281,10 +253,7 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
 	if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
 		err = ovl_want_write(dentry);
 		if (!err) {
-			if (file_flags & O_TRUNC)
-				err = ovl_copy_up_truncate(dentry);
-			else
-				err = ovl_copy_up(dentry);
+			err = ovl_copy_up_open(dentry, file_flags);
 			ovl_drop_write(dentry);
 		}
 	}
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 421fd50..1a2f9a5 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -215,7 +215,6 @@ void ovl_cleanup(struct inode *dir, struct dentry *dentry);
 
 /* copy_up.c */
 int ovl_copy_up(struct dentry *dentry);
-int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
-		    struct path *lowerpath, struct kstat *stat);
+int ovl_copy_up_open(struct dentry *dentry, int flags);
 int ovl_copy_xattr(struct dentry *old, struct dentry *new);
 int ovl_set_attr(struct dentry *upper, struct kstat *stat);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files
  2016-11-12 19:36 [RFC][PATCH 0/4] ovl: Do not hold s_vfs_rename_mutex during data copy up Amir Goldstein
                   ` (2 preceding siblings ...)
  2016-11-12 19:36 ` [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up() Amir Goldstein
@ 2016-11-12 19:36 ` Amir Goldstein
  2016-11-15 15:56   ` Vivek Goyal
  3 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2016-11-12 19:36 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel

Currently, all copy operations are serialized by taking the
lock_rename(workdir, upperdir) locks for the entire copy up
operation, including the data copy up.

Copying up data may take a long time with large files and
holding s_vfs_rename_mutex during that time blocks all
rename and copy up operations on the entire underlying fs.

This change addresses this problem by changing the copy up
locking scheme for different types of files as follows.

Directories:
  <maybe> inode_lock(ovl_inode)
    lock_rename(workdir, upperdir)
      copy up dir to workdir
      move dir to upperdir

Special and zero size files:
  inode_lock(ovl_inode)
    lock_rename(workdir, upperdir)
      copy up file to workdir (no data)
      move file to upperdir

Regular files with data:
  inode_lock(ovl_inode)
    lock_rename(workdir, upperdir)
      copy up file to workdir (no data)
    unlock_rename(workdir, upperdir)
      copy data to file in workdir
    lock_rename(workdir, upperdir)
      move file to upperdir

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++---
 fs/overlayfs/inode.c   |  3 +++
 2 files changed, 69 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index a16127b..1b9705e 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -230,6 +230,44 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
 	return err;
 }
 
+/*
+ * Called with dentry->d_inode lock held only for the last (leaf) copy up
+ * from __ovl_copy_up(), so it is NOT held when called for ancestor
+ * directory from __ovl_copy_up()
+ *
+ * Called with lock_rename(workdir, upperdir) locks held.
+ *
+ * lock_rename() locks remain locked throughout the copy up
+ * of non regular files and zero sized regular files.
+ *
+ * lock_rename() locks are released during ovl_copy_up_data()
+ * of non zero sized regular files. During this time, the overlay
+ * dentry->d_inode lock is still held to avoid concurrent
+ * copy up of files with data.
+ *
+ * Maybe a better description of this locking scheme:
+ *
+ * Directories:
+ *   <maybe> inode_lock(ovl_inode)
+ *     lock_rename(workdir, upperdir)
+ *       copy up dir to workdir
+ *       move dir to upperdir
+ *
+ * Special and zero size files:
+ *   inode_lock(ovl_inode)
+ *     lock_rename(workdir, upperdir)
+ *       copy up file to workdir (no data)
+ *       move file to upperdir
+ *
+ * Regular files with data:
+ *  inode_lock(ovl_inode)
+ *    lock_rename(workdir, upperdir)
+ *      copy up file to workdir (no data)
+ *    unlock_rename(workdir, upperdir)
+ *      copy data to file in workdir
+ *    lock_rename(workdir, upperdir)
+ *      move file to upperdir
+ */
 static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 			      struct dentry *dentry, struct path *lowerpath,
 			      struct kstat *stat, const char *link)
@@ -274,16 +312,39 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out2;
 
-	if (S_ISREG(stat->mode)) {
+	if (S_ISREG(stat->mode) && stat->size > 0) {
 		struct path upperpath;
 
 		ovl_path_upper(dentry, &upperpath);
 		BUG_ON(upperpath.dentry != NULL);
 		upperpath.dentry = newdentry;
 
+		/*
+		 * Release rename locks, because copy data may take a long time,
+		 * and holding s_vfs_rename_mutex will block all rename and
+		 * copy up operations on the entire underlying fs.
+		 * We still hold the overlay inode lock to avoid concurrent
+		 * copy up of this file.
+		 */
+		unlock_rename(workdir, upperdir);
+
 		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
+
+		/*
+		 * Re-aquire rename locks, before moving copied file into place.
+		 */
+		if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
+			pr_err("overlayfs: failed to re-aquire lock_rename\n");
+			err = -EIO;
+		}
+
 		if (err)
 			goto out_cleanup;
+
+		if (WARN_ON(ovl_dentry_is_upper(dentry))) {
+			/* Raced with another copy-up? This shouldn't happen */
+			goto out_cleanup;
+		}
 	}
 
 	err = ovl_copy_xattr(lowerpath->dentry, newdentry);
@@ -366,15 +427,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
 			return PTR_ERR(link);
 	}
 
-	err = -EIO;
-	if (lock_rename(workdir, upperdir) != NULL) {
+	if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
 		pr_err("overlayfs: failed to lock workdir+upperdir\n");
+		err = -EIO;
 		goto out_unlock;
 	}
 
 	if (ovl_dentry_is_upper(dentry)) {
 		/* Raced with another copy-up?  Nothing to do, then... */
-		err = 0;
 		goto out_unlock;
 	}
 
@@ -433,11 +493,13 @@ static int __ovl_copy_up(struct dentry *dentry, int flags)
 	return err;
 }
 
+/* Called with dentry->d_inode lock held */
 int ovl_copy_up(struct dentry *dentry)
 {
 	return __ovl_copy_up(dentry, 0);
 }
 
+/* Called with dentry->d_inode lock held */
 int ovl_copy_up_open(struct dentry *dentry, int flags)
 {
 	return __ovl_copy_up(dentry, flags);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 7abae00..532b0d5 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -251,11 +251,14 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
 
 	type = ovl_path_real(dentry, &realpath);
 	if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
+		/* Take the overlay inode lock to avoid concurrent copy-up */
+		inode_lock(dentry->d_inode);
 		err = ovl_want_write(dentry);
 		if (!err) {
 			err = ovl_copy_up_open(dentry, file_flags);
 			ovl_drop_write(dentry);
 		}
+		inode_unlock(dentry->d_inode);
 	}
 
 	return err;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files
  2016-11-12 19:36 ` [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files Amir Goldstein
@ 2016-11-15 15:56   ` Vivek Goyal
  2016-11-15 19:20     ` Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Vivek Goyal @ 2016-11-15 15:56 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Konstantin Khlebnikov, Al Viro, linux-unionfs,
	linux-fsdevel

On Sat, Nov 12, 2016 at 09:36:04PM +0200, Amir Goldstein wrote:
> Currently, all copy operations are serialized by taking the
> lock_rename(workdir, upperdir) locks for the entire copy up
> operation, including the data copy up.
> 
> Copying up data may take a long time with large files and
> holding s_vfs_rename_mutex during that time blocks all
> rename and copy up operations on the entire underlying fs.
> 
> This change addresses this problem by changing the copy up
> locking scheme for different types of files as follows.
> 
> Directories:
>   <maybe> inode_lock(ovl_inode)

Hi Amir,

What does <maybe> mean here? Is it optional?

>     lock_rename(workdir, upperdir)
>       copy up dir to workdir
>       move dir to upperdir
> 
> Special and zero size files:
>   inode_lock(ovl_inode)
>     lock_rename(workdir, upperdir)
>       copy up file to workdir (no data)
>       move file to upperdir
> 

If we are copying up directories and special and zero file under
lock_rename() and don't drop it during the whole operation, then
why do we need to take ovl_inode lock.

IOW, current code seems to be just doing lock_rename(workdir, upperdir)
for the copy up operation. But now this new code also requires
inode_lock(ovl_inode) and I am trying to understand why.

Thanks
Vivek

> Regular files with data:
>   inode_lock(ovl_inode)
>     lock_rename(workdir, upperdir)
>       copy up file to workdir (no data)
>     unlock_rename(workdir, upperdir)
>       copy data to file in workdir
>     lock_rename(workdir, upperdir)
>       move file to upperdir
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/copy_up.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++---
>  fs/overlayfs/inode.c   |  3 +++
>  2 files changed, 69 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index a16127b..1b9705e 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -230,6 +230,44 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
>  	return err;
>  }
>  
> +/*
> + * Called with dentry->d_inode lock held only for the last (leaf) copy up
> + * from __ovl_copy_up(), so it is NOT held when called for ancestor
> + * directory from __ovl_copy_up()
> + *
> + * Called with lock_rename(workdir, upperdir) locks held.
> + *
> + * lock_rename() locks remain locked throughout the copy up
> + * of non regular files and zero sized regular files.
> + *
> + * lock_rename() locks are released during ovl_copy_up_data()
> + * of non zero sized regular files. During this time, the overlay
> + * dentry->d_inode lock is still held to avoid concurrent
> + * copy up of files with data.
> + *
> + * Maybe a better description of this locking scheme:
> + *
> + * Directories:
> + *   <maybe> inode_lock(ovl_inode)
> + *     lock_rename(workdir, upperdir)
> + *       copy up dir to workdir
> + *       move dir to upperdir
> + *
> + * Special and zero size files:
> + *   inode_lock(ovl_inode)
> + *     lock_rename(workdir, upperdir)
> + *       copy up file to workdir (no data)
> + *       move file to upperdir
> + *
> + * Regular files with data:
> + *  inode_lock(ovl_inode)
> + *    lock_rename(workdir, upperdir)
> + *      copy up file to workdir (no data)
> + *    unlock_rename(workdir, upperdir)
> + *      copy data to file in workdir
> + *    lock_rename(workdir, upperdir)
> + *      move file to upperdir
> + */
>  static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>  			      struct dentry *dentry, struct path *lowerpath,
>  			      struct kstat *stat, const char *link)
> @@ -274,16 +312,39 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>  	if (err)
>  		goto out2;
>  
> -	if (S_ISREG(stat->mode)) {
> +	if (S_ISREG(stat->mode) && stat->size > 0) {
>  		struct path upperpath;
>  
>  		ovl_path_upper(dentry, &upperpath);
>  		BUG_ON(upperpath.dentry != NULL);
>  		upperpath.dentry = newdentry;
>  
> +		/*
> +		 * Release rename locks, because copy data may take a long time,
> +		 * and holding s_vfs_rename_mutex will block all rename and
> +		 * copy up operations on the entire underlying fs.
> +		 * We still hold the overlay inode lock to avoid concurrent
> +		 * copy up of this file.
> +		 */
> +		unlock_rename(workdir, upperdir);
> +
>  		err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
> +
> +		/*
> +		 * Re-aquire rename locks, before moving copied file into place.
> +		 */
> +		if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
> +			pr_err("overlayfs: failed to re-aquire lock_rename\n");
> +			err = -EIO;
> +		}
> +
>  		if (err)
>  			goto out_cleanup;
> +
> +		if (WARN_ON(ovl_dentry_is_upper(dentry))) {
> +			/* Raced with another copy-up? This shouldn't happen */
> +			goto out_cleanup;
> +		}
>  	}
>  
>  	err = ovl_copy_xattr(lowerpath->dentry, newdentry);
> @@ -366,15 +427,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>  			return PTR_ERR(link);
>  	}
>  
> -	err = -EIO;
> -	if (lock_rename(workdir, upperdir) != NULL) {
> +	if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
>  		pr_err("overlayfs: failed to lock workdir+upperdir\n");
> +		err = -EIO;
>  		goto out_unlock;
>  	}
>  
>  	if (ovl_dentry_is_upper(dentry)) {
>  		/* Raced with another copy-up?  Nothing to do, then... */
> -		err = 0;
>  		goto out_unlock;
>  	}
>  
> @@ -433,11 +493,13 @@ static int __ovl_copy_up(struct dentry *dentry, int flags)
>  	return err;
>  }
>  
> +/* Called with dentry->d_inode lock held */
>  int ovl_copy_up(struct dentry *dentry)
>  {
>  	return __ovl_copy_up(dentry, 0);
>  }
>  
> +/* Called with dentry->d_inode lock held */
>  int ovl_copy_up_open(struct dentry *dentry, int flags)
>  {
>  	return __ovl_copy_up(dentry, flags);
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 7abae00..532b0d5 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -251,11 +251,14 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
>  
>  	type = ovl_path_real(dentry, &realpath);
>  	if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
> +		/* Take the overlay inode lock to avoid concurrent copy-up */
> +		inode_lock(dentry->d_inode);
>  		err = ovl_want_write(dentry);
>  		if (!err) {
>  			err = ovl_copy_up_open(dentry, file_flags);
>  			ovl_drop_write(dentry);
>  		}
> +		inode_unlock(dentry->d_inode);
>  	}
>  
>  	return err;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files
  2016-11-15 15:56   ` Vivek Goyal
@ 2016-11-15 19:20     ` Amir Goldstein
  2016-11-15 21:57       ` Vivek Goyal
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2016-11-15 19:20 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Miklos Szeredi, Konstantin Khlebnikov, Al Viro, linux-unionfs,
	linux-fsdevel

On Tue, Nov 15, 2016 at 5:56 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Sat, Nov 12, 2016 at 09:36:04PM +0200, Amir Goldstein wrote:
>> Currently, all copy operations are serialized by taking the
>> lock_rename(workdir, upperdir) locks for the entire copy up
>> operation, including the data copy up.
>>
>> Copying up data may take a long time with large files and
>> holding s_vfs_rename_mutex during that time blocks all
>> rename and copy up operations on the entire underlying fs.
>>
>> This change addresses this problem by changing the copy up
>> locking scheme for different types of files as follows.
>>
>> Directories:
>>   <maybe> inode_lock(ovl_inode)
>
> Hi Amir,
>
> What does <maybe> mean here? Is it optional?
>


It means that some execution paths inode_lock(ovl_inode) is taken (e.g. rmdir()
of overlay dir) and in some execution paths it is not taken (e.g. when
copying up
the victim inodes' parents).

What I have not explain properly is that my change does not add any new
inode_lock(ovl_inode) calls for directories and special files - it is taken in
VFS before getting to overlay code.
I listed the inode_lock(ovl_inode) in the locking scheme of directories and
special files to show that is safe (locking order wise) to take the same lock
inside overlay code, for regular files on open.

>>     lock_rename(workdir, upperdir)
>>       copy up dir to workdir
>>       move dir to upperdir
>>
>> Special and zero size files:
>>   inode_lock(ovl_inode)
>>     lock_rename(workdir, upperdir)
>>       copy up file to workdir (no data)
>>       move file to upperdir
>>
>
> If we are copying up directories and special and zero file under
> lock_rename() and don't drop it during the whole operation, then
> why do we need to take ovl_inode lock.
>

So we really don't take it, but for all the call sites of ovl_copy_up()
except for the ovl_d_real() for regular files open, the ovl_inode lock is
already taken in VFS.

> IOW, current code seems to be just doing lock_rename(workdir, upperdir)
> for the copy up operation. But now this new code also requires
> inode_lock(ovl_inode) and I am trying to understand why.
>

So inode_lock(ovl_inode) is now taken ALSO in the only path it was not
already taken until now. And the reason that we take it is so we can release
lock_rename() for the duration of copy up data.

Hope I was able to clarify myself.

Amir.

> Thanks
> Vivek
>
>> Regular files with data:
>>   inode_lock(ovl_inode)
>>     lock_rename(workdir, upperdir)
>>       copy up file to workdir (no data)
>>     unlock_rename(workdir, upperdir)
>>       copy data to file in workdir
>>     lock_rename(workdir, upperdir)
>>       move file to upperdir
>>
>> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> ---
>>  fs/overlayfs/copy_up.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++---
>>  fs/overlayfs/inode.c   |  3 +++
>>  2 files changed, 69 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>> index a16127b..1b9705e 100644
>> --- a/fs/overlayfs/copy_up.c
>> +++ b/fs/overlayfs/copy_up.c
>> @@ -230,6 +230,44 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
>>       return err;
>>  }
>>
>> +/*
>> + * Called with dentry->d_inode lock held only for the last (leaf) copy up
>> + * from __ovl_copy_up(), so it is NOT held when called for ancestor
>> + * directory from __ovl_copy_up()
>> + *
>> + * Called with lock_rename(workdir, upperdir) locks held.
>> + *
>> + * lock_rename() locks remain locked throughout the copy up
>> + * of non regular files and zero sized regular files.
>> + *
>> + * lock_rename() locks are released during ovl_copy_up_data()
>> + * of non zero sized regular files. During this time, the overlay
>> + * dentry->d_inode lock is still held to avoid concurrent
>> + * copy up of files with data.
>> + *
>> + * Maybe a better description of this locking scheme:
>> + *
>> + * Directories:
>> + *   <maybe> inode_lock(ovl_inode)
>> + *     lock_rename(workdir, upperdir)
>> + *       copy up dir to workdir
>> + *       move dir to upperdir
>> + *
>> + * Special and zero size files:
>> + *   inode_lock(ovl_inode)
>> + *     lock_rename(workdir, upperdir)
>> + *       copy up file to workdir (no data)
>> + *       move file to upperdir
>> + *
>> + * Regular files with data:
>> + *  inode_lock(ovl_inode)
>> + *    lock_rename(workdir, upperdir)
>> + *      copy up file to workdir (no data)
>> + *    unlock_rename(workdir, upperdir)
>> + *      copy data to file in workdir
>> + *    lock_rename(workdir, upperdir)
>> + *      move file to upperdir
>> + */
>>  static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>>                             struct dentry *dentry, struct path *lowerpath,
>>                             struct kstat *stat, const char *link)
>> @@ -274,16 +312,39 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>>       if (err)
>>               goto out2;
>>
>> -     if (S_ISREG(stat->mode)) {
>> +     if (S_ISREG(stat->mode) && stat->size > 0) {
>>               struct path upperpath;
>>
>>               ovl_path_upper(dentry, &upperpath);
>>               BUG_ON(upperpath.dentry != NULL);
>>               upperpath.dentry = newdentry;
>>
>> +             /*
>> +              * Release rename locks, because copy data may take a long time,
>> +              * and holding s_vfs_rename_mutex will block all rename and
>> +              * copy up operations on the entire underlying fs.
>> +              * We still hold the overlay inode lock to avoid concurrent
>> +              * copy up of this file.
>> +              */
>> +             unlock_rename(workdir, upperdir);
>> +
>>               err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
>> +
>> +             /*
>> +              * Re-aquire rename locks, before moving copied file into place.
>> +              */
>> +             if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
>> +                     pr_err("overlayfs: failed to re-aquire lock_rename\n");
>> +                     err = -EIO;
>> +             }
>> +
>>               if (err)
>>                       goto out_cleanup;
>> +
>> +             if (WARN_ON(ovl_dentry_is_upper(dentry))) {
>> +                     /* Raced with another copy-up? This shouldn't happen */
>> +                     goto out_cleanup;
>> +             }
>>       }
>>
>>       err = ovl_copy_xattr(lowerpath->dentry, newdentry);
>> @@ -366,15 +427,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>>                       return PTR_ERR(link);
>>       }
>>
>> -     err = -EIO;
>> -     if (lock_rename(workdir, upperdir) != NULL) {
>> +     if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
>>               pr_err("overlayfs: failed to lock workdir+upperdir\n");
>> +             err = -EIO;
>>               goto out_unlock;
>>       }
>>
>>       if (ovl_dentry_is_upper(dentry)) {
>>               /* Raced with another copy-up?  Nothing to do, then... */
>> -             err = 0;
>>               goto out_unlock;
>>       }
>>
>> @@ -433,11 +493,13 @@ static int __ovl_copy_up(struct dentry *dentry, int flags)
>>       return err;
>>  }
>>
>> +/* Called with dentry->d_inode lock held */
>>  int ovl_copy_up(struct dentry *dentry)
>>  {
>>       return __ovl_copy_up(dentry, 0);
>>  }
>>
>> +/* Called with dentry->d_inode lock held */
>>  int ovl_copy_up_open(struct dentry *dentry, int flags)
>>  {
>>       return __ovl_copy_up(dentry, flags);
>> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>> index 7abae00..532b0d5 100644
>> --- a/fs/overlayfs/inode.c
>> +++ b/fs/overlayfs/inode.c
>> @@ -251,11 +251,14 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
>>
>>       type = ovl_path_real(dentry, &realpath);
>>       if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
>> +             /* Take the overlay inode lock to avoid concurrent copy-up */
>> +             inode_lock(dentry->d_inode);
>>               err = ovl_want_write(dentry);
>>               if (!err) {
>>                       err = ovl_copy_up_open(dentry, file_flags);
>>                       ovl_drop_write(dentry);
>>               }
>> +             inode_unlock(dentry->d_inode);
>>       }
>>
>>       return err;
>> --
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files
  2016-11-15 19:20     ` Amir Goldstein
@ 2016-11-15 21:57       ` Vivek Goyal
  2016-11-16  5:27         ` Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Vivek Goyal @ 2016-11-15 21:57 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Konstantin Khlebnikov, Al Viro, linux-unionfs,
	linux-fsdevel

On Tue, Nov 15, 2016 at 09:20:32PM +0200, Amir Goldstein wrote:
> On Tue, Nov 15, 2016 at 5:56 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> > On Sat, Nov 12, 2016 at 09:36:04PM +0200, Amir Goldstein wrote:
> >> Currently, all copy operations are serialized by taking the
> >> lock_rename(workdir, upperdir) locks for the entire copy up
> >> operation, including the data copy up.
> >>
> >> Copying up data may take a long time with large files and
> >> holding s_vfs_rename_mutex during that time blocks all
> >> rename and copy up operations on the entire underlying fs.
> >>
> >> This change addresses this problem by changing the copy up
> >> locking scheme for different types of files as follows.
> >>
> >> Directories:
> >>   <maybe> inode_lock(ovl_inode)
> >
> > Hi Amir,
> >
> > What does <maybe> mean here? Is it optional?
> >
> 
> 
> It means that some execution paths inode_lock(ovl_inode) is taken (e.g. rmdir()
> of overlay dir) and in some execution paths it is not taken (e.g. when
> copying up
> the victim inodes' parents).
> 
> What I have not explain properly is that my change does not add any new
> inode_lock(ovl_inode) calls for directories and special files - it is taken in
> VFS before getting to overlay code.
> I listed the inode_lock(ovl_inode) in the locking scheme of directories and
> special files to show that is safe (locking order wise) to take the same lock
> inside overlay code, for regular files on open.
> 

Ok, got it. Only in case of regular files (non-zero size), we have added
the inode_lock(ovl_inode) and that's required because we will be dropping
lock_rename() locks temporarily for data copy and want to make sure
another thread does not trigger a parallel copy up of same file.

Thanks
Vivek

> >>     lock_rename(workdir, upperdir)
> >>       copy up dir to workdir
> >>       move dir to upperdir
> >>
> >> Special and zero size files:
> >>   inode_lock(ovl_inode)
> >>     lock_rename(workdir, upperdir)
> >>       copy up file to workdir (no data)
> >>       move file to upperdir
> >>
> >
> > If we are copying up directories and special and zero file under
> > lock_rename() and don't drop it during the whole operation, then
> > why do we need to take ovl_inode lock.
> >
> 
> So we really don't take it, but for all the call sites of ovl_copy_up()
> except for the ovl_d_real() for regular files open, the ovl_inode lock is
> already taken in VFS.
> 
> > IOW, current code seems to be just doing lock_rename(workdir, upperdir)
> > for the copy up operation. But now this new code also requires
> > inode_lock(ovl_inode) and I am trying to understand why.
> >
> 
> So inode_lock(ovl_inode) is now taken ALSO in the only path it was not
> already taken until now. And the reason that we take it is so we can release
> lock_rename() for the duration of copy up data.
> 
> Hope I was able to clarify myself.
> 
> Amir.
> 
> > Thanks
> > Vivek
> >
> >> Regular files with data:
> >>   inode_lock(ovl_inode)
> >>     lock_rename(workdir, upperdir)
> >>       copy up file to workdir (no data)
> >>     unlock_rename(workdir, upperdir)
> >>       copy data to file in workdir
> >>     lock_rename(workdir, upperdir)
> >>       move file to upperdir
> >>
> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> >> ---
> >>  fs/overlayfs/copy_up.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++---
> >>  fs/overlayfs/inode.c   |  3 +++
> >>  2 files changed, 69 insertions(+), 4 deletions(-)
> >>
> >> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> >> index a16127b..1b9705e 100644
> >> --- a/fs/overlayfs/copy_up.c
> >> +++ b/fs/overlayfs/copy_up.c
> >> @@ -230,6 +230,44 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
> >>       return err;
> >>  }
> >>
> >> +/*
> >> + * Called with dentry->d_inode lock held only for the last (leaf) copy up
> >> + * from __ovl_copy_up(), so it is NOT held when called for ancestor
> >> + * directory from __ovl_copy_up()
> >> + *
> >> + * Called with lock_rename(workdir, upperdir) locks held.
> >> + *
> >> + * lock_rename() locks remain locked throughout the copy up
> >> + * of non regular files and zero sized regular files.
> >> + *
> >> + * lock_rename() locks are released during ovl_copy_up_data()
> >> + * of non zero sized regular files. During this time, the overlay
> >> + * dentry->d_inode lock is still held to avoid concurrent
> >> + * copy up of files with data.
> >> + *
> >> + * Maybe a better description of this locking scheme:
> >> + *
> >> + * Directories:
> >> + *   <maybe> inode_lock(ovl_inode)
> >> + *     lock_rename(workdir, upperdir)
> >> + *       copy up dir to workdir
> >> + *       move dir to upperdir
> >> + *
> >> + * Special and zero size files:
> >> + *   inode_lock(ovl_inode)
> >> + *     lock_rename(workdir, upperdir)
> >> + *       copy up file to workdir (no data)
> >> + *       move file to upperdir
> >> + *
> >> + * Regular files with data:
> >> + *  inode_lock(ovl_inode)
> >> + *    lock_rename(workdir, upperdir)
> >> + *      copy up file to workdir (no data)
> >> + *    unlock_rename(workdir, upperdir)
> >> + *      copy data to file in workdir
> >> + *    lock_rename(workdir, upperdir)
> >> + *      move file to upperdir
> >> + */
> >>  static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
> >>                             struct dentry *dentry, struct path *lowerpath,
> >>                             struct kstat *stat, const char *link)
> >> @@ -274,16 +312,39 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
> >>       if (err)
> >>               goto out2;
> >>
> >> -     if (S_ISREG(stat->mode)) {
> >> +     if (S_ISREG(stat->mode) && stat->size > 0) {
> >>               struct path upperpath;
> >>
> >>               ovl_path_upper(dentry, &upperpath);
> >>               BUG_ON(upperpath.dentry != NULL);
> >>               upperpath.dentry = newdentry;
> >>
> >> +             /*
> >> +              * Release rename locks, because copy data may take a long time,
> >> +              * and holding s_vfs_rename_mutex will block all rename and
> >> +              * copy up operations on the entire underlying fs.
> >> +              * We still hold the overlay inode lock to avoid concurrent
> >> +              * copy up of this file.
> >> +              */
> >> +             unlock_rename(workdir, upperdir);
> >> +
> >>               err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
> >> +
> >> +             /*
> >> +              * Re-aquire rename locks, before moving copied file into place.
> >> +              */
> >> +             if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
> >> +                     pr_err("overlayfs: failed to re-aquire lock_rename\n");
> >> +                     err = -EIO;
> >> +             }
> >> +
> >>               if (err)
> >>                       goto out_cleanup;
> >> +
> >> +             if (WARN_ON(ovl_dentry_is_upper(dentry))) {
> >> +                     /* Raced with another copy-up? This shouldn't happen */
> >> +                     goto out_cleanup;
> >> +             }
> >>       }
> >>
> >>       err = ovl_copy_xattr(lowerpath->dentry, newdentry);
> >> @@ -366,15 +427,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> >>                       return PTR_ERR(link);
> >>       }
> >>
> >> -     err = -EIO;
> >> -     if (lock_rename(workdir, upperdir) != NULL) {
> >> +     if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
> >>               pr_err("overlayfs: failed to lock workdir+upperdir\n");
> >> +             err = -EIO;
> >>               goto out_unlock;
> >>       }
> >>
> >>       if (ovl_dentry_is_upper(dentry)) {
> >>               /* Raced with another copy-up?  Nothing to do, then... */
> >> -             err = 0;
> >>               goto out_unlock;
> >>       }
> >>
> >> @@ -433,11 +493,13 @@ static int __ovl_copy_up(struct dentry *dentry, int flags)
> >>       return err;
> >>  }
> >>
> >> +/* Called with dentry->d_inode lock held */
> >>  int ovl_copy_up(struct dentry *dentry)
> >>  {
> >>       return __ovl_copy_up(dentry, 0);
> >>  }
> >>
> >> +/* Called with dentry->d_inode lock held */
> >>  int ovl_copy_up_open(struct dentry *dentry, int flags)
> >>  {
> >>       return __ovl_copy_up(dentry, flags);
> >> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> >> index 7abae00..532b0d5 100644
> >> --- a/fs/overlayfs/inode.c
> >> +++ b/fs/overlayfs/inode.c
> >> @@ -251,11 +251,14 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
> >>
> >>       type = ovl_path_real(dentry, &realpath);
> >>       if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
> >> +             /* Take the overlay inode lock to avoid concurrent copy-up */
> >> +             inode_lock(dentry->d_inode);
> >>               err = ovl_want_write(dentry);
> >>               if (!err) {
> >>                       err = ovl_copy_up_open(dentry, file_flags);
> >>                       ovl_drop_write(dentry);
> >>               }
> >> +             inode_unlock(dentry->d_inode);
> >>       }
> >>
> >>       return err;
> >> --
> >> 2.7.4
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files
  2016-11-15 21:57       ` Vivek Goyal
@ 2016-11-16  5:27         ` Amir Goldstein
  2016-11-20  8:27           ` Amir Goldstein
  0 siblings, 1 reply; 12+ messages in thread
From: Amir Goldstein @ 2016-11-16  5:27 UTC (permalink / raw)
  To: Vivek Goyal
  Cc: Miklos Szeredi, Konstantin Khlebnikov, Al Viro, linux-unionfs,
	linux-fsdevel

On Tue, Nov 15, 2016 at 11:57 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
> On Tue, Nov 15, 2016 at 09:20:32PM +0200, Amir Goldstein wrote:
>> On Tue, Nov 15, 2016 at 5:56 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> > On Sat, Nov 12, 2016 at 09:36:04PM +0200, Amir Goldstein wrote:
>> >> Currently, all copy operations are serialized by taking the
>> >> lock_rename(workdir, upperdir) locks for the entire copy up
>> >> operation, including the data copy up.
>> >>
>> >> Copying up data may take a long time with large files and
>> >> holding s_vfs_rename_mutex during that time blocks all
>> >> rename and copy up operations on the entire underlying fs.
>> >>
>> >> This change addresses this problem by changing the copy up
>> >> locking scheme for different types of files as follows.
>> >>
>> >> Directories:
>> >>   <maybe> inode_lock(ovl_inode)
>> >
>> > Hi Amir,
>> >
>> > What does <maybe> mean here? Is it optional?
>> >
>>
>>
>> It means that some execution paths inode_lock(ovl_inode) is taken (e.g. rmdir()
>> of overlay dir) and in some execution paths it is not taken (e.g. when
>> copying up
>> the victim inodes' parents).
>>
>> What I have not explain properly is that my change does not add any new
>> inode_lock(ovl_inode) calls for directories and special files - it is taken in
>> VFS before getting to overlay code.
>> I listed the inode_lock(ovl_inode) in the locking scheme of directories and
>> special files to show that is safe (locking order wise) to take the same lock
>> inside overlay code, for regular files on open.
>>
>
> Ok, got it. Only in case of regular files (non-zero size), we have added
> the inode_lock(ovl_inode) and that's required because we will be dropping
> lock_rename() locks temporarily for data copy and want to make sure
> another thread does not trigger a parallel copy up of same file.
>

Well, to put it more accurately:

Only in case of regular files open for write and truncate(), we have added
the inode_lock(ovl_inode).
The inode_lock(ovl_inode) is required for open for write (no O_TRUNC)
of regular files (non-zero size), because we will be dropping
lock_rename() locks temporarily for data copy and want to make sure
another thread does not trigger a parallel copy up of same file.

So we are taking inode_lock(ovl_inode) for truncate() and open of
zero sized files for no other reason then keeping the code simpler
and being able to declare unconditionally above ovl_copy_up{,_open}():
/* Called with dentry->d_inode lock held */

>> >>     lock_rename(workdir, upperdir)
>> >>       copy up dir to workdir
>> >>       move dir to upperdir
>> >>
>> >> Special and zero size files:
>> >>   inode_lock(ovl_inode)
>> >>     lock_rename(workdir, upperdir)
>> >>       copy up file to workdir (no data)
>> >>       move file to upperdir
>> >>
>> >
>> > If we are copying up directories and special and zero file under
>> > lock_rename() and don't drop it during the whole operation, then
>> > why do we need to take ovl_inode lock.
>> >
>>
>> So we really don't take it, but for all the call sites of ovl_copy_up()
>> except for the ovl_d_real() for regular files open, the ovl_inode lock is
>> already taken in VFS.
>>
>> > IOW, current code seems to be just doing lock_rename(workdir, upperdir)
>> > for the copy up operation. But now this new code also requires
>> > inode_lock(ovl_inode) and I am trying to understand why.
>> >
>>
>> So inode_lock(ovl_inode) is now taken ALSO in the only path it was not
>> already taken until now. And the reason that we take it is so we can release
>> lock_rename() for the duration of copy up data.
>>
>> Hope I was able to clarify myself.
>>
>> Amir.
>>
>> > Thanks
>> > Vivek
>> >
>> >> Regular files with data:
>> >>   inode_lock(ovl_inode)
>> >>     lock_rename(workdir, upperdir)
>> >>       copy up file to workdir (no data)
>> >>     unlock_rename(workdir, upperdir)
>> >>       copy data to file in workdir
>> >>     lock_rename(workdir, upperdir)
>> >>       move file to upperdir
>> >>
>> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>> >> ---
>> >>  fs/overlayfs/copy_up.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++---
>> >>  fs/overlayfs/inode.c   |  3 +++
>> >>  2 files changed, 69 insertions(+), 4 deletions(-)
>> >>
>> >> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>> >> index a16127b..1b9705e 100644
>> >> --- a/fs/overlayfs/copy_up.c
>> >> +++ b/fs/overlayfs/copy_up.c
>> >> @@ -230,6 +230,44 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
>> >>       return err;
>> >>  }
>> >>
>> >> +/*
>> >> + * Called with dentry->d_inode lock held only for the last (leaf) copy up
>> >> + * from __ovl_copy_up(), so it is NOT held when called for ancestor
>> >> + * directory from __ovl_copy_up()
>> >> + *
>> >> + * Called with lock_rename(workdir, upperdir) locks held.
>> >> + *
>> >> + * lock_rename() locks remain locked throughout the copy up
>> >> + * of non regular files and zero sized regular files.
>> >> + *
>> >> + * lock_rename() locks are released during ovl_copy_up_data()
>> >> + * of non zero sized regular files. During this time, the overlay
>> >> + * dentry->d_inode lock is still held to avoid concurrent
>> >> + * copy up of files with data.
>> >> + *
>> >> + * Maybe a better description of this locking scheme:
>> >> + *
>> >> + * Directories:
>> >> + *   <maybe> inode_lock(ovl_inode)
>> >> + *     lock_rename(workdir, upperdir)
>> >> + *       copy up dir to workdir
>> >> + *       move dir to upperdir
>> >> + *
>> >> + * Special and zero size files:
>> >> + *   inode_lock(ovl_inode)
>> >> + *     lock_rename(workdir, upperdir)
>> >> + *       copy up file to workdir (no data)
>> >> + *       move file to upperdir
>> >> + *
>> >> + * Regular files with data:
>> >> + *  inode_lock(ovl_inode)
>> >> + *    lock_rename(workdir, upperdir)
>> >> + *      copy up file to workdir (no data)
>> >> + *    unlock_rename(workdir, upperdir)
>> >> + *      copy data to file in workdir
>> >> + *    lock_rename(workdir, upperdir)
>> >> + *      move file to upperdir
>> >> + */
>> >>  static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>> >>                             struct dentry *dentry, struct path *lowerpath,
>> >>                             struct kstat *stat, const char *link)
>> >> @@ -274,16 +312,39 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>> >>       if (err)
>> >>               goto out2;
>> >>
>> >> -     if (S_ISREG(stat->mode)) {
>> >> +     if (S_ISREG(stat->mode) && stat->size > 0) {
>> >>               struct path upperpath;
>> >>
>> >>               ovl_path_upper(dentry, &upperpath);
>> >>               BUG_ON(upperpath.dentry != NULL);
>> >>               upperpath.dentry = newdentry;
>> >>
>> >> +             /*
>> >> +              * Release rename locks, because copy data may take a long time,
>> >> +              * and holding s_vfs_rename_mutex will block all rename and
>> >> +              * copy up operations on the entire underlying fs.
>> >> +              * We still hold the overlay inode lock to avoid concurrent
>> >> +              * copy up of this file.
>> >> +              */
>> >> +             unlock_rename(workdir, upperdir);
>> >> +
>> >>               err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
>> >> +
>> >> +             /*
>> >> +              * Re-aquire rename locks, before moving copied file into place.
>> >> +              */
>> >> +             if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
>> >> +                     pr_err("overlayfs: failed to re-aquire lock_rename\n");
>> >> +                     err = -EIO;
>> >> +             }
>> >> +
>> >>               if (err)
>> >>                       goto out_cleanup;
>> >> +
>> >> +             if (WARN_ON(ovl_dentry_is_upper(dentry))) {
>> >> +                     /* Raced with another copy-up? This shouldn't happen */
>> >> +                     goto out_cleanup;
>> >> +             }
>> >>       }
>> >>
>> >>       err = ovl_copy_xattr(lowerpath->dentry, newdentry);
>> >> @@ -366,15 +427,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>> >>                       return PTR_ERR(link);
>> >>       }
>> >>
>> >> -     err = -EIO;
>> >> -     if (lock_rename(workdir, upperdir) != NULL) {
>> >> +     if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
>> >>               pr_err("overlayfs: failed to lock workdir+upperdir\n");
>> >> +             err = -EIO;
>> >>               goto out_unlock;
>> >>       }
>> >>
>> >>       if (ovl_dentry_is_upper(dentry)) {
>> >>               /* Raced with another copy-up?  Nothing to do, then... */
>> >> -             err = 0;
>> >>               goto out_unlock;
>> >>       }
>> >>
>> >> @@ -433,11 +493,13 @@ static int __ovl_copy_up(struct dentry *dentry, int flags)
>> >>       return err;
>> >>  }
>> >>
>> >> +/* Called with dentry->d_inode lock held */
>> >>  int ovl_copy_up(struct dentry *dentry)
>> >>  {
>> >>       return __ovl_copy_up(dentry, 0);
>> >>  }
>> >>
>> >> +/* Called with dentry->d_inode lock held */
>> >>  int ovl_copy_up_open(struct dentry *dentry, int flags)
>> >>  {
>> >>       return __ovl_copy_up(dentry, flags);
>> >> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>> >> index 7abae00..532b0d5 100644
>> >> --- a/fs/overlayfs/inode.c
>> >> +++ b/fs/overlayfs/inode.c
>> >> @@ -251,11 +251,14 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
>> >>
>> >>       type = ovl_path_real(dentry, &realpath);
>> >>       if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
>> >> +             /* Take the overlay inode lock to avoid concurrent copy-up */
>> >> +             inode_lock(dentry->d_inode);
>> >>               err = ovl_want_write(dentry);
>> >>               if (!err) {
>> >>                       err = ovl_copy_up_open(dentry, file_flags);
>> >>                       ovl_drop_write(dentry);
>> >>               }
>> >> +             inode_unlock(dentry->d_inode);
>> >>       }
>> >>
>> >>       return err;
>> >> --
>> >> 2.7.4
>> >>
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
>> >> the body of a message to majordomo@vger.kernel.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files
  2016-11-16  5:27         ` Amir Goldstein
@ 2016-11-20  8:27           ` Amir Goldstein
  0 siblings, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2016-11-20  8:27 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel,
	Vivek Goyal

On Wed, Nov 16, 2016 at 7:27 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Tue, Nov 15, 2016 at 11:57 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>> On Tue, Nov 15, 2016 at 09:20:32PM +0200, Amir Goldstein wrote:
>>> On Tue, Nov 15, 2016 at 5:56 PM, Vivek Goyal <vgoyal@redhat.com> wrote:
>>> > On Sat, Nov 12, 2016 at 09:36:04PM +0200, Amir Goldstein wrote:
>>> >> Currently, all copy operations are serialized by taking the
>>> >> lock_rename(workdir, upperdir) locks for the entire copy up
>>> >> operation, including the data copy up.
>>> >>
>>> >> Copying up data may take a long time with large files and
>>> >> holding s_vfs_rename_mutex during that time blocks all
>>> >> rename and copy up operations on the entire underlying fs.
>>> >>
>>> >> This change addresses this problem by changing the copy up
>>> >> locking scheme for different types of files as follows.
>>> >>
>>> >> Directories:
>>> >>   <maybe> inode_lock(ovl_inode)
>>> >
>>> > Hi Amir,
>>> >
>>> > What does <maybe> mean here? Is it optional?
>>> >
>>>
>>>
>>> It means that some execution paths inode_lock(ovl_inode) is taken (e.g. rmdir()
>>> of overlay dir) and in some execution paths it is not taken (e.g. when
>>> copying up
>>> the victim inodes' parents).
>>>
>>> What I have not explain properly is that my change does not add any new
>>> inode_lock(ovl_inode) calls for directories and special files - it is taken in
>>> VFS before getting to overlay code.
>>> I listed the inode_lock(ovl_inode) in the locking scheme of directories and
>>> special files to show that is safe (locking order wise) to take the same lock
>>> inside overlay code, for regular files on open.
>>>
>>
>> Ok, got it. Only in case of regular files (non-zero size), we have added
>> the inode_lock(ovl_inode) and that's required because we will be dropping
>> lock_rename() locks temporarily for data copy and want to make sure
>> another thread does not trigger a parallel copy up of same file.
>>
>
> Well, to put it more accurately:
>
> Only in case of regular files open for write and truncate(), we have added
> the inode_lock(ovl_inode).
> The inode_lock(ovl_inode) is required for open for write (no O_TRUNC)
> of regular files (non-zero size), because we will be dropping
> lock_rename() locks temporarily for data copy and want to make sure
> another thread does not trigger a parallel copy up of same file.
>
> So we are taking inode_lock(ovl_inode) for truncate() and open of
> zero sized files for no other reason then keeping the code simpler
> and being able to declare unconditionally above ovl_copy_up{,_open}():
> /* Called with dentry->d_inode lock held */
>
>>> >>     lock_rename(workdir, upperdir)
>>> >>       copy up dir to workdir
>>> >>       move dir to upperdir
>>> >>
>>> >> Special and zero size files:
>>> >>   inode_lock(ovl_inode)
>>> >>     lock_rename(workdir, upperdir)
>>> >>       copy up file to workdir (no data)
>>> >>       move file to upperdir
>>> >>
>>> >
>>> > If we are copying up directories and special and zero file under
>>> > lock_rename() and don't drop it during the whole operation, then
>>> > why do we need to take ovl_inode lock.
>>> >
>>>
>>> So we really don't take it, but for all the call sites of ovl_copy_up()
>>> except for the ovl_d_real() for regular files open, the ovl_inode lock is
>>> already taken in VFS.
>>>
>>> > IOW, current code seems to be just doing lock_rename(workdir, upperdir)
>>> > for the copy up operation. But now this new code also requires
>>> > inode_lock(ovl_inode) and I am trying to understand why.
>>> >
>>>
>>> So inode_lock(ovl_inode) is now taken ALSO in the only path it was not
>>> already taken until now. And the reason that we take it is so we can release
>>> lock_rename() for the duration of copy up data.
>>>
>>> Hope I was able to clarify myself.
>>>
>>> Amir.
>>>


Ping.

Miklos,

Did you get a chance to look at this?
Unless there are any objections, would you mind queuing this up for 4.10?

Thanks,
Amir.



>>> >
>>> >> Regular files with data:
>>> >>   inode_lock(ovl_inode)
>>> >>     lock_rename(workdir, upperdir)
>>> >>       copy up file to workdir (no data)
>>> >>     unlock_rename(workdir, upperdir)
>>> >>       copy data to file in workdir
>>> >>     lock_rename(workdir, upperdir)
>>> >>       move file to upperdir
>>> >>
>>> >> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>> >> ---
>>> >>  fs/overlayfs/copy_up.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++---
>>> >>  fs/overlayfs/inode.c   |  3 +++
>>> >>  2 files changed, 69 insertions(+), 4 deletions(-)
>>> >>
>>> >> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
>>> >> index a16127b..1b9705e 100644
>>> >> --- a/fs/overlayfs/copy_up.c
>>> >> +++ b/fs/overlayfs/copy_up.c
>>> >> @@ -230,6 +230,44 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat)
>>> >>       return err;
>>> >>  }
>>> >>
>>> >> +/*
>>> >> + * Called with dentry->d_inode lock held only for the last (leaf) copy up
>>> >> + * from __ovl_copy_up(), so it is NOT held when called for ancestor
>>> >> + * directory from __ovl_copy_up()
>>> >> + *
>>> >> + * Called with lock_rename(workdir, upperdir) locks held.
>>> >> + *
>>> >> + * lock_rename() locks remain locked throughout the copy up
>>> >> + * of non regular files and zero sized regular files.
>>> >> + *
>>> >> + * lock_rename() locks are released during ovl_copy_up_data()
>>> >> + * of non zero sized regular files. During this time, the overlay
>>> >> + * dentry->d_inode lock is still held to avoid concurrent
>>> >> + * copy up of files with data.
>>> >> + *
>>> >> + * Maybe a better description of this locking scheme:
>>> >> + *
>>> >> + * Directories:
>>> >> + *   <maybe> inode_lock(ovl_inode)
>>> >> + *     lock_rename(workdir, upperdir)
>>> >> + *       copy up dir to workdir
>>> >> + *       move dir to upperdir
>>> >> + *
>>> >> + * Special and zero size files:
>>> >> + *   inode_lock(ovl_inode)
>>> >> + *     lock_rename(workdir, upperdir)
>>> >> + *       copy up file to workdir (no data)
>>> >> + *       move file to upperdir
>>> >> + *
>>> >> + * Regular files with data:
>>> >> + *  inode_lock(ovl_inode)
>>> >> + *    lock_rename(workdir, upperdir)
>>> >> + *      copy up file to workdir (no data)
>>> >> + *    unlock_rename(workdir, upperdir)
>>> >> + *      copy data to file in workdir
>>> >> + *    lock_rename(workdir, upperdir)
>>> >> + *      move file to upperdir
>>> >> + */
>>> >>  static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>>> >>                             struct dentry *dentry, struct path *lowerpath,
>>> >>                             struct kstat *stat, const char *link)
>>> >> @@ -274,16 +312,39 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>>> >>       if (err)
>>> >>               goto out2;
>>> >>
>>> >> -     if (S_ISREG(stat->mode)) {
>>> >> +     if (S_ISREG(stat->mode) && stat->size > 0) {
>>> >>               struct path upperpath;
>>> >>
>>> >>               ovl_path_upper(dentry, &upperpath);
>>> >>               BUG_ON(upperpath.dentry != NULL);
>>> >>               upperpath.dentry = newdentry;
>>> >>
>>> >> +             /*
>>> >> +              * Release rename locks, because copy data may take a long time,
>>> >> +              * and holding s_vfs_rename_mutex will block all rename and
>>> >> +              * copy up operations on the entire underlying fs.
>>> >> +              * We still hold the overlay inode lock to avoid concurrent
>>> >> +              * copy up of this file.
>>> >> +              */
>>> >> +             unlock_rename(workdir, upperdir);
>>> >> +
>>> >>               err = ovl_copy_up_data(lowerpath, &upperpath, stat->size);
>>> >> +
>>> >> +             /*
>>> >> +              * Re-aquire rename locks, before moving copied file into place.
>>> >> +              */
>>> >> +             if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
>>> >> +                     pr_err("overlayfs: failed to re-aquire lock_rename\n");
>>> >> +                     err = -EIO;
>>> >> +             }
>>> >> +
>>> >>               if (err)
>>> >>                       goto out_cleanup;
>>> >> +
>>> >> +             if (WARN_ON(ovl_dentry_is_upper(dentry))) {
>>> >> +                     /* Raced with another copy-up? This shouldn't happen */
>>> >> +                     goto out_cleanup;
>>> >> +             }
>>> >>       }
>>> >>
>>> >>       err = ovl_copy_xattr(lowerpath->dentry, newdentry);
>>> >> @@ -366,15 +427,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>>> >>                       return PTR_ERR(link);
>>> >>       }
>>> >>
>>> >> -     err = -EIO;
>>> >> -     if (lock_rename(workdir, upperdir) != NULL) {
>>> >> +     if (unlikely(lock_rename(workdir, upperdir) != NULL)) {
>>> >>               pr_err("overlayfs: failed to lock workdir+upperdir\n");
>>> >> +             err = -EIO;
>>> >>               goto out_unlock;
>>> >>       }
>>> >>
>>> >>       if (ovl_dentry_is_upper(dentry)) {
>>> >>               /* Raced with another copy-up?  Nothing to do, then... */
>>> >> -             err = 0;
>>> >>               goto out_unlock;
>>> >>       }
>>> >>
>>> >> @@ -433,11 +493,13 @@ static int __ovl_copy_up(struct dentry *dentry, int flags)
>>> >>       return err;
>>> >>  }
>>> >>
>>> >> +/* Called with dentry->d_inode lock held */
>>> >>  int ovl_copy_up(struct dentry *dentry)
>>> >>  {
>>> >>       return __ovl_copy_up(dentry, 0);
>>> >>  }
>>> >>
>>> >> +/* Called with dentry->d_inode lock held */
>>> >>  int ovl_copy_up_open(struct dentry *dentry, int flags)
>>> >>  {
>>> >>       return __ovl_copy_up(dentry, flags);
>>> >> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>>> >> index 7abae00..532b0d5 100644
>>> >> --- a/fs/overlayfs/inode.c
>>> >> +++ b/fs/overlayfs/inode.c
>>> >> @@ -251,11 +251,14 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
>>> >>
>>> >>       type = ovl_path_real(dentry, &realpath);
>>> >>       if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
>>> >> +             /* Take the overlay inode lock to avoid concurrent copy-up */
>>> >> +             inode_lock(dentry->d_inode);
>>> >>               err = ovl_want_write(dentry);
>>> >>               if (!err) {
>>> >>                       err = ovl_copy_up_open(dentry, file_flags);
>>> >>                       ovl_drop_write(dentry);
>>> >>               }
>>> >> +             inode_unlock(dentry->d_inode);
>>> >>       }
>>> >>
>>> >>       return err;
>>> >> --
>>> >> 2.7.4
>>> >>
>>> >> --
>>> >> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
>>> >> the body of a message to majordomo@vger.kernel.org
>>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up()
  2016-11-12 19:36 ` [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up() Amir Goldstein
@ 2016-12-03 17:04   ` Amir Goldstein
  2016-12-05 14:51   ` Miklos Szeredi
  1 sibling, 0 replies; 12+ messages in thread
From: Amir Goldstein @ 2016-12-03 17:04 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel

On Sat, Nov 12, 2016 at 9:36 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> This removes code duplication and is prep work
> for fixing up copy up locking.
>

Miklos,

Please consider applying this prep patch even though
you NACKed the concurrent copy up patch.

It is net code deduplication and its also a good prep work
to any other locking scheme change for concurrent copy up.

Amir.

> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
>  fs/overlayfs/copy_up.c   | 19 ++++++++++++++++---
>  fs/overlayfs/inode.c     | 33 +--------------------------------
>  fs/overlayfs/overlayfs.h |  3 +--
>  3 files changed, 18 insertions(+), 37 deletions(-)
>
> diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
> index 7e67512..a16127b 100644
> --- a/fs/overlayfs/copy_up.c
> +++ b/fs/overlayfs/copy_up.c
> @@ -336,8 +336,8 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
>   * d_parent it is possible that the copy up will lock the old parent.  At
>   * that point the file will have already been copied up anyway.
>   */
> -int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> -                   struct path *lowerpath, struct kstat *stat)
> +static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> +                          struct path *lowerpath, struct kstat *stat)
>  {
>         DEFINE_DELAYED_CALL(done);
>         struct dentry *workdir = ovl_workdir(dentry);
> @@ -391,7 +391,7 @@ int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
>         return err;
>  }
>
> -int ovl_copy_up(struct dentry *dentry)
> +static int __ovl_copy_up(struct dentry *dentry, int flags)
>  {
>         int err = 0;
>         const struct cred *old_cred = ovl_override_creds(dentry->d_sb);
> @@ -419,6 +419,9 @@ int ovl_copy_up(struct dentry *dentry)
>
>                 ovl_path_lower(next, &lowerpath);
>                 err = vfs_getattr(&lowerpath, &stat);
> +               /* maybe truncate regular file. this has no effect on dirs */
> +               if (flags & O_TRUNC)
> +                       stat.size = 0;
>                 if (!err)
>                         err = ovl_copy_up_one(parent, next, &lowerpath, &stat);
>
> @@ -429,3 +432,13 @@ int ovl_copy_up(struct dentry *dentry)
>
>         return err;
>  }
> +
> +int ovl_copy_up(struct dentry *dentry)
> +{
> +       return __ovl_copy_up(dentry, 0);
> +}
> +
> +int ovl_copy_up_open(struct dentry *dentry, int flags)
> +{
> +       return __ovl_copy_up(dentry, flags);
> +}
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index a10e948..7abae00 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -13,34 +13,6 @@
>  #include <linux/posix_acl.h>
>  #include "overlayfs.h"
>
> -static int ovl_copy_up_truncate(struct dentry *dentry)
> -{
> -       int err;
> -       struct dentry *parent;
> -       struct kstat stat;
> -       struct path lowerpath;
> -       const struct cred *old_cred;
> -
> -       parent = dget_parent(dentry);
> -       err = ovl_copy_up(parent);
> -       if (err)
> -               goto out_dput_parent;
> -
> -       ovl_path_lower(dentry, &lowerpath);
> -
> -       old_cred = ovl_override_creds(dentry->d_sb);
> -       err = vfs_getattr(&lowerpath, &stat);
> -       if (!err) {
> -               stat.size = 0;
> -               err = ovl_copy_up_one(parent, dentry, &lowerpath, &stat);
> -       }
> -       revert_creds(old_cred);
> -
> -out_dput_parent:
> -       dput(parent);
> -       return err;
> -}
> -
>  int ovl_setattr(struct dentry *dentry, struct iattr *attr)
>  {
>         int err;
> @@ -281,10 +253,7 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
>         if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
>                 err = ovl_want_write(dentry);
>                 if (!err) {
> -                       if (file_flags & O_TRUNC)
> -                               err = ovl_copy_up_truncate(dentry);
> -                       else
> -                               err = ovl_copy_up(dentry);
> +                       err = ovl_copy_up_open(dentry, file_flags);
>                         ovl_drop_write(dentry);
>                 }
>         }
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 421fd50..1a2f9a5 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -215,7 +215,6 @@ void ovl_cleanup(struct inode *dir, struct dentry *dentry);
>
>  /* copy_up.c */
>  int ovl_copy_up(struct dentry *dentry);
> -int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry,
> -                   struct path *lowerpath, struct kstat *stat);
> +int ovl_copy_up_open(struct dentry *dentry, int flags);
>  int ovl_copy_xattr(struct dentry *old, struct dentry *new);
>  int ovl_set_attr(struct dentry *upper, struct kstat *stat);
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up()
  2016-11-12 19:36 ` [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up() Amir Goldstein
  2016-12-03 17:04   ` Amir Goldstein
@ 2016-12-05 14:51   ` Miklos Szeredi
  1 sibling, 0 replies; 12+ messages in thread
From: Miklos Szeredi @ 2016-12-05 14:51 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Konstantin Khlebnikov, Al Viro, linux-unionfs, linux-fsdevel

On Sat, Nov 12, 2016 at 8:36 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> This removes code duplication and is prep work
> for fixing up copy up locking.

Thanks, applied.

Miklos

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2016-12-05 14:51 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-12 19:36 [RFC][PATCH 0/4] ovl: Do not hold s_vfs_rename_mutex during data copy up Amir Goldstein
2016-11-12 19:36 ` [RFC][PATCH 1/4] vfs: update documentation for inode operation locking rules Amir Goldstein
2016-11-12 19:36 ` [RFC][PATCH 2/4] ovl: add helper ovl_dentry_is_upper() Amir Goldstein
2016-11-12 19:36 ` [RFC][PATCH 3/4] ovl: fold ovl_copy_up_truncate() into ovl_copy_up() Amir Goldstein
2016-12-03 17:04   ` Amir Goldstein
2016-12-05 14:51   ` Miklos Szeredi
2016-11-12 19:36 ` [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files Amir Goldstein
2016-11-15 15:56   ` Vivek Goyal
2016-11-15 19:20     ` Amir Goldstein
2016-11-15 21:57       ` Vivek Goyal
2016-11-16  5:27         ` Amir Goldstein
2016-11-20  8:27           ` Amir Goldstein

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).