($INBOX_DIR/description missing)
 help / color / Atom feed
* [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs
@ 2020-11-13  6:55 Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 1/9] ovl: setup overlayfs' private bdi Chengguang Xu
                   ` (9 more replies)
  0 siblings, 10 replies; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
on upper_sb to synchronize whole dirty inodes in upper filesystem
regardless of the overlay ownership of the inode. In the use case of
container, when multiple containers using the same underlying upper
filesystem, it has some shortcomings as below.

(1) Performance
Synchronization is probably heavy because it actually syncs unnecessary
inodes for target overlayfs.

(2) Interference
Unplanned synchronization will probably impact IO performance of
unrelated container processes on the other overlayfs.

This series try to implement containerized syncfs for overlayfs so that
only sync target dirty upper inodes which are belong to specific overlayfs
instance. By doing this, it is able to reduce cost of synchronization and
will not seriously impact IO performance of unrelated processes.

v1->v2:
- Mark overlayfs' inode dirty itself instead of adding notification
  mechanism to vfs inode.

v2->v3:
- Introduce overlayfs' extra syncfs wait list to wait target upper inodes
in ->sync_fs.

v3->v4:
- Using wait_sb_inodes() to wait syncing upper inodes.
- Mark overlay inode dirty only when having upper inode and  VM_SHARED
flag in ovl_mmap().
- Check upper i_state after checking upper mmap state
in ovl_write_inode.

Chengguang Xu (9):
  ovl: setup overlayfs' private bdi
  ovl: implement ->writepages operation
  ovl: implement overlayfs' ->evict_inode operation
  ovl: mark overlayfs' inode dirty on modification
  ovl: mark overlayfs' inode dirty on shared mmap
  ovl: implement overlayfs' ->write_inode operation
  ovl: cache dirty overlayfs' inode
  fs: export wait_sb_inodes()
  ovl: implement containerized syncfs for overlayfs

 fs/fs-writeback.c         |  3 +-
 fs/overlayfs/file.c       |  3 ++
 fs/overlayfs/inode.c      | 15 ++++++++++
 fs/overlayfs/overlayfs.h  |  4 +++
 fs/overlayfs/super.c      | 63 ++++++++++++++++++++++++++++++++++++---
 fs/overlayfs/util.c       | 14 +++++++++
 include/linux/writeback.h |  1 +
 7 files changed, 98 insertions(+), 5 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 1/9] ovl: setup overlayfs' private bdi
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 2/9] ovl: implement ->writepages operation Chengguang Xu
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Setup overlayfs' private bdi so that we can collect
overlayfs' own dirty inodes.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/super.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 290983bcfbb3..d1e546abce87 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1869,6 +1869,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	if (!ofs)
 		goto out;
 
+	err = super_setup_bdi(sb);
+	if (err)
+		goto out_err;
+
 	ofs->creator_cred = cred = prepare_creds();
 	if (!cred)
 		goto out_err;
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 2/9] ovl: implement ->writepages operation
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 1/9] ovl: setup overlayfs' private bdi Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 3/9] ovl: implement overlayfs' ->evict_inode operation Chengguang Xu
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Implement overlayfs' ->writepages operation so that
we can sync dirty data/metadata to upper filesystem.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/inode.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index b584dca845ba..8cfa75e86f56 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -11,6 +11,7 @@
 #include <linux/posix_acl.h>
 #include <linux/ratelimit.h>
 #include <linux/fiemap.h>
+#include <linux/writeback.h>
 #include "overlayfs.h"
 
 
@@ -516,7 +517,20 @@ static const struct inode_operations ovl_special_inode_operations = {
 	.update_time	= ovl_update_time,
 };
 
+static int ovl_writepages(struct address_space *mapping,
+			  struct writeback_control *wbc)
+{
+	struct inode *inode = mapping->host;
+	struct ovl_fs *ofs = inode->i_sb->s_fs_info;
+	struct inode *upper = ovl_inode_upper(inode);
+
+	if (!ovl_should_sync(ofs))
+		return 0;
+	return sync_inode(upper, wbc);
+}
+
 static const struct address_space_operations ovl_aops = {
+	.writepages		= ovl_writepages,
 	/* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */
 	.direct_IO		= noop_direct_IO,
 };
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 3/9] ovl: implement overlayfs' ->evict_inode operation
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 1/9] ovl: setup overlayfs' private bdi Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 2/9] ovl: implement ->writepages operation Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 4/9] ovl: mark overlayfs' inode dirty on modification Chengguang Xu
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Implement overlayfs' ->evict_inode operation,
so that we can clear all inode dirty flags.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/super.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index d1e546abce87..883172ac8a12 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -390,11 +390,18 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 	return ret;
 }
 
+static void ovl_evict_inode(struct inode *inode)
+{
+	inode->i_state &= ~I_DIRTY_ALL;
+	clear_inode(inode);
+}
+
 static const struct super_operations ovl_super_operations = {
 	.alloc_inode	= ovl_alloc_inode,
 	.free_inode	= ovl_free_inode,
 	.destroy_inode	= ovl_destroy_inode,
 	.drop_inode	= generic_delete_inode,
+	.evict_inode	= ovl_evict_inode,
 	.put_super	= ovl_put_super,
 	.sync_fs	= ovl_sync_fs,
 	.statfs		= ovl_statfs,
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 4/9] ovl: mark overlayfs' inode dirty on modification
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (2 preceding siblings ...)
  2020-11-13  6:55 ` [RFC PATCH v4 3/9] ovl: implement overlayfs' ->evict_inode operation Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2021-04-09 13:45   ` Miklos Szeredi
  2020-11-13  6:55 ` [RFC PATCH v4 5/9] ovl: mark overlayfs' inode dirty on shared mmap Chengguang Xu
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Mark overlayfs' inode dirty on modification so that
we can recognize target inodes during syncfs.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/inode.c     |  1 +
 fs/overlayfs/overlayfs.h |  4 ++++
 fs/overlayfs/util.c      | 14 ++++++++++++++
 3 files changed, 19 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 8cfa75e86f56..342693657ab0 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -468,6 +468,7 @@ int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags)
 		if (upperpath.dentry) {
 			touch_atime(&upperpath);
 			inode->i_atime = d_inode(upperpath.dentry)->i_atime;
+			ovl_mark_inode_dirty(inode);
 		}
 	}
 	return 0;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index f8880aa2ba0e..eaf1d5b05d8e 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -247,6 +247,7 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
 }
 
 /* util.c */
+void ovl_mark_inode_dirty(struct inode *inode);
 int ovl_want_write(struct dentry *dentry);
 void ovl_drop_write(struct dentry *dentry);
 struct dentry *ovl_workdir(struct dentry *dentry);
@@ -472,6 +473,9 @@ static inline void ovl_copyattr(struct inode *from, struct inode *to)
 	to->i_mtime = from->i_mtime;
 	to->i_ctime = from->i_ctime;
 	i_size_write(to, i_size_read(from));
+
+	if (ovl_inode_upper(to) && from->i_state & I_DIRTY_ALL)
+		ovl_mark_inode_dirty(to);
 }
 
 static inline void ovl_copyflags(struct inode *from, struct inode *to)
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 23f475627d07..a6f59df744ae 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -950,3 +950,17 @@ char *ovl_get_redirect_xattr(struct ovl_fs *ofs, struct dentry *dentry,
 	kfree(buf);
 	return ERR_PTR(res);
 }
+
+/*
+ * We intentionally add I_DIRTY_SYNC flag regardless dirty flag
+ * of upper inode so that we have chance to invoke ->write_inode
+ * to re-dirty overlayfs' inode during writeback process.
+ */
+void ovl_mark_inode_dirty(struct inode *inode)
+{
+	struct inode *upper = ovl_inode_upper(inode);
+	unsigned long iflag = I_DIRTY_SYNC;
+
+	iflag |= upper->i_state & I_DIRTY_ALL;
+	__mark_inode_dirty(inode, iflag);
+}
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 5/9] ovl: mark overlayfs' inode dirty on shared mmap
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (3 preceding siblings ...)
  2020-11-13  6:55 ` [RFC PATCH v4 4/9] ovl: mark overlayfs' inode dirty on modification Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 6/9] ovl: implement overlayfs' ->write_inode operation Chengguang Xu
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Overlayfs cannot be notified when mmapped area gets dirty,
so we need to proactively mark inode dirty in ->mmap operation.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index efccb7c1f9bc..3f9de5343513 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -486,6 +486,9 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 		/* Drop reference count from new vm_file value */
 		fput(realfile);
 	} else {
+		if (ovl_inode_upper(file_inode(file)) &&
+		    vma->vm_flags & VM_SHARED)
+			ovl_mark_inode_dirty(file_inode(file));
 		/* Drop reference count from previous vm_file value */
 		fput(file);
 	}
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 6/9] ovl: implement overlayfs' ->write_inode operation
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (4 preceding siblings ...)
  2020-11-13  6:55 ` [RFC PATCH v4 5/9] ovl: mark overlayfs' inode dirty on shared mmap Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2021-04-09 13:49   ` Miklos Szeredi
  2020-11-13  6:55 ` [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode Chengguang Xu
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Implement overlayfs' ->write_inode to sync dirty data
and redirty overlayfs' inode if necessary.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/super.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 883172ac8a12..82e001b97f38 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -390,6 +390,35 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 	return ret;
 }
 
+static int ovl_write_inode(struct inode *inode,
+			   struct writeback_control *wbc)
+{
+	struct ovl_fs *ofs = inode->i_sb->s_fs_info;
+	struct inode *upper = ovl_inode_upper(inode);
+	unsigned long iflag = 0;
+	int ret = 0;
+
+	if (!upper)
+		return 0;
+
+	if (!ovl_should_sync(ofs))
+		return 0;
+
+	if (upper->i_sb->s_op->write_inode)
+		ret = upper->i_sb->s_op->write_inode(inode, wbc);
+
+	if (mapping_writably_mapped(upper->i_mapping) ||
+	    mapping_tagged(upper->i_mapping, PAGECACHE_TAG_WRITEBACK))
+		iflag |= I_DIRTY_PAGES;
+
+	iflag |= upper->i_state & I_DIRTY_ALL;
+
+	if (iflag)
+		ovl_mark_inode_dirty(inode);
+
+	return ret;
+}
+
 static void ovl_evict_inode(struct inode *inode)
 {
 	inode->i_state &= ~I_DIRTY_ALL;
@@ -402,6 +431,7 @@ static const struct super_operations ovl_super_operations = {
 	.destroy_inode	= ovl_destroy_inode,
 	.drop_inode	= generic_delete_inode,
 	.evict_inode	= ovl_evict_inode,
+	.write_inode	= ovl_write_inode,
 	.put_super	= ovl_put_super,
 	.sync_fs	= ovl_sync_fs,
 	.statfs		= ovl_statfs,
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (5 preceding siblings ...)
  2020-11-13  6:55 ` [RFC PATCH v4 6/9] ovl: implement overlayfs' ->write_inode operation Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2021-04-09 13:50   ` Miklos Szeredi
  2020-11-13  6:55 ` [RFC PATCH v4 8/9] fs: export wait_sb_inodes() Chengguang Xu
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Now drop overlayfs' inode will sync dirty data,
so we change to only drop clean inode.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/super.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 82e001b97f38..982b3954b47c 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -425,11 +425,20 @@ static void ovl_evict_inode(struct inode *inode)
 	clear_inode(inode);
 }
 
+static int ovl_drop_inode(struct inode *inode)
+{
+	struct inode *upper = ovl_inode_upper(inode);
+
+	if (!upper || !(inode->i_state & I_DIRTY_ALL))
+		return 1;
+	return generic_drop_inode(inode);
+}
+
 static const struct super_operations ovl_super_operations = {
 	.alloc_inode	= ovl_alloc_inode,
 	.free_inode	= ovl_free_inode,
 	.destroy_inode	= ovl_destroy_inode,
-	.drop_inode	= generic_delete_inode,
+	.drop_inode	= ovl_drop_inode,
 	.evict_inode	= ovl_evict_inode,
 	.write_inode	= ovl_write_inode,
 	.put_super	= ovl_put_super,
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 8/9] fs: export wait_sb_inodes()
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (6 preceding siblings ...)
  2020-11-13  6:55 ` [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2020-11-13  6:55 ` [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs Chengguang Xu
  2020-12-04 14:49 ` 回复:[RFC PATCH v4 0/9] " Chengguang Xu
  9 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

In order to wait syncing upper inodes we need to call wait_sb_inodes()
in overlayfs' ->sync_fs.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/fs-writeback.c         | 3 ++-
 include/linux/writeback.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 149227160ff0..55dafdc72661 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2365,7 +2365,7 @@ EXPORT_SYMBOL(__mark_inode_dirty);
  * completed by the time we have gained the lock and waited for all IO that is
  * in progress regardless of the order callers are granted the lock.
  */
-static void wait_sb_inodes(struct super_block *sb)
+void wait_sb_inodes(struct super_block *sb)
 {
 	LIST_HEAD(sync_list);
 
@@ -2449,6 +2449,7 @@ static void wait_sb_inodes(struct super_block *sb)
 	rcu_read_unlock();
 	mutex_unlock(&sb->s_sync_lock);
 }
+EXPORT_SYMBOL(wait_sb_inodes);
 
 static void __writeback_inodes_sb_nr(struct super_block *sb, unsigned long nr,
 				     enum wb_reason reason, bool skip_if_busy)
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index 8e5c5bb16e2d..53e826af33b6 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -198,6 +198,7 @@ void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 				enum wb_reason reason);
 void inode_wait_for_writeback(struct inode *inode);
 void inode_io_list_del(struct inode *inode);
+void wait_sb_inodes(struct super_block *sb);
 
 /* writeback.h requires fs.h; it, too, is not included from here. */
 static inline void wait_on_inode(struct inode *inode)
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (7 preceding siblings ...)
  2020-11-13  6:55 ` [RFC PATCH v4 8/9] fs: export wait_sb_inodes() Chengguang Xu
@ 2020-11-13  6:55 ` Chengguang Xu
  2021-04-09 13:51   ` Miklos Szeredi
  2020-12-04 14:49 ` 回复:[RFC PATCH v4 0/9] " Chengguang Xu
  9 siblings, 1 reply; 24+ messages in thread
From: Chengguang Xu @ 2020-11-13  6:55 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

Now overlayfs can only sync dirty inode during syncfs,
so remove unnecessary sync_filesystem() on upper file
system.

Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
---
 fs/overlayfs/super.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 982b3954b47c..58507f1cd583 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -15,6 +15,8 @@
 #include <linux/seq_file.h>
 #include <linux/posix_acl_xattr.h>
 #include <linux/exportfs.h>
+#include <linux/blkdev.h>
+#include <linux/writeback.h>
 #include "overlayfs.h"
 
 MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
@@ -270,8 +272,7 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
 	 * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
 	 * All the super blocks will be iterated, including upper_sb.
 	 *
-	 * If this is a syncfs(2) call, then we do need to call
-	 * sync_filesystem() on upper_sb, but enough if we do it when being
+	 * if this is a syncfs(2) call, it will be enough we do it when being
 	 * called with wait == 1.
 	 */
 	if (!wait)
@@ -280,7 +281,11 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
 	upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
 
 	down_read(&upper_sb->s_umount);
-	ret = sync_filesystem(upper_sb);
+	wait_sb_inodes(upper_sb);
+	if (upper_sb->s_op->sync_fs)
+		ret = upper_sb->s_op->sync_fs(upper_sb, wait);
+	if (!ret)
+		ret = sync_blockdev(upper_sb->s_bdev);
 	up_read(&upper_sb->s_umount);
 
 	return ret;
-- 
2.26.2



^ permalink raw reply	[flat|nested] 24+ messages in thread

* 回复:[RFC PATCH v4 0/9] implement containerized syncfs for overlayfs
  2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (8 preceding siblings ...)
  2020-11-13  6:55 ` [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs Chengguang Xu
@ 2020-12-04 14:49 ` Chengguang Xu
  2020-12-04 15:02   ` [RFC " Miklos Szeredi
  2020-12-04 16:40   ` 回复:[RFC " Jan Kara
  9 siblings, 2 replies; 24+ messages in thread
From: Chengguang Xu @ 2020-12-04 14:49 UTC (permalink / raw)
  To: miklos, jack, amir73il; +Cc: linux-unionfs, linux-fsdevel, Chengguang Xu

 ---- 在 星期五, 2020-11-13 14:55:46 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
 > Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
 > on upper_sb to synchronize whole dirty inodes in upper filesystem
 > regardless of the overlay ownership of the inode. In the use case of
 > container, when multiple containers using the same underlying upper
 > filesystem, it has some shortcomings as below.
 > 
 > (1) Performance
 > Synchronization is probably heavy because it actually syncs unnecessary
 > inodes for target overlayfs.
 > 
 > (2) Interference
 > Unplanned synchronization will probably impact IO performance of
 > unrelated container processes on the other overlayfs.
 > 
 > This series try to implement containerized syncfs for overlayfs so that
 > only sync target dirty upper inodes which are belong to specific overlayfs
 > instance. By doing this, it is able to reduce cost of synchronization and
 > will not seriously impact IO performance of unrelated processes.
 
Hi Miklos,

I think this version has addressed all previous issues and comments from Jack
and Amir.  Have you got time to review this patch series?

Thanks,
Chengguang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs
  2020-12-04 14:49 ` 回复:[RFC PATCH v4 0/9] " Chengguang Xu
@ 2020-12-04 15:02   ` Miklos Szeredi
  2020-12-04 16:40   ` 回复:[RFC " Jan Kara
  1 sibling, 0 replies; 24+ messages in thread
From: Miklos Szeredi @ 2020-12-04 15:02 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: jack, amir73il, linux-unionfs, linux-fsdevel

On Fri, Dec 4, 2020 at 3:50 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
>  ---- 在 星期五, 2020-11-13 14:55:46 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
>  > Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
>  > on upper_sb to synchronize whole dirty inodes in upper filesystem
>  > regardless of the overlay ownership of the inode. In the use case of
>  > container, when multiple containers using the same underlying upper
>  > filesystem, it has some shortcomings as below.
>  >
>  > (1) Performance
>  > Synchronization is probably heavy because it actually syncs unnecessary
>  > inodes for target overlayfs.
>  >
>  > (2) Interference
>  > Unplanned synchronization will probably impact IO performance of
>  > unrelated container processes on the other overlayfs.
>  >
>  > This series try to implement containerized syncfs for overlayfs so that
>  > only sync target dirty upper inodes which are belong to specific overlayfs
>  > instance. By doing this, it is able to reduce cost of synchronization and
>  > will not seriously impact IO performance of unrelated processes.
>
> Hi Miklos,
>
> I think this version has addressed all previous issues and comments from Jack
> and Amir.  Have you got time to review this patch series?

Hopefully yes.

I'm really keen to finish off the unprivileged overlay patches first.
Will test and post a new version shortly.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: 回复:[RFC PATCH v4 0/9] implement containerized syncfs for overlayfs
  2020-12-04 14:49 ` 回复:[RFC PATCH v4 0/9] " Chengguang Xu
  2020-12-04 15:02   ` [RFC " Miklos Szeredi
@ 2020-12-04 16:40   ` Jan Kara
  1 sibling, 0 replies; 24+ messages in thread
From: Jan Kara @ 2020-12-04 16:40 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel

On Fri 04-12-20 22:49:13, Chengguang Xu wrote:
>  ---- 在 星期五, 2020-11-13 14:55:46 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
>  > Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
>  > on upper_sb to synchronize whole dirty inodes in upper filesystem
>  > regardless of the overlay ownership of the inode. In the use case of
>  > container, when multiple containers using the same underlying upper
>  > filesystem, it has some shortcomings as below.
>  > 
>  > (1) Performance
>  > Synchronization is probably heavy because it actually syncs unnecessary
>  > inodes for target overlayfs.
>  > 
>  > (2) Interference
>  > Unplanned synchronization will probably impact IO performance of
>  > unrelated container processes on the other overlayfs.
>  > 
>  > This series try to implement containerized syncfs for overlayfs so that
>  > only sync target dirty upper inodes which are belong to specific overlayfs
>  > instance. By doing this, it is able to reduce cost of synchronization and
>  > will not seriously impact IO performance of unrelated processes.
>  
> Hi Miklos,
> 
> I think this version has addressed all previous issues and comments from Jack
> and Amir.  Have you got time to review this patch series?

Yes, the patches now look good to me. Feel free to add:

Acked-by: Jan Kara <jack@suse.cz>

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 4/9] ovl: mark overlayfs' inode dirty on modification
  2020-11-13  6:55 ` [RFC PATCH v4 4/9] ovl: mark overlayfs' inode dirty on modification Chengguang Xu
@ 2021-04-09 13:45   ` Miklos Szeredi
  2021-04-12 11:58     ` Chengguang Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2021-04-09 13:45 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
> Mark overlayfs' inode dirty on modification so that
> we can recognize target inodes during syncfs.
>
> Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
> ---
>  fs/overlayfs/inode.c     |  1 +
>  fs/overlayfs/overlayfs.h |  4 ++++
>  fs/overlayfs/util.c      | 14 ++++++++++++++
>  3 files changed, 19 insertions(+)
>
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 8cfa75e86f56..342693657ab0 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -468,6 +468,7 @@ int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags)
>                 if (upperpath.dentry) {
>                         touch_atime(&upperpath);
>                         inode->i_atime = d_inode(upperpath.dentry)->i_atime;
> +                       ovl_mark_inode_dirty(inode);
>                 }
>         }
>         return 0;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index f8880aa2ba0e..eaf1d5b05d8e 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -247,6 +247,7 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
>  }
>
>  /* util.c */
> +void ovl_mark_inode_dirty(struct inode *inode);
>  int ovl_want_write(struct dentry *dentry);
>  void ovl_drop_write(struct dentry *dentry);
>  struct dentry *ovl_workdir(struct dentry *dentry);
> @@ -472,6 +473,9 @@ static inline void ovl_copyattr(struct inode *from, struct inode *to)
>         to->i_mtime = from->i_mtime;
>         to->i_ctime = from->i_ctime;
>         i_size_write(to, i_size_read(from));
> +
> +       if (ovl_inode_upper(to) && from->i_state & I_DIRTY_ALL)
> +               ovl_mark_inode_dirty(to);
>  }

Okay, ovl_copyattr() certainly seems a good place to copy dirtyness as well.

What I'm fearing is that it does not cover all the places where
underlying inode can be dirtied.  This really needs an audit of all
filesystem modifying operations.

>
>  static inline void ovl_copyflags(struct inode *from, struct inode *to)
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 23f475627d07..a6f59df744ae 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -950,3 +950,17 @@ char *ovl_get_redirect_xattr(struct ovl_fs *ofs, struct dentry *dentry,
>         kfree(buf);
>         return ERR_PTR(res);
>  }
> +
> +/*
> + * We intentionally add I_DIRTY_SYNC flag regardless dirty flag
> + * of upper inode so that we have chance to invoke ->write_inode
> + * to re-dirty overlayfs' inode during writeback process.
> + */
> +void ovl_mark_inode_dirty(struct inode *inode)
> +{
> +       struct inode *upper = ovl_inode_upper(inode);
> +       unsigned long iflag = I_DIRTY_SYNC;
> +
> +       iflag |= upper->i_state & I_DIRTY_ALL;
> +       __mark_inode_dirty(inode, iflag);
> +}
> --
> 2.26.2
>
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 6/9] ovl: implement overlayfs' ->write_inode operation
  2020-11-13  6:55 ` [RFC PATCH v4 6/9] ovl: implement overlayfs' ->write_inode operation Chengguang Xu
@ 2021-04-09 13:49   ` Miklos Szeredi
  2021-04-14  6:14     ` Chengguang Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2021-04-09 13:49 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
> Implement overlayfs' ->write_inode to sync dirty data
> and redirty overlayfs' inode if necessary.
>
> Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
> ---
>  fs/overlayfs/super.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 883172ac8a12..82e001b97f38 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -390,6 +390,35 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
>         return ret;
>  }
>
> +static int ovl_write_inode(struct inode *inode,
> +                          struct writeback_control *wbc)
> +{
> +       struct ovl_fs *ofs = inode->i_sb->s_fs_info;
> +       struct inode *upper = ovl_inode_upper(inode);
> +       unsigned long iflag = 0;
> +       int ret = 0;
> +
> +       if (!upper)
> +               return 0;
> +
> +       if (!ovl_should_sync(ofs))
> +               return 0;
> +
> +       if (upper->i_sb->s_op->write_inode)
> +               ret = upper->i_sb->s_op->write_inode(inode, wbc);
> +
> +       if (mapping_writably_mapped(upper->i_mapping) ||
> +           mapping_tagged(upper->i_mapping, PAGECACHE_TAG_WRITEBACK))
> +               iflag |= I_DIRTY_PAGES;
> +
> +       iflag |= upper->i_state & I_DIRTY_ALL;

How is I_DIRTY_SYNC added/removed from the overlay inode?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode
  2020-11-13  6:55 ` [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode Chengguang Xu
@ 2021-04-09 13:50   ` Miklos Szeredi
  2021-04-13  2:14     ` Chengguang Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2021-04-09 13:50 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
> Now drop overlayfs' inode will sync dirty data,
> so we change to only drop clean inode.

I don't understand what happens here.  Please add more explanation.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs
  2020-11-13  6:55 ` [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs Chengguang Xu
@ 2021-04-09 13:51   ` Miklos Szeredi
  2021-04-12 12:24     ` Chengguang Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2021-04-09 13:51 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
> Now overlayfs can only sync dirty inode during syncfs,
> so remove unnecessary sync_filesystem() on upper file
> system.
>
> Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
> ---
>  fs/overlayfs/super.c | 11 ++++++++---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 982b3954b47c..58507f1cd583 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -15,6 +15,8 @@
>  #include <linux/seq_file.h>
>  #include <linux/posix_acl_xattr.h>
>  #include <linux/exportfs.h>
> +#include <linux/blkdev.h>
> +#include <linux/writeback.h>
>  #include "overlayfs.h"
>
>  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
> @@ -270,8 +272,7 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
>          * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
>          * All the super blocks will be iterated, including upper_sb.
>          *
> -        * If this is a syncfs(2) call, then we do need to call
> -        * sync_filesystem() on upper_sb, but enough if we do it when being
> +        * if this is a syncfs(2) call, it will be enough we do it when being
>          * called with wait == 1.
>          */
>         if (!wait)
> @@ -280,7 +281,11 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
>         upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
>
>         down_read(&upper_sb->s_umount);
> -       ret = sync_filesystem(upper_sb);
> +       wait_sb_inodes(upper_sb);
> +       if (upper_sb->s_op->sync_fs)
> +               ret = upper_sb->s_op->sync_fs(upper_sb, wait);
> +       if (!ret)
> +               ret = sync_blockdev(upper_sb->s_bdev);

Should this instead be __sync_blockdev(..., wait)?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 4/9] ovl: mark overlayfs' inode dirty on modification
  2021-04-09 13:45   ` Miklos Szeredi
@ 2021-04-12 11:58     ` Chengguang Xu
  0 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2021-04-12 11:58 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

---- 在 星期五, 2021-04-09 21:45:28 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
 > On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > > Mark overlayfs' inode dirty on modification so that
 > > we can recognize target inodes during syncfs.
 > >
 > > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
 > > ---
 > >  fs/overlayfs/inode.c     |  1 +
 > >  fs/overlayfs/overlayfs.h |  4 ++++
 > >  fs/overlayfs/util.c      | 14 ++++++++++++++
 > >  3 files changed, 19 insertions(+)
 > >
 > > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
 > > index 8cfa75e86f56..342693657ab0 100644
 > > --- a/fs/overlayfs/inode.c
 > > +++ b/fs/overlayfs/inode.c
 > > @@ -468,6 +468,7 @@ int ovl_update_time(struct inode *inode, struct timespec64 *ts, int flags)
 > >                 if (upperpath.dentry) {
 > >                         touch_atime(&upperpath);
 > >                         inode->i_atime = d_inode(upperpath.dentry)->i_atime;
 > > +                       ovl_mark_inode_dirty(inode);
 > >                 }
 > >         }
 > >         return 0;
 > > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
 > > index f8880aa2ba0e..eaf1d5b05d8e 100644
 > > --- a/fs/overlayfs/overlayfs.h
 > > +++ b/fs/overlayfs/overlayfs.h
 > > @@ -247,6 +247,7 @@ static inline bool ovl_open_flags_need_copy_up(int flags)
 > >  }
 > >
 > >  /* util.c */
 > > +void ovl_mark_inode_dirty(struct inode *inode);
 > >  int ovl_want_write(struct dentry *dentry);
 > >  void ovl_drop_write(struct dentry *dentry);
 > >  struct dentry *ovl_workdir(struct dentry *dentry);
 > > @@ -472,6 +473,9 @@ static inline void ovl_copyattr(struct inode *from, struct inode *to)
 > >         to->i_mtime = from->i_mtime;
 > >         to->i_ctime = from->i_ctime;
 > >         i_size_write(to, i_size_read(from));
 > > +
 > > +       if (ovl_inode_upper(to) && from->i_state & I_DIRTY_ALL)
 > > +               ovl_mark_inode_dirty(to);
 > >  }
 > 
 > Okay, ovl_copyattr() certainly seems a good place to copy dirtyness as well.
 > 
 > What I'm fearing is that it does not cover all the places where
 > underlying inode can be dirtied.  This really needs an audit of all
 > filesystem modifying operations.
 
You are right. Let me fix it in next version.

Thanks,
Chengguang


 > >
 > >  static inline void ovl_copyflags(struct inode *from, struct inode *to)
 > > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
 > > index 23f475627d07..a6f59df744ae 100644
 > > --- a/fs/overlayfs/util.c
 > > +++ b/fs/overlayfs/util.c
 > > @@ -950,3 +950,17 @@ char *ovl_get_redirect_xattr(struct ovl_fs *ofs, struct dentry *dentry,
 > >         kfree(buf);
 > >         return ERR_PTR(res);
 > >  }
 > > +
 > > +/*
 > > + * We intentionally add I_DIRTY_SYNC flag regardless dirty flag
 > > + * of upper inode so that we have chance to invoke ->write_inode
 > > + * to re-dirty overlayfs' inode during writeback process.
 > > + */
 > > +void ovl_mark_inode_dirty(struct inode *inode)
 > > +{
 > > +       struct inode *upper = ovl_inode_upper(inode);
 > > +       unsigned long iflag = I_DIRTY_SYNC;
 > > +
 > > +       iflag |= upper->i_state & I_DIRTY_ALL;
 > > +       __mark_inode_dirty(inode, iflag);
 > > +}
 > > --
 > > 2.26.2
 > >
 > >
 > 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs
  2021-04-09 13:51   ` Miklos Szeredi
@ 2021-04-12 12:24     ` Chengguang Xu
  2021-04-12 14:22       ` Miklos Szeredi
  0 siblings, 1 reply; 24+ messages in thread
From: Chengguang Xu @ 2021-04-12 12:24 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

 ---- 在 星期五, 2021-04-09 21:51:26 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
 > On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > > Now overlayfs can only sync dirty inode during syncfs,
 > > so remove unnecessary sync_filesystem() on upper file
 > > system.
 > >
 > > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
 > > ---
 > >  fs/overlayfs/super.c | 11 ++++++++---
 > >  1 file changed, 8 insertions(+), 3 deletions(-)
 > >
 > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
 > > index 982b3954b47c..58507f1cd583 100644
 > > --- a/fs/overlayfs/super.c
 > > +++ b/fs/overlayfs/super.c
 > > @@ -15,6 +15,8 @@
 > >  #include <linux/seq_file.h>
 > >  #include <linux/posix_acl_xattr.h>
 > >  #include <linux/exportfs.h>
 > > +#include <linux/blkdev.h>
 > > +#include <linux/writeback.h>
 > >  #include "overlayfs.h"
 > >
 > >  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
 > > @@ -270,8 +272,7 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
 > >          * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
 > >          * All the super blocks will be iterated, including upper_sb.
 > >          *
 > > -        * If this is a syncfs(2) call, then we do need to call
 > > -        * sync_filesystem() on upper_sb, but enough if we do it when being
 > > +        * if this is a syncfs(2) call, it will be enough we do it when being
 > >          * called with wait == 1.
 > >          */
 > >         if (!wait)
 > > @@ -280,7 +281,11 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
 > >         upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
 > >
 > >         down_read(&upper_sb->s_umount);
 > > -       ret = sync_filesystem(upper_sb);
 > > +       wait_sb_inodes(upper_sb);
 > > +       if (upper_sb->s_op->sync_fs)
 > > +               ret = upper_sb->s_op->sync_fs(upper_sb, wait);
 > > +       if (!ret)
 > > +               ret = sync_blockdev(upper_sb->s_bdev);
 > 
 > Should this instead be __sync_blockdev(..., wait)?
 
I don't remember why we skipped the case of (wait == 0) here, just guess it's not worth
to export internal function __sync_blockdev() to modules, do you prefer to call __sync_blockdev()
and handle both nowait and wait cases?


Thanks,
Chengguang






^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs
  2021-04-12 12:24     ` Chengguang Xu
@ 2021-04-12 14:22       ` Miklos Szeredi
  0 siblings, 0 replies; 24+ messages in thread
From: Miklos Szeredi @ 2021-04-12 14:22 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

On Mon, Apr 12, 2021 at 2:24 PM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
>  ---- 在 星期五, 2021-04-09 21:51:26 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
>  > On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>  > >
>  > > Now overlayfs can only sync dirty inode during syncfs,
>  > > so remove unnecessary sync_filesystem() on upper file
>  > > system.
>  > >
>  > > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
>  > > ---
>  > >  fs/overlayfs/super.c | 11 ++++++++---
>  > >  1 file changed, 8 insertions(+), 3 deletions(-)
>  > >
>  > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
>  > > index 982b3954b47c..58507f1cd583 100644
>  > > --- a/fs/overlayfs/super.c
>  > > +++ b/fs/overlayfs/super.c
>  > > @@ -15,6 +15,8 @@
>  > >  #include <linux/seq_file.h>
>  > >  #include <linux/posix_acl_xattr.h>
>  > >  #include <linux/exportfs.h>
>  > > +#include <linux/blkdev.h>
>  > > +#include <linux/writeback.h>
>  > >  #include "overlayfs.h"
>  > >
>  > >  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
>  > > @@ -270,8 +272,7 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
>  > >          * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
>  > >          * All the super blocks will be iterated, including upper_sb.
>  > >          *
>  > > -        * If this is a syncfs(2) call, then we do need to call
>  > > -        * sync_filesystem() on upper_sb, but enough if we do it when being
>  > > +        * if this is a syncfs(2) call, it will be enough we do it when being
>  > >          * called with wait == 1.
>  > >          */
>  > >         if (!wait)
>  > > @@ -280,7 +281,11 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
>  > >         upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
>  > >
>  > >         down_read(&upper_sb->s_umount);
>  > > -       ret = sync_filesystem(upper_sb);
>  > > +       wait_sb_inodes(upper_sb);
>  > > +       if (upper_sb->s_op->sync_fs)
>  > > +               ret = upper_sb->s_op->sync_fs(upper_sb, wait);
>  > > +       if (!ret)
>  > > +               ret = sync_blockdev(upper_sb->s_bdev);
>  >
>  > Should this instead be __sync_blockdev(..., wait)?
>
> I don't remember why we skipped the case of (wait == 0) here, just guess it's not worth
> to export internal function __sync_blockdev() to modules, do you prefer to call __sync_blockdev()
> and handle both nowait and wait cases?

Possibly it would make most sense to export the "->sync_fs() +
__sync_blockdev()" part of __sync_filesystem() as a separate helper.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode
  2021-04-09 13:50   ` Miklos Szeredi
@ 2021-04-13  2:14     ` Chengguang Xu
  2021-04-13  8:43       ` Miklos Szeredi
  0 siblings, 1 reply; 24+ messages in thread
From: Chengguang Xu @ 2021-04-13  2:14 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

 ---- 在 星期五, 2021-04-09 21:50:35 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
 > On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > > Now drop overlayfs' inode will sync dirty data,
 > > so we change to only drop clean inode.
 > 
 > I don't understand what happens here.  Please add more explanation.

In iput_final(), clean overlayfs inode will directly drop as the same as before,
dirty overlayfs inode will keep in the cache to wait writeback to sync dirty data
and then add to lru list to wait reclaim.

The purpose of doing this is to keep compatible behavior with original one,
because without this series, dropping overlayfs inode will not trigger syncing
underlying dirty inode.


Thanks,
Chengguang

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode
  2021-04-13  2:14     ` Chengguang Xu
@ 2021-04-13  8:43       ` Miklos Szeredi
  2021-04-14  5:53         ` Chengguang Xu
  0 siblings, 1 reply; 24+ messages in thread
From: Miklos Szeredi @ 2021-04-13  8:43 UTC (permalink / raw)
  To: Chengguang Xu; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

On Tue, Apr 13, 2021 at 4:14 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
>  ---- 在 星期五, 2021-04-09 21:50:35 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
>  > On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>  > >
>  > > Now drop overlayfs' inode will sync dirty data,
>  > > so we change to only drop clean inode.
>  >
>  > I don't understand what happens here.  Please add more explanation.
>
> In iput_final(), clean overlayfs inode will directly drop as the same as before,
> dirty overlayfs inode will keep in the cache to wait writeback to sync dirty data
> and then add to lru list to wait reclaim.
>
> The purpose of doing this is to keep compatible behavior with original one,
> because without this series, dropping overlayfs inode will not trigger syncing
> underlying dirty inode.

I get it now.  Can you please update the patch header with this description?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode
  2021-04-13  8:43       ` Miklos Szeredi
@ 2021-04-14  5:53         ` Chengguang Xu
  0 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2021-04-14  5:53 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

 ---- 在 星期二, 2021-04-13 16:43:27 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
 > On Tue, Apr 13, 2021 at 4:14 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > >  ---- 在 星期五, 2021-04-09 21:50:35 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
 > >  > On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >  > >
 > >  > > Now drop overlayfs' inode will sync dirty data,
 > >  > > so we change to only drop clean inode.
 > >  >
 > >  > I don't understand what happens here.  Please add more explanation.
 > >
 > > In iput_final(), clean overlayfs inode will directly drop as the same as before,
 > > dirty overlayfs inode will keep in the cache to wait writeback to sync dirty data
 > > and then add to lru list to wait reclaim.
 > >
 > > The purpose of doing this is to keep compatible behavior with original one,
 > > because without this series, dropping overlayfs inode will not trigger syncing
 > > underlying dirty inode.
 > 
 > I get it now.  Can you please update the patch header with this description?
 > 

No problem, I'll update commit log in next version.

Thanks,
Chengguang


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [RFC PATCH v4 6/9] ovl: implement overlayfs' ->write_inode operation
  2021-04-09 13:49   ` Miklos Szeredi
@ 2021-04-14  6:14     ` Chengguang Xu
  0 siblings, 0 replies; 24+ messages in thread
From: Chengguang Xu @ 2021-04-14  6:14 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Jan Kara, Amir Goldstein, overlayfs, linux-fsdevel

---- 在 星期五, 2021-04-09 21:49:07 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
 > On Fri, Nov 13, 2020 at 7:57 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > > Implement overlayfs' ->write_inode to sync dirty data
 > > and redirty overlayfs' inode if necessary.
 > >
 > > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
 > > ---
 > >  fs/overlayfs/super.c | 30 ++++++++++++++++++++++++++++++
 > >  1 file changed, 30 insertions(+)
 > >
 > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
 > > index 883172ac8a12..82e001b97f38 100644
 > > --- a/fs/overlayfs/super.c
 > > +++ b/fs/overlayfs/super.c
 > > @@ -390,6 +390,35 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 > >         return ret;
 > >  }
 > >
 > > +static int ovl_write_inode(struct inode *inode,
 > > +                          struct writeback_control *wbc)
 > > +{
 > > +       struct ovl_fs *ofs = inode->i_sb->s_fs_info;
 > > +       struct inode *upper = ovl_inode_upper(inode);
 > > +       unsigned long iflag = 0;
 > > +       int ret = 0;
 > > +
 > > +       if (!upper)
 > > +               return 0;
 > > +
 > > +       if (!ovl_should_sync(ofs))
 > > +               return 0;
 > > +
 > > +       if (upper->i_sb->s_op->write_inode)
 > > +               ret = upper->i_sb->s_op->write_inode(inode, wbc);
 > > +
 > > +       if (mapping_writably_mapped(upper->i_mapping) ||
 > > +           mapping_tagged(upper->i_mapping, PAGECACHE_TAG_WRITEBACK))
 > > +               iflag |= I_DIRTY_PAGES;
 > > +
 > > +       iflag |= upper->i_state & I_DIRTY_ALL;
 > 
 > How is I_DIRTY_SYNC added/removed from the overlay inode?
 > 

generally, I_DIRTY_SYNC is added to overlay inode when the operation dirties
upper inode, I'll check all those places and call ovl_mark_inode_dirty() to do it.
After writeback if upper inode becomes clean status, we will remove I_DIRTY_SYNC
from overlay inode.

One exception is mmaped file, it will always keep I_DIRTY_SYNC until to be evicted.

Thanks,
Chengguang




^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, back to index

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-11-13  6:55 [RFC PATCH v4 0/9] implement containerized syncfs for overlayfs Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 1/9] ovl: setup overlayfs' private bdi Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 2/9] ovl: implement ->writepages operation Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 3/9] ovl: implement overlayfs' ->evict_inode operation Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 4/9] ovl: mark overlayfs' inode dirty on modification Chengguang Xu
2021-04-09 13:45   ` Miklos Szeredi
2021-04-12 11:58     ` Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 5/9] ovl: mark overlayfs' inode dirty on shared mmap Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 6/9] ovl: implement overlayfs' ->write_inode operation Chengguang Xu
2021-04-09 13:49   ` Miklos Szeredi
2021-04-14  6:14     ` Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 7/9] ovl: cache dirty overlayfs' inode Chengguang Xu
2021-04-09 13:50   ` Miklos Szeredi
2021-04-13  2:14     ` Chengguang Xu
2021-04-13  8:43       ` Miklos Szeredi
2021-04-14  5:53         ` Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 8/9] fs: export wait_sb_inodes() Chengguang Xu
2020-11-13  6:55 ` [RFC PATCH v4 9/9] ovl: implement containerized syncfs for overlayfs Chengguang Xu
2021-04-09 13:51   ` Miklos Szeredi
2021-04-12 12:24     ` Chengguang Xu
2021-04-12 14:22       ` Miklos Szeredi
2020-12-04 14:49 ` 回复:[RFC PATCH v4 0/9] " Chengguang Xu
2020-12-04 15:02   ` [RFC " Miklos Szeredi
2020-12-04 16:40   ` 回复:[RFC " Jan Kara

($INBOX_DIR/description missing)

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-unionfs/0 linux-unionfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-unionfs linux-unionfs/ https://lore.kernel.org/linux-unionfs \
		linux-unionfs@vger.kernel.org
	public-inbox-index linux-unionfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-unionfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git