linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs
@ 2021-11-22  3:00 Chengguang Xu
  2021-11-22  3:00 ` [RFC PATCH V6 1/7] ovl: setup overlayfs' private bdi Chengguang Xu
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
on upper_sb to synchronize whole dirty inodes in upper filesystem
regardless of the overlay ownership of the inode. In the use case of
container, when multiple containers using the same underlying upper
filesystem, it has some shortcomings as below.

(1) Performance
Synchronization is probably heavy because it actually syncs unnecessary
inodes for target overlayfs.

(2) Interference
Unplanned synchronization will probably impact IO performance of
unrelated container processes on the other overlayfs.

This series try to implement containerized syncfs for overlayfs so that
only sync target dirty upper inodes which are belong to specific overlayfs
instance. By doing this, it is able to reduce cost of synchronization and 
will not seriously impact IO performance of unrelated processes.

v1->v2:
- Mark overlayfs' inode dirty itself instead of adding notification
mechanism to vfs inode.

v2->v3:
- Introduce overlayfs' extra syncfs wait list to wait target upper inodes
in ->sync_fs.

v3->v4:
- Using wait_sb_inodes() to wait syncing upper inodes.
- Mark overlay inode dirty only when having upper inode and VM_SHARED
flag in ovl_mmap().
- Check upper i_state after checking upper mmap state
in ovl_write_inode.

v4->v5:
- Add underlying inode dirtiness check after mnt_drop_write().
- Handle both wait/no-wait mode of syncfs(2) in overlayfs' ->sync_fs().

v5->v6:
- Rebase to latest overlayfs-next tree.
- Mark oerlay inode dirty when it has upper instead of marking dirty on
  modification.
- Trigger dirty page writeback in overlayfs' ->write_inode().
- Mark overlay inode 'DONTCACHE' flag.
- Delete overlayfs' ->writepages() and ->evict_inode() operations.

Chengguang Xu (7):
  ovl: setup overlayfs' private bdi
  ovl: mark overlayfs inode dirty when it has upper
  ovl: implement overlayfs' own ->write_inode operation
  ovl: set 'DONTCACHE' flag for overlayfs inode
  fs: export wait_sb_inodes()
  ovl: introduce ovl_sync_upper_blockdev()
  ovl: implement containerized syncfs for overlayfs

 fs/fs-writeback.c         |  3 ++-
 fs/overlayfs/inode.c      |  5 +++-
 fs/overlayfs/super.c      | 49 ++++++++++++++++++++++++++++++++-------
 fs/overlayfs/util.c       |  1 +
 include/linux/writeback.h |  1 +
 5 files changed, 48 insertions(+), 11 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V6 1/7] ovl: setup overlayfs' private bdi
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
@ 2021-11-22  3:00 ` Chengguang Xu
  2021-11-26  8:51   ` Jan Kara
  2021-11-22  3:00 ` [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper Chengguang Xu
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

Setup overlayfs' private bdi so that we can collect
overlayfs' own dirty inodes.

Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
---
 fs/overlayfs/super.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 265181c110ae..18a12088a37b 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1984,6 +1984,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	if (!ofs)
 		goto out;
 
+	err = super_setup_bdi(sb);
+	if (err)
+		goto out_err;
+
 	err = -ENOMEM;
 	ofs->creator_cred = cred = prepare_creds();
 	if (!cred)
-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
  2021-11-22  3:00 ` [RFC PATCH V6 1/7] ovl: setup overlayfs' private bdi Chengguang Xu
@ 2021-11-22  3:00 ` Chengguang Xu
  2021-11-26  9:10   ` Jan Kara
  2021-11-22  3:00 ` [RFC PATCH V6 3/7] ovl: implement overlayfs' own ->write_inode operation Chengguang Xu
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

We simply mark overlayfs inode dirty when it has upper,
it's much simpler than mark dirtiness on modification.

Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
---
 fs/overlayfs/inode.c | 4 +++-
 fs/overlayfs/util.c  | 1 +
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 1f36158c7dbe..027ffc0a2539 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -778,8 +778,10 @@ void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
 {
 	struct inode *realinode;
 
-	if (oip->upperdentry)
+	if (oip->upperdentry) {
 		OVL_I(inode)->__upperdentry = oip->upperdentry;
+		mark_inode_dirty(inode);
+	}
 	if (oip->lowerpath && oip->lowerpath->dentry)
 		OVL_I(inode)->lower = igrab(d_inode(oip->lowerpath->dentry));
 	if (oip->lowerdata)
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index f48284a2a896..a1922af32a13 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -421,6 +421,7 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
 		inode->i_private = upperinode;
 		__insert_inode_hash(inode, (unsigned long) upperinode);
 	}
+	mark_inode_dirty(inode);
 }
 
 static void ovl_dir_version_inc(struct dentry *dentry, bool impurity)
-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V6 3/7] ovl: implement overlayfs' own ->write_inode operation
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
  2021-11-22  3:00 ` [RFC PATCH V6 1/7] ovl: setup overlayfs' private bdi Chengguang Xu
  2021-11-22  3:00 ` [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper Chengguang Xu
@ 2021-11-22  3:00 ` Chengguang Xu
  2021-11-26  9:14   ` Jan Kara
  2021-11-22  3:00 ` [RFC PATCH V6 4/7] ovl: set 'DONTCACHE' flag for overlayfs inode Chengguang Xu
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

Sync dirty data and meta of upper inode in overlayfs' ->write_inode()
and redirty overlayfs' inode.

Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
---
 fs/overlayfs/super.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 18a12088a37b..12acf0ec7395 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -15,6 +15,7 @@
 #include <linux/seq_file.h>
 #include <linux/posix_acl_xattr.h>
 #include <linux/exportfs.h>
+#include <linux/writeback.h>
 #include "overlayfs.h"
 
 MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
@@ -406,12 +407,32 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 	return ret;
 }
 
+static int ovl_write_inode(struct inode *inode,
+			   struct writeback_control *wbc)
+{
+	struct ovl_fs *ofs = inode->i_sb->s_fs_info;
+	struct inode *upper_inode = ovl_inode_upper(inode);
+	int ret = 0;
+
+	if (!upper_inode)
+		return 0;
+
+	if (!ovl_should_sync(ofs))
+		return 0;
+
+	ret = write_inode_now(upper_inode, wbc->sync_mode == WB_SYNC_ALL);
+	mark_inode_dirty(inode);
+
+	return ret;
+}
+
 static const struct super_operations ovl_super_operations = {
 	.alloc_inode	= ovl_alloc_inode,
 	.free_inode	= ovl_free_inode,
 	.destroy_inode	= ovl_destroy_inode,
 	.drop_inode	= generic_delete_inode,
 	.put_super	= ovl_put_super,
+	.write_inode	= ovl_write_inode,
 	.sync_fs	= ovl_sync_fs,
 	.statfs		= ovl_statfs,
 	.show_options	= ovl_show_options,
-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V6 4/7] ovl: set 'DONTCACHE' flag for overlayfs inode
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (2 preceding siblings ...)
  2021-11-22  3:00 ` [RFC PATCH V6 3/7] ovl: implement overlayfs' own ->write_inode operation Chengguang Xu
@ 2021-11-22  3:00 ` Chengguang Xu
  2021-11-26  9:20   ` Jan Kara
  2021-11-22  3:00 ` [RFC PATCH V6 5/7] fs: export wait_sb_inodes() Chengguang Xu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

Set 'DONTCACHE' flag to overlayfs inode so that
upper inode to be always synced before eviction.

Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
---
 fs/overlayfs/inode.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 027ffc0a2539..c4472299d5df 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -791,6 +791,7 @@ void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
 	ovl_copyattr(realinode, inode);
 	ovl_copyflags(realinode, inode);
 	ovl_map_ino(inode, ino, fsid);
+	d_mark_dontcache(inode);
 }
 
 static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev)
-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V6 5/7] fs: export wait_sb_inodes()
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (3 preceding siblings ...)
  2021-11-22  3:00 ` [RFC PATCH V6 4/7] ovl: set 'DONTCACHE' flag for overlayfs inode Chengguang Xu
@ 2021-11-22  3:00 ` Chengguang Xu
  2021-11-26  9:20   ` Jan Kara
  2021-11-22  3:00 ` [RFC PATCH V6 6/7] ovl: introduce ovl_sync_upper_blockdev() Chengguang Xu
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

In order to wait syncing upper inodes we need to
call wait_sb_inodes() in overlayfs' ->sync_fs.

Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
---
 fs/fs-writeback.c         | 3 ++-
 include/linux/writeback.h | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 81ec192ce067..0438c911241e 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -2505,7 +2505,7 @@ EXPORT_SYMBOL(__mark_inode_dirty);
  * completed by the time we have gained the lock and waited for all IO that is
  * in progress regardless of the order callers are granted the lock.
  */
-static void wait_sb_inodes(struct super_block *sb)
+void wait_sb_inodes(struct super_block *sb)
 {
 	LIST_HEAD(sync_list);
 
@@ -2589,6 +2589,7 @@ static void wait_sb_inodes(struct super_block *sb)
 	rcu_read_unlock();
 	mutex_unlock(&sb->s_sync_lock);
 }
+EXPORT_SYMBOL(wait_sb_inodes);
 
 static void __writeback_inodes_sb_nr(struct super_block *sb, unsigned long nr,
 				     enum wb_reason reason, bool skip_if_busy)
diff --git a/include/linux/writeback.h b/include/linux/writeback.h
index d1f65adf6a26..d7aacd0434cf 100644
--- a/include/linux/writeback.h
+++ b/include/linux/writeback.h
@@ -198,6 +198,7 @@ void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
 				enum wb_reason reason);
 void inode_wait_for_writeback(struct inode *inode);
 void inode_io_list_del(struct inode *inode);
+void wait_sb_inodes(struct super_block *sb);
 
 /* writeback.h requires fs.h; it, too, is not included from here. */
 static inline void wait_on_inode(struct inode *inode)
-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V6 6/7] ovl: introduce ovl_sync_upper_blockdev()
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (4 preceding siblings ...)
  2021-11-22  3:00 ` [RFC PATCH V6 5/7] fs: export wait_sb_inodes() Chengguang Xu
@ 2021-11-22  3:00 ` Chengguang Xu
  2021-11-26  9:21   ` Jan Kara
  2021-11-22  3:00 ` [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs Chengguang Xu
  2021-11-27  9:26 ` 回复:[RFC PATCH V6 0/7] " Chengguang Xu
  7 siblings, 1 reply; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

Introduce new helper ovl_sync_upper_blockdev() to sync
upper blockdev.

Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
---
 fs/overlayfs/super.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 12acf0ec7395..ccffcd96491d 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -258,6 +258,16 @@ static void ovl_put_super(struct super_block *sb)
 	ovl_free_fs(ofs);
 }
 
+static int ovl_sync_upper_blockdev(struct super_block *sb, int wait)
+{
+	if (!sb->s_bdev)
+		return 0;
+
+	if (!wait)
+		return filemap_flush(sb->s_bdev->bd_inode->i_mapping);
+	return filemap_write_and_wait_range(sb->s_bdev->bd_inode->i_mapping, 0, LLONG_MAX);
+}
+
 /* Sync real dirty inodes in upper filesystem (if it exists) */
 static int ovl_sync_fs(struct super_block *sb, int wait)
 {
-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (5 preceding siblings ...)
  2021-11-22  3:00 ` [RFC PATCH V6 6/7] ovl: introduce ovl_sync_upper_blockdev() Chengguang Xu
@ 2021-11-22  3:00 ` Chengguang Xu
  2021-11-22  7:40   ` Amir Goldstein
  2021-11-26  9:25   ` Jan Kara
  2021-11-27  9:26 ` 回复:[RFC PATCH V6 0/7] " Chengguang Xu
  7 siblings, 2 replies; 21+ messages in thread
From: Chengguang Xu @ 2021-11-22  3:00 UTC (permalink / raw)
  To: miklos, jack, amir73il
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu

From: Chengguang Xu <charliecgxu@tencent.com>

Now overlayfs can only sync own dirty inodes during syncfs,
so remove unnecessary sync_filesystem() on upper file system.

Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
---
 fs/overlayfs/super.c | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index ccffcd96491d..213b795a6a86 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -292,18 +292,14 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
 	/*
 	 * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
 	 * All the super blocks will be iterated, including upper_sb.
-	 *
-	 * If this is a syncfs(2) call, then we do need to call
-	 * sync_filesystem() on upper_sb, but enough if we do it when being
-	 * called with wait == 1.
 	 */
-	if (!wait)
-		return 0;
-
 	upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
-
 	down_read(&upper_sb->s_umount);
-	ret = sync_filesystem(upper_sb);
+	if (wait)
+		wait_sb_inodes(upper_sb);
+	if (upper_sb->s_op->sync_fs)
+		upper_sb->s_op->sync_fs(upper_sb, wait);
+	ret = ovl_sync_upper_blockdev(upper_sb, wait);
 	up_read(&upper_sb->s_umount);
 
 	return ret;
-- 
2.27.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs
  2021-11-22  3:00 ` [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs Chengguang Xu
@ 2021-11-22  7:40   ` Amir Goldstein
  2021-11-26  5:03     ` Chengguang Xu
  2021-11-26  9:25   ` Jan Kara
  1 sibling, 1 reply; 21+ messages in thread
From: Amir Goldstein @ 2021-11-22  7:40 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: Miklos Szeredi, Jan Kara, overlayfs, linux-fsdevel, linux-kernel,
	Chengguang Xu

On Mon, Nov 22, 2021 at 5:01 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
>
> From: Chengguang Xu <charliecgxu@tencent.com>
>
> Now overlayfs can only sync own dirty inodes during syncfs,
> so remove unnecessary sync_filesystem() on upper file system.
>
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
> ---
>  fs/overlayfs/super.c | 14 +++++---------
>  1 file changed, 5 insertions(+), 9 deletions(-)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index ccffcd96491d..213b795a6a86 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -292,18 +292,14 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
>         /*
>          * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
>          * All the super blocks will be iterated, including upper_sb.
> -        *
> -        * If this is a syncfs(2) call, then we do need to call
> -        * sync_filesystem() on upper_sb, but enough if we do it when being
> -        * called with wait == 1.
>          */
> -       if (!wait)
> -               return 0;
> -
>         upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
> -
>         down_read(&upper_sb->s_umount);
> -       ret = sync_filesystem(upper_sb);
> +       if (wait)
> +               wait_sb_inodes(upper_sb);
> +       if (upper_sb->s_op->sync_fs)
> +               upper_sb->s_op->sync_fs(upper_sb, wait);
> +       ret = ovl_sync_upper_blockdev(upper_sb, wait);

I think it will be cleaner to use a helper ovl_sync_upper_filesystem()
with everything from  upper_sb = ... and a comment to explain that
this is a variant of __sync_filesystem() where all the dirty inodes write
have already been started.

Thanks,
Amir.

P.S. I like this "stoopid proof" v6 because I can understand it ;-)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs
  2021-11-22  7:40   ` Amir Goldstein
@ 2021-11-26  5:03     ` Chengguang Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Chengguang Xu @ 2021-11-26  5:03 UTC (permalink / raw)
  To: Amir Goldstein
  Cc: Miklos Szeredi, Jan Kara, overlayfs, linux-fsdevel, linux-kernel,
	Chengguang Xu

 ---- 在 星期一, 2021-11-22 15:40:59 Amir Goldstein <amir73il@gmail.com> 撰写 ----
 > On Mon, Nov 22, 2021 at 5:01 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >
 > > From: Chengguang Xu <charliecgxu@tencent.com>
 > >
 > > Now overlayfs can only sync own dirty inodes during syncfs,
 > > so remove unnecessary sync_filesystem() on upper file system.
 > >
 > > Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
 > > ---
 > >  fs/overlayfs/super.c | 14 +++++---------
 > >  1 file changed, 5 insertions(+), 9 deletions(-)
 > >
 > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
 > > index ccffcd96491d..213b795a6a86 100644
 > > --- a/fs/overlayfs/super.c
 > > +++ b/fs/overlayfs/super.c
 > > @@ -292,18 +292,14 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
 > >         /*
 > >          * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
 > >          * All the super blocks will be iterated, including upper_sb.
 > > -        *
 > > -        * If this is a syncfs(2) call, then we do need to call
 > > -        * sync_filesystem() on upper_sb, but enough if we do it when being
 > > -        * called with wait == 1.
 > >          */
 > > -       if (!wait)
 > > -               return 0;
 > > -
 > >         upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
 > > -
 > >         down_read(&upper_sb->s_umount);
 > > -       ret = sync_filesystem(upper_sb);
 > > +       if (wait)
 > > +               wait_sb_inodes(upper_sb);
 > > +       if (upper_sb->s_op->sync_fs)
 > > +               upper_sb->s_op->sync_fs(upper_sb, wait);
 > > +       ret = ovl_sync_upper_blockdev(upper_sb, wait);
 > 
 > I think it will be cleaner to use a helper ovl_sync_upper_filesystem()
 > with everything from  upper_sb = ... and a comment to explain that
 > this is a variant of __sync_filesystem() where all the dirty inodes write
 > have already been started.
 > 
 
I agree with you. 

Thanks,
Chengguang

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 1/7] ovl: setup overlayfs' private bdi
  2021-11-22  3:00 ` [RFC PATCH V6 1/7] ovl: setup overlayfs' private bdi Chengguang Xu
@ 2021-11-26  8:51   ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2021-11-26  8:51 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu

On Mon 22-11-21 11:00:32, Chengguang Xu wrote:
> From: Chengguang Xu <charliecgxu@tencent.com>
> 
> Setup overlayfs' private bdi so that we can collect
> overlayfs' own dirty inodes.
> 
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/overlayfs/super.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 265181c110ae..18a12088a37b 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -1984,6 +1984,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	if (!ofs)
>  		goto out;
>  
> +	err = super_setup_bdi(sb);
> +	if (err)
> +		goto out_err;
> +
>  	err = -ENOMEM;
>  	ofs->creator_cred = cred = prepare_creds();
>  	if (!cred)
> -- 
> 2.27.0
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper
  2021-11-22  3:00 ` [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper Chengguang Xu
@ 2021-11-26  9:10   ` Jan Kara
  2021-11-26 13:06     ` Chengguang Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2021-11-26  9:10 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu

On Mon 22-11-21 11:00:33, Chengguang Xu wrote:
> From: Chengguang Xu <charliecgxu@tencent.com>
> 
> We simply mark overlayfs inode dirty when it has upper,
> it's much simpler than mark dirtiness on modification.
> 
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
> ---
>  fs/overlayfs/inode.c | 4 +++-
>  fs/overlayfs/util.c  | 1 +
>  2 files changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 1f36158c7dbe..027ffc0a2539 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -778,8 +778,10 @@ void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
>  {
>  	struct inode *realinode;
>  
> -	if (oip->upperdentry)
> +	if (oip->upperdentry) {
>  		OVL_I(inode)->__upperdentry = oip->upperdentry;
> +		mark_inode_dirty(inode);
> +	}
>  	if (oip->lowerpath && oip->lowerpath->dentry)
>  		OVL_I(inode)->lower = igrab(d_inode(oip->lowerpath->dentry));
>  	if (oip->lowerdata)

Hum, does this get called only for inodes with upper inode existing? I
suppose we do not need to track inodes that were not copied up because they
cannot be dirty? I'm sorry, my knowledge of overlayfs is rather limited so
I may be missing something basic.



								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 3/7] ovl: implement overlayfs' own ->write_inode operation
  2021-11-22  3:00 ` [RFC PATCH V6 3/7] ovl: implement overlayfs' own ->write_inode operation Chengguang Xu
@ 2021-11-26  9:14   ` Jan Kara
  2021-11-26 13:09     ` Chengguang Xu
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kara @ 2021-11-26  9:14 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu

On Mon 22-11-21 11:00:34, Chengguang Xu wrote:
> From: Chengguang Xu <charliecgxu@tencent.com>
> 
> Sync dirty data and meta of upper inode in overlayfs' ->write_inode()
> and redirty overlayfs' inode.
> 
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>

Looks good. I'm still not 100% convinced keeping inodes dirty all the time
will not fire back in excessive writeback activity (e.g. flush worker will
traverse the list of all dirty inodes every 30 seconds and try to write
them) but I guess we can start with this and if people complain, dirtiness
handling can be improved. So feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/overlayfs/super.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 18a12088a37b..12acf0ec7395 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -15,6 +15,7 @@
>  #include <linux/seq_file.h>
>  #include <linux/posix_acl_xattr.h>
>  #include <linux/exportfs.h>
> +#include <linux/writeback.h>
>  #include "overlayfs.h"
>  
>  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
> @@ -406,12 +407,32 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
>  	return ret;
>  }
>  
> +static int ovl_write_inode(struct inode *inode,
> +			   struct writeback_control *wbc)
> +{
> +	struct ovl_fs *ofs = inode->i_sb->s_fs_info;
> +	struct inode *upper_inode = ovl_inode_upper(inode);
> +	int ret = 0;
> +
> +	if (!upper_inode)
> +		return 0;
> +
> +	if (!ovl_should_sync(ofs))
> +		return 0;
> +
> +	ret = write_inode_now(upper_inode, wbc->sync_mode == WB_SYNC_ALL);
> +	mark_inode_dirty(inode);
> +
> +	return ret;
> +}
> +
>  static const struct super_operations ovl_super_operations = {
>  	.alloc_inode	= ovl_alloc_inode,
>  	.free_inode	= ovl_free_inode,
>  	.destroy_inode	= ovl_destroy_inode,
>  	.drop_inode	= generic_delete_inode,
>  	.put_super	= ovl_put_super,
> +	.write_inode	= ovl_write_inode,
>  	.sync_fs	= ovl_sync_fs,
>  	.statfs		= ovl_statfs,
>  	.show_options	= ovl_show_options,
> -- 
> 2.27.0
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 4/7] ovl: set 'DONTCACHE' flag for overlayfs inode
  2021-11-22  3:00 ` [RFC PATCH V6 4/7] ovl: set 'DONTCACHE' flag for overlayfs inode Chengguang Xu
@ 2021-11-26  9:20   ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2021-11-26  9:20 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu

On Mon 22-11-21 11:00:35, Chengguang Xu wrote:
> From: Chengguang Xu <charliecgxu@tencent.com>
> 
> Set 'DONTCACHE' flag to overlayfs inode so that
> upper inode to be always synced before eviction.
> 
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
> ---
>  fs/overlayfs/inode.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 027ffc0a2539..c4472299d5df 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -791,6 +791,7 @@ void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
>  	ovl_copyattr(realinode, inode);
>  	ovl_copyflags(realinode, inode);
>  	ovl_map_ino(inode, ino, fsid);
> +	d_mark_dontcache(inode);
>  }

Doesn't this effectively disable dcache for overlayfs dentries? I mean e.g.
whenever overlayfs file is closed, we will drop its dentry & inode from the
cache. Upper and lower inodes / dentries stay in cache so no disk access
should be needed to reconstruct overlayfs dentry & inode but still it may
be a bit costly? I guess others more familiar with overlayfs have to judge.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 5/7] fs: export wait_sb_inodes()
  2021-11-22  3:00 ` [RFC PATCH V6 5/7] fs: export wait_sb_inodes() Chengguang Xu
@ 2021-11-26  9:20   ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2021-11-26  9:20 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu

On Mon 22-11-21 11:00:36, Chengguang Xu wrote:
> From: Chengguang Xu <charliecgxu@tencent.com>
> 
> In order to wait syncing upper inodes we need to
> call wait_sb_inodes() in overlayfs' ->sync_fs.
> 
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/fs-writeback.c         | 3 ++-
>  include/linux/writeback.h | 1 +
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 81ec192ce067..0438c911241e 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -2505,7 +2505,7 @@ EXPORT_SYMBOL(__mark_inode_dirty);
>   * completed by the time we have gained the lock and waited for all IO that is
>   * in progress regardless of the order callers are granted the lock.
>   */
> -static void wait_sb_inodes(struct super_block *sb)
> +void wait_sb_inodes(struct super_block *sb)
>  {
>  	LIST_HEAD(sync_list);
>  
> @@ -2589,6 +2589,7 @@ static void wait_sb_inodes(struct super_block *sb)
>  	rcu_read_unlock();
>  	mutex_unlock(&sb->s_sync_lock);
>  }
> +EXPORT_SYMBOL(wait_sb_inodes);
>  
>  static void __writeback_inodes_sb_nr(struct super_block *sb, unsigned long nr,
>  				     enum wb_reason reason, bool skip_if_busy)
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index d1f65adf6a26..d7aacd0434cf 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -198,6 +198,7 @@ void wakeup_flusher_threads_bdi(struct backing_dev_info *bdi,
>  				enum wb_reason reason);
>  void inode_wait_for_writeback(struct inode *inode);
>  void inode_io_list_del(struct inode *inode);
> +void wait_sb_inodes(struct super_block *sb);
>  
>  /* writeback.h requires fs.h; it, too, is not included from here. */
>  static inline void wait_on_inode(struct inode *inode)
> -- 
> 2.27.0
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 6/7] ovl: introduce ovl_sync_upper_blockdev()
  2021-11-22  3:00 ` [RFC PATCH V6 6/7] ovl: introduce ovl_sync_upper_blockdev() Chengguang Xu
@ 2021-11-26  9:21   ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2021-11-26  9:21 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu

On Mon 22-11-21 11:00:37, Chengguang Xu wrote:
> From: Chengguang Xu <charliecgxu@tencent.com>
> 
> Introduce new helper ovl_sync_upper_blockdev() to sync
> upper blockdev.
> 
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/overlayfs/super.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 12acf0ec7395..ccffcd96491d 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -258,6 +258,16 @@ static void ovl_put_super(struct super_block *sb)
>  	ovl_free_fs(ofs);
>  }
>  
> +static int ovl_sync_upper_blockdev(struct super_block *sb, int wait)
> +{
> +	if (!sb->s_bdev)
> +		return 0;
> +
> +	if (!wait)
> +		return filemap_flush(sb->s_bdev->bd_inode->i_mapping);
> +	return filemap_write_and_wait_range(sb->s_bdev->bd_inode->i_mapping, 0, LLONG_MAX);
> +}
> +
>  /* Sync real dirty inodes in upper filesystem (if it exists) */
>  static int ovl_sync_fs(struct super_block *sb, int wait)
>  {
> -- 
> 2.27.0
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs
  2021-11-22  3:00 ` [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs Chengguang Xu
  2021-11-22  7:40   ` Amir Goldstein
@ 2021-11-26  9:25   ` Jan Kara
  1 sibling, 0 replies; 21+ messages in thread
From: Jan Kara @ 2021-11-26  9:25 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: miklos, jack, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu

On Mon 22-11-21 11:00:38, Chengguang Xu wrote:
> From: Chengguang Xu <charliecgxu@tencent.com>
> 
> Now overlayfs can only sync own dirty inodes during syncfs,
> so remove unnecessary sync_filesystem() on upper file system.
> 
> Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/overlayfs/super.c | 14 +++++---------
>  1 file changed, 5 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index ccffcd96491d..213b795a6a86 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -292,18 +292,14 @@ static int ovl_sync_fs(struct super_block *sb, int wait)
>  	/*
>  	 * Not called for sync(2) call or an emergency sync (SB_I_SKIP_SYNC).
>  	 * All the super blocks will be iterated, including upper_sb.
> -	 *
> -	 * If this is a syncfs(2) call, then we do need to call
> -	 * sync_filesystem() on upper_sb, but enough if we do it when being
> -	 * called with wait == 1.
>  	 */
> -	if (!wait)
> -		return 0;
> -
>  	upper_sb = ovl_upper_mnt(ofs)->mnt_sb;
> -
>  	down_read(&upper_sb->s_umount);
> -	ret = sync_filesystem(upper_sb);
> +	if (wait)
> +		wait_sb_inodes(upper_sb);
> +	if (upper_sb->s_op->sync_fs)
> +		upper_sb->s_op->sync_fs(upper_sb, wait);
> +	ret = ovl_sync_upper_blockdev(upper_sb, wait);
>  	up_read(&upper_sb->s_umount);
>  
>  	return ret;
> -- 
> 2.27.0
> 
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper
  2021-11-26  9:10   ` Jan Kara
@ 2021-11-26 13:06     ` Chengguang Xu
  2021-11-26 14:32       ` Jan Kara
  0 siblings, 1 reply; 21+ messages in thread
From: Chengguang Xu @ 2021-11-26 13:06 UTC (permalink / raw)
  To: Jan Kara
  Cc: miklos, amir73il, linux-unionfs, linux-fsdevel, linux-kernel,
	Chengguang Xu, ronyjin

 ---- 在 星期五, 2021-11-26 17:10:07 Jan Kara <jack@suse.cz> 撰写 ----
 > On Mon 22-11-21 11:00:33, Chengguang Xu wrote:
 > > From: Chengguang Xu <charliecgxu@tencent.com>
 > > 
 > > We simply mark overlayfs inode dirty when it has upper,
 > > it's much simpler than mark dirtiness on modification.
 > > 
 > > Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
 > > ---
 > >  fs/overlayfs/inode.c | 4 +++-
 > >  fs/overlayfs/util.c  | 1 +
 > >  2 files changed, 4 insertions(+), 1 deletion(-)
 > > 
 > > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
 > > index 1f36158c7dbe..027ffc0a2539 100644
 > > --- a/fs/overlayfs/inode.c
 > > +++ b/fs/overlayfs/inode.c
 > > @@ -778,8 +778,10 @@ void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
 > >  {
 > >      struct inode *realinode;
 > >  
 > > -    if (oip->upperdentry)
 > > +    if (oip->upperdentry) {
 > >          OVL_I(inode)->__upperdentry = oip->upperdentry;
 > > +        mark_inode_dirty(inode);
 > > +    }
 > >      if (oip->lowerpath && oip->lowerpath->dentry)
 > >          OVL_I(inode)->lower = igrab(d_inode(oip->lowerpath->dentry));
 > >      if (oip->lowerdata)
 > 
 > Hum, does this get called only for inodes with upper inode existing? I
 > suppose we do not need to track inodes that were not copied up because they
 > cannot be dirty? I'm sorry, my knowledge of overlayfs is rather limited so
 > I may be missing something basic.
 > 

Well, as long as overly inode has upper it can be modified without copy-up,
so we need to track all overlay inodes which have upper inode.

Thanks,
Chengguang



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 3/7] ovl: implement overlayfs' own ->write_inode operation
  2021-11-26  9:14   ` Jan Kara
@ 2021-11-26 13:09     ` Chengguang Xu
  0 siblings, 0 replies; 21+ messages in thread
From: Chengguang Xu @ 2021-11-26 13:09 UTC (permalink / raw)
  To: Jan Kara
  Cc: miklos, amir73il, linux-unionfs, linux-fsdevel, linux-kernel,
	Chengguang Xu, ronyjin

 ---- 在 星期五, 2021-11-26 17:14:30 Jan Kara <jack@suse.cz> 撰写 ----
 > On Mon 22-11-21 11:00:34, Chengguang Xu wrote:
 > > From: Chengguang Xu <charliecgxu@tencent.com>
 > > 
 > > Sync dirty data and meta of upper inode in overlayfs' ->write_inode()
 > > and redirty overlayfs' inode.
 > > 
 > > Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
 > 
 > Looks good. I'm still not 100% convinced keeping inodes dirty all the time
 > will not fire back in excessive writeback activity (e.g. flush worker will
 > traverse the list of all dirty inodes every 30 seconds and try to write
 > them) but I guess we can start with this and if people complain, dirtiness
 > handling can be improved. 

Hi Jan,

Thanks for the review and suggestion.


Thanks,
Chengguang



 > So feel free to add:
 > 
 > Reviewed-by: Jan Kara <jack@suse.cz>
 > 
 >                                 Honza
 > 
 > > ---
 > >  fs/overlayfs/super.c | 21 +++++++++++++++++++++
 > >  1 file changed, 21 insertions(+)
 > > 
 > > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
 > > index 18a12088a37b..12acf0ec7395 100644
 > > --- a/fs/overlayfs/super.c
 > > +++ b/fs/overlayfs/super.c
 > > @@ -15,6 +15,7 @@
 > >  #include <linux/seq_file.h>
 > >  #include <linux/posix_acl_xattr.h>
 > >  #include <linux/exportfs.h>
 > > +#include <linux/writeback.h>
 > >  #include "overlayfs.h"
 > >  
 > >  MODULE_AUTHOR("Miklos Szeredi <miklos@szeredi.hu>");
 > > @@ -406,12 +407,32 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
 > >      return ret;
 > >  }
 > >  
 > > +static int ovl_write_inode(struct inode *inode,
 > > +               struct writeback_control *wbc)
 > > +{
 > > +    struct ovl_fs *ofs = inode->i_sb->s_fs_info;
 > > +    struct inode *upper_inode = ovl_inode_upper(inode);
 > > +    int ret = 0;
 > > +
 > > +    if (!upper_inode)
 > > +        return 0;
 > > +
 > > +    if (!ovl_should_sync(ofs))
 > > +        return 0;
 > > +
 > > +    ret = write_inode_now(upper_inode, wbc->sync_mode == WB_SYNC_ALL);
 > > +    mark_inode_dirty(inode);
 > > +
 > > +    return ret;
 > > +}
 > > +
 > >  static const struct super_operations ovl_super_operations = {
 > >      .alloc_inode    = ovl_alloc_inode,
 > >      .free_inode    = ovl_free_inode,
 > >      .destroy_inode    = ovl_destroy_inode,
 > >      .drop_inode    = generic_delete_inode,
 > >      .put_super    = ovl_put_super,
 > > +    .write_inode    = ovl_write_inode,
 > >      .sync_fs    = ovl_sync_fs,
 > >      .statfs        = ovl_statfs,
 > >      .show_options    = ovl_show_options,
 > > -- 
 > > 2.27.0
 > > 
 > > 
 > -- 
 > Jan Kara <jack@suse.com>
 > SUSE Labs, CR
 > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper
  2021-11-26 13:06     ` Chengguang Xu
@ 2021-11-26 14:32       ` Jan Kara
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kara @ 2021-11-26 14:32 UTC (permalink / raw)
  To: Chengguang Xu
  Cc: Jan Kara, miklos, amir73il, linux-unionfs, linux-fsdevel,
	linux-kernel, Chengguang Xu, ronyjin

On Fri 26-11-21 21:06:10, Chengguang Xu wrote:
>  ---- 在 星期五, 2021-11-26 17:10:07 Jan Kara <jack@suse.cz> 撰写 ----
>  > On Mon 22-11-21 11:00:33, Chengguang Xu wrote:
>  > > From: Chengguang Xu <charliecgxu@tencent.com>
>  > > 
>  > > We simply mark overlayfs inode dirty when it has upper,
>  > > it's much simpler than mark dirtiness on modification.
>  > > 
>  > > Signed-off-by: Chengguang Xu <charliecgxu@tencent.com>
>  > > ---
>  > >  fs/overlayfs/inode.c | 4 +++-
>  > >  fs/overlayfs/util.c  | 1 +
>  > >  2 files changed, 4 insertions(+), 1 deletion(-)
>  > > 
>  > > diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
>  > > index 1f36158c7dbe..027ffc0a2539 100644
>  > > --- a/fs/overlayfs/inode.c
>  > > +++ b/fs/overlayfs/inode.c
>  > > @@ -778,8 +778,10 @@ void ovl_inode_init(struct inode *inode, struct ovl_inode_params *oip,
>  > >  {
>  > >      struct inode *realinode;
>  > >  
>  > > -    if (oip->upperdentry)
>  > > +    if (oip->upperdentry) {
>  > >          OVL_I(inode)->__upperdentry = oip->upperdentry;
>  > > +        mark_inode_dirty(inode);
>  > > +    }
>  > >      if (oip->lowerpath && oip->lowerpath->dentry)
>  > >          OVL_I(inode)->lower = igrab(d_inode(oip->lowerpath->dentry));
>  > >      if (oip->lowerdata)
>  > 
>  > Hum, does this get called only for inodes with upper inode existing? I
>  > suppose we do not need to track inodes that were not copied up because they
>  > cannot be dirty? I'm sorry, my knowledge of overlayfs is rather limited so
>  > I may be missing something basic.
>  > 
> 
> Well, as long as overly inode has upper it can be modified without copy-up,
> so we need to track all overlay inodes which have upper inode.

OK, and oip->upperdentry is set only if there's upper inode, now I
understand. Thanks for explanation and feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 21+ messages in thread

* 回复:[RFC PATCH V6 0/7] implement containerized syncfs for overlayfs
  2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
                   ` (6 preceding siblings ...)
  2021-11-22  3:00 ` [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs Chengguang Xu
@ 2021-11-27  9:26 ` Chengguang Xu
  7 siblings, 0 replies; 21+ messages in thread
From: Chengguang Xu @ 2021-11-27  9:26 UTC (permalink / raw)
  To: miklos
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Chengguang Xu,
	ronyjin, amir73il, jack

 ---- 在 星期一, 2021-11-22 11:00:31 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
 > From: Chengguang Xu <charliecgxu@tencent.com>
 > 
 > Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
 > on upper_sb to synchronize whole dirty inodes in upper filesystem
 > regardless of the overlay ownership of the inode. In the use case of
 > container, when multiple containers using the same underlying upper
 > filesystem, it has some shortcomings as below.
 > 
 > (1) Performance
 > Synchronization is probably heavy because it actually syncs unnecessary
 > inodes for target overlayfs.
 > 
 > (2) Interference
 > Unplanned synchronization will probably impact IO performance of
 > unrelated container processes on the other overlayfs.
 > 
 > This series try to implement containerized syncfs for overlayfs so that
 > only sync target dirty upper inodes which are belong to specific overlayfs
 > instance. By doing this, it is able to reduce cost of synchronization and 
 > will not seriously impact IO performance of unrelated processes.
 > 
 > v1->v2:
 > - Mark overlayfs' inode dirty itself instead of adding notification
 > mechanism to vfs inode.
 > 
 > v2->v3:
 > - Introduce overlayfs' extra syncfs wait list to wait target upper inodes
 > in ->sync_fs.
 > 
 > v3->v4:
 > - Using wait_sb_inodes() to wait syncing upper inodes.
 > - Mark overlay inode dirty only when having upper inode and VM_SHARED
 > flag in ovl_mmap().
 > - Check upper i_state after checking upper mmap state
 > in ovl_write_inode.
 > 
 > v4->v5:
 > - Add underlying inode dirtiness check after mnt_drop_write().
 > - Handle both wait/no-wait mode of syncfs(2) in overlayfs' ->sync_fs().
 > 
 > v5->v6:
 > - Rebase to latest overlayfs-next tree.
 > - Mark oerlay inode dirty when it has upper instead of marking dirty on
 >   modification.
 > - Trigger dirty page writeback in overlayfs' ->write_inode().
 > - Mark overlay inode 'DONTCACHE' flag.
 > - Delete overlayfs' ->writepages() and ->evict_inode() operations.


Hi Miklos,

Have you got time to have a look at this V6 series? I think this version has already fixed
the issues in previous feedbacks of you guys and passed fstests (generic/overlay cases).

I did some stress long time tests (tar & syncfs & diff on w/wo copy-up) and found no obvious problem.
For syncfs time with 1M clean upper inodes, there was extra 1.3s wasted on waiting scheduling.
I guess this 1.3s will not bring significant impact to container instance in most cases, I also
agree with Jack that we can start with this approach and do some improvements afterwards if there is
complain from any real users.



Thanks,
Chengguang


 > 
 > Chengguang Xu (7):
 >   ovl: setup overlayfs' private bdi
 >   ovl: mark overlayfs inode dirty when it has upper
 >   ovl: implement overlayfs' own ->write_inode operation
 >   ovl: set 'DONTCACHE' flag for overlayfs inode
 >   fs: export wait_sb_inodes()
 >   ovl: introduce ovl_sync_upper_blockdev()
 >   ovl: implement containerized syncfs for overlayfs
 > 
 >  fs/fs-writeback.c         |  3 ++-
 >  fs/overlayfs/inode.c      |  5 +++-
 >  fs/overlayfs/super.c      | 49 ++++++++++++++++++++++++++++++++-------
 >  fs/overlayfs/util.c       |  1 +
 >  include/linux/writeback.h |  1 +
 >  5 files changed, 48 insertions(+), 11 deletions(-)
 > 
 > -- 
 > 2.27.0
 > 
 > 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2021-11-27  9:29 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-22  3:00 [RFC PATCH V6 0/7] implement containerized syncfs for overlayfs Chengguang Xu
2021-11-22  3:00 ` [RFC PATCH V6 1/7] ovl: setup overlayfs' private bdi Chengguang Xu
2021-11-26  8:51   ` Jan Kara
2021-11-22  3:00 ` [RFC PATCH V6 2/7] ovl: mark overlayfs inode dirty when it has upper Chengguang Xu
2021-11-26  9:10   ` Jan Kara
2021-11-26 13:06     ` Chengguang Xu
2021-11-26 14:32       ` Jan Kara
2021-11-22  3:00 ` [RFC PATCH V6 3/7] ovl: implement overlayfs' own ->write_inode operation Chengguang Xu
2021-11-26  9:14   ` Jan Kara
2021-11-26 13:09     ` Chengguang Xu
2021-11-22  3:00 ` [RFC PATCH V6 4/7] ovl: set 'DONTCACHE' flag for overlayfs inode Chengguang Xu
2021-11-26  9:20   ` Jan Kara
2021-11-22  3:00 ` [RFC PATCH V6 5/7] fs: export wait_sb_inodes() Chengguang Xu
2021-11-26  9:20   ` Jan Kara
2021-11-22  3:00 ` [RFC PATCH V6 6/7] ovl: introduce ovl_sync_upper_blockdev() Chengguang Xu
2021-11-26  9:21   ` Jan Kara
2021-11-22  3:00 ` [RFC PATCH V6 7/7] ovl: implement containerized syncfs for overlayfs Chengguang Xu
2021-11-22  7:40   ` Amir Goldstein
2021-11-26  5:03     ` Chengguang Xu
2021-11-26  9:25   ` Jan Kara
2021-11-27  9:26 ` 回复:[RFC PATCH V6 0/7] " Chengguang Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).