From: Chengguang Xu <cgxu519@mykernel.net>
To: "cgxu519" <cgxu519@mykernel.net>
Cc: "miklos" <miklos@szeredi.hu>, "amir73il" <amir73il@gmail.com>,
"jack" <jack@suse.cz>,
"linux-unionfs" <linux-unionfs@vger.kernel.org>
Subject: 回复:[PATCH v11] ovl: Improving syncfs efficiency
Date: Tue, 17 Mar 2020 12:49:51 +0800 [thread overview]
Message-ID: <170e6d3975b.f31c4fdf10742.467886206992391518@mykernel.net> (raw)
In-Reply-To: <170e6cbc06e.d92f059210735.6409036322143087221@mykernel.net>
---- 在 星期二, 2020-03-17 12:41:17 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
> ---- 在 星期一, 2020-02-10 11:10:09 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
> > Current syncfs(2) syscall on overlayfs just calls sync_filesystem()
> > on upper_sb to synchronize whole dirty inodes in upper filesystem
> > regardless of the overlay ownership of the inode. In the use case of
> > container, when multiple containers using the same underlying upper
> > filesystem, it has some shortcomings as below.
> >
> > (1) Performance
> > Synchronization is probably heavy because it actually syncs unnecessary
> > inodes for target overlayfs.
> >
> > (2) Interference
> > Unplanned synchronization will probably impact IO performance of
> > unrelated container processes on the other overlayfs.
> >
> > This patch iterates upper inode list in overlayfs to only sync target
> > dirty inodes and wait for completion. By doing this, It is able to
> > reduce cost of synchronization and will not seriously impact IO performance
> > of unrelated processes. In special case, when having very few dirty inodes
> > and a lot of clean upper inodes in overlayfs, then iteration may waste
> > time so that synchronization is slower than before, but we think the
> > potential for performance improvement far surpass the potential for
> > performance regression in most cases.
> >
> > Signed-off-by: Chengguang Xu <cgxu519@mykernel.net>
> > ---
>
> Hi Miklos,
>
> Have you get time to review this patch?
> I tested xfstest overlay part (and generic/427 which is testcase for syncfs) and heavy untar + syncfs test and I found no problem.
>
Additionally, Jack and Amir have already reviewd until V10 version.
Thanks,
cgxu
> > Changes since v10:
> > - Add special handling in ovl_evict_inode() for inode eviction trigered
> > by kswapd memory reclamation(with PF_MEMALLOC flag).
> > - Rebase on current latest overlayfs-next tree.
> > - Slightly modify license information.
> >
> > Changes since v9:
> > - Rebase on current latest overlayfs-next tree.
> > - Calling clear_inode() regardless of having upper inode.
> >
> > Changes since v8:
> > - Remove unnecessary blk_start_plug() call in ovl_sync_inodes().
> >
> > Changes since v7:
> > - Check OVL_EVICT_PENDING flag instead of I_FREEING to recognize
> > inode which is in eviction process.
> > - Do not move ovl_inode back to ofs->upper_inodes when inode is in
> > eviction process.
> > - Delete unnecessary memory barrier in ovl_evict_inode().
> >
> > Changes since v6:
> > - Take/release sb->s_sync_lock bofore/after sync_fs to serialize
> > concurrent syncfs.
> > - Change list iterating method to improve code readability.
> > - Fix typo in commit log and comments.
> > - Add header comment to sync.c.
> >
> > Changes since v5:
> > - Move sync related functions to new file sync.c.
> > - Introduce OVL_EVICT_PENDING flag to avoid race conditions.
> > - If ovl inode flag is I_WILL_FREE or I_FREEING then syncfs
> > will wait until OVL_EVICT_PENDING to be cleared.
> > - Move inode sync operation into evict_inode from drop_inode.
> > - Call upper_sb->sync_fs with no-wait after sync dirty upper
> > inodes.
> > - Hold s_inode_wblist_lock until deleting item from list.
> > - Remove write_inode fuction because no more used.
> >
> > Changes since v4:
> > - Add syncing operation and deleting from upper_inodes list
> > during ovl inode destruction.
> > - Export symbol of inode_wait_for_writeback() for waiting
> > writeback on ovl inode.
> > - Reuse I_SYNC flag to avoid race conditions between syncfs
> > and drop_inode.
> >
> > Changes since v3:
> > - Introduce upper_inode list to organize ovl inode which has upper
> > inode.
> > - Avoid iput operation inside spin lock.
> > - Change list iteration method for safety.
> > - Reuse inode->i_wb_list and sb->s_inodes_wb to save memory.
> > - Modify commit log and add more comments to the code.
> >
> > Changes since v2:
> > - Decoupling with syncfs of upper fs by taking sb->s_sync_lock of
> > overlayfs.
> > - Factoring out sync operation to different helpers for easy
> > understanding.
> >
> > Changes since v1:
> > - If upper filesystem is readonly mode then skip synchronization.
> > - Introduce global wait list to replace temporary wait list for
> > concurrent synchronization.
> >
> > fs/overlayfs/Makefile | 2 +-
> > fs/overlayfs/overlayfs.h | 9 ++
> > fs/overlayfs/ovl_entry.h | 5 +
> > fs/overlayfs/super.c | 44 ++----
> > fs/overlayfs/sync.c | 331 +++++++++++++++++++++++++++++++++++++++
> > fs/overlayfs/util.c | 23 ++-
> > 6 files changed, 381 insertions(+), 33 deletions(-)
> > create mode 100644 fs/overlayfs/sync.c
> >
> > diff --git a/fs/overlayfs/Makefile b/fs/overlayfs/Makefile
> > index 9164c585eb2f..0c4e53fcf4ef 100644
> > --- a/fs/overlayfs/Makefile
> > +++ b/fs/overlayfs/Makefile
> > @@ -6,4 +6,4 @@
> > obj-$(CONFIG_OVERLAY_FS) += overlay.o
> >
> > overlay-objs := super.o namei.o util.o inode.o file.o dir.o readdir.o \
> > - copy_up.o export.o
> > + copy_up.o export.o sync.o
> > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> > index 3623d28aa4fa..3cdb06e206a3 100644
> > --- a/fs/overlayfs/overlayfs.h
> > +++ b/fs/overlayfs/overlayfs.h
> > @@ -40,6 +40,10 @@ enum ovl_inode_flag {
> > OVL_UPPERDATA,
> > /* Inode number will remain constant over copy up. */
> > OVL_CONST_INO,
> > + /* Set when ovl inode is about to be freed */
> > + OVL_EVICT_PENDING,
> > + /* Set when sync upper inode in workqueue work */
> > + OVL_WRITE_INODE_PENDING,
> > };
> >
> > enum ovl_entry_flag {
> > @@ -295,6 +299,7 @@ bool ovl_is_metacopy_dentry(struct dentry *dentry);
> > char *ovl_get_redirect_xattr(struct dentry *dentry, int padding);
> > ssize_t ovl_getxattr(struct dentry *dentry, char *name, char **value,
> > size_t padding);
> > +void ovl_detach_upper_inodes_list(struct inode *inode);
> >
> > static inline bool ovl_is_impuredir(struct dentry *dentry)
> > {
> > @@ -466,3 +471,7 @@ int ovl_set_origin(struct dentry *dentry, struct dentry *lower,
> >
> > /* export.c */
> > extern const struct export_operations ovl_export_operations;
> > +
> > +/* sync.c */
> > +void ovl_evict_inode(struct inode *inode);
> > +int ovl_sync_fs(struct super_block *sb, int wait);
> > diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> > index 89015ea822e7..42e092e8cd61 100644
> > --- a/fs/overlayfs/ovl_entry.h
> > +++ b/fs/overlayfs/ovl_entry.h
> > @@ -75,6 +75,9 @@ struct ovl_fs {
> > struct inode *indexdir_trap;
> > /* -1: disabled, 0: same fs, 1..32: number of unused ino bits */
> > int xino_mode;
> > + /* Upper inode list and lock */
> > + spinlock_t upper_inodes_lock;
> > + struct list_head upper_inodes;
> > };
> >
> > static inline struct ovl_fs *OVL_FS(struct super_block *sb)
> > @@ -115,6 +118,8 @@ struct ovl_inode {
> >
> > /* synchronize copy up and more */
> > struct mutex lock;
> > + /* Upper inodes list */
> > + struct list_head upper_inodes_list;
> > };
> >
> > static inline struct ovl_inode *OVL_I(struct inode *inode)
> > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> > index 319fe0d355b0..3e6871e1a7ba 100644
> > --- a/fs/overlayfs/super.c
> > +++ b/fs/overlayfs/super.c
> > @@ -183,6 +183,7 @@ static struct inode *ovl_alloc_inode(struct super_block *sb)
> > oi->lower = NULL;
> > oi->lowerdata = NULL;
> > mutex_init(&oi->lock);
> > + INIT_LIST_HEAD(&oi->upper_inodes_list);
> >
> > return &oi->vfs_inode;
> > }
> > @@ -249,36 +250,6 @@ static void ovl_put_super(struct super_block *sb)
> > ovl_free_fs(ofs);
> > }
> >
> > -/* Sync real dirty inodes in upper filesystem (if it exists) */
> > -static int ovl_sync_fs(struct super_block *sb, int wait)
> > -{
> > - struct ovl_fs *ofs = sb->s_fs_info;
> > - struct super_block *upper_sb;
> > - int ret;
> > -
> > - if (!ofs->upper_mnt)
> > - return 0;
> > -
> > - /*
> > - * If this is a sync(2) call or an emergency sync, all the super blocks
> > - * will be iterated, including upper_sb, so no need to do anything.
> > - *
> > - * If this is a syncfs(2) call, then we do need to call
> > - * sync_filesystem() on upper_sb, but enough if we do it when being
> > - * called with wait == 1.
> > - */
> > - if (!wait)
> > - return 0;
> > -
> > - upper_sb = ofs->upper_mnt->mnt_sb;
> > -
> > - down_read(&upper_sb->s_umount);
> > - ret = sync_filesystem(upper_sb);
> > - up_read(&upper_sb->s_umount);
> > -
> > - return ret;
> > -}
> > -
> > /**
> > * ovl_statfs
> > * @sb: The overlayfs super block
> > @@ -376,11 +347,19 @@ static int ovl_remount(struct super_block *sb, int *flags, char *data)
> > return 0;
> > }
> >
> > +static int ovl_drop_inode(struct inode *inode)
> > +{
> > + if (ovl_inode_upper(inode))
> > + ovl_set_flag(OVL_EVICT_PENDING, inode);
> > + return 1;
> > +}
> > +
> > static const struct super_operations ovl_super_operations = {
> > .alloc_inode = ovl_alloc_inode,
> > .free_inode = ovl_free_inode,
> > .destroy_inode = ovl_destroy_inode,
> > - .drop_inode = generic_delete_inode,
> > + .drop_inode = ovl_drop_inode,
> > + .evict_inode = ovl_evict_inode,
> > .put_super = ovl_put_super,
> > .sync_fs = ovl_sync_fs,
> > .statfs = ovl_statfs,
> > @@ -1601,6 +1580,9 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
> > if (!ofs)
> > goto out;
> >
> > + spin_lock_init(&ofs->upper_inodes_lock);
> > + INIT_LIST_HEAD(&ofs->upper_inodes);
> > +
> > ofs->creator_cred = cred = prepare_creds();
> > if (!cred)
> > goto out_err;
> > diff --git a/fs/overlayfs/sync.c b/fs/overlayfs/sync.c
> > new file mode 100644
> > index 000000000000..aecd312ec851
> > --- /dev/null
> > +++ b/fs/overlayfs/sync.c
> > @@ -0,0 +1,331 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2020 All Rights Reserved.
> > + * Author Chengguang Xu <cgxu519@mykernel.net>
> > + */
> > +
> > +#include <linux/fs.h>
> > +#include <linux/xattr.h>
> > +#include <linux/mount.h>
> > +#include <linux/writeback.h>
> > +#include <linux/blkdev.h>
> > +#include "overlayfs.h"
> > +
> > +/**
> > + * upper_inodes list is used for organizing potential target of syncfs,
> > + * ovl inode which has upper inode will be added to this list while
> > + * initializing or updating and will be deleted from this list while
> > + * evicting.
> > + *
> > + * Introduce ovl_inode flag "OVL_EVICT_PENDING" to indicate the ovl inode
> > + * is in eviction process, syncfs(actually ovl_sync_inodes()) will wait on
> > + * evicting inode until the IO to be completed in evict_inode().
> > + *
> > + * inode state/ovl_inode flags cycle in syncfs context will be as below.
> > + * OVL_EVICT_PENDING (ovl_inode->flags) is only marked when inode having
> > + * upper inode.
> > + *
> > + * (allocation)
> > + * |
> > + * I_NEW (inode->i_state)
> > + * |
> > + * NONE (inode->i_state)
> > + * |
> > + * OVL_EVICT_PENDING (ovl_inode->flags)
> > + * |
> > + * I_FREEING (inode->i_state) | OVL_EVICT_PENDING (ovl_inode->flags)
> > + * |
> > + * I_FREEING (inode->i_state) | I_CLEAR (inode->i_state)
> > + * |
> > + * (destruction)
> > + *
> > + *
> > + * There are five locks in in syncfs contexts:
> > + *
> > + * upper_sb->s_umount(semaphore) : avoiding r/o to r/w or vice versa
> > + * sb->s_sync_lock(mutex) : avoiding concorrent syncfs running
> > + * ofs->upper_inodes_lock(spinlock) : protecting upper_inodes list
> > + * sb->s_inode_wblist_lock(spinlock): protecting s_inodes_wb(sync waiting) list
> > + * inode->i_lock(spinlock) : protecting inode fields
> > + *
> > + * Lock dependencies in syncfs contexts:
> > + *
> > + * upper_sb->s_umount
> > + * sb->s_sync_lock
> > + * ofs->upper_inodes_lock
> > + * inode->i_lock
> > + *
> > + * upper_sb->s_umount
> > + * sb->s_sync_lock
> > + * sb->s_inode_wblist_lock
> > + *
> > + */
> > +
> > +struct ovl_write_inode_work {
> > + struct work_struct work;
> > + struct inode *inode;
> > +};
> > +
> > +static void ovl_write_inode_work_fn(struct work_struct *work)
> > +{
> > + struct ovl_write_inode_work *ovl_wiw;
> > + struct inode *inode;
> > +
> > + ovl_wiw = container_of(work, struct ovl_write_inode_work, work);
> > + inode = ovl_wiw->inode;
> > + write_inode_now(ovl_inode_upper(inode), 1);
> > +
> > + spin_lock(&inode->i_lock);
> > + ovl_clear_flag(OVL_WRITE_INODE_PENDING, inode);
> > + wake_up_bit(&OVL_I(inode)->flags, OVL_WRITE_INODE_PENDING);
> > + spin_unlock(&inode->i_lock);
> > +}
> > +
> > +void ovl_evict_inode(struct inode *inode)
> > +{
> > + struct ovl_inode *oi = OVL_I(inode);
> > + struct ovl_write_inode_work ovl_wiw;
> > + DEFINE_WAIT_BIT(wait, &oi->flags, OVL_WRITE_INODE_PENDING);
> > + wait_queue_head_t *wqh;
> > +
> > + if (ovl_inode_upper(inode)) {
> > + if (current->flags & PF_MEMALLOC) {
> > + spin_lock(&inode->i_lock);
> > + ovl_set_flag(OVL_WRITE_INODE_PENDING, inode);
> > + wqh = bit_waitqueue(&oi->flags,
> > + OVL_WRITE_INODE_PENDING);
> > + prepare_to_wait(wqh, &wait.wq_entry,
> > + TASK_UNINTERRUPTIBLE);
> > + spin_unlock(&inode->i_lock);
> > +
> > + ovl_wiw.inode = inode;
> > + INIT_WORK(&ovl_wiw.work, ovl_write_inode_work_fn);
> > + schedule_work(&ovl_wiw.work);
> > +
> > + schedule();
> > + finish_wait(wqh, &wait.wq_entry);
> > + } else {
> > + write_inode_now(ovl_inode_upper(inode), 1);
> > + }
> > + ovl_detach_upper_inodes_list(inode);
> > +
> > + /*
> > + * ovl_sync_inodes() may wait until
> > + * flag OVL_EVICT_PENDING to be cleared.
> > + */
> > + spin_lock(&inode->i_lock);
> > + ovl_clear_flag(OVL_EVICT_PENDING, inode);
> > + wake_up_bit(&OVL_I(inode)->flags, OVL_EVICT_PENDING);
> > + spin_unlock(&inode->i_lock);
> > + }
> > + clear_inode(inode);
> > +}
> > +
> > +void ovl_wait_evict_pending(struct inode *inode)
> > +{
> > + struct ovl_fs *ofs = inode->i_sb->s_fs_info;
> > + struct ovl_inode *oi = OVL_I(inode);
> > + DEFINE_WAIT_BIT(wait, &oi->flags, OVL_EVICT_PENDING);
> > + wait_queue_head_t *wqh;
> > +
> > + wqh = bit_waitqueue(&oi->flags, OVL_EVICT_PENDING);
> > + prepare_to_wait(wqh, &wait.wq_entry, TASK_UNINTERRUPTIBLE);
> > + spin_unlock(&inode->i_lock);
> > + spin_unlock(&ofs->upper_inodes_lock);
> > + schedule();
> > + finish_wait(wqh, &wait.wq_entry);
> > +}
> > +
> > +/**
> > + * ovl_sync_inodes
> > + * @sb: The overlayfs super block
> > + *
> > + * upper_inodes list is used for organizing ovl inode which has upper inode,
> > + * by iterating the list to looking for and syncing dirty upper inode.
> > + *
> > + * When starting syncing inode, we add the inode to wait list explicitly,
> > + * in order to save memory we reuse inode->i_wb_list and sb->s_inodes_wb,
> > + * so those fields have slightly differnt meanings in overlayfs.
> > + */
> > +static void ovl_sync_inodes(struct super_block *sb)
> > +{
> > + struct ovl_fs *ofs = sb->s_fs_info;
> > + struct ovl_inode *oi;
> > + struct inode *inode;
> > + struct inode *upper_inode;
> > + struct blk_plug plug;
> > + LIST_HEAD(sync_tmp_list);
> > +
> > + struct writeback_control wbc_for_sync = {
> > + .sync_mode = WB_SYNC_ALL,
> > + .for_sync = 1,
> > + .range_start = 0,
> > + .range_end = LLONG_MAX,
> > + .nr_to_write = LONG_MAX,
> > + };
> > +
> > + blk_start_plug(&plug);
> > + spin_lock(&ofs->upper_inodes_lock);
> > + list_splice_init(&ofs->upper_inodes, &sync_tmp_list);
> > +
> > + while (!list_empty(&sync_tmp_list)) {
> > + oi = list_first_entry(&sync_tmp_list, struct ovl_inode,
> > + upper_inodes_list);
> > + inode = &oi->vfs_inode;
> > + spin_lock(&inode->i_lock);
> > +
> > + if (inode->i_state & I_NEW) {
> > + list_move_tail(&oi->upper_inodes_list,
> > + &ofs->upper_inodes);
> > + spin_unlock(&inode->i_lock);
> > + continue;
> > + }
> > +
> > + /*
> > + * If ovl_inode flags is OVL_EVICT_PENDING,
> > + * left sync operation to the ovl_evict_inode(),
> > + * so wait here until OVL_EVICT_PENDING flag to be cleared.
> > + */
> > + if (ovl_test_flag(OVL_EVICT_PENDING, inode)) {
> > + ovl_wait_evict_pending(inode);
> > + goto next;
> > + }
> > +
> > + list_move_tail(&oi->upper_inodes_list,
> > + &ofs->upper_inodes);
> > + ihold(inode);
> > + upper_inode = ovl_inode_upper(inode);
> > + spin_unlock(&inode->i_lock);
> > + spin_unlock(&ofs->upper_inodes_lock);
> > +
> > + if (!(upper_inode->i_state & I_DIRTY_ALL)) {
> > + if (!mapping_tagged(upper_inode->i_mapping,
> > + PAGECACHE_TAG_WRITEBACK)) {
> > + iput(inode);
> > + goto next;
> > + }
> > + } else {
> > + sync_inode(upper_inode, &wbc_for_sync);
> > + }
> > +
> > + spin_lock(&sb->s_inode_wblist_lock);
> > + list_add_tail(&inode->i_wb_list, &sb->s_inodes_wb);
> > + spin_unlock(&sb->s_inode_wblist_lock);
> > +
> > +next:
> > + cond_resched();
> > + spin_lock(&ofs->upper_inodes_lock);
> > + }
> > + spin_unlock(&ofs->upper_inodes_lock);
> > + blk_finish_plug(&plug);
> > +}
> > +
> > +/**
> > + * ovl_wait_inodes
> > + * @sb: The overlayfs super block
> > + *
> > + * Waiting writeback inodes which are in s_inodes_wb list,
> > + * all the IO that has been issued up to the time this
> > + * function is enter is guaranteed to be completed.
> > + */
> > +static void ovl_wait_inodes(struct super_block *sb)
> > +{
> > + LIST_HEAD(sync_wait_list);
> > + struct inode *inode;
> > + struct inode *upper_inode;
> > +
> > + /*
> > + * ovl inode in sb->s_inodes_wb list has already got inode reference,
> > + * this will avoid silent inode droping during waiting on it.
> > + *
> > + * Splice the sync wait list onto a temporary list to avoid waiting on
> > + * inodes that have started writeback after this point.
> > + */
> > + spin_lock(&sb->s_inode_wblist_lock);
> > + list_splice_init(&sb->s_inodes_wb, &sync_wait_list);
> > +
> > + while (!list_empty(&sync_wait_list)) {
> > + inode = list_first_entry(&sync_wait_list, struct inode,
> > + i_wb_list);
> > + list_del_init(&inode->i_wb_list);
> > + upper_inode = ovl_inode_upper(inode);
> > + spin_unlock(&sb->s_inode_wblist_lock);
> > +
> > + if (!mapping_tagged(upper_inode->i_mapping,
> > + PAGECACHE_TAG_WRITEBACK)) {
> > + goto next;
> > + }
> > +
> > + filemap_fdatawait_keep_errors(upper_inode->i_mapping);
> > + cond_resched();
> > +
> > +next:
> > + iput(inode);
> > + spin_lock(&sb->s_inode_wblist_lock);
> > + }
> > + spin_unlock(&sb->s_inode_wblist_lock);
> > +}
> > +
> > +/**
> > + * ovl_sync_filesystem
> > + * @sb: The overlayfs super block
> > + *
> > + * Sync underlying dirty inodes in upper filesystem and
> > + * wait for completion.
> > + */
> > +static int ovl_sync_filesystem(struct super_block *sb)
> > +{
> > + struct ovl_fs *ofs = sb->s_fs_info;
> > + struct super_block *upper_sb = ofs->upper_mnt->mnt_sb;
> > + int ret = 0;
> > +
> > + down_read(&upper_sb->s_umount);
> > + if (!sb_rdonly(upper_sb)) {
> > + /*
> > + * s_sync_lock is used for serializing concurrent
> > + * syncfs operations.
> > + */
> > + mutex_lock(&sb->s_sync_lock);
> > + ovl_sync_inodes(sb);
> > + /* Calling sync_fs with no-wait for better performance. */
> > + if (upper_sb->s_op->sync_fs)
> > + upper_sb->s_op->sync_fs(upper_sb, 0);
> > +
> > + ovl_wait_inodes(sb);
> > + if (upper_sb->s_op->sync_fs)
> > + upper_sb->s_op->sync_fs(upper_sb, 1);
> > +
> > + ret = sync_blockdev(upper_sb->s_bdev);
> > + mutex_unlock(&sb->s_sync_lock);
> > + }
> > + up_read(&upper_sb->s_umount);
> > + return ret;
> > +}
> > +
> > +/**
> > + * ovl_sync_fs
> > + * @sb: The overlayfs super block
> > + * @wait: Wait for I/O to complete
> > + *
> > + * Sync real dirty inodes in upper filesystem (if it exists)
> > + */
> > +int ovl_sync_fs(struct super_block *sb, int wait)
> > +{
> > + struct ovl_fs *ofs = sb->s_fs_info;
> > +
> > + if (!ofs->upper_mnt)
> > + return 0;
> > +
> > + /*
> > + * If this is a sync(2) call or an emergency sync, all the super blocks
> > + * will be iterated, including upper_sb, so no need to do anything.
> > + *
> > + * If this is a syncfs(2) call, then we do need to call
> > + * sync_filesystem() on upper_sb, but enough if we do it when being
> > + * called with wait == 1.
> > + */
> > + if (!wait)
> > + return 0;
> > +
> > + return ovl_sync_filesystem(sb);
> > +}
> > diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> > index ea005085803f..73ef195d9aef 100644
> > --- a/fs/overlayfs/util.c
> > +++ b/fs/overlayfs/util.c
> > @@ -386,13 +386,33 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect)
> > oi->redirect = redirect;
> > }
> >
> > +void ovl_attach_upper_inodes_list(struct inode *inode)
> > +{
> > + struct ovl_fs *ofs = inode->i_sb->s_fs_info;
> > +
> > + spin_lock(&ofs->upper_inodes_lock);
> > + list_add(&OVL_I(inode)->upper_inodes_list, &ofs->upper_inodes);
> > + spin_unlock(&ofs->upper_inodes_lock);
> > +}
> > +
> > +void ovl_detach_upper_inodes_list(struct inode *inode)
> > +{
> > + struct ovl_fs *ofs = inode->i_sb->s_fs_info;
> > +
> > + spin_lock(&ofs->upper_inodes_lock);
> > + list_del_init(&OVL_I(inode)->upper_inodes_list);
> > + spin_unlock(&ofs->upper_inodes_lock);
> > +}
> > +
> > void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
> > struct dentry *lowerdentry, struct dentry *lowerdata)
> > {
> > struct inode *realinode = d_inode(upperdentry ?: lowerdentry);
> >
> > - if (upperdentry)
> > + if (upperdentry) {
> > OVL_I(inode)->__upperdentry = upperdentry;
> > + ovl_attach_upper_inodes_list(inode);
> > + }
> > if (lowerdentry)
> > OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
> > if (lowerdata)
> > @@ -421,6 +441,7 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
> > inode->i_private = upperinode;
> > __insert_inode_hash(inode, (unsigned long) upperinode);
> > }
> > + ovl_attach_upper_inodes_list(inode);
> > }
> >
> > static void ovl_dentry_version_inc(struct dentry *dentry, bool impurity)
> > --
> > 2.21.1
> >
> >
> >
>
next prev parent reply other threads:[~2020-03-17 4:50 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-10 3:10 [PATCH v11] ovl: Improving syncfs efficiency Chengguang Xu
2020-03-17 4:41 ` 回复:[PATCH " Chengguang Xu
2020-03-17 4:49 ` Chengguang Xu [this message]
2020-04-15 19:19 ` [PATCH " Miklos Szeredi
2020-04-16 6:08 ` Chengguang Xu
2020-04-16 7:21 ` Miklos Szeredi
2020-04-16 7:39 ` Miklos Szeredi
2020-04-16 8:00 ` Miklos Szeredi
2020-04-16 11:16 ` Jan Kara
2020-04-16 11:14 ` Jan Kara
2020-04-16 13:52 ` Chengguang Xu
2020-04-16 14:33 ` Jan Kara
2020-04-17 1:23 ` Chengguang Xu
2020-04-17 9:26 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=170e6d3975b.f31c4fdf10742.467886206992391518@mykernel.net \
--to=cgxu519@mykernel.net \
--cc=amir73il@gmail.com \
--cc=jack@suse.cz \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).