All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Chengguang Xu <cgxu519@mykernel.net>
Cc: Jan Kara <jack@suse.cz>, Miklos Szeredi <miklos@szeredi.hu>,
	Amir Goldstein <amir73il@gmail.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	overlayfs <linux-unionfs@vger.kernel.org>,
	linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH v5 06/10] ovl: implement overlayfs' ->write_inode operation
Date: Thu, 18 Nov 2021 17:43:49 +0100	[thread overview]
Message-ID: <20211118164349.GB8267@quack2.suse.cz> (raw)
In-Reply-To: <17d32ecf46e.124314f8f672.8832559275193368959@mykernel.net>

On Thu 18-11-21 20:02:09, Chengguang Xu wrote:
>  ---- 在 星期四, 2021-11-18 19:23:15 Jan Kara <jack@suse.cz> 撰写 ----
>  > On Thu 18-11-21 14:32:36, Chengguang Xu wrote:
>  > > 
>  > >  ---- 在 星期三, 2021-11-17 14:11:29 Chengguang Xu <cgxu519@mykernel.net> 撰写 ----
>  > >  >  ---- 在 星期二, 2021-11-16 20:35:55 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
>  > >  >  > On Tue, 16 Nov 2021 at 03:20, Chengguang Xu <cgxu519@mykernel.net> wrote:
>  > >  >  > >
>  > >  >  > >  ---- 在 星期四, 2021-10-07 21:34:19 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
>  > >  >  > >  > On Thu, 7 Oct 2021 at 15:10, Chengguang Xu <cgxu519@mykernel.net> wrote:
>  > >  >  > >  > >  > However that wasn't what I was asking about.  AFAICS ->write_inode()
>  > >  >  > >  > >  > won't start write back for dirty pages.   Maybe I'm missing something,
>  > >  >  > >  > >  > but there it looks as if nothing will actually trigger writeback for
>  > >  >  > >  > >  > dirty pages in upper inode.
>  > >  >  > >  > >  >
>  > >  >  > >  > >
>  > >  >  > >  > > Actually, page writeback on upper inode will be triggered by overlayfs ->writepages and
>  > >  >  > >  > > overlayfs' ->writepages will be called by vfs writeback function (i.e writeback_sb_inodes).
>  > >  >  > >  >
>  > >  >  > >  > Right.
>  > >  >  > >  >
>  > >  >  > >  > But wouldn't it be simpler to do this from ->write_inode()?
>  > >  >  > >  >
>  > >  >  > >  > I.e. call write_inode_now() as suggested by Jan.
>  > >  >  > >  >
>  > >  >  > >  > Also could just call mark_inode_dirty() on the overlay inode
>  > >  >  > >  > regardless of the dirty flags on the upper inode since it shouldn't
>  > >  >  > >  > matter and results in simpler logic.
>  > >  >  > >  >
>  > >  >  > >
>  > >  >  > > Hi Miklos,
>  > >  >  > >
>  > >  >  > > Sorry for delayed response for this, I've been busy with another project.
>  > >  >  > >
>  > >  >  > > I agree with your suggesion above and further more how about just mark overlay inode dirty
>  > >  >  > > when it has upper inode? This approach will make marking dirtiness simple enough.
>  > >  >  > 
>  > >  >  > Are you suggesting that all non-lower overlay inodes should always be dirty?
>  > >  >  > 
>  > >  >  > The logic would be simple, no doubt, but there's the cost to walking
>  > >  >  > those overlay inodes which don't have a dirty upper inode, right?  
>  > >  > 
>  > >  > That's true.
>  > >  > 
>  > >  >  > Can you quantify this cost with a benchmark?  Can be totally synthetic,
>  > >  >  > e.g. lookup a million upper files without modifying them, then call
>  > >  >  > syncfs.
>  > >  >  > 
>  > >  > 
>  > >  > No problem, I'll do some tests for the performance.
>  > >  > 
>  > > 
>  > > Hi Miklos,
>  > > 
>  > > I did some rough tests and the results like below.  In practice,  I don't
>  > > think that 1.3s extra time of syncfs will cause significant problem.
>  > > What do you think?
>  > 
>  > Well, burning 1.3s worth of CPU time for doing nothing seems like quite a
>  > bit to me. I understand this is with 1000000 inodes but although that is
>  > quite a few it is not unheard of. If there would be several containers
>  > calling sync_fs(2) on the machine they could easily hog the machine... That
>  > is why I was originally against keeping overlay inodes always dirty and
>  > wanted their dirtiness to at least roughly track the real need to do
>  > writeback.
>  > 
> 
> Hi Jan,
> 
> Actually, the time on user and sys are almost same with directly excute syncfs on underlying fs.
> IMO, it only extends syncfs(2) waiting time for perticular container but not burning cpu.
> What am I missing?

Ah, right, I've missed that only realtime changed, not systime. I'm sorry
for confusion. But why did the realtime increase so much? Are we waiting
for some IO?

								Honza

>  > > Test bed: kvm vm 
>  > > 2.50GHz cpu 32core
>  > > 64GB mem
>  > > vm kernel  5.15.0-rc1+ (with ovl syncfs patch V6)
>  > > 
>  > > one millon files spread to 2 level of dir hierarchy.
>  > > test step:
>  > > 1) create testfiles in ovl upper dir
>  > > 2) mount overlayfs
>  > > 3) excute ls -lR to lookup all file in overlay merge dir
>  > > 4) excute slabtop to make sure overlay inode number
>  > > 5) call syncfs to the file in merge dir
>  > > 
>  > > Tested five times and the reusults are in 1.310s ~ 1.326s
>  > > 
>  > > root@VM-144-4-centos test]# time ./syncfs ovl-merge/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m1.310s
>  > > user    0m0.000s
>  > > sys     0m0.001s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-merge/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m1.326s
>  > > user    0m0.001s
>  > > sys     0m0.000s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-merge/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m1.321s
>  > > user    0m0.000s
>  > > sys     0m0.001s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-merge/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m1.316s
>  > > user    0m0.000s
>  > > sys     0m0.001s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-merge/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m1.314s
>  > > user    0m0.001s
>  > > sys     0m0.001s
>  > > 
>  > > 
>  > > Directly run syncfs to the file in ovl-upper dir.
>  > > Tested five times and the reusults are in 0.001s ~ 0.003s
>  > > 
>  > > [root@VM-144-4-centos test]# time ./syncfs a
>  > > syncfs success
>  > > 
>  > > real    0m0.002s
>  > > user    0m0.001s
>  > > sys     0m0.000s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-upper/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m0.003s
>  > > user    0m0.001s
>  > > sys     0m0.000s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-upper/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m0.001s
>  > > user    0m0.000s
>  > > sys     0m0.001s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-upper/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m0.001s
>  > > user    0m0.000s
>  > > sys     0m0.001s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-upper/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m0.001s
>  > > user    0m0.000s
>  > > sys     0m0.001s
>  > > [root@VM-144-4-centos test]# time ./syncfs ovl-upper/create-file.sh 
>  > > syncfs success
>  > > 
>  > > real    0m0.001s
>  > > user    0m0.000s
>  > > sys     0m0.001
>  > > 
>  > > 
>  > > 
>  > > 
>  > > 
>  > > 
>  > -- 
>  > Jan Kara <jack@suse.com>
>  > SUSE Labs, CR
>  > 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2021-11-18 16:43 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-23 13:08 [RFC PATCH v5 00/10] implement containerized syncfs for overlayfs Chengguang Xu
2021-09-23 13:08 ` [RFC PATCH v5 01/10] ovl: setup overlayfs' private bdi Chengguang Xu
2021-09-23 13:08 ` [RFC PATCH v5 02/10] ovl: implement ->writepages operation Chengguang Xu
2021-09-23 13:08 ` [RFC PATCH v5 03/10] ovl: implement overlayfs' ->evict_inode operation Chengguang Xu
2021-10-06 15:33   ` Miklos Szeredi
2021-10-07  6:08     ` Chengguang Xu
2021-10-07  7:43       ` Miklos Szeredi
2021-09-23 13:08 ` [RFC PATCH v5 04/10] ovl: mark overlayfs' inode dirty on modification Chengguang Xu
2021-10-07 18:43   ` Miklos Szeredi
2021-09-23 13:08 ` [RFC PATCH v5 05/10] ovl: mark overlayfs' inode dirty on shared mmap Chengguang Xu
2021-09-23 13:08 ` [RFC PATCH v5 06/10] ovl: implement overlayfs' ->write_inode operation Chengguang Xu
2021-10-07  9:01   ` Jan Kara
2021-10-07 12:26     ` Chengguang Xu
2021-10-07 14:41       ` Jan Kara
2021-10-07 14:54         ` Chengguang Xu
2021-10-07  9:23   ` Miklos Szeredi
2021-10-07 12:28     ` Chengguang Xu
2021-10-07 12:45       ` Miklos Szeredi
2021-10-07 13:09         ` Chengguang Xu
2021-10-07 13:34           ` Miklos Szeredi
2021-10-07 14:46             ` Jan Kara
2021-10-07 14:53               ` Chengguang Xu
2021-10-07 18:51                 ` Miklos Szeredi
2021-10-08 13:13                   ` Jan Kara
2021-11-16  2:20             ` Chengguang Xu
2021-11-16 12:35               ` Miklos Szeredi
2021-11-17  6:11                 ` Chengguang Xu
2021-11-18  6:32                   ` Chengguang Xu
2021-11-18 11:23                     ` Jan Kara
2021-11-18 12:02                       ` Chengguang Xu
2021-11-18 16:43                         ` Jan Kara [this message]
2021-11-19  6:12                           ` Chengguang Xu
2021-11-30 11:22                             ` Jan Kara
2021-11-30 16:09                               ` Chengguang Xu
2021-11-30 19:04                                 ` Amir Goldstein
2021-12-01  2:37                                   ` Chengguang Xu
2021-12-01  6:31                                     ` Chengguang Xu
2021-12-01  7:19                                       ` Amir Goldstein
2021-12-01 13:46                                         ` Jan Kara
2021-12-01 14:59                                           ` Chengguang Xu
2021-12-01 16:24                                           ` Chengguang Xu
2021-12-01 22:47                                             ` Amir Goldstein
2021-12-01 23:23                                               ` ovl_flush() behavior Amir Goldstein
2021-12-02  2:11                                                 ` Chengguang Xu
2021-12-02 15:20                                                   ` Vivek Goyal
2021-12-02 15:59                                                     ` Amir Goldstein
2021-12-02 22:00                                                       ` Vivek Goyal
2021-12-02 15:14                                                 ` Vivek Goyal
2021-12-05 14:06                                               ` [RFC PATCH v5 06/10] ovl: implement overlayfs' ->write_inode operation Chengguang Xu
2021-12-07  5:33                                                 ` Amir Goldstein
2022-02-05 16:09                                                   ` Chengguang Xu
2022-02-05 16:23                                                     ` Amir Goldstein
2021-09-23 13:08 ` [RFC PATCH v5 07/10] ovl: cache dirty overlayfs' inode Chengguang Xu
2021-10-07 11:09   ` Miklos Szeredi
2021-10-07 12:04     ` Chengguang Xu
2021-10-07 12:27       ` Miklos Szeredi
2021-09-23 13:08 ` [RFC PATCH v5 08/10] fs: export wait_sb_inodes() Chengguang Xu
2021-09-23 13:08 ` [RFC PATCH v5 09/10] fs: introduce new helper sync_fs_and_blockdev() Chengguang Xu
2021-10-19  7:15   ` Amir Goldstein
2021-11-15 11:39     ` Chengguang Xu
2021-09-23 13:08 ` [RFC PATCH v5 10/10] ovl: implement containerized syncfs for overlayfs Chengguang Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20211118164349.GB8267@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=amir73il@gmail.com \
    --cc=cgxu519@mykernel.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.