All of lore.kernel.org
 help / color / mirror / Atom feed
From: Amir Goldstein <amir73il@gmail.com>
To: Chengguang Xu <cgxu519@mykernel.net>
Cc: Jan Kara <jack@suse.cz>, miklos <miklos@szeredi.hu>,
	linux-unionfs <linux-unionfs@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [RFC PATCH 1/5] fs: introduce notifier list for vfs inode
Date: Thu, 15 Oct 2020 15:32:24 +0300	[thread overview]
Message-ID: <CAOQ4uxhEA=ggONsJrUzGfHOGHob+81-UHk1Wo9ejj=CziAjtTQ@mail.gmail.com> (raw)
In-Reply-To: <1752c05cbe5.fd554a7a44272.2744418978249296745@mykernel.net>

>  > Perhaps you can combine that with the shadow overlay sbi approach.
>  > Instead of dirtying overlay inode when underlying is dirtied, you can
>  > "pre-dirty" overlayfs inode in higher level file ops and add them to the
>  > "maybe-dirty" list (e.g. after write).
>  >
>
> Main problem is we can't be notified by set_page_dirty from writable mmap.
> Meanwhile, if we dirty overlay inode then writeback will pick up dirty overlay
> inode and clear it after syncing, then overlay inode could be release at any time,
> so in the end, maybe overlay inode is released but upper inode is still dirty and
> there is no any pointer to find upper dirty inode out.
>

But we can control whether overlay inode is release with ovl_drop_inode()
right? Can we prevent dropping overlay inode if upper inode is
inode_is_open_for_write()?

About re-dirty, see below...

>
>  > ovl_sync_fs() can iterate the maybe-dirty list and re-dirty overlay inodes
>  > if the underlying inode is still dirty on the (!wait) pass.
>  >
>  > As for memory mapped inodes via overlayfs (which can be dirtied without
>  > notifying overlayfs) I am not sure that is a big problem in practice.
>  >
>
> Yes, it's key problem here.
>
>  > When an inode is writably mapped via ovarlayfs, you can flag the
>  > overlay inode with "maybe-writably-mapped" and then remove
>  > it from the maybe dirty list when the underlying inode is not dirty
>  > AND its i_writecount is 0 (checked on write_inode() and release()).
>  >
>  > Actually, there is no reason to treat writably mapped inodes and
>  > other dirty inodes differently - insert to suspect list on open for
>  > write, remove from suspect list on last release() or write_inode()
>  > when inode is no longer dirty and writable.
>  >
>  > Did I miss anything?
>  >
>
> If we dirty overlay inode that means we have launched writeback mechanism,
> so in this case, re-dirty overlay inode in time becomes important.
>

My idea was to use the first call to ovl_sync_fs() with 'wait' false
to iterate the
maybe-dirty list and re-dirty overlay inodes whose upper is dirty.

Then in the second call to __sync_filesystem, sync_inodes_sb() will take
care of calling ovl_write_inode() for all the re-dirty inodes.

In current code we sync ALL dirty upper fs inodes and we do not overlay
inodes with no reference in cache.

The best code would sync only upper fs inodes dirtied by this overlay
and will be able to evict overlay inodes whose upper inodes are clean.

The compromise code would sync only upper fs inodes dirtied by this overlay,
and would not evict overlay inodes as long as upper inodes are "open for write".
I think its a fine compromise considering the alternatives.

Is this workable?

Thanks,
Amir.

  reply	other threads:[~2020-10-15 12:32 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-10 14:23 [RFC PATCH 0/5] implement containerized syncfs for overlayfs Chengguang Xu
2020-10-10 14:23 ` [RFC PATCH 1/5] fs: introduce notifier list for vfs inode Chengguang Xu
2020-10-14 16:15   ` Jan Kara
2020-10-15  3:03     ` Chengguang Xu
2020-10-15  6:11       ` Amir Goldstein
2020-10-15 11:29         ` Chengguang Xu
2020-10-15 12:32           ` Amir Goldstein [this message]
2020-10-15 13:13             ` Chengguang Xu
2020-10-15 16:02               ` Amir Goldstein
2020-10-15 16:06                 ` Amir Goldstein
2020-10-16  1:56                 ` Chengguang Xu
2020-10-16  4:39                   ` Amir Goldstein
2020-10-16  7:43                     ` Chengguang Xu
2020-10-15  3:25   ` Al Viro
2020-10-15  3:42     ` Chengguang Xu
2020-10-15  4:57       ` Al Viro
2020-10-15 10:56         ` Chengguang Xu
2020-10-16  7:09         ` Chengguang Xu
2020-10-16  9:22           ` Jan Kara
2020-10-10 14:23 ` [RFC PATCH 2/5] fs: export symbol of writeback_single_inode() Chengguang Xu
2020-10-10 14:23 ` [RFC PATCH 3/5] ovl: setup overlayfs' private bdi Chengguang Xu
2020-10-10 14:23 ` [RFC PATCH 4/5] ovl: monitor marking dirty activity of underlying upper inode Chengguang Xu
2020-10-11  1:16   ` kernel test robot
2020-10-11  1:55   ` kernel test robot
2020-10-10 14:23 ` [RFC PATCH 5/5] ovl: impement containerized syncfs for overlayfs Chengguang Xu
2020-10-11  2:57   ` kernel test robot
2020-10-11  3:10   ` kernel test robot
2020-10-11 13:08   ` kernel test robot
2020-10-12 12:31   ` Dan Carpenter
2020-10-12 12:31     ` Dan Carpenter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAOQ4uxhEA=ggONsJrUzGfHOGHob+81-UHk1Wo9ejj=CziAjtTQ@mail.gmail.com' \
    --to=amir73il@gmail.com \
    --cc=cgxu519@mykernel.net \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.