linux-unionfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chengguang Xu <cgxu519@mykernel.net>
To: "Jan Kara" <jack@suse.cz>
Cc: "Miklos Szeredi" <miklos@szeredi.hu>,
	"Amir Goldstein" <amir73il@gmail.com>,
	"overlayfs" <linux-unionfs@vger.kernel.org>
Subject: Re: [PATCH v11] ovl: Improving syncfs efficiency
Date: Fri, 17 Apr 2020 09:23:07 +0800	[thread overview]
Message-ID: <17185bb747c.ce54f2d81371.5054043296955656018@mykernel.net> (raw)
In-Reply-To: <20200416143349.GP23739@quack2.suse.cz>

 ---- 在 星期四, 2020-04-16 22:33:49 Jan Kara <jack@suse.cz> 撰写 ----
 > On Thu 16-04-20 21:52:27, Chengguang Xu wrote:
 > >  ---- 在 星期四, 2020-04-16 19:14:24 Jan Kara <jack@suse.cz> 撰写 ----
 > >  > On Thu 16-04-20 14:08:59, Chengguang Xu wrote:
 > >  > >  ---- 在 星期四, 2020-04-16 03:19:50 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
 > >  > >  > On Mon, Feb 10, 2020 at 4:10 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
 > >  > >  > > +void ovl_evict_inode(struct inode *inode)
 > >  > >  > > +{
 > >  > >  > > +       struct ovl_inode *oi = OVL_I(inode);
 > >  > >  > > +       struct ovl_write_inode_work ovl_wiw;
 > >  > >  > > +       DEFINE_WAIT_BIT(wait, &oi->flags, OVL_WRITE_INODE_PENDING);
 > >  > >  > > +       wait_queue_head_t *wqh;
 > >  > >  > > +
 > >  > >  > > +       if (ovl_inode_upper(inode)) {
 > >  > >  > > +               if (current->flags & PF_MEMALLOC) {
 > >  > >  > > +                       spin_lock(&inode->i_lock);
 > >  > >  > > +                       ovl_set_flag(OVL_WRITE_INODE_PENDING, inode);
 > >  > >  > > +                       wqh = bit_waitqueue(&oi->flags,
 > >  > >  > > +                                       OVL_WRITE_INODE_PENDING);
 > >  > >  > > +                       prepare_to_wait(wqh, &wait.wq_entry,
 > >  > >  > > +                                       TASK_UNINTERRUPTIBLE);
 > >  > >  > > +                       spin_unlock(&inode->i_lock);
 > >  > >  > > +
 > >  > >  > > +                       ovl_wiw.inode = inode;
 > >  > >  > > +                       INIT_WORK(&ovl_wiw.work, ovl_write_inode_work_fn);
 > >  > >  > > +                       schedule_work(&ovl_wiw.work);
 > >  > >  > > +
 > >  > >  > > +                       schedule();
 > >  > >  > > +                       finish_wait(wqh, &wait.wq_entry);
 > >  > >  > 
 > >  > >  > What is the reason to do this in another thread if this is a PF_MEMALLOC task?
 > >  > > 
 > >  > > Some underlying filesystems(for example ext4) check the flag in
 > >  > > ->write_inode() and treate it as an abnormal case.(warn and return)
 > >  > > 
 > >  > > ext4_write_inode():
 > >  > >         if (WARN_ON_ONCE(current->flags & PF_MEMALLOC) ||
 > >  > >                 sb_rdonly(inode->i_sb))
 > >  > >                         return 0;
 > >  > > 
 > >  > > overlayfs inodes are always keeping clean even after wring/modifying
 > >  > > upperfile , so they are right target of kswapd  but in the point of lower
 > >  > > layer, ext4 just thinks kswapd is choosing a wrong dirty inode to reclam
 > >  > > memory.
 > >  > 
 > >  > In ext4, it isn't a big problem if ext4_write_inode() is called from
 > >  > kswapd. But if ext4_write_inode() is called from direct reclaim (which also
 > >  > sets PF_MEMALLOC) we can deadlock because we may wait for transaction
 > >  > commit and transaction commit may require locks (such as page lock or
 > >  > waiting for page writeback to complete) which are held by the task
 > >  > currently in direct reclaim. Your push to workqueue will silence the
 > >  > warning but does not solve the possible deadlock.
 > >  > 
 > >  > I'm actually not sure why you need to writeback the upper inode when
 > >  > reclaiming overlayfs inode. Why not leave it on flush worker on upper fs?
 > >  > 
 > > 
 > > Because it is the last chance we can sync dirty upper inode, I mean after
 > > evicting overlayfs inode we can not find the associated dirty upper inode
 > > from any list and that dirty upper inode will be skipped from the target
 > > of syncfs().
 > 
 > I see. But this flushing of dirty inodes on reclaim really isn't a great
 > idea. It can also stall reclaim (due to it being stuck waiting for IO) and
 > thus lead to bad behavior in low memory situations. It's better to just
 > skip reclaiming such inodes - but then I agree it's a difficult question
 > when to reclaim them. Ideally you'd need to hook into inode_lru_isolate()
 > but that's just too ugly (but maybe it could be done in some clean manner).
 > 

How about prepare another list to organize  this kind of inode, when evicting overlay
inode we grab dirty upper inode reference, put a entry(a field point to upper inode) 
into this new list and retrun, after that periodically checking list entry to release(iput)
clean upper inode.

In overlayfs's syncfs(), we also need to iterate this new list after iterating oi->upper_inodes_list.

Thanks,
cgxu






  reply	other threads:[~2020-04-17  1:23 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-10  3:10 [PATCH v11] ovl: Improving syncfs efficiency Chengguang Xu
2020-03-17  4:41 ` 回复:[PATCH " Chengguang Xu
2020-03-17  4:49   ` Chengguang Xu
2020-04-15 19:19 ` [PATCH " Miklos Szeredi
2020-04-16  6:08   ` Chengguang Xu
2020-04-16  7:21     ` Miklos Szeredi
2020-04-16  7:39       ` Miklos Szeredi
2020-04-16  8:00         ` Miklos Szeredi
2020-04-16 11:16           ` Jan Kara
2020-04-16 11:14     ` Jan Kara
2020-04-16 13:52       ` Chengguang Xu
2020-04-16 14:33         ` Jan Kara
2020-04-17  1:23           ` Chengguang Xu [this message]
2020-04-17  9:26             ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17185bb747c.ce54f2d81371.5054043296955656018@mykernel.net \
    --to=cgxu519@mykernel.net \
    --cc=amir73il@gmail.com \
    --cc=jack@suse.cz \
    --cc=linux-unionfs@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).