From: Jan Kara <jack@suse.cz>
To: Miklos Szeredi <miklos@szeredi.hu>
Cc: Chengguang Xu <cgxu519@mykernel.net>,
Amir Goldstein <amir73il@gmail.com>, Jan Kara <jack@suse.cz>,
overlayfs <linux-unionfs@vger.kernel.org>
Subject: Re: [PATCH v11] ovl: Improving syncfs efficiency
Date: Thu, 16 Apr 2020 13:16:26 +0200 [thread overview]
Message-ID: <20200416111626.GG23739@quack2.suse.cz> (raw)
In-Reply-To: <CAJfpegsooG8-kVkozZYf3sP_UL78nTAaqmcUyPTktTiyzSa+Kw@mail.gmail.com>
On Thu 16-04-20 10:00:13, Miklos Szeredi wrote:
> On Thu, Apr 16, 2020 at 9:39 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Thu, Apr 16, 2020 at 9:21 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Thu, Apr 16, 2020 at 8:09 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
> > > >
> > > > ---- 在 星期四, 2020-04-16 03:19:50 Miklos Szeredi <miklos@szeredi.hu> 撰写 ----
> > > > > On Mon, Feb 10, 2020 at 4:10 AM Chengguang Xu <cgxu519@mykernel.net> wrote:
> > >
> > > > > > + if (current->flags & PF_MEMALLOC) {
> > > > > > + spin_lock(&inode->i_lock);
> > > > > > + ovl_set_flag(OVL_WRITE_INODE_PENDING, inode);
> > > > > > + wqh = bit_waitqueue(&oi->flags,
> > > > > > + OVL_WRITE_INODE_PENDING);
> > > > > > + prepare_to_wait(wqh, &wait.wq_entry,
> > > > > > + TASK_UNINTERRUPTIBLE);
> > > > > > + spin_unlock(&inode->i_lock);
> > > > > > +
> > > > > > + ovl_wiw.inode = inode;
> > > > > > + INIT_WORK(&ovl_wiw.work, ovl_write_inode_work_fn);
> > > > > > + schedule_work(&ovl_wiw.work);
> > > > > > +
> > > > > > + schedule();
> > > > > > + finish_wait(wqh, &wait.wq_entry);
> > > > >
> > > > > What is the reason to do this in another thread if this is a PF_MEMALLOC task?
> > > >
> > > > Some underlying filesystems(for example ext4) check the flag in ->write_inode()
> > > > and treate it as an abnormal case.(warn and return)
> > > >
> > > > ext4_write_inode():
> > > > if (WARN_ON_ONCE(current->flags & PF_MEMALLOC) ||
> > > > sb_rdonly(inode->i_sb))
> > > > return 0;
> > > >
> > > > overlayfs inodes are always keeping clean even after wring/modifying upperfile ,
> > > > so they are right target of kswapd but in the point of lower layer, ext4 just thinks
> > > > kswapd is choosing a wrong dirty inode to reclam memory.
> > >
> > > I don't get it. Why can't overlayfs just skip the writeback of upper
> > > inode in the reclaim case? It will be written back through the normal
> > > relcaim channels.
> >
> > And how do we get reclaim on overlay inode at all? Overlay inodes
> > will get evicted immediately after their refcount drops to zero, so
> > there's absolutely no chance that memory reclaim will encounter them,
> > no?
>
> Spoke too soon. Obviously this case is about dentry reclaim, not
> inode reclaim.
>
> So how about temporarily clearing PF_MEMALLOC in this case? Doing
> this is a kernel thread doesn't seem to add any advantages.
Clearing PF_MEMALLOC will not solve the deadlock I've described in the
reply to Chengguang. Ext4 really cannot safely handle data integrity
writeback (which is what write_inode_now(inode, 1) does) from direct
reclaim.
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
next prev parent reply other threads:[~2020-04-16 11:28 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-10 3:10 [PATCH v11] ovl: Improving syncfs efficiency Chengguang Xu
2020-03-17 4:41 ` 回复:[PATCH " Chengguang Xu
2020-03-17 4:49 ` Chengguang Xu
2020-04-15 19:19 ` [PATCH " Miklos Szeredi
2020-04-16 6:08 ` Chengguang Xu
2020-04-16 7:21 ` Miklos Szeredi
2020-04-16 7:39 ` Miklos Szeredi
2020-04-16 8:00 ` Miklos Szeredi
2020-04-16 11:16 ` Jan Kara [this message]
2020-04-16 11:14 ` Jan Kara
2020-04-16 13:52 ` Chengguang Xu
2020-04-16 14:33 ` Jan Kara
2020-04-17 1:23 ` Chengguang Xu
2020-04-17 9:26 ` Jan Kara
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200416111626.GG23739@quack2.suse.cz \
--to=jack@suse.cz \
--cc=amir73il@gmail.com \
--cc=cgxu519@mykernel.net \
--cc=linux-unionfs@vger.kernel.org \
--cc=miklos@szeredi.hu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).