linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	david@fromorbit.com
Subject: Re: [PATCH 1/3] fs: Perform writebacks under memalloc_nofs
Date: Tue, 27 Mar 2018 09:45:01 -0700	[thread overview]
Message-ID: <20180327164501.GA21975@bombadil.infradead.org> (raw)
In-Reply-To: <3a96b6ff-7d55-9bb6-8a30-f32f5dd0b054@suse.de>

On Tue, Mar 27, 2018 at 10:13:53AM -0500, Goldwyn Rodrigues wrote:
> On 03/27/2018 09:21 AM, Matthew Wilcox wrote:
> > On Tue, Mar 27, 2018 at 07:52:48AM -0500, Goldwyn Rodrigues wrote:
> >> I am not sure if I missed a condition in the code, but here is one of
> >> the call lineup:
> >>
> >> writepages() -> writepage() -> kmalloc() -> __alloc_pages() ->
> >> __alloc_pages_nodemask -> __alloc_pages_slowpath ->
> >> __alloc_pages_direct_reclaim() -> try_to_free_pages() ->
> >> do_try_to_free_pages() -> shrink_zones() -> shrink_node() ->
> >> shrink_slab() -> do_shrink_slab() -> shrinker.scan_objects() ->
> >> super_cache_scan() -> prune_icache_sb() -> fs/inode.c:dispose_list() ->
> >> evict(inode) -> evict_inode() for ext4 ->  filemap_write_and_wait() ->
> >> filemap_fdatawrite(mapping) -> __filemap_fdatawrite_range() ->
> >> do_writepages -> writepages()
> >>
> >> Please note, most filesystems currently have a safeguard in writepage()
> >> which will return if the PF_MEMALLOC is set. The other safeguard is
> >> __GFP_FS which we are trying to eliminate.
> > 
> > But is that harmful?  ext4_writepage() (for example) says that it will
> > not deadlock in that circumstance:
> 
> No, it is not harmful.
> 
> > 
> >  * We can get recursively called as show below.
> >  *
> >  *      ext4_writepage() -> kmalloc() -> __alloc_pages() -> page_launder() ->
> >  *              ext4_writepage()
> >  *
> >  * But since we don't do any block allocation we should not deadlock.
> >  * Page also have the dirty flag cleared so we don't get recurive page_lock.
> 
> Yes, and it avoids this by checking for PF_MEMALLOC flag.
> 
> > 
> > One might well argue that it's not *useful*; if we've gone into
> > writepage already, there's no point in re-entering writepage.  And the
> > last thing we want to do is 
> 
> ?

Sorry, got cut off.  The last thing we want to do is blow the stack by
recursing too deeply, but I don't think we're going to go through this
loop more than once.

> > But I could see filesystems behaving differently when entered
> > for writepage-for-regularly-scheduled-writeback versus
> > writepage-for-shrinking, so maybe they can make progress.
> > 
> 
> do_writepages() is the same for both, and hence the memalloc_* API patch.

But we don't want to avoid this particular recursion.  We only need to
avoid the recursion if it would result in a deadlock.

> > Maybe no real filesystem behaves that way.  We need feedback from
> > filesystem people.
> 
> The idea is to:
> * Keep a central location for check, rather than individual filesystem
> writepage(). It should reduce code as well.
> * Filesystem developers call memory allocations without thinking twice
> about which GFP flag to use: GFP_KERNEL or GFP_NOFS. In essence
> eliminate GFP_NOFS.

I know the goal is to eliminate GFP_NOFS.  I'm very much in favour
of that idea.  I'm just not sure you're going about it the right way.
Probably we will have a good discussion about it next month.

  reply	other threads:[~2018-03-27 16:45 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21 22:44 [PATCH 0/3] fs: Use memalloc_nofs_save/restore scope API Goldwyn Rodrigues
2018-03-21 22:44 ` [PATCH 1/3] fs: Perform writebacks under memalloc_nofs Goldwyn Rodrigues
2018-03-22  7:08   ` Michal Hocko
2018-03-27 12:52     ` Goldwyn Rodrigues
2018-03-27 14:21       ` Matthew Wilcox
2018-03-27 15:13         ` Goldwyn Rodrigues
2018-03-27 16:45           ` Matthew Wilcox [this message]
2018-03-28  7:01           ` Michal Hocko
2018-03-28 23:57             ` Dave Chinner
2018-03-29  7:01               ` Michal Hocko
2018-03-31 21:21                 ` Dave Chinner
2018-03-21 22:44 ` [PATCH 2/3] fs: use memalloc_nofs API while shrinking superblock Goldwyn Rodrigues
2018-03-22  7:09   ` Michal Hocko
2018-03-21 22:44 ` [PATCH 3/3] fs: Use memalloc_nofs_save in generic_perform_write Goldwyn Rodrigues
2018-03-22  7:10   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180327164501.GA21975@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=rgoldwyn@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).