linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Goldwyn Rodrigues <rgoldwyn@suse.de>
To: Matthew Wilcox <willy@infradead.org>,
	Goldwyn Rodrigues <rgoldwyn@suse.de>
Cc: Michal Hocko <mhocko@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	david@fromorbit.com
Subject: Re: [PATCH 1/3] fs: Perform writebacks under memalloc_nofs
Date: Tue, 27 Mar 2018 10:13:53 -0500	[thread overview]
Message-ID: <3a96b6ff-7d55-9bb6-8a30-f32f5dd0b054@suse.de> (raw)
In-Reply-To: <20180327142150.GA13604@bombadil.infradead.org>



On 03/27/2018 09:21 AM, Matthew Wilcox wrote:
> On Tue, Mar 27, 2018 at 07:52:48AM -0500, Goldwyn Rodrigues wrote:
>> I am not sure if I missed a condition in the code, but here is one of
>> the call lineup:
>>
>> writepages() -> writepage() -> kmalloc() -> __alloc_pages() ->
>> __alloc_pages_nodemask -> __alloc_pages_slowpath ->
>> __alloc_pages_direct_reclaim() -> try_to_free_pages() ->
>> do_try_to_free_pages() -> shrink_zones() -> shrink_node() ->
>> shrink_slab() -> do_shrink_slab() -> shrinker.scan_objects() ->
>> super_cache_scan() -> prune_icache_sb() -> fs/inode.c:dispose_list() ->
>> evict(inode) -> evict_inode() for ext4 ->  filemap_write_and_wait() ->
>> filemap_fdatawrite(mapping) -> __filemap_fdatawrite_range() ->
>> do_writepages -> writepages()
>>
>> Please note, most filesystems currently have a safeguard in writepage()
>> which will return if the PF_MEMALLOC is set. The other safeguard is
>> __GFP_FS which we are trying to eliminate.
> 
> But is that harmful?  ext4_writepage() (for example) says that it will
> not deadlock in that circumstance:

No, it is not harmful.

> 
>  * We can get recursively called as show below.
>  *
>  *      ext4_writepage() -> kmalloc() -> __alloc_pages() -> page_launder() ->
>  *              ext4_writepage()
>  *
>  * But since we don't do any block allocation we should not deadlock.
>  * Page also have the dirty flag cleared so we don't get recurive page_lock.

Yes, and it avoids this by checking for PF_MEMALLOC flag.

> 
> One might well argue that it's not *useful*; if we've gone into
> writepage already, there's no point in re-entering writepage.  And the
> last thing we want to do is 

?

> But I could see filesystems behaving differently when entered
> for writepage-for-regularly-scheduled-writeback versus
> writepage-for-shrinking, so maybe they can make progress.
> 

do_writepages() is the same for both, and hence the memalloc_* API patch.

> Maybe no real filesystem behaves that way.  We need feedback from
> filesystem people.

The idea is to:
* Keep a central location for check, rather than individual filesystem
writepage(). It should reduce code as well.
* Filesystem developers call memory allocations without thinking twice
about which GFP flag to use: GFP_KERNEL or GFP_NOFS. In essence
eliminate GFP_NOFS.


-- 
Goldwyn

  reply	other threads:[~2018-03-27 15:13 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21 22:44 [PATCH 0/3] fs: Use memalloc_nofs_save/restore scope API Goldwyn Rodrigues
2018-03-21 22:44 ` [PATCH 1/3] fs: Perform writebacks under memalloc_nofs Goldwyn Rodrigues
2018-03-22  7:08   ` Michal Hocko
2018-03-27 12:52     ` Goldwyn Rodrigues
2018-03-27 14:21       ` Matthew Wilcox
2018-03-27 15:13         ` Goldwyn Rodrigues [this message]
2018-03-27 16:45           ` Matthew Wilcox
2018-03-28  7:01           ` Michal Hocko
2018-03-28 23:57             ` Dave Chinner
2018-03-29  7:01               ` Michal Hocko
2018-03-31 21:21                 ` Dave Chinner
2018-03-21 22:44 ` [PATCH 2/3] fs: use memalloc_nofs API while shrinking superblock Goldwyn Rodrigues
2018-03-22  7:09   ` Michal Hocko
2018-03-21 22:44 ` [PATCH 3/3] fs: Use memalloc_nofs_save in generic_perform_write Goldwyn Rodrigues
2018-03-22  7:10   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3a96b6ff-7d55-9bb6-8a30-f32f5dd0b054@suse.de \
    --to=rgoldwyn@suse.de \
    --cc=david@fromorbit.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).