linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-xfs@vger.kernel.org, dm-devel@redhat.com,
	Jens Axboe <axboe@kernel.dk>, NeilBrown <neilb@suse.de>
Subject: Re: [PATCH 0/6] Overhaul memalloc_no*
Date: Mon, 29 Jun 2020 10:22:09 +0200	[thread overview]
Message-ID: <20200629082209.GC32461@dhcp22.suse.cz> (raw)
In-Reply-To: <20200626230847.GI2005@dread.disaster.area>

On Sat 27-06-20 09:08:47, Dave Chinner wrote:
> On Fri, Jun 26, 2020 at 11:02:19AM -0400, Mikulas Patocka wrote:
> > Hi
> > 
> > I suggest to join memalloc_noio and memalloc_nofs into just one flag that 
> > prevents both filesystem recursion and i/o recursion.
> > 
> > Note that any I/O can recurse into a filesystem via the loop device, thus 
> > it doesn't make much sense to have a context where PF_MEMALLOC_NOFS is set 
> > and PF_MEMALLOC_NOIO is not set.
> 
> Correct me if I'm wrong, but I think that will prevent swapping from
> GFP_NOFS memory reclaim contexts. IOWs, this will substantially
> change the behaviour of the memory reclaim system under sustained
> GFP_NOFS memory pressure. Sustained GFP_NOFS memory pressure is
> quite common, so I really don't think we want to telling memory
> reclaim "you can't do IO at all" when all we are trying to do is
> prevent recursion back into the same filesystem.
> 
> Given that the loop device IO path already operates under
> memalloc_noio context, (i.e. the recursion restriction is applied in
> only the context that needs is) I see no reason for making that a
> global reclaim limitation....
> 
> In reality, we need to be moving the other way with GFP_NOFS - to
> fine grained anti-recursion contexts, not more broad contexts.

Absolutely agreed! It is not really hard to see system struggling due to
heavy FS metadata workload while there are objects which could be
reclaimed.

> That is, GFP_NOFS prevents recursion into any filesystem, not just
> the one that we are actively operating on and needing to prevent
> recursion back into. We can safely have reclaim do relcaim work on
> other filesysetms without fear of recursion deadlocks, but the
> memory reclaim infrastructure does not provide that capability.(*)
> 
> e.g. if memalloc_nofs_save() took a reclaim context structure that
> the filesystem put the superblock, the superblock's nesting depth
> (because layering on loop devices can create cross-filesystem
> recursion dependencies), and any other filesyetm private data the
> fs wanted to add, we could actually have reclaim only avoid reclaim
> from filesytsems where there is a deadlock possiblity. e.g:
> 
> 	- superblock nesting depth is different, apply GFP_NOFS
> 	  reclaim unconditionally
> 	- superblock different apply GFP_KERNEL reclaim
> 	- superblock the same, pass context to filesystem to
> 	  decide if reclaim from the sueprblock is safe.
> 
> At this point, we get memory reclaim able to always be able to
> reclaim from filesystems that are not at risk of recursion
> deadlocks. Direct reclaim is much more likely to be able to make
> progress now because it is much less restricted in what it can
> reclaim. That's going to make direct relcaim faster and more
> efficient, and taht's the ultimate goal we are aiming to acheive
> here...

Yes, we have discussed something like that few years back at LSFMM IIRC.
The scoped NOFS/NOIO api was just a first step to reduce explicit
NOFS/NOIO usage with a hope that we will get no-recursion entry points
much more well defined and get rid of many instances where "this is a fs
code so it has to use NOFS gfp mask".

Some of that has happened and that is really great. On the other hand
many people still like to use that api as a workaround for an immediate
problem because no-recursion scopes are much harder to recognize unless
you are supper familiar with the specific fs/IO layer implementation.
So this is definitely not a project for somebody to go over all code and
just do the clean up.

Thanks!
-- 
Michal Hocko
SUSE Labs


      parent reply	other threads:[~2020-06-29  8:22 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-25 11:31 [PATCH 0/6] Overhaul memalloc_no* Matthew Wilcox (Oracle)
2020-06-25 11:31 ` [PATCH 1/6] mm: Replace PF_MEMALLOC_NOIO with memalloc_noio Matthew Wilcox (Oracle)
2020-06-25 12:22   ` Michal Hocko
2020-06-25 12:34     ` Matthew Wilcox
2020-06-25 12:42       ` Michal Hocko
2020-06-25 11:31 ` [PATCH 2/6] mm: Add become_kswapd and restore_kswapd Matthew Wilcox (Oracle)
2020-06-25 12:31   ` Michal Hocko
2020-06-25 11:31 ` [PATCH 3/6] xfs: Convert to memalloc_nofs_save Matthew Wilcox (Oracle)
2020-06-25 11:31 ` [PATCH 4/6] mm: Replace PF_MEMALLOC_NOFS with memalloc_nofs Matthew Wilcox (Oracle)
2020-06-25 13:35   ` Michal Hocko
2020-06-25 11:31 ` [PATCH 5/6] mm: Replace PF_MEMALLOC_NOIO with memalloc_nocma Matthew Wilcox (Oracle)
2020-06-25 11:31 ` [PATCH 6/6] mm: Add memalloc_nowait Matthew Wilcox (Oracle)
2020-06-25 12:40   ` Michal Hocko
2020-06-25 13:10     ` Matthew Wilcox
2020-06-25 13:34       ` Michal Hocko
2020-06-25 19:05   ` kernel test robot
2020-06-25 23:51   ` kernel test robot
2020-06-29  5:08   ` Mike Rapoport
2020-06-29 12:18     ` Matthew Wilcox
2020-06-29 12:52       ` Michal Hocko
2020-06-29 13:45         ` Mike Rapoport
2020-06-29 21:28           ` Matthew Wilcox
2020-06-30  6:34             ` Michal Hocko
2020-07-01  4:12               ` Matthew Wilcox
2020-07-01  5:53                 ` Michal Hocko
2020-07-01  7:04                   ` Mike Rapoport
2020-09-24  0:39   ` Mike Snitzer
2020-09-24  1:10     ` Matthew Wilcox
2020-10-23 14:49   ` Daniel Vetter
2020-06-25 18:48 ` [PATCH 0/6] Overhaul memalloc_no* Darrick J. Wong
2020-06-25 20:34   ` Matthew Wilcox
2020-06-25 20:36   ` Michal Hocko
2020-06-25 20:40     ` Matthew Wilcox
2020-06-26 15:02 ` Mikulas Patocka
2020-06-26 23:08   ` Dave Chinner
2020-06-27 13:09     ` Mikulas Patocka
2020-06-29  0:35       ` Dave Chinner
2020-06-29 13:43         ` Mikulas Patocka
2020-06-29 22:34           ` Dave Chinner
2020-07-03 14:26             ` [PATCH] dm-bufio: do cleanup from a workqueue Mikulas Patocka
2020-06-29  8:22     ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200629082209.GC32461@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=axboe@kernel.dk \
    --cc=david@fromorbit.com \
    --cc=dm-devel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mpatocka@redhat.com \
    --cc=neilb@suse.de \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).