linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: NeilBrown <neilb@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Theodore Ts'o <tytso@mit.edu>,
	Andreas Dilger <adilger.kernel@dilger.ca>,
	"Darrick J. Wong" <djwong@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Mel Gorman <mgorman@suse.com>,
	linux-xfs@vger.kernel.org, linux-ext4@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-nfs@vger.kernel.org,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/6] XFS: remove congestion_wait() loop from xfs_buf_alloc_pages()
Date: Tue, 14 Sep 2021 15:33:00 +1000	[thread overview]
Message-ID: <20210914053300.GI2361455@dread.disaster.area> (raw)
In-Reply-To: <163158695921.3992.9776900395549582360@noble.neil.brown.name>

On Tue, Sep 14, 2021 at 12:35:59PM +1000, NeilBrown wrote:
> On Tue, 14 Sep 2021, Dave Chinner wrote:
> > On Tue, Sep 14, 2021 at 10:13:04AM +1000, NeilBrown wrote:
> > > Documentation commment in gfp.h discourages indefinite retry loops on
> > > ENOMEM and says of __GFP_NOFAIL that it
> > > 
> > >     is definitely preferable to use the flag rather than opencode
> > >     endless loop around allocator.
> > > 
> > > congestion_wait() is indistinguishable from
> > > schedule_timeout_uninterruptible() in practice and it is not a good way
> > > to wait for memory to become available.
> > > 
> > > So instead of waiting, allocate a single page using __GFP_NOFAIL, then
> > > loop around and try to get any more pages that might be needed with a
> > > bulk allocation.  This single-page allocation will wait in the most
> > > appropriate way.
> > > 
> > > Signed-off-by: NeilBrown <neilb@suse.de>
> > > ---
> > >  fs/xfs/xfs_buf.c |    6 +++---
> > >  1 file changed, 3 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
> > > index 5fa6cd947dd4..1ae3768f6504 100644
> > > --- a/fs/xfs/xfs_buf.c
> > > +++ b/fs/xfs/xfs_buf.c
> > > @@ -372,8 +372,8 @@ xfs_buf_alloc_pages(
> > >  
> > >  	/*
> > >  	 * Bulk filling of pages can take multiple calls. Not filling the entire
> > > -	 * array is not an allocation failure, so don't back off if we get at
> > > -	 * least one extra page.
> > > +	 * array is not an allocation failure, so don't fail or fall back on
> > > +	 * __GFP_NOFAIL if we get at least one extra page.
> > >  	 */
> > >  	for (;;) {
> > >  		long	last = filled;
> > > @@ -394,7 +394,7 @@ xfs_buf_alloc_pages(
> > >  		}
> > >  
> > >  		XFS_STATS_INC(bp->b_mount, xb_page_retries);
> > > -		congestion_wait(BLK_RW_ASYNC, HZ / 50);
> > > +		bp->b_pages[filled++] = alloc_page(gfp_mask | __GFP_NOFAIL);
> > 
> > This smells wrong - the whole point of using the bulk page allocator
> > in this loop is to avoid the costly individual calls to
> > alloc_page().
> > 
> > What we are implementing here fail-fast semantics for readahead and
> > fail-never for everything else.  If the bulk allocator fails to get
> > a page from the fast path free lists, it already falls back to
> > __alloc_pages(gfp, 0, ...) to allocate a single page. So AFAICT
> > there's no need to add another call to alloc_page() because we can
> > just do this instead:
> > 
> > 	if (flags & XBF_READ_AHEAD)
> > 		gfp_mask |= __GFP_NORETRY;
> > 	else
> > -		gfp_mask |= GFP_NOFS;
> > +		gfp_mask |= GFP_NOFS | __GFP_NOFAIL;
> > 
> > Which should make the __alloc_pages() call in
> > alloc_pages_bulk_array() do a __GFP_NOFAIL allocation and hence
> > provide the necessary never-fail guarantee that is needed here.
> 
> That is a nice simplification.
> Mel Gorman told me
>   https://lore.kernel.org/linux-nfs/20210907153116.GJ3828@suse.com/
> that alloc_pages_bulk ignores GFP_NOFAIL.  I added that to the
> documentation comment in an earlier patch.

Well, that's a surprise to me - I can't see where it masked out
NOFAIL, and it seems quite arbitrary to just say "different code
needs different fallbacks, so you can't have NOFAIL" despite NOFAIL
being the exact behavioural semantics one of only three users of the
bulk allocator really needs...

> I had a look at the code and cannot see how it would fail to allocate at
> least one page.  Maybe Mel can help....

Yup, clarification is definitely needed here.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2021-09-14  5:33 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-14  0:13 [PATCH 0/6] congestion_wait() and GFP_NOFAIL NeilBrown
2021-09-14  0:13 ` [PATCH 2/6] MM: annotate congestion_wait() and wait_iff_congested() as ineffective NeilBrown
2021-09-15 11:56   ` Michal Hocko
2021-09-16 22:13     ` NeilBrown
2021-09-14  0:13 ` [PATCH 5/6] XFS: remove congestion_wait() loop from kmem_alloc() NeilBrown
2021-09-14  1:31   ` Dave Chinner
2021-09-14  3:27     ` NeilBrown
2021-09-14  6:05       ` Dave Chinner
2021-09-14  0:13 ` [PATCH 3/6] EXT4: Remove ENOMEM/congestion_wait() loops NeilBrown
2021-09-14 16:34   ` Mel Gorman
2021-09-14 21:48     ` NeilBrown
2021-09-15 12:06       ` Michal Hocko
2021-09-15 22:35         ` NeilBrown
2021-09-16  0:37           ` Dave Chinner
2021-09-16  6:52           ` Michal Hocko
2021-09-14 23:55     ` Dave Chinner
2021-09-15  8:59       ` Mel Gorman
2021-09-15 12:20         ` Michal Hocko
2021-09-15 14:35         ` Mel Gorman
2021-09-15 22:38           ` Dave Chinner
2021-09-16  9:00             ` Mel Gorman
2021-09-15  0:28   ` Theodore Ts'o
2021-09-15  5:25     ` NeilBrown
2021-09-15 17:02       ` Theodore Ts'o
2021-09-14  0:13 ` [PATCH 1/6] MM: improve documentation for __GFP_NOFAIL NeilBrown
2021-09-15 11:51   ` Michal Hocko
2021-09-14  0:13 ` [PATCH 6/6] XFS: remove congestion_wait() loop from xfs_buf_alloc_pages() NeilBrown
2021-09-14  2:08   ` Dave Chinner
2021-09-14  2:35     ` NeilBrown
2021-09-14  5:33       ` Dave Chinner [this message]
2021-09-14 16:45       ` Mel Gorman
2021-09-14 21:13         ` NeilBrown
2021-09-14  0:13 ` [PATCH 4/6] EXT4: remove congestion_wait from ext4_bio_write_page, and simplify NeilBrown
2021-09-17  2:56 [PATCH 0/6 v2] congestion_wait() and GFP_NOFAIL NeilBrown
2021-09-17  2:56 ` [PATCH 6/6] XFS: remove congestion_wait() loop from xfs_buf_alloc_pages() NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210914053300.GI2361455@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=akpm@linux-foundation.org \
    --cc=djwong@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mgorman@suse.com \
    --cc=neilb@suse.de \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).