linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Dave Chinner <david@fromorbit.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>, Rik van Riel <riel@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] xfs: map KM_MAYFAIL to __GFP_RETRY_HARD
Date: Tue, 21 Jun 2016 19:00:52 +0200	[thread overview]
Message-ID: <20160621170052.GA24410@dhcp22.suse.cz> (raw)
In-Reply-To: <20160621042249.GA18870@cmpxchg.org>

On Tue 21-06-16 00:22:49, Johannes Weiner wrote:
[...]
> As for changing the default - remember that we currently warn about
> allocation failures as if they were bugs, unless they are explicitely
> allocated with the __GFP_NOWARN flag. We can assume that the current
> __GFP_NOWARN sites are 1) commonly failing but 2) prefer to fall back
> rather than incurring latency (otherwise they would have added the
> __GFP_REPEAT flag). These sites would be a good list of candidates to
> annotate with __GFP_NORETRY.

This sounds like a good idea at first sight but a brief git grep shows
that many of them are just trying to silence the warning from non
sleeping allocations which are quite likely to fail. This wouldn't
be hard to filter out so we can ignore them.

Then there are things like 8be04b9374e5 ("treewide: Add __GFP_NOWARN
to k.alloc calls with v.alloc fallbacks") where the flag is added to
many places with vmalloc fallbacks. Do we want to weaken them in
favor of the vmalloc in general (note that it is not clear from the size
whether they are costly or !costly)?

I have looked at some random others and they are adding the flag without
any explanation so it is not really clear what was the motivation.

To me it seems like the flag is used quite randomly. I suspect there are
many places which do not have that flag just because nobody bothered to
report the allocation failure which is hard to reproduce.

> If we made __GFP_REPEAT then the default,
> the sites that would then try harder are the same sites that would now
> emit page allocation failure warnings. These are rare, and the only
> times I have seen them is under enough load that latency is shot to
> hell anyway. So I'm not really convinced by the regression argument.

You do not need to be under a heavy load to fail those allocations. It
is sufficient to have the memory fragmented which might be just a matter
of time. I am worried that we have hard to examine number of allocation
requests that might change the overall system behavior because they
might trigger more reclaim/swap and the source of the behavior change
wouldn't be quite obvious. On the other hand we already have some places
already annotated to require a more effort which is the reason I would
find it better to follow up with that.

High order allocations can be really expensive and the current behavior
with the allocation warning has an advantage that we can see the failure
mode and get a bug report with the exact trace (hopefully) without too
much of a background interference. Then the subsystem familiar person
can judge whether that particular allocation is worth more effort or
different fallback.

I am not saying that changing the default behavior for costly
allocations is a no go. I just feel it is too risky and it would be
better to use "override the default because I know what I am doing"
flag. The __GFP_RETRY_HARD might be a terrible name and a better name
would cause less confusion (__GFP_RETRY_MAYFAIL?).

> But that would *actually* clean up the flags, not make them even more
> confusing:
> 
> Allocations that can't ever handle failure would use __GFP_NOFAIL.
> 
> Callers like XFS would use __GFP_MAYFAIL specifically to disable the
> implicit __GFP_NOFAIL of !costly allocations.
>
> Callers that would prefer falling back over killing and looping would
> use __GFP_NORETRY.
> 
> Wouldn't that cover all usecases and be much more intuitive, both in
> the default behavior as well as in the names of the flags?

How do we describe kcompacd vs. page fault THP allocations? We do not
want to cause a lot of reclaim for those but we can wait for compaction
for the first while we would prefer not to for the later.

Thanks!
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2016-06-21 17:01 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-06 11:32 [RFC PATCH 0/2] mm: give GFP_REPEAT a better semantic Michal Hocko
2016-06-06 11:32 ` [RFC PATCH 1/2] mm, tree wide: replace __GFP_REPEAT by __GFP_RETRY_HARD with more useful semantic Michal Hocko
2016-06-07 12:11   ` Tetsuo Handa
2016-06-07 12:31     ` Michal Hocko
2016-06-11 14:35       ` Tetsuo Handa
2016-06-13 11:37         ` Michal Hocko
2016-06-13 14:54           ` Tetsuo Handa
2016-06-13 15:17             ` Michal Hocko
2016-06-14 11:12               ` Tetsuo Handa
2016-06-14 18:54                 ` Michal Hocko
2016-06-06 11:32 ` [RFC PATCH 2/2] xfs: map KM_MAYFAIL to __GFP_RETRY_HARD Michal Hocko
2016-06-16  0:23   ` Dave Chinner
2016-06-16  8:03     ` Michal Hocko
2016-06-16 11:26       ` Michal Hocko
2016-06-17 18:22         ` Johannes Weiner
2016-06-17 20:30           ` Vlastimil Babka
2016-06-17 21:39             ` Johannes Weiner
2016-06-20  8:08               ` Michal Hocko
2016-06-21  4:22                 ` Johannes Weiner
2016-06-21  9:29                   ` Vlastimil Babka
2016-06-21 17:00                   ` Michal Hocko [this message]
2016-10-06 11:14 ` [RFC PATCH 0/2] mm: give GFP_REPEAT a better semantic Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160621170052.GA24410@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=riel@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).