From: Mike Kravetz <mike.kravetz@oracle.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [Question] Should direct reclaim time be bounded?
Date: Mon, 1 Jul 2019 20:15:50 -0700 [thread overview]
Message-ID: <80036eed-993d-1d24-7ab6-e495f01b1caa@oracle.com> (raw)
In-Reply-To: <20190701085920.GB2812@suse.de>
On 7/1/19 1:59 AM, Mel Gorman wrote:
> On Fri, Jun 28, 2019 at 11:20:42AM -0700, Mike Kravetz wrote:
>> On 4/24/19 7:35 AM, Vlastimil Babka wrote:
>>> On 4/23/19 6:39 PM, Mike Kravetz wrote:
>>>>> That being said, I do not think __GFP_RETRY_MAYFAIL is wrong here. It
>>>>> looks like there is something wrong in the reclaim going on.
>>>>
>>>> Ok, I will start digging into that. Just wanted to make sure before I got
>>>> into it too deep.
>>>>
>>>> BTW - This is very easy to reproduce. Just try to allocate more huge pages
>>>> than will fit into memory. I see this 'reclaim taking forever' behavior on
>>>> v5.1-rc5-mmotm-2019-04-19-14-53. Looks like it was there in v5.0 as well.
>>>
>>> I'd suspect this in should_continue_reclaim():
>>>
>>> /* Consider stopping depending on scan and reclaim activity */
>>> if (sc->gfp_mask & __GFP_RETRY_MAYFAIL) {
>>> /*
>>> * For __GFP_RETRY_MAYFAIL allocations, stop reclaiming if the
>>> * full LRU list has been scanned and we are still failing
>>> * to reclaim pages. This full LRU scan is potentially
>>> * expensive but a __GFP_RETRY_MAYFAIL caller really wants to succeed
>>> */
>>> if (!nr_reclaimed && !nr_scanned)
>>> return false;
>>>
>>> And that for some reason, nr_scanned never becomes zero. But it's hard
>>> to figure out through all the layers of functions :/
>>
>> I got back to looking into the direct reclaim/compaction stalls when
>> trying to allocate huge pages. As previously mentioned, the code is
>> looping for a long time in shrink_node(). The routine
>> should_continue_reclaim() returns true perhaps more often than it should.
>>
>> As Vlastmil guessed, my debug code output below shows nr_scanned is remaining
>> non-zero for quite a while. This was on v5.2-rc6.
>>
>
> I think it would be reasonable to have should_continue_reclaim allow an
> exit if scanning at higher priority than DEF_PRIORITY - 2, nr_scanned is
> less than SWAP_CLUSTER_MAX and no pages are being reclaimed.
Thanks Mel,
I added such a check to should_continue_reclaim. However, it does not
address the issue I am seeing. In that do-while loop in shrink_node,
the scan priority is not raised (priority--). We can enter the loop
with priority == DEF_PRIORITY and continue to loop for minutes as seen
in my previous debug output.
--
Mike Kravetz
WARNING: multiple messages have this Message-ID (diff)
From: Hillf Danton <hdanton@sina.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
Michal Hocko <mhocko@kernel.org>, Mel Gorman <mgorman@suse.de>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
linux-kernel <linux-kernel@vger.kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: Re: [Question] Should direct reclaim time be bounded?
Date: Mon, 8 Jul 2019 13:19:46 +0800 [thread overview]
Message-ID: <80036eed-993d-1d24-7ab6-e495f01b1caa@oracle.com> (raw)
Message-ID: <20190708051946.bER1NpKpnH663PpCF1VjLChmGkn61iaoze7e3Z2wYzw@z> (raw)
In-Reply-To: <20190701085920.GB2812@suse.de>
On Mon, 01 Jul 2019 20:15:51 -0700 Mike Kravetz wrote:
>On 7/1/19 1:59 AM, Mel Gorman wrote:
>> On Fri, Jun 28, 2019 at 11:20:42AM -0700, Mike Kravetz wrote:
>>> On 4/24/19 7:35 AM, Vlastimil Babka wrote:
>>>> On 4/23/19 6:39 PM, Mike Kravetz wrote:
>>>>>> That being said, I do not think __GFP_RETRY_MAYFAIL is wrong here. It
>>>>>> looks like there is something wrong in the reclaim going on.
>>>>>
>>>>> Ok, I will start digging into that. Just wanted to make sure before I got
>>>>> into it too deep.
>>>>>
>>>>> BTW - This is very easy to reproduce. Just try to allocate more huge pages
>>>>> than will fit into memory. I see this 'reclaim taking forever' behavior on
>>>>> v5.1-rc5-mmotm-2019-04-19-14-53. Looks like it was there in v5.0 as well.
>>>>
>>>> I'd suspect this in should_continue_reclaim():
>>>>
>>>> /* Consider stopping depending on scan and reclaim activity */
>>>> if (sc->gfp_mask & __GFP_RETRY_MAYFAIL) {
>>>> /*
>>>> * For __GFP_RETRY_MAYFAIL allocations, stop reclaiming if the
>>>> * full LRU list has been scanned and we are still failing
>>>> * to reclaim pages. This full LRU scan is potentially
>>>> * expensive but a __GFP_RETRY_MAYFAIL caller really wants to succeed
>>>> */
>>>> if (!nr_reclaimed && !nr_scanned)
>>>> return false;
>>>>
>>>> And that for some reason, nr_scanned never becomes zero. But it's hard
>>>> to figure out through all the layers of functions :/
>>>
>>> I got back to looking into the direct reclaim/compaction stalls when
>>> trying to allocate huge pages. As previously mentioned, the code is
>>> looping for a long time in shrink_node(). The routine
>>> should_continue_reclaim() returns true perhaps more often than it should.
>>>
>>> As Vlastmil guessed, my debug code output below shows nr_scanned is remaining
>>> non-zero for quite a while. This was on v5.2-rc6.
>>>
>>
>> I think it would be reasonable to have should_continue_reclaim allow an
>> exit if scanning at higher priority than DEF_PRIORITY - 2, nr_scanned is
>> less than SWAP_CLUSTER_MAX and no pages are being reclaimed.
>
>Thanks Mel,
>
>I added such a check to should_continue_reclaim. However, it does not
>address the issue I am seeing. In that do-while loop in shrink_node,
>the scan priority is not raised (priority--). We can enter the loop
>with priority == DEF_PRIORITY and continue to loop for minutes as seen
>in my previous debug output.
>
Does it help raise prioity in your case?
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2543,11 +2543,18 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
unsigned long pages_for_compaction;
unsigned long inactive_lru_pages;
int z;
+ bool costly_fg_reclaim = false;
/* If not in reclaim/compaction mode, stop */
if (!in_reclaim_compaction(sc))
return false;
+ /* Let compact determine what to do for high order allocators */
+ costly_fg_reclaim = sc->order > PAGE_ALLOC_COSTLY_ORDER &&
+ !current_is_kswapd();
+ if (costly_fg_reclaim)
+ goto check_compact;
+
/* Consider stopping depending on scan and reclaim activity */
if (sc->gfp_mask & __GFP_RETRY_MAYFAIL) {
/*
@@ -2571,6 +2578,7 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
return false;
}
+check_compact:
/*
* If we have not reclaimed enough pages for compaction and the
* inactive lists are large enough, continue reclaiming
@@ -2583,6 +2591,9 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
inactive_lru_pages > pages_for_compaction)
return true;
+ if (costly_fg_reclaim)
+ return false;
+
/* If compaction would go ahead or the allocation would succeed, stop */
for (z = 0; z <= sc->reclaim_idx; z++) {
struct zone *zone = &pgdat->node_zones[z];
--
next prev parent reply other threads:[~2019-07-02 3:16 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-23 4:07 [Question] Should direct reclaim time be bounded? Mike Kravetz
2019-04-23 7:19 ` Michal Hocko
2019-04-23 16:39 ` Mike Kravetz
2019-04-24 14:35 ` Vlastimil Babka
2019-06-28 18:20 ` Mike Kravetz
2019-07-01 8:59 ` Mel Gorman
2019-07-02 3:15 ` Mike Kravetz [this message]
2019-07-03 9:43 ` Mel Gorman
2019-07-03 23:54 ` Mike Kravetz
2019-07-04 11:09 ` Michal Hocko
2019-07-04 15:11 ` Mike Kravetz
2019-07-08 5:19 ` Hillf Danton
2019-07-10 18:42 ` Mike Kravetz
2019-07-10 19:44 ` Michal Hocko
2019-07-10 23:36 ` Mike Kravetz
2019-07-11 7:12 ` Michal Hocko
2019-07-12 9:49 ` Mel Gorman
2019-07-11 15:44 Hillf Danton
2019-07-12 5:47 Hillf Danton
2019-07-13 1:11 ` Mike Kravetz
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=80036eed-993d-1d24-7ab6-e495f01b1caa@oracle.com \
--to=mike.kravetz@oracle.com \
--cc=aarcange@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).