linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>, Hillf Danton <hdanton@sina.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RESEND 0/8] hugetlb: add demote/split page functionality
Date: Wed, 15 Sep 2021 09:57:15 -0700	[thread overview]
Message-ID: <b053cb84-67d3-d5c7-fbb6-c9041116f707@oracle.com> (raw)
In-Reply-To: <YT9zUaPofENSMQHg@dhcp22.suse.cz>

On 9/13/21 8:50 AM, Michal Hocko wrote:
> On Fri 10-09-21 17:11:05, Mike Kravetz wrote:
> [...]
>> @@ -5064,8 +5068,18 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>>  	if (did_some_progress > 0 &&
>>  			should_compact_retry(ac, order, alloc_flags,
>>  				compact_result, &compact_priority,
>> -				&compaction_retries))
>> +				&compaction_retries)) {
>> +		/*
>> +		 * In one pathological case, pages can be stolen immediately
>> +		 * after reclaimed.  It looks like we are making progress, and
>> +		 * compaction_retries is not incremented.  This could cause
>> +		 * an indefinite number of retries.  Cap the number of retries
>> +		 * for costly orders.
>> +		 */
>> +		if (max_tries && tries > max_tries)
>> +			goto nopage;
>>  		goto retry;
>> +	}
> 
> I do not think this is a good approach. We do not want to play with
> retries numbers. If we want to go with a minimal change for now then the
> compaction feedback mechanism should track the number of reclaimed pages
> to satisfy watermarks and if that grows beyond reasonable (proportionaly
> to the request size) then simply give up rather than keep trying again
> and again.

I am experimenting with code similar to the patch below.  The idea is to
key off of 'COMPACT_SKIPPED' being returned.  This implies that not enough
pages are available for compaction.  One of the values in this calculation
is compact_gap() which is scaled based on allocation order.  The simple
patch is to give up if number of reclaimed pages is greater than
compact_gap().  The thought is that this implies pages are being stolen
before compaction.

Such a patch does help in my specific test.  However, the concern is
that giving up earlier may regress some workloads.  One could of course
come up with a scenario where one more retry would result in success.

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index eeb3a9c..f8e3b0e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4413,6 +4413,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 
 static inline bool
 should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
+		     unsigned long nr_reclaimed,
 		     enum compact_result compact_result,
 		     enum compact_priority *compact_priority,
 		     int *compaction_retries)
@@ -4445,7 +4446,16 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 	 * to work with, so we retry only if it looks like reclaim can help.
 	 */
 	if (compaction_needs_reclaim(compact_result)) {
-		ret = compaction_zonelist_suitable(ac, order, alloc_flags);
+		/*
+		 * If we reclaimed enough pages, but they are not available
+		 * now, this implies they were stolen.  Do not reclaim again
+		 * as this could go on indefinitely.
+		 */
+		if (nr_reclaimed > compact_gap(order))
+			ret = false;
+		else
+			ret = compaction_zonelist_suitable(ac, order,
+								alloc_flags);
 		goto out;
 	}
 
@@ -4504,6 +4514,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
 static inline bool
 should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags,
 		     enum compact_result compact_result,
+		     unsigned long nr_reclaimed,
 		     enum compact_priority *compact_priority,
 		     int *compaction_retries)
 {
@@ -4890,6 +4901,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 	struct page *page = NULL;
 	unsigned int alloc_flags;
 	unsigned long did_some_progress;
+	unsigned long nr_reclaimed;
 	enum compact_priority compact_priority;
 	enum compact_result compact_result;
 	int compaction_retries;
@@ -5033,6 +5045,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 							&did_some_progress);
 	if (page)
 		goto got_pg;
+	nr_reclaimed = did_some_progress;
 
 	/* Try direct compaction and then allocating */
 	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
@@ -5063,7 +5076,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
 	 */
 	if (did_some_progress > 0 &&
 			should_compact_retry(ac, order, alloc_flags,
-				compact_result, &compact_priority,
+				nr_reclaimed, compact_result, &compact_priority,
 				&compaction_retries))
 		goto retry;
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index eeae2f6..d40eca5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2819,10 +2819,18 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
 	}
 
 	/*
+	 * If we have reclaimed enough pages for compaction, and compaction
+	 * is not ready, this implies pages are being stolen as they are being
+	 * reclaimed.  Quit now instead of continual retrying.
+	 */
+	pages_for_compaction = compact_gap(sc->order);
+	if (!current_is_kswapd() && sc->nr_reclaimed > pages_for_compaction)
+		return false;
+
+	/*
 	 * If we have not reclaimed enough pages for compaction and the
 	 * inactive lists are large enough, continue reclaiming
 	 */
-	pages_for_compaction = compact_gap(sc->order);
 	inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE);
 	if (get_nr_swap_pages() > 0)
 		inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON);


-- 
Mike Kravetz


  reply	other threads:[~2021-09-15 16:57 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16 22:49 [PATCH RESEND 0/8] hugetlb: add demote/split page functionality Mike Kravetz
2021-08-16 22:49 ` [PATCH 1/8] hugetlb: add demote hugetlb page sysfs interfaces Mike Kravetz
2021-08-16 22:49 ` [PATCH 2/8] hugetlb: add HPageCma flag and code to free non-gigantic pages in CMA Mike Kravetz
2021-08-16 22:49 ` [PATCH 3/8] hugetlb: add demote bool to gigantic page routines Mike Kravetz
2021-08-16 22:49 ` [PATCH 4/8] hugetlb: add hugetlb demote page support Mike Kravetz
2021-08-16 22:49 ` [PATCH 5/8] hugetlb: document the demote sysfs interfaces Mike Kravetz
2021-08-16 23:28   ` Andrew Morton
2021-08-17  1:04     ` Mike Kravetz
2021-09-21 13:52   ` Aneesh Kumar K.V
2021-09-21 17:17     ` Mike Kravetz
2021-08-16 22:49 ` [PATCH 6/8] hugetlb: vmemmap optimizations when demoting hugetlb pages Mike Kravetz
2021-08-16 22:49 ` [PATCH 7/8] hugetlb: prepare destroy and prep routines for vmemmap optimized pages Mike Kravetz
2021-08-16 22:49 ` [PATCH 8/8] hugetlb: Optimized demote vmemmap optimizatized pages Mike Kravetz
2021-08-16 23:23 ` [PATCH RESEND 0/8] hugetlb: add demote/split page functionality Andrew Morton
2021-08-17  0:17   ` Mike Kravetz
2021-08-17  0:39     ` Andrew Morton
2021-08-17  0:58       ` Mike Kravetz
2021-08-16 23:27 ` Andrew Morton
2021-08-17  0:46   ` Mike Kravetz
2021-08-17  1:46     ` Andrew Morton
2021-08-17  7:30       ` David Hildenbrand
2021-08-17 16:19         ` Mike Kravetz
2021-08-17 18:49           ` David Hildenbrand
2021-08-24 22:08       ` Mike Kravetz
2021-08-26  7:32         ` Hillf Danton
2021-08-27 17:22         ` Vlastimil Babka
2021-08-27 23:04           ` Mike Kravetz
2021-08-30 10:11             ` Vlastimil Babka
2021-09-02 18:17               ` Mike Kravetz
2021-09-06 14:40                 ` Vlastimil Babka
2021-09-07  8:50                   ` Hillf Danton
2021-09-08 21:00                     ` Mike Kravetz
2021-09-09  4:07                       ` Hillf Danton
2021-09-09 11:54                       ` Michal Hocko
2021-09-09 13:45                         ` Vlastimil Babka
2021-09-09 21:31                           ` Mike Kravetz
2021-09-10  8:20                           ` Michal Hocko
2021-09-11  0:11                             ` Mike Kravetz
2021-09-11  3:11                               ` Hillf Danton
2021-09-13 15:50                               ` Michal Hocko
2021-09-15 16:57                                 ` Mike Kravetz [this message]
2021-09-17 20:44                                   ` Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b053cb84-67d3-d5c7-fbb6-c9041116f707@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).