linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>, Hillf Danton <hdanton@sina.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RESEND 0/8] hugetlb: add demote/split page functionality
Date: Fri, 17 Sep 2021 13:44:56 -0700	[thread overview]
Message-ID: <3b3d4263-8872-21cd-cd05-4e13043a882e@oracle.com> (raw)
In-Reply-To: <b053cb84-67d3-d5c7-fbb6-c9041116f707@oracle.com>

On 9/15/21 9:57 AM, Mike Kravetz wrote:
> On 9/13/21 8:50 AM, Michal Hocko wrote:
>> I do not think this is a good approach. We do not want to play with
>> retries numbers. If we want to go with a minimal change for now then the
>> compaction feedback mechanism should track the number of reclaimed pages
>> to satisfy watermarks and if that grows beyond reasonable (proportionaly
>> to the request size) then simply give up rather than keep trying again
>> and again.
> 
> I am experimenting with code similar to the patch below.  The idea is to
> key off of 'COMPACT_SKIPPED' being returned.  This implies that not enough
> pages are available for compaction.  One of the values in this calculation
> is compact_gap() which is scaled based on allocation order.  The simple
> patch is to give up if number of reclaimed pages is greater than
> compact_gap().  The thought is that this implies pages are being stolen
> before compaction.
> 
> Such a patch does help in my specific test.  However, the concern is
> that giving up earlier may regress some workloads.  One could of course
> come up with a scenario where one more retry would result in success.

With the patch below, I instrumented the number of times shrink_node and
__alloc_pages_slowpath would 'quit earlier than before'.  In my test (free
1GB pages, allocate 2MB pages), this early exit was exercised a significant
number of times:

shrink_node:
        931124 quick exits

__alloc_pages_slowpath:
        bail early 32 times

For comparison, I ran the LTP mm tests mostly because they include a few
OOM test scenarios.  Test results were unchanged, and numbers for
exiting early were much lower than my hugetlb test:

shrink_node:
        57 quick exits

__alloc_pages_slowpath:
        bail early 0 times

This at least 'looks' like the changes target the page stealing corner
case in my test, and minimally impact other scenarios.

Any suggestions for other tests to run, or tweaks to the code?
-- 
Mike Kravetz

> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index eeb3a9c..f8e3b0e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4413,6 +4413,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
>  
>  static inline bool
>  should_compact_retry(struct alloc_context *ac, int order, int alloc_flags,
> +		     unsigned long nr_reclaimed,
>  		     enum compact_result compact_result,
>  		     enum compact_priority *compact_priority,
>  		     int *compaction_retries)
> @@ -4445,7 +4446,16 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
>  	 * to work with, so we retry only if it looks like reclaim can help.
>  	 */
>  	if (compaction_needs_reclaim(compact_result)) {
> -		ret = compaction_zonelist_suitable(ac, order, alloc_flags);
> +		/*
> +		 * If we reclaimed enough pages, but they are not available
> +		 * now, this implies they were stolen.  Do not reclaim again
> +		 * as this could go on indefinitely.
> +		 */
> +		if (nr_reclaimed > compact_gap(order))
> +			ret = false;
> +		else
> +			ret = compaction_zonelist_suitable(ac, order,
> +								alloc_flags);
>  		goto out;
>  	}
>  
> @@ -4504,6 +4514,7 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
>  static inline bool
>  should_compact_retry(struct alloc_context *ac, unsigned int order, int alloc_flags,
>  		     enum compact_result compact_result,
> +		     unsigned long nr_reclaimed,
>  		     enum compact_priority *compact_priority,
>  		     int *compaction_retries)
>  {
> @@ -4890,6 +4901,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  	struct page *page = NULL;
>  	unsigned int alloc_flags;
>  	unsigned long did_some_progress;
> +	unsigned long nr_reclaimed;
>  	enum compact_priority compact_priority;
>  	enum compact_result compact_result;
>  	int compaction_retries;
> @@ -5033,6 +5045,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  							&did_some_progress);
>  	if (page)
>  		goto got_pg;
> +	nr_reclaimed = did_some_progress;
>  
>  	/* Try direct compaction and then allocating */
>  	page = __alloc_pages_direct_compact(gfp_mask, order, alloc_flags, ac,
> @@ -5063,7 +5076,7 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>  	 */
>  	if (did_some_progress > 0 &&
>  			should_compact_retry(ac, order, alloc_flags,
> -				compact_result, &compact_priority,
> +				nr_reclaimed, compact_result, &compact_priority,
>  				&compaction_retries))
>  		goto retry;
>  
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index eeae2f6..d40eca5 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2819,10 +2819,18 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat,
>  	}
>  
>  	/*
> +	 * If we have reclaimed enough pages for compaction, and compaction
> +	 * is not ready, this implies pages are being stolen as they are being
> +	 * reclaimed.  Quit now instead of continual retrying.
> +	 */
> +	pages_for_compaction = compact_gap(sc->order);
> +	if (!current_is_kswapd() && sc->nr_reclaimed > pages_for_compaction)
> +		return false;
> +
> +	/*
>  	 * If we have not reclaimed enough pages for compaction and the
>  	 * inactive lists are large enough, continue reclaiming
>  	 */
> -	pages_for_compaction = compact_gap(sc->order);
>  	inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE);
>  	if (get_nr_swap_pages() > 0)
>  		inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON);
> 
> 


      reply	other threads:[~2021-09-17 20:45 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-16 22:49 [PATCH RESEND 0/8] hugetlb: add demote/split page functionality Mike Kravetz
2021-08-16 22:49 ` [PATCH 1/8] hugetlb: add demote hugetlb page sysfs interfaces Mike Kravetz
2021-08-16 22:49 ` [PATCH 2/8] hugetlb: add HPageCma flag and code to free non-gigantic pages in CMA Mike Kravetz
2021-08-16 22:49 ` [PATCH 3/8] hugetlb: add demote bool to gigantic page routines Mike Kravetz
2021-08-16 22:49 ` [PATCH 4/8] hugetlb: add hugetlb demote page support Mike Kravetz
2021-08-16 22:49 ` [PATCH 5/8] hugetlb: document the demote sysfs interfaces Mike Kravetz
2021-08-16 23:28   ` Andrew Morton
2021-08-17  1:04     ` Mike Kravetz
2021-09-21 13:52   ` Aneesh Kumar K.V
2021-09-21 17:17     ` Mike Kravetz
2021-08-16 22:49 ` [PATCH 6/8] hugetlb: vmemmap optimizations when demoting hugetlb pages Mike Kravetz
2021-08-16 22:49 ` [PATCH 7/8] hugetlb: prepare destroy and prep routines for vmemmap optimized pages Mike Kravetz
2021-08-16 22:49 ` [PATCH 8/8] hugetlb: Optimized demote vmemmap optimizatized pages Mike Kravetz
2021-08-16 23:23 ` [PATCH RESEND 0/8] hugetlb: add demote/split page functionality Andrew Morton
2021-08-17  0:17   ` Mike Kravetz
2021-08-17  0:39     ` Andrew Morton
2021-08-17  0:58       ` Mike Kravetz
2021-08-16 23:27 ` Andrew Morton
2021-08-17  0:46   ` Mike Kravetz
2021-08-17  1:46     ` Andrew Morton
2021-08-17  7:30       ` David Hildenbrand
2021-08-17 16:19         ` Mike Kravetz
2021-08-17 18:49           ` David Hildenbrand
2021-08-24 22:08       ` Mike Kravetz
2021-08-26  7:32         ` Hillf Danton
2021-08-27 17:22         ` Vlastimil Babka
2021-08-27 23:04           ` Mike Kravetz
2021-08-30 10:11             ` Vlastimil Babka
2021-09-02 18:17               ` Mike Kravetz
2021-09-06 14:40                 ` Vlastimil Babka
2021-09-07  8:50                   ` Hillf Danton
2021-09-08 21:00                     ` Mike Kravetz
2021-09-09  4:07                       ` Hillf Danton
2021-09-09 11:54                       ` Michal Hocko
2021-09-09 13:45                         ` Vlastimil Babka
2021-09-09 21:31                           ` Mike Kravetz
2021-09-10  8:20                           ` Michal Hocko
2021-09-11  0:11                             ` Mike Kravetz
2021-09-11  3:11                               ` Hillf Danton
2021-09-13 15:50                               ` Michal Hocko
2021-09-15 16:57                                 ` Mike Kravetz
2021-09-17 20:44                                   ` Mike Kravetz [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3b3d4263-8872-21cd-cd05-4e13043a882e@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).