From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1753129AbcCGPm3 (ORCPT <rfc822;w@1wt.eu>);
	Mon, 7 Mar 2016 10:42:29 -0500
Received: from mx2.suse.de ([195.135.220.15]:44611 "EHLO mx2.suse.de"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1751146AbcCGPmW (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Mon, 7 Mar 2016 10:42:22 -0500
Subject: Re: [PATCH] mm: limit direct reclaim for higher order allocations
To: Rik van Riel <riel@redhat.com>, linux-kernel@vger.kernel.org
References: <20160224163850.3d7eb56c@annuminas.surriel.com>
Cc: hannes@cmpxchg.org, akpm@linux-foundation.org, mgorman@suse.de
From: Vlastimil Babka <vbabka@suse.cz>
Message-ID: <56DDA15B.70006@suse.cz>
Date: Mon, 7 Mar 2016 16:42:19 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <20160224163850.3d7eb56c@annuminas.surriel.com>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On 02/24/2016 10:38 PM, Rik van Riel wrote:
> For multi page allocations smaller than PAGE_ALLOC_COSTLY_ORDER,
> the kernel will do direct reclaim if compaction failed for any
> reason. This worked fine when Linux systems had 128MB RAM, but
> on my 24GB system I frequently see higher order allocations
> free up over 3GB of memory, pushing all kinds of things into
> swap, and slowing down applications.
>
> It would be much better to limit the amount of reclaim done,
> rather than cause excessive pageout activity.
>
> When enough memory is free to do compaction for the highest order
> allocation possible, bail out of the direct page reclaim code.
>
> On smaller systems, this may be enough to obtain contiguous
> free memory areas to satisfy small allocations, continuing our
> strategy of relying on luck occasionally. On larger systems,
> relying on luck like that has not been working for years.
>
> Signed-off-by: Rik van Riel <riel@redhat.com>

So the main point of this patch is the change from "continue" to "return 
true", right? This will prevent looking at other zones, but I guess 
that's not the reason why without this patch reclaim frees 3 of 24GB for 
you?

What I suspect more is should_continue_reclaim() where it wants to 
reclaim (2UL << sc->order) pages regardless of watermark, or compaction 
status. But that one is called from shrink_zone(), and shrink_zones() 
should not call shrink_zone() if compaction is ready, even before this 
patch. Perhaps if multiple processes manage to enter shrink_zone() 
simultaneously, they could over-reclaim due to that?

> ---
>   mm/vmscan.c | 19 ++++++++-----------
>   1 file changed, 8 insertions(+), 11 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index fc62546096f9..8dd15d514761 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2584,20 +2584,17 @@ static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
>   				continue;	/* Let kswapd poll it */
>
>   			/*
> -			 * If we already have plenty of memory free for
> -			 * compaction in this zone, don't free any more.
> -			 * Even though compaction is invoked for any
> -			 * non-zero order, only frequent costly order
> -			 * reclamation is disruptive enough to become a
> -			 * noticeable problem, like transparent huge
> -			 * page allocations.
> +			 * For higher order allocations, free enough memory
> +			 * to be able to do compaction for the largest possible
> +			 * allocation. On smaller systems, this may be enough
> +			 * that smaller allocations can skip compaction, if
> +			 * enough adjacent pages get freed.
>   			 */
> -			if (IS_ENABLED(CONFIG_COMPACTION) &&
> -			    sc->order > PAGE_ALLOC_COSTLY_ORDER &&
> +			if (IS_ENABLED(CONFIG_COMPACTION) && sc->order &&
>   			    zonelist_zone_idx(z) <= requested_highidx &&
> -			    compaction_ready(zone, sc->order)) {
> +			    compaction_ready(zone, MAX_ORDER)) {
>   				sc->compaction_ready = true;
> -				continue;
> +				return true;
>   			}
>
>   			/*
>