From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760804AbcBYOnk (ORCPT ); Thu, 25 Feb 2016 09:43:40 -0500 Received: from mail-wm0-f54.google.com ([74.125.82.54]:37066 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760611AbcBYOni (ORCPT ); Thu, 25 Feb 2016 09:43:38 -0500 Date: Thu, 25 Feb 2016 15:43:35 +0100 From: Michal Hocko To: Rik van Riel Cc: linux-kernel@vger.kernel.org, hannes@cmpxchg.org, akpm@linux-foundation.org, vbabka@suse.cz, mgorman@suse.de Subject: Re: [PATCH] mm: limit direct reclaim for higher order allocations Message-ID: <20160225144335.GA5517@dhcp22.suse.cz> References: <20160224163850.3d7eb56c@annuminas.surriel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160224163850.3d7eb56c@annuminas.surriel.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed 24-02-16 16:38:50, Rik van Riel wrote: > For multi page allocations smaller than PAGE_ALLOC_COSTLY_ORDER, > the kernel will do direct reclaim if compaction failed for any > reason. This worked fine when Linux systems had 128MB RAM, but > on my 24GB system I frequently see higher order allocations > free up over 3GB of memory, pushing all kinds of things into > swap, and slowing down applications. > > It would be much better to limit the amount of reclaim done, > rather than cause excessive pageout activity. > > When enough memory is free to do compaction for the highest order > allocation possible, bail out of the direct page reclaim code. > > On smaller systems, this may be enough to obtain contiguous > free memory areas to satisfy small allocations, continuing our > strategy of relying on luck occasionally. On larger systems, > relying on luck like that has not been working for years. I guess I have seen the similar problem just from a different direction though. With my oom detection rework I have started seeing pre mature OOM killing for higher order requests (mostly order-2 from from). This thing is that the oom rework has limitted the number of the reclaim/compaction retries to a finit number. Currently we are relying on zone_reclaimable which can keep the reclaim in a loop for a long time reclaimaing order-0 pages while compaction doesn't bother to compact at all. The reason is most probably that the compaction is mainly focused on THP and doesn't care about !costly high order allocations. Wouldn't it be better if the compaction tried harder for these requests rather than falling back to the reclaim which is not guaranteed to help much? We can compact pages if they are on the LRU even without reclaiming them, right? That being said, shouldn't we rather have a look at compaction than the reclaim path? > Signed-off-by: Rik van Riel > --- > mm/vmscan.c | 19 ++++++++----------- > 1 file changed, 8 insertions(+), 11 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index fc62546096f9..8dd15d514761 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -2584,20 +2584,17 @@ static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc) > continue; /* Let kswapd poll it */ > > /* > - * If we already have plenty of memory free for > - * compaction in this zone, don't free any more. > - * Even though compaction is invoked for any > - * non-zero order, only frequent costly order > - * reclamation is disruptive enough to become a > - * noticeable problem, like transparent huge > - * page allocations. > + * For higher order allocations, free enough memory > + * to be able to do compaction for the largest possible > + * allocation. On smaller systems, this may be enough > + * that smaller allocations can skip compaction, if > + * enough adjacent pages get freed. > */ > - if (IS_ENABLED(CONFIG_COMPACTION) && > - sc->order > PAGE_ALLOC_COSTLY_ORDER && > + if (IS_ENABLED(CONFIG_COMPACTION) && sc->order && > zonelist_zone_idx(z) <= requested_highidx && > - compaction_ready(zone, sc->order)) { > + compaction_ready(zone, MAX_ORDER)) { > sc->compaction_ready = true; > - continue; > + return true; > } > > /* -- Michal Hocko SUSE Labs