From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934791AbcIVPSz (ORCPT ); Thu, 22 Sep 2016 11:18:55 -0400 Received: from mx2.suse.de ([195.135.220.15]:44295 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933098AbcIVPSy (ORCPT ); Thu, 22 Sep 2016 11:18:54 -0400 Subject: Re: [PATCH 0/4] reintroduce compaction feedback for OOM decisions To: Michal Hocko References: <20160906135258.18335-1-vbabka@suse.cz> <20160921171830.GH24210@dhcp22.suse.cz> Cc: Andrew Morton , Arkadiusz Miskiewicz , Ralf-Peter Rohbeck , Olaf Hering , linux-kernel@vger.kernel.org, Linus Torvalds , linux-mm@kvack.org, David Rientjes , Joonsoo Kim , Mel Gorman , Rik van Riel From: Vlastimil Babka Message-ID: <56f2c2ed-8a58-cf9c-dd00-c0d0e274607a@suse.cz> Date: Thu, 22 Sep 2016 17:18:48 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <20160921171830.GH24210@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 09/21/2016 07:18 PM, Michal Hocko wrote: > On Tue 06-09-16 15:52:54, Vlastimil Babka wrote: > > We still do not ignore fragindex in the full priority. This part has > always been quite unclear to me so I cannot really tell whether that > makes any difference or not but just to be on the safe side I would > preffer to have _all_ the shortcuts out of the way in the highest > priority. It is true that this will cause COMPACT_NOT_SUITABLE_ZONE > so keep retrying but still a complication to understand the workflow. > > What do you think? I was thinking that this shouldn't be a problem on non-costly orders and default extfrag_threshold. But better be safe. Moreover I think the issue is much more dangerous for compact_zonelist_suitable() as explained below. ----8<---- >>From 0e6cb251aa6e3b1be7deff315c0238c4d478f22e Mon Sep 17 00:00:00 2001 From: Vlastimil Babka Date: Thu, 22 Sep 2016 15:33:57 +0200 Subject: [PATCH] mm, compaction: ignore fragindex on highest direct compaction priority Fragmentation index check in compaction_suitable() should be the last heuristic that we allow on the highest compaction priority. Since that's a potential premature OOM, disable it too. Even more problematic is its usage from compaction_zonelist_suitable() -> __compaction_suitable() where we check the order-0 watermark against free plus available-for-reclaim pages, but the fragindex considers only truly free pages. Thus we can get a result close to 0 indicating failure do to lack of memory, and wrongly decide that compaction won't be suitable even after reclaim. The solution is to skip the fragindex check also in this context, regardless of priority. Signed-off-by: Vlastimil Babka --- include/linux/compaction.h | 5 +++-- mm/compaction.c | 44 +++++++++++++++++++++++--------------------- mm/internal.h | 1 + mm/vmscan.c | 6 ++++-- 4 files changed, 31 insertions(+), 25 deletions(-) diff --git a/include/linux/compaction.h b/include/linux/compaction.h index 0d8415820fc3..3ccf13d57651 100644 --- a/include/linux/compaction.h +++ b/include/linux/compaction.h @@ -97,7 +97,8 @@ extern enum compact_result try_to_compact_pages(gfp_t gfp_mask, const struct alloc_context *ac, enum compact_priority prio); extern void reset_isolation_suitable(pg_data_t *pgdat); extern enum compact_result compaction_suitable(struct zone *zone, int order, - unsigned int alloc_flags, int classzone_idx); + unsigned int alloc_flags, int classzone_idx, + bool check_fragindex); extern void defer_compaction(struct zone *zone, int order); extern bool compaction_deferred(struct zone *zone, int order); @@ -183,7 +184,7 @@ static inline void reset_isolation_suitable(pg_data_t *pgdat) } static inline enum compact_result compaction_suitable(struct zone *zone, int order, - int alloc_flags, int classzone_idx) + int alloc_flags, int classzone_idx, bool check_fragindex) { return COMPACT_SKIPPED; } diff --git a/mm/compaction.c b/mm/compaction.c index 86d4d0bbfc7c..ae6a115f37b2 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -1379,7 +1379,6 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order, int classzone_idx, unsigned long wmark_target) { - int fragindex; unsigned long watermark; if (is_via_compact_memory(order)) @@ -1415,6 +1414,18 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order, ALLOC_CMA, wmark_target)) return COMPACT_SKIPPED; + return COMPACT_CONTINUE; +} + +enum compact_result compaction_suitable(struct zone *zone, int order, + unsigned int alloc_flags, + int classzone_idx, bool check_fragindex) +{ + enum compact_result ret; + int fragindex; + + ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx, + zone_page_state(zone, NR_FREE_PAGES)); /* * fragmentation index determines if allocation failures are due to * low memory or external fragmentation @@ -1426,21 +1437,12 @@ static enum compact_result __compaction_suitable(struct zone *zone, int order, * * Only compact if a failure would be due to fragmentation. */ - fragindex = fragmentation_index(zone, order); - if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold) - return COMPACT_NOT_SUITABLE_ZONE; - - return COMPACT_CONTINUE; -} - -enum compact_result compaction_suitable(struct zone *zone, int order, - unsigned int alloc_flags, - int classzone_idx) -{ - enum compact_result ret; + if (ret == COMPACT_CONTINUE && check_fragindex) { + fragindex = fragmentation_index(zone, order); + if (fragindex >= 0 && fragindex <= sysctl_extfrag_threshold) + ret = COMPACT_NOT_SUITABLE_ZONE; + } - ret = __compaction_suitable(zone, order, alloc_flags, classzone_idx, - zone_page_state(zone, NR_FREE_PAGES)); trace_mm_compaction_suitable(zone, order, ret); if (ret == COMPACT_NOT_SUITABLE_ZONE) ret = COMPACT_SKIPPED; @@ -1473,8 +1475,7 @@ bool compaction_zonelist_suitable(struct alloc_context *ac, int order, available += zone_page_state_snapshot(zone, NR_FREE_PAGES); compact_result = __compaction_suitable(zone, order, alloc_flags, ac_classzone_idx(ac), available); - if (compact_result != COMPACT_SKIPPED && - compact_result != COMPACT_NOT_SUITABLE_ZONE) + if (compact_result != COMPACT_SKIPPED) return true; } @@ -1490,7 +1491,7 @@ static enum compact_result compact_zone(struct zone *zone, struct compact_contro const bool sync = cc->mode != MIGRATE_ASYNC; ret = compaction_suitable(zone, cc->order, cc->alloc_flags, - cc->classzone_idx); + cc->classzone_idx, !cc->ignore_fragindex); /* Compaction is likely to fail */ if (ret == COMPACT_SUCCESS || ret == COMPACT_SKIPPED) return ret; @@ -1661,7 +1662,8 @@ static enum compact_result compact_zone_order(struct zone *zone, int order, .direct_compaction = true, .whole_zone = (prio == MIN_COMPACT_PRIORITY), .ignore_skip_hint = (prio == MIN_COMPACT_PRIORITY), - .ignore_block_suitable = (prio == MIN_COMPACT_PRIORITY) + .ignore_block_suitable = (prio == MIN_COMPACT_PRIORITY), + .ignore_fragindex = (prio == MIN_COMPACT_PRIORITY) }; INIT_LIST_HEAD(&cc.freepages); INIT_LIST_HEAD(&cc.migratepages); @@ -1869,7 +1871,7 @@ static bool kcompactd_node_suitable(pg_data_t *pgdat) continue; if (compaction_suitable(zone, pgdat->kcompactd_max_order, 0, - classzone_idx) == COMPACT_CONTINUE) + classzone_idx, true) == COMPACT_CONTINUE) return true; } @@ -1905,7 +1907,7 @@ static void kcompactd_do_work(pg_data_t *pgdat) if (compaction_deferred(zone, cc.order)) continue; - if (compaction_suitable(zone, cc.order, 0, zoneid) != + if (compaction_suitable(zone, cc.order, 0, zoneid, true) != COMPACT_CONTINUE) continue; diff --git a/mm/internal.h b/mm/internal.h index 537ac9951f5f..f18adf559e28 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -179,6 +179,7 @@ struct compact_control { enum migrate_mode mode; /* Async or sync migration mode */ bool ignore_skip_hint; /* Scan blocks even if marked skip */ bool ignore_block_suitable; /* Scan blocks considered unsuitable */ + bool ignore_fragindex; /* Ignore fragmentation index */ bool direct_compaction; /* False from kcompactd or /proc/... */ bool whole_zone; /* Whole zone should/has been scanned */ int order; /* order a direct compactor needs */ diff --git a/mm/vmscan.c b/mm/vmscan.c index 55943a284082..08f16893cb2b 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -2511,7 +2511,8 @@ static inline bool should_continue_reclaim(struct pglist_data *pgdat, if (!managed_zone(zone)) continue; - switch (compaction_suitable(zone, sc->order, 0, sc->reclaim_idx)) { + switch (compaction_suitable(zone, sc->order, 0, + sc->reclaim_idx, true)) { case COMPACT_SUCCESS: case COMPACT_CONTINUE: return false; @@ -2624,7 +2625,8 @@ static inline bool compaction_ready(struct zone *zone, struct scan_control *sc) unsigned long watermark; enum compact_result suitable; - suitable = compaction_suitable(zone, sc->order, 0, sc->reclaim_idx); + suitable = compaction_suitable(zone, sc->order, 0, sc->reclaim_idx, + true); if (suitable == COMPACT_SUCCESS) /* Allocation should succeed already. Don't reclaim. */ return true; -- 2.10.0 From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f72.google.com (mail-wm0-f72.google.com [74.125.82.72]) by kanga.kvack.org (Postfix) with ESMTP id A98E46B0275 for ; Thu, 22 Sep 2016 11:18:54 -0400 (EDT) Received: by mail-wm0-f72.google.com with SMTP id b130so74420653wmc.2 for ; Thu, 22 Sep 2016 08:18:54 -0700 (PDT) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id n123si2954504wmg.32.2016.09.22.08.18.53 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 22 Sep 2016 08:18:53 -0700 (PDT) Subject: Re: [PATCH 0/4] reintroduce compaction feedback for OOM decisions References: <20160906135258.18335-1-vbabka@suse.cz> <20160921171830.GH24210@dhcp22.suse.cz> From: Vlastimil Babka Message-ID: <56f2c2ed-8a58-cf9c-dd00-c0d0e274607a@suse.cz> Date: Thu, 22 Sep 2016 17:18:48 +0200 MIME-Version: 1.0 In-Reply-To: <20160921171830.GH24210@dhcp22.suse.cz> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: Andrew Morton , Arkadiusz Miskiewicz , Ralf-Peter Rohbeck , Olaf Hering , linux-kernel@vger.kernel.org, Linus Torvalds , linux-mm@kvack.org, David Rientjes , Joonsoo Kim , Mel Gorman , Rik van Riel On 09/21/2016 07:18 PM, Michal Hocko wrote: > On Tue 06-09-16 15:52:54, Vlastimil Babka wrote: > > We still do not ignore fragindex in the full priority. This part has > always been quite unclear to me so I cannot really tell whether that > makes any difference or not but just to be on the safe side I would > preffer to have _all_ the shortcuts out of the way in the highest > priority. It is true that this will cause COMPACT_NOT_SUITABLE_ZONE > so keep retrying but still a complication to understand the workflow. > > What do you think? I was thinking that this shouldn't be a problem on non-costly orders and default extfrag_threshold. But better be safe. Moreover I think the issue is much more dangerous for compact_zonelist_suitable() as explained below. ----8<----