From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754037AbcFPOr6 (ORCPT ); Thu, 16 Jun 2016 10:47:58 -0400 Received: from outbound-smtp03.blacknight.com ([81.17.249.16]:33138 "EHLO outbound-smtp03.blacknight.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752587AbcFPOr5 (ORCPT ); Thu, 16 Jun 2016 10:47:57 -0400 Date: Thu, 16 Jun 2016 15:47:52 +0100 From: Mel Gorman To: Vlastimil Babka Cc: Andrew Morton , Linux-MM , Rik van Riel , Johannes Weiner , LKML Subject: Re: [PATCH 12/27] mm, vmscan: Make shrink_node decisions more node-centric Message-ID: <20160616144752.GI1868@techsingularity.net> References: <1465495483-11855-1-git-send-email-mgorman@techsingularity.net> <1465495483-11855-13-git-send-email-mgorman@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 16, 2016 at 03:35:15PM +0200, Vlastimil Babka wrote: > On 06/09/2016 08:04 PM, Mel Gorman wrote: > >Earlier patches focused on having direct reclaim and kswapd use data that > >is node-centric for reclaiming but shrink_node() itself still uses too much > >zone information. This patch removes unnecessary zone-based information > >with the most important decision being whether to continue reclaim or > >not. Some memcg APIs are adjusted as a result even though memcg itself > >still uses some zone information. > > > >Signed-off-by: Mel Gorman > > [...] > > >@@ -2372,21 +2374,27 @@ static inline bool should_continue_reclaim(struct zone *zone, > > * inactive lists are large enough, continue reclaiming > > */ > > pages_for_compaction = (2UL << sc->order); > >- inactive_lru_pages = node_page_state(zone->zone_pgdat, NR_INACTIVE_FILE); > >+ inactive_lru_pages = node_page_state(pgdat, NR_INACTIVE_FILE); > > if (get_nr_swap_pages() > 0) > >- inactive_lru_pages += node_page_state(zone->zone_pgdat, NR_INACTIVE_ANON); > >+ inactive_lru_pages += node_page_state(pgdat, NR_INACTIVE_ANON); > > if (sc->nr_reclaimed < pages_for_compaction && > > inactive_lru_pages > pages_for_compaction) > > return true; > > > > /* If compaction would go ahead or the allocation would succeed, stop */ > >- switch (compaction_suitable(zone, sc->order, 0, 0)) { > >- case COMPACT_PARTIAL: > >- case COMPACT_CONTINUE: > >- return false; > >- default: > >- return true; > >+ for (z = 0; z <= sc->reclaim_idx; z++) { > >+ struct zone *zone = &pgdat->node_zones[z]; > >+ > >+ switch (compaction_suitable(zone, sc->order, 0, 0)) { > > Using 0 for classzone_idx here was sort of OK when each zone was reclaimed > separately, as a Normal allocation not passing appropriate classzone_idx > (and thus subtracting lowmem reserve from free pages) means that a false > COMPACT_PARTIAL (or COMPACT_CONTINUE) could be returned for e.g. DMA zone. > It means a premature end of reclaim for this single zone, which is > relatively small anyway, so no big deal (and we might avoid useless > over-reclaim, when even reclaiming everything wouldn't get us above the > lowmem_reserve). > > But in node-centric reclaim, such premature "return false" from a DMA zone > stops reclaiming the whole node. So I think we should involve the real > classzone_idx here. > Fair point although for compaction, it'll occur for a marginal corner case. Premature allowed compaction for ZONE_DMA is unfortunate but bizarre to think there would be a high-order allocation restricted to just that zone too. I'll pass in sc->reclaim_idx as it represents the allocating order. That highlights that direct reclaim was not setting reclaim_idx but that's been corrected. -- Mel Gorman SUSE Labs