From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752912AbcFVQA1 (ORCPT ); Wed, 22 Jun 2016 12:00:27 -0400 Received: from mx2.suse.de ([195.135.220.15]:35818 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752151AbcFVQAZ (ORCPT ); Wed, 22 Jun 2016 12:00:25 -0400 Subject: Re: [PATCH 04/27] mm, vmscan: Begin reclaiming pages on a per-node basis To: Mel Gorman , Andrew Morton , Linux-MM References: <1466518566-30034-1-git-send-email-mgorman@techsingularity.net> <1466518566-30034-5-git-send-email-mgorman@techsingularity.net> <6eecdf50-7880-2bfe-5519-004a4beeece6@suse.cz> Cc: Rik van Riel , Johannes Weiner , LKML , Michal Hocko From: Vlastimil Babka Message-ID: Date: Wed, 22 Jun 2016 18:00:12 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: <6eecdf50-7880-2bfe-5519-004a4beeece6@suse.cz> Content-Type: text/plain; charset=iso-8859-2; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/22/2016 04:04 PM, Vlastimil Babka wrote: > On 06/21/2016 04:15 PM, Mel Gorman wrote: >> This patch makes reclaim decisions on a per-node basis. A reclaimer knows >> what zone is required by the allocation request and skips pages from >> higher zones. In many cases this will be ok because it's a GFP_HIGHMEM >> request of some description. On 64-bit, ZONE_DMA32 requests will cause >> some problems but 32-bit devices on 64-bit platforms are increasingly >> rare. Historically it would have been a major problem on 32-bit with big >> Highmem:Lowmem ratios but such configurations are also now rare and even >> where they exist, they are not encouraged. If it really becomes a problem, >> it'll manifest as very low reclaim efficiencies. >> >> Signed-off-by: Mel Gorman > > [...] > >> @@ -2540,14 +2559,14 @@ static inline bool compaction_ready(struct zone *zone, int order, int classzone_ >> * If a zone is deemed to be full of pinned pages then just give it a light >> * scan then give up on it. >> */ >> -static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) >> +static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc, >> + enum zone_type classzone_idx) >> { >> struct zoneref *z; >> struct zone *zone; >> unsigned long nr_soft_reclaimed; >> unsigned long nr_soft_scanned; >> gfp_t orig_mask; >> - enum zone_type requested_highidx = gfp_zone(sc->gfp_mask); >> >> /* >> * If the number of buffer_heads in the machine exceeds the maximum >> @@ -2560,15 +2579,20 @@ static void shrink_zones(struct zonelist *zonelist, struct scan_control *sc) >> >> for_each_zone_zonelist_nodemask(zone, z, zonelist, >> gfp_zone(sc->gfp_mask), sc->nodemask) { > > Using sc->reclaim_idx could be faster/nicer here than gfp_zone()? > Although after "mm, vmscan: Update classzone_idx if buffer_heads_over_limit" > there would need to be a variable for the highmem adjusted value - maybe reuse > "requested_highidx"? Not important though. > >> - enum zone_type classzone_idx; >> - >> if (!populated_zone(zone)) >> continue; >> >> - classzone_idx = requested_highidx; >> + /* >> + * Note that reclaim_idx does not change as it is the highest >> + * zone reclaimed from which for empty zones is a no-op but >> + * classzone_idx is used by shrink_node to test if the slabs >> + * should be shrunk on a given node. >> + */ >> while (!populated_zone(zone->zone_pgdat->node_zones + >> - classzone_idx)) >> + classzone_idx)) { >> classzone_idx--; >> + continue; Oh and Michal's comment on Patch 20 made me realize that my objection to v6 about possible underflow of sc->reclaim_idx and classzone_idx seems to still apply here for classzone_idx? Updated example: Normal zone allocation. A small node 0 without Normal zone will get us classzone_idx == dma32. Node 1 next in zonelist won't have dma/dma32 zones so we won't see node_zones + classzone_idx populated, and the while loop will lead to underflow of classzone_idx. I may be missing something, but I don't really see another way around it than resetting classzone_idx to sc->reclaim_idx before the while loop.