From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753047AbdGXJZ4 (ORCPT ); Mon, 24 Jul 2017 05:25:56 -0400 Received: from mx2.suse.de ([195.135.220.15]:36871 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750864AbdGXJZu (ORCPT ); Mon, 24 Jul 2017 05:25:50 -0400 Subject: Re: [PATCH 6/9] mm, page_alloc: simplify zonelist initialization To: Michal Hocko , Andrew Morton Cc: Mel Gorman , Johannes Weiner , linux-mm@kvack.org, LKML , Michal Hocko References: <20170721143915.14161-1-mhocko@kernel.org> <20170721143915.14161-7-mhocko@kernel.org> From: Vlastimil Babka Message-ID: <994c1d72-bc57-1378-586d-fdfce770e53e@suse.cz> Date: Mon, 24 Jul 2017 11:25:47 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 In-Reply-To: <20170721143915.14161-7-mhocko@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 07/21/2017 04:39 PM, Michal Hocko wrote: > From: Michal Hocko > > build_zonelists gradually builds zonelists from the nearest to the most > distant node. As we do not know how many populated zones we will have in > each node we rely on the _zoneref to terminate initialized part of the > zonelist by a NULL zone. While this is functionally correct it is quite > suboptimal because we cannot allow updaters to race with zonelists > users because they could see an empty zonelist and fail the allocation > or hit the OOM killer in the worst case. > > We can do much better, though. We can store the node ordering into an > already existing node_order array and then give this array to > build_zonelists_in_node_order and do the whole initialization at once. > zonelists consumers still might see halfway initialized state but that > should be much more tolerateable because the list will not be empty and > they would either see some zone twice or skip over some zone(s) in the > worst case which shouldn't lead to immediate failures. > > While at it let's simplify build_zonelists_node which is rather > confusing now. It gets an index into the zoneref array and returns > the updated index for the next iteration. Let's rename the function > to build_zonerefs_node to better reflect its purpose and give it > zoneref array to update. The function doesn't the index anymore. It > just returns the number of added zones so that the caller can advance > the zonered array start for the next update. > > This patch alone doesn't introduce any functional change yet, though, it > is merely a preparatory work for later changes. > > Changes since v1 > - build_zonelists_node -> build_zonerefs_node and operate directly on > zonerefs array rather than play tricks with index into the array. > - give build_zonelists_in_node_order nr_nodes to not iterate over all > MAX_NUMNODES as per Mel > > Signed-off-by: Michal Hocko Acked-by: Vlastimil Babka From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wm0-f69.google.com (mail-wm0-f69.google.com [74.125.82.69]) by kanga.kvack.org (Postfix) with ESMTP id 45F786B0292 for ; Mon, 24 Jul 2017 05:25:51 -0400 (EDT) Received: by mail-wm0-f69.google.com with SMTP id a186so5365527wmh.9 for ; Mon, 24 Jul 2017 02:25:51 -0700 (PDT) Received: from mx1.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id j71si4031103wmg.264.2017.07.24.02.25.50 for (version=TLS1 cipher=AES128-SHA bits=128/128); Mon, 24 Jul 2017 02:25:50 -0700 (PDT) Subject: Re: [PATCH 6/9] mm, page_alloc: simplify zonelist initialization References: <20170721143915.14161-1-mhocko@kernel.org> <20170721143915.14161-7-mhocko@kernel.org> From: Vlastimil Babka Message-ID: <994c1d72-bc57-1378-586d-fdfce770e53e@suse.cz> Date: Mon, 24 Jul 2017 11:25:47 +0200 MIME-Version: 1.0 In-Reply-To: <20170721143915.14161-7-mhocko@kernel.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko , Andrew Morton Cc: Mel Gorman , Johannes Weiner , linux-mm@kvack.org, LKML , Michal Hocko On 07/21/2017 04:39 PM, Michal Hocko wrote: > From: Michal Hocko > > build_zonelists gradually builds zonelists from the nearest to the most > distant node. As we do not know how many populated zones we will have in > each node we rely on the _zoneref to terminate initialized part of the > zonelist by a NULL zone. While this is functionally correct it is quite > suboptimal because we cannot allow updaters to race with zonelists > users because they could see an empty zonelist and fail the allocation > or hit the OOM killer in the worst case. > > We can do much better, though. We can store the node ordering into an > already existing node_order array and then give this array to > build_zonelists_in_node_order and do the whole initialization at once. > zonelists consumers still might see halfway initialized state but that > should be much more tolerateable because the list will not be empty and > they would either see some zone twice or skip over some zone(s) in the > worst case which shouldn't lead to immediate failures. > > While at it let's simplify build_zonelists_node which is rather > confusing now. It gets an index into the zoneref array and returns > the updated index for the next iteration. Let's rename the function > to build_zonerefs_node to better reflect its purpose and give it > zoneref array to update. The function doesn't the index anymore. It > just returns the number of added zones so that the caller can advance > the zonered array start for the next update. > > This patch alone doesn't introduce any functional change yet, though, it > is merely a preparatory work for later changes. > > Changes since v1 > - build_zonelists_node -> build_zonerefs_node and operate directly on > zonerefs array rather than play tricks with index into the array. > - give build_zonelists_in_node_order nr_nodes to not iterate over all > MAX_NUMNODES as per Mel > > Signed-off-by: Michal Hocko Acked-by: Vlastimil Babka -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org