From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754208Ab2KMMCz (ORCPT ); Tue, 13 Nov 2012 07:02:55 -0500 Received: from cantor2.suse.de ([195.135.220.15]:51380 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751628Ab2KMMCw (ORCPT ); Tue, 13 Nov 2012 07:02:52 -0500 Date: Tue, 13 Nov 2012 12:02:48 +0000 From: Mel Gorman To: Ingo Molnar Cc: Peter Zijlstra , Andrea Arcangeli , Rik van Riel , Johannes Weiner , Hugh Dickins , Thomas Gleixner , Linus Torvalds , Andrew Morton , Linux-MM , LKML Subject: Re: [PATCH 14/19] mm: mempolicy: Add MPOL_MF_LAZY Message-ID: <20121113120248.GA8218@suse.de> References: <1352193295-26815-1-git-send-email-mgorman@suse.de> <1352193295-26815-15-git-send-email-mgorman@suse.de> <20121113102555.GE21522@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20121113102555.GE21522@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 13, 2012 at 11:25:55AM +0100, Ingo Molnar wrote: > > * Mel Gorman wrote: > > > From: Lee Schermerhorn > > > > NOTE: Once again there is a lot of patch stealing and the end result > > is sufficiently different that I had to drop the signed-offs. > > Will re-add if the original authors are ok with that. > > > > This patch adds another mbind() flag to request "lazy migration". The > > flag, MPOL_MF_LAZY, modifies MPOL_MF_MOVE* such that the selected > > pages are marked PROT_NONE. The pages will be migrated in the fault > > path on "first touch", if the policy dictates at that time. > > > > > > Here you are paying a heavy price for the earlier design > mistake, for forking into per arch approach - the NUMA version > of change_protection() had to be open-coded: > I considered this when looking at the two trees. At the time I also had the option of making change_prot_numa() to be a wrapper around change_protection() and if pte_numa is made generic, that becomes more attractive. One of the reasons I went with this version from Andrea's tree is simply because it does less work than change_protect() but what should be sufficient for _PAGE_NUMA. I avoid the TLB flush if there are no PTE updates for example but could shuffle change_protection() and get the same thing. > > include/linux/mm.h | 3 + > > include/uapi/linux/mempolicy.h | 13 ++- > > mm/mempolicy.c | 176 ++++++++++++++++++++++++++++++++++++---- > > 3 files changed, 174 insertions(+), 18 deletions(-) > > Compare it to the generic version that Peter used: > > include/uapi/linux/mempolicy.h | 13 ++++++++--- > mm/mempolicy.c | 49 +++++++++++++++++++++++++++--------------- > 2 files changed, 42 insertions(+), 20 deletions(-) > > and the cleanliness and maintainability advantages are obvious. > > So without some really good arguments in favor of your approach > NAK on that complex approach really. > I will reimplement around change_protection() and see what effect, if any, it has on overhead. -- Mel Gorman SUSE Labs