From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752020Ab3LLUt5 (ORCPT ); Thu, 12 Dec 2013 15:49:57 -0500 Received: from relay3.sgi.com ([192.48.152.1]:44659 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751554Ab3LLUty (ORCPT ); Thu, 12 Dec 2013 15:49:54 -0500 Date: Thu, 12 Dec 2013 14:49:50 -0600 From: Alex Thorlton To: Andy Lutomirski Cc: "linux-mm@kvack.org" , Andrew Morton , "Kirill A. Shutemov" , Benjamin Herrenschmidt , Rik van Riel , Wanpeng Li , Mel Gorman , Michel Lespinasse , Benjamin LaHaise , Oleg Nesterov , "Eric W. Biederman" , Al Viro , David Rientjes , Zhang Yanfei , Peter Zijlstra , Johannes Weiner , Michal Hocko , Jiang Liu , Cody P Schafer , Glauber Costa , Kamezawa Hiroyuki , Naoya Horiguchi , "linux-kernel@vger.kernel.org" Subject: Re: [RFC PATCH 2/3] Add tunable to control THP behavior Message-ID: <20131212204950.GA6034@sgi.com> References: <20131212180050.GC134240@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > Is there a setting that will turn off the must-be-the-same-node > behavior? There are workloads where TLB matters more than cross-node > traffic (or where all the pages are hopelessly shared between nodes, > but hugepages are still useful). That's pretty much how THPs already behave in the kernel, so if you want to allow THPs to be handed out to one node, but referenced from many others, you'd just set the threshold to 1, and let the existing code take over. As for the must-be-the-same-node behavior: I'd actually say it's more like a "must have so much on one node" behavior, in that, if you set the threshold to 16, for example, 16 4K pages must be faulted in on the same node, in the same contiguous 2M chunk, before a THP will be created. What happens after that THP is created is out of our control, it could be referenced from anywhere. The idea here is that we can tune things so that jobs that behave poorly with THP on will not be given THPs, but the jobs that like THPs can still get them. Granted, there are still issues with this approach, but I think it's a bit better than just handing out a THP because we touched one byte in a 2M chunk. - Alex From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ig0-f180.google.com (mail-ig0-f180.google.com [209.85.213.180]) by kanga.kvack.org (Postfix) with ESMTP id DA8D06B0035 for ; Thu, 12 Dec 2013 15:49:55 -0500 (EST) Received: by mail-ig0-f180.google.com with SMTP id uq1so176095igb.1 for ; Thu, 12 Dec 2013 12:49:55 -0800 (PST) Date: Thu, 12 Dec 2013 14:49:50 -0600 From: Alex Thorlton Subject: Re: [RFC PATCH 2/3] Add tunable to control THP behavior Message-ID: <20131212204950.GA6034@sgi.com> References: <20131212180050.GC134240@sgi.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Andy Lutomirski Cc: "linux-mm@kvack.org" , Andrew Morton , "Kirill A. Shutemov" , Benjamin Herrenschmidt , Rik van Riel , Wanpeng Li , Mel Gorman , Michel Lespinasse , Benjamin LaHaise , Oleg Nesterov , "Eric W. Biederman" , Al Viro , David Rientjes , Zhang Yanfei , Peter Zijlstra , Johannes Weiner , Michal Hocko , Jiang Liu , Cody P Schafer , Glauber Costa , Kamezawa Hiroyuki , Naoya Horiguchi , "linux-kernel@vger.kernel.org" > Is there a setting that will turn off the must-be-the-same-node > behavior? There are workloads where TLB matters more than cross-node > traffic (or where all the pages are hopelessly shared between nodes, > but hugepages are still useful). That's pretty much how THPs already behave in the kernel, so if you want to allow THPs to be handed out to one node, but referenced from many others, you'd just set the threshold to 1, and let the existing code take over. As for the must-be-the-same-node behavior: I'd actually say it's more like a "must have so much on one node" behavior, in that, if you set the threshold to 16, for example, 16 4K pages must be faulted in on the same node, in the same contiguous 2M chunk, before a THP will be created. What happens after that THP is created is out of our control, it could be referenced from anywhere. The idea here is that we can tune things so that jobs that behave poorly with THP on will not be given THPs, but the jobs that like THPs can still get them. Granted, there are still issues with this approach, but I think it's a bit better than just handing out a THP because we touched one byte in a 2M chunk. - Alex -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org