From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757243Ab2JRXpG (ORCPT ); Thu, 18 Oct 2012 19:45:06 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:41556 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757174Ab2JRXpD (ORCPT ); Thu, 18 Oct 2012 19:45:03 -0400 Date: Thu, 18 Oct 2012 16:45:02 -0700 From: Andrew Morton To: "Kirill A. Shutemov" Cc: Andrea Arcangeli , linux-mm@kvack.org, Andi Kleen , "H. Peter Anvin" , linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: Re: [PATCH v4 10/10] thp: implement refcounting for huge zero page Message-Id: <20121018164502.b32791e7.akpm@linux-foundation.org> In-Reply-To: <1350280859-18801-11-git-send-email-kirill.shutemov@linux.intel.com> References: <1350280859-18801-1-git-send-email-kirill.shutemov@linux.intel.com> <1350280859-18801-11-git-send-email-kirill.shutemov@linux.intel.com> X-Mailer: Sylpheed 3.0.2 (GTK+ 2.20.1; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 15 Oct 2012 09:00:59 +0300 "Kirill A. Shutemov" wrote: > H. Peter Anvin doesn't like huge zero page which sticks in memory forever > after the first allocation. Here's implementation of lockless refcounting > for huge zero page. > > We have two basic primitives: {get,put}_huge_zero_page(). They > manipulate reference counter. > > If counter is 0, get_huge_zero_page() allocates a new huge page and > takes two references: one for caller and one for shrinker. We free the > page only in shrinker callback if counter is 1 (only shrinker has the > reference). > > put_huge_zero_page() only decrements counter. Counter is never zero > in put_huge_zero_page() since shrinker holds on reference. > > Freeing huge zero page in shrinker callback helps to avoid frequent > allocate-free. I'd like more details on this please. The cost of freeing then reinstantiating that page is tremendous, because it has to be zeroed out again. If there is any way at all in which the kernel can be made to enter a high-frequency free/reinstantiate pattern then I expect the effects would be quite bad. Do we have sufficient mechanisms in there to prevent this from happening in all cases? If so, what are they, because I'm not seeing them? > Refcounting has cost. On 4 socket machine I observe ~1% slowdown on > parallel (40 processes) read page faulting comparing to lazy huge page > allocation. I think it's pretty reasonable for synthetic benchmark.