From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.7 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81683C2D0E4 for ; Tue, 17 Nov 2020 15:36:06 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 919672463C for ; Tue, 17 Nov 2020 15:36:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="nUVsC2+9" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 919672463C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id B99656B0072; Tue, 17 Nov 2020 10:36:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B4A1C6B0073; Tue, 17 Nov 2020 10:36:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9C43D6B0074; Tue, 17 Nov 2020 10:36:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0085.hostedemail.com [216.40.44.85]) by kanga.kvack.org (Postfix) with ESMTP id 6F1D96B0072 for ; Tue, 17 Nov 2020 10:36:04 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 147C7181AEF00 for ; Tue, 17 Nov 2020 15:36:04 +0000 (UTC) X-FDA: 77494311048.18.start13_150e5d427332 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id D03B9100ED9DD for ; Tue, 17 Nov 2020 15:36:03 +0000 (UTC) X-HE-Tag: start13_150e5d427332 X-Filterd-Recvd-Size: 8080 Received: from mail-pj1-f66.google.com (mail-pj1-f66.google.com [209.85.216.66]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Tue, 17 Nov 2020 15:36:02 +0000 (UTC) Received: by mail-pj1-f66.google.com with SMTP id js21so645011pjb.0 for ; Tue, 17 Nov 2020 07:35:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=hptfVDhp17lNG0zTY32jtB2ORRoWKMoYJw0mGzRFDYU=; b=nUVsC2+9jIY0quFTo/Ulukf3VUuU1nHVOeXU3jB/pV+YhS7xXtH3IKlRvLx0ObzOWO DFe+fr+fz+KjwT+YERHo4TsJ045U5KRwFYrqEZMTpgVkgnS4iWksZ22Z5OEUFPxDKGMJ eLntD23iYtJBYmehuQcIvl5/VDMxMsP8LGOAU9xpC+D7nIzqpC7G81XgCsoXHYAuG9Mj RQ+oLGpJpCJeNThKY06vrzlCKBxi9S2nKqY8KayRRfjKM2sL2Hd/B17jOYFiP/d0hBKI SOAwSIxeumw6Fdu0RFV5flWlGtSmE4OL223//7brUSNLcCibmXJ41E+Sk7ZYDkc7oEQA ox0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=hptfVDhp17lNG0zTY32jtB2ORRoWKMoYJw0mGzRFDYU=; b=QA1MbF9YDxqw/7pq7tFWKVpBQAnxqK/EwuknPdl4C+RnLPMvFQuA/O/ypPPcvZI8F/ wTA+WgSRX1afOKpdLbXSoZ3PIGsPLQeM9/YJ8LDq0cp6vWvqa4Vz/dhNjiCsvT3ZuVZK ulHiKFAi+30zPNhIAMB0wD45HFkm27ETSQ22I+oIKblNuNHuPl46rDmQERgf9SwCKzgK DANsQ4YTqDarG7Bcjzi8ABVEaL7OrN8WsIFaSc5W736lS2odiuFofwI4uX8JMN4JNNR1 vVH2pfEL91gH0iQl6XprTWwVVzgeEZ3AN5RChbmpnNxbNYVgu06B7BEdlehvYwCPU2lq h3Sw== X-Gm-Message-State: AOAM530Hntj5egMqwTATq+t92IcdFGNPH6H3JuhlEt+vFATtWUjzcZby 1/ZSjNIP9QF1byZyOjfNLWhzsY3WYRGyPJogOg1h8A== X-Google-Smtp-Source: ABdhPJwwhxoNvmlzPdAoSMicIpSAuvKb4FDamsz7ABcccu4oUhkRg96zse1OAo2A7X/R+bLoM37GT5VBN2K7h/qnIGg= X-Received: by 2002:a17:902:c14b:b029:d6:ab18:108d with SMTP id 11-20020a170902c14bb02900d6ab18108dmr93696plj.20.1605627358702; Tue, 17 Nov 2020 07:35:58 -0800 (PST) MIME-Version: 1.0 References: <20201108141113.65450-1-songmuchun@bytedance.com> <20201108141113.65450-4-songmuchun@bytedance.com> <20201109135215.GA4778@localhost.localdomain> <20201110195025.GN17076@casper.infradead.org> In-Reply-To: <20201110195025.GN17076@casper.infradead.org> From: Muchun Song Date: Tue, 17 Nov 2020 23:35:19 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v3 03/21] mm/hugetlb: Introduce a new config HUGETLB_PAGE_FREE_VMEMMAP To: Matthew Wilcox Cc: Mike Kravetz , Oscar Salvador , Jonathan Corbet , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Michal Hocko , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 11, 2020 at 3:50 AM Matthew Wilcox wrote: > > On Tue, Nov 10, 2020 at 11:31:31AM -0800, Mike Kravetz wrote: > > On 11/9/20 5:52 AM, Oscar Salvador wrote: > > > On Sun, Nov 08, 2020 at 10:10:55PM +0800, Muchun Song wrote: > > >> The purpose of introducing HUGETLB_PAGE_FREE_VMEMMAP is to configure > > >> whether to enable the feature of freeing unused vmemmap associated > > >> with HugeTLB pages. Now only support x86. > > >> > > >> Signed-off-by: Muchun Song > > >> --- > > >> arch/x86/mm/init_64.c | 2 +- > > >> fs/Kconfig | 16 ++++++++++++++++ > > >> mm/bootmem_info.c | 3 +-- > > >> 3 files changed, 18 insertions(+), 3 deletions(-) > > >> > > >> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > > >> index 0a45f062826e..0435bee2e172 100644 > > >> --- a/arch/x86/mm/init_64.c > > >> +++ b/arch/x86/mm/init_64.c > > >> @@ -1225,7 +1225,7 @@ static struct kcore_list kcore_vsyscall; > > >> > > >> static void __init register_page_bootmem_info(void) > > >> { > > >> -#ifdef CONFIG_NUMA > > >> +#if defined(CONFIG_NUMA) || defined(CONFIG_HUGETLB_PAGE_FREE_VMEMMAP) > > >> int i; > > >> > > >> for_each_online_node(i) > > >> diff --git a/fs/Kconfig b/fs/Kconfig > > >> index 976e8b9033c4..21b8d39a9715 100644 > > >> --- a/fs/Kconfig > > >> +++ b/fs/Kconfig > > >> @@ -245,6 +245,22 @@ config HUGETLBFS > > >> config HUGETLB_PAGE > > >> def_bool HUGETLBFS > > >> > > >> +config HUGETLB_PAGE_FREE_VMEMMAP > > >> + bool "Free unused vmemmap associated with HugeTLB pages" > > >> + default y > > >> + depends on X86 > > >> + depends on HUGETLB_PAGE > > >> + depends on SPARSEMEM_VMEMMAP > > >> + depends on HAVE_BOOTMEM_INFO_NODE > > >> + help > > >> + There are many struct page structures associated with each HugeTLB > > >> + page. But we only use a few struct page structures. In this case, > > >> + it wastes some memory. It is better to free the unused struct page > > >> + structures to buddy system which can save some memory. For > > >> + architectures that support it, say Y here. > > >> + > > >> + If unsure, say N. > > > > > > I am not sure the above is useful for someone who needs to decide > > > whether he needs/wants to enable this or not. > > > I think the above fits better in a Documentation part. > > > > > > I suck at this, but what about the following, or something along those > > > lines? > > > > > > " > > > When using SPARSEMEM_VMEMMAP, the system can save up some memory > > > from pre-allocated HugeTLB pages when they are not used. > > > 6 pages per 2MB HugeTLB page and 4095 per 1GB HugeTLB page. > > > When the pages are going to be used or freed up, the vmemmap > > > array representing that range needs to be remapped again and > > > the pages we discarded earlier need to be rellocated again. > > > Therefore, this is a trade-off between saving memory and > > > increasing time in allocation/free path. > > > " > > > > > > It would be also great to point out that this might be a > > > trade-off between saving up memory and increasing the cost > > > of certain operations on allocation/free path. > > > That is why I mentioned it there. > > > > Yes, this is somewhat a trade-off. > > > > As a config option, this is something that would likely be decided by > > distros. I almost hate to suggest this, but is it something that an > > end user would want to decide? Is this something that perhaps should > > be a boot/kernel command line option? > > I don't like config options. I like boot options even less. I don't > know how to describe to an end-user whether they should select this > or not. Is there a way to make this not a tradeoff? Or make the > tradeoff so minimal as to be not worth describing? (do we have numbers > for the worst possible situation when enabling this option?) > > I haven't read through these patches in detail, so maybe we do this > already, but when we free the pages to the buddy allocator, do we retain > the third page to use for the PTEs (and free pages 3-7), or do we allocate > a separate page for the PTES and free pages 2-7? Sorry for missing this reply. It is a good idea. I will start an investigation and implement this. Thanks Matthew. -- Yours, Muchun