From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3D1BBC63777 for ; Fri, 20 Nov 2020 08:52:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C951422254 for ; Fri, 20 Nov 2020 08:52:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="N5BsUqRd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727246AbgKTIwn (ORCPT ); Fri, 20 Nov 2020 03:52:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727043AbgKTIwm (ORCPT ); Fri, 20 Nov 2020 03:52:42 -0500 Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 645CCC0617A7 for ; Fri, 20 Nov 2020 00:52:42 -0800 (PST) Received: by mail-pl1-x641.google.com with SMTP id x15so4535562pll.2 for ; Fri, 20 Nov 2020 00:52:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zpNptVEuWzTVs1t2YZA8V1BK1+nQ/xLKMvtlTpvsiNM=; b=N5BsUqRdrzvSw9AHWZZyrIIthOxBDzszxZllUK77vwE59ZQelnxtVCObH/ZZK2Vw7M rLy1RI+ru0XDHzycpGag2P2fSftv+XPBCzQf4DRTy40YMdKx93654iCy1prysMuHfnrd XJYFtOgP3E7SEuxIPL3Gctaltw7yGhhyMeSvyTefv19A3WDZcpbKBVLtR7parCS5NiLG uNpdTmc7TiRt7y3EW/ESytrDvqpv+7dJ2rV7MbLouSFomLdR7rbO1KDMDG3cJuJI3Sf3 skXK1rNeiKWVZZCSBhRQgDTJZW5EEN8yZ1bNVzie05HGjDQ4fZca2TSzpUU5ARswisw0 qvZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zpNptVEuWzTVs1t2YZA8V1BK1+nQ/xLKMvtlTpvsiNM=; b=nfD6ie+bPovMLB+SafOCoENise+/seplzHbH3j0NPao6o9yfHPZU2JJs2V/cqk/moK abA5z0OsZkjjHWrv1f6VCAS7r4N4uWGP0SNZvMR68EF0qx92ximzGfEPi0nyuiCeK2hP C4ZaY1iNEi0j7PvPqdc0V4UgfMV+LwAqrPuuAbeZoBAVODBwES6fUjuqCynB8f+T8ROB S7YFQz/66haQZU69fhYmz0ZBivJHpMyuTWRNO9t2aRht4ch5A/1iE2yVFlZ4UJwgg+8h sahc43bxU1Zd171fqQ44Y3wrNKHX7uY96gdwLpKYlW/q61HyccOdPvsp1yH312ThdZYE PgBQ== X-Gm-Message-State: AOAM531E4Ifin3UuCl6joVPZz+HVH8OxYIDwhHll9hDQTmywTaNB6K/e pbOrgP9p5CNh41pr8rBk2nNGWQC7hmehY1/c12vKCw== X-Google-Smtp-Source: ABdhPJwlDBHCTerTRacGWw2itZ5Ez5TirD6WZf/c7BEGg462vAoZLIr8yw3+7RDjdDmEuv9EbGXEI9G5aPzAlqKWK8U= X-Received: by 2002:a17:902:76c8:b029:d9:d6c3:357d with SMTP id j8-20020a17090276c8b02900d9d6c3357dmr7764836plt.34.1605862361852; Fri, 20 Nov 2020 00:52:41 -0800 (PST) MIME-Version: 1.0 References: <20201120064325.34492-1-songmuchun@bytedance.com> <20201120064325.34492-12-songmuchun@bytedance.com> <20201120081123.GC3200@dhcp22.suse.cz> In-Reply-To: <20201120081123.GC3200@dhcp22.suse.cz> From: Muchun Song Date: Fri, 20 Nov 2020 16:51:59 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v5 11/21] mm/hugetlb: Allocate the vmemmap pages associated with each hugetlb page To: Michal Hocko Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , "Song Bao Hua (Barry Song)" , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Nov 20, 2020 at 4:11 PM Michal Hocko wrote: > > On Fri 20-11-20 14:43:15, Muchun Song wrote: > [...] > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > > index eda7e3a0b67c..361c4174e222 100644 > > --- a/mm/hugetlb_vmemmap.c > > +++ b/mm/hugetlb_vmemmap.c > > @@ -117,6 +117,8 @@ > > #define RESERVE_VMEMMAP_NR 2U > > #define RESERVE_VMEMMAP_SIZE (RESERVE_VMEMMAP_NR << PAGE_SHIFT) > > #define TAIL_PAGE_REUSE -1 > > +#define GFP_VMEMMAP_PAGE \ > > + (GFP_KERNEL | __GFP_NOFAIL | __GFP_MEMALLOC) > > This is really dangerous! __GFP_MEMALLOC would allow a complete memory > depletion. I am not even sure triggering the OOM killer is a reasonable > behavior. It is just unexpected that shrinking a hugetlb pool can have > destructive side effects. I believe it would be more reasonable to > simply refuse to shrink the pool if we cannot free those pages up. This > sucks as well but it isn't destructive at least. I find the instructions of __GFP_MEMALLOC from the kernel doc. %__GFP_MEMALLOC allows access to all memory. This should only be used when the caller guarantees the allocation will allow more memory to be freed very shortly. Our situation is in line with the description above. We will free a HugeTLB page to the buddy allocator which is much larger than that we allocated shortly. Thanks. > -- > Michal Hocko > SUSE Labs -- Yours, Muchun From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A65BC63798 for ; Fri, 20 Nov 2020 08:52:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AD4A32225B for ; Fri, 20 Nov 2020 08:52:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=bytedance-com.20150623.gappssmtp.com header.i=@bytedance-com.20150623.gappssmtp.com header.b="N5BsUqRd" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AD4A32225B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=bytedance.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C257E6B0072; Fri, 20 Nov 2020 03:52:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BD5F16B0073; Fri, 20 Nov 2020 03:52:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AC4766B0074; Fri, 20 Nov 2020 03:52:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0075.hostedemail.com [216.40.44.75]) by kanga.kvack.org (Postfix) with ESMTP id 717FE6B0072 for ; Fri, 20 Nov 2020 03:52:44 -0500 (EST) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 0B8345823 for ; Fri, 20 Nov 2020 08:52:44 +0000 (UTC) X-FDA: 77504181048.16.berry53_0b09bb72734a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin16.hostedemail.com (Postfix) with ESMTP id DC666100E6929 for ; Fri, 20 Nov 2020 08:52:43 +0000 (UTC) X-HE-Tag: berry53_0b09bb72734a X-Filterd-Recvd-Size: 5171 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Fri, 20 Nov 2020 08:52:43 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id k7so4539457plk.3 for ; Fri, 20 Nov 2020 00:52:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=zpNptVEuWzTVs1t2YZA8V1BK1+nQ/xLKMvtlTpvsiNM=; b=N5BsUqRdrzvSw9AHWZZyrIIthOxBDzszxZllUK77vwE59ZQelnxtVCObH/ZZK2Vw7M rLy1RI+ru0XDHzycpGag2P2fSftv+XPBCzQf4DRTy40YMdKx93654iCy1prysMuHfnrd XJYFtOgP3E7SEuxIPL3Gctaltw7yGhhyMeSvyTefv19A3WDZcpbKBVLtR7parCS5NiLG uNpdTmc7TiRt7y3EW/ESytrDvqpv+7dJ2rV7MbLouSFomLdR7rbO1KDMDG3cJuJI3Sf3 skXK1rNeiKWVZZCSBhRQgDTJZW5EEN8yZ1bNVzie05HGjDQ4fZca2TSzpUU5ARswisw0 qvZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=zpNptVEuWzTVs1t2YZA8V1BK1+nQ/xLKMvtlTpvsiNM=; b=S5zkkfQFGGHyahoIrJjpVJjYoMfb0E/kITUhMcq1n3z1Uyqa9tXsn6M5Zft8KyQ4pY dt8lIAfkPDXfelQpf3+feSjKh9snkbuRwcXnFYInXQyMP1jQs+tw+yXX4XI++QgLO+0u h3Ce2ZMT/XK/uBR4r/PHyaus1ILXnbAJB8PjZxD+aGOwIGDyvPMiFNA29uG/PUmK9eoz pmXWuTuGKKRpJ9Fh0QgurR9QvgyXAZBV29rVLmhgVGACR7lGABU/PCvRst2hMDCeZqEh HSFIMxNNymVG10N+APmiD2DZTu6DAc22zb2PHCsH29wrCxRDVpoctMaMyqRVgQbk+FPb 5JFg== X-Gm-Message-State: AOAM531kZhjKv/daVpnFgNy1eyL1wiCy4ll4NRE5Ic5EQPLtKYtzJX/o z5kjyjWKLeqGiQlRO3L6ElM4dtZUhR4NGs/+wuZn1A== X-Google-Smtp-Source: ABdhPJwlDBHCTerTRacGWw2itZ5Ez5TirD6WZf/c7BEGg462vAoZLIr8yw3+7RDjdDmEuv9EbGXEI9G5aPzAlqKWK8U= X-Received: by 2002:a17:902:76c8:b029:d9:d6c3:357d with SMTP id j8-20020a17090276c8b02900d9d6c3357dmr7764836plt.34.1605862361852; Fri, 20 Nov 2020 00:52:41 -0800 (PST) MIME-Version: 1.0 References: <20201120064325.34492-1-songmuchun@bytedance.com> <20201120064325.34492-12-songmuchun@bytedance.com> <20201120081123.GC3200@dhcp22.suse.cz> In-Reply-To: <20201120081123.GC3200@dhcp22.suse.cz> From: Muchun Song Date: Fri, 20 Nov 2020 16:51:59 +0800 Message-ID: Subject: Re: [External] Re: [PATCH v5 11/21] mm/hugetlb: Allocate the vmemmap pages associated with each hugetlb page To: Michal Hocko Cc: Jonathan Corbet , Mike Kravetz , Thomas Gleixner , mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, Peter Zijlstra , viro@zeniv.linux.org.uk, Andrew Morton , paulmck@kernel.org, mchehab+huawei@kernel.org, pawan.kumar.gupta@linux.intel.com, Randy Dunlap , oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de, Mina Almasry , David Rientjes , Matthew Wilcox , Oscar Salvador , "Song Bao Hua (Barry Song)" , Xiongchun duan , linux-doc@vger.kernel.org, LKML , Linux Memory Management List , linux-fsdevel Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Nov 20, 2020 at 4:11 PM Michal Hocko wrote: > > On Fri 20-11-20 14:43:15, Muchun Song wrote: > [...] > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > > index eda7e3a0b67c..361c4174e222 100644 > > --- a/mm/hugetlb_vmemmap.c > > +++ b/mm/hugetlb_vmemmap.c > > @@ -117,6 +117,8 @@ > > #define RESERVE_VMEMMAP_NR 2U > > #define RESERVE_VMEMMAP_SIZE (RESERVE_VMEMMAP_NR << PAGE_SHIFT) > > #define TAIL_PAGE_REUSE -1 > > +#define GFP_VMEMMAP_PAGE \ > > + (GFP_KERNEL | __GFP_NOFAIL | __GFP_MEMALLOC) > > This is really dangerous! __GFP_MEMALLOC would allow a complete memory > depletion. I am not even sure triggering the OOM killer is a reasonable > behavior. It is just unexpected that shrinking a hugetlb pool can have > destructive side effects. I believe it would be more reasonable to > simply refuse to shrink the pool if we cannot free those pages up. This > sucks as well but it isn't destructive at least. I find the instructions of __GFP_MEMALLOC from the kernel doc. %__GFP_MEMALLOC allows access to all memory. This should only be used when the caller guarantees the allocation will allow more memory to be freed very shortly. Our situation is in line with the description above. We will free a HugeTLB page to the buddy allocator which is much larger than that we allocated shortly. Thanks. > -- > Michal Hocko > SUSE Labs -- Yours, Muchun