From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A4223C7619A for ; Wed, 12 Apr 2023 19:58:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 41621900005; Wed, 12 Apr 2023 15:58:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3C4D6900002; Wed, 12 Apr 2023 15:58:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28D1E900005; Wed, 12 Apr 2023 15:58:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 19131900002 for ; Wed, 12 Apr 2023 15:58:21 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id CD19D160170 for ; Wed, 12 Apr 2023 19:58:20 +0000 (UTC) X-FDA: 80673800760.23.076BC56 Received: from mail-qk1-f182.google.com (mail-qk1-f182.google.com [209.85.222.182]) by imf29.hostedemail.com (Postfix) with ESMTP id 08F8B120002 for ; Wed, 12 Apr 2023 19:58:18 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=fqSp3WNu; spf=pass (imf29.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.182 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681329499; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jgcXxurLSkKjg/BoVCFNrpd/Kbb13VI/vKy5ffh4+LQ=; b=pNzc/kuUD6/YOZfAkAb3KgAE1RtDDUHTZuCspd8+QRV9D+T6pItXcsyU2rvZdlNqpu3Vc6 y5k5GcCYWfb5B2K42Q3jcQEGnmcejfIKYZ7jvBrpia27GEAnCj4+C0xvwmYMB0sZFxTpr0 q7vpLFj5larA7X02rL8jlugpn/OQU44= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=soleen.com header.s=google header.b=fqSp3WNu; spf=pass (imf29.hostedemail.com: domain of pasha.tatashin@soleen.com designates 209.85.222.182 as permitted sender) smtp.mailfrom=pasha.tatashin@soleen.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681329499; a=rsa-sha256; cv=none; b=GPU+9fRtL8xhgJubSRTOC0KBi2PmXvRMz34xS+FKfJ0+Ljh3rPW1ThNWdFctCCIawoYjEH vSSCiLkXuarS3bK4/RxrR60nRbTgA8b8DPty8W8B58ZmFe+iKd2/z37ACv8CZNJSFhV0on ZLNznVrdNGIhK8NU7HdP+Sq1k9OxJ1I= Received: by mail-qk1-f182.google.com with SMTP id b15so4909132qkj.0 for ; Wed, 12 Apr 2023 12:58:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; t=1681329498; x=1683921498; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jgcXxurLSkKjg/BoVCFNrpd/Kbb13VI/vKy5ffh4+LQ=; b=fqSp3WNu6KTe4Bst1jHEVLaJ1qgUkDvX6Djr9nQnDM+Trn6CjYm90o+4s1HvE89kIZ CaAKRBXeaSaqlhdKB7uZXdtG2z7sgMmhHNRVBK1BXMDtM8bsjH366IPiY5x/wUXjznAF KhWS+g2w+G5ljTn8jiBrFOBu+ops4syc3VS96FEjdiOOrGB3KXiaAUNhRpLT+KWJ6YG/ x7znnDh1KUBVSjgbCfgm1bW6rGlN/i+se0576lryKP2CFiyu88uItYjVjHruupDlS5BH 4YMGNf+GqN+YWKaaxA0ArnPlO4xHscALtOKAZle4JgS13ARkpzMDOOH4pvxyTUvITpJS j5hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681329498; x=1683921498; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jgcXxurLSkKjg/BoVCFNrpd/Kbb13VI/vKy5ffh4+LQ=; b=a6mhoeyJQxyKHNXT0iR5yVDd4XX3eYYBGySI2Jv9KawbZSVFozUy5o4ukvseT4gQjs AVjWNmcJQCudoZB7lsQAMFpYQJohhD25PQoOkn8rgqWq5wdtNM6/yCNpj6YTVrik6W4A uNb1k2aVqqpKRRD9e99Iaskm1M7DDI7w9orRNwg/yMHwP0FASnENBeaHbQtH05aO4s5s /QlQnvclMgMNWmPFpxjdmHSJLMrcU9leIyEkP+Lt39lQDuN5soIdR/zInT+yWBXUiWWf ObO4euKprsgomVoRXIgDjS4EBLDgZ9ylmZTmuT3T2kx+IQpqq8m+Lu49Msj34iP9CaDm NjbQ== X-Gm-Message-State: AAQBX9eMkamGwIUC00GNlvMgLdqsoBJHZUYENRJoABTUGdB05SwrJVSM QkSovDIxt1Qb8zCPlQ29IR+ED3FvM20aO5i3d0Jehg== X-Google-Smtp-Source: AKy350bA1Nj2erBMaSHUcdOQB+2uDyETm6PUznmWQa7/2EKLnpIWHiHC3NrERYWdeDmitcyWdkX/mEy8lJ5JYZlErJw= X-Received: by 2002:a05:620a:31aa:b0:743:d508:97ba with SMTP id bi42-20020a05620a31aa00b00743d50897bamr4832765qkb.1.1681329497994; Wed, 12 Apr 2023 12:58:17 -0700 (PDT) MIME-Version: 1.0 References: <20230412152337.1203254-1-pasha.tatashin@soleen.com> <63736432-5cef-f67c-c809-cc19b236a7f4@google.com> In-Reply-To: <63736432-5cef-f67c-c809-cc19b236a7f4@google.com> From: Pasha Tatashin Date: Wed, 12 Apr 2023 15:57:42 -0400 Message-ID: Subject: Re: [PATCH] mm: hugetlb_vmemmap: provide stronger vmemmap allocaction gurantees To: David Rientjes Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, mike.kravetz@oracle.com, muchun.song@linux.dev, souravpanda@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 08F8B120002 X-Stat-Signature: orj6oub66pe83oo47jj6geowndsnhida X-HE-Tag: 1681329498-700514 X-HE-Meta: U2FsdGVkX1+emS2DaEa4DM2yemMugqmVvveVAvc4UF1qzL113EcpJitY1IZ+rTgyegNnYoA4wmqXaIgZu1izpHpMl+xhj+fIneIuB4X+R6AEB8ildTgeoBIjVk/8ylyzxO1qFZ0K2SxircldY8U06q/Hm33dxDQdV+QKLTKVDp6itZm0CWPPUoDfw6mP70QeEiQGXG1ElZ91NtQIs1ttqR+YBOkp0+Bn94ePvC6gQnblsehDQysylI0dXAwjIui1FOjjpyHplbj6QP0mFdX9biTr5juy0FcNMWFJWFKrU4EYO4e4BIxrkCOSN/blH3b3pFWfNwX17GqHNPna7tv/LcajsS4PZ9iE555oXgfKKZaaiBf6njL58A10v3EjjUqrgAjSL8PrZIjDNnFQ8vYIGoIPhfmit/kkLRxcmrxyVKnmMQc4axwWd4N3F5ISoMEou/G1MwEiapL7g+42Dv/lwrvhod+SgH+jnFMhMhtVeNxxwPbdSkyTEh0MRwsNwZDwDtJCCMYzUiLnXs1t2qKQQcGAFriQIafQTbshiongRvHkQy4DEQkZW4wTtSJ1VgIl9WP8lyTKCq32qi46LXSKslOWzoxWDDl479Ed9FnKhnaNCf5lFjtkN2/XvJAThYIAC3sxuOQ5O0cNN/4gcCXig5beOnQDRrDWvk58gy4K6C5NI/pJb6FKAFsO5jLIroVhWSbUxfETliREh2qwWUPDLotDHHHNHhKREG3P47iE0kOS0XiFljdiYuxcyeI1nScQKDbIFNhSZKKgwOP3cHwYAMB0RTgrWz06uLs6zH71zbJ7YNthPzvQXUoXjQQx/mlNa+DFSHgBDKLeHeKGVmmm/MG+MnqdcvaKNetn6FxogKx2PiHGIieAhCH7e4NCDr08AKWRLv0upR8qR9kGfeEXsgnjH1wdPCWQnHNuCdBYMF7J//0f8UcQE6AD3te/annCI14tQOhmQ9ChtrFlNeR 6e2+Jq1f nT+bghPtpWcIIbNm/nuTDpbqHtpPDfV3pKqQ3Ic/UhMCkB+hzOTqucrCuZP1Wa3cnQlPh+0MCor8nmey/uKiAleGtdONlimECDc4O0/WJJRPpo0pcs/6ae5uukAu3J8uBDa7Q1WRednQ3erRFyGwEW8PGqjeADbnScjPG2pp+B8z287KqJaeDMeJiab6zfhzllfUJGejCCRVMGJrCtrRhICe066zj1DwUAJGR5wZKrlYzd8b6AP481Iq0HPRdKDM5nXJfX2rLojBW8GBpYMLMhKXg396i14MFP9lGWyW4HYU+kItxtZhUgV/t8eZqCdfwwkUd3HjHyZqMFY7hYmDlrrR/unw2L/g8Puqd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 12, 2023 at 1:54=E2=80=AFPM David Rientjes wrote: > > On Wed, 12 Apr 2023, Pasha Tatashin wrote: > > > HugeTLB pages have a struct page optimizations where struct pages for t= ail > > pages are freed. However, when HugeTLB pages are destroyed, the memory = for > > struct pages (vmemmap) need to be allocated again. > > > > Currently, __GFP_NORETRY flag is used to allocate the memory for vmemma= p, > > but given that this flag makes very little effort to actually reclaim > > memory the returning of huge pages back to the system can be problem. L= ets > > use __GFP_RETRY_MAYFAIL instead. This flag is also performs graceful > > reclaim without causing ooms, but at least it may perform a few retries= , > > and will fail only when there is genuinely little amount of unused memo= ry > > in the system. > > > > Thanks Pasha, this definitely makes sense. We want to free the hugetlb > page back to the system so it would be a shame to have to strand it in th= e > hugetlb pool because we can't allocate the tail pages (we want to free > more memory than we're allocating). > > > Signed-off-by: Pasha Tatashin > > Suggested-by: David Rientjes > > --- > > mm/hugetlb_vmemmap.c | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c > > index a559037cce00..c4226d2af7cc 100644 > > --- a/mm/hugetlb_vmemmap.c > > +++ b/mm/hugetlb_vmemmap.c > > @@ -475,9 +475,12 @@ int hugetlb_vmemmap_restore(const struct hstate *h= , struct page *head) > > * the range is mapped to the page which @vmemmap_reuse is mapped= to. > > * When a HugeTLB page is freed to the buddy allocator, previousl= y > > * discarded vmemmap pages must be allocated and remapping. > > + * > > + * Use __GFP_RETRY_MAYFAIL to fail only when there is genuinely l= ittle > > + * unused memory in the system. > > */ > > ret =3D vmemmap_remap_alloc(vmemmap_start, vmemmap_end, vmemmap_r= euse, > > - GFP_KERNEL | __GFP_NORETRY | __GFP_THIS= NODE); > > + GFP_KERNEL | __GFP_RETRY_MAYFAIL | __GF= P_THISNODE); > > if (!ret) { > > ClearHPageVmemmapOptimized(head); > > static_branch_dec(&hugetlb_optimize_vmemmap_key); > > The behavior of __GFP_RETRY_MAYFAIL is different for high-order memory (a= t > least larger than PAGE_ALLOC_COSTLY_ORDER). The order that we're > allocating would depend on the implementation of alloc_vmemmap_page_list(= ) > so likely best to move the gfp mask to that function. Thank you David. This makes sense, I will send the 2nd version soon. Pasha