From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F4087C4360C for ; Wed, 2 Oct 2019 23:03:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B976421D82 for ; Wed, 2 Oct 2019 23:03:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Ell4lfoP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728201AbfJBXDI (ORCPT ); Wed, 2 Oct 2019 19:03:08 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:40930 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728379AbfJBXDH (ORCPT ); Wed, 2 Oct 2019 19:03:07 -0400 Received: by mail-pf1-f194.google.com with SMTP id x127so441372pfb.7 for ; Wed, 02 Oct 2019 16:03:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:user-agent:mime-version; bh=Kx1lQ6Xuj0lo4gXMSS/qxuv9enZ5ijN/jq6CI+qCygs=; b=Ell4lfoPjQ/U64uUNXZ1rWWz41pBKUiRnxxBUmzrplg1vOXSoFHUHs+cLMF4fl/2/I qhtxDkZZPYsZYz490rSFFz3RVb0eTfQV1FQChEZ7Rw7ohYw3TzvTpSl0c8t/nOTC2d8/ qmZH0MYjsv4H4kGXvKG82vqQ0zkgVjGVe2IItfZzV5A2UwtbFV2v6TL7NmdILOV7z3mV PQv37bFpgeDQjoLjPcgLR4RQ62LjyU5mCxEi5bR8pOxOQdlFAENR7ZiqMZKdrdhv5C9f 6CoapBroe9c3nzmT2bW0WiBc9QAIdHPlTLLRkEtc6Eu+346qX/M6s7tXNFGc+JwITWhR PiGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:user-agent :mime-version; bh=Kx1lQ6Xuj0lo4gXMSS/qxuv9enZ5ijN/jq6CI+qCygs=; b=RoMB/pcJWREQGD6pLNWFyeZvbDmTSy3JEl4UtRMS+HMFgxjP3Bfib1Q0FK7Si/eD/c HM7gQbRSrtiWMhATiPrPycJZ2yjDLJMvErCLt3phto5Jmni5XVjiFEiH/0w4Yor1/dQx OoShf4n1pVwaUnwWQZP3UZr1DuDlCJiK527NE+7hn4Dczi2xNsQPV4UtU3zVvZnbOOk1 OvkAhxM4k8ZGdbz4z0YnPlP/6eVG4muoK7e9aj6u1MMsT9PkDhKBgztNZZ3i7bAZIP37 9Jtvn+k1hoRvCwaR8/KE6DkSJSsAU7veiATafQaA9aPHYglYkTbfLBNxa05lHPCsKuL9 jBsg== X-Gm-Message-State: APjAAAVYOotTHe04lb7w4IOXZhHQOAz54taehOeYZ3pRoO3TG/C+Xxv0 GhKW+zpKuxy6xENU1tBZ9Xg6uW6nOLI= X-Google-Smtp-Source: APXvYqze6T/pvyncy2GIGKjzFCBiih1dWCjaT09cVEUfiHqI5i28yaqW2t6FRRhXydBoOryGEzmLnQ== X-Received: by 2002:a17:90a:2301:: with SMTP id f1mr7086902pje.121.1570057385215; Wed, 02 Oct 2019 16:03:05 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id s1sm317223pjs.31.2019.10.02.16.03.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Oct 2019 16:03:04 -0700 (PDT) Date: Wed, 2 Oct 2019 16:03:03 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Mike Kravetz , Michal Hocko cc: Vlastimil Babka , Linus Torvalds , Andrea Arcangeli , Andrew Morton , Mel Gorman , "Kirill A. Shutemov" , Linux Kernel Mailing List , Linux-MM Subject: [rfc] mm, hugetlb: allow hugepage allocations to excessively reclaim Message-ID: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hugetlb allocations use __GFP_RETRY_MAYFAIL to aggressively attempt to get hugepages that the user needs. Commit b39d0ee2632d ("mm, page_alloc: avoid expensive reclaim when compaction may not succeed") intends to improve allocator behind for thp allocations to prevent excessive amounts of reclaim especially when constrained to a single node. Since hugetlb allocations have explicitly preferred to loop and do reclaim and compaction, exempt them from this new behavior at least for the time being. It is not shown that hugetlb allocation success rate has been impacted by commit b39d0ee2632d but hugetlb allocations are admittedly beyond the scope of what the patch is intended to address (thp allocations). Cc: Mike Kravetz Signed-off-by: David Rientjes --- Mike, you eluded that you may want to opt hugetlbfs out of this for the time being in https://marc.info/?l=linux-kernel&m=156771690024533 -- not sure if you want to allow this excessive amount of reclaim for hugetlb allocations or not given the swap storms Andrea has shown is possible (and nr_hugepages_mempolicy does exist), but hugetlbfs was not part of the problem we are trying to address here so no objection to opting it out. You might want to consider how expensive hugetlb allocations can become and disruptive to the system if it does not yield additional hugepages, but that can be done at any time later as a general improvement rather than part of a series aimed at thp. mm/page_alloc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4467,12 +4467,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (page) goto got_pg; - if (order >= pageblock_order && (gfp_mask & __GFP_IO)) { + if (order >= pageblock_order && (gfp_mask & __GFP_IO) && + !(gfp_mask & __GFP_RETRY_MAYFAIL)) { /* * If allocating entire pageblock(s) and compaction * failed because all zones are below low watermarks * or is prohibited because it recently failed at this - * order, fail immediately. + * order, fail immediately unless the allocator has + * requested compaction and reclaim retry. * * Reclaim is * - potentially very expensive because zones are far From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.9 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2EA83C47404 for ; Wed, 2 Oct 2019 23:03:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id E29FD222C0 for ; Wed, 2 Oct 2019 23:03:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Ell4lfoP" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E29FD222C0 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6F93A6B0003; Wed, 2 Oct 2019 19:03:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 680156B0006; Wed, 2 Oct 2019 19:03:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 520D76B0007; Wed, 2 Oct 2019 19:03:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0187.hostedemail.com [216.40.44.187]) by kanga.kvack.org (Postfix) with ESMTP id 2AA276B0003 for ; Wed, 2 Oct 2019 19:03:08 -0400 (EDT) Received: from smtpin11.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id BDAF7181AC9AE for ; Wed, 2 Oct 2019 23:03:07 +0000 (UTC) X-FDA: 76000372014.11.care78_38589afbb8a0f X-HE-Tag: care78_38589afbb8a0f X-Filterd-Recvd-Size: 5578 Received: from mail-pg1-f193.google.com (mail-pg1-f193.google.com [209.85.215.193]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Wed, 2 Oct 2019 23:03:07 +0000 (UTC) Received: by mail-pg1-f193.google.com with SMTP id z12so502832pgp.9 for ; Wed, 02 Oct 2019 16:03:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:user-agent:mime-version; bh=Kx1lQ6Xuj0lo4gXMSS/qxuv9enZ5ijN/jq6CI+qCygs=; b=Ell4lfoPjQ/U64uUNXZ1rWWz41pBKUiRnxxBUmzrplg1vOXSoFHUHs+cLMF4fl/2/I qhtxDkZZPYsZYz490rSFFz3RVb0eTfQV1FQChEZ7Rw7ohYw3TzvTpSl0c8t/nOTC2d8/ qmZH0MYjsv4H4kGXvKG82vqQ0zkgVjGVe2IItfZzV5A2UwtbFV2v6TL7NmdILOV7z3mV PQv37bFpgeDQjoLjPcgLR4RQ62LjyU5mCxEi5bR8pOxOQdlFAENR7ZiqMZKdrdhv5C9f 6CoapBroe9c3nzmT2bW0WiBc9QAIdHPlTLLRkEtc6Eu+346qX/M6s7tXNFGc+JwITWhR PiGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:user-agent :mime-version; bh=Kx1lQ6Xuj0lo4gXMSS/qxuv9enZ5ijN/jq6CI+qCygs=; b=DCk+oA3TJApqPm7XW1TxLgoo/enkdT6O/lnOcq1jw6rp6oueYzaqzD7Ud3RSu0Gk8h jXdxpWCbGqnxW6G28uOk0wADEwgHie4fr+xf4lmMxUXVx0P2UcOEI2tBdRHCn2AhWpYq +G7bfQt2WYye4p8pbrc/lEzZjN96ekRfce1EL51JUMPdJfHyiZGuQP/5ZgXYPsn7ueQr 94bT9mv60tFqwhAM5Gba9k9z2V8dn/RGR519vGBTRSA9TW3NzLKjSUEev8ScapThuebA bevwdvQ+n009ezqeqWDq++LJn+wCPdUeHOpYgnfIdRsrf3BqRZBz0ici7bULELvTYZjv iTPw== X-Gm-Message-State: APjAAAUV3XKiYNyJdBmH4o5+9ywOubb6MyQBiofQd3RqO6r/YCtnt7OJ G/vohToh5Gpj6PDL0o86fw756Q== X-Google-Smtp-Source: APXvYqze6T/pvyncy2GIGKjzFCBiih1dWCjaT09cVEUfiHqI5i28yaqW2t6FRRhXydBoOryGEzmLnQ== X-Received: by 2002:a17:90a:2301:: with SMTP id f1mr7086902pje.121.1570057385215; Wed, 02 Oct 2019 16:03:05 -0700 (PDT) Received: from [2620:15c:17:3:3a5:23a7:5e32:4598] ([2620:15c:17:3:3a5:23a7:5e32:4598]) by smtp.gmail.com with ESMTPSA id s1sm317223pjs.31.2019.10.02.16.03.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Oct 2019 16:03:04 -0700 (PDT) Date: Wed, 2 Oct 2019 16:03:03 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Mike Kravetz , Michal Hocko cc: Vlastimil Babka , Linus Torvalds , Andrea Arcangeli , Andrew Morton , Mel Gorman , "Kirill A. Shutemov" , Linux Kernel Mailing List , Linux-MM Subject: [rfc] mm, hugetlb: allow hugepage allocations to excessively reclaim Message-ID: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hugetlb allocations use __GFP_RETRY_MAYFAIL to aggressively attempt to get hugepages that the user needs. Commit b39d0ee2632d ("mm, page_alloc: avoid expensive reclaim when compaction may not succeed") intends to improve allocator behind for thp allocations to prevent excessive amounts of reclaim especially when constrained to a single node. Since hugetlb allocations have explicitly preferred to loop and do reclaim and compaction, exempt them from this new behavior at least for the time being. It is not shown that hugetlb allocation success rate has been impacted by commit b39d0ee2632d but hugetlb allocations are admittedly beyond the scope of what the patch is intended to address (thp allocations). Cc: Mike Kravetz Signed-off-by: David Rientjes --- Mike, you eluded that you may want to opt hugetlbfs out of this for the time being in https://marc.info/?l=linux-kernel&m=156771690024533 -- not sure if you want to allow this excessive amount of reclaim for hugetlb allocations or not given the swap storms Andrea has shown is possible (and nr_hugepages_mempolicy does exist), but hugetlbfs was not part of the problem we are trying to address here so no objection to opting it out. You might want to consider how expensive hugetlb allocations can become and disruptive to the system if it does not yield additional hugepages, but that can be done at any time later as a general improvement rather than part of a series aimed at thp. mm/page_alloc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4467,12 +4467,14 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, if (page) goto got_pg; - if (order >= pageblock_order && (gfp_mask & __GFP_IO)) { + if (order >= pageblock_order && (gfp_mask & __GFP_IO) && + !(gfp_mask & __GFP_RETRY_MAYFAIL)) { /* * If allocating entire pageblock(s) and compaction * failed because all zones are below low watermarks * or is prohibited because it recently failed at this - * order, fail immediately. + * order, fail immediately unless the allocator has + * requested compaction and reclaim retry. * * Reclaim is * - potentially very expensive because zones are far