From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8064C433F5 for ; Thu, 14 Apr 2022 18:06:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 765F76B007E; Thu, 14 Apr 2022 14:06:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7155E6B0080; Thu, 14 Apr 2022 14:06:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5DDEE6B0081; Thu, 14 Apr 2022 14:06:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id 50D776B007E for ; Thu, 14 Apr 2022 14:06:49 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id D78F1235FD for ; Thu, 14 Apr 2022 18:06:47 +0000 (UTC) X-FDA: 79356265254.16.9D37BF3 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf04.hostedemail.com (Postfix) with ESMTP id 3EFA640008 for ; Thu, 14 Apr 2022 18:06:47 +0000 (UTC) Received: by mail-pl1-f202.google.com with SMTP id l6-20020a170903120600b0014f43ba55f3so3097330plh.11 for ; Thu, 14 Apr 2022 11:06:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=MiAoRuoF75xZwfOeEOmMMAHOt8RR57w9sTCwWNNMnTE=; b=YsiE1PtGxTUG1OpIdrsBhdz7blfxiDypdTgX/vjuZjEtN4MadSMeVas82kojl3tlNR KRAM3pbRAeaPiJKo40WzofJPnQTUGHc0y+9Gp+/GT7f3cbx2GyB6ImjiF12KGDeLqC7E Ohu8ZN4oNlm9hNIzq34p4NIF5AQMrzml/FpOT0LIhF4y3G5/7BFe82COoZvdOm1/KfBe FnjuM9jy9JweiQ8B2JKqbDdmIANgKs4qzwrbX7muokGco6UestH4atgS2r0QiFpYDCGQ CspA7fMgoyhxHPM4/fXC6GsqyxxFKGSlX6rwEwnHy2nFP/RtOKTfGW6JofBjwAePTeoZ z7Sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=MiAoRuoF75xZwfOeEOmMMAHOt8RR57w9sTCwWNNMnTE=; b=RMuD5OuvyqfvVBXG2Cm7vpYSRsVRnRgK9y6iwc1hmrFMONYssD6igptzZ38OhoEgfQ vw28tN9oSorR9bnv216aznxoHbraeiYT4hjhs2ao0mHD+Xx3QqHFkVIyIwgnAPoLjMTm cBsyWgJ3h9vb0vAgX9FvxgQ88rXaRLeZHfHmyLy50lFlm2L2/TrhAyeBDd+jKtTH96EB Lkpzj04rZEXJ+Adrfv7Unutdf52ng4tJyeN2tjOlRpD8BFPajpCN9A/qf4e5OhYuJcMC qglD/qnoxy4cNGrLIrzEbzmDY0ReJegfGwXNGsIF/mRd87v15yf5dgpT1A8O7mEGfgqd DaDg== X-Gm-Message-State: AOAM531xPoHM1oA5jLsaIzRO1NbT60xvJuk6TMgWgZ12n5PsjWIZLLcr cmkOAvoCRSgUVkmZkksh1Ly3Vqc7TuIk X-Google-Smtp-Source: ABdhPJwZILYIPw9MsNNKU5cptSUiRK5JVdiNmlPyW2xn1mxWYgEVyGodnId8iKoOiTF/EndMjpzvVAhxLENv X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:90a:858b:b0:1c6:5bc8:781a with SMTP id m11-20020a17090a858b00b001c65bc8781amr92977pjn.0.1649959605809; Thu, 14 Apr 2022 11:06:45 -0700 (PDT) Date: Thu, 14 Apr 2022 11:06:07 -0700 In-Reply-To: <20220414180612.3844426-1-zokeefe@google.com> Message-Id: <20220414180612.3844426-8-zokeefe@google.com> Mime-Version: 1.0 References: <20220414180612.3844426-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.0.rc0.470.gd361397f0d-goog Subject: [PATCH v2 07/12] mm/khugepaged: add flag to ignore khugepaged_max_ptes_* From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Peter Xu , Thomas Bogendoerfer , "Zach O'Keefe" Content-Type: text/plain; charset="UTF-8" X-Stat-Signature: wbbzg3xe8856xmnd77bzf516dtxmdbf1 Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=YsiE1PtG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3tWJYYgcKCPQvkgaabackkcha.Ykihejqt-iigrWYg.knc@flex--zokeefe.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3tWJYYgcKCPQvkgaabackkcha.Ykihejqt-iigrWYg.knc@flex--zokeefe.bounces.google.com X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 3EFA640008 X-HE-Tag: 1649959607-799638 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_pte_scan_limits flag to struct collapse_control that allows context to ignore sysfs-controlled knobs: khugepaged_max_ptes_[none|swap|shared]. Set this flag in khugepaged collapse context to preserve existing khugepaged behavior and unset the flag in madvise collapse context since the user presumably has reason to believe the collapse will be beneficial. Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 716ba465b356..2f95f60431aa 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -87,6 +87,9 @@ static struct kmem_cache *mm_slot_cache __read_mostly; #define MAX_PTE_MAPPED_THP 8 struct collapse_control { + /* Respect khugepaged_max_ptes_[none|swap|shared] */ + bool enforce_pte_scan_limits; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -631,6 +634,7 @@ static bool is_refcount_suitable(struct page *page) static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, + struct collapse_control *cc, struct list_head *compound_pagelist) { struct page *page = NULL; @@ -644,7 +648,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -664,8 +669,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, VM_BUG_ON_PAGE(!PageAnon(page), page); - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -1207,7 +1212,7 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); - cr->result = __collapse_huge_page_isolate(vma, address, pte, + cr->result = __collapse_huge_page_isolate(vma, address, pte, cc, &compound_pagelist); spin_unlock(pte_ptl); @@ -1296,7 +1301,8 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { - if (++unmapped <= khugepaged_max_ptes_swap) { + if (++unmapped <= khugepaged_max_ptes_swap || + !cc->enforce_pte_scan_limits) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1315,7 +1321,8 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { cr->result = SCAN_EXCEED_NONE_PTE; @@ -1345,8 +1352,9 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, goto out_unmap; } - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && + page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { cr->result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -2086,7 +2094,8 @@ static void khugepaged_scan_file(struct mm_struct *mm, continue; if (xa_is_value(page)) { - if (++swap > khugepaged_max_ptes_swap) { + if (cc->enforce_pte_scan_limits && + ++swap > khugepaged_max_ptes_swap) { cr->result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; @@ -2137,7 +2146,8 @@ static void khugepaged_scan_file(struct mm_struct *mm, rcu_read_unlock(); if (cr->result == SCAN_SUCCEED) { - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && + cc->enforce_pte_scan_limits) { cr->result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { @@ -2364,6 +2374,7 @@ static int khugepaged(void *none) { struct mm_slot *mm_slot; struct collapse_control cc = { + .enforce_pte_scan_limits = true, .last_target_node = NUMA_NO_NODE, .alloc_hpage = &khugepaged_alloc_page, }; @@ -2505,6 +2516,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end) { struct collapse_control cc = { + .enforce_pte_scan_limits = false, .last_target_node = NUMA_NO_NODE, .hpage = NULL, .alloc_hpage = &alloc_hpage, -- 2.36.0.rc0.470.gd361397f0d-goog