From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 40C2FC433EF for ; Sat, 4 Jun 2022 00:40:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3AA0E8D000B; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3048D8D0001; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A37D8D000B; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 06AAB8D0001 for ; Fri, 3 Jun 2022 20:40:27 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id DE93435FFB for ; Sat, 4 Jun 2022 00:40:26 +0000 (UTC) X-FDA: 79538697252.12.C8249B7 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 016ADC000C for ; Sat, 4 Jun 2022 00:40:22 +0000 (UTC) Received: by mail-pf1-f202.google.com with SMTP id x18-20020a62fb12000000b0051bab667811so4685720pfm.5 for ; Fri, 03 Jun 2022 17:40:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=H9V8udCAxXnwzoEsmfbIoLOPFQ1/JzQfk5V6JxTtMOM=; b=NK07lq920/iFRpmwM8krBW6OkI5vS+Ah/sAXvPGW9+6QRtiMeplmSOSQhtTExWfUa4 6UGvlGj9ePhP2CzoDcJhHfDRvWZCOb7i9nackiaMf1iGfNytUEuQh+Ava5uDCtV6PluO UWgPybMPCbtFfV+PeWGeDpVKVgoJVu3Iv0A0BVLAWN2dp9jvfg5wLTzmD6LvCeKUrgWv 4vL7BvJ8D4mnk3b+8TO76H7egT/UzEiS164WdB+iedqE+7gDxNnkwAPODFKaFv+gxqQr ZjBDbX9OHg+E/KJIXyd+YP0+XIAgPNgqcmsjpWzB8rNc8+E4wicJocOZVtPF5Xuwst8V 0SFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=H9V8udCAxXnwzoEsmfbIoLOPFQ1/JzQfk5V6JxTtMOM=; b=28P3aw2+hl9TyR4UBCaC9qeEcK+UTlDUpKpNyRRSW+ZVFFLCuoYtmwEhXpW7S8c2kB nuUTijCDZ1r6iERQYBroGz5ZfT/1Fsj4R8k1Fd8Z0/pWo5mqsdcwA+2X21RSL/dS0/6f DGpcGmGznnd5qrnjpq9t8PPJKaPCkLCaCRqGAWVp9GSDK3GShVeRXbwQWBwu1897LMkt 3v88nXb6VgRJ2ce2fYoy/KIXnQuQCrT34u8gV1+Qvg+jOJLrklFW9OwGGjaqJ0Oem8fW 1kx45x0ghCpIY8GBvHl6GTVRr6yqJat7rDsvNzRtAcAK9jgPBfRLpmfrqncWw1CMoa5t AEpw== X-Gm-Message-State: AOAM531nkWYRN02Fo6npizOP/Kxj71mnqbT3ef0PpuKO41v7Djg4GObJ 6XaXhJgS+gAq3Dc9IccRUerUj1paKQsF X-Google-Smtp-Source: ABdhPJxdygnKfNDVnctU0aq/VVYlo5+VPz5vpoGe+JVEOmhCGPSqJyj9bAiWSpUq9xNYylmUJoBD+2WO7JWz X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:7048:b0:15f:34c1:bae0 with SMTP id h8-20020a170902704800b0015f34c1bae0mr12429141plt.71.1654303225897; Fri, 03 Jun 2022 17:40:25 -0700 (PDT) Date: Fri, 3 Jun 2022 17:39:57 -0700 In-Reply-To: <20220604004004.954674-1-zokeefe@google.com> Message-Id: <20220604004004.954674-9-zokeefe@google.com> Mime-Version: 1.0 References: <20220604004004.954674-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.1.255.ge46751e96f-goog Subject: [PATCH v6 08/15] mm/khugepaged: add flag to ignore THP sysfs enabled From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=NK07lq92; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of 3-amaYgcKCNINC822324CC492.0CA96BIL-AA8Jy08.CF4@flex--zokeefe.bounces.google.com designates 209.85.210.202 as permitted sender) smtp.mailfrom=3-amaYgcKCNINC822324CC492.0CA96BIL-AA8Jy08.CF4@flex--zokeefe.bounces.google.com X-Stat-Signature: 1pyhx86ro8hz1y1y8m7iphzreimkhpg4 X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 016ADC000C X-HE-Tag: 1654303222-654895 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_thp_enabled flag to struct collapse_control that allows context to ignore constraints imposed by /sys/kernel/transparent_hugepage/enabled. This flag is set in khugepaged collapse context to preserve existing khugepaged behavior. This flag will be used (unset) when introducing madvise collapse context since the desired THP semantics of MADV_COLLAPSE aren't coupled to sysfs THP settings. Most notably, for the purpose of eventual madvise_collapse(2) support, this allows userspace to trigger THP collapse on behalf of another processes, without adding support to meddle with the VMA flags of said process, or change sysfs THP settings. For now, limit this flag to /sys/kernel/transparent_hugepage/enabled, but it can be expanded to include /sys/kernel/transparent_hugepage/shmem_enabled later. Link: https://lore.kernel.org/linux-mm/CAAa6QmQxay1_=Pmt8oCX2-Va18t44FV-Vs-WsQt_6+qBks4nZA@mail.gmail.com/ Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index c3589b3e238d..4ad04f552347 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -94,6 +94,11 @@ struct collapse_control { */ bool enforce_page_heuristics; + /* Enforce constraints of + * /sys/kernel/mm/transparent_hugepage/enabled + */ + bool enforce_thp_enabled; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -893,10 +898,12 @@ static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) */ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, - struct vm_area_struct **vmap) + struct vm_area_struct **vmap, + struct collapse_control *cc) { struct vm_area_struct *vma; unsigned long hstart, hend; + unsigned long vma_flags; if (unlikely(khugepaged_test_exit(mm))) return SCAN_ANY_PROCESS; @@ -909,7 +916,18 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, hend = vma->vm_end & HPAGE_PMD_MASK; if (address < hstart || address + HPAGE_PMD_SIZE > hend) return SCAN_ADDRESS_RANGE; - if (!hugepage_vma_check(vma, vma->vm_flags)) + + /* + * If !cc->enforce_thp_enabled, set VM_HUGEPAGE so that + * hugepage_vma_check() can pass even if + * TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG is set (i.e. "madvise" mode). + * Note that hugepage_vma_check() doesn't enforce that + * TRANSPARENT_HUGEPAGE_FLAG or TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG + * must be set (i.e. "never" mode). + */ + vma_flags = cc->enforce_thp_enabled ? vma->vm_flags + : vma->vm_flags | VM_HUGEPAGE; + if (!hugepage_vma_check(vma, vma_flags)) return SCAN_VMA_CHECK; /* Anon VMA expected */ if (!vma->anon_vma || !vma_is_anonymous(vma)) @@ -953,7 +971,8 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, static bool __collapse_huge_page_swapin(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long haddr, pmd_t *pmd, - int referenced) + int referenced, + struct collapse_control *cc) { int swapped_in = 0; vm_fault_t ret = 0; @@ -980,7 +999,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ if (ret & VM_FAULT_RETRY) { mmap_read_lock(mm); - if (hugepage_vma_revalidate(mm, haddr, &vma)) { + if (hugepage_vma_revalidate(mm, haddr, &vma, cc)) { /* vma is no longer available, don't continue to swapin */ trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); return false; @@ -1047,7 +1066,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, goto out_nolock; mmap_read_lock(mm); - result = hugepage_vma_revalidate(mm, address, &vma); + result = hugepage_vma_revalidate(mm, address, &vma, cc); if (result) { mmap_read_unlock(mm); goto out_nolock; @@ -1066,7 +1085,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, * Continuing to collapse causes inconsistency. */ if (unmapped && !__collapse_huge_page_swapin(mm, vma, address, - pmd, referenced)) { + pmd, referenced, cc)) { mmap_read_unlock(mm); goto out_nolock; } @@ -1078,7 +1097,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, * handled by the anon_vma lock + PG_lock. */ mmap_write_lock(mm); - result = hugepage_vma_revalidate(mm, address, &vma); + result = hugepage_vma_revalidate(mm, address, &vma, cc); if (result) goto out_up_write; /* check if the pmd is still valid */ @@ -2277,6 +2296,7 @@ static int khugepaged(void *none) struct mm_slot *mm_slot; struct collapse_control cc = { .enforce_page_heuristics = true, + .enforce_thp_enabled = true, .last_target_node = NUMA_NO_NODE, /* .gfp set later */ }; -- 2.36.1.255.ge46751e96f-goog