From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DEBBC433F5 for ; Wed, 16 Mar 2022 22:15:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 07ACC6B0071; Wed, 16 Mar 2022 18:15:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0016D6B0072; Wed, 16 Mar 2022 18:15:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DE40A8D0001; Wed, 16 Mar 2022 18:15:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C8C686B0071 for ; Wed, 16 Mar 2022 18:15:26 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 77A048249980 for ; Wed, 16 Mar 2022 22:15:26 +0000 (UTC) X-FDA: 79251656652.17.14DA07A Received: from mail-yb1-f177.google.com (mail-yb1-f177.google.com [209.85.219.177]) by imf17.hostedemail.com (Postfix) with ESMTP id DCE1E40007 for ; Wed, 16 Mar 2022 22:15:25 +0000 (UTC) Received: by mail-yb1-f177.google.com with SMTP id o5so7021899ybe.2 for ; Wed, 16 Mar 2022 15:15:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=fzy1q7yZxTsQqPTdZtzeZYFh+ifL58Ovi/YokXT2mF8=; b=ShVBfQQjJC7lfytQX/nrlKpzsp/2D5XSF8wTWdFmFhgJNoyk/LwGb/acBV+KKTBtWe MqGHo9wJmy5e+HtdYRP5icDbxW3pAaX/Z3CSNKWxMUAHmHFKXkuwgAbmOwb4R/F8UAKV oZxlPUn0Iq5YnfRohk6KxSpo75h5trFMAGU+AHIa9UBUvkZaRBdlw7zOECMQnF99KwOj 7B2PizgWfvf/HoQlpSGMwzKmhsGlJWxJOW3pow6Nm3Rs7KUqHh6TJc0T3liGlBeCS3Ps 17Ui7uVBymoILX1qq0GdZWy4bnM4UkBCvcp2OR7XJLwhFO8hDYJaiFwqT7rw/GqZvRzX yHqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=fzy1q7yZxTsQqPTdZtzeZYFh+ifL58Ovi/YokXT2mF8=; b=W3geL1LdpzW80sK9GIj78cW/blP6RvLtqMAVxxrCQ0jPC0qrKAQePZowVd8vwZlgEW IcaUjeAxM9VuaoImQih4J/+861nj6TG5rSJaMhNQ+7BF+8R5CdglROWsaYq1PZtQtC2I 5rM+TE2okVEVVM3xdbPOZe4yX+C8viuKUlzgzMrBeOaT1hdbpOOd+CNzjWGpAUJhZS91 xtCogm0sqkUUaDEXIteS6NVRreXYkfjZqUSW5/A4fAtK8dHhN5FDBxBIlOJOfp3T3zoB LfyPS8YW98kmjoWE+dSUEuTaREUxShMJu3dxMJ18fMtnJH3lBOY+dh+Xj0VTbwyLsUbm 1grQ== X-Gm-Message-State: AOAM531SxNQzsrQhh/8YU/GjBI5EVJMj2vgsv/TXqClcfH8dnR1D25EX BHJjcFZ81U2bjC2XScy6tDDoK5mvP9wwRD/dm20= X-Google-Smtp-Source: ABdhPJyFQMuz/0hw/ZkMi8DFHi/gflT4UxHsJLPA6TMY/fcwGpClrwi2RSsYQfr+aJ1+yNfj2mJqAUQrzPf9DXl3/Lg= X-Received: by 2002:a25:c046:0:b0:633:7235:afda with SMTP id c67-20020a25c046000000b006337235afdamr2263706ybf.540.1647468925020; Wed, 16 Mar 2022 15:15:25 -0700 (PDT) MIME-Version: 1.0 References: <20220309021230.721028-1-yuzhao@google.com> <20220309021230.721028-3-yuzhao@google.com> In-Reply-To: <20220309021230.721028-3-yuzhao@google.com> From: Barry Song <21cnbao@gmail.com> Date: Thu, 17 Mar 2022 11:15:11 +1300 Message-ID: Subject: Re: [PATCH v9 02/14] mm: x86: add CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG To: Yu Zhao Cc: Andrew Morton , Linus Torvalds , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , LAK , Linux Doc Mailing List , LKML , Linux-MM , Kernel Page Reclaim v2 , x86 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: kwxqdzmp6pmow956si9gsunmtok5973o Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=ShVBfQQj; spf=pass (imf17.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.219.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: DCE1E40007 X-HE-Tag: 1647468925-421115 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 9, 2022 at 3:47 PM Yu Zhao wrote: > > Some architectures support the accessed bit in non-leaf PMD entries, > e.g., x86 sets the accessed bit in a non-leaf PMD entry when using it > as part of linear address translation [1]. Page table walkers that > clear the accessed bit may use this capability to reduce their search > space. > > Note that: > 1. Although an inline function is preferable, this capability is added > as a configuration option for consistency with the existing macros. > 2. Due to the little interest in other varieties, this capability was > only tested on Intel and AMD CPUs. > > [1]: Intel 64 and IA-32 Architectures Software Developer's Manual > Volume 3 (June 2021), section 4.8 > > Signed-off-by: Yu Zhao > Acked-by: Brian Geffon > Acked-by: Jan Alexander Steffens (heftig) > Acked-by: Oleksandr Natalenko > Acked-by: Steven Barrett > Acked-by: Suleiman Souhlal > Tested-by: Daniel Byrne > Tested-by: Donald Carr > Tested-by: Holger Hoffst=C3=A4tte > Tested-by: Konstantin Kharlamov > Tested-by: Shuang Zhai > Tested-by: Sofia Trinh > Tested-by: Vaibhav Jain > --- Reviewed-by: Barry Song hard to read this patch by itself. but after reading the change in walk_pmd_range(), it seems this patch becomes quite clear: walk_pmd_range() { ... #ifdef CONFIG_ARCH_HAS_NONLEAF_PMD_YOUNG if (get_cap(LRU_GEN_NONLEAF_YOUNG)) { if (!pmd_young(val)) continue; walk_pmd_range_locked(pud, addr, vma, walk, &pos); } #endif ... } this gives us the chance to skip the scan of all ptes within the pmd. so i am not quite sure this should necessarily be a separate patch, or should be put together with the change in walk_pmd_range() to make readers understand its purpose. > arch/Kconfig | 9 +++++++++ > arch/x86/Kconfig | 1 + > arch/x86/include/asm/pgtable.h | 3 ++- > arch/x86/mm/pgtable.c | 5 ++++- > include/linux/pgtable.h | 4 ++-- > 5 files changed, 18 insertions(+), 4 deletions(-) > > diff --git a/arch/Kconfig b/arch/Kconfig > index 678a80713b21..f9c59ecadbbb 100644 > --- a/arch/Kconfig > +++ b/arch/Kconfig > @@ -1322,6 +1322,15 @@ config DYNAMIC_SIGFRAME > config HAVE_ARCH_NODE_DEV_GROUP > bool > > +config ARCH_HAS_NONLEAF_PMD_YOUNG > + bool > + depends on PGTABLE_LEVELS > 2 > + help > + Architectures that select this option are capable of setting th= e > + accessed bit in non-leaf PMD entries when using them as part of= linear > + address translations. Page table walkers that clear the accesse= d bit > + may use this capability to reduce their search space. > + > source "kernel/gcov/Kconfig" > > source "scripts/gcc-plugins/Kconfig" > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index 9f5bd41bf660..e787b7fc75be 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -85,6 +85,7 @@ config X86 > select ARCH_HAS_PMEM_API if X86_64 > select ARCH_HAS_PTE_DEVMAP if X86_64 > select ARCH_HAS_PTE_SPECIAL > + select ARCH_HAS_NONLEAF_PMD_YOUNG > select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64 > select ARCH_HAS_COPY_MC if X86_64 > select ARCH_HAS_SET_MEMORY > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtabl= e.h > index 60b6ce45c2e3..f973788f6b21 100644 > --- a/arch/x86/include/asm/pgtable.h > +++ b/arch/x86/include/asm/pgtable.h > @@ -819,7 +819,8 @@ static inline unsigned long pmd_page_vaddr(pmd_t pmd) > > static inline int pmd_bad(pmd_t pmd) > { > - return (pmd_flags(pmd) & ~_PAGE_USER) !=3D _KERNPG_TABLE; > + return (pmd_flags(pmd) & ~(_PAGE_USER | _PAGE_ACCESSED)) !=3D > + (_KERNPG_TABLE & ~_PAGE_ACCESSED); > } > > static inline unsigned long pages_to_mb(unsigned long npg) > diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c > index 3481b35cb4ec..a224193d84bf 100644 > --- a/arch/x86/mm/pgtable.c > +++ b/arch/x86/mm/pgtable.c > @@ -550,7 +550,7 @@ int ptep_test_and_clear_young(struct vm_area_struct *= vma, > return ret; > } > > -#ifdef CONFIG_TRANSPARENT_HUGEPAGE > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_ARCH_HAS_NONL= EAF_PMD_YOUNG) > int pmdp_test_and_clear_young(struct vm_area_struct *vma, > unsigned long addr, pmd_t *pmdp) > { > @@ -562,6 +562,9 @@ int pmdp_test_and_clear_young(struct vm_area_struct *= vma, > > return ret; > } > +#endif > + > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE > int pudp_test_and_clear_young(struct vm_area_struct *vma, > unsigned long addr, pud_t *pudp) > { > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index 79f64dcff07d..743e7fc4afda 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -212,7 +212,7 @@ static inline int ptep_test_and_clear_young(struct vm= _area_struct *vma, > #endif > > #ifndef __HAVE_ARCH_PMDP_TEST_AND_CLEAR_YOUNG > -#ifdef CONFIG_TRANSPARENT_HUGEPAGE > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_ARCH_HAS_NONL= EAF_PMD_YOUNG) > static inline int pmdp_test_and_clear_young(struct vm_area_struct *vma, > unsigned long address, > pmd_t *pmdp) > @@ -233,7 +233,7 @@ static inline int pmdp_test_and_clear_young(struct vm= _area_struct *vma, > BUILD_BUG(); > return 0; > } > -#endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > +#endif /* CONFIG_TRANSPARENT_HUGEPAGE || CONFIG_ARCH_HAS_NONLEAF_PMD_YOU= NG */ > #endif > > #ifndef __HAVE_ARCH_PTEP_CLEAR_YOUNG_FLUSH > -- > 2.35.1.616.g0bdcbb4464-goog > Thanks Barry