From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2F9FC433EF for ; Fri, 11 Mar 2022 10:55:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 580888D0002; Fri, 11 Mar 2022 05:55:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 52F168D0001; Fri, 11 Mar 2022 05:55:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3F70D8D0002; Fri, 11 Mar 2022 05:55:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0121.hostedemail.com [216.40.44.121]) by kanga.kvack.org (Postfix) with ESMTP id 2D1A48D0001 for ; Fri, 11 Mar 2022 05:55:32 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id AAB6A9879C for ; Fri, 11 Mar 2022 10:55:31 +0000 (UTC) X-FDA: 79231799262.19.9BDF896 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) by imf15.hostedemail.com (Postfix) with ESMTP id 150B6A0014 for ; Fri, 11 Mar 2022 10:55:30 +0000 (UTC) Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-2dc0364d2ceso88632687b3.7 for ; Fri, 11 Mar 2022 02:55:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=qlwCyDk8qlleYrAYeX+7n1Ivn6nloAFSh4xt8Gq150o=; b=oaYU4zv7z2lQ1xabpxAZyjYx0RA1wJsVPblkE5xVDqUf7Z10PTPs6cEpwKyIXaEZLp OR7FHaerSmPEUm9vpiREI5PHgfQIUR3LNKbaTrJ59BzycE9q6bxoG23jcyuHsza1KufN EQuoGUhNP4ua9p4MDDqkY+Hi1KIDssycuBolb6ZYVVVVJHBAHd4uoSOkZaAo2ZmSqaHi IpLg/FGBIvKw5JoL6Pgz+YTfcBHOi3war4xtv5GFc9k+2vDi+KNJx9qTZqgOdMCUymG2 602cuilghZfUlapRNuKVU6exzuh6x5ooe3mnoP70gdJq9Ojs0rAhGMKT0SOjjlPu4NfP 5tAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=qlwCyDk8qlleYrAYeX+7n1Ivn6nloAFSh4xt8Gq150o=; b=lpjM5wvazfmlhGl51mNXQpJFJByFlLN/2PYvJBP0yePoN32lYL2akqPt1dB3tmrdZm TK8S9JSWrNzFLfqV8i+FkiRYtS2IuZA/L8DcAewDs3/bcEYLXe2CA2IwGAflOk0/tUry nQRb6o1xIZVOHuRF4HoK6PTs8CNIrQWpAiqXT/zfTi0JW2Aj5P2QJv9m1JL+1Jctg6C8 5QpXelSgYKb/9AcamJM4AL8bIq0OXroDjXPMruEd7xXvqO7gQnx/8VHPz2dkDV+BggY6 bPw3Ew5uyjLQEGDm8Wg+eM5DLnE21hpU0cnVVn3nF3J1G0qrUoH4INaE3gKr0c9+QG4G fCCw== X-Gm-Message-State: AOAM531A/sGHtJErFQ59FbKzSTAkcK/8RnWlcMMGlNAvYP8yfscG0Kgh PzIEtfb4jFztn7h0Ugi0zEdJCKD2d1pHpq0zmmc= X-Google-Smtp-Source: ABdhPJxW4CPn/wNnPf2NhXuDc0qbrCNkJ5zhS0YrB60/r/3wkqcqNNWHHLNWpmFub9ERdIkvwy2HwUOOIS4OoJKLLp8= X-Received: by 2002:a81:19c3:0:b0:2dc:2686:14e3 with SMTP id 186-20020a8119c3000000b002dc268614e3mr7479905ywz.515.1646996130236; Fri, 11 Mar 2022 02:55:30 -0800 (PST) MIME-Version: 1.0 References: <20220309021230.721028-1-yuzhao@google.com> <20220309021230.721028-2-yuzhao@google.com> In-Reply-To: <20220309021230.721028-2-yuzhao@google.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 11 Mar 2022 23:55:19 +1300 Message-ID: Subject: Re: [PATCH v9 01/14] mm: x86, arm64: add arch_has_hw_pte_young() To: Yu Zhao Cc: Andrew Morton , Linus Torvalds , Andi Kleen , Aneesh Kumar , Catalin Marinas , Dave Hansen , Hillf Danton , Jens Axboe , Jesse Barnes , Johannes Weiner , Jonathan Corbet , Matthew Wilcox , Mel Gorman , Michael Larabel , Michal Hocko , Mike Rapoport , Rik van Riel , Vlastimil Babka , Will Deacon , Ying Huang , LAK , Linux Doc Mailing List , LKML , Linux-MM , page-reclaim@google.com, x86 , Brian Geffon , Jan Alexander Steffens , Oleksandr Natalenko , Steven Barrett , Suleiman Souhlal , Daniel Byrne , Donald Carr , =?UTF-8?Q?Holger_Hoffst=C3=A4tte?= , Konstantin Kharlamov , Shuang Zhai , Sofia Trinh , Vaibhav Jain Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=oaYU4zv7; spf=pass (imf15.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.128.176 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 150B6A0014 X-Stat-Signature: 9dafftqwzurjamyt8a1diamtijhexzd5 X-HE-Tag: 1646996130-920229 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 9, 2022 at 3:47 PM Yu Zhao wrote: > > Some architectures automatically set the accessed bit in PTEs, e.g., > x86 and arm64 v8.2. On architectures that do not have this capability, > clearing the accessed bit in a PTE usually triggers a page fault > following the TLB miss of this PTE (to emulate the accessed bit). > > Being aware of this capability can help make better decisions, e.g., > whether to spread the work out over a period of time to reduce bursty > page faults when trying to clear the accessed bit in many PTEs. > > Note that theoretically this capability can be unreliable, e.g., > hotplugged CPUs might be different from builtin ones. Therefore it > should not be used in architecture-independent code that involves > correctness, e.g., to determine whether TLB flushes are required (in > combination with the accessed bit). > > Signed-off-by: Yu Zhao > Acked-by: Brian Geffon > Acked-by: Jan Alexander Steffens (heftig) > Acked-by: Oleksandr Natalenko > Acked-by: Steven Barrett > Acked-by: Suleiman Souhlal > Acked-by: Will Deacon > Tested-by: Daniel Byrne > Tested-by: Donald Carr > Tested-by: Holger Hoffst=C3=A4tte > Tested-by: Konstantin Kharlamov > Tested-by: Shuang Zhai > Tested-by: Sofia Trinh > Tested-by: Vaibhav Jain > --- Reviewed-by: Barry Song i guess arch_has_hw_pte_young() isn't called that often in either mm/memory.c or mm/vmscan.c. Otherwise, moving to a static key might help. Is it? > arch/arm64/include/asm/pgtable.h | 14 ++------------ > arch/x86/include/asm/pgtable.h | 6 +++--- > include/linux/pgtable.h | 13 +++++++++++++ > mm/memory.c | 14 +------------- > 4 files changed, 19 insertions(+), 28 deletions(-) > > diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pg= table.h > index c4ba047a82d2..990358eca359 100644 > --- a/arch/arm64/include/asm/pgtable.h > +++ b/arch/arm64/include/asm/pgtable.h > @@ -999,23 +999,13 @@ static inline void update_mmu_cache(struct vm_area_= struct *vma, > * page after fork() + CoW for pfn mappings. We don't always have a > * hardware-managed access flag on arm64. > */ > -static inline bool arch_faults_on_old_pte(void) > -{ > - WARN_ON(preemptible()); > - > - return !cpu_has_hw_af(); > -} > -#define arch_faults_on_old_pte arch_faults_on_old_pte > +#define arch_has_hw_pte_young cpu_has_hw_af > > /* > * Experimentally, it's cheap to set the access flag in hardware and we > * benefit from prefaulting mappings as 'old' to start with. > */ > -static inline bool arch_wants_old_prefaulted_pte(void) > -{ > - return !arch_faults_on_old_pte(); > -} > -#define arch_wants_old_prefaulted_pte arch_wants_old_prefaulted_pte > +#define arch_wants_old_prefaulted_pte cpu_has_hw_af > > static inline pgprot_t arch_filter_pgprot(pgprot_t prot) > { > diff --git a/arch/x86/include/asm/pgtable.h b/arch/x86/include/asm/pgtabl= e.h > index 8a9432fb3802..60b6ce45c2e3 100644 > --- a/arch/x86/include/asm/pgtable.h > +++ b/arch/x86/include/asm/pgtable.h > @@ -1423,10 +1423,10 @@ static inline bool arch_has_pfn_modify_check(void= ) > return boot_cpu_has_bug(X86_BUG_L1TF); > } > > -#define arch_faults_on_old_pte arch_faults_on_old_pte > -static inline bool arch_faults_on_old_pte(void) > +#define arch_has_hw_pte_young arch_has_hw_pte_young > +static inline bool arch_has_hw_pte_young(void) > { > - return false; > + return true; > } > > #endif /* __ASSEMBLY__ */ > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index f4f4077b97aa..79f64dcff07d 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -259,6 +259,19 @@ static inline int pmdp_clear_flush_young(struct vm_a= rea_struct *vma, > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > #endif > > +#ifndef arch_has_hw_pte_young > +/* > + * Return whether the accessed bit is supported on the local CPU. > + * > + * This stub assumes accessing through an old PTE triggers a page fault. > + * Architectures that automatically set the access bit should overwrite = it. > + */ > +static inline bool arch_has_hw_pte_young(void) > +{ > + return false; > +} > +#endif > + > #ifndef __HAVE_ARCH_PTEP_CLEAR > static inline void ptep_clear(struct mm_struct *mm, unsigned long addr, > pte_t *ptep) > diff --git a/mm/memory.c b/mm/memory.c > index c125c4969913..a7379196a47e 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -122,18 +122,6 @@ int randomize_va_space __read_mostly =3D > 2; > #endif > > -#ifndef arch_faults_on_old_pte > -static inline bool arch_faults_on_old_pte(void) > -{ > - /* > - * Those arches which don't have hw access flag feature need to > - * implement their own helper. By default, "true" means pagefault > - * will be hit on old pte. > - */ > - return true; > -} > -#endif > - > #ifndef arch_wants_old_prefaulted_pte > static inline bool arch_wants_old_prefaulted_pte(void) > { > @@ -2778,7 +2766,7 @@ static inline bool cow_user_page(struct page *dst, = struct page *src, > * On architectures with software "accessed" bits, we would > * take a double page fault, so mark it accessed here. > */ > - if (arch_faults_on_old_pte() && !pte_young(vmf->orig_pte)) { > + if (!arch_has_hw_pte_young() && !pte_young(vmf->orig_pte)) { > pte_t entry; > > vmf->pte =3D pte_offset_map_lock(mm, vmf->pmd, addr, &vmf= ->ptl); > -- > 2.35.1.616.g0bdcbb4464-goog > Thanks Barry