From: Yu Zhao <yuzhao@google.com> To: Ryan Roberts <ryan.roberts@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Matthew Wilcox <willy@infradead.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Yin Fengwei <fengwei.yin@intel.com>, David Hildenbrand <david@redhat.com>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Anshuman Khandual <anshuman.khandual@arm.com>, Yang Shi <shy828301@gmail.com>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 3/5] mm: Default implementation of arch_wants_pte_order() Date: Mon, 3 Jul 2023 13:50:48 -0600 [thread overview] Message-ID: <CAOUHufa_xFJvFFvmw1Tkdc9cXaZ1GPA1dVSauH+J9zGX-sO1UA@mail.gmail.com> (raw) In-Reply-To: <20230703135330.1865927-4-ryan.roberts@arm.com> On Mon, Jul 3, 2023 at 7:53 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > arch_wants_pte_order() can be overridden by the arch to return the > preferred folio order for pte-mapped memory. This is useful as some > architectures (e.g. arm64) can coalesce TLB entries when the physical > memory is suitably contiguous. > > The first user for this hint will be FLEXIBLE_THP, which aims to > allocate large folios for anonymous memory to reduce page faults and > other per-page operation costs. > > Here we add the default implementation of the function, used when the > architecture does not define it, which returns the order corresponding > to 64K. I don't really mind a non-zero default value. But people would ask why non-zero and why 64KB. Probably you could argue this is the large size all known archs support if they have TLB coalescing. For x86, AMD CPUs would want to override this. I'll leave it to Fengwei to decide whether Intel wants a different default value. Also I don't like the vma parameter because it makes arch_wants_pte_order() a mix of hw preference and vma policy. From my POV, the function should be only about the former; the latter should be decided by arch-independent MM code. However, I can live with it if ARM MM people think this is really what you want. ATM, I'm skeptical they do. > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> After another CPU vendor, e.g., Fengwei, and an ARM MM person, e.g., Will give the green light: Reviewed-by: Yu Zhao <yuzhao@google.com> > --- > include/linux/pgtable.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index a661a17173fa..f7e38598f20b 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -13,6 +13,7 @@ > #include <linux/errno.h> > #include <asm-generic/pgtable_uffd.h> > #include <linux/page_table_check.h> > +#include <linux/sizes.h> > > #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \ > defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS > @@ -336,6 +337,18 @@ static inline bool arch_has_hw_pte_young(void) > } > #endif > > +#ifndef arch_wants_pte_order > +/* > + * Returns preferred folio order for pte-mapped memory. Must be in range [0, > + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios The warning is helpful. > + * to be at least order-2. > + */ > +static inline int arch_wants_pte_order(struct vm_area_struct *vma) > +{ > + return ilog2(SZ_64K >> PAGE_SHIFT); > +} > +#endif > + > #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, > unsigned long address,
WARNING: multiple messages have this Message-ID (diff)
From: Yu Zhao <yuzhao@google.com> To: Ryan Roberts <ryan.roberts@arm.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Matthew Wilcox <willy@infradead.org>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Yin Fengwei <fengwei.yin@intel.com>, David Hildenbrand <david@redhat.com>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Anshuman Khandual <anshuman.khandual@arm.com>, Yang Shi <shy828301@gmail.com>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v2 3/5] mm: Default implementation of arch_wants_pte_order() Date: Mon, 3 Jul 2023 13:50:48 -0600 [thread overview] Message-ID: <CAOUHufa_xFJvFFvmw1Tkdc9cXaZ1GPA1dVSauH+J9zGX-sO1UA@mail.gmail.com> (raw) In-Reply-To: <20230703135330.1865927-4-ryan.roberts@arm.com> On Mon, Jul 3, 2023 at 7:53 AM Ryan Roberts <ryan.roberts@arm.com> wrote: > > arch_wants_pte_order() can be overridden by the arch to return the > preferred folio order for pte-mapped memory. This is useful as some > architectures (e.g. arm64) can coalesce TLB entries when the physical > memory is suitably contiguous. > > The first user for this hint will be FLEXIBLE_THP, which aims to > allocate large folios for anonymous memory to reduce page faults and > other per-page operation costs. > > Here we add the default implementation of the function, used when the > architecture does not define it, which returns the order corresponding > to 64K. I don't really mind a non-zero default value. But people would ask why non-zero and why 64KB. Probably you could argue this is the large size all known archs support if they have TLB coalescing. For x86, AMD CPUs would want to override this. I'll leave it to Fengwei to decide whether Intel wants a different default value. Also I don't like the vma parameter because it makes arch_wants_pte_order() a mix of hw preference and vma policy. From my POV, the function should be only about the former; the latter should be decided by arch-independent MM code. However, I can live with it if ARM MM people think this is really what you want. ATM, I'm skeptical they do. > Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> After another CPU vendor, e.g., Fengwei, and an ARM MM person, e.g., Will give the green light: Reviewed-by: Yu Zhao <yuzhao@google.com> > --- > include/linux/pgtable.h | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > index a661a17173fa..f7e38598f20b 100644 > --- a/include/linux/pgtable.h > +++ b/include/linux/pgtable.h > @@ -13,6 +13,7 @@ > #include <linux/errno.h> > #include <asm-generic/pgtable_uffd.h> > #include <linux/page_table_check.h> > +#include <linux/sizes.h> > > #if 5 - defined(__PAGETABLE_P4D_FOLDED) - defined(__PAGETABLE_PUD_FOLDED) - \ > defined(__PAGETABLE_PMD_FOLDED) != CONFIG_PGTABLE_LEVELS > @@ -336,6 +337,18 @@ static inline bool arch_has_hw_pte_young(void) > } > #endif > > +#ifndef arch_wants_pte_order > +/* > + * Returns preferred folio order for pte-mapped memory. Must be in range [0, > + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios The warning is helpful. > + * to be at least order-2. > + */ > +static inline int arch_wants_pte_order(struct vm_area_struct *vma) > +{ > + return ilog2(SZ_64K >> PAGE_SHIFT); > +} > +#endif > + > #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR > static inline pte_t ptep_get_and_clear(struct mm_struct *mm, > unsigned long address, _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2023-07-03 19:51 UTC|newest] Thread overview: 167+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-07-03 13:53 [PATCH v2 0/5] variable-order, large folios for anonymous memory Ryan Roberts 2023-07-03 13:53 ` Ryan Roberts 2023-07-03 13:53 ` [PATCH v2 1/5] mm: Non-pmd-mappable, large folios for folio_add_new_anon_rmap() Ryan Roberts 2023-07-03 13:53 ` Ryan Roberts 2023-07-03 19:05 ` Yu Zhao 2023-07-03 19:05 ` Yu Zhao 2023-07-04 2:13 ` Yin, Fengwei 2023-07-04 2:13 ` Yin, Fengwei 2023-07-04 11:19 ` Ryan Roberts 2023-07-04 11:19 ` Ryan Roberts 2023-07-04 2:14 ` Yin, Fengwei 2023-07-04 2:14 ` Yin, Fengwei 2023-07-03 13:53 ` [PATCH v2 2/5] mm: Allow deferred splitting of arbitrary large anon folios Ryan Roberts 2023-07-03 13:53 ` Ryan Roberts 2023-07-07 8:21 ` Huang, Ying 2023-07-07 8:21 ` Huang, Ying 2023-07-07 9:39 ` Ryan Roberts 2023-07-07 9:42 ` Ryan Roberts 2023-07-07 9:42 ` Ryan Roberts 2023-07-10 5:37 ` Huang, Ying 2023-07-10 5:37 ` Huang, Ying 2023-07-10 8:29 ` Ryan Roberts 2023-07-10 8:29 ` Ryan Roberts 2023-07-10 9:01 ` Huang, Ying 2023-07-10 9:01 ` Huang, Ying 2023-07-10 9:39 ` Ryan Roberts 2023-07-10 9:39 ` Ryan Roberts 2023-07-11 1:56 ` Huang, Ying 2023-07-11 1:56 ` Huang, Ying 2023-07-03 13:53 ` [PATCH v2 3/5] mm: Default implementation of arch_wants_pte_order() Ryan Roberts 2023-07-03 13:53 ` Ryan Roberts 2023-07-03 19:50 ` Yu Zhao [this message] 2023-07-03 19:50 ` Yu Zhao 2023-07-04 13:20 ` Ryan Roberts 2023-07-04 13:20 ` Ryan Roberts 2023-07-05 2:07 ` Yu Zhao 2023-07-05 2:07 ` Yu Zhao 2023-07-05 9:11 ` Ryan Roberts 2023-07-05 9:11 ` Ryan Roberts 2023-07-05 17:24 ` Yu Zhao 2023-07-05 17:24 ` Yu Zhao 2023-07-05 18:01 ` Ryan Roberts 2023-07-05 18:01 ` Ryan Roberts 2023-07-06 19:33 ` Matthew Wilcox 2023-07-06 19:33 ` Matthew Wilcox 2023-07-07 10:00 ` Ryan Roberts 2023-07-07 10:00 ` Ryan Roberts 2023-07-04 2:22 ` Yin, Fengwei 2023-07-04 2:22 ` Yin, Fengwei 2023-07-04 3:02 ` Yu Zhao 2023-07-04 3:02 ` Yu Zhao 2023-07-04 3:59 ` Yu Zhao 2023-07-04 3:59 ` Yu Zhao 2023-07-04 5:22 ` Yin, Fengwei 2023-07-04 5:22 ` Yin, Fengwei 2023-07-04 5:42 ` Yu Zhao 2023-07-04 5:42 ` Yu Zhao 2023-07-04 12:36 ` Ryan Roberts 2023-07-04 12:36 ` Ryan Roberts 2023-07-04 13:23 ` Ryan Roberts 2023-07-04 13:23 ` Ryan Roberts 2023-07-05 1:40 ` Yu Zhao 2023-07-05 1:40 ` Yu Zhao 2023-07-05 1:23 ` Yu Zhao 2023-07-05 1:23 ` Yu Zhao 2023-07-05 2:18 ` Yin Fengwei 2023-07-05 2:18 ` Yin Fengwei 2023-07-03 13:53 ` [PATCH v2 4/5] mm: FLEXIBLE_THP for improved performance Ryan Roberts 2023-07-03 13:53 ` Ryan Roberts 2023-07-03 15:51 ` kernel test robot 2023-07-03 15:51 ` kernel test robot 2023-07-03 16:01 ` kernel test robot 2023-07-03 16:01 ` kernel test robot 2023-07-04 1:35 ` Yu Zhao 2023-07-04 1:35 ` Yu Zhao 2023-07-04 14:08 ` Ryan Roberts 2023-07-04 14:08 ` Ryan Roberts 2023-07-04 23:47 ` Yu Zhao 2023-07-04 23:47 ` Yu Zhao 2023-07-04 3:45 ` Yin, Fengwei 2023-07-04 3:45 ` Yin, Fengwei 2023-07-04 14:20 ` Ryan Roberts 2023-07-04 14:20 ` Ryan Roberts 2023-07-04 23:35 ` Yin Fengwei 2023-07-04 23:57 ` Matthew Wilcox 2023-07-04 23:57 ` Matthew Wilcox 2023-07-05 9:54 ` Ryan Roberts 2023-07-05 9:54 ` Ryan Roberts 2023-07-05 12:08 ` Matthew Wilcox 2023-07-05 12:08 ` Matthew Wilcox 2023-07-07 8:01 ` Huang, Ying 2023-07-07 8:01 ` Huang, Ying 2023-07-07 9:52 ` Ryan Roberts 2023-07-07 9:52 ` Ryan Roberts 2023-07-07 11:29 ` David Hildenbrand 2023-07-07 11:29 ` David Hildenbrand 2023-07-07 13:57 ` Matthew Wilcox 2023-07-07 13:57 ` Matthew Wilcox 2023-07-07 14:07 ` David Hildenbrand 2023-07-07 14:07 ` David Hildenbrand 2023-07-07 15:13 ` Ryan Roberts 2023-07-07 15:13 ` Ryan Roberts 2023-07-07 16:06 ` David Hildenbrand 2023-07-07 16:06 ` David Hildenbrand 2023-07-07 16:22 ` Ryan Roberts 2023-07-07 16:22 ` Ryan Roberts 2023-07-07 19:06 ` David Hildenbrand 2023-07-07 19:06 ` David Hildenbrand 2023-07-10 8:41 ` Ryan Roberts 2023-07-10 8:41 ` Ryan Roberts 2023-07-10 3:03 ` Huang, Ying 2023-07-10 3:03 ` Huang, Ying 2023-07-10 8:55 ` Ryan Roberts 2023-07-10 8:55 ` Ryan Roberts 2023-07-10 9:18 ` Huang, Ying 2023-07-10 9:18 ` Huang, Ying 2023-07-10 9:25 ` Ryan Roberts 2023-07-10 9:25 ` Ryan Roberts 2023-07-11 0:48 ` Huang, Ying 2023-07-11 0:48 ` Huang, Ying 2023-07-10 2:49 ` Huang, Ying 2023-07-10 2:49 ` Huang, Ying 2023-07-03 13:53 ` [PATCH v2 5/5] arm64: mm: Override arch_wants_pte_order() Ryan Roberts 2023-07-03 13:53 ` Ryan Roberts 2023-07-03 20:02 ` Yu Zhao 2023-07-03 20:02 ` Yu Zhao 2023-07-04 2:18 ` [PATCH v2 0/5] variable-order, large folios for anonymous memory Yu Zhao 2023-07-04 2:18 ` Yu Zhao 2023-07-04 6:22 ` Yin, Fengwei 2023-07-04 6:22 ` Yin, Fengwei 2023-07-04 7:11 ` Yu Zhao 2023-07-04 7:11 ` Yu Zhao 2023-07-04 15:36 ` Ryan Roberts 2023-07-04 15:36 ` Ryan Roberts 2023-07-04 23:52 ` Yin Fengwei 2023-07-05 0:21 ` Yu Zhao 2023-07-05 0:21 ` Yu Zhao 2023-07-05 10:16 ` Ryan Roberts 2023-07-05 10:16 ` Ryan Roberts 2023-07-05 19:00 ` Yu Zhao 2023-07-05 19:00 ` Yu Zhao 2023-07-05 19:38 ` David Hildenbrand 2023-07-05 19:38 ` David Hildenbrand 2023-07-06 8:02 ` Ryan Roberts 2023-07-06 8:02 ` Ryan Roberts 2023-07-07 11:40 ` David Hildenbrand 2023-07-07 11:40 ` David Hildenbrand 2023-07-07 13:12 ` Matthew Wilcox 2023-07-07 13:12 ` Matthew Wilcox 2023-07-07 13:24 ` David Hildenbrand 2023-07-07 13:24 ` David Hildenbrand 2023-07-10 10:07 ` Ryan Roberts 2023-07-10 10:07 ` Ryan Roberts 2023-07-10 16:57 ` Matthew Wilcox 2023-07-10 16:57 ` Matthew Wilcox 2023-07-10 16:53 ` Zi Yan 2023-07-10 16:53 ` Zi Yan 2023-07-19 15:49 ` Ryan Roberts 2023-07-19 15:49 ` Ryan Roberts 2023-07-19 16:05 ` Zi Yan 2023-07-19 16:05 ` Zi Yan 2023-07-19 18:37 ` Ryan Roberts 2023-07-19 18:37 ` Ryan Roberts 2023-07-11 21:11 ` Luis Chamberlain 2023-07-11 21:11 ` Luis Chamberlain 2023-07-11 21:59 ` Matthew Wilcox 2023-07-11 21:59 ` Matthew Wilcox
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=CAOUHufa_xFJvFFvmw1Tkdc9cXaZ1GPA1dVSauH+J9zGX-sO1UA@mail.gmail.com \ --to=yuzhao@google.com \ --cc=akpm@linux-foundation.org \ --cc=anshuman.khandual@arm.com \ --cc=catalin.marinas@arm.com \ --cc=david@redhat.com \ --cc=fengwei.yin@intel.com \ --cc=kirill.shutemov@linux.intel.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=ryan.roberts@arm.com \ --cc=shy828301@gmail.com \ --cc=will@kernel.org \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.