linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ryan Roberts <ryan.roberts@arm.com>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Yu Zhao <yuzhao@google.com>,
	"Yin, Fengwei" <fengwei.yin@intel.com>,
	linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org
Subject: Re: [RFC v2 PATCH 05/17] mm: Routines to determine max anon folio allocation order
Date: Fri, 14 Apr 2023 17:06:49 +0100	[thread overview]
Message-ID: <e65acdf4-339f-874c-608f-2472e071e7ac@arm.com> (raw)
In-Reply-To: <20230414153747.n5kyhvb5a726lvrz@box.shutemov.name>

On 14/04/2023 16:37, Kirill A. Shutemov wrote:
> On Fri, Apr 14, 2023 at 03:38:35PM +0100, Ryan Roberts wrote:
>> On 14/04/2023 15:09, Kirill A. Shutemov wrote:
>>> On Fri, Apr 14, 2023 at 02:02:51PM +0100, Ryan Roberts wrote:
>>>> For variable-order anonymous folios, we want to tune the order that we
>>>> prefer to allocate based on the vma. Add the routines to manage that
>>>> heuristic.
>>>>
>>>> TODO: Currently we always use the global maximum. Add per-vma logic!
>>>>
>>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>>> ---
>>>>  include/linux/mm.h | 5 +++++
>>>>  mm/memory.c        | 8 ++++++++
>>>>  2 files changed, 13 insertions(+)
>>>>
>>>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>>>> index cdb8c6031d0f..cc8d0b239116 100644
>>>> --- a/include/linux/mm.h
>>>> +++ b/include/linux/mm.h
>>>> @@ -3674,4 +3674,9 @@ madvise_set_anon_name(struct mm_struct *mm, unsigned long start,
>>>>  }
>>>>  #endif
>>>>
>>>> +/*
>>>> + * TODO: Should this be set per-architecture?
>>>> + */
>>>> +#define ANON_FOLIO_ORDER_MAX	4
>>>> +
>>>
>>> I think it has to be derived from size in bytes, not directly specifies
>>> page order. For 4K pages, order 4 is 64k and for 64k pages it is 1M.
>>>
>>
>> Yes I see where you are coming from. What's your feel for what a sensible upper
>> bound in bytes is?
>>
>> My difficulty is that I would like to be able to use this allocation mechanism
>> to enable using the "contiguous bit" on arm64; that's a set of contiguous PTEs
>> that are mapped to physically contiguous memory, and the HW can use that hint to
>> coalesce the TLB entries.
>>
>> For 4KB pages, the contig size is 64KB (order-4), so that works nicely. But for
>> 16KB and 64KB pages, its 2MB (order-7 and order-5 respectively). Do you think
>> allocating 2MB pages here is going to lead to too much memory wastage?
> 
> I think it boils down to the specifics of the microarchitecture.
> 
> We can justify 2M PMD-mapped THP in many cases. But PMD-mapped THP is not
> only reduces TLB pressure (that contiguous bit does too, I believe), but
> also saves one more memory access on page table walk.
> 
> It may or may not matter for the processor. It has to be evaluated.

I think you are saying that if the performance uplift is good, then some extra
memory wastage can be justified?

The point I'm thinking about is for 4K pages, we need to allocate 64K blocks to
use the contig bit. Roughly I guess that means going from average of 2K wastage
per anon VMA to 32K. Perhaps you can get away with that for a decent perf uplift.

But for 64K pages, we need to allocate 2M blocks to use the contig bit. So that
takes average wastage from 32K to 1M. That feels a bit harder to justify.
Perhaps here, we should make a decision based on MADV_HUGEPAGE?

So perhaps we actually want 2 values: one for if MADV_HUGEPAGE is not set on the
VMA, and one if it is? (with 64K pages I'm guessing there are many cases where
we won't PMD-map THPs - its 512MB).

> 
> Maybe moving it to per-arch is the right way. With default in generic code
> to be ilog2(SZ_64K >> PAGE_SIZE) or something.

Yes, I agree that sounds like a good starting point for the !MADV_HUGEPAGE case.



  reply	other threads:[~2023-04-14 16:06 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-14 13:02 [RFC v2 PATCH 00/17] variable-order, large folios for anonymous memory Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 01/17] mm: Expose clear_huge_page() unconditionally Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 02/17] mm: pass gfp flags and order to vma_alloc_zeroed_movable_folio() Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 03/17] mm: Introduce try_vma_alloc_movable_folio() Ryan Roberts
2023-04-17  8:49   ` Yin, Fengwei
2023-04-17 10:11     ` Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 04/17] mm: Implement folio_add_new_anon_rmap_range() Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 05/17] mm: Routines to determine max anon folio allocation order Ryan Roberts
2023-04-14 14:09   ` Kirill A. Shutemov
2023-04-14 14:38     ` Ryan Roberts
2023-04-14 15:37       ` Kirill A. Shutemov
2023-04-14 16:06         ` Ryan Roberts [this message]
2023-04-14 16:18           ` Matthew Wilcox
2023-04-14 16:31             ` Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 06/17] mm: Allocate large folios for anonymous memory Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 07/17] mm: Allow deferred splitting of arbitrary large anon folios Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 08/17] mm: Implement folio_move_anon_rmap_range() Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 09/17] mm: Update wp_page_reuse() to operate on range of pages Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 10/17] mm: Reuse large folios for anonymous memory Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 11/17] mm: Split __wp_page_copy_user() into 2 variants Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 12/17] mm: ptep_clear_flush_range_notify() macro for batch operation Ryan Roberts
2023-04-14 13:02 ` [RFC v2 PATCH 13/17] mm: Implement folio_remove_rmap_range() Ryan Roberts
2023-04-14 13:03 ` [RFC v2 PATCH 14/17] mm: Copy large folios for anonymous memory Ryan Roberts
2023-04-14 13:03 ` [RFC v2 PATCH 15/17] mm: Convert zero page to large folios on write Ryan Roberts
2023-04-14 13:03 ` [RFC v2 PATCH 16/17] mm: mmap: Align unhinted maps to highest anon folio order Ryan Roberts
2023-04-17  8:25   ` Yin, Fengwei
2023-04-17 10:13     ` Ryan Roberts
2023-04-14 13:03 ` [RFC v2 PATCH 17/17] mm: Batch-zap large anonymous folio PTE mappings Ryan Roberts
2023-04-17  8:04 ` [RFC v2 PATCH 00/17] variable-order, large folios for anonymous memory Yin, Fengwei
2023-04-17 10:19   ` Ryan Roberts
2023-04-17  8:19 ` Yin, Fengwei
2023-04-17 10:28   ` Ryan Roberts
2023-04-17 10:54 ` David Hildenbrand
2023-04-17 11:43   ` Ryan Roberts
2023-04-17 14:05     ` David Hildenbrand
2023-04-17 15:38       ` Ryan Roberts
2023-04-17 15:44         ` David Hildenbrand
2023-04-17 16:15           ` Ryan Roberts
2023-04-26 10:41           ` Ryan Roberts
2023-05-17 13:58             ` David Hildenbrand
2023-05-18 11:23               ` Ryan Roberts
2023-04-19 10:12       ` Ryan Roberts
2023-04-19 10:51         ` David Hildenbrand
2023-04-19 11:13           ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e65acdf4-339f-874c-608f-2472e071e7ac@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=fengwei.yin@intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).