linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicholas Piggin <npiggin@gmail.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	Eric Dumazet <eric.dumazet@gmail.com>,
	linux-mm@kvack.org
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>,
	Christoph Hellwig <hch@infradead.org>,
	Jonathan Cameron <Jonathan.Cameron@Huawei.com>,
	linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, Zefan Li <lizefan@huawei.com>
Subject: Re: [PATCH v6 11/12] mm/vmalloc: Hugepage vmalloc mappings
Date: Sat, 22 Aug 2020 02:05:35 +1000	[thread overview]
Message-ID: <1598025275.jd6s9py77x.astroid@bobo.none> (raw)
In-Reply-To: <1e001c2c-6c47-21a9-e920-caf78933b713@gmail.com>

Excerpts from Eric Dumazet's message of August 22, 2020 1:38 am:
> 
> On 8/21/20 8:12 AM, Nicholas Piggin wrote:
>> Support huge page vmalloc mappings. Config option HAVE_ARCH_HUGE_VMALLOC
>> enables support on architectures that define HAVE_ARCH_HUGE_VMAP and
>> supports PMD sized vmap mappings.
>> 
>> vmalloc will attempt to allocate PMD-sized pages if allocating PMD size or
>> larger, and fall back to small pages if that was unsuccessful.
>> 
>> Allocations that do not use PAGE_KERNEL prot are not permitted to use huge
>> pages, because not all callers expect this (e.g., module allocations vs
>> strict module rwx).
>> 
>> This reduces TLB misses by nearly 30x on a `git diff` workload on a 2-node
>> POWER9 (59,800 -> 2,100) and reduces CPU cycles by 0.54%.
>> 
>> This can result in more internal fragmentation and memory overhead for a
>> given allocation, an option nohugevmalloc is added to disable at boot.
>> 
>>
> 
> Thanks for working on this stuff, I tried something similar in the past,
> but could not really do more than a hack.
> ( https://lkml.org/lkml/2016/12/21/285 )

Oh nice. It might be possible to do some ideas from your patch
still. Higher order pages smaller than PMD size, or the memory
policy stuff, perhaps.

> Note that __init alloc_large_system_hash() is used at boot time,
> when NUMA policy is spreading allocations over all NUMA nodes.
> 
> This means that on a dual node system, a hash table should be 50/50 spread.
> 
> With your patch, if a hashtable is exactly the size of one huge page,
> the location of this hashtable will be not balanced, this might have some
> unwanted impact.

In that case it shouldn't because it divides by the number of nodes,
but it will in general have a bit larger granularity in balancing than
smaller pages of course.

There's probably a better way to size these important hashes on NUMA. I
suspect most of the time you have a NUMA machine you actually would
prefer to use large pages now, even if it means taking up to 2MB more
memory per node per hash. It's not a great amount and the allocation 
size is rather arbitrary anyway.

Thanks,
Nick

  reply	other threads:[~2020-08-21 16:06 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-21 15:12 [PATCH v6 00/12] huge vmalloc mappings Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 01/12] mm/vmalloc: fix vmalloc_to_page for huge vmap mappings Nicholas Piggin
2020-08-21 20:07   ` Andrew Morton
2020-08-21 22:42     ` Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 02/12] mm: apply_to_pte_range warn and fail if a large pte is encountered Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 03/12] mm/vmalloc: rename vmap_*_range vmap_pages_*_range Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 04/12] mm/ioremap: rename ioremap_*_range to vmap_*_range Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 05/12] mm: HUGE_VMAP arch support cleanup Nicholas Piggin
2020-08-21 20:14   ` Andrew Morton
2020-08-21 22:45     ` Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 06/12] powerpc: inline huge vmap supported functions Nicholas Piggin
2020-08-21 20:15   ` Andrew Morton
2020-08-21 15:12 ` [PATCH v6 07/12] arm64: " Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 08/12] x86: " Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 09/12] mm: Move vmap_range from mm/ioremap.c to mm/vmalloc.c Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 10/12] mm/vmalloc: add vmap_range_noflush variant Nicholas Piggin
2020-08-21 15:12 ` [PATCH v6 11/12] mm/vmalloc: Hugepage vmalloc mappings Nicholas Piggin
2020-08-21 15:38   ` Eric Dumazet
2020-08-21 16:05     ` Nicholas Piggin [this message]
2020-08-21 15:12 ` [PATCH v6 12/12] powerpc/64s/radix: Enable huge " Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1598025275.jd6s9py77x.astroid@bobo.none \
    --to=npiggin@gmail.com \
    --cc=Jonathan.Cameron@Huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=eric.dumazet@gmail.com \
    --cc=hch@infradead.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lizefan@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).