From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Nicholas Piggin <npiggin@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>,
Dave Hansen <dave.hansen@intel.com>,
<linux-arm-kernel@lists.infradead.org>,
<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
<linux-mm@kvack.org>, <linuxppc-dev@lists.ozlabs.org>,
<x86@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Dave Hansen <dave.hansen@linux.intel.com>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
"Michael Ellerman" <mpe@ellerman.id.au>,
Paul Mackerras <paulus@samba.org>,
"Thomas Gleixner" <tglx@linutronix.de>,
Will Deacon <will@kernel.org>,
Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings
Date: Wed, 19 Jan 2022 21:32:13 +0800 [thread overview]
Message-ID: <e9961fc7-30c7-1b1f-0b38-d23891c60284@huawei.com> (raw)
In-Reply-To: <1642565468.c0jax91tvn.astroid@bobo.none>
On 2022/1/19 12:17, Nicholas Piggin wrote:
> Excerpts from Dave Hansen's message of January 19, 2022 3:28 am:
>> On 1/17/22 6:46 PM, Nicholas Piggin wrote:
>>>> This all sounds very fragile to me. Every time a new architecture would
>>>> get added for huge vmalloc() support, the developer needs to know to go
>>>> find that architecture's module_alloc() and add this flag.
>>> This is documented in the Kconfig.
>>>
>>> #
>>> # Archs that select this would be capable of PMD-sized vmaps (i.e.,
>>> # arch_vmap_pmd_supported() returns true), and they must make no assumptions
>>> # that vmalloc memory is mapped with PAGE_SIZE ptes. The VM_NO_HUGE_VMAP flag
>>> # can be used to prohibit arch-specific allocations from using hugepages to
>>> # help with this (e.g., modules may require it).
>>> #
>>> config HAVE_ARCH_HUGE_VMALLOC
>>> depends on HAVE_ARCH_HUGE_VMAP
>>> bool
>>>
>>> Is it really fair to say it's *very* fragile? Surely it's reasonable to
>>> read the (not very long) documentation ad understand the consequences for
>>> the arch code before enabling it.
>> Very fragile or not, I think folks are likely to get it wrong. It would
>> be nice to have it default *everyone* to safe and slow and make *sure*
> It's not safe to enable though. That's the problem. If it was just
> modules then you'd have a point but it could be anything.
>
>> they go look at the architecture modules code itself before enabling
>> this for modules.
> This is required not just for modules for the whole arch code, it
> has to be looked at and decided this will work.
>
>> Just from that Kconfig text, I don't think I'd know off the top of my
>> head what do do for x86, or what code I needed to go touch.
> You have to make sure arch/x86 makes no assumptions that vmalloc memory
> is backed by PAGE_SIZE ptes. If you can't do that then you shouldn't
> enable the option. The option can not explain it any more because any
> arch could do anything with its mappings. The module code is an example,
> not the recipe.
Hi Nick, Dave and Christophe,thanks for your review, a little
confused, I think,
1) for ppc/arm64 module_alloc(), it must set VM_NO_HUGE_VMAP because the
arch's set_memory_* funcitons can only support PAGE_SIZE mapping, due to the
limit of apply_to_page_range().
2) but for x86's module_alloc(), add VM_NO_HUGE_VMAP is to avoid
fragmentation,
x86's __change_page_attr functions will split the huge mapping. this
flags is not a must.
and the behavior above occurred when STRICT_MODULE_RWX enabled, so
1) add a unified function to set vm flags(suggested by Dave ) or
2) add vm flags with some comments to per-arch's module_alloc()
are both acceptable, for the way of unified function , we could make
this a default recipe
with STRICT_MODULE_RWX, also make two more vm flags into it, eg,
+unsigned long module_alloc_vm_flags(bool need_flush_reset_perms)
+{
+ unsigned long vm_flags = VM_DEFER_KMEMLEAK;
+
+ if (need_flush_reset_perms)
+ vm_flags |= VM_FLUSH_RESET_PERMS;
+ /*
+ * Modules use a single, large vmalloc(). Different permissions
+ * are applied later and will fragment huge mappings or even
+ * fails in set_memory_* on some architectures. Avoid using
+ * huge pages for modules.
+ */
+ if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX))
+ vm_flags |= VM_NO_HUGE_VMAP;
+
+ return vm_flags;
+}
then called each arch's module_alloc().
Any suggestion, many thanks.
>
> Thanks,
> Nick
> .
next prev parent reply other threads:[~2022-01-19 13:32 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-12-27 14:59 [PATCH v2 0/3] mm: support huge vmalloc mapping on arm64/x86 Kefeng Wang
2021-12-27 14:59 ` [PATCH v2 1/3] mm: vmalloc: Let user to control huge vmalloc default behavior Kefeng Wang
2022-01-18 2:52 ` Nicholas Piggin
2022-01-19 12:57 ` Kefeng Wang
2022-01-19 13:22 ` Matthew Wilcox
2022-01-19 13:44 ` Kefeng Wang
2022-01-19 13:48 ` Matthew Wilcox
2021-12-27 14:59 ` [PATCH v2 2/3] arm64: Support huge vmalloc mappings Kefeng Wang
2021-12-27 17:35 ` (No subject) William Kucharski
2021-12-28 1:36 ` Kefeng Wang
2022-01-15 10:05 ` [PATCH v2 2/3] arm64: Support huge vmalloc mappings Christophe Leroy
2021-12-27 14:59 ` [PATCH v2 3/3] x86: " Kefeng Wang
2021-12-27 15:56 ` Dave Hansen
2021-12-28 10:26 ` Kefeng Wang
2021-12-28 16:14 ` Dave Hansen
2021-12-29 11:01 ` Kefeng Wang
2022-01-15 10:17 ` Christophe Leroy
2022-01-15 10:15 ` Christophe Leroy
2022-01-18 2:46 ` Nicholas Piggin
2022-01-18 17:28 ` Dave Hansen
2022-01-19 4:17 ` Nicholas Piggin
2022-01-19 13:32 ` Kefeng Wang [this message]
2022-01-15 10:11 ` Christophe Leroy
2022-01-15 10:06 ` Christophe Leroy
2022-01-15 10:07 ` [PATCH v2 0/3] mm: support huge vmalloc mapping on arm64/x86 Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e9961fc7-30c7-1b1f-0b38-d23891c60284@huawei.com \
--to=wangkefeng.wang@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=corbet@lwn.net \
--cc=dave.hansen@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=hpa@zytor.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulus@samba.org \
--cc=tglx@linutronix.de \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).