All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Nicholas Piggin <npiggin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Dave Hansen <dave.hansen@intel.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <linuxppc-dev@lists.ozlabs.org>,
	<x86@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Borislav Petkov <bp@alien8.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	Paul Mackerras <paulus@samba.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings
Date: Wed, 19 Jan 2022 21:32:13 +0800	[thread overview]
Message-ID: <e9961fc7-30c7-1b1f-0b38-d23891c60284@huawei.com> (raw)
In-Reply-To: <1642565468.c0jax91tvn.astroid@bobo.none>


On 2022/1/19 12:17, Nicholas Piggin wrote:
> Excerpts from Dave Hansen's message of January 19, 2022 3:28 am:
>> On 1/17/22 6:46 PM, Nicholas Piggin wrote:
>>>> This all sounds very fragile to me.  Every time a new architecture would
>>>> get added for huge vmalloc() support, the developer needs to know to go
>>>> find that architecture's module_alloc() and add this flag.
>>> This is documented in the Kconfig.
>>>
>>>   #
>>>   #  Archs that select this would be capable of PMD-sized vmaps (i.e.,
>>>   #  arch_vmap_pmd_supported() returns true), and they must make no assumptions
>>>   #  that vmalloc memory is mapped with PAGE_SIZE ptes. The VM_NO_HUGE_VMAP flag
>>>   #  can be used to prohibit arch-specific allocations from using hugepages to
>>>   #  help with this (e.g., modules may require it).
>>>   #
>>>   config HAVE_ARCH_HUGE_VMALLOC
>>>           depends on HAVE_ARCH_HUGE_VMAP
>>>           bool
>>>
>>> Is it really fair to say it's *very* fragile? Surely it's reasonable to
>>> read the (not very long) documentation ad understand the consequences for
>>> the arch code before enabling it.
>> Very fragile or not, I think folks are likely to get it wrong.  It would
>> be nice to have it default *everyone* to safe and slow and make *sure*
> It's not safe to enable though. That's the problem. If it was just
> modules then you'd have a point but it could be anything.
>
>> they go look at the architecture modules code itself before enabling
>> this for modules.
> This is required not just for modules for the whole arch code, it
> has to be looked at and decided this will work.
>
>> Just from that Kconfig text, I don't think I'd know off the top of my
>> head what do do for x86, or what code I needed to go touch.
> You have to make sure arch/x86 makes no assumptions that vmalloc memory
> is backed by PAGE_SIZE ptes. If you can't do that then you shouldn't
> enable the option. The option can not explain it any more because any
> arch could do anything with its mappings. The module code is an example,
> not the recipe.

Hi Nick, Dave and Christophe,thanks for your review,  a little 
confused,   I think,

1) for ppc/arm64 module_alloc(),  it must set VM_NO_HUGE_VMAP because the

arch's set_memory_* funcitons can only support PAGE_SIZE mapping, due to the

limit of apply_to_page_range().

2) but for x86's module_alloc(), add VM_NO_HUGE_VMAP is to avoid 
fragmentation,

x86's __change_page_attr functions will split the huge mapping. this 
flags is not a must.


and the behavior above occurred when STRICT_MODULE_RWX enabled, so

1) add a unified function to set vm flags(suggested by Dave ) or

2) add vm flags with some comments to per-arch's module_alloc()

are both acceptable, for the way of unified function ,  we could make 
this a default recipe

with STRICT_MODULE_RWX, also make two more vm flags into it, eg,

+unsigned long module_alloc_vm_flags(bool need_flush_reset_perms)
+{
+       unsigned long vm_flags = VM_DEFER_KMEMLEAK;
+
+       if (need_flush_reset_perms)
+               vm_flags |= VM_FLUSH_RESET_PERMS;
+       /*
+        * Modules use a single, large vmalloc(). Different permissions
+        * are applied later and will fragment huge mappings or even
+        * fails in set_memory_* on some architectures. Avoid using
+        * huge pages for modules.
+        */
+       if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX))
+               vm_flags |= VM_NO_HUGE_VMAP;
+
+       return vm_flags;
+}

then called each arch's module_alloc().

Any suggestion, many thanks.


>
> Thanks,
> Nick
> .

WARNING: multiple messages have this Message-ID (diff)
From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Nicholas Piggin <npiggin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Dave Hansen <dave.hansen@intel.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <linuxppc-dev@lists.ozlabs.org>,
	<x86@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Paul Mackerras <paulus@samba.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings
Date: Wed, 19 Jan 2022 21:32:13 +0800	[thread overview]
Message-ID: <e9961fc7-30c7-1b1f-0b38-d23891c60284@huawei.com> (raw)
In-Reply-To: <1642565468.c0jax91tvn.astroid@bobo.none>


On 2022/1/19 12:17, Nicholas Piggin wrote:
> Excerpts from Dave Hansen's message of January 19, 2022 3:28 am:
>> On 1/17/22 6:46 PM, Nicholas Piggin wrote:
>>>> This all sounds very fragile to me.  Every time a new architecture would
>>>> get added for huge vmalloc() support, the developer needs to know to go
>>>> find that architecture's module_alloc() and add this flag.
>>> This is documented in the Kconfig.
>>>
>>>   #
>>>   #  Archs that select this would be capable of PMD-sized vmaps (i.e.,
>>>   #  arch_vmap_pmd_supported() returns true), and they must make no assumptions
>>>   #  that vmalloc memory is mapped with PAGE_SIZE ptes. The VM_NO_HUGE_VMAP flag
>>>   #  can be used to prohibit arch-specific allocations from using hugepages to
>>>   #  help with this (e.g., modules may require it).
>>>   #
>>>   config HAVE_ARCH_HUGE_VMALLOC
>>>           depends on HAVE_ARCH_HUGE_VMAP
>>>           bool
>>>
>>> Is it really fair to say it's *very* fragile? Surely it's reasonable to
>>> read the (not very long) documentation ad understand the consequences for
>>> the arch code before enabling it.
>> Very fragile or not, I think folks are likely to get it wrong.  It would
>> be nice to have it default *everyone* to safe and slow and make *sure*
> It's not safe to enable though. That's the problem. If it was just
> modules then you'd have a point but it could be anything.
>
>> they go look at the architecture modules code itself before enabling
>> this for modules.
> This is required not just for modules for the whole arch code, it
> has to be looked at and decided this will work.
>
>> Just from that Kconfig text, I don't think I'd know off the top of my
>> head what do do for x86, or what code I needed to go touch.
> You have to make sure arch/x86 makes no assumptions that vmalloc memory
> is backed by PAGE_SIZE ptes. If you can't do that then you shouldn't
> enable the option. The option can not explain it any more because any
> arch could do anything with its mappings. The module code is an example,
> not the recipe.

Hi Nick, Dave and Christophe,thanks for your review,  a little 
confused,   I think,

1) for ppc/arm64 module_alloc(),  it must set VM_NO_HUGE_VMAP because the

arch's set_memory_* funcitons can only support PAGE_SIZE mapping, due to the

limit of apply_to_page_range().

2) but for x86's module_alloc(), add VM_NO_HUGE_VMAP is to avoid 
fragmentation,

x86's __change_page_attr functions will split the huge mapping. this 
flags is not a must.


and the behavior above occurred when STRICT_MODULE_RWX enabled, so

1) add a unified function to set vm flags(suggested by Dave ) or

2) add vm flags with some comments to per-arch's module_alloc()

are both acceptable, for the way of unified function ,  we could make 
this a default recipe

with STRICT_MODULE_RWX, also make two more vm flags into it, eg,

+unsigned long module_alloc_vm_flags(bool need_flush_reset_perms)
+{
+       unsigned long vm_flags = VM_DEFER_KMEMLEAK;
+
+       if (need_flush_reset_perms)
+               vm_flags |= VM_FLUSH_RESET_PERMS;
+       /*
+        * Modules use a single, large vmalloc(). Different permissions
+        * are applied later and will fragment huge mappings or even
+        * fails in set_memory_* on some architectures. Avoid using
+        * huge pages for modules.
+        */
+       if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX))
+               vm_flags |= VM_NO_HUGE_VMAP;
+
+       return vm_flags;
+}

then called each arch's module_alloc().

Any suggestion, many thanks.


>
> Thanks,
> Nick
> .

WARNING: multiple messages have this Message-ID (diff)
From: Kefeng Wang <wangkefeng.wang@huawei.com>
To: Nicholas Piggin <npiggin@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jonathan Corbet <corbet@lwn.net>,
	Dave Hansen <dave.hansen@intel.com>,
	<linux-arm-kernel@lists.infradead.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <linuxppc-dev@lists.ozlabs.org>,
	<x86@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Borislav Petkov <bp@alien8.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	"Michael Ellerman" <mpe@ellerman.id.au>,
	Paul Mackerras <paulus@samba.org>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	Will Deacon <will@kernel.org>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v2 3/3] x86: Support huge vmalloc mappings
Date: Wed, 19 Jan 2022 21:32:13 +0800	[thread overview]
Message-ID: <e9961fc7-30c7-1b1f-0b38-d23891c60284@huawei.com> (raw)
In-Reply-To: <1642565468.c0jax91tvn.astroid@bobo.none>


On 2022/1/19 12:17, Nicholas Piggin wrote:
> Excerpts from Dave Hansen's message of January 19, 2022 3:28 am:
>> On 1/17/22 6:46 PM, Nicholas Piggin wrote:
>>>> This all sounds very fragile to me.  Every time a new architecture would
>>>> get added for huge vmalloc() support, the developer needs to know to go
>>>> find that architecture's module_alloc() and add this flag.
>>> This is documented in the Kconfig.
>>>
>>>   #
>>>   #  Archs that select this would be capable of PMD-sized vmaps (i.e.,
>>>   #  arch_vmap_pmd_supported() returns true), and they must make no assumptions
>>>   #  that vmalloc memory is mapped with PAGE_SIZE ptes. The VM_NO_HUGE_VMAP flag
>>>   #  can be used to prohibit arch-specific allocations from using hugepages to
>>>   #  help with this (e.g., modules may require it).
>>>   #
>>>   config HAVE_ARCH_HUGE_VMALLOC
>>>           depends on HAVE_ARCH_HUGE_VMAP
>>>           bool
>>>
>>> Is it really fair to say it's *very* fragile? Surely it's reasonable to
>>> read the (not very long) documentation ad understand the consequences for
>>> the arch code before enabling it.
>> Very fragile or not, I think folks are likely to get it wrong.  It would
>> be nice to have it default *everyone* to safe and slow and make *sure*
> It's not safe to enable though. That's the problem. If it was just
> modules then you'd have a point but it could be anything.
>
>> they go look at the architecture modules code itself before enabling
>> this for modules.
> This is required not just for modules for the whole arch code, it
> has to be looked at and decided this will work.
>
>> Just from that Kconfig text, I don't think I'd know off the top of my
>> head what do do for x86, or what code I needed to go touch.
> You have to make sure arch/x86 makes no assumptions that vmalloc memory
> is backed by PAGE_SIZE ptes. If you can't do that then you shouldn't
> enable the option. The option can not explain it any more because any
> arch could do anything with its mappings. The module code is an example,
> not the recipe.

Hi Nick, Dave and Christophe,thanks for your review,  a little 
confused,   I think,

1) for ppc/arm64 module_alloc(),  it must set VM_NO_HUGE_VMAP because the

arch's set_memory_* funcitons can only support PAGE_SIZE mapping, due to the

limit of apply_to_page_range().

2) but for x86's module_alloc(), add VM_NO_HUGE_VMAP is to avoid 
fragmentation,

x86's __change_page_attr functions will split the huge mapping. this 
flags is not a must.


and the behavior above occurred when STRICT_MODULE_RWX enabled, so

1) add a unified function to set vm flags(suggested by Dave ) or

2) add vm flags with some comments to per-arch's module_alloc()

are both acceptable, for the way of unified function ,  we could make 
this a default recipe

with STRICT_MODULE_RWX, also make two more vm flags into it, eg,

+unsigned long module_alloc_vm_flags(bool need_flush_reset_perms)
+{
+       unsigned long vm_flags = VM_DEFER_KMEMLEAK;
+
+       if (need_flush_reset_perms)
+               vm_flags |= VM_FLUSH_RESET_PERMS;
+       /*
+        * Modules use a single, large vmalloc(). Different permissions
+        * are applied later and will fragment huge mappings or even
+        * fails in set_memory_* on some architectures. Avoid using
+        * huge pages for modules.
+        */
+       if (IS_ENABLED(CONFIG_STRICT_MODULE_RWX))
+               vm_flags |= VM_NO_HUGE_VMAP;
+
+       return vm_flags;
+}

then called each arch's module_alloc().

Any suggestion, many thanks.


>
> Thanks,
> Nick
> .

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2022-01-19 13:32 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-27 14:59 [PATCH v2 0/3] mm: support huge vmalloc mapping on arm64/x86 Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 14:59 ` Kefeng Wang
2021-12-27 14:59 ` [PATCH v2 1/3] mm: vmalloc: Let user to control huge vmalloc default behavior Kefeng Wang
2021-12-27 14:59   ` Kefeng Wang
2021-12-27 14:59   ` Kefeng Wang
2022-01-18  2:52   ` Nicholas Piggin
2022-01-18  2:52     ` Nicholas Piggin
2022-01-18  2:52     ` Nicholas Piggin
2022-01-19 12:57     ` Kefeng Wang
2022-01-19 12:57       ` Kefeng Wang
2022-01-19 12:57       ` Kefeng Wang
2022-01-19 13:22       ` Matthew Wilcox
2022-01-19 13:22         ` Matthew Wilcox
2022-01-19 13:22         ` Matthew Wilcox
2022-01-19 13:44         ` Kefeng Wang
2022-01-19 13:44           ` Kefeng Wang
2022-01-19 13:44           ` Kefeng Wang
2022-01-19 13:48           ` Matthew Wilcox
2022-01-19 13:48             ` Matthew Wilcox
2022-01-19 13:48             ` Matthew Wilcox
2021-12-27 14:59 ` [PATCH v2 2/3] arm64: Support huge vmalloc mappings Kefeng Wang
2021-12-27 14:59   ` Kefeng Wang
2021-12-27 14:59   ` Kefeng Wang
2021-12-27 17:35   ` (No subject) William Kucharski
2021-12-27 17:35     ` William Kucharski
2021-12-27 17:35     ` William Kucharski
2021-12-28  1:36     ` Kefeng Wang
2021-12-28  1:36       ` Kefeng Wang
2021-12-28  1:36       ` Kefeng Wang
2022-01-15 10:05   ` [PATCH v2 2/3] arm64: Support huge vmalloc mappings Christophe Leroy
2022-01-15 10:05     ` Christophe Leroy
2022-01-15 10:05     ` Christophe Leroy
2021-12-27 14:59 ` [PATCH v2 3/3] x86: " Kefeng Wang
2021-12-27 14:59   ` Kefeng Wang
2021-12-27 14:59   ` Kefeng Wang
2021-12-27 15:56   ` Dave Hansen
2021-12-27 15:56     ` Dave Hansen
2021-12-27 15:56     ` Dave Hansen
2021-12-28 10:26     ` Kefeng Wang
2021-12-28 10:26       ` Kefeng Wang
2021-12-28 10:26       ` Kefeng Wang
2021-12-28 16:14       ` Dave Hansen
2021-12-28 16:14         ` Dave Hansen
2021-12-28 16:14         ` Dave Hansen
2021-12-29 11:01         ` Kefeng Wang
2021-12-29 11:01           ` Kefeng Wang
2021-12-29 11:01           ` Kefeng Wang
2022-01-15 10:17           ` Christophe Leroy
2022-01-15 10:17             ` Christophe Leroy
2022-01-15 10:17             ` Christophe Leroy
2022-01-15 10:15         ` Christophe Leroy
2022-01-15 10:15           ` Christophe Leroy
2022-01-15 10:15           ` Christophe Leroy
2022-01-18  2:46         ` Nicholas Piggin
2022-01-18  2:46           ` Nicholas Piggin
2022-01-18  2:46           ` Nicholas Piggin
2022-01-18 17:28           ` Dave Hansen
2022-01-18 17:28             ` Dave Hansen
2022-01-18 17:28             ` Dave Hansen
2022-01-19  4:17             ` Nicholas Piggin
2022-01-19  4:17               ` Nicholas Piggin
2022-01-19  4:17               ` Nicholas Piggin
2022-01-19 13:32               ` Kefeng Wang [this message]
2022-01-19 13:32                 ` Kefeng Wang
2022-01-19 13:32                 ` Kefeng Wang
2022-01-15 10:11       ` Christophe Leroy
2022-01-15 10:11         ` Christophe Leroy
2022-01-15 10:11         ` Christophe Leroy
2022-01-15 10:06   ` Christophe Leroy
2022-01-15 10:06     ` Christophe Leroy
2022-01-15 10:06     ` Christophe Leroy
2022-01-15 10:07 ` [PATCH v2 0/3] mm: support huge vmalloc mapping on arm64/x86 Christophe Leroy
2022-01-15 10:07   ` Christophe Leroy
2022-01-15 10:07   ` Christophe Leroy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9961fc7-30c7-1b1f-0b38-d23891c60284@huawei.com \
    --to=wangkefeng.wang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=paulus@samba.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.