Re: [PATCH v4] arm64: mm: fix linear mem mapping access performance degradation

From: "guanghui.fgh" <guanghuifeng@linux.alibaba.com>
To: Will Deacon <will@kernel.org>, Ard Biesheuvel <ardb@kernel.org>
Cc: baolin.wang@linux.alibaba.com, catalin.marinas@arm.com,
	akpm@linux-foundation.org, david@redhat.com, jianyong.wu@arm.com,
	james.morse@arm.com, quic_qiancai@quicinc.com,
	christophe.leroy@csgroup.eu, jonathan@marek.ca,
	mark.rutland@arm.com, thunder.leizhen@huawei.com,
	anshuman.khandual@arm.com, linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org, rppt@kernel.org,
	geert+renesas@glider.be, linux-mm@kvack.org,
	yaohongbo@linux.alibaba.com,
	alikernel-developer@linux.alibaba.com
Subject: Re: [PATCH v4] arm64: mm: fix linear mem mapping access performance degradation
Date: Tue, 5 Jul 2022 20:07:07 +0800	[thread overview]
Message-ID: <5d044fdd-a61a-d60f-d294-89e17de37712@linux.alibaba.com> (raw)
In-Reply-To: <20220705095231.GB552@willie-the-truck>

在 2022/7/5 17:52, Will Deacon 写道:
> On Mon, Jul 04, 2022 at 07:09:23PM +0200, Ard Biesheuvel wrote:
>> On Mon, 4 Jul 2022 at 18:38, Will Deacon <will@kernel.org> wrote:
>>>
>>> On Mon, Jul 04, 2022 at 10:34:07PM +0800, guanghui.fgh wrote:
>>>> Thanks.
>>>>
>>>> 在 2022/7/4 22:23, Will Deacon 写道:
>>>>> On Mon, Jul 04, 2022 at 10:11:27PM +0800, guanghui.fgh wrote:
>> ...
>>>>>> Namely, it's need to use non block/section mapping for crashkernel mem
>>>>>> before shringking.
>>>>>
>>>>> Well, yes, but we can change arch_kexec_[un]protect_crashkres() not to do
>>>>> that if we're leaving the thing mapped, no?
>>>>>
>>>> I think we should use arch_kexec_[un]protect_crashkres for crashkernel mem.
>>>>
>>>> Because when invalid crashkernel mem pagetable, there is no chance to rd/wr
>>>> the crashkernel mem by mistake.
>>>>
>>>> If we don't use arch_kexec_[un]protect_crashkres to invalid crashkernel mem
>>>> pagetable, there maybe some write operations to these mem by mistake which
>>>> may cause crashkernel boot error and vmcore saving error.
>>>
>>> I don't really buy this line of reasoning. The entire main kernel is
>>> writable, so why do we care about protecting the crashkernel so much? The
>>> _code_ to launch the crash kernel is writable! If you care about preventing
>>> writes to memory which should not be writable, then you should use
>>> rodata=full.
>>>
>>
>> This is not entirely true - the core kernel text and rodata are
>> remapped r/o in the linear map, whereas all module code and rodata are
>> left writable when rodata != full.
> 
> Yes, sorry, you're quite right. The kernel text is only writable if
> rodata=off.
> 
> But I still think it makes sense to protect the crashkernel only if
> rodata=full (which is the default on arm64) as this allows us to rely
> on page mappings and I think fits well with what we do for modules.
> 
>> But the conclusion is the same, imo: if you can't be bothered to
>> protect a good chunk of the code and rodata that the kernel relies on,
>> why should the crashkernel be treated any differently?
> 
> Thanks.
> 
> Will
Thanks.

1.The rodata full is harm to the performance and has been disabled in-house.

2.When using crashkernel with rodata non full, the kernel also will use 
non block/section mapping which cause high d-TLB miss and degrade 
performance greatly.
This patch fix it to use block/section mapping as far as possible.

bool can_set_direct_map(void)
{
	return rodata_full || debug_pagealloc_enabled();
}

map_mem:
if (can_set_direct_map() || IS_ENABLED(CONFIG_KFENCE))
	flags |= NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS;

3.When rodata full is disabled, crashkernel also need protect(keep 
arch_kexec_[un]protect_crashkres using).
I think crashkernel should't depend on radata full(Maybe other 
architecture don't support radata full now).

Thanks.