From: James Morse <james.morse@arm.com>
To: Chen Zhou <chenzhou10@huawei.com>
Cc: wangkefeng.wang@huawei.com, horms@verge.net.au,
ard.biesheuvel@linaro.org, catalin.marinas@arm.com,
will.deacon@arm.com, linux-kernel@vger.kernel.org,
rppt@linux.ibm.com, linux-mm@kvack.org,
takahiro.akashi@linaro.org, mingo@redhat.com, bp@alien8.de,
ebiederm@xmission.com, kexec@lists.infradead.org,
akpm@linux-foundation.org, tglx@linutronix.de,
linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH 2/4] arm64: kdump: support reserving crashkernel above 4G
Date: Wed, 5 Jun 2019 17:29:54 +0100 [thread overview]
Message-ID: <df2b659d-7406-fbfd-597d-be3a3f69abcb@arm.com> (raw)
In-Reply-To: <20190507035058.63992-3-chenzhou10@huawei.com>
Hello,
On 07/05/2019 04:50, Chen Zhou wrote:
> When crashkernel is reserved above 4G in memory, kernel should
> reserve some amount of low memory for swiotlb and some DMA buffers.
> Meanwhile, support crashkernel=X,[high,low] in arm64. When use
> crashkernel=X parameter, try low memory first and fall back to high
> memory unless "crashkernel=X,high" is specified.
What is the 'unless crashkernel=...,high' for? I think it would be simpler to relax the
ARCH_LOW_ADDRESS_LIMIT if reserve_crashkernel_low() allocated something.
This way "crashkernel=1G" tries to allocate 1G below 4G, but fails if there isn't enough
memory. "crashkernel=1G crashkernel=16M,low" allocates 16M below 4G, which is more likely
to succeed, if it does it can then place the 1G block anywhere.
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 413d566..82cd9a0 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -243,6 +243,9 @@ static void __init request_standard_resources(void)
> request_resource(res, &kernel_data);
> #ifdef CONFIG_KEXEC_CORE
> /* Userspace will find "Crash kernel" region in /proc/iomem. */
> + if (crashk_low_res.end && crashk_low_res.start >= res->start &&
> + crashk_low_res.end <= res->end)
> + request_resource(res, &crashk_low_res);
> if (crashk_res.end && crashk_res.start >= res->start &&
> crashk_res.end <= res->end)
> request_resource(res, &crashk_res);
With both crashk_low_res and crashk_res, we end up with two entries in /proc/iomem called
"Crash kernel". Because its sorted by address, and kexec-tools stops searching when it
find "Crash kernel", you are always going to get the kernel placed in the lower portion.
I suspect this isn't what you want, can we rename crashk_low_res for arm64 so that
existing kexec-tools doesn't use it?
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index d2adffb..3fcd739 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -74,20 +74,37 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> static void __init reserve_crashkernel(void)
> {
> unsigned long long crash_base, crash_size;
> + bool high = false;
> int ret;
>
> ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> &crash_size, &crash_base);
> /* no crashkernel= or invalid value specified */
> - if (ret || !crash_size)
> - return;
> + if (ret || !crash_size) {
> + /* crashkernel=X,high */
> + ret = parse_crashkernel_high(boot_command_line,
> + memblock_phys_mem_size(),
> + &crash_size, &crash_base);
> + if (ret || !crash_size)
> + return;
> + high = true;
> + }
>
> crash_size = PAGE_ALIGN(crash_size);
>
> if (crash_base == 0) {
> - /* Current arm64 boot protocol requires 2MB alignment */
> - crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
> - crash_size, SZ_2M);
> + /*
> + * Try low memory first and fall back to high memory
> + * unless "crashkernel=size[KMG],high" is specified.
> + */
> + if (!high)
> + crash_base = memblock_find_in_range(0,
> + ARCH_LOW_ADDRESS_LIMIT,
> + crash_size, CRASH_ALIGN);
> + if (!crash_base)
> + crash_base = memblock_find_in_range(0,
> + memblock_end_of_DRAM(),
> + crash_size, CRASH_ALIGN);
> if (crash_base == 0) {
> pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> crash_size);
> @@ -105,13 +122,18 @@ static void __init reserve_crashkernel(void)
> return;
> }
>
> - if (!IS_ALIGNED(crash_base, SZ_2M)) {
> + if (!IS_ALIGNED(crash_base, CRASH_ALIGN)) {
> pr_warn("cannot reserve crashkernel: base address is not 2MB aligned\n");
> return;
> }
> }
> memblock_reserve(crash_base, crash_size);
>
> + if (crash_base >= SZ_4G && reserve_crashkernel_low()) {
> + memblock_free(crash_base, crash_size);
> + return;
This is going to be annoying on platforms that don't have, and don't need memory below 4G.
A "crashkernel=...,low" on these system will break crashdump. I don't think we should
expect users to know the memory layout. (I'm assuming distro's are going to add a low
reservation everywhere, just in case)
I think the 'low' region should be a small optional/best-effort extra, that kexec-tools
can't touch.
I'm afraid you've missed the ugly bit of the crashkernel reservation...
arch/arm64/mm/mmu.c::map_mem() marks the crashkernel as 'nomap' during the first pass of
page-table generation. This means it isn't mapped in the linear map. It then maps it with
page-size mappings, and removes the nomap flag.
This is done so that arch_kexec_protect_crashkres() and
arch_kexec_unprotect_crashkres() can remove the valid bits of the crashkernel mapping.
This way the old-kernel can't accidentally overwrite the crashkernel. It also saves us if
the old-kernel and the crashkernel use different memory attributes for the mapping.
As your low-memory reservation is intended to be used for devices, having it mapped by the
old-kernel as cacheable memory is going to cause problems if those CPUs aren't taken
offline and go corrupting this memory. (we did crash for a reason after all)
I think the simplest thing to do is mark the low region as 'nomap' in
reserve_crashkernel() and always leave it unmapped. We can then describe it via a
different string in /proc/iomem, something like "Crash kernel (low)". Older kexec-tools
shouldn't use it, (I assume its not using strncmp() in a way that would do this by
accident), and newer kexec-tools can know to describe it in the DT, but it can't write to it.
Thanks,
James
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2019-06-05 16:30 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-07 3:50 [PATCH 0/4] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
2019-05-07 3:50 ` [PATCH 1/4] x86: kdump: move reserve_crashkernel_low() into kexec_core.c Chen Zhou
2019-06-05 16:29 ` James Morse
2019-06-13 11:26 ` Chen Zhou
2019-06-12 8:45 ` Dave Young
2019-06-13 11:27 ` Chen Zhou
2019-05-07 3:50 ` [PATCH 2/4] arm64: kdump: support reserving crashkernel above 4G Chen Zhou
2019-06-05 16:29 ` James Morse [this message]
2019-06-13 11:27 ` Chen Zhou
2019-06-13 12:44 ` James Morse
2019-05-07 3:50 ` [PATCH 3/4] memblock: extend memblock_cap_memory_range to multiple ranges Chen Zhou
2019-05-07 3:50 ` [PATCH 4/4] kdump: update Documentation about crashkernel on arm64 Chen Zhou
2019-05-15 5:16 ` Bhupesh Sharma
2019-05-16 3:23 ` Chen Zhou
2019-05-15 5:06 ` [PATCH 0/4] support reserving crashkernel above 4G on arm64 kdump Bhupesh Sharma
2019-05-16 3:19 ` Chen Zhou
2019-06-03 2:24 ` Chen Zhou
2019-06-05 16:32 ` James Morse
2019-06-13 11:27 ` Chen Zhou
2019-06-13 12:43 ` James Morse
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df2b659d-7406-fbfd-597d-be3a3f69abcb@arm.com \
--to=james.morse@arm.com \
--cc=akpm@linux-foundation.org \
--cc=ard.biesheuvel@linaro.org \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=chenzhou10@huawei.com \
--cc=ebiederm@xmission.com \
--cc=horms@verge.net.au \
--cc=kexec@lists.infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=rppt@linux.ibm.com \
--cc=takahiro.akashi@linaro.org \
--cc=tglx@linutronix.de \
--cc=wangkefeng.wang@huawei.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).