All of lore.kernel.org
 help / color / mirror / Atom feed
From: Prabhakar Kushwaha <prabhakar.pkin@gmail.com>
To: John Donnelly <john.p.donnelly@oracle.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>,
	Chen Zhou <chenzhou10@huawei.com>,
	Simon Horman <horms@verge.net.au>,
	Devicetree List <devicetree@vger.kernel.org>,
	Baoquan He <bhe@redhat.com>, Will Deacon <will@kernel.org>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	kexec mailing list <kexec@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Rob Herring <robh+dt@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>,
	guohanjun@huawei.com, James Morse <james.morse@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Prabhakar Kushwaha <pkushwaha@marvell.com>,
	RuiRui Yang <dyoung@redhat.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v8 0/5] support reserving crashkernel above 4G on arm64 kdump
Date: Tue, 2 Jun 2020 11:08:49 +0530	[thread overview]
Message-ID: <CAJ2QiJJE-jeRL1HPUZCwi1LtV9CBMmYrsOaS6vX1R1sJ6Z1t8g@mail.gmail.com> (raw)
In-Reply-To: <F64A309C-B9C0-45F2-A50D-D677005C33A6@oracle.com>

On Tue, Jun 2, 2020 at 3:29 AM John Donnelly <john.p.donnelly@oracle.com> wrote:
>
> Hi .  See below !
>
> > On Jun 1, 2020, at 4:02 PM, Bhupesh Sharma <bhsharma@redhat.com> wrote:
> >
> > Hi John,
> >
> > On Tue, Jun 2, 2020 at 1:01 AM John Donnelly <John.P.donnelly@oracle.com> wrote:
> >>
> >> Hi,
> >>
> >>
> >> On 6/1/20 7:02 AM, Prabhakar Kushwaha wrote:
> >>> Hi Chen,
> >>>
> >>> On Thu, May 21, 2020 at 3:05 PM Chen Zhou <chenzhou10@huawei.com> wrote:
> >>>> This patch series enable reserving crashkernel above 4G in arm64.
> >>>>
> >>>> There are following issues in arm64 kdump:
> >>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
> >>>> when there is no enough low memory.
> >>>> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
> >>>> in this case, if swiotlb or DMA buffers are required, crash dump kernel
> >>>> will boot failure because there is no low memory available for allocation.
> >>>>
> >>>> To solve these issues, introduce crashkernel=X,low to reserve specified
> >>>> size low memory.
> >>>> Crashkernel=X tries to reserve memory for the crash dump kernel under
> >>>> 4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
> >>>> size low memory for crash kdump kernel devices firstly and then reserve
> >>>> memory above 4G.
> >>>>
> >>>> When crashkernel is reserved above 4G in memory, that is, crashkernel=X,low
> >>>> is specified simultaneously, kernel should reserve specified size low memory
> >>>> for crash dump kernel devices. So there may be two crash kernel regions, one
> >>>> is below 4G, the other is above 4G.
> >>>> In order to distinct from the high region and make no effect to the use of
> >>>> kexec-tools, rename the low region as "Crash kernel (low)", and add DT property
> >>>> "linux,low-memory-range" to crash dump kernel's dtb to pass the low region.
> >>>>
> >>>> Besides, we need to modify kexec-tools:
> >>>> arm64: kdump: add another DT property to crash dump kernel's dtb(see [1])
> >>>>
> >>>> The previous changes and discussions can be retrieved from:
> >>>>
> >>>> Changes since [v7]
> >>>> - Move x86 CRASH_ALIGN to 2M
> >>>> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> >>>> - Update Documentation/devicetree/bindings/chosen.txt
> >>>> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt suggested by Arnd.
> >>>> - Add Tested-by from Jhon and pk
> >>>>
> >>>> Changes since [v6]
> >>>> - Fix build errors reported by kbuild test robot.
> >>>>
> >>>> Changes since [v5]
> >>>> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> >>>> - Delete crashkernel=X,high.
> >>>> - Modify crashkernel=X,low.
> >>>> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> >>>> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> >>>> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> >>>> pass to crash dump kernel by DT property "linux,low-memory-range".
> >>>> - Update Documentation/admin-guide/kdump/kdump.rst.
> >>>>
> >>>> Changes since [v4]
> >>>> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> >>>>
> >>>> Changes since [v3]
> >>>> - Add memblock_cap_memory_ranges back for multiple ranges.
> >>>> - Fix some compiling warnings.
> >>>>
> >>>> Changes since [v2]
> >>>> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> >>>> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> >>>> patch.
> >>>>
> >>>> Changes since [v1]:
> >>>> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> >>>> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> >>>> in fdt_enforce_memory_region().
> >>>> There are at most two crash kernel regions, for two crash kernel regions
> >>>> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> >>>> and then remove the memory range in the middle.
> >>>>
> >>>> [1]: https://urldefense.com/v3/__http://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbvpn1uM1$
> >>>> [v1]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/2/1174__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbt0xN9PE$
> >>>> [v2]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/86__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbub7yUQH$
> >>>> [v3]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/306__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbnc4zPPV$
> >>>> [v4]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/15/273__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbvsAsZLu$
> >>>> [v5]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/5/6/1360__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbl24n-79$
> >>>> [v6]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/8/30/142__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbs7r8G2a$
> >>>> [v7]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/12/23/411__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbiFUH90G$
> >>>>
> >>>> Chen Zhou (5):
> >>>>   x86: kdump: move reserve_crashkernel_low() into crash_core.c
> >>>>   arm64: kdump: reserve crashkenel above 4G for crash dump kernel
> >>>>   arm64: kdump: add memory for devices by DT property, low-memory-range
> >>>>   kdump: update Documentation about crashkernel on arm64
> >>>>   dt-bindings: chosen: Document linux,low-memory-range for arm64 kdump
> >>>>
> >>> We are getting "warn_alloc" [1] warning during boot of kdump kernel
> >>> with bootargs as [2] of primary kernel.
> >>> This error observed on ThunderX2  ARM64 platform.
> >>>
> >>> It is observed with latest upstream tag (v5.7-rc3) with this patch set
> >>>  and https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbiIAAlzu$
> >>> Also **without** this patch-set
> >>> "https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$"
> >>>
> >>> This issue comes whenever crashkernel memory is reserved after 0xc000_0000.
> >>> More details discussed earlier in
> >>> https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$  without any
> >>> solution
> >>>
> >>> This patch-set is expected to solve similar kind of issue.
> >>> i.e. low memory is only targeted for DMA, swiotlb; So above mentioned
> >>> observation should be considered/fixed. .
> >>>
> >>> --pk
> >>>
> >>> [1]
> >>> [   30.366695] DMI: Cavium Inc. Saber/Saber, BIOS
> >>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
> >>> [   30.367696] NET: Registered protocol family 16
> >>> [   30.369973] swapper/0: page allocation failure: order:6,
> >>> mode:0x1(GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0
> >>> [   30.369980] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc3+ #121
> >>> [   30.369981] Hardware name: Cavium Inc. Saber/Saber, BIOS
> >>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
> >>> [   30.369984] Call trace:
> >>> [   30.369989]  dump_backtrace+0x0/0x1f8
> >>> [   30.369991]  show_stack+0x20/0x30
> >>> [   30.369997]  dump_stack+0xc0/0x10c
> >>> [   30.370001]  warn_alloc+0x10c/0x178
> >>> [   30.370004]  __alloc_pages_slowpath.constprop.111+0xb10/0xb50
> >>> [   30.370006]  __alloc_pages_nodemask+0x2b4/0x300
> >>> [   30.370008]  alloc_page_interleave+0x24/0x98
> >>> [   30.370011]  alloc_pages_current+0xe4/0x108
> >>> [   30.370017]  dma_atomic_pool_init+0x44/0x1a4
> >>> [   30.370020]  do_one_initcall+0x54/0x228
> >>> [   30.370027]  kernel_init_freeable+0x228/0x2cc
> >>> [   30.370031]  kernel_init+0x1c/0x110
> >>> [   30.370034]  ret_from_fork+0x10/0x18
> >>> [   30.370036] Mem-Info:
> >>> [   30.370064] active_anon:0 inactive_anon:0 isolated_anon:0
> >>> [   30.370064]  active_file:0 inactive_file:0 isolated_file:0
> >>> [   30.370064]  unevictable:0 dirty:0 writeback:0 unstable:0
> >>> [   30.370064]  slab_reclaimable:34 slab_unreclaimable:4438
> >>> [   30.370064]  mapped:0 shmem:0 pagetables:14 bounce:0
> >>> [   30.370064]  free:1537719 free_pcp:219 free_cma:0
> >>> [   30.370070] Node 0 active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> >>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
> >>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
> >>> unstable:0kB all_unreclaimable? no
> >>> [   30.370073] Node 1 active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> >>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
> >>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
> >>> unstable:0kB all_unreclaimable? no
> >>> [   30.370079] Node 0 DMA free:0kB min:0kB low:0kB high:0kB
> >>> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:128kB managed:0kB mlocked:0kB kernel_stack:0kB pagetables:0kB
> >>> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >>> [   30.370084] lowmem_reserve[]: 0 250 6063 6063
> >>> [   30.370090] Node 0 DMA32 free:256000kB min:408kB low:664kB
> >>> high:920kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:269700kB managed:256000kB mlocked:0kB kernel_stack:0kB
> >>> pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >>> [   30.370094] lowmem_reserve[]: 0 0 5813 5813
> >>> [   30.370100] Node 0 Normal free:5894876kB min:9552kB low:15504kB
> >>> high:21456kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:8388608kB managed:5953112kB mlocked:0kB kernel_stack:21672kB
> >>> pagetables:56kB bounce:0kB free_pcp:876kB local_pcp:176kB free_cma:0kB
> >>> [   30.370104] lowmem_reserve[]: 0 0 0 0
> >>> [   30.370107] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
> >>> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
> >>> [   30.370113] Node 0 DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
> >>> 0*256kB 0*512kB 0*1024kB 1*2048kB (M) 62*4096kB (M) = 256000kB
> >>> [   30.370119] Node 0 Normal: 2*4kB (M) 3*8kB (ME) 2*16kB (UE) 3*32kB
> >>> (UM) 1*64kB (U) 2*128kB (M) 2*256kB (ME) 3*512kB (ME) 3*1024kB (ME)
> >>> 3*2048kB (UME) 1436*4096kB (M) = 5893600kB
> >>> [   30.370129] Node 0 hugepages_total=0 hugepages_free=0
> >>> hugepages_surp=0 hugepages_size=1048576kB
> >>> [   30.370130] 0 total pagecache pages
> >>> [   30.370132] 0 pages in swap cache
> >>> [   30.370134] Swap cache stats: add 0, delete 0, find 0/0
> >>> [   30.370135] Free swap  = 0kB
> >>> [   30.370136] Total swap = 0kB
> >>> [   30.370137] 2164609 pages RAM
> >>> [   30.370139] 0 pages HighMem/MovableOnly
> >>> [   30.370140] 612331 pages reserved
> >>> [   30.370141] 0 pages hwpoisoned
> >>> [   30.370143] DMA: failed to allocate 256 KiB pool for atomic
> >>> coherent allocation
> >>
> >>
> >> During my testing I saw the same error and Chen's  solution corrected it .
> >
> > Which combination you are using on your side? I am using Prabhakar's
> > suggested environment and can reproduce the issue
> > with or without Chen's crashkernel support above 4G patchset.
> >
> > I am also using a ThunderX2 platform with latest makedumpfile code and
> > kexec-tools (with the suggested patch
> > <https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!J6lUig58-Gw6TKZnEEYzEeSU36T-1SqlB1kImU00xtX_lss5Tx-JbUmLE9TJC3foXBLg$ >).
> >
> > Thanks,
> > Bhupesh
>
>
> I did this activity 5 months ago and I have moved on to other activities. My DMA failures were related to PCI devices that could not be enumerated because  low-DMA space was not  available  when crashkernel was moved above 4G; I don’t recall the exact platform.
>
>
>
> For this failure ,
>
> >>>  DMA: failed to allocate 256 KiB pool for atomic
> >>> coherent allocation
>
>
> Is due to :
>
>
>  3618082c
>  ("arm64 use both ZONE_DMA and ZONE_DMA32")
>
> With the introduction of ZONE_DMA to support the Raspberry DMA
> region below 1G, the crashkernel is placed in the upper 4G
> ZONE_DMA_32 region. Since the crashkernel does not have access
> to the ZONE_DMA region, it prints out call trace during bootup.
>
> It is due to having this CONFIG item  ON  :
>
>
> CONFIG_ZONE_DMA=y
>
> Turning off ZONE_DMA fixes a issue and Raspberry PI 4 will
> use the device tree to specify memory below 1G.
>
>

Disabling ZONE_DMA is temporary solution.  We may need proper solution

> I would like to see Chen’s feature added , perhaps as EXPERIMENTAL,  so we can get some configuration testing done on it.   It corrects having a DMA zone in low memory while crash-kernel is above 4GB.  This has been going on for a year now.

I will also like this patch to be added in Linux as early as possible.

Issue mentioned by me happens with or without this patch.

This patch-set can consider fixing because it uses low memory for DMA
& swiotlb only.
We can consider restricting crashkernel within the required range like below

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 7f9e5a6dc48c..bd67b90d35bd 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -354,7 +354,7 @@ int __init reserve_crashkernel_low(void)
                        return 0;
        }

-       low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+       low_base = memblock_find_in_range(0, 0xc0000000, low_size, CRASH_ALIGN);
        if (!low_base) {
                pr_err("Cannot reserve %ldMB crashkernel low memory,
please try smaller size.\n",
                       (unsigned long)(low_size >> 20));


Similar change can be considered for scenario "without" this patch.
But it will decrease memory availability for crashkernel.
Hence increase the failure probability of crashkernel reservation.

--pk

WARNING: multiple messages have this Message-ID (diff)
From: Prabhakar Kushwaha <prabhakar.pkin@gmail.com>
To: John Donnelly <john.p.donnelly@oracle.com>
Cc: Devicetree List <devicetree@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>, Baoquan He <bhe@redhat.com>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>,
	Chen Zhou <chenzhou10@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Bhupesh Sharma <bhsharma@redhat.com>,
	RuiRui Yang <dyoung@redhat.com>,
	kexec mailing list <kexec@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Rob Herring <robh+dt@kernel.org>,
	Simon Horman <horms@verge.net.au>,
	James Morse <james.morse@arm.com>,
	guohanjun@huawei.com, Thomas Gleixner <tglx@linutronix.de>,
	Prabhakar Kushwaha <pkushwaha@marvell.com>,
	Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v8 0/5] support reserving crashkernel above 4G on arm64 kdump
Date: Tue, 2 Jun 2020 11:08:49 +0530	[thread overview]
Message-ID: <CAJ2QiJJE-jeRL1HPUZCwi1LtV9CBMmYrsOaS6vX1R1sJ6Z1t8g@mail.gmail.com> (raw)
In-Reply-To: <F64A309C-B9C0-45F2-A50D-D677005C33A6@oracle.com>

On Tue, Jun 2, 2020 at 3:29 AM John Donnelly <john.p.donnelly@oracle.com> wrote:
>
> Hi .  See below !
>
> > On Jun 1, 2020, at 4:02 PM, Bhupesh Sharma <bhsharma@redhat.com> wrote:
> >
> > Hi John,
> >
> > On Tue, Jun 2, 2020 at 1:01 AM John Donnelly <John.P.donnelly@oracle.com> wrote:
> >>
> >> Hi,
> >>
> >>
> >> On 6/1/20 7:02 AM, Prabhakar Kushwaha wrote:
> >>> Hi Chen,
> >>>
> >>> On Thu, May 21, 2020 at 3:05 PM Chen Zhou <chenzhou10@huawei.com> wrote:
> >>>> This patch series enable reserving crashkernel above 4G in arm64.
> >>>>
> >>>> There are following issues in arm64 kdump:
> >>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
> >>>> when there is no enough low memory.
> >>>> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
> >>>> in this case, if swiotlb or DMA buffers are required, crash dump kernel
> >>>> will boot failure because there is no low memory available for allocation.
> >>>>
> >>>> To solve these issues, introduce crashkernel=X,low to reserve specified
> >>>> size low memory.
> >>>> Crashkernel=X tries to reserve memory for the crash dump kernel under
> >>>> 4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
> >>>> size low memory for crash kdump kernel devices firstly and then reserve
> >>>> memory above 4G.
> >>>>
> >>>> When crashkernel is reserved above 4G in memory, that is, crashkernel=X,low
> >>>> is specified simultaneously, kernel should reserve specified size low memory
> >>>> for crash dump kernel devices. So there may be two crash kernel regions, one
> >>>> is below 4G, the other is above 4G.
> >>>> In order to distinct from the high region and make no effect to the use of
> >>>> kexec-tools, rename the low region as "Crash kernel (low)", and add DT property
> >>>> "linux,low-memory-range" to crash dump kernel's dtb to pass the low region.
> >>>>
> >>>> Besides, we need to modify kexec-tools:
> >>>> arm64: kdump: add another DT property to crash dump kernel's dtb(see [1])
> >>>>
> >>>> The previous changes and discussions can be retrieved from:
> >>>>
> >>>> Changes since [v7]
> >>>> - Move x86 CRASH_ALIGN to 2M
> >>>> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> >>>> - Update Documentation/devicetree/bindings/chosen.txt
> >>>> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt suggested by Arnd.
> >>>> - Add Tested-by from Jhon and pk
> >>>>
> >>>> Changes since [v6]
> >>>> - Fix build errors reported by kbuild test robot.
> >>>>
> >>>> Changes since [v5]
> >>>> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> >>>> - Delete crashkernel=X,high.
> >>>> - Modify crashkernel=X,low.
> >>>> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> >>>> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> >>>> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> >>>> pass to crash dump kernel by DT property "linux,low-memory-range".
> >>>> - Update Documentation/admin-guide/kdump/kdump.rst.
> >>>>
> >>>> Changes since [v4]
> >>>> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> >>>>
> >>>> Changes since [v3]
> >>>> - Add memblock_cap_memory_ranges back for multiple ranges.
> >>>> - Fix some compiling warnings.
> >>>>
> >>>> Changes since [v2]
> >>>> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> >>>> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> >>>> patch.
> >>>>
> >>>> Changes since [v1]:
> >>>> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> >>>> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> >>>> in fdt_enforce_memory_region().
> >>>> There are at most two crash kernel regions, for two crash kernel regions
> >>>> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> >>>> and then remove the memory range in the middle.
> >>>>
> >>>> [1]: https://urldefense.com/v3/__http://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbvpn1uM1$
> >>>> [v1]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/2/1174__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbt0xN9PE$
> >>>> [v2]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/86__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbub7yUQH$
> >>>> [v3]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/306__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbnc4zPPV$
> >>>> [v4]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/15/273__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbvsAsZLu$
> >>>> [v5]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/5/6/1360__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbl24n-79$
> >>>> [v6]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/8/30/142__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbs7r8G2a$
> >>>> [v7]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/12/23/411__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbiFUH90G$
> >>>>
> >>>> Chen Zhou (5):
> >>>>   x86: kdump: move reserve_crashkernel_low() into crash_core.c
> >>>>   arm64: kdump: reserve crashkenel above 4G for crash dump kernel
> >>>>   arm64: kdump: add memory for devices by DT property, low-memory-range
> >>>>   kdump: update Documentation about crashkernel on arm64
> >>>>   dt-bindings: chosen: Document linux,low-memory-range for arm64 kdump
> >>>>
> >>> We are getting "warn_alloc" [1] warning during boot of kdump kernel
> >>> with bootargs as [2] of primary kernel.
> >>> This error observed on ThunderX2  ARM64 platform.
> >>>
> >>> It is observed with latest upstream tag (v5.7-rc3) with this patch set
> >>>  and https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbiIAAlzu$
> >>> Also **without** this patch-set
> >>> "https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$"
> >>>
> >>> This issue comes whenever crashkernel memory is reserved after 0xc000_0000.
> >>> More details discussed earlier in
> >>> https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$  without any
> >>> solution
> >>>
> >>> This patch-set is expected to solve similar kind of issue.
> >>> i.e. low memory is only targeted for DMA, swiotlb; So above mentioned
> >>> observation should be considered/fixed. .
> >>>
> >>> --pk
> >>>
> >>> [1]
> >>> [   30.366695] DMI: Cavium Inc. Saber/Saber, BIOS
> >>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
> >>> [   30.367696] NET: Registered protocol family 16
> >>> [   30.369973] swapper/0: page allocation failure: order:6,
> >>> mode:0x1(GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0
> >>> [   30.369980] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc3+ #121
> >>> [   30.369981] Hardware name: Cavium Inc. Saber/Saber, BIOS
> >>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
> >>> [   30.369984] Call trace:
> >>> [   30.369989]  dump_backtrace+0x0/0x1f8
> >>> [   30.369991]  show_stack+0x20/0x30
> >>> [   30.369997]  dump_stack+0xc0/0x10c
> >>> [   30.370001]  warn_alloc+0x10c/0x178
> >>> [   30.370004]  __alloc_pages_slowpath.constprop.111+0xb10/0xb50
> >>> [   30.370006]  __alloc_pages_nodemask+0x2b4/0x300
> >>> [   30.370008]  alloc_page_interleave+0x24/0x98
> >>> [   30.370011]  alloc_pages_current+0xe4/0x108
> >>> [   30.370017]  dma_atomic_pool_init+0x44/0x1a4
> >>> [   30.370020]  do_one_initcall+0x54/0x228
> >>> [   30.370027]  kernel_init_freeable+0x228/0x2cc
> >>> [   30.370031]  kernel_init+0x1c/0x110
> >>> [   30.370034]  ret_from_fork+0x10/0x18
> >>> [   30.370036] Mem-Info:
> >>> [   30.370064] active_anon:0 inactive_anon:0 isolated_anon:0
> >>> [   30.370064]  active_file:0 inactive_file:0 isolated_file:0
> >>> [   30.370064]  unevictable:0 dirty:0 writeback:0 unstable:0
> >>> [   30.370064]  slab_reclaimable:34 slab_unreclaimable:4438
> >>> [   30.370064]  mapped:0 shmem:0 pagetables:14 bounce:0
> >>> [   30.370064]  free:1537719 free_pcp:219 free_cma:0
> >>> [   30.370070] Node 0 active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> >>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
> >>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
> >>> unstable:0kB all_unreclaimable? no
> >>> [   30.370073] Node 1 active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> >>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
> >>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
> >>> unstable:0kB all_unreclaimable? no
> >>> [   30.370079] Node 0 DMA free:0kB min:0kB low:0kB high:0kB
> >>> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:128kB managed:0kB mlocked:0kB kernel_stack:0kB pagetables:0kB
> >>> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >>> [   30.370084] lowmem_reserve[]: 0 250 6063 6063
> >>> [   30.370090] Node 0 DMA32 free:256000kB min:408kB low:664kB
> >>> high:920kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:269700kB managed:256000kB mlocked:0kB kernel_stack:0kB
> >>> pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >>> [   30.370094] lowmem_reserve[]: 0 0 5813 5813
> >>> [   30.370100] Node 0 Normal free:5894876kB min:9552kB low:15504kB
> >>> high:21456kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:8388608kB managed:5953112kB mlocked:0kB kernel_stack:21672kB
> >>> pagetables:56kB bounce:0kB free_pcp:876kB local_pcp:176kB free_cma:0kB
> >>> [   30.370104] lowmem_reserve[]: 0 0 0 0
> >>> [   30.370107] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
> >>> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
> >>> [   30.370113] Node 0 DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
> >>> 0*256kB 0*512kB 0*1024kB 1*2048kB (M) 62*4096kB (M) = 256000kB
> >>> [   30.370119] Node 0 Normal: 2*4kB (M) 3*8kB (ME) 2*16kB (UE) 3*32kB
> >>> (UM) 1*64kB (U) 2*128kB (M) 2*256kB (ME) 3*512kB (ME) 3*1024kB (ME)
> >>> 3*2048kB (UME) 1436*4096kB (M) = 5893600kB
> >>> [   30.370129] Node 0 hugepages_total=0 hugepages_free=0
> >>> hugepages_surp=0 hugepages_size=1048576kB
> >>> [   30.370130] 0 total pagecache pages
> >>> [   30.370132] 0 pages in swap cache
> >>> [   30.370134] Swap cache stats: add 0, delete 0, find 0/0
> >>> [   30.370135] Free swap  = 0kB
> >>> [   30.370136] Total swap = 0kB
> >>> [   30.370137] 2164609 pages RAM
> >>> [   30.370139] 0 pages HighMem/MovableOnly
> >>> [   30.370140] 612331 pages reserved
> >>> [   30.370141] 0 pages hwpoisoned
> >>> [   30.370143] DMA: failed to allocate 256 KiB pool for atomic
> >>> coherent allocation
> >>
> >>
> >> During my testing I saw the same error and Chen's  solution corrected it .
> >
> > Which combination you are using on your side? I am using Prabhakar's
> > suggested environment and can reproduce the issue
> > with or without Chen's crashkernel support above 4G patchset.
> >
> > I am also using a ThunderX2 platform with latest makedumpfile code and
> > kexec-tools (with the suggested patch
> > <https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!J6lUig58-Gw6TKZnEEYzEeSU36T-1SqlB1kImU00xtX_lss5Tx-JbUmLE9TJC3foXBLg$ >).
> >
> > Thanks,
> > Bhupesh
>
>
> I did this activity 5 months ago and I have moved on to other activities. My DMA failures were related to PCI devices that could not be enumerated because  low-DMA space was not  available  when crashkernel was moved above 4G; I don’t recall the exact platform.
>
>
>
> For this failure ,
>
> >>>  DMA: failed to allocate 256 KiB pool for atomic
> >>> coherent allocation
>
>
> Is due to :
>
>
>  3618082c
>  ("arm64 use both ZONE_DMA and ZONE_DMA32")
>
> With the introduction of ZONE_DMA to support the Raspberry DMA
> region below 1G, the crashkernel is placed in the upper 4G
> ZONE_DMA_32 region. Since the crashkernel does not have access
> to the ZONE_DMA region, it prints out call trace during bootup.
>
> It is due to having this CONFIG item  ON  :
>
>
> CONFIG_ZONE_DMA=y
>
> Turning off ZONE_DMA fixes a issue and Raspberry PI 4 will
> use the device tree to specify memory below 1G.
>
>

Disabling ZONE_DMA is temporary solution.  We may need proper solution

> I would like to see Chen’s feature added , perhaps as EXPERIMENTAL,  so we can get some configuration testing done on it.   It corrects having a DMA zone in low memory while crash-kernel is above 4GB.  This has been going on for a year now.

I will also like this patch to be added in Linux as early as possible.

Issue mentioned by me happens with or without this patch.

This patch-set can consider fixing because it uses low memory for DMA
& swiotlb only.
We can consider restricting crashkernel within the required range like below

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 7f9e5a6dc48c..bd67b90d35bd 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -354,7 +354,7 @@ int __init reserve_crashkernel_low(void)
                        return 0;
        }

-       low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+       low_base = memblock_find_in_range(0, 0xc0000000, low_size, CRASH_ALIGN);
        if (!low_base) {
                pr_err("Cannot reserve %ldMB crashkernel low memory,
please try smaller size.\n",
                       (unsigned long)(low_size >> 20));


Similar change can be considered for scenario "without" this patch.
But it will decrease memory availability for crashkernel.
Hence increase the failure probability of crashkernel reservation.

--pk

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Prabhakar Kushwaha <prabhakar.pkin@gmail.com>
To: John Donnelly <john.p.donnelly@oracle.com>
Cc: Devicetree List <devicetree@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>, Baoquan He <bhe@redhat.com>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>,
	Chen Zhou <chenzhou10@huawei.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Bhupesh Sharma <bhsharma@redhat.com>,
	RuiRui Yang <dyoung@redhat.com>,
	kexec mailing list <kexec@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Rob Herring <robh+dt@kernel.org>,
	Simon Horman <horms@verge.net.au>,
	James Morse <james.morse@arm.com>,
	guohanjun@huawei.com, Thomas Gleixner <tglx@linutronix.de>,
	Prabhakar Kushwaha <pkushwaha@marvell.com>,
	Will Deacon <will@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v8 0/5] support reserving crashkernel above 4G on arm64 kdump
Date: Tue, 2 Jun 2020 11:08:49 +0530	[thread overview]
Message-ID: <CAJ2QiJJE-jeRL1HPUZCwi1LtV9CBMmYrsOaS6vX1R1sJ6Z1t8g@mail.gmail.com> (raw)
In-Reply-To: <F64A309C-B9C0-45F2-A50D-D677005C33A6@oracle.com>

On Tue, Jun 2, 2020 at 3:29 AM John Donnelly <john.p.donnelly@oracle.com> wrote:
>
> Hi .  See below !
>
> > On Jun 1, 2020, at 4:02 PM, Bhupesh Sharma <bhsharma@redhat.com> wrote:
> >
> > Hi John,
> >
> > On Tue, Jun 2, 2020 at 1:01 AM John Donnelly <John.P.donnelly@oracle.com> wrote:
> >>
> >> Hi,
> >>
> >>
> >> On 6/1/20 7:02 AM, Prabhakar Kushwaha wrote:
> >>> Hi Chen,
> >>>
> >>> On Thu, May 21, 2020 at 3:05 PM Chen Zhou <chenzhou10@huawei.com> wrote:
> >>>> This patch series enable reserving crashkernel above 4G in arm64.
> >>>>
> >>>> There are following issues in arm64 kdump:
> >>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
> >>>> when there is no enough low memory.
> >>>> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
> >>>> in this case, if swiotlb or DMA buffers are required, crash dump kernel
> >>>> will boot failure because there is no low memory available for allocation.
> >>>>
> >>>> To solve these issues, introduce crashkernel=X,low to reserve specified
> >>>> size low memory.
> >>>> Crashkernel=X tries to reserve memory for the crash dump kernel under
> >>>> 4G. If crashkernel=Y,low is specified simultaneously, reserve spcified
> >>>> size low memory for crash kdump kernel devices firstly and then reserve
> >>>> memory above 4G.
> >>>>
> >>>> When crashkernel is reserved above 4G in memory, that is, crashkernel=X,low
> >>>> is specified simultaneously, kernel should reserve specified size low memory
> >>>> for crash dump kernel devices. So there may be two crash kernel regions, one
> >>>> is below 4G, the other is above 4G.
> >>>> In order to distinct from the high region and make no effect to the use of
> >>>> kexec-tools, rename the low region as "Crash kernel (low)", and add DT property
> >>>> "linux,low-memory-range" to crash dump kernel's dtb to pass the low region.
> >>>>
> >>>> Besides, we need to modify kexec-tools:
> >>>> arm64: kdump: add another DT property to crash dump kernel's dtb(see [1])
> >>>>
> >>>> The previous changes and discussions can be retrieved from:
> >>>>
> >>>> Changes since [v7]
> >>>> - Move x86 CRASH_ALIGN to 2M
> >>>> Suggested by Dave and do some test, move x86 CRASH_ALIGN to 2M.
> >>>> - Update Documentation/devicetree/bindings/chosen.txt
> >>>> Add corresponding documentation to Documentation/devicetree/bindings/chosen.txt suggested by Arnd.
> >>>> - Add Tested-by from Jhon and pk
> >>>>
> >>>> Changes since [v6]
> >>>> - Fix build errors reported by kbuild test robot.
> >>>>
> >>>> Changes since [v5]
> >>>> - Move reserve_crashkernel_low() into kernel/crash_core.c.
> >>>> - Delete crashkernel=X,high.
> >>>> - Modify crashkernel=X,low.
> >>>> If crashkernel=X,low is specified simultaneously, reserve spcified size low
> >>>> memory for crash kdump kernel devices firstly and then reserve memory above 4G.
> >>>> In addition, rename crashk_low_res as "Crash kernel (low)" for arm64, and then
> >>>> pass to crash dump kernel by DT property "linux,low-memory-range".
> >>>> - Update Documentation/admin-guide/kdump/kdump.rst.
> >>>>
> >>>> Changes since [v4]
> >>>> - Reimplement memblock_cap_memory_ranges for multiple ranges by Mike.
> >>>>
> >>>> Changes since [v3]
> >>>> - Add memblock_cap_memory_ranges back for multiple ranges.
> >>>> - Fix some compiling warnings.
> >>>>
> >>>> Changes since [v2]
> >>>> - Split patch "arm64: kdump: support reserving crashkernel above 4G" as
> >>>> two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
> >>>> patch.
> >>>>
> >>>> Changes since [v1]:
> >>>> - Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
> >>>> - Remove memblock_cap_memory_ranges() i added in v1 and implement that
> >>>> in fdt_enforce_memory_region().
> >>>> There are at most two crash kernel regions, for two crash kernel regions
> >>>> case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
> >>>> and then remove the memory range in the middle.
> >>>>
> >>>> [1]: https://urldefense.com/v3/__http://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbvpn1uM1$
> >>>> [v1]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/2/1174__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbt0xN9PE$
> >>>> [v2]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/86__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbub7yUQH$
> >>>> [v3]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/9/306__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbnc4zPPV$
> >>>> [v4]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/4/15/273__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbvsAsZLu$
> >>>> [v5]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/5/6/1360__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbl24n-79$
> >>>> [v6]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/8/30/142__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbs7r8G2a$
> >>>> [v7]: https://urldefense.com/v3/__https://lkml.org/lkml/2019/12/23/411__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbiFUH90G$
> >>>>
> >>>> Chen Zhou (5):
> >>>>   x86: kdump: move reserve_crashkernel_low() into crash_core.c
> >>>>   arm64: kdump: reserve crashkenel above 4G for crash dump kernel
> >>>>   arm64: kdump: add memory for devices by DT property, low-memory-range
> >>>>   kdump: update Documentation about crashkernel on arm64
> >>>>   dt-bindings: chosen: Document linux,low-memory-range for arm64 kdump
> >>>>
> >>> We are getting "warn_alloc" [1] warning during boot of kdump kernel
> >>> with bootargs as [2] of primary kernel.
> >>> This error observed on ThunderX2  ARM64 platform.
> >>>
> >>> It is observed with latest upstream tag (v5.7-rc3) with this patch set
> >>>  and https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbiIAAlzu$
> >>> Also **without** this patch-set
> >>> "https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$"
> >>>
> >>> This issue comes whenever crashkernel memory is reserved after 0xc000_0000.
> >>> More details discussed earlier in
> >>> https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$  without any
> >>> solution
> >>>
> >>> This patch-set is expected to solve similar kind of issue.
> >>> i.e. low memory is only targeted for DMA, swiotlb; So above mentioned
> >>> observation should be considered/fixed. .
> >>>
> >>> --pk
> >>>
> >>> [1]
> >>> [   30.366695] DMI: Cavium Inc. Saber/Saber, BIOS
> >>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
> >>> [   30.367696] NET: Registered protocol family 16
> >>> [   30.369973] swapper/0: page allocation failure: order:6,
> >>> mode:0x1(GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0
> >>> [   30.369980] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc3+ #121
> >>> [   30.369981] Hardware name: Cavium Inc. Saber/Saber, BIOS
> >>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
> >>> [   30.369984] Call trace:
> >>> [   30.369989]  dump_backtrace+0x0/0x1f8
> >>> [   30.369991]  show_stack+0x20/0x30
> >>> [   30.369997]  dump_stack+0xc0/0x10c
> >>> [   30.370001]  warn_alloc+0x10c/0x178
> >>> [   30.370004]  __alloc_pages_slowpath.constprop.111+0xb10/0xb50
> >>> [   30.370006]  __alloc_pages_nodemask+0x2b4/0x300
> >>> [   30.370008]  alloc_page_interleave+0x24/0x98
> >>> [   30.370011]  alloc_pages_current+0xe4/0x108
> >>> [   30.370017]  dma_atomic_pool_init+0x44/0x1a4
> >>> [   30.370020]  do_one_initcall+0x54/0x228
> >>> [   30.370027]  kernel_init_freeable+0x228/0x2cc
> >>> [   30.370031]  kernel_init+0x1c/0x110
> >>> [   30.370034]  ret_from_fork+0x10/0x18
> >>> [   30.370036] Mem-Info:
> >>> [   30.370064] active_anon:0 inactive_anon:0 isolated_anon:0
> >>> [   30.370064]  active_file:0 inactive_file:0 isolated_file:0
> >>> [   30.370064]  unevictable:0 dirty:0 writeback:0 unstable:0
> >>> [   30.370064]  slab_reclaimable:34 slab_unreclaimable:4438
> >>> [   30.370064]  mapped:0 shmem:0 pagetables:14 bounce:0
> >>> [   30.370064]  free:1537719 free_pcp:219 free_cma:0
> >>> [   30.370070] Node 0 active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> >>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
> >>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
> >>> unstable:0kB all_unreclaimable? no
> >>> [   30.370073] Node 1 active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> >>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
> >>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
> >>> unstable:0kB all_unreclaimable? no
> >>> [   30.370079] Node 0 DMA free:0kB min:0kB low:0kB high:0kB
> >>> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:128kB managed:0kB mlocked:0kB kernel_stack:0kB pagetables:0kB
> >>> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >>> [   30.370084] lowmem_reserve[]: 0 250 6063 6063
> >>> [   30.370090] Node 0 DMA32 free:256000kB min:408kB low:664kB
> >>> high:920kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:269700kB managed:256000kB mlocked:0kB kernel_stack:0kB
> >>> pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> >>> [   30.370094] lowmem_reserve[]: 0 0 5813 5813
> >>> [   30.370100] Node 0 Normal free:5894876kB min:9552kB low:15504kB
> >>> high:21456kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
> >>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
> >>> present:8388608kB managed:5953112kB mlocked:0kB kernel_stack:21672kB
> >>> pagetables:56kB bounce:0kB free_pcp:876kB local_pcp:176kB free_cma:0kB
> >>> [   30.370104] lowmem_reserve[]: 0 0 0 0
> >>> [   30.370107] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
> >>> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
> >>> [   30.370113] Node 0 DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
> >>> 0*256kB 0*512kB 0*1024kB 1*2048kB (M) 62*4096kB (M) = 256000kB
> >>> [   30.370119] Node 0 Normal: 2*4kB (M) 3*8kB (ME) 2*16kB (UE) 3*32kB
> >>> (UM) 1*64kB (U) 2*128kB (M) 2*256kB (ME) 3*512kB (ME) 3*1024kB (ME)
> >>> 3*2048kB (UME) 1436*4096kB (M) = 5893600kB
> >>> [   30.370129] Node 0 hugepages_total=0 hugepages_free=0
> >>> hugepages_surp=0 hugepages_size=1048576kB
> >>> [   30.370130] 0 total pagecache pages
> >>> [   30.370132] 0 pages in swap cache
> >>> [   30.370134] Swap cache stats: add 0, delete 0, find 0/0
> >>> [   30.370135] Free swap  = 0kB
> >>> [   30.370136] Total swap = 0kB
> >>> [   30.370137] 2164609 pages RAM
> >>> [   30.370139] 0 pages HighMem/MovableOnly
> >>> [   30.370140] 612331 pages reserved
> >>> [   30.370141] 0 pages hwpoisoned
> >>> [   30.370143] DMA: failed to allocate 256 KiB pool for atomic
> >>> coherent allocation
> >>
> >>
> >> During my testing I saw the same error and Chen's  solution corrected it .
> >
> > Which combination you are using on your side? I am using Prabhakar's
> > suggested environment and can reproduce the issue
> > with or without Chen's crashkernel support above 4G patchset.
> >
> > I am also using a ThunderX2 platform with latest makedumpfile code and
> > kexec-tools (with the suggested patch
> > <https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!J6lUig58-Gw6TKZnEEYzEeSU36T-1SqlB1kImU00xtX_lss5Tx-JbUmLE9TJC3foXBLg$ >).
> >
> > Thanks,
> > Bhupesh
>
>
> I did this activity 5 months ago and I have moved on to other activities. My DMA failures were related to PCI devices that could not be enumerated because  low-DMA space was not  available  when crashkernel was moved above 4G; I don’t recall the exact platform.
>
>
>
> For this failure ,
>
> >>>  DMA: failed to allocate 256 KiB pool for atomic
> >>> coherent allocation
>
>
> Is due to :
>
>
>  3618082c
>  ("arm64 use both ZONE_DMA and ZONE_DMA32")
>
> With the introduction of ZONE_DMA to support the Raspberry DMA
> region below 1G, the crashkernel is placed in the upper 4G
> ZONE_DMA_32 region. Since the crashkernel does not have access
> to the ZONE_DMA region, it prints out call trace during bootup.
>
> It is due to having this CONFIG item  ON  :
>
>
> CONFIG_ZONE_DMA=y
>
> Turning off ZONE_DMA fixes a issue and Raspberry PI 4 will
> use the device tree to specify memory below 1G.
>
>

Disabling ZONE_DMA is temporary solution.  We may need proper solution

> I would like to see Chen’s feature added , perhaps as EXPERIMENTAL,  so we can get some configuration testing done on it.   It corrects having a DMA zone in low memory while crash-kernel is above 4GB.  This has been going on for a year now.

I will also like this patch to be added in Linux as early as possible.

Issue mentioned by me happens with or without this patch.

This patch-set can consider fixing because it uses low memory for DMA
& swiotlb only.
We can consider restricting crashkernel within the required range like below

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 7f9e5a6dc48c..bd67b90d35bd 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -354,7 +354,7 @@ int __init reserve_crashkernel_low(void)
                        return 0;
        }

-       low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
+       low_base = memblock_find_in_range(0, 0xc0000000, low_size, CRASH_ALIGN);
        if (!low_base) {
                pr_err("Cannot reserve %ldMB crashkernel low memory,
please try smaller size.\n",
                       (unsigned long)(low_size >> 20));


Similar change can be considered for scenario "without" this patch.
But it will decrease memory availability for crashkernel.
Hence increase the failure probability of crashkernel reservation.

--pk

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2020-06-02  5:39 UTC|newest]

Thread overview: 102+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-21  9:38 [PATCH v8 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
2020-05-21  9:38 ` Chen Zhou
2020-05-21  9:38 ` Chen Zhou
2020-05-21  9:38 ` [PATCH v8 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-26  0:56   ` Baoquan He
2020-05-26  0:56     ` Baoquan He
2020-05-26  0:56     ` Baoquan He
2020-05-21  9:38 ` [PATCH v8 2/5] arm64: kdump: reserve crashkenel above 4G for crash dump kernel Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-26  0:59   ` Baoquan He
2020-05-26  0:59     ` Baoquan He
2020-05-26  0:59     ` Baoquan He
2020-05-21  9:38 ` [PATCH v8 3/5] arm64: kdump: add memory for devices by DT property, low-memory-range Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-21  9:38 ` [PATCH v8 4/5] kdump: update Documentation about crashkernel on arm64 Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-21  9:38 ` [PATCH v8 5/5] dt-bindings: chosen: Document linux,low-memory-range for arm64 kdump Chen Zhou
2020-05-21  9:38   ` [PATCH v8 5/5] dt-bindings: chosen: Document linux, low-memory-range " Chen Zhou
2020-05-21  9:38   ` Chen Zhou
2020-05-21 13:29   ` [PATCH v8 5/5] dt-bindings: chosen: Document linux,low-memory-range " Rob Herring
2020-05-21 13:29     ` [PATCH v8 5/5] dt-bindings: chosen: Document linux, low-memory-range " Rob Herring
2020-05-21 13:29     ` Rob Herring
2020-05-22  3:24     ` [PATCH v8 5/5] dt-bindings: chosen: Document linux,low-memory-range " chenzhou
2020-05-22  3:24       ` chenzhou
2020-05-22  3:24       ` chenzhou
2020-05-26 21:18       ` Rob Herring
2020-05-26 21:18         ` Rob Herring
2020-05-26 21:18         ` Rob Herring
2020-05-29 16:11         ` James Morse
2020-05-29 16:11           ` James Morse
2020-05-29 16:11           ` James Morse
2020-06-20  3:54           ` chenzhou
2020-06-20  3:54             ` chenzhou
2020-06-20  3:54             ` chenzhou
2020-05-26  1:42 ` [PATCH v8 0/5] support reserving crashkernel above 4G on " Baoquan He
2020-05-26  1:42   ` Baoquan He
2020-05-26  1:42   ` Baoquan He
2020-05-26  2:28   ` chenzhou
2020-05-26  2:28     ` chenzhou
2020-05-26  2:28     ` chenzhou
2020-05-28 22:20   ` John Donnelly
2020-05-28 22:20     ` John Donnelly
2020-05-28 22:20     ` John Donnelly
2020-05-29  8:05     ` Will Deacon
2020-05-29  8:05       ` Will Deacon
2020-05-29  8:05       ` Will Deacon
2020-06-01 12:02 ` Prabhakar Kushwaha
2020-06-01 12:02   ` Prabhakar Kushwaha
2020-06-01 12:02   ` Prabhakar Kushwaha
2020-06-01 19:30   ` John Donnelly
2020-06-01 19:30     ` John Donnelly
2020-06-01 19:30     ` John Donnelly
2020-06-01 21:02     ` Bhupesh Sharma
2020-06-01 21:02       ` Bhupesh Sharma
2020-06-01 21:02       ` Bhupesh Sharma
2020-06-01 21:59       ` John Donnelly
2020-06-01 21:59         ` John Donnelly
2020-06-01 21:59         ` John Donnelly
2020-06-02  5:38         ` Prabhakar Kushwaha [this message]
2020-06-02  5:38           ` Prabhakar Kushwaha
2020-06-02  5:38           ` Prabhakar Kushwaha
2020-06-02 14:41           ` John Donnelly
2020-06-02 14:41             ` John Donnelly
2020-06-02 14:41             ` John Donnelly
2020-06-03 11:47             ` Prabhakar Kushwaha
2020-06-03 11:47               ` Prabhakar Kushwaha
2020-06-03 11:47               ` Prabhakar Kushwaha
2020-06-03 13:20               ` chenzhou
2020-06-03 13:20                 ` chenzhou
2020-06-03 13:20                 ` chenzhou
2020-06-03 15:30                 ` John Donnelly
2020-06-03 15:30                   ` John Donnelly
2020-06-03 15:30                   ` John Donnelly
2020-06-03 19:47                   ` Bhupesh Sharma
2020-06-03 19:47                     ` Bhupesh Sharma
2020-06-03 19:47                     ` Bhupesh Sharma
2020-06-04  7:14                     ` Will Deacon
2020-06-04  7:14                       ` Will Deacon
2020-06-04  7:14                       ` Will Deacon
2020-06-04 17:01                     ` Nicolas Saenz Julienne
2020-06-04 17:01                       ` Nicolas Saenz Julienne
2020-06-04 17:01                       ` Nicolas Saenz Julienne
2020-06-05  2:26                       ` John Donnelly
2020-06-05  2:26                         ` John Donnelly
2020-06-05  2:26                         ` John Donnelly
2020-06-05  8:21                         ` Nicolas Saenz Julienne
2020-06-05  8:21                           ` Nicolas Saenz Julienne
2020-06-05  8:21                           ` Nicolas Saenz Julienne
2020-06-19  2:32                       ` John Donnelly
2020-06-19  2:32                         ` John Donnelly
2020-06-19  2:32                         ` John Donnelly
2020-06-19  8:21                         ` chenzhou
2020-06-19  8:21                           ` chenzhou
2020-06-19  8:21                           ` chenzhou
2020-06-20  0:01                           ` John Donnelly
2020-06-20  0:01                             ` John Donnelly
2020-06-20  0:01                             ` John Donnelly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAJ2QiJJE-jeRL1HPUZCwi1LtV9CBMmYrsOaS6vX1R1sJ6Z1t8g@mail.gmail.com \
    --to=prabhakar.pkin@gmail.com \
    --cc=arnd@arndb.de \
    --cc=bhe@redhat.com \
    --cc=bhsharma@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=chenzhou10@huawei.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dyoung@redhat.com \
    --cc=guohanjun@huawei.com \
    --cc=horms@verge.net.au \
    --cc=james.morse@arm.com \
    --cc=john.p.donnelly@oracle.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pkushwaha@marvell.com \
    --cc=robh+dt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.