devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: chenzhou <chenzhou10@huawei.com>
To: Prabhakar Kushwaha <prabhakar.pkin@gmail.com>,
	John Donnelly <john.p.donnelly@oracle.com>
Cc: Bhupesh Sharma <bhsharma@redhat.com>,
	Simon Horman <horms@verge.net.au>,
	Devicetree List <devicetree@vger.kernel.org>,
	Baoquan He <bhe@redhat.com>, Will Deacon <will@kernel.org>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	"kexec mailing list" <kexec@lists.infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Rob Herring <robh+dt@kernel.org>, Ingo Molnar <mingo@redhat.com>,
	Arnd Bergmann <arnd@arndb.de>, <guohanjun@huawei.com>,
	James Morse <james.morse@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Prabhakar Kushwaha <pkushwaha@marvell.com>,
	RuiRui Yang <dyoung@redhat.com>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v8 0/5] support reserving crashkernel above 4G on arm64 kdump
Date: Wed, 3 Jun 2020 21:20:18 +0800	[thread overview]
Message-ID: <8463464e-5461-f328-621c-bacc6a3b88dd@huawei.com> (raw)
In-Reply-To: <CAJ2QiJJhUCnobrMHui5=6zLzgy3KsoPxrqiH_oYT8Jhb5MkmbA@mail.gmail.com>

Hi,


On 2020/6/3 19:47, Prabhakar Kushwaha wrote:
> Hi Chen,
>
> On Tue, Jun 2, 2020 at 8:12 PM John Donnelly <john.p.donnelly@oracle.com> wrote:
>>
>>
>>> On Jun 2, 2020, at 12:38 AM, Prabhakar Kushwaha <prabhakar.pkin@gmail.com> wrote:
>>>
>>> On Tue, Jun 2, 2020 at 3:29 AM John Donnelly <john.p.donnelly@oracle.com> wrote:
>>>> Hi .  See below !
>>>>
>>>>> On Jun 1, 2020, at 4:02 PM, Bhupesh Sharma <bhsharma@redhat.com> wrote:
>>>>>
>>>>> Hi John,
>>>>>
>>>>> On Tue, Jun 2, 2020 at 1:01 AM John Donnelly <John.P.donnelly@oracle.com> wrote:
>>>>>> Hi,
>>>>>>
>>>>>>
>>>>>> On 6/1/20 7:02 AM, Prabhakar Kushwaha wrote:
>>>>>>> Hi Chen,
>>>>>>>
>>>>>>> On Thu, May 21, 2020 at 3:05 PM Chen Zhou <chenzhou10@huawei.com> wrote:
>>>>>>>> This patch series enable reserving crashkernel above 4G in arm64.
>>>>>>>>
>>>>>>>> There are following issues in arm64 kdump:
>>>>>>>> 1. We use crashkernel=X to reserve crashkernel below 4G, which will fail
>>>>>>>> when there is no enough low memory.
>>>>>>>> 2. Currently, crashkernel=Y@X can be used to reserve crashkernel above 4G,
>>>>>>>> in this case, if swiotlb or DMA buffers are required, crash dump kernel
>>>>>>>> will boot failure because there is no low memory available for allocation.
>>>>>>>>
>>>>>>> We are getting "warn_alloc" [1] warning during boot of kdump kernel
>>>>>>> with bootargs as [2] of primary kernel.
>>>>>>> This error observed on ThunderX2  ARM64 platform.
>>>>>>>
>>>>>>> It is observed with latest upstream tag (v5.7-rc3) with this patch set
>>>>>>> and https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbiIAAlzu$
>>>>>>> Also **without** this patch-set
>>>>>>> "https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$"
>>>>>>>
>>>>>>> This issue comes whenever crashkernel memory is reserved after 0xc000_0000.
>>>>>>> More details discussed earlier in
>>>>>>> https://urldefense.com/v3/__https://www.spinics.net/lists/arm-kernel/msg806882.html__;!!GqivPVa7Brio!LnTSARkCt0V0FozR0KmqooaH5ADtdXvs3mPdP3KRVqALmvSK2VmCkIPIhsaxbjC6ujMA$  without any
>>>>>>> solution
>>>>>>>
>>>>>>> This patch-set is expected to solve similar kind of issue.
>>>>>>> i.e. low memory is only targeted for DMA, swiotlb; So above mentioned
>>>>>>> observation should be considered/fixed. .
>>>>>>>
>>>>>>> --pk
>>>>>>>
>>>>>>> [1]
>>>>>>> [   30.366695] DMI: Cavium Inc. Saber/Saber, BIOS
>>>>>>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
>>>>>>> [   30.367696] NET: Registered protocol family 16
>>>>>>> [   30.369973] swapper/0: page allocation failure: order:6,
>>>>>>> mode:0x1(GFP_DMA), nodemask=(null),cpuset=/,mems_allowed=0
>>>>>>> [   30.369980] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc3+ #121
>>>>>>> [   30.369981] Hardware name: Cavium Inc. Saber/Saber, BIOS
>>>>>>> TX2-FW-Release-3.1-build_01-2803-g74253a541a mm/dd/yyyy
>>>>>>> [   30.369984] Call trace:
>>>>>>> [   30.369989]  dump_backtrace+0x0/0x1f8
>>>>>>> [   30.369991]  show_stack+0x20/0x30
>>>>>>> [   30.369997]  dump_stack+0xc0/0x10c
>>>>>>> [   30.370001]  warn_alloc+0x10c/0x178
>>>>>>> [   30.370004]  __alloc_pages_slowpath.constprop.111+0xb10/0xb50
>>>>>>> [   30.370006]  __alloc_pages_nodemask+0x2b4/0x300
>>>>>>> [   30.370008]  alloc_page_interleave+0x24/0x98
>>>>>>> [   30.370011]  alloc_pages_current+0xe4/0x108
>>>>>>> [   30.370017]  dma_atomic_pool_init+0x44/0x1a4
>>>>>>> [   30.370020]  do_one_initcall+0x54/0x228
>>>>>>> [   30.370027]  kernel_init_freeable+0x228/0x2cc
>>>>>>> [   30.370031]  kernel_init+0x1c/0x110
>>>>>>> [   30.370034]  ret_from_fork+0x10/0x18
>>>>>>> [   30.370036] Mem-Info:
>>>>>>> [   30.370064] active_anon:0 inactive_anon:0 isolated_anon:0
>>>>>>> [   30.370064]  active_file:0 inactive_file:0 isolated_file:0
>>>>>>> [   30.370064]  unevictable:0 dirty:0 writeback:0 unstable:0
>>>>>>> [   30.370064]  slab_reclaimable:34 slab_unreclaimable:4438
>>>>>>> [   30.370064]  mapped:0 shmem:0 pagetables:14 bounce:0
>>>>>>> [   30.370064]  free:1537719 free_pcp:219 free_cma:0
>>>>>>> [   30.370070] Node 0 active_anon:0kB inactive_anon:0kB
>>>>>>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
>>>>>>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
>>>>>>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
>>>>>>> unstable:0kB all_unreclaimable? no
>>>>>>> [   30.370073] Node 1 active_anon:0kB inactive_anon:0kB
>>>>>>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
>>>>>>> isolated(file):0kB mapped:0kB dirty:0kB writeback:0kB shmem:0kB
>>>>>>> shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB writeback_tmp:0kB
>>>>>>> unstable:0kB all_unreclaimable? no
>>>>>>> [   30.370079] Node 0 DMA free:0kB min:0kB low:0kB high:0kB
>>>>>>> reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
>>>>>>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
>>>>>>> present:128kB managed:0kB mlocked:0kB kernel_stack:0kB pagetables:0kB
>>>>>>> bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>>>>>>> [   30.370084] lowmem_reserve[]: 0 250 6063 6063
>>>>>>> [   30.370090] Node 0 DMA32 free:256000kB min:408kB low:664kB
>>>>>>> high:920kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
>>>>>>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
>>>>>>> present:269700kB managed:256000kB mlocked:0kB kernel_stack:0kB
>>>>>>> pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>>>>>>> [   30.370094] lowmem_reserve[]: 0 0 5813 5813
>>>>>>> [   30.370100] Node 0 Normal free:5894876kB min:9552kB low:15504kB
>>>>>>> high:21456kB reserved_highatomic:0KB active_anon:0kB inactive_anon:0kB
>>>>>>> active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB
>>>>>>> present:8388608kB managed:5953112kB mlocked:0kB kernel_stack:21672kB
>>>>>>> pagetables:56kB bounce:0kB free_pcp:876kB local_pcp:176kB free_cma:0kB
>>>>>>> [   30.370104] lowmem_reserve[]: 0 0 0 0
>>>>>>> [   30.370107] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
>>>>>>> 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 0kB
>>>>>>> [   30.370113] Node 0 DMA32: 0*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB
>>>>>>> 0*256kB 0*512kB 0*1024kB 1*2048kB (M) 62*4096kB (M) = 256000kB
>>>>>>> [   30.370119] Node 0 Normal: 2*4kB (M) 3*8kB (ME) 2*16kB (UE) 3*32kB
>>>>>>> (UM) 1*64kB (U) 2*128kB (M) 2*256kB (ME) 3*512kB (ME) 3*1024kB (ME)
>>>>>>> 3*2048kB (UME) 1436*4096kB (M) = 5893600kB
>>>>>>> [   30.370129] Node 0 hugepages_total=0 hugepages_free=0
>>>>>>> hugepages_surp=0 hugepages_size=1048576kB
>>>>>>> [   30.370130] 0 total pagecache pages
>>>>>>> [   30.370132] 0 pages in swap cache
>>>>>>> [   30.370134] Swap cache stats: add 0, delete 0, find 0/0
>>>>>>> [   30.370135] Free swap  = 0kB
>>>>>>> [   30.370136] Total swap = 0kB
>>>>>>> [   30.370137] 2164609 pages RAM
>>>>>>> [   30.370139] 0 pages HighMem/MovableOnly
>>>>>>> [   30.370140] 612331 pages reserved
>>>>>>> [   30.370141] 0 pages hwpoisoned
>>>>>>> [   30.370143] DMA: failed to allocate 256 KiB pool for atomic
>>>>>>> coherent allocation
>>>>>>
>>>>>> During my testing I saw the same error and Chen's  solution corrected it .
>>>>> Which combination you are using on your side? I am using Prabhakar's
>>>>> suggested environment and can reproduce the issue
>>>>> with or without Chen's crashkernel support above 4G patchset.
>>>>>
>>>>> I am also using a ThunderX2 platform with latest makedumpfile code and
>>>>> kexec-tools (with the suggested patch
>>>>> <https://urldefense.com/v3/__https://lists.infradead.org/pipermail/kexec/2020-May/025128.html__;!!GqivPVa7Brio!J6lUig58-Gw6TKZnEEYzEeSU36T-1SqlB1kImU00xtX_lss5Tx-JbUmLE9TJC3foXBLg$ >).
>>>>>
>>>>> Thanks,
>>>>> Bhupesh
>>>>
>>>> I did this activity 5 months ago and I have moved on to other activities. My DMA failures were related to PCI devices that could not be enumerated because  low-DMA space was not  available when crashkernel was moved above 4G; I don’t recall the exact platform.
>>>>
>>>>
>>>>
>>>> For this failure ,
>>>>
>>>>>>> DMA: failed to allocate 256 KiB pool for atomic
>>>>>>> coherent allocation
>>>>
>>>> Is due to :
>>>>
>>>>
>>>> 3618082c
>>>> ("arm64 use both ZONE_DMA and ZONE_DMA32")
>>>>
>>>> With the introduction of ZONE_DMA to support the Raspberry DMA
>>>> region below 1G, the crashkernel is placed in the upper 4G
>>>> ZONE_DMA_32 region. Since the crashkernel does not have access
>>>> to the ZONE_DMA region, it prints out call trace during bootup.
>>>>
>>>> It is due to having this CONFIG item  ON  :
>>>>
>>>>
>>>> CONFIG_ZONE_DMA=y
>>>>
>>>> Turning off ZONE_DMA fixes a issue and Raspberry PI 4 will
>>>> use the device tree to specify memory below 1G.
>>>>
>>>>
>>> Disabling ZONE_DMA is temporary solution.  We may need proper solution
>>
>> Perhaps the Raspberry platform configuration dependencies need separated  from “server class” Arm  equipment ?  Or auto-configured on boot ?  Consult an expert ;-)
>>
>>
>>
>>>> I would like to see Chen’s feature added , perhaps as EXPERIMENTAL,  so we can get some configuration testing done on it.   It corrects having a DMA zone in low memory while crash-kernel is above 4GB.  This has been going on for a year now.
>>> I will also like this patch to be added in Linux as early as possible.
>>>
>>> Issue mentioned by me happens with or without this patch.
>>>
>>> This patch-set can consider fixing because it uses low memory for DMA
>>> & swiotlb only.
>>> We can consider restricting crashkernel within the required range like below
>>>
>>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>>> index 7f9e5a6dc48c..bd67b90d35bd 100644
>>> --- a/kernel/crash_core.c
>>> +++ b/kernel/crash_core.c
>>> @@ -354,7 +354,7 @@ int __init reserve_crashkernel_low(void)
>>>                        return 0;
>>>        }
>>>
>>> -       low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
>>> +       low_base = memblock_find_in_range(0,0xc0000000, low_size, CRASH_ALIGN);
>>>        if (!low_base) {
>>>                pr_err("Cannot reserve %ldMB crashkernel low memory,
>>> please try smaller size.\n",
>>>                       (unsigned long)(low_size >> 20));
>>>
>>>
>>     I suspect  0xc0000000  would need to be a CONFIG item  and not hard-coded.
>>
> if you consider this as valid change,  can you please incorporate as
> part of your patch-set.

After commit 1a8e1cef7 ("arm64: use both ZONE_DMA and ZONE_DMA32"),the 0-4G memory is splited
to DMA [mem 0x0000000000000000-0x000000003fffffff] and DMA32 [mem 0x0000000040000000-0x00000000ffffffff] on arm64.

From the above discussion, on your platform, the low crashkernel fall in DMA32 region, but your environment needs to access DMA
region, so there is the call trace.

I have a question, why do you choose 0xc0000000 here?

Besides, this is common code, we also need to consider about x86.

Thanks,
Chen Zhou

>
> --pk.
>
> .
>



  reply	other threads:[~2020-06-03 13:20 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-21  9:38 [PATCH v8 0/5] support reserving crashkernel above 4G on arm64 kdump Chen Zhou
2020-05-21  9:38 ` [PATCH v8 1/5] x86: kdump: move reserve_crashkernel_low() into crash_core.c Chen Zhou
2020-05-26  0:56   ` Baoquan He
2020-05-21  9:38 ` [PATCH v8 2/5] arm64: kdump: reserve crashkenel above 4G for crash dump kernel Chen Zhou
2020-05-26  0:59   ` Baoquan He
2020-05-21  9:38 ` [PATCH v8 3/5] arm64: kdump: add memory for devices by DT property, low-memory-range Chen Zhou
2020-05-21  9:38 ` [PATCH v8 4/5] kdump: update Documentation about crashkernel on arm64 Chen Zhou
2020-05-21  9:38 ` [PATCH v8 5/5] dt-bindings: chosen: Document linux,low-memory-range for arm64 kdump Chen Zhou
2020-05-21 13:29   ` Rob Herring
2020-05-22  3:24     ` chenzhou
2020-05-26 21:18       ` Rob Herring
2020-05-29 16:11         ` James Morse
2020-06-20  3:54           ` chenzhou
2020-05-26  1:42 ` [PATCH v8 0/5] support reserving crashkernel above 4G on " Baoquan He
2020-05-26  2:28   ` chenzhou
2020-05-28 22:20   ` John Donnelly
2020-05-29  8:05     ` Will Deacon
2020-06-01 12:02 ` Prabhakar Kushwaha
2020-06-01 19:30   ` John Donnelly
2020-06-01 21:02     ` Bhupesh Sharma
2020-06-01 21:59       ` John Donnelly
2020-06-02  5:38         ` Prabhakar Kushwaha
2020-06-02 14:41           ` John Donnelly
2020-06-03 11:47             ` Prabhakar Kushwaha
2020-06-03 13:20               ` chenzhou [this message]
2020-06-03 15:30                 ` John Donnelly
2020-06-03 19:47                   ` Bhupesh Sharma
2020-06-04  7:14                     ` Will Deacon
2020-06-04 17:01                     ` Nicolas Saenz Julienne
2020-06-05  2:26                       ` John Donnelly
2020-06-05  8:21                         ` Nicolas Saenz Julienne
2020-06-19  2:32                       ` John Donnelly
2020-06-19  8:21                         ` chenzhou
2020-06-20  0:01                           ` John Donnelly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8463464e-5461-f328-621c-bacc6a3b88dd@huawei.com \
    --to=chenzhou10@huawei.com \
    --cc=arnd@arndb.de \
    --cc=bhe@redhat.com \
    --cc=bhsharma@redhat.com \
    --cc=catalin.marinas@arm.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dyoung@redhat.com \
    --cc=guohanjun@huawei.com \
    --cc=horms@verge.net.au \
    --cc=james.morse@arm.com \
    --cc=john.p.donnelly@oracle.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pkushwaha@marvell.com \
    --cc=prabhakar.pkin@gmail.com \
    --cc=robh+dt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).