All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>
To: Baoquan He <bhe@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	<x86@kernel.org>, "H . Peter Anvin" <hpa@zytor.com>,
	<linux-kernel@vger.kernel.org>, Dave Young <dyoung@redhat.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	<kexec@lists.infradead.org>, Will Deacon <will@kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	Rob Herring <robh+dt@kernel.org>,
	Frank Rowand <frowand.list@gmail.com>,
	<devicetree@vger.kernel.org>, "Jonathan Corbet" <corbet@lwn.net>,
	<linux-doc@vger.kernel.org>, Randy Dunlap <rdunlap@infradead.org>,
	Feng Zhou <zhoufeng.zf@bytedance.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Chen Zhou <dingguo.cz@antgroup.com>,
	"John Donnelly" <John.p.donnelly@oracle.com>,
	Dave Kleikamp <dave.kleikamp@oracle.com>
Subject: Re: [PATCH v24 3/6] arm64: kdump: Reimplement crashkernel=X
Date: Sat, 7 May 2022 19:49:56 +0800	[thread overview]
Message-ID: <0c7e91fb-10a3-f7e6-e856-0c865c71527b@huawei.com> (raw)
In-Reply-To: <6e892914-74ae-2b8f-954e-342aaf4be870@huawei.com>



On 2022/5/7 17:35, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/5/7 11:37, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/5/7 10:07, Baoquan He wrote:
>>> On 05/07/22 at 09:34am, Leizhen (ThunderTown) wrote:
>>>>
>>>>
>>>> On 2022/5/7 7:10, Baoquan He wrote:
>>>>> On 05/06/22 at 07:43pm, Zhen Lei wrote:
>>>>> ......  
>>>>>> @@ -118,8 +162,7 @@ static void __init reserve_crashkernel(void)
>>>>>>  	if (crash_base)
>>>>>>  		crash_max = crash_base + crash_size;
>>>>>>  
>>>>>> -	/* Current arm64 boot protocol requires 2MB alignment */
>>>>>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>>>>>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>  					       crash_base, crash_max);
>>>>>>  	if (!crash_base) {
>>>>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>>>> @@ -127,6 +170,11 @@ static void __init reserve_crashkernel(void)
>>>>>>  		return;
>>>>>>  	}
>>>>>>  
>>>>>
>>>>> There's corner case missed, e.g
>>>>> 1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
>>>>> 2) ,high and ,low are specified, the whole system memory is under 4G.
>>>>>
>>>>> Below judgement can filter them away:
>>>>>         
>>>>> 	if (crash_base > arm64_dma_phys_limit && crash_low_size &&
>>>>> 	    reserve_crashkernel_low(crash_low_size)) {
>>>>>
>>>>> What's your opinion? Leave it and add document to notice user, or fix it
>>>>> with code change?
> 
> I decided to modify the code and document. But the code changes aren't what
> you suggested. For the following reasons:
> 1. The memory allocated for 'high' may be partially under 4G. So the low
>    memory may not be enough. Of course, it's rare.
> 2. The second kernel can work properly only when the high and low memory
>    are successfully applied for. For example, high=128M, low=128M, but the
>    second kernel need 256M.
> 
> So for the cases you listed:
> 1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
>    --> Follow you suggestion, ignore crashkernel=Y,low, don't allocate low memory.
> 
> @@ -100,6 +100,14 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>  {
>         unsigned long long low_base;
> 
> +       /*
> +        * The kernel does not have any DMA zone, so the range of each DMA
> +        * zone is unknown. Please make sure both CONFIG_ZONE_DMA and
> +        * CONFIG_ZONE_DMA32 are also not set in the second kernel.
> +        */
> +       if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> +               return 0;
> +
> 
> 2) ,high and ,low are specified, the whole system memory is under 4G.
>    --> two memory ranges will be allocated, the size is what 'high' and 'low' specified.
>    --> Yes, the memory of 'low' may be above 'high', but the 'high' just hint allocation
>    --> from top, try high memory first. Of course, this may cause kexec to fail to load.
>    --> Because the memory of 'low' with small size will be used to store Image, etc..
>    --> But the memory of 'low' above 'high' is almost impossible, we use memblock API to
>    --> allocate memory from top to bottem, 'low' above 'high' need a sizeable memory block
>    --> (128M, 256M?) to be freed at init phase.
>    -->  Maybe I should add: crash_max = min(crash_base, CRASH_ADDR_LOW_MAX);
>    --> to make sure the memory of 'low' is always under 'high'

I have added the min() above.

Test result:
1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
root@localhost:~# dmesg | grep crash
[    0.000000] crashkernel reserved: 0x0000000420000000 - 0x0000000440000000 (512 MB)
[    0.000000] Kernel command line: console=ttyAMA0 root=/dev/vda rw panic_on_oops=1 oops=panic crashkernel=512M,high crashkernel=128M,low

2) ,high and ,low are specified, the whole system memory is under 4G.
root@localhost:~# dmesg | grep crash
[    0.000000] crashkernel tmp reserved: 0x00000000f2800000 - 0x00000000fa800000 (128 MB)
[    0.000000] crashkernel low memory reserved: 0xca800000 - 0xd2800000 (128 MB)
[    0.000000] crashkernel reserved: 0x00000000d2800000 - 0x00000000f2800000 (512 MB)
[    0.000000] Kernel command line: console=ttyAMA0 root=/dev/vda rw panic_on_oops=1 oops=panic crashkernel=512M,high crashkernel=128M,low

test stub for 2):

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5cb73bbd286b100..abbde2158a0976a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -147,6 +147,7 @@ static void __init reserve_crashkernel(void)
        unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
        char *cmdline = boot_command_line;
        int ret;
+       unsigned long long tmp_base;

        if (!IS_ENABLED(CONFIG_KEXEC_CORE))
                return;
@@ -179,6 +180,11 @@ static void __init reserve_crashkernel(void)
        if (crash_base)
                crash_max = crash_base + crash_size;

+       tmp_base = memblock_phys_alloc_range(crash_low_size, CRASH_ALIGN, crash_base, crash_max);
+       BUG_ON(!tmp_base);
+       pr_info("crashkernel tmp reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+               tmp_base, tmp_base + crash_low_size, crash_low_size >> 20);
+
        crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);
        if (!crash_base) {
@@ -186,6 +192,7 @@ static void __init reserve_crashkernel(void)
                        crash_size);
                return;
        }
+       memblock_phys_free(tmp_base, crash_low_size);

        if (crash_low_size && reserve_crashkernel_low(crash_low_size, crash_base)) {
                memblock_phys_free(crash_base, crash_size);

> 
>>>>
>>>> I think maybe we can leave it unchanged. If the user configures two memory ranges,
>>>> we'd better apply for two. Otherwise, he'll be confused when he inquires. Currently,
>>>> crash_low_size is non-zero only when 'crashkernel=Y,low' is explicitly configured.
>>>
>>> Then user need know the system information, e.g how much is the high
>>> memory, low memory, if CONFIG_ZONE_DMA|DMA32 is enabled. And we need
>>> describe these cases in document. Any corner case or exception need
>>> be noted if we don't handle it in code.
>>>
>>> Caring about this very much because we have CI with existed test cases
>>> to run on the system, and QA will check these manually too. Support
>>> engineer need detailed document if anything special but happened.
>>> Anything unclear or uncovered will be reported as bug to our kernel dev.
>>> Guess your company do the similar thing like this.
>>>
>>> This crashkerne,high and crashkernel,low reservation is special if we
>>> allow ,high, ,low existing in the same zone. Imagine on system with
>>> CONFIG_ZONE_DMA|DMA32 disabled, people copy the crashkernel=512M,high
>>> and crashkernel=128M,low from other system, and he could get
>>> crash_res at [5G, 5G+512M], while crash_low_res at [6G, 6G+128M]. Guess
>>> how they will judge us.
>>
>> OK, I got it.
>>
>>>
>>>>
>>>>>
>>>>> I would suggest merging this series, Lei can add this corner case
>>>>> handling on top. Since this is a newly added support, we don't have
>>>>> to make it one step. Doing step by step can make reviewing easier.
>>>>>
>>>>>> +	if (crash_low_size && reserve_crashkernel_low(crash_low_size)) {
>>>>>> +		memblock_phys_free(crash_base, crash_size);
>>>>>> +		return;
>>>>>> +	}
>>>>>> +
>>>>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>>>>  
>>>>>> @@ -135,6 +183,9 @@ static void __init reserve_crashkernel(void)
>>>>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>>>>  	 */
>>>>>>  	kmemleak_ignore_phys(crash_base);
>>>>>> +	if (crashk_low_res.end)
>>>>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>>>>> +
>>>>>>  	crashk_res.start = crash_base;
>>>>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>>>>  	insert_resource(&iomem_resource, &crashk_res);
>>>>>> -- 
>>>>>> 2.25.1
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>> -- 
>>>> Regards,
>>>>   Zhen Lei
>>>>
>>>
>>> .
>>>
>>
> 

-- 
Regards,
  Zhen Lei

WARNING: multiple messages have this Message-ID (diff)
From: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>
To: Baoquan He <bhe@redhat.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	<x86@kernel.org>, "H . Peter Anvin" <hpa@zytor.com>,
	<linux-kernel@vger.kernel.org>, Dave Young <dyoung@redhat.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	<kexec@lists.infradead.org>, Will Deacon <will@kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	Rob Herring <robh+dt@kernel.org>,
	Frank Rowand <frowand.list@gmail.com>,
	<devicetree@vger.kernel.org>, "Jonathan Corbet" <corbet@lwn.net>,
	<linux-doc@vger.kernel.org>, Randy Dunlap <rdunlap@infradead.org>,
	Feng Zhou <zhoufeng.zf@bytedance.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Chen Zhou <dingguo.cz@antgroup.com>,
	"John Donnelly" <John.p.donnelly@oracle.com>,
	Dave Kleikamp <dave.kleikamp@oracle.com>
Subject: Re: [PATCH v24 3/6] arm64: kdump: Reimplement crashkernel=X
Date: Sat, 7 May 2022 19:49:56 +0800	[thread overview]
Message-ID: <0c7e91fb-10a3-f7e6-e856-0c865c71527b@huawei.com> (raw)
In-Reply-To: <6e892914-74ae-2b8f-954e-342aaf4be870@huawei.com>



On 2022/5/7 17:35, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/5/7 11:37, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/5/7 10:07, Baoquan He wrote:
>>> On 05/07/22 at 09:34am, Leizhen (ThunderTown) wrote:
>>>>
>>>>
>>>> On 2022/5/7 7:10, Baoquan He wrote:
>>>>> On 05/06/22 at 07:43pm, Zhen Lei wrote:
>>>>> ......  
>>>>>> @@ -118,8 +162,7 @@ static void __init reserve_crashkernel(void)
>>>>>>  	if (crash_base)
>>>>>>  		crash_max = crash_base + crash_size;
>>>>>>  
>>>>>> -	/* Current arm64 boot protocol requires 2MB alignment */
>>>>>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>>>>>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>  					       crash_base, crash_max);
>>>>>>  	if (!crash_base) {
>>>>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>>>> @@ -127,6 +170,11 @@ static void __init reserve_crashkernel(void)
>>>>>>  		return;
>>>>>>  	}
>>>>>>  
>>>>>
>>>>> There's corner case missed, e.g
>>>>> 1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
>>>>> 2) ,high and ,low are specified, the whole system memory is under 4G.
>>>>>
>>>>> Below judgement can filter them away:
>>>>>         
>>>>> 	if (crash_base > arm64_dma_phys_limit && crash_low_size &&
>>>>> 	    reserve_crashkernel_low(crash_low_size)) {
>>>>>
>>>>> What's your opinion? Leave it and add document to notice user, or fix it
>>>>> with code change?
> 
> I decided to modify the code and document. But the code changes aren't what
> you suggested. For the following reasons:
> 1. The memory allocated for 'high' may be partially under 4G. So the low
>    memory may not be enough. Of course, it's rare.
> 2. The second kernel can work properly only when the high and low memory
>    are successfully applied for. For example, high=128M, low=128M, but the
>    second kernel need 256M.
> 
> So for the cases you listed:
> 1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
>    --> Follow you suggestion, ignore crashkernel=Y,low, don't allocate low memory.
> 
> @@ -100,6 +100,14 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>  {
>         unsigned long long low_base;
> 
> +       /*
> +        * The kernel does not have any DMA zone, so the range of each DMA
> +        * zone is unknown. Please make sure both CONFIG_ZONE_DMA and
> +        * CONFIG_ZONE_DMA32 are also not set in the second kernel.
> +        */
> +       if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> +               return 0;
> +
> 
> 2) ,high and ,low are specified, the whole system memory is under 4G.
>    --> two memory ranges will be allocated, the size is what 'high' and 'low' specified.
>    --> Yes, the memory of 'low' may be above 'high', but the 'high' just hint allocation
>    --> from top, try high memory first. Of course, this may cause kexec to fail to load.
>    --> Because the memory of 'low' with small size will be used to store Image, etc..
>    --> But the memory of 'low' above 'high' is almost impossible, we use memblock API to
>    --> allocate memory from top to bottem, 'low' above 'high' need a sizeable memory block
>    --> (128M, 256M?) to be freed at init phase.
>    -->  Maybe I should add: crash_max = min(crash_base, CRASH_ADDR_LOW_MAX);
>    --> to make sure the memory of 'low' is always under 'high'

I have added the min() above.

Test result:
1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
root@localhost:~# dmesg | grep crash
[    0.000000] crashkernel reserved: 0x0000000420000000 - 0x0000000440000000 (512 MB)
[    0.000000] Kernel command line: console=ttyAMA0 root=/dev/vda rw panic_on_oops=1 oops=panic crashkernel=512M,high crashkernel=128M,low

2) ,high and ,low are specified, the whole system memory is under 4G.
root@localhost:~# dmesg | grep crash
[    0.000000] crashkernel tmp reserved: 0x00000000f2800000 - 0x00000000fa800000 (128 MB)
[    0.000000] crashkernel low memory reserved: 0xca800000 - 0xd2800000 (128 MB)
[    0.000000] crashkernel reserved: 0x00000000d2800000 - 0x00000000f2800000 (512 MB)
[    0.000000] Kernel command line: console=ttyAMA0 root=/dev/vda rw panic_on_oops=1 oops=panic crashkernel=512M,high crashkernel=128M,low

test stub for 2):

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5cb73bbd286b100..abbde2158a0976a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -147,6 +147,7 @@ static void __init reserve_crashkernel(void)
        unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
        char *cmdline = boot_command_line;
        int ret;
+       unsigned long long tmp_base;

        if (!IS_ENABLED(CONFIG_KEXEC_CORE))
                return;
@@ -179,6 +180,11 @@ static void __init reserve_crashkernel(void)
        if (crash_base)
                crash_max = crash_base + crash_size;

+       tmp_base = memblock_phys_alloc_range(crash_low_size, CRASH_ALIGN, crash_base, crash_max);
+       BUG_ON(!tmp_base);
+       pr_info("crashkernel tmp reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+               tmp_base, tmp_base + crash_low_size, crash_low_size >> 20);
+
        crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);
        if (!crash_base) {
@@ -186,6 +192,7 @@ static void __init reserve_crashkernel(void)
                        crash_size);
                return;
        }
+       memblock_phys_free(tmp_base, crash_low_size);

        if (crash_low_size && reserve_crashkernel_low(crash_low_size, crash_base)) {
                memblock_phys_free(crash_base, crash_size);

> 
>>>>
>>>> I think maybe we can leave it unchanged. If the user configures two memory ranges,
>>>> we'd better apply for two. Otherwise, he'll be confused when he inquires. Currently,
>>>> crash_low_size is non-zero only when 'crashkernel=Y,low' is explicitly configured.
>>>
>>> Then user need know the system information, e.g how much is the high
>>> memory, low memory, if CONFIG_ZONE_DMA|DMA32 is enabled. And we need
>>> describe these cases in document. Any corner case or exception need
>>> be noted if we don't handle it in code.
>>>
>>> Caring about this very much because we have CI with existed test cases
>>> to run on the system, and QA will check these manually too. Support
>>> engineer need detailed document if anything special but happened.
>>> Anything unclear or uncovered will be reported as bug to our kernel dev.
>>> Guess your company do the similar thing like this.
>>>
>>> This crashkerne,high and crashkernel,low reservation is special if we
>>> allow ,high, ,low existing in the same zone. Imagine on system with
>>> CONFIG_ZONE_DMA|DMA32 disabled, people copy the crashkernel=512M,high
>>> and crashkernel=128M,low from other system, and he could get
>>> crash_res at [5G, 5G+512M], while crash_low_res at [6G, 6G+128M]. Guess
>>> how they will judge us.
>>
>> OK, I got it.
>>
>>>
>>>>
>>>>>
>>>>> I would suggest merging this series, Lei can add this corner case
>>>>> handling on top. Since this is a newly added support, we don't have
>>>>> to make it one step. Doing step by step can make reviewing easier.
>>>>>
>>>>>> +	if (crash_low_size && reserve_crashkernel_low(crash_low_size)) {
>>>>>> +		memblock_phys_free(crash_base, crash_size);
>>>>>> +		return;
>>>>>> +	}
>>>>>> +
>>>>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>>>>  
>>>>>> @@ -135,6 +183,9 @@ static void __init reserve_crashkernel(void)
>>>>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>>>>  	 */
>>>>>>  	kmemleak_ignore_phys(crash_base);
>>>>>> +	if (crashk_low_res.end)
>>>>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>>>>> +
>>>>>>  	crashk_res.start = crash_base;
>>>>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>>>>  	insert_resource(&iomem_resource, &crashk_res);
>>>>>> -- 
>>>>>> 2.25.1
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>> -- 
>>>> Regards,
>>>>   Zhen Lei
>>>>
>>>
>>> .
>>>
>>
> 

-- 
Regards,
  Zhen Lei

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Leizhen (ThunderTown) <thunder.leizhen@huawei.com>
To: kexec@lists.infradead.org
Subject: [PATCH v24 3/6] arm64: kdump: Reimplement crashkernel=X
Date: Sat, 7 May 2022 19:49:56 +0800	[thread overview]
Message-ID: <0c7e91fb-10a3-f7e6-e856-0c865c71527b@huawei.com> (raw)
In-Reply-To: <6e892914-74ae-2b8f-954e-342aaf4be870@huawei.com>



On 2022/5/7 17:35, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/5/7 11:37, Leizhen (ThunderTown) wrote:
>>
>>
>> On 2022/5/7 10:07, Baoquan He wrote:
>>> On 05/07/22 at 09:34am, Leizhen (ThunderTown) wrote:
>>>>
>>>>
>>>> On 2022/5/7 7:10, Baoquan He wrote:
>>>>> On 05/06/22 at 07:43pm, Zhen Lei wrote:
>>>>> ......  
>>>>>> @@ -118,8 +162,7 @@ static void __init reserve_crashkernel(void)
>>>>>>  	if (crash_base)
>>>>>>  		crash_max = crash_base + crash_size;
>>>>>>  
>>>>>> -	/* Current arm64 boot protocol requires 2MB alignment */
>>>>>> -	crash_base = memblock_phys_alloc_range(crash_size, SZ_2M,
>>>>>> +	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
>>>>>>  					       crash_base, crash_max);
>>>>>>  	if (!crash_base) {
>>>>>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
>>>>>> @@ -127,6 +170,11 @@ static void __init reserve_crashkernel(void)
>>>>>>  		return;
>>>>>>  	}
>>>>>>  
>>>>>
>>>>> There's corner case missed, e.g
>>>>> 1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
>>>>> 2) ,high and ,low are specified, the whole system memory is under 4G.
>>>>>
>>>>> Below judgement can filter them away:
>>>>>         
>>>>> 	if (crash_base > arm64_dma_phys_limit && crash_low_size &&
>>>>> 	    reserve_crashkernel_low(crash_low_size)) {
>>>>>
>>>>> What's your opinion? Leave it and add document to notice user, or fix it
>>>>> with code change?
> 
> I decided to modify the code and document. But the code changes aren't what
> you suggested. For the following reasons:
> 1. The memory allocated for 'high' may be partially under 4G. So the low
>    memory may not be enough. Of course, it's rare.
> 2. The second kernel can work properly only when the high and low memory
>    are successfully applied for. For example, high=128M, low=128M, but the
>    second kernel need 256M.
> 
> So for the cases you listed:
> 1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
>    --> Follow you suggestion, ignore crashkernel=Y,low, don't allocate low memory.
> 
> @@ -100,6 +100,14 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
>  {
>         unsigned long long low_base;
> 
> +       /*
> +        * The kernel does not have any DMA zone, so the range of each DMA
> +        * zone is unknown. Please make sure both CONFIG_ZONE_DMA and
> +        * CONFIG_ZONE_DMA32 are also not set in the second kernel.
> +        */
> +       if (!IS_ENABLED(CONFIG_ZONE_DMA) && !IS_ENABLED(CONFIG_ZONE_DMA32))
> +               return 0;
> +
> 
> 2) ,high and ,low are specified, the whole system memory is under 4G.
>    --> two memory ranges will be allocated, the size is what 'high' and 'low' specified.
>    --> Yes, the memory of 'low' may be above 'high', but the 'high' just hint allocation
>    --> from top, try high memory first. Of course, this may cause kexec to fail to load.
>    --> Because the memory of 'low' with small size will be used to store Image, etc..
>    --> But the memory of 'low' above 'high' is almost impossible, we use memblock API to
>    --> allocate memory from top to bottem, 'low' above 'high' need a sizeable memory block
>    --> (128M, 256M?) to be freed at init phase.
>    -->  Maybe I should add: crash_max = min(crash_base, CRASH_ADDR_LOW_MAX);
>    --> to make sure the memory of 'low' is always under 'high'

I have added the min() above.

Test result:
1) ,high and ,low are specified, CONFIG_ZONE_DMA|DMA32 is not enabled;
root at localhost:~# dmesg | grep crash
[    0.000000] crashkernel reserved: 0x0000000420000000 - 0x0000000440000000 (512 MB)
[    0.000000] Kernel command line: console=ttyAMA0 root=/dev/vda rw panic_on_oops=1 oops=panic crashkernel=512M,high crashkernel=128M,low

2) ,high and ,low are specified, the whole system memory is under 4G.
root at localhost:~# dmesg | grep crash
[    0.000000] crashkernel tmp reserved: 0x00000000f2800000 - 0x00000000fa800000 (128 MB)
[    0.000000] crashkernel low memory reserved: 0xca800000 - 0xd2800000 (128 MB)
[    0.000000] crashkernel reserved: 0x00000000d2800000 - 0x00000000f2800000 (512 MB)
[    0.000000] Kernel command line: console=ttyAMA0 root=/dev/vda rw panic_on_oops=1 oops=panic crashkernel=512M,high crashkernel=128M,low

test stub for 2):

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 5cb73bbd286b100..abbde2158a0976a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -147,6 +147,7 @@ static void __init reserve_crashkernel(void)
        unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
        char *cmdline = boot_command_line;
        int ret;
+       unsigned long long tmp_base;

        if (!IS_ENABLED(CONFIG_KEXEC_CORE))
                return;
@@ -179,6 +180,11 @@ static void __init reserve_crashkernel(void)
        if (crash_base)
                crash_max = crash_base + crash_size;

+       tmp_base = memblock_phys_alloc_range(crash_low_size, CRASH_ALIGN, crash_base, crash_max);
+       BUG_ON(!tmp_base);
+       pr_info("crashkernel tmp reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
+               tmp_base, tmp_base + crash_low_size, crash_low_size >> 20);
+
        crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);
        if (!crash_base) {
@@ -186,6 +192,7 @@ static void __init reserve_crashkernel(void)
                        crash_size);
                return;
        }
+       memblock_phys_free(tmp_base, crash_low_size);

        if (crash_low_size && reserve_crashkernel_low(crash_low_size, crash_base)) {
                memblock_phys_free(crash_base, crash_size);

> 
>>>>
>>>> I think maybe we can leave it unchanged. If the user configures two memory ranges,
>>>> we'd better apply for two. Otherwise, he'll be confused when he inquires. Currently,
>>>> crash_low_size is non-zero only when 'crashkernel=Y,low' is explicitly configured.
>>>
>>> Then user need know the system information, e.g how much is the high
>>> memory, low memory, if CONFIG_ZONE_DMA|DMA32 is enabled. And we need
>>> describe these cases in document. Any corner case or exception need
>>> be noted if we don't handle it in code.
>>>
>>> Caring about this very much because we have CI with existed test cases
>>> to run on the system, and QA will check these manually too. Support
>>> engineer need detailed document if anything special but happened.
>>> Anything unclear or uncovered will be reported as bug to our kernel dev.
>>> Guess your company do the similar thing like this.
>>>
>>> This crashkerne,high and crashkernel,low reservation is special if we
>>> allow ,high, ,low existing in the same zone. Imagine on system with
>>> CONFIG_ZONE_DMA|DMA32 disabled, people copy the crashkernel=512M,high
>>> and crashkernel=128M,low from other system, and he could get
>>> crash_res at [5G, 5G+512M], while crash_low_res at [6G, 6G+128M]. Guess
>>> how they will judge us.
>>
>> OK, I got it.
>>
>>>
>>>>
>>>>>
>>>>> I would suggest merging this series, Lei can add this corner case
>>>>> handling on top. Since this is a newly added support, we don't have
>>>>> to make it one step. Doing step by step can make reviewing easier.
>>>>>
>>>>>> +	if (crash_low_size && reserve_crashkernel_low(crash_low_size)) {
>>>>>> +		memblock_phys_free(crash_base, crash_size);
>>>>>> +		return;
>>>>>> +	}
>>>>>> +
>>>>>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
>>>>>>  		crash_base, crash_base + crash_size, crash_size >> 20);
>>>>>>  
>>>>>> @@ -135,6 +183,9 @@ static void __init reserve_crashkernel(void)
>>>>>>  	 * map. Inform kmemleak so that it won't try to access it.
>>>>>>  	 */
>>>>>>  	kmemleak_ignore_phys(crash_base);
>>>>>> +	if (crashk_low_res.end)
>>>>>> +		kmemleak_ignore_phys(crashk_low_res.start);
>>>>>> +
>>>>>>  	crashk_res.start = crash_base;
>>>>>>  	crashk_res.end = crash_base + crash_size - 1;
>>>>>>  	insert_resource(&iomem_resource, &crashk_res);
>>>>>> -- 
>>>>>> 2.25.1
>>>>>>
>>>>>
>>>>> .
>>>>>
>>>>
>>>> -- 
>>>> Regards,
>>>>   Zhen Lei
>>>>
>>>
>>> .
>>>
>>
> 

-- 
Regards,
  Zhen Lei


  reply	other threads:[~2022-05-07 11:50 UTC|newest]

Thread overview: 87+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-06 11:43 [PATCH v24 0/6] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
2022-05-06 11:43 ` Zhen Lei
2022-05-06 11:43 ` Zhen Lei
2022-05-06 11:43 ` [PATCH v24 1/6] kdump: return -ENOENT if required cmdline option does not exist Zhen Lei
2022-05-06 11:43   ` Zhen Lei
2022-05-06 11:43   ` Zhen Lei
2022-05-06 11:43 ` [PATCH v24 2/6] arm64: Use insert_resource() to simplify code Zhen Lei
2022-05-06 11:43   ` Zhen Lei
2022-05-06 11:43   ` Zhen Lei
2022-05-06 11:43 ` [PATCH v24 3/6] arm64: kdump: Reimplement crashkernel=X Zhen Lei
2022-05-06 11:43   ` Zhen Lei
2022-05-06 11:43   ` Zhen Lei
2022-05-06 23:10   ` Baoquan He
2022-05-06 23:10     ` Baoquan He
2022-05-06 23:10     ` Baoquan He
2022-05-07  1:34     ` Leizhen (ThunderTown)
2022-05-07  1:34       ` Leizhen
2022-05-07  1:34       ` Leizhen (ThunderTown)
2022-05-07  2:07       ` Baoquan He
2022-05-07  2:07         ` Baoquan He
2022-05-07  2:07         ` Baoquan He
2022-05-07  3:37         ` Leizhen (ThunderTown)
2022-05-07  3:37           ` Leizhen
2022-05-07  3:37           ` Leizhen (ThunderTown)
2022-05-07  9:35           ` Leizhen (ThunderTown)
2022-05-07  9:35             ` Leizhen
2022-05-07  9:35             ` Leizhen (ThunderTown)
2022-05-07 11:49             ` Leizhen (ThunderTown) [this message]
2022-05-07 11:49               ` Leizhen
2022-05-07 11:49               ` Leizhen (ThunderTown)
2022-05-07 12:20               ` Leizhen (ThunderTown)
2022-05-07 12:20                 ` Leizhen
2022-05-07 12:20                 ` Leizhen (ThunderTown)
2022-05-07 13:22             ` Baoquan He
2022-05-07 13:22               ` Baoquan He
2022-05-07 13:22               ` Baoquan He
2022-05-07 17:30     ` John Donnelly
2022-05-07 17:30       ` John Donnelly
2022-05-07 17:30       ` John Donnelly
2022-05-07 18:50     ` Catalin Marinas
2022-05-07 18:50       ` Catalin Marinas
2022-05-07 18:50       ` Catalin Marinas
2022-05-09  4:04       ` Baoquan He
2022-05-09  4:04         ` Baoquan He
2022-05-09  4:04         ` Baoquan He
2022-05-06 11:44 ` [PATCH v24 4/6] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
2022-05-06 11:44   ` [PATCH v24 4/6] of: fdt: Add memory for devices by DT property "linux, usable-memory-range" Zhen Lei
2022-05-06 11:44   ` Zhen Lei
2022-05-06 23:15   ` [PATCH v24 4/6] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Baoquan He
2022-05-06 23:15     ` Baoquan He
2022-05-06 23:15     ` Baoquan He
2022-05-06 11:44 ` [PATCH v24 5/6] of: Support more than one crash kernel regions for kexec -s Zhen Lei
2022-05-06 11:44   ` Zhen Lei
2022-05-06 11:44   ` Zhen Lei
2022-05-06 23:17   ` Baoquan He
2022-05-06 23:17     ` Baoquan He
2022-05-06 23:17     ` Baoquan He
2022-05-07  1:42     ` Leizhen (ThunderTown)
2022-05-07  1:42       ` Leizhen
2022-05-07  1:42       ` Leizhen (ThunderTown)
2022-05-07  2:36       ` Baoquan He
2022-05-07  2:36         ` Baoquan He
2022-05-07  2:36         ` Baoquan He
2022-05-06 11:44 ` [PATCH v24 6/6] docs: kdump: Update the crashkernel description for arm64 Zhen Lei
2022-05-06 11:44   ` Zhen Lei
2022-05-06 11:44   ` Zhen Lei
2022-05-06 23:14   ` Baoquan He
2022-05-06 23:14     ` Baoquan He
2022-05-06 23:14     ` Baoquan He
2022-05-07  1:41     ` Leizhen (ThunderTown)
2022-05-07  1:41       ` Leizhen
2022-05-07  1:41       ` Leizhen (ThunderTown)
2022-05-07  3:23       ` Leizhen (ThunderTown)
2022-05-07  3:23         ` Leizhen
2022-05-07  3:23         ` Leizhen (ThunderTown)
2022-05-07  3:30       ` Baoquan He
2022-05-07  3:30         ` Baoquan He
2022-05-07  3:30         ` Baoquan He
2022-05-07 18:22         ` Catalin Marinas
2022-05-07 18:22           ` Catalin Marinas
2022-05-07 18:22           ` Catalin Marinas
2022-05-09  4:05           ` Baoquan He
2022-05-09  4:05             ` Baoquan He
2022-05-09  4:05             ` Baoquan He
2022-05-07 19:12 ` [PATCH v24 0/6] support reserving crashkernel above 4G on arm64 kdump Catalin Marinas
2022-05-07 19:12   ` Catalin Marinas
2022-05-07 19:12   ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0c7e91fb-10a3-f7e6-e856-0c865c71527b@huawei.com \
    --to=thunder.leizhen@huawei.com \
    --cc=John.p.donnelly@oracle.com \
    --cc=bhe@redhat.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=dave.kleikamp@oracle.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dingguo.cz@antgroup.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=frowand.list@gmail.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=robh+dt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=vgoyal@redhat.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=zhoufeng.zf@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.