linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Baoquan He <bhe@redhat.com>
To: "Leizhen (ThunderTown)" <thunder.leizhen@huawei.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H . Peter Anvin" <hpa@zytor.com>,
	linux-kernel@vger.kernel.org, Dave Young <dyoung@redhat.com>,
	Vivek Goyal <vgoyal@redhat.com>,
	Eric Biederman <ebiederm@xmission.com>,
	kexec@lists.infradead.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	linux-arm-kernel@lists.infradead.org,
	Rob Herring <robh+dt@kernel.org>,
	Frank Rowand <frowand.list@gmail.com>,
	devicetree@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
	linux-doc@vger.kernel.org, Randy Dunlap <rdunlap@infradead.org>,
	Feng Zhou <zhoufeng.zf@bytedance.com>,
	Kefeng Wang <wangkefeng.wang@huawei.com>,
	Chen Zhou <dingguo.cz@antgroup.com>,
	John Donnelly <John.p.donnelly@oracle.com>,
	Dave Kleikamp <dave.kleikamp@oracle.com>
Subject: Re: [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X
Date: Wed, 16 Feb 2022 18:20:58 +0800	[thread overview]
Message-ID: <YgzQCqwmCOfzwzx/@MiWiFi-R3L-srv> (raw)
In-Reply-To: <f1614874-306d-482a-a652-d71a8bcbe3a9@huawei.com>

On 02/16/22 at 10:58am, Leizhen (ThunderTown) wrote:
> 
> 
> On 2022/2/14 15:53, Leizhen (ThunderTown) wrote:
> > 
> > 
> > On 2022/2/14 11:52, Baoquan He wrote:
> >> On 01/24/22 at 04:47pm, Zhen Lei wrote:
> >> ......
> >>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >>> index 6c653a2c7cff052..a5d43feac0d7d96 100644
> >>> --- a/arch/arm64/mm/init.c
> >>> +++ b/arch/arm64/mm/init.c
> >>> @@ -71,6 +71,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>>  #define CRASH_ADDR_LOW_MAX	arm64_dma_phys_limit
> >>>  #define CRASH_ADDR_HIGH_MAX	MEMBLOCK_ALLOC_ACCESSIBLE
> >>>  
> >>> +static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>> +{
> >>> +	unsigned long long low_base;
> >>> +
> >>> +	/* passed with crashkernel=0,low ? */
> >>> +	if (!low_size)
> >>> +		return 0;
> >>> +
> >>> +	low_base = memblock_phys_alloc_range(low_size, CRASH_ALIGN, 0, CRASH_ADDR_LOW_MAX);
> >>> +	if (!low_base) {
> >>> +		pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size);
> >>> +		return -ENOMEM;
> >>> +	}
> >>> +
> >>> +	pr_info("crashkernel low memory reserved: 0x%llx - 0x%llx (%lld MB)\n",
> >>> +		low_base, low_base + low_size, low_size >> 20);
> >>> +
> >>> +	crashk_low_res.start = low_base;
> >>> +	crashk_low_res.end   = low_base + low_size - 1;
> >>> +	insert_resource(&iomem_resource, &crashk_low_res);
> >>> +
> >>> +	return 0;
> >>> +}
> >>> +
> >>>  /*
> >>>   * reserve_crashkernel() - reserves memory for crash kernel
> >>
> >> My another concern is the crashkernel=,low handling. In this patch, the
> >> code related to low memory is obscure. Wondering if we should make them
> >> explicit with a little redundant but very clear code flows. Saying this
> >> because the code must be very clear to you and reviewers, it may be
> >> harder for later code reader or anyone interested to understand.
> >>
> >> 1) crashkernel=X,high
> >> 2) crashkernel=X,high crashkernel=Y,low
> >> 3) crashkernel=X,high crashkernel=0,low
> >> 4) crashkernel=X,high crashkernel='messy code',low
> >> 5) crashkernel=X //fall back to high memory, low memory is required then.
> >>
> >> It could be me thinking about it too much. I made changes to your patch
> >> with a tuning, not sure if it's OK to you. Otherwise, this patchset
> > 
> > I think it's good.
> > 
> >> works very well for all above test cases, it's ripe to be merged for
> >> wider testing.
> > 
> > I will test it tomorrow. I've prepared a little more use cases than yours.
> 
> After the following modifications, I have tested it and it works well. Passed
> all the test cases I prepared.

That's great.

You might need to add 'crashkernel=xM, crashkernel=0,low',
'crashkernel=xM, crashkernel='messy code',low' to your test cases.

> 
> > 
> > 1) crashkernel=4G						//high=4G, low=256M
> > 2) crashkernel=4G crashkernel=512M,high crashkernel=512M,low	//high=4G, low=256M, high and low are ignored
> > 3) crashkernel=4G crashkernel=512M,high				//high=4G, low=256M, high is ignored
> > 4) crashkernel=4G crashkernel=512M,low				//high=4G, low=256M, low is ignored
> > 5) crashkernel=4G@0xe0000000					//high=0G, low=0M, cannot allocate, failed
> > 6) crashkernel=512M						//high=0G, low=512M
> > 7) crashkernel=128M						//high=0G, low=128M
> > 8) crashkernel=512M@0xde000000		//512M@3552M		//high=0G, low=512M
> > 9) crashkernel=4G,high						//high=4G, low=256M
> > a) crashkernel=4G,high crashkernel=512M,low			//high=4G, low=512M
> > b) crashkernel=512M,high crashkernel=128M,low			//high=512M, low=128M
> > c) crashkernel=512M,low						//high=0G, low=0M, invalid
> > 
> > 
> >>
> >> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> >> index a5d43feac0d7..671862c56d7d 100644
> >> --- a/arch/arm64/mm/init.c
> >> +++ b/arch/arm64/mm/init.c
> >> @@ -94,7 +94,8 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>  
> >>  	return 0;
> >>  }
> >> -
> >> +/*Words explaining why it's 256M*/
> >> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M
> 
> It's an empirical value.
> 
> 94fb9334182284e8e7e4bcb9125c25dc33af19d4 x86/crash: Allocate enough low memory when crashkernel=high
> 
>     When the crash kernel is loaded above 4GiB in memory, the
>     first kernel allocates only 72MiB of low-memory for the DMA
>     requirements of the second kernel. On systems with many
>     devices this is not enough and causes device driver
>     initialization errors and failed crash dumps. Testing by
>     SUSE and Redhat has shown that 256MiB is a good default
>     value for now and the discussion has lead to this value as
>     well. So set this default value to 256MiB to make sure there
>     is enough memory available for DMA.

Then, some words like below can be added. I am not confident it's good
enought, hope someone else can help to polish it.

/*
 * This is an empirical value in x86_64 and taken here directly. Please
 * refer to code comment in reserve_crashkernel_low() of x86_64 for more
 * details.
 */
#define DEFAULT_CRASH_KERNEL_LOW_SIZE SZ_256M

> 
> 
> >>  /*
> >>   * reserve_crashkernel() - reserves memory for crash kernel
> >>   *
> >> @@ -105,10 +106,10 @@ static int __init reserve_crashkernel_low(unsigned long long low_size)
> >>  static void __init reserve_crashkernel(void)
> >>  {
> >>  	unsigned long long crash_base, crash_size;
> >> -	unsigned long long crash_low_size = SZ_256M;
> >> +	unsigned long long crash_low_size;
> >>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>  	int ret;
> >> -	bool fixed_base;
> >> +	bool fixed_base, high;
> 
> high = false;
> 
> >>  	char *cmdline = boot_command_line;
> >>  
> >>  	/* crashkernel=X[@offset] */
> >> @@ -126,7 +127,10 @@ static void __init reserve_crashkernel(void)
> >>  		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> >>  		if (!ret)
> >>  			crash_low_size = low_size;
> >> +		else
> >> +			crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >>  
> >> +		high = true;
> >>  		crash_max = CRASH_ADDR_HIGH_MAX;
> >>  	}
> >>  
> >> @@ -134,7 +138,7 @@ static void __init reserve_crashkernel(void)
> >>  	crash_size = PAGE_ALIGN(crash_size);
> >>  
> >>  	/* User specifies base address explicitly. */
> >> -	if (crash_base)
> >> +	if (fixed_base)
> >>  		crash_max = crash_base + crash_size;
> >>  
> >>  retry:
> >> @@ -156,7 +160,10 @@ static void __init reserve_crashkernel(void)
> >>  		return;
> >>  	}
> >>  
> >> -	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> >> +	if (crash_base >= SZ_4G && !high) 
> >> +		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;
> >> +
> >> +	if (reserve_crashkernel_low(crash_low_size)) {
> >>  		memblock_phys_free(crash_base, crash_size);
> >>  		return;
> >>  	}
> 
> -       if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> -               memblock_phys_free(crash_base, crash_size);
> -               return;
> +       if (crash_base >= SZ_4G) {
> +               if (!high)
> +                       crash_low_size = SZ_256M;
> +
> +               if (reserve_crashkernel_low(crash_low_size)) {
> +                       memblock_phys_free(crash_base, crash_size);
> +                       return;
> +               }
>         }
> 
> Looks like changing 'high' to 'low' would be more accurate. Whether crashkernel=Y,low is specified.

What I menat is like below, we even can add code comment to make it more
clearer.

static void __init reserve_crashkernel(void)
{

        /* crashkernel=X[@offset] */
        ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
                                &crash_size, &crash_base);
        if (ret || !crash_size) {
                unsigned long long low_size;

                /* crashkernel=X,high */
                ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
                if (ret || !crash_size)
                        return;

                /* crashkernel=X,low */
                ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
		//case #1, crashkernel=yM,low is specified explicitly in cmdline
                if (!ret)
                        crash_low_size = low_size;
		else //case #2, crashkernel=yM,low is not specified explicitly
                        crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;

		//high means crashkernel,high is specified explicitly
		high = true;
                crash_max = CRASH_ADDR_HIGH_MAX;
        }

        fixed_base = !!crash_base;
        crash_size = PAGE_ALIGN(crash_size);

        /* User specifies base address explicitly. */
        if (crash_base)
                crash_max = crash_base + crash_size;
retry:
        crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
                                               crash_base, crash_max);
        if (!crash_base) {
                /*
                 * Attempt to fully allocate low memory failed, fall back
                 * to high memory, the minimum required low memory will be
                 * reserved later.
                 */
                if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
                        crash_max = CRASH_ADDR_HIGH_MAX;
                        goto retry;
                }

                pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
                        crash_size);
                return;
        }


	//case #3: get crashkernel from high memory through fallback, let's set crashkernel,low too.
        if (crash_base >= SZ_4G && !high)
		crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE;	

        if (reserve_crashkernel_low(crash_low_size)) {
                memblock_phys_free(crash_base, crash_size);
                return;
        }

        pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
                crash_base, crash_base + crash_size, crash_size >> 20);

        /*
         * The crashkernel memory will be removed from the kernel linear
         * map. Inform kmemleak so that it won't try to access it.
         */
        kmemleak_ignore_phys(crash_base);
        if (crashk_low_res.end)
                kmemleak_ignore_phys(crashk_low_res.start);

        crashk_res.start = crash_base;
        crashk_res.end = crash_base + crash_size - 1;
        insert_resource(&iomem_resource, &crashk_res);
}


> 
> 
> > 
> > It feels like {} may need to be added here so that it is in branch "if (crash_base >= SZ_4G)".
> > The case of "crashkernel=128M" will not fall back to high memory and does not need to reserve
> > low memory again.
> > 
> >>
> >>>   *
> >>> @@ -81,29 +105,62 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
> >>>  static void __init reserve_crashkernel(void)
> >>>  {
> >>>  	unsigned long long crash_base, crash_size;
> >>> +	unsigned long long crash_low_size = SZ_256M;
> >>>  	unsigned long long crash_max = CRASH_ADDR_LOW_MAX;
> >>>  	int ret;
> >>> +	bool fixed_base;
> >>> +	char *cmdline = boot_command_line;
> >>>  
> >>> -	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >>> +	/* crashkernel=X[@offset] */
> >>> +	ret = parse_crashkernel(cmdline, memblock_phys_mem_size(),
> >>>  				&crash_size, &crash_base);
> >>> -	/* no crashkernel= or invalid value specified */
> >>> -	if (ret || !crash_size)
> >>> -		return;
> >>> +	if (ret || !crash_size) {
> >>> +		unsigned long long low_size;
> >>>  
> >>> +		/* crashkernel=X,high */
> >>> +		ret = parse_crashkernel_high(cmdline, 0, &crash_size, &crash_base);
> >>> +		if (ret || !crash_size)
> >>> +			return;
> >>> +
> >>> +		/* crashkernel=X,low */
> >>> +		ret = parse_crashkernel_low(cmdline, 0, &low_size, &crash_base);
> >>> +		if (!ret)
> >>> +			crash_low_size = low_size;
> >>> +
> >>> +		crash_max = CRASH_ADDR_HIGH_MAX;
> >>> +	}
> >>> +
> >>> +	fixed_base = !!crash_base;
> >>>  	crash_size = PAGE_ALIGN(crash_size);
> >>>  
> >>>  	/* User specifies base address explicitly. */
> >>>  	if (crash_base)
> >>>  		crash_max = crash_base + crash_size;
> >>>  
> >>> +retry:
> >>>  	crash_base = memblock_phys_alloc_range(crash_size, CRASH_ALIGN,
> >>>  					       crash_base, crash_max);
> >>>  	if (!crash_base) {
> >>> +		/*
> >>> +		 * Attempt to fully allocate low memory failed, fall back
> >>> +		 * to high memory, the minimum required low memory will be
> >>> +		 * reserved later.
> >>> +		 */
> >>> +		if (!fixed_base && (crash_max == CRASH_ADDR_LOW_MAX)) {
> >>> +			crash_max = CRASH_ADDR_HIGH_MAX;
> >>> +			goto retry;
> >>> +		}
> >>> +
> >>>  		pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
> >>>  			crash_size);
> >>>  		return;
> >>>  	}
> >>>  
> >>> +	if (crash_base >= SZ_4G && reserve_crashkernel_low(crash_low_size)) {
> >>> +		memblock_phys_free(crash_base, crash_size);
> >>> +		return;
> >>> +	}
> >>> +
> >>>  	pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
> >>>  		crash_base, crash_base + crash_size, crash_size >> 20);
> >>>  
> >>> @@ -112,6 +169,9 @@ static void __init reserve_crashkernel(void)
> >>>  	 * map. Inform kmemleak so that it won't try to access it.
> >>>  	 */
> >>>  	kmemleak_ignore_phys(crash_base);
> >>> +	if (crashk_low_res.end)
> >>> +		kmemleak_ignore_phys(crashk_low_res.start);
> >>> +
> >>>  	crashk_res.start = crash_base;
> >>>  	crashk_res.end = crash_base + crash_size - 1;
> >>>  	insert_resource(&iomem_resource, &crashk_res);
> >>> -- 
> >>> 2.25.1
> >>>
> >>
> >> .
> >>
> > 
> 
> -- 
> Regards,
>   Zhen Lei
> 


  reply	other threads:[~2022-02-16 10:21 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-24  8:47 [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Zhen Lei
2022-01-24  8:47 ` [PATCH v20 1/5] arm64: Use insert_resource() to simplify code Zhen Lei
2022-01-26 15:16   ` john.p.donnelly
2022-02-08  1:43   ` Baoquan He
2022-01-24  8:47 ` [PATCH v20 2/5] arm64: kdump: introduce some macros for crash kernel reservation Zhen Lei
2022-01-26 15:17   ` john.p.donnelly
2022-02-11 10:39   ` Baoquan He
2022-02-14  6:22     ` Leizhen (ThunderTown)
2022-02-21  3:22       ` Baoquan He
2022-02-21  6:19         ` Leizhen (ThunderTown)
2022-01-24  8:47 ` [PATCH v20 3/5] arm64: kdump: reimplement crashkernel=X Zhen Lei
2022-01-26 15:18   ` john.p.donnelly
2022-02-11 10:30   ` Baoquan He
2022-02-11 10:41     ` Leizhen (ThunderTown)
2022-02-11 10:51       ` Baoquan He
2022-02-14  6:44         ` Leizhen (ThunderTown)
2022-02-14  7:09           ` Baoquan He
2022-02-14  3:52   ` Baoquan He
2022-02-14  7:53     ` Leizhen (ThunderTown)
2022-02-16  2:58       ` Leizhen (ThunderTown)
2022-02-16 10:20         ` Baoquan He [this message]
2022-02-17  1:57           ` Leizhen (ThunderTown)
2022-01-24  8:47 ` [PATCH v20 4/5] of: fdt: Add memory for devices by DT property "linux,usable-memory-range" Zhen Lei
2022-01-26 15:19   ` john.p.donnelly
2022-01-24  8:47 ` [PATCH v20 5/5] kdump: update Documentation about crashkernel Zhen Lei
2022-01-26 15:19   ` john.p.donnelly
2022-02-21  3:48   ` Baoquan He
2022-02-21  6:38     ` Leizhen (ThunderTown)
2022-02-07  4:04 ` [PATCH v20 0/5] support reserving crashkernel above 4G on arm64 kdump Leizhen (ThunderTown)
2022-02-08  2:34   ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YgzQCqwmCOfzwzx/@MiWiFi-R3L-srv \
    --to=bhe@redhat.com \
    --cc=John.p.donnelly@oracle.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=dave.kleikamp@oracle.com \
    --cc=devicetree@vger.kernel.org \
    --cc=dingguo.cz@antgroup.com \
    --cc=dyoung@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=frowand.list@gmail.com \
    --cc=hpa@zytor.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=rdunlap@infradead.org \
    --cc=robh+dt@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=thunder.leizhen@huawei.com \
    --cc=vgoyal@redhat.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=zhoufeng.zf@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).