LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] powerpc/crashkernel: take mem option into account
@ 2019-09-09  4:05 Pingfan Liu
  2019-09-09  7:35 ` Pingfan Liu
  2019-09-12  2:50 ` Pingfan Liu
  0 siblings, 2 replies; 7+ messages in thread
From: Pingfan Liu @ 2019-09-09  4:05 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Hari Bathini, Pingfan Liu

'mem=" option is an easy way to put high pressure on memory during some
test. Hence in stead of total mem, the effective usable memory size should
be considered when reserving mem for crashkernel. Otherwise the boot up may
experience oom issue.

E.g passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
mem=5G.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
To: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/kernel/machine_kexec.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index c4ed328..714b733 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-	unsigned long long crash_size, crash_base;
+	unsigned long long crash_size, crash_base, total_mem_sz;
 	int ret;
 
+	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
 	/* use common parsing */
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, total_mem_sz,
 			&crash_size, &crash_base);
 	if (ret == 0 && crash_size > 0) {
 		crashk_res.start = crash_base;
-- 
2.7.5


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] powerpc/crashkernel: take mem option into account
  2019-09-09  4:05 [PATCH] powerpc/crashkernel: take mem option into account Pingfan Liu
@ 2019-09-09  7:35 ` Pingfan Liu
  2019-09-12  2:50 ` Pingfan Liu
  1 sibling, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2019-09-09  7:35 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Hari Bathini

On Mon, Sep 9, 2019 at 12:05 PM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> 'mem=" option is an easy way to put high pressure on memory during some
> test. Hence in stead of total mem, the effective usable memory size should
> be considered when reserving mem for crashkernel. Otherwise the boot up may
> experience oom issue.
>
> E.g passing
> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
> mem=5G.
>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> To: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/powerpc/kernel/machine_kexec.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
> index c4ed328..714b733 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
>
>  void __init reserve_crashkernel(void)
>  {
> -       unsigned long long crash_size, crash_base;
> +       unsigned long long crash_size, crash_base, total_mem_sz;
>         int ret;
>
> +       total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
Here memory_limit is used to esstimation and may be changed.
So I think it is better to use memory_limit here than moving
memblock_enforce_memory_limit() before the call to
reserve_crashkernel()

Thanks,
Pingfan
>         /* use common parsing */
> -       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +       ret = parse_crashkernel(boot_command_line, total_mem_sz,
>                         &crash_size, &crash_base);
>         if (ret == 0 && crash_size > 0) {
>                 crashk_res.start = crash_base;
> --
> 2.7.5
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] powerpc/crashkernel: take mem option into account
  2019-09-09  4:05 [PATCH] powerpc/crashkernel: take mem option into account Pingfan Liu
  2019-09-09  7:35 ` Pingfan Liu
@ 2019-09-12  2:50 ` Pingfan Liu
  1 sibling, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2019-09-12  2:50 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Hari Bathini

NACK it. Due to a miss the updating of printk info. I will send out V2

On Mon, Sep 9, 2019 at 12:05 PM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> 'mem=" option is an easy way to put high pressure on memory during some
> test. Hence in stead of total mem, the effective usable memory size should
> be considered when reserving mem for crashkernel. Otherwise the boot up may
> experience oom issue.
>
> E.g passing
> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
> mem=5G.
>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> To: linuxppc-dev@lists.ozlabs.org
> ---
>  arch/powerpc/kernel/machine_kexec.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
> index c4ed328..714b733 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
>
>  void __init reserve_crashkernel(void)
>  {
> -       unsigned long long crash_size, crash_base;
> +       unsigned long long crash_size, crash_base, total_mem_sz;
>         int ret;
>
> +       total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
>         /* use common parsing */
> -       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +       ret = parse_crashkernel(boot_command_line, total_mem_sz,
>                         &crash_size, &crash_base);
>         if (ret == 0 && crash_size > 0) {
>                 crashk_res.start = crash_base;
> --
> 2.7.5
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] powerpc/crashkernel: take mem option into account
  2019-09-18 11:22   ` Michael Ellerman
@ 2019-09-23  4:14     ` Pingfan Liu
  0 siblings, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2019-09-23  4:14 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, Hari Bathini

On Wed, Sep 18, 2019 at 7:23 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Pingfan Liu <kernelfans@gmail.com> writes:
> > Cc Kexec list. And keep the original content.
> >
> > On Thu, Sep 12, 2019 at 10:50 AM Pingfan Liu <kernelfans@gmail.com> wrote:
> >>
> >> 'mem=" option is an easy way to put high pressure on memory during some
> >> test. Hence in stead of total mem, the effective usable memory size
>                ^                          ^
>                instead                    "actual" would be clearer
>
> I think adding: "after applying the memory limit"
>
> would help here.
>
> >> should be considered when reserving mem for crashkernel. Otherwise
> >> the boot up may experience oom issue.
>                               ^
>                               OOM
> >>
> >> E.g passing
> >> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
> >> mem=5G on a 256G machine.
>
> Spelling out the behaviour before and after would help here, eg:
>
> .. "would reserve 4G prior to the change and 512M afterward."
>
Thanks for kindly review. I will update the commit based on your suggestion.
>
> >> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> >> Cc: Hari Bathini <hbathini@linux.ibm.com>
> >> Cc: Michael Ellerman <mpe@ellerman.id.au>
> >> To: linuxppc-dev@lists.ozlabs.org
> >> ---
> >> v1 -> v2: fix the printk info about the total mem
> >>  arch/powerpc/kernel/machine_kexec.c | 7 ++++---
> >>  1 file changed, 4 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
> >> index c4ed328..eec96dc 100644
> >> --- a/arch/powerpc/kernel/machine_kexec.c
> >> +++ b/arch/powerpc/kernel/machine_kexec.c
> >> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
> >>
> >>  void __init reserve_crashkernel(void)
> >>  {
> >> -       unsigned long long crash_size, crash_base;
> >> +       unsigned long long crash_size, crash_base, total_mem_sz;
> >>         int ret;
> >>
> >> +       total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
> >>         /* use common parsing */
> >> -       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> >> +       ret = parse_crashkernel(boot_command_line, total_mem_sz,
> >>                         &crash_size, &crash_base);
>
> I think this change makes sense. But we have multiple arches that
> implement similar logic, and I wonder if we should keep them all the
> same.
>
> eg:
>
>   arch/arm/kernel/setup.c:                ret = parse_crashkernel(boot_command_line, total_mem,
>   arch/arm64/mm/init.c:                   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   arch/ia64/kernel/setup.c:               ret = parse_crashkernel(boot_command_line, total,
>   arch/mips/kernel/setup.c:               ret = parse_crashkernel(boot_command_line, total_mem,
>   arch/powerpc/kernel/fadump.c:           ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   arch/powerpc/kernel/machine_kexec.c:    ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   arch/s390/kernel/setup.c:               rc = parse_crashkernel(boot_command_line, memory_end, &crash_size,
>   arch/sh/kernel/machine_kexec.c:         ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>   arch/x86/kernel/setup.c:                ret = parse_crashkernel(boot_command_line, total_mem, &crash_size, &crash_base);
>
>
> From a quick glance most of them don't seem to take the memory limit
> into account.
>
> So I guess the question is do we want all arches to implement the same
> behaviour or do we think it doesn't matter if they differ in details
> like this?

On powerpc, the current code make fadump/kdump a higher priority than
"mem=" option, as the notes in fadump_reserve_mem() says
"
        /*
         * Calculate the memory boundary.
         * If memory_limit is less than actual memory boundary then reserve
         * the memory for fadump beyond the memory_limit and adjust the
         * memory_limit accordingly, so that the running kernel can run with
         * specified memory_limit.
         */
"

While on other archs, they pack "mem=" info into memblock before
calling memblock_phys_mem_size(). So when parse_crashkernel() calls
memblock_phys_mem_size(), the "mem=" takes effect.

E.g for x86 in arch/x86/kernel/e820.c
static int __init parse_memopt(char *p)
{
...
e820__range_remove(mem_size, ULLONG_MAX - mem_size, E820_TYPE_RAM, 1);
// this pack the "mem=" info into e820, and is finally feed to
memblock
}

Thanks,
Pingfan

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] powerpc/crashkernel: take mem option into account
  2019-09-17  5:29 ` Pingfan Liu
@ 2019-09-18 11:22   ` Michael Ellerman
  2019-09-23  4:14     ` Pingfan Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Michael Ellerman @ 2019-09-18 11:22 UTC (permalink / raw)
  To: Pingfan Liu, linuxppc-dev; +Cc: Hari Bathini

Pingfan Liu <kernelfans@gmail.com> writes:
> Cc Kexec list. And keep the original content.
>
> On Thu, Sep 12, 2019 at 10:50 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>>
>> 'mem=" option is an easy way to put high pressure on memory during some
>> test. Hence in stead of total mem, the effective usable memory size
               ^                          ^
               instead                    "actual" would be clearer

I think adding: "after applying the memory limit" 

would help here.

>> should be considered when reserving mem for crashkernel. Otherwise
>> the boot up may experience oom issue.
                              ^
                              OOM
>>
>> E.g passing
>> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
>> mem=5G on a 256G machine.

Spelling out the behaviour before and after would help here, eg:

.. "would reserve 4G prior to the change and 512M afterward."


>> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
>> Cc: Hari Bathini <hbathini@linux.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> To: linuxppc-dev@lists.ozlabs.org
>> ---
>> v1 -> v2: fix the printk info about the total mem
>>  arch/powerpc/kernel/machine_kexec.c | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
>> index c4ed328..eec96dc 100644
>> --- a/arch/powerpc/kernel/machine_kexec.c
>> +++ b/arch/powerpc/kernel/machine_kexec.c
>> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
>>
>>  void __init reserve_crashkernel(void)
>>  {
>> -       unsigned long long crash_size, crash_base;
>> +       unsigned long long crash_size, crash_base, total_mem_sz;
>>         int ret;
>>
>> +       total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
>>         /* use common parsing */
>> -       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
>> +       ret = parse_crashkernel(boot_command_line, total_mem_sz,
>>                         &crash_size, &crash_base);

I think this change makes sense. But we have multiple arches that
implement similar logic, and I wonder if we should keep them all the
same.

eg:

  arch/arm/kernel/setup.c:                ret = parse_crashkernel(boot_command_line, total_mem,
  arch/arm64/mm/init.c:                   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/ia64/kernel/setup.c:               ret = parse_crashkernel(boot_command_line, total,
  arch/mips/kernel/setup.c:               ret = parse_crashkernel(boot_command_line, total_mem,
  arch/powerpc/kernel/fadump.c:           ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/powerpc/kernel/machine_kexec.c:    ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/s390/kernel/setup.c:               rc = parse_crashkernel(boot_command_line, memory_end, &crash_size,
  arch/sh/kernel/machine_kexec.c:         ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
  arch/x86/kernel/setup.c:                ret = parse_crashkernel(boot_command_line, total_mem, &crash_size, &crash_base);


From a quick glance most of them don't seem to take the memory limit
into account.

So I guess the question is do we want all arches to implement the same
behaviour or do we think it doesn't matter if they differ in details
like this?

cheers

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] powerpc/crashkernel: take mem option into account
  2019-09-12  2:50 Pingfan Liu
@ 2019-09-17  5:29 ` Pingfan Liu
  2019-09-18 11:22   ` Michael Ellerman
  0 siblings, 1 reply; 7+ messages in thread
From: Pingfan Liu @ 2019-09-17  5:29 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Hari Bathini

Cc Kexec list. And keep the original content.

On Thu, Sep 12, 2019 at 10:50 AM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> 'mem=" option is an easy way to put high pressure on memory during some
> test. Hence in stead of total mem, the effective usable memory size should
> be considered when reserving mem for crashkernel. Otherwise the boot up may
> experience oom issue.
>
> E.g passing
> crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
> mem=5G on a 256G machine.
>
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Hari Bathini <hbathini@linux.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> To: linuxppc-dev@lists.ozlabs.org
> ---
> v1 -> v2: fix the printk info about the total mem
>  arch/powerpc/kernel/machine_kexec.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
> index c4ed328..eec96dc 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
>
>  void __init reserve_crashkernel(void)
>  {
> -       unsigned long long crash_size, crash_base;
> +       unsigned long long crash_size, crash_base, total_mem_sz;
>         int ret;
>
> +       total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
>         /* use common parsing */
> -       ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
> +       ret = parse_crashkernel(boot_command_line, total_mem_sz,
>                         &crash_size, &crash_base);
>         if (ret == 0 && crash_size > 0) {
>                 crashk_res.start = crash_base;
> @@ -185,7 +186,7 @@ void __init reserve_crashkernel(void)
>                         "for crashkernel (System RAM: %ldMB)\n",
>                         (unsigned long)(crash_size >> 20),
>                         (unsigned long)(crashk_res.start >> 20),
> -                       (unsigned long)(memblock_phys_mem_size() >> 20));
> +                       (unsigned long)(total_mem_sz >> 20));
>
>         if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
>             memblock_reserve(crashk_res.start, crash_size)) {
> --
> 2.7.5
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] powerpc/crashkernel: take mem option into account
@ 2019-09-12  2:50 Pingfan Liu
  2019-09-17  5:29 ` Pingfan Liu
  0 siblings, 1 reply; 7+ messages in thread
From: Pingfan Liu @ 2019-09-12  2:50 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: Hari Bathini, Pingfan Liu

'mem=" option is an easy way to put high pressure on memory during some
test. Hence in stead of total mem, the effective usable memory size should
be considered when reserving mem for crashkernel. Otherwise the boot up may
experience oom issue.

E.g passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G", and
mem=5G on a 256G machine.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
To: linuxppc-dev@lists.ozlabs.org
---
v1 -> v2: fix the printk info about the total mem
 arch/powerpc/kernel/machine_kexec.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec.c b/arch/powerpc/kernel/machine_kexec.c
index c4ed328..eec96dc 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-	unsigned long long crash_size, crash_base;
+	unsigned long long crash_size, crash_base, total_mem_sz;
 	int ret;
 
+	total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
 	/* use common parsing */
-	ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+	ret = parse_crashkernel(boot_command_line, total_mem_sz,
 			&crash_size, &crash_base);
 	if (ret == 0 && crash_size > 0) {
 		crashk_res.start = crash_base;
@@ -185,7 +186,7 @@ void __init reserve_crashkernel(void)
 			"for crashkernel (System RAM: %ldMB)\n",
 			(unsigned long)(crash_size >> 20),
 			(unsigned long)(crashk_res.start >> 20),
-			(unsigned long)(memblock_phys_mem_size() >> 20));
+			(unsigned long)(total_mem_sz >> 20));
 
 	if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
 	    memblock_reserve(crashk_res.start, crash_size)) {
-- 
2.7.5


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-09  4:05 [PATCH] powerpc/crashkernel: take mem option into account Pingfan Liu
2019-09-09  7:35 ` Pingfan Liu
2019-09-12  2:50 ` Pingfan Liu
2019-09-12  2:50 Pingfan Liu
2019-09-17  5:29 ` Pingfan Liu
2019-09-18 11:22   ` Michael Ellerman
2019-09-23  4:14     ` Pingfan Liu

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org linuxppc-dev@archiver.kernel.org
	public-inbox-index linuxppc-dev


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox