All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
@ 2015-07-19 11:07 Baoquan He
  2015-07-19 13:40 ` Borislav Petkov
  2015-07-19 14:53 ` [PATCH v2] " Baoquan He
  0 siblings, 2 replies; 26+ messages in thread
From: Baoquan He @ 2015-07-19 11:07 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp, akpm, jkosina, vgoyal, linux-kernel; +Cc: Baoquan He

People reported that when allocating crashkernel memory using
",high" and ",low" syntax, there were cases where the reservation
of the "high" portion succeeds, but the reservation of the "low"
portion fails. Then kexec can load kdump kernel successfully, but
the boot of kdump kernel fails as there's no low memory. This is
because allocation of low memory for kdump kernel can fail on large
systems for reasons. E.g it could be manually specified crashkernel
low memory is too large to find in memblock region.

In this patch add return value for reserve_crashkernel_low. Then put
the crashkernel low memory reserving earlier, just between finding
the crashkernel high memory region and reserving crashkernel high
memory. Then if crashkernel low memory reserving failed we do not
reserve crashkernel high memory but return directly. Users can take
measures when they found kdump kernel cann't be loaded successfully.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/setup.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 80f874b..b9d6f71 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -513,7 +513,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 # define CRASH_KERNEL_ADDR_HIGH_MAX	MAXMEM
 #endif
 
-static void __init reserve_crashkernel_low(void)
+static int __init reserve_crashkernel_low(void)
 {
 #ifdef CONFIG_X86_64
 	const unsigned long long alignment = 16<<20;	/* 16M */
@@ -542,7 +542,7 @@ static void __init reserve_crashkernel_low(void)
 	} else {
 		/* passed with crashkernel=0,low ? */
 		if (!low_size)
-			return;
+			return 0;
 	}
 
 	low_base = memblock_find_in_range(low_size, (1ULL<<32),
@@ -552,7 +552,7 @@ static void __init reserve_crashkernel_low(void)
 		if (!auto_set)
 			pr_info("crashkernel low reservation failed - No suitable area found.\n");
 
-		return;
+		return EINVAL;
 	}
 
 	memblock_reserve(low_base, low_size);
@@ -564,6 +564,7 @@ static void __init reserve_crashkernel_low(void)
 	crashk_low_res.end   = low_base + low_size - 1;
 	insert_resource(&iomem_resource, &crashk_low_res);
 #endif
+	return 0;
 }
 
 static void __init reserve_crashkernel(void)
@@ -613,6 +614,10 @@ static void __init reserve_crashkernel(void)
 			return;
 		}
 	}
+
+	if (crash_base >= (1ULL<<32) && reserve_crashkernel_low())
+		return;
+
 	memblock_reserve(crash_base, crash_size);
 
 	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
@@ -624,9 +629,6 @@ static void __init reserve_crashkernel(void)
 	crashk_res.start = crash_base;
 	crashk_res.end   = crash_base + crash_size - 1;
 	insert_resource(&iomem_resource, &crashk_res);
-
-	if (crash_base >= (1ULL<<32))
-		reserve_crashkernel_low();
 }
 #else
 static void __init reserve_crashkernel(void)
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-19 11:07 [PATCH] Do not reserve crashkernel high memory if crashkernel low memory reserving failed Baoquan He
@ 2015-07-19 13:40 ` Borislav Petkov
  2015-07-19 14:23   ` Baoquan He
  2015-07-19 14:53 ` [PATCH v2] " Baoquan He
  1 sibling, 1 reply; 26+ messages in thread
From: Borislav Petkov @ 2015-07-19 13:40 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, akpm, jkosina, vgoyal, linux-kernel

On Sun, Jul 19, 2015 at 07:07:44PM +0800, Baoquan He wrote:
> People reported that when allocating crashkernel memory using
> ",high" and ",low" syntax, there were cases where the reservation
> of the "high" portion succeeds, but the reservation of the "low"
> portion fails. Then kexec can load kdump kernel successfully, but
> the boot of kdump kernel fails as there's no low memory. This is
> because allocation of low memory for kdump kernel can fail on large
> systems for reasons. E.g it could be manually specified crashkernel
> low memory is too large to find in memblock region.
> 
> In this patch add return value for reserve_crashkernel_low. Then put
> the crashkernel low memory reserving earlier, just between finding
> the crashkernel high memory region and reserving crashkernel high
> memory. Then if crashkernel low memory reserving failed we do not
> reserve crashkernel high memory but return directly. Users can take
> measures when they found kdump kernel cann't be loaded successfully.
>
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/kernel/setup.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)

...

> @@ -552,7 +552,7 @@ static void __init reserve_crashkernel_low(void)
>  		if (!auto_set)
>  			pr_info("crashkernel low reservation failed - No suitable area found.\n");
>  
> -		return;
> +		return EINVAL;

Error values are negative:

		return -EINVAL;

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-19 13:40 ` Borislav Petkov
@ 2015-07-19 14:23   ` Baoquan He
  0 siblings, 0 replies; 26+ messages in thread
From: Baoquan He @ 2015-07-19 14:23 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: tglx, mingo, hpa, akpm, jkosina, vgoyal, linux-kernel

On 07/19/15 at 03:40pm, Borislav Petkov wrote:
> ...
> 
> > @@ -552,7 +552,7 @@ static void __init reserve_crashkernel_low(void)
> >  		if (!auto_set)
> >  			pr_info("crashkernel low reservation failed - No suitable area found.\n");
> >  
> > -		return;
> > +		return EINVAL;
> 
> Error values are negative:
> 
> 		return -EINVAL;

Yes, will repost. Thanks a lot.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-19 11:07 [PATCH] Do not reserve crashkernel high memory if crashkernel low memory reserving failed Baoquan He
  2015-07-19 13:40 ` Borislav Petkov
@ 2015-07-19 14:53 ` Baoquan He
  2015-07-21  7:31   ` Dave Young
  2015-07-27 14:41   ` Joerg Roedel
  1 sibling, 2 replies; 26+ messages in thread
From: Baoquan He @ 2015-07-19 14:53 UTC (permalink / raw)
  To: tglx, mingo, hpa, bp, akpm, jkosina, vgoyal, linux-kernel

People reported that when allocating crashkernel memory using
",high" and ",low" syntax, there were cases where the reservation
of the "high" portion succeeds, but the reservation of the "low"
portion fails. Then kexec can load kdump kernel successfully, but
the boot of kdump kernel fails as there's no low memory. This is
because allocation of low memory for kdump kernel can fail on large
systems for reasons. E.g it could be manually specified crashkernel
low memory is too large to find in memblock region.

In this patch add return value for reserve_crashkernel_low. Then put
the crashkernel low memory reserving earlier, just between finding
the crashkernel high memory region and reserving crashkernel high
memory. Then if crashkernel low memory reserving failed we do not
reserve crashkernel high memory but return immediately. Users can
take measures when they found kdump kernel cann't be loaded
successfully.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 arch/x86/kernel/setup.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 80f874b..36aeac3 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -513,7 +513,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
 # define CRASH_KERNEL_ADDR_HIGH_MAX	MAXMEM
 #endif
 
-static void __init reserve_crashkernel_low(void)
+static int __init reserve_crashkernel_low(void)
 {
 #ifdef CONFIG_X86_64
 	const unsigned long long alignment = 16<<20;	/* 16M */
@@ -542,7 +542,7 @@ static void __init reserve_crashkernel_low(void)
 	} else {
 		/* passed with crashkernel=0,low ? */
 		if (!low_size)
-			return;
+			return 0;
 	}
 
 	low_base = memblock_find_in_range(low_size, (1ULL<<32),
@@ -552,7 +552,7 @@ static void __init reserve_crashkernel_low(void)
 		if (!auto_set)
 			pr_info("crashkernel low reservation failed - No suitable area found.\n");
 
-		return;
+		return -EINVAL;
 	}
 
 	memblock_reserve(low_base, low_size);
@@ -564,6 +564,7 @@ static void __init reserve_crashkernel_low(void)
 	crashk_low_res.end   = low_base + low_size - 1;
 	insert_resource(&iomem_resource, &crashk_low_res);
 #endif
+	return 0;
 }
 
 static void __init reserve_crashkernel(void)
@@ -613,6 +614,10 @@ static void __init reserve_crashkernel(void)
 			return;
 		}
 	}
+
+	if (crash_base >= (1ULL<<32) && reserve_crashkernel_low())
+		return;
+
 	memblock_reserve(crash_base, crash_size);
 
 	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
@@ -624,9 +629,6 @@ static void __init reserve_crashkernel(void)
 	crashk_res.start = crash_base;
 	crashk_res.end   = crash_base + crash_size - 1;
 	insert_resource(&iomem_resource, &crashk_res);
-
-	if (crash_base >= (1ULL<<32))
-		reserve_crashkernel_low();
 }
 #else
 static void __init reserve_crashkernel(void)
-- 
1.9.3



^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-19 14:53 ` [PATCH v2] " Baoquan He
@ 2015-07-21  7:31   ` Dave Young
  2015-07-21  7:38     ` Ingo Molnar
                       ` (2 more replies)
  2015-07-27 14:41   ` Joerg Roedel
  1 sibling, 3 replies; 26+ messages in thread
From: Dave Young @ 2015-07-21  7:31 UTC (permalink / raw)
  To: Baoquan He
  Cc: tglx, mingo, hpa, bp, akpm, jkosina, vgoyal, linux-kernel, yinghai

Hi, Baoquan

The interface was introduced by Yinghai, ccing him.

On 07/19/15 at 10:53pm, Baoquan He wrote:
> People reported that when allocating crashkernel memory using
> ",high" and ",low" syntax, there were cases where the reservation
> of the "high" portion succeeds, but the reservation of the "low"
> portion fails. Then kexec can load kdump kernel successfully, but
> the boot of kdump kernel fails as there's no low memory. This is
> because allocation of low memory for kdump kernel can fail on large
> systems for reasons. E.g it could be manually specified crashkernel
> low memory is too large to find in memblock region.
> 
> In this patch add return value for reserve_crashkernel_low. Then put
> the crashkernel low memory reserving earlier, just between finding
> the crashkernel high memory region and reserving crashkernel high
> memory. Then if crashkernel low memory reserving failed we do not
> reserve crashkernel high memory but return immediately. Users can
> take measures when they found kdump kernel cann't be loaded
> successfully.

So we have 3 sementics now,
crashkernel ,low
crashkernel ,high
crashkernel ,low + crashkernel ,high
For the last case, we need make sure both ,low and ,high reserved,

Can we assume we need both ,low and ,high being ok if one specify
these two types in kernel cmdline?

Also in case one suceed, another fail, should we free the reserved memory?

> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/kernel/setup.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index 80f874b..36aeac3 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -513,7 +513,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
>  # define CRASH_KERNEL_ADDR_HIGH_MAX	MAXMEM
>  #endif
>  
> -static void __init reserve_crashkernel_low(void)
> +static int __init reserve_crashkernel_low(void)
>  {
>  #ifdef CONFIG_X86_64
>  	const unsigned long long alignment = 16<<20;	/* 16M */
> @@ -542,7 +542,7 @@ static void __init reserve_crashkernel_low(void)
>  	} else {
>  		/* passed with crashkernel=0,low ? */
>  		if (!low_size)
> -			return;
> +			return 0;
>  	}
>  
>  	low_base = memblock_find_in_range(low_size, (1ULL<<32),
> @@ -552,7 +552,7 @@ static void __init reserve_crashkernel_low(void)
>  		if (!auto_set)
>  			pr_info("crashkernel low reservation failed - No suitable area found.\n");
>  
> -		return;
> +		return -EINVAL;
>  	}
>  
>  	memblock_reserve(low_base, low_size);
> @@ -564,6 +564,7 @@ static void __init reserve_crashkernel_low(void)
>  	crashk_low_res.end   = low_base + low_size - 1;
>  	insert_resource(&iomem_resource, &crashk_low_res);
>  #endif
> +	return 0;
>  }
>  
>  static void __init reserve_crashkernel(void)
> @@ -613,6 +614,10 @@ static void __init reserve_crashkernel(void)
>  			return;
>  		}
>  	}
> +
> +	if (crash_base >= (1ULL<<32) && reserve_crashkernel_low())
> +		return;
> +
>  	memblock_reserve(crash_base, crash_size);
>  
>  	printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
> @@ -624,9 +629,6 @@ static void __init reserve_crashkernel(void)
>  	crashk_res.start = crash_base;
>  	crashk_res.end   = crash_base + crash_size - 1;
>  	insert_resource(&iomem_resource, &crashk_res);
> -
> -	if (crash_base >= (1ULL<<32))
> -		reserve_crashkernel_low();
>  }
>  #else
>  static void __init reserve_crashkernel(void)
> -- 
> 1.9.3
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  7:31   ` Dave Young
@ 2015-07-21  7:38     ` Ingo Molnar
  2015-07-21  8:19       ` Dave Young
                         ` (2 more replies)
  2015-07-21  7:50     ` Baoquan He
  2015-07-27 18:31     ` Yinghai Lu
  2 siblings, 3 replies; 26+ messages in thread
From: Ingo Molnar @ 2015-07-21  7:38 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, tglx, mingo, hpa, bp, akpm, jkosina, vgoyal,
	linux-kernel, yinghai, Peter Zijlstra


* Dave Young <dyoung@redhat.com> wrote:

> Hi, Baoquan
> 
> The interface was introduced by Yinghai, ccing him.

Also, why was this syntax introduced in the first place? Why should the user 
care??

We should only have a single crashkernel option, to enable it - and everything 
else should be figured out by the kernel, automatically.

Any other sub-options just paper over some fragility elsewhere and make the 
feature harder to use, hence more fragile.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  7:31   ` Dave Young
  2015-07-21  7:38     ` Ingo Molnar
@ 2015-07-21  7:50     ` Baoquan He
  2015-07-21  8:23       ` Dave Young
  2015-07-27 18:31     ` Yinghai Lu
  2 siblings, 1 reply; 26+ messages in thread
From: Baoquan He @ 2015-07-21  7:50 UTC (permalink / raw)
  To: Dave Young
  Cc: tglx, mingo, hpa, bp, akpm, jkosina, vgoyal, linux-kernel, yinghai

Hi Dave,

On 07/21/15 at 03:31pm, Dave Young wrote:
> Hi, Baoquan
> 
> The interface was introduced by Yinghai, ccing him.
> 
> On 07/19/15 at 10:53pm, Baoquan He wrote:
> > People reported that when allocating crashkernel memory using
> > ",high" and ",low" syntax, there were cases where the reservation
> > of the "high" portion succeeds, but the reservation of the "low"
> > portion fails. Then kexec can load kdump kernel successfully, but
> > the boot of kdump kernel fails as there's no low memory. This is
> > because allocation of low memory for kdump kernel can fail on large
> > systems for reasons. E.g it could be manually specified crashkernel
> > low memory is too large to find in memblock region.
> > 
> > In this patch add return value for reserve_crashkernel_low. Then put
> > the crashkernel low memory reserving earlier, just between finding
> > the crashkernel high memory region and reserving crashkernel high
> > memory. Then if crashkernel low memory reserving failed we do not
> > reserve crashkernel high memory but return immediately. Users can
> > take measures when they found kdump kernel cann't be loaded
> > successfully.
> 
> So we have 3 sementics now,
> crashkernel ,low
> crashkernel ,high
> crashkernel ,low + crashkernel ,high
> For the last case, we need make sure both ,low and ,high reserved,
> 
> Can we assume we need both ,low and ,high being ok if one specify
> these two types in kernel cmdline?

I think so. the reason why ,low is introduced is swiotlb or pci device
need low memory when crashkernel is reserved above 4G. Low memory is
necessary when ,high is specified unless user can make sure their
machines don't need low memory and specify crashkernel=0,low explictly.

> 
> Also in case one suceed, another fail, should we free the reserved memory?

In this patch it doesn't need to free. Since it just finds an available
memblock region for crashkernel high, then try to reserve crashkernel
low memory. If low memory failed to allocate, it doesn't allocate
crashkernel high memory found, but return immediately. Only if low
memory is allocated successfully, high memory is allocated too
subsequently.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  7:38     ` Ingo Molnar
@ 2015-07-21  8:19       ` Dave Young
  2015-07-22  1:13       ` Baoquan He
  2015-07-27 14:45       ` Joerg Roedel
  2 siblings, 0 replies; 26+ messages in thread
From: Dave Young @ 2015-07-21  8:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Baoquan He, tglx, mingo, hpa, bp, akpm, jkosina, vgoyal,
	linux-kernel, yinghai, Peter Zijlstra

On 07/21/15 at 09:38am, Ingo Molnar wrote:
> 
> * Dave Young <dyoung@redhat.com> wrote:
> 
> > Hi, Baoquan
> > 
> > The interface was introduced by Yinghai, ccing him.
> 
> Also, why was this syntax introduced in the first place? Why should the user 
> care??

The history is like below, I might miss something, Yinghai/HPA/Vivek may correct
and provide more background:

First commit is below:
commit 7d41a8a4a2b2438621a9159477bff36a11d79a42
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Thu Jan 24 12:20:10 2013 -0800

    x86, kdump: Remove crashkernel range find limit for 64bit

Previously kdump reservation is limited to <896M, above commit remove the limitation.
Then because of kdump kernel need access low mem for swiotlb buffer, thus there's
commit below:

commit 0212f9159694be61c6bc52e925fa76643e0c1abf
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Thu Jan 24 12:20:11 2013 -0800

    x86: Add Crash kernel low reservation

Later, Vivek found old kexec-tools does not work, then Yinghai introduced new param
crashkernel_high=:

commit 55a20ee7804ab64ac90bcdd4e2868a42829e2784
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Mon Apr 15 22:23:47 2013 -0700

    x86, kdump: Retore crashkernel= to allocate under 896M

Finally it becomes crashkernel=x,high:
commit adbc742bf78695bb98c79d18c558b61571748b99
Author: Yinghai Lu <yinghai@kernel.org>
Date:   Mon Apr 15 22:23:48 2013 -0700

    x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low


> 
> We should only have a single crashkernel option, to enable it - and everything 
> else should be figured out by the kernel, automatically.

Presonally I do not current complicated params either, I hope we can have an
elegant interface, but seems there's no better way to resolve the low memory
only requirement for software tlb.

> 
> Any other sub-options just paper over some fragility elsewhere and make the 
> feature harder to use, hence more fragile.

Agree, it is becoming worse..

Thanks
Dave

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  7:50     ` Baoquan He
@ 2015-07-21  8:23       ` Dave Young
  2015-07-21  8:58         ` Baoquan He
  0 siblings, 1 reply; 26+ messages in thread
From: Dave Young @ 2015-07-21  8:23 UTC (permalink / raw)
  To: Baoquan He
  Cc: tglx, mingo, hpa, bp, akpm, jkosina, vgoyal, linux-kernel, yinghai

On 07/21/15 at 03:50pm, Baoquan He wrote:
> Hi Dave,
> 
> On 07/21/15 at 03:31pm, Dave Young wrote:
> > Hi, Baoquan
> > 
> > The interface was introduced by Yinghai, ccing him.
> > 
> > On 07/19/15 at 10:53pm, Baoquan He wrote:
> > > People reported that when allocating crashkernel memory using
> > > ",high" and ",low" syntax, there were cases where the reservation
> > > of the "high" portion succeeds, but the reservation of the "low"
> > > portion fails. Then kexec can load kdump kernel successfully, but
> > > the boot of kdump kernel fails as there's no low memory. This is
> > > because allocation of low memory for kdump kernel can fail on large
> > > systems for reasons. E.g it could be manually specified crashkernel
> > > low memory is too large to find in memblock region.
> > > 
> > > In this patch add return value for reserve_crashkernel_low. Then put
> > > the crashkernel low memory reserving earlier, just between finding
> > > the crashkernel high memory region and reserving crashkernel high
> > > memory. Then if crashkernel low memory reserving failed we do not
> > > reserve crashkernel high memory but return immediately. Users can
> > > take measures when they found kdump kernel cann't be loaded
> > > successfully.
> > 
> > So we have 3 sementics now,
> > crashkernel ,low
> > crashkernel ,high
> > crashkernel ,low + crashkernel ,high
> > For the last case, we need make sure both ,low and ,high reserved,
> > 
> > Can we assume we need both ,low and ,high being ok if one specify
> > these two types in kernel cmdline?
> 
> I think so. the reason why ,low is introduced is swiotlb or pci device
> need low memory when crashkernel is reserved above 4G. Low memory is
> necessary when ,high is specified unless user can make sure their
> machines don't need low memory and specify crashkernel=0,low explictly.

I think forcing user to provide crashkernel=0,low even they do not need
is bad, IMHO one should provided crashkernel value they really need..

> 
> > 
> > Also in case one suceed, another fail, should we free the reserved memory?
> 
> In this patch it doesn't need to free. Since it just finds an available
> memblock region for crashkernel high, then try to reserve crashkernel
> low memory. If low memory failed to allocate, it doesn't allocate
> crashkernel high memory found, but return immediately. Only if low
> memory is allocated successfully, high memory is allocated too
> subsequently.

Ok, thanks for explanation
Dave

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  8:23       ` Dave Young
@ 2015-07-21  8:58         ` Baoquan He
  2015-07-21 19:22           ` Yinghai Lu
  0 siblings, 1 reply; 26+ messages in thread
From: Baoquan He @ 2015-07-21  8:58 UTC (permalink / raw)
  To: Dave Young
  Cc: tglx, mingo, hpa, bp, akpm, jkosina, vgoyal, linux-kernel, yinghai

On 07/21/15 at 04:23pm, Dave Young wrote:
> > I think so. the reason why ,low is introduced is swiotlb or pci device
> > need low memory when crashkernel is reserved above 4G. Low memory is
> > necessary when ,high is specified unless user can make sure their
> > machines don't need low memory and specify crashkernel=0,low explictly.
> 
> I think forcing user to provide crashkernel=0,low even they do not need
> is bad, IMHO one should provided crashkernel value they really need..

Maybe system which don't need low memory is rare, only for testing?


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  8:58         ` Baoquan He
@ 2015-07-21 19:22           ` Yinghai Lu
  2015-07-22  0:59             ` Baoquan He
                               ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Yinghai Lu @ 2015-07-21 19:22 UTC (permalink / raw)
  To: Baoquan He
  Cc: Dave Young, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On Tue, Jul 21, 2015 at 1:58 AM, Baoquan He <bhe@redhat.com> wrote:

> Maybe system which don't need low memory is rare, only for testing?

No, it is not rare.

All recent intel based systems with iommu support does not need low.

And those systems get punished by following patch:

| commit 94fb9334182284e8e7e4bcb9125c25dc33af19d4
| Author: Joerg Roedel <jroedel@suse.de>
| Date:   Wed Jun 10 17:49:42 2015 +0200
|
|    x86/crash: Allocate enough low memory when crashkernel=high

that reserve 256M low always. and those 256M get wasted.

That commit should only be used to workaround some systems that
have partial iommu support.

Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21 19:22           ` Yinghai Lu
@ 2015-07-22  0:59             ` Baoquan He
  2015-07-22 23:47               ` Yinghai Lu
  2015-07-22  4:47             ` Minfei Huang
  2015-07-22 10:11             ` Joerg Roedel
  2 siblings, 1 reply; 26+ messages in thread
From: Baoquan He @ 2015-07-22  0:59 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Dave Young, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On 07/21/15 at 12:22pm, Yinghai Lu wrote:
> On Tue, Jul 21, 2015 at 1:58 AM, Baoquan He <bhe@redhat.com> wrote:
> 
> > Maybe system which don't need low memory is rare, only for testing?
> 
> No, it is not rare.
> 
> All recent intel based systems with iommu support does not need low.

Yeah, you are right.

> 
> And those systems get punished by following patch:
> 
> | commit 94fb9334182284e8e7e4bcb9125c25dc33af19d4
> | Author: Joerg Roedel <jroedel@suse.de>
> | Date:   Wed Jun 10 17:49:42 2015 +0200
> |
> |    x86/crash: Allocate enough low memory when crashkernel=high
> 
> that reserve 256M low always. and those 256M get wasted.
> 
> That commit should only be used to workaround some systems that
> have partial iommu support.

Those big servers mostly has hardware iommu. But they still can
enable swiotlb suport. Then low memory is needed.

> 
> Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  7:38     ` Ingo Molnar
  2015-07-21  8:19       ` Dave Young
@ 2015-07-22  1:13       ` Baoquan He
  2015-07-27 14:45       ` Joerg Roedel
  2 siblings, 0 replies; 26+ messages in thread
From: Baoquan He @ 2015-07-22  1:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dave Young, tglx, mingo, hpa, bp, akpm, jkosina, vgoyal,
	linux-kernel, yinghai, Peter Zijlstra

On 07/21/15 at 09:38am, Ingo Molnar wrote:
> 
> * Dave Young <dyoung@redhat.com> wrote:
> 
> > Hi, Baoquan
> > 
> > The interface was introduced by Yinghai, ccing him.
> 
> Also, why was this syntax introduced in the first place? Why should the user 
> care??

The user space tool makedumpfile default to use bitmap array to mark
all pages if they are dumpable or need be filtered out. For x86_64 the
filtering logic requires 2bits per 4K of RAM except of kdump kernel
running. Then 1G memory will cost 512K bytes, 1T costs 512M. So for
system with more than 1T physical memory user prefers crashkernel memory
resereved above 4G since memory space above 4G could be huge while
memory under 4G is very limited. So system with more than 1T physical
memory should take high memory and with a very few low memory
reservation, e.g about 100M. This is very cost-effective for those
systems where pci memory space is very precious because of many pci
devices.

I think it's more meaningful to provide a ,high/,low for those systems
which need reserve about 500M~1G crashkernel memory since they can be
allocated in below 4G successfully and can be allocated above 4G for
saving pci memory space. where to reserve is decided by user who knows
if they need it.

> 
> We should only have a single crashkernel option, to enable it - and everything 
> else should be figured out by the kernel, automatically.

We could provide a single crashkernel option, like
crashkernel=xxx:yyy-zzz, xxx is size, yyy-zzz is range starting and end
address. But when memory is reserved above 4G, we still need be told if
low memory is needed since some systems must have low memory, some don't
need it like systems with hardware iommu.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21 19:22           ` Yinghai Lu
  2015-07-22  0:59             ` Baoquan He
@ 2015-07-22  4:47             ` Minfei Huang
  2015-07-22 23:44               ` Yinghai Lu
  2015-07-22 10:11             ` Joerg Roedel
  2 siblings, 1 reply; 26+ messages in thread
From: Minfei Huang @ 2015-07-22  4:47 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Baoquan He, Dave Young, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Borislav Petkov, Andrew Morton, Jiri Kosina,
	Vivek Goyal, Linux Kernel Mailing List

On 07/21/15 at 12:22pm, Yinghai Lu wrote:
> On Tue, Jul 21, 2015 at 1:58 AM, Baoquan He <bhe@redhat.com> wrote:
> 
> > Maybe system which don't need low memory is rare, only for testing?
> 
> No, it is not rare.
> 
> All recent intel based systems with iommu support does not need low.
> 
> And those systems get punished by following patch:
> 
> | commit 94fb9334182284e8e7e4bcb9125c25dc33af19d4
> | Author: Joerg Roedel <jroedel@suse.de>
> | Date:   Wed Jun 10 17:49:42 2015 +0200
> |
> |    x86/crash: Allocate enough low memory when crashkernel=high
> 
> that reserve 256M low always. and those 256M get wasted.
> 
> That commit should only be used to workaround some systems that
> have partial iommu support.
> 

Since low memory does not need for some machines, how about kexec does
not allocate low memory automatically, if cmdline does not specify the
option ",low". User shall know well, if they specify the cmdline with
option ",high".

Thanks
Minfei

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21 19:22           ` Yinghai Lu
  2015-07-22  0:59             ` Baoquan He
  2015-07-22  4:47             ` Minfei Huang
@ 2015-07-22 10:11             ` Joerg Roedel
  2015-07-22 23:41               ` Yinghai Lu
  2 siblings, 1 reply; 26+ messages in thread
From: Joerg Roedel @ 2015-07-22 10:11 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Baoquan He, Dave Young, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Borislav Petkov, Andrew Morton, Jiri Kosina,
	Vivek Goyal, Linux Kernel Mailing List

On Tue, Jul 21, 2015 at 12:22:53PM -0700, Yinghai Lu wrote:
> On Tue, Jul 21, 2015 at 1:58 AM, Baoquan He <bhe@redhat.com> wrote:
> 
> > Maybe system which don't need low memory is rare, only for testing?
> 
> No, it is not rare.
> 
> All recent intel based systems with iommu support does not need low.

All Intel-IOMMU systems have the iommu disabled by default (at least
that is the default in most distros). So low memory is definitly needed
by those systems too.

> that reserve 256M low always. and those 256M get wasted.
> 
> That commit should only be used to workaround some systems that
> have partial iommu support.

We currently lack the infrastructure for that, but I am happy to review
patches. How about letting subsystems announce their need for low
crash-kernel memory and allocate based on that?

The subsystems (like iommu or swiotlb code, for example) could even
announce how much memory they need and we base our allocation on that.


	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-22 10:11             ` Joerg Roedel
@ 2015-07-22 23:41               ` Yinghai Lu
  2015-07-27 14:03                 ` Joerg Roedel
  0 siblings, 1 reply; 26+ messages in thread
From: Yinghai Lu @ 2015-07-22 23:41 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Baoquan He, Dave Young, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Borislav Petkov, Andrew Morton, Jiri Kosina,
	Vivek Goyal, Linux Kernel Mailing List

On Wed, Jul 22, 2015 at 3:11 AM, Joerg Roedel <joro@8bytes.org> wrote:
> On Tue, Jul 21, 2015 at 12:22:53PM -0700, Yinghai Lu wrote:
>> On Tue, Jul 21, 2015 at 1:58 AM, Baoquan He <bhe@redhat.com> wrote:
>>
>> > Maybe system which don't need low memory is rare, only for testing?
>>
>> No, it is not rare.
>>
>> All recent intel based systems with iommu support does not need low.
>
> All Intel-IOMMU systems have the iommu disabled by default (at least
> that is the default in most distros). So low memory is definitly needed
> by those systems too.

Do those systems need crashkernel=,high?

Do you mean BIOS have that disabled with not exposing DMAR table ?

kernel for RHEL 6 and RHEL7 have them enabled.
Also opensuse kernel have that enabled too.


>
>> that reserve 256M low always. and those 256M get wasted.
>>
>> That commit should only be used to workaround some systems that
>> have partial iommu support.
>
> We currently lack the infrastructure for that, but I am happy to review
> patches. How about letting subsystems announce their need for low
> crash-kernel memory and allocate based on that?
>
> The subsystems (like iommu or swiotlb code, for example) could even
> announce how much memory they need and we base our allocation on that.

That would be hard, as we don't know if second kernel could take what
kernel parameters.
user could disable iommu etc from command kernel for second kernel.

Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-22  4:47             ` Minfei Huang
@ 2015-07-22 23:44               ` Yinghai Lu
  0 siblings, 0 replies; 26+ messages in thread
From: Yinghai Lu @ 2015-07-22 23:44 UTC (permalink / raw)
  To: Minfei Huang
  Cc: Baoquan He, Dave Young, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Borislav Petkov, Andrew Morton, Jiri Kosina,
	Vivek Goyal, Linux Kernel Mailing List

On Tue, Jul 21, 2015 at 9:47 PM, Minfei Huang <mhuang@redhat.com> wrote:
>
> Since low memory does not need for some machines, how about kexec does
> not allocate low memory automatically, if cmdline does not specify the
> option ",low". User shall know well, if they specify the cmdline with
> option ",high".

That was what I tried to do at that time.  Some others think
automatically set a small
value would be friendly to users.

Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-22  0:59             ` Baoquan He
@ 2015-07-22 23:47               ` Yinghai Lu
  2015-07-28  0:52                 ` Baoquan He
  0 siblings, 1 reply; 26+ messages in thread
From: Yinghai Lu @ 2015-07-22 23:47 UTC (permalink / raw)
  To: Baoquan He
  Cc: Dave Young, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On Tue, Jul 21, 2015 at 5:59 PM, Baoquan He <bhe@redhat.com> wrote:
>> That commit should only be used to workaround some systems that
>> have partial iommu support.
>
> Those big servers mostly has hardware iommu. But they still can
> enable swiotlb suport. Then low memory is needed.

Do you have whole bootlog? I don't understand why those system can not use
full iommu. BIOS problem or HW/silicon limitation?

Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-22 23:41               ` Yinghai Lu
@ 2015-07-27 14:03                 ` Joerg Roedel
  0 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2015-07-27 14:03 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Baoquan He, Dave Young, Thomas Gleixner, Ingo Molnar,
	H. Peter Anvin, Borislav Petkov, Andrew Morton, Jiri Kosina,
	Vivek Goyal, Linux Kernel Mailing List

On Wed, Jul 22, 2015 at 04:41:00PM -0700, Yinghai Lu wrote:
> Do you mean BIOS have that disabled with not exposing DMAR table ?
> 
> kernel for RHEL 6 and RHEL7 have them enabled.
> Also opensuse kernel have that enabled too.

You still need to pass intel_iommu=on in the kernel command line to
enable its usage.



	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-19 14:53 ` [PATCH v2] " Baoquan He
  2015-07-21  7:31   ` Dave Young
@ 2015-07-27 14:41   ` Joerg Roedel
  1 sibling, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2015-07-27 14:41 UTC (permalink / raw)
  To: Baoquan He; +Cc: tglx, mingo, hpa, bp, akpm, jkosina, vgoyal, linux-kernel

Hi Baoquan,

thanks for the fix!

On Sun, Jul 19, 2015 at 10:53:20PM +0800, Baoquan He wrote:
> People reported that when allocating crashkernel memory using
> ",high" and ",low" syntax, there were cases where the reservation
> of the "high" portion succeeds, but the reservation of the "low"
> portion fails. Then kexec can load kdump kernel successfully, but
> the boot of kdump kernel fails as there's no low memory. This is
> because allocation of low memory for kdump kernel can fail on large
> systems for reasons. E.g it could be manually specified crashkernel
> low memory is too large to find in memblock region.
> 
> In this patch add return value for reserve_crashkernel_low. Then put
> the crashkernel low memory reserving earlier, just between finding
> the crashkernel high memory region and reserving crashkernel high
> memory. Then if crashkernel low memory reserving failed we do not
> reserve crashkernel high memory but return immediately. Users can
> take measures when they found kdump kernel cann't be loaded
> successfully.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  arch/x86/kernel/setup.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)

Reviewed-by: Joerg Roedel <jroedel@suse.de>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  7:38     ` Ingo Molnar
  2015-07-21  8:19       ` Dave Young
  2015-07-22  1:13       ` Baoquan He
@ 2015-07-27 14:45       ` Joerg Roedel
  2 siblings, 0 replies; 26+ messages in thread
From: Joerg Roedel @ 2015-07-27 14:45 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Dave Young, Baoquan He, tglx, mingo, hpa, bp, akpm, jkosina,
	vgoyal, linux-kernel, yinghai, Peter Zijlstra

On Tue, Jul 21, 2015 at 09:38:14AM +0200, Ingo Molnar wrote:
> Also, why was this syntax introduced in the first place? Why should the user 
> care??
> 
> We should only have a single crashkernel option, to enable it - and everything 
> else should be figured out by the kernel, automatically.
> 
> Any other sub-options just paper over some fragility elsewhere and make the 
> feature harder to use, hence more fragile.

Hmm, maybe the reason is that old userspace (kdump/kexec tools) can't
deal with crashkernel loaded high, so that the default for
crashkernel=size allocations was kept to be under 896MB.

If that's not an issue we can change the default and get rid of the
,high and ,low syntax.


	Joerg


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-21  7:31   ` Dave Young
  2015-07-21  7:38     ` Ingo Molnar
  2015-07-21  7:50     ` Baoquan He
@ 2015-07-27 18:31     ` Yinghai Lu
  2015-07-27 23:32       ` Baoquan He
  2 siblings, 1 reply; 26+ messages in thread
From: Yinghai Lu @ 2015-07-27 18:31 UTC (permalink / raw)
  To: Dave Young
  Cc: Baoquan He, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On Tue, Jul 21, 2015 at 12:31 AM, Dave Young <dyoung@redhat.com> wrote:
>> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
>> index 80f874b..36aeac3 100644
>> --- a/arch/x86/kernel/setup.c
>> +++ b/arch/x86/kernel/setup.c
>> @@ -513,7 +513,7 @@ static void __init memblock_x86_reserve_range_setup_data(void)
>>  # define CRASH_KERNEL_ADDR_HIGH_MAX  MAXMEM
>>  #endif
>>
>> -static void __init reserve_crashkernel_low(void)
>> +static int __init reserve_crashkernel_low(void)
>>  {
>>  #ifdef CONFIG_X86_64
>>       const unsigned long long alignment = 16<<20;    /* 16M */
>> @@ -542,7 +542,7 @@ static void __init reserve_crashkernel_low(void)
>>       } else {
>>               /* passed with crashkernel=0,low ? */
>>               if (!low_size)
>> -                     return;
>> +                     return 0;
>>       }
>>
>>       low_base = memblock_find_in_range(low_size, (1ULL<<32),
>> @@ -552,7 +552,7 @@ static void __init reserve_crashkernel_low(void)
>>               if (!auto_set)
>>                       pr_info("crashkernel low reservation failed - No suitable area found.\n");
>>
>> -             return;
>> +             return -EINVAL;
>>       }
>>
>>       memblock_reserve(low_base, low_size);
>> @@ -564,6 +564,7 @@ static void __init reserve_crashkernel_low(void)
>>       crashk_low_res.end   = low_base + low_size - 1;
>>       insert_resource(&iomem_resource, &crashk_low_res);
>>  #endif
>> +     return 0;
>>  }
>>
>>  static void __init reserve_crashkernel(void)
>> @@ -613,6 +614,10 @@ static void __init reserve_crashkernel(void)
>>                       return;
>>               }
>>       }
>> +
>> +     if (crash_base >= (1ULL<<32) && reserve_crashkernel_low())
>> +             return;
>> +
>>       memblock_reserve(crash_base, crash_size);
>>
>>       printk(KERN_INFO "Reserving %ldMB of memory at %ldMB "
>> @@ -624,9 +629,6 @@ static void __init reserve_crashkernel(void)
>>       crashk_res.start = crash_base;
>>       crashk_res.end   = crash_base + crash_size - 1;
>>       insert_resource(&iomem_resource, &crashk_res);
>> -
>> -     if (crash_base >= (1ULL<<32))
>> -             reserve_crashkernel_low();
>>  }
>>  #else
>>  static void __init reserve_crashkernel(void)

No, you can not move the calling position for reserve_crashkernel_low().

old sequence:

memblock_find_in_range  for high
memblock_reserve for high
memblock_find_in_range  for low
memblock_reserve for low

now you change to:
memblock_find_in_range  for high
memblock_find_in_range  for low
memblock_reserve for low
memblock_reserve for high

during memblock_reserve, we would double the memblock reserve array.
So there is possibility that new membock reserve array is overlapped with
range for  crashdump high.

so you should keep the old sequence, and if reserve_crashkernel_low fail,
just call memblock_free to free high range that is reserved before.

Thanks

Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-27 18:31     ` Yinghai Lu
@ 2015-07-27 23:32       ` Baoquan He
  0 siblings, 0 replies; 26+ messages in thread
From: Baoquan He @ 2015-07-27 23:32 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Dave Young, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On 07/27/15 at 11:31am, Yinghai Lu wrote:
> >>  #else
> >>  static void __init reserve_crashkernel(void)
> 
> No, you can not move the calling position for reserve_crashkernel_low().
> 
> old sequence:
> 
> memblock_find_in_range  for high
> memblock_reserve for high
> memblock_find_in_range  for low
> memblock_reserve for low
> 
> now you change to:
> memblock_find_in_range  for high
> memblock_find_in_range  for low
> memblock_reserve for low
> memblock_reserve for high
> 
> during memblock_reserve, we would double the memblock reserve array.
> So there is possibility that new membock reserve array is overlapped with
> range for  crashdump high.
> 
> so you should keep the old sequence, and if reserve_crashkernel_low fail,
> just call memblock_free to free high range that is reserved before.

Right, memblock_double_array need avoid the required region. Will
repost.

> 
> Thanks
> 
> Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-22 23:47               ` Yinghai Lu
@ 2015-07-28  0:52                 ` Baoquan He
  2015-07-28  2:45                   ` Yinghai Lu
  0 siblings, 1 reply; 26+ messages in thread
From: Baoquan He @ 2015-07-28  0:52 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Dave Young, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On 07/22/15 at 04:47pm, Yinghai Lu wrote:
> On Tue, Jul 21, 2015 at 5:59 PM, Baoquan He <bhe@redhat.com> wrote:
> >> That commit should only be used to workaround some systems that
> >> have partial iommu support.
> >
> > Those big servers mostly has hardware iommu. But they still can
> > enable swiotlb suport. Then low memory is needed.
> 
> Do you have whole bootlog? I don't understand why those system can not use
> full iommu. BIOS problem or HW/silicon limitation?

Sorry for late reply. This problem is reported by customers. They usulay
don't like to make these things public. While for those systems with
good hard iommu support, it could also fail to initialize hw iommu and
then use swiotlb again, E.g in kdump kernel case. You can see in
intel_iommu_init() swiotlb is assigned to 0 only if intel iommu (namely vt-d)
is initialized successfully. There's possibility that hw iommu
initialization will fail, in this case kdump kernel will fail to boot
if no any low memory is given. So we can't make assumption that system
can boot always well without low memory.

Thanks
Baoquan

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-28  0:52                 ` Baoquan He
@ 2015-07-28  2:45                   ` Yinghai Lu
  2015-07-28  9:19                     ` Baoquan He
  0 siblings, 1 reply; 26+ messages in thread
From: Yinghai Lu @ 2015-07-28  2:45 UTC (permalink / raw)
  To: Baoquan He
  Cc: Dave Young, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On Mon, Jul 27, 2015 at 5:52 PM, Baoquan He <bhe@redhat.com> wrote:
> On 07/22/15 at 04:47pm, Yinghai Lu wrote:
>
> Sorry for late reply. This problem is reported by customers. They usulay
> don't like to make these things public. While for those systems with
> good hard iommu support, it could also fail to initialize hw iommu and
> then use swiotlb again, E.g in kdump kernel case. You can see in
> intel_iommu_init() swiotlb is assigned to 0 only if intel iommu (namely vt-d)
> is initialized successfully. There's possibility that hw iommu
> initialization will fail, in this case kdump kernel will fail to boot
> if no any low memory is given. So we can't make assumption that system
> can boot always well without low memory.

When we had crashkernel=,high working with auto low=40M.
all system with crashkernel=,high worked.
Now come one model (assume it is 16 socket system), and it
could use crashkernel=,high crashkernel=256M,low. So it forces
all auto_low be 256M.

That is ugly. If the customer does not want to make the log public to make
us to find good solution, why should we care about it ?
why not just let them carry the "crashkernel=256M,low" all the way?

Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v2] Do not reserve crashkernel high memory if crashkernel low memory reserving failed
  2015-07-28  2:45                   ` Yinghai Lu
@ 2015-07-28  9:19                     ` Baoquan He
  0 siblings, 0 replies; 26+ messages in thread
From: Baoquan He @ 2015-07-28  9:19 UTC (permalink / raw)
  To: Yinghai Lu
  Cc: Dave Young, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Borislav Petkov, Andrew Morton, Jiri Kosina, Vivek Goyal,
	Linux Kernel Mailing List

On 07/27/15 at 07:45pm, Yinghai Lu wrote:
> On Mon, Jul 27, 2015 at 5:52 PM, Baoquan He <bhe@redhat.com> wrote:
> > On 07/22/15 at 04:47pm, Yinghai Lu wrote:
> >
> > Sorry for late reply. This problem is reported by customers. They usulay
> > don't like to make these things public. While for those systems with
> > good hard iommu support, it could also fail to initialize hw iommu and
> > then use swiotlb again, E.g in kdump kernel case. You can see in
> > intel_iommu_init() swiotlb is assigned to 0 only if intel iommu (namely vt-d)
> > is initialized successfully. There's possibility that hw iommu
> > initialization will fail, in this case kdump kernel will fail to boot
> > if no any low memory is given. So we can't make assumption that system
> > can boot always well without low memory.
> 
> When we had crashkernel=,high working with auto low=40M.
> all system with crashkernel=,high worked.
> Now come one model (assume it is 16 socket system), and it
> could use crashkernel=,high crashkernel=256M,low. So it forces
> all auto_low be 256M.

Previously the default low memory was 64+8M (64M is for swiotlb). That
value was changed since people complained their crash dumping failed
because of insufficient low memory unexpectedly, almost 200M low memory
is needed on their system when crashkernel=, high is specified. 
I think it makes sense to set a default low memory size to make systems
work, specify a exact low memory size to make it better (for memory
usage optimizing).
> 
> That is ugly. If the customer does not want to make the log public to make
> us to find good solution, why should we care about it ?
> why not just let them carry the "crashkernel=256M,low" all the way?

Now companies can provide big servers with tens of Tera bytes physical
memory, I am not sure if this will leak information of their products. I
will ask if a boot log is pasted here. They found the old default
72M low memory is not enough since they believe the old default low
memory and then crash dump failed.

> 
> Yinghai

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2015-07-28  9:20 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-19 11:07 [PATCH] Do not reserve crashkernel high memory if crashkernel low memory reserving failed Baoquan He
2015-07-19 13:40 ` Borislav Petkov
2015-07-19 14:23   ` Baoquan He
2015-07-19 14:53 ` [PATCH v2] " Baoquan He
2015-07-21  7:31   ` Dave Young
2015-07-21  7:38     ` Ingo Molnar
2015-07-21  8:19       ` Dave Young
2015-07-22  1:13       ` Baoquan He
2015-07-27 14:45       ` Joerg Roedel
2015-07-21  7:50     ` Baoquan He
2015-07-21  8:23       ` Dave Young
2015-07-21  8:58         ` Baoquan He
2015-07-21 19:22           ` Yinghai Lu
2015-07-22  0:59             ` Baoquan He
2015-07-22 23:47               ` Yinghai Lu
2015-07-28  0:52                 ` Baoquan He
2015-07-28  2:45                   ` Yinghai Lu
2015-07-28  9:19                     ` Baoquan He
2015-07-22  4:47             ` Minfei Huang
2015-07-22 23:44               ` Yinghai Lu
2015-07-22 10:11             ` Joerg Roedel
2015-07-22 23:41               ` Yinghai Lu
2015-07-27 14:03                 ` Joerg Roedel
2015-07-27 18:31     ` Yinghai Lu
2015-07-27 23:32       ` Baoquan He
2015-07-27 14:41   ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.