linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Add +~800M crashkernel explaination
@ 2016-12-10  0:22 Robert LeBlanc
  2016-12-10  2:49 ` Baoquan He
  0 siblings, 1 reply; 9+ messages in thread
From: Robert LeBlanc @ 2016-12-10  0:22 UTC (permalink / raw)
  To: kexec, linux-doc, linux-kernel; +Cc: Robert LeBlanc

When trying to configure crashkernel greater than about 800 MB, the
kernel fails to allocate memory on x86 and x86_64. This is due to an
undocumented limit that the crashkernel and other low memory items must
be allocated below 896 MB unless the ",high" option is given. This
updates the documentation to explain this and what I understand the
limitations to be on the option.

Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
---
 Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index b0eb27b..aa3efa8 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
 configurations, sometimes it's handy to have the reserved memory dependent
 on the value of System RAM -- that's mostly for distributors that pre-setup
 the kernel command line to avoid a unbootable system after some memory has
-been removed from the machine.
+been removed from the machine. If you need to allocate more than ~800M
+for x86 or x86_64 then you must use the simple format as the format
+',high' conflicts with the separators of ranges.
 
 The syntax is:
 
@@ -282,11 +284,21 @@ Boot into System Kernel
 1) Update the boot loader (such as grub, yaboot, or lilo) configuration
    files as necessary.
 
-2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
+2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
    where Y specifies how much memory to reserve for the dump-capture kernel
-   and X specifies the beginning of this reserved memory. For example,
-   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
-   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
+   and X specifies the beginning of this reserved memory or ',high' to load in
+   high memory. For example, "crashkernel=64M@16M" tells the system
+   kernel to reserve 64 MB of memory starting at physical address
+   0x01000000 (16MB) for the dump-capture kernel.
+
+   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
+   of memory using high memory for the dump-capture kernel, there may also
+   be some low memory allocated as well. If you need more than ~800M for
+   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
+   added to the previous, etc), you need to specify ',high' since without
+   it crashkerenel has to try and fit under 896M along with some other
+   items and will fail to allocate memory. High memory may only be relevant
+   on x86 and x86_64.
 
    On x86 and x86_64, use "crashkernel=64M@16M".
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2016-12-10  0:22 [PATCH] Add +~800M crashkernel explaination Robert LeBlanc
@ 2016-12-10  2:49 ` Baoquan He
  2016-12-10  5:20   ` Robert LeBlanc
  0 siblings, 1 reply; 9+ messages in thread
From: Baoquan He @ 2016-12-10  2:49 UTC (permalink / raw)
  To: Robert LeBlanc; +Cc: kexec, linux-doc, linux-kernel

On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
> When trying to configure crashkernel greater than about 800 MB, the
> kernel fails to allocate memory on x86 and x86_64. This is due to an
> undocumented limit that the crashkernel and other low memory items must
> be allocated below 896 MB unless the ",high" option is given. This
> updates the documentation to explain this and what I understand the
> limitations to be on the option.

This is true, but not very accurate. You found it's about 800M, it's
becasue usually the current kernel need about 40M space to run, and some
extra reservation before reserve_crashkernel invocation, another ~10M.
However it's normal case, people may build modules into or have some
special code to bloat kernel. This patch makes sense to address the
low|high issue, it might be not good so determined to say ~800M.

> 
> Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
> ---
>  Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
>  1 file changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
> index b0eb27b..aa3efa8 100644
> --- a/Documentation/kdump/kdump.txt
> +++ b/Documentation/kdump/kdump.txt
> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
>  configurations, sometimes it's handy to have the reserved memory dependent
>  on the value of System RAM -- that's mostly for distributors that pre-setup
>  the kernel command line to avoid a unbootable system after some memory has
> -been removed from the machine.
> +been removed from the machine. If you need to allocate more than ~800M
> +for x86 or x86_64 then you must use the simple format as the format
> +',high' conflicts with the separators of ranges.
>  
>  The syntax is:
>  
> @@ -282,11 +284,21 @@ Boot into System Kernel
>  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
>     files as necessary.
>  
> -2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
> +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
>     where Y specifies how much memory to reserve for the dump-capture kernel
> -   and X specifies the beginning of this reserved memory. For example,
> -   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
> -   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
> +   and X specifies the beginning of this reserved memory or ',high' to load in
> +   high memory. For example, "crashkernel=64M@16M" tells the system
> +   kernel to reserve 64 MB of memory starting at physical address
> +   0x01000000 (16MB) for the dump-capture kernel.
> +
> +   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
> +   of memory using high memory for the dump-capture kernel, there may also
> +   be some low memory allocated as well. If you need more than ~800M for
> +   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
> +   added to the previous, etc), you need to specify ',high' since without
> +   it crashkerenel has to try and fit under 896M along with some other
> +   items and will fail to allocate memory. High memory may only be relevant
> +   on x86 and x86_64.
>  
>     On x86 and x86_64, use "crashkernel=64M@16M".
>  
> -- 
> 2.10.2
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2016-12-10  2:49 ` Baoquan He
@ 2016-12-10  5:20   ` Robert LeBlanc
  2016-12-14  3:08     ` Xunlei Pang
  0 siblings, 1 reply; 9+ messages in thread
From: Robert LeBlanc @ 2016-12-10  5:20 UTC (permalink / raw)
  To: Baoquan He; +Cc: kexec, linux-doc, Linux-Kernel@Vger. Kernel. Org

On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <bhe@redhat.com> wrote:
> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
>> When trying to configure crashkernel greater than about 800 MB, the
>> kernel fails to allocate memory on x86 and x86_64. This is due to an
>> undocumented limit that the crashkernel and other low memory items must
>> be allocated below 896 MB unless the ",high" option is given. This
>> updates the documentation to explain this and what I understand the
>> limitations to be on the option.
>
> This is true, but not very accurate. You found it's about 800M, it's
> becasue usually the current kernel need about 40M space to run, and some
> extra reservation before reserve_crashkernel invocation, another ~10M.
> However it's normal case, people may build modules into or have some
> special code to bloat kernel. This patch makes sense to address the
> low|high issue, it might be not good so determined to say ~800M.

My testing showed that I could go anywhere from about 830M to 880M,
depending on distro, kernel version, and stuff that you mentioned. I
just thought some rule of thumb of when to consider using high would
be good. People may not think that 800 MB is 'large' when you have 512
GB of RAM for instance. I thought about making 512 MB be the rule of
thumb, but you can do a lot with ~300 MB.

I'm happy to adjust the wording, what would you recommend? Also, I'm
not 100% sure that I got the cases covered correctly. I was surprised
that I could not get it to work with the "new" format with the
multiple ranges, and that specifying an offset would't work either,
although the offset kind of makes sense. Do you know for sure that it
doesn't work with ranges?

I tried,

crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high

and

crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high

and neither worked. It seems that a better separator would be ';'
instead of ',' for ranges, then you could specify options better. Kind
of hard to change now.

>
>>
>> Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
>> ---
>>  Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
>>  1 file changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>> index b0eb27b..aa3efa8 100644
>> --- a/Documentation/kdump/kdump.txt
>> +++ b/Documentation/kdump/kdump.txt
>> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
>>  configurations, sometimes it's handy to have the reserved memory dependent
>>  on the value of System RAM -- that's mostly for distributors that pre-setup
>>  the kernel command line to avoid a unbootable system after some memory has
>> -been removed from the machine.
>> +been removed from the machine. If you need to allocate more than ~800M
>> +for x86 or x86_64 then you must use the simple format as the format
>> +',high' conflicts with the separators of ranges.
>>
>>  The syntax is:
>>
>> @@ -282,11 +284,21 @@ Boot into System Kernel
>>  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
>>     files as necessary.
>>
>> -2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
>> +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
>>     where Y specifies how much memory to reserve for the dump-capture kernel
>> -   and X specifies the beginning of this reserved memory. For example,
>> -   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>> -   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>> +   and X specifies the beginning of this reserved memory or ',high' to load in
>> +   high memory. For example, "crashkernel=64M@16M" tells the system
>> +   kernel to reserve 64 MB of memory starting at physical address
>> +   0x01000000 (16MB) for the dump-capture kernel.
>> +
>> +   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
>> +   of memory using high memory for the dump-capture kernel, there may also
>> +   be some low memory allocated as well. If you need more than ~800M for
>> +   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
>> +   added to the previous, etc), you need to specify ',high' since without
>> +   it crashkerenel has to try and fit under 896M along with some other
>> +   items and will fail to allocate memory. High memory may only be relevant
>> +   on x86 and x86_64.
>>
>>     On x86 and x86_64, use "crashkernel=64M@16M".
>>
>> --
>> 2.10.2
>>
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2016-12-10  5:20   ` Robert LeBlanc
@ 2016-12-14  3:08     ` Xunlei Pang
  2016-12-14  3:23       ` Xunlei Pang
  2016-12-14 17:50       ` Robert LeBlanc
  0 siblings, 2 replies; 9+ messages in thread
From: Xunlei Pang @ 2016-12-14  3:08 UTC (permalink / raw)
  To: Robert LeBlanc, Baoquan He
  Cc: kexec, Linux-Kernel@Vger. Kernel. Org, linux-doc

On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote:
> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <bhe@redhat.com> wrote:
>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
>>> When trying to configure crashkernel greater than about 800 MB, the
>>> kernel fails to allocate memory on x86 and x86_64. This is due to an
>>> undocumented limit that the crashkernel and other low memory items must
>>> be allocated below 896 MB unless the ",high" option is given. This
>>> updates the documentation to explain this and what I understand the
>>> limitations to be on the option.
>> This is true, but not very accurate. You found it's about 800M, it's
>> becasue usually the current kernel need about 40M space to run, and some
>> extra reservation before reserve_crashkernel invocation, another ~10M.
>> However it's normal case, people may build modules into or have some
>> special code to bloat kernel. This patch makes sense to address the
>> low|high issue, it might be not good so determined to say ~800M.
> My testing showed that I could go anywhere from about 830M to 880M,
> depending on distro, kernel version, and stuff that you mentioned. I
> just thought some rule of thumb of when to consider using high would
> be good. People may not think that 800 MB is 'large' when you have 512
> GB of RAM for instance. I thought about making 512 MB be the rule of
> thumb, but you can do a lot with ~300 MB.

Hi Robert,

I think you are correct.

For x86, the kernel uses memblock to locate the proper range starts from 16MB to some "end",
without "high" prefix, "end" is CRASH_ADDR_LOW_MAX, otherwise CRASH_ADDR_HIGH_MAX.

You can find the definition for both 32-bit and 64-bit:
#ifdef CONFIG_X86_32
# define CRASH_ADDR_LOW_MAX (512 << 20)
# define CRASH_ADDR_HIGH_MAX    (512 << 20)
#else
# define CRASH_ADDR_LOW_MAX (896UL << 20)
# define CRASH_ADDR_HIGH_MAX    MAXMEM
#endif

as some memory was already allocated by the kernel, which means it's highly likely to get a reservation
failure after specifying a crashkernel value near 800MB(for x86_64) which was what you met. But we can't
get the exact threshold, but it would be better if there is some explanation accordingly in the document.

>
> I'm happy to adjust the wording, what would you recommend? Also, I'm
> not 100% sure that I got the cases covered correctly. I was surprised
> that I could not get it to work with the "new" format with the
> multiple ranges, and that specifying an offset would't work either,
> although the offset kind of makes sense. Do you know for sure that it
> doesn't work with ranges?
>
> I tried,
>
> crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high
>
> and
>
> crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high
>
> and neither worked. It seems that a better separator would be ';'
> instead of ',' for ranges, then you could specify options better. Kind
> of hard to change now.

For "crashkernel=range1:size1[,range2:size2,...][@offset]"
I'm afraid it doesn't support "high" prefix in the current implementation, so there is no guarantee.
I guess we can drop a note to eliminate the confusion.

Regards,
Xunlei

>>> Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
>>> ---
>>>  Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
>>>  1 file changed, 17 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>>> index b0eb27b..aa3efa8 100644
>>> --- a/Documentation/kdump/kdump.txt
>>> +++ b/Documentation/kdump/kdump.txt
>>> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
>>>  configurations, sometimes it's handy to have the reserved memory dependent
>>>  on the value of System RAM -- that's mostly for distributors that pre-setup
>>>  the kernel command line to avoid a unbootable system after some memory has
>>> -been removed from the machine.
>>> +been removed from the machine. If you need to allocate more than ~800M
>>> +for x86 or x86_64 then you must use the simple format as the format
>>> +',high' conflicts with the separators of ranges.
>>>
>>>  The syntax is:
>>>
>>> @@ -282,11 +284,21 @@ Boot into System Kernel
>>>  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
>>>     files as necessary.
>>>
>>> -2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
>>> +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
>>>     where Y specifies how much memory to reserve for the dump-capture kernel
>>> -   and X specifies the beginning of this reserved memory. For example,
>>> -   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>>> -   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>>> +   and X specifies the beginning of this reserved memory or ',high' to load in
>>> +   high memory. For example, "crashkernel=64M@16M" tells the system
>>> +   kernel to reserve 64 MB of memory starting at physical address
>>> +   0x01000000 (16MB) for the dump-capture kernel.
>>> +
>>> +   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
>>> +   of memory using high memory for the dump-capture kernel, there may also
>>> +   be some low memory allocated as well. If you need more than ~800M for
>>> +   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
>>> +   added to the previous, etc), you need to specify ',high' since without
>>> +   it crashkerenel has to try and fit under 896M along with some other
>>> +   items and will fail to allocate memory. High memory may only be relevant
>>> +   on x86 and x86_64.
>>>
>>>     On x86 and x86_64, use "crashkernel=64M@16M".
>>>
>>> --
>>> 2.10.2
>>>
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2016-12-14  3:08     ` Xunlei Pang
@ 2016-12-14  3:23       ` Xunlei Pang
  2016-12-14 17:50       ` Robert LeBlanc
  1 sibling, 0 replies; 9+ messages in thread
From: Xunlei Pang @ 2016-12-14  3:23 UTC (permalink / raw)
  To: Robert LeBlanc, Baoquan He
  Cc: kexec, Linux-Kernel@Vger. Kernel. Org, linux-doc

On 12/14/2016 at 11:08 AM, Xunlei Pang wrote:
> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote:
>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <bhe@redhat.com> wrote:
>>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
>>>> When trying to configure crashkernel greater than about 800 MB, the
>>>> kernel fails to allocate memory on x86 and x86_64. This is due to an
>>>> undocumented limit that the crashkernel and other low memory items must
>>>> be allocated below 896 MB unless the ",high" option is given. This
>>>> updates the documentation to explain this and what I understand the
>>>> limitations to be on the option.
>>> This is true, but not very accurate. You found it's about 800M, it's
>>> becasue usually the current kernel need about 40M space to run, and some
>>> extra reservation before reserve_crashkernel invocation, another ~10M.
>>> However it's normal case, people may build modules into or have some
>>> special code to bloat kernel. This patch makes sense to address the
>>> low|high issue, it might be not good so determined to say ~800M.
>> My testing showed that I could go anywhere from about 830M to 880M,
>> depending on distro, kernel version, and stuff that you mentioned. I
>> just thought some rule of thumb of when to consider using high would
>> be good. People may not think that 800 MB is 'large' when you have 512
>> GB of RAM for instance. I thought about making 512 MB be the rule of
>> thumb, but you can do a lot with ~300 MB.
> Hi Robert,
>
> I think you are correct.
>
> For x86, the kernel uses memblock to locate the proper range starts from 16MB to some "end",
> without "high" prefix, "end" is CRASH_ADDR_LOW_MAX, otherwise CRASH_ADDR_HIGH_MAX.
>
> You can find the definition for both 32-bit and 64-bit:
> #ifdef CONFIG_X86_32
> # define CRASH_ADDR_LOW_MAX (512 << 20)
> # define CRASH_ADDR_HIGH_MAX    (512 << 20)
> #else
> # define CRASH_ADDR_LOW_MAX (896UL << 20)
> # define CRASH_ADDR_HIGH_MAX    MAXMEM
> #endif
>
> as some memory was already allocated by the kernel, which means it's highly likely to get a reservation
> failure after specifying a crashkernel value near 800MB(for x86_64) which was what you met. But we can't
> get the exact threshold, but it would be better if there is some explanation accordingly in the document.

But there is another point:
If you specify the base using crashkernel=size[KMG][@offset[KMG]], for example
"crashkernel=1024M@0x10000000", there is no such limitation, and you may get
a successful reservation. I have no idea why the design is so different.

Regards,
Xunlei

>
>> I'm happy to adjust the wording, what would you recommend? Also, I'm
>> not 100% sure that I got the cases covered correctly. I was surprised
>> that I could not get it to work with the "new" format with the
>> multiple ranges, and that specifying an offset would't work either,
>> although the offset kind of makes sense. Do you know for sure that it
>> doesn't work with ranges?
>>
>> I tried,
>>
>> crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high
>>
>> and
>>
>> crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high
>>
>> and neither worked. It seems that a better separator would be ';'
>> instead of ',' for ranges, then you could specify options better. Kind
>> of hard to change now.
> For "crashkernel=range1:size1[,range2:size2,...][@offset]"
> I'm afraid it doesn't support "high" prefix in the current implementation, so there is no guarantee.
> I guess we can drop a note to eliminate the confusion.
>
> Regards,
> Xunlei
>
>>>> Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
>>>> ---
>>>>  Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
>>>>  1 file changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>>>> index b0eb27b..aa3efa8 100644
>>>> --- a/Documentation/kdump/kdump.txt
>>>> +++ b/Documentation/kdump/kdump.txt
>>>> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
>>>>  configurations, sometimes it's handy to have the reserved memory dependent
>>>>  on the value of System RAM -- that's mostly for distributors that pre-setup
>>>>  the kernel command line to avoid a unbootable system after some memory has
>>>> -been removed from the machine.
>>>> +been removed from the machine. If you need to allocate more than ~800M
>>>> +for x86 or x86_64 then you must use the simple format as the format
>>>> +',high' conflicts with the separators of ranges.
>>>>
>>>>  The syntax is:
>>>>
>>>> @@ -282,11 +284,21 @@ Boot into System Kernel
>>>>  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
>>>>     files as necessary.
>>>>
>>>> -2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
>>>> +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
>>>>     where Y specifies how much memory to reserve for the dump-capture kernel
>>>> -   and X specifies the beginning of this reserved memory. For example,
>>>> -   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>>>> -   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>>>> +   and X specifies the beginning of this reserved memory or ',high' to load in
>>>> +   high memory. For example, "crashkernel=64M@16M" tells the system
>>>> +   kernel to reserve 64 MB of memory starting at physical address
>>>> +   0x01000000 (16MB) for the dump-capture kernel.
>>>> +
>>>> +   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
>>>> +   of memory using high memory for the dump-capture kernel, there may also
>>>> +   be some low memory allocated as well. If you need more than ~800M for
>>>> +   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
>>>> +   added to the previous, etc), you need to specify ',high' since without
>>>> +   it crashkerenel has to try and fit under 896M along with some other
>>>> +   items and will fail to allocate memory. High memory may only be relevant
>>>> +   on x86 and x86_64.
>>>>
>>>>     On x86 and x86_64, use "crashkernel=64M@16M".
>>>>
>>>> --
>>>> 2.10.2
>>>>
>>>>
>>>> _______________________________________________
>>>> kexec mailing list
>>>> kexec@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/kexec
>> ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2016-12-14  3:08     ` Xunlei Pang
  2016-12-14  3:23       ` Xunlei Pang
@ 2016-12-14 17:50       ` Robert LeBlanc
  2016-12-14 23:17         ` Xunlei Pang
  1 sibling, 1 reply; 9+ messages in thread
From: Robert LeBlanc @ 2016-12-14 17:50 UTC (permalink / raw)
  To: xlpang; +Cc: Baoquan He, kexec, Linux-Kernel@Vger. Kernel. Org, linux-doc

On Tue, Dec 13, 2016 at 8:08 PM, Xunlei Pang <xpang@redhat.com> wrote:
> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote:
>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <bhe@redhat.com> wrote:
>>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
>>>> When trying to configure crashkernel greater than about 800 MB, the
>>>> kernel fails to allocate memory on x86 and x86_64. This is due to an
>>>> undocumented limit that the crashkernel and other low memory items must
>>>> be allocated below 896 MB unless the ",high" option is given. This
>>>> updates the documentation to explain this and what I understand the
>>>> limitations to be on the option.
>>> This is true, but not very accurate. You found it's about 800M, it's
>>> becasue usually the current kernel need about 40M space to run, and some
>>> extra reservation before reserve_crashkernel invocation, another ~10M.
>>> However it's normal case, people may build modules into or have some
>>> special code to bloat kernel. This patch makes sense to address the
>>> low|high issue, it might be not good so determined to say ~800M.
>> My testing showed that I could go anywhere from about 830M to 880M,
>> depending on distro, kernel version, and stuff that you mentioned. I
>> just thought some rule of thumb of when to consider using high would
>> be good. People may not think that 800 MB is 'large' when you have 512
>> GB of RAM for instance. I thought about making 512 MB be the rule of
>> thumb, but you can do a lot with ~300 MB.
>
> Hi Robert,
>
> I think you are correct.
>
> For x86, the kernel uses memblock to locate the proper range starts from 16MB to some "end",
> without "high" prefix, "end" is CRASH_ADDR_LOW_MAX, otherwise CRASH_ADDR_HIGH_MAX.
>
> You can find the definition for both 32-bit and 64-bit:
> #ifdef CONFIG_X86_32
> # define CRASH_ADDR_LOW_MAX (512 << 20)
> # define CRASH_ADDR_HIGH_MAX    (512 << 20)
> #else
> # define CRASH_ADDR_LOW_MAX (896UL << 20)
> # define CRASH_ADDR_HIGH_MAX    MAXMEM
> #endif
>
> as some memory was already allocated by the kernel, which means it's highly likely to get a reservation
> failure after specifying a crashkernel value near 800MB(for x86_64) which was what you met. But we can't
> get the exact threshold, but it would be better if there is some explanation accordingly in the document.

To make sure I'm understanding what you are say, you want me to go
into a bit more detail about the limitation and specify the
differences between x86 and x86_64, right?

>> I'm happy to adjust the wording, what would you recommend? Also, I'm
>> not 100% sure that I got the cases covered correctly. I was surprised
>> that I could not get it to work with the "new" format with the
>> multiple ranges, and that specifying an offset would't work either,
>> although the offset kind of makes sense. Do you know for sure that it
>> doesn't work with ranges?
>>
>> I tried,
>>
>> crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high
>>
>> and
>>
>> crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high
>>
>> and neither worked. It seems that a better separator would be ';'
>> instead of ',' for ranges, then you could specify options better. Kind
>> of hard to change now.
>
> For "crashkernel=range1:size1[,range2:size2,...][@offset]"
> I'm afraid it doesn't support "high" prefix in the current implementation, so there is no guarantee.
> I guess we can drop a note to eliminate the confusion.

I tried to express in the extended syntax section that ',high' is not
available and you have to use the 'simple' format. Do you think this
needs to be expanded as well?


----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

>>>> Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
>>>> ---
>>>>  Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
>>>>  1 file changed, 17 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>>>> index b0eb27b..aa3efa8 100644
>>>> --- a/Documentation/kdump/kdump.txt
>>>> +++ b/Documentation/kdump/kdump.txt
>>>> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
>>>>  configurations, sometimes it's handy to have the reserved memory dependent
>>>>  on the value of System RAM -- that's mostly for distributors that pre-setup
>>>>  the kernel command line to avoid a unbootable system after some memory has
>>>> -been removed from the machine.
>>>> +been removed from the machine. If you need to allocate more than ~800M
>>>> +for x86 or x86_64 then you must use the simple format as the format
>>>> +',high' conflicts with the separators of ranges.
>>>>
>>>>  The syntax is:
>>>>
>>>> @@ -282,11 +284,21 @@ Boot into System Kernel
>>>>  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
>>>>     files as necessary.
>>>>
>>>> -2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
>>>> +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
>>>>     where Y specifies how much memory to reserve for the dump-capture kernel
>>>> -   and X specifies the beginning of this reserved memory. For example,
>>>> -   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>>>> -   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>>>> +   and X specifies the beginning of this reserved memory or ',high' to load in
>>>> +   high memory. For example, "crashkernel=64M@16M" tells the system
>>>> +   kernel to reserve 64 MB of memory starting at physical address
>>>> +   0x01000000 (16MB) for the dump-capture kernel.
>>>> +
>>>> +   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
>>>> +   of memory using high memory for the dump-capture kernel, there may also
>>>> +   be some low memory allocated as well. If you need more than ~800M for
>>>> +   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
>>>> +   added to the previous, etc), you need to specify ',high' since without
>>>> +   it crashkerenel has to try and fit under 896M along with some other
>>>> +   items and will fail to allocate memory. High memory may only be relevant
>>>> +   on x86 and x86_64.
>>>>
>>>>     On x86 and x86_64, use "crashkernel=64M@16M".
>>>>
>>>> --
>>>> 2.10.2
>>>>
>>>>
>>>> _______________________________________________
>>>> kexec mailing list
>>>> kexec@lists.infradead.org
>>>> http://lists.infradead.org/mailman/listinfo/kexec
>> ----------------
>> Robert LeBlanc
>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>
>> _______________________________________________
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2016-12-14 17:50       ` Robert LeBlanc
@ 2016-12-14 23:17         ` Xunlei Pang
  2017-01-11 19:35           ` Robert LeBlanc
  0 siblings, 1 reply; 9+ messages in thread
From: Xunlei Pang @ 2016-12-14 23:17 UTC (permalink / raw)
  To: Robert LeBlanc, xlpang
  Cc: kexec, Linux-Kernel@Vger. Kernel. Org, Baoquan He, linux-doc

On 12/15/2016 at 01:50 AM, Robert LeBlanc wrote:
> On Tue, Dec 13, 2016 at 8:08 PM, Xunlei Pang <xpang@redhat.com> wrote:
>> On 12/10/2016 at 01:20 PM, Robert LeBlanc wrote:
>>> On Fri, Dec 9, 2016 at 7:49 PM, Baoquan He <bhe@redhat.com> wrote:
>>>> On 12/09/16 at 05:22pm, Robert LeBlanc wrote:
>>>>> When trying to configure crashkernel greater than about 800 MB, the
>>>>> kernel fails to allocate memory on x86 and x86_64. This is due to an
>>>>> undocumented limit that the crashkernel and other low memory items must
>>>>> be allocated below 896 MB unless the ",high" option is given. This
>>>>> updates the documentation to explain this and what I understand the
>>>>> limitations to be on the option.
>>>> This is true, but not very accurate. You found it's about 800M, it's
>>>> becasue usually the current kernel need about 40M space to run, and some
>>>> extra reservation before reserve_crashkernel invocation, another ~10M.
>>>> However it's normal case, people may build modules into or have some
>>>> special code to bloat kernel. This patch makes sense to address the
>>>> low|high issue, it might be not good so determined to say ~800M.
>>> My testing showed that I could go anywhere from about 830M to 880M,
>>> depending on distro, kernel version, and stuff that you mentioned. I
>>> just thought some rule of thumb of when to consider using high would
>>> be good. People may not think that 800 MB is 'large' when you have 512
>>> GB of RAM for instance. I thought about making 512 MB be the rule of
>>> thumb, but you can do a lot with ~300 MB.
>> Hi Robert,
>>
>> I think you are correct.
>>
>> For x86, the kernel uses memblock to locate the proper range starts from 16MB to some "end",
>> without "high" prefix, "end" is CRASH_ADDR_LOW_MAX, otherwise CRASH_ADDR_HIGH_MAX.
>>
>> You can find the definition for both 32-bit and 64-bit:
>> #ifdef CONFIG_X86_32
>> # define CRASH_ADDR_LOW_MAX (512 << 20)
>> # define CRASH_ADDR_HIGH_MAX    (512 << 20)
>> #else
>> # define CRASH_ADDR_LOW_MAX (896UL << 20)
>> # define CRASH_ADDR_HIGH_MAX    MAXMEM
>> #endif
>>
>> as some memory was already allocated by the kernel, which means it's highly likely to get a reservation
>> failure after specifying a crashkernel value near 800MB(for x86_64) which was what you met. But we can't
>> get the exact threshold, but it would be better if there is some explanation accordingly in the document.
> To make sure I'm understanding what you are say, you want me to go
> into a bit more detail about the limitation and specify the
> differences between x86 and x86_64, right?

Yeah, it would be better to have one, at least to mention the different upper bounds.

As I replied in another post, if you really want to detail the behaviour, should mention
"crashkernel=size[KMG][@offset[KMG]]" with @offset[KMG] specified explicitly, after
all, it's handled differently with no upper bound limitation, but doing this may put
the first kernel at the risk of lacking low memory(some devices require 32bit DMA),
must use it with care because the kernel will assume users are aware of what they
are doing and make a successful reservation as long as the given range is available.

>
>>> I'm happy to adjust the wording, what would you recommend? Also, I'm
>>> not 100% sure that I got the cases covered correctly. I was surprised
>>> that I could not get it to work with the "new" format with the
>>> multiple ranges, and that specifying an offset would't work either,
>>> although the offset kind of makes sense. Do you know for sure that it
>>> doesn't work with ranges?
>>>
>>> I tried,
>>>
>>> crashkernel=256M-1G:128M,high,1G-4G:256M,high,4G-:512M,high
>>>
>>> and
>>>
>>> crashkernel=256M-1G:128M,1G-4G:256M,4G-:512M,high
>>>
>>> and neither worked. It seems that a better separator would be ';'
>>> instead of ',' for ranges, then you could specify options better. Kind
>>> of hard to change now.
>> For "crashkernel=range1:size1[,range2:size2,...][@offset]"
>> I'm afraid it doesn't support "high" prefix in the current implementation, so there is no guarantee.
>> I guess we can drop a note to eliminate the confusion.
> I tried to express in the extended syntax section that ',high' is not
> available and you have to use the 'simple' format. Do you think this

ditto

> needs to be expanded as well?

If you really have good reasons or use cases, please try it :-)

Regards,
Xunlei

>
>
> ----------------
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
>>>>> Signed-off-by: Robert LeBlanc <robert@leblancnet.us>
>>>>> ---
>>>>>  Documentation/kdump/kdump.txt | 22 +++++++++++++++++-----
>>>>>  1 file changed, 17 insertions(+), 5 deletions(-)
>>>>>
>>>>> diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
>>>>> index b0eb27b..aa3efa8 100644
>>>>> --- a/Documentation/kdump/kdump.txt
>>>>> +++ b/Documentation/kdump/kdump.txt
>>>>> @@ -256,7 +256,9 @@ While the "crashkernel=size[@offset]" syntax is sufficient for most
>>>>>  configurations, sometimes it's handy to have the reserved memory dependent
>>>>>  on the value of System RAM -- that's mostly for distributors that pre-setup
>>>>>  the kernel command line to avoid a unbootable system after some memory has
>>>>> -been removed from the machine.
>>>>> +been removed from the machine. If you need to allocate more than ~800M
>>>>> +for x86 or x86_64 then you must use the simple format as the format
>>>>> +',high' conflicts with the separators of ranges.
>>>>>
>>>>>  The syntax is:
>>>>>
>>>>> @@ -282,11 +284,21 @@ Boot into System Kernel
>>>>>  1) Update the boot loader (such as grub, yaboot, or lilo) configuration
>>>>>     files as necessary.
>>>>>
>>>>> -2) Boot the system kernel with the boot parameter "crashkernel=Y@X",
>>>>> +2) Boot the system kernel with the boot parameter "crashkernel=Y[@X | ,high]",
>>>>>     where Y specifies how much memory to reserve for the dump-capture kernel
>>>>> -   and X specifies the beginning of this reserved memory. For example,
>>>>> -   "crashkernel=64M@16M" tells the system kernel to reserve 64 MB of memory
>>>>> -   starting at physical address 0x01000000 (16MB) for the dump-capture kernel.
>>>>> +   and X specifies the beginning of this reserved memory or ',high' to load in
>>>>> +   high memory. For example, "crashkernel=64M@16M" tells the system
>>>>> +   kernel to reserve 64 MB of memory starting at physical address
>>>>> +   0x01000000 (16MB) for the dump-capture kernel.
>>>>> +
>>>>> +   Specifying "crashkernel=1G,high" tells the system kernel to reserve 1 GB
>>>>> +   of memory using high memory for the dump-capture kernel, there may also
>>>>> +   be some low memory allocated as well. If you need more than ~800M for
>>>>> +   the crash kernel to operate (volumes on FC/iSCSI, large volumes, systemd
>>>>> +   added to the previous, etc), you need to specify ',high' since without
>>>>> +   it crashkerenel has to try and fit under 896M along with some other
>>>>> +   items and will fail to allocate memory. High memory may only be relevant
>>>>> +   on x86 and x86_64.
>>>>>
>>>>>     On x86 and x86_64, use "crashkernel=64M@16M".
>>>>>
>>>>> --
>>>>> 2.10.2
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> kexec mailing list
>>>>> kexec@lists.infradead.org
>>>>> http://lists.infradead.org/mailman/listinfo/kexec
>>> ----------------
>>> Robert LeBlanc
>>> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>>>
>>> _______________________________________________
>>> kexec mailing list
>>> kexec@lists.infradead.org
>>> http://lists.infradead.org/mailman/listinfo/kexec
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2016-12-14 23:17         ` Xunlei Pang
@ 2017-01-11 19:35           ` Robert LeBlanc
  2017-01-12  3:24             ` Xunlei Pang
  0 siblings, 1 reply; 9+ messages in thread
From: Robert LeBlanc @ 2017-01-11 19:35 UTC (permalink / raw)
  To: xlpang; +Cc: kexec, Linux-Kernel@Vger. Kernel. Org, Baoquan He, linux-doc

On Wed, Dec 14, 2016 at 4:17 PM, Xunlei Pang <xpang@redhat.com> wrote:
> As I replied in another post, if you really want to detail the behaviour, should mention
> "crashkernel=size[KMG][@offset[KMG]]" with @offset[KMG] specified explicitly, after
> all, it's handled differently with no upper bound limitation, but doing this may put
> the first kernel at the risk of lacking low memory(some devices require 32bit DMA),
> must use it with care because the kernel will assume users are aware of what they
> are doing and make a successful reservation as long as the given range is available.

crashkernel=1024M@0x10000000

I can't get the offset to work. It seems that it allocates the space
and loads the crash kernel, but I couldn't get it to actually boot
into the crash kernel. Does it work for you? I'm using the 4.9 kernel.

Thanks

----------------
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] Add +~800M crashkernel explaination
  2017-01-11 19:35           ` Robert LeBlanc
@ 2017-01-12  3:24             ` Xunlei Pang
  0 siblings, 0 replies; 9+ messages in thread
From: Xunlei Pang @ 2017-01-12  3:24 UTC (permalink / raw)
  To: Robert LeBlanc, xlpang
  Cc: kexec, Linux-Kernel@Vger. Kernel. Org, Baoquan He, linux-doc

On 01/12/2017 at 03:35 AM, Robert LeBlanc wrote:
> On Wed, Dec 14, 2016 at 4:17 PM, Xunlei Pang <xpang@redhat.com> wrote:
>> As I replied in another post, if you really want to detail the behaviour, should mention
>> "crashkernel=size[KMG][@offset[KMG]]" with @offset[KMG] specified explicitly, after
>> all, it's handled differently with no upper bound limitation, but doing this may put
>> the first kernel at the risk of lacking low memory(some devices require 32bit DMA),
>> must use it with care because the kernel will assume users are aware of what they
>> are doing and make a successful reservation as long as the given range is available.
> crashkernel=1024M@0x10000000
>
> I can't get the offset to work. It seems that it allocates the space
> and loads the crash kernel, but I couldn't get it to actually boot
> into the crash kernel. Does it work for you? I'm using the 4.9 kernel.

Not sure what is the problem you met, but kdump kernel boots well using 4.9 on my x86_64 machine.

Regards,
Xunlei

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-01-12  3:23 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-10  0:22 [PATCH] Add +~800M crashkernel explaination Robert LeBlanc
2016-12-10  2:49 ` Baoquan He
2016-12-10  5:20   ` Robert LeBlanc
2016-12-14  3:08     ` Xunlei Pang
2016-12-14  3:23       ` Xunlei Pang
2016-12-14 17:50       ` Robert LeBlanc
2016-12-14 23:17         ` Xunlei Pang
2017-01-11 19:35           ` Robert LeBlanc
2017-01-12  3:24             ` Xunlei Pang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).