linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Crash kernel with 256 MB reserved memory runs into OOM condition
       [not found] <d65e4a42-1962-78c6-1b5a-65cb70529d62@molgen.mpg.de>
@ 2019-08-12  9:50 ` Michal Hocko
  2019-08-12  9:59   ` Paul Menzel
  2019-08-13  2:43   ` Dave Young
  0 siblings, 2 replies; 7+ messages in thread
From: Michal Hocko @ 2019-08-12  9:50 UTC (permalink / raw)
  To: Paul Menzel
  Cc: Jörg Rödel, iommu, linux-pci, x86, kexec,
	Linux Kernel Mailing List, Donald Buczek

On Mon 12-08-19 11:42:33, Paul Menzel wrote:
> Dear Linux folks,
> 
> 
> On a Dell PowerEdge R7425 with two AMD EPYC 7601 (total 128 threads) and
> 1 TB RAM, the crash kernel with 256 MB of space reserved crashes.
> 
> Please find the messages of the normal and the crash kernel attached.

You will need more memory to reserve for the crash kernel because ...

> [    4.548703] Node 0 DMA free:484kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:568kB managed:484kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> [    4.573612] lowmem_reserve[]: 0 125 125 125
> [    4.577799] Node 0 DMA32 free:1404kB min:1428kB low:1784kB high:2140kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:15720kB writepending:0kB present:261560kB managed:133752kB mlocked:0kB kernel_stack:2496kB pagetables:0kB bounce:0kB free_pcp:212kB local_pcp:212kB free_cma:0kB

... the memory is really depleted and nothing to be reclaimed (no anon.
file pages) Look how tht free memory is below min watermark (node zone DMA has
lowmem protection for GFP_KERNEL allocation).

[...]
> [    4.923156] Out of memory and no killable processes...

and there is no task existing to be killed so we go and panic.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Crash kernel with 256 MB reserved memory runs into OOM condition
  2019-08-12  9:50 ` Crash kernel with 256 MB reserved memory runs into OOM condition Michal Hocko
@ 2019-08-12  9:59   ` Paul Menzel
  2019-08-13  2:43   ` Dave Young
  1 sibling, 0 replies; 7+ messages in thread
From: Paul Menzel @ 2019-08-12  9:59 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Jörg Rödel, iommu, linux-pci, x86, kexec,
	Linux Kernel Mailing List, Donald Buczek

[-- Attachment #1: Type: text/plain, Size: 1715 bytes --]

Dear Michal,


On 12.08.19 11:50, Michal Hocko wrote:
> On Mon 12-08-19 11:42:33, Paul Menzel wrote:

>> On a Dell PowerEdge R7425 with two AMD EPYC 7601 (total 128 threads) and
>> 1 TB RAM, the crash kernel with 256 MB of space reserved crashes.
>>
>> Please find the messages of the normal and the crash kernel attached.
> 
> You will need more memory to reserve for the crash kernel because ...
> 
>> [    4.548703] Node 0 DMA free:484kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:568kB managed:484kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>> [    4.573612] lowmem_reserve[]: 0 125 125 125
>> [    4.577799] Node 0 DMA32 free:1404kB min:1428kB low:1784kB high:2140kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:15720kB writepending:0kB present:261560kB managed:133752kB mlocked:0kB kernel_stack:2496kB pagetables:0kB bounce:0kB free_pcp:212kB local_pcp:212kB free_cma:0kB
> 
> ... the memory is really depleted and nothing to be reclaimed (no anon.
> file pages) Look how tht free memory is below min watermark (node zone DMA has
> lowmem protection for GFP_KERNEL allocation).
> 
> [...]
>> [    4.923156] Out of memory and no killable processes...
> 
> and there is no task existing to be killed so we go and panic.

Yeah, we figured that.

What we wonder is, how 256 MB are not enough for booting, and what
hardware properties cause it to be too small. In the overview I just
see a 60 MB allocation.

    [    4.857565] kmalloc-2048           59164KB      59164KB


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Crash kernel with 256 MB reserved memory runs into OOM condition
  2019-08-12  9:50 ` Crash kernel with 256 MB reserved memory runs into OOM condition Michal Hocko
  2019-08-12  9:59   ` Paul Menzel
@ 2019-08-13  2:43   ` Dave Young
  2019-08-13  2:46     ` Dave Young
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Young @ 2019-08-13  2:43 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Paul Menzel, linux-pci, Jörg Rödel, x86, kexec,
	Linux Kernel Mailing List, iommu, kasong, lijiang, bhe,
	Donald Buczek

Hi,

On 08/12/19 at 11:50am, Michal Hocko wrote:
> On Mon 12-08-19 11:42:33, Paul Menzel wrote:
> > Dear Linux folks,
> > 
> > 
> > On a Dell PowerEdge R7425 with two AMD EPYC 7601 (total 128 threads) and
> > 1 TB RAM, the crash kernel with 256 MB of space reserved crashes.
> > 
> > Please find the messages of the normal and the crash kernel attached.
> 
> You will need more memory to reserve for the crash kernel because ...
> 
> > [    4.548703] Node 0 DMA free:484kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:568kB managed:484kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > [    4.573612] lowmem_reserve[]: 0 125 125 125
> > [    4.577799] Node 0 DMA32 free:1404kB min:1428kB low:1784kB high:2140kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:15720kB writepending:0kB present:261560kB managed:133752kB mlocked:0kB kernel_stack:2496kB pagetables:0kB bounce:0kB free_pcp:212kB local_pcp:212kB free_cma:0kB
> 
> ... the memory is really depleted and nothing to be reclaimed (no anon.
> file pages) Look how tht free memory is below min watermark (node zone DMA has
> lowmem protection for GFP_KERNEL allocation).

We found similar issue on our side while working on kdump on SME enabled
systemd.  Kairui is working on some patches.

Actually on those SME/SEV enabled machines, swiotlb is enabled
automatically so at least we need extra 64M+ memory for kdump other
than the normal expectation.

Can you check if this is also your case?

Thanks
Dave

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Crash kernel with 256 MB reserved memory runs into OOM condition
  2019-08-13  2:43   ` Dave Young
@ 2019-08-13  2:46     ` Dave Young
  2019-08-13  2:54       ` Dave Young
  2019-08-15 17:00       ` Messages to kexec@ get moderated (was: Crash kernel with 256 MB reserved memory runs into OOM condition) Paul Menzel
  0 siblings, 2 replies; 7+ messages in thread
From: Dave Young @ 2019-08-13  2:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Paul Menzel, linux-pci, Jörg Rödel, x86, kexec,
	Linux Kernel Mailing List, iommu, kasong, lijiang, Donald Buczek

Add more cc.
On 08/13/19 at 10:43am, Dave Young wrote:
> Hi,
> 
> On 08/12/19 at 11:50am, Michal Hocko wrote:
> > On Mon 12-08-19 11:42:33, Paul Menzel wrote:
> > > Dear Linux folks,
> > > 
> > > 
> > > On a Dell PowerEdge R7425 with two AMD EPYC 7601 (total 128 threads) and
> > > 1 TB RAM, the crash kernel with 256 MB of space reserved crashes.
> > > 
> > > Please find the messages of the normal and the crash kernel attached.
> > 
> > You will need more memory to reserve for the crash kernel because ...
> > 
> > > [    4.548703] Node 0 DMA free:484kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:568kB managed:484kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > > [    4.573612] lowmem_reserve[]: 0 125 125 125
> > > [    4.577799] Node 0 DMA32 free:1404kB min:1428kB low:1784kB high:2140kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:15720kB writepending:0kB present:261560kB managed:133752kB mlocked:0kB kernel_stack:2496kB pagetables:0kB bounce:0kB free_pcp:212kB local_pcp:212kB free_cma:0kB
> > 
> > ... the memory is really depleted and nothing to be reclaimed (no anon.
> > file pages) Look how tht free memory is below min watermark (node zone DMA has
> > lowmem protection for GFP_KERNEL allocation).
> 
> We found similar issue on our side while working on kdump on SME enabled
> systemd.  Kairui is working on some patches.
> 
> Actually on those SME/SEV enabled machines, swiotlb is enabled
> automatically so at least we need extra 64M+ memory for kdump other
> than the normal expectation.
> 
> Can you check if this is also your case?

The question is to Paul,  also it would be always good to cc kexec mail
list for kexec and kdump issues.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Crash kernel with 256 MB reserved memory runs into OOM condition
  2019-08-13  2:46     ` Dave Young
@ 2019-08-13  2:54       ` Dave Young
  2019-09-04 10:10         ` Paul Menzel
  2019-08-15 17:00       ` Messages to kexec@ get moderated (was: Crash kernel with 256 MB reserved memory runs into OOM condition) Paul Menzel
  1 sibling, 1 reply; 7+ messages in thread
From: Dave Young @ 2019-08-13  2:54 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Paul Menzel, linux-pci, Jörg Rödel, x86, kexec,
	Linux Kernel Mailing List, iommu, kasong, lijiang, Donald Buczek

On 08/13/19 at 10:46am, Dave Young wrote:
> Add more cc.
> On 08/13/19 at 10:43am, Dave Young wrote:
> > Hi,
> > 
> > On 08/12/19 at 11:50am, Michal Hocko wrote:
> > > On Mon 12-08-19 11:42:33, Paul Menzel wrote:
> > > > Dear Linux folks,
> > > > 
> > > > 
> > > > On a Dell PowerEdge R7425 with two AMD EPYC 7601 (total 128 threads) and
> > > > 1 TB RAM, the crash kernel with 256 MB of space reserved crashes.
> > > > 
> > > > Please find the messages of the normal and the crash kernel attached.
> > > 
> > > You will need more memory to reserve for the crash kernel because ...
> > > 
> > > > [    4.548703] Node 0 DMA free:484kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:568kB managed:484kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
> > > > [    4.573612] lowmem_reserve[]: 0 125 125 125
> > > > [    4.577799] Node 0 DMA32 free:1404kB min:1428kB low:1784kB high:2140kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:15720kB writepending:0kB present:261560kB managed:133752kB mlocked:0kB kernel_stack:2496kB pagetables:0kB bounce:0kB free_pcp:212kB local_pcp:212kB free_cma:0kB
> > > 
> > > ... the memory is really depleted and nothing to be reclaimed (no anon.
> > > file pages) Look how tht free memory is below min watermark (node zone DMA has
> > > lowmem protection for GFP_KERNEL allocation).
> > 
> > We found similar issue on our side while working on kdump on SME enabled
> > systemd.  Kairui is working on some patches.
> > 
> > Actually on those SME/SEV enabled machines, swiotlb is enabled
> > automatically so at least we need extra 64M+ memory for kdump other
> > than the normal expectation.
> > 
> > Can you check if this is also your case?
> 
> The question is to Paul,  also it would be always good to cc kexec mail
> list for kexec and kdump issues.

Looks like hardware iommu is used, maybe you do not enable SME?

Also replace maxcpus=1 with nr_cpus=1 can save some memory, can have a
try.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Messages to kexec@ get moderated (was: Crash kernel with 256 MB reserved memory runs into OOM condition)
  2019-08-13  2:46     ` Dave Young
  2019-08-13  2:54       ` Dave Young
@ 2019-08-15 17:00       ` Paul Menzel
  1 sibling, 0 replies; 7+ messages in thread
From: Paul Menzel @ 2019-08-15 17:00 UTC (permalink / raw)
  To: Dave Young, Michal Hocko
  Cc: linux-pci, Jörg Rödel, x86, kexec,
	Linux Kernel Mailing List, iommu, kasong, lijiang, Donald Buczek

[-- Attachment #1: Type: text/plain, Size: 958 bytes --]

Dear Dave,


On 13.08.19 04:46, Dave Young wrote:

> On 08/13/19 at 10:43am, Dave Young wrote:

[…]

> The question is to Paul,  also it would be always good to cc kexec mail
> list for kexec and kdump issues.

kexec@ was CCed in my original mail, but my messages got moderated. It’d
great if you checked that with the list administrators.

> Your mail to 'kexec' with the subject
> 
>     Crash kernel with 256 MB reserved memory runs into OOM condition
> 
> Is being held until the list moderator can review it for approval.
> 
> The reason it is being held:
> 
>     Message has a suspicious header
> 
> Either the message will get posted to the list, or you will receive
> notification of the moderator's decision.  If you would like to cancel
> this posting, please visit the following URL:
> 
>     http://lists.infradead.org/mailman/confirm/kexec/a23ab6162ef34d099af5dd86c46113def5152bb1


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Crash kernel with 256 MB reserved memory runs into OOM condition
  2019-08-13  2:54       ` Dave Young
@ 2019-09-04 10:10         ` Paul Menzel
  0 siblings, 0 replies; 7+ messages in thread
From: Paul Menzel @ 2019-09-04 10:10 UTC (permalink / raw)
  To: Dave Young, Michal Hocko
  Cc: linux-pci, Jörg Rödel, x86, kexec,
	Linux Kernel Mailing List, iommu, kasong, lijiang, Donald Buczek

[-- Attachment #1: Type: text/plain, Size: 2632 bytes --]

Dear Dave,


Thank you for your replies.


On 2019-08-13 04:54, Dave Young wrote:
> On 08/13/19 at 10:46am, Dave Young wrote:

>> On 08/13/19 at 10:43am, Dave Young wrote:

>>> On 08/12/19 at 11:50am, Michal Hocko wrote:
>>>> On Mon 12-08-19 11:42:33, Paul Menzel wrote:

>>>>> On a Dell PowerEdge R7425 with two AMD EPYC 7601 (total 128 threads) and
>>>>> 1 TB RAM, the crash kernel with 256 MB of space reserved crashes.
>>>>>
>>>>> Please find the messages of the normal and the crash kernel attached.
>>>>
>>>> You will need more memory to reserve for the crash kernel because ...
>>>>
>>>>> [    4.548703] Node 0 DMA free:484kB min:4kB low:4kB high:4kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB writepending:0kB present:568kB managed:484kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB
>>>>> [    4.573612] lowmem_reserve[]: 0 125 125 125
>>>>> [    4.577799] Node 0 DMA32 free:1404kB min:1428kB low:1784kB high:2140kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:15720kB writepending:0kB present:261560kB managed:133752kB mlocked:0kB kernel_stack:2496kB pagetables:0kB bounce:0kB free_pcp:212kB local_pcp:212kB free_cma:0kB
>>>>
>>>> ... the memory is really depleted and nothing to be reclaimed (no anon.
>>>> file pages) Look how tht free memory is below min watermark (node zone DMA has
>>>> lowmem protection for GFP_KERNEL allocation).
>>>
>>> We found similar issue on our side while working on kdump on SME enabled
>>> systemd.  Kairui is working on some patches.
>>>
>>> Actually on those SME/SEV enabled machines, swiotlb is enabled
>>> automatically so at least we need extra 64M+ memory for kdump other
>>> than the normal expectation.
>>>
>>> Can you check if this is also your case?
>>
>> The question is to Paul,  also it would be always good to cc kexec mail
>> list for kexec and kdump issues.

As already replied <kexec@lists.infradead.org> was CCed in my original
message, but the list put it under moderation.

> Looks like hardware iommu is used, maybe you do not enable SME?

Do you mean AMD Secure Memory Encryption? I do not think, we use that.

> Also replace maxcpus=1 with nr_cpus=1 can save some memory, can have a
> try.

Thank you for this suggestion. That fixed it indeed, and the reserved
memory can stay at 256 MB. (The parameter names are a little unintuitive –
I guess due to historical reasons.


Kind regards,

Paul


[1]: https://www.kernel.org/doc/Documentation/admin-guide/kernel-parameters.txt


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-09-04 10:11 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <d65e4a42-1962-78c6-1b5a-65cb70529d62@molgen.mpg.de>
2019-08-12  9:50 ` Crash kernel with 256 MB reserved memory runs into OOM condition Michal Hocko
2019-08-12  9:59   ` Paul Menzel
2019-08-13  2:43   ` Dave Young
2019-08-13  2:46     ` Dave Young
2019-08-13  2:54       ` Dave Young
2019-09-04 10:10         ` Paul Menzel
2019-08-15 17:00       ` Messages to kexec@ get moderated (was: Crash kernel with 256 MB reserved memory runs into OOM condition) Paul Menzel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).