All of lore.kernel.org
 help / color / mirror / Atom feed
* Transparent Huge pages hanging on 5.1.x/5.2.0 kernels?
@ 2019-07-15 10:32 David Zarzycki
  2019-08-08 16:08 ` Vlastimil Babka
  0 siblings, 1 reply; 2+ messages in thread
From: David Zarzycki @ 2019-07-15 10:32 UTC (permalink / raw)
  To: linux-mm

Hello,

In the last few weeks, one of my build boxes started hanging at the end of a build with a zombie ld.lld process stuck in the kernel:

[97199.634549] CPU: 14 PID: 72214 Comm: ld.lld Kdump: loaded Not tainted 5.2.0-1.fc31.x86_64 #1
[97199.634550] Hardware name: Supermicro SYS-5038K-i-NF9/K1SPE, BIOS 1.0b 04/13/2017
[97199.634551] RIP: 0010:compact_zone+0x4d0/0xce0
[97199.634553] Code: 41 c6 47 78 01 e9 52 fc ff ff 4c 89 f7 48 89 ea 4c 89 e6 e8 22 8e 02 00 49 89 c6 e9 d7 fd ff ff 8b 4c 24 10 4c 89 e2 4c 89 ee <4c> 89 ff e8 e8 e0 ff ff 49 89 c4 48 85 c0 0f 84 bd fe ff ff 45 8b
[97199.634555] RSP: 0018:ffffac6a53c879c0 EFLAGS: 00000202
[97199.634557] RAX: 0000000000000001 RBX: 000000000619f200 RCX: 000000000000000c
[97199.634558] RDX: 000000000619f000 RSI: 000000000619ee20 RDI: ffff95f77ffc8330
[97199.634559] RBP: ffff95fb7ffd4d00 R08: 0000000000000007 R09: 000000000619f000
[97199.634561] R10: 0000000000000000 R11: 0000000000000003 R12: 000000000619f000
[97199.634562] R13: 000000000619ee20 R14: fffffb58467b8000 R15: ffffac6a53c87a90
[97199.634563] FS:  00007ffff10fd700(0000) GS:ffff95f5fb780000(0000) knlGS:0000000000000000
[97199.634566] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[97199.634567] CR2: 00007fff08001378 CR3: 00000054737f6000 CR4: 00000000001406e0
[97199.634568] Call Trace:
[97199.634569]  compact_zone_order+0xde/0x140
[97199.634570]  try_to_compact_pages+0xcc/0x2a0
[97199.634570]  __alloc_pages_direct_compact+0x8c/0x170
[97199.634571]  __alloc_pages_slowpath+0x248/0xdf0
[97199.634572]  ? get_vtime_delta+0x13/0xe0
[97199.634573]  ? finish_task_switch+0x12f/0x2a0
[97199.634574]  __alloc_pages_nodemask+0x2f2/0x340
[97199.634575]  do_huge_pmd_anonymous_page+0x130/0x910
[97199.634576]  __handle_mm_fault+0xfd7/0x1ac0
[97199.634577]  handle_mm_fault+0xc4/0x1f0
[97199.634577]  do_user_addr_fault+0x1f6/0x450
[97199.634578]  do_page_fault+0x33/0x120
[97199.634579]  ? page_fault+0x8/0x30
[97199.634580]  page_fault+0x1e/0x30

This bug seems to go away if I comment out the following lines from my boot script:

# echo always > /sys/kernel/mm/transparent_hugepage/enabled
# echo always > /sys/kernel/mm/transparent_hugepage/defrag

What can I do to debug this further?

Dave


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Transparent Huge pages hanging on 5.1.x/5.2.0 kernels?
  2019-07-15 10:32 Transparent Huge pages hanging on 5.1.x/5.2.0 kernels? David Zarzycki
@ 2019-08-08 16:08 ` Vlastimil Babka
  0 siblings, 0 replies; 2+ messages in thread
From: Vlastimil Babka @ 2019-08-08 16:08 UTC (permalink / raw)
  To: David Zarzycki, linux-mm; +Cc: Mel Gorman

On 7/15/19 12:32 PM, David Zarzycki wrote:
> Hello,
> 
> In the last few weeks, one of my build boxes started hanging at the end of a build with a zombie ld.lld process stuck in the kernel:
> 
> [97199.634549] CPU: 14 PID: 72214 Comm: ld.lld Kdump: loaded Not tainted 5.2.0-1.fc31.x86_64 #1
> [97199.634550] Hardware name: Supermicro SYS-5038K-i-NF9/K1SPE, BIOS 1.0b 04/13/2017
> [97199.634551] RIP: 0010:compact_zone+0x4d0/0xce0
> [97199.634553] Code: 41 c6 47 78 01 e9 52 fc ff ff 4c 89 f7 48 89 ea 4c 89 e6 e8 22 8e 02 00 49 89 c6 e9 d7 fd ff ff 8b 4c 24 10 4c 89 e2 4c 89 ee <4c> 89 ff e8 e8 e0 ff ff 49 89 c4 48 85 c0 0f 84 bd fe ff ff 45 8b
> [97199.634555] RSP: 0018:ffffac6a53c879c0 EFLAGS: 00000202
> [97199.634557] RAX: 0000000000000001 RBX: 000000000619f200 RCX: 000000000000000c
> [97199.634558] RDX: 000000000619f000 RSI: 000000000619ee20 RDI: ffff95f77ffc8330
> [97199.634559] RBP: ffff95fb7ffd4d00 R08: 0000000000000007 R09: 000000000619f000
> [97199.634561] R10: 0000000000000000 R11: 0000000000000003 R12: 000000000619f000
> [97199.634562] R13: 000000000619ee20 R14: fffffb58467b8000 R15: ffffac6a53c87a90
> [97199.634563] FS:  00007ffff10fd700(0000) GS:ffff95f5fb780000(0000) knlGS:0000000000000000
> [97199.634566] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [97199.634567] CR2: 00007fff08001378 CR3: 00000054737f6000 CR4: 00000000001406e0
> [97199.634568] Call Trace:
> [97199.634569]  compact_zone_order+0xde/0x140

This was likely the same as
https://bugzilla.kernel.org/show_bug.cgi?id=204165
Fixed by patch https://marc.info/?l=linux-mm&m=156344023621776&w=2
Now commit 670105a25608 ("mm: compaction: avoid 100% CPU usage during
compaction when a task is killed")
It should hit your distro kernel at some point.

> [97199.634570]  try_to_compact_pages+0xcc/0x2a0
> [97199.634570]  __alloc_pages_direct_compact+0x8c/0x170
> [97199.634571]  __alloc_pages_slowpath+0x248/0xdf0
> [97199.634572]  ? get_vtime_delta+0x13/0xe0
> [97199.634573]  ? finish_task_switch+0x12f/0x2a0
> [97199.634574]  __alloc_pages_nodemask+0x2f2/0x340
> [97199.634575]  do_huge_pmd_anonymous_page+0x130/0x910
> [97199.634576]  __handle_mm_fault+0xfd7/0x1ac0
> [97199.634577]  handle_mm_fault+0xc4/0x1f0
> [97199.634577]  do_user_addr_fault+0x1f6/0x450
> [97199.634578]  do_page_fault+0x33/0x120
> [97199.634579]  ? page_fault+0x8/0x30
> [97199.634580]  page_fault+0x1e/0x30
> 
> This bug seems to go away if I comment out the following lines from my boot script:
> 
> # echo always > /sys/kernel/mm/transparent_hugepage/enabled
> # echo always > /sys/kernel/mm/transparent_hugepage/defrag
> 
> What can I do to debug this further?
> 
> Dave
> 


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2019-08-08 16:08 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-15 10:32 Transparent Huge pages hanging on 5.1.x/5.2.0 kernels? David Zarzycki
2019-08-08 16:08 ` Vlastimil Babka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.