linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* WARN at kernel/sched/core.c:5358 (kthread_end_lazy_tlb_mm)
@ 2023-06-01 10:46 Sachin Sant
  2023-06-06  9:46 ` Nicholas Piggin
  0 siblings, 1 reply; 3+ messages in thread
From: Sachin Sant @ 2023-06-01 10:46 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: linux-mm, Nick Piggin

While compiling a kernel on a IBM Power system booted with
6.4.0-rc4-next-20230601 following warning is observed

[  276.351697] ------------[ cut here ]------------
[  276.351709] WARNING: CPU: 27 PID: 9237 at kernel/sched/core.c:5358 kthread_end_lazy_tlb_mm+0x90/0xa0
[  276.351719] Modules linked in: dm_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc pseries_rng aes_gcm_p10_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi scsi_transport_srp ibmveth vmx_crypto fuse
[  276.351752] CPU: 27 PID: 9237 Comm: cc1 Kdump: loaded Not tainted 6.4.0-rc4-next-20230601 #1
[  276.351756] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[  276.351759] NIP:  c0000000001b8c10 LR: c0000000000a8d54 CTR: c00000000046ec00
[  276.351763] REGS: c0000000dce337d0 TRAP: 0700   Not tainted  (6.4.0-rc4-next-20230601)
[  276.351766] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24002228  XER: 00000000
[  276.351774] CFAR: c0000000001b8ba0 IRQMASK: 0  [  276.351774] GPR00: c0000000000a8d54 c0000000dce33a70 c0000000014a1800 c000000007852a00  [  276.351774] GPR04: 0000000000000001 ffffffffffffffff 0000000000000000 c000000007852f78  [  276.351774] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000024002428  [  276.351774] GPR12: c0000000a032b608 c00000135faa5b00 0000000000000000 0000000000000000  [  276.351774] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351774] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351774] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000007852a70  [  276.351774] GPR28: 0000000000000000 0000000000000000 000000000000001b c000000007852a00  [  276.351810] NIP [c0000000001b8c10] kthread_end_lazy_tlb_mm+0x90/0xa0
[  276.351814] LR [c0000000000a8d54] exit_lazy_flush_tlb+0xf4/0x110
[  276.351818] Call Trace:
[  276.351820] [c0000000dce33a70] [0000000000000001] 0x1 (unreliable)
[  276.351825] [c0000000dce33ab0] [c0000000000a8fbc] flush_type_needed+0x24c/0x260
[  276.351829] [c0000000dce33af0] [c0000000000a91a8] __flush_all_mm+0x48/0x2c0
[  276.351833] [c0000000dce33b40] [c0000000004d6dcc] tlb_finish_mmu+0x16c/0x230
[  276.351839] [c0000000dce33b70] [c0000000004d2a2c] exit_mmap+0x17c/0x4c0
[  276.351844] [c0000000dce33c90] [c000000000159120] __mmput+0x60/0x1e0
[  276.351849] [c0000000dce33cc0] [c0000000001689cc] exit_mm+0xdc/0x170
[  276.351853] [c0000000dce33d00] [c000000000168cec] do_exit+0x28c/0x580
[  276.351857] [c0000000dce33db0] [c00000000016922c] do_group_exit+0x4c/0xc0
[  276.351861] [c0000000dce33df0] [c0000000001692c8] sys_exit_group+0x28/0x30
[  276.351866] [c0000000dce33e10] [c000000000034adc] system_call_exception+0x13c/0x340
[  276.351871] [c0000000dce33e50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[  276.351876] --- interrupt: 3000 at 0x7fffa330c2c4
[  276.351879] NIP:  00007fffa330c2c4 LR: 0000000000000000 CTR: 0000000000000000
[  276.351882] REGS: c0000000dce33e80 TRAP: 3000   Not tainted  (6.4.0-rc4-next-20230601)
[  276.351885] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 44002422  XER: 00000000
[  276.351894] IRQMASK: 0  [  276.351894] GPR00: 00000000000000ea 00007fffed1a4ca0 00007fffa3a77e00 0000000000000000  [  276.351894] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351894] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351894] GPR12: 0000000000000000 00007fffa3a7cac0 0000000000000000 0000000000000000  [  276.351894] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351894] GPR20: 0000000000000000 0000000000000000 000000000000003a 0000000000000001  [  276.351894] GPR24: 00007fffa34423c0 0000000000000000 00007fffa3440a38 0000000000000000  [  276.351894] GPR28: 0000000000000001 00007fffa3a75e48 fffffffffffff000 0000000000000000  [  276.351928] NIP [00007fffa330c2c4] 0x7fffa330c2c4
[  276.351930] LR [0000000000000000] 0x0
[  276.351932] --- interrupt: 3000
[  276.351935] Code: 38210020 e8010010 7c0803a6 4e800020 0fe00000 60000000 60000000 60000000 4e800020 60000000 60000000 60000000 <0fe00000> 60000000 60000000 60000000  [  276.351946] ---[ end trace 0000000000000000 ]—

Git bisect points to following code change:

commit 253808d464bf472c66d299faa3d8ffb65149f4da
     lazy tlb: consolidate lazy tlb mm switching

- Sachin


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: WARN at kernel/sched/core.c:5358 (kthread_end_lazy_tlb_mm)
  2023-06-01 10:46 WARN at kernel/sched/core.c:5358 (kthread_end_lazy_tlb_mm) Sachin Sant
@ 2023-06-06  9:46 ` Nicholas Piggin
  2023-06-06 23:23   ` Michael Ellerman
  0 siblings, 1 reply; 3+ messages in thread
From: Nicholas Piggin @ 2023-06-06  9:46 UTC (permalink / raw)
  To: Sachin Sant, linuxppc-dev; +Cc: linux-mm

On Thu Jun 1, 2023 at 8:46 PM AEST, Sachin Sant wrote:
> While compiling a kernel on a IBM Power system booted with
> 6.4.0-rc4-next-20230601 following warning is observed
>
> [  276.351697] ------------[ cut here ]------------
> [  276.351709] WARNING: CPU: 27 PID: 9237 at kernel/sched/core.c:5358 kthread_end_lazy_tlb_mm+0x90/0xa0
> [  276.351719] Modules linked in: dm_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc pseries_rng aes_gcm_p10_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi scsi_transport_srp ibmveth vmx_crypto fuse
> [  276.351752] CPU: 27 PID: 9237 Comm: cc1 Kdump: loaded Not tainted 6.4.0-rc4-next-20230601 #1
> [  276.351756] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [  276.351759] NIP:  c0000000001b8c10 LR: c0000000000a8d54 CTR: c00000000046ec00
> [  276.351763] REGS: c0000000dce337d0 TRAP: 0700   Not tainted  (6.4.0-rc4-next-20230601)
> [  276.351766] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24002228  XER: 00000000
> [  276.351774] CFAR: c0000000001b8ba0 IRQMASK: 0  [  276.351774] GPR00: c0000000000a8d54 c0000000dce33a70 c0000000014a1800 c000000007852a00  [  276.351774] GPR04: 0000000000000001 ffffffffffffffff 0000000000000000 c000000007852f78  [  276.351774] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000024002428  [  276.351774] GPR12: c0000000a032b608 c00000135faa5b00 0000000000000000 0000000000000000  [  276.351774] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351774] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351774] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000007852a70  [  276.351774] GPR28: 0000000000000000 0000000000000000 000000000000001b c000000007852a00  [  276.351810] NIP [c0000000001b8c10] kthread_end_lazy_tlb_mm+0x90/0xa0
> [  276.351814] LR [c0000000000a8d54] exit_lazy_flush_tlb+0xf4/0x110
> [  276.351818] Call Trace:
> [  276.351820] [c0000000dce33a70] [0000000000000001] 0x1 (unreliable)
> [  276.351825] [c0000000dce33ab0] [c0000000000a8fbc] flush_type_needed+0x24c/0x260
> [  276.351829] [c0000000dce33af0] [c0000000000a91a8] __flush_all_mm+0x48/0x2c0
> [  276.351833] [c0000000dce33b40] [c0000000004d6dcc] tlb_finish_mmu+0x16c/0x230
> [  276.351839] [c0000000dce33b70] [c0000000004d2a2c] exit_mmap+0x17c/0x4c0

Thanks for the report. IRQs aren't diabled where I'd they would be. Fix
should be just add a local_irq_disable somewhere, but this looks like it
is exposing an upstream bug of mine so I'll work out a fix for that
first. No big deal for this series, it can stay in -next for now, it
might just require a rebase.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: WARN at kernel/sched/core.c:5358 (kthread_end_lazy_tlb_mm)
  2023-06-06  9:46 ` Nicholas Piggin
@ 2023-06-06 23:23   ` Michael Ellerman
  0 siblings, 0 replies; 3+ messages in thread
From: Michael Ellerman @ 2023-06-06 23:23 UTC (permalink / raw)
  To: Nicholas Piggin, Sachin Sant, linuxppc-dev; +Cc: linux-mm

"Nicholas Piggin" <npiggin@gmail.com> writes:
> On Thu Jun 1, 2023 at 8:46 PM AEST, Sachin Sant wrote:
>> While compiling a kernel on a IBM Power system booted with
>> 6.4.0-rc4-next-20230601 following warning is observed
>>
>> [  276.351697] ------------[ cut here ]------------
>> [  276.351709] WARNING: CPU: 27 PID: 9237 at kernel/sched/core.c:5358 kthread_end_lazy_tlb_mm+0x90/0xa0
>> [  276.351719] Modules linked in: dm_mod nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 bonding tls rfkill ip_set nf_tables nfnetlink sunrpc pseries_rng aes_gcm_p10_crypto xfs libcrc32c sd_mod sr_mod t10_pi crc64_rocksoft_generic cdrom crc64_rocksoft crc64 sg ibmvscsi scsi_transport_srp ibmveth vmx_crypto fuse
>> [  276.351752] CPU: 27 PID: 9237 Comm: cc1 Kdump: loaded Not tainted 6.4.0-rc4-next-20230601 #1
>> [  276.351756] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
>> [  276.351759] NIP:  c0000000001b8c10 LR: c0000000000a8d54 CTR: c00000000046ec00
>> [  276.351763] REGS: c0000000dce337d0 TRAP: 0700   Not tainted  (6.4.0-rc4-next-20230601)
>> [  276.351766] MSR:  8000000000029033 <SF,EE,ME,IR,DR,RI,LE>  CR: 24002228  XER: 00000000
>> [  276.351774] CFAR: c0000000001b8ba0 IRQMASK: 0  [  276.351774] GPR00: c0000000000a8d54 c0000000dce33a70 c0000000014a1800 c000000007852a00  [  276.351774] GPR04: 0000000000000001 ffffffffffffffff 0000000000000000 c000000007852f78  [  276.351774] GPR08: 0000000000000000 0000000000000000 0000000000000000 0000000024002428  [  276.351774] GPR12: c0000000a032b608 c00000135faa5b00 0000000000000000 0000000000000000  [  276.351774] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351774] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  276.351774] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000007852a70  [  276.351774] GPR28: 0000000000000000 0000000000000000 000000000000001b c000000007852a00  [  276.351810] NIP [c0000000001b8c10] kthread_end_lazy_tlb_mm+0x90/0xa0
>> [  276.351814] LR [c0000000000a8d54] exit_lazy_flush_tlb+0xf4/0x110
>> [  276.351818] Call Trace:
>> [  276.351820] [c0000000dce33a70] [0000000000000001] 0x1 (unreliable)
>> [  276.351825] [c0000000dce33ab0] [c0000000000a8fbc] flush_type_needed+0x24c/0x260
>> [  276.351829] [c0000000dce33af0] [c0000000000a91a8] __flush_all_mm+0x48/0x2c0
>> [  276.351833] [c0000000dce33b40] [c0000000004d6dcc] tlb_finish_mmu+0x16c/0x230
>> [  276.351839] [c0000000dce33b70] [c0000000004d2a2c] exit_mmap+0x17c/0x4c0
>
> Thanks for the report. IRQs aren't diabled where I'd they would be. Fix
> should be just add a local_irq_disable somewhere, but this looks like it
> is exposing an upstream bug of mine so I'll work out a fix for that
> first. No big deal for this series, it can stay in -next for now, it
> might just require a rebase.

Can we drop the newly added WARN_ON_ONCE() in the interim?

It blows up a bunch of my tests, because they fail on seeing any WARN.

cheers

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-06 23:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-01 10:46 WARN at kernel/sched/core.c:5358 (kthread_end_lazy_tlb_mm) Sachin Sant
2023-06-06  9:46 ` Nicholas Piggin
2023-06-06 23:23   ` Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).