All of lore.kernel.org
 help / color / mirror / Atom feed
* Fwd: rcu stall in __d_lookup_rcu
       [not found]   ` <CAEtum4kzuWgCTyzUAdGwQZFxj76wnbptN-5Vf4kyv2-nTmM62w@mail.gmail.com>
@ 2021-09-12 13:28     ` lixiaofeng li
  0 siblings, 0 replies; only message in thread
From: lixiaofeng li @ 2021-09-12 13:28 UTC (permalink / raw)
  To: linux-kernel

Hello,
      The rcu stall warming is reported randomly after the system was
uptime more than 100 days. Because it is an embedded system, the
vmcore can not be generated.  From the dumpstack of rcu stall warm,
the rcu stall was detected in __d_lookup_rcu() function.
The similar rcu stall warnings happened 6 times on different machines,
which all pointed to  d_alloc_parallel()-->__d_lookup_rcu().  Some of
them triggered the oops due to khungtaskd detection..
     And the commits 015555f and 8cc07c8 by Will Deacon  have already
been present in our linux  kernel.
Can someone help guide what the issue is and how to process this type
of issue further?

1. The linux version is 4.14.47 and the cpu is fsl,T1040RDB, 32bits.
    4.14.47+gb09b730 #1 SMP Fri Apr 2 01:34:59 UTC 2021 ppc GNU/Linux.
2. The tree rcu is used and preempt is disabled
CONFIG_TREE_RCU=y
CONFIG_SRCU=y
CONFIG_TREE_SRCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_HAVE_RCU_TABLE_FREE=y
CONFIG_RCU_CPU_STALL_TIMEOUT=60
CONFIG_RCU_TRACE=y

CONFIG_PREEMPT_NONE=y

3. Below is the rcu warming:
[10953024.068601] INFO: rcu_sched self-detected stall on CPU
[10953024.074035]    0-...: (59998 ticks this GP)
idle=cc2/140000000000001/0 softirq=1466139644/1466139644 fqs=14337
[10953024.084226]     (t=60000 jiffies g=872597753 c=872597752 q=83688)
[10953024.090418] Task dump for CPU 0:
[10953024.090421] pidof           R  running task     6096 29584
29580 0x00000006
[10953024.090439] Call Trace:
[10953024.090450] [e871fb00] [c00cc2f4] rcu_dump_cpu_stacks+0xa4/0x100
(unreliable)
[10953024.090465] [e871fb20] [c00cb52c] rcu_check_callbacks+0x7dc/0xa00
[10953024.090476] [e871fb90] [c00d3ec0] update_process_times+0x40/0x80
[10953024.090481] [e871fba0] [c00eccac] tick_sched_handle.isra.5+0x4c/0x70
[10953024.090486] [e871fbb0] [c00ecd44] tick_sched_timer+0x74/0x110
[10953024.090493] [e871fbd0] [c00d5698] __hrtimer_run_queues+0x138/0x380
[10953024.090499] [e871fc20] [c00d5cc8] hrtimer_interrupt+0x108/0x2c0
[10953024.090510] [e871fc60] [c000bee8] __timer_interrupt+0xf8/0x340
[10953024.090516] [e871fc90] [c000c360] timer_interrupt+0xd0/0x110
[10953024.090525] [e871fcb0] [c0013928] ret_from_except+0x0/0x18
[10953024.090537] --- interrupt: 901 at __d_lookup_rcu+0x8c/0x210
[10953024.090537]     LR = d_alloc_parallel+0xe4/0x520
[10953024.090541] [e871fd70] [c02537bc] __d_alloc+0x10c/0x230 (unreliable)
[10953024.090547] [e871fda0] [c02540b4] d_alloc_parallel+0xe4/0x520
[10953024.090556] [e871fe10] [c02ae7b0] proc_fill_cache+0x120/0x1e0
[10953024.090562] [e871fe70] [c02af940] proc_pid_readdir+0x1a0/0x300
[10953024.090567] [e871fed0] [c024c4e4] iterate_dir+0x1d4/0x240
[10953024.090572] [e871ff00] [c024cbd8] SyS_getdents+0x98/0x160
[10953024.090578] [e871ff40] [c0013278] ret_from_syscall+0x0/0x3c
[10953024.090583] --- interrupt: c01 at 0xfea2d14
[10953024.090583]     LR = 0xfea2ccc
[10953125.985432] INFO: task notfmgrd:5188 blocked for more than 120 seconds.
[10953125.992357]       Tainted: P           O    4.14.47+gb09b730 #1
[10953125.998646] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[10953126.006769] notfmgrd        D 3984  5188      1 0x00000002
[10953126.006785] Call Trace:
[10953126.006800] [e78cdad0] [c0088584]
__update_load_avg_se.isra.3+0x234/0x270 (unreliable)
[10953126.006808] [e78cdb90] [c09b8dcc] __schedule+0x2cc/0x9d0
[10953126.006813] [e78cdbf0] [c09b9508] schedule+0x38/0xc0
[10953126.006820] [e78cdc00] [c09bd6d4] schedule_timeout+0x244/0x4b0
[10953126.006826] [e78cdc50] [c09ba428] wait_for_common+0xc8/0x1c0
[10953126.006835] [e78cdc90] [c00c128c] __wait_rcu_gp+0x22c/0x240
[10953126.006844] [e78cdcd0] [c00ca284] synchronize_sched+0x54/0x90
[10953126.006861] [e78cdd00] [f46dc804] pram_notify_change+0x214/0x3b0 [pramfs]
[10953126.006872] [e78cdd40] [c02592f4] notify_change+0x1e4/0x520
[10953126.006879] [e78cdd70] [c022d9ec] do_truncate+0x7c/0x100
[10953126.006890] [e78cddc0] [c0246674] path_openat+0xf74/0x13f0
[10953126.006896] [e78cde50] [c02481e4] do_filp_open+0x74/0x100
[10953126.006902] [e78cdf00] [c022f3c0] do_sys_open+0x220/0x2c0
[10953126.006913] [e78cdf40] [c0013278] ret_from_syscall+0x0/0x3c
[10953126.006918] --- interrupt: c01 at 0xfed5eb8
[10953126.006918]     LR = 0xfed5e7c
[10953126.006921] Kernel panic - not syncing: hung_task: blocked tasks
[10953126.013201] CPU: 3 PID: 404 Comm: khungtaskd Tainted: P
 O    4.14.47+gb09b730 #1
[10953126.021740] Call Trace:
[10953126.024452] [e93b1e30] [c09a05a8] dump_stack+0x88/0xc0 (unreliable)
[10953126.030993] [e93b1e50] [c004a088] panic+0x124/0x28c
[10953126.036145] [e93b1eb0] [c0124fd8] watchdog+0x3c8/0x430
[10953126.041557] [e93b1f10] [c0072594] kthread+0x164/0x170
[10953126.046878] [e93b1f40] [c00133b0] ret_from_kernel_thread+0x5c/0x64

Thanks

regards
sky

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-12 13:28 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CAEtum4m92ibKHY6W2hWDB8vg96HMreyoXCEPVCq137nO25E7Gg@mail.gmail.com>
     [not found] ` <CAEtum4m4b+q9+T7s1-ReYYa0VXwb1h+n6BKWssB=YzcJgCznaA@mail.gmail.com>
     [not found]   ` <CAEtum4kzuWgCTyzUAdGwQZFxj76wnbptN-5Vf4kyv2-nTmM62w@mail.gmail.com>
2021-09-12 13:28     ` Fwd: rcu stall in __d_lookup_rcu lixiaofeng li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.