rcu_sched stall detected, but no state dump

* rcu_sched stall detected, but no state dump
@ 2014-12-10 12:52 Miroslav Benes
  2014-12-10 16:28 ` Paul E. McKenney
  0 siblings, 1 reply; 14+ messages in thread
From: Miroslav Benes @ 2014-12-10 12:52 UTC (permalink / raw)
  To: Paul E. McKenney; +Cc: Linux Kernel Mailing List

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1906 bytes --]

Hi,

today I came across RCU stall which was correctly detected, but there is 
no state dump. This is a bit suspicious, I think. 

This is the output in serial console:

[  105.727003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  105.727003]  (detected by 0, t=21002 jiffies, g=3269, c=3268, q=138)
[  105.727003] INFO: Stall ended before state dump start
[  168.732006] INFO: rcu_sched detected stalls on CPUs/tasks:
[  168.732006]  (detected by 0, t=84007 jiffies, g=3269, c=3268, q=270)
[  168.732006] INFO: Stall ended before state dump start
[  231.737003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  231.737003]  (detected by 0, t=147012 jiffies, g=3269, c=3268, q=388)
[  231.737003] INFO: Stall ended before state dump start
[  294.742003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  294.742003]  (detected by 0, t=210017 jiffies, g=3269, c=3268, q=539)
[  294.742003] INFO: Stall ended before state dump start
[  357.747003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  357.747003]  (detected by 0, t=273022 jiffies, g=3269, c=3268, q=693)
[  357.747003] INFO: Stall ended before state dump start
[  420.752003] INFO: rcu_sched detected stalls on CPUs/tasks:
[  420.752003]  (detected by 0, t=336027 jiffies, g=3269, c=3268, q=806)
[  420.752003] INFO: Stall ended before state dump start
...

It can be reproduced by trivial code attached to this mail (infinite 
loop in kernel thread created in kernel module). I have CONFIG_PREEMPT=n. 
The kernel thread is scheduled on the same CPU which causes soft lockup 
(reliably detected when lockup detector is on). There is certainly RCU 
stall, but I would expect a state dump. Is this an expected behaviour? 
Maybe I overlooked some config option, don't know.

I tested 3.18 and also next-20141210. If it is improper behaviour I could 
try to find a good kernel release and bisect it.

Best regards,
--
Miroslav Benes
SUSE Labs

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: Type: TEXT/x-c++src; name=kthread_mod.c, Size: 604 bytes --]

#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/kthread.h>
#include <linux/delay.h>

static struct task_struct *test_thread;

static int test_thread_fn(void *data)
{
	while (1) {
		if (kthread_should_stop())
			break;

//		msleep(1000);
	}

	return 0;
}

static int __init test_module_init(void)
{
	test_thread = kthread_run(test_thread_fn, NULL, "test_thread");

	return 0;
}

static void __exit test_module_cleanup(void)
{
	kthread_stop(test_thread);
}

module_init(test_module_init);
module_exit(test_module_cleanup);

MODULE_LICENSE("GPL");

^ permalink raw reply	[flat|nested] 14+ messages in thread