On Mon, Feb 20, 2017 at 2:03 AM, Niklas Cassel wrote: > On 02/19/2017 05:46 PM, Guenter Roeck wrote: > > Cc: Wolfram for input. > > > > On 02/17/2017 10:25 AM, Niklas Cassel wrote: > >> From: Niklas Cassel > >> > >> Checking for timer expiration is done from the softirq TIMER_SOFTIRQ. > >> > >> Since commit 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), > >> pending softirqs are no longer always handled immediately, instead, > >> if there are pending softirqs, and ksoftirqd is in state TASK_RUNNING, > >> the handling of the softirqs are deferred, and are instead supposed > >> to be handled by ksoftirqd, when ksoftirqd gets scheduled. > >> > >> If a user space process with a real-time policy starts to misbehave > >> by never relinquishing the CPU while ksoftirqd is in state TASK_RUNNING, > >> what will happen is that all softirqs will get deferred, while > ksoftirqd, > >> which is supposed to handle the deferred softirqs, will never get to > run. > >> > >> To make sure that the watchdog is able to fire even when we do not get > >> to run softirqs, replace the timers with hrtimers. > >> > > > > This makes the driver dependent on HIGH_RES_TIMERS, which is not > available > > on all architectures. Before adding that restriction, I would like to see > > some discussion if this is the only feasible solution. > > > > Is this driver the only one with this problem, or is anything using > > timers affected ? > > Anything using timers is affected. > The timers will still get incremented, but the code checking for timer > expiration is run from a softirq, which in this case never gets to run, > so the timers will never expire. > > Before 4cd13c21b207 ("softirq: Let ksoftirqd do its job"), softirqs > were never deferred, so they always got to run when exiting an irq. > > So previously with a user space process using all the CPU, like: > chrt -r 99 sh -c "while :; do :; done" > the softdog would still fire. > > So the question is : If some RT process does an infinite loop, should we care about system being functional ? Looks like the OS is now WAI (Working As Intended) > If we ask the system to run something all the time, > and the system does that, I don't think we can blame the system. > It is however important that the watchdog can still detect and > fire when this happens. Other drivers, not so much. > > I guess another solution would be to modify the if-statements in > kernel/softirq.c to sometimes do the softirq directly, even if ksoftirqd > is in state TASK_RUNNING, if we also meet some other condition. > However, do we want to add that extra complexity? > Perhaps someone with more softirq/scheduler knowledge can give > some input on this. >