preempt_disable causes big latency

* preempt_disable causes big latency
@ 2021-01-28  9:04 chensong_2000
  2021-01-28 11:30 ` Daniel Vacek
  0 siblings, 1 reply; 3+ messages in thread
From: chensong_2000 @ 2021-01-28  9:04 UTC (permalink / raw)
  To: linux-rt-users

Dear experts,

When i was running cyclictest with ltp(kernel 4.19.90 rt, arch64), 
cyclictest got a big latecy(more than 10ms), the ftrace log shows that 
modprobe called preempt_disable for a long time, which stops cyclictest 
being switched.

here is a piece of log:

modprobe-4551    0...10   712.754529: preempt_disable: 
caller=0xffff0001019900c8 parent=0xffff0001019900c8
...
modprobe-4551    0d..10   712.754589: irq_disable: caller=el1_irq+0x7c 
parent=0xffff0001019900dc
modprobe-4551    0d.h10   712.754590: irq_handler_entry:    irq=2 
name=arch_timer
modprobe-4551    0d.h20   712.754591: hrtimer_cancel: 
hrtimer=0xffff8020f4983d98
modprobe-4551    0d.h10   712.754591: hrtimer_expire_entry: 
hrtimer=0xffff8020f4983d98 now=712720218245 function=hrtimer_wakeup/0x0
modprobe-4551    0d.h20   712.754592: sched_waking: comm=cyclictest 
pid=2212 prio=9 target_cpu=000
modprobe-4551    0dNh30   712.754593: sched_wakeup: cyclictest:2212 [9] 
success=1 CPU:000
modprobe-4551    0dNh10   712.754594: hrtimer_expire_exit: 
hrtimer=0xffff8020f4983d98
modprobe-4551    0dNh10   712.754594: irq_handler_exit:     irq=2 
ret=handled

Further i found 
preemptirq_delay_test(kernel/trace/preemptirq_delay_test.c) can 
reproduce it easily and got the similar ftrace log.

The root cause i found is in el1_irq, it checks preempt count and 
TIF_NEED_RESCHED before it goes through the path el1_preempt, 
preempt_disable just stops it happening.

Then i came up an idea and did an experiment,  call 
preempt_count_set(0); and set_tsk_need_resched(task); in hrtimer_wakeup 
to meet the expectation in order to reschdule.

static enum hrtimer_restart hrtimer_wakeup(struct hrtimer *timer)
{
     struct hrtimer_sleeper *t =
         container_of(timer, struct hrtimer_sleeper, timer);
     struct task_struct *task = t->task;

     t->task = NULL;
     if (task)
         wake_up_process(task);

     if(preempt_count())
         preempt_count_set(0);
     set_tsk_need_resched(task);

     return HRTIMER_NORESTART;
}

but, failed, system froze, no panic information.

Is there anyone having the same problem? i would appreciate it if you 
could share me some information in this case, many thanks.

BR

Song Chen

^ permalink raw reply	[flat|nested] 3+ messages in thread