On Mon, 2018-07-09 at 05:34 -0700, Paul E. McKenney wrote: > The reason that David's latencies went from 100ms to one second is > because I made this code less aggressive about invoking resched_cpu(). Ten seconds. We saw synchronize_sched() take ten seconds in 4.15. We wouldn't have been happy with one second, but ten seconds was considered particularly suboptimal. > The reason I did that was to allow cond_resched_rcu_qs() to be used less > without performance regressions.  And just plain cond_resched() on > !PREEMPT is intended to handle the faster checks.  But KVM defeats > this by checking need_resched() before invoking cond_resched(). It isn't just KVM. It's a relatively common construct to use need_resched(), then drop any local locks around cond_resched(). A bare cond_resched() will call rcu_all_qs() unconditionally, and it is kind of inconsistent that need_resched() doesn't include the corresponding condition.