On Mon, 2018-07-09 at 05:34 -0700, Paul E. McKenney wrote:
> The reason that David's latencies went from 100ms to one second is
> because I made this code less aggressive about invoking resched_cpu().

Ten seconds. We saw synchronize_sched() take ten seconds in 4.15. We
wouldn't have been happy with one second, but ten seconds was
considered particularly suboptimal.

> The reason I did that was to allow cond_resched_rcu_qs() to be used less
> without performance regressions.  And just plain cond_resched() on
> !PREEMPT is intended to handle the faster checks.  But KVM defeats
> this by checking need_resched() before invoking cond_resched().

It isn't just KVM. It's a relatively common construct to use
need_resched(), then drop any local locks around cond_resched().

A bare cond_resched() will call rcu_all_qs() unconditionally, and it is
kind of inconsistent that need_resched() doesn't include the
corresponding condition.