On Mon, 2018-07-09 at 10:53 +0200, Peter Zijlstra wrote: > On Fri, Jul 06, 2018 at 10:11:50AM -0700, Paul E. McKenney wrote: > > On Fri, Jul 06, 2018 at 06:29:05PM +0200, Peter Zijlstra wrote: > > > On Fri, Jul 06, 2018 at 03:53:30PM +0100, David Woodhouse wrote: > > > > > > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > > > index e4d4e60..89f5814 100644 > > > > --- a/include/linux/sched.h > > > > +++ b/include/linux/sched.h > > > > @@ -1616,7 +1616,8 @@ static inline int spin_needbreak(spinlock_t *lock) > > > >   > > > >  static __always_inline bool need_resched(void) > > > >  { > > > > - return unlikely(tif_need_resched()); > > > > + return unlikely(tif_need_resched()) || > > > > + rcu_urgent_qs_requested(); > > > >  } > > > Instead of making need_resched() touch two cachelines, I think I would > > > prefer adding resched_cpu() to rcu_request_urgent_qs_task(). > > > I used to do something like this, but decided that whacking each holdout > > CPU over the head ten times a second was a bit much. > > This is only called from the !list_empty(rcu_tasks_holdout) loop in > rcu_tasks_kthread afaict, and that has a > schedule_timeout_interruptible(HZ) in it, which I read as once a second. > > Which seems like an entirely reasonable amount of time to kick a task. > Not scheduling for a second is like an eternity. If that is our only "fix" for KVM, then wouldn't that mean that things like expand_fdtable() would be *expected* to take "an eternity" when another CPU happens to be in the guest? Because vcpu_run() would still loop until the task gets kicked after a second? Of course, we can explicitly put a check into the KVM loop, but that brings me back to my original concern — why is it OK to do it there as a special case and not for the general case construct of if (need_resched) { drop_local_locks(); cond_resched(); get_local_locks(); }