On Mon, 2018-07-09 at 10:53 +0200, Peter Zijlstra wrote:
> On Fri, Jul 06, 2018 at 10:11:50AM -0700, Paul E. McKenney wrote:
> > On Fri, Jul 06, 2018 at 06:29:05PM +0200, Peter Zijlstra wrote:
> > > On Fri, Jul 06, 2018 at 03:53:30PM +0100, David Woodhouse wrote:
> > > > 
> > > > diff --git a/include/linux/sched.h b/include/linux/sched.h
> > > > index e4d4e60..89f5814 100644
> > > > --- a/include/linux/sched.h
> > > > +++ b/include/linux/sched.h
> > > > @@ -1616,7 +1616,8 @@ static inline int spin_needbreak(spinlock_t *lock)
> > > >  
> > > >  static __always_inline bool need_resched(void)
> > > >  {
> > > > -	return unlikely(tif_need_resched());
> > > > +	return unlikely(tif_need_resched()) ||
> > > > +		rcu_urgent_qs_requested();
> > > >  }
> > > Instead of making need_resched() touch two cachelines, I think I would
> > > prefer adding resched_cpu() to rcu_request_urgent_qs_task().
>
> > I used to do something like this, but decided that whacking each holdout
> > CPU over the head ten times a second was a bit much.
>
> This is only called from the !list_empty(rcu_tasks_holdout) loop in
> rcu_tasks_kthread afaict, and that has a
> schedule_timeout_interruptible(HZ) in it, which I read as once a second.
> 
> Which seems like an entirely reasonable amount of time to kick a task.
> Not scheduling for a second is like an eternity.

If that is our only "fix" for KVM, then wouldn't that mean that things
like expand_fdtable() would be *expected* to take "an eternity" when
another CPU happens to be in the guest? Because vcpu_run() would still
loop until the task gets kicked after a second?

Of course, we can explicitly put a check into the KVM loop, but that
brings me back to my original concern — why is it OK to do it there as
a special case and not for the general case construct of
if (need_resched) { drop_local_locks(); cond_resched(); get_local_locks(); }