On Sat, Aug 09, 2014 at 09:01:37AM -0700, Paul E. McKenney wrote:
> > That's so wrong its not funny. If you need some abortion to deal with
> > NOHZ_FULL then put it under CONFIG_NOHZ_FULL, don't burden the entire
> > world with it.
> 
> Peter, the polling approach actually -reduces- the common-case
> per-context-switch burden, as in when RCU-tasks isn't doing anything.
> See your own code above.

I'm not seeing it, CONFIG_PREEMPT already touches a per task cacheline
for each context switch. And for !PREEMPT this thing should pretty much
reduce to rcu_sched.

Would not the thing I proposed be a valid rcu_preempt implementation?
one where its rcu read side primitives run from (voluntary) schedule()
to (voluntary) schedule() call and therefore entirely cover smaller
sections.

> > As for idle tasks, I'm not sure about those, I think that we should say
> > NO to anything that would require waking idle CPUs, push the pain to
> > ftrace/kprobes, we should _not_ be waking idle cpus.
> 
> So the current patch set wakes an idle task once per RCU-tasks grace
> period, but only when that idle task did not otherwise get awakened.
> This is not a real problem.

And on the other hand we're trying to reduce random wakeups, so this
sure is a problem. If we don't start, we don't have to fix later.

> And it could probably be reduced further, for example, for architectures
> where the program counter of sleeping CPUs can be remotely accessed and
> where the address of the am-asleep code is known.  I doubt that this
> would really be worth it, but it could be done, in theory anyway.  Or, as
> Steven suggested earlier, there could be a per-CPU variable that was set
> (with approapriate memory ordering) when the CPU was actually sleeping.
> 
> So I don't believe that the current wakeup rate is a problem, and it
> can be reduced if it proves to be a problem.

How about we simply assume 'idle' code, as defined by the rcu idle hooks
are safe? Why do we want to bend over backwards to cover this?