On Fri, Aug 08, 2014 at 01:58:26PM -0700, Paul E. McKenney wrote: > > > And on that, you probably should change rcu_sched_rq() to read: > > > > this_cpu_inc(rcu_sched_data.passed_quiesce); > > > > That avoids touching the per-cpu data offset. > > Hmmm... Interrupts are disabled, No they are not, __schedule()->rcu_note_context_switch()->rcu_sched_qs() is only called with preemption disabled. We only disable IRQs later, where we take the rq->lock. > so no need to further disable > interrupts. Storing 1 works fine, no need to increment. If I followed > the twisty per_cpu passages correctly, my guess is that you would like > me to do something like this: > > __this_cpu_write(rcu_sched_data.passed_quiesce, 1); > > Does that work? Yeah, should be more or less similar, the inc might be encoded shorter due to not requiring an immediate, but who cares :-) void rcu_sched_qs(int cpu) { if (trace_rcu_grace_period_enabled()) { if (!__this_cpu_read(rcu_sched_data.passed_quiesce)) trace_rcu_grace_period(...); } __this_cpu_write(rcu_sched_data.passed_quiesce, 1); } Would further avoid emitting the conditional in the normal case where the tracepoint is inactive. Steve does it make sense to have __DO_TRACE() emit __trace_##name() to avoid the double static_branch thing? > > And it would be very good if we could avoid the unconditional IRQ flag > > fiddling in rcu_preempt_note_context_switch(), them expensive, this > > looks entirely feasibly in the 'normal' case where > > t->rcu_read_unlock_special doesn't have RCU_READ_UNLOCK_NEED_QS set. > > Agreed, but sometimes RCU_READ_UNLOCK_NEED_QS is set. > > That said, I should probably revisit RCU_READ_UNLOCK_NEED_QS. A lot has > changed since I wrote that code. Sure, but a conditional testing RCU_READ_UNLOCK_NEED_QS is far cheaper than poking the IRQ flags. That said, its not entirely clear to me why that needs IRQs disabled at all, then again I didn't look long and I'm sure its all subtle.