On Mon, 2018-07-09 at 13:06 +0200, Peter Zijlstra wrote: > On Mon, Jul 09, 2018 at 11:56:41AM +0100, David Woodhouse wrote: > > > > > > > > > But either proposal is exactly the same in this respect. The whole > > > rcu_urgent_qs thing won't be set any earlier either. > > Er.... Marius, our latencies in expand_fdtable() definitely went from > > ~10s to well below one second when we just added the rcu_all_qs() into > > the loop, didn't they? And that does nothing if !rcu_urgent_qs. > Argh I never found that, because obfuscation: > > ruqp = per_cpu_ptr(&rcu_dynticks.rcu_urgent_qs, rdp->cpu); > ... > smp_store_release(ruqp, true); > > I, using git grep "rcu_urgent_qs.*true" only found > rcu_request_urgent_qs_task() and sync_sched_exp_handler(). > > But how come KVM even triggers that case; rcu_implicit_dynticks_qs() is > for NOHZ and offline CPUs. I don't know that it is; I'm merely going by the empirical observation that with a check for rcu_urgent_qs in the vcpu_run() loop, KVM is no longer screwing over synchronize_sched() for 10 seconds at a time. Or even 1 second at a time. I'm all for considering a CPU in guest mode to be quiescent, and not waiting for it at all. But we don't do that without full NOHZ even for CPUs in userspace.