On 02.03.20 14:25, Igor Druzhinin wrote: > On 28/02/2020 07:10, Jürgen Groß wrote: >> >> I think you are just narrowing the window of the race: >> >> It is still possible to have two cpus entering rcu_barrier() and to >> make it into the if ( !initial ) clause. >> >> Instead of introducing another atomic I believe the following patch >> instead of yours should do it: >> >> diff --git a/xen/common/rcupdate.c b/xen/common/rcupdate.c >> index e6add0b120..0d5469a326 100644 >> --- a/xen/common/rcupdate.c >> +++ b/xen/common/rcupdate.c >> @@ -180,23 +180,17 @@ static void rcu_barrier_action(void) >> >>  void rcu_barrier(void) >>  { >> -    int initial = atomic_read(&cpu_count); >> - >>      while ( !get_cpu_maps() ) >>      { >>          process_pending_softirqs(); >> -        if ( initial && !atomic_read(&cpu_count) ) >> +        if ( !atomic_read(&cpu_count) ) >>              return; >> >>          cpu_relax(); >> -        initial = atomic_read(&cpu_count); >>      } >> >> -    if ( !initial ) >> -    { >> -        atomic_set(&cpu_count, num_online_cpus()); >> +    if ( atomic_cmpxchg(&cpu_count, 0, num_online_cpus()) == 0 ) >>          cpumask_raise_softirq(&cpu_online_map, RCU_SOFTIRQ); >> -    } >> >>      while ( atomic_read(&cpu_count) ) >>      { >> >> Could you give that a try, please? > > With this patch I cannot disable SMT at all. > > The problem that my diff solved was a race between 2 consecutive > rcu_barrier operations on CPU0 (the pattern specific to SMT-on/off > operation) where some CPUs didn't exit the cpu_count checking loop > completely but cpu_count is already reinitialized on CPU0 - this > results in some CPUs being stuck in the loop. Ah, okay, then I believe a combination of the two patches is needed. Something like the attached version? Juergen