On Sun, Sep 24, 2017 at 02:23:04PM +0000, Mathieu Desnoyers wrote: [...] > >> > >> copy_mm() is performed without holding current->sighand->siglock, so > >> it appears to be racing with concurrent membarrier register cmd. > > > > Speak of racing, I think we currently have a problem if we do a > > register_private_expedited in one thread and do a > > membarrer_private_expedited in another thread(sharing the same mm), as > > follow: > > > > {t1,t2,t3 sharing the same ->mm} > > CPU 0 CPU 1 CPU2 > > ==================== =================== ============ > > {in thread t1} > > membarrier_register_private_expedited(): > > ... > > WRITE_ONCE(->mm->membarrier_private_expedited, 1); > > membarrier_arch_register_private_expedited(): > > ... > > > > > > {in thread t2} > > membarrier_private_expedited(): > > READ_ONCE(->mm->membarrier_private_expedited); // == 1 > > ... > > for_each_online_cpu() > > ... > >

curr> > > if (p && p->mm == current->mm) // false > > > > > > {about to switch to t3} > > rq->curr = t3; > > .... > > context_switch(): > > ... > > finish_task_swtich(): > > membarrier_sched_in(): > > > > // no smp_mb() here. > > > > , and we will miss the smp_mb() on CPU2, right? And this could even > > happen if t2 has a membarrier_register_private_expedited() preceding the > > membarrier_private_expedited(). > > > > Am I missing something subtle here? > > I think the problem sits in this function: > > static void membarrier_register_private_expedited(void) > { > struct task_struct *p = current; > > if (READ_ONCE(p->mm->membarrier_private_expedited)) > return; > WRITE_ONCE(p->mm->membarrier_private_expedited, 1); > membarrier_arch_register_private_expedited(p); > } > > I need to change the order between WRITE_ONCE() and invoking > membarrier_arch_register_private_expedited. If I issue the > WRITE_ONCE after the arch code (which sets the TIF flags), > then concurrent membarrier priv exped commands will simply > return an -EPERM error: > > static void membarrier_register_private_expedited(void) > { > struct task_struct *p = current; > > if (READ_ONCE(p->mm->membarrier_private_expedited)) > return; > membarrier_arch_register_private_expedited(p); > WRITE_ONCE(p->mm->membarrier_private_expedited, 1); > } > > Do you agree that this would fix the race you identified ? > Yep, that will do the trick ;-) Regards, Boqun > Thanks, > > Mathieu >