On Fri, Sep 22, 2017 at 03:10:10PM +0000, Mathieu Desnoyers wrote: > ----- On Sep 22, 2017, at 4:59 AM, Boqun Feng boqun.feng@gmail.com wrote: > > > On Tue, Sep 19, 2017 at 06:13:41PM -0400, Mathieu Desnoyers wrote: > > [...] > >> +static inline void membarrier_arch_sched_in(struct task_struct *prev, > >> + struct task_struct *next) > >> +{ > >> + /* > >> + * Only need the full barrier when switching between processes. > >> + */ > >> + if (likely(!test_ti_thread_flag(task_thread_info(next), > >> + TIF_MEMBARRIER_PRIVATE_EXPEDITED) > >> + || prev->mm == next->mm)) > > > > And we also don't need the smp_mb() if !prev->mm, because switching from > > kernel to user will have a smp_mb() implied by mmdrop()? > > Right. And we also don't need it when switching from userspace to kernel Yep, but this case is covered already, as I think we don't allow kernel thread to have TIF_MEMBARRIER_PRIVATE_EXPEDITED set, right? > thread neither. Something like this: > > static inline void membarrier_arch_sched_in(struct task_struct *prev, > struct task_struct *next) > { > /* > * Only need the full barrier when switching between processes. > * Barrier when switching from kernel to userspace is not > * required here, given that it is implied by mmdrop(). Barrier > * when switching from userspace to kernel is not needed after > * store to rq->curr. > */ > if (likely(!test_ti_thread_flag(task_thread_info(next), > TIF_MEMBARRIER_PRIVATE_EXPEDITED) > || !prev->mm || !next->mm || prev->mm == next->mm)) , so no need to test next->mm here. > return; > > /* > * The membarrier system call requires a full memory barrier > * after storing to rq->curr, before going back to user-space. > */ > smp_mb(); > } > > > > >> + return; > >> + > >> + /* > >> + * The membarrier system call requires a full memory barrier > >> + * after storing to rq->curr, before going back to user-space. > >> + */ > >> + smp_mb(); > >> +} > > > > [...] > > > >> +static inline void membarrier_fork(struct task_struct *t, > >> + unsigned long clone_flags) > >> +{ > >> + if (!current->mm || !t->mm) > >> + return; > >> + t->mm->membarrier_private_expedited = > >> + current->mm->membarrier_private_expedited; > > > > Have we already done the copy of ->membarrier_private_expedited in > > copy_mm()? > > copy_mm() is performed without holding current->sighand->siglock, so > it appears to be racing with concurrent membarrier register cmd. Speak of racing, I think we currently have a problem if we do a register_private_expedited in one thread and do a membarrer_private_expedited in another thread(sharing the same mm), as follow: {t1,t2,t3 sharing the same ->mm} CPU 0 CPU 1 CPU2 ==================== =================== ============ {in thread t1} membarrier_register_private_expedited(): ... WRITE_ONCE(->mm->membarrier_private_expedited, 1); membarrier_arch_register_private_expedited(): ... {in thread t2} membarrier_private_expedited(): READ_ONCE(->mm->membarrier_private_expedited); // == 1 ... for_each_online_cpu() ...

curr> if (p && p->mm == current->mm) // false {about to switch to t3} rq->curr = t3; .... context_switch(): ... finish_task_swtich(): membarrier_sched_in(): // no smp_mb() here. , and we will miss the smp_mb() on CPU2, right? And this could even happen if t2 has a membarrier_register_private_expedited() preceding the membarrier_private_expedited(). Am I missing something subtle here? Regards, Boqun > However, given that it is a single flag updated with WRITE_ONCE() > and read with READ_ONCE(), it might be OK to rely on copy_mm there. > If userspace runs registration concurrently with fork, they should > not expect the child to be specifically registered or unregistered. > > So yes, I think you are right about removing this copy and relying on > copy_mm() instead. I also think we can improve membarrier_arch_fork() > on powerpc to test the current thread flag rather than using current->mm. > > Which leads to those two changes: > > static inline void membarrier_fork(struct task_struct *t, > unsigned long clone_flags) > { > /* > * Prior copy_mm() copies the membarrier_private_expedited field > * from current->mm to t->mm. > */ > membarrier_arch_fork(t, clone_flags); > } > > And on PowerPC: > > static inline void membarrier_arch_fork(struct task_struct *t, > unsigned long clone_flags) > { > /* > * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread > * fork is protected by siglock. membarrier_arch_fork is called > * with siglock held. > */ > if (test_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED)) > set_ti_thread_flag(task_thread_info(t), > TIF_MEMBARRIER_PRIVATE_EXPEDITED); > } > > Thanks, > > Mathieu > > > > > > Regards, > > Boqun > > > >> + membarrier_arch_fork(t, clone_flags); > >> +} > >> +static inline void membarrier_execve(struct task_struct *t) > >> +{ > >> + t->mm->membarrier_private_expedited = 0; > >> + membarrier_arch_execve(t); > >> +} > >> +#else > >> +static inline void membarrier_sched_in(struct task_struct *prev, > >> + struct task_struct *next) > >> +{ > >> +} > >> +static inline void membarrier_fork(struct task_struct *t, > >> + unsigned long clone_flags) > >> +{ > >> +} > >> +static inline void membarrier_execve(struct task_struct *t) > >> +{ > >> +} > >> +#endif > >> + > > [...] > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com