* [RFC PATCH 0/3] sched: membarrier updates @ 2020-08-14 16:43 Mathieu Desnoyers 2020-08-14 16:43 ` [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) Mathieu Desnoyers ` (3 more replies) 0 siblings, 4 replies; 11+ messages in thread From: Mathieu Desnoyers @ 2020-08-14 16:43 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Will Deacon, Paul E . McKenney, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin, Mathieu Desnoyers Hi, Please find those three changes based on our recent discussions about interaction between membarrier and exit_mm and kthread_use_mm. I added documentation of memory ordering scenarios to membarrier.c as well. Thanks, Mathieu Mathieu Desnoyers (3): sched: fix exit_mm vs membarrier (v2) sched: membarrier: cover kthread_use_mm (v2) sched: membarrier: document memory ordering scenarios kernel/exit.c | 8 +++ kernel/kthread.c | 19 +++++++ kernel/sched/idle.c | 1 + kernel/sched/membarrier.c | 136 ++++++++++++++++++++++++++++++++++++++++++++-- 4 files changed, 158 insertions(+), 6 deletions(-) -- 2.11.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) 2020-08-14 16:43 [RFC PATCH 0/3] sched: membarrier updates Mathieu Desnoyers @ 2020-08-14 16:43 ` Mathieu Desnoyers 2020-08-16 15:23 ` Boqun Feng 2020-08-14 16:43 ` [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) Mathieu Desnoyers ` (2 subsequent siblings) 3 siblings, 1 reply; 11+ messages in thread From: Mathieu Desnoyers @ 2020-08-14 16:43 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Will Deacon, Paul E . McKenney, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin, Mathieu Desnoyers, Thomas Gleixner, Linus Torvalds, linux-mm exit_mm should issue memory barriers after user-space memory accesses, before clearing current->mm, to order user-space memory accesses performed prior to exit_mm before clearing tsk->mm, which has the effect of skipping the membarrier private expedited IPIs. The membarrier system call can be issued concurrently with do_exit if we have thread groups created with CLONE_VM but not CLONE_THREAD. Here is the scenario I have in mind: Two thread groups are created, A and B. Thread group B is created by issuing clone from group A with flag CLONE_VM set, but not CLONE_THREAD. Let's assume we have a single thread within each thread group (Thread A and Thread B). The AFAIU we can have: Userspace variables: int x = 0, y = 0; CPU 0 CPU 1 Thread A Thread B (in thread group A) (in thread group B) x = 1 barrier() y = 1 exit() exit_mm() current->mm = NULL; r1 = load y membarrier() skips CPU 0 (no IPI) because its current mm is NULL r2 = load x BUG_ON(r1 == 1 && r2 == 0) Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: linux-mm@kvack.org --- Changes since v1: - Use smp_mb__after_spinlock rather than smp_mb. - Document race scenario in commit message. --- kernel/exit.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/kernel/exit.c b/kernel/exit.c index 733e80f334e7..fe64e6e28dd5 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -475,6 +475,14 @@ static void exit_mm(void) BUG_ON(mm != current->active_mm); /* more a memory barrier than a real lock */ task_lock(current); + /* + * When a thread stops operating on an address space, the loop + * in membarrier_{private,global}_expedited() may not observe + * that tsk->mm, and not issue an IPI. Membarrier requires a + * memory barrier after accessing user-space memory, before + * clearing tsk->mm. + */ + smp_mb__after_spinlock(); current->mm = NULL; mmap_read_unlock(mm); enter_lazy_tlb(mm, current); -- 2.11.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) 2020-08-14 16:43 ` [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) Mathieu Desnoyers @ 2020-08-16 15:23 ` Boqun Feng 2020-09-24 15:01 ` Mathieu Desnoyers 0 siblings, 1 reply; 11+ messages in thread From: Boqun Feng @ 2020-08-16 15:23 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Peter Zijlstra, linux-kernel, Will Deacon, Paul E . McKenney, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin, Thomas Gleixner, Linus Torvalds, linux-mm Hi Mathieu, On Fri, Aug 14, 2020 at 12:43:56PM -0400, Mathieu Desnoyers wrote: > exit_mm should issue memory barriers after user-space memory accesses, > before clearing current->mm, to order user-space memory accesses > performed prior to exit_mm before clearing tsk->mm, which has the > effect of skipping the membarrier private expedited IPIs. > > The membarrier system call can be issued concurrently with do_exit > if we have thread groups created with CLONE_VM but not CLONE_THREAD. > > Here is the scenario I have in mind: > > Two thread groups are created, A and B. Thread group B is created by > issuing clone from group A with flag CLONE_VM set, but not CLONE_THREAD. > Let's assume we have a single thread within each thread group (Thread A > and Thread B). > > The AFAIU we can have: > > Userspace variables: > > int x = 0, y = 0; > > CPU 0 CPU 1 > Thread A Thread B > (in thread group A) (in thread group B) > > x = 1 > barrier() > y = 1 > exit() > exit_mm() > current->mm = NULL; > r1 = load y > membarrier() > skips CPU 0 (no IPI) because its current mm is NULL > r2 = load x > BUG_ON(r1 == 1 && r2 == 0) > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Will Deacon <will@kernel.org> > Cc: Paul E. McKenney <paulmck@kernel.org> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Andy Lutomirski <luto@amacapital.net> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Linus Torvalds <torvalds@linux-foundation.org> > Cc: Alan Stern <stern@rowland.harvard.edu> > Cc: linux-mm@kvack.org > --- > Changes since v1: > - Use smp_mb__after_spinlock rather than smp_mb. > - Document race scenario in commit message. > --- > kernel/exit.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/kernel/exit.c b/kernel/exit.c > index 733e80f334e7..fe64e6e28dd5 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -475,6 +475,14 @@ static void exit_mm(void) > BUG_ON(mm != current->active_mm); > /* more a memory barrier than a real lock */ > task_lock(current); > + /* > + * When a thread stops operating on an address space, the loop > + * in membarrier_{private,global}_expedited() may not observe Is it accurate to say that the correctness of membarrier_global_expedited() relies on the observation of ->mm? Because IIUC membarrier_global_expedited() loop doesn't check ->mm. Regards, Boqun > + * that tsk->mm, and not issue an IPI. Membarrier requires a > + * memory barrier after accessing user-space memory, before > + * clearing tsk->mm. > + */ > + smp_mb__after_spinlock(); > current->mm = NULL; > mmap_read_unlock(mm); > enter_lazy_tlb(mm, current); > -- > 2.11.0 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) 2020-08-16 15:23 ` Boqun Feng @ 2020-09-24 15:01 ` Mathieu Desnoyers 0 siblings, 0 replies; 11+ messages in thread From: Mathieu Desnoyers @ 2020-09-24 15:01 UTC (permalink / raw) To: Boqun Feng Cc: Peter Zijlstra, linux-kernel, Will Deacon, paulmck, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin, Thomas Gleixner, Linus Torvalds, linux-mm ----- On Aug 16, 2020, at 11:23 AM, Boqun Feng boqun.feng@gmail.com wrote: > Hi Mathieu, > > On Fri, Aug 14, 2020 at 12:43:56PM -0400, Mathieu Desnoyers wrote: >> exit_mm should issue memory barriers after user-space memory accesses, >> before clearing current->mm, to order user-space memory accesses >> performed prior to exit_mm before clearing tsk->mm, which has the >> effect of skipping the membarrier private expedited IPIs. >> >> The membarrier system call can be issued concurrently with do_exit >> if we have thread groups created with CLONE_VM but not CLONE_THREAD. >> >> Here is the scenario I have in mind: >> >> Two thread groups are created, A and B. Thread group B is created by >> issuing clone from group A with flag CLONE_VM set, but not CLONE_THREAD. >> Let's assume we have a single thread within each thread group (Thread A >> and Thread B). >> >> The AFAIU we can have: >> >> Userspace variables: >> >> int x = 0, y = 0; >> >> CPU 0 CPU 1 >> Thread A Thread B >> (in thread group A) (in thread group B) >> >> x = 1 >> barrier() >> y = 1 >> exit() >> exit_mm() >> current->mm = NULL; >> r1 = load y >> membarrier() >> skips CPU 0 (no IPI) because its current mm is NULL >> r2 = load x >> BUG_ON(r1 == 1 && r2 == 0) >> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> >> Cc: Will Deacon <will@kernel.org> >> Cc: Paul E. McKenney <paulmck@kernel.org> >> Cc: Nicholas Piggin <npiggin@gmail.com> >> Cc: Andy Lutomirski <luto@amacapital.net> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> Cc: Linus Torvalds <torvalds@linux-foundation.org> >> Cc: Alan Stern <stern@rowland.harvard.edu> >> Cc: linux-mm@kvack.org >> --- >> Changes since v1: >> - Use smp_mb__after_spinlock rather than smp_mb. >> - Document race scenario in commit message. >> --- >> kernel/exit.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/kernel/exit.c b/kernel/exit.c >> index 733e80f334e7..fe64e6e28dd5 100644 >> --- a/kernel/exit.c >> +++ b/kernel/exit.c >> @@ -475,6 +475,14 @@ static void exit_mm(void) >> BUG_ON(mm != current->active_mm); >> /* more a memory barrier than a real lock */ >> task_lock(current); >> + /* >> + * When a thread stops operating on an address space, the loop >> + * in membarrier_{private,global}_expedited() may not observe > > Is it accurate to say that the correctness of > membarrier_global_expedited() relies on the observation of ->mm? Because > IIUC membarrier_global_expedited() loop doesn't check ->mm. Good point, I was wrong. Will instead reword as: /* * When a thread stops operating on an address space, the loop * in membarrier_private_expedited() may not observe that * tsk->mm, and the loop in membarrier_global_expedited() may * not observe a MEMBARRIER_STATE_GLOBAL_EXPEDITED * rq->membarrier_state, so those would not issue an IPI. * Membarrier requires a memory barrier after accessing * user-space memory, before clearing tsk->mm or the * rq->membarrier_state. */ And I'll make sure exit_mm clears this_rq()->membarrier_state as well. Thanks, Mathieu > > Regards, > Boqun > >> + * that tsk->mm, and not issue an IPI. Membarrier requires a >> + * memory barrier after accessing user-space memory, before >> + * clearing tsk->mm. >> + */ >> + smp_mb__after_spinlock(); >> current->mm = NULL; >> mmap_read_unlock(mm); >> enter_lazy_tlb(mm, current); >> -- >> 2.11.0 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) 2020-08-14 16:43 [RFC PATCH 0/3] sched: membarrier updates Mathieu Desnoyers 2020-08-14 16:43 ` [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) Mathieu Desnoyers @ 2020-08-14 16:43 ` Mathieu Desnoyers 2020-08-16 15:29 ` Boqun Feng 2020-08-14 16:43 ` [RFC PATCH 3/3] sched: membarrier: document memory ordering scenarios Mathieu Desnoyers [not found] ` <20200816070932.10752-1-hdanton@sina.com> 3 siblings, 1 reply; 11+ messages in thread From: Mathieu Desnoyers @ 2020-08-14 16:43 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Will Deacon, Paul E . McKenney, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin, Mathieu Desnoyers Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm to allow the effect of membarrier(2) to apply to kthreads accessing user-space memory as well. Given that no prior kthread use this guarantee and that it only affects kthreads, adding this guarantee does not affect user-space ABI. Refine the check in membarrier_global_expedited to exclude runqueues running the idle thread rather than all kthreads from the IPI cpumask. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> --- Changes since v1: - Add WARN_ON_ONCE(current->mm) in play_idle_precise (PeterZ), - Use smp_mb__after_spinlock rather than smp_mb after task_lock. --- kernel/kthread.c | 19 +++++++++++++++++++ kernel/sched/idle.c | 1 + kernel/sched/membarrier.c | 8 ++------ 3 files changed, 22 insertions(+), 6 deletions(-) diff --git a/kernel/kthread.c b/kernel/kthread.c index 3edaa380dc7b..77aaaa7bc8d9 100644 --- a/kernel/kthread.c +++ b/kernel/kthread.c @@ -1255,8 +1255,19 @@ void kthread_use_mm(struct mm_struct *mm) finish_arch_post_lock_switch(); #endif + /* + * When a kthread starts operating on an address space, the loop + * in membarrier_{private,global}_expedited() may not observe + * that tsk->mm, and not issue an IPI. Membarrier requires a + * memory barrier after storing to tsk->mm, before accessing + * user-space memory. A full memory barrier for membarrier + * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by + * mmdrop(), or explicitly with smp_mb(). + */ if (active_mm != mm) mmdrop(active_mm); + else + smp_mb(); to_kthread(tsk)->oldfs = force_uaccess_begin(); } @@ -1276,6 +1287,14 @@ void kthread_unuse_mm(struct mm_struct *mm) force_uaccess_end(to_kthread(tsk)->oldfs); task_lock(tsk); + /* + * When a kthread stops operating on an address space, the loop + * in membarrier_{private,global}_expedited() may not observe + * that tsk->mm, and not issue an IPI. Membarrier requires a + * memory barrier after accessing user-space memory, before + * clearing tsk->mm. + */ + smp_mb__after_spinlock(); sync_mm_rss(mm); local_irq_disable(); tsk->mm = NULL; diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 6bf34986f45c..3443ee8335d0 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -341,6 +341,7 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns) WARN_ON_ONCE(!(current->flags & PF_KTHREAD)); WARN_ON_ONCE(!(current->flags & PF_NO_SETAFFINITY)); WARN_ON_ONCE(!duration_ns); + WARN_ON_ONCE(current->mm); rcu_sleep_check(); preempt_disable(); diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 168479a7d61b..8a294483074d 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -100,13 +100,9 @@ static int membarrier_global_expedited(void) MEMBARRIER_STATE_GLOBAL_EXPEDITED)) continue; - /* - * Skip the CPU if it runs a kernel thread. The scheduler - * leaves the prior task mm in place as an optimization when - * scheduling a kthread. - */ + /* Skip the CPU if it runs the idle thread. */ p = rcu_dereference(cpu_rq(cpu)->curr); - if (p->flags & PF_KTHREAD) + if (is_idle_task(p)) continue; __cpumask_set_cpu(cpu, tmpmask); -- 2.11.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) 2020-08-14 16:43 ` [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) Mathieu Desnoyers @ 2020-08-16 15:29 ` Boqun Feng 2020-08-24 15:27 ` Mathieu Desnoyers 0 siblings, 1 reply; 11+ messages in thread From: Boqun Feng @ 2020-08-16 15:29 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Peter Zijlstra, linux-kernel, Will Deacon, Paul E . McKenney, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin On Fri, Aug 14, 2020 at 12:43:57PM -0400, Mathieu Desnoyers wrote: > Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm > to allow the effect of membarrier(2) to apply to kthreads accessing > user-space memory as well. > > Given that no prior kthread use this guarantee and that it only affects > kthreads, adding this guarantee does not affect user-space ABI. > > Refine the check in membarrier_global_expedited to exclude runqueues > running the idle thread rather than all kthreads from the IPI cpumask. > > Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > Cc: Will Deacon <will@kernel.org> > Cc: Paul E. McKenney <paulmck@kernel.org> > Cc: Nicholas Piggin <npiggin@gmail.com> > Cc: Andy Lutomirski <luto@amacapital.net> > Cc: Andrew Morton <akpm@linux-foundation.org> > --- > Changes since v1: > - Add WARN_ON_ONCE(current->mm) in play_idle_precise (PeterZ), > - Use smp_mb__after_spinlock rather than smp_mb after task_lock. > --- > kernel/kthread.c | 19 +++++++++++++++++++ > kernel/sched/idle.c | 1 + > kernel/sched/membarrier.c | 8 ++------ > 3 files changed, 22 insertions(+), 6 deletions(-) > > diff --git a/kernel/kthread.c b/kernel/kthread.c > index 3edaa380dc7b..77aaaa7bc8d9 100644 > --- a/kernel/kthread.c > +++ b/kernel/kthread.c > @@ -1255,8 +1255,19 @@ void kthread_use_mm(struct mm_struct *mm) > finish_arch_post_lock_switch(); > #endif > > + /* > + * When a kthread starts operating on an address space, the loop > + * in membarrier_{private,global}_expedited() may not observe > + * that tsk->mm, and not issue an IPI. Membarrier requires a > + * memory barrier after storing to tsk->mm, before accessing > + * user-space memory. A full memory barrier for membarrier > + * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by > + * mmdrop(), or explicitly with smp_mb(). > + */ > if (active_mm != mm) > mmdrop(active_mm); > + else > + smp_mb(); Similar question here: could smp_mb() guarantee the correctness of GLOBAL_EXPEDITED? Don't you need membarrier_switch_mm() here and in kthread_unuse_mm(), too? Am I miss something here? Regards, Boqun > > to_kthread(tsk)->oldfs = force_uaccess_begin(); > } > @@ -1276,6 +1287,14 @@ void kthread_unuse_mm(struct mm_struct *mm) > force_uaccess_end(to_kthread(tsk)->oldfs); > > task_lock(tsk); > + /* > + * When a kthread stops operating on an address space, the loop > + * in membarrier_{private,global}_expedited() may not observe > + * that tsk->mm, and not issue an IPI. Membarrier requires a > + * memory barrier after accessing user-space memory, before > + * clearing tsk->mm. > + */ > + smp_mb__after_spinlock(); > sync_mm_rss(mm); > local_irq_disable(); > tsk->mm = NULL; > diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > index 6bf34986f45c..3443ee8335d0 100644 > --- a/kernel/sched/idle.c > +++ b/kernel/sched/idle.c > @@ -341,6 +341,7 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns) > WARN_ON_ONCE(!(current->flags & PF_KTHREAD)); > WARN_ON_ONCE(!(current->flags & PF_NO_SETAFFINITY)); > WARN_ON_ONCE(!duration_ns); > + WARN_ON_ONCE(current->mm); > > rcu_sleep_check(); > preempt_disable(); > diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c > index 168479a7d61b..8a294483074d 100644 > --- a/kernel/sched/membarrier.c > +++ b/kernel/sched/membarrier.c > @@ -100,13 +100,9 @@ static int membarrier_global_expedited(void) > MEMBARRIER_STATE_GLOBAL_EXPEDITED)) > continue; > > - /* > - * Skip the CPU if it runs a kernel thread. The scheduler > - * leaves the prior task mm in place as an optimization when > - * scheduling a kthread. > - */ > + /* Skip the CPU if it runs the idle thread. */ > p = rcu_dereference(cpu_rq(cpu)->curr); > - if (p->flags & PF_KTHREAD) > + if (is_idle_task(p)) > continue; > > __cpumask_set_cpu(cpu, tmpmask); > -- > 2.11.0 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) 2020-08-16 15:29 ` Boqun Feng @ 2020-08-24 15:27 ` Mathieu Desnoyers 2020-08-25 2:06 ` Boqun Feng 0 siblings, 1 reply; 11+ messages in thread From: Mathieu Desnoyers @ 2020-08-24 15:27 UTC (permalink / raw) To: Boqun Feng Cc: Peter Zijlstra, linux-kernel, Will Deacon, paulmck, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin ----- On Aug 16, 2020, at 11:29 AM, Boqun Feng boqun.feng@gmail.com wrote: > On Fri, Aug 14, 2020 at 12:43:57PM -0400, Mathieu Desnoyers wrote: >> Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm >> to allow the effect of membarrier(2) to apply to kthreads accessing >> user-space memory as well. >> >> Given that no prior kthread use this guarantee and that it only affects >> kthreads, adding this guarantee does not affect user-space ABI. >> >> Refine the check in membarrier_global_expedited to exclude runqueues >> running the idle thread rather than all kthreads from the IPI cpumask. >> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> >> Cc: Will Deacon <will@kernel.org> >> Cc: Paul E. McKenney <paulmck@kernel.org> >> Cc: Nicholas Piggin <npiggin@gmail.com> >> Cc: Andy Lutomirski <luto@amacapital.net> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> --- >> Changes since v1: >> - Add WARN_ON_ONCE(current->mm) in play_idle_precise (PeterZ), >> - Use smp_mb__after_spinlock rather than smp_mb after task_lock. >> --- >> kernel/kthread.c | 19 +++++++++++++++++++ >> kernel/sched/idle.c | 1 + >> kernel/sched/membarrier.c | 8 ++------ >> 3 files changed, 22 insertions(+), 6 deletions(-) >> >> diff --git a/kernel/kthread.c b/kernel/kthread.c >> index 3edaa380dc7b..77aaaa7bc8d9 100644 >> --- a/kernel/kthread.c >> +++ b/kernel/kthread.c >> @@ -1255,8 +1255,19 @@ void kthread_use_mm(struct mm_struct *mm) >> finish_arch_post_lock_switch(); >> #endif >> >> + /* >> + * When a kthread starts operating on an address space, the loop >> + * in membarrier_{private,global}_expedited() may not observe >> + * that tsk->mm, and not issue an IPI. Membarrier requires a >> + * memory barrier after storing to tsk->mm, before accessing >> + * user-space memory. A full memory barrier for membarrier >> + * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by >> + * mmdrop(), or explicitly with smp_mb(). >> + */ >> if (active_mm != mm) >> mmdrop(active_mm); >> + else >> + smp_mb(); > > Similar question here: could smp_mb() guarantee the correctness of > GLOBAL_EXPEDITED? Don't you need membarrier_switch_mm() here and in > kthread_unuse_mm(), too? > > Am I miss something here? I think you have a good point there. Which brings me to wonder why we don't have membarrier_switch_mm() when entering/leaving lazy TLB mode. This means an IPI can be sent to a kthread even if it does not use an mm, just because the membarrier state in the runqueue is that of the active_mm. Thoughts ? Thanks, Mathieu > > Regards, > Boqun > >> >> to_kthread(tsk)->oldfs = force_uaccess_begin(); >> } >> @@ -1276,6 +1287,14 @@ void kthread_unuse_mm(struct mm_struct *mm) >> force_uaccess_end(to_kthread(tsk)->oldfs); >> >> task_lock(tsk); >> + /* >> + * When a kthread stops operating on an address space, the loop >> + * in membarrier_{private,global}_expedited() may not observe >> + * that tsk->mm, and not issue an IPI. Membarrier requires a >> + * memory barrier after accessing user-space memory, before >> + * clearing tsk->mm. >> + */ >> + smp_mb__after_spinlock(); >> sync_mm_rss(mm); >> local_irq_disable(); >> tsk->mm = NULL; >> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c >> index 6bf34986f45c..3443ee8335d0 100644 >> --- a/kernel/sched/idle.c >> +++ b/kernel/sched/idle.c >> @@ -341,6 +341,7 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns) >> WARN_ON_ONCE(!(current->flags & PF_KTHREAD)); >> WARN_ON_ONCE(!(current->flags & PF_NO_SETAFFINITY)); >> WARN_ON_ONCE(!duration_ns); >> + WARN_ON_ONCE(current->mm); >> >> rcu_sleep_check(); >> preempt_disable(); >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c >> index 168479a7d61b..8a294483074d 100644 >> --- a/kernel/sched/membarrier.c >> +++ b/kernel/sched/membarrier.c >> @@ -100,13 +100,9 @@ static int membarrier_global_expedited(void) >> MEMBARRIER_STATE_GLOBAL_EXPEDITED)) >> continue; >> >> - /* >> - * Skip the CPU if it runs a kernel thread. The scheduler >> - * leaves the prior task mm in place as an optimization when >> - * scheduling a kthread. >> - */ >> + /* Skip the CPU if it runs the idle thread. */ >> p = rcu_dereference(cpu_rq(cpu)->curr); >> - if (p->flags & PF_KTHREAD) >> + if (is_idle_task(p)) >> continue; >> >> __cpumask_set_cpu(cpu, tmpmask); >> -- >> 2.11.0 -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) 2020-08-24 15:27 ` Mathieu Desnoyers @ 2020-08-25 2:06 ` Boqun Feng 2020-09-24 15:26 ` Mathieu Desnoyers 0 siblings, 1 reply; 11+ messages in thread From: Boqun Feng @ 2020-08-25 2:06 UTC (permalink / raw) To: Mathieu Desnoyers Cc: Peter Zijlstra, linux-kernel, Will Deacon, paulmck, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin On Mon, Aug 24, 2020 at 11:27:49AM -0400, Mathieu Desnoyers wrote: > ----- On Aug 16, 2020, at 11:29 AM, Boqun Feng boqun.feng@gmail.com wrote: > > > On Fri, Aug 14, 2020 at 12:43:57PM -0400, Mathieu Desnoyers wrote: > >> Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm > >> to allow the effect of membarrier(2) to apply to kthreads accessing > >> user-space memory as well. > >> > >> Given that no prior kthread use this guarantee and that it only affects > >> kthreads, adding this guarantee does not affect user-space ABI. > >> > >> Refine the check in membarrier_global_expedited to exclude runqueues > >> running the idle thread rather than all kthreads from the IPI cpumask. > >> > >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> > >> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> > >> Cc: Will Deacon <will@kernel.org> > >> Cc: Paul E. McKenney <paulmck@kernel.org> > >> Cc: Nicholas Piggin <npiggin@gmail.com> > >> Cc: Andy Lutomirski <luto@amacapital.net> > >> Cc: Andrew Morton <akpm@linux-foundation.org> > >> --- > >> Changes since v1: > >> - Add WARN_ON_ONCE(current->mm) in play_idle_precise (PeterZ), > >> - Use smp_mb__after_spinlock rather than smp_mb after task_lock. > >> --- > >> kernel/kthread.c | 19 +++++++++++++++++++ > >> kernel/sched/idle.c | 1 + > >> kernel/sched/membarrier.c | 8 ++------ > >> 3 files changed, 22 insertions(+), 6 deletions(-) > >> > >> diff --git a/kernel/kthread.c b/kernel/kthread.c > >> index 3edaa380dc7b..77aaaa7bc8d9 100644 > >> --- a/kernel/kthread.c > >> +++ b/kernel/kthread.c > >> @@ -1255,8 +1255,19 @@ void kthread_use_mm(struct mm_struct *mm) > >> finish_arch_post_lock_switch(); > >> #endif > >> > >> + /* > >> + * When a kthread starts operating on an address space, the loop > >> + * in membarrier_{private,global}_expedited() may not observe > >> + * that tsk->mm, and not issue an IPI. Membarrier requires a > >> + * memory barrier after storing to tsk->mm, before accessing > >> + * user-space memory. A full memory barrier for membarrier > >> + * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by > >> + * mmdrop(), or explicitly with smp_mb(). > >> + */ > >> if (active_mm != mm) > >> mmdrop(active_mm); > >> + else > >> + smp_mb(); > > > > Similar question here: could smp_mb() guarantee the correctness of > > GLOBAL_EXPEDITED? Don't you need membarrier_switch_mm() here and in > > kthread_unuse_mm(), too? > > > > Am I miss something here? > > I think you have a good point there. Which brings me to wonder why we > don't have membarrier_switch_mm() when entering/leaving lazy TLB mode. > This means an IPI can be sent to a kthread even if it does not use an > mm, just because the membarrier state in the runqueue is that of the > active_mm. > > Thoughts ? > Right, I think we should also handle the percpu membarrier_state. The basic rule is whenever we change current->mm or current (i.e. rq->curr) itself, we need to update the percpu membarrier_state accordingly. Regards, Boqun > Thanks, > > Mathieu > > > > > Regards, > > Boqun > > > >> > >> to_kthread(tsk)->oldfs = force_uaccess_begin(); > >> } > >> @@ -1276,6 +1287,14 @@ void kthread_unuse_mm(struct mm_struct *mm) > >> force_uaccess_end(to_kthread(tsk)->oldfs); > >> > >> task_lock(tsk); > >> + /* > >> + * When a kthread stops operating on an address space, the loop > >> + * in membarrier_{private,global}_expedited() may not observe > >> + * that tsk->mm, and not issue an IPI. Membarrier requires a > >> + * memory barrier after accessing user-space memory, before > >> + * clearing tsk->mm. > >> + */ > >> + smp_mb__after_spinlock(); > >> sync_mm_rss(mm); > >> local_irq_disable(); > >> tsk->mm = NULL; > >> diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c > >> index 6bf34986f45c..3443ee8335d0 100644 > >> --- a/kernel/sched/idle.c > >> +++ b/kernel/sched/idle.c > >> @@ -341,6 +341,7 @@ void play_idle_precise(u64 duration_ns, u64 latency_ns) > >> WARN_ON_ONCE(!(current->flags & PF_KTHREAD)); > >> WARN_ON_ONCE(!(current->flags & PF_NO_SETAFFINITY)); > >> WARN_ON_ONCE(!duration_ns); > >> + WARN_ON_ONCE(current->mm); > >> > >> rcu_sleep_check(); > >> preempt_disable(); > >> diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c > >> index 168479a7d61b..8a294483074d 100644 > >> --- a/kernel/sched/membarrier.c > >> +++ b/kernel/sched/membarrier.c > >> @@ -100,13 +100,9 @@ static int membarrier_global_expedited(void) > >> MEMBARRIER_STATE_GLOBAL_EXPEDITED)) > >> continue; > >> > >> - /* > >> - * Skip the CPU if it runs a kernel thread. The scheduler > >> - * leaves the prior task mm in place as an optimization when > >> - * scheduling a kthread. > >> - */ > >> + /* Skip the CPU if it runs the idle thread. */ > >> p = rcu_dereference(cpu_rq(cpu)->curr); > >> - if (p->flags & PF_KTHREAD) > >> + if (is_idle_task(p)) > >> continue; > >> > >> __cpumask_set_cpu(cpu, tmpmask); > >> -- > >> 2.11.0 > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) 2020-08-25 2:06 ` Boqun Feng @ 2020-09-24 15:26 ` Mathieu Desnoyers 0 siblings, 0 replies; 11+ messages in thread From: Mathieu Desnoyers @ 2020-09-24 15:26 UTC (permalink / raw) To: Boqun Feng Cc: Peter Zijlstra, linux-kernel, Will Deacon, paulmck, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin ----- On Aug 24, 2020, at 10:06 PM, Boqun Feng boqun.feng@gmail.com wrote: > On Mon, Aug 24, 2020 at 11:27:49AM -0400, Mathieu Desnoyers wrote: >> ----- On Aug 16, 2020, at 11:29 AM, Boqun Feng boqun.feng@gmail.com wrote: >> >> > On Fri, Aug 14, 2020 at 12:43:57PM -0400, Mathieu Desnoyers wrote: >> >> Add comments and memory barrier to kthread_use_mm and kthread_unuse_mm >> >> to allow the effect of membarrier(2) to apply to kthreads accessing >> >> user-space memory as well. >> >> >> >> Given that no prior kthread use this guarantee and that it only affects >> >> kthreads, adding this guarantee does not affect user-space ABI. >> >> >> >> Refine the check in membarrier_global_expedited to exclude runqueues >> >> running the idle thread rather than all kthreads from the IPI cpumask. >> >> >> >> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> >> >> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> >> >> Cc: Will Deacon <will@kernel.org> >> >> Cc: Paul E. McKenney <paulmck@kernel.org> >> >> Cc: Nicholas Piggin <npiggin@gmail.com> >> >> Cc: Andy Lutomirski <luto@amacapital.net> >> >> Cc: Andrew Morton <akpm@linux-foundation.org> >> >> --- >> >> Changes since v1: >> >> - Add WARN_ON_ONCE(current->mm) in play_idle_precise (PeterZ), >> >> - Use smp_mb__after_spinlock rather than smp_mb after task_lock. >> >> --- >> >> kernel/kthread.c | 19 +++++++++++++++++++ >> >> kernel/sched/idle.c | 1 + >> >> kernel/sched/membarrier.c | 8 ++------ >> >> 3 files changed, 22 insertions(+), 6 deletions(-) >> >> >> >> diff --git a/kernel/kthread.c b/kernel/kthread.c >> >> index 3edaa380dc7b..77aaaa7bc8d9 100644 >> >> --- a/kernel/kthread.c >> >> +++ b/kernel/kthread.c >> >> @@ -1255,8 +1255,19 @@ void kthread_use_mm(struct mm_struct *mm) >> >> finish_arch_post_lock_switch(); >> >> #endif >> >> >> >> + /* >> >> + * When a kthread starts operating on an address space, the loop >> >> + * in membarrier_{private,global}_expedited() may not observe >> >> + * that tsk->mm, and not issue an IPI. Membarrier requires a >> >> + * memory barrier after storing to tsk->mm, before accessing >> >> + * user-space memory. A full memory barrier for membarrier >> >> + * {PRIVATE,GLOBAL}_EXPEDITED is implicitly provided by >> >> + * mmdrop(), or explicitly with smp_mb(). >> >> + */ >> >> if (active_mm != mm) >> >> mmdrop(active_mm); >> >> + else >> >> + smp_mb(); >> > >> > Similar question here: could smp_mb() guarantee the correctness of >> > GLOBAL_EXPEDITED? Don't you need membarrier_switch_mm() here and in >> > kthread_unuse_mm(), too? >> > >> > Am I miss something here? >> >> I think you have a good point there. Which brings me to wonder why we >> don't have membarrier_switch_mm() when entering/leaving lazy TLB mode. >> This means an IPI can be sent to a kthread even if it does not use an >> mm, just because the membarrier state in the runqueue is that of the >> active_mm. >> >> Thoughts ? >> > > Right, I think we should also handle the percpu membarrier_state. The > basic rule is whenever we change current->mm or current (i.e. rq->curr) > itself, we need to update the percpu membarrier_state accordingly. OK, so as we introduce IPIs to kthreads which are using kthread_use_mm, we need to reconsider how the scheduler deals with runqueues entering lazy TLB state. Currently, membarrier_switch_mm() is not issued when entering lazy TLB state. But as we start considering kthreads as candidates for sending IPIs, we need to update the rq->membarrier_state whenever we enter lazy TLB state as well. So I plan to do this change to cover this: diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 84758f34cdb0..44521dc5602a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -3736,6 +3736,8 @@ context_switch(struct rq *rq, struct task_struct *prev, */ arch_start_context_switch(prev); + membarrier_switch_mm(rq, prev->mm, next->mm); + /* * kernel -> kernel lazy + transfer active * user -> kernel lazy + mmgrab() active @@ -3752,7 +3754,6 @@ context_switch(struct rq *rq, struct task_struct *prev, else prev->active_mm = NULL; } else { // to user - membarrier_switch_mm(rq, prev->active_mm, next->mm); /* * sys_membarrier() requires an smp_mb() between setting * rq->curr / membarrier_switch_mm() and returning to userspace. diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h index 3fd283892761..481149066086 100644 --- a/kernel/sched/sched.h +++ b/kernel/sched/sched.h @@ -2592,12 +2592,13 @@ static inline void membarrier_switch_mm(struct rq *rq, struct mm_struct *prev_mm, struct mm_struct *next_mm) { - int membarrier_state; + int membarrier_state = 0; if (prev_mm == next_mm) return; - membarrier_state = atomic_read(&next_mm->membarrier_state); + if (next_mm) + membarrier_state = atomic_read(&next_mm->membarrier_state); if (READ_ONCE(rq->membarrier_state) == membarrier_state) return; Thoughts ? Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 3/3] sched: membarrier: document memory ordering scenarios 2020-08-14 16:43 [RFC PATCH 0/3] sched: membarrier updates Mathieu Desnoyers 2020-08-14 16:43 ` [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) Mathieu Desnoyers 2020-08-14 16:43 ` [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) Mathieu Desnoyers @ 2020-08-14 16:43 ` Mathieu Desnoyers [not found] ` <20200816070932.10752-1-hdanton@sina.com> 3 siblings, 0 replies; 11+ messages in thread From: Mathieu Desnoyers @ 2020-08-14 16:43 UTC (permalink / raw) To: Peter Zijlstra Cc: linux-kernel, Will Deacon, Paul E . McKenney, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin, Mathieu Desnoyers Document membarrier ordering scenarios in membarrier.c. Thanks to Alan Stern for refreshing my memory. Now that I have those in mind, it seems appropriate to serialize them to comments for posterity. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Will Deacon <will@kernel.org> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Alan Stern <stern@rowland.harvard.edu> --- kernel/sched/membarrier.c | 128 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 128 insertions(+) diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index 8a294483074d..103f5edb8ba5 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -7,6 +7,134 @@ #include "sched.h" /* + * For documentation purposes, here are some membarrier ordering + * scenarios to keep in mind: + * + * A) Userspace thread execution after IPI vs membarrier's memory + * barrier before sending the IPI + * + * Userspace variables: + * + * int x = 0, y = 0; + * + * The memory barrier at the start of membarrier() on CPU0 is necessary in + * order to enforce the guarantee that any writes occurring on CPU0 before + * the membarrier() is executed will be visible to any code executing on + * CPU1 after the IPI-induced memory barrier: + * + * CPU0 CPU1 + * + * x = 1 + * membarrier(): + * a: smp_mb() + * b: send IPI IPI-induced mb + * c: smp_mb() + * r2 = y + * y = 1 + * barrier() + * r1 = x + * + * BUG_ON(r1 == 0 && r2 == 0) + * + * The write to y and load from x by CPU1 are unordered by the hardware, + * so it's possible to have "r1 = x" reordered before "y = 1" at any + * point after (b). If the memory barrier at (a) is omitted, then "x = 1" + * can be reordered after (a) (although not after (c)), so we get r1 == 0 + * and r2 == 0. This violates the guarantee that membarrier() is + * supposed by provide. + * + * The timing of the memory barrier at (a) has to ensure that it executes + * before the IPI-induced memory barrier on CPU1. + * + * B) Userspace thread execution before IPI vs membarrier's memory + * barrier after completing the IPI + * + * Userspace variables: + * + * int x = 0, y = 0; + * + * The memory barrier at the end of membarrier() on CPU0 is necessary in + * order to enforce the guarantee that any writes occurring on CPU1 before + * the membarrier() is executed will be visible to any code executing on + * CPU0 after the membarrier(): + * + * CPU0 CPU1 + * + * x = 1 + * barrier() + * y = 1 + * r2 = y + * membarrier(): + * a: smp_mb() + * b: send IPI IPI-induced mb + * c: smp_mb() + * r1 = x + * BUG_ON(r1 == 0 && r2 == 1) + * + * The writes to x and y are unordered by the hardware, so it's possible to + * have "r2 = 1" even though the write to x doesn't execute until (b). If + * the memory barrier at (c) is omitted then "r1 = x" can be reordered + * before (b) (although not before (a)), so we get "r1 = 0". This violates + * the guarantee that membarrier() is supposed to provide. + * + * The timing of the memory barrier at (c) has to ensure that it executes + * after the IPI-induced memory barrier on CPU1. + * + * C) Scheduling userspace thread -> kthread -> userspace thread vs membarrier + * + * CPU0 CPU1 + * + * membarrier(): + * a: smp_mb() + * d: switch to kthread (includes mb) + * b: read rq->curr->mm == NULL + * e: switch to user (includes mb) + * c: smp_mb() + * + * Using the scenario from (A), we can show that (a) needs to be paired + * with (e). Using the scenario from (B), we can show that (c) needs to + * be paired with (d). + * + * D) exit_mm vs membarrier + * + * Two thread groups are created, A and B. Thread group B is created by + * issuing clone from group A with flag CLONE_VM set, but not CLONE_THREAD. + * Let's assume we have a single thread within each thread group (Thread A + * and Thread B). Thread A runs on CPU0, Thread B runs on CPU1. + * + * CPU0 CPU1 + * + * membarrier(): + * a: smp_mb() + * exit_mm(): + * d: smp_mb() + * e: current->mm = NULL + * b: read rq->curr->mm == NULL + * c: smp_mb() + * + * Using scenario (B), we can show that (c) needs to be paired with (d). + * + * E) kthread_{use,unuse}_mm vs membarrier + * + * CPU0 CPU1 + * + * membarrier(): + * a: smp_mb() + * kthread_unuse_mm() + * d: smp_mb() + * e: current->mm = NULL + * b: read rq->curr->mm == NULL + * kthread_use_mm() + * f: current->mm = mm + * g: smp_mb() + * c: smp_mb() + * + * Using the scenario from (A), we can show that (a) needs to be paired + * with (g). Using the scenario from (B), we can show that (c) needs to + * be paired with (d). + */ + +/* * Bitmask made from a "or" of all commands within enum membarrier_cmd, * except MEMBARRIER_CMD_QUERY. */ -- 2.11.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
[parent not found: <20200816070932.10752-1-hdanton@sina.com>]
* Re: [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) [not found] ` <20200816070932.10752-1-hdanton@sina.com> @ 2020-08-16 21:05 ` Mathieu Desnoyers 0 siblings, 0 replies; 11+ messages in thread From: Mathieu Desnoyers @ 2020-08-16 21:05 UTC (permalink / raw) To: Hillf Danton Cc: Peter Zijlstra, linux-kernel, Will Deacon, paulmck, Andy Lutomirski, Andrew Morton, Alan Stern, Nicholas Piggin ----- On Aug 16, 2020, at 3:09 AM, Hillf Danton hdanton@sina.com wrote: > On Fri, 14 Aug 2020 12:43:57 -0400 Mathieu Desnoyers wrote: >> >> Given that no prior kthread use this guarantee and that it only affects >> kthreads, adding this guarantee does not affect user-space ABI. > > Can you expand a bit on why kthreads like ksoftirqd have to ack the > IPIs from Dave who's not CAP_SYS_ADMIN. Do ksoftirqd kthreads ever use kthread_use_mm() to access user processes' memory ? If not, then they won't be disturbed by any IPI. Thanks, Mathieu -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2020-09-24 15:26 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-14 16:43 [RFC PATCH 0/3] sched: membarrier updates Mathieu Desnoyers 2020-08-14 16:43 ` [RFC PATCH 1/3] sched: fix exit_mm vs membarrier (v2) Mathieu Desnoyers 2020-08-16 15:23 ` Boqun Feng 2020-09-24 15:01 ` Mathieu Desnoyers 2020-08-14 16:43 ` [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) Mathieu Desnoyers 2020-08-16 15:29 ` Boqun Feng 2020-08-24 15:27 ` Mathieu Desnoyers 2020-08-25 2:06 ` Boqun Feng 2020-09-24 15:26 ` Mathieu Desnoyers 2020-08-14 16:43 ` [RFC PATCH 3/3] sched: membarrier: document memory ordering scenarios Mathieu Desnoyers [not found] ` <20200816070932.10752-1-hdanton@sina.com> 2020-08-16 21:05 ` [RFC PATCH 2/3] sched: membarrier: cover kthread_use_mm (v2) Mathieu Desnoyers
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).