From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds <torvalds@linux-foundation.org> Cc: Oleg Nesterov <oleg@redhat.com>, Russell King - ARM Linux admin <linux@armlinux.org.uk>, Peter Zijlstra <peterz@infradead.org>, Chris Metcalf <cmetcalf@ezchip.com>, Christoph Lameter <cl@linux.com>, Kirill Tkhai <tkhai@yandex.ru>, Mike Galbraith <efault@gmx.de>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@kernel.org>, Linux List Kernel Mailing <linux-kernel@vger.kernel.org>, Davidlohr Bueso <dave@stgolabs.net> Subject: [PATCH 3/3] task: Clean house now that tasks on the runqueue are rcu protected Date: Mon, 02 Sep 2019 23:52:35 -0500 Message-ID: <8736het20c.fsf_-_@x220.int.ebiederm.org> (raw) In-Reply-To: <87k1aqt23r.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Mon, 02 Sep 2019 23:50:32 -0500") Use rcu_dereference instead of task_rcu_dereference. Remove task_rcu_dereference. Remove the complications of rcuwait that were in place because tasks on the runqueue were not rcu protected. It is now safe to call wake_up_process if the target was know to be on the runqueue in the current rcu interval. Cc: Davidlohr Bueso <dave@stgolabs.net> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Oleg Nesterov <oleg@redhat.com> Ref: 8f95c90ceb54 ("sched/wait, RCU: Introduce rcuwait machinery") Ref: 150593bf8693 ("sched/api: Introduce task_rcu_dereference() and try_get_task_struct()") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- include/linux/rcuwait.h | 20 +++--------- include/linux/sched/task.h | 1 - kernel/exit.c | 67 -------------------------------------- kernel/sched/fair.c | 2 +- kernel/sched/membarrier.c | 4 +-- 5 files changed, 7 insertions(+), 87 deletions(-) diff --git a/include/linux/rcuwait.h b/include/linux/rcuwait.h index 563290fc194f..75c97e4bbc57 100644 --- a/include/linux/rcuwait.h +++ b/include/linux/rcuwait.h @@ -6,16 +6,11 @@ /* * rcuwait provides a way of blocking and waking up a single - * task in an rcu-safe manner; where it is forbidden to use - * after exit_notify(). task_struct is not properly rcu protected, - * unless dealing with rcu-aware lists, ie: find_task_by_*(). + * task in an rcu-safe manner. * - * Alternatively we have task_rcu_dereference(), but the return - * semantics have different implications which would break the - * wakeup side. The only time @task is non-nil is when a user is - * blocked (or checking if it needs to) on a condition, and reset - * as soon as we know that the condition has succeeded and are - * awoken. + * The only time @task is non-nil is when a user is blocked (or + * checking if it needs to) on a condition, and reset as soon as we + * know that the condition has succeeded and are awoken. */ struct rcuwait { struct task_struct __rcu *task; @@ -37,13 +32,6 @@ extern void rcuwait_wake_up(struct rcuwait *w); */ #define rcuwait_wait_event(w, condition) \ ({ \ - /* \ - * Complain if we are called after do_exit()/exit_notify(), \ - * as we cannot rely on the rcu critical region for the \ - * wakeup side. \ - */ \ - WARN_ON(current->exit_state); \ - \ rcu_assign_pointer((w)->task, current); \ for (;;) { \ /* \ diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index 4c44c37236b2..8bd51af44bf8 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -115,7 +115,6 @@ static inline void put_task_struct(struct task_struct *t) __put_task_struct(t); } -struct task_struct *task_rcu_dereference(struct task_struct **ptask); void put_task_struct_rcu_user(struct task_struct *task); #ifdef CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT diff --git a/kernel/exit.c b/kernel/exit.c index 2e259286f4e7..f943773622fc 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -234,69 +234,6 @@ void release_task(struct task_struct *p) goto repeat; } -/* - * Note that if this function returns a valid task_struct pointer (!NULL) - * task->usage must remain >0 for the duration of the RCU critical section. - */ -struct task_struct *task_rcu_dereference(struct task_struct **ptask) -{ - struct sighand_struct *sighand; - struct task_struct *task; - - /* - * We need to verify that release_task() was not called and thus - * delayed_put_task_struct() can't run and drop the last reference - * before rcu_read_unlock(). We check task->sighand != NULL, - * but we can read the already freed and reused memory. - */ -retry: - task = rcu_dereference(*ptask); - if (!task) - return NULL; - - probe_kernel_address(&task->sighand, sighand); - - /* - * Pairs with atomic_dec_and_test() in put_task_struct(). If this task - * was already freed we can not miss the preceding update of this - * pointer. - */ - smp_rmb(); - if (unlikely(task != READ_ONCE(*ptask))) - goto retry; - - /* - * We've re-checked that "task == *ptask", now we have two different - * cases: - * - * 1. This is actually the same task/task_struct. In this case - * sighand != NULL tells us it is still alive. - * - * 2. This is another task which got the same memory for task_struct. - * We can't know this of course, and we can not trust - * sighand != NULL. - * - * In this case we actually return a random value, but this is - * correct. - * - * If we return NULL - we can pretend that we actually noticed that - * *ptask was updated when the previous task has exited. Or pretend - * that probe_slab_address(&sighand) reads NULL. - * - * If we return the new task (because sighand is not NULL for any - * reason) - this is fine too. This (new) task can't go away before - * another gp pass. - * - * And note: We could even eliminate the false positive if re-read - * task->sighand once again to avoid the falsely NULL. But this case - * is very unlikely so we don't care. - */ - if (!sighand) - return NULL; - - return task; -} - void rcuwait_wake_up(struct rcuwait *w) { struct task_struct *task; @@ -316,10 +253,6 @@ void rcuwait_wake_up(struct rcuwait *w) */ smp_mb(); /* (B) */ - /* - * Avoid using task_rcu_dereference() magic as long as we are careful, - * see comment in rcuwait_wait_event() regarding ->exit_state. - */ task = rcu_dereference(w->task); if (task) wake_up_process(task); diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index bc9cfeaac8bd..215c640e2a6b 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1644,7 +1644,7 @@ static void task_numa_compare(struct task_numa_env *env, return; rcu_read_lock(); - cur = task_rcu_dereference(&dst_rq->curr); + cur = rcu_dereference(dst_rq->curr); if (cur && ((cur->flags & PF_EXITING) || is_idle_task(cur))) cur = NULL; diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c index aa8d75804108..b14250a11608 100644 --- a/kernel/sched/membarrier.c +++ b/kernel/sched/membarrier.c @@ -71,7 +71,7 @@ static int membarrier_global_expedited(void) continue; rcu_read_lock(); - p = task_rcu_dereference(&cpu_rq(cpu)->curr); + p = rcu_dereference(cpu_rq(cpu)->curr); if (p && p->mm && (atomic_read(&p->mm->membarrier_state) & MEMBARRIER_STATE_GLOBAL_EXPEDITED)) { if (!fallback) @@ -150,7 +150,7 @@ static int membarrier_private_expedited(int flags) if (cpu == raw_smp_processor_id()) continue; rcu_read_lock(); - p = task_rcu_dereference(&cpu_rq(cpu)->curr); + p = rcu_dereference(cpu_rq(cpu)->curr); if (p && p->mm == current->mm) { if (!fallback) __cpumask_set_cpu(cpu, tmpmask); -- 2.21.0.dirty
next prev parent reply index Thread overview: 75+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-08-30 14:08 [BUG] Use of probe_kernel_address() in task_rcu_dereference() without checking return value Russell King - ARM Linux admin 2019-08-30 15:24 ` Oleg Nesterov 2019-08-30 15:30 ` Linus Torvalds 2019-08-30 15:40 ` Russell King - ARM Linux admin 2019-08-30 15:43 ` Linus Torvalds 2019-08-30 15:41 ` Linus Torvalds 2019-08-30 16:09 ` Oleg Nesterov 2019-08-30 16:21 ` Linus Torvalds 2019-08-30 16:44 ` Oleg Nesterov 2019-08-30 16:58 ` Linus Torvalds 2019-08-30 19:36 ` Eric W. Biederman 2019-09-02 13:40 ` Oleg Nesterov 2019-09-02 13:53 ` Peter Zijlstra 2019-09-02 14:44 ` Oleg Nesterov 2019-09-02 16:20 ` Peter Zijlstra 2019-09-02 17:04 ` Eric W. Biederman 2019-09-02 17:34 ` Linus Torvalds 2019-09-03 4:50 ` [PATCH 0/3] task: Making tasks on the runqueue rcu protected Eric W. Biederman 2019-09-03 4:51 ` [PATCH 1/3] task: Add a count of task rcu users Eric W. Biederman 2019-09-04 14:36 ` Oleg Nesterov 2019-09-04 14:44 ` Frederic Weisbecker 2019-09-04 15:32 ` Oleg Nesterov 2019-09-04 16:33 ` Frederic Weisbecker 2019-09-04 18:20 ` Linus Torvalds 2019-09-05 14:59 ` Frederic Weisbecker 2019-09-03 4:52 ` [PATCH 2/3] task: RCU protect tasks on the runqueue Eric W. Biederman 2019-09-03 7:41 ` Peter Zijlstra 2019-09-03 7:47 ` Peter Zijlstra 2019-09-03 16:44 ` Eric W. Biederman 2019-09-03 17:08 ` Linus Torvalds 2019-09-03 18:13 ` Eric W. Biederman 2019-09-03 19:18 ` Linus Torvalds 2019-09-03 20:06 ` Peter Zijlstra 2019-09-03 21:32 ` Paul E. McKenney 2019-09-05 20:02 ` Eric W. Biederman 2019-09-05 20:55 ` Paul E. McKenney 2019-09-06 7:07 ` Peter Zijlstra 2019-09-09 12:22 ` Eric W. Biederman 2019-09-25 7:36 ` Peter Zijlstra 2019-09-27 8:10 ` [tip: sched/urgent] tasks, sched/core: RCUify the assignment of rq->curr tip-bot2 for Eric W. Biederman 2019-09-03 19:42 ` [PATCH 2/3] task: RCU protect tasks on the runqueue Peter Zijlstra 2019-09-14 12:31 ` [PATCH v2 1/4] task: Add a count of task rcu users Eric W. Biederman 2019-09-14 12:31 ` [PATCH v2 2/4] task: Ensure tasks are available for a grace period after leaving the runqueue Eric W. Biederman 2019-09-14 12:32 ` [PATCH v2 3/4] task: With a grace period after finish_task_switch, remove unnecessary code Eric W. Biederman 2019-09-04 14:22 ` [PATCH 2/3] task: RCU protect tasks on the runqueue Frederic Weisbecker 2019-09-03 4:52 ` Eric W. Biederman [this message] 2019-09-03 9:45 ` [PATCH 3/3] task: Clean house now that tasks on the runqueue are rcu protected kbuild test robot 2019-09-03 13:06 ` Oleg Nesterov 2019-09-03 13:58 ` [PATCH 0/3] task: Making tasks on the runqueue " Oleg Nesterov 2019-09-03 15:44 ` Linus Torvalds 2019-09-03 19:46 ` Peter Zijlstra [not found] ` <87muf7f4bf.fsf_-_@x220.int.ebiederm.org> 2019-09-14 12:33 ` [PATCH v2 1/4] task: Add a count of task rcu users Eric W. Biederman 2019-09-15 13:54 ` Paul E. McKenney 2019-09-27 8:10 ` [tip: sched/urgent] tasks: Add a count of task RCU users tip-bot2 for Eric W. Biederman 2019-09-14 12:33 ` [PATCH v2 2/4] task: Ensure tasks are available for a grace period after leaving the runqueue Eric W. Biederman 2019-09-15 14:07 ` Paul E. McKenney 2019-09-15 14:09 ` Paul E. McKenney 2019-09-27 8:10 ` [tip: sched/urgent] tasks, sched/core: " tip-bot2 for Eric W. Biederman 2019-09-14 12:34 ` [PATCH v2 3/4] task: With a grace period after finish_task_switch, remove unnecessary code Eric W. Biederman 2019-09-15 14:32 ` Paul E. McKenney 2019-09-15 17:07 ` Linus Torvalds 2019-09-15 18:47 ` Paul E. McKenney 2019-09-27 8:10 ` [tip: sched/urgent] tasks, sched/core: With a grace period after finish_task_switch(), " tip-bot2 for Eric W. Biederman 2019-09-14 12:35 ` [PATCH v2 4/4] task: RCUify the assignment of rq->curr Eric W. Biederman 2019-09-15 14:41 ` Paul E. McKenney 2019-09-15 17:59 ` Eric W. Biederman 2019-09-15 18:25 ` Eric W. Biederman 2019-09-15 18:48 ` Paul E. McKenney 2019-09-20 23:02 ` Frederic Weisbecker 2019-09-26 1:49 ` Eric W. Biederman 2019-09-26 12:42 ` Frederic Weisbecker 2019-09-14 17:43 ` [PATCH v2 0/4] task: Making tasks on the runqueue rcu protected Linus Torvalds 2019-09-17 17:38 ` Eric W. Biederman 2019-09-25 7:51 ` Peter Zijlstra 2019-09-26 1:11 ` Eric W. Biederman
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=8736het20c.fsf_-_@x220.int.ebiederm.org \ --to=ebiederm@xmission.com \ --cc=cl@linux.com \ --cc=cmetcalf@ezchip.com \ --cc=dave@stgolabs.net \ --cc=efault@gmx.de \ --cc=linux-kernel@vger.kernel.org \ --cc=linux@armlinux.org.uk \ --cc=mingo@kernel.org \ --cc=oleg@redhat.com \ --cc=peterz@infradead.org \ --cc=tglx@linutronix.de \ --cc=tkhai@yandex.ru \ --cc=torvalds@linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ linux-kernel@vger.kernel.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git