linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up()
@ 2014-08-25 10:54 Kautuk Consul
  2014-08-25 15:57 ` Oleg Nesterov
  0 siblings, 1 reply; 25+ messages in thread
From: Kautuk Consul @ 2014-08-25 10:54 UTC (permalink / raw)
  To: Andrew Morton, Oleg Nesterov, Michal Hocko, David Rientjes,
	Ionut Alexa, Guillaume Morin, linux-kernel
  Cc: Kautuk Consul

I encountered a BUG() scenario within do_exit() on an ARM system.

The problem is due to a race scenario between do_exit() and try_to_wake_up()
on different CPUs due to usage of sleeping primitives such as __down_common
and wait_for_common.

Race Scenario
=============

Let us assume there are 2 CPUs A and B execute code in the following order:
1)	CPU A was running in user-mode and enters kernel mode via some
	syscall/exception handler.
2)	CPU A sets the current task(t) state to TASK_INTERRUPTIBLE via __down_common
	or wait_for_common.
3)	CPU A checks for signal_pending() and returns due to TIF_SIGPENDING
	being set in t's threadinfo due to a previous signal(say SIGKILL) being
	received on this task t.
4)	CPU A returns returns back to the assembly trap handler and calls
	do_work_pending() -> do_signal() -> get_signal() -> do_group_exit()
	 -> do_exit()
	CPU A  has not yet executed the following line of code before the final
	call to schedule:
    /* causes final put_task_struct in finish_task_switch(). */
    tsk->state = TASK_DEAD;
5)	CPU B tries to send a signal to task t (currently executing on CPU A)
	and thus enters: signal_wake_up_state() -> wake_up_state() ->
					 try_to_wake_up()
6)	CPU B executes all code in try_to_wake_up() till the call to
	ttwu_queue -> ttwu_do_activate -> ttwu_do_wakeup().
	CPU B has still not executed the following code in ttwu_do_wakeup():
	p->state = TASK_RUNNING;
7)	CPU A executes the following line of code:
    /* causes final put_task_struct in finish_task_switch(). */
    tsk->state = TASK_DEAD;
8)	CPU B executes the following code in ttwu_do_wakeup():
	p->state = TASK_RUNNING;
9)	CPU A continues to the call to do_exit() -> schedule().
	Since the tsk->state is TASK_RUNNING, the call to schedule() returns and
	do_exit() -> BUG() is hit on CPU A.

Alternate Solution
==================

An alternate solution would be to simply set the current task state to
TASK_RUNNING in __down_common(), wait_for_common() and all other interruptible
sleeping primitives in their if(signal_pending/signal_pending_state) conditional
blocks.

But this change seems to me to be more logical because:
i)		This will involve lesser changes to the kernel core code.
ii)		Any further sleeping primitives in the kernel also need not suffer from
		this kind of race scenario.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
---
 kernel/exit.c |   10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 32c58f7..69a8231 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -824,14 +824,16 @@ void do_exit(long code)
 	 *     (or hypervisor of virtual machine switches to other guest)
 	 *  As a result, we may become TASK_RUNNING after becoming TASK_DEAD
 	 *
-	 * To avoid it, we have to wait for releasing tsk->pi_lock which
-	 * is held by try_to_wake_up()
+	 * To solve this, we have to compete for tsk->pi_lock which is held by
+	 * try_to_wake_up().
 	 */
-	smp_mb();
-	raw_spin_unlock_wait(&tsk->pi_lock);
+	raw_spin_lock(&tsk->pi_lock);
 
 	/* causes final put_task_struct in finish_task_switch(). */
 	tsk->state = TASK_DEAD;
+
+	raw_spin_unlock(&tsk->pi_lock);
+
 	tsk->flags |= PF_NOFREEZE;	/* tell freezer to ignore us */
 	schedule();
 	BUG();
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2014-09-05 11:38 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-25 10:54 [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kautuk Consul
2014-08-25 15:57 ` Oleg Nesterov
2014-08-26  4:45   ` Kautuk Consul
2014-08-26 15:03     ` Oleg Nesterov
2014-09-01 15:39   ` Peter Zijlstra
2014-09-01 17:58     ` Oleg Nesterov
2014-09-01 19:09       ` Peter Zijlstra
2014-09-02 15:52         ` Oleg Nesterov
2014-09-02 16:47           ` Oleg Nesterov
2014-09-02 17:39             ` Peter Zijlstra
2014-09-03 13:36               ` Oleg Nesterov
2014-09-03 14:44                 ` Peter Zijlstra
2014-09-03 15:18                   ` Oleg Nesterov
2014-09-04  7:15                     ` Peter Zijlstra
2014-09-04 17:03                       ` Paul E. McKenney
2014-09-04  5:04                   ` Ingo Molnar
2014-09-04  6:32                     ` Peter Zijlstra
2014-09-03 16:08             ` task_numa_fault() && TASK_DEAD Oleg Nesterov
2014-09-03 16:33               ` Rik van Riel
2014-09-04  7:11               ` Peter Zijlstra
2014-09-04 10:39                 ` Oleg Nesterov
2014-09-04 19:14                   ` Hugh Dickins
2014-09-05 11:35                     ` Oleg Nesterov
2014-09-03  9:04   ` [PATCH 1/1] do_exit(): Solve possibility of BUG() due to race with try_to_wake_up() Kirill Tkhai
2014-09-03  9:45     ` Peter Zijlstra

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).