All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] fix kill_proc_info() vs copy_process() race
@ 2006-02-06 16:45 Oleg Nesterov
  2006-02-06 17:10 ` Oleg Nesterov
  2006-02-14 22:32 ` Paul E. McKenney
  0 siblings, 2 replies; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-06 16:45 UTC (permalink / raw)
  To: Ingo Molnar, Paul E. McKenney
  Cc: linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

kill_proc_info() takes taskllist_lock only for sig_kernel_stop()
case nowadays. I beleive it races with copy_process().

The first race is simple, copy_process:

	/*
	 * Check for pending SIGKILL! The new thread should not be allowed
	 * to slip out of an OOM kill. (or normal SIGKILL.)
	 */
	if (sigismember(&current->pending.signal, SIGKILL))
		return EINTR;

This relies on tasklist_lock and is racy now.


The second race is more tricky, copy_process:

	attach_pid(p, PIDTYPE_PID, p->pid);
	attach_pid(p, PIDTYPE_TGID, p->tgid);

This means that we can find a task in kill_proc_info()->find_task_by_pid()
which is not registered as part of thread group yet. Various bad things can
happen, note that handle_stop_signal(SIGCONT) and __group_complete_signal()
iterate over threads list. But p->pids[PIDTYPE_TGID] is a copy of current's
'struct pid' from dup_task_struct(), and if we don't have CLONE_THREAD here
we will use completely unreleated (parent's) thread list.

I think we can solve these problems by enlarging a ->siglock's scope in
copy_process(), but I don't know how to test this patch.

NOTE: release_task()->__unhash_process() path is safe, we already have
->sighand == NULL while detaching PIDTYPE_PID/PIDTYPE_TGID nonatomically.

Unless I missed something, I personally think this is a 2.6.16 material.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- RC-1/kernel/fork.c~	2006-02-06 22:04:40.000000000 +0300
+++ RC-1/kernel/fork.c	2006-02-06 22:11:51.000000000 +0300
@@ -1050,7 +1050,7 @@ static task_t *copy_process(unsigned lon
 	sched_fork(p, clone_flags);
 
 	/* Need tasklist lock for parent etc handling! */
-	write_lock_irq(&tasklist_lock);
+	write_lock(&tasklist_lock);
 
 	/*
 	 * The task hasn't been attached yet, so its cpus_allowed mask will
@@ -1066,33 +1066,34 @@ static task_t *copy_process(unsigned lon
 			!cpu_online(task_cpu(p))))
 		set_task_cpu(p, smp_processor_id());
 
+	/* CLONE_PARENT re-uses the old parent */
+	if (clone_flags & (CLONE_PARENT|CLONE_THREAD))
+		p->real_parent = current->real_parent;
+	else
+		p->real_parent = current;
+	p->parent = p->real_parent;
+
+	spin_lock_irq(&current->sighand->siglock);
 	/*
 	 * Check for pending SIGKILL! The new thread should not be allowed
 	 * to slip out of an OOM kill. (or normal SIGKILL.)
 	 */
 	if (sigismember(&current->pending.signal, SIGKILL)) {
-		write_unlock_irq(&tasklist_lock);
+		spin_unlock_irq(&current->sighand->siglock);
+		write_unlock(&tasklist_lock);
 		retval = -EINTR;
 		goto bad_fork_cleanup_namespace;
 	}
 
-	/* CLONE_PARENT re-uses the old parent */
-	if (clone_flags & (CLONE_PARENT|CLONE_THREAD))
-		p->real_parent = current->real_parent;
-	else
-		p->real_parent = current;
-	p->parent = p->real_parent;
-
 	if (clone_flags & CLONE_THREAD) {
-		spin_lock(&current->sighand->siglock);
 		/*
 		 * Important: if an exit-all has been started then
 		 * do not create this new thread - the whole thread
 		 * group is supposed to exit anyway.
 		 */
 		if (current->signal->flags & SIGNAL_GROUP_EXIT) {
-			spin_unlock(&current->sighand->siglock);
-			write_unlock_irq(&tasklist_lock);
+			spin_unlock_irq(&current->sighand->siglock);
+			write_unlock(&tasklist_lock);
 			retval = -EAGAIN;
 			goto bad_fork_cleanup_namespace;
 		}
@@ -1122,8 +1123,6 @@ static task_t *copy_process(unsigned lon
 			 */
 			p->it_prof_expires = jiffies_to_cputime(1);
 		}
-
-		spin_unlock(&current->sighand->siglock);
 	}
 
 	/*
@@ -1151,7 +1150,9 @@ static task_t *copy_process(unsigned lon
 
 	nr_threads++;
 	total_forks++;
-	write_unlock_irq(&tasklist_lock);
+	spin_unlock_irq(&current->sighand->siglock);
+	write_unlock(&tasklist_lock);
+
 	proc_fork_connector(p);
 	return p;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] fix kill_proc_info() vs copy_process() race
  2006-02-06 16:45 [PATCH] fix kill_proc_info() vs copy_process() race Oleg Nesterov
@ 2006-02-06 17:10 ` Oleg Nesterov
  2006-02-06 20:25   ` Oleg Nesterov
  2006-02-14 22:32 ` Paul E. McKenney
  1 sibling, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-06 17:10 UTC (permalink / raw)
  To: Ingo Molnar, Paul E. McKenney, linux-kernel, Roland McGrath,
	Linus Torvalds, Andrew Morton

Oleg Nesterov wrote:
> 
> This means that we can find a task in kill_proc_info()->find_task_by_pid()
> which is not registered as part of thread group yet. Various bad things can
> happen, note that handle_stop_signal(SIGCONT) and __group_complete_signal()
> iterate over threads list. But p->pids[PIDTYPE_TGID] is a copy of current's
> 'struct pid' from dup_task_struct(), and if we don't have CLONE_THREAD here
> we will use completely unreleated (parent's) thread list.
> 
> I think we can solve these problems by enlarging a ->siglock's scope in
> copy_process(), but I don't know how to test this patch.
> 
> NOTE: release_task()->__unhash_process() path is safe, we already have
> ->sighand == NULL while detaching PIDTYPE_PID/PIDTYPE_TGID nonatomically.

Sorry, I was wrong. Without CLONE_THREAD current->sighand.siglock can't help,
we need p->sighand.siglock, I beleive.

Am I correct that the bug exists at least?

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] fix kill_proc_info() vs copy_process() race
  2006-02-06 17:10 ` Oleg Nesterov
@ 2006-02-06 20:25   ` Oleg Nesterov
  0 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-06 20:25 UTC (permalink / raw)
  To: Ingo Molnar, Paul E. McKenney, linux-kernel, Roland McGrath,
	Linus Torvalds, Andrew Morton

Oleg Nesterov wrote:
> 
> Sorry, I was wrong. Without CLONE_THREAD current->sighand.siglock can't help,
> we need p->sighand.siglock, I beleive.

Also, it is stupid to do write_lock(&tasklist_lock) without clearing irqs.

Ok, may be something like (untested, for review only) patch below ?

attach_pid() does wmb(), group_send_sig_info()->rcu_dereference() calls
rmb(), so we can just reverse PIDTYPE_PID/PIDTYPE_TGID attaching?

Note that now we check sigismember(->pending, SIGKILL) holding both
tasklist and ->sighand.siglock, this means we can kill SIGNAL_GROUP_EXIT
check under 'if (clone_flags & CLONE_THREAD)':

	__group_complete_signal() and zap_other_threads() need at least
	->sighand.siglock and they send SIGKILL without unlocking.

	We can miss SIGNAL_GROUP_EXIT from do_coredump(), but it is possible
	anyway. The new thread will be killed later, from zap_threads().
	
What do you think?

Oleg.

--- RC-1/kernel/fork.c~	2006-02-07 01:41:14.000000000 +0300
+++ RC-1/kernel/fork.c	2006-02-07 02:13:10.000000000 +0300
@@ -1066,11 +1066,13 @@ static task_t *copy_process(unsigned lon
 			!cpu_online(task_cpu(p))))
 		set_task_cpu(p, smp_processor_id());
 
+	spin_lock(&current->sighand->siglock);
 	/*
 	 * Check for pending SIGKILL! The new thread should not be allowed
 	 * to slip out of an OOM kill. (or normal SIGKILL.)
 	 */
 	if (sigismember(&current->pending.signal, SIGKILL)) {
+		spin_unlock(&current->sighand->siglock);
 		write_unlock_irq(&tasklist_lock);
 		retval = -EINTR;
 		goto bad_fork_cleanup_namespace;
@@ -1084,18 +1086,6 @@ static task_t *copy_process(unsigned lon
 	p->parent = p->real_parent;
 
 	if (clone_flags & CLONE_THREAD) {
-		spin_lock(&current->sighand->siglock);
-		/*
-		 * Important: if an exit-all has been started then
-		 * do not create this new thread - the whole thread
-		 * group is supposed to exit anyway.
-		 */
-		if (current->signal->flags & SIGNAL_GROUP_EXIT) {
-			spin_unlock(&current->sighand->siglock);
-			write_unlock_irq(&tasklist_lock);
-			retval = -EAGAIN;
-			goto bad_fork_cleanup_namespace;
-		}
 		p->group_leader = current->group_leader;
 
 		if (current->signal->group_stop_count > 0) {
@@ -1122,8 +1112,6 @@ static task_t *copy_process(unsigned lon
 			 */
 			p->it_prof_expires = jiffies_to_cputime(1);
 		}
-
-		spin_unlock(&current->sighand->siglock);
 	}
 
 	/*
@@ -1135,8 +1123,8 @@ static task_t *copy_process(unsigned lon
 	if (unlikely(p->ptrace & PT_PTRACED))
 		__ptrace_link(p, current->parent);
 
-	attach_pid(p, PIDTYPE_PID, p->pid);
 	attach_pid(p, PIDTYPE_TGID, p->tgid);
+	attach_pid(p, PIDTYPE_PID, p->pid);
 	if (thread_group_leader(p)) {
 		p->signal->tty = current->signal->tty;
 		p->signal->pgrp = process_group(current);
@@ -1149,6 +1137,7 @@ static task_t *copy_process(unsigned lon
 
 	nr_threads++;
 	total_forks++;
+	spin_unlock(&current->sighand->siglock);
 	write_unlock_irq(&tasklist_lock);
 	proc_fork_connector(p);
 	return p;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] fix kill_proc_info() vs copy_process() race
  2006-02-06 16:45 [PATCH] fix kill_proc_info() vs copy_process() race Oleg Nesterov
  2006-02-06 17:10 ` Oleg Nesterov
@ 2006-02-14 22:32 ` Paul E. McKenney
  2006-02-15 14:05   ` Oleg Nesterov
  1 sibling, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2006-02-14 22:32 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

On Mon, Feb 06, 2006 at 07:45:48PM +0300, Oleg Nesterov wrote:
> kill_proc_info() takes taskllist_lock only for sig_kernel_stop()
> case nowadays. I beleive it races with copy_process().

For SIGCONT as well as well as sig_kernel_stop(), but yes, tasklist_lock
is now acquired only sometimes.  (And please accept my apologies for 
the late reply -- trouble with my local email system that I could not
fix from 8,000 miles away...)

> The first race is simple, copy_process:
> 
> 	/*
> 	 * Check for pending SIGKILL! The new thread should not be allowed
> 	 * to slip out of an OOM kill. (or normal SIGKILL.)
> 	 */
> 	if (sigismember(&current->pending.signal, SIGKILL))
> 		return EINTR;
> 
> This relies on tasklist_lock and is racy now.

Agreed, but is the race any worse than it was before?  Since SIGKILL is
fatal, the bit can be set but never cleared.  My belief, quite possibly
mistaken, is that this is a performance issue rather than a correctness
issue -- we would like to avoid the overhead of a fork() for a "walking
dead" process.

If my belief is incorrect, one fix would be to add SIGKILL to the checks
in kill_proc_info().

> The second race is more tricky, copy_process:
> 
> 	attach_pid(p, PIDTYPE_PID, p->pid);
> 	attach_pid(p, PIDTYPE_TGID, p->tgid);
> 
> This means that we can find a task in kill_proc_info()->find_task_by_pid()
> which is not registered as part of thread group yet. Various bad things can
> happen, note that handle_stop_signal(SIGCONT) and __group_complete_signal()
> iterate over threads list. But p->pids[PIDTYPE_TGID] is a copy of current's
> 'struct pid' from dup_task_struct(), and if we don't have CLONE_THREAD here
> we will use completely unreleated (parent's) thread list.

But handle_stop_signal() will not do anything except for sig_kernel_stop(),
SIGCONT, and SIGKILL.  The first two will have tasklist_lock held just
as before.  The latter (SIGKILL) does not iterate over anything, instead
it only sets p->signal->flags to zero.

On __group_complete_signal(), just to make sure I understand...  For this
to happen, we would have to kill() a child process while it was being
fork()ed, but before the parent was aware what the child's pid was, right?
And the race you are noting is that we might find the embryo process just
as it is checking for thread_group_leader(p), but before p->signal->tty,
p->signal_pgrp, and p->signal->session are initialized?

> I think we can solve these problems by enlarging a ->siglock's scope in
> copy_process(), but I don't know how to test this patch.

I have not convinced myself that this is a bug, but it might well
be.  The reason I am unconvinced is that if we have not done all the
attach_pid()s, the signal should not be able to find us.  Yes, we do
have a copy of the parent's p->pids[PIDTYPE_TGID], but this process is
not linked into the lists -- the process can find the parent, but not
vice versa.

But I could easily be missing something, still a bit jetlagged.  Could
you please lay out the exact sequence of events in the scenario that you
are thinking of?

And if there is a real problem, is it possible to fix it by changing
the order of the attach_pid() calls?

> NOTE: release_task()->__unhash_process() path is safe, we already have
> ->sighand == NULL while detaching PIDTYPE_PID/PIDTYPE_TGID nonatomically.

Agreed.

							Thanx, Paul

> Unless I missed something, I personally think this is a 2.6.16 material.
> 
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- RC-1/kernel/fork.c~	2006-02-06 22:04:40.000000000 +0300
> +++ RC-1/kernel/fork.c	2006-02-06 22:11:51.000000000 +0300
> @@ -1050,7 +1050,7 @@ static task_t *copy_process(unsigned lon
>  	sched_fork(p, clone_flags);
>  
>  	/* Need tasklist lock for parent etc handling! */
> -	write_lock_irq(&tasklist_lock);
> +	write_lock(&tasklist_lock);
>  
>  	/*
>  	 * The task hasn't been attached yet, so its cpus_allowed mask will
> @@ -1066,33 +1066,34 @@ static task_t *copy_process(unsigned lon
>  			!cpu_online(task_cpu(p))))
>  		set_task_cpu(p, smp_processor_id());
>  
> +	/* CLONE_PARENT re-uses the old parent */
> +	if (clone_flags & (CLONE_PARENT|CLONE_THREAD))
> +		p->real_parent = current->real_parent;
> +	else
> +		p->real_parent = current;
> +	p->parent = p->real_parent;
> +
> +	spin_lock_irq(&current->sighand->siglock);
>  	/*
>  	 * Check for pending SIGKILL! The new thread should not be allowed
>  	 * to slip out of an OOM kill. (or normal SIGKILL.)
>  	 */
>  	if (sigismember(&current->pending.signal, SIGKILL)) {
> -		write_unlock_irq(&tasklist_lock);
> +		spin_unlock_irq(&current->sighand->siglock);
> +		write_unlock(&tasklist_lock);
>  		retval = -EINTR;
>  		goto bad_fork_cleanup_namespace;
>  	}
>  
> -	/* CLONE_PARENT re-uses the old parent */
> -	if (clone_flags & (CLONE_PARENT|CLONE_THREAD))
> -		p->real_parent = current->real_parent;
> -	else
> -		p->real_parent = current;
> -	p->parent = p->real_parent;
> -
>  	if (clone_flags & CLONE_THREAD) {
> -		spin_lock(&current->sighand->siglock);
>  		/*
>  		 * Important: if an exit-all has been started then
>  		 * do not create this new thread - the whole thread
>  		 * group is supposed to exit anyway.
>  		 */
>  		if (current->signal->flags & SIGNAL_GROUP_EXIT) {
> -			spin_unlock(&current->sighand->siglock);
> -			write_unlock_irq(&tasklist_lock);
> +			spin_unlock_irq(&current->sighand->siglock);
> +			write_unlock(&tasklist_lock);
>  			retval = -EAGAIN;
>  			goto bad_fork_cleanup_namespace;
>  		}
> @@ -1122,8 +1123,6 @@ static task_t *copy_process(unsigned lon
>  			 */
>  			p->it_prof_expires = jiffies_to_cputime(1);
>  		}
> -
> -		spin_unlock(&current->sighand->siglock);
>  	}
>  
>  	/*
> @@ -1151,7 +1150,9 @@ static task_t *copy_process(unsigned lon
>  
>  	nr_threads++;
>  	total_forks++;
> -	write_unlock_irq(&tasklist_lock);
> +	spin_unlock_irq(&current->sighand->siglock);
> +	write_unlock(&tasklist_lock);
> +
>  	proc_fork_connector(p);
>  	return p;
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] fix kill_proc_info() vs copy_process() race
  2006-02-14 22:32 ` Paul E. McKenney
@ 2006-02-15 14:05   ` Oleg Nesterov
  2006-02-15 19:13     ` [PATCH 1/2] fix kill_proc_info() vs CLONE_THREAD race Oleg Nesterov
  2006-02-15 19:13     ` [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race Oleg Nesterov
  0 siblings, 2 replies; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-15 14:05 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

Paul E. McKenney wrote:
> 
> On Mon, Feb 06, 2006 at 07:45:48PM +0300, Oleg Nesterov wrote:
> > The first race is simple, copy_process:
> > 
> > 	/*
> > 	 * Check for pending SIGKILL! The new thread should not be allowed
> > 	 * to slip out of an OOM kill. (or normal SIGKILL.)
> > 	 */
> > 	if (sigismember(&current->pending.signal, SIGKILL))
> > 		return EINTR;
> > 
> > This relies on tasklist_lock and is racy now.
> 
> Agreed, but is the race any worse than it was before?  Since SIGKILL is
> fatal, the bit can be set but never cleared.  My belief, quite possibly
> mistaken, is that this is a performance issue rather than a correctness
> issue -- we would like to avoid the overhead of a fork() for a "walking
> dead" process.

My apologies, I was very unclear. I talked about this check because I tried
to unify it with 'if (SIGNAL_GROUP_EXIT)' below. Let me try again.

copy_process(CLONE_THREAD)					__group_complete_signal(SIGKILL)

	lock(->sighand);
	if (->signal->flags & SIGNAL_GROUP_EXIT) // NO
		...abort forking...
	unlock(->sighand)

								->signal->flags = SIGNAL_GROUP_EXIT;
								// does not see the new thread yet
								for_each_thread_in_thread_group(t) {
									sigaddset(t->pending, SIGKILL);
									signal_wake_up(t);
								}

	... finish clone ...


The new thread starts without TIF_SIGPENDING. When any of other threads calls
get_signal_to_deliver() it will notice SIGKILL and call do_group_exit(), which
does:

	if (SIGNAL_GROUP_EXIT) // Yes, was set in group_complete_signal()
		// don't call zap_other_threads()
	do_exit();

So, thread group missed SIGKILL. The new thread runs with SIGNAL_GROUP_EXIT set
and has SIGKILL in ->shared_pending, so it can't be killed via sys_kill(SIGKILL),
and it can't be stopped.

This is not fatal, we can kill this thread via tkill(), even if it blocked other
signals, but still this is a bug (if I am right).

> > The second race is more tricky, copy_process:
> > 
> > 	attach_pid(p, PIDTYPE_PID, p->pid);
> > 	attach_pid(p, PIDTYPE_TGID, p->tgid);
> > 
> > This means that we can find a task in kill_proc_info()->find_task_by_pid()
> > which is not registered as part of thread group yet. Various bad things can
> > happen, note that handle_stop_signal(SIGCONT) and __group_complete_signal()
> > iterate over threads list. But p->pids[PIDTYPE_TGID] is a copy of current's
> > 'struct pid' from dup_task_struct(), and if we don't have CLONE_THREAD here
> > we will use completely unreleated (parent's) thread list.
> 
> But I could easily be missing something, still a bit jetlagged.  Could
> you please lay out the exact sequence of events in the scenario that you
> are thinking of?

Let's suppose that process with pid == 1000 does fork (no CLONE_THREAD bit),
and a bad boy does sys_kill(1001, SIGXXX)

copy_process:

	// it is possible that p->pid == 1001
	attach_pid(p, PIDTYPE_PID, p->pid);


						kill_proc_info:

						    p = find_task_by_pid(1001); // Found!

						    __group_complete_signal(p):

						        // iterate over thread group
						        do {
						        	...
						        } while (next_thread(t) != p)

The (one of) problem is that this loop never stops: next_thread() will iterate
over parent's threads list, because p have a copy of the parent's pids[PIDTYPE_TGID],
and p is not a member of this thread group. Unless I missed something, we have
an endless loop with interrupts disabled.

> And if there is a real problem, is it possible to fix it by changing
> the order of the attach_pid() calls?

I think yes, and I did exactly that in my next attempt to fix this problem.

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/2] fix kill_proc_info() vs CLONE_THREAD race
  2006-02-15 14:05   ` Oleg Nesterov
@ 2006-02-15 19:13     ` Oleg Nesterov
  2006-02-16 19:19       ` Paul E. McKenney
  2006-02-15 19:13     ` [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race Oleg Nesterov
  1 sibling, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-15 19:13 UTC (permalink / raw)
  To: paulmck, Ingo Molnar
  Cc: linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

There is a window after copy_process() unlocks ->sighand.siglock
and before it adds the new thread to the thread list.

In that window __group_complete_signal(SIGKILL) will not see the
new thread yet, so this thread will start running while the whole
thread group was supposed to exit.

I beleive we have another good reason to place attach_pid(PID/TGID)
under ->sighand.siglock. We can do the same for

	release_task()->__unhash_process()

	de_thread()->switch_exec_pids()

After that we don't need tasklist_lock to iterate over the thread
list, and we can simplify things, see for example do_sigaction()
or sys_times().

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 2.6.16-rc3/kernel/fork.c~1_KILL	2006-02-15 22:52:07.000000000 +0300
+++ 2.6.16-rc3/kernel/fork.c	2006-02-15 23:21:51.000000000 +0300
@@ -1123,8 +1123,8 @@ static task_t *copy_process(unsigned lon
 		p->real_parent = current;
 	p->parent = p->real_parent;
 
+	spin_lock(&current->sighand->siglock);
 	if (clone_flags & CLONE_THREAD) {
-		spin_lock(&current->sighand->siglock);
 		/*
 		 * Important: if an exit-all has been started then
 		 * do not create this new thread - the whole thread
@@ -1162,8 +1162,6 @@ static task_t *copy_process(unsigned lon
 			 */
 			p->it_prof_expires = jiffies_to_cputime(1);
 		}
-
-		spin_unlock(&current->sighand->siglock);
 	}
 
 	/*
@@ -1189,6 +1187,7 @@ static task_t *copy_process(unsigned lon
 
 	nr_threads++;
 	total_forks++;
+	spin_unlock(&current->sighand->siglock);
 	write_unlock_irq(&tasklist_lock);
 	proc_fork_connector(p);
 	return p;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-15 14:05   ` Oleg Nesterov
  2006-02-15 19:13     ` [PATCH 1/2] fix kill_proc_info() vs CLONE_THREAD race Oleg Nesterov
@ 2006-02-15 19:13     ` Oleg Nesterov
  2006-02-16 19:26       ` Paul E. McKenney
  1 sibling, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-15 19:13 UTC (permalink / raw)
  To: paulmck, Ingo Molnar
  Cc: linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

copy_process:

	attach_pid(p, PIDTYPE_PID, p->pid);
	attach_pid(p, PIDTYPE_TGID, p->tgid);

What if kill_proc_info(p->pid) happens in between?

copy_process() holds current->sighand.siglock, so we are safe
in CLONE_THREAD case, because current->sighand == p->sighand.

Otherwise, p->sighand is unlocked, the new process is already
visible to the find_task_by_pid(), but have a copy of parent's
'struct pid' in ->pids[PIDTYPE_TGID].

This means that __group_complete_signal() may hang while doing

	do ... while (next_thread() != p)

We can solve this problem if we reverse these 2 attach_pid()s:

	attach_pid() does wmb()

	group_send_sig_info() calls spin_lock(), which
	provides a read barrier. // Yes ?

I don't think we can hit this race in practice, but still.

Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>

--- 2.6.16-rc3/kernel/fork.c~2_HANG	2006-02-15 23:21:51.000000000 +0300
+++ 2.6.16-rc3/kernel/fork.c	2006-02-16 00:03:20.000000000 +0300
@@ -1173,8 +1173,6 @@ static task_t *copy_process(unsigned lon
 	if (unlikely(p->ptrace & PT_PTRACED))
 		__ptrace_link(p, current->parent);
 
-	attach_pid(p, PIDTYPE_PID, p->pid);
-	attach_pid(p, PIDTYPE_TGID, p->tgid);
 	if (thread_group_leader(p)) {
 		p->signal->tty = current->signal->tty;
 		p->signal->pgrp = process_group(current);
@@ -1184,6 +1182,8 @@ static task_t *copy_process(unsigned lon
 		if (p->pid)
 			__get_cpu_var(process_counts)++;
 	}
+	attach_pid(p, PIDTYPE_TGID, p->tgid);
+	attach_pid(p, PIDTYPE_PID, p->pid);
 
 	nr_threads++;
 	total_forks++;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] fix kill_proc_info() vs CLONE_THREAD race
  2006-02-15 19:13     ` [PATCH 1/2] fix kill_proc_info() vs CLONE_THREAD race Oleg Nesterov
@ 2006-02-16 19:19       ` Paul E. McKenney
  2006-02-16 21:16         ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2006-02-16 19:19 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

On Wed, Feb 15, 2006 at 10:13:24PM +0300, Oleg Nesterov wrote:
> There is a window after copy_process() unlocks ->sighand.siglock
> and before it adds the new thread to the thread list.
> 
> In that window __group_complete_signal(SIGKILL) will not see the
> new thread yet, so this thread will start running while the whole
> thread group was supposed to exit.

The fix looks good to me!

> I beleive we have another good reason to place attach_pid(PID/TGID)
> under ->sighand.siglock. We can do the same for
> 
> 	release_task()->__unhash_process()
> 
> 	de_thread()->switch_exec_pids()
> 
> After that we don't need tasklist_lock to iterate over the thread
> list, and we can simplify things, see for example do_sigaction()
> or sys_times().

The above proposal would require that we hold siglock during the
traversal, correct?  Is that reasonable for non-signal-related traversals?
Or were you thinking of making this change only for signal code?

						Thanx, Paul

Acked-by: <paulmck@us.ibm.com>
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- 2.6.16-rc3/kernel/fork.c~1_KILL	2006-02-15 22:52:07.000000000 +0300
> +++ 2.6.16-rc3/kernel/fork.c	2006-02-15 23:21:51.000000000 +0300
> @@ -1123,8 +1123,8 @@ static task_t *copy_process(unsigned lon
>  		p->real_parent = current;
>  	p->parent = p->real_parent;
>  
> +	spin_lock(&current->sighand->siglock);
>  	if (clone_flags & CLONE_THREAD) {
> -		spin_lock(&current->sighand->siglock);
>  		/*
>  		 * Important: if an exit-all has been started then
>  		 * do not create this new thread - the whole thread
> @@ -1162,8 +1162,6 @@ static task_t *copy_process(unsigned lon
>  			 */
>  			p->it_prof_expires = jiffies_to_cputime(1);
>  		}
> -
> -		spin_unlock(&current->sighand->siglock);
>  	}
>  
>  	/*
> @@ -1189,6 +1187,7 @@ static task_t *copy_process(unsigned lon
>  
>  	nr_threads++;
>  	total_forks++;
> +	spin_unlock(&current->sighand->siglock);
>  	write_unlock_irq(&tasklist_lock);
>  	proc_fork_connector(p);
>  	return p;
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-15 19:13     ` [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race Oleg Nesterov
@ 2006-02-16 19:26       ` Paul E. McKenney
  2006-02-16 20:56         ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2006-02-16 19:26 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

On Wed, Feb 15, 2006 at 10:13:26PM +0300, Oleg Nesterov wrote:
> copy_process:
> 
> 	attach_pid(p, PIDTYPE_PID, p->pid);
> 	attach_pid(p, PIDTYPE_TGID, p->tgid);
> 
> What if kill_proc_info(p->pid) happens in between?

Doesn't your patch 1/2 that expanded the scope of siglock in
copy_process() prevent this from happening?

o	A new process is being created on CPU 0, and does the first
	attach_pid() in copy_process(), but has not yet done
	the second attach_pid().

o	Meanwhile, on CPU 1, kill_proc_info() successfully looks up the
	new process via find_task_by_pid().

o	Also on CPU 1, kill_proc_info() calls group_send_sig_info(),
	which checks permissions, locates the sighand structure,
	then attempts to acquire siglock.

	Given your patch 1/2, CPU 1 cannot proceed until CPU 0 gets
	done with the remaining attach_pid() calls.

So, what am I missing this time?  ;-)

						Thanx, Paul

> copy_process() holds current->sighand.siglock, so we are safe
> in CLONE_THREAD case, because current->sighand == p->sighand.
> 
> Otherwise, p->sighand is unlocked, the new process is already
> visible to the find_task_by_pid(), but have a copy of parent's
> 'struct pid' in ->pids[PIDTYPE_TGID].
> 
> This means that __group_complete_signal() may hang while doing
> 
> 	do ... while (next_thread() != p)
> 
> We can solve this problem if we reverse these 2 attach_pid()s:
> 
> 	attach_pid() does wmb()
> 
> 	group_send_sig_info() calls spin_lock(), which
> 	provides a read barrier. // Yes ?
> 
> I don't think we can hit this race in practice, but still.
> 
> Signed-off-by: Oleg Nesterov <oleg@tv-sign.ru>
> 
> --- 2.6.16-rc3/kernel/fork.c~2_HANG	2006-02-15 23:21:51.000000000 +0300
> +++ 2.6.16-rc3/kernel/fork.c	2006-02-16 00:03:20.000000000 +0300
> @@ -1173,8 +1173,6 @@ static task_t *copy_process(unsigned lon
>  	if (unlikely(p->ptrace & PT_PTRACED))
>  		__ptrace_link(p, current->parent);
>  
> -	attach_pid(p, PIDTYPE_PID, p->pid);
> -	attach_pid(p, PIDTYPE_TGID, p->tgid);
>  	if (thread_group_leader(p)) {
>  		p->signal->tty = current->signal->tty;
>  		p->signal->pgrp = process_group(current);
> @@ -1184,6 +1182,8 @@ static task_t *copy_process(unsigned lon
>  		if (p->pid)
>  			__get_cpu_var(process_counts)++;
>  	}
> +	attach_pid(p, PIDTYPE_TGID, p->tgid);
> +	attach_pid(p, PIDTYPE_PID, p->pid);
>  
>  	nr_threads++;
>  	total_forks++;
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-16 20:56         ` Oleg Nesterov
@ 2006-02-16 19:53           ` Paul E. McKenney
  2006-02-16 21:20             ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2006-02-16 19:53 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

On Thu, Feb 16, 2006 at 11:56:12PM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> > 
> > On Wed, Feb 15, 2006 at 10:13:26PM +0300, Oleg Nesterov wrote:
> > > copy_process:
> > >
> > >       attach_pid(p, PIDTYPE_PID, p->pid);
> > >       attach_pid(p, PIDTYPE_TGID, p->tgid);
> > >
> > > What if kill_proc_info(p->pid) happens in between?
> > 
> > Doesn't your patch 1/2 that expanded the scope of siglock in
> > copy_process() prevent this from happening?
> 
> I think, no. Please see below,
> 
> > o       A new process is being created on CPU 0, and does the first
> >         attach_pid() in copy_process(), but has not yet done
> >         the second attach_pid().
> > 
> > o       Meanwhile, on CPU 1, kill_proc_info() successfully looks up the
> >         new process via find_task_by_pid().
> > 
> > o       Also on CPU 1, kill_proc_info() calls group_send_sig_info(),
> >         which checks permissions, locates the sighand structure,
> >         then attempts to acquire siglock.
> 
> ... and takes it. Without CLONE_THREAD (more precisely, CLONE_SIGHAND)
> we have different ->sighand for parent (current) and for the new child.
> 
> copy_process() holds parents's ->sighand, while group_send_sig_info()
> takes child's.

Good point!!!

The other thing to think through is tkill on a thread/process while it
is being created.  I believe that this is OK, since thread-specific
kill must target a specific thread, so does not do the traversal.

Does this match your understanding?

						Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-16 19:26       ` Paul E. McKenney
@ 2006-02-16 20:56         ` Oleg Nesterov
  2006-02-16 19:53           ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-16 20:56 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

"Paul E. McKenney" wrote:
> 
> On Wed, Feb 15, 2006 at 10:13:26PM +0300, Oleg Nesterov wrote:
> > copy_process:
> >
> >       attach_pid(p, PIDTYPE_PID, p->pid);
> >       attach_pid(p, PIDTYPE_TGID, p->tgid);
> >
> > What if kill_proc_info(p->pid) happens in between?
> 
> Doesn't your patch 1/2 that expanded the scope of siglock in
> copy_process() prevent this from happening?

I think, no. Please see below,

> o       A new process is being created on CPU 0, and does the first
>         attach_pid() in copy_process(), but has not yet done
>         the second attach_pid().
> 
> o       Meanwhile, on CPU 1, kill_proc_info() successfully looks up the
>         new process via find_task_by_pid().
> 
> o       Also on CPU 1, kill_proc_info() calls group_send_sig_info(),
>         which checks permissions, locates the sighand structure,
>         then attempts to acquire siglock.

... and takes it. Without CLONE_THREAD (more precisely, CLONE_SIGHAND)
we have different ->sighand for parent (current) and for the new child.

copy_process() holds parents's ->sighand, while group_send_sig_info()
takes child's.

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/2] fix kill_proc_info() vs CLONE_THREAD race
  2006-02-16 19:19       ` Paul E. McKenney
@ 2006-02-16 21:16         ` Oleg Nesterov
  0 siblings, 0 replies; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-16 21:16 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

"Paul E. McKenney" wrote:
> 
> > After that we don't need tasklist_lock to iterate over the thread
> > list, and we can simplify things, see for example do_sigaction()
> > or sys_times().
> 
> The above proposal would require that we hold siglock during the
> traversal, correct?

Yes, of course.

>                     Is that reasonable for non-signal-related traversals?
> Or were you thinking of making this change only for signal code?

Yes, I think it may be useful for non-signal-related traversals.

Currently we need tasklist_lock in order to use next_thread().
I beleive, we can migrate to rcu_read_lock+spinlock(sighand)
in most cases.

Well, next_thread() itself is safe already, but it can return
already zapped threads.

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-16 19:53           ` Paul E. McKenney
@ 2006-02-16 21:20             ` Oleg Nesterov
  2006-02-18  2:06               ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-16 21:20 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

"Paul E. McKenney" wrote:
> 
> The other thing to think through is tkill on a thread/process while it
> is being created.  I believe that this is OK, since thread-specific
> kill must target a specific thread, so does not do the traversal.
> 

Also, tkill was not converted to use rcu_read_lock yet, it still
takes tasklist_lock, so I think it is safe.

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-16 21:20             ` Oleg Nesterov
@ 2006-02-18  2:06               ` Paul E. McKenney
  2006-02-18 18:19                 ` Oleg Nesterov
  0 siblings, 1 reply; 16+ messages in thread
From: Paul E. McKenney @ 2006-02-18  2:06 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

On Fri, Feb 17, 2006 at 12:20:08AM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> > 
> > The other thing to think through is tkill on a thread/process while it
> > is being created.  I believe that this is OK, since thread-specific
> > kill must target a specific thread, so does not do the traversal.
> 
> Also, tkill was not converted to use rcu_read_lock yet, it still
> takes tasklist_lock, so I think it is safe.

I suspect that tkill will eventually need to avoid tasklist_lock...  ;-)

						Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-18  2:06               ` Paul E. McKenney
@ 2006-02-18 18:19                 ` Oleg Nesterov
  2006-02-20 17:49                   ` Paul E. McKenney
  0 siblings, 1 reply; 16+ messages in thread
From: Oleg Nesterov @ 2006-02-18 18:19 UTC (permalink / raw)
  To: paulmck
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

"Paul E. McKenney" wrote:
> 
> On Fri, Feb 17, 2006 at 12:20:08AM +0300, Oleg Nesterov wrote:
> > "Paul E. McKenney" wrote:
> > >
> > > The other thing to think through is tkill on a thread/process while it
> > > is being created.  I believe that this is OK, since thread-specific
> > > kill must target a specific thread, so does not do the traversal.
> >
> > Also, tkill was not converted to use rcu_read_lock yet, it still
> > takes tasklist_lock, so I think it is safe.
> 
> I suspect that tkill will eventually need to avoid tasklist_lock...  ;-)

Ok, I am sending a couple of preparation patches for this.

Paul, I didn't beleive you when you started this work. Now I think
we can avoid tasklist AND cleanup the code in many places. I am glad
I was wrong.

Btw,
>
> firing off some steamroller tests on it.

Could you point me to these tests?

Oleg.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race
  2006-02-18 18:19                 ` Oleg Nesterov
@ 2006-02-20 17:49                   ` Paul E. McKenney
  0 siblings, 0 replies; 16+ messages in thread
From: Paul E. McKenney @ 2006-02-20 17:49 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Ingo Molnar, linux-kernel, Roland McGrath, Linus Torvalds, Andrew Morton

On Sat, Feb 18, 2006 at 09:19:36PM +0300, Oleg Nesterov wrote:
> "Paul E. McKenney" wrote:
> > 
> > On Fri, Feb 17, 2006 at 12:20:08AM +0300, Oleg Nesterov wrote:
> > > "Paul E. McKenney" wrote:
> > > >
> > > > The other thing to think through is tkill on a thread/process while it
> > > > is being created.  I believe that this is OK, since thread-specific
> > > > kill must target a specific thread, so does not do the traversal.
> > >
> > > Also, tkill was not converted to use rcu_read_lock yet, it still
> > > takes tasklist_lock, so I think it is safe.
> > 
> > I suspect that tkill will eventually need to avoid tasklist_lock...  ;-)
> 
> Ok, I am sending a couple of preparation patches for this.
> 
> Paul, I didn't beleive you when you started this work. Now I think
> we can avoid tasklist AND cleanup the code in many places. I am glad
> I was wrong.

And I am very glad that you are working this -- you have found some
approaches that are much better than those I would have come up with!

> Btw,
> >
> > firing off some steamroller tests on it.
> 
> Could you point me to these tests?

http://www.rdrop.com/users/paulmck/projects/steamroller/

Contributions of additional tests very welcome!

							Thanx, Paul

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2006-02-20 17:49 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-06 16:45 [PATCH] fix kill_proc_info() vs copy_process() race Oleg Nesterov
2006-02-06 17:10 ` Oleg Nesterov
2006-02-06 20:25   ` Oleg Nesterov
2006-02-14 22:32 ` Paul E. McKenney
2006-02-15 14:05   ` Oleg Nesterov
2006-02-15 19:13     ` [PATCH 1/2] fix kill_proc_info() vs CLONE_THREAD race Oleg Nesterov
2006-02-16 19:19       ` Paul E. McKenney
2006-02-16 21:16         ` Oleg Nesterov
2006-02-15 19:13     ` [PATCH 2/2] fix kill_proc_info() vs fork() theoretical race Oleg Nesterov
2006-02-16 19:26       ` Paul E. McKenney
2006-02-16 20:56         ` Oleg Nesterov
2006-02-16 19:53           ` Paul E. McKenney
2006-02-16 21:20             ` Oleg Nesterov
2006-02-18  2:06               ` Paul E. McKenney
2006-02-18 18:19                 ` Oleg Nesterov
2006-02-20 17:49                   ` Paul E. McKenney

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.