All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
@ 2010-11-02 13:58 Sergey Senozhatsky
  2010-11-02 15:31 ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Sergey Senozhatsky @ 2010-11-02 13:58 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Thomas Gleixner, Peter Zijlstra, Ingo Molnar, linux-kernel

Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.

Tetsuo Handa wrote:

Quoting from one of posts in that thead
http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388

| Usually tasklist gives enough protection, but if copy_process() fails
| it calls free_pid() lockless and does call_rcu(delayed_put_pid().   
| This means, without rcu lock find_pid_ns() can't scan the hash table
| safely.


Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..855bc53 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -38,11 +38,13 @@ static int check_clock(const clockid_t which_clock)
 		return 0;
 
 	read_lock(&tasklist_lock);
+	rcu_read_lock();
 	p = find_task_by_vpid(pid);
 	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
 		   same_thread_group(p, current) : thread_group_leader(p))) {
 		error = -EINVAL;
 	}
+	rcu_read_unlock();
 	read_unlock(&tasklist_lock);
 
 	return error;
@@ -395,17 +397,21 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 		if (pid == 0) {
 			p = current;
 		} else {
+			rcu_read_lock();
 			p = find_task_by_vpid(pid);
 			if (p && !same_thread_group(p, current))
 				p = NULL;
+			rcu_read_unlock();
 		}
 	} else {
 		if (pid == 0) {
 			p = current->group_leader;
 		} else {
+			rcu_read_lock();
 			p = find_task_by_vpid(pid);
 			if (p && !thread_group_leader(p))
 				p = NULL;
+			rcu_read_unlock();
 		}
 	}
 	new_timer->it.cpu.task = p;


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-02 13:58 [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call Sergey Senozhatsky
@ 2010-11-02 15:31 ` Thomas Gleixner
  2010-11-02 16:02   ` Sergey Senozhatsky
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2010-11-02 15:31 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, linux-kernel

On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:

> Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
> 
> Tetsuo Handa wrote:
> 
> Quoting from one of posts in that thead
> http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> 
> | Usually tasklist gives enough protection, but if copy_process() fails
> | it calls free_pid() lockless and does call_rcu(delayed_put_pid().   
> | This means, without rcu lock find_pid_ns() can't scan the hash table
> | safely.

We can remove the tasklist_lock while at it. rcu_read_lock is enough.

Thanks,

	tglx
 
> 
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
> 
> ---
> 
> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..855bc53 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -38,11 +38,13 @@ static int check_clock(const clockid_t which_clock)
>  		return 0;
>  
>  	read_lock(&tasklist_lock);
> +	rcu_read_lock();
>  	p = find_task_by_vpid(pid);
>  	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
>  		   same_thread_group(p, current) : thread_group_leader(p))) {
>  		error = -EINVAL;
>  	}
> +	rcu_read_unlock();
>  	read_unlock(&tasklist_lock);
>  
>  	return error;
> @@ -395,17 +397,21 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>  		if (pid == 0) {
>  			p = current;
>  		} else {
> +			rcu_read_lock();
>  			p = find_task_by_vpid(pid);
>  			if (p && !same_thread_group(p, current))
>  				p = NULL;
> +			rcu_read_unlock();
>  		}
>  	} else {
>  		if (pid == 0) {
>  			p = current->group_leader;
>  		} else {
> +			rcu_read_lock();
>  			p = find_task_by_vpid(pid);
>  			if (p && !thread_group_leader(p))
>  				p = NULL;
> +			rcu_read_unlock();
>  		}
>  	}
>  	new_timer->it.cpu.task = p;
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-02 15:31 ` Thomas Gleixner
@ 2010-11-02 16:02   ` Sergey Senozhatsky
  2010-11-02 16:04     ` Thomas Gleixner
  0 siblings, 1 reply; 14+ messages in thread
From: Sergey Senozhatsky @ 2010-11-02 16:02 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1859 bytes --]

On (11/02/10 16:31), Thomas Gleixner wrote:
> On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> 
> > Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> > find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> > Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
> > 
> > Tetsuo Handa wrote:
> > 
> > Quoting from one of posts in that thead
> > http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> > 
> > | Usually tasklist gives enough protection, but if copy_process() fails
> > | it calls free_pid() lockless and does call_rcu(delayed_put_pid().   
> > | This means, without rcu lock find_pid_ns() can't scan the hash table
> > | safely.
> 
> We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> 

Please kindly review.

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..96fe2a3 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
 	if (pid == 0)
 		return 0;
 
-	read_lock(&tasklist_lock);
+	rcu_read_lock();
 	p = find_task_by_vpid(pid);
 	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
 		   same_thread_group(p, current) : thread_group_leader(p))) {
 		error = -EINVAL;
 	}
-	read_unlock(&tasklist_lock);
+	rcu_read_unlock();
 
 	return error;
 }
@@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 
 	INIT_LIST_HEAD(&new_timer->it.cpu.entry);
 
-	read_lock(&tasklist_lock);
+	rcu_read_lock();
 	if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
 		if (pid == 0) {
 			p = current;
@@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 	} else {
 		ret = -EINVAL;
 	}
-	read_unlock(&tasklist_lock);
+	rcu_read_unlock();
 
 	return ret;
 }


[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-02 16:02   ` Sergey Senozhatsky
@ 2010-11-02 16:04     ` Thomas Gleixner
  2010-11-02 18:33       ` Oleg Nesterov
  0 siblings, 1 reply; 14+ messages in thread
From: Thomas Gleixner @ 2010-11-02 16:04 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Andrew Morton, Peter Zijlstra, Ingo Molnar, LKML, Oleg Nesterov

Just added Oleg to the cc-list

On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:

> On (11/02/10 16:31), Thomas Gleixner wrote:
> > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > 
> > > Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> > > find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> > > Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
> > > 
> > > Tetsuo Handa wrote:
> > > 
> > > Quoting from one of posts in that thead
> > > http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> > > 
> > > | Usually tasklist gives enough protection, but if copy_process() fails
> > > | it calls free_pid() lockless and does call_rcu(delayed_put_pid().   
> > > | This means, without rcu lock find_pid_ns() can't scan the hash table
> > > | safely.
> > 
> > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > 
> 
> Please kindly review.
> 
> ---
> 
> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..96fe2a3 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
>  	if (pid == 0)
>  		return 0;
>  
> -	read_lock(&tasklist_lock);
> +	rcu_read_lock();
>  	p = find_task_by_vpid(pid);
>  	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
>  		   same_thread_group(p, current) : thread_group_leader(p))) {
>  		error = -EINVAL;
>  	}
> -	read_unlock(&tasklist_lock);
> +	rcu_read_unlock();
>  
>  	return error;
>  }
> @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>  
>  	INIT_LIST_HEAD(&new_timer->it.cpu.entry);
>  
> -	read_lock(&tasklist_lock);
> +	rcu_read_lock();
>  	if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
>  		if (pid == 0) {
>  			p = current;
> @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>  	} else {
>  		ret = -EINVAL;
>  	}
> -	read_unlock(&tasklist_lock);
> +	rcu_read_unlock();
>  
>  	return ret;
>  }
> 
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-02 16:04     ` Thomas Gleixner
@ 2010-11-02 18:33       ` Oleg Nesterov
  2010-11-03 10:58         ` Sergey Senozhatsky
  0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2010-11-02 18:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Sergey Senozhatsky, Andrew Morton, Peter Zijlstra, Ingo Molnar,
	LKML, Stanislaw Gruszka

> On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
>
> > On (11/02/10 16:31), Thomas Gleixner wrote:
> > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > >
> > > > Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> > > > find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> > > > Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
> > > >
> > > > Tetsuo Handa wrote:
> > > >
> > > > Quoting from one of posts in that thead
> > > > http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> > > >
> > > > | Usually tasklist gives enough protection, but if copy_process() fails
> > > > | it calls free_pid() lockless and does call_rcu(delayed_put_pid().
> > > > | This means, without rcu lock find_pid_ns() can't scan the hash table
> > > > | safely.
> > >
> > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > >

Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
but it is not trivial to change this code.

Minor nit,

> > @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> >
> >  	INIT_LIST_HEAD(&new_timer->it.cpu.entry);
> >
> > -	read_lock(&tasklist_lock);
> > +	rcu_read_lock();
> >  	if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
> >  		if (pid == 0) {
> >  			p = current;
> > @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
> >  	} else {
> >  		ret = -EINVAL;
> >  	}
> > -	read_unlock(&tasklist_lock);
> > +	rcu_read_unlock();

I think this change is fine, but please note that thread_group_leader()
check is not relaible without tasklist. If we race with de_thread()
find_task_by_vpid() can find the new leader before it updates its
->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.

Not that I think this really matters, posix_cpu_timer_create() has
other problems with de_thread(). But perhaps it makes sense to
change posix_cpu_timer_create() to use has_group_leader_pid() instead,
just to make this code not look racy and avoid adding new problems.

The real fix, I think, should change cpu_timer_list to use
struct pid* instead of task_struct.

Oleg.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-02 18:33       ` Oleg Nesterov
@ 2010-11-03 10:58         ` Sergey Senozhatsky
  2010-11-03 12:48           ` Oleg Nesterov
  0 siblings, 1 reply; 14+ messages in thread
From: Sergey Senozhatsky @ 2010-11-03 10:58 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Thomas Gleixner, Andrew Morton, Peter Zijlstra, Ingo Molnar,
	LKML, Stanislaw Gruszka

[-- Attachment #1: Type: text/plain, Size: 2519 bytes --]

On (11/02/10 19:33), Oleg Nesterov wrote:
> > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> >
> > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > >
> 
> Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> but it is not trivial to change this code.
>
>[..] 
> 
> I think this change is fine, but please note that thread_group_leader()
> check is not relaible without tasklist. If we race with de_thread()
> find_task_by_vpid() can find the new leader before it updates its
> ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
> 
> Not that I think this really matters, posix_cpu_timer_create() has
> other problems with de_thread(). But perhaps it makes sense to
> change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> just to make this code not look racy and avoid adding new problems.
> 
> The real fix, I think, should change cpu_timer_list to use
> struct pid* instead of task_struct.
> 

Hello,
Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock 
is not aquired (check_clock and posix_cpu_timer_create).

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..05bb717 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
 	if (pid == 0)
 		return 0;
 
-	read_lock(&tasklist_lock);
+	rcu_read_lock();
 	p = find_task_by_vpid(pid);
 	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
-		   same_thread_group(p, current) : thread_group_leader(p))) {
+		   same_thread_group(p, current) : has_group_leader_pid(p))) {
 		error = -EINVAL;
 	}
-	read_unlock(&tasklist_lock);
+	rcu_read_unlock();
 
 	return error;
 }
@@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 
 	INIT_LIST_HEAD(&new_timer->it.cpu.entry);
 
-	read_lock(&tasklist_lock);
+	rcu_read_lock();
 	if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
 		if (pid == 0) {
 			p = current;
@@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 			p = current->group_leader;
 		} else {
 			p = find_task_by_vpid(pid);
-			if (p && !thread_group_leader(p))
+			if (p && !has_group_leader_pid(p))
 				p = NULL;
 		}
 	}
@@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 	} else {
 		ret = -EINVAL;
 	}
-	read_unlock(&tasklist_lock);
+	rcu_read_unlock();
 
 	return ret;
 }



[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-03 10:58         ` Sergey Senozhatsky
@ 2010-11-03 12:48           ` Oleg Nesterov
  2010-11-03 16:10             ` Oleg Nesterov
  0 siblings, 1 reply; 14+ messages in thread
From: Oleg Nesterov @ 2010-11-03 12:48 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Thomas Gleixner, Andrew Morton, Peter Zijlstra, Ingo Molnar,
	LKML, Stanislaw Gruszka

Damn.

Sergey, Thomas, please wait a bit.

Yes, I believe this patch is fine by itself. But looking into
posix-cpu-timers.c again, I suspect that those "other problems
with de_thread" I already mentioned are much more serious and
need the urgent fix.

I'll try to verify this a bit later today.

In any case, I believe someone should find the time to audit/
rewrite posix-cpu-timers.c ;)

On 11/03, Sergey Senozhatsky wrote:
>
> On (11/02/10 19:33), Oleg Nesterov wrote:
> > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > >
> > > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > > >
> >
> > Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> > but it is not trivial to change this code.
> >
> >[..]
> >
> > I think this change is fine, but please note that thread_group_leader()
> > check is not relaible without tasklist. If we race with de_thread()
> > find_task_by_vpid() can find the new leader before it updates its
> > ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
> >
> > Not that I think this really matters, posix_cpu_timer_create() has
> > other problems with de_thread(). But perhaps it makes sense to
> > change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> > just to make this code not look racy and avoid adding new problems.
> >
> > The real fix, I think, should change cpu_timer_list to use
> > struct pid* instead of task_struct.
> >
>
> Hello,
> Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock
> is not aquired (check_clock and posix_cpu_timer_create).
>
> ---
>
> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..05bb717 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
>  	if (pid == 0)
>  		return 0;
>
> -	read_lock(&tasklist_lock);
> +	rcu_read_lock();
>  	p = find_task_by_vpid(pid);
>  	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
> -		   same_thread_group(p, current) : thread_group_leader(p))) {
> +		   same_thread_group(p, current) : has_group_leader_pid(p))) {
>  		error = -EINVAL;
>  	}
> -	read_unlock(&tasklist_lock);
> +	rcu_read_unlock();
>
>  	return error;
>  }
> @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>
>  	INIT_LIST_HEAD(&new_timer->it.cpu.entry);
>
> -	read_lock(&tasklist_lock);
> +	rcu_read_lock();
>  	if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
>  		if (pid == 0) {
>  			p = current;
> @@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>  			p = current->group_leader;
>  		} else {
>  			p = find_task_by_vpid(pid);
> -			if (p && !thread_group_leader(p))
> +			if (p && !has_group_leader_pid(p))
>  				p = NULL;
>  		}
>  	}
> @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>  	} else {
>  		ret = -EINVAL;
>  	}
> -	read_unlock(&tasklist_lock);
> +	rcu_read_unlock();
>
>  	return ret;
>  }
>
>




^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-03 12:48           ` Oleg Nesterov
@ 2010-11-03 16:10             ` Oleg Nesterov
  2010-11-03 16:38               ` Sergey Senozhatsky
                                 ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Oleg Nesterov @ 2010-11-03 16:10 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Thomas Gleixner, Andrew Morton, Peter Zijlstra, Ingo Molnar,
	LKML, Stanislaw Gruszka

On 11/03, Oleg Nesterov wrote:
>
> Sergey, Thomas, please wait a bit.
>
> Yes, I believe this patch is fine by itself. But looking into
> posix-cpu-timers.c again, I suspect that those "other problems
> with de_thread" I already mentioned are much more serious and
> need the urgent fix.

Yes. I'll send the patch tomorrow.


However, my initial thinking was wrong, that bug is orthogonal
to this patch.

Sergey, how much will you hate me if I ask you to re-send it again?


> > On (11/02/10 19:33), Oleg Nesterov wrote:
> > > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > > >
> > > > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > > > >
> > >
> > > Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> > > but it is not trivial to change this code.
> > >
> > >[..]
> > >
> > > I think this change is fine, but please note that thread_group_leader()
> > > check is not relaible without tasklist. If we race with de_thread()
> > > find_task_by_vpid() can find the new leader before it updates its
> > > ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
> > >
> > > Not that I think this really matters, posix_cpu_timer_create() has
> > > other problems with de_thread(). But perhaps it makes sense to
> > > change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> > > just to make this code not look racy and avoid adding new problems.
> > >
> > > The real fix, I think, should change cpu_timer_list to use
> > > struct pid* instead of task_struct.
> > >
> >
> > Hello,
> > Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock
> > is not aquired (check_clock and posix_cpu_timer_create).

This doesn't look like a valid changelog ;) I'd suggest you to write
the new one without quoting old emails. It should explain that a)
tasklist_lock is not enough for find_vpid() and b) it is not needed.

Also, you forgot to add your signed-of-by.

Otherwise,

Reviewed-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-03 16:10             ` Oleg Nesterov
@ 2010-11-03 16:38               ` Sergey Senozhatsky
  2010-11-03 16:52               ` Sergey Senozhatsky
  2010-11-05 15:53               ` [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec Oleg Nesterov
  2 siblings, 0 replies; 14+ messages in thread
From: Sergey Senozhatsky @ 2010-11-03 16:38 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sergey Senozhatsky, Thomas Gleixner, Andrew Morton,
	Peter Zijlstra, Ingo Molnar, LKML, Stanislaw Gruszka

[-- Attachment #1: Type: text/plain, Size: 2402 bytes --]

On (11/03/10 17:10), Oleg Nesterov wrote:
> On 11/03, Oleg Nesterov wrote:
> >
> > Sergey, Thomas, please wait a bit.
> >
> > Yes, I believe this patch is fine by itself. But looking into
> > posix-cpu-timers.c again, I suspect that those "other problems
> > with de_thread" I already mentioned are much more serious and
> > need the urgent fix.
> 
> Yes. I'll send the patch tomorrow.
> 
> 
> However, my initial thinking was wrong, that bug is orthogonal
> to this patch.
> 
> Sergey, how much will you hate me if I ask you to re-send it again?
>

Oleg, not a problem at all. Which one should I resend?

#  desc.
1) added rcu_read_lock/unlock
2) removed tasklist_lock
3) added has_group_leader_pid


	Sergey

 
> 
> > > On (11/02/10 19:33), Oleg Nesterov wrote:
> > > > > On Tue, 2 Nov 2010, Sergey Senozhatsky wrote:
> > > > >
> > > > > > > We can remove the tasklist_lock while at it. rcu_read_lock is enough.
> > > > > > >
> > > >
> > > > Yes, I believe posix-cpu-timers.c shouldn't use tasklist at all,
> > > > but it is not trivial to change this code.
> > > >
> > > >[..]
> > > >
> > > > I think this change is fine, but please note that thread_group_leader()
> > > > check is not relaible without tasklist. If we race with de_thread()
> > > > find_task_by_vpid() can find the new leader before it updates its
> > > > ->group_leader. IOW, posix_cpu_timer_create() can fail when it shouldn't.
> > > >
> > > > Not that I think this really matters, posix_cpu_timer_create() has
> > > > other problems with de_thread(). But perhaps it makes sense to
> > > > change posix_cpu_timer_create() to use has_group_leader_pid() instead,
> > > > just to make this code not look racy and avoid adding new problems.
> > > >
> > > > The real fix, I think, should change cpu_timer_list to use
> > > > struct pid* instead of task_struct.
> > > >
> > >
> > > Hello,
> > > Using has_group_leader_pid instead of thread_group_leader, when tasklist_lock
> > > is not aquired (check_clock and posix_cpu_timer_create).
> 
> This doesn't look like a valid changelog ;) I'd suggest you to write
> the new one without quoting old emails. It should explain that a)
> tasklist_lock is not enough for find_vpid() and b) it is not needed.
> 
> Also, you forgot to add your signed-of-by.
> 
> Otherwise,
> 
> Reviewed-by: Oleg Nesterov <oleg@redhat.com>
> 

[-- Attachment #2: Type: application/pgp-signature, Size: 316 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-03 16:10             ` Oleg Nesterov
  2010-11-03 16:38               ` Sergey Senozhatsky
@ 2010-11-03 16:52               ` Sergey Senozhatsky
  2010-11-03 17:17                 ` Oleg Nesterov
  2010-11-05 15:53               ` [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec Oleg Nesterov
  2 siblings, 1 reply; 14+ messages in thread
From: Sergey Senozhatsky @ 2010-11-03 16:52 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Thomas Gleixner, Andrew Morton, Peter Zijlstra, Ingo Molnar,
	LKML, Stanislaw Gruszka

Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.

Tetsuo Handa wrote:
| Quoting from one of posts in that thead
| http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
|
|| Usually tasklist gives enough protection, but if copy_process() fails
|| it calls free_pid() lockless and does call_rcu(delayed_put_pid().
|| This means, without rcu lock find_pid_ns() can't scan the hash table
|| safely.

Thomas Gleixner wrote:
| We can remove the tasklist_lock while at it. rcu_read_lock is enough.

Patch also replaces thread_group_leader with has_group_leader_pid
in accordance to comment by Oleg Nesterov:

| ... thread_group_leader() check is not relaible without 
| tasklist. If we race with de_thread() find_task_by_vpid() can find
| the new leader before it updates its ->group_leader.
|
| perhaps it makes sense to change posix_cpu_timer_create() to use 
| has_group_leader_pid() instead, just to make this code not look racy
| and avoid adding new problems.


Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

---

diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
index 6842eeb..05bb717 100644
--- a/kernel/posix-cpu-timers.c
+++ b/kernel/posix-cpu-timers.c
@@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
 	if (pid == 0)
 		return 0;
 
-	read_lock(&tasklist_lock);
+	rcu_read_lock();
 	p = find_task_by_vpid(pid);
 	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
-		   same_thread_group(p, current) : thread_group_leader(p))) {
+		   same_thread_group(p, current) : has_group_leader_pid(p))) {
 		error = -EINVAL;
 	}
-	read_unlock(&tasklist_lock);
+	rcu_read_unlock();
 
 	return error;
 }
@@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 
 	INIT_LIST_HEAD(&new_timer->it.cpu.entry);
 
-	read_lock(&tasklist_lock);
+	rcu_read_lock();
 	if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
 		if (pid == 0) {
 			p = current;
@@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 			p = current->group_leader;
 		} else {
 			p = find_task_by_vpid(pid);
-			if (p && !thread_group_leader(p))
+			if (p && !has_group_leader_pid(p))
 				p = NULL;
 		}
 	}
@@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
 	} else {
 		ret = -EINVAL;
 	}
-	read_unlock(&tasklist_lock);
+	rcu_read_unlock();
 
 	return ret;
 }


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call
  2010-11-03 16:52               ` Sergey Senozhatsky
@ 2010-11-03 17:17                 ` Oleg Nesterov
  0 siblings, 0 replies; 14+ messages in thread
From: Oleg Nesterov @ 2010-11-03 17:17 UTC (permalink / raw)
  To: Sergey Senozhatsky
  Cc: Thomas Gleixner, Andrew Morton, Peter Zijlstra, Ingo Molnar,
	LKML, Stanislaw Gruszka

On 11/03, Sergey Senozhatsky wrote:
>
> Commit 4221a9918e38b7494cee341dda7b7b4bb8c04bde "Add RCU check for
> find_task_by_vpid()" introduced rcu_lockdep_assert to find_task_by_pid_ns.
> Add rcu_read_lock/rcu_read_unlock to call find_task_by_vpid.
>
> Tetsuo Handa wrote:
> | Quoting from one of posts in that thead
> | http://kerneltrap.org/mailarchive/linux-kernel/2010/2/8/4536388
> |
> || Usually tasklist gives enough protection, but if copy_process() fails
> || it calls free_pid() lockless and does call_rcu(delayed_put_pid().
> || This means, without rcu lock find_pid_ns() can't scan the hash table
> || safely.
>
> Thomas Gleixner wrote:
> | We can remove the tasklist_lock while at it. rcu_read_lock is enough.
>
> Patch also replaces thread_group_leader with has_group_leader_pid
> in accordance to comment by Oleg Nesterov:
>
> | ... thread_group_leader() check is not relaible without
> | tasklist. If we race with de_thread() find_task_by_vpid() can find
> | the new leader before it updates its ->group_leader.
> |
> | perhaps it makes sense to change posix_cpu_timer_create() to use
> | has_group_leader_pid() instead, just to make this code not look racy
> | and avoid adding new problems.
>
>
> Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>

Reviewed-by: Oleg Nesterov <oleg@redhat.com>


> diff --git a/kernel/posix-cpu-timers.c b/kernel/posix-cpu-timers.c
> index 6842eeb..05bb717 100644
> --- a/kernel/posix-cpu-timers.c
> +++ b/kernel/posix-cpu-timers.c
> @@ -37,13 +37,13 @@ static int check_clock(const clockid_t which_clock)
>  	if (pid == 0)
>  		return 0;
>
> -	read_lock(&tasklist_lock);
> +	rcu_read_lock();
>  	p = find_task_by_vpid(pid);
>  	if (!p || !(CPUCLOCK_PERTHREAD(which_clock) ?
> -		   same_thread_group(p, current) : thread_group_leader(p))) {
> +		   same_thread_group(p, current) : has_group_leader_pid(p))) {
>  		error = -EINVAL;
>  	}
> -	read_unlock(&tasklist_lock);
> +	rcu_read_unlock();
>
>  	return error;
>  }
> @@ -390,7 +390,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>
>  	INIT_LIST_HEAD(&new_timer->it.cpu.entry);
>
> -	read_lock(&tasklist_lock);
> +	rcu_read_lock();
>  	if (CPUCLOCK_PERTHREAD(new_timer->it_clock)) {
>  		if (pid == 0) {
>  			p = current;
> @@ -404,7 +404,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>  			p = current->group_leader;
>  		} else {
>  			p = find_task_by_vpid(pid);
> -			if (p && !thread_group_leader(p))
> +			if (p && !has_group_leader_pid(p))
>  				p = NULL;
>  		}
>  	}
> @@ -414,7 +414,7 @@ int posix_cpu_timer_create(struct k_itimer *new_timer)
>  	} else {
>  		ret = -EINVAL;
>  	}
> -	read_unlock(&tasklist_lock);
> +	rcu_read_unlock();
>
>  	return ret;
>  }
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec
  2010-11-03 16:10             ` Oleg Nesterov
  2010-11-03 16:38               ` Sergey Senozhatsky
  2010-11-03 16:52               ` Sergey Senozhatsky
@ 2010-11-05 15:53               ` Oleg Nesterov
  2010-11-08 18:14                 ` Roland McGrath
  2010-11-09 14:54                 ` Stanislaw Gruszka
  2 siblings, 2 replies; 14+ messages in thread
From: Oleg Nesterov @ 2010-11-05 15:53 UTC (permalink / raw)
  To: Andrew Morton, Thomas Gleixner
  Cc: Peter Zijlstra, Ingo Molnar, LKML, Stanislaw Gruszka,
	Sergey Senozhatsky, Roland McGrath, stable

posix-cpu-timers.c correctly assumes that the dying process does
posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
timers from signal->cpu_timers list.

But, it also assumes that timer->it.cpu.task is always the group
leader, and thus the dead ->task means the dead thread group.

This is obviously not true after de_thread() changes the leader.
After that almost every posix_cpu_timer_ method has problems.

It is not simple to fix this bug correctly. First of all, I think
that timer->it.cpu should use struct pid instead of task_struct.
Also, the locking should be reworked completely. In particular,
tasklist_lock should not be used at all. This all needs a lot of
nontrivial and hard-to-test changes.

Change __exit_signal() to do posix_cpu_timers_exit_group() when
the old leader dies during exec. This is not the fix, just the
temporary hack to hide the problem for 2.6.37 and stable. IOW,
this is obviously wrong but this is what we currently have anyway:
cpu timers do not work after mt exec.

In theory this change adds another race. The exiting leader can
detach the timers which were attached to the new leader. However,
the window between de_thread() and release_task() is small, we
can pretend that sys_timer_create() was called before de_thread().

Signed-off-by: Oleg Nesterov <oleg@redhat.com>
---

 kernel/exit.c |    8 ++++++++
 1 file changed, 8 insertions(+)

--- kstub/kernel/exit.c~pct_de_thread_race	2010-08-17 12:32:24.000000000 +0200
+++ kstub/kernel/exit.c	2010-11-04 21:30:18.000000000 +0100
@@ -95,6 +95,14 @@ static void __exit_signal(struct task_st
 		sig->tty = NULL;
 	} else {
 		/*
+		 * This can only happen if the caller is de_thread().
+		 * FIXME: this is the temporary hack, we should teach
+		 * posix-cpu-timers to handle this case correctly.
+		 */
+		if (unlikely(has_group_leader_pid(tsk)))
+			posix_cpu_timers_exit_group(tsk);
+
+		/*
 		 * If there is any task waiting for the group exit
 		 * then notify it:
 		 */


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec
  2010-11-05 15:53               ` [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec Oleg Nesterov
@ 2010-11-08 18:14                 ` Roland McGrath
  2010-11-09 14:54                 ` Stanislaw Gruszka
  1 sibling, 0 replies; 14+ messages in thread
From: Roland McGrath @ 2010-11-08 18:14 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	LKML, Stanislaw Gruszka, Sergey Senozhatsky, stable

That seems like a reasonable workaround off hand.  But since all the
reworkings I am no longer particularly authoritative on this code.


Thanks,
Roland


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec
  2010-11-05 15:53               ` [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec Oleg Nesterov
  2010-11-08 18:14                 ` Roland McGrath
@ 2010-11-09 14:54                 ` Stanislaw Gruszka
  1 sibling, 0 replies; 14+ messages in thread
From: Stanislaw Gruszka @ 2010-11-09 14:54 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Thomas Gleixner, Peter Zijlstra, Ingo Molnar,
	LKML, Sergey Senozhatsky, Roland McGrath, stable

On Fri, Nov 05, 2010 at 04:53:42PM +0100, Oleg Nesterov wrote:
> posix-cpu-timers.c correctly assumes that the dying process does
> posix_cpu_timers_exit_group() and removes all !CPUCLOCK_PERTHREAD
> timers from signal->cpu_timers list.
> 
> But, it also assumes that timer->it.cpu.task is always the group
> leader, and thus the dead ->task means the dead thread group.
> 
> This is obviously not true after de_thread() changes the leader.
> After that almost every posix_cpu_timer_ method has problems.
> 
> It is not simple to fix this bug correctly. First of all, I think
> that timer->it.cpu should use struct pid instead of task_struct.
> Also, the locking should be reworked completely. In particular,
> tasklist_lock should not be used at all. This all needs a lot of
> nontrivial and hard-to-test changes.
> 
> Change __exit_signal() to do posix_cpu_timers_exit_group() when
> the old leader dies during exec. This is not the fix, just the
> temporary hack to hide the problem for 2.6.37 and stable. IOW,
> this is obviously wrong but this is what we currently have anyway:
> cpu timers do not work after mt exec.
> 
> In theory this change adds another race. The exiting leader can
> detach the timers which were attached to the new leader. However,
> the window between de_thread() and release_task() is small, we
> can pretend that sys_timer_create() was called before de_thread().
> 
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>

Reviewed-by: Stanislaw Gruszka <sgruszka@redhat.com>


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2010-11-09 14:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-11-02 13:58 [PATCH] posix-cpu-timers: rcu_read_lock/unlock protect find_task_by_vpid call Sergey Senozhatsky
2010-11-02 15:31 ` Thomas Gleixner
2010-11-02 16:02   ` Sergey Senozhatsky
2010-11-02 16:04     ` Thomas Gleixner
2010-11-02 18:33       ` Oleg Nesterov
2010-11-03 10:58         ` Sergey Senozhatsky
2010-11-03 12:48           ` Oleg Nesterov
2010-11-03 16:10             ` Oleg Nesterov
2010-11-03 16:38               ` Sergey Senozhatsky
2010-11-03 16:52               ` Sergey Senozhatsky
2010-11-03 17:17                 ` Oleg Nesterov
2010-11-05 15:53               ` [PATCH] posix-cpu-timers: workaround to suppress the problems with mt exec Oleg Nesterov
2010-11-08 18:14                 ` Roland McGrath
2010-11-09 14:54                 ` Stanislaw Gruszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.