All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Don Zickus <dzickus@redhat.com>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@elte.hu>,
	Jerome Marchand <jmarchan@redhat.com>,
	Mandeep Singh Baines <msb@google.com>,
	Roland McGrath <roland@redhat.com>,
	linux-kernel@vger.kernel.org, stable@kernel.org,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: while_each_thread() under rcu_read_lock() is broken?
Date: Fri, 18 Jun 2010 21:34:03 +0200	[thread overview]
Message-ID: <20100618193403.GA17314@redhat.com> (raw)
In-Reply-To: <20100618190251.GA17297@redhat.com>

(add cc's)

Hmm. Once I sent this patch, I suddenly realized with horror that
while_each_thread() is NOT safe under rcu_read_lock(). Both
do_each_thread/while_each_thread or do/while_each_thread() can
race with exec().

Yes, it is safe to do next_thread() or next_task(). But:

	#define while_each_thread(g, t) \
		while ((t = next_thread(t)) != g)

suppose that t is not the group leader, and it does de_thread() and then
release_task(g). After that next_thread(t) returns t, not g, and the loop
will never stop.

I _really_ hope I missed something, will recheck tomorrow with the fresh
head. Still I'd like to share my concerns...

If I am right, probably we can fix this, something like

	#define while_each_thread(g, t) \
		while ((t = next_thread(t)) != g && pid_alive(g))

[we can't do while (!thread_group_leadr(t = next_thread(t)))].
but this needs barrires, and we should validate the callers anyway.

Or, perhaps,

	#define XXX(t)	({
		struct task_struct *__prev = t;
		t = next_thread(t);
		t != g && t != __prev;
	})

	#define while_each_thread(g, t) \
		while (XXX(t))

Please tell me I am wrong!

Oleg.

On 06/18, Oleg Nesterov wrote:
>
> check_hung_uninterruptible_tasks()->rcu_lock_break() introduced by
> "softlockup: check all tasks in hung_task" commit ce9dbe24 looks
> absolutely wrong.
>
> 	- rcu_lock_break() does put_task_struct(). If the task has exited
> 	  it is not safe to even read its ->state, nothing protects this
> 	  task_struct.
>
> 	- The TASK_DEAD checks are wrong too. Contrary to the comment, we
> 	  can't use it to check if the task was unhashed. It can be unhashed
> 	  without TASK_DEAD, or it can be valid with TASK_DEAD.
>
> 	  For example, an autoreaping task can do release_task(current)
> 	  long before it sets TASK_DEAD in do_exit().
>
> 	  Or, a zombie task can have ->state == TASK_DEAD but release_task()
> 	  was not called, and in this case we must not break the loop.
>
> Change this code to check pid_alive() instead, and do this before we
> drop the reference to the task_struct.
>
> Signed-off-by: Oleg Nesterov <oleg@redhat.com>
> ---
>
>  kernel/hung_task.c |   11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> --- 35-rc2/kernel/hung_task.c~CHT_FIX_RCU_LOCK_BREAK	2009-12-18 19:05:38.000000000 +0100
> +++ 35-rc2/kernel/hung_task.c	2010-06-18 20:06:11.000000000 +0200
> @@ -113,15 +113,20 @@ static void check_hung_task(struct task_
>   * For preemptible RCU it is sufficient to call rcu_read_unlock in order
>   * exit the grace period. For classic RCU, a reschedule is required.
>   */
> -static void rcu_lock_break(struct task_struct *g, struct task_struct *t)
> +static bool rcu_lock_break(struct task_struct *g, struct task_struct *t)
>  {
> +	bool can_cont;
> +
>  	get_task_struct(g);
>  	get_task_struct(t);
>  	rcu_read_unlock();
>  	cond_resched();
>  	rcu_read_lock();
> +	can_cont = pid_alive(g) && pid_alive(t);
>  	put_task_struct(t);
>  	put_task_struct(g);
> +
> +	return can_cont;
>  }
>
>  /*
> @@ -148,9 +153,7 @@ static void check_hung_uninterruptible_t
>  			goto unlock;
>  		if (!--batch_count) {
>  			batch_count = HUNG_TASK_BATCHING;
> -			rcu_lock_break(g, t);
> -			/* Exit if t or g was unhashed during refresh. */
> -			if (t->state == TASK_DEAD || g->state == TASK_DEAD)
> +			if (!rcu_lock_break(g, t))
>  				goto unlock;
>  		}
>  		/* use "==" to skip the TASK_KILLABLE tasks waiting on NFS */


  reply	other threads:[~2010-06-18 19:36 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-18 19:02 [PATCH] fix the racy check_hung_uninterruptible_tasks()->rcu_lock_break() logic Oleg Nesterov
2010-06-18 19:34 ` Oleg Nesterov [this message]
2010-06-18 21:08   ` while_each_thread() under rcu_read_lock() is broken? Roland McGrath
2010-06-18 22:37     ` Oleg Nesterov
2010-06-18 22:33   ` Paul E. McKenney
2010-06-21 17:09     ` Oleg Nesterov
2010-06-21 17:44       ` Oleg Nesterov
2010-06-21 18:00         ` Oleg Nesterov
2010-06-21 19:02         ` Roland McGrath
2010-06-21 20:06           ` Oleg Nesterov
2010-06-21 21:19             ` Eric W. Biederman
2010-06-22 14:34               ` Oleg Nesterov
2010-07-08 23:59             ` Roland McGrath
2010-07-09  0:41               ` Paul E. McKenney
2010-07-09  1:01                 ` Roland McGrath
2010-07-09 16:18                   ` Paul E. McKenney
2010-06-21 20:51       ` Paul E. McKenney
2010-06-21 21:22         ` Eric W. Biederman
2010-06-21 21:38           ` Paul E. McKenney
2010-06-22 21:23         ` Oleg Nesterov
2010-06-22 22:12           ` Paul E. McKenney
2010-06-23 15:24             ` Oleg Nesterov
2010-06-24 18:07               ` Paul E. McKenney
2010-06-24 18:50                 ` Chris Friesen
2010-06-24 22:00                   ` Oleg Nesterov
2010-06-25  0:08                     ` Eric W. Biederman
2010-06-25  3:42                       ` Paul E. McKenney
2010-06-25 10:08                       ` Oleg Nesterov
2010-07-09  0:52                       ` Roland McGrath
2010-06-24 21:14                 ` Roland McGrath
2010-06-25  3:37                   ` Paul E. McKenney
2010-07-09  0:41                     ` Roland McGrath
2010-06-24 21:57                 ` Oleg Nesterov
2010-06-25  3:41                   ` Paul E. McKenney
2010-06-25  9:55                     ` Oleg Nesterov
2010-06-28 23:43                       ` Paul E. McKenney
2010-06-29 13:05                         ` Oleg Nesterov
2010-06-29 15:34                           ` Paul E. McKenney
2010-06-29 17:54                             ` Oleg Nesterov
2010-06-19  5:00   ` Mandeep Baines
2010-06-19  5:35     ` Frederic Weisbecker
2010-06-19 15:44       ` Mandeep Baines
2010-06-19 19:19     ` Oleg Nesterov
2010-06-18 20:11 ` [PATCH] fix the racy check_hung_uninterruptible_tasks()->rcu_lock_break() logic Frederic Weisbecker
2010-06-18 20:38 ` Mandeep Singh Baines

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100618193403.GA17314@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=dzickus@redhat.com \
    --cc=ebiederm@xmission.com \
    --cc=fweisbec@gmail.com \
    --cc=jmarchan@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=msb@google.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=roland@redhat.com \
    --cc=stable@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.