All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
To: NeilBrown <neilb-l3A5Bk7waGM@public.gmane.org>
Cc: Ben Blum <bblum-OM76b2Iv3yLQjUSlxSEPGw@public.gmane.org>,
	containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org,
	"linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org"
	<linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Paul Menage <menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>,
	Oleg Nesterov <oleg-6lXkIZvqkOAvJsYlp49lxw@public.gmane.org>
Subject: Re: Possible race between cgroup_attach_proc and de_thread, and questionable code in de_thread.
Date: Thu, 28 Jul 2011 05:17:41 -0700	[thread overview]
Message-ID: <20110728121741.GB2427__24923.7984804951$1311855529$gmane$org@linux.vnet.ibm.com> (raw)
In-Reply-To: <20110728110813.7ff84b13-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>

On Thu, Jul 28, 2011 at 11:08:13AM +1000, NeilBrown wrote:
> On Wed, 27 Jul 2011 16:42:35 -0700 "Paul E. McKenney"
> <paulmck-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> wrote:
> 
> > On Wed, Jul 27, 2011 at 11:07:10AM -0400, Ben Blum wrote:
> > > On Wed, Jul 27, 2011 at 05:11:01PM +1000, NeilBrown wrote:
> > 
> > [ . . . ]
> > 
> > > >  The race as I understand it is with this code:
> > > > 
> > > > 
> > > > 		list_replace_rcu(&leader->tasks, &tsk->tasks);
> > > > 		list_replace_init(&leader->sibling, &tsk->sibling);
> > > > 
> > > > 		tsk->group_leader = tsk;
> > > > 		leader->group_leader = tsk;
> > > > 
> > > > 
> > > >  which seems to be called with only tasklist_lock held, which doesn't seem to
> > > >  be held in the cgroup code.
> > > > 
> > > >  If the "thread_group_leader(leader)" call in cgroup_attach_proc() runs before
> > > >  this chunk is run with the same value for 'leader', but the
> > > >  while_each_thread is run after, then the while_read_thread() might loop
> > > >  forever.  rcu_read_lock doesn't prevent this from happening.
> > > 
> > > Somehow I was under the impression that holding tasklist_lock (for
> > > writing) provided exclusion from code that holds rcu_read_lock -
> > > probably because there are other points in the kernel which do
> > > while_each_thread with only RCU-read held (and not tasklist):
> > > 
> > > - kernel/hung_task.c, check_hung_uninterruptible_tasks()
> > 
> > This one looks OK to me.  The code is just referencing fields in each
> > of the task structures, and appears to be making proper use of
> > rcu_dereference().  All this code requires is that the task structures
> > remain in existence through the full lifetime of the RCU read-side
> > critical section, which is guaranteed because of the way the task_struct
> > is freed.
> 
> I disagree.  It also requires - by virtue of the use of while_each_thread() -
> that 'g' remains on the list that 't' is walking along.

Doesn't the following code in the loop body deal with this possibilty?

	/* Exit if t or g was unhashed during refresh. */
	if (t->state == TASK_DEAD || g->state == TASK_DEAD)
		goto unlock;

Yes, a concurrent dethread could cause some of the tasks to be skipped,
but there really is a hung thread, it will still be there to be caught
next time, right?

							Thanx, Paul

> Now for a normal list, the head always stays on the list and is accessible
> even from an rcu-removed entry.  But the thread_group list isn't a normal
> list.  It doesn't have a distinct head.  It is a loop of all of the
> 'task_structs' in a thread group.  One of them is designated the 'leader' but
> de_thread() can change the 'leader' - though it doesn't remove the old leader.
> 
> __unhash_process in mm/exit.c looks like it could remove the leader from the
> list and definitely could remove a non-leader.
> 
> So if a non-leader calls 'exec' and the leader calls 'exit', then a
> task_struct that was the leader could become a non-leader and then be removed
> from the list that kernel/hung_task could be walking along.
> 
> So I don't think that while_each_thread() is currently safe.  It depends on
> the thread leader not disappearing and I think it can.
> 
> So I'm imagining a patch like this to ensure that while_each_thread() is
> actually safe.  If it is always safe you can remove that extra check in
> cgroup_attach_proc() which looked wrong.
> 
> I just hope someone who understands the process tree is listening..
> The change in exit.c is the most uncertain part.
> 
> NeilBrown
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 6075a1e..c9ea5f0 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -960,6 +960,9 @@ static int de_thread(struct task_struct *tsk)
>  		list_replace_init(&leader->sibling, &tsk->sibling);
> 
>  		tsk->group_leader = tsk;
> +		smp_mb(); /* ensure that any reader will always be able to see
> +			   * a task that claims to be the group leader
> +			   */
>  		leader->group_leader = tsk;
> 
>  		tsk->exit_signal = SIGCHLD;
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 14a6c7b..13e0192 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2267,8 +2267,10 @@ extern bool current_is_single_threaded(void);
>  #define do_each_thread(g, t) \
>  	for (g = t = &init_task ; (g = t = next_task(g)) != &init_task ; ) do
> 
> +/* Thread group leader can change, so stop loop when we see one
> + * even if it isn't 'g' */
>  #define while_each_thread(g, t) \
> -	while ((t = next_thread(t)) != g)
> +	while ((t = next_thread(t)) != g && !thread_group_leader(t))
> 
>  static inline int get_nr_threads(struct task_struct *tsk)
>  {
> diff --git a/kernel/exit.c b/kernel/exit.c
> index f2b321b..d6cef25 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -70,8 +70,13 @@ static void __unhash_process(struct task_struct *p, bool group_dead)
>  		list_del_rcu(&p->tasks);
>  		list_del_init(&p->sibling);
>  		__this_cpu_dec(process_counts);
> -	}
> -	list_del_rcu(&p->thread_group);
> +	} else
> +		/* only remove members from the thread group.
> +		 * The thread group leader must stay so that
> +		 * while_each_thread() uses can see the end of
> +		 * the list and stop.
> +		 */
> +		list_del_rcu(&p->thread_group);
>  }
> 
>  /*
> 

  parent reply	other threads:[~2011-07-28 12:17 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-07-27  7:11 Possible race between cgroup_attach_proc and de_thread, and questionable code in de_thread NeilBrown
2011-08-14 17:40 ` Oleg Nesterov
2011-08-15  0:11   ` NeilBrown
     [not found]     ` <20110815101144.39812e9f-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2011-08-15 19:09       ` Oleg Nesterov
2011-08-15 19:09     ` Oleg Nesterov
     [not found]   ` <20110814174000.GA2381-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-15  0:11     ` NeilBrown
     [not found] ` <20110727171101.5e32d8eb-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2011-07-27 15:07   ` Ben Blum
2011-07-27 15:07     ` Ben Blum
2011-07-27 23:42     ` Paul E. McKenney
     [not found]       ` <20110727234235.GA2318-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2011-07-28  1:08         ` NeilBrown
2011-07-28  1:08       ` NeilBrown
2011-07-28  6:26         ` Ben Blum
2011-07-28  7:13           ` NeilBrown
     [not found]             ` <20110728171345.67d3797d-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2011-07-29 14:28               ` [PATCH][BUGFIX] cgroups: more safe tasklist locking in cgroup_attach_proc Ben Blum
2011-07-29 14:28             ` Ben Blum
2011-08-01 19:31               ` Paul Menage
     [not found]               ` <20110729142842.GA8462-japSPQJXeIlCM1neWV3AGuCmf2DRS9x2@public.gmane.org>
2011-08-01 19:31                 ` Paul Menage
2011-08-15 18:49                 ` Oleg Nesterov
2011-08-15 18:49               ` Oleg Nesterov
2011-08-15 22:50                 ` Frederic Weisbecker
2011-08-15 23:04                   ` Ben Blum
2011-08-15 23:09                     ` Ben Blum
2011-08-15 23:19                       ` Frederic Weisbecker
     [not found]                       ` <20110815230900.GB6867-japSPQJXeIlCM1neWV3AGuCmf2DRS9x2@public.gmane.org>
2011-08-15 23:19                         ` Frederic Weisbecker
     [not found]                     ` <20110815230415.GA6867-japSPQJXeIlCM1neWV3AGuCmf2DRS9x2@public.gmane.org>
2011-08-15 23:09                       ` Ben Blum
2011-08-15 23:04                   ` Ben Blum
2011-08-15 23:11                   ` [PATCH][BUGFIX] cgroups: fix ordering of calls " Ben Blum
2011-08-15 23:20                     ` Frederic Weisbecker
     [not found]                     ` <20110815231156.GC6867-japSPQJXeIlCM1neWV3AGuCmf2DRS9x2@public.gmane.org>
2011-08-15 23:20                       ` Frederic Weisbecker
2011-08-15 23:31                       ` Paul Menage
2011-08-15 23:31                     ` Paul Menage
2011-08-15 23:11                   ` Ben Blum
     [not found]                 ` <20110815184957.GA16588-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-15 22:50                   ` [PATCH][BUGFIX] cgroups: more safe tasklist locking " Frederic Weisbecker
2011-09-01 21:46                   ` Ben Blum
2011-09-01 21:46                 ` Ben Blum
     [not found]                   ` <20110901214643.GD10401-japSPQJXeIlCM1neWV3AGuCmf2DRS9x2@public.gmane.org>
2011-09-02 12:32                     ` Oleg Nesterov
2011-09-02 12:32                   ` Oleg Nesterov
     [not found]                     ` <20110902123251.GA26764-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-09-08  2:11                       ` Ben Blum
2011-09-08  2:11                     ` Ben Blum
2011-10-14  0:31                 ` [PATCH 1/2] cgroups: use sighand lock instead of tasklist_lock " Ben Blum
2011-10-14 12:15                   ` Frederic Weisbecker
2011-10-14  0:36                 ` [PATCH 2/2] cgroups: convert ss->attach to use whole threadgroup flex_array (cpuset, memcontrol) Ben Blum
2011-10-14 12:21                   ` Frederic Weisbecker
2011-10-14 13:53                     ` Ben Blum
2011-10-14 13:54                       ` Ben Blum
2011-10-14 15:22                         ` Frederic Weisbecker
2011-10-17 19:11                           ` Ben Blum
2011-10-14 15:21                       ` Frederic Weisbecker
2011-10-19  5:43                   ` Paul Menage
     [not found]           ` <20110728062616.GC15204-japSPQJXeIlCM1neWV3AGuCmf2DRS9x2@public.gmane.org>
2011-07-28  7:13             ` Possible race between cgroup_attach_proc and de_thread, and questionable code in de_thread NeilBrown
2011-07-28 12:17         ` Paul E. McKenney
     [not found]           ` <20110728121741.GB2427-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org>
2011-08-14 17:51             ` Oleg Nesterov
2011-08-14 17:51           ` Oleg Nesterov
     [not found]             ` <20110814175119.GC2381-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2011-08-14 23:58               ` NeilBrown
2011-08-15 18:01               ` Paul E. McKenney
2011-08-14 23:58             ` NeilBrown
2011-08-15 18:01             ` Paul E. McKenney
     [not found]         ` <20110728110813.7ff84b13-wvvUuzkyo1EYVZTmpyfIwg@public.gmane.org>
2011-07-28  6:26           ` Ben Blum
2011-07-28 12:17           ` Paul E. McKenney [this message]
2011-08-14 17:45           ` Oleg Nesterov
2011-08-14 17:45         ` Oleg Nesterov
     [not found]     ` <20110727150710.GB5242-japSPQJXeIlCM1neWV3AGuCmf2DRS9x2@public.gmane.org>
2011-07-27 23:42       ` Paul E. McKenney
2011-08-14 17:40   ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='20110728121741.GB2427__24923.7984804951$1311855529$gmane$org@linux.vnet.ibm.com' \
    --to=paulmck-23vcf4htsmix0ybbhkvfkdbpr1lh4cv8@public.gmane.org \
    --cc=bblum-OM76b2Iv3yLQjUSlxSEPGw@public.gmane.org \
    --cc=containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=menage-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=neilb-l3A5Bk7waGM@public.gmane.org \
    --cc=oleg-6lXkIZvqkOAvJsYlp49lxw@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.