linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: ebiederm@xmission.com (Eric W. Biederman)
To: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Aaron Tomlin <atomlin@redhat.com>,
	Pavel Emelyanov <xemul@parallels.com>,
	Serge Hallyn <serge.hallyn@ubuntu.com>,
	Sterling Alexander <stalexan@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 2/2] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting
Date: Tue, 25 Nov 2014 12:43:15 -0600	[thread overview]
Message-ID: <87wq6j9l4s.fsf@x220.int.ebiederm.org> (raw)
In-Reply-To: <20141125181521.GA31963@redhat.com> (Oleg Nesterov's message of "Tue, 25 Nov 2014 19:15:21 +0100")

Oleg Nesterov <oleg@redhat.com> writes:

> On 11/25, Eric W. Biederman wrote:
>>
>> Oleg Nesterov <oleg@redhat.com> writes:
>>
>> > On 11/24, Eric W. Biederman wrote:
>> >>
>> >> However at the moment my I can't figure out if it is safe to move
>> >> get_pid_ns elow hlist_add_head_rcu.  Because once we are on the rcu list
>> >> the pid is findable, and being publicly visible with a bad refcount could cause
>> >> problems.
>> >
>> > The caller has a reference, this ns can't go away. Obviously, otherwise
>> > get_pid_ns(ns) is not safe.
>> >
>> > We need this get_pid_ns() to balance put_pid()->put_pid_ns() which obviously
>> > won't be called until we return this pid, otherwise everything is wrong.
>> >
>> > So I think this should be safe?
>>
>> My concern is exposing a half initialized struct pid to the world via an
>> rcu data structure.  In particular could one of the rcu users get into
>> trouble because we haven't called get_pid_ns yet?  That is unclear to me.
>
> They can't. This pid was fully initialized, in particular
> pid->numbers[pid->level].ns == ns has a reference.
>
> Just it is not ready for put_pid() which will be called by the "owner" of
> this pid, the caller or the new child. So in this sense it doesn't matter
> when we call get_pid_ns(), just we need to do this before return.

Or by someone calling find_get_pid() ... put_pid().

Now the reference count should not hit zero in that case but I hate to
think of that case separately.

>> That is one of those weird nasty races I would rather not have to
>> consider and moving the get_pid_ns after hlist_add requires that we
>> think about it.
>>
>> To fix the error handling and avoid thinking about the races we have two
>> choices:
>> - In the error path that is currently called out_unlock we can drop the
>>   extra references.
>> - Immediately after we perform the test that on error jumps to out_unlock
>>   we call get_pid_ns.
>>
>> My preference would be the first, as it is a trivially correct one line
>> change.
>>
>> Aka I think this is the obviously correct trivial fix.
>>
>>  out_unlock:
>>  	spin_unlock_irq(&pidmap_lock);
>> +	put_pid_ns(ns);
>
> Sure, initially I was going to do this. But this is sub-optimal imo, I mainly
> mean less clear (imho).
>
> But again, I won't argue. I'll send V2 once we finish the discussion
> about 2/2.

At this point, and especially since we need to Cc stable and get this
fix backported to who knows how many kernel releases having something
that is trivial to validate is correct is important.

If you prefer to call get_pid_ns() immedately after:

	if (!(ns->nr_hashed & PIDNS_HASH_ADDING))
		goto out_unlock;

That would be fine with me as well.

Anything else to too clever for my brain to verify is correct today.

Eric


  reply	other threads:[~2014-11-25 18:44 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-07 20:14 [PATCH 0/4] proc: deuglify task_state() Oleg Nesterov
2014-11-07 20:14 ` [PATCH 1/4] proc: task_state: read cred->group_info outside of task_lock() Oleg Nesterov
2014-11-07 20:14 ` [PATCH 2/4] proc: task_state: deuglify the max_fds calculation Oleg Nesterov
2014-11-07 20:14 ` [PATCH 3/4] proc: task_state: move the main seq_printf() outside of rcu_read_lock() Oleg Nesterov
2014-11-13 18:04   ` Paul E. McKenney
2014-11-07 20:14 ` [PATCH 4/4] proc: task_state: ptrace_parent() doesn't need pid_alive() check Oleg Nesterov
2014-11-10 21:59 ` [PATCH 0/5] exit: reparent zombie fix + cleanups/optimizations Oleg Nesterov
2014-11-10 22:00   ` [PATCH 1/5] sched_show_task: fix unsafe usage of ->real_parent Oleg Nesterov
2014-11-11 10:39     ` Peter Zijlstra
2014-11-10 22:00   ` [PATCH 2/5] exit: reparent: use ->ptrace_entry rather than ->sibling for EXIT_DEAD tasks Oleg Nesterov
2014-11-10 22:00   ` [PATCH 3/5] exit: reparent: cleanup the changing of ->parent Oleg Nesterov
2014-11-10 22:00   ` [PATCH 4/5] exit: reparent: cleanup the usage of reparent_leader() Oleg Nesterov
2014-11-10 22:00   ` [PATCH 5/5] exit: ptrace: shift "reap dead" code from exit_ptrace() to forget_original_parent() Oleg Nesterov
2014-11-14  1:37 ` [PATCH 0/5] exit: more cleanups/optimizations Oleg Nesterov
2014-11-14  1:38   ` [PATCH 1/5] exit: wait: cleanup the ptrace_reparented() checks Oleg Nesterov
2014-11-14  1:38   ` [PATCH 2/5] exit: wait: don't use zombie->real_parent Oleg Nesterov
2014-11-14  1:38   ` [PATCH 3/5] exit: wait: drop tasklist_lock before psig->c* accounting Oleg Nesterov
2014-11-14  1:38   ` [PATCH 4/5] exit: release_task: fix the comment about group leader accounting Oleg Nesterov
2014-11-14  1:38   ` [PATCH 5/5] exit: proc: don't try to flush /proc/tgid/task/tgid Oleg Nesterov
2014-11-18 21:29 ` [PATCH 0/6] exit: find_new_reaper() fixes/cleanups Oleg Nesterov
2014-11-18 21:30   ` [PATCH 1/6] exit: reparent: fix the dead-parent PR_SET_CHILD_SUBREAPER reparenting Oleg Nesterov
2014-11-18 21:30   ` [PATCH 2/6] exit: reparent: fix the cross-namespace " Oleg Nesterov
2014-11-18 21:30   ` [PATCH 3/6] exit: reparent: s/while_each_thread/for_each_thread/ in find_new_reaper() Oleg Nesterov
2014-11-18 21:30   ` [PATCH 4/6] exit: reparent: document the ->has_child_subreaper checks Oleg Nesterov
2014-11-18 21:30   ` [PATCH 5/6] exit: reparent: introduce find_child_reaper() Oleg Nesterov
2014-11-18 21:30   ` [PATCH 6/6] exit: reparent: introduce find_alive_thread() Oleg Nesterov
2014-11-20 18:34 ` [PATCH 0/3] exit: avoid O(n ** 2) thread-list scan on group-exit if possible Oleg Nesterov
2014-11-20 18:34   ` [PATCH -mm 1/3] exit: reparent: avoid find_new_reaper() if no children Oleg Nesterov
2014-11-20 22:37     ` Andrew Morton
2014-11-21 20:01       ` Oleg Nesterov
2014-11-20 18:34   ` [PATCH -mm 2/3] exit: reparent: call forget_original_parent() under tasklist_lock Oleg Nesterov
2014-11-20 18:34   ` [PATCH -mm 3/3] exit: exit_notify: re-use "dead" list to autoreap current Oleg Nesterov
2014-11-24 20:06 ` [PATCH 0/2] exit/pid_ns: comments + simple fix Oleg Nesterov
2014-11-24 20:06   ` [PATCH 1/2] exit: reparent: fix/update the comments in zap_pid_ns_processes() Oleg Nesterov
2014-11-24 20:14     ` Oleg Nesterov
2014-11-24 22:07     ` Eric W. Biederman
2014-11-25 16:57       ` Oleg Nesterov
2014-11-25 17:17         ` Oleg Nesterov
2014-11-24 20:06   ` [PATCH 2/2] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting Oleg Nesterov
2014-11-24 21:46     ` Eric W. Biederman
2014-11-25 17:07       ` Oleg Nesterov
2014-11-25 17:50         ` Eric W. Biederman
2014-11-25 18:15           ` Oleg Nesterov
2014-11-25 18:43             ` Eric W. Biederman [this message]
2014-11-25 18:59               ` Oleg Nesterov
2014-11-24 21:27   ` [PATCH 0/2] exit/pid_ns: comments + simple fix Eric W. Biederman
2014-11-24 21:38     ` Oleg Nesterov
2014-11-24 21:48   ` Eric W. Biederman
2014-11-25 16:57     ` Oleg Nesterov
2014-11-26 23:54   ` [PATCH v2 " Oleg Nesterov
2014-11-26 23:54     ` [PATCH v2 1/2] exit: pidns: alloc_pid() leaks pid_namespace if child_reaper is exiting Oleg Nesterov
2014-11-27 15:44       ` Eric W. Biederman
2014-11-26 23:54     ` [PATCH v2 2/2] exit: pidns: fix/update the comments in zap_pid_ns_processes() Oleg Nesterov
2014-12-01 22:39       ` Andrew Morton
2014-12-01 23:24         ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87wq6j9l4s.fsf@x220.int.ebiederm.org \
    --to=ebiederm@xmission.com \
    --cc=akpm@linux-foundation.org \
    --cc=atomlin@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=serge.hallyn@ubuntu.com \
    --cc=stalexan@redhat.com \
    --cc=xemul@parallels.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).