From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753528AbcLIWZD (ORCPT ); Fri, 9 Dec 2016 17:25:03 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:46302 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753341AbcLIWZC (ORCPT ); Fri, 9 Dec 2016 17:25:02 -0500 From: ebiederm@xmission.com (Eric W. Biederman) To: Oleg Nesterov Cc: EunTaik Lee , "mingo\@redhat.com" , "peterz\@infradead.org" , "linux-kernel\@vger.kernel.org" References: <20161209093351epcms1p418673c3cdec7d4c3e81b5df131173c57@epcms1p4> <20161209172114.GA25742@redhat.com> Date: Sat, 10 Dec 2016 11:21:59 +1300 In-Reply-To: <20161209172114.GA25742@redhat.com> (Oleg Nesterov's message of "Fri, 9 Dec 2016 18:21:15 +0100") Message-ID: <87wpf8hhfc.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1cFTb6-0000sg-LY;;;mid=<87wpf8hhfc.fsf@xmission.com>;;;hst=in02.mta.xmission.com;;;ip=101.100.131.98;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+buP2uu27jSbik0y7jrcSyPojDDAt6r/o= X-SA-Exim-Connect-IP: 101.100.131.98 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa08 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMDrugObfuBody_12 obfuscated drug references * 0.0 T_TooManySym_02 5+ unique symbols in subject * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa08 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ;Oleg Nesterov X-Spam-Relay-Country: X-Spam-Timing: total 1767 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 3.7 (0.2%), b_tie_ro: 2.7 (0.2%), parse: 0.76 (0.0%), extract_message_metadata: 25 (1.4%), get_uri_detail_list: 9 (0.5%), tests_pri_-1000: 3.3 (0.2%), tests_pri_-950: 1.11 (0.1%), tests_pri_-900: 1.04 (0.1%), tests_pri_-400: 27 (1.6%), check_bayes: 26 (1.5%), b_tokenize: 7 (0.4%), b_tok_get_all: 11 (0.6%), b_comp_prob: 2.0 (0.1%), b_tok_touch_all: 3.7 (0.2%), b_finish: 0.76 (0.0%), tests_pri_0: 1694 (95.9%), check_dkim_signature: 0.56 (0.0%), check_dkim_adsp: 1476 (83.6%), tests_pri_500: 3.8 (0.2%), rewrite_mail: 0.00 (0.0%) Subject: Re: [PATCH] sched/pid fix use-after free in task_tgid_vnr X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Oleg Nesterov writes: > On 12/09, EunTaik Lee wrote: >> >> There is a use-after-free case with below call stack. >> >> pid_nr_ns+0x10/0x38 >> cgroup_pidlist_start+0x144/0x400 >> cgroup_seqfile_start+0x1c/0x24 >> kernfs_seq_start+0x54/0x90 >> seq_read+0x15c/0x3a8 >> kernfs_fop_read+0x38/0x160 >> __vfs_read+0x28/0xc8 >> vfs_read+0x84/0xfc How is this a use after free. The function pid_nr_ns should take a NULL pointer as input and return 0? Certainly if the addtion of pid_alive fixes it pid_vnr(task_tgid(tsk)) is fine. Are we perhaps missing rcu locking? Or is the problem simply that in task_tgid we are accessing task->group_leader which may already be dead? If so the fix needs to be in task_tgid. > This reminds about perf_event_pid() which is equally buggy... > >> static inline pid_t task_tgid_vnr(struct task_struct *tsk) >> { >> - return pid_vnr(task_tgid(tsk)); >> + pid_t pid = 0; >> + >> + rcu_read_lock(); >> + if (pid_alive(tsk)) >> + pid = pid_vnr(task_tgid(tsk)); >> + rcu_read_unlock(); >> + >> + return pid; >> } > > Eric, EunTaik, what do you think about the patch below? > > I can't decide whether it is too ugly or not, but it would be nice > to avoid the code duplication. I think it can be beaten into shape but I am not certain it addresses the core issue. > > Oleg. > > > --- x/include/linux/pid.h > +++ x/include/linux/pid.h > @@ -8,7 +8,8 @@ enum pid_type > PIDTYPE_PID, > PIDTYPE_PGID, > PIDTYPE_SID, > - PIDTYPE_MAX > + PIDTYPE_MAX, > + PIDTYPE_TGID /* do not use */ I would do: /* __PIDTYPE_TGID is only valid to __task_pid_nr_ns */ #define __PIDTYPE_TGID PIDTYPE_MAX Prefixing __PIDTYPE_TGID with __ should help make it clear this is a special use define. I am also curious why pid_alive is the proper check to see if task->group_leader is valid? That feels like it could get us into trouble later. Especially as that is the real problem child here. > }; > > /* > --- x/kernel/pid.c > +++ x/kernel/pid.c > @@ -526,8 +526,11 @@ pid_t __task_pid_nr_ns(struct task_struc > if (!ns) > ns = task_active_pid_ns(current); > if (likely(pid_alive(task))) { > - if (type != PIDTYPE_PID) > + if (type != PIDTYPE_PID) { > + if (type == PIDTYPE_TGID) > + type = PIDTYPE_PID; > task = task->group_leader; > + } > nr = pid_nr_ns(rcu_dereference(task->pids[type].pid), ns); > } > rcu_read_unlock(); > @@ -538,7 +541,7 @@ EXPORT_SYMBOL(__task_pid_nr_ns); > > pid_t task_tgid_nr_ns(struct task_struct *tsk, struct pid_namespace *ns) > { > - return pid_nr_ns(task_tgid(tsk), ns); > + return __task_pid_nr_ns(tsk, PIDTYPE_TGID, ns); > } > EXPORT_SYMBOL(task_tgid_nr_ns); >