From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751355AbdHaAzv (ORCPT ); Wed, 30 Aug 2017 20:55:51 -0400 Received: from mail-qt0-f194.google.com ([209.85.216.194]:37067 "EHLO mail-qt0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751086AbdHaAzt (ORCPT ); Wed, 30 Aug 2017 20:55:49 -0400 Date: Wed, 30 Aug 2017 17:55:45 -0700 From: Tejun Heo To: Neeraj Upadhyay Cc: lizefan@huawei.com, mingo@kernel.org, longman@redhat.com, akpm@linux-foundation.org, linux-kernel@vger.kernel.org, sramana@codeaurora.org, prsood@codeaurora.org Subject: Re: [PATCH] cgroup: Fix potential race between cgroup_exit and migrate path Message-ID: <20170831005545.GA491396@devbig577.frc2.facebook.com> References: <1504097649-32754-1-git-send-email-neeraju@codeaurora.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1504097649-32754-1-git-send-email-neeraju@codeaurora.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Neeraj. On Wed, Aug 30, 2017 at 06:24:09PM +0530, Neeraj Upadhyay wrote: > There is a potential race between cgroup_exit() and the > migration path. This race happens because cgroup_exit path > reads the css_set and does cg_list empty check outside of > css_set lock. This can potentially race with the migrate path > trying to move the tasks to a different css_set. For instance, > below is the interleaved sequence of events, where race is > observed: > > cpuset_hotplug_workfn() > cgroup_transfer_tasks() > cgroup_migrate() > cgroup_migrate_execute() > css_set_move_task() > list_del_init(&task->cg_list); > > cgroup_exit() > cset = task_css_set(tsk); > if (!list_empty(&tsk->cg_list)) > > list_add_tail(&task->cg_list, use_mg_tasks > > In above sequence, as cgroup_exit() read the cg_list for > the task as empty, it didn't disassociate it from its > current css_set, and was moved to new css_set instance > css_set_move_task() called from cpuset_hotplug_workfn() > path. This eventually can result in use after free scenarios, > while accessing the same task_struct again, like in following > sequence: > > kernfs_seq_start() > cgroup_seqfile_start() > cgroup_pidlist_start() > css_task_iter_next() > __put_task_struct() > > > Fix this problem, by moving the css_set and cg_list fetch in > cgroup_exit() inside css_set lock. Hmm... I haven't really thought through but could the problem be that css_set_move_task() is temporarily making ->cg_list empty? The use_task_css_set_links optimization can't handle that. Would something like the following fix the issue? Thanks. diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index df2e0f1..cd85ca0 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -683,7 +683,7 @@ static void css_set_move_task(struct task_struct *task, if (it->task_pos == &task->cg_list) css_task_iter_advance(it); - list_del_init(&task->cg_list); + list_del(&task->cg_list); if (!css_set_populated(from_cset)) css_set_update_populated(from_cset, false); } else {