From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752539AbaLYNLU (ORCPT ); Thu, 25 Dec 2014 08:11:20 -0500 Received: from www262.sakura.ne.jp ([202.181.97.72]:37784 "EHLO www262.sakura.ne.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752034AbaLYNLT (ORCPT ); Thu, 25 Dec 2014 08:11:19 -0500 To: mingo@redhat.com, peterz@infradead.org Cc: linux-kernel@vger.kernel.org Subject: [PATCH] sched/fair: Fix RCU stall upon ENOMEM at sched_create_group(). From: Tetsuo Handa Message-Id: <201412252210.GCC30204.SOMVFFOtQJFLOH@I-love.SAKURA.ne.jp> X-Mailer: Winbiff [Version 2.51 PL2] X-Accept-Language: ja,en,zh Date: Thu, 25 Dec 2014 22:10:45 +0900 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >>From 052595ab1a1d1c5668d9de61395c9cc17694597e Mon Sep 17 00:00:00 2001 From: Tetsuo Handa Date: Thu, 25 Dec 2014 15:51:21 +0900 Subject: [PATCH] sched/fair: Fix RCU stall upon ENOMEM at sched_create_group(). When alloc_fair_sched_group() in sched_create_group() failed, free_sched_group() is called, and free_fair_sched_group() is called by free_sched_group(). Since destroy_cfs_bandwidth() is called by free_fair_sched_group() without calling init_cfs_bandwidth(), RCU stall occurs at hrtimer_cancel(). INFO: rcu_sched self-detected stall on CPU { 1} (t=60000 jiffies g=13074 c=13073 q=0) Task dump for CPU 1: (fprintd) R running task 0 6249 1 0x00000088 ffffffff81c47d40 ffff88007fc43d78 ffffffff81094988 0000000000000001 ffffffff81c47d40 ffff88007fc43d98 ffffffff81097acd ffff88007fc43dd8 0000000000000002 ffff88007fc43dc8 ffffffff810c3a80 ffff88007fc4d840 Call Trace: [] sched_show_task+0xa8/0x110 [] dump_cpu_task+0x3d/0x50 [] rcu_dump_cpu_stacks+0x90/0xd0 [] rcu_check_callbacks+0x491/0x700 [] update_process_times+0x4b/0x80 [] tick_sched_handle.isra.20+0x36/0x50 [] tick_sched_timer+0x42/0x70 [] __run_hrtimer+0x69/0x1a0 [] ? tick_sched_handle.isra.20+0x50/0x50 [] hrtimer_interrupt+0xef/0x230 [] local_apic_timer_interrupt+0x3b/0x70 [] smp_apic_timer_interrupt+0x45/0x60 [] apic_timer_interrupt+0x6d/0x80 [] ? lock_hrtimer_base.isra.23+0x18/0x50 [] ? __kmalloc+0x211/0x230 [] hrtimer_try_to_cancel+0x22/0xd0 [] ? __kmalloc+0x211/0x230 [] hrtimer_cancel+0x22/0x30 [] free_fair_sched_group+0x25/0xd0 [] free_sched_group+0x16/0x40 [] sched_create_group+0x4b/0x80 [] sched_autogroup_create_attach+0x43/0x1c0 [] sys_setsid+0x7c/0x110 [] system_call_fastpath+0x12/0x17 Check whether init_cfs_bandwidth() was called before calling destroy_cfs_bandwidth(). Signed-off-by: Tetsuo Handa --- kernel/sched/fair.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ef2b104..586ee15 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7756,8 +7756,12 @@ static void task_move_group_fair(struct task_struct *p, int queued) void free_fair_sched_group(struct task_group *tg) { int i; + struct cfs_bandwidth *cfs_b; - destroy_cfs_bandwidth(tg_cfs_bandwidth(tg)); + /* Check whether init_cfs_bandwidth() was called. */ + cfs_b = tg_cfs_bandwidth(tg); + if (cfs_b->throttled_cfs_rq.next) + destroy_cfs_bandwidth(cfs_b); for_each_possible_cpu(i) { if (tg->cfs_rq) -- 1.8.3.1