From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753517AbbFAOQv (ORCPT ); Mon, 1 Jun 2015 10:16:51 -0400 Received: from bombadil.infradead.org ([198.137.202.9]:53115 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751975AbbFAOQo (ORCPT ); Mon, 1 Jun 2015 10:16:44 -0400 Date: Mon, 1 Jun 2015 16:16:37 +0200 From: Peter Zijlstra To: umgwanakikbuti@gmail.com, mingo@elte.hu Cc: ktkhai@parallels.com, rostedt@goodmis.org, juri.lelli@gmail.com, pang.xunlei@linaro.org, oleg@redhat.com, linux-kernel@vger.kernel.org Subject: Re: [RFC][PATCH 0/7] sched: balance callbacks Message-ID: <20150601141637.GT19282@twins.programming.kicks-ass.net> References: <20150601135818.506080835@infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20150601135818.506080835@infradead.org> User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 01, 2015 at 03:58:18PM +0200, Peter Zijlstra wrote: > Hi, > > Mike stumbled over a cute bug where the RT/DL balancing ops caused a bug. > > The exact scenario is __sched_setscheduler() changing a (runnable) task from > FIFO to OTHER. In swiched_from_rt(), where we do pull_rt_task() we temporarity > drop rq->lock. This gap allows regular cfs load-balancing to step in and > migrate our. s/\./ task&/ > However, check_class_changed() will happily continue with switched_to_fair() > which assumes our task is still on the old rq and makes the kernel go boom. > > Instead of trying to patch this up and make things complicated; simply disallow > these methods to drop rq->lock and extend the current post_schedule stuff into > a balancing callback list, and use that. > > This survives Mike's testcase for well over an hour on my ivb-ep. I've not yet > tested it on anything bigger. >