From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759260Ab1FWMvV (ORCPT ); Thu, 23 Jun 2011 08:51:21 -0400 Received: from mx2.mail.elte.hu ([157.181.151.9]:35203 "EHLO mx2.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758170Ab1FWMvU (ORCPT ); Thu, 23 Jun 2011 08:51:20 -0400 Date: Thu, 23 Jun 2011 14:51:07 +0200 From: Ingo Molnar To: Tejun Heo Cc: Thomas Gleixner , LKML , Peter Zijlstra , Jens Axboe , Linus Torvalds Subject: Re: [patch 4/4] sched: Distangle worker accounting from rq->lock Message-ID: <20110623125107.GB15430@elte.hu> References: <20110622174659.496793734@linutronix.de> <20110622174919.135236139@linutronix.de> <20110623083722.GF30101@htj.dyndns.org> <20110623101541.GL30101@htj.dyndns.org> <20110623104455.GA9274@elte.hu> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.5.20 (2009-08-17) X-ELTE-SpamScore: -2.0 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-2.0 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.3.1 -2.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Tejun Heo wrote: > Hello, Ingo. > > On Thu, Jun 23, 2011 at 12:44 PM, Ingo Molnar wrote: > > > > * Tejun Heo wrote: > > > >> The patch description is simply untrue.  It does affect how wq > >> behaves under heavy CPU load.  The effect might be perfectly okay > >> but more likely it will result in subtle suboptimal behaviors under > >> certain load situations which would be difficult to characterize > >> and track down.  Again, the trade off (mostly killing of > >> ttwu_local) could be okay but you can't get away with just claiming > >> "there's no harm". > > > > Well, either it can be measured or not. If you can suggest a specific > > testing method to Thomas, please do. > > Crafting a test case where the suggested change results in worse > behavior isn't difficult (it ends up waking/creating workers which > it doesn't have to between ttwu and actual execution); however, as > with any micro benchmark, the problem is with assessing whether and > how much it would matter in actual workloads (whatever that means) > and workqueue is used throughout the kernel with widely varying > patterns making drawing conclusion a bit tricky. [...] Well, please suggest a workload where it *matters* - as i suspect any workload tglx will come up with will have another 90% of workloads that you could suggest: so it's much better if you suggest a testing method. When someone comes to me with a scheduler change i can give them the workloads that they should double check. See the changelog of this recent commit for example: c8b281161dfa: sched: Increase SCHED_LOAD_SCALE resolution So please suggest a testing method. > [...] Given that, changing the behavior for the worse just for this > cleanup doesn't sound like too sweet a deal. Is there any other > reason to change the behavior (latency, whatever) other than the > ttuw_local ugliness? Well, the ugliness is one aspect of it, but my main concern is simply testability: any claim of speedup or slowdown ought to be testable, right? I mean, we'd like to drive people towards coming with patches and number like Nikhil Rao did in c8b281161dfa, right? Thanks, Ingo