From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753482Ab3DKGB4 (ORCPT ); Thu, 11 Apr 2013 02:01:56 -0400 Received: from e28smtp02.in.ibm.com ([122.248.162.2]:51699 "EHLO e28smtp02.in.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752551Ab3DKGBz (ORCPT ); Thu, 11 Apr 2013 02:01:55 -0400 Message-ID: <516651C8.307@linux.vnet.ibm.com> Date: Thu, 11 Apr 2013 14:01:44 +0800 From: Michael Wang User-Agent: Mozilla/5.0 (X11; Linux i686; rv:16.0) Gecko/20121011 Thunderbird/16.0.1 MIME-Version: 1.0 To: Peter Zijlstra , Peter Zijlstra CC: LKML , Ingo Molnar , Mike Galbraith , Alex Shi , Namhyung Kim , Paul Turner , Andrew Morton , "Nikunj A. Dadhania" , Ram Pai Subject: Re: [PATCH] sched: wake-affine throttle References: <5164DCE7.8080906@linux.vnet.ibm.com> <1365583873.30071.31.camel@laptop> <51652F43.7000300@linux.vnet.ibm.com> In-Reply-To: <51652F43.7000300@linux.vnet.ibm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13041105-5816-0000-0000-00000782078F Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/10/2013 05:22 PM, Michael Wang wrote: > Hi, Peter > > Thanks for your reply :) > > On 04/10/2013 04:51 PM, Peter Zijlstra wrote: >> On Wed, 2013-04-10 at 11:30 +0800, Michael Wang wrote: >>> | 15 GB | 32 | 35918 | | 37632 | +4.77% | 47923 | +33.42% | >>> 52241 | +45.45% >> >> So I don't get this... is wake_affine() once every milisecond _that_ >> expensive? >> >> Seeing we get a 45%!! improvement out of once every 100ms that would >> mean we're like spending 1/3rd of our time in wake_affine()? that's >> preposterous. So what's happening? > > Not all the regression was caused by overhead, adopt curr_cpu not > prev_cpu for select_idle_sibling() is a more important reason for the > regression of pgbench. > > In other word, for pgbench, we waste time in wake_affine() and make the > wrong decision at most of the time, the previously patch show > wake_affine() do pull unrelated tasks together, that's good if current > cpu still cached hot data for wakee, but that's not the case of the > workload like pgbench. Please let me know if I failed to express my thought clearly. I know it's hard to figure out why throttle could bring so many benefit, since the wake-affine stuff is a black box with too many unmeasurable factors, but that's actually the reason why we finally figure out this throttle idea, not the approach like wakeup-buddy, although both of them help to stop the regression. It's fortunate that there is a benchmark could help to find out the regression, and now we have a simple and efficient approach ready for action ;-) Regards, Michael Wang > > The workload just don't satisfied the decision changed by wake-affine, > the more wake-affine active, the more it suffered, that's why 100ms show > better results than 1ms, but when reached some rate, the benefit and > lost of wake-affine will be balanced. > > Regards, > Michael Wang > >> >> >> >