From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751704AbdIQQss convert rfc822-to-8bit (ORCPT ); Sun, 17 Sep 2017 12:48:48 -0400 Received: from mout.gmx.net ([212.227.15.15]:60209 "EHLO mout.gmx.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751410AbdIQQsq (ORCPT ); Sun, 17 Sep 2017 12:48:46 -0400 Message-ID: <1505666869.15333.110.camel@gmx.de> Subject: Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression From: Mike Galbraith To: Joel Fernandes , Rik van Riel Cc: kernel test robot , LKML , Peter Zijlstra , Josef Bacik , Juri Lelli , Brendan Jackman , Dietmar Eggemann , Matt Fleming , Ingo Molnar , lkp@01.org Date: Sun, 17 Sep 2017 18:47:49 +0200 In-Reply-To: References: <20170827010226.19703-1-joelaf@google.com> <20170910134021.GB29265@yexl-desktop> <1505098524.18240.42.camel@gmx.de> <1505404603.12821.19.camel@redhat.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.5 Mime-Version: 1.0 Content-Transfer-Encoding: 8BIT X-Provags-ID: V03:K0:BVxoK8xlh43mr7QfNo7g8BC6tBnJEcLPCkoJxRLsAzT/P7V1D9C lJXNrXP9Hj0jY2srPtFLwSWGxtpC6/0vjuu/30mAhooc/Wt4Owx8MangbTNpV44nnCYGT1I //jI9Qzne1eO/Ty0VXTLO7VZMfjFLtrJnLGBeRUGHprAMecNFLaOB39h1y9JRjPbN8QhR75 +0ZM4iOUS/UY2jBxx47Yw== X-UI-Out-Filterresults: notjunk:1;V01:K0:vpQN9d9FU5I=:of/tfnH1E92aQjjIkIQ1sT U/vPN8JWQgt019fu0y2RS85lymToENXDz4Hs8JPtWGPv1fqqdiKw9iGUeI9QB5nFI/GHU8Pf5 DOXJcLdZVv1YBYT01eOOa9FWst3A08SkMRM8JSe7jNRyv3CbfN2Fh7MIJp17b6UykHrIwxaz9 bnyuw70MpzY7pgPqtokLkEpVN42AFmZrbGQVlyiReu56JrmX10BOXiIagFtuh8YLZQkIHxFZ9 xF8G0Jm4yG/hwz+GFewnHEqSjPfI7IrsOE2Gkv2nVC+q1cXVdbA8K85H+8sw1x5qoaBr7i9h9 MuJhgJ4iJ9sNeD+HN6WyzH6oVlAX5fsGDG/ErhF9Px/n0mPntliHpZHmbak2HVZJCR9mRHKTw ni5xi/7V7JlIGJxqF7b6aMk3P8Dhq2ZGOwXyfHF8ijmrX8sWW397tF1BAwcyX8S37PGLoZ6mf DrAKci1NROs7p/ey18VTIdZnmHpJcsOrbCuNkwmArODiaNXCIdhsbQx2C5nmVOmigjzd3B3MR PaaOK4xCYyGHEJQ/seMk7uxmuUCkaBOkYF44RqnjEvP7WA7g/2METCj0Ot2igp/Su3eDhSleC U3HHaZEoN23DG+s5u6V9Cvs8W3EVglY6w/wFWd14DXxYhqaQXAHnDhoNHE8ADTlyN8ToaqjG1 Tndw8Ce24ex1JPlcx9llLhTbqmDFNpx9FQaZoxIyvNyqL7Kwgr/G2yRd02I7gH2CEscR/Zf2c /nD5fd6ZJKI/kBSKY8dTv9rwuGkM3gkpVq0pQ10G7w3an20jcLJrCDDbB5/nF47vQCL8JSjps n9kF4GZFlvYpY6Hi6O3rCh9+LHKPu1keboGJdnZDezCi309vs0= Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 2017-09-16 at 23:42 -0700, Joel Fernandes wrote: > > Yes I understand. However with my 'strong sync' patch, such a > balancing check could be useful which is what I was trying to do in a > different way in my patch - but it could be that my way is not good > enough and potentially the old wake_affine check could help here Help how?  The old wake_affine() check contained zero concurrency information, it served to exclude excessive stacking, defeating the purpose of SMP.  A truly synchronous wakeup has absolutely nothing to do with load balance in the general case: you can neither generate nor cure an imbalance by replacing one (nice zero) task with another.  The mere existence of a load based braking mechanism speaks volumes. > On systems with SMT, it may make more sense for > > sync wakeups to look for idle threads of the same > > core, than to have the woken task end up on the > > same thread, and wait for the current task to stop > > running. > > I am ok with additionally doing an select_idle_smt for the SMT cases. > However Mike shows that it doesn't necessarily cause a performance > improvement. But if there is consensus on checking for idle SMT > threads, then I'm Ok with doing that. select_idle_sibling() used to check thread first, that was changed to core first for performance reasons. > > "Strong sync" wakeups like you propose would also > > change the semantics of wake_wide() and potentially > > other bits of code... > > > > I understand, I am not very confident that wake_wide does the right > thing anyway. Atleast for Android, wake_wide doesn't seem to mirror > the most common usecase of display pipeline well. It seems that we > have cases where the 'flip count' is really high and causes wake_wide > all the time and sends us straight to the wake up slow path causing > regressions in Android benchmarks. Hm.  It didn't pull those counts out of the vacuum, it measured them.  It definitely does not force Android into the full balance path, that is being done by Android developers, as SD_BALANCE_WAKE is off by default.  It was briefly on by default, but was quickly turned back off because it... induced performance regressions. In any case, if you have cause to believe that wake_wide() is causing you grief, why the heck are you bending up the sync hint? > Atleast with the sync flag, the caller provides a meaningful > indication and I think making that flag stronger / more preferred than > wake_wide makes sense from that perspective since its not a signal > that's guessed, but is rather an input request. Sort of, if you disregard the history I mentioned... https://www.youtube.com/watch?v=Yho1Eydh1mM :) Lacking solid concurrency information to base your decision on, you'll end up robbing Peter to pay Paul forever, you absolutely will stack non-synchronous tasks, inducing needless latency hits and injuring scalability.  We've been down that road.  $subject was a very small sample of what lies down this path . -Mike