From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755612AbcGHO51 (ORCPT ); Fri, 8 Jul 2016 10:57:27 -0400 Received: from mx1.redhat.com ([209.132.183.28]:52636 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755183AbcGHO5U (ORCPT ); Fri, 8 Jul 2016 10:57:20 -0400 Date: Fri, 8 Jul 2016 09:57:17 -0500 From: Josh Poimboeuf To: Peter Zijlstra Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Mel Gorman , Matt Fleming , Srikar Dronamraju Subject: Re: [PATCH 0/5] sched/debug: decouple sched_stat tracepoints from CONFIG_SCHEDSTATS Message-ID: <20160708145717.rut44wu3yuxefmwc@treble> References: <20160628124336.GG30909@twins.programming.kicks-ass.net> <20160629102958.GC30927@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160629102958.GC30927@twins.programming.kicks-ass.net> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Fri, 08 Jul 2016 14:57:19 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jun 29, 2016 at 12:29:58PM +0200, Peter Zijlstra wrote: > On Tue, Jun 28, 2016 at 02:43:36PM +0200, Peter Zijlstra wrote: > > > Yeah, its a bit of a pain in general... > > > > A) perf stat --null --repeat 50 -- perf bench sched messaging -g 50 -l 5000 | grep "seconds time elapsed" > > B) perf stat --null --repeat 50 -- taskset 1 perf bench sched pipe | grep "seconds time elapsed" > > > > 1) tip/master + 1-4 > > 2) tip/master + 1-5 > > 3) tip/master + 1-5 + below > > > > 1 2 3 > > > > A) 4.627767855 4.650429917 4.646208062 > > 4.633921933 4.641424424 4.612021058 > > 4.649536375 4.663144144 4.636815948 > > 4.630165619 4.649053552 4.613022902 > > > > B) 1.770732957 1.789534273 1.773334291 > > 1.761740716 1.795618428 1.773338681 > > 1.763761666 1.822316496 1.774385589 > > > > > > From this it looks like patch 5 does hurt a wee bit, but we can get most > > of that back by reordering the structure a bit. The results seem > > 'stable' across rebuilds and reboots (I've pop'ed all patches and > > rebuild, rebooted and re-benched 1 at the end and obtained similar > > results). > > Ha! So those numbers were with CONFIG_SCHEDSTAT=n :-/ > > 1) above 1 (4 patches, CONFIG_SCHEDSTAT=n, sysctl=0) > 2) 1 + CONFIG_SCHEDSTAT=y (sysctl=0) > 3) 2 + sysctl=1 > 4) above 3 (6 patches) + CONFIG_SCHEDSTAT=y (sysctl=0) > > > 1 2 3 4 > > A) 4.620495664 4.788352823 4.862036428 4.623480512 > 4.628800053 4.792622881 4.855325525 4.613553872 > 4.611909507 4.794282178 4.850959761 4.613323142 > 4.608379522 4.787300153 4.822439864 4.597903070 > > B) 1.765668026 1.788374847 1.877803100 1.827213170 > 1.769379968 1.779881911 1.870091005 1.825335322 > 1.765822150 1.786251610 1.885874745 1.828218761 > > > Which looks good for hackbench, but still stinks for pipetest :/ I tried again on another system (Broadwell 2*10*2) and seemingly got more consistent results, but the conclusions are a bit different from yours. I tested only with CONFIG_SCHEDSTAT=y, sysctl=0, because I think that should be the most common configuration by far. 1) linus/master 2) linus/master + 1-4 3) linux/master + 1-5 4) linus/master + 1-5 + smp cacheline patch A) perf stat --null --repeat 50 -- perf bench sched messaging -g 50 -l 5000 B) perf stat --null --repeat 50 -- taskset 1 perf bench sched pipe 1 2 3 4 A) 6.335625627 6.299825679 6.317633969 6.305548464 6.334188492 6.331391159 6.345195048 6.334608006 6.345243359 6.329650737 6.328263309 6.304355127 6.333154970 6.313859694 6.336338820 6.342374680 B) 2.310476138 2.324716175 2.355990033 2.350834083 2.307231831 2.327946052 2.349816680 2.335581939 2.303859470 2.317300965 2.347193526 2.333758084 2.317224538 2.331390610 2.326164933 2.334235895 With patches 1-4, hackbench was slightly better and pipetest was slightly worse. With patches 1-5, hackbench was about the same or even slightly better than baseline, and pipetest was 1-2% worse than baseline. With your smp cacheline patch added, I didn't see a clear improvement. It would be nice to have the schedstat tracepoints be always functional, but I suppose it's up to you and Ingo as to whether it's worth the performance tradeoff. Another option would be to only merge patches 1-4. -- Josh