From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752635AbcF2Cci (ORCPT <rfc822;w@1wt.eu>);
	Tue, 28 Jun 2016 22:32:38 -0400
Received: from mx1.redhat.com ([209.132.183.28]:41116 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752449AbcF2Cch (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Tue, 28 Jun 2016 22:32:37 -0400
Date: Tue, 28 Jun 2016 21:32:35 -0500
From: Josh Poimboeuf <jpoimboe@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Ingo Molnar <mingo@redhat.com>, linux-kernel@vger.kernel.org,
        Mel Gorman <mgorman@techsingularity.net>,
        Matt Fleming <matt@codeblueprint.co.uk>,
        Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Subject: Re: [PATCH 0/5] sched/debug: decouple sched_stat tracepoints from
 CONFIG_SCHEDSTATS
Message-ID: <20160629023235.p3fviotuka5hwuzp@treble>
References: <cover.1466184592.git.jpoimboe@redhat.com>
 <20160628124336.GG30909@twins.programming.kicks-ass.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Disposition: inline
In-Reply-To: <20160628124336.GG30909@twins.programming.kicks-ass.net>
User-Agent: Mutt/1.6.0.1 (2016-04-01)
X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 29 Jun 2016 02:32:36 +0000 (UTC)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Jun 28, 2016 at 02:43:36PM +0200, Peter Zijlstra wrote:
> On Fri, Jun 17, 2016 at 12:43:22PM -0500, Josh Poimboeuf wrote:
> > NOTE: I didn't include any performance numbers because I wasn't able to
> > get consistent results.  I tried the following on a Xeon E5-2420 v2 CPU:
> > 
> >   $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo -n performance > $i; done
> >   $ echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
> >   $ echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct
> >   $ echo 0 > /proc/sys/kernel/nmi_watchdog
> >   $ taskset 0x10 perf stat -n -r10 perf bench sched pipe -l 1000000
> > 
> > I was going to post the numbers from that, both with and without
> > SCHEDSTATS, but then when I tried to repeat the test on a different day,
> > the results were surprisingly different, with different conclusions.
> > 
> > So any advice on measuring scheduler performance would be appreciated...
> 
> Yeah, its a bit of a pain in general...
> 
> A) perf stat --null --repeat 50 -- perf bench sched messaging -g 50 -l 5000 | grep "seconds time elapsed"
> B) perf stat --null --repeat 50 -- taskset 1 perf bench sched pipe | grep "seconds time elapsed"
> 
> 1) tip/master + 1-4
> 2) tip/master + 1-5
> 3) tip/master + 1-5 + below
> 
> 	1		2		3
> 
> A)	4.627767855	4.650429917	4.646208062
> 	4.633921933	4.641424424	4.612021058
> 	4.649536375	4.663144144	4.636815948
> 	4.630165619	4.649053552	4.613022902
> 
> B)	1.770732957	1.789534273	1.773334291
> 	1.761740716	1.795618428	1.773338681
> 	1.763761666	1.822316496	1.774385589
> 
> 
> From this it looks like patch 5 does hurt a wee bit, but we can get most
> of that back by reordering the structure a bit. The results seem
> 'stable' across rebuilds and reboots (I've pop'ed all patches and
> rebuild, rebooted and re-benched 1 at the end and obtained similar
> results).
> 
> Although, possible that if we reorder first and then do 5, we'll just
> see a bigger regression. I've not bothered.

Thanks a lot for benchmarking this!  And also for improving the cache
alignments.  Your changes look good to me.

-- 
Josh