From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752635AbcF2Cci (ORCPT ); Tue, 28 Jun 2016 22:32:38 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41116 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752449AbcF2Cch (ORCPT ); Tue, 28 Jun 2016 22:32:37 -0400 Date: Tue, 28 Jun 2016 21:32:35 -0500 From: Josh Poimboeuf To: Peter Zijlstra Cc: Ingo Molnar , linux-kernel@vger.kernel.org, Mel Gorman , Matt Fleming , Srikar Dronamraju Subject: Re: [PATCH 0/5] sched/debug: decouple sched_stat tracepoints from CONFIG_SCHEDSTATS Message-ID: <20160629023235.p3fviotuka5hwuzp@treble> References: <20160628124336.GG30909@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20160628124336.GG30909@twins.programming.kicks-ass.net> User-Agent: Mutt/1.6.0.1 (2016-04-01) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 29 Jun 2016 02:32:36 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jun 28, 2016 at 02:43:36PM +0200, Peter Zijlstra wrote: > On Fri, Jun 17, 2016 at 12:43:22PM -0500, Josh Poimboeuf wrote: > > NOTE: I didn't include any performance numbers because I wasn't able to > > get consistent results. I tried the following on a Xeon E5-2420 v2 CPU: > > > > $ for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do echo -n performance > $i; done > > $ echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo > > $ echo 100 > /sys/devices/system/cpu/intel_pstate/min_perf_pct > > $ echo 0 > /proc/sys/kernel/nmi_watchdog > > $ taskset 0x10 perf stat -n -r10 perf bench sched pipe -l 1000000 > > > > I was going to post the numbers from that, both with and without > > SCHEDSTATS, but then when I tried to repeat the test on a different day, > > the results were surprisingly different, with different conclusions. > > > > So any advice on measuring scheduler performance would be appreciated... > > Yeah, its a bit of a pain in general... > > A) perf stat --null --repeat 50 -- perf bench sched messaging -g 50 -l 5000 | grep "seconds time elapsed" > B) perf stat --null --repeat 50 -- taskset 1 perf bench sched pipe | grep "seconds time elapsed" > > 1) tip/master + 1-4 > 2) tip/master + 1-5 > 3) tip/master + 1-5 + below > > 1 2 3 > > A) 4.627767855 4.650429917 4.646208062 > 4.633921933 4.641424424 4.612021058 > 4.649536375 4.663144144 4.636815948 > 4.630165619 4.649053552 4.613022902 > > B) 1.770732957 1.789534273 1.773334291 > 1.761740716 1.795618428 1.773338681 > 1.763761666 1.822316496 1.774385589 > > > From this it looks like patch 5 does hurt a wee bit, but we can get most > of that back by reordering the structure a bit. The results seem > 'stable' across rebuilds and reboots (I've pop'ed all patches and > rebuild, rebooted and re-benched 1 at the end and obtained similar > results). > > Although, possible that if we reorder first and then do 5, we'll just > see a bigger regression. I've not bothered. Thanks a lot for benchmarking this! And also for improving the cache alignments. Your changes look good to me. -- Josh