From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751802AbbFXJSu (ORCPT ); Wed, 24 Jun 2015 05:18:50 -0400 Received: from mail.bmw-carit.de ([62.245.222.98]:43821 "EHLO mail.bmw-carit.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750966AbbFXJSm (ORCPT ); Wed, 24 Jun 2015 05:18:42 -0400 X-CTCH-RefID: str=0001.0A0C0203.558A75EB.0111,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0 Message-ID: <558A75EA.40905@bmw-carit.de> Date: Wed, 24 Jun 2015 11:18:34 +0200 From: Daniel Wagner User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Ingo Molnar , Peter Zijlstra CC: , , , , , , , , , , Subject: Re: [RFC][PATCH 00/13] percpu rwsem -v2 References: <20150622121623.291363374@infradead.org> <55884FC2.6030607@bmw-carit.de> <20150622190553.GZ3644@twins.programming.kicks-ass.net> <5589285C.2010100@bmw-carit.de> <20150623143411.GA25159@twins.programming.kicks-ass.net> <558973A7.6010407@bmw-carit.de> <20150623175012.GD3644@twins.programming.kicks-ass.net> <20150623193624.GH18673@twins.programming.kicks-ass.net> <20150624084648.GB27873@gmail.com> In-Reply-To: <20150624084648.GB27873@gmail.com> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 06/24/2015 10:46 AM, Ingo Molnar wrote: > So I'd suggest to first compare preemption behavior: does the workload > context-switch heavily, and is it the exact same context switching rate and are > the points of preemption the same as well between the two kernels? If I read this correctly, the answer is yes. First the 'stable' flock02 test: perf stat --repeat 5 --pre 'rm -rf /tmp/a' ~/src/lockperf/flock02 -n 128 -l 64 /tmp/a 0.008793148 0.008784990 0.008587804 0.008693641 0.008776946 Performance counter stats for '/home/wagi/src/lockperf/flock02 -n 128 -l 64 /tmp/a' (5 runs): 76.509634 task-clock (msec) # 3.312 CPUs utilized ( +- 0.67% ) 2 context-switches # 0.029 K/sec ( +- 26.50% ) 128 cpu-migrations # 0.002 M/sec ( +- 0.31% ) 5,295 page-faults # 0.069 M/sec ( +- 0.49% ) 89,944,154 cycles # 1.176 GHz ( +- 0.66% ) 58,670,259 stalled-cycles-frontend # 65.23% frontend cycles idle ( +- 0.88% ) 0 stalled-cycles-backend # 0.00% backend cycles idle 76,991,414 instructions # 0.86 insns per cycle # 0.76 stalled cycles per insn ( +- 0.19% ) 15,239,720 branches # 199.187 M/sec ( +- 0.20% ) 103,418 branch-misses # 0.68% of all branches ( +- 6.68% ) 0.023102895 seconds time elapsed ( +- 1.09% ) And here posix01 which shows high variance: perf stat --repeat 5 --pre 'rm -rf /tmp/a' ~/src/lockperf/posix01 -n 128 -l 64 /tmp/a 0.006020402 32.510838421 55.516466069 46.794470223 5.097701438 Performance counter stats for '/home/wagi/src/lockperf/posix01 -n 128 -l 64 /tmp/a' (5 runs): 4177.932106 task-clock (msec) # 14.162 CPUs utilized ( +- 34.59% ) 70,646 context-switches # 0.017 M/sec ( +- 31.56% ) 28,009 cpu-migrations # 0.007 M/sec ( +- 33.55% ) 4,834 page-faults # 0.001 M/sec ( +- 0.98% ) 7,291,160,968 cycles # 1.745 GHz ( +- 32.17% ) 5,216,204,262 stalled-cycles-frontend # 71.54% frontend cycles idle ( +- 32.13% ) 0 stalled-cycles-backend # 0.00% backend cycles idle 1,901,289,780 instructions # 0.26 insns per cycle # 2.74 stalled cycles per insn ( +- 30.80% ) 440,415,914 branches # 105.415 M/sec ( +- 31.06% ) 1,347,021 branch-misses # 0.31% of all branches ( +- 29.17% ) 0.295016987 seconds time elapsed ( +- 32.01% ) BTW, thanks for the perf stat tip. Really handy! cheers, daniel