From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752409AbcHOFKI (ORCPT ); Mon, 15 Aug 2016 01:10:08 -0400 Received: from mail-wm0-f68.google.com ([74.125.82.68]:33227 "EHLO mail-wm0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750977AbcHOFKG (ORCPT ); Mon, 15 Aug 2016 01:10:06 -0400 Date: Mon, 15 Aug 2016 07:10:02 +0200 From: Ingo Molnar To: Brian Gerst Cc: Linus Torvalds , the arch/x86 maintainers , Linux Kernel Mailing List , "H. Peter Anvin" , Denys Vlasenko , Andy Lutomirski , Borislav Petkov , Thomas Gleixner , Josh Poimboeuf Subject: Re: [PATCH v3 0/7] x86: Rewrite switch_to() Message-ID: <20160815051002.GB16267@gmail.com> References: <1471106302-10159-1-git-send-email-brgerst@gmail.com> <20160813184534.GA15037@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org * Brian Gerst wrote: > > Something like this: > > > > taskset 1 perf stat -a -e '{instructions,cycles}' --repeat 10 perf bench sched pipe > > > > ... will give a very good idea about the general impact of these changes on > > context switch overhead. > > Before: > Performance counter stats for 'system wide' (10 runs): > > 12,010,932,128 instructions # 1.03 insn per > cycle ( +- 0.31% ) > 11,691,797,513 cycles > ( +- 0.76% ) > > 3.487329979 seconds time elapsed > ( +- 0.78% ) > > After: > Performance counter stats for 'system wide' (10 runs): > > 12,097,706,506 instructions # 1.04 insn per > cycle ( +- 0.14% ) > 11,612,167,742 cycles > ( +- 0.81% ) > > 3.451278789 seconds time elapsed > ( +- 0.82% ) > > The numbers with or without this patch series are roughly the same. > There is noticeable variation in the numbers each time I run it, so > I'm not sure how good of a benchmark this is. Weird, I get an order of magnitude lower noise: triton:~/tip> taskset 1 perf stat -a -e '{instructions,cycles}' --repeat 10 perf bench sched pipe >/dev/null Performance counter stats for 'system wide' (10 runs): 11,503,026,062 instructions # 1.23 insn per cycle ( +- 2.64% ) 9,377,410,613 cycles ( +- 2.05% ) 1.669425407 seconds time elapsed ( +- 0.12% ) But note that I also had '--sync' for perf stat and did a >/dev/null at the end to make sure no terminal output and subsequent Xorg activities interfere. Also, full screen terminal. Maybe try 'taskset 4' as well to put the workload on another CPU, if the first CPU is busier than the others? (Any Hyperthreading on your test system?) Thanks, Ingo