From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752516AbdBBSK7 (ORCPT ); Thu, 2 Feb 2017 13:10:59 -0500 Received: from mail.skyhub.de ([78.46.96.112]:39570 "EHLO mail.skyhub.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752170AbdBBSK5 (ORCPT ); Thu, 2 Feb 2017 13:10:57 -0500 Date: Thu, 2 Feb 2017 19:10:34 +0100 From: Borislav Petkov To: Ingo Molnar Cc: "Ghannam, Yazen" , x86-ml , Yves Dionne , Brice Goglin , Peter Zijlstra , lkml Subject: Re: [RFC PATCH] x86/CPU/AMD: Bring back Compute Unit ID Message-ID: <20170202181034.qjiggpbkmyr53yah@pd.tnic> References: <20170201200237.36s2jwjgxi24we66@pd.tnic> <20170201214421.ppw2ww3faxxu2jrm@pd.tnic> <20170201222507.qvcn6dsxucn6fqcv@pd.tnic> <20170201224150.ohb7f7jvbttnikkz@pd.tnic> <20170202121054.im3c3iiqp26a2dyb@pd.tnic> <20170202154359.fevz4fmwc6t4ew75@pd.tnic> <20170202160916.GA12498@gmail.com> <20170202170455.gbuivmli5fhmuucn@pd.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20170202170455.gbuivmli5fhmuucn@pd.tnic> User-Agent: NeoMutt/20161014 (1.7.1) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 02, 2017 at 06:04:56PM +0100, Borislav Petkov wrote: > I think I stole it from you from some mail thread we had in the past. Yap, --all-cpus is a bit better in that the difference between the two kernels is smaller. For some reason, though, with the patch the workload is a bit slower. We have more cycles, more branches, ... It is only 2 sec slower though. I think that's probably because it is the first Bulldozer uarch and when you run it on newer versions of the uarch, it is better, due to improvements in the uarch. Yazen, what BD generation is your machine? I have one more Bulldozer box: rev C0 on which I could run this over the weekend. ./tools/perf/perf stat -a -e task-clock,context-switches,cache-misses,cpu-migrations,page-faults,cycles,instructions,branches,branch-misses --repeat 3 --sync --pre ~/bin/pre-build-kernel.sh -- make -s -j17 bzImage before: Performance counter stats for 'system wide' (3 runs): 2279512.230871 task-clock (msec) # 15.999 CPUs utilized ( +- 0.40% ) 714,492 context-switches # 0.313 K/sec ( +- 0.19% ) 6,726,972,836 cache-misses ( +- 0.15% ) 56,490 cpu-migrations # 0.025 K/sec ( +- 2.98% ) 27,794,829 page-faults # 0.012 M/sec ( +- 0.04% ) 3,719,570,726,045 cycles # 1.632 GHz ( +- 0.06% ) 2,146,930,432,417 instructions # 0.58 insn per cycle ( +- 0.05% ) 476,587,085,009 branches # 209.074 M/sec ( +- 0.06% ) 25,286,321,575 branch-misses # 5.31% of all branches ( +- 0.07% ) 142.475046735 seconds time elapsed ( +- 0.40% ) after: Performance counter stats for 'system wide' (3 runs): 2312821.267459 task-clock (msec) # 16.000 CPUs utilized ( +- 0.20% ) 760,839 context-switches # 0.329 K/sec ( +- 0.29% ) 6,769,543,062 cache-misses ( +- 0.05% ) 68,785 cpu-migrations # 0.030 K/sec ( +- 0.75% ) 27,828,222 page-faults # 0.012 M/sec ( +- 0.04% ) 3,725,704,384,061 cycles # 1.611 GHz ( +- 0.06% ) 2,149,336,525,435 instructions # 0.58 insn per cycle ( +- 0.01% ) 477,157,066,501 branches # 206.310 M/sec ( +- 0.01% ) 25,289,357,158 branch-misses # 5.30% of all branches ( +- 0.07% ) 144.551731453 seconds time elapsed ( +- 0.20% ) -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply.