From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.3 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MENTIONS_GIT_HOSTING,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8BBDC433DF for ; Mon, 19 Oct 2020 01:00:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4512222272 for ; Mon, 19 Oct 2020 01:00:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=firstfloor.org header.i=@firstfloor.org header.b="A5L6BTdb" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730298AbgJSBAb (ORCPT ); Sun, 18 Oct 2020 21:00:31 -0400 Received: from one.firstfloor.org ([193.170.194.197]:51490 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730273AbgJSBAa (ORCPT ); Sun, 18 Oct 2020 21:00:30 -0400 Received: by one.firstfloor.org (Postfix, from userid 503) id 7642686899; Mon, 19 Oct 2020 03:00:27 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=firstfloor.org; s=mail; t=1603069227; bh=x0qX4wCgYOfhUPOO64BV9loj3gVnQ4gOoP1bm12yY8M=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=A5L6BTdb2FvR2/pac+Nz1si3UzQaDXoiVHeooXCQQaaw2BR3TUHNGGlmn3B4Y1UvS Fva9ybfxcrXrLS17A7EqQGAV1u+Cmk8ykao6ILegPm7OTaj2RK1KfCPipVElPntWdm KOmF6BTZAWh8RSPYIY/cWyJl4+LXXeqNjL5aTvU0= Date: Sun, 18 Oct 2020 18:00:27 -0700 From: Andi Kleen To: Or Gerlitz Cc: Andi Kleen , Peter Zijlstra , Ingo Molnar , Brendan Gregg , Linux Netdev List Subject: Re: perf measure for stalled cycles per instruction on newer Intel processors Message-ID: <20201019010026.x72tmoqv6uh76ene@two.firstfloor.org> References: <20201015183352.o4zmciukdrdvvdj4@two.firstfloor.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: NeoMutt/20170113 (1.7.2) Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org > > Don't use it. It's misleading on a out-of-order CPU because you don't > > know if it's actually limiting anything. > > > > If you want useful bottleneck data use --topdown. > > So running again, this time with the below params, I got this output > where all the right most column is colored red. I wonder what can be > said on the amount/ratio of stalls for this app - if you can maybe recommend > some posts of yours to better understand that, I saw some comment in the > perf-stat man page and some lwn article but wasn't really able to figure it out. TopDown determines what limits the execution the most. The application is mostly backend bound (55-72%). This can be either memory issues (more common), or sometimes also execution issues. Standard perf doesn't support a further break down beyond these high level categories, but there are alternative tools that do (e.g. mine is "toplev" in https://github.com/andikleen/pmu-tools or VTune) Some references on TopDown: https://github.com/andikleen/pmu-tools/wiki/toplev-manual http://bit.ly/tma-ispass14 The tools above would also allow you to sample where the stalls are occuring. -Andi