From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5A3CC433E7 for ; Thu, 8 Oct 2020 05:58:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 68B6420725 for ; Thu, 8 Oct 2020 05:58:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="F/NytGEE" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727698AbgJHF6P (ORCPT ); Thu, 8 Oct 2020 01:58:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726245AbgJHF6P (ORCPT ); Thu, 8 Oct 2020 01:58:15 -0400 Received: from mail-yb1-xb2d.google.com (mail-yb1-xb2d.google.com [IPv6:2607:f8b0:4864:20::b2d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7CE1CC061755 for ; Wed, 7 Oct 2020 22:58:13 -0700 (PDT) Received: by mail-yb1-xb2d.google.com with SMTP id v60so3641935ybi.10 for ; Wed, 07 Oct 2020 22:58:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BpjUk3WJh5BtDdO9f3Ts1YSXb0KQevgNW3xvBqhlGcA=; b=F/NytGEEpfYcQsXf8QPdp7QaEIsHC07vBWajA82+pr6wpQUtuIh1hu58K1/Df/OR9g RyT6CtJGXTmJldJ/eFV/9Rvzib+wgK3LOtgB0bcJnr0kyFPgQTFY+RteR82RKrtL5iel rzqyiA4XG1lLHMU0Ju4oUYJJCkitoTXQN6kc4v7nNHNq6+VwjlLrKPj9G48N1I6MmX3C iaJMxXblNDYFj+vHk2zZcYG6GdhAFta306DLqn69VJ8vLCb/fDUpLfPaKlbrTpFJP/zf imV5v0ZKieJon3d5KNSOZ4AazQyHNGsy1L1iKQQF0O1nneNlUEfuerCKb2oDiSiYizC5 ATgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=BpjUk3WJh5BtDdO9f3Ts1YSXb0KQevgNW3xvBqhlGcA=; b=ICMKwY1IHVphg/jkS5Hv+T2HK1mQOH91XMTDsol2+ZzsytlRA/WL6kTLAUW42dkiez +bx2KrxGu5BgfvjBraXYHr3Av+K/Ss+1AY0MX96UsKL17dmihcO/+l/7VfTNuSYaX/Sh cdC9BSv08nkZbFHupu+XNBssUPO4lae2yvc7pw7C4ylNPGhz04C0BKafvPdURzvH30Kw v9Oxz/Z+u0V6Gi289d0M7ZgmQXR7NC3UEumderEdS/3CibpZvEMxG1HrBa5BGX4l0ZOQ xnsk5IbLRnNoRAFr5VEmJrchtbqo3n/dRfruT+cmS2OPtgoiLtfPsZio/Ya/yyhNpS90 JBfg== X-Gm-Message-State: AOAM530/Xbs38sdDImjssE4LMwDavsUhfAr9OdBmrz8iRilMIBGM1vSI xALYC6s5g5u+4LTpPpkVkRqEkRp7TEmrBkGZupuowA== X-Google-Smtp-Source: ABdhPJymAUa5hljECcQ17OL9QSfqKUIwEjvufvsMQqYFu0f8LbG9PmAaYBk+Ceb+ji/yQSoCZI6BWlCbz0jXrBLSBVo= X-Received: by 2002:a25:b1a3:: with SMTP id h35mr8474048ybj.136.1602136691914; Wed, 07 Oct 2020 22:58:11 -0700 (PDT) MIME-Version: 1.0 References: <20201006131703.GR2628@hirez.programming.kicks-ass.net> In-Reply-To: <20201006131703.GR2628@hirez.programming.kicks-ass.net> From: Stephane Eranian Date: Wed, 7 Oct 2020 22:58:00 -0700 Message-ID: Subject: Re: Additional debug info to aid cacheline analysis To: Peter Zijlstra Cc: linux-toolchains@vger.kernel.org, Arnaldo Carvalho de Melo , linux-kernel@ver.kernel.org, Ingo Molnar , Jiri Olsa , Namhyung Kim , Ian Rogers , "Phillips, Kim" , Mark Rutland Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-toolchains@vger.kernel.org Hi Peter, On Tue, Oct 6, 2020 at 6:17 AM Peter Zijlstra wrote: > > Hi all, > > I've been trying to float this idea for a fair number of years, and I > think at least Stephane has been talking to tools people about it, but > I'm not sure what, if anything, ever happened with it, so let me post it > here :-) > > Thanks for bringing this back. This is a pet project of mine and I have been looking at it for the last 4 years intermittently now. Simply never got a chance to complete because preempted by other higher priority projects. I have developed an internal proof-of-concept prototype using one of the 3 approaches I know. My goal was to demonstrate that PMU statistical sampling of loads/stores and with data addresses would work as well as instrumentation. This is slightly different from hit/miss in the analysis but the process is the same. As you point out, the difficulty is not so much in collecting the sample but rather in symbolizing data addresses from the heap. Intel PEBS, IBM Marked Events work well to collect the data. AMD IBS works though you get a lot of irrelevant samples due to lack of hardware filtering. ARM SPE would work too. Overall, all the major architectures will provide the sampling support needed. Some time ago, I had my intern pursue the other 2 approaches for symbolization. The one I see as most promising is by using the DWARF information (no BPF needed). The good news is that I believe we do not need more information than what is already there. We just need the compiler to generate valid DWARF at most optimization levels, which I believe is not the case for LLVM based compilers but maybe okay for GCC. Once we have the DWARF logic in place then it is easier to improve perf report/annotate do to hit/miss or hot/cold, read/write analysis on each data type and fields within. Once we have the code for perf, we are planning to contribute it upstream. In the meantime, we need to lean on the compiler teams to ensure no data type information is lost with high optimizations levels. My understanding from talking with some compiler folks is that this is not a trivial fix. > Basically, what I want is a (perf) tool for cacheline optimizations. > Something very much like the excellent pahole tool, but with hit/miss > information added. > > Now, some PMUs provide the data address for various relevant events, but > that gets us the problem of mapping a 'random' address to a type and > offset. And esp. for dynamic objects, that's a difficult problem. > > However, the compiler actually knows what type and offset (most) memory > references are, so if perf can get us the exact IP (Intel PEBS / AMD > IBS, as opposed to one with skid on) we could get the type from debug > info. > > And therein lies the rub, existing debug info (DWARF) does contain type > information, but in a way that is (I've been told) _very_ hard to use > for this purpose. > > So could the compiler emit extra debug info for every instruction with a > memory reference on to facilitate this? > > ~ Peter