From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4096CC2B9F4 for ; Mon, 14 Jun 2021 11:44:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CA5960BBB for ; Mon, 14 Jun 2021 11:44:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235833AbhFNLqF (ORCPT ); Mon, 14 Jun 2021 07:46:05 -0400 Received: from mail-ed1-f43.google.com ([209.85.208.43]:46960 "EHLO mail-ed1-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236801AbhFNLoR (ORCPT ); Mon, 14 Jun 2021 07:44:17 -0400 Received: by mail-ed1-f43.google.com with SMTP id s15so5144613edt.13 for ; Mon, 14 Jun 2021 04:42:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7Khk9VJiAZEUo5jow78IwItFdk7LfK4jmTWSp67zs44=; b=GOcz1YcHwn9vgKrtti2m3vYVfzrDbyOPJ4rqWNtz5qRjBuNylB9OmfA+OA851kfhOd BF0XxYRjtjYnmPF/YS37wERJomtTZfzB38tGS+6n+pCpdSFhxJnUSVgm82ewx33tcdl/ vLLvnBF5Ve1MG1pM2zXwRHRAFPwIJJfrFgR5Ygjx6MBdMiwxgX3Q8B1LvMsRjkTz8CTQ MOmBG+TPI2AFgOJCaehq32t79EG7CBA18uQa3bal0btKG6YYTbLhtNhO4N2210c8cNY4 ufxqnAg7dCaNM+TcVX0XYjw+moDzfHkRWqFuPWM2YQJd+0dSMcUq3qlTsgsy6D/FE1B3 8eyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7Khk9VJiAZEUo5jow78IwItFdk7LfK4jmTWSp67zs44=; b=ob1H75xwD4FjY3SZqFKrOtmwzCzeGVu8mHT6mLw6OuzZBxkWgweNWY/KgHQ91uvIo7 HcmFc/Ok9IKKZs0bh30y961OCbOrV3obL6i81iPXXGk7LA6c1Qt0nh/rTUQBiP7YJJS2 fIwLeW6WzWIdRgWMgitju9uNrvyb+8cYbbUzlwjjySOXmNWzndh9e8OYt0UL8i4rRcYx DrmrtWlreeYFtT0ferO4F6jmSv8UAFjmTRDT1AbInob+MQTcAP7sda2vOart091T4Ckm zU7A8+FJfavwMNpzp8RTjDX0/+PnjNacjlXBcOAlb/GdLIjuTCBJeAbEM5gcNmpAuMm8 wMnw== X-Gm-Message-State: AOAM531BWQexfC+nLz+23A570sD3YIZXkFmhf6bce+rrfL+YMgChL1eY y4Kn8DwzxOCRSInSip9TBMprHUgwYAIkqkbW+1Jt X-Google-Smtp-Source: ABdhPJyhQe4FspCtePZL4R+lsIGJyMDxloE0ZMAT1bxKdZe0dGF+U6TxsG6Fv6fhzpvKMieNH1Ah5Oi1lMk6W8RARPU= X-Received: by 2002:aa7:dc42:: with SMTP id g2mr16365775edu.362.1623670873184; Mon, 14 Jun 2021 04:41:13 -0700 (PDT) MIME-Version: 1.0 References: <20210111081821.3041587-1-morbo@google.com> <20210407211704.367039-1-morbo@google.com> <20210612202505.GG68208@worktop.programming.kicks-ass.net> In-Reply-To: From: Bill Wendling Date: Mon, 14 Jun 2021 04:41:01 -0700 Message-ID: Subject: Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure To: Peter Zijlstra Cc: Kees Cook , Jonathan Corbet , Masahiro Yamada , Linux Doc Mailing List , LKML , Linux Kbuild mailing list , clang-built-linux , Andrew Morton , Nathan Chancellor , Nick Desaulniers , Sami Tolvanen , Fangrui Song , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , andreyknvl@gmail.com, dvyukov@google.com, elver@google.com, johannes.berg@intel.com, oberpar@linux.vnet.ibm.com, linux-toolchains@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-doc@vger.kernel.org On Mon, Jun 14, 2021 at 3:45 AM Peter Zijlstra wrote: > On Mon, Jun 14, 2021 at 02:39:41AM -0700, Bill Wendling wrote: > > On Mon, Jun 14, 2021 at 2:01 AM Peter Zijlstra wrote: > > > > Because having GCOV, KCOV and PGO all do essentially the same thing > > > differently, makes heaps of sense? > > > > > It does when you're dealing with one toolchain without access to another. > > Here's a sekrit, don't tell anyone, but you can get a free copy of GCC > right here: > > https://gcc.gnu.org/ > > We also have this linux-toolchains list (Cc'ed now) that contains folks > from both sides. > Your sarcasm is not useful. > > > I understand that the compilers actually generates radically different > > > instrumentation for the various cases, but essentially they're all > > > collecting (function/branch) arcs. > > > > > That's true, but there's no one format for profiling data that's > > usable between all compilers. I'm not even sure there's a good way to > > translate between, say, gcov and llvm's format. To make matters more > > complicated, each compiler's format is tightly coupled to a specific > > version of that compiler. And depending on *how* the data is collected > > (e.g. sampling or instrumentation), it may not give us the full > > benefit of FDO/PGO. > > I'm thinking that something simple like: > > struct arc { > u64 from; > u64 to; > u64 nr; > u64 cntrs[0]; > }; > > goes a very long way. Stick a header on that says how large cntrs[] is, > and some other data (like load offset and whatnot) and you should be > good. > > Combine that with the executable image (say /proc/kcore) to recover > what's @from (call, jmp or conditional branch) and I'm thinking one > ought to be able to construct lots of useful data. > > I've also been led to believe that the KCOV data format is not in fact > dependent on which toolchain is used. > > > > I'm thinking it might be about time to build _one_ infrastructure for > > > that and define a kernel arc format and call it a day. > > > > > That may be nice, but it's a rather large request. > > Given GCOV just died, perhaps you can look at what KCOV does and see if > that can be extended to do as you want. KCOV is actively used and > we actually tripped over all the fun little noinstr bugs at the time.