linux-toolchains.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Bill Wendling <morbo@google.com>, Kees Cook <keescook@google.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Masahiro Yamada <masahiroy@kernel.org>,
	Linux Doc Mailing List <linux-doc@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux Kbuild mailing list <linux-kbuild@vger.kernel.org>,
	clang-built-linux <clang-built-linux@googlegroups.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Nathan Chancellor <natechancellor@gmail.com>,
	Nick Desaulniers <ndesaulniers@google.com>,
	Sami Tolvanen <samitolvanen@google.com>,
	Fangrui Song <maskray@google.com>,
	"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)"
	<x86@kernel.org>, Andrey Konovalov <andreyknvl@gmail.com>,
	Dmitry Vyukov <dvyukov@google.com>,
	johannes.berg@intel.com, oberpar@linux.vnet.ibm.com,
	linux-toolchains@vger.kernel.org
Subject: Re: [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure
Date: Mon, 14 Jun 2021 16:16:16 +0200	[thread overview]
Message-ID: <CANpmjNNnZv7DHYaJBL7knn9P+50F+SOCvis==Utaf-avENnVsw@mail.gmail.com> (raw)
In-Reply-To: <YMczJGPsxSWNgJMG@hirez.programming.kicks-ass.net>

On Mon, 14 Jun 2021 at 12:45, Peter Zijlstra <peterz@infradead.org> wrote:
[...]
> I've also been led to believe that the KCOV data format is not in fact
> dependent on which toolchain is used.

Correct, we use KCOV with both gcc and clang. Both gcc and clang emit
the same instrumentation for -fsanitize-coverage. Thus, the user-space
portion and interface is indeed identical:
https://www.kernel.org/doc/html/latest/dev-tools/kcov.html

> > > I'm thinking it might be about time to build _one_ infrastructure for
> > > that and define a kernel arc format and call it a day.
> > >
> > That may be nice, but it's a rather large request.
>
> Given GCOV just died, perhaps you can look at what KCOV does and see if
> that can be extended to do as you want. KCOV is actively used and
> we actually tripped over all the fun little noinstr bugs at the time.

There might be a subtle mismatch between coverage instrumentation for
testing/fuzzing and for profiling. (Disclaimer: I'm not too familiar
with Clang-PGO's requirements.) For example, while for testing/fuzzing
we may only require information if a code-path has been visited, for
profiling the "hotness" might be of interest. Therefore, the
user-space exported data format can make several trade-offs in
complexity.

In theory, I imagine there's a limit to how generic one could make
profiling information, because one compiler's optimizations are not
another compiler's optimizations. On the other hand, it may be doable
to collect unified profiling information for common stuff, but I guess
there's little motivation for figuring out the common ground given the
producer and consumer of the PGO data is the same compiler by design
(unlike coverage info for testing/fuzzing).

Therefore, if KCOV's exposed information does not match PGO's
requirements today, I'm not sure what realistically can be done
without turning KCOV into a monster. Because KCOV is optimized for
testing/fuzzing coverage, and I'm not sure how complex we can or want
to make it to cater to a new use-case.

My intuition is that the simpler design is to have 2 subsystems for
instrumentation-based coverage collection: one for testing/fuzzing,
and the other for profiling.

Alas, there's the problem of GCOV, which should be replaceable by KCOV
for most use cases. But it would be good to hear from a GCOV user if
there are some.

But as we learned GCOV is broken on x86 now, I see these options:

1. Remove GCOV, make KCOV the de-facto test-coverage collection
subsystem. Introduce PGO-instrumentation subsystem for profile
collection only, and make it _very_ clear that KCOV != PGO data as
hinted above. A pre-requisite is that compiler-support for PGO
instrumentation adds selective instrumentation support, likely just
making attribute no_instrument_function do the right thing.

2. Like (1) but also keep GCOV, given proper support for attribute
no_instrument_function would probably fix it (?).

3. Keep GCOV (and KCOV of course). Somehow extract PGO profiles from KCOV.

4. Somehow extract PGO profiles from GCOV, or modify kernel/gcov to do so.

Thanks.

  parent reply	other threads:[~2021-06-14 14:17 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210111081821.3041587-1-morbo@google.com>
     [not found] ` <20210407211704.367039-1-morbo@google.com>
     [not found]   ` <YMTn9yjuemKFLbws@hirez.programming.kicks-ass.net>
     [not found]     ` <CAGG=3QXjD1DQjACu=CQQSP=whue-14Pw8FcNcXrJZfLC_E+y9w@mail.gmail.com>
     [not found]       ` <YMT5xZsZMX0PpDKQ@hirez.programming.kicks-ass.net>
     [not found]         ` <CAGG=3QVHkkJ236mCJ8Jt_6JtgYtWHV9b4aVXnoj6ypc7GOnc0A@mail.gmail.com>
     [not found]           ` <20210612202505.GG68208@worktop.programming.kicks-ass.net>
     [not found]             ` <CAGG=3QUZ9tXGNLhbOr+AFDTJABDujZuaG1mYaLKdTcJZguEDWw@mail.gmail.com>
     [not found]               ` <YMca2aa+t+3VrpN9@hirez.programming.kicks-ass.net>
     [not found]                 ` <CAGG=3QVPCuAx9UMTOzQp+8MJk8KVyOfaYeV0yehpVwbCaYMVpg@mail.gmail.com>
2021-06-14 10:44                   ` [PATCH v9] pgo: add clang's Profile Guided Optimization infrastructure Peter Zijlstra
2021-06-14 11:41                     ` Bill Wendling
2021-06-14 11:43                     ` Bill Wendling
2021-06-14 14:16                     ` Marco Elver [this message]
2021-06-14 15:26                       ` Kees Cook
2021-06-14 15:35                         ` Peter Zijlstra
2021-06-14 16:22                           ` Kees Cook
2021-06-14 18:07                             ` Nick Desaulniers
2021-06-14 20:49                               ` Nick Desaulniers
2021-06-14 15:46                         ` Peter Zijlstra
2021-06-14 16:03                           ` Nick Desaulniers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CANpmjNNnZv7DHYaJBL7knn9P+50F+SOCvis==Utaf-avENnVsw@mail.gmail.com' \
    --to=elver@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreyknvl@gmail.com \
    --cc=clang-built-linux@googlegroups.com \
    --cc=corbet@lwn.net \
    --cc=dvyukov@google.com \
    --cc=johannes.berg@intel.com \
    --cc=keescook@google.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=masahiroy@kernel.org \
    --cc=maskray@google.com \
    --cc=morbo@google.com \
    --cc=natechancellor@gmail.com \
    --cc=ndesaulniers@google.com \
    --cc=oberpar@linux.vnet.ibm.com \
    --cc=peterz@infradead.org \
    --cc=samitolvanen@google.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).