On Sat, Mar 13, 2021 at 2:47 PM Sedat Dilek wrote: [ ... ] > Let me look if I will do a selfmade ThinLTO+PGO optimized LLVM > toolchain v12.0.0-rc3 this weekend. > I did it. Here some fresh numbers: [ Selfmade LLVM toolchain v12.0.0-rc3 "stage1-only" ] [ Host-Kernel: 5.12.0-rc2-8-amd64-clang12-cfi includes Peter's NOPS patchset ] Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1 PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-9-amd64-clang12-cfi KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza KBUILD_BUILD_USER=sedat.dilek@gmail.com KBUILD_BUILD_TIMESTAMP=2021-03-13 bindeb-pkg KDEB_PKGVERSION=5.12.0~rc2-9~bullseye+dileks1': 55936351.95 msec task-clock # 3.580 CPUs utilized 8291848 context-switches # 0.148 K/sec 269686 cpu-migrations # 0.005 K/sec 288389721 page-faults # 0.005 M/sec 108344049253836 cycles # 1.937 GHz 83228135285263 stalled-cycles-frontend # 76.82% frontend cycles idle 65616255370809 stalled-cycles-backend # 60.56% backend cycles idle 59590373937199 instructions # 0.55 insn per cycle # 1.40 stalled cycles per insn 10906265495505 branches # 194.976 M/sec 488578274434 branch-misses # 4.48% of all branches 15622.926203302 seconds time elapsed 53453.974928000 seconds user 2526.773533000 seconds sys [ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ] [ Host-Kernel: Debian's 5.10.19-1 kernel ] Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1 PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-10-amd64-clang12-cfi KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza KBUILD_BUILD_USER=sedat.dilek@gmail.com KBUILD_BUILD_TIMESTAMP=2021-03-14 bindeb-pkg KDEB_PKGVERSION=5.12.0~rc2-10~bullseye+dileks1': 40223080.69 msec task-clock # 3.434 CPUs utilized 7438923 context-switches # 0.185 K/sec 245636 cpu-migrations # 0.006 K/sec 288073015 page-faults # 0.007 M/sec 77325441657129 cycles # 1.922 GHz 55357463522675 stalled-cycles-frontend # 71.59% frontend cycles idle 38978871249074 stalled-cycles-backend # 50.41% backend cycles idle 55178265045056 instructions # 0.71 insn per cycle # 1.00 stalled cycles per insn 9749166033571 branches # 242.377 M/sec 431303563167 branch-misses # 4.42% of all branches 11714.751645982 seconds time elapsed 37951.117840000 seconds user 2313.807151000 seconds sys [ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ] [ Host-Kernel: 5.12.0-rc2-10-amd64-clang12-cfi includes Peter's NOPS patchset ] Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1 PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-1-amd64-clang12-cfi KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza KBUILD_BUILD_USER=sedat.dilek@gmail.com KBUILD_BUILD_TIMESTAMP=2021-03-15 bindeb-pkg KDEB_PKGVERSION=5.12.0~rc3-1~bullseye+dileks1': 40632207.25 msec task-clock # 3.406 CPUs utilized 8216832 context-switches # 0.202 K/sec 277610 cpu-migrations # 0.007 K/sec 281331052 page-faults # 0.007 M/sec 77031538570411 cycles # 1.896 GHz (83.33%) 55247905369487 stalled-cycles-frontend # 71.72% frontend cycles idle (83.33%) 39046795510242 stalled-cycles-backend # 50.69% backend cycles idle (66.67%) 54592585444704 instructions # 0.71 insn per cycle # 1.01 stalled cycles per insn (83.33%) 9641589406714 branches # 237.289 M/sec (83.33%) 435317273069 branch-misses # 4.51% of all branches (83.33%) 11928.047003788 seconds time elapsed 38187.685111000 seconds user 2502.075987000 seconds sys As said in an earlier email: A ThinLTO+PGO optimized LLVM-toolchain saves here approx. 60mins of build-time. Depending on the host-kernel including Peter's NOPS patchset: 3mins longer build-time. Brewing time of one single Turkish Tea bag. Attached are the 3 build-time log-files. - Sedat -