All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sedat Dilek <sedat.dilek@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Peter Zijlstra <peterz@infradead.org>,
	x86@kernel.org, rostedt@goodmis.org, hpa@zytor.com,
	torvalds@linuxfoundation.org, linux-kernel@vger.kernel.org,
	linux-toolchains@vger.kernel.org, jpoimboe@redhat.com,
	alexei.starovoitov@gmail.com, mhiramat@kernel.org
Subject: Re: [PATCH 0/2] x86: Remove ideal_nops[]
Date: Mon, 15 Mar 2021 18:04:41 +0100	[thread overview]
Message-ID: <CA+icZUXLyFqq0y_GnKca8MS4wO2kcj4K-D1kBHLa8u_pnLZ7eQ@mail.gmail.com> (raw)
In-Reply-To: <CA+icZUWTSo2vkQO_tRggDFvvF_Q6AdzhvhQvmAsNxKnpGXHi0Q@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 4618 bytes --]

On Sat, Mar 13, 2021 at 2:47 PM Sedat Dilek <sedat.dilek@gmail.com> wrote:
[ ... ]
> Let me look if I will do a selfmade ThinLTO+PGO optimized LLVM
> toolchain v12.0.0-rc3 this weekend.
>

I did it.

Here some fresh numbers:

[ Selfmade LLVM toolchain v12.0.0-rc3 "stage1-only" ]
[ Host-Kernel: 5.12.0-rc2-8-amd64-clang12-cfi includes Peter's NOPS patchset ]

Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-9-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@gmail.com
KBUILD_BUILD_TIMESTAMP=2021-03-13 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc2-9~bullseye+dileks1':

      55936351.95 msec task-clock                #    3.580 CPUs
utilized
          8291848      context-switches          #    0.148 K/sec
           269686      cpu-migrations            #    0.005 K/sec
        288389721      page-faults               #    0.005 M/sec
  108344049253836      cycles                    #    1.937 GHz
   83228135285263      stalled-cycles-frontend   #   76.82% frontend
cycles idle
   65616255370809      stalled-cycles-backend    #   60.56% backend
cycles idle
   59590373937199      instructions              #    0.55  insn per
cycle
                                                 #    1.40  stalled
cycles per insn
   10906265495505      branches                  #  194.976 M/sec
     488578274434      branch-misses             #    4.48% of all
branches

  15622.926203302 seconds time elapsed

  53453.974928000 seconds user
   2526.773533000 seconds sys


[ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ]
[ Host-Kernel: Debian's 5.10.19-1 kernel ]

Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-10-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@gmail.com
KBUILD_BUILD_TIMESTAMP=2021-03-14 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc2-10~bullseye+dileks1':

      40223080.69 msec task-clock                #    3.434 CPUs
utilized
          7438923      context-switches          #    0.185 K/sec
           245636      cpu-migrations            #    0.006 K/sec
        288073015      page-faults               #    0.007 M/sec
   77325441657129      cycles                    #    1.922 GHz
   55357463522675      stalled-cycles-frontend   #   71.59% frontend
cycles idle
   38978871249074      stalled-cycles-backend    #   50.41% backend
cycles idle
   55178265045056      instructions              #    0.71  insn per
cycle
                                                 #    1.00  stalled
cycles per insn
    9749166033571      branches                  #  242.377 M/sec
     431303563167      branch-misses             #    4.42% of all
branches

  11714.751645982 seconds time elapsed

  37951.117840000 seconds user
   2313.807151000 seconds sys


[ Selfmade LLVM toolchain v12.0.0-rc3 "thinlto_pgo_optimized" ]
[ Host-Kernel: 5.12.0-rc2-10-amd64-clang12-cfi includes Peter's NOPS patchset ]

Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1
PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-1-amd64-clang12-cfi
KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza
KBUILD_BUILD_USER=sedat.dilek@gmail.com
KBUILD_BUILD_TIMESTAMP=2021-03-15 bindeb-pkg
KDEB_PKGVERSION=5.12.0~rc3-1~bullseye+dileks1':

      40632207.25 msec task-clock                #    3.406 CPUs
utilized
          8216832      context-switches          #    0.202 K/sec
           277610      cpu-migrations            #    0.007 K/sec
        281331052      page-faults               #    0.007 M/sec
   77031538570411      cycles                    #    1.896 GHz
              (83.33%)
   55247905369487      stalled-cycles-frontend   #   71.72% frontend
cycles idle     (83.33%)
   39046795510242      stalled-cycles-backend    #   50.69% backend
cycles idle      (66.67%)
   54592585444704      instructions              #    0.71  insn per
cycle
                                                 #    1.01  stalled
cycles per insn  (83.33%)
    9641589406714      branches                  #  237.289 M/sec
              (83.33%)
     435317273069      branch-misses             #    4.51% of all
branches          (83.33%)

  11928.047003788 seconds time elapsed

  38187.685111000 seconds user
   2502.075987000 seconds sys

As said in an earlier email:
A ThinLTO+PGO optimized LLVM-toolchain saves here approx. 60mins of build-time.

Depending on the host-kernel including Peter's NOPS patchset: 3mins
longer build-time.
Brewing time of one single Turkish Tea bag.

Attached are the 3 build-time log-files.

- Sedat -

[-- Attachment #2: build-time_5.12.0-rc2-9-amd64-clang12-cfi.txt --]
[-- Type: text/plain, Size: 1344 bytes --]

 Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1 PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-9-amd64-clang12-cfi KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza KBUILD_BUILD_USER=sedat.dilek@gmail.com KBUILD_BUILD_TIMESTAMP=2021-03-13 bindeb-pkg KDEB_PKGVERSION=5.12.0~rc2-9~bullseye+dileks1':

       55936351.95 msec task-clock                #    3.580 CPUs utilized          
           8291848      context-switches          #    0.148 K/sec                  
            269686      cpu-migrations            #    0.005 K/sec                  
         288389721      page-faults               #    0.005 M/sec                  
   108344049253836      cycles                    #    1.937 GHz                    
    83228135285263      stalled-cycles-frontend   #   76.82% frontend cycles idle   
    65616255370809      stalled-cycles-backend    #   60.56% backend cycles idle    
    59590373937199      instructions              #    0.55  insn per cycle         
                                                  #    1.40  stalled cycles per insn
    10906265495505      branches                  #  194.976 M/sec                  
      488578274434      branch-misses             #    4.48% of all branches        

   15622.926203302 seconds time elapsed

   53453.974928000 seconds user
    2526.773533000 seconds sys



[-- Attachment #3: build-time_5.12.0-rc2-10-amd64-clang12-cfi.txt --]
[-- Type: text/plain, Size: 1346 bytes --]

 Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1 PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-10-amd64-clang12-cfi KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza KBUILD_BUILD_USER=sedat.dilek@gmail.com KBUILD_BUILD_TIMESTAMP=2021-03-14 bindeb-pkg KDEB_PKGVERSION=5.12.0~rc2-10~bullseye+dileks1':

       40223080.69 msec task-clock                #    3.434 CPUs utilized          
           7438923      context-switches          #    0.185 K/sec                  
            245636      cpu-migrations            #    0.006 K/sec                  
         288073015      page-faults               #    0.007 M/sec                  
    77325441657129      cycles                    #    1.922 GHz                    
    55357463522675      stalled-cycles-frontend   #   71.59% frontend cycles idle   
    38978871249074      stalled-cycles-backend    #   50.41% backend cycles idle    
    55178265045056      instructions              #    0.71  insn per cycle         
                                                  #    1.00  stalled cycles per insn
     9749166033571      branches                  #  242.377 M/sec                  
      431303563167      branch-misses             #    4.42% of all branches        

   11714.751645982 seconds time elapsed

   37951.117840000 seconds user
    2313.807151000 seconds sys



[-- Attachment #4: build-time_5.12.0-rc3-1-amd64-clang12-cfi.txt --]
[-- Type: text/plain, Size: 1404 bytes --]

 Performance counter stats for 'make V=1 -j4 LLVM=1 LLVM_IAS=1 PAHOLE=/opt/pahole/bin/pahole LOCALVERSION=-1-amd64-clang12-cfi KBUILD_VERBOSE=1 KBUILD_BUILD_HOST=iniza KBUILD_BUILD_USER=sedat.dilek@gmail.com KBUILD_BUILD_TIMESTAMP=2021-03-15 bindeb-pkg KDEB_PKGVERSION=5.12.0~rc3-1~bullseye+dileks1':

       40632207.25 msec task-clock                #    3.406 CPUs utilized          
           8216832      context-switches          #    0.202 K/sec                  
            277610      cpu-migrations            #    0.007 K/sec                  
         281331052      page-faults               #    0.007 M/sec                  
    77031538570411      cycles                    #    1.896 GHz                      (83.33%)
    55247905369487      stalled-cycles-frontend   #   71.72% frontend cycles idle     (83.33%)
    39046795510242      stalled-cycles-backend    #   50.69% backend cycles idle      (66.67%)
    54592585444704      instructions              #    0.71  insn per cycle         
                                                  #    1.01  stalled cycles per insn  (83.33%)
     9641589406714      branches                  #  237.289 M/sec                    (83.33%)
      435317273069      branch-misses             #    4.51% of all branches          (83.33%)

   11928.047003788 seconds time elapsed

   38187.685111000 seconds user
    2502.075987000 seconds sys



  reply	other threads:[~2021-03-15 17:06 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-12 11:32 [PATCH 0/2] x86: Remove ideal_nops[] Peter Zijlstra
2021-03-12 11:32 ` [PATCH 1/2] x86: Remove dynamic NOP selection Peter Zijlstra
2021-03-12 12:09   ` Peter Zijlstra
2021-03-12 20:36     ` Linus Torvalds
2021-03-15 18:00   ` [tip: x86/cpu] " tip-bot2 for Peter Zijlstra
2024-01-20  6:58   ` [PATCH 1/2] " Thorsten Glaser
2024-01-20  8:22     ` H. Peter Anvin
2024-01-20 16:53       ` Thorsten Glaser
2024-01-21 23:21         ` H. Peter Anvin
2024-01-21 23:58           ` Thorsten Glaser
2024-01-22  0:15             ` H. Peter Anvin
2024-01-22  0:56               ` Steven Rostedt
2024-01-22  1:17                 ` Thorsten Glaser
2024-01-22  2:04                   ` H. Peter Anvin
2024-01-22  2:15                 ` H. Peter Anvin
2024-01-22  2:22                   ` Steven Rostedt
2024-01-22  2:31                     ` H. Peter Anvin
2024-01-20 17:00       ` Linus Torvalds
2024-01-20 17:19         ` Thorsten Glaser
2024-01-20 18:21           ` disassemblers (was Re: [PATCH 1/2] x86: Remove dynamic NOP selection) Thorsten Glaser
2024-01-21 22:36         ` [PATCH 1/2] x86: Remove dynamic NOP selection David Laight
2024-01-21 23:10           ` H. Peter Anvin
2021-03-12 11:32 ` [PATCH 2/2] objtool,x86: Use asm/nops.h Peter Zijlstra
2021-03-15 18:00   ` [tip: x86/cpu] objtool/x86: " tip-bot2 for Peter Zijlstra
2021-03-12 14:29 ` [PATCH 0/2] x86: Remove ideal_nops[] Sedat Dilek
2021-03-12 14:47   ` Borislav Petkov
2021-03-12 17:26     ` Steven Rostedt
2021-03-12 17:35       ` Sedat Dilek
2021-03-12 17:46         ` Borislav Petkov
2021-03-12 17:47         ` Steven Rostedt
2021-03-12 18:13           ` Sedat Dilek
2021-03-12 19:03             ` Sedat Dilek
2021-03-12 20:59 ` Borislav Petkov
2021-03-13  5:26   ` Sedat Dilek
2021-03-13  8:49     ` Borislav Petkov
2021-03-13 11:23       ` Borislav Petkov
2021-03-13 12:10       ` Sedat Dilek
2021-03-13 12:15         ` Borislav Petkov
2021-03-13 12:38           ` Sedat Dilek
2021-03-13 12:49             ` Borislav Petkov
2021-03-13 12:58               ` Sedat Dilek
2021-03-13 13:29                 ` Borislav Petkov
2021-03-13 13:47                   ` Sedat Dilek
2021-03-15 17:04                     ` Sedat Dilek [this message]
2021-03-15 17:15                       ` Borislav Petkov
2021-03-15 17:19                         ` Sedat Dilek
2021-03-15 17:23                           ` Borislav Petkov
2021-03-15 18:10                       ` Peter Zijlstra
2021-03-15 18:23                         ` Sedat Dilek
2021-03-15 22:14                           ` Peter Zijlstra
2021-03-16  5:56                             ` Sedat Dilek
2021-03-27 12:08                               ` Sedat Dilek
2021-03-27 20:02                                 ` Linus Torvalds
2021-03-30 12:31                                   ` Sedat Dilek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CA+icZUXLyFqq0y_GnKca8MS4wO2kcj4K-D1kBHLa8u_pnLZ7eQ@mail.gmail.com \
    --to=sedat.dilek@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jpoimboe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-toolchains@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=torvalds@linuxfoundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.