All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Stultz <jstultz@google.com>
To: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Stephen Boyd <sboyd@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	kernel-janitors@vger.kernel.org
Subject: Re: [PATCH v2] timers: Optimize usleep_range()
Date: Fri, 29 Jul 2022 14:22:23 -0700	[thread overview]
Message-ID: <CANDhNCrFtv+jdGTFGH5NSV_W591Z-Ut-3rt94zzCRtVbUoye9g@mail.gmail.com> (raw)
In-Reply-To: <a896e176f0f0b819f8ec5ab8935355d01a642506.1659126514.git.christophe.jaillet@wanadoo.fr>

On Fri, Jul 29, 2022 at 1:29 PM Christophe JAILLET
<christophe.jaillet@wanadoo.fr> wrote:
>
> Most of the time the 'min' and 'max' parameters of usleep_range() are
> constant. We can take advantage of it to pre-compute at compile time
> some values otherwise computer at run-time in usleep_range_state().
>
> Replace usleep_range_state() by a new __nsleep_range_delta_state() function
> that takes as parameters the pre-computed values.
>
> The main benefit is to save a few instructions, especially 2
> multiplications (x1000 when converting us to ns).
>
>
> Some hand simplified diff of the generated asm are given below. They were
> produced on a Intel(R) Core(TM) i7-3770, with gcc 11.2.0.
>
> drivers/clk/clk-si514.c (taken as an example)
> -----------------------
> In this driver we have:
>    usleep_range(10000, 12000);
>
> --- clk_before.asm      2022-07-29 21:49:05.702289425 +0200
> +++ clk_after.asm       2022-07-29 21:50:23.801548963 +0200
> @@ -972,8 +972,8 @@
>   ea0:  45 85 e4                test   %r12d,%r12d
>   ea3:  0f 88 f6 fc ff ff       js     b9f <si514_set_rate+0x9f>
>   ea9:  e8 00 00 00 00          call   eae <si514_set_rate+0x3ae>
> - eae:  be e0 2e 00 00          mov    $0x2ee0,%esi             ;     12.000
> - eb3:  bf 10 27 00 00          mov    $0x2710,%edi             ;     10.000
> + eae:  be 80 84 1e 00          mov    $0x1e8480,%esi           ;  2.000.000
> + eb3:  bf 80 96 98 00          mov    $0x989680,%edi           ; 10.000.000
>   eb8:  ba 02 00 00 00          mov    $0x2,%edx
>   ebd:  e8 00 00 00 00          call   ec2 <si514_set_rate+0x3c2>
>   ec2:  44 8b 74 24 30          mov    0x30(%rsp),%r14d
>
> The asm produced in the caller is mostly the same. Only constant values
> passed to usleep_range_state() or __nsleep_range_delta_state() are
> different. No other instructions or whatever is different.
>
>
> kernel/time/timer.c
> -------------------
> -0000000000000000 <usleep_range_state>:
> +0000000000000000 <__nsleep_range_delta_state>:
>   f3 0f 1e fa           endbr64
>   e8 00 00 00 00        call   ...
>   48 b8 00 00 00 00 00  movabs $0xdffffc0000000000,%rax
> @@ -10692,16 +10692,14 @@
>   41 56                 push   %r14
>   49 c7 c6 00 00 00 00  mov    $0x0,%r14
>   41 55                 push   %r13
> - 41 89 d5              mov    %edx,%r13d
> + 49 89 f5              mov    %rsi,%r13
>   41 54                 push   %r12
> - 49 89 f4              mov    %rsi,%r12
> + 41 89 d4              mov    %edx,%r12d
>   55                    push   %rbp
> - 44 89 ed              mov    %r13d,%ebp
> + 44 89 e5              mov    %r12d,%ebp
>   53                    push   %rbx
>   48 89 fb              mov    %rdi,%rbx
>   81 e5 cc 00 00 00     and    $0xcc,%ebp
> - 49 29 dc              sub    %rbx,%r12              ; (max - min)
> - 4d 69 e4 e8 03 00 00  imul   $0x3e8,%r12,%r12       ; us --> ns (x 1000)
>   48 83 ec 68           sub    $0x68,%rsp
>   48 c7 44 24 08 b3 8a  movq   $0x41b58ab3,0x8(%rsp)
>   b5 41
> @@ -10721,18 +10719,16 @@
>   31 c0                 xor    %eax,%eax
>   e8 00 00 00 00        call   ...
>   e8 00 00 00 00        call   ...
> - 49 89 c0              mov    %rax,%r8
> - 48 69 c3 e8 03 00 00  imul   $0x3e8,%rbx,%rax       ; us --> ns (x 1000)
> + 48 01 d8              add    %rbx,%rax
> + 48 89 44 24 28        mov    %rax,0x28(%rsp)
>   65 48 8b 1c 25 00 00  mov    %gs:0x0,%rbx
>   00 00
> - 4c 01 c0              add    %r8,%rax
> - 48 89 44 24 28        mov    %rax,0x28(%rsp)
>   e8 00 00 00 00        call   ...
>   31 ff                 xor    %edi,%edi
>   89 ee                 mov    %ebp,%esi
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> ---
> v1 -> v2
>   - Simplify and avoid use of __buildint_constant_p() [John Stultz <jstultz@google.com>]
>   - Also update usleep_idle_range()
>   - Axe usleep_range_state()  [John Stultz <jstultz@google.com>]
>   - Fix kerneldoc  [John Stultz <jstultz@google.com>]
>   - Update log message accordingly
> https://lore.kernel.org/all/d7fc85736adee02ce52ee88a54fa7477fbd18ed2.1653236802.git.christophe.jaillet@wanadoo.fr/
> ---

Thanks for taking the time to rework this! It looks much better to me!

The only nit I have is you still have a few checkpatch issues to resolve:

WARNING: 'convertion' may be misspelled - perhaps 'conversion'?
#154: FILE: include/linux/delay.h:68:
+        * convertion to ns will be optimized-out at compile time.
           ^^^^^^^^^^

ERROR: code indent should use tabs where possible
#198: FILE: kernel/time/timer.c:2124:
+^I^I^I^I        unsigned int state)$

CHECK: Alignment should match open parenthesis
#198: FILE: kernel/time/timer.c:2124:
+void __sched __nsleep_range_delta_state(u64 min, u64 delta,
+                                       unsigned int state)


(also the path lines in the commit message is confusing checkpatch a bit)

With those resolved, when you resubmit, you can add my:
  Acked-by: John Stultz <jstultz@google.com>

thanks
-john

  reply	other threads:[~2022-07-29 21:22 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-29 20:28 [PATCH v2] timers: Optimize usleep_range() Christophe JAILLET
2022-07-29 21:22 ` John Stultz [this message]
2022-08-01  8:18 ` David Laight
2022-08-01 18:01   ` Christophe JAILLET

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANDhNCrFtv+jdGTFGH5NSV_W591Z-Ut-3rt94zzCRtVbUoye9g@mail.gmail.com \
    --to=jstultz@google.com \
    --cc=christophe.jaillet@wanadoo.fr \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sboyd@kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.