From: Peter Zijlstra <peterz@infradead.org> To: torvalds@linux-foundation.org, akpm@linux-foundation.org Cc: dave@stgolabs.net, aksgarg1989@gmail.com, tglx@linutronix.de, mingo@kernel.org, will.deacon@arm.com, joe@perches.com, peterz@infradead.org, linux-kernel@vger.kernel.org, davem@davemloft.net, mawilcox@microsoft.com, keescook@chromium.org, md@google.com Subject: [PATCH -v2 1/3] lib/int_sqrt: Optimize small argument Date: Fri, 20 Oct 2017 18:44:38 +0200 Message-ID: <20171020164644.876503355@infradead.org> (raw) In-Reply-To: <20171020164437.917622646@infradead.org> [-- Attachment #0: peterz-int_sqrt-opt-1.patch --] [-- Type: text/plain, Size: 1888 bytes --] The current int_sqrt() computation is sub-optimal for the case of small @x. Which is the interesting case when we're going to do cumulative distribution functions on idle times, which we assume to be a random variable, where the target residency of the deepest idle state gives an upper bound on the variable (5e6ns on recent Intel chips). In the case of small @x, the compute loop: while (m != 0) { b = y + m; y >>= 1; if (x >= b) { x -= b; y += m; } m >>= 2; } can be reduced to: while (m > x) m >>= 2; Because y==0, b==m and until x>=m y will remain 0. And while this is computationally equivalent, it runs much faster because there's less code, in particular less branches. cycles: branches: branch-misses: OLD: hot: 45.109444 +- 0.044117 44.333392 +- 0.002254 0.018723 +- 0.000593 cold: 187.737379 +- 0.156678 44.333407 +- 0.002254 6.272844 +- 0.004305 PRE: hot: 67.937492 +- 0.064124 66.999535 +- 0.000488 0.066720 +- 0.001113 cold: 232.004379 +- 0.332811 66.999527 +- 0.000488 6.914634 +- 0.006568 POST: hot: 43.633557 +- 0.034373 45.333132 +- 0.002277 0.023529 +- 0.000681 cold: 207.438411 +- 0.125840 45.333132 +- 0.002277 6.976486 +- 0.004219 Averages computed over all values <128k using a LFSR to generate order. Cold numbers have a LFSR based branch trace buffer 'confuser' ran between each int_sqrt() invocation. Fixes: 30493cc9dddb ("lib/int_sqrt.c: optimize square root algorithm") Suggested-by: Anshul Garg <aksgarg1989@gmail.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> --- lib/int_sqrt.c | 3 +++ 1 file changed, 3 insertions(+) --- a/lib/int_sqrt.c +++ b/lib/int_sqrt.c @@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x) return x; m = 1UL << (BITS_PER_LONG - 2); + while (m > x) + m >>= 2; + while (m != 0) { b = y + m; y >>= 1;
next prev parent reply index Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-10-20 16:44 [PATCH -v2 0/3] lib/int_sqrt: Fix optimize and document Peter Zijlstra 2017-10-20 16:44 ` Peter Zijlstra [this message] 2017-10-20 16:44 ` [PATCH -v2 2/3] lib/int_sqrt: Optimize initial value compute Peter Zijlstra 2017-10-20 16:44 ` [PATCH -v2 3/3] lib/int_sqrt: Adjust comments Peter Zijlstra 2017-10-20 19:31 ` [PATCH -v2 0/3] lib/int_sqrt: Fix optimize and document Linus Torvalds 2017-11-02 16:43 ` Joe Perches 2017-11-02 16:47 ` Peter Zijlstra 2017-11-02 16:51 ` Kees Cook
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20171020164644.876503355@infradead.org \ --to=peterz@infradead.org \ --cc=akpm@linux-foundation.org \ --cc=aksgarg1989@gmail.com \ --cc=dave@stgolabs.net \ --cc=davem@davemloft.net \ --cc=joe@perches.com \ --cc=keescook@chromium.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mawilcox@microsoft.com \ --cc=md@google.com \ --cc=mingo@kernel.org \ --cc=tglx@linutronix.de \ --cc=torvalds@linux-foundation.org \ --cc=will.deacon@arm.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
LKML Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \ linux-kernel@vger.kernel.org public-inbox-index lkml Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel AGPL code for this site: git clone https://public-inbox.org/public-inbox.git