All of lore.kernel.org
 help / color / mirror / Atom feed
From: Arnd Bergmann <arnd@arndb.de>
To: stable@vger.kernel.org, Will Deacon <will.deacon@arm.com>,
	Florian La Roche <florian.laroche@googlemail.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Anshul Garg <aksgarg1989@gmail.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Davidlohr Bueso <dave@stgolabs.net>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@kernel.org>, Joe Perches <joe@perches.com>,
	David Miller <davem@davemloft.net>,
	Matthew Wilcox <mawilcox@microsoft.com>,
	Kees Cook <keescook@chromium.org>,
	Michael Davidson <md@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Arnd Bergmann <arnd@arndb.de>,
	linux-kernel@vger.kernel.org
Subject: [BACKPORT 4.4.y 24/25] lib/int_sqrt: optimize small argument
Date: Fri, 22 Mar 2019 16:44:15 +0100	[thread overview]
Message-ID: <20190322154425.3852517-25-arnd@arndb.de> (raw)
In-Reply-To: <20190322154425.3852517-1-arnd@arndb.de>

From: Peter Zijlstra <peterz@infradead.org>

The current int_sqrt() computation is sub-optimal for the case of small
@x.  Which is the interesting case when we're going to do cumulative
distribution functions on idle times, which we assume to be a random
variable, where the target residency of the deepest idle state gives an
upper bound on the variable (5e6ns on recent Intel chips).

In the case of small @x, the compute loop:

	while (m != 0) {
		b = y + m;
		y >>= 1;

		if (x >= b) {
			x -= b;
			y += m;
		}
		m >>= 2;
	}

can be reduced to:

	while (m > x)
		m >>= 2;

Because y==0, b==m and until x>=m y will remain 0.

And while this is computationally equivalent, it runs much faster
because there's less code, in particular less branches.

      cycles:                 branches:              branch-misses:

OLD:

hot:   45.109444 +- 0.044117  44.333392 +- 0.002254  0.018723 +- 0.000593
cold: 187.737379 +- 0.156678  44.333407 +- 0.002254  6.272844 +- 0.004305

PRE:

hot:   67.937492 +- 0.064124  66.999535 +- 0.000488  0.066720 +- 0.001113
cold: 232.004379 +- 0.332811  66.999527 +- 0.000488  6.914634 +- 0.006568

POST:

hot:   43.633557 +- 0.034373  45.333132 +- 0.002277  0.023529 +- 0.000681
cold: 207.438411 +- 0.125840  45.333132 +- 0.002277  6.976486 +- 0.004219

Averages computed over all values <128k using a LFSR to generate order.
Cold numbers have a LFSR based branch trace buffer 'confuser' ran between
each int_sqrt() invocation.

Link: http://lkml.kernel.org/r/20171020164644.876503355@infradead.org
Fixes: 30493cc9dddb ("lib/int_sqrt.c: optimize square root algorithm")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Suggested-by: Anshul Garg <aksgarg1989@gmail.com>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Joe Perches <joe@perches.com>
Cc: David Miller <davem@davemloft.net>
Cc: Matthew Wilcox <mawilcox@microsoft.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Michael Davidson <md@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 3f3295709edea6268ff1609855f498035286af73)
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 lib/int_sqrt.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/lib/int_sqrt.c b/lib/int_sqrt.c
index 1ef4cc344977..1afb545a37c5 100644
--- a/lib/int_sqrt.c
+++ b/lib/int_sqrt.c
@@ -22,6 +22,9 @@ unsigned long int_sqrt(unsigned long x)
 		return x;
 
 	m = 1UL << (BITS_PER_LONG - 2);
+	while (m > x)
+		m >>= 2;
+
 	while (m != 0) {
 		b = y + m;
 		y >>= 1;
-- 
2.20.0


  parent reply	other threads:[~2019-03-22 15:50 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-22 15:43 [BACKPORT 4.4.y 00/25] candidates from spreadtrum 4.4 product kernel Arnd Bergmann
2019-03-22 15:43 ` Arnd Bergmann
2019-03-22 15:43 ` Arnd Bergmann
2019-03-22 15:43 ` Arnd Bergmann
2019-03-22 15:43 ` [BACKPORT 4.4.y 01/25] mmc: pwrseq: constify mmc_pwrseq_ops structures Arnd Bergmann
2019-03-26  1:08   ` Greg KH
2019-03-26  6:44     ` Julia Lawall
2019-03-26  8:11     ` Arnd Bergmann
2019-03-22 15:43 ` [BACKPORT 4.4.y 02/25] ALSA: compress: add support for 32bit calls in a 64bit kernel Arnd Bergmann
2019-03-26  1:09   ` Greg KH
2019-03-26  7:55     ` Arnd Bergmann
2019-03-30  9:40       ` Greg KH
2019-03-22 15:43 ` [BACKPORT 4.4.y 03/25] mmc: pwrseq_simple: Make reset-gpios optional to match doc Arnd Bergmann
2019-03-22 15:43 ` [BACKPORT 4.4.y 04/25] USB: iowarrior: fix oops with malicious USB descriptors Arnd Bergmann
2019-03-22 15:43   ` [BACKPORT,4.4.y,04/25] " Arnd Bergmann
2019-03-26  1:13   ` [BACKPORT 4.4.y 04/25] " Greg Kroah-Hartman
2019-03-26  1:13     ` [BACKPORT,4.4.y,04/25] " Greg Kroah-Hartman
2019-03-26  8:20     ` [BACKPORT 4.4.y 04/25] " Arnd Bergmann
2019-03-26  8:20       ` [BACKPORT,4.4.y,04/25] " Arnd Bergmann
2019-03-26  9:35       ` [BACKPORT 4.4.y 04/25] " Baolin Wang
2019-03-26  9:35         ` [BACKPORT,4.4.y,04/25] " Baolin Wang
2019-03-26  9:47         ` [BACKPORT 4.4.y 04/25] " 翟京 (Orson Zhai)
2019-03-26  9:47           ` [BACKPORT,4.4.y,04/25] " 翟京 (Orson Zhai)
2019-03-22 15:43 ` [BACKPORT 4.4.y 05/25] mmc: debugfs: Add a restriction to mmc debugfs clock setting Arnd Bergmann
2019-03-22 15:43 ` [BACKPORT 4.4.y 06/25] mmc: make MAN_BKOPS_EN message a debug Arnd Bergmann
2019-03-22 15:43 ` [BACKPORT 4.4.y 07/25] mmc: sanitize 'bus width' in debug output Arnd Bergmann
2019-03-22 15:43 ` [BACKPORT 4.4.y 08/25] mmc: core: shut up "voltage-ranges unspecified" pr_info() Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 09/25] usb: dwc3: gadget: Fix suspend/resume during device mode Arnd Bergmann
2019-03-22 15:44   ` [BACKPORT,4.4.y,09/25] " Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 10/25] arm64: mm: Add trace_irqflags annotations to do_debug_exception() Arnd Bergmann
2019-03-22 15:44   ` Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 11/25] mmc: core: fix using wrong io voltage if mmc_select_hs200 fails Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 12/25] mm/rmap: replace BUG_ON(anon_vma->degree) with VM_WARN_ON Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 13/25] extcon: usb-gpio: Don't miss event during suspend/resume Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 14/25] kbuild: setlocalversion: print error to STDERR Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 15/25] usb: gadget: composite: fix dereference after null check coverify warning Arnd Bergmann
2019-03-22 15:44   ` [BACKPORT,4.4.y,15/25] " Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 16/25] usb: gadget: Add the gserial port checking in gs_start_tx() Arnd Bergmann
2019-03-22 15:44   ` [BACKPORT,4.4.y,16/25] " Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 17/25] mmc: core: don't try to switch block size for dual rate mode Arnd Bergmann
2019-03-26  1:27   ` Greg KH
2019-03-26  8:14     ` Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 18/25] tcp/dccp: drop SYN packets if accept queue is full Arnd Bergmann
2019-03-22 15:44   ` Arnd Bergmann
2019-03-26  1:21   ` Greg KH
2019-03-26  1:21     ` Greg KH
2019-03-22 15:44 ` [BACKPORT 4.4.y 19/25] serial: sprd: adjust TIMEOUT to a big value Arnd Bergmann
2019-03-26  1:21   ` Greg KH
2019-03-22 15:44 ` [BACKPORT 4.4.y 20/25] Hang/soft lockup in d_invalidate with simultaneous calls Arnd Bergmann
2019-03-26  1:30   ` Greg KH
2019-03-22 15:44 ` [BACKPORT 4.4.y 21/25] arm64: traps: disable irq in die() Arnd Bergmann
2019-03-22 15:44   ` Arnd Bergmann
2019-03-26  1:31   ` Greg KH
2019-03-26  1:31     ` Greg KH
2019-03-22 15:44 ` [BACKPORT 4.4.y 22/25] usb: renesas_usbhs: gadget: fix unused-but-set-variable warning Arnd Bergmann
2019-03-22 15:44   ` [BACKPORT,4.4.y,22/25] " Arnd Bergmann
2019-03-22 15:44 ` [BACKPORT 4.4.y 23/25] serial: sprd: clear timeout interrupt only rather than all interrupts Arnd Bergmann
2019-03-26  1:34   ` Greg KH
2019-03-22 15:44 ` Arnd Bergmann [this message]
2019-03-26  1:36   ` [BACKPORT 4.4.y 24/25] lib/int_sqrt: optimize small argument Greg KH
2019-03-22 15:44 ` [BACKPORT 4.4.y 25/25] USB: core: only clean up what we allocated Arnd Bergmann
2019-03-22 15:44   ` [BACKPORT,4.4.y,25/25] " Arnd Bergmann
2019-03-26  1:36   ` [BACKPORT 4.4.y 25/25] " Greg Kroah-Hartman
2019-03-26  1:36     ` [BACKPORT,4.4.y,25/25] " Greg Kroah-Hartman
2019-03-26  2:18 ` [BACKPORT 4.4.y 00/25] candidates from spreadtrum 4.4 product kernel Greg KH
2019-03-26  2:18   ` Greg KH
2019-03-26  2:18   ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190322154425.3852517-25-arnd@arndb.de \
    --to=arnd@arndb.de \
    --cc=akpm@linux-foundation.org \
    --cc=aksgarg1989@gmail.com \
    --cc=dave@stgolabs.net \
    --cc=davem@davemloft.net \
    --cc=florian.laroche@googlemail.com \
    --cc=joe@perches.com \
    --cc=keescook@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mawilcox@microsoft.com \
    --cc=md@google.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=will.deacon@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.