linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Optimize is_power_of_2().
@ 2007-06-15 16:56 Vegard Nossum
  2007-06-15 18:21 ` H. Peter Anvin
  2007-06-15 19:47 ` Jan Engelhardt
  0 siblings, 2 replies; 7+ messages in thread
From: Vegard Nossum @ 2007-06-15 16:56 UTC (permalink / raw)
  To: linux-kernel

From: Vegard Nossum <vegard@peltkore.net>
Date: Fri, 15 Jun 2007 18:35:49 +0200
Subject: [PATCH] Optimize is_power_of_2().

Rationale: Removes one conditional branch and reduces icache footprint.
Proof: If n is false, the product of n and any value is false. If n is
true, the truth of (n * x) is the truth of x.

Signed-off-by: Vegard Nossum <vegard@peltkore.net>
---
 include/linux/log2.h |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/log2.h b/include/linux/log2.h
index 1b8a2c1..56196b1 100644
--- a/include/linux/log2.h
+++ b/include/linux/log2.h
@@ -51,7 +51,7 @@ int __ilog2_u64(u64 n)
 static inline __attribute__((const))
 bool is_power_of_2(unsigned long n)
 {
-	return (n != 0 && ((n & (n - 1)) == 0));
+	return n * !(n & (n - 1));
 }
 
 /*
-- 
1.5.0.6


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Optimize is_power_of_2().
  2007-06-15 16:56 [PATCH] Optimize is_power_of_2() Vegard Nossum
@ 2007-06-15 18:21 ` H. Peter Anvin
  2007-06-15 18:28   ` Robert P. J. Day
  2007-06-15 19:47 ` Jan Engelhardt
  1 sibling, 1 reply; 7+ messages in thread
From: H. Peter Anvin @ 2007-06-15 18:21 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: linux-kernel

Vegard Nossum wrote:
> From: Vegard Nossum <vegard@peltkore.net>
> Date: Fri, 15 Jun 2007 18:35:49 +0200
> Subject: [PATCH] Optimize is_power_of_2().
> 
> Rationale: Removes one conditional branch and reduces icache footprint.
> Proof: If n is false, the product of n and any value is false. If n is
> true, the truth of (n * x) is the truth of x.
> 

You realize that on a lot of platforms, multiplication is done by an
out-of-line subroutine, and that even on those that aren't, it's
generally a long-pipeline operation, right?

	-hpa

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Optimize is_power_of_2().
  2007-06-15 18:21 ` H. Peter Anvin
@ 2007-06-15 18:28   ` Robert P. J. Day
  2007-06-15 20:26     ` H. Peter Anvin
  0 siblings, 1 reply; 7+ messages in thread
From: Robert P. J. Day @ 2007-06-15 18:28 UTC (permalink / raw)
  To: H. Peter Anvin; +Cc: Vegard Nossum, linux-kernel

On Fri, 15 Jun 2007, H. Peter Anvin wrote:

> Vegard Nossum wrote:
> > From: Vegard Nossum <vegard@peltkore.net>
> > Date: Fri, 15 Jun 2007 18:35:49 +0200
> > Subject: [PATCH] Optimize is_power_of_2().
> >
> > Rationale: Removes one conditional branch and reduces icache
> > footprint. Proof: If n is false, the product of n and any value is
> > false. If n is true, the truth of (n * x) is the truth of x.
>
> You realize that on a lot of platforms, multiplication is done by an
> out-of-line subroutine, and that even on those that aren't, it's
> generally a long-pipeline operation, right?

i was *just* about to mention somthing of the sort, but probably not
as succinctly.

rday
-- 
========================================================================
Robert P. J. Day
Linux Consulting, Training and Annoying Kernel Pedantry
Waterloo, Ontario, CANADA

http://fsdev.net/wiki/index.php?title=Main_Page
========================================================================

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Optimize is_power_of_2().
  2007-06-15 16:56 [PATCH] Optimize is_power_of_2() Vegard Nossum
  2007-06-15 18:21 ` H. Peter Anvin
@ 2007-06-15 19:47 ` Jan Engelhardt
  2007-06-15 19:54   ` David M. Lloyd
  1 sibling, 1 reply; 7+ messages in thread
From: Jan Engelhardt @ 2007-06-15 19:47 UTC (permalink / raw)
  To: Vegard Nossum; +Cc: Linux Kernel Mailing List, H. Peter Anvin, Robert P. J. Day



On Jun 15 2007 18:56, Vegard Nossum wrote:
> bool is_power_of_2(unsigned long n)
> {
>-	return (n != 0 && ((n & (n - 1)) == 0));
>+	return n * !(n & (n - 1));
> }

There is a third way which uses neither * nor &&, but []:

bool is_power_of_2(unsigned long n)
{
	static const bool yn[] = {
		[0] = false,
		[1 ... (8 * sizeof(n) - 1)] = true,
	};
	return yn[(n & (n - 1)) == 0];
}



	Jan
-- 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Optimize is_power_of_2().
  2007-06-15 19:47 ` Jan Engelhardt
@ 2007-06-15 19:54   ` David M. Lloyd
  2007-06-15 19:57     ` David M. Lloyd
  0 siblings, 1 reply; 7+ messages in thread
From: David M. Lloyd @ 2007-06-15 19:54 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Vegard Nossum, Linux Kernel Mailing List, H. Peter Anvin,
	Robert P. J. Day

On Fri, 15 Jun 2007 21:47:50 +0200 (CEST)
Jan Engelhardt <jengelh@computergmbh.de> wrote:

> On Jun 15 2007 18:56, Vegard Nossum wrote:
> > bool is_power_of_2(unsigned long n)
> > {
> >-	return (n != 0 && ((n & (n - 1)) == 0));
> >+	return n * !(n & (n - 1));
> > }
> 
> There is a third way which uses neither * nor &&, but []:

I assume using something GCC-specific is right out?

bool is_power_of_to(unsigned long n)
{
	return __builtin_ffsl(n) == 1;
}

- DML

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Optimize is_power_of_2().
  2007-06-15 19:54   ` David M. Lloyd
@ 2007-06-15 19:57     ` David M. Lloyd
  0 siblings, 0 replies; 7+ messages in thread
From: David M. Lloyd @ 2007-06-15 19:57 UTC (permalink / raw)
  To: David M. Lloyd
  Cc: Jan Engelhardt, Vegard Nossum, Linux Kernel Mailing List,
	H. Peter Anvin, Robert P. J. Day

On Fri, 15 Jun 2007 14:54:20 -0500
"David M. Lloyd" <dmlloyd@flurg.com> wrote:

> On Fri, 15 Jun 2007 21:47:50 +0200 (CEST)
> Jan Engelhardt <jengelh@computergmbh.de> wrote:
> 
> > On Jun 15 2007 18:56, Vegard Nossum wrote:
> > > bool is_power_of_2(unsigned long n)
> > > {
> > >-	return (n != 0 && ((n & (n - 1)) == 0));
> > >+	return n * !(n & (n - 1));
> > > }
> > 
> > There is a third way which uses neither * nor &&, but []:
> 
> I assume using something GCC-specific is right out?
> 
> bool is_power_of_to(unsigned long n)
> {
> 	return __builtin_ffsl(n) == 1;

Pretend I typed this instead:

bool is_power_of_2(unsigned long n)
{
	return __builtin_popcountl(n) == 1;
}

All I can say is, it's really hot here :P

- DML

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Optimize is_power_of_2().
  2007-06-15 18:28   ` Robert P. J. Day
@ 2007-06-15 20:26     ` H. Peter Anvin
  0 siblings, 0 replies; 7+ messages in thread
From: H. Peter Anvin @ 2007-06-15 20:26 UTC (permalink / raw)
  To: Robert P. J. Day; +Cc: Vegard Nossum, linux-kernel

Robert P. J. Day wrote:
> On Fri, 15 Jun 2007, H. Peter Anvin wrote:
> 
>> Vegard Nossum wrote:
>>> From: Vegard Nossum <vegard@peltkore.net>
>>> Date: Fri, 15 Jun 2007 18:35:49 +0200
>>> Subject: [PATCH] Optimize is_power_of_2().
>>>
>>> Rationale: Removes one conditional branch and reduces icache
>>> footprint. Proof: If n is false, the product of n and any value is
>>> false. If n is true, the truth of (n * x) is the truth of x.
>> You realize that on a lot of platforms, multiplication is done by an
>> out-of-line subroutine, and that even on those that aren't, it's
>> generally a long-pipeline operation, right?
> 
> i was *just* about to mention somthing of the sort, but probably not
> as succinctly.
> 

You also realize, I hope, that obfuscating this for the compiler just
hurts.  As has already been described, there are versions involving
jumps, selects, multiplication, tables, ffs, and popcount.  All of those
operations are hideously expensive on some architectures and cheap on
others.

Personally, I would bet that the version as originally written is
probably best, since select (conditional move) is a fairly commonly
supported operation, especially on architectures where jumps are expensive.

Just to confuse the situation further, here is yet another version:

bool is_power_of_2(unsigned long n)
{
	return !!n & !(n & (n-1));
}

... which, on i686 compiles to:
00000000 <is_power_of_2>:
   0:   85 c0                   test   %eax,%eax
   2:   8d 50 ff                lea    0xffffffff(%eax),%edx
   5:   0f 95 c1                setne  %cl
   8:   85 d0                   test   %edx,%eax
   a:   0f 94 c0                sete   %al
   d:   0f b6 c0                movzbl %al,%eax
  10:   21 c8                   and    %ecx,%eax
  12:   c3                      ret

In the current version, gcc does generate a jump on i686 and x86-64:

00000000 <is_power_of_2>:
   0:   89 c2                   mov    %eax,%edx
   2:   31 c0                   xor    %eax,%eax
   4:   85 d2                   test   %edx,%edx
   6:   74 0b                   je     13 <is_power_of_2+0x13>
   8:   8d 42 ff                lea    0xffffffff(%edx),%eax
   b:   85 c2                   test   %eax,%edx
   d:   0f 94 c0                sete   %al
  10:   83 e0 01                and    $0x1,%eax
  13:   f3 c3                   repz ret

	-hpa


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2007-06-15 20:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-06-15 16:56 [PATCH] Optimize is_power_of_2() Vegard Nossum
2007-06-15 18:21 ` H. Peter Anvin
2007-06-15 18:28   ` Robert P. J. Day
2007-06-15 20:26     ` H. Peter Anvin
2007-06-15 19:47 ` Jan Engelhardt
2007-06-15 19:54   ` David M. Lloyd
2007-06-15 19:57     ` David M. Lloyd

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).