linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [tip:x86/asm] x86, bitops: Change bitops to be native operand size
@ 2013-07-17  1:15 tip-bot for H. Peter Anvin
  2013-11-10 21:09 ` Joe Perches
  0 siblings, 1 reply; 12+ messages in thread
From: tip-bot for H. Peter Anvin @ 2013-07-17  1:15 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, james.t.kukunas, tglx, hpa

Commit-ID:  9b710506a03b01a9fdd83962912bc9d8237b82e8
Gitweb:     http://git.kernel.org/tip/9b710506a03b01a9fdd83962912bc9d8237b82e8
Author:     H. Peter Anvin <hpa@linux.intel.com>
AuthorDate: Tue, 16 Jul 2013 15:20:14 -0700
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Tue, 16 Jul 2013 15:24:04 -0700

x86, bitops: Change bitops to be native operand size

Change the bitops operation to be naturally "long", i.e. 63 bits on
the 64-bit kernel.  Additional bugs are likely to crop up in the
future.

We already have bugs which machines with > 16 TiB of memory in a
single node, as can happen if memory is interleaved.  The x86 bitop
operations take a signed index, so using an unsigned type is not an
option.

Jim Kukunas measured the effect of this patch on kernel size: it adds
2779 bytes to the allyesconfig kernel.  Some of that probably could be
elided by replacing the inline functions with macros which select the
32-bit type if the index is a 32-bit value, something like:

In that case we could also use "Jr" constraints for the 64-bit
version.

However, this would more than double the amount of code for a
relatively small gain.

Note that we can't use ilog2() for _BITOPS_LONG_SHIFT, as that causes
a recursive header inclusion problem.

The change to constant_test_bit() should both generate better code and
give correct result for negative bit indicies.  As previously written
the compiler had to generate extra code to create the proper wrong
result for negative values.

Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Cc: Jim Kukunas <james.t.kukunas@intel.com>
Link: http://lkml.kernel.org/n/tip-z61ofiwe90xeyb461o72h8ya@git.kernel.org
---
 arch/x86/include/asm/bitops.h      | 46 ++++++++++++++++++++++----------------
 arch/x86/include/asm/sync_bitops.h | 24 ++++++++++----------
 2 files changed, 39 insertions(+), 31 deletions(-)

diff --git a/arch/x86/include/asm/bitops.h b/arch/x86/include/asm/bitops.h
index 6dfd019..41639ce 100644
--- a/arch/x86/include/asm/bitops.h
+++ b/arch/x86/include/asm/bitops.h
@@ -15,6 +15,14 @@
 #include <linux/compiler.h>
 #include <asm/alternative.h>
 
+#if BITS_PER_LONG == 32
+# define _BITOPS_LONG_SHIFT 5
+#elif BITS_PER_LONG == 64
+# define _BITOPS_LONG_SHIFT 6
+#else
+# error "Unexpected BITS_PER_LONG"
+#endif
+
 #define BIT_64(n)			(U64_C(1) << (n))
 
 /*
@@ -59,7 +67,7 @@
  * restricted to acting on a single-word quantity.
  */
 static __always_inline void
-set_bit(unsigned int nr, volatile unsigned long *addr)
+set_bit(long nr, volatile unsigned long *addr)
 {
 	if (IS_IMMEDIATE(nr)) {
 		asm volatile(LOCK_PREFIX "orb %1,%0"
@@ -81,7 +89,7 @@ set_bit(unsigned int nr, volatile unsigned long *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __set_bit(int nr, volatile unsigned long *addr)
+static inline void __set_bit(long nr, volatile unsigned long *addr)
 {
 	asm volatile("bts %1,%0" : ADDR : "Ir" (nr) : "memory");
 }
@@ -97,7 +105,7 @@ static inline void __set_bit(int nr, volatile unsigned long *addr)
  * in order to ensure changes are visible on other processors.
  */
 static __always_inline void
-clear_bit(int nr, volatile unsigned long *addr)
+clear_bit(long nr, volatile unsigned long *addr)
 {
 	if (IS_IMMEDIATE(nr)) {
 		asm volatile(LOCK_PREFIX "andb %1,%0"
@@ -118,13 +126,13 @@ clear_bit(int nr, volatile unsigned long *addr)
  * clear_bit() is atomic and implies release semantics before the memory
  * operation. It can be used for an unlock.
  */
-static inline void clear_bit_unlock(unsigned nr, volatile unsigned long *addr)
+static inline void clear_bit_unlock(long nr, volatile unsigned long *addr)
 {
 	barrier();
 	clear_bit(nr, addr);
 }
 
-static inline void __clear_bit(int nr, volatile unsigned long *addr)
+static inline void __clear_bit(long nr, volatile unsigned long *addr)
 {
 	asm volatile("btr %1,%0" : ADDR : "Ir" (nr));
 }
@@ -141,7 +149,7 @@ static inline void __clear_bit(int nr, volatile unsigned long *addr)
  * No memory barrier is required here, because x86 cannot reorder stores past
  * older loads. Same principle as spin_unlock.
  */
-static inline void __clear_bit_unlock(unsigned nr, volatile unsigned long *addr)
+static inline void __clear_bit_unlock(long nr, volatile unsigned long *addr)
 {
 	barrier();
 	__clear_bit(nr, addr);
@@ -159,7 +167,7 @@ static inline void __clear_bit_unlock(unsigned nr, volatile unsigned long *addr)
  * If it's called on the same region of memory simultaneously, the effect
  * may be that only one operation succeeds.
  */
-static inline void __change_bit(int nr, volatile unsigned long *addr)
+static inline void __change_bit(long nr, volatile unsigned long *addr)
 {
 	asm volatile("btc %1,%0" : ADDR : "Ir" (nr));
 }
@@ -173,7 +181,7 @@ static inline void __change_bit(int nr, volatile unsigned long *addr)
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
-static inline void change_bit(int nr, volatile unsigned long *addr)
+static inline void change_bit(long nr, volatile unsigned long *addr)
 {
 	if (IS_IMMEDIATE(nr)) {
 		asm volatile(LOCK_PREFIX "xorb %1,%0"
@@ -194,7 +202,7 @@ static inline void change_bit(int nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
-static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_set_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
@@ -212,7 +220,7 @@ static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
  * This is the same as test_and_set_bit on x86.
  */
 static __always_inline int
-test_and_set_bit_lock(int nr, volatile unsigned long *addr)
+test_and_set_bit_lock(long nr, volatile unsigned long *addr)
 {
 	return test_and_set_bit(nr, addr);
 }
@@ -226,7 +234,7 @@ test_and_set_bit_lock(int nr, volatile unsigned long *addr)
  * If two examples of this operation race, one can appear to succeed
  * but actually fail.  You must protect multiple accesses with a lock.
  */
-static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
+static inline int __test_and_set_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
@@ -245,7 +253,7 @@ static inline int __test_and_set_bit(int nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
-static inline int test_and_clear_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_clear_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
@@ -272,7 +280,7 @@ static inline int test_and_clear_bit(int nr, volatile unsigned long *addr)
  * accessed from a hypervisor on the same CPU if running in a VM: don't change
  * this without also updating arch/x86/kernel/kvm.c
  */
-static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
+static inline int __test_and_clear_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
@@ -284,7 +292,7 @@ static inline int __test_and_clear_bit(int nr, volatile unsigned long *addr)
 }
 
 /* WARNING: non atomic and it can be reordered! */
-static inline int __test_and_change_bit(int nr, volatile unsigned long *addr)
+static inline int __test_and_change_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
@@ -304,7 +312,7 @@ static inline int __test_and_change_bit(int nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
-static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_change_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
@@ -315,13 +323,13 @@ static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
 	return oldbit;
 }
 
-static __always_inline int constant_test_bit(unsigned int nr, const volatile unsigned long *addr)
+static __always_inline int constant_test_bit(long nr, const volatile unsigned long *addr)
 {
-	return ((1UL << (nr % BITS_PER_LONG)) &
-		(addr[nr / BITS_PER_LONG])) != 0;
+	return ((1UL << (nr & (BITS_PER_LONG-1))) &
+		(addr[nr >> _BITOPS_LONG_SHIFT])) != 0;
 }
 
-static inline int variable_test_bit(int nr, volatile const unsigned long *addr)
+static inline int variable_test_bit(long nr, volatile const unsigned long *addr)
 {
 	int oldbit;
 
diff --git a/arch/x86/include/asm/sync_bitops.h b/arch/x86/include/asm/sync_bitops.h
index 9d09b40..05af3b3 100644
--- a/arch/x86/include/asm/sync_bitops.h
+++ b/arch/x86/include/asm/sync_bitops.h
@@ -26,9 +26,9 @@
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
-static inline void sync_set_bit(int nr, volatile unsigned long *addr)
+static inline void sync_set_bit(long nr, volatile unsigned long *addr)
 {
-	asm volatile("lock; btsl %1,%0"
+	asm volatile("lock; bts %1,%0"
 		     : "+m" (ADDR)
 		     : "Ir" (nr)
 		     : "memory");
@@ -44,9 +44,9 @@ static inline void sync_set_bit(int nr, volatile unsigned long *addr)
  * you should call smp_mb__before_clear_bit() and/or smp_mb__after_clear_bit()
  * in order to ensure changes are visible on other processors.
  */
-static inline void sync_clear_bit(int nr, volatile unsigned long *addr)
+static inline void sync_clear_bit(long nr, volatile unsigned long *addr)
 {
-	asm volatile("lock; btrl %1,%0"
+	asm volatile("lock; btr %1,%0"
 		     : "+m" (ADDR)
 		     : "Ir" (nr)
 		     : "memory");
@@ -61,9 +61,9 @@ static inline void sync_clear_bit(int nr, volatile unsigned long *addr)
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
-static inline void sync_change_bit(int nr, volatile unsigned long *addr)
+static inline void sync_change_bit(long nr, volatile unsigned long *addr)
 {
-	asm volatile("lock; btcl %1,%0"
+	asm volatile("lock; btc %1,%0"
 		     : "+m" (ADDR)
 		     : "Ir" (nr)
 		     : "memory");
@@ -77,11 +77,11 @@ static inline void sync_change_bit(int nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
-static inline int sync_test_and_set_bit(int nr, volatile unsigned long *addr)
+static inline int sync_test_and_set_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
-	asm volatile("lock; btsl %2,%1\n\tsbbl %0,%0"
+	asm volatile("lock; bts %2,%1\n\tsbbl %0,%0"
 		     : "=r" (oldbit), "+m" (ADDR)
 		     : "Ir" (nr) : "memory");
 	return oldbit;
@@ -95,11 +95,11 @@ static inline int sync_test_and_set_bit(int nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
-static inline int sync_test_and_clear_bit(int nr, volatile unsigned long *addr)
+static inline int sync_test_and_clear_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
-	asm volatile("lock; btrl %2,%1\n\tsbbl %0,%0"
+	asm volatile("lock; btr %2,%1\n\tsbbl %0,%0"
 		     : "=r" (oldbit), "+m" (ADDR)
 		     : "Ir" (nr) : "memory");
 	return oldbit;
@@ -113,11 +113,11 @@ static inline int sync_test_and_clear_bit(int nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
-static inline int sync_test_and_change_bit(int nr, volatile unsigned long *addr)
+static inline int sync_test_and_change_bit(long nr, volatile unsigned long *addr)
 {
 	int oldbit;
 
-	asm volatile("lock; btcl %2,%1\n\tsbbl %0,%0"
+	asm volatile("lock; btc %2,%1\n\tsbbl %0,%0"
 		     : "=r" (oldbit), "+m" (ADDR)
 		     : "Ir" (nr) : "memory");
 	return oldbit;

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops:  Change bitops to be native operand size
  2013-07-17  1:15 [tip:x86/asm] x86, bitops: Change bitops to be native operand size tip-bot for H. Peter Anvin
@ 2013-11-10 21:09 ` Joe Perches
  2013-11-10 22:10   ` H. Peter Anvin
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Perches @ 2013-11-10 21:09 UTC (permalink / raw)
  To: mingo, hpa, linux-kernel, tglx, james.t.kukunas, hpa, Linus Torvalds
  Cc: David Miller

On Tue, 2013-07-16 at 18:15 -0700, tip-bot for H. Peter Anvin wrote:
> Commit-ID:  9b710506a03b01a9fdd83962912bc9d8237b82e8
[]
> x86, bitops: Change bitops to be native operand size
> 
> Change the bitops operation to be naturally "long", i.e. 63 bits on
> the 64-bit kernel.  Additional bugs are likely to crop up in the
> future.

> We already have bugs which machines with > 16 TiB of memory in a
> single node, as can happen if memory is interleaved.  The x86 bitop
> operations take a signed index, so using an unsigned type is not an
> option.

I think it odd that any bitop index nr should be
anything other than unsigned long for any arch.

Why should this arch be any different than the
defined type in Documentation/atomic_ops.txt?

What value is a negative index when the bitmap
array address passed is the starting 0th bit?

btw: asm-generic/bitops.h doesn't match
Documentation/atomic_ops.txt either.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops:  Change bitops to be native operand size
  2013-11-10 21:09 ` Joe Perches
@ 2013-11-10 22:10   ` H. Peter Anvin
  2013-11-10 22:44     ` Joe Perches
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2013-11-10 22:10 UTC (permalink / raw)
  To: Joe Perches, mingo, linux-kernel, tglx, james.t.kukunas, hpa,
	Linus Torvalds
  Cc: David Miller

Yes, on the generic it is int.

The problem is in part that some architectures have bitop instructions with specific behavior.

Joe Perches <joe@perches.com> wrote:
>On Tue, 2013-07-16 at 18:15 -0700, tip-bot for H. Peter Anvin wrote:
>> Commit-ID:  9b710506a03b01a9fdd83962912bc9d8237b82e8
>[]
>> x86, bitops: Change bitops to be native operand size
>> 
>> Change the bitops operation to be naturally "long", i.e. 63 bits on
>> the 64-bit kernel.  Additional bugs are likely to crop up in the
>> future.
>
>> We already have bugs which machines with > 16 TiB of memory in a
>> single node, as can happen if memory is interleaved.  The x86 bitop
>> operations take a signed index, so using an unsigned type is not an
>> option.
>
>I think it odd that any bitop index nr should be
>anything other than unsigned long for any arch.
>
>Why should this arch be any different than the
>defined type in Documentation/atomic_ops.txt?
>
>What value is a negative index when the bitmap
>array address passed is the starting 0th bit?
>
>btw: asm-generic/bitops.h doesn't match
>Documentation/atomic_ops.txt either.

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops:  Change bitops to be native operand size
  2013-11-10 22:10   ` H. Peter Anvin
@ 2013-11-10 22:44     ` Joe Perches
  2013-11-11  2:06       ` H. Peter Anvin
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Perches @ 2013-11-10 22:44 UTC (permalink / raw)
  To: H. Peter Anvin, linux-arch
  Cc: mingo, linux-kernel, tglx, james.t.kukunas, hpa, Linus Torvalds,
	David Miller

(adding linux-arch, and possible patch below)

On Sun, 2013-11-10 at 14:10 -0800, H. Peter Anvin wrote:
> Yes, on the generic it is int.
> 
> The problem is in part that some architectures have bitop
> instructions with specific behavior.

I think that all bitop indices should be changed
to unsigned (int or long, probably long) for all
arches.

Is there any impediment to that?

I didn't find a negative index used anywhere
but I didn't do an exhaustive search.

There are many different arch specific declarations
of <foo>_bit functions.

For instance:

$ git grep -w clear_bit arch|grep "bitops\.h.*static"
arch/arc/include/asm/bitops.h:static inline void clear_bit(unsigned long nr, volatile unsigned long *m)
arch/arc/include/asm/bitops.h:static inline void clear_bit(unsigned long nr, volatile unsigned long *m)
arch/avr32/include/asm/bitops.h:static inline void clear_bit(int nr, volatile void * addr)
arch/blackfin/include/asm/bitops.h:static inline void clear_bit(int nr, volatile unsigned long *addr)
arch/frv/include/asm/bitops.h:static inline void clear_bit(unsigned long nr, volatile void *addr)
arch/hexagon/include/asm/bitops.h:static inline void clear_bit(int nr, volatile void *addr)
arch/m32r/include/asm/bitops.h:static __inline__ void clear_bit(int nr, volatile void * addr)
arch/metag/include/asm/bitops.h:static inline void clear_bit(unsigned int bit, volatile unsigned long *p)
arch/mips/include/asm/bitops.h:static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
arch/parisc/include/asm/bitops.h:static __inline__ void clear_bit(int nr, volatile unsigned long * addr)
arch/powerpc/include/asm/bitops.h:static __inline__ void clear_bit(int nr, volatile unsigned long *addr)
arch/s390/include/asm/bitops.h:static inline void clear_bit(unsigned long nr, volatile unsigned long *ptr)
arch/xtensa/include/asm/bitops.h:static inline void clear_bit(unsigned int bit, volatile unsigned long *p)

> Joe Perches <joe@perches.com> wrote:
> >On Tue, 2013-07-16 at 18:15 -0700, tip-bot for H. Peter Anvin wrote:
> >> Commit-ID:  9b710506a03b01a9fdd83962912bc9d8237b82e8
> >[]
> >> x86, bitops: Change bitops to be native operand size
> >> 
> >> Change the bitops operation to be naturally "long", i.e. 63 bits on
> >> the 64-bit kernel.  Additional bugs are likely to crop up in the
> >> future.
> >
> >> We already have bugs which machines with > 16 TiB of memory in a
> >> single node, as can happen if memory is interleaved.  The x86 bitop
> >> operations take a signed index, so using an unsigned type is not an
> >> option.
> >
> >I think it odd that any bitop index nr should be
> >anything other than unsigned long for any arch.
> >
> >Why should this arch be any different than the
> >defined type in Documentation/atomic_ops.txt?
> >
> >What value is a negative index when the bitmap
> >array address passed is the starting 0th bit?
> >
> >btw: asm-generic/bitops.h doesn't match
> >Documentation/atomic_ops.txt either.

---

 include/asm-generic/bitops/atomic.h | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/include/asm-generic/bitops/atomic.h b/include/asm-generic/bitops/atomic.h
index 9ae6c34..e4feee5 100644
--- a/include/asm-generic/bitops/atomic.h
+++ b/include/asm-generic/bitops/atomic.h
@@ -62,7 +62,7 @@ extern arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned;
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
-static inline void set_bit(int nr, volatile unsigned long *addr)
+static inline void set_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -83,7 +83,7 @@ static inline void set_bit(int nr, volatile unsigned long *addr)
  * you should call smp_mb__before_clear_bit() and/or smp_mb__after_clear_bit()
  * in order to ensure changes are visible on other processors.
  */
-static inline void clear_bit(int nr, volatile unsigned long *addr)
+static inline void clear_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -104,7 +104,7 @@ static inline void clear_bit(int nr, volatile unsigned long *addr)
  * Note that @nr may be almost arbitrarily large; this function is not
  * restricted to acting on a single-word quantity.
  */
-static inline void change_bit(int nr, volatile unsigned long *addr)
+static inline void change_bit(unsigned long nr, volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -124,7 +124,8 @@ static inline void change_bit(int nr, volatile unsigned long *addr)
  * It may be reordered on other architectures than x86.
  * It also implies a memory barrier.
  */
-static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_set_bit(unsigned long nr,
+				   volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -148,7 +149,8 @@ static inline int test_and_set_bit(int nr, volatile unsigned long *addr)
  * It can be reorderdered on other architectures other than x86.
  * It also implies a memory barrier.
  */
-static inline int test_and_clear_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_clear_bit(unsigned long nr,
+				     volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);
@@ -171,7 +173,8 @@ static inline int test_and_clear_bit(int nr, volatile unsigned long *addr)
  * This operation is atomic and cannot be reordered.
  * It also implies a memory barrier.
  */
-static inline int test_and_change_bit(int nr, volatile unsigned long *addr)
+static inline int test_and_change_bit(unsigned long nr,
+				      volatile unsigned long *addr)
 {
 	unsigned long mask = BIT_MASK(nr);
 	unsigned long *p = ((unsigned long *)addr) + BIT_WORD(nr);



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops:  Change bitops to be native operand size
  2013-11-10 22:44     ` Joe Perches
@ 2013-11-11  2:06       ` H. Peter Anvin
  2013-11-11  2:22         ` Joe Perches
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2013-11-11  2:06 UTC (permalink / raw)
  To: Joe Perches, linux-arch
  Cc: mingo, linux-kernel, tglx, james.t.kukunas, hpa, Linus Torvalds,
	David Miller

On 11/10/2013 02:44 PM, Joe Perches wrote:
> (adding linux-arch, and possible patch below)
> 
> On Sun, 2013-11-10 at 14:10 -0800, H. Peter Anvin wrote:
>> Yes, on the generic it is int.
>>
>> The problem is in part that some architectures have bitop
>> instructions with specific behavior.
> 
> I think that all bitop indices should be changed
> to unsigned (int or long, probably long) for all
> arches.
> 
> Is there any impediment to that?
> 

It is at the very best misleading.  On x86 bit indicies will be signed
no matter what the data type says, and having an unsigned data type
being interpreted as signed seems like really dangerous.

On the other hand, for the generic implementation unsigned long makes sense.

We might need a bitindex_t or something like that for it to be clean.

	-hpa


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops:  Change bitops to be native operand size
  2013-11-11  2:06       ` H. Peter Anvin
@ 2013-11-11  2:22         ` Joe Perches
  2013-11-11 23:34           ` H. Peter Anvin
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Perches @ 2013-11-11  2:22 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-arch, mingo, linux-kernel, tglx, james.t.kukunas, hpa,
	Linus Torvalds, David Miller

On Sun, 2013-11-10 at 18:06 -0800, H. Peter Anvin wrote:
> On 11/10/2013 02:44 PM, Joe Perches wrote:
> > On Sun, 2013-11-10 at 14:10 -0800, H. Peter Anvin wrote:
> >> Yes, on the generic it is int.
> >> The problem is in part that some architectures have bitop
> >> instructions with specific behavior.
> > I think that all bitop indices should be changed
> > to unsigned (int or long, probably long) for all
> > arches.
> > Is there any impediment to that?
> It is at the very best misleading.  On x86 bit indicies will be signed
> no matter what the data type says,

?

> and having an unsigned data type
> being interpreted as signed seems like really dangerous.
> On the other hand, for the generic implementation unsigned long makes sense.
> We might need a bitindex_t or something like that for it to be clean.

Is there really any reason to introduce bitindex_t?

Perhaps the current x86 bitops asm code is being conflated
with the ideal implementation?


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops:  Change bitops to be native operand size
  2013-11-11  2:22         ` Joe Perches
@ 2013-11-11 23:34           ` H. Peter Anvin
  2013-11-12  2:54             ` Joe Perches
  0 siblings, 1 reply; 12+ messages in thread
From: H. Peter Anvin @ 2013-11-11 23:34 UTC (permalink / raw)
  To: Joe Perches, H. Peter Anvin
  Cc: linux-arch, mingo, linux-kernel, tglx, james.t.kukunas,
	Linus Torvalds, David Miller

On 11/10/2013 06:22 PM, Joe Perches wrote:
> 
> Perhaps the current x86 bitops asm code is being conflated
> with the ideal implementation?
> 

Yes, by you.

x86 has instructions that operate on signed bitindicies.  It doesn't
have instructions that operate on unsigned bitindicies.  Unless someone
is willing to do the work to prove that shift and mask is actually
faster than using the hardware instructions (which I doubt, but it is
always a possibility), that's what we have.

	-hpa


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops:  Change bitops to be native operand size
  2013-11-11 23:34           ` H. Peter Anvin
@ 2013-11-12  2:54             ` Joe Perches
  2013-11-12  3:15               ` Linus Torvalds
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Perches @ 2013-11-12  2:54 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: H. Peter Anvin, linux-arch, mingo, linux-kernel, tglx,
	james.t.kukunas, Linus Torvalds, David Miller

On Mon, 2013-11-11 at 15:34 -0800, H. Peter Anvin wrote:
> On 11/10/2013 06:22 PM, Joe Perches wrote:
> > 
> > Perhaps the current x86 bitops asm code is being conflated
> > with the ideal implementation?
> > 
> Yes, by you.

Really?  I don't think so.

How does the use of signed long for an index where
no negative values are possible or the use of a
negative int for BIT_MASK make sense?

> x86 has instructions that operate on signed bitindicies.

indices.

> It doesn't
> have instructions that operate on unsigned bitindicies.  Unless someone
> is willing to do the work to prove that shift and mask is actually
> faster than using the hardware instructions (which I doubt, but it is
> always a possibility), that's what we have.

That doesn't mean x86 is the ideal implementation.



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops: Change bitops to be native operand size
  2013-11-12  2:54             ` Joe Perches
@ 2013-11-12  3:15               ` Linus Torvalds
  2013-11-12  4:08                 ` Joe Perches
  0 siblings, 1 reply; 12+ messages in thread
From: Linus Torvalds @ 2013-11-12  3:15 UTC (permalink / raw)
  To: Joe Perches
  Cc: H. Peter Anvin, H. Peter Anvin, linux-arch, Ingo Molnar,
	Linux Kernel Mailing List, Thomas Gleixner, james.t.kukunas,
	David Miller

On Tue, Nov 12, 2013 at 11:54 AM, Joe Perches <joe@perches.com> wrote:
> On Mon, 2013-11-11 at 15:34 -0800, H. Peter Anvin wrote:
>> On 11/10/2013 06:22 PM, Joe Perches wrote:
>> >
>> > Perhaps the current x86 bitops asm code is being conflated
>> > with the ideal implementation?
>> >
>> Yes, by you.
>
> Really?  I don't think so.

What you think in this case doesn't really matter, does it? There are
actual facts, and then there is your thinking, and guess which one
matters?

Peter is absolutely correct, and has shown remarkable restraint trying
to explain it to you. The fact is, the x86 bitop instructions act on a
signed index. Making the index be "unsigned long" would violate the
actual *behavior* of the function, so it would be singularly stupid.

Talking about "ideal implementation" is also singularly stupid.
There's this fascinating thing called "reality", and you should try to
re-aquaint yourself with it.

Don't bother replying to this thread.

               Linus

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops: Change bitops to be native operand size
  2013-11-12  3:15               ` Linus Torvalds
@ 2013-11-12  4:08                 ` Joe Perches
  2013-11-12  8:52                   ` Geert Uytterhoeven
  0 siblings, 1 reply; 12+ messages in thread
From: Joe Perches @ 2013-11-12  4:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: H. Peter Anvin, H. Peter Anvin, linux-arch, Ingo Molnar,
	Linux Kernel Mailing List, Thomas Gleixner, james.t.kukunas,
	David Miller

On Tue, 2013-11-12 at 12:15 +0900, Linus Torvalds wrote:
> Talking about "ideal implementation" is also singularly stupid.

I just want the various arch implementations to match
the docs.  I know that's stupid.

Maybe if you really don't want to discuss things, you
should fix the documentation.

Documentation/atomic_ops.txt



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops: Change bitops to be native operand size
  2013-11-12  4:08                 ` Joe Perches
@ 2013-11-12  8:52                   ` Geert Uytterhoeven
  2013-11-30 23:16                     ` Rob Landley
  0 siblings, 1 reply; 12+ messages in thread
From: Geert Uytterhoeven @ 2013-11-12  8:52 UTC (permalink / raw)
  To: Joe Perches
  Cc: Linus Torvalds, H. Peter Anvin, H. Peter Anvin, linux-arch,
	Ingo Molnar, Linux Kernel Mailing List, Thomas Gleixner,
	james.t.kukunas, David Miller

On Tue, Nov 12, 2013 at 5:08 AM, Joe Perches <joe@perches.com> wrote:
> On Tue, 2013-11-12 at 12:15 +0900, Linus Torvalds wrote:
>> Talking about "ideal implementation" is also singularly stupid.
>
> I just want the various arch implementations to match
> the docs.  I know that's stupid.
>
> Maybe if you really don't want to discuss things, you
> should fix the documentation.

E.g. by adding a paragraph that the actual allowed range of indices may be
a subset of "unsigned long" on some architectures.
Or if we know that everyone supports at least 31 resp. 63 bits, that it may
be limited to 31 resp. 63 unsigned bits, which is the positive range subset of
"long".

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [tip:x86/asm] x86, bitops: Change bitops to be native operand size
  2013-11-12  8:52                   ` Geert Uytterhoeven
@ 2013-11-30 23:16                     ` Rob Landley
  0 siblings, 0 replies; 12+ messages in thread
From: Rob Landley @ 2013-11-30 23:16 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Joe Perches, H. Peter Anvin, H. Peter Anvin, linux-arch,
	Ingo Molnar, Linux Kernel Mailing List, Thomas Gleixner,
	james.t.kukunas, David Miller

On 11/12/2013 02:52:57 AM, Geert Uytterhoeven wrote:
> On Tue, Nov 12, 2013 at 5:08 AM, Joe Perches <joe@perches.com> wrote:
> > On Tue, 2013-11-12 at 12:15 +0900, Linus Torvalds wrote:
> >> Talking about "ideal implementation" is also singularly stupid.
> >
> > I just want the various arch implementations to match
> > the docs.  I know that's stupid.
> >
> > Maybe if you really don't want to discuss things, you
> > should fix the documentation.
> 
> E.g. by adding a paragraph that the actual allowed range of indices  
> may be
> a subset of "unsigned long" on some architectures.
> Or if we know that everyone supports at least 31 resp. 63 bits, that  
> it may
> be limited to 31 resp. 63 unsigned bits, which is the positive range  
> subset of
> "long".

If this ever turns into an actual patch to this file, could you cc: me  
on it so I can marshal it upstream? (Not enough domain expertise for me  
to produce it myself...)

Thanks,

Rob

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2013-12-01  2:28 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-17  1:15 [tip:x86/asm] x86, bitops: Change bitops to be native operand size tip-bot for H. Peter Anvin
2013-11-10 21:09 ` Joe Perches
2013-11-10 22:10   ` H. Peter Anvin
2013-11-10 22:44     ` Joe Perches
2013-11-11  2:06       ` H. Peter Anvin
2013-11-11  2:22         ` Joe Perches
2013-11-11 23:34           ` H. Peter Anvin
2013-11-12  2:54             ` Joe Perches
2013-11-12  3:15               ` Linus Torvalds
2013-11-12  4:08                 ` Joe Perches
2013-11-12  8:52                   ` Geert Uytterhoeven
2013-11-30 23:16                     ` Rob Landley

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).