All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch 0/2] Fixing AMD cmpxchg "missing lfence" mess
@ 2009-04-22 20:18 Mathieu Desnoyers
  2009-04-22 20:18 ` [patch 1/2] x86: cleanup alternative.h Mathieu Desnoyers
  2009-04-22 20:18 ` [patch 2/2] x86 amd fix cmpxchg read acquire barrier Mathieu Desnoyers
  0 siblings, 2 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-22 20:18 UTC (permalink / raw)
  To: akpm, Ingo Molnar, linux-kernel

Hi,

I've prepared this patch, which applies on 2.6.30-rc2. It's an attempt to make
the kernel work around AMD's bug regarding missing lfence in the sync; cmpxchg
instruction.

The first patch is a alternative.h cleanup, and the second patch implements the
fixup using alternatives.

Indeed, this only makes the kernel aware of the bug. Userland will still have to
be taught to use such lfence too.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 1/2] x86: cleanup alternative.h
  2009-04-22 20:18 [patch 0/2] Fixing AMD cmpxchg "missing lfence" mess Mathieu Desnoyers
@ 2009-04-22 20:18 ` Mathieu Desnoyers
  2009-04-28 13:05   ` [GIT PULL] " Mathieu Desnoyers
  2009-04-22 20:18 ` [patch 2/2] x86 amd fix cmpxchg read acquire barrier Mathieu Desnoyers
  1 sibling, 1 reply; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-22 20:18 UTC (permalink / raw)
  To: akpm, Ingo Molnar, linux-kernel
  Cc: Mathieu Desnoyers, H. Peter Anvin, Andi Kleen

[-- Attachment #1: x86-cleanup-alternative.h.patch --]
[-- Type: text/plain, Size: 4853 bytes --]

Alternative header duplicates assembly that could be merged in one single macro.
Merging this into this macro also allows to directly declare ALTERNATIVE()
statements within assembly code.

Uses a __stringify() of the feature bits rather than passing a "i" operand.
Leave the old %0 operand as-is (set to 0), unused to stay compatible with API.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/include/asm/alternative.h |   58 ++++++++++++++-----------------------
 1 file changed, 22 insertions(+), 36 deletions(-)

Index: linux-2.6-lttng/arch/x86/include/asm/alternative.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/alternative.h	2009-04-22 16:14:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/alternative.h	2009-04-22 16:17:48.000000000 -0400
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <linux/stringify.h>
 #include <asm/asm.h>
 
 /*
@@ -74,6 +75,21 @@ static inline void alternatives_smp_swit
 
 const unsigned char *const *find_nop_table(void);
 
+/* alternative assembly primitive */
+#define ALTERNATIVE(oldinstr, newinstr, feature) \
+	      "661:\n\t" oldinstr "\n662:\n"			\
+	      ".section .altinstructions,\"a\"\n"		\
+	      _ASM_ALIGN "\n"					\
+	      _ASM_PTR "661b\n"		/* label */		\
+	      _ASM_PTR "663f\n"		/* new instruction */	\
+	      "	 .byte " __stringify(feature) "\n"	/* feature bit */ \
+	      "	 .byte 662b-661b\n"	/* sourcelen */		\
+	      "	 .byte 664f-663f\n"	/* replacementlen */	\
+	      ".previous\n"					\
+	      ".section .altinstr_replacement,\"ax\"\n"		\
+	      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
+	      ".previous"
+
 /*
  * Alternative instructions for different CPU types or capabilities.
  *
@@ -87,18 +103,7 @@ const unsigned char *const *find_nop_tab
  * without volatile and memory clobber.
  */
 #define alternative(oldinstr, newinstr, feature)			\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature) : "memory")
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
 
 /*
  * Alternative inline assembly with input.
@@ -109,35 +114,16 @@ const unsigned char *const *find_nop_tab
  * Best is to use constraints that are fixed size (like (%1) ... "r")
  * If you use variable sized constraints like "m" or "g" in the
  * replacement make sure to pad to the worst case length.
+ * Leaving an unused argument 0 to keep API compatibility.
  */
 #define alternative_input(oldinstr, newinstr, feature, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: : "i" (0), ## input)
 
 /* Like alternative_input, but with a single output argument */
 #define alternative_io(oldinstr, newinstr, feature, output, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c[feat]\n"	/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */ \
-		      ".previous" : output : [feat] "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: output : "i" (0), ## input)
 
 /*
  * use this macro(s) if you need more than one output parameter

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-22 20:18 [patch 0/2] Fixing AMD cmpxchg "missing lfence" mess Mathieu Desnoyers
  2009-04-22 20:18 ` [patch 1/2] x86: cleanup alternative.h Mathieu Desnoyers
@ 2009-04-22 20:18 ` Mathieu Desnoyers
  2009-04-22 20:59   ` Alan Cox
                     ` (2 more replies)
  1 sibling, 3 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-22 20:18 UTC (permalink / raw)
  To: akpm, Ingo Molnar, linux-kernel
  Cc: Mathieu Desnoyers, mark.langsdorf, arekm, H. Peter Anvin,
	Andi Kleen, Avi Kivity

[-- Attachment #1: x86-amd-fix-cmpxchg-read-acquire-barrier.patch --]
[-- Type: text/plain, Size: 8823 bytes --]

http://google-perftools.googlecode.com/svn-history/r48/trunk/src/base/atomicops-internals-x86.cc

says

"  // Opteron Rev E has a bug in which on very rare occasions a locked
  // instruction doesn't act as a read-acquire barrier if followed by a
  // non-locked read-modify-write instruction.  Rev F has this bug in 
  // pre-release versions, but not in versions released to customers,
  // so we test only for Rev E, which is family 15, model 32..63 inclusive.
  if (strcmp(vendor, "AuthenticAMD") == 0 &&       // AMD
      family == 15 &&
      32 <= model && model <= 63) {
    AtomicOps_Internalx86CPUFeatures.has_amd_lock_mb_bug = true;
  } else {
    AtomicOps_Internalx86CPUFeatures.has_amd_lock_mb_bug = false;
  }
"

There is no official AMD bug ID yet, but it seems to be reported in the field
and a fix is present in Solaris code base. The following links shows the current
understanding of the issue.

http://bugzilla.kernel.org/show_bug.cgi?id=11305
http://bugs.mysql.com/bug.php?id=26081

Since I needed to put the alternative within assembly statements, I decided to
rework alternative.h and fold the multiple alternative*() definitions into the
same ALTERNATIVE() macro. It is done in the previous patch. Those were
duplicated code anyway.

AFAIK, this bug has not received any official bug ID from AMD. Solaris already
have a workaround for this issue.

One should probably audit Xen cmpxchg too.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: mark.langsdorf@amd.com
CC: arekm@maven.pl
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Avi Kivity <avi@qumranet.com>
---
 arch/x86/include/asm/alternative.h |    5 +++++
 arch/x86/include/asm/cmpxchg_64.h  |   21 ++++++++++++++-------
 arch/x86/include/asm/cpufeature.h  |    1 +
 arch/x86/include/asm/futex.h       |    2 ++
 arch/x86/include/asm/rwsem.h       |    1 +
 arch/x86/include/asm/spinlock.h    |    2 ++
 arch/x86/kernel/cpu/amd.c          |    4 ++++
 7 files changed, 29 insertions(+), 7 deletions(-)

Index: linux-2.6-lttng/arch/x86/include/asm/cpufeature.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/cpufeature.h	2009-04-18 23:58:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/cpufeature.h	2009-04-18 23:58:28.000000000 -0400
@@ -94,6 +94,7 @@
 #define X86_FEATURE_TSC_RELIABLE (3*32+23) /* TSC is known to be reliable */
 #define X86_FEATURE_NONSTOP_TSC	(3*32+24) /* TSC does not stop in C states */
 #define X86_FEATURE_CLFLUSH_MONITOR (3*32+25) /* "" clflush reqd with monitor */
+#define X86_FEATURE_NEED_CMPXCHG_LFENCE	(3*32+26) /* Fix AMD cmpxchg lfence */
 
 /* Intel-defined CPU features, CPUID level 0x00000001 (ecx), word 4 */
 #define X86_FEATURE_XMM3	(4*32+ 0) /* "pni" SSE-3 */
Index: linux-2.6-lttng/arch/x86/kernel/cpu/amd.c
===================================================================
--- linux-2.6-lttng.orig/arch/x86/kernel/cpu/amd.c	2009-04-18 23:58:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/kernel/cpu/amd.c	2009-04-18 23:58:28.000000000 -0400
@@ -416,6 +416,10 @@ static void __cpuinit init_amd(struct cp
 		clear_cpu_cap(c, X86_FEATURE_MCE);
 #endif
 
+	/* Enable workaround for missing AMD cmpxchg lfence */
+	if (c->x86 == 0xf && c->x86_model >= 32 && c->x86_model <= 63)
+		set_cpu_cap(c, X86_FEATURE_NEED_CMPXCHG_LFENCE);
+
 	/* Enable workaround for FXSAVE leak */
 	if (c->x86 >= 6)
 		set_cpu_cap(c, X86_FEATURE_FXSAVE_LEAK);
Index: linux-2.6-lttng/arch/x86/include/asm/cmpxchg_64.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/cmpxchg_64.h	2009-04-18 23:58:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/cmpxchg_64.h	2009-04-18 23:58:28.000000000 -0400
@@ -66,25 +66,29 @@ static inline unsigned long __cmpxchg(vo
 	unsigned long prev;
 	switch (size) {
 	case 1:
-		asm volatile(LOCK_PREFIX "cmpxchgb %b1,%2"
+		asm volatile(LOCK_PREFIX "cmpxchgb %b1,%2\n\t"
+			     CMPXCHG_LFENCE
 			     : "=a"(prev)
 			     : "q"(new), "m"(*__xg(ptr)), "0"(old)
 			     : "memory");
 		return prev;
 	case 2:
-		asm volatile(LOCK_PREFIX "cmpxchgw %w1,%2"
+		asm volatile(LOCK_PREFIX "cmpxchgw %w1,%2\n\t"
+			     CMPXCHG_LFENCE
 			     : "=a"(prev)
 			     : "r"(new), "m"(*__xg(ptr)), "0"(old)
 			     : "memory");
 		return prev;
 	case 4:
-		asm volatile(LOCK_PREFIX "cmpxchgl %k1,%2"
+		asm volatile(LOCK_PREFIX "cmpxchgl %k1,%2\n\t"
+			     CMPXCHG_LFENCE
 			     : "=a"(prev)
 			     : "r"(new), "m"(*__xg(ptr)), "0"(old)
 			     : "memory");
 		return prev;
 	case 8:
-		asm volatile(LOCK_PREFIX "cmpxchgq %1,%2"
+		asm volatile(LOCK_PREFIX "cmpxchgq %1,%2\n\t"
+			     CMPXCHG_LFENCE
 			     : "=a"(prev)
 			     : "r"(new), "m"(*__xg(ptr)), "0"(old)
 			     : "memory");
@@ -105,19 +109,22 @@ static inline unsigned long __sync_cmpxc
 	unsigned long prev;
 	switch (size) {
 	case 1:
-		asm volatile("lock; cmpxchgb %b1,%2"
+		asm volatile("lock; cmpxchgb %b1,%2\n\t"
+			     CMPXCHG_LFENCE
 			     : "=a"(prev)
 			     : "q"(new), "m"(*__xg(ptr)), "0"(old)
 			     : "memory");
 		return prev;
 	case 2:
-		asm volatile("lock; cmpxchgw %w1,%2"
+		asm volatile("lock; cmpxchgw %w1,%2\n\t"
+			     CMPXCHG_LFENCE
 			     : "=a"(prev)
 			     : "r"(new), "m"(*__xg(ptr)), "0"(old)
 			     : "memory");
 		return prev;
 	case 4:
-		asm volatile("lock; cmpxchgl %1,%2"
+		asm volatile("lock; cmpxchgl %1,%2\n\t"
+			     CMPXCHG_LFENCE
 			     : "=a"(prev)
 			     : "r"(new), "m"(*__xg(ptr)), "0"(old)
 			     : "memory");
Index: linux-2.6-lttng/arch/x86/include/asm/alternative.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/alternative.h	2009-04-18 23:58:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/alternative.h	2009-04-18 23:58:28.000000000 -0400
@@ -5,6 +5,7 @@
 #include <linux/stddef.h>
 #include <linux/stringify.h>
 #include <asm/asm.h>
+#include <asm/nops.h>
 
 /*
  * Alternative inline assembly for SMP.
@@ -54,6 +55,10 @@ struct alt_instr {
 #endif
 };
 
+/* Needed both in UP and SMP build because of cmpxchg_sync */
+#define CMPXCHG_LFENCE 	ALTERNATIVE(ASM_NOP3, "lfence",	\
+				    X86_FEATURE_NEED_CMPXCHG_LFENCE)
+
 extern void alternative_instructions(void);
 extern void apply_alternatives(struct alt_instr *start, struct alt_instr *end);
 
Index: linux-2.6-lttng/arch/x86/include/asm/spinlock.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/spinlock.h	2009-04-18 23:58:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/spinlock.h	2009-04-18 23:58:28.000000000 -0400
@@ -86,6 +86,7 @@ static __always_inline int __ticket_spin
 		     "leal 0x100(%" REG_PTR_MODE "0), %1\n\t"
 		     "jne 1f\n\t"
 		     LOCK_PREFIX "cmpxchgw %w1,%2\n\t"
+		     CMPXCHG_LFENCE "\n\t"
 		     "1:"
 		     "sete %b1\n\t"
 		     "movzbl %b1,%0\n\t"
@@ -139,6 +140,7 @@ static __always_inline int __ticket_spin
 		     "leal 0x00010000(%" REG_PTR_MODE "0), %1\n\t"
 		     "jne 1f\n\t"
 		     LOCK_PREFIX "cmpxchgl %1,%2\n\t"
+		     CMPXCHG_LFENCE "\n\t"
 		     "1:"
 		     "sete %b1\n\t"
 		     "movzbl %b1,%0\n\t"
Index: linux-2.6-lttng/arch/x86/include/asm/rwsem.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/rwsem.h	2009-04-18 23:58:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/rwsem.h	2009-04-18 23:58:28.000000000 -0400
@@ -129,6 +129,7 @@ static inline int __down_read_trylock(st
 		     "  addl      %3,%2\n\t"
 		     "  jle	     2f\n\t"
 		     LOCK_PREFIX "  cmpxchgl  %2,%0\n\t"
+		     CMPXCHG_LFENCE "\n\t"
 		     "  jnz	     1b\n\t"
 		     "2:\n\t"
 		     "# ending __down_read_trylock\n\t"
Index: linux-2.6-lttng/arch/x86/include/asm/futex.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/futex.h	2009-04-18 23:58:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/futex.h	2009-04-18 23:58:28.000000000 -0400
@@ -26,6 +26,7 @@
 		     "\tmovl\t%0, %3\n"				\
 		     "\t" insn "\n"				\
 		     "2:\t" LOCK_PREFIX "cmpxchgl %3, %2\n"	\
+		     CMPXCHG_LFENCE "\n"			\
 		     "\tjnz\t1b\n"				\
 		     "3:\t.section .fixup,\"ax\"\n"		\
 		     "4:\tmov\t%5, %1\n"			\
@@ -123,6 +124,7 @@ static inline int futex_atomic_cmpxchg_i
 		return -EFAULT;
 
 	asm volatile("1:\t" LOCK_PREFIX "cmpxchgl %3, %1\n"
+		     CMPXCHG_LFENCE "\n"
 		     "2:\t.section .fixup, \"ax\"\n"
 		     "3:\tmov     %2, %0\n"
 		     "\tjmp     2b\n"

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-22 20:18 ` [patch 2/2] x86 amd fix cmpxchg read acquire barrier Mathieu Desnoyers
@ 2009-04-22 20:59   ` Alan Cox
  2009-04-22 22:47     ` Mathieu Desnoyers
  2009-04-23  8:06   ` Ingo Molnar
  2009-04-25  8:19   ` Pavel Machek
  2 siblings, 1 reply; 21+ messages in thread
From: Alan Cox @ 2009-04-22 20:59 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: akpm, Ingo Molnar, linux-kernel, Mathieu Desnoyers,
	mark.langsdorf, arekm, H. Peter Anvin, Andi Kleen, Avi Kivity

> There is no official AMD bug ID yet, but it seems to be reported in the field
> and a fix is present in Solaris code base. The following links shows the current
> understanding of the issue.

I would like to know why there isn't an official vendor Bug ID yet
because without that nobody knows what the fix actually is or when it is
needed. It's also not going to help much given the user space problem
(and mixed kernel/user stuff like futexes).

Second point - it needs to be possible to avoid compiling this stuff in
the first place. I don't see where you arrange CMPXCHNG_LFENCE to come
out as blank when people compile for processors without the bug ?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-22 20:59   ` Alan Cox
@ 2009-04-22 22:47     ` Mathieu Desnoyers
  2009-04-22 22:53       ` Alan Cox
  0 siblings, 1 reply; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-22 22:47 UTC (permalink / raw)
  To: Alan Cox
  Cc: akpm, Ingo Molnar, linux-kernel, mark.langsdorf, arekm,
	H. Peter Anvin, Andi Kleen, Avi Kivity

* Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> > There is no official AMD bug ID yet, but it seems to be reported in the field
> > and a fix is present in Solaris code base. The following links shows the current
> > understanding of the issue.
> 
> I would like to know why there isn't an official vendor Bug ID yet
> because without that nobody knows what the fix actually is or when it is
> needed.

I guess AMD should answer to this.

> It's also not going to help much given the user space problem
> (and mixed kernel/user stuff like futexes).

Yeah, ideally, unless user-space is completely audited, the kernel
should restrict multi-threaded user-space programs to run on a single
CPU unless they don't share any memory mapping when the kernel finds it
is running on such broken CPU. Or it can simply refuse to bring up more
than one CPU. That would be a very basic gross fix for a gross problem,
but at least there would be no data corruption. If we use this last
solution, then my cmpxchg lfence workaround becomes unneeded.

> Second point - it needs to be possible to avoid compiling this stuff in
> the first place. I don't see where you arrange CMPXCHNG_LFENCE to come
> out as blank when people compile for processors without the bug ?
> 

Good point. I should probably configure this as "nothing" unless

#ifdef CONFIG_X86_64
#if (defined(CONFIG_GENERIC_CPU) || defined(CONFIG_MK8))
...
#endif
#endif

I doubt it's worth adding a "HAVE_MISSING_CMPXCHG_LFENCE" select.

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-22 22:47     ` Mathieu Desnoyers
@ 2009-04-22 22:53       ` Alan Cox
  2009-04-22 23:26         ` Mathieu Desnoyers
  0 siblings, 1 reply; 21+ messages in thread
From: Alan Cox @ 2009-04-22 22:53 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: akpm, Ingo Molnar, linux-kernel, mark.langsdorf, arekm,
	H. Peter Anvin, Andi Kleen, Avi Kivity

> Good point. I should probably configure this as "nothing" unless
> 
> #ifdef CONFIG_X86_64
> #if (defined(CONFIG_GENERIC_CPU) || defined(CONFIG_MK8))
> ...
> #endif
> #endif

Is the erratum only present in 64bit mode ?

Really this needs someone with knowledge of the erratum to characterise
it accurately and the workarounds.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-22 22:53       ` Alan Cox
@ 2009-04-22 23:26         ` Mathieu Desnoyers
  0 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-22 23:26 UTC (permalink / raw)
  To: Alan Cox
  Cc: akpm, Ingo Molnar, linux-kernel, mark.langsdorf, arekm,
	H.  Peter Anvin, Andi Kleen, Avi Kivity

* Alan Cox (alan@lxorguk.ukuu.org.uk) wrote:
> > Good point. I should probably configure this as "nothing" unless
> > 
> > #ifdef CONFIG_X86_64
> > #if (defined(CONFIG_GENERIC_CPU) || defined(CONFIG_MK8))
> > ...
> > #endif
> > #endif
> 
> Is the erratum only present in 64bit mode ?
> 

Hrm, from the info I gathered in the web pages linked in the patch
header (refering to Google and Solaris), nothing said it was limited to
64bit mode. You are right.

> Really this needs someone with knowledge of the erratum to characterise
> it accurately and the workarounds.

Yep, but all I can hear from AMD so far is... *crickets*

The last revision of their errata I can find has been updated in
February 2008, and does not talk about this issue.

http://support.amd.com/us/Processor_TechDocs/25759.pdf

Thanks,

Mathieu

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-22 20:18 ` [patch 2/2] x86 amd fix cmpxchg read acquire barrier Mathieu Desnoyers
  2009-04-22 20:59   ` Alan Cox
@ 2009-04-23  8:06   ` Ingo Molnar
  2009-04-23 13:19     ` Mathieu Desnoyers
  2009-04-25  8:19   ` Pavel Machek
  2 siblings, 1 reply; 21+ messages in thread
From: Ingo Molnar @ 2009-04-23  8:06 UTC (permalink / raw)
  To: Mathieu Desnoyers, Alan Cox
  Cc: akpm, linux-kernel, mark.langsdorf, arekm, H. Peter Anvin,
	Andi Kleen, Avi Kivity


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> " // Opteron Rev E has a bug in which on very rare occasions a locked
>   // instruction doesn't act as a read-acquire barrier if followed by a
>   // non-locked read-modify-write instruction.  Rev F has this bug in 
>   // pre-release versions, but not in versions released to customers,
>   // so we test only for Rev E, which is family 15, model 32..63 inclusive.

Dunno. The fix looks a bit intrusive (emits a NOP even on good 
CPUs). Also, the text above says "not in versions released to 
customers".

So unless there's an official erratum or reports in the field (not 
from early prototype systems shipped to developers) i'd not rush to 
apply it, just yet.

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-23  8:06   ` Ingo Molnar
@ 2009-04-23 13:19     ` Mathieu Desnoyers
  2009-04-23 13:41       ` Arkadiusz Miskiewicz
  0 siblings, 1 reply; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-23 13:19 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Alan Cox, akpm, linux-kernel, mark.langsdorf, arekm,
	H. Peter Anvin, Andi Kleen, Avi Kivity

* Ingo Molnar (mingo@elte.hu) wrote:
> 
> * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> 
> > " // Opteron Rev E has a bug in which on very rare occasions a locked
> >   // instruction doesn't act as a read-acquire barrier if followed by a
> >   // non-locked read-modify-write instruction.  Rev F has this bug in 
> >   // pre-release versions, but not in versions released to customers,
> >   // so we test only for Rev E, which is family 15, model 32..63 inclusive.
> 
> Dunno. The fix looks a bit intrusive (emits a NOP even on good 
> CPUs). Also, the text above says "not in versions released to 
> customers".
> 
> So unless there's an official erratum or reports in the field (not 
> from early prototype systems shipped to developers) i'd not rush to 
> apply it, just yet.
> 

Actually, Operon Rev E has this bug in the field (family 15, model
32..64). Rev F only had the bug in pre-releases.

But yes, it's bad that it drags so many code additions to something as
critical as cmpxchg. I start to think it might be better to just
disallow bringing up more than one CPU on these machines.

Mathieu

> 	Ingo

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-23 13:19     ` Mathieu Desnoyers
@ 2009-04-23 13:41       ` Arkadiusz Miskiewicz
  2009-04-23 22:17         ` Mathieu Desnoyers
  0 siblings, 1 reply; 21+ messages in thread
From: Arkadiusz Miskiewicz @ 2009-04-23 13:41 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Ingo Molnar, Alan Cox, akpm, linux-kernel, mark.langsdorf,
	H. Peter Anvin, Andi Kleen, Avi Kivity

On Thursday 23 of April 2009, Mathieu Desnoyers wrote:
> * Ingo Molnar (mingo@elte.hu) wrote:
> > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> > > " // Opteron Rev E has a bug in which on very rare occasions a locked
> > >   // instruction doesn't act as a read-acquire barrier if followed by a
> > >   // non-locked read-modify-write instruction.  Rev F has this bug in
> > >   // pre-release versions, but not in versions released to customers,
> > >   // so we test only for Rev E, which is family 15, model 32..63
> > > inclusive.
> >
> > Dunno. The fix looks a bit intrusive (emits a NOP even on good
> > CPUs). Also, the text above says "not in versions released to
> > customers".
> >
> > So unless there's an official erratum or reports in the field (not
> > from early prototype systems shipped to developers) i'd not rush to
> > apply it, just yet.
>
> Actually, Operon Rev E has this bug in the field (family 15, model
> 32..64). Rev F only had the bug in pre-releases.
>
> But yes, it's bad that it drags so many code additions to something as
> critical as cmpxchg. I start to think it might be better to just
> disallow bringing up more than one CPU on these machines.

That probably would be even worse than what we have now. This bug doesn't 
manifest too often in a noticeable way here (I have few such machines here, 
mostly 2 x dual core; once per few months mysql dies) and loosing 3 of 4 cores 
(or 1 cpu of 2; depends on what you mean) doesn't sound like fun.

> Mathieu


-- 
Arkadiusz Miśkiewicz        PLD/Linux Team
arekm / maven.pl            http://ftp.pld-linux.org/


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-23 13:41       ` Arkadiusz Miskiewicz
@ 2009-04-23 22:17         ` Mathieu Desnoyers
  0 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-23 22:17 UTC (permalink / raw)
  To: Arkadiusz Miskiewicz
  Cc: Ingo Molnar, Alan Cox, akpm, linux-kernel, mark.langsdorf,
	H. Peter Anvin, Andi Kleen, Avi Kivity

* Arkadiusz Miskiewicz (a.miskiewicz@gmail.com) wrote:
> On Thursday 23 of April 2009, Mathieu Desnoyers wrote:
> > * Ingo Molnar (mingo@elte.hu) wrote:
> > > * Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:
> > > > " // Opteron Rev E has a bug in which on very rare occasions a locked
> > > >   // instruction doesn't act as a read-acquire barrier if followed by a
> > > >   // non-locked read-modify-write instruction.  Rev F has this bug in
> > > >   // pre-release versions, but not in versions released to customers,
> > > >   // so we test only for Rev E, which is family 15, model 32..63
> > > > inclusive.
> > >
> > > Dunno. The fix looks a bit intrusive (emits a NOP even on good
> > > CPUs). Also, the text above says "not in versions released to
> > > customers".
> > >
> > > So unless there's an official erratum or reports in the field (not
> > > from early prototype systems shipped to developers) i'd not rush to
> > > apply it, just yet.
> >
> > Actually, Operon Rev E has this bug in the field (family 15, model
> > 32..64). Rev F only had the bug in pre-releases.
> >
> > But yes, it's bad that it drags so many code additions to something as
> > critical as cmpxchg. I start to think it might be better to just
> > disallow bringing up more than one CPU on these machines.
> 
> That probably would be even worse than what we have now. This bug doesn't 
> manifest too often in a noticeable way here (I have few such machines here, 
> mostly 2 x dual core; once per few months mysql dies) and loosing 3 of 4 cores 
> (or 1 cpu of 2; depends on what you mean) doesn't sound like fun.
> 

Having silent data corruption does not sound like fun neither. Another
alternative, when we detect those CPUs, is to printk a warning telling :

"AMD Opteron family X model Y is known to corrupt data on SMP due"
"to incorrect cmpxchg instruction memory barriers. Please contact"
"AMD for more information."

And activate the "tainted" kernel flag. This way, we won't be bothered
trying to fix AMD bugs, and it will officially become AMD's problem.

Mathieu

> > Mathieu
> 
> 
> -- 
> Arkadiusz Miśkiewicz        PLD/Linux Team
> arekm / maven.pl            http://ftp.pld-linux.org/
> 
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-22 20:18 ` [patch 2/2] x86 amd fix cmpxchg read acquire barrier Mathieu Desnoyers
  2009-04-22 20:59   ` Alan Cox
  2009-04-23  8:06   ` Ingo Molnar
@ 2009-04-25  8:19   ` Pavel Machek
  2009-05-02 15:55     ` Mathieu Desnoyers
  2 siblings, 1 reply; 21+ messages in thread
From: Pavel Machek @ 2009-04-25  8:19 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: akpm, Ingo Molnar, linux-kernel, mark.langsdorf, arekm,
	H. Peter Anvin, Andi Kleen, Avi Kivity

On Wed 2009-04-22 16:18:54, Mathieu Desnoyers wrote:
> http://google-perftools.googlecode.com/svn-history/r48/trunk/src/base/atomicops-internals-x86.cc
> 
> says
> 
> "  // Opteron Rev E has a bug in which on very rare occasions a locked
>   // instruction doesn't act as a read-acquire barrier if followed by a
>   // non-locked read-modify-write instruction.  Rev F has this bug in 
>   // pre-release versions, but not in versions released to customers,
>   // so we test only for Rev E, which is family 15, model 32..63 inclusive.

Could we be more careful  here and catch the F pre-release versions,
too? Stepping should help there...

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [GIT PULL] x86: cleanup alternative.h
  2009-04-22 20:18 ` [patch 1/2] x86: cleanup alternative.h Mathieu Desnoyers
@ 2009-04-28 13:05   ` Mathieu Desnoyers
  2009-04-28 14:58     ` Ingo Molnar
  0 siblings, 1 reply; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-28 13:05 UTC (permalink / raw)
  To: akpm, Ingo Molnar, linux-kernel; +Cc: H. Peter Anvin, Andi Kleen

Ingo, do you think the cleanup below would be worth merging ? It adds
the ability to use ALTERNATIVE() directly in gcc inline assembly _and_
cleans up the current alternative*() cut-and-paste coding style. :)

So unless someone can spot any kind of serious drawback with it, I think
this cleanup patch should be pulled.

Alternative header duplicates assembly that could be merged in one single macro.
Merging this into this macro also allows to directly declare ALTERNATIVE()
statements within assembly code.

Uses a __stringify() of the feature bits rather than passing a "i" operand.
Leave the old %0 operand as-is (set to 0), unused to stay compatible with API.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/include/asm/alternative.h |   58 ++++++++++++++-----------------------
 1 file changed, 22 insertions(+), 36 deletions(-)

Index: linux-2.6-lttng/arch/x86/include/asm/alternative.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/alternative.h	2009-04-22 16:14:23.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/alternative.h	2009-04-22 16:17:48.000000000 -0400
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <linux/stringify.h>
 #include <asm/asm.h>
 
 /*
@@ -74,6 +75,21 @@ static inline void alternatives_smp_swit
 
 const unsigned char *const *find_nop_table(void);
 
+/* alternative assembly primitive */
+#define ALTERNATIVE(oldinstr, newinstr, feature) \
+	      "661:\n\t" oldinstr "\n662:\n"			\
+	      ".section .altinstructions,\"a\"\n"		\
+	      _ASM_ALIGN "\n"					\
+	      _ASM_PTR "661b\n"		/* label */		\
+	      _ASM_PTR "663f\n"		/* new instruction */	\
+	      "	 .byte " __stringify(feature) "\n"	/* feature bit */ \
+	      "	 .byte 662b-661b\n"	/* sourcelen */		\
+	      "	 .byte 664f-663f\n"	/* replacementlen */	\
+	      ".previous\n"					\
+	      ".section .altinstr_replacement,\"ax\"\n"		\
+	      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
+	      ".previous"
+
 /*
  * Alternative instructions for different CPU types or capabilities.
  *
@@ -87,18 +103,7 @@ const unsigned char *const *find_nop_tab
  * without volatile and memory clobber.
  */
 #define alternative(oldinstr, newinstr, feature)			\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature) : "memory")
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
 
 /*
  * Alternative inline assembly with input.
@@ -109,35 +114,16 @@ const unsigned char *const *find_nop_tab
  * Best is to use constraints that are fixed size (like (%1) ... "r")
  * If you use variable sized constraints like "m" or "g" in the
  * replacement make sure to pad to the worst case length.
+ * Leaving an unused argument 0 to keep API compatibility.
  */
 #define alternative_input(oldinstr, newinstr, feature, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: : "i" (0), ## input)
 
 /* Like alternative_input, but with a single output argument */
 #define alternative_io(oldinstr, newinstr, feature, output, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c[feat]\n"	/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */ \
-		      ".previous" : output : [feat] "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: output : "i" (0), ## input)
 
 /*
  * use this macro(s) if you need more than one output parameter

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GIT PULL] x86: cleanup alternative.h
  2009-04-28 13:05   ` [GIT PULL] " Mathieu Desnoyers
@ 2009-04-28 14:58     ` Ingo Molnar
  2009-04-28 15:13       ` [GIT PULL] x86: cleanup alternative.h (v2) Mathieu Desnoyers
  2009-04-28 17:40       ` [GIT PULL] x86: cleanup alternative.h H. Peter Anvin
  0 siblings, 2 replies; 21+ messages in thread
From: Ingo Molnar @ 2009-04-28 14:58 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: akpm, linux-kernel, H. Peter Anvin, Andi Kleen, Thomas Gleixner


* Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca> wrote:

> Ingo, do you think the cleanup below would be worth merging ? It 
> adds the ability to use ALTERNATIVE() directly in gcc inline 
> assembly _and_ cleans up the current alternative*() cut-and-paste 
> coding style. :)

Sure it's worth it! Your patch removes quite a bit of code:

   1 file changed, 22 insertions(+), 36 deletions(-)

One small style detail:

> +/* alternative assembly primitive */
> +#define ALTERNATIVE(oldinstr, newinstr, feature) \
> +	      "661:\n\t" oldinstr "\n662:\n"			\
> +	      ".section .altinstructions,\"a\"\n"		\
> +	      _ASM_ALIGN "\n"					\
> +	      _ASM_PTR "661b\n"		/* label */		\
> +	      _ASM_PTR "663f\n"		/* new instruction */	\
> +	      "	 .byte " __stringify(feature) "\n"	/* feature bit */ \
> +	      "	 .byte 662b-661b\n"	/* sourcelen */		\
> +	      "	 .byte 664f-663f\n"	/* replacementlen */	\
> +	      ".previous\n"					\
> +	      ".section .altinstr_replacement,\"ax\"\n"		\
> +	      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
> +	      ".previous"

Please try to align both the continuation backslashes and the 
comments vertically. Something like:

> +#define ALTERNATIVE(oldinstr, newinstr, feature)			 \
> +									 \
> +	"661:\n\t" oldinstr "\n662:\n"					 \
> +	".section .altinstructions,\"a\"\n"			 	 \
> +	_ASM_ALIGN "\n"						 	 \
> +	_ASM_PTR "661b\n"			   /* label           */ \
> +	_ASM_PTR "663f\n"			   /* new instruction */ \
> +	 "	 .byte " __stringify(feature) "\n" /* feature bit     */ \
> +	 "	 .byte 662b-661b\n"		   /* sourcelen       */ \
> +	 "	 .byte 664f-663f\n"		   /* replacementlen  */ \ 

... should do the trick. (also note the extra line after the #define)

	Ingo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GIT PULL] x86: cleanup alternative.h (v2)
  2009-04-28 14:58     ` Ingo Molnar
@ 2009-04-28 15:13       ` Mathieu Desnoyers
  2009-04-29  6:17         ` [tip:x86/asm] x86: clean up alternative.h tip-bot for Mathieu Desnoyers
  2009-04-28 17:40       ` [GIT PULL] x86: cleanup alternative.h H. Peter Anvin
  1 sibling, 1 reply; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-28 15:13 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: akpm, linux-kernel, H. Peter Anvin, Andi Kleen, Thomas Gleixner

Alternative header duplicates assembly that could be merged in one single macro.
Merging this into this macro also allows to directly declare ALTERNATIVE()
statements within assembly code.

Uses a __stringify() of the feature bits rather than passing a "i" operand.
Leave the old %0 operand as-is (set to 0), unused to stay compatible with API.

(update : tab alignment fixes)

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Andrew Morton <akpm@linux-foundation.org>
---
 arch/x86/include/asm/alternative.h |   59 ++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 36 deletions(-)

Index: linux-2.6-lttng/arch/x86/include/asm/alternative.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/alternative.h	2009-04-28 10:45:10.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/alternative.h	2009-04-28 10:47:50.000000000 -0400
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <linux/stringify.h>
 #include <asm/asm.h>
 
 /*
@@ -74,6 +75,22 @@ static inline void alternatives_smp_swit
 
 const unsigned char *const *find_nop_table(void);
 
+/* alternative assembly primitive */
+#define ALTERNATIVE(oldinstr, newinstr, feature) 			\
+									\
+      "661:\n\t" oldinstr "\n662:\n"					\
+      ".section .altinstructions,\"a\"\n"				\
+      _ASM_ALIGN "\n"							\
+      _ASM_PTR "661b\n"				/* label */		\
+      _ASM_PTR "663f\n"				/* new instruction */	\
+      "	 .byte " __stringify(feature) "\n"	/* feature bit */	\
+      "	 .byte 662b-661b\n"			/* sourcelen */		\
+      "	 .byte 664f-663f\n"			/* replacementlen */	\
+      ".previous\n"							\
+      ".section .altinstr_replacement,\"ax\"\n"				\
+      "663:\n\t" newinstr "\n664:\n"		/* replacement */	\
+      ".previous"
+
 /*
  * Alternative instructions for different CPU types or capabilities.
  *
@@ -87,18 +104,7 @@ const unsigned char *const *find_nop_tab
  * without volatile and memory clobber.
  */
 #define alternative(oldinstr, newinstr, feature)			\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature) : "memory")
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
 
 /*
  * Alternative inline assembly with input.
@@ -109,35 +115,16 @@ const unsigned char *const *find_nop_tab
  * Best is to use constraints that are fixed size (like (%1) ... "r")
  * If you use variable sized constraints like "m" or "g" in the
  * replacement make sure to pad to the worst case length.
+ * Leaving an unused argument 0 to keep API compatibility.
  */
 #define alternative_input(oldinstr, newinstr, feature, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: : "i" (0), ## input)
 
 /* Like alternative_input, but with a single output argument */
 #define alternative_io(oldinstr, newinstr, feature, output, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c[feat]\n"	/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */ \
-		      ".previous" : output : [feat] "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: output : "i" (0), ## input)
 
 /*
  * use this macro(s) if you need more than one output parameter

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GIT PULL] x86: cleanup alternative.h
  2009-04-28 14:58     ` Ingo Molnar
  2009-04-28 15:13       ` [GIT PULL] x86: cleanup alternative.h (v2) Mathieu Desnoyers
@ 2009-04-28 17:40       ` H. Peter Anvin
  2009-04-28 18:11         ` Mathieu Desnoyers
  2009-04-28 18:13         ` [GIT PULL] x86: cleanup alternative.h (v3) Mathieu Desnoyers
  1 sibling, 2 replies; 21+ messages in thread
From: H. Peter Anvin @ 2009-04-28 17:40 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Mathieu Desnoyers, akpm, linux-kernel, Andi Kleen, Thomas Gleixner

Ingo Molnar wrote:
> 
> Please try to align both the continuation backslashes and the 
> comments vertically. Something like:
> 
>> +#define ALTERNATIVE(oldinstr, newinstr, feature)			 \
>> +									 \
>> +	"661:\n\t" oldinstr "\n662:\n"					 \
>> +	".section .altinstructions,\"a\"\n"			 	 \
>> +	_ASM_ALIGN "\n"						 	 \
>> +	_ASM_PTR "661b\n"			   /* label           */ \
>> +	_ASM_PTR "663f\n"			   /* new instruction */ \
>> +	 "	 .byte " __stringify(feature) "\n" /* feature bit     */ \
>> +	 "	 .byte 662b-661b\n"		   /* sourcelen       */ \
>> +	 "	 .byte 664f-663f\n"		   /* replacementlen  */ \ 
> 
> ... should do the trick. (also note the extra line after the #define)
> 

This formatting seems a bit odd; _ASM_ALIGN, _ASM_PTR, and .section are
functionally equivalent to the opcode field, as is .byte, so the shift
in indentation for the .byte lines seems odd at best.

	-hpa

-- 
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel.  I don't speak on their behalf.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GIT PULL] x86: cleanup alternative.h
  2009-04-28 17:40       ` [GIT PULL] x86: cleanup alternative.h H. Peter Anvin
@ 2009-04-28 18:11         ` Mathieu Desnoyers
  2009-04-28 18:13         ` [GIT PULL] x86: cleanup alternative.h (v3) Mathieu Desnoyers
  1 sibling, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-28 18:11 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, akpm, linux-kernel, Andi Kleen, Thomas Gleixner

* H. Peter Anvin (hpa@zytor.com) wrote:
> Ingo Molnar wrote:
> > 
> > Please try to align both the continuation backslashes and the 
> > comments vertically. Something like:
> > 
> >> +#define ALTERNATIVE(oldinstr, newinstr, feature)			 \
> >> +									 \
> >> +	"661:\n\t" oldinstr "\n662:\n"					 \
> >> +	".section .altinstructions,\"a\"\n"			 	 \
> >> +	_ASM_ALIGN "\n"						 	 \
> >> +	_ASM_PTR "661b\n"			   /* label           */ \
> >> +	_ASM_PTR "663f\n"			   /* new instruction */ \
> >> +	 "	 .byte " __stringify(feature) "\n" /* feature bit     */ \
> >> +	 "	 .byte 662b-661b\n"		   /* sourcelen       */ \
> >> +	 "	 .byte 664f-663f\n"		   /* replacementlen  */ \ 
> > 
> > ... should do the trick. (also note the extra line after the #define)
> > 
> 
> This formatting seems a bit odd; _ASM_ALIGN, _ASM_PTR, and .section are
> functionally equivalent to the opcode field, as is .byte, so the shift
> in indentation for the .byte lines seems odd at best.
> 

OK, I'll repost with corrected indentation. I tried to follow the
original style as closely as possible, but indeed a cleanup is needed
here.

Mathieu

> 	-hpa
> 
> -- 
> H. Peter Anvin, Intel Open Source Technology Center
> I work for Intel.  I don't speak on their behalf.
> 

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [GIT PULL] x86: cleanup alternative.h (v3)
  2009-04-28 17:40       ` [GIT PULL] x86: cleanup alternative.h H. Peter Anvin
  2009-04-28 18:11         ` Mathieu Desnoyers
@ 2009-04-28 18:13         ` Mathieu Desnoyers
  1 sibling, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-28 18:13 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Ingo Molnar, akpm, linux-kernel, Andi Kleen, Thomas Gleixner

Alternative header duplicates assembly that could be merged in one single macro.
Merging this into this macro also allows to directly declare ALTERNATIVE()
statements within assembly code.

Uses a __stringify() of the feature bits rather than passing a "i" operand.
Leave the old %0 operand as-is (set to 0), unused to stay compatible with API.

Update :
align comments, align code.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
CC: 'Ingo Molnar' <mingo@elte.hu>
CC: H. Peter Anvin <hpa@zytor.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/alternative.h |   59 ++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 36 deletions(-)

Index: linux-2.6-lttng/arch/x86/include/asm/alternative.h
===================================================================
--- linux-2.6-lttng.orig/arch/x86/include/asm/alternative.h	2009-04-28 13:47:03.000000000 -0400
+++ linux-2.6-lttng/arch/x86/include/asm/alternative.h	2009-04-28 13:48:38.000000000 -0400
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <linux/stringify.h>
 #include <asm/asm.h>
 
 /*
@@ -74,6 +75,22 @@ static inline void alternatives_smp_swit
 
 const unsigned char *const *find_nop_table(void);
 
+/* alternative assembly primitive */
+#define ALTERNATIVE(oldinstr, newinstr, feature) 			\
+									\
+      "661:\n\t" oldinstr "\n662:\n"					\
+      ".section .altinstructions,\"a\"\n"				\
+      _ASM_ALIGN "\n"							\
+      _ASM_PTR "661b\n"				/* label */		\
+      _ASM_PTR "663f\n"				/* new instruction */	\
+      ".byte " __stringify(feature) "\n"	/* feature bit */	\
+      ".byte 662b-661b\n"			/* sourcelen */		\
+      ".byte 664f-663f\n"			/* replacementlen */	\
+      ".previous\n"							\
+      ".section .altinstr_replacement,\"ax\"\n"				\
+      "663:\n\t" newinstr "\n664:\n"		/* replacement */	\
+      ".previous"
+
 /*
  * Alternative instructions for different CPU types or capabilities.
  *
@@ -87,18 +104,7 @@ const unsigned char *const *find_nop_tab
  * without volatile and memory clobber.
  */
 #define alternative(oldinstr, newinstr, feature)			\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature) : "memory")
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
 
 /*
  * Alternative inline assembly with input.
@@ -109,35 +115,16 @@ const unsigned char *const *find_nop_tab
  * Best is to use constraints that are fixed size (like (%1) ... "r")
  * If you use variable sized constraints like "m" or "g" in the
  * replacement make sure to pad to the worst case length.
+ * Leaving an unused argument 0 to keep API compatibility.
  */
 #define alternative_input(oldinstr, newinstr, feature, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: : "i" (0), ## input)
 
 /* Like alternative_input, but with a single output argument */
 #define alternative_io(oldinstr, newinstr, feature, output, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c[feat]\n"	/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */ \
-		      ".previous" : output : [feat] "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: output : "i" (0), ## input)
 
 /*
  * use this macro(s) if you need more than one output parameter

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tip:x86/asm] x86: clean up alternative.h
  2009-04-28 15:13       ` [GIT PULL] x86: cleanup alternative.h (v2) Mathieu Desnoyers
@ 2009-04-29  6:17         ` tip-bot for Mathieu Desnoyers
  2009-04-29  6:22           ` Mathieu Desnoyers
  0 siblings, 1 reply; 21+ messages in thread
From: tip-bot for Mathieu Desnoyers @ 2009-04-29  6:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, mathieu.desnoyers, hpa, mingo, tglx, mingo

Commit-ID:  edc953fa4ebc0265ef3b1754fe116a9fd4264e15
Gitweb:     http://git.kernel.org/tip/edc953fa4ebc0265ef3b1754fe116a9fd4264e15
Author:     Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
AuthorDate: Tue, 28 Apr 2009 11:13:46 -0400
Committer:  Ingo Molnar <mingo@elte.hu>
CommitDate: Tue, 28 Apr 2009 17:42:08 +0200

x86: clean up alternative.h

Alternative header duplicates assembly that could be merged in
one single macro.  Merging this into this macro also allows to
directly declare ALTERNATIVE() statements within assembly code.

Uses a __stringify() of the feature bits rather than passing a
"i" operand.  Leave the old %0 operand as-is (set to 0), unused
to stay compatible with API.

(v2: tab alignment fixes)

[ Impact: cleanup ]

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
LKML-Reference: <20090428151346.GA31212@Krystal>
Signed-off-by: Ingo Molnar <mingo@elte.hu>


---
 arch/x86/include/asm/alternative.h |   59 ++++++++++++++----------------------
 1 files changed, 23 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index f6aa18e..1a37bcd 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -3,6 +3,7 @@
 
 #include <linux/types.h>
 #include <linux/stddef.h>
+#include <linux/stringify.h>
 #include <asm/asm.h>
 
 /*
@@ -74,6 +75,22 @@ static inline void alternatives_smp_switch(int smp) {}
 
 const unsigned char *const *find_nop_table(void);
 
+/* alternative assembly primitive: */
+#define ALTERNATIVE(oldinstr, newinstr, feature)			\
+									\
+      "661:\n\t" oldinstr "\n662:\n"					\
+      ".section .altinstructions,\"a\"\n"				\
+      _ASM_ALIGN "\n"							\
+      _ASM_PTR "661b\n"				/* label           */	\
+      _ASM_PTR "663f\n"				/* new instruction */	\
+      "	 .byte " __stringify(feature) "\n"	/* feature bit     */	\
+      "	 .byte 662b-661b\n"			/* sourcelen       */	\
+      "	 .byte 664f-663f\n"			/* replacementlen  */	\
+      ".previous\n"							\
+      ".section .altinstr_replacement, \"ax\"\n"			\
+      "663:\n\t" newinstr "\n664:\n"		/* replacement     */	\
+      ".previous"
+
 /*
  * Alternative instructions for different CPU types or capabilities.
  *
@@ -87,18 +104,7 @@ const unsigned char *const *find_nop_table(void);
  * without volatile and memory clobber.
  */
 #define alternative(oldinstr, newinstr, feature)			\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature) : "memory")
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
 
 /*
  * Alternative inline assembly with input.
@@ -109,35 +115,16 @@ const unsigned char *const *find_nop_table(void);
  * Best is to use constraints that are fixed size (like (%1) ... "r")
  * If you use variable sized constraints like "m" or "g" in the
  * replacement make sure to pad to the worst case length.
+ * Leaving an unused argument 0 to keep API compatibility.
  */
 #define alternative_input(oldinstr, newinstr, feature, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c0\n"		/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
-		      ".previous" :: "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: : "i" (0), ## input)
 
 /* Like alternative_input, but with a single output argument */
 #define alternative_io(oldinstr, newinstr, feature, output, input...)	\
-	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
-		      ".section .altinstructions,\"a\"\n"		\
-		      _ASM_ALIGN "\n"					\
-		      _ASM_PTR "661b\n"		/* label */		\
-		      _ASM_PTR "663f\n"		/* new instruction */	\
-		      "	 .byte %c[feat]\n"	/* feature bit */	\
-		      "	 .byte 662b-661b\n"	/* sourcelen */		\
-		      "	 .byte 664f-663f\n"	/* replacementlen */	\
-		      ".previous\n"					\
-		      ".section .altinstr_replacement,\"ax\"\n"		\
-		      "663:\n\t" newinstr "\n664:\n"  /* replacement */ \
-		      ".previous" : output : [feat] "i" (feature), ##input)
+	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
+		: output : "i" (0), ## input)
 
 /*
  * use this macro(s) if you need more than one output parameter

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [tip:x86/asm] x86: clean up alternative.h
  2009-04-29  6:17         ` [tip:x86/asm] x86: clean up alternative.h tip-bot for Mathieu Desnoyers
@ 2009-04-29  6:22           ` Mathieu Desnoyers
  0 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-04-29  6:22 UTC (permalink / raw)
  To: mingo; +Cc: linux-kernel, hpa, tglx, mingo, linux-tip-commits

* tip-bot for Mathieu Desnoyers (mathieu.desnoyers@polymtl.ca) wrote:
> Commit-ID:  edc953fa4ebc0265ef3b1754fe116a9fd4264e15
> Gitweb:     http://git.kernel.org/tip/edc953fa4ebc0265ef3b1754fe116a9fd4264e15
> Author:     Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> AuthorDate: Tue, 28 Apr 2009 11:13:46 -0400
> Committer:  Ingo Molnar <mingo@elte.hu>
> CommitDate: Tue, 28 Apr 2009 17:42:08 +0200
> 
> x86: clean up alternative.h
> 
> Alternative header duplicates assembly that could be merged in
> one single macro.  Merging this into this macro also allows to
> directly declare ALTERNATIVE() statements within assembly code.
> 
> Uses a __stringify() of the feature bits rather than passing a
> "i" operand.  Leave the old %0 operand as-is (set to 0), unused
> to stay compatible with API.
> 
> (v2: tab alignment fixes)
> 
> [ Impact: cleanup ]
> 
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
> LKML-Reference: <20090428151346.GA31212@Krystal>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> 
> 
> ---
>  arch/x86/include/asm/alternative.h |   59 ++++++++++++++----------------------
>  1 files changed, 23 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
> index f6aa18e..1a37bcd 100644
> --- a/arch/x86/include/asm/alternative.h
> +++ b/arch/x86/include/asm/alternative.h
> @@ -3,6 +3,7 @@
>  
>  #include <linux/types.h>
>  #include <linux/stddef.h>
> +#include <linux/stringify.h>
>  #include <asm/asm.h>
>  
>  /*
> @@ -74,6 +75,22 @@ static inline void alternatives_smp_switch(int smp) {}
>  
>  const unsigned char *const *find_nop_table(void);
>  
> +/* alternative assembly primitive: */
> +#define ALTERNATIVE(oldinstr, newinstr, feature)			\
> +									\
> +      "661:\n\t" oldinstr "\n662:\n"					\
> +      ".section .altinstructions,\"a\"\n"				\
> +      _ASM_ALIGN "\n"							\
> +      _ASM_PTR "661b\n"				/* label           */	\
> +      _ASM_PTR "663f\n"				/* new instruction */	\
> +      "	 .byte " __stringify(feature) "\n"	/* feature bit     */	\
> +      "	 .byte 662b-661b\n"			/* sourcelen       */	\
> +      "	 .byte 664f-663f\n"			/* replacementlen  */	\

I posted an updated v3 this morning which removes the "tab" before each
of these .byte, as requested by Peter.

Mathieu

> +      ".previous\n"							\
> +      ".section .altinstr_replacement, \"ax\"\n"			\
> +      "663:\n\t" newinstr "\n664:\n"		/* replacement     */	\
> +      ".previous"
> +
>  /*
>   * Alternative instructions for different CPU types or capabilities.
>   *
> @@ -87,18 +104,7 @@ const unsigned char *const *find_nop_table(void);
>   * without volatile and memory clobber.
>   */
>  #define alternative(oldinstr, newinstr, feature)			\
> -	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
> -		      ".section .altinstructions,\"a\"\n"		\
> -		      _ASM_ALIGN "\n"					\
> -		      _ASM_PTR "661b\n"		/* label */		\
> -		      _ASM_PTR "663f\n"		/* new instruction */	\
> -		      "	 .byte %c0\n"		/* feature bit */	\
> -		      "	 .byte 662b-661b\n"	/* sourcelen */		\
> -		      "	 .byte 664f-663f\n"	/* replacementlen */	\
> -		      ".previous\n"					\
> -		      ".section .altinstr_replacement,\"ax\"\n"		\
> -		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
> -		      ".previous" :: "i" (feature) : "memory")
> +	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature) : : : "memory")
>  
>  /*
>   * Alternative inline assembly with input.
> @@ -109,35 +115,16 @@ const unsigned char *const *find_nop_table(void);
>   * Best is to use constraints that are fixed size (like (%1) ... "r")
>   * If you use variable sized constraints like "m" or "g" in the
>   * replacement make sure to pad to the worst case length.
> + * Leaving an unused argument 0 to keep API compatibility.
>   */
>  #define alternative_input(oldinstr, newinstr, feature, input...)	\
> -	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
> -		      ".section .altinstructions,\"a\"\n"		\
> -		      _ASM_ALIGN "\n"					\
> -		      _ASM_PTR "661b\n"		/* label */		\
> -		      _ASM_PTR "663f\n"		/* new instruction */	\
> -		      "	 .byte %c0\n"		/* feature bit */	\
> -		      "	 .byte 662b-661b\n"	/* sourcelen */		\
> -		      "	 .byte 664f-663f\n"	/* replacementlen */	\
> -		      ".previous\n"					\
> -		      ".section .altinstr_replacement,\"ax\"\n"		\
> -		      "663:\n\t" newinstr "\n664:\n"  /* replacement */	\
> -		      ".previous" :: "i" (feature), ##input)
> +	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
> +		: : "i" (0), ## input)
>  
>  /* Like alternative_input, but with a single output argument */
>  #define alternative_io(oldinstr, newinstr, feature, output, input...)	\
> -	asm volatile ("661:\n\t" oldinstr "\n662:\n"			\
> -		      ".section .altinstructions,\"a\"\n"		\
> -		      _ASM_ALIGN "\n"					\
> -		      _ASM_PTR "661b\n"		/* label */		\
> -		      _ASM_PTR "663f\n"		/* new instruction */	\
> -		      "	 .byte %c[feat]\n"	/* feature bit */	\
> -		      "	 .byte 662b-661b\n"	/* sourcelen */		\
> -		      "	 .byte 664f-663f\n"	/* replacementlen */	\
> -		      ".previous\n"					\
> -		      ".section .altinstr_replacement,\"ax\"\n"		\
> -		      "663:\n\t" newinstr "\n664:\n"  /* replacement */ \
> -		      ".previous" : output : [feat] "i" (feature), ##input)
> +	asm volatile (ALTERNATIVE(oldinstr, newinstr, feature)		\
> +		: output : "i" (0), ## input)
>  
>  /*
>   * use this macro(s) if you need more than one output parameter

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/2] x86 amd fix cmpxchg read acquire barrier
  2009-04-25  8:19   ` Pavel Machek
@ 2009-05-02 15:55     ` Mathieu Desnoyers
  0 siblings, 0 replies; 21+ messages in thread
From: Mathieu Desnoyers @ 2009-05-02 15:55 UTC (permalink / raw)
  To: Pavel Machek
  Cc: akpm, Ingo Molnar, linux-kernel, mark.langsdorf, arekm,
	H. Peter Anvin, Andi Kleen, Avi Kivity

* Pavel Machek (pavel@ucw.cz) wrote:
> On Wed 2009-04-22 16:18:54, Mathieu Desnoyers wrote:
> > http://google-perftools.googlecode.com/svn-history/r48/trunk/src/base/atomicops-internals-x86.cc
> > 
> > says
> > 
> > "  // Opteron Rev E has a bug in which on very rare occasions a locked
> >   // instruction doesn't act as a read-acquire barrier if followed by a
> >   // non-locked read-modify-write instruction.  Rev F has this bug in 
> >   // pre-release versions, but not in versions released to customers,
> >   // so we test only for Rev E, which is family 15, model 32..63 inclusive.
> 
> Could we be more careful  here and catch the F pre-release versions,
> too? Stepping should help there...
> 

Yes, I guess for F pre-releases we're stucked unless AMD provides more
information. Dtrace did not seem to bother about F pre-release versions
though, neither did Google.

Mathieu

> -- 
> (english) http://www.livejournal.com/~pavelmachek
> (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

-- 
Mathieu Desnoyers
OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F  BA06 3F25 A8FE 3BAE 9A68

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-05-02 15:55 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-22 20:18 [patch 0/2] Fixing AMD cmpxchg "missing lfence" mess Mathieu Desnoyers
2009-04-22 20:18 ` [patch 1/2] x86: cleanup alternative.h Mathieu Desnoyers
2009-04-28 13:05   ` [GIT PULL] " Mathieu Desnoyers
2009-04-28 14:58     ` Ingo Molnar
2009-04-28 15:13       ` [GIT PULL] x86: cleanup alternative.h (v2) Mathieu Desnoyers
2009-04-29  6:17         ` [tip:x86/asm] x86: clean up alternative.h tip-bot for Mathieu Desnoyers
2009-04-29  6:22           ` Mathieu Desnoyers
2009-04-28 17:40       ` [GIT PULL] x86: cleanup alternative.h H. Peter Anvin
2009-04-28 18:11         ` Mathieu Desnoyers
2009-04-28 18:13         ` [GIT PULL] x86: cleanup alternative.h (v3) Mathieu Desnoyers
2009-04-22 20:18 ` [patch 2/2] x86 amd fix cmpxchg read acquire barrier Mathieu Desnoyers
2009-04-22 20:59   ` Alan Cox
2009-04-22 22:47     ` Mathieu Desnoyers
2009-04-22 22:53       ` Alan Cox
2009-04-22 23:26         ` Mathieu Desnoyers
2009-04-23  8:06   ` Ingo Molnar
2009-04-23 13:19     ` Mathieu Desnoyers
2009-04-23 13:41       ` Arkadiusz Miskiewicz
2009-04-23 22:17         ` Mathieu Desnoyers
2009-04-25  8:19   ` Pavel Machek
2009-05-02 15:55     ` Mathieu Desnoyers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.