Fix prefetch patching in 2.5-bk
diff mbox series

Message ID 20030501001511.GA2890@averell
State New, archived
Headers show
Series
  • Fix prefetch patching in 2.5-bk
Related show

Commit Message

Andi Kleen May 1, 2003, 12:15 a.m. UTC
Brown paperbag time. I forgot to take the modrm byte in account
with the prefetch patch replacement.  With 3.2 it worked because
it used the right registers in my configuration.

But gcc 2.96 uses a different register in __dpath and the prefetch becomes
4 bytes with modrm and the original nop needs to be as long as that too.

If your machine BUG()s in apply_alternatives at booting 
or module loading you need this patch.

Linus please apply.

-Andi

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Comments

Linus Torvalds May 1, 2003, 1:21 a.m. UTC | #1
On Thu, 1 May 2003, Andi Kleen wrote:
> 
> If your machine BUG()s in apply_alternatives at booting 
> or module loading you need this patch.

I applied it, but I don't have to like it..

How about doing this differently, and having something like this:

	#define nop_alternative(newinstr, feature)		\
		".section .altinstructions,\"a\"\n"		\
		"  .align 4\n"
		"    .long 660f\n"
		"    .long 663f\n"
		"    .byte %c0\n"
		"    .byte 0\n"
		"    .byte 664f-663f\n"
		".previous\n"
		".section .altinstr_replacement",\"ax\"\n"
		"663:\n\t" newinstr "\n664:\n"
		".previous"
		"660:\n\t"
		".rept  664b-663b, 0x90\n\t"

and making "sourcelen==0" a special case for replacement (replace with the 
proper destination length nop, instead of having that "0x90 0x90 0x90" 
sequence).

This allows you to use arbitrary-sized things without having to worry 
about having to have the size right, or without having to use 
unnecessarily long nop-sequences. You'll always get the right-size nop.

		Linus

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/
Andi Kleen May 1, 2003, 1:51 a.m. UTC | #2
On Thu, May 01, 2003 at 03:21:52AM +0200, Linus Torvalds wrote:
> and making "sourcelen==0" a special case for replacement (replace with the 
> proper destination length nop, instead of having that "0x90 0x90 0x90" 
> sequence).

Note sure what you mean with 0x90 sequence.

My original implementation used .rept to generate the correct number of 
(single byte) nops based on the label length of the other case. 
But it didn't work because I ran into at least one weird assembler bug (it internally 
got confused on something and gave an impossible error message about a
missing label). Also it only generated single byte nops.

In any case you need to pad the code to the correct number of bytes,
I'm not sure how it can be done otherwise.

-Andi
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Patch
diff mbox series

Index: linux/include/asm-i386/processor.h
===================================================================
RCS file: /home/cvs/linux-2.5/include/asm-i386/processor.h,v
retrieving revision 1.48
diff -u -u -r1.48 processor.h
--- linux/include/asm-i386/processor.h	30 Apr 2003 14:32:05 -0000	1.48
+++ linux/include/asm-i386/processor.h	30 Apr 2003 22:48:26 -0000
@@ -564,7 +564,7 @@ 
 #define ARCH_HAS_PREFETCH
 extern inline void prefetch(const void *x)
 {
-	alternative_input(ASM_NOP3,
+	alternative_input(ASM_NOP4,
 			  "prefetchnta (%1)",
 			  X86_FEATURE_XMM,
 			  "r" (x));
@@ -578,7 +578,7 @@ 
    spinlocks to avoid one state transition in the cache coherency protocol. */
 extern inline void prefetchw(const void *x)
 {
-	alternative_input(ASM_NOP3,
+	alternative_input(ASM_NOP4,
 			  "prefetchw (%1)",
 			  X86_FEATURE_3DNOW,
 			  "r" (x));