Re: [PATCH] optimize ia32 memmove

* Re: [PATCH] optimize ia32 memmove
       [not found] <200312300713.hBU7DGC4024213@hera.kernel.org>
@ 2003-12-30  7:32 ` Jeff Garzik
  2003-12-30  7:51   ` Andrew Morton
  2003-12-30 10:21   ` Ed Sweetman
  0 siblings, 2 replies; 12+ messages in thread
From: Jeff Garzik @ 2003-12-30  7:32 UTC (permalink / raw)
  To: manfred, akpm; +Cc: Linux Kernel Mailing List

Linux Kernel Mailing List wrote:
> ChangeSet 1.1496.22.32, 2003/12/29 21:45:30-08:00, akpm@osdl.org
> 
> 	[PATCH] optimize ia32 memmove
> 	
> 	From: Manfred Spraul <manfred@colorfullife.com>
> 	
> 	The memmove implementation of i386 is not optimized: it uses movsb, which is
> 	far slower than movsd.  The optimization is trivial: if dest is less than
> 	source, then call memcpy().  markw tried it on a 4xXeon with dbt2, it saved
> 	around 300 million cpu ticks in cache_flusharray():
[...]
> diff -Nru a/include/asm-i386/string.h b/include/asm-i386/string.h
> --- a/include/asm-i386/string.h	Mon Dec 29 23:13:20 2003
> +++ b/include/asm-i386/string.h	Mon Dec 29 23:13:20 2003
> @@ -299,14 +299,9 @@
>  static inline void * memmove(void * dest,const void * src, size_t n)
>  {
>  int d0, d1, d2;
> -if (dest<src)
> -__asm__ __volatile__(
> -	"rep\n\t"
> -	"movsb"
> -	: "=&c" (d0), "=&S" (d1), "=&D" (d2)
> -	:"0" (n),"1" (src),"2" (dest)
> -	: "memory");
> -else
> +if (dest<src) {
> +	memcpy(dest,src,n);
> +} else
>  __asm__ __volatile__(
>  	"std\n\t"
>  	"rep\n\t"

Dumb question, though...   what about the overlap case, when dest<src ? 
  It seems to me this change is ignoring that.

	Jeff

^ permalink raw reply	[flat|nested] 12+ messages in thread