All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy
@ 2009-09-28  9:34 Arjan van de Ven
  2009-09-28 12:21 ` Arjan van de Ven
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Arjan van de Ven @ 2009-09-28  9:34 UTC (permalink / raw)
  To: linux-kernel; +Cc: mingo, tglx, hpa, torvalds


>From ebb81aab0c3df19771ebc0eec1261ae314ddc0af Mon Sep 17 00:00:00 2001
From: Arjan van de Ven <arjan@linux.intel.com>
Date: Mon, 28 Sep 2009 11:21:32 +0200
Subject: [PATCH] x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy

GCC provides reasonable memset/memcpy functions itself, with __builtin_memset
and __builtin_memcpy. For the "unknown" cases, it'll fall back to our
current existing functions, but for fixed size versions it'll inline
something smart. Quite often that will be the same as we have now,
but sometimes it can do something smarter (for example, if the code
then sets the first member of a struct, it can do a shorter memset).

In addition, and this is more important, gcc knows which registers and
such are not clobbered (while for our asm version it pretty much
acts like a compiler barrier), so for various cases it can avoid reloading
values.

The effect on codesize is shown below on my typical laptop .config:

   text	   data	    bss	    dec	    hex	filename
5605675	2041100	6525148	14171923	 d83f13	vmlinux.before
5595849	2041668	6525148	14162665	 d81ae9	vmlinux.after

Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
---
 arch/x86/include/asm/string_32.h |   11 ++---------
 1 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/string_32.h b/arch/x86/include/asm/string_32.h
index ae907e6..23a0bc2 100644
--- a/arch/x86/include/asm/string_32.h
+++ b/arch/x86/include/asm/string_32.h
@@ -177,10 +177,7 @@ static inline void *__memcpy3d(void *to, const void *from, size_t len)
  */
 
 #ifndef CONFIG_KMEMCHECK
-#define memcpy(t, f, n)				\
-	(__builtin_constant_p((n))		\
-	 ? __constant_memcpy((t), (f), (n))	\
-	 : __memcpy((t), (f), (n)))
+#define memcpy(t, f, n)	__builtin_memcpy(t, f, n)
 #else
 /*
  * kmemcheck becomes very happy if we use the REP instructions unconditionally,
@@ -316,11 +313,7 @@ void *__constant_c_and_count_memset(void *s, unsigned long pattern,
 	 : __memset_generic((s), (c), (count)))
 
 #define __HAVE_ARCH_MEMSET
-#define memset(s, c, count)						\
-	(__builtin_constant_p(c)					\
-	 ? __constant_c_x_memset((s), (0x01010101UL * (unsigned char)(c)), \
-				 (count))				\
-	 : __memset((s), (c), (count)))
+#define memset(s, c, count) __builtin_memset(s, c, count)
 
 /*
  * find the first occurrence of byte 'c', or 1 past the area if none
-- 
1.6.2.5


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2009-10-02 20:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-28  9:34 [PATCH] x86: Use __builtin_memset and __builtin_memcpy for memset/memcpy Arjan van de Ven
2009-09-28 12:21 ` Arjan van de Ven
2009-09-28 23:47   ` [tip:x86/asm] " tip-bot for Arjan van de Ven
2009-09-29 12:44 ` [PATCH] " Arnd Bergmann
2009-09-29 12:53   ` Arjan van de Ven
2009-09-29 15:32   ` H. Peter Anvin
2009-10-02 19:19 ` Andi Kleen
2009-10-02 20:04   ` H. Peter Anvin
2009-10-02 20:12     ` Andi Kleen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.