linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/4] perf/x86-64: also allow measuring alternative memcpy implementations
@ 2012-01-18 13:28 Jan Beulich
  2012-01-26 13:32 ` [tip:perf/core] perf bench: Also " tip-bot for Jan Beulich
  0 siblings, 1 reply; 2+ messages in thread
From: Jan Beulich @ 2012-01-18 13:28 UTC (permalink / raw)
  To: a.p.zijlstra, mingo, acme, paulus; +Cc: linux-kernel

Intended to be able to support the current selection of the preferred
memcpy() implementation, this patch adds the ability to also measure
the two alternative implementations, again by way of using some
pre-processsor replacement.

While on my Westmere system this proves that the movsb based variant is
worse than the movsq based one (since the ERMS feature isn't there), it
also shows that here for the default as well as small sizes the
unrolled variant outperforms the movsq one.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/tools/perf/bench/mem-memcpy-x86-64-asm-def.h
+++ b/tools/perf/bench/mem-memcpy-x86-64-asm-def.h
@@ -2,3 +2,11 @@
 MEMCPY_FN(__memcpy,
 	"x86-64-unrolled",
 	"unrolled memcpy() in arch/x86/lib/memcpy_64.S")
+
+MEMCPY_FN(memcpy_c,
+	"x86-64-movsq",
+	"movsq-based memcpy() in arch/x86/lib/memcpy_64.S")
+
+MEMCPY_FN(memcpy_c_e,
+	"x86-64-movsb",
+	"movsb-based memcpy() in arch/x86/lib/memcpy_64.S")
--- a/tools/perf/bench/mem-memcpy-x86-64-asm.S
+++ b/tools/perf/bench/mem-memcpy-x86-64-asm.S
@@ -1,2 +1,6 @@
 #define memcpy MEMCPY /* don't hide glibc's memcpy() */
+#define altinstr_replacement text
+#define globl p2align 4; .globl
+#define Lmemcpy_c globl memcpy_c; memcpy_c
+#define Lmemcpy_c_e globl memcpy_c_e; memcpy_c_e
 #include "../../../arch/x86/lib/memcpy_64.S"




^ permalink raw reply	[flat|nested] 2+ messages in thread

* [tip:perf/core] perf bench: Also allow measuring alternative memcpy implementations
  2012-01-18 13:28 [PATCH 2/4] perf/x86-64: also allow measuring alternative memcpy implementations Jan Beulich
@ 2012-01-26 13:32 ` tip-bot for Jan Beulich
  0 siblings, 0 replies; 2+ messages in thread
From: tip-bot for Jan Beulich @ 2012-01-26 13:32 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: acme, linux-kernel, paulus, hpa, mingo, a.p.zijlstra, jbeulich,
	JBeulich, tglx, mingo

Commit-ID:  800eb01484b3ca1eaf4eb5186df13fb24de2db19
Gitweb:     http://git.kernel.org/tip/800eb01484b3ca1eaf4eb5186df13fb24de2db19
Author:     Jan Beulich <JBeulich@suse.com>
AuthorDate: Wed, 18 Jan 2012 13:28:56 +0000
Committer:  Arnaldo Carvalho de Melo <acme@redhat.com>
CommitDate: Tue, 24 Jan 2012 19:51:01 -0200

perf bench: Also allow measuring alternative memcpy implementations

Intended to be able to support the current selection of the preferred
memcpy() implementation, this patch adds the ability to also measure the
two alternative implementations, again by way of using some
pre-processsor replacement.

While on my Westmere system this proves that the movsb based variant is
worse than the movsq based one (since the ERMS feature isn't there), it
also shows that here for the default as well as small sizes the unrolled
variant outperforms the movsq one.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4F16D728020000780006D732@nat28.tlf.novell.com
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/bench/mem-memcpy-x86-64-asm-def.h |    8 ++++++++
 tools/perf/bench/mem-memcpy-x86-64-asm.S     |    4 ++++
 2 files changed, 12 insertions(+), 0 deletions(-)

diff --git a/tools/perf/bench/mem-memcpy-x86-64-asm-def.h b/tools/perf/bench/mem-memcpy-x86-64-asm-def.h
index d588b87..d66ab79 100644
--- a/tools/perf/bench/mem-memcpy-x86-64-asm-def.h
+++ b/tools/perf/bench/mem-memcpy-x86-64-asm-def.h
@@ -2,3 +2,11 @@
 MEMCPY_FN(__memcpy,
 	"x86-64-unrolled",
 	"unrolled memcpy() in arch/x86/lib/memcpy_64.S")
+
+MEMCPY_FN(memcpy_c,
+	"x86-64-movsq",
+	"movsq-based memcpy() in arch/x86/lib/memcpy_64.S")
+
+MEMCPY_FN(memcpy_c_e,
+	"x86-64-movsb",
+	"movsb-based memcpy() in arch/x86/lib/memcpy_64.S")
diff --git a/tools/perf/bench/mem-memcpy-x86-64-asm.S b/tools/perf/bench/mem-memcpy-x86-64-asm.S
index 384b607..a20780b 100644
--- a/tools/perf/bench/mem-memcpy-x86-64-asm.S
+++ b/tools/perf/bench/mem-memcpy-x86-64-asm.S
@@ -1,2 +1,6 @@
 #define memcpy MEMCPY /* don't hide glibc's memcpy() */
+#define altinstr_replacement text
+#define globl p2align 4; .globl
+#define Lmemcpy_c globl memcpy_c; memcpy_c
+#define Lmemcpy_c_e globl memcpy_c_e; memcpy_c_e
 #include "../../../arch/x86/lib/memcpy_64.S"

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-01-26 13:32 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-01-18 13:28 [PATCH 2/4] perf/x86-64: also allow measuring alternative memcpy implementations Jan Beulich
2012-01-26 13:32 ` [tip:perf/core] perf bench: Also " tip-bot for Jan Beulich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).