From: Joel Stanley <joel@jms.id.au>
To: linuxppc-dev@lists.ozlabs.org
Cc: Jeremy Kerr <jk@codeconstruct.com.au>,
Matt Johnston <matt@codeconstruct.com.au>
Subject: [PATCH] powerpc: Implement slightly better 64-bit LE non-VMX memory copy
Date: Mon, 3 Oct 2022 18:06:15 +1030 [thread overview]
Message-ID: <20221003073615.5553-1-joel@jms.id.au> (raw)
From: Paul Mackerras <paulus@ozlabs.org>
At present, on 64-bit little-endian machines, we have the choice of
either a dumb loop that does one byte per iteration, or an optimized
loop using VMX instructions. On microwatt, we don't have VMX, so
we are stuck with the dumb loop, which is very slow.
This makes the dumb loop a little less dumb. It now does 16 bytes
per iteration, using 'ld' and 'std' instructions. If the number of
bytes to copy is not a multiple of 16, the one-byte-per-iteration
loop is used for the last 1--15 bytes.
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Signed-off-by: Joel Stanley <joel@jms.id.au>
---
arch/powerpc/lib/memcpy_64.S | 27 +++++++++++++++++++--------
1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/lib/memcpy_64.S b/arch/powerpc/lib/memcpy_64.S
index 016c91e958d8..bed7eb327b25 100644
--- a/arch/powerpc/lib/memcpy_64.S
+++ b/arch/powerpc/lib/memcpy_64.S
@@ -18,7 +18,7 @@
_GLOBAL_TOC_KASAN(memcpy)
BEGIN_FTR_SECTION
#ifdef __LITTLE_ENDIAN__
- cmpdi cr7,r5,0
+ clrldi r6,r5,60
#else
std r3,-STACKFRAMESIZE+STK_REG(R31)(r1) /* save destination pointer for return value */
#endif
@@ -29,13 +29,24 @@ FTR_SECTION_ELSE
ALT_FTR_SECTION_END_IFCLR(CPU_FTR_VMX_COPY)
#ifdef __LITTLE_ENDIAN__
/* dumb little-endian memcpy that will get replaced at runtime */
- addi r9,r3,-1
- addi r4,r4,-1
- beqlr cr7
- mtctr r5
-1: lbzu r10,1(r4)
- stbu r10,1(r9)
- bdnz 1b
+ addi r9,r3,-8
+ addi r4,r4,-8
+ srdi. r0,r5,4
+ beq 2f
+ mtctr r0
+3: ld r10,8(r4)
+ std r10,8(r9)
+ ldu r10,16(r4)
+ stdu r10,16(r9)
+ bdnz 3b
+2: cmpwi r6,0
+ beqlr
+ addi r9,r9,7
+ addi r4,r4,7
+ mtctr r6
+1: lbzu r10,1(r4)
+ stbu r10,1(r9)
+ bdnz 1b
blr
#else
PPC_MTOCRF(0x01,r5)
--
2.35.1
reply other threads:[~2022-10-03 7:37 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20221003073615.5553-1-joel@jms.id.au \
--to=joel@jms.id.au \
--cc=jk@codeconstruct.com.au \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=matt@codeconstruct.com.au \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).