From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: sunrui26@huawei.com
Cc: Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will.deacon@arm.com>,
linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] arm64:crc:accelerated-crc32-by-64bytes
Date: Mon, 19 Nov 2018 08:11:38 -0800 [thread overview]
Message-ID: <CAKv+Gu_=e70fzmVRuo7A4ObdNfWt=mQSU+n7E_HkRuqLV0fnMg@mail.gmail.com> (raw)
In-Reply-To: <1542612560-10089-1-git-send-email-sunrui26@huawei.com>
On Sun, 18 Nov 2018 at 23:30, Rui Sun <sunrui26@huawei.com> wrote:
>
> add 64 bytes loop to acceleration calculation
>
Can you share some performance numbers please?
Also, we don't need 64 byte, 32 byte and 16 byte code paths: just make
the 8 byte one a loop as well, and drop the 32 byte and 16 byte ones.
> Signed-off-by: Rui Sun <sunrui26@huawei.com>
> ---
> arch/arm64/lib/crc32.S | 54 ++++++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 50 insertions(+), 4 deletions(-)
>
> diff --git a/arch/arm64/lib/crc32.S b/arch/arm64/lib/crc32.S
> index 5bc1e85..2b37009 100644
> --- a/arch/arm64/lib/crc32.S
> +++ b/arch/arm64/lib/crc32.S
> @@ -15,15 +15,61 @@
> .cpu generic+crc
>
> .macro __crc32, c
> -0: subs x2, x2, #16
> - b.mi 8f
> +
> +64: cmp x2, #64
> + b.lt 32f
> +
> + adds x11, x1, #16
> + adds x12, x1, #32
> + adds x13, x1, #48
> +
> +0 : subs x2, x2, #64
> + b.mi 32f
> +
> + ldp x3, x4, [x1], #64
> + ldp x5, x6, [x11], #64
> + ldp x7, x8, [x12], #64
> + ldp x9, x10,[x13], #64
> +
Can we do this instead, and get rid of the temp variables?
ldp x3, x4, [x1], #64
ldp x5, x6, [x1, #-48]
ldp x7, x8, [x1, #-32]
ldp x9, x10,[x1, #-16]
> + CPU_BE( rev x3, x3 )
> + CPU_BE( rev x4, x4 )
> + CPU_BE( rev x5, x5 )
> + CPU_BE( rev x6, x6 )
> + CPU_BE( rev x7, x7 )
> + CPU_BE( rev x8, x8 )
> + CPU_BE( rev x9, x9 )
> + CPU_BE( rev x10,x10 )
> +
> + crc32\c\()x w0, w0, x3
> + crc32\c\()x w0, w0, x4
> + crc32\c\()x w0, w0, x5
> + crc32\c\()x w0, w0, x6
> + crc32\c\()x w0, w0, x7
> + crc32\c\()x w0, w0, x8
> + crc32\c\()x w0, w0, x9
> + crc32\c\()x w0, w0, x10
> +
> + b.ne 0b
> + ret
> +
> +32: tbz x2, #5, 16f
> + ldp x3, x4, [x1], #16
> + ldp x5, x6, [x1], #16
> +CPU_BE( rev x3, x3 )
> +CPU_BE( rev x4, x4 )
> +CPU_BE( rev x5, x5 )
> +CPU_BE( rev x6, x6 )
> + crc32\c\()x w0, w0, x3
> + crc32\c\()x w0, w0, x4
> + crc32\c\()x w0, w0, x5
> + crc32\c\()x w0, w0, x6
> +
> +16: tbz x2, #4, 8f
> ldp x3, x4, [x1], #16
> CPU_BE( rev x3, x3 )
> CPU_BE( rev x4, x4 )
> crc32\c\()x w0, w0, x3
> crc32\c\()x w0, w0, x4
> - b.ne 0b
> - ret
>
> 8: tbz x2, #3, 4f
> ldr x3, [x1], #8
> --
> 1.8.3.1
>
prev parent reply other threads:[~2018-11-19 16:11 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-19 7:29 [PATCH] arm64:crc:accelerated-crc32-by-64bytes Rui Sun
2018-11-19 16:11 ` Ard Biesheuvel [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAKv+Gu_=e70fzmVRuo7A4ObdNfWt=mQSU+n7E_HkRuqLV0fnMg@mail.gmail.com' \
--to=ard.biesheuvel@linaro.org \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sunrui26@huawei.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).