From: Charlie Jenkins <charlie@rivosinc.com> To: Charlie Jenkins <charlie@rivosinc.com>, Palmer Dabbelt <palmer@dabbelt.com>, Conor Dooley <conor@kernel.org>, Samuel Holland <samuel.holland@sifive.com>, David Laight <David.Laight@aculab.com>, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Cc: Paul Walmsley <paul.walmsley@sifive.com>, Albert Ou <aou@eecs.berkeley.edu>, Arnd Bergmann <arnd@arndb.de>, David Laight <david.laight@aculab.com> Subject: [PATCH v7 1/4] asm-generic: Improve csum_fold Date: Tue, 19 Sep 2023 11:44:30 -0700 [thread overview] Message-ID: <20230919-optimize_checksum-v7-1-06c7d0ddd5d6@rivosinc.com> (raw) In-Reply-To: <20230919-optimize_checksum-v7-0-06c7d0ddd5d6@rivosinc.com> This csum_fold implementation introduced into arch/arc by Vineet Gupta is better than the default implementation on at least arc, x86, and riscv. Using GCC trunk and compiling non-inlined version, this implementation has 41.6667%, 25% fewer instructions on riscv64, x86-64 respectively with -O3 optimization. Most implmentations override this default in asm, but this should be more performant than all of those other implementations except for arm which has barrel shifting and sparc32 which has a carry flag. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: David Laight <david.laight@aculab.com> --- include/asm-generic/checksum.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h index 43e18db89c14..ad928cce268b 100644 --- a/include/asm-generic/checksum.h +++ b/include/asm-generic/checksum.h @@ -2,6 +2,8 @@ #ifndef __ASM_GENERIC_CHECKSUM_H #define __ASM_GENERIC_CHECKSUM_H +#include <linux/bitops.h> + /* * computes the checksum of a memory block at buff, length len, * and adds in "sum" (32-bit) @@ -31,9 +33,7 @@ extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl); static inline __sum16 csum_fold(__wsum csum) { u32 sum = (__force u32)csum; - sum = (sum & 0xffff) + (sum >> 16); - sum = (sum & 0xffff) + (sum >> 16); - return (__force __sum16)~sum; + return (__force __sum16)((~sum - ror32(sum, 16)) >> 16); } #endif -- 2.42.0
WARNING: multiple messages have this Message-ID (diff)
From: Charlie Jenkins <charlie@rivosinc.com> To: Charlie Jenkins <charlie@rivosinc.com>, Palmer Dabbelt <palmer@dabbelt.com>, Conor Dooley <conor@kernel.org>, Samuel Holland <samuel.holland@sifive.com>, David Laight <David.Laight@aculab.com>, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-arch@vger.kernel.org Cc: Paul Walmsley <paul.walmsley@sifive.com>, Albert Ou <aou@eecs.berkeley.edu>, Arnd Bergmann <arnd@arndb.de>, David Laight <david.laight@aculab.com> Subject: [PATCH v7 1/4] asm-generic: Improve csum_fold Date: Tue, 19 Sep 2023 11:44:30 -0700 [thread overview] Message-ID: <20230919-optimize_checksum-v7-1-06c7d0ddd5d6@rivosinc.com> (raw) In-Reply-To: <20230919-optimize_checksum-v7-0-06c7d0ddd5d6@rivosinc.com> This csum_fold implementation introduced into arch/arc by Vineet Gupta is better than the default implementation on at least arc, x86, and riscv. Using GCC trunk and compiling non-inlined version, this implementation has 41.6667%, 25% fewer instructions on riscv64, x86-64 respectively with -O3 optimization. Most implmentations override this default in asm, but this should be more performant than all of those other implementations except for arm which has barrel shifting and sparc32 which has a carry flag. Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> Reviewed-by: David Laight <david.laight@aculab.com> --- include/asm-generic/checksum.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/include/asm-generic/checksum.h b/include/asm-generic/checksum.h index 43e18db89c14..ad928cce268b 100644 --- a/include/asm-generic/checksum.h +++ b/include/asm-generic/checksum.h @@ -2,6 +2,8 @@ #ifndef __ASM_GENERIC_CHECKSUM_H #define __ASM_GENERIC_CHECKSUM_H +#include <linux/bitops.h> + /* * computes the checksum of a memory block at buff, length len, * and adds in "sum" (32-bit) @@ -31,9 +33,7 @@ extern __sum16 ip_fast_csum(const void *iph, unsigned int ihl); static inline __sum16 csum_fold(__wsum csum) { u32 sum = (__force u32)csum; - sum = (sum & 0xffff) + (sum >> 16); - sum = (sum & 0xffff) + (sum >> 16); - return (__force __sum16)~sum; + return (__force __sum16)((~sum - ror32(sum, 16)) >> 16); } #endif -- 2.42.0 _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv
next prev parent reply other threads:[~2023-09-19 18:45 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-09-19 18:44 [PATCH v7 0/4] riscv: Add fine-tuned checksum functions Charlie Jenkins 2023-09-19 18:44 ` Charlie Jenkins 2023-09-19 18:44 ` Charlie Jenkins [this message] 2023-09-19 18:44 ` [PATCH v7 1/4] asm-generic: Improve csum_fold Charlie Jenkins 2023-09-19 18:44 ` [PATCH v7 2/4] riscv: Checksum header Charlie Jenkins 2023-09-19 18:44 ` Charlie Jenkins 2023-10-12 14:54 ` Conor Dooley 2023-10-12 14:54 ` Conor Dooley 2023-10-25 6:50 ` Wang, Xiao W 2023-10-25 6:50 ` Wang, Xiao W 2023-10-25 20:37 ` Charlie Jenkins 2023-10-25 20:37 ` Charlie Jenkins 2023-10-25 20:52 ` Arnd Bergmann 2023-10-25 20:52 ` Arnd Bergmann 2023-10-25 21:11 ` Charlie Jenkins 2023-10-25 21:11 ` Charlie Jenkins 2023-10-25 21:18 ` Arnd Bergmann 2023-10-25 21:18 ` Arnd Bergmann 2023-10-25 21:20 ` Charlie Jenkins 2023-10-25 21:20 ` Charlie Jenkins 2023-09-19 18:44 ` [PATCH v7 3/4] riscv: Add checksum library Charlie Jenkins 2023-09-19 18:44 ` Charlie Jenkins 2023-10-12 14:51 ` Conor Dooley 2023-10-12 14:51 ` Conor Dooley 2023-10-25 7:29 ` Wang, Xiao W 2023-10-25 7:29 ` Wang, Xiao W 2023-09-19 18:44 ` [PATCH v7 4/4] riscv: Test checksum functions Charlie Jenkins 2023-09-19 18:44 ` Charlie Jenkins
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230919-optimize_checksum-v7-1-06c7d0ddd5d6@rivosinc.com \ --to=charlie@rivosinc.com \ --cc=David.Laight@aculab.com \ --cc=aou@eecs.berkeley.edu \ --cc=arnd@arndb.de \ --cc=conor@kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-riscv@lists.infradead.org \ --cc=palmer@dabbelt.com \ --cc=paul.walmsley@sifive.com \ --cc=samuel.holland@sifive.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.