linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: optimise inet_proto_csum_replace4()
@ 2014-09-23  8:54 Christophe Leroy
  2014-09-23 11:10 ` Eric Dumazet
  2014-09-26 20:14 ` David Miller
  0 siblings, 2 replies; 3+ messages in thread
From: Christophe Leroy @ 2014-09-23  8:54 UTC (permalink / raw)
  To: David S. Miller; +Cc: linux-kernel, Eric Dumazet, netdev

csum_partial() is a generic function which is not optimised for small fixed
length calculations, and its use requires to store "from" and "to" values in
memory while we already have them available in registers. This also has impact,
especially on RISC processors. In the same spirit as the change done by
Eric Dumazet on csum_replace2(), this patch rewrites inet_proto_csum_replace4()
taking into account RFC1624.

I spotted during a NATted tcp transfert that csum_partial() is one of top 5
consuming functions (around 8%), and the second user of csum_partial() is 
inet_proto_csum_replace4().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

---
--
 net/core/utils.c | 12 +++++-------
 1 file changed, 5 insertions(+), 7 deletions(-)

diff --git a/net/core/utils.c b/net/core/utils.c
index eed3433..efc76dd 100644
--- a/net/core/utils.c
+++ b/net/core/utils.c
@@ -306,16 +306,14 @@ EXPORT_SYMBOL(in6_pton);
 void inet_proto_csum_replace4(__sum16 *sum, struct sk_buff *skb,
 			      __be32 from, __be32 to, int pseudohdr)
 {
-	__be32 diff[] = { ~from, to };
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
-		*sum = csum_fold(csum_partial(diff, sizeof(diff),
-				~csum_unfold(*sum)));
+		*sum = csum_fold(csum_add(csum_sub(~csum_unfold(*sum), from),
+				 to));
 		if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr)
-			skb->csum = ~csum_partial(diff, sizeof(diff),
-						~skb->csum);
+			skb->csum = ~csum_add(csum_sub(~(skb->csum), from), to);
 	} else if (pseudohdr)
-		*sum = ~csum_fold(csum_partial(diff, sizeof(diff),
-				csum_unfold(*sum)));
+		*sum = ~csum_fold(csum_add(csum_sub(csum_unfold(*sum), from),
+				  to));
 }
 EXPORT_SYMBOL(inet_proto_csum_replace4);
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] net: optimise inet_proto_csum_replace4()
  2014-09-23  8:54 [PATCH] net: optimise inet_proto_csum_replace4() Christophe Leroy
@ 2014-09-23 11:10 ` Eric Dumazet
  2014-09-26 20:14 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: Eric Dumazet @ 2014-09-23 11:10 UTC (permalink / raw)
  To: Christophe Leroy; +Cc: David S. Miller, linux-kernel, Eric Dumazet, netdev

On Tue, 2014-09-23 at 10:54 +0200, Christophe Leroy wrote:
> csum_partial() is a generic function which is not optimised for small fixed
> length calculations, and its use requires to store "from" and "to" values in
> memory while we already have them available in registers. This also has impact,
> especially on RISC processors. In the same spirit as the change done by
> Eric Dumazet on csum_replace2(), this patch rewrites inet_proto_csum_replace4()
> taking into account RFC1624.
> 
> I spotted during a NATted tcp transfert that csum_partial() is one of top 5
> consuming functions (around 8%), and the second user of csum_partial() is 
> inet_proto_csum_replace4().
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
> 

Acked-by: Eric Dumazet <edumazet@google.com>

Thanks !



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] net: optimise inet_proto_csum_replace4()
  2014-09-23  8:54 [PATCH] net: optimise inet_proto_csum_replace4() Christophe Leroy
  2014-09-23 11:10 ` Eric Dumazet
@ 2014-09-26 20:14 ` David Miller
  1 sibling, 0 replies; 3+ messages in thread
From: David Miller @ 2014-09-26 20:14 UTC (permalink / raw)
  To: christophe.leroy; +Cc: linux-kernel, edumazet, netdev

From: Christophe Leroy <christophe.leroy@c-s.fr>
Date: Tue, 23 Sep 2014 10:54:37 +0200 (CEST)

> csum_partial() is a generic function which is not optimised for small fixed
> length calculations, and its use requires to store "from" and "to" values in
> memory while we already have them available in registers. This also has impact,
> especially on RISC processors. In the same spirit as the change done by
> Eric Dumazet on csum_replace2(), this patch rewrites inet_proto_csum_replace4()
> taking into account RFC1624.
> 
> I spotted during a NATted tcp transfert that csum_partial() is one of top 5
> consuming functions (around 8%), and the second user of csum_partial() is 
> inet_proto_csum_replace4().
> 
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Also applied, thanks Christophe.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2014-09-26 20:14 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-23  8:54 [PATCH] net: optimise inet_proto_csum_replace4() Christophe Leroy
2014-09-23 11:10 ` Eric Dumazet
2014-09-26 20:14 ` David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).