All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfrm: optimize ipv4 selector matching
@ 2011-11-22 14:43 Alexey Dobriyan
  2011-11-22 14:50 ` David Laight
  0 siblings, 1 reply; 9+ messages in thread
From: Alexey Dobriyan @ 2011-11-22 14:43 UTC (permalink / raw)
  To: davem; +Cc: netdev

Current addr_match() is errh, under-optimized.

Compiler doesn't know that memcmp() branch doesn't trigger for IPv4.
Pass addresses by value.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/net/xfrm.h     |   10 ++++++++++
 net/xfrm/xfrm_policy.c |    4 ++--
 2 files changed, 12 insertions(+), 2 deletions(-)

--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -827,6 +827,16 @@ static inline bool addr_match(const void *token1, const void *token2,
 	return true;
 }
 
+static inline int addr4_match(const __be32 a1, const __be32 a2, const u8 prefixlen)
+{
+	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
+	if (prefixlen == 0) {
+		/* Matching constants result in smaller assembly. */
+		return 0xFFFFFFFFu;
+	}
+	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
+}
+
 static __inline__
 __be16 xfrm_flowi_sport(const struct flowi *fl, const union flowi_uli *uli)
 {
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -61,8 +61,8 @@ __xfrm4_selector_match(const struct xfrm_selector *sel, const struct flowi *fl)
 {
 	const struct flowi4 *fl4 = &fl->u.ip4;
 
-	return  addr_match(&fl4->daddr, &sel->daddr, sel->prefixlen_d) &&
-		addr_match(&fl4->saddr, &sel->saddr, sel->prefixlen_s) &&
+	return  addr4_match(fl4->daddr, sel->daddr.a4, sel->prefixlen_d) &&
+		addr4_match(fl4->saddr, sel->saddr.a4, sel->prefixlen_s) &&
 		!((xfrm_flowi_dport(fl, &fl4->uli) ^ sel->dport) & sel->dport_mask) &&
 		!((xfrm_flowi_sport(fl, &fl4->uli) ^ sel->sport) & sel->sport_mask) &&
 		(fl4->flowi4_proto == sel->proto || !sel->proto) &&

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] xfrm: optimize ipv4 selector matching
  2011-11-22 14:43 [PATCH] xfrm: optimize ipv4 selector matching Alexey Dobriyan
@ 2011-11-22 14:50 ` David Laight
  2011-11-22 14:59   ` Alexey Dobriyan
  0 siblings, 1 reply; 9+ messages in thread
From: David Laight @ 2011-11-22 14:50 UTC (permalink / raw)
  To: Alexey Dobriyan, davem; +Cc: netdev

 
> +static inline int addr4_match(const __be32 a1, const __be32 
> a2, const u8 prefixlen)
> +{
> +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> +	if (prefixlen == 0) {
> +		/* Matching constants result in smaller assembly. */
> +		return 0xFFFFFFFFu;
> +	}
> +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> +}
> +

It would probably be clearer to 'return 1' when prefixlen is zero.

If this is a common path, might be worth caching
    htonl(0xFFFFFFFFu << (32 - prefixlen))
in the enclosing structure.

	David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfrm: optimize ipv4 selector matching
  2011-11-22 14:50 ` David Laight
@ 2011-11-22 14:59   ` Alexey Dobriyan
  2011-11-22 15:10     ` Eric Dumazet
  0 siblings, 1 reply; 9+ messages in thread
From: Alexey Dobriyan @ 2011-11-22 14:59 UTC (permalink / raw)
  To: David Laight; +Cc: davem, netdev

On Tue, Nov 22, 2011 at 02:50:59PM -0000, David Laight wrote:
>  
> > +static inline int addr4_match(const __be32 a1, const __be32 
> > a2, const u8 prefixlen)
> > +{
> > +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > +	if (prefixlen == 0) {
> > +		/* Matching constants result in smaller assembly. */
> > +		return 0xFFFFFFFFu;
> > +	}
> > +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > +}
> > +
> 
> It would probably be clearer to 'return 1' when prefixlen is zero.

"return 1" results in bigger code.
This function used only in boolean context, so exact return value doesn't matter.

> If this is a common path, might be worth caching
>     htonl(0xFFFFFFFFu << (32 - prefixlen))
> in the enclosing structure.

This means one more branch, which wouldn't be a win compared
to current code.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfrm: optimize ipv4 selector matching
  2011-11-22 14:59   ` Alexey Dobriyan
@ 2011-11-22 15:10     ` Eric Dumazet
  2011-11-22 15:15       ` David Laight
  2011-11-22 15:47       ` [PATCH] " Alexey Dobriyan
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Dumazet @ 2011-11-22 15:10 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: David Laight, davem, netdev

Le mardi 22 novembre 2011 à 17:59 +0300, Alexey Dobriyan a écrit :
> On Tue, Nov 22, 2011 at 02:50:59PM -0000, David Laight wrote:
> >  
> > > +static inline int addr4_match(const __be32 a1, const __be32 
> > > a2, const u8 prefixlen)
> > > +{
> > > +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > > +	if (prefixlen == 0) {
> > > +		/* Matching constants result in smaller assembly. */
> > > +		return 0xFFFFFFFFu;
> > > +	}
> > > +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > > +}
> > > +
> > 
> > It would probably be clearer to 'return 1' when prefixlen is zero.
> 
> "return 1" results in bigger code.
> This function used only in boolean context, so exact return value doesn't matter.


Thats too ugly, sorry.

Also, using "const" attributes on integral types (not pointers) make
code less clear. Caller doesnt care if the implementation wants to
change a1 or a2 or prefixlen.

Please use :

+static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
+{
+       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
+       if (prefixlen == 0)
+               return true;
+       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
+}

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] xfrm: optimize ipv4 selector matching
  2011-11-22 15:10     ` Eric Dumazet
@ 2011-11-22 15:15       ` David Laight
  2011-11-22 15:41         ` Alexey Dobriyan
  2011-11-22 16:46         ` [PATCH v2] " Alexey Dobriyan
  2011-11-22 15:47       ` [PATCH] " Alexey Dobriyan
  1 sibling, 2 replies; 9+ messages in thread
From: David Laight @ 2011-11-22 15:15 UTC (permalink / raw)
  To: Eric Dumazet, Alexey Dobriyan; +Cc: davem, netdev

 
> Please use :
> 
> +static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
> +{
> +       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> +       if (prefixlen == 0)
> +               return true;
> +       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> +}

I'm not sure I'd agree about using 'u8'.
It may well cause an unnecessary mask with 0xff.

	David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfrm: optimize ipv4 selector matching
  2011-11-22 15:15       ` David Laight
@ 2011-11-22 15:41         ` Alexey Dobriyan
  2011-11-22 16:46         ` [PATCH v2] " Alexey Dobriyan
  1 sibling, 0 replies; 9+ messages in thread
From: Alexey Dobriyan @ 2011-11-22 15:41 UTC (permalink / raw)
  To: David Laight; +Cc: Eric Dumazet, davem, netdev

On Tue, Nov 22, 2011 at 03:15:25PM -0000, David Laight wrote:
>  
> > Please use :
> > 
> > +static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
> > +{
> > +       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > +       if (prefixlen == 0)
> > +               return true;
> > +       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > +}
> 
> I'm not sure I'd agree about using 'u8'.
> It may well cause an unnecessary mask with 0xff.

It's u8 in all other places.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] xfrm: optimize ipv4 selector matching
  2011-11-22 15:10     ` Eric Dumazet
  2011-11-22 15:15       ` David Laight
@ 2011-11-22 15:47       ` Alexey Dobriyan
  1 sibling, 0 replies; 9+ messages in thread
From: Alexey Dobriyan @ 2011-11-22 15:47 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Laight, davem, netdev

On Tue, Nov 22, 2011 at 04:10:52PM +0100, Eric Dumazet wrote:
> Le mardi 22 novembre 2011 à 17:59 +0300, Alexey Dobriyan a écrit :
> > On Tue, Nov 22, 2011 at 02:50:59PM -0000, David Laight wrote:
> > >  
> > > > +static inline int addr4_match(const __be32 a1, const __be32 
> > > > a2, const u8 prefixlen)
> > > > +{
> > > > +	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> > > > +	if (prefixlen == 0) {
> > > > +		/* Matching constants result in smaller assembly. */
> > > > +		return 0xFFFFFFFFu;
> > > > +	}
> > > > +	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> > > > +}
> > > > +
> > > 
> > > It would probably be clearer to 'return 1' when prefixlen is zero.
> > 
> > "return 1" results in bigger code.
> > This function used only in boolean context, so exact return value doesn't matter.
> 
> 
> Thats too ugly, sorry.
> 
> Also, using "const" attributes on integral types (not pointers) make
> code less clear.

No. it doesn't.
It signals to programmer that functions doesn't change its arguments.

> Caller doesnt care if the implementation wants to
> change a1 or a2 or prefixlen.

const is never about caller.
Millions of const qualifiers were injected already, so who cares.

> Please use :
> 
> +static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
> +{
> +       /* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
> +       if (prefixlen == 0)
> +               return true;
> +       return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
> +}

hmm, bool results in smaller code for -Os, so I'll resend.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v2] xfrm: optimize ipv4 selector matching
  2011-11-22 15:15       ` David Laight
  2011-11-22 15:41         ` Alexey Dobriyan
@ 2011-11-22 16:46         ` Alexey Dobriyan
  2011-11-22 20:27           ` David Miller
  1 sibling, 1 reply; 9+ messages in thread
From: Alexey Dobriyan @ 2011-11-22 16:46 UTC (permalink / raw)
  To: David Laight; +Cc: Eric Dumazet, davem, netdev

Current addr_match() is errh, under-optimized.

Compiler doesn't know that memcmp() branch doesn't trigger for IPv4.
Also, pass addresses by value -- they fit into register.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
---

 include/net/xfrm.h     |    8 ++++++++
 net/xfrm/xfrm_policy.c |    4 ++--
 2 files changed, 10 insertions(+), 2 deletions(-)

--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -827,6 +827,14 @@ static inline bool addr_match(const void *token1, const void *token2,
 	return true;
 }
 
+static inline bool addr4_match(__be32 a1, __be32 a2, u8 prefixlen)
+{
+	/* C99 6.5.7 (3): u32 << 32 is undefined behaviour */
+	if (prefixlen == 0)
+		return true;
+	return !((a1 ^ a2) & htonl(0xFFFFFFFFu << (32 - prefixlen)));
+}
+
 static __inline__
 __be16 xfrm_flowi_sport(const struct flowi *fl, const union flowi_uli *uli)
 {
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -61,8 +61,8 @@ __xfrm4_selector_match(const struct xfrm_selector *sel, const struct flowi *fl)
 {
 	const struct flowi4 *fl4 = &fl->u.ip4;
 
-	return  addr_match(&fl4->daddr, &sel->daddr, sel->prefixlen_d) &&
-		addr_match(&fl4->saddr, &sel->saddr, sel->prefixlen_s) &&
+	return  addr4_match(fl4->daddr, sel->daddr.a4, sel->prefixlen_d) &&
+		addr4_match(fl4->saddr, sel->saddr.a4, sel->prefixlen_s) &&
 		!((xfrm_flowi_dport(fl, &fl4->uli) ^ sel->dport) & sel->dport_mask) &&
 		!((xfrm_flowi_sport(fl, &fl4->uli) ^ sel->sport) & sel->sport_mask) &&
 		(fl4->flowi4_proto == sel->proto || !sel->proto) &&

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] xfrm: optimize ipv4 selector matching
  2011-11-22 16:46         ` [PATCH v2] " Alexey Dobriyan
@ 2011-11-22 20:27           ` David Miller
  0 siblings, 0 replies; 9+ messages in thread
From: David Miller @ 2011-11-22 20:27 UTC (permalink / raw)
  To: adobriyan; +Cc: David.Laight, eric.dumazet, netdev

From: Alexey Dobriyan <adobriyan@gmail.com>
Date: Tue, 22 Nov 2011 19:46:02 +0300

> Current addr_match() is errh, under-optimized.
> 
> Compiler doesn't know that memcmp() branch doesn't trigger for IPv4.
> Also, pass addresses by value -- they fit into register.
> 
> Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>

Applied, thanks.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2011-11-22 20:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-11-22 14:43 [PATCH] xfrm: optimize ipv4 selector matching Alexey Dobriyan
2011-11-22 14:50 ` David Laight
2011-11-22 14:59   ` Alexey Dobriyan
2011-11-22 15:10     ` Eric Dumazet
2011-11-22 15:15       ` David Laight
2011-11-22 15:41         ` Alexey Dobriyan
2011-11-22 16:46         ` [PATCH v2] " Alexey Dobriyan
2011-11-22 20:27           ` David Miller
2011-11-22 15:47       ` [PATCH] " Alexey Dobriyan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.