netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts
@ 2020-07-09 10:16 Eyal Birger
  2020-07-09 10:16 ` [PATCH ipsec-next 1/2] xfrm interface: avoid xi lookup in xfrmi_decode_session() Eyal Birger
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Eyal Birger @ 2020-07-09 10:16 UTC (permalink / raw)
  To: steffen.klassert, davem, herbert, netdev; +Cc: Eyal Birger

When having many xfrm interfaces, the linear lookup of devices based on
if_id becomes costly.

The first patch refactors xfrmi_decode_session() to use the xi used in
the netdevice priv context instead of looking it up in the list based
on ifindex. This is needed in order to use if_id as the only key used
for xi lookup.

The second patch extends the existing infrastructure - which already
stores the xfrmi contexts in an array of lists - to use a hash of the
if_id.

Example benchmarks:
- running on a KVM based VM
- xfrm tunnel mode between two namespaces
- xfrm interface in one namespace (10.0.0.2)

Before this change set:

Single xfrm interface in namespace:
$ netperf -H 10.0.0.2 -l8 -I95,10 -t TCP_STREAM

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.2 () port 0 AF_INET : +/-5.000% @ 95% conf.  : demo
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131072  16384  16384    8.00      298.36

After adding 400 xfrmi interfaces in the same namespace:

$ netperf -H 10.0.0.2 -l8 -I95,10 -t TCP_STREAM

MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.2 () port 0 AF_INET : +/-5.000% @ 95% conf.  : demo
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

131072  16384  16384    8.00      221.77   

After this patchset there was no observed change after adding the
xfrmi interfaces.

Eyal Birger (2):
  xfrm interface: avoid xi lookup in xfrmi_decode_session()
  xfrm interface: store xfrmi contexts in a hash by if_id

 net/xfrm/xfrm_interface.c | 52 +++++++++++++++++++++++++--------------
 1 file changed, 33 insertions(+), 19 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH ipsec-next 1/2] xfrm interface: avoid xi lookup in xfrmi_decode_session()
  2020-07-09 10:16 [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts Eyal Birger
@ 2020-07-09 10:16 ` Eyal Birger
  2020-07-09 10:16 ` [PATCH ipsec-next 2/2] xfrm interface: store xfrmi contexts in a hash by if_id Eyal Birger
  2020-07-14  9:44 ` [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts Steffen Klassert
  2 siblings, 0 replies; 4+ messages in thread
From: Eyal Birger @ 2020-07-09 10:16 UTC (permalink / raw)
  To: steffen.klassert, davem, herbert, netdev; +Cc: Eyal Birger

The xfrmi context exists in the netdevice priv context.
Avoid looking for it in a separate list.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
---
 net/xfrm/xfrm_interface.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index c407ecbc5d46..069dafeba873 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -47,6 +47,7 @@ static int xfrmi_dev_init(struct net_device *dev);
 static void xfrmi_dev_setup(struct net_device *dev);
 static struct rtnl_link_ops xfrmi_link_ops __read_mostly;
 static unsigned int xfrmi_net_id __read_mostly;
+static const struct net_device_ops xfrmi_netdev_ops;
 
 struct xfrmi_net {
 	/* lists for storing interfaces in use */
@@ -73,8 +74,7 @@ static struct xfrm_if *xfrmi_lookup(struct net *net, struct xfrm_state *x)
 static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb,
 					    unsigned short family)
 {
-	struct xfrmi_net *xfrmn;
-	struct xfrm_if *xi;
+	struct net_device *dev;
 	int ifindex = 0;
 
 	if (!secpath_exists(skb) || !skb->dev)
@@ -88,18 +88,21 @@ static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb,
 		ifindex = inet_sdif(skb);
 		break;
 	}
-	if (!ifindex)
-		ifindex = skb->dev->ifindex;
 
-	xfrmn = net_generic(xs_net(xfrm_input_state(skb)), xfrmi_net_id);
+	if (ifindex) {
+		struct net *net = xs_net(xfrm_input_state(skb));
 
-	for_each_xfrmi_rcu(xfrmn->xfrmi[0], xi) {
-		if (ifindex == xi->dev->ifindex &&
-			(xi->dev->flags & IFF_UP))
-				return xi;
+		dev = dev_get_by_index_rcu(net, ifindex);
+	} else {
+		dev = skb->dev;
 	}
 
-	return NULL;
+	if (!dev || !(dev->flags & IFF_UP))
+		return NULL;
+	if (dev->netdev_ops != &xfrmi_netdev_ops)
+		return NULL;
+
+	return netdev_priv(dev);
 }
 
 static void xfrmi_link(struct xfrmi_net *xfrmn, struct xfrm_if *xi)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH ipsec-next 2/2] xfrm interface: store xfrmi contexts in a hash by if_id
  2020-07-09 10:16 [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts Eyal Birger
  2020-07-09 10:16 ` [PATCH ipsec-next 1/2] xfrm interface: avoid xi lookup in xfrmi_decode_session() Eyal Birger
@ 2020-07-09 10:16 ` Eyal Birger
  2020-07-14  9:44 ` [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts Steffen Klassert
  2 siblings, 0 replies; 4+ messages in thread
From: Eyal Birger @ 2020-07-09 10:16 UTC (permalink / raw)
  To: steffen.klassert, davem, herbert, netdev; +Cc: Eyal Birger

xfrmi_lookup() is called on every packet. Using a single list for
looking up if_id becomes a bottleneck when having many xfrm interfaces.

Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
---
 net/xfrm/xfrm_interface.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/net/xfrm/xfrm_interface.c b/net/xfrm/xfrm_interface.c
index 069dafeba873..f4ec117e9110 100644
--- a/net/xfrm/xfrm_interface.c
+++ b/net/xfrm/xfrm_interface.c
@@ -49,20 +49,28 @@ static struct rtnl_link_ops xfrmi_link_ops __read_mostly;
 static unsigned int xfrmi_net_id __read_mostly;
 static const struct net_device_ops xfrmi_netdev_ops;
 
+#define XFRMI_HASH_BITS	8
+#define XFRMI_HASH_SIZE	BIT(XFRMI_HASH_BITS)
+
 struct xfrmi_net {
 	/* lists for storing interfaces in use */
-	struct xfrm_if __rcu *xfrmi[1];
+	struct xfrm_if __rcu *xfrmi[XFRMI_HASH_SIZE];
 };
 
 #define for_each_xfrmi_rcu(start, xi) \
 	for (xi = rcu_dereference(start); xi; xi = rcu_dereference(xi->next))
 
+static u32 xfrmi_hash(u32 if_id)
+{
+	return hash_32(if_id, XFRMI_HASH_BITS);
+}
+
 static struct xfrm_if *xfrmi_lookup(struct net *net, struct xfrm_state *x)
 {
 	struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
 	struct xfrm_if *xi;
 
-	for_each_xfrmi_rcu(xfrmn->xfrmi[0], xi) {
+	for_each_xfrmi_rcu(xfrmn->xfrmi[xfrmi_hash(x->if_id)], xi) {
 		if (x->if_id == xi->p.if_id &&
 		    (xi->dev->flags & IFF_UP))
 			return xi;
@@ -107,7 +115,7 @@ static struct xfrm_if *xfrmi_decode_session(struct sk_buff *skb,
 
 static void xfrmi_link(struct xfrmi_net *xfrmn, struct xfrm_if *xi)
 {
-	struct xfrm_if __rcu **xip = &xfrmn->xfrmi[0];
+	struct xfrm_if __rcu **xip = &xfrmn->xfrmi[xfrmi_hash(xi->p.if_id)];
 
 	rcu_assign_pointer(xi->next , rtnl_dereference(*xip));
 	rcu_assign_pointer(*xip, xi);
@@ -118,7 +126,7 @@ static void xfrmi_unlink(struct xfrmi_net *xfrmn, struct xfrm_if *xi)
 	struct xfrm_if __rcu **xip;
 	struct xfrm_if *iter;
 
-	for (xip = &xfrmn->xfrmi[0];
+	for (xip = &xfrmn->xfrmi[xfrmi_hash(xi->p.if_id)];
 	     (iter = rtnl_dereference(*xip)) != NULL;
 	     xip = &iter->next) {
 		if (xi == iter) {
@@ -162,7 +170,7 @@ static struct xfrm_if *xfrmi_locate(struct net *net, struct xfrm_if_parms *p)
 	struct xfrm_if *xi;
 	struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
 
-	for (xip = &xfrmn->xfrmi[0];
+	for (xip = &xfrmn->xfrmi[xfrmi_hash(p->if_id)];
 	     (xi = rtnl_dereference(*xip)) != NULL;
 	     xip = &xi->next)
 		if (xi->p.if_id == p->if_id)
@@ -761,11 +769,14 @@ static void __net_exit xfrmi_exit_batch_net(struct list_head *net_exit_list)
 		struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
 		struct xfrm_if __rcu **xip;
 		struct xfrm_if *xi;
+		int i;
 
-		for (xip = &xfrmn->xfrmi[0];
-		     (xi = rtnl_dereference(*xip)) != NULL;
-		     xip = &xi->next)
-			unregister_netdevice_queue(xi->dev, &list);
+		for (i = 0; i < XFRMI_HASH_SIZE; i++) {
+			for (xip = &xfrmn->xfrmi[i];
+			     (xi = rtnl_dereference(*xip)) != NULL;
+			     xip = &xi->next)
+				unregister_netdevice_queue(xi->dev, &list);
+		}
 	}
 	unregister_netdevice_many(&list);
 	rtnl_unlock();
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts
  2020-07-09 10:16 [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts Eyal Birger
  2020-07-09 10:16 ` [PATCH ipsec-next 1/2] xfrm interface: avoid xi lookup in xfrmi_decode_session() Eyal Birger
  2020-07-09 10:16 ` [PATCH ipsec-next 2/2] xfrm interface: store xfrmi contexts in a hash by if_id Eyal Birger
@ 2020-07-14  9:44 ` Steffen Klassert
  2 siblings, 0 replies; 4+ messages in thread
From: Steffen Klassert @ 2020-07-14  9:44 UTC (permalink / raw)
  To: Eyal Birger; +Cc: davem, herbert, netdev

On Thu, Jul 09, 2020 at 01:16:50PM +0300, Eyal Birger wrote:
> When having many xfrm interfaces, the linear lookup of devices based on
> if_id becomes costly.
> 
> The first patch refactors xfrmi_decode_session() to use the xi used in
> the netdevice priv context instead of looking it up in the list based
> on ifindex. This is needed in order to use if_id as the only key used
> for xi lookup.
> 
> The second patch extends the existing infrastructure - which already
> stores the xfrmi contexts in an array of lists - to use a hash of the
> if_id.
> 
> Example benchmarks:
> - running on a KVM based VM
> - xfrm tunnel mode between two namespaces
> - xfrm interface in one namespace (10.0.0.2)
> 
> Before this change set:
> 
> Single xfrm interface in namespace:
> $ netperf -H 10.0.0.2 -l8 -I95,10 -t TCP_STREAM
> 
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.2 () port 0 AF_INET : +/-5.000% @ 95% conf.  : demo
> Recv   Send    Send                          
> Socket Socket  Message  Elapsed              
> Size   Size    Size     Time     Throughput  
> bytes  bytes   bytes    secs.    10^6bits/sec  
> 
> 131072  16384  16384    8.00      298.36
> 
> After adding 400 xfrmi interfaces in the same namespace:
> 
> $ netperf -H 10.0.0.2 -l8 -I95,10 -t TCP_STREAM
> 
> MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 10.0.0.2 () port 0 AF_INET : +/-5.000% @ 95% conf.  : demo
> Recv   Send    Send                          
> Socket Socket  Message  Elapsed              
> Size   Size    Size     Time     Throughput  
> bytes  bytes   bytes    secs.    10^6bits/sec  
> 
> 131072  16384  16384    8.00      221.77   
> 
> After this patchset there was no observed change after adding the
> xfrmi interfaces.
> 
> Eyal Birger (2):
>   xfrm interface: avoid xi lookup in xfrmi_decode_session()
>   xfrm interface: store xfrmi contexts in a hash by if_id
> 
>  net/xfrm/xfrm_interface.c | 52 +++++++++++++++++++++++++--------------
>  1 file changed, 33 insertions(+), 19 deletions(-)

Applied to ipsec-next, thanks a lot Eyal!

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-07-14  9:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-09 10:16 [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts Eyal Birger
2020-07-09 10:16 ` [PATCH ipsec-next 1/2] xfrm interface: avoid xi lookup in xfrmi_decode_session() Eyal Birger
2020-07-09 10:16 ` [PATCH ipsec-next 2/2] xfrm interface: store xfrmi contexts in a hash by if_id Eyal Birger
2020-07-14  9:44 ` [PATCH ipsec-next 0/2] xfrm interface: use hash to store xfrmi contexts Steffen Klassert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).