All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
@ 2009-11-12 14:11 Eric Dumazet
  2009-11-12 17:12 ` Stephen Hemminger
  0 siblings, 1 reply; 6+ messages in thread
From: Eric Dumazet @ 2009-11-12 14:11 UTC (permalink / raw)
  To: David S. Miller; +Cc: Linux Netdev List, Stephen Hemminger

When handling large number of netdevices, inet_dump_ifaddr()
is very slow because it has O(N^2) complexity.

Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
sub lists of the dev_index hash table, and RCU lookups.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
 net/ipv4/devinet.c |   57 ++++++++++++++++++++++++++-----------------
 1 files changed, 35 insertions(+), 22 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index c2045f9..a74cb4e 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1174,39 +1174,52 @@ nla_put_failure:
 static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
-	int idx, ip_idx;
+	int h, s_h;
+	int idx, s_idx;
+	int ip_idx, s_ip_idx;
 	struct net_device *dev;
 	struct in_device *in_dev;
 	struct in_ifaddr *ifa;
-	int s_ip_idx, s_idx = cb->args[0];
+	struct hlist_head *head;
+	struct hlist_node *node;
 
-	s_ip_idx = ip_idx = cb->args[1];
-	idx = 0;
-	for_each_netdev(net, dev) {
-		if (idx < s_idx)
-			goto cont;
-		if (idx > s_idx)
-			s_ip_idx = 0;
-		in_dev = __in_dev_get_rtnl(dev);
-		if (!in_dev)
-			goto cont;
+	s_h = cb->args[0];
+	s_idx = idx = cb->args[1];
+	s_ip_idx = ip_idx = cb->args[2];
 
-		for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
-		     ifa = ifa->ifa_next, ip_idx++) {
-			if (ip_idx < s_ip_idx)
-				continue;
-			if (inet_fill_ifaddr(skb, ifa, NETLINK_CB(cb->skb).pid,
+	rcu_read_lock();
+	for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
+		idx = 0;
+		head = &net->dev_index_head[h];
+		hlist_for_each_entry_rcu(dev, node, head, index_hlist) {
+			if (idx < s_idx)
+				goto cont;
+			if (idx > s_idx)
+				s_ip_idx = 0;
+			in_dev = __in_dev_get_rcu(dev);
+			if (!in_dev)
+				goto cont;
+
+			for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
+			     ifa = ifa->ifa_next, ip_idx++) {
+				if (ip_idx < s_ip_idx)
+					continue;
+				if (inet_fill_ifaddr(skb, ifa,
+					     NETLINK_CB(cb->skb).pid,
 					     cb->nlh->nlmsg_seq,
 					     RTM_NEWADDR, NLM_F_MULTI) <= 0)
-				goto done;
-		}
+					goto done;
+			}
 cont:
-		idx++;
+			idx++;
+		}
 	}
 
 done:
-	cb->args[0] = idx;
-	cb->args[1] = ip_idx;
+	rcu_read_unlock();
+	cb->args[0] = h;
+	cb->args[1] = idx;
+	cb->args[2] = ip_idx;
 
 	return skb->len;
 }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
  2009-11-12 14:11 [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr() Eric Dumazet
@ 2009-11-12 17:12 ` Stephen Hemminger
  2009-11-12 17:44   ` Eric Dumazet
  0 siblings, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2009-11-12 17:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List

On Thu, 12 Nov 2009 15:11:36 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> When handling large number of netdevices, inet_dump_ifaddr()
> is very slow because it has O(N^2) complexity.
> 
> Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
> sub lists of the dev_index hash table, and RCU lookups.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>

You might be able to make RCU critical section smaller by moving
it into loop.

Acked-by: Stephen Hemminger <shemminger@vyatta.com>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
  2009-11-12 17:12 ` Stephen Hemminger
@ 2009-11-12 17:44   ` Eric Dumazet
  2009-11-12 19:12     ` Stephen Hemminger
  2009-11-14  4:50     ` David Miller
  0 siblings, 2 replies; 6+ messages in thread
From: Eric Dumazet @ 2009-11-12 17:44 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David S. Miller, Linux Netdev List

Stephen Hemminger a écrit :
> On Thu, 12 Nov 2009 15:11:36 +0100
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> When handling large number of netdevices, inet_dump_ifaddr()
>> is very slow because it has O(N^2) complexity.
>>
>> Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
>> sub lists of the dev_index hash table, and RCU lookups.
>>
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> You might be able to make RCU critical section smaller by moving
> it into loop.
> 

Indeed. But we dump at most one skb (<= 8192 bytes ?), so rcu_read_lock
holding time is small, unless we meet many netdevices without
addresses. I wonder if its really common...

Thanks

[PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()

When handling large number of netdevices, inet_dump_ifaddr()
is very slow because it has O(N2) complexity.

Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
sub lists of the dev_index hash table, and RCU lookups.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>
---
 net/ipv4/devinet.c |   61 ++++++++++++++++++++++++++-----------------
 1 files changed, 38 insertions(+), 23 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index c2045f9..7620382 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -1174,39 +1174,54 @@ nla_put_failure:
 static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
 {
 	struct net *net = sock_net(skb->sk);
-	int idx, ip_idx;
+	int h, s_h;
+	int idx, s_idx;
+	int ip_idx, s_ip_idx;
 	struct net_device *dev;
 	struct in_device *in_dev;
 	struct in_ifaddr *ifa;
-	int s_ip_idx, s_idx = cb->args[0];
+	struct hlist_head *head;
+	struct hlist_node *node;
 
-	s_ip_idx = ip_idx = cb->args[1];
-	idx = 0;
-	for_each_netdev(net, dev) {
-		if (idx < s_idx)
-			goto cont;
-		if (idx > s_idx)
-			s_ip_idx = 0;
-		in_dev = __in_dev_get_rtnl(dev);
-		if (!in_dev)
-			goto cont;
+	s_h = cb->args[0];
+	s_idx = idx = cb->args[1];
+	s_ip_idx = ip_idx = cb->args[2];
 
-		for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
-		     ifa = ifa->ifa_next, ip_idx++) {
-			if (ip_idx < s_ip_idx)
-				continue;
-			if (inet_fill_ifaddr(skb, ifa, NETLINK_CB(cb->skb).pid,
+	for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
+		idx = 0;
+		head = &net->dev_index_head[h];
+		rcu_read_lock();
+		hlist_for_each_entry_rcu(dev, node, head, index_hlist) {
+			if (idx < s_idx)
+				goto cont;
+			if (idx > s_idx)
+				s_ip_idx = 0;
+			in_dev = __in_dev_get_rcu(dev);
+			if (!in_dev)
+				goto cont;
+
+			for (ifa = in_dev->ifa_list, ip_idx = 0; ifa;
+			     ifa = ifa->ifa_next, ip_idx++) {
+				if (ip_idx < s_ip_idx)
+					continue;
+				if (inet_fill_ifaddr(skb, ifa,
+					     NETLINK_CB(cb->skb).pid,
 					     cb->nlh->nlmsg_seq,
-					     RTM_NEWADDR, NLM_F_MULTI) <= 0)
-				goto done;
-		}
+					     RTM_NEWADDR, NLM_F_MULTI) <= 0) {
+					rcu_read_unlock();
+					goto done;
+				}
+			}
 cont:
-		idx++;
+			idx++;
+		}
+		rcu_read_unlock();
 	}
 
 done:
-	cb->args[0] = idx;
-	cb->args[1] = ip_idx;
+	cb->args[0] = h;
+	cb->args[1] = idx;
+	cb->args[2] = ip_idx;
 
 	return skb->len;
 }

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
  2009-11-12 17:44   ` Eric Dumazet
@ 2009-11-12 19:12     ` Stephen Hemminger
  2009-11-12 19:14       ` Eric Dumazet
  2009-11-14  4:50     ` David Miller
  1 sibling, 1 reply; 6+ messages in thread
From: Stephen Hemminger @ 2009-11-12 19:12 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David S. Miller, Linux Netdev List

On Thu, 12 Nov 2009 18:44:25 +0100
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Stephen Hemminger a écrit :
> > On Thu, 12 Nov 2009 15:11:36 +0100
> > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > 
> >> When handling large number of netdevices, inet_dump_ifaddr()
> >> is very slow because it has O(N^2) complexity.
> >>
> >> Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
> >> sub lists of the dev_index hash table, and RCU lookups.
> >>
> >> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > 
> > You might be able to make RCU critical section smaller by moving
> > it into loop.
> > 
> 
> Indeed. But we dump at most one skb (<= 8192 bytes ?), so rcu_read_lock
> holding time is small, unless we meet many netdevices without
> addresses. I wonder if its really common...
> 
> Thanks

One case where that might happen is:
  modprobe dummy numdummies=10000

But dummy device should really be added with netlink, not at boot time.

-- 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
  2009-11-12 19:12     ` Stephen Hemminger
@ 2009-11-12 19:14       ` Eric Dumazet
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Dumazet @ 2009-11-12 19:14 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David S. Miller, Linux Netdev List

Stephen Hemminger a écrit :

> 
> One case where that might happen is:
>   modprobe dummy numdummies=10000
> 
> But dummy device should really be added with netlink, not at boot time.
> 

At least we cannot exceed 32768 dummies :)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
  2009-11-12 17:44   ` Eric Dumazet
  2009-11-12 19:12     ` Stephen Hemminger
@ 2009-11-14  4:50     ` David Miller
  1 sibling, 0 replies; 6+ messages in thread
From: David Miller @ 2009-11-14  4:50 UTC (permalink / raw)
  To: eric.dumazet; +Cc: shemminger, netdev

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 12 Nov 2009 18:44:25 +0100

> [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr()
> 
> When handling large number of netdevices, inet_dump_ifaddr()
> is very slow because it has O(N2) complexity.
> 
> Instead of scanning one single list, we can use the NETDEV_HASHENTRIES
> sub lists of the dev_index hash table, and RCU lookups.
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Acked-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-11-14  4:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-12 14:11 [PATCH net-next-2.6] ipv4: speedup inet_dump_ifaddr() Eric Dumazet
2009-11-12 17:12 ` Stephen Hemminger
2009-11-12 17:44   ` Eric Dumazet
2009-11-12 19:12     ` Stephen Hemminger
2009-11-12 19:14       ` Eric Dumazet
2009-11-14  4:50     ` David Miller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.