netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention
@ 2013-08-01  8:04 Michal Kubecek
  2013-08-01  8:04 ` [PATCH net v2 1/2] " Michal Kubecek
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: Michal Kubecek @ 2013-08-01  8:04 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, netdev, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy

On a high-traffic router with many processors and many IPv6 dst entries,
soft lockup in fib6_run_gc() can occur when number of entries reaches
gc_thresh. Prevent this by not waiting for the lock if garbage collector
is already running as we would only run it again as soon as it finished.

v2: as suggested by Eric Dumazet, move update of ip6_rt_last_gc into
    fib6_run_gc() so that it reflects every run of garbage collector
    rather than only those via ip6_dst_gc()

Michal Kubecek (2):
  ipv6: prevent fib6_run_gc() contention
  ipv6: update ip6_rt_last_gc every time GC is run

 include/net/ip6_fib.h |  2 +-
 net/ipv6/ip6_fib.c    | 25 +++++++++++++------------
 net/ipv6/ndisc.c      |  4 ++--
 net/ipv6/route.c      |  8 +++-----
 4 files changed, 19 insertions(+), 20 deletions(-)

-- 
1.8.1.4

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH net v2 1/2] ipv6: prevent fib6_run_gc() contention
  2013-08-01  8:04 [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention Michal Kubecek
@ 2013-08-01  8:04 ` Michal Kubecek
  2013-08-01  8:04 ` [PATCH net v2 2/2] ipv6: update ip6_rt_last_gc every time GC is run Michal Kubecek
  2013-08-01 21:18 ` [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Michal Kubecek @ 2013-08-01  8:04 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, netdev, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy

On a high-traffic router with many processors and many IPv6 dst
entries, soft lockup in fib6_run_gc() can occur when number of
entries reaches gc_thresh.

This happens because fib6_run_gc() uses fib6_gc_lock to allow
only one thread to run the garbage collector but ip6_dst_gc()
doesn't update net->ipv6.ip6_rt_last_gc until fib6_run_gc()
returns. On a system with many entries, this can take some time
so that in the meantime, other threads pass the tests in
ip6_dst_gc() (ip6_rt_last_gc is still not updated) and wait for
the lock. They then have to run the garbage collector one after
another which blocks them for quite long.

Resolve this by replacing special value ~0UL of expire parameter
to fib6_run_gc() by explicit "force" parameter to choose between
spin_lock_bh() and spin_trylock_bh() and call fib6_run_gc() with
force=false if gc_thresh is reached but not max_size.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
---
 include/net/ip6_fib.h |  2 +-
 net/ipv6/ip6_fib.c    | 19 ++++++++-----------
 net/ipv6/ndisc.c      |  4 ++--
 net/ipv6/route.c      |  4 ++--
 4 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 2a601e7..48ec25a 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -300,7 +300,7 @@ extern void			inet6_rt_notify(int event, struct rt6_info *rt,
 						struct nl_info *info);
 
 extern void			fib6_run_gc(unsigned long expires,
-					    struct net *net);
+					    struct net *net, bool force);
 
 extern void			fib6_gc_cleanup(void);
 
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 5fc9c7a..d872553 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1632,19 +1632,16 @@ static int fib6_age(struct rt6_info *rt, void *arg)
 
 static DEFINE_SPINLOCK(fib6_gc_lock);
 
-void fib6_run_gc(unsigned long expires, struct net *net)
+void fib6_run_gc(unsigned long expires, struct net *net, bool force)
 {
-	if (expires != ~0UL) {
+	if (force) {
 		spin_lock_bh(&fib6_gc_lock);
-		gc_args.timeout = expires ? (int)expires :
-			net->ipv6.sysctl.ip6_rt_gc_interval;
-	} else {
-		if (!spin_trylock_bh(&fib6_gc_lock)) {
-			mod_timer(&net->ipv6.ip6_fib_timer, jiffies + HZ);
-			return;
-		}
-		gc_args.timeout = net->ipv6.sysctl.ip6_rt_gc_interval;
+	} else if (!spin_trylock_bh(&fib6_gc_lock)) {
+		mod_timer(&net->ipv6.ip6_fib_timer, jiffies + HZ);
+		return;
 	}
+	gc_args.timeout = expires ? (int)expires :
+			  net->ipv6.sysctl.ip6_rt_gc_interval;
 
 	gc_args.more = icmp6_dst_gc();
 
@@ -1661,7 +1658,7 @@ void fib6_run_gc(unsigned long expires, struct net *net)
 
 static void fib6_gc_timer_cb(unsigned long arg)
 {
-	fib6_run_gc(0, (struct net *)arg);
+	fib6_run_gc(0, (struct net *)arg, true);
 }
 
 static int __net_init fib6_net_init(struct net *net)
diff --git a/net/ipv6/ndisc.c b/net/ipv6/ndisc.c
index 24c03396..79aa965 100644
--- a/net/ipv6/ndisc.c
+++ b/net/ipv6/ndisc.c
@@ -1576,7 +1576,7 @@ static int ndisc_netdev_event(struct notifier_block *this, unsigned long event,
 	switch (event) {
 	case NETDEV_CHANGEADDR:
 		neigh_changeaddr(&nd_tbl, dev);
-		fib6_run_gc(~0UL, net);
+		fib6_run_gc(0, net, false);
 		idev = in6_dev_get(dev);
 		if (!idev)
 			break;
@@ -1586,7 +1586,7 @@ static int ndisc_netdev_event(struct notifier_block *this, unsigned long event,
 		break;
 	case NETDEV_DOWN:
 		neigh_ifdown(&nd_tbl, dev);
-		fib6_run_gc(~0UL, net);
+		fib6_run_gc(0, net, false);
 		break;
 	case NETDEV_NOTIFY_PEERS:
 		ndisc_send_unsol_na(dev);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index a8c891a..824c424 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1326,7 +1326,7 @@ static int ip6_dst_gc(struct dst_ops *ops)
 		goto out;
 
 	net->ipv6.ip6_rt_gc_expire++;
-	fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net);
+	fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, entries > rt_max_size);
 	net->ipv6.ip6_rt_last_gc = now;
 	entries = dst_entries_get_slow(ops);
 	if (entries < ops->gc_thresh)
@@ -2827,7 +2827,7 @@ int ipv6_sysctl_rtcache_flush(struct ctl_table *ctl, int write,
 	net = (struct net *)ctl->extra1;
 	delay = net->ipv6.sysctl.flush_delay;
 	proc_dointvec(ctl, write, buffer, lenp, ppos);
-	fib6_run_gc(delay <= 0 ? ~0UL : (unsigned long)delay, net);
+	fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0);
 	return 0;
 }
 
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH net v2 2/2] ipv6: update ip6_rt_last_gc every time GC is run
  2013-08-01  8:04 [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention Michal Kubecek
  2013-08-01  8:04 ` [PATCH net v2 1/2] " Michal Kubecek
@ 2013-08-01  8:04 ` Michal Kubecek
  2013-08-01 21:18 ` [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: Michal Kubecek @ 2013-08-01  8:04 UTC (permalink / raw)
  To: David S. Miller
  Cc: Eric Dumazet, netdev, Alexey Kuznetsov, James Morris,
	Hideaki YOSHIFUJI, Patrick McHardy

As pointed out by Eric Dumazet, net->ipv6.ip6_rt_last_gc should
hold the last time garbage collector was run so that we should
update it whenever fib6_run_gc() calls fib6_clean_all(), not only
if we got there from ip6_dst_gc().

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
---
 net/ipv6/ip6_fib.c | 6 +++++-
 net/ipv6/route.c   | 4 +---
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index d872553..bff3d82 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1634,6 +1634,8 @@ static DEFINE_SPINLOCK(fib6_gc_lock);
 
 void fib6_run_gc(unsigned long expires, struct net *net, bool force)
 {
+	unsigned long now;
+
 	if (force) {
 		spin_lock_bh(&fib6_gc_lock);
 	} else if (!spin_trylock_bh(&fib6_gc_lock)) {
@@ -1646,10 +1648,12 @@ void fib6_run_gc(unsigned long expires, struct net *net, bool force)
 	gc_args.more = icmp6_dst_gc();
 
 	fib6_clean_all(net, fib6_age, 0, NULL);
+	now = jiffies;
+	net->ipv6.ip6_rt_last_gc = now;
 
 	if (gc_args.more)
 		mod_timer(&net->ipv6.ip6_fib_timer,
-			  round_jiffies(jiffies
+			  round_jiffies(now
 					+ net->ipv6.sysctl.ip6_rt_gc_interval));
 	else
 		del_timer(&net->ipv6.ip6_fib_timer);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 824c424..b70f897 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1311,7 +1311,6 @@ static void icmp6_clean_all(int (*func)(struct rt6_info *rt, void *arg),
 
 static int ip6_dst_gc(struct dst_ops *ops)
 {
-	unsigned long now = jiffies;
 	struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
 	int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
 	int rt_max_size = net->ipv6.sysctl.ip6_rt_max_size;
@@ -1321,13 +1320,12 @@ static int ip6_dst_gc(struct dst_ops *ops)
 	int entries;
 
 	entries = dst_entries_get_fast(ops);
-	if (time_after(rt_last_gc + rt_min_interval, now) &&
+	if (time_after(rt_last_gc + rt_min_interval, jiffies) &&
 	    entries <= rt_max_size)
 		goto out;
 
 	net->ipv6.ip6_rt_gc_expire++;
 	fib6_run_gc(net->ipv6.ip6_rt_gc_expire, net, entries > rt_max_size);
-	net->ipv6.ip6_rt_last_gc = now;
 	entries = dst_entries_get_slow(ops);
 	if (entries < ops->gc_thresh)
 		net->ipv6.ip6_rt_gc_expire = rt_gc_timeout>>1;
-- 
1.8.1.4

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention
  2013-08-01  8:04 [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention Michal Kubecek
  2013-08-01  8:04 ` [PATCH net v2 1/2] " Michal Kubecek
  2013-08-01  8:04 ` [PATCH net v2 2/2] ipv6: update ip6_rt_last_gc every time GC is run Michal Kubecek
@ 2013-08-01 21:18 ` David Miller
  2 siblings, 0 replies; 4+ messages in thread
From: David Miller @ 2013-08-01 21:18 UTC (permalink / raw)
  To: mkubecek; +Cc: eric.dumazet, netdev, kuznet, jmorris, yoshfuji, kaber

From: Michal Kubecek <mkubecek@suse.cz>
Date: Thu,  1 Aug 2013 10:04:04 +0200 (CEST)

> On a high-traffic router with many processors and many IPv6 dst entries,
> soft lockup in fib6_run_gc() can occur when number of entries reaches
> gc_thresh. Prevent this by not waiting for the lock if garbage collector
> is already running as we would only run it again as soon as it finished.
> 
> v2: as suggested by Eric Dumazet, move update of ip6_rt_last_gc into
>     fib6_run_gc() so that it reflects every run of garbage collector
>     rather than only those via ip6_dst_gc()

Applied, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-08-01 21:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-08-01  8:04 [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention Michal Kubecek
2013-08-01  8:04 ` [PATCH net v2 1/2] " Michal Kubecek
2013-08-01  8:04 ` [PATCH net v2 2/2] ipv6: update ip6_rt_last_gc every time GC is run Michal Kubecek
2013-08-01 21:18 ` [PATCH net v2 0/2] ipv6: prevent fib6_run_gc() contention David Miller

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).