netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
@ 2014-09-21 14:11 Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
                   ` (9 more replies)
  0 siblings, 10 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
their rt6_genid is never renewed, thus ip6_dst_check always considers
them outdated. This is a major problem, because these kind of routes
are normally used to in input handling.

Thus it does not make sense to use rt6i_genid anymore. This series
removes it.

The address deletion path is already covered and does not depend on
rt6i_genid. When we add a new address, we update the fn_sernums while
traversing the tree.

Because inet6_connect_socket depend on dst_check returning NULL also
for source address invalidation, we currently have to walk the whole
tree and update the fn_sernums manually when an address gets deleted.
This is a fairly expensive operation we currenlty have to do for
address deletion and xfrm policy changes. We currently do that for
interface mtu changes already.

I dropped the patch for updating the fn_sernum on deletion as it
showed some side effects with /proc/net/ipv6_route and we currently
don't need it. I stashed it away.

Thanks to Eric Dumazet for noticing the problem with rt6i_genid!

v2 (addressed YOSHIFUJI Hideaki's feedback, thanks!):
* fixed changelog in patch #2
* added patch to rename rt_genid_bump_ipv6 etc.

Regarding the __u32 to u32 conversion, I didn't see anything
problematic or left over, if I missed something there I am happy to
fix it up in a next version.


Hannes Frederic Sowa (9):
  ipv6: support for fib6_clean_* to update fn_sernum
  ipv6: a bit more typesafety
  ipv6: only generate one new serial number during fib6_add()
  ipv6: if no function for cleaner is specified only visit fib6_nodes
  ipv6: new function fib6_flush_trees and use it instead of bumping
    removed rt6_genid
  ipv6: no need to bump rt_genid_ipv6 on address addition
  ipv6: keep rt_sernum per namespace to reduce number of flushes
  ipv6: switch rt_sernum to atomic_t and clean up types
  ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches

 include/net/ip6_fib.h           | 16 ++++++--
 include/net/net_namespace.h     | 22 ++++------
 include/net/netns/ipv6.h        |  2 +-
 net/ipv6/addrconf.c             |  3 +-
 net/ipv6/addrconf_core.c        |  6 +++
 net/ipv6/af_inet6.c             |  2 +-
 net/ipv6/ip6_fib.c              | 90 +++++++++++++++++++++++++----------------
 net/ipv6/route.c                |  4 --
 net/xfrm/xfrm_policy.c          |  2 +-
 security/selinux/include/xfrm.h |  2 +-
 10 files changed, 87 insertions(+), 62 deletions(-)

-- 
1.9.3

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 2/9] ipv6: a bit more typesafety Hannes Frederic Sowa
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/ip6_fib.h |  2 +-
 net/ipv6/ip6_fib.c    | 31 ++++++++++++++++++++-----------
 2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 9bcb220..1cdd46e 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -64,7 +64,7 @@ struct fib6_node {
 
 	__u16			fn_bit;		/* bit key */
 	__u16			fn_flags;
-	__u32			fn_sernum;
+	u32			fn_sernum;
 	struct rt6_info		*rr_ptr;
 };
 
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 76b7f5e..900254e 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -60,6 +60,7 @@ struct fib6_cleaner_t {
 	struct fib6_walker_t w;
 	struct net *net;
 	int (*func)(struct rt6_info *, void *arg);
+	u32 sernum;
 	void *arg;
 };
 
@@ -71,7 +72,8 @@ static DEFINE_RWLOCK(fib6_walker_lock);
 #define FWS_INIT FWS_L
 #endif
 
-static void fib6_prune_clones(struct net *net, struct fib6_node *fn);
+static void fib6_prune_clones(struct net *net, struct fib6_node *fn,
+			      u32 sernum);
 static struct rt6_info *fib6_find_prefix(struct net *net, struct fib6_node *fn);
 static struct fib6_node *fib6_repair_tree(struct net *net, struct fib6_node *fn);
 static int fib6_walk(struct fib6_walker_t *w);
@@ -84,7 +86,7 @@ static int fib6_walk_continue(struct fib6_walker_t *w);
  *	result of redirects, path MTU changes, etc.
  */
 
-static __u32 rt_sernum;
+static u32 rt_sernum;
 
 static void fib6_gc_timer_cb(unsigned long arg);
 
@@ -107,11 +109,13 @@ static inline void fib6_walker_unlink(struct fib6_walker_t *w)
 static __inline__ u32 fib6_new_sernum(void)
 {
 	u32 n = ++rt_sernum;
-	if ((__s32)n <= 0)
+	if ((s32)n <= 0)
 		rt_sernum = n = 1;
 	return n;
 }
 
+#define FIB6_NO_SERNUM_CHANGE (0U)
+
 /*
  *	Auxiliary address test functions for the radix tree.
  *
@@ -430,7 +434,7 @@ static struct fib6_node *fib6_add_1(struct fib6_node *root,
 	struct rt6key *key;
 	int	bit;
 	__be32	dir = 0;
-	__u32	sernum = fib6_new_sernum();
+	u32	sernum = fib6_new_sernum();
 
 	RT6_TRACE("fib6_add_1\n");
 
@@ -940,7 +944,8 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 	if (!err) {
 		fib6_start_gc(info->nl_net, rt);
 		if (!(rt->rt6i_flags & RTF_CACHE))
-			fib6_prune_clones(info->nl_net, pn);
+			fib6_prune_clones(info->nl_net, pn,
+					  FIB6_NO_SERNUM_CHANGE);
 	}
 
 out:
@@ -1374,7 +1379,7 @@ int fib6_del(struct rt6_info *rt, struct nl_info *info)
 			pn = pn->parent;
 		}
 #endif
-		fib6_prune_clones(info->nl_net, pn);
+		fib6_prune_clones(info->nl_net, pn, FIB6_NO_SERNUM_CHANGE);
 	}
 
 	/*
@@ -1521,6 +1526,9 @@ static int fib6_clean_node(struct fib6_walker_t *w)
 		.nl_net = c->net,
 	};
 
+	if (c->sernum != FIB6_NO_SERNUM_CHANGE)
+		c->w.node->fn_sernum = c->sernum;
+
 	for (rt = w->leaf; rt; rt = rt->dst.rt6_next) {
 		res = c->func(rt, c->arg);
 		if (res < 0) {
@@ -1554,7 +1562,7 @@ static int fib6_clean_node(struct fib6_walker_t *w)
 
 static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 			    int (*func)(struct rt6_info *, void *arg),
-			    int prune, void *arg)
+			    int prune, u32 sernum, void *arg)
 {
 	struct fib6_cleaner_t c;
 
@@ -1564,6 +1572,7 @@ static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 	c.w.count = 0;
 	c.w.skip = 0;
 	c.func = func;
+	c.sernum = sernum;
 	c.arg = arg;
 	c.net = net;
 
@@ -1583,7 +1592,7 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
 		hlist_for_each_entry_rcu(table, head, tb6_hlist) {
 			write_lock_bh(&table->tb6_lock);
 			fib6_clean_tree(net, &table->tb6_root,
-					func, 0, arg);
+					func, 0, FIB6_NO_SERNUM_CHANGE, arg);
 			write_unlock_bh(&table->tb6_lock);
 		}
 	}
@@ -1600,9 +1609,9 @@ static int fib6_prune_clone(struct rt6_info *rt, void *arg)
 	return 0;
 }
 
-static void fib6_prune_clones(struct net *net, struct fib6_node *fn)
+static void fib6_prune_clones(struct net *net, struct fib6_node *fn, u32 sernum)
 {
-	fib6_clean_tree(net, fn, fib6_prune_clone, 1, NULL);
+	fib6_clean_tree(net, fn, fib6_prune_clone, 1, sernum, NULL);
 }
 
 /*
@@ -1811,7 +1820,7 @@ struct ipv6_route_iter {
 	struct fib6_walker_t w;
 	loff_t skip;
 	struct fib6_table *tbl;
-	__u32 sernum;
+	u32 sernum;
 };
 
 static int ipv6_route_seq_show(struct seq_file *seq, void *v)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 2/9] ipv6: a bit more typesafety
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add() Hannes Frederic Sowa
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

Convert prune argument/struct member to bool and fib6_walker_t.state to
the already existing enum.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/ip6_fib.h | 14 ++++++++++++--
 net/ipv6/ip6_fib.c    | 17 ++++-------------
 2 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 1cdd46e..a09e554 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -205,12 +205,22 @@ static inline void ip6_rt_put(struct rt6_info *rt)
 	dst_release(&rt->dst);
 }
 
+enum fib_walk_state_t {
+#ifdef CONFIG_IPV6_SUBTREES
+	FWS_S,
+#endif
+	FWS_L,
+	FWS_R,
+	FWS_C,
+	FWS_U
+};
+
 struct fib6_walker_t {
 	struct list_head lh;
 	struct fib6_node *root, *node;
 	struct rt6_info *leaf;
-	unsigned char state;
-	unsigned char prune;
+	enum fib_walk_state_t state;
+	bool prune;
 	unsigned int skip;
 	unsigned int count;
 	int (*func)(struct fib6_walker_t *);
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 900254e..67599d8 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -46,16 +46,6 @@
 
 static struct kmem_cache *fib6_node_kmem __read_mostly;
 
-enum fib_walk_state_t {
-#ifdef CONFIG_IPV6_SUBTREES
-	FWS_S,
-#endif
-	FWS_L,
-	FWS_R,
-	FWS_C,
-	FWS_U
-};
-
 struct fib6_cleaner_t {
 	struct fib6_walker_t w;
 	struct net *net;
@@ -1562,7 +1552,7 @@ static int fib6_clean_node(struct fib6_walker_t *w)
 
 static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 			    int (*func)(struct rt6_info *, void *arg),
-			    int prune, u32 sernum, void *arg)
+			    bool prune, u32 sernum, void *arg)
 {
 	struct fib6_cleaner_t c;
 
@@ -1592,7 +1582,8 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
 		hlist_for_each_entry_rcu(table, head, tb6_hlist) {
 			write_lock_bh(&table->tb6_lock);
 			fib6_clean_tree(net, &table->tb6_root,
-					func, 0, FIB6_NO_SERNUM_CHANGE, arg);
+					func, false, FIB6_NO_SERNUM_CHANGE,
+					arg);
 			write_unlock_bh(&table->tb6_lock);
 		}
 	}
@@ -1611,7 +1602,7 @@ static int fib6_prune_clone(struct rt6_info *rt, void *arg)
 
 static void fib6_prune_clones(struct net *net, struct fib6_node *fn, u32 sernum)
 {
-	fib6_clean_tree(net, fn, fib6_prune_clone, 1, sernum, NULL);
+	fib6_clean_tree(net, fn, fib6_prune_clone, true, sernum, NULL);
 }
 
 /*
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add()
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 2/9] ipv6: a bit more typesafety Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes Hannes Frederic Sowa
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/ip6_fib.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 67599d8..d8f0af4 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -415,16 +415,15 @@ out:
  */
 
 static struct fib6_node *fib6_add_1(struct fib6_node *root,
-				     struct in6_addr *addr, int plen,
-				     int offset, int allow_create,
-				     int replace_required)
+				    struct in6_addr *addr, int plen,
+				    int offset, int allow_create,
+				    int replace_required, u32 sernum)
 {
 	struct fib6_node *fn, *in, *ln;
 	struct fib6_node *pn = NULL;
 	struct rt6key *key;
 	int	bit;
 	__be32	dir = 0;
-	u32	sernum = fib6_new_sernum();
 
 	RT6_TRACE("fib6_add_1\n");
 
@@ -842,6 +841,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 	int err = -ENOMEM;
 	int allow_create = 1;
 	int replace_required = 0;
+	u32 sernum = fib6_new_sernum();
 
 	if (info->nlh) {
 		if (!(info->nlh->nlmsg_flags & NLM_F_CREATE))
@@ -854,7 +854,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 
 	fn = fib6_add_1(root, &rt->rt6i_dst.addr, rt->rt6i_dst.plen,
 			offsetof(struct rt6_info, rt6i_dst), allow_create,
-			replace_required);
+			replace_required, sernum);
 	if (IS_ERR(fn)) {
 		err = PTR_ERR(fn);
 		fn = NULL;
@@ -888,14 +888,14 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 			sfn->leaf = info->nl_net->ipv6.ip6_null_entry;
 			atomic_inc(&info->nl_net->ipv6.ip6_null_entry->rt6i_ref);
 			sfn->fn_flags = RTN_ROOT;
-			sfn->fn_sernum = fib6_new_sernum();
+			sfn->fn_sernum = sernum;
 
 			/* Now add the first leaf node to new subtree */
 
 			sn = fib6_add_1(sfn, &rt->rt6i_src.addr,
 					rt->rt6i_src.plen,
 					offsetof(struct rt6_info, rt6i_src),
-					allow_create, replace_required);
+					allow_create, replace_required, sernum);
 
 			if (IS_ERR(sn)) {
 				/* If it is failed, discard just allocated
@@ -914,7 +914,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 			sn = fib6_add_1(fn->subtree, &rt->rt6i_src.addr,
 					rt->rt6i_src.plen,
 					offsetof(struct rt6_info, rt6i_src),
-					allow_create, replace_required);
+					allow_create, replace_required, sernum);
 
 			if (IS_ERR(sn)) {
 				err = PTR_ERR(sn);
@@ -934,8 +934,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 	if (!err) {
 		fib6_start_gc(info->nl_net, rt);
 		if (!(rt->rt6i_flags & RTF_CACHE))
-			fib6_prune_clones(info->nl_net, pn,
-					  FIB6_NO_SERNUM_CHANGE);
+			fib6_prune_clones(info->nl_net, pn, sernum);
 	}
 
 out:
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
                   ` (2 preceding siblings ...)
  2014-09-21 14:11 ` [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add() Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid Hannes Frederic Sowa
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

We now allow NULL rt6_info walker functions to we only visit nodes.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/ip6_fib.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index d8f0af4..4dfadd4 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1518,6 +1518,12 @@ static int fib6_clean_node(struct fib6_walker_t *w)
 	if (c->sernum != FIB6_NO_SERNUM_CHANGE)
 		c->w.node->fn_sernum = c->sernum;
 
+	if (!c->func) {
+		WARN_ON_ONCE(c->sernum == FIB6_NO_SERNUM_CHANGE);
+		w->leaf = NULL;
+		return 0;
+	}
+
 	for (rt = w->leaf; rt; rt = rt->dst.rt6_next) {
 		res = c->func(rt, c->arg);
 		if (res < 0) {
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
                   ` (3 preceding siblings ...)
  2014-09-21 14:11 ` [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition Hannes Frederic Sowa
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

fib6_flush_trees is still a very costly operation but now is only called
by xfrm code when a policy changes or ipv6 addresses are added/removed.

fib6_flush_tree must walk all ipv6 routing tables and modify fn_sernum,
so all sockets relookup their dst_entries. Use a NULL callback, so we
only walk the nodes without looking at the rt6_infos.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/net_namespace.h | 14 +++-----------
 include/net/netns/ipv6.h    |  1 -
 net/ipv6/addrconf_core.c    |  6 ++++++
 net/ipv6/af_inet6.c         |  1 -
 net/ipv6/ip6_fib.c          | 21 +++++++++++++++++----
 net/ipv6/route.c            |  4 ----
 6 files changed, 26 insertions(+), 21 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 361d260..61aad36 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -353,21 +353,13 @@ static inline void rt_genid_bump_ipv4(struct net *net)
 }
 
 #if IS_ENABLED(CONFIG_IPV6)
-static inline int rt_genid_ipv6(struct net *net)
-{
-	return atomic_read(&net->ipv6.rt_genid);
-}
-
+extern void (*__fib6_flush_trees)(struct net *);
 static inline void rt_genid_bump_ipv6(struct net *net)
 {
-	atomic_inc(&net->ipv6.rt_genid);
+	if (__fib6_flush_trees)
+		__fib6_flush_trees(net);
 }
 #else
-static inline int rt_genid_ipv6(struct net *net)
-{
-	return 0;
-}
-
 static inline void rt_genid_bump_ipv6(struct net *net)
 {
 }
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index eade27a..3291ba6 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -76,7 +76,6 @@ struct netns_ipv6 {
 #endif
 #endif
 	atomic_t		dev_addr_genid;
-	atomic_t		rt_genid;
 };
 
 #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
diff --git a/net/ipv6/addrconf_core.c b/net/ipv6/addrconf_core.c
index e696045..8b2d99a 100644
--- a/net/ipv6/addrconf_core.c
+++ b/net/ipv6/addrconf_core.c
@@ -10,6 +10,12 @@
 
 #define IPV6_ADDR_SCOPE_TYPE(scope)	((scope) << 16)
 
+/* if ipv6 module registers this function is used by xfrm to force
+ * all sockets to relookup their nodes - this is fairly expensive
+ */
+void (*__fib6_flush_trees)(struct net *);
+EXPORT_SYMBOL(__fib6_flush_trees);
+
 static inline unsigned int ipv6_addr_scope2type(unsigned int scope)
 {
 	switch (scope) {
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index e4865a3..2189d2d 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -766,7 +766,6 @@ static int __net_init inet6_net_init(struct net *net)
 	net->ipv6.sysctl.icmpv6_time = 1*HZ;
 	net->ipv6.sysctl.flowlabel_consistency = 1;
 	net->ipv6.sysctl.auto_flowlabels = 0;
-	atomic_set(&net->ipv6.rt_genid, 0);
 
 	err = ipv6_init_mibs(net);
 	if (err)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 4dfadd4..0a97216 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1574,8 +1574,9 @@ static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 	fib6_walk(&c.w);
 }
 
-void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
-		    void *arg)
+void __fib6_clean_all(struct net *net,
+		      int (*func)(struct rt6_info *, void *arg),
+		      u32 sernum, void *arg)
 {
 	struct fib6_table *table;
 	struct hlist_head *head;
@@ -1587,14 +1588,24 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
 		hlist_for_each_entry_rcu(table, head, tb6_hlist) {
 			write_lock_bh(&table->tb6_lock);
 			fib6_clean_tree(net, &table->tb6_root,
-					func, false, FIB6_NO_SERNUM_CHANGE,
-					arg);
+					func, false, sernum, arg);
 			write_unlock_bh(&table->tb6_lock);
 		}
 	}
 	rcu_read_unlock();
 }
 
+void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
+		    void *arg)
+{
+	__fib6_clean_all(net, func, FIB6_NO_SERNUM_CHANGE, arg);
+}
+
+static void fib6_flush_trees(struct net *net)
+{
+	__fib6_clean_all(net, NULL, fib6_new_sernum(), NULL);
+}
+
 static int fib6_prune_clone(struct rt6_info *rt, void *arg)
 {
 	if (rt->rt6i_flags & RTF_CACHE) {
@@ -1793,6 +1804,8 @@ int __init fib6_init(void)
 			      NULL);
 	if (ret)
 		goto out_unregister_subsys;
+
+	__fib6_flush_trees = fib6_flush_trees;
 out:
 	return ret;
 
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index f74b041..a318dd89 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -314,7 +314,6 @@ static inline struct rt6_info *ip6_dst_alloc(struct net *net,
 
 		memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst));
 		rt6_init_peer(rt, table ? &table->tb6_peers : net->ipv6.peers);
-		rt->rt6i_genid = rt_genid_ipv6(net);
 		INIT_LIST_HEAD(&rt->rt6i_siblings);
 	}
 	return rt;
@@ -1096,9 +1095,6 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
 	 * DST_OBSOLETE_FORCE_CHK which forces validation calls down
 	 * into this function always.
 	 */
-	if (rt->rt6i_genid != rt_genid_ipv6(dev_net(rt->dst.dev)))
-		return NULL;
-
 	if (!rt->rt6i_node || (rt->rt6i_node->fn_sernum != cookie))
 		return NULL;
 
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
                   ` (4 preceding siblings ...)
  2014-09-21 14:11 ` [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes Hannes Frederic Sowa
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

fn_sernum takes care that on address insertion the sockets throw away
their cached dst_entries and do a relookup.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv6/addrconf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 39d3335..a2d2626 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4781,10 +4781,11 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
 
 		if (ip6_del_rt(ifp->rt))
 			dst_free(&ifp->rt->dst);
+
+		rt_genid_bump_ipv6(net);
 		break;
 	}
 	atomic_inc(&net->ipv6.dev_addr_genid);
-	rt_genid_bump_ipv6(net);
 }
 
 static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
                   ` (5 preceding siblings ...)
  2014-09-21 14:11 ` [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types Hannes Frederic Sowa
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/netns/ipv6.h |  1 +
 net/ipv6/af_inet6.c      |  1 +
 net/ipv6/ip6_fib.c       | 19 ++++++++++---------
 3 files changed, 12 insertions(+), 9 deletions(-)

diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 3291ba6..2319949 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -76,6 +76,7 @@ struct netns_ipv6 {
 #endif
 #endif
 	atomic_t		dev_addr_genid;
+	u32			rt_sernum;
 };
 
 #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 2189d2d..7ff8996 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -766,6 +766,7 @@ static int __net_init inet6_net_init(struct net *net)
 	net->ipv6.sysctl.icmpv6_time = 1*HZ;
 	net->ipv6.sysctl.flowlabel_consistency = 1;
 	net->ipv6.sysctl.auto_flowlabels = 0;
+	net->ipv6.rt_sernum = 1;
 
 	err = ipv6_init_mibs(net);
 	if (err)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 0a97216..9f973e4 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -76,8 +76,6 @@ static int fib6_walk_continue(struct fib6_walker_t *w);
  *	result of redirects, path MTU changes, etc.
  */
 
-static u32 rt_sernum;
-
 static void fib6_gc_timer_cb(unsigned long arg);
 
 static LIST_HEAD(fib6_walkers);
@@ -96,12 +94,15 @@ static inline void fib6_walker_unlink(struct fib6_walker_t *w)
 	list_del(&w->lh);
 	write_unlock_bh(&fib6_walker_lock);
 }
-static __inline__ u32 fib6_new_sernum(void)
+
+static u32 fib6_new_sernum(struct net *net)
 {
-	u32 n = ++rt_sernum;
-	if ((s32)n <= 0)
-		rt_sernum = n = 1;
-	return n;
+	int *n = &net->ipv6.rt_sernum;
+
+	++*n;
+	if ((s32)*n <= 0)
+		*n = 1;
+	return *n;
 }
 
 #define FIB6_NO_SERNUM_CHANGE (0U)
@@ -841,7 +842,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 	int err = -ENOMEM;
 	int allow_create = 1;
 	int replace_required = 0;
-	u32 sernum = fib6_new_sernum();
+	u32 sernum = fib6_new_sernum(dev_net(rt->dst.dev));
 
 	if (info->nlh) {
 		if (!(info->nlh->nlmsg_flags & NLM_F_CREATE))
@@ -1603,7 +1604,7 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
 
 static void fib6_flush_trees(struct net *net)
 {
-	__fib6_clean_all(net, NULL, fib6_new_sernum(), NULL);
+	__fib6_clean_all(net, NULL, fib6_new_sernum(net), NULL);
 }
 
 static int fib6_prune_clone(struct rt6_info *rt, void *arg)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
                   ` (6 preceding siblings ...)
  2014-09-21 14:11 ` [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-21 14:11 ` [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches Hannes Frederic Sowa
  2014-09-26  4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

Switch rt_sernum to atomic_t, make it concurrency safe (the old scheme
looked broken to me) and switch from u32 to int types for the fn_sernum.
(fib6_new_sernum only gets used with table locks, but different tables
can get mutated at the same time.)

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/ip6_fib.h    |  2 +-
 include/net/netns/ipv6.h |  2 +-
 net/ipv6/af_inet6.c      |  2 +-
 net/ipv6/ip6_fib.c       | 29 +++++++++++++++--------------
 4 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index a09e554..5440f99 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -64,7 +64,7 @@ struct fib6_node {
 
 	__u16			fn_bit;		/* bit key */
 	__u16			fn_flags;
-	u32			fn_sernum;
+	int			fn_sernum;
 	struct rt6_info		*rr_ptr;
 };
 
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 2319949..7dee21b 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -76,7 +76,7 @@ struct netns_ipv6 {
 #endif
 #endif
 	atomic_t		dev_addr_genid;
-	u32			rt_sernum;
+	atomic_t		rt_sernum;
 };
 
 #if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 7ff8996..6cde9b4 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -766,7 +766,7 @@ static int __net_init inet6_net_init(struct net *net)
 	net->ipv6.sysctl.icmpv6_time = 1*HZ;
 	net->ipv6.sysctl.flowlabel_consistency = 1;
 	net->ipv6.sysctl.auto_flowlabels = 0;
-	net->ipv6.rt_sernum = 1;
+	atomic_set(&net->ipv6.rt_sernum, 1);
 
 	err = ipv6_init_mibs(net);
 	if (err)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 9f973e4..ae87d0c 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -50,7 +50,7 @@ struct fib6_cleaner_t {
 	struct fib6_walker_t w;
 	struct net *net;
 	int (*func)(struct rt6_info *, void *arg);
-	u32 sernum;
+	int sernum;
 	void *arg;
 };
 
@@ -63,7 +63,7 @@ static DEFINE_RWLOCK(fib6_walker_lock);
 #endif
 
 static void fib6_prune_clones(struct net *net, struct fib6_node *fn,
-			      u32 sernum);
+			      int sernum);
 static struct rt6_info *fib6_find_prefix(struct net *net, struct fib6_node *fn);
 static struct fib6_node *fib6_repair_tree(struct net *net, struct fib6_node *fn);
 static int fib6_walk(struct fib6_walker_t *w);
@@ -97,15 +97,16 @@ static inline void fib6_walker_unlink(struct fib6_walker_t *w)
 
 static u32 fib6_new_sernum(struct net *net)
 {
-	int *n = &net->ipv6.rt_sernum;
+	int old, new;
 
-	++*n;
-	if ((s32)*n <= 0)
-		*n = 1;
-	return *n;
+	do {
+		old = atomic_read(&net->ipv6.rt_sernum);
+		new = old < INT_MAX ? old + 1 : 1;
+	} while (atomic_cmpxchg(&net->ipv6.rt_sernum, old, new) != old);
+	return new;
 }
 
-#define FIB6_NO_SERNUM_CHANGE (0U)
+#define FIB6_NO_SERNUM_CHANGE (0)
 
 /*
  *	Auxiliary address test functions for the radix tree.
@@ -418,7 +419,7 @@ out:
 static struct fib6_node *fib6_add_1(struct fib6_node *root,
 				    struct in6_addr *addr, int plen,
 				    int offset, int allow_create,
-				    int replace_required, u32 sernum)
+				    int replace_required, int sernum)
 {
 	struct fib6_node *fn, *in, *ln;
 	struct fib6_node *pn = NULL;
@@ -842,7 +843,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
 	int err = -ENOMEM;
 	int allow_create = 1;
 	int replace_required = 0;
-	u32 sernum = fib6_new_sernum(dev_net(rt->dst.dev));
+	int sernum = fib6_new_sernum(dev_net(rt->dst.dev));
 
 	if (info->nlh) {
 		if (!(info->nlh->nlmsg_flags & NLM_F_CREATE))
@@ -1558,7 +1559,7 @@ static int fib6_clean_node(struct fib6_walker_t *w)
 
 static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 			    int (*func)(struct rt6_info *, void *arg),
-			    bool prune, u32 sernum, void *arg)
+			    bool prune, int sernum, void *arg)
 {
 	struct fib6_cleaner_t c;
 
@@ -1577,7 +1578,7 @@ static void fib6_clean_tree(struct net *net, struct fib6_node *root,
 
 void __fib6_clean_all(struct net *net,
 		      int (*func)(struct rt6_info *, void *arg),
-		      u32 sernum, void *arg)
+		      int sernum, void *arg)
 {
 	struct fib6_table *table;
 	struct hlist_head *head;
@@ -1617,7 +1618,7 @@ static int fib6_prune_clone(struct rt6_info *rt, void *arg)
 	return 0;
 }
 
-static void fib6_prune_clones(struct net *net, struct fib6_node *fn, u32 sernum)
+static void fib6_prune_clones(struct net *net, struct fib6_node *fn, int sernum)
 {
 	fib6_clean_tree(net, fn, fib6_prune_clone, true, sernum, NULL);
 }
@@ -1830,7 +1831,7 @@ struct ipv6_route_iter {
 	struct fib6_walker_t w;
 	loff_t skip;
 	struct fib6_table *tbl;
-	u32 sernum;
+	int sernum;
 };
 
 static int ipv6_route_seq_show(struct seq_file *seq, void *v)
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
                   ` (7 preceding siblings ...)
  2014-09-21 14:11 ` [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
  2014-09-26  4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
  9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

Also rename ipv4 and ipv6 agnostic rt_genid_bump_all to
rt_inval_dst_caches as we don't care how the flushing is implemented in
the protocols.

Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/net_namespace.h     | 8 ++++----
 net/ipv6/addrconf.c             | 2 +-
 net/xfrm/xfrm_policy.c          | 2 +-
 security/selinux/include/xfrm.h | 2 +-
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 61aad36..e73b80f 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -354,13 +354,13 @@ static inline void rt_genid_bump_ipv4(struct net *net)
 
 #if IS_ENABLED(CONFIG_IPV6)
 extern void (*__fib6_flush_trees)(struct net *);
-static inline void rt_genid_bump_ipv6(struct net *net)
+static inline void rt6_dst_inval_caches(struct net *net)
 {
 	if (__fib6_flush_trees)
 		__fib6_flush_trees(net);
 }
 #else
-static inline void rt_genid_bump_ipv6(struct net *net)
+static inline void rt6_dst_inval_caches(struct net *net)
 {
 }
 #endif
@@ -374,10 +374,10 @@ net_ieee802154_lowpan(struct net *net)
 #endif
 
 /* For callers who don't really care about whether it's IPv4 or IPv6 */
-static inline void rt_genid_bump_all(struct net *net)
+static inline void dst_inval_caches(struct net *net)
 {
 	rt_genid_bump_ipv4(net);
-	rt_genid_bump_ipv6(net);
+	rt6_dst_inval_caches(net);
 }
 
 static inline int fnhe_genid(struct net *net)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index a2d2626..0c2aade 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4782,7 +4782,7 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
 		if (ip6_del_rt(ifp->rt))
 			dst_free(&ifp->rt->dst);
 
-		rt_genid_bump_ipv6(net);
+		rt6_dst_inval_caches(net);
 		break;
 	}
 	atomic_inc(&net->ipv6.dev_addr_genid);
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index beeed60..6d09195 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -665,7 +665,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl)
 	if (policy->family == AF_INET)
 		rt_genid_bump_ipv4(net);
 	else
-		rt_genid_bump_ipv6(net);
+		rt6_dst_inval_caches(net);
 
 	if (delpol) {
 		xfrm_policy_requeue(delpol, policy);
diff --git a/security/selinux/include/xfrm.h b/security/selinux/include/xfrm.h
index 1450f85..a1c5f97 100644
--- a/security/selinux/include/xfrm.h
+++ b/security/selinux/include/xfrm.h
@@ -49,7 +49,7 @@ static inline void selinux_xfrm_notify_policyload(void)
 	rtnl_lock();
 	for_each_net(net) {
 		atomic_inc(&net->xfrm.flow_cache_genid);
-		rt_genid_bump_all(net);
+		dst_inval_caches(net);
 	}
 	rtnl_unlock();
 }
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
  2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
                   ` (8 preceding siblings ...)
  2014-09-21 14:11 ` [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches Hannes Frederic Sowa
@ 2014-09-26  4:28 ` David Miller
  2014-09-26  7:58   ` Hannes Frederic Sowa
  9 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2014-09-26  4:28 UTC (permalink / raw)
  To: hannes; +Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Sun, 21 Sep 2014 16:11:44 +0200

> Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
> RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
> their rt6_genid is never renewed, thus ip6_dst_check always considers
> them outdated. This is a major problem, because these kind of routes
> are normally used to in input handling.

This series is a disappointment for me from the perspective of the
fact that we have a regression in mainline and this is too complex
of a set of changes for there.

If we relookup the thing every TCP input packet, we might as well
not do the input route caching in the socket.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
  2014-09-26  4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
@ 2014-09-26  7:58   ` Hannes Frederic Sowa
  2014-09-26 16:46     ` David Miller
  0 siblings, 1 reply; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-26  7:58 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

On Fri, Sep 26, 2014, at 06:28, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Sun, 21 Sep 2014 16:11:44 +0200
> 
> > Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
> > RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
> > their rt6_genid is never renewed, thus ip6_dst_check always considers
> > them outdated. This is a major problem, because these kind of routes
> > are normally used to in input handling.
> 
> This series is a disappointment for me from the perspective of the
> fact that we have a regression in mainline and this is too complex
> of a set of changes for there.
> 
> If we relookup the thing every TCP input packet, we might as well
> not do the input route caching in the socket.

I can understand.

Toss this series, I'll try to do better tomorrow and send changes for
net and submit net-next cleanups when your queue is a bit smaller.

Bye,
Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
  2014-09-26  7:58   ` Hannes Frederic Sowa
@ 2014-09-26 16:46     ` David Miller
  2014-09-27 23:32       ` Hannes Frederic Sowa
  0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2014-09-26 16:46 UTC (permalink / raw)
  To: hannes; +Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 26 Sep 2014 09:58:43 +0200

> On Fri, Sep 26, 2014, at 06:28, David Miller wrote:
>> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Date: Sun, 21 Sep 2014 16:11:44 +0200
>> 
>> > Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
>> > RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
>> > their rt6_genid is never renewed, thus ip6_dst_check always considers
>> > them outdated. This is a major problem, because these kind of routes
>> > are normally used to in input handling.
>> 
>> This series is a disappointment for me from the perspective of the
>> fact that we have a regression in mainline and this is too complex
>> of a set of changes for there.
>> 
>> If we relookup the thing every TCP input packet, we might as well
>> not do the input route caching in the socket.
> 
> I can understand.
> 
> Toss this series, I'll try to do better tomorrow and send changes for
> net and submit net-next cleanups when your queue is a bit smaller.

BTW, don't get me wrong, I like the new code and for 'net-next' it's
good.

But for 'net' we have to come up with something simpler meanwhile.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
  2014-09-26 16:46     ` David Miller
@ 2014-09-27 23:32       ` Hannes Frederic Sowa
  0 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-27 23:32 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai

On Fri, Sep 26, 2014, at 18:46, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Fri, 26 Sep 2014 09:58:43 +0200
> 
> > On Fri, Sep 26, 2014, at 06:28, David Miller wrote:
> >> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >> Date: Sun, 21 Sep 2014 16:11:44 +0200
> >> 
> >> > Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
> >> > RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
> >> > their rt6_genid is never renewed, thus ip6_dst_check always considers
> >> > them outdated. This is a major problem, because these kind of routes
> >> > are normally used to in input handling.
> >> 
> >> This series is a disappointment for me from the perspective of the
> >> fact that we have a regression in mainline and this is too complex
> >> of a set of changes for there.
> >> 
> >> If we relookup the thing every TCP input packet, we might as well
> >> not do the input route caching in the socket.
> > 
> > I can understand.
> > 
> > Toss this series, I'll try to do better tomorrow and send changes for
> > net and submit net-next cleanups when your queue is a bit smaller.
> 
> BTW, don't get me wrong, I like the new code and for 'net-next' it's
> good.

I didn't. ;)

> But for 'net' we have to come up with something simpler meanwhile.

Sure, I just posted one small patch to address the problem, cleanups and
smaller performance optimizations come in later after you did a merge.

Thanks,
Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-09-27 23:32 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 2/9] ipv6: a bit more typesafety Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add() Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches Hannes Frederic Sowa
2014-09-26  4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
2014-09-26  7:58   ` Hannes Frederic Sowa
2014-09-26 16:46     ` David Miller
2014-09-27 23:32       ` Hannes Frederic Sowa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).