* [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
@ 2014-09-21 14:11 Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
` (9 more replies)
0 siblings, 10 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
their rt6_genid is never renewed, thus ip6_dst_check always considers
them outdated. This is a major problem, because these kind of routes
are normally used to in input handling.
Thus it does not make sense to use rt6i_genid anymore. This series
removes it.
The address deletion path is already covered and does not depend on
rt6i_genid. When we add a new address, we update the fn_sernums while
traversing the tree.
Because inet6_connect_socket depend on dst_check returning NULL also
for source address invalidation, we currently have to walk the whole
tree and update the fn_sernums manually when an address gets deleted.
This is a fairly expensive operation we currenlty have to do for
address deletion and xfrm policy changes. We currently do that for
interface mtu changes already.
I dropped the patch for updating the fn_sernum on deletion as it
showed some side effects with /proc/net/ipv6_route and we currently
don't need it. I stashed it away.
Thanks to Eric Dumazet for noticing the problem with rt6i_genid!
v2 (addressed YOSHIFUJI Hideaki's feedback, thanks!):
* fixed changelog in patch #2
* added patch to rename rt_genid_bump_ipv6 etc.
Regarding the __u32 to u32 conversion, I didn't see anything
problematic or left over, if I missed something there I am happy to
fix it up in a next version.
Hannes Frederic Sowa (9):
ipv6: support for fib6_clean_* to update fn_sernum
ipv6: a bit more typesafety
ipv6: only generate one new serial number during fib6_add()
ipv6: if no function for cleaner is specified only visit fib6_nodes
ipv6: new function fib6_flush_trees and use it instead of bumping
removed rt6_genid
ipv6: no need to bump rt_genid_ipv6 on address addition
ipv6: keep rt_sernum per namespace to reduce number of flushes
ipv6: switch rt_sernum to atomic_t and clean up types
ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches
include/net/ip6_fib.h | 16 ++++++--
include/net/net_namespace.h | 22 ++++------
include/net/netns/ipv6.h | 2 +-
net/ipv6/addrconf.c | 3 +-
net/ipv6/addrconf_core.c | 6 +++
net/ipv6/af_inet6.c | 2 +-
net/ipv6/ip6_fib.c | 90 +++++++++++++++++++++++++----------------
net/ipv6/route.c | 4 --
net/xfrm/xfrm_policy.c | 2 +-
security/selinux/include/xfrm.h | 2 +-
10 files changed, 87 insertions(+), 62 deletions(-)
--
1.9.3
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 2/9] ipv6: a bit more typesafety Hannes Frederic Sowa
` (8 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/ip6_fib.h | 2 +-
net/ipv6/ip6_fib.c | 31 ++++++++++++++++++++-----------
2 files changed, 21 insertions(+), 12 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 9bcb220..1cdd46e 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -64,7 +64,7 @@ struct fib6_node {
__u16 fn_bit; /* bit key */
__u16 fn_flags;
- __u32 fn_sernum;
+ u32 fn_sernum;
struct rt6_info *rr_ptr;
};
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 76b7f5e..900254e 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -60,6 +60,7 @@ struct fib6_cleaner_t {
struct fib6_walker_t w;
struct net *net;
int (*func)(struct rt6_info *, void *arg);
+ u32 sernum;
void *arg;
};
@@ -71,7 +72,8 @@ static DEFINE_RWLOCK(fib6_walker_lock);
#define FWS_INIT FWS_L
#endif
-static void fib6_prune_clones(struct net *net, struct fib6_node *fn);
+static void fib6_prune_clones(struct net *net, struct fib6_node *fn,
+ u32 sernum);
static struct rt6_info *fib6_find_prefix(struct net *net, struct fib6_node *fn);
static struct fib6_node *fib6_repair_tree(struct net *net, struct fib6_node *fn);
static int fib6_walk(struct fib6_walker_t *w);
@@ -84,7 +86,7 @@ static int fib6_walk_continue(struct fib6_walker_t *w);
* result of redirects, path MTU changes, etc.
*/
-static __u32 rt_sernum;
+static u32 rt_sernum;
static void fib6_gc_timer_cb(unsigned long arg);
@@ -107,11 +109,13 @@ static inline void fib6_walker_unlink(struct fib6_walker_t *w)
static __inline__ u32 fib6_new_sernum(void)
{
u32 n = ++rt_sernum;
- if ((__s32)n <= 0)
+ if ((s32)n <= 0)
rt_sernum = n = 1;
return n;
}
+#define FIB6_NO_SERNUM_CHANGE (0U)
+
/*
* Auxiliary address test functions for the radix tree.
*
@@ -430,7 +434,7 @@ static struct fib6_node *fib6_add_1(struct fib6_node *root,
struct rt6key *key;
int bit;
__be32 dir = 0;
- __u32 sernum = fib6_new_sernum();
+ u32 sernum = fib6_new_sernum();
RT6_TRACE("fib6_add_1\n");
@@ -940,7 +944,8 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
if (!err) {
fib6_start_gc(info->nl_net, rt);
if (!(rt->rt6i_flags & RTF_CACHE))
- fib6_prune_clones(info->nl_net, pn);
+ fib6_prune_clones(info->nl_net, pn,
+ FIB6_NO_SERNUM_CHANGE);
}
out:
@@ -1374,7 +1379,7 @@ int fib6_del(struct rt6_info *rt, struct nl_info *info)
pn = pn->parent;
}
#endif
- fib6_prune_clones(info->nl_net, pn);
+ fib6_prune_clones(info->nl_net, pn, FIB6_NO_SERNUM_CHANGE);
}
/*
@@ -1521,6 +1526,9 @@ static int fib6_clean_node(struct fib6_walker_t *w)
.nl_net = c->net,
};
+ if (c->sernum != FIB6_NO_SERNUM_CHANGE)
+ c->w.node->fn_sernum = c->sernum;
+
for (rt = w->leaf; rt; rt = rt->dst.rt6_next) {
res = c->func(rt, c->arg);
if (res < 0) {
@@ -1554,7 +1562,7 @@ static int fib6_clean_node(struct fib6_walker_t *w)
static void fib6_clean_tree(struct net *net, struct fib6_node *root,
int (*func)(struct rt6_info *, void *arg),
- int prune, void *arg)
+ int prune, u32 sernum, void *arg)
{
struct fib6_cleaner_t c;
@@ -1564,6 +1572,7 @@ static void fib6_clean_tree(struct net *net, struct fib6_node *root,
c.w.count = 0;
c.w.skip = 0;
c.func = func;
+ c.sernum = sernum;
c.arg = arg;
c.net = net;
@@ -1583,7 +1592,7 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
hlist_for_each_entry_rcu(table, head, tb6_hlist) {
write_lock_bh(&table->tb6_lock);
fib6_clean_tree(net, &table->tb6_root,
- func, 0, arg);
+ func, 0, FIB6_NO_SERNUM_CHANGE, arg);
write_unlock_bh(&table->tb6_lock);
}
}
@@ -1600,9 +1609,9 @@ static int fib6_prune_clone(struct rt6_info *rt, void *arg)
return 0;
}
-static void fib6_prune_clones(struct net *net, struct fib6_node *fn)
+static void fib6_prune_clones(struct net *net, struct fib6_node *fn, u32 sernum)
{
- fib6_clean_tree(net, fn, fib6_prune_clone, 1, NULL);
+ fib6_clean_tree(net, fn, fib6_prune_clone, 1, sernum, NULL);
}
/*
@@ -1811,7 +1820,7 @@ struct ipv6_route_iter {
struct fib6_walker_t w;
loff_t skip;
struct fib6_table *tbl;
- __u32 sernum;
+ u32 sernum;
};
static int ipv6_route_seq_show(struct seq_file *seq, void *v)
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 2/9] ipv6: a bit more typesafety
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add() Hannes Frederic Sowa
` (7 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
Convert prune argument/struct member to bool and fib6_walker_t.state to
the already existing enum.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/ip6_fib.h | 14 ++++++++++++--
net/ipv6/ip6_fib.c | 17 ++++-------------
2 files changed, 16 insertions(+), 15 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 1cdd46e..a09e554 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -205,12 +205,22 @@ static inline void ip6_rt_put(struct rt6_info *rt)
dst_release(&rt->dst);
}
+enum fib_walk_state_t {
+#ifdef CONFIG_IPV6_SUBTREES
+ FWS_S,
+#endif
+ FWS_L,
+ FWS_R,
+ FWS_C,
+ FWS_U
+};
+
struct fib6_walker_t {
struct list_head lh;
struct fib6_node *root, *node;
struct rt6_info *leaf;
- unsigned char state;
- unsigned char prune;
+ enum fib_walk_state_t state;
+ bool prune;
unsigned int skip;
unsigned int count;
int (*func)(struct fib6_walker_t *);
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 900254e..67599d8 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -46,16 +46,6 @@
static struct kmem_cache *fib6_node_kmem __read_mostly;
-enum fib_walk_state_t {
-#ifdef CONFIG_IPV6_SUBTREES
- FWS_S,
-#endif
- FWS_L,
- FWS_R,
- FWS_C,
- FWS_U
-};
-
struct fib6_cleaner_t {
struct fib6_walker_t w;
struct net *net;
@@ -1562,7 +1552,7 @@ static int fib6_clean_node(struct fib6_walker_t *w)
static void fib6_clean_tree(struct net *net, struct fib6_node *root,
int (*func)(struct rt6_info *, void *arg),
- int prune, u32 sernum, void *arg)
+ bool prune, u32 sernum, void *arg)
{
struct fib6_cleaner_t c;
@@ -1592,7 +1582,8 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
hlist_for_each_entry_rcu(table, head, tb6_hlist) {
write_lock_bh(&table->tb6_lock);
fib6_clean_tree(net, &table->tb6_root,
- func, 0, FIB6_NO_SERNUM_CHANGE, arg);
+ func, false, FIB6_NO_SERNUM_CHANGE,
+ arg);
write_unlock_bh(&table->tb6_lock);
}
}
@@ -1611,7 +1602,7 @@ static int fib6_prune_clone(struct rt6_info *rt, void *arg)
static void fib6_prune_clones(struct net *net, struct fib6_node *fn, u32 sernum)
{
- fib6_clean_tree(net, fn, fib6_prune_clone, 1, sernum, NULL);
+ fib6_clean_tree(net, fn, fib6_prune_clone, true, sernum, NULL);
}
/*
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add()
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 2/9] ipv6: a bit more typesafety Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes Hannes Frederic Sowa
` (6 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv6/ip6_fib.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 67599d8..d8f0af4 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -415,16 +415,15 @@ out:
*/
static struct fib6_node *fib6_add_1(struct fib6_node *root,
- struct in6_addr *addr, int plen,
- int offset, int allow_create,
- int replace_required)
+ struct in6_addr *addr, int plen,
+ int offset, int allow_create,
+ int replace_required, u32 sernum)
{
struct fib6_node *fn, *in, *ln;
struct fib6_node *pn = NULL;
struct rt6key *key;
int bit;
__be32 dir = 0;
- u32 sernum = fib6_new_sernum();
RT6_TRACE("fib6_add_1\n");
@@ -842,6 +841,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
int err = -ENOMEM;
int allow_create = 1;
int replace_required = 0;
+ u32 sernum = fib6_new_sernum();
if (info->nlh) {
if (!(info->nlh->nlmsg_flags & NLM_F_CREATE))
@@ -854,7 +854,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
fn = fib6_add_1(root, &rt->rt6i_dst.addr, rt->rt6i_dst.plen,
offsetof(struct rt6_info, rt6i_dst), allow_create,
- replace_required);
+ replace_required, sernum);
if (IS_ERR(fn)) {
err = PTR_ERR(fn);
fn = NULL;
@@ -888,14 +888,14 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
sfn->leaf = info->nl_net->ipv6.ip6_null_entry;
atomic_inc(&info->nl_net->ipv6.ip6_null_entry->rt6i_ref);
sfn->fn_flags = RTN_ROOT;
- sfn->fn_sernum = fib6_new_sernum();
+ sfn->fn_sernum = sernum;
/* Now add the first leaf node to new subtree */
sn = fib6_add_1(sfn, &rt->rt6i_src.addr,
rt->rt6i_src.plen,
offsetof(struct rt6_info, rt6i_src),
- allow_create, replace_required);
+ allow_create, replace_required, sernum);
if (IS_ERR(sn)) {
/* If it is failed, discard just allocated
@@ -914,7 +914,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
sn = fib6_add_1(fn->subtree, &rt->rt6i_src.addr,
rt->rt6i_src.plen,
offsetof(struct rt6_info, rt6i_src),
- allow_create, replace_required);
+ allow_create, replace_required, sernum);
if (IS_ERR(sn)) {
err = PTR_ERR(sn);
@@ -934,8 +934,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
if (!err) {
fib6_start_gc(info->nl_net, rt);
if (!(rt->rt6i_flags & RTF_CACHE))
- fib6_prune_clones(info->nl_net, pn,
- FIB6_NO_SERNUM_CHANGE);
+ fib6_prune_clones(info->nl_net, pn, sernum);
}
out:
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
` (2 preceding siblings ...)
2014-09-21 14:11 ` [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add() Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid Hannes Frederic Sowa
` (5 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
We now allow NULL rt6_info walker functions to we only visit nodes.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv6/ip6_fib.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index d8f0af4..4dfadd4 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1518,6 +1518,12 @@ static int fib6_clean_node(struct fib6_walker_t *w)
if (c->sernum != FIB6_NO_SERNUM_CHANGE)
c->w.node->fn_sernum = c->sernum;
+ if (!c->func) {
+ WARN_ON_ONCE(c->sernum == FIB6_NO_SERNUM_CHANGE);
+ w->leaf = NULL;
+ return 0;
+ }
+
for (rt = w->leaf; rt; rt = rt->dst.rt6_next) {
res = c->func(rt, c->arg);
if (res < 0) {
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
` (3 preceding siblings ...)
2014-09-21 14:11 ` [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition Hannes Frederic Sowa
` (4 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
fib6_flush_trees is still a very costly operation but now is only called
by xfrm code when a policy changes or ipv6 addresses are added/removed.
fib6_flush_tree must walk all ipv6 routing tables and modify fn_sernum,
so all sockets relookup their dst_entries. Use a NULL callback, so we
only walk the nodes without looking at the rt6_infos.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/net_namespace.h | 14 +++-----------
include/net/netns/ipv6.h | 1 -
net/ipv6/addrconf_core.c | 6 ++++++
net/ipv6/af_inet6.c | 1 -
net/ipv6/ip6_fib.c | 21 +++++++++++++++++----
net/ipv6/route.c | 4 ----
6 files changed, 26 insertions(+), 21 deletions(-)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 361d260..61aad36 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -353,21 +353,13 @@ static inline void rt_genid_bump_ipv4(struct net *net)
}
#if IS_ENABLED(CONFIG_IPV6)
-static inline int rt_genid_ipv6(struct net *net)
-{
- return atomic_read(&net->ipv6.rt_genid);
-}
-
+extern void (*__fib6_flush_trees)(struct net *);
static inline void rt_genid_bump_ipv6(struct net *net)
{
- atomic_inc(&net->ipv6.rt_genid);
+ if (__fib6_flush_trees)
+ __fib6_flush_trees(net);
}
#else
-static inline int rt_genid_ipv6(struct net *net)
-{
- return 0;
-}
-
static inline void rt_genid_bump_ipv6(struct net *net)
{
}
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index eade27a..3291ba6 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -76,7 +76,6 @@ struct netns_ipv6 {
#endif
#endif
atomic_t dev_addr_genid;
- atomic_t rt_genid;
};
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
diff --git a/net/ipv6/addrconf_core.c b/net/ipv6/addrconf_core.c
index e696045..8b2d99a 100644
--- a/net/ipv6/addrconf_core.c
+++ b/net/ipv6/addrconf_core.c
@@ -10,6 +10,12 @@
#define IPV6_ADDR_SCOPE_TYPE(scope) ((scope) << 16)
+/* if ipv6 module registers this function is used by xfrm to force
+ * all sockets to relookup their nodes - this is fairly expensive
+ */
+void (*__fib6_flush_trees)(struct net *);
+EXPORT_SYMBOL(__fib6_flush_trees);
+
static inline unsigned int ipv6_addr_scope2type(unsigned int scope)
{
switch (scope) {
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index e4865a3..2189d2d 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -766,7 +766,6 @@ static int __net_init inet6_net_init(struct net *net)
net->ipv6.sysctl.icmpv6_time = 1*HZ;
net->ipv6.sysctl.flowlabel_consistency = 1;
net->ipv6.sysctl.auto_flowlabels = 0;
- atomic_set(&net->ipv6.rt_genid, 0);
err = ipv6_init_mibs(net);
if (err)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 4dfadd4..0a97216 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1574,8 +1574,9 @@ static void fib6_clean_tree(struct net *net, struct fib6_node *root,
fib6_walk(&c.w);
}
-void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
- void *arg)
+void __fib6_clean_all(struct net *net,
+ int (*func)(struct rt6_info *, void *arg),
+ u32 sernum, void *arg)
{
struct fib6_table *table;
struct hlist_head *head;
@@ -1587,14 +1588,24 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
hlist_for_each_entry_rcu(table, head, tb6_hlist) {
write_lock_bh(&table->tb6_lock);
fib6_clean_tree(net, &table->tb6_root,
- func, false, FIB6_NO_SERNUM_CHANGE,
- arg);
+ func, false, sernum, arg);
write_unlock_bh(&table->tb6_lock);
}
}
rcu_read_unlock();
}
+void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
+ void *arg)
+{
+ __fib6_clean_all(net, func, FIB6_NO_SERNUM_CHANGE, arg);
+}
+
+static void fib6_flush_trees(struct net *net)
+{
+ __fib6_clean_all(net, NULL, fib6_new_sernum(), NULL);
+}
+
static int fib6_prune_clone(struct rt6_info *rt, void *arg)
{
if (rt->rt6i_flags & RTF_CACHE) {
@@ -1793,6 +1804,8 @@ int __init fib6_init(void)
NULL);
if (ret)
goto out_unregister_subsys;
+
+ __fib6_flush_trees = fib6_flush_trees;
out:
return ret;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index f74b041..a318dd89 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -314,7 +314,6 @@ static inline struct rt6_info *ip6_dst_alloc(struct net *net,
memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst));
rt6_init_peer(rt, table ? &table->tb6_peers : net->ipv6.peers);
- rt->rt6i_genid = rt_genid_ipv6(net);
INIT_LIST_HEAD(&rt->rt6i_siblings);
}
return rt;
@@ -1096,9 +1095,6 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
* DST_OBSOLETE_FORCE_CHK which forces validation calls down
* into this function always.
*/
- if (rt->rt6i_genid != rt_genid_ipv6(dev_net(rt->dst.dev)))
- return NULL;
-
if (!rt->rt6i_node || (rt->rt6i_node->fn_sernum != cookie))
return NULL;
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
` (4 preceding siblings ...)
2014-09-21 14:11 ` [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes Hannes Frederic Sowa
` (3 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
fn_sernum takes care that on address insertion the sockets throw away
their cached dst_entries and do a relookup.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv6/addrconf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 39d3335..a2d2626 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4781,10 +4781,11 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
if (ip6_del_rt(ifp->rt))
dst_free(&ifp->rt->dst);
+
+ rt_genid_bump_ipv6(net);
break;
}
atomic_inc(&net->ipv6.dev_addr_genid);
- rt_genid_bump_ipv6(net);
}
static void ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
` (5 preceding siblings ...)
2014-09-21 14:11 ` [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types Hannes Frederic Sowa
` (2 subsequent siblings)
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/netns/ipv6.h | 1 +
net/ipv6/af_inet6.c | 1 +
net/ipv6/ip6_fib.c | 19 ++++++++++---------
3 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 3291ba6..2319949 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -76,6 +76,7 @@ struct netns_ipv6 {
#endif
#endif
atomic_t dev_addr_genid;
+ u32 rt_sernum;
};
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 2189d2d..7ff8996 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -766,6 +766,7 @@ static int __net_init inet6_net_init(struct net *net)
net->ipv6.sysctl.icmpv6_time = 1*HZ;
net->ipv6.sysctl.flowlabel_consistency = 1;
net->ipv6.sysctl.auto_flowlabels = 0;
+ net->ipv6.rt_sernum = 1;
err = ipv6_init_mibs(net);
if (err)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 0a97216..9f973e4 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -76,8 +76,6 @@ static int fib6_walk_continue(struct fib6_walker_t *w);
* result of redirects, path MTU changes, etc.
*/
-static u32 rt_sernum;
-
static void fib6_gc_timer_cb(unsigned long arg);
static LIST_HEAD(fib6_walkers);
@@ -96,12 +94,15 @@ static inline void fib6_walker_unlink(struct fib6_walker_t *w)
list_del(&w->lh);
write_unlock_bh(&fib6_walker_lock);
}
-static __inline__ u32 fib6_new_sernum(void)
+
+static u32 fib6_new_sernum(struct net *net)
{
- u32 n = ++rt_sernum;
- if ((s32)n <= 0)
- rt_sernum = n = 1;
- return n;
+ int *n = &net->ipv6.rt_sernum;
+
+ ++*n;
+ if ((s32)*n <= 0)
+ *n = 1;
+ return *n;
}
#define FIB6_NO_SERNUM_CHANGE (0U)
@@ -841,7 +842,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
int err = -ENOMEM;
int allow_create = 1;
int replace_required = 0;
- u32 sernum = fib6_new_sernum();
+ u32 sernum = fib6_new_sernum(dev_net(rt->dst.dev));
if (info->nlh) {
if (!(info->nlh->nlmsg_flags & NLM_F_CREATE))
@@ -1603,7 +1604,7 @@ void fib6_clean_all(struct net *net, int (*func)(struct rt6_info *, void *arg),
static void fib6_flush_trees(struct net *net)
{
- __fib6_clean_all(net, NULL, fib6_new_sernum(), NULL);
+ __fib6_clean_all(net, NULL, fib6_new_sernum(net), NULL);
}
static int fib6_prune_clone(struct rt6_info *rt, void *arg)
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
` (6 preceding siblings ...)
2014-09-21 14:11 ` [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches Hannes Frederic Sowa
2014-09-26 4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
Switch rt_sernum to atomic_t, make it concurrency safe (the old scheme
looked broken to me) and switch from u32 to int types for the fn_sernum.
(fib6_new_sernum only gets used with table locks, but different tables
can get mutated at the same time.)
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/ip6_fib.h | 2 +-
include/net/netns/ipv6.h | 2 +-
net/ipv6/af_inet6.c | 2 +-
net/ipv6/ip6_fib.c | 29 +++++++++++++++--------------
4 files changed, 18 insertions(+), 17 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index a09e554..5440f99 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -64,7 +64,7 @@ struct fib6_node {
__u16 fn_bit; /* bit key */
__u16 fn_flags;
- u32 fn_sernum;
+ int fn_sernum;
struct rt6_info *rr_ptr;
};
diff --git a/include/net/netns/ipv6.h b/include/net/netns/ipv6.h
index 2319949..7dee21b 100644
--- a/include/net/netns/ipv6.h
+++ b/include/net/netns/ipv6.h
@@ -76,7 +76,7 @@ struct netns_ipv6 {
#endif
#endif
atomic_t dev_addr_genid;
- u32 rt_sernum;
+ atomic_t rt_sernum;
};
#if IS_ENABLED(CONFIG_NF_DEFRAG_IPV6)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 7ff8996..6cde9b4 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -766,7 +766,7 @@ static int __net_init inet6_net_init(struct net *net)
net->ipv6.sysctl.icmpv6_time = 1*HZ;
net->ipv6.sysctl.flowlabel_consistency = 1;
net->ipv6.sysctl.auto_flowlabels = 0;
- net->ipv6.rt_sernum = 1;
+ atomic_set(&net->ipv6.rt_sernum, 1);
err = ipv6_init_mibs(net);
if (err)
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index 9f973e4..ae87d0c 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -50,7 +50,7 @@ struct fib6_cleaner_t {
struct fib6_walker_t w;
struct net *net;
int (*func)(struct rt6_info *, void *arg);
- u32 sernum;
+ int sernum;
void *arg;
};
@@ -63,7 +63,7 @@ static DEFINE_RWLOCK(fib6_walker_lock);
#endif
static void fib6_prune_clones(struct net *net, struct fib6_node *fn,
- u32 sernum);
+ int sernum);
static struct rt6_info *fib6_find_prefix(struct net *net, struct fib6_node *fn);
static struct fib6_node *fib6_repair_tree(struct net *net, struct fib6_node *fn);
static int fib6_walk(struct fib6_walker_t *w);
@@ -97,15 +97,16 @@ static inline void fib6_walker_unlink(struct fib6_walker_t *w)
static u32 fib6_new_sernum(struct net *net)
{
- int *n = &net->ipv6.rt_sernum;
+ int old, new;
- ++*n;
- if ((s32)*n <= 0)
- *n = 1;
- return *n;
+ do {
+ old = atomic_read(&net->ipv6.rt_sernum);
+ new = old < INT_MAX ? old + 1 : 1;
+ } while (atomic_cmpxchg(&net->ipv6.rt_sernum, old, new) != old);
+ return new;
}
-#define FIB6_NO_SERNUM_CHANGE (0U)
+#define FIB6_NO_SERNUM_CHANGE (0)
/*
* Auxiliary address test functions for the radix tree.
@@ -418,7 +419,7 @@ out:
static struct fib6_node *fib6_add_1(struct fib6_node *root,
struct in6_addr *addr, int plen,
int offset, int allow_create,
- int replace_required, u32 sernum)
+ int replace_required, int sernum)
{
struct fib6_node *fn, *in, *ln;
struct fib6_node *pn = NULL;
@@ -842,7 +843,7 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt, struct nl_info *info,
int err = -ENOMEM;
int allow_create = 1;
int replace_required = 0;
- u32 sernum = fib6_new_sernum(dev_net(rt->dst.dev));
+ int sernum = fib6_new_sernum(dev_net(rt->dst.dev));
if (info->nlh) {
if (!(info->nlh->nlmsg_flags & NLM_F_CREATE))
@@ -1558,7 +1559,7 @@ static int fib6_clean_node(struct fib6_walker_t *w)
static void fib6_clean_tree(struct net *net, struct fib6_node *root,
int (*func)(struct rt6_info *, void *arg),
- bool prune, u32 sernum, void *arg)
+ bool prune, int sernum, void *arg)
{
struct fib6_cleaner_t c;
@@ -1577,7 +1578,7 @@ static void fib6_clean_tree(struct net *net, struct fib6_node *root,
void __fib6_clean_all(struct net *net,
int (*func)(struct rt6_info *, void *arg),
- u32 sernum, void *arg)
+ int sernum, void *arg)
{
struct fib6_table *table;
struct hlist_head *head;
@@ -1617,7 +1618,7 @@ static int fib6_prune_clone(struct rt6_info *rt, void *arg)
return 0;
}
-static void fib6_prune_clones(struct net *net, struct fib6_node *fn, u32 sernum)
+static void fib6_prune_clones(struct net *net, struct fib6_node *fn, int sernum)
{
fib6_clean_tree(net, fn, fib6_prune_clone, true, sernum, NULL);
}
@@ -1830,7 +1831,7 @@ struct ipv6_route_iter {
struct fib6_walker_t w;
loff_t skip;
struct fib6_table *tbl;
- u32 sernum;
+ int sernum;
};
static int ipv6_route_seq_show(struct seq_file *seq, void *v)
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
` (7 preceding siblings ...)
2014-09-21 14:11 ` [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types Hannes Frederic Sowa
@ 2014-09-21 14:11 ` Hannes Frederic Sowa
2014-09-26 4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
9 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-21 14:11 UTC (permalink / raw)
To: netdev; +Cc: eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
Also rename ipv4 and ipv6 agnostic rt_genid_bump_all to
rt_inval_dst_caches as we don't care how the flushing is implemented in
the protocols.
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: YOSHIFUJI Hideaki <hideaki@yoshifuji.org>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Martin Lau <kafai@fb.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
include/net/net_namespace.h | 8 ++++----
net/ipv6/addrconf.c | 2 +-
net/xfrm/xfrm_policy.c | 2 +-
security/selinux/include/xfrm.h | 2 +-
4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 61aad36..e73b80f 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -354,13 +354,13 @@ static inline void rt_genid_bump_ipv4(struct net *net)
#if IS_ENABLED(CONFIG_IPV6)
extern void (*__fib6_flush_trees)(struct net *);
-static inline void rt_genid_bump_ipv6(struct net *net)
+static inline void rt6_dst_inval_caches(struct net *net)
{
if (__fib6_flush_trees)
__fib6_flush_trees(net);
}
#else
-static inline void rt_genid_bump_ipv6(struct net *net)
+static inline void rt6_dst_inval_caches(struct net *net)
{
}
#endif
@@ -374,10 +374,10 @@ net_ieee802154_lowpan(struct net *net)
#endif
/* For callers who don't really care about whether it's IPv4 or IPv6 */
-static inline void rt_genid_bump_all(struct net *net)
+static inline void dst_inval_caches(struct net *net)
{
rt_genid_bump_ipv4(net);
- rt_genid_bump_ipv6(net);
+ rt6_dst_inval_caches(net);
}
static inline int fnhe_genid(struct net *net)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index a2d2626..0c2aade 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4782,7 +4782,7 @@ static void __ipv6_ifa_notify(int event, struct inet6_ifaddr *ifp)
if (ip6_del_rt(ifp->rt))
dst_free(&ifp->rt->dst);
- rt_genid_bump_ipv6(net);
+ rt6_dst_inval_caches(net);
break;
}
atomic_inc(&net->ipv6.dev_addr_genid);
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index beeed60..6d09195 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -665,7 +665,7 @@ int xfrm_policy_insert(int dir, struct xfrm_policy *policy, int excl)
if (policy->family == AF_INET)
rt_genid_bump_ipv4(net);
else
- rt_genid_bump_ipv6(net);
+ rt6_dst_inval_caches(net);
if (delpol) {
xfrm_policy_requeue(delpol, policy);
diff --git a/security/selinux/include/xfrm.h b/security/selinux/include/xfrm.h
index 1450f85..a1c5f97 100644
--- a/security/selinux/include/xfrm.h
+++ b/security/selinux/include/xfrm.h
@@ -49,7 +49,7 @@ static inline void selinux_xfrm_notify_policyload(void)
rtnl_lock();
for_each_net(net) {
atomic_inc(&net->xfrm.flow_cache_genid);
- rt_genid_bump_all(net);
+ dst_inval_caches(net);
}
rtnl_unlock();
}
--
1.9.3
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
` (8 preceding siblings ...)
2014-09-21 14:11 ` [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches Hannes Frederic Sowa
@ 2014-09-26 4:28 ` David Miller
2014-09-26 7:58 ` Hannes Frederic Sowa
9 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2014-09-26 4:28 UTC (permalink / raw)
To: hannes; +Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Sun, 21 Sep 2014 16:11:44 +0200
> Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
> RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
> their rt6_genid is never renewed, thus ip6_dst_check always considers
> them outdated. This is a major problem, because these kind of routes
> are normally used to in input handling.
This series is a disappointment for me from the perspective of the
fact that we have a regression in mainline and this is too complex
of a set of changes for there.
If we relookup the thing every TCP input packet, we might as well
not do the input route caching in the socket.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
2014-09-26 4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
@ 2014-09-26 7:58 ` Hannes Frederic Sowa
2014-09-26 16:46 ` David Miller
0 siblings, 1 reply; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-26 7:58 UTC (permalink / raw)
To: David Miller
Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
On Fri, Sep 26, 2014, at 06:28, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Sun, 21 Sep 2014 16:11:44 +0200
>
> > Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
> > RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
> > their rt6_genid is never renewed, thus ip6_dst_check always considers
> > them outdated. This is a major problem, because these kind of routes
> > are normally used to in input handling.
>
> This series is a disappointment for me from the perspective of the
> fact that we have a regression in mainline and this is too complex
> of a set of changes for there.
>
> If we relookup the thing every TCP input packet, we might as well
> not do the input route caching in the socket.
I can understand.
Toss this series, I'll try to do better tomorrow and send changes for
net and submit net-next cleanups when your queue is a bit smaller.
Bye,
Hannes
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
2014-09-26 7:58 ` Hannes Frederic Sowa
@ 2014-09-26 16:46 ` David Miller
2014-09-27 23:32 ` Hannes Frederic Sowa
0 siblings, 1 reply; 14+ messages in thread
From: David Miller @ 2014-09-26 16:46 UTC (permalink / raw)
To: hannes; +Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
From: Hannes Frederic Sowa <hannes@stressinduktion.org>
Date: Fri, 26 Sep 2014 09:58:43 +0200
> On Fri, Sep 26, 2014, at 06:28, David Miller wrote:
>> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
>> Date: Sun, 21 Sep 2014 16:11:44 +0200
>>
>> > Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
>> > RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
>> > their rt6_genid is never renewed, thus ip6_dst_check always considers
>> > them outdated. This is a major problem, because these kind of routes
>> > are normally used to in input handling.
>>
>> This series is a disappointment for me from the perspective of the
>> fact that we have a regression in mainline and this is too complex
>> of a set of changes for there.
>>
>> If we relookup the thing every TCP input packet, we might as well
>> not do the input route caching in the socket.
>
> I can understand.
>
> Toss this series, I'll try to do better tomorrow and send changes for
> net and submit net-next cleanups when your queue is a bit smaller.
BTW, don't get me wrong, I like the new code and for 'net-next' it's
good.
But for 'net' we have to come up with something simpler meanwhile.
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups
2014-09-26 16:46 ` David Miller
@ 2014-09-27 23:32 ` Hannes Frederic Sowa
0 siblings, 0 replies; 14+ messages in thread
From: Hannes Frederic Sowa @ 2014-09-27 23:32 UTC (permalink / raw)
To: David Miller
Cc: netdev, eric.dumazet, hideaki, vyasevich, nicolas.dichtel, kafai
On Fri, Sep 26, 2014, at 18:46, David Miller wrote:
> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> Date: Fri, 26 Sep 2014 09:58:43 +0200
>
> > On Fri, Sep 26, 2014, at 06:28, David Miller wrote:
> >> From: Hannes Frederic Sowa <hannes@stressinduktion.org>
> >> Date: Sun, 21 Sep 2014 16:11:44 +0200
> >>
> >> > Eric Dumazet noticed that rt6_nodes wich are neither RTF_NONEXTHOP nor
> >> > RTF_GATEWAY but DST_HOST ones cause major routing lookup churn because
> >> > their rt6_genid is never renewed, thus ip6_dst_check always considers
> >> > them outdated. This is a major problem, because these kind of routes
> >> > are normally used to in input handling.
> >>
> >> This series is a disappointment for me from the perspective of the
> >> fact that we have a regression in mainline and this is too complex
> >> of a set of changes for there.
> >>
> >> If we relookup the thing every TCP input packet, we might as well
> >> not do the input route caching in the socket.
> >
> > I can understand.
> >
> > Toss this series, I'll try to do better tomorrow and send changes for
> > net and submit net-next cleanups when your queue is a bit smaller.
>
> BTW, don't get me wrong, I like the new code and for 'net-next' it's
> good.
I didn't. ;)
> But for 'net' we have to come up with something simpler meanwhile.
Sure, I just posted one small patch to address the problem, cleanups and
smaller performance optimizations come in later after you did a merge.
Thanks,
Hannes
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2014-09-27 23:32 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-21 14:11 [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 1/9] ipv6: support for fib6_clean_* to update fn_sernum Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 2/9] ipv6: a bit more typesafety Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 3/9] ipv6: only generate one new serial number during fib6_add() Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 4/9] ipv6: if no function for cleaner is specified only visit fib6_nodes Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 5/9] ipv6: new function fib6_flush_trees and use it instead of bumping removed rt6_genid Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 6/9] ipv6: no need to bump rt_genid_ipv6 on address addition Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 7/9] ipv6: keep rt_sernum per namespace to reduce number of flushes Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 8/9] ipv6: switch rt_sernum to atomic_t and clean up types Hannes Frederic Sowa
2014-09-21 14:11 ` [PATCH v2 net-next 9/9] ipv6: rename rt_genid_bump_ipv6 to rt6_inval_dst_caches Hannes Frederic Sowa
2014-09-26 4:28 ` [PATCH v2 net-next 0/9] ipv6: fib6: socket dst_entry improvments and cleanups David Miller
2014-09-26 7:58 ` Hannes Frederic Sowa
2014-09-26 16:46 ` David Miller
2014-09-27 23:32 ` Hannes Frederic Sowa
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).