* [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update
@ 2015-04-28 20:03 Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 1/5] ipv6: Consider RTF_CACHE when searching the fib6 tree Martin KaFai Lau
` (5 more replies)
0 siblings, 6 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2015-04-28 20:03 UTC (permalink / raw)
To: netdev
Cc: Hannes Frederic Sowa, Steffen Klassert, David Miller,
Yang Yingliang, shengyong, Kernel Team
The series is separated from another patch series,
'ipv6: Only create RTF_CACHE route after encountering pmtu exception',
which can be found here:
http://thread.gmane.org/gmane.linux.network/359140
This series focus on fixing the /128 route issues. It is currently targeted
for net-next due to the number of code churn but it is also applicable
to net (should be without conflict). The original reported problem can be
found here:
http://thread.gmane.org/gmane.linux.network/348138
Patch 01 and 02 are to prepare the fib6 search to expect both the
RTF_CACHE clone and its original route exist at the same fib6_node.
Patch 03 fixes the /128 route disappearing bug.
Patch 04 and 05 stop rt6_info from using the inet_peer's metrics to
avoid the /128 routes (like the /128 clone and its original route)
from stepping on each others' metrics.
The second patch is by 'Steffen Klassert <steffen.klassert@secunet.com>'
which I pulled off from netdev. The third patch is also mostly by
Steffen with one minor optimization.
Many thanks to Hannes Frederic Sowa <hannes@stressinduktion.org> on
reviewing the patches and giving advice.
--Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* [PATCH net-next 1/5] ipv6: Consider RTF_CACHE when searching the fib6 tree
2015-04-28 20:03 [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
@ 2015-04-28 20:03 ` Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 2/5] ipv6: Extend the route lookups to low priority metrics Martin KaFai Lau
` (4 subsequent siblings)
5 siblings, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2015-04-28 20:03 UTC (permalink / raw)
To: netdev
Cc: Hannes Frederic Sowa, Steffen Klassert, David Miller,
Yang Yingliang, shengyong, Kernel Team
It is a prep work for the later bug-fix patch which will stop /128 route
from disappearing after pmtu update.
The later bug-fix patch will allow a /128 route and its RTF_CACHE clone
both exist at the same fib6_node. To do this, we need to prepare the
existing fib6 tree search to expect RTF_CACHE for /128 route.
Note that the fn->leaf is sorted by rt6i_metric. Hence,
RTF_CACHE (if there is any) is always at the front. This property
leads to the following:
1. When doing ip6_route_del(), it should honor the RTF_CACHE flag which
the caller is used to ask for deleting clone or non-clone.
The rtm_to_fib6_config() should also check the RTM_F_CLONED and
then set RTF_CACHE accordingly so that:
- 'ip -6 r del...' will make ip6_route_del() to delete a route
and all its clones. Note that its clones is flushed by fib6_del()
- 'ip -6 r flush table cache' will make ip6_route_del() to
only delete clone(s).
2. Exclude RTF_CACHE from addrconf_get_prefix_route() which
should not configure on a cloned route.
3. No change is need for rt6_device_match() since it currently could
return a RTF_CACHE clone route, so the later bug-fix patch will not
affect it.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
---
net/ipv6/addrconf.c | 2 ++
net/ipv6/route.c | 6 ++++++
2 files changed, 8 insertions(+)
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 37b70e8..21c2c81 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -2121,6 +2121,8 @@ static struct rt6_info *addrconf_get_prefix_route(const struct in6_addr *pfx,
fn = fib6_locate(&table->tb6_root, pfx, plen, NULL, 0);
if (!fn)
goto out;
+
+ noflags |= RTF_CACHE;
for (rt = fn->leaf; rt; rt = rt->dst.rt6_next) {
if (rt->dst.dev->ifindex != dev->ifindex)
continue;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 5c48293..4774f13 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1785,6 +1785,9 @@ static int ip6_route_del(struct fib6_config *cfg)
if (fn) {
for (rt = fn->leaf; rt; rt = rt->dst.rt6_next) {
+ if ((rt->rt6i_flags & RTF_CACHE) &&
+ !(cfg->fc_flags & RTF_CACHE))
+ continue;
if (cfg->fc_ifindex &&
(!rt->dst.dev ||
rt->dst.dev->ifindex != cfg->fc_ifindex))
@@ -2433,6 +2436,9 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
if (rtm->rtm_type == RTN_LOCAL)
cfg->fc_flags |= RTF_LOCAL;
+ if (rtm->rtm_flags & RTM_F_CLONED)
+ cfg->fc_flags |= RTF_CACHE;
+
cfg->fc_nlinfo.portid = NETLINK_CB(skb).portid;
cfg->fc_nlinfo.nlh = nlh;
cfg->fc_nlinfo.nl_net = sock_net(skb->sk);
--
1.8.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH net-next 2/5] ipv6: Extend the route lookups to low priority metrics.
2015-04-28 20:03 [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 1/5] ipv6: Consider RTF_CACHE when searching the fib6 tree Martin KaFai Lau
@ 2015-04-28 20:03 ` Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
` (3 subsequent siblings)
5 siblings, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2015-04-28 20:03 UTC (permalink / raw)
To: netdev
Cc: Hannes Frederic Sowa, Steffen Klassert, David Miller,
Yang Yingliang, shengyong, Kernel Team
From: Steffen Klassert <steffen.klassert@secunet.com>
We search only for routes with highest priority metric in
find_rr_leaf(). However if one of these routes is marked
as invalid, we may fail to find a route even if there is
a appropriate route with lower priority. Then we loose
connectivity until the garbage collector deletes the
invalid route. This typically happens if a host route
expires afer a pmtu event. Fix this by searching also
for routes with a lower priority metric.
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv6/route.c | 28 +++++++++++++++++++++++-----
1 file changed, 23 insertions(+), 5 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4774f13..07562a2 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -652,15 +652,33 @@ static struct rt6_info *find_rr_leaf(struct fib6_node *fn,
u32 metric, int oif, int strict,
bool *do_rr)
{
- struct rt6_info *rt, *match;
+ struct rt6_info *rt, *match, *cont;
int mpri = -1;
match = NULL;
- for (rt = rr_head; rt && rt->rt6i_metric == metric;
- rt = rt->dst.rt6_next)
+ cont = NULL;
+ for (rt = rr_head; rt; rt = rt->dst.rt6_next) {
+ if (rt->rt6i_metric != metric) {
+ cont = rt;
+ break;
+ }
+
+ match = find_match(rt, oif, strict, &mpri, match, do_rr);
+ }
+
+ for (rt = fn->leaf; rt && rt != rr_head; rt = rt->dst.rt6_next) {
+ if (rt->rt6i_metric != metric) {
+ cont = rt;
+ break;
+ }
+
match = find_match(rt, oif, strict, &mpri, match, do_rr);
- for (rt = fn->leaf; rt && rt != rr_head && rt->rt6i_metric == metric;
- rt = rt->dst.rt6_next)
+ }
+
+ if (match || !cont)
+ return match;
+
+ for (rt = cont; rt; rt = rt->dst.rt6_next)
match = find_match(rt, oif, strict, &mpri, match, do_rr);
return match;
--
1.8.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-04-28 20:03 [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 1/5] ipv6: Consider RTF_CACHE when searching the fib6 tree Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 2/5] ipv6: Extend the route lookups to low priority metrics Martin KaFai Lau
@ 2015-04-28 20:03 ` Martin KaFai Lau
2015-05-02 22:41 ` Hajime Tazaki
2015-04-28 20:03 ` [PATCH net-next 4/5] ipv6: Stop rt6_info from using inet_peer's metrics Martin KaFai Lau
` (2 subsequent siblings)
5 siblings, 1 reply; 18+ messages in thread
From: Martin KaFai Lau @ 2015-04-28 20:03 UTC (permalink / raw)
To: netdev
Cc: Hannes Frederic Sowa, Steffen Klassert, David Miller,
Yang Yingliang, shengyong, Kernel Team
This patch is mostly from Steffen Klassert <steffen.klassert@secunet.com>.
I only removed the (rt6->rt6i_dst.plen == 128) check from
ip6_rt_update_pmtu() because the (rt6->rt6i_flags & RTF_CACHE) test
has already implied it.
This patch:
1. Create RTF_CACHE route for /128 non local route
2. After (1), all routes that allow pmtu update should have a RTF_CACHE
clone. Hence, stop updating MTU for any non RTF_CACHE route.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
net/ipv6/route.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 07562a2..aa4cfdd 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -977,7 +977,7 @@ redo_rt6_select:
if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
- else if (!(rt->dst.flags & DST_HOST))
+ else if (!(rt->dst.flags & DST_HOST) || !(rt->dst.flags & RTF_LOCAL))
nrt = rt6_alloc_clone(rt, &fl6->daddr);
else
goto out2;
@@ -1172,7 +1172,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
struct rt6_info *rt6 = (struct rt6_info *)dst;
dst_confirm(dst);
- if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
+ if (mtu < dst_mtu(dst) && (rt6->rt6i_flags & RTF_CACHE)) {
struct net *net = dev_net(dst->dev);
rt6->rt6i_flags |= RTF_MODIFIED;
--
1.8.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH net-next 4/5] ipv6: Stop rt6_info from using inet_peer's metrics
2015-04-28 20:03 [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
` (2 preceding siblings ...)
2015-04-28 20:03 ` [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
@ 2015-04-28 20:03 ` Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 5/5] ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peer Martin KaFai Lau
2015-05-02 1:01 ` [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update David Miller
5 siblings, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2015-04-28 20:03 UTC (permalink / raw)
To: netdev
Cc: Hannes Frederic Sowa, Steffen Klassert, David Miller,
Yang Yingliang, shengyong, Kernel Team
inet_peer is indexed by the dst address alone. However, the fib6 tree
could have multiple routing entries (rt6_info) for the same dst. For
example,
1. A /128 dst via multiple gateways.
2. A RTF_CACHE route cloned from a /128 route.
In the above cases, all of them will share the same metrics and
step on each other.
This patch will steer away from inet_peer's metrics and use
dst_cow_metrics_generic() for everything.
Change Highlights:
1. Remove rt6_cow_metrics() which currently acquires metrics from
inet_peer for DST_HOST route (i.e. /128 route).
2. Add rt6i_pmtu to take care of the pmtu update to avoid creating a
full size metrics just to override the RTAX_MTU.
3. After (2), the RTF_CACHE route can also share the metrics with its
dst.from route, by:
dst_init_metrics(&cache_rt->dst, dst_metrics_ptr(cache_rt->dst.from), true);
4. Stop creating RTF_CACHE route by cloning another RTF_CACHE route. Instead,
directly clone from rt->dst.
[ Currently, cloning from another RTF_CACHE is only possible during
rt6_do_redirect(). Also, the old clone is removed from the tree
immediately after the new clone is added. ]
In case of cloning from an older redirect RTF_CACHE, it should work as
before.
In case of cloning from an older pmtu RTF_CACHE, this patch will forget
the pmtu and re-learn it (if there is any) from the redirected route.
The _rt6i_peer and DST_METRICS_FORCE_OVERWRITE will be removed
in the next cleanup patch.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
---
include/net/ip6_fib.h | 10 +----
net/ipv6/route.c | 102 +++++++++++++++++++++++++++++---------------------
2 files changed, 60 insertions(+), 52 deletions(-)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 20e80fa..7383a8c 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -124,6 +124,7 @@ struct rt6_info {
unsigned long _rt6i_peer;
u32 rt6i_metric;
+ u32 rt6i_pmtu;
/* more non-fragment space at head required */
unsigned short rt6i_nfheader_len;
u8 rt6i_protocol;
@@ -189,15 +190,6 @@ static inline void rt6_update_expires(struct rt6_info *rt0, int timeout)
rt0->rt6i_flags |= RTF_EXPIRES;
}
-static inline void rt6_set_from(struct rt6_info *rt, struct rt6_info *from)
-{
- struct dst_entry *new = (struct dst_entry *) from;
-
- rt->rt6i_flags &= ~RTF_EXPIRES;
- dst_hold(new);
- rt->dst.from = new;
-}
-
static inline void ip6_rt_put(struct rt6_info *rt)
{
/* dst_release() accepts a NULL parameter.
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index aa4cfdd..4d6eb5d 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -92,6 +92,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
struct sk_buff *skb, u32 mtu);
static void rt6_do_redirect(struct dst_entry *dst, struct sock *sk,
struct sk_buff *skb);
+static void rt6_dst_from_metrics_check(struct rt6_info *rt);
static int rt6_score_route(struct rt6_info *rt, int oif, int strict);
#ifdef CONFIG_IPV6_ROUTE_INFO
@@ -136,33 +137,12 @@ static struct inet_peer *rt6_get_peer_create(struct rt6_info *rt)
static u32 *ipv6_cow_metrics(struct dst_entry *dst, unsigned long old)
{
- struct rt6_info *rt = (struct rt6_info *) dst;
- struct inet_peer *peer;
- u32 *p = NULL;
+ struct rt6_info *rt = (struct rt6_info *)dst;
- if (!(rt->dst.flags & DST_HOST))
+ if (rt->rt6i_flags & RTF_CACHE)
+ return NULL;
+ else
return dst_cow_metrics_generic(dst, old);
-
- peer = rt6_get_peer_create(rt);
- if (peer) {
- u32 *old_p = __DST_METRICS_PTR(old);
- unsigned long prev, new;
-
- p = peer->metrics;
- if (inet_metrics_new(peer) ||
- (old & DST_METRICS_FORCE_OVERWRITE))
- memcpy(p, old_p, sizeof(u32) * RTAX_MAX);
-
- new = (unsigned long) p;
- prev = cmpxchg(&dst->_metrics, old, new);
-
- if (prev != old) {
- p = __DST_METRICS_PTR(prev);
- if (prev & DST_METRICS_READ_ONLY)
- p = NULL;
- }
- }
- return p;
}
static inline const void *choose_neigh_daddr(struct rt6_info *rt,
@@ -323,8 +303,7 @@ static void ip6_dst_destroy(struct dst_entry *dst)
struct inet6_dev *idev = rt->rt6i_idev;
struct dst_entry *from = dst->from;
- if (!(rt->dst.flags & DST_HOST))
- dst_destroy_metrics_generic(dst);
+ dst_destroy_metrics_generic(dst);
if (idev) {
rt->rt6i_idev = NULL;
@@ -333,11 +312,6 @@ static void ip6_dst_destroy(struct dst_entry *dst)
dst->from = NULL;
dst_release(from);
-
- if (rt6_has_peer(rt)) {
- struct inet_peer *peer = rt6_peer_ptr(rt);
- inet_putpeer(peer);
- }
}
static void ip6_dst_ifdown(struct dst_entry *dst, struct net_device *dev,
@@ -1003,6 +977,7 @@ redo_rt6_select:
goto redo_fib6_lookup_lock;
out2:
+ rt6_dst_from_metrics_check(rt);
rt->dst.lastuse = jiffies;
rt->dst.__use++;
@@ -1111,6 +1086,13 @@ struct dst_entry *ip6_blackhole_route(struct net *net, struct dst_entry *dst_ori
* Destination cache support functions
*/
+static void rt6_dst_from_metrics_check(struct rt6_info *rt)
+{
+ if (rt->dst.from &&
+ dst_metrics_ptr(&rt->dst) != dst_metrics_ptr(rt->dst.from))
+ dst_init_metrics(&rt->dst, dst_metrics_ptr(rt->dst.from), true);
+}
+
static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
{
struct rt6_info *rt;
@@ -1127,6 +1109,8 @@ static struct dst_entry *ip6_dst_check(struct dst_entry *dst, u32 cookie)
if (rt6_check_expired(rt))
return NULL;
+ rt6_dst_from_metrics_check(rt);
+
return dst;
}
@@ -1179,7 +1163,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
if (mtu < IPV6_MIN_MTU)
mtu = IPV6_MIN_MTU;
- dst_metric_set(dst, RTAX_MTU, mtu);
+ rt6->rt6i_pmtu = mtu;
rt6_update_expires(rt6, net->ipv6.sysctl.ip6_rt_mtu_expires);
}
}
@@ -1359,12 +1343,17 @@ static unsigned int ip6_default_advmss(const struct dst_entry *dst)
static unsigned int ip6_mtu(const struct dst_entry *dst)
{
+ const struct rt6_info *rt = (const struct rt6_info *)dst;
+ unsigned int mtu = rt->rt6i_pmtu;
struct inet6_dev *idev;
- unsigned int mtu = dst_metric_raw(dst, RTAX_MTU);
if (mtu)
goto out;
+ mtu = dst_metric_raw(dst, RTAX_MTU);
+ if (mtu)
+ goto out;
+
mtu = IPV6_MIN_MTU;
rcu_read_lock();
@@ -1947,12 +1936,27 @@ out:
* Misc support functions
*/
+static void rt6_set_from(struct rt6_info *rt, struct rt6_info *from)
+{
+ BUG_ON(from->dst.from);
+
+ rt->rt6i_flags &= ~RTF_EXPIRES;
+ dst_hold(&from->dst);
+ rt->dst.from = &from->dst;
+ dst_init_metrics(&rt->dst, dst_metrics_ptr(&from->dst), true);
+}
+
static struct rt6_info *ip6_rt_copy(struct rt6_info *ort,
const struct in6_addr *dest)
{
struct net *net = dev_net(ort->dst.dev);
- struct rt6_info *rt = ip6_dst_alloc(net, ort->dst.dev, 0,
- ort->rt6i_table);
+ struct rt6_info *rt;
+
+ if (ort->rt6i_flags & RTF_CACHE)
+ ort = (struct rt6_info *)ort->dst.from;
+
+ rt = ip6_dst_alloc(net, ort->dst.dev, 0,
+ ort->rt6i_table);
if (rt) {
rt->dst.input = ort->dst.input;
@@ -1961,7 +1965,6 @@ static struct rt6_info *ip6_rt_copy(struct rt6_info *ort,
rt->rt6i_dst.addr = *dest;
rt->rt6i_dst.plen = 128;
- dst_copy_metrics(&rt->dst, &ort->dst);
rt->dst.error = ort->dst.error;
rt->rt6i_idev = ort->rt6i_idev;
if (rt->rt6i_idev)
@@ -2393,11 +2396,20 @@ static int rt6_mtu_change_route(struct rt6_info *rt, void *p_arg)
PMTU discouvery.
*/
if (rt->dst.dev == arg->dev &&
- !dst_metric_locked(&rt->dst, RTAX_MTU) &&
- (dst_mtu(&rt->dst) >= arg->mtu ||
- (dst_mtu(&rt->dst) < arg->mtu &&
- dst_mtu(&rt->dst) == idev->cnf.mtu6))) {
- dst_metric_set(&rt->dst, RTAX_MTU, arg->mtu);
+ !dst_metric_locked(&rt->dst, RTAX_MTU)) {
+ if (rt->rt6i_flags & RTF_CACHE) {
+ /* For RTF_CACHE with rt6i_pmtu == 0
+ * (i.e. a redirected route),
+ * the metrics of its rt->dst.from has already
+ * been updated.
+ */
+ if (rt->rt6i_pmtu && rt->rt6i_pmtu > arg->mtu)
+ rt->rt6i_pmtu = arg->mtu;
+ } else if (dst_mtu(&rt->dst) >= arg->mtu ||
+ (dst_mtu(&rt->dst) < arg->mtu &&
+ dst_mtu(&rt->dst) == idev->cnf.mtu6)) {
+ dst_metric_set(&rt->dst, RTAX_MTU, arg->mtu);
+ }
}
return 0;
}
@@ -2627,6 +2639,7 @@ static int rt6_fill_node(struct net *net,
int iif, int type, u32 portid, u32 seq,
int prefix, int nowait, unsigned int flags)
{
+ u32 metrics[RTAX_MAX];
struct rtmsg *rtm;
struct nlmsghdr *nlh;
long expires;
@@ -2740,7 +2753,10 @@ static int rt6_fill_node(struct net *net,
goto nla_put_failure;
}
- if (rtnetlink_put_metrics(skb, dst_metrics_ptr(&rt->dst)) < 0)
+ memcpy(metrics, dst_metrics_ptr(&rt->dst), sizeof(metrics));
+ if (rt->rt6i_pmtu)
+ metrics[RTAX_MTU - 1] = rt->rt6i_pmtu;
+ if (rtnetlink_put_metrics(skb, metrics) < 0)
goto nla_put_failure;
if (rt->rt6i_flags & RTF_GATEWAY) {
--
1.8.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* [PATCH net-next 5/5] ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peer
2015-04-28 20:03 [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
` (3 preceding siblings ...)
2015-04-28 20:03 ` [PATCH net-next 4/5] ipv6: Stop rt6_info from using inet_peer's metrics Martin KaFai Lau
@ 2015-04-28 20:03 ` Martin KaFai Lau
2015-05-02 1:01 ` [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update David Miller
5 siblings, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2015-04-28 20:03 UTC (permalink / raw)
To: netdev
Cc: Hannes Frederic Sowa, Steffen Klassert, David Miller,
Yang Yingliang, shengyong, Kernel Team, Michal Kubeček
_rt6i_peer is no longer needed after the last patch,
'ipv6: Stop rt6_info from using inet_peer's metrics'.
DST_METRICS_FORCE_OVERWRITE is added by
commit e5fd387ad5b3 ("ipv6: do not overwrite inetpeer metrics prematurely").
Since inetpeer is no longer used for metrics, this bit is also not needed.
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Reviewed-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Michal Kubeček <mkubecek@suse.cz>
Cc: Steffen Klassert <steffen.klassert@secunet.com>
---
include/net/dst.h | 6 ------
include/net/ip6_fib.h | 31 -------------------------------
net/ipv6/route.c | 36 +-----------------------------------
net/ipv6/xfrm6_policy.c | 14 --------------
4 files changed, 1 insertion(+), 86 deletions(-)
diff --git a/include/net/dst.h b/include/net/dst.h
index 0fb99a2..22aa93f 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -109,7 +109,6 @@ u32 *dst_cow_metrics_generic(struct dst_entry *dst, unsigned long old);
extern const u32 dst_default_metrics[];
#define DST_METRICS_READ_ONLY 0x1UL
-#define DST_METRICS_FORCE_OVERWRITE 0x2UL
#define DST_METRICS_FLAGS 0x3UL
#define __DST_METRICS_PTR(Y) \
((u32 *)((Y) & ~DST_METRICS_FLAGS))
@@ -120,11 +119,6 @@ static inline bool dst_metrics_read_only(const struct dst_entry *dst)
return dst->_metrics & DST_METRICS_READ_ONLY;
}
-static inline void dst_metrics_set_force_overwrite(struct dst_entry *dst)
-{
- dst->_metrics |= DST_METRICS_FORCE_OVERWRITE;
-}
-
void __dst_destroy_metrics_generic(struct dst_entry *dst, unsigned long old);
static inline void dst_destroy_metrics_generic(struct dst_entry *dst)
diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 7383a8c..e000180 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -121,7 +121,6 @@ struct rt6_info {
struct rt6key rt6i_prefsrc;
struct inet6_dev *rt6i_idev;
- unsigned long _rt6i_peer;
u32 rt6i_metric;
u32 rt6i_pmtu;
@@ -130,36 +129,6 @@ struct rt6_info {
u8 rt6i_protocol;
};
-static inline struct inet_peer *rt6_peer_ptr(struct rt6_info *rt)
-{
- return inetpeer_ptr(rt->_rt6i_peer);
-}
-
-static inline bool rt6_has_peer(struct rt6_info *rt)
-{
- return inetpeer_ptr_is_peer(rt->_rt6i_peer);
-}
-
-static inline void __rt6_set_peer(struct rt6_info *rt, struct inet_peer *peer)
-{
- __inetpeer_ptr_set_peer(&rt->_rt6i_peer, peer);
-}
-
-static inline bool rt6_set_peer(struct rt6_info *rt, struct inet_peer *peer)
-{
- return inetpeer_ptr_set_peer(&rt->_rt6i_peer, peer);
-}
-
-static inline void rt6_init_peer(struct rt6_info *rt, struct inet_peer_base *base)
-{
- inetpeer_init_ptr(&rt->_rt6i_peer, base);
-}
-
-static inline void rt6_transfer_peer(struct rt6_info *rt, struct rt6_info *ort)
-{
- inetpeer_transfer_peer(&rt->_rt6i_peer, &ort->_rt6i_peer);
-}
-
static inline struct inet6_dev *ip6_dst_idev(struct dst_entry *dst)
{
return ((struct rt6_info *)dst)->rt6i_idev;
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 4d6eb5d..3522711 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -105,36 +105,6 @@ static struct rt6_info *rt6_get_route_info(struct net *net,
const struct in6_addr *gwaddr, int ifindex);
#endif
-static void rt6_bind_peer(struct rt6_info *rt, int create)
-{
- struct inet_peer_base *base;
- struct inet_peer *peer;
-
- base = inetpeer_base_ptr(rt->_rt6i_peer);
- if (!base)
- return;
-
- peer = inet_getpeer_v6(base, &rt->rt6i_dst.addr, create);
- if (peer) {
- if (!rt6_set_peer(rt, peer))
- inet_putpeer(peer);
- }
-}
-
-static struct inet_peer *__rt6_get_peer(struct rt6_info *rt, int create)
-{
- if (rt6_has_peer(rt))
- return rt6_peer_ptr(rt);
-
- rt6_bind_peer(rt, create);
- return (rt6_has_peer(rt) ? rt6_peer_ptr(rt) : NULL);
-}
-
-static struct inet_peer *rt6_get_peer_create(struct rt6_info *rt)
-{
- return __rt6_get_peer(rt, 1);
-}
-
static u32 *ipv6_cow_metrics(struct dst_entry *dst, unsigned long old)
{
struct rt6_info *rt = (struct rt6_info *)dst;
@@ -291,7 +261,6 @@ static inline struct rt6_info *ip6_dst_alloc(struct net *net,
struct dst_entry *dst = &rt->dst;
memset(dst + 1, 0, sizeof(*rt) - sizeof(*dst));
- rt6_init_peer(rt, table ? &table->tb6_peers : net->ipv6.peers);
INIT_LIST_HEAD(&rt->rt6i_siblings);
}
return rt;
@@ -1052,7 +1021,6 @@ struct dst_entry *ip6_blackhole_route(struct net *net, struct dst_entry *dst_ori
new = &rt->dst;
memset(new + 1, 0, sizeof(*rt) - sizeof(*new));
- rt6_init_peer(rt, net->ipv6.peers);
new->__use = 1;
new->input = dst_discard;
@@ -1597,10 +1565,8 @@ int ip6_route_add(struct fib6_config *cfg)
ipv6_addr_prefix(&rt->rt6i_dst.addr, &cfg->fc_dst, cfg->fc_dst_len);
rt->rt6i_dst.plen = cfg->fc_dst_len;
- if (rt->rt6i_dst.plen == 128) {
+ if (rt->rt6i_dst.plen == 128)
rt->dst.flags |= DST_HOST;
- dst_metrics_set_force_overwrite(&rt->dst);
- }
#ifdef CONFIG_IPV6_SUBTREES
ipv6_addr_prefix(&rt->rt6i_src.addr, &cfg->fc_src, cfg->fc_src_len);
diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
index f337a90..6ae256b 100644
--- a/net/ipv6/xfrm6_policy.c
+++ b/net/ipv6/xfrm6_policy.c
@@ -71,13 +71,6 @@ static int xfrm6_get_tos(const struct flowi *fl)
return 0;
}
-static void xfrm6_init_dst(struct net *net, struct xfrm_dst *xdst)
-{
- struct rt6_info *rt = (struct rt6_info *)xdst;
-
- rt6_init_peer(rt, net->ipv6.peers);
-}
-
static int xfrm6_init_path(struct xfrm_dst *path, struct dst_entry *dst,
int nfheader_len)
{
@@ -106,8 +99,6 @@ static int xfrm6_fill_dst(struct xfrm_dst *xdst, struct net_device *dev,
return -ENODEV;
}
- rt6_transfer_peer(&xdst->u.rt6, rt);
-
/* Sheit... I remember I did this right. Apparently,
* it was magically lost, so this code needs audit */
xdst->u.rt6.rt6i_flags = rt->rt6i_flags & (RTF_ANYCAST |
@@ -255,10 +246,6 @@ static void xfrm6_dst_destroy(struct dst_entry *dst)
if (likely(xdst->u.rt6.rt6i_idev))
in6_dev_put(xdst->u.rt6.rt6i_idev);
dst_destroy_metrics_generic(dst);
- if (rt6_has_peer(&xdst->u.rt6)) {
- struct inet_peer *peer = rt6_peer_ptr(&xdst->u.rt6);
- inet_putpeer(peer);
- }
xfrm_dst_destroy(xdst);
}
@@ -308,7 +295,6 @@ static struct xfrm_policy_afinfo xfrm6_policy_afinfo = {
.get_saddr = xfrm6_get_saddr,
.decode_session = _decode_session6,
.get_tos = xfrm6_get_tos,
- .init_dst = xfrm6_init_dst,
.init_path = xfrm6_init_path,
.fill_dst = xfrm6_fill_dst,
.blackhole_route = ip6_blackhole_route,
--
1.8.1
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-04-28 20:03 [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
` (4 preceding siblings ...)
2015-04-28 20:03 ` [PATCH net-next 5/5] ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peer Martin KaFai Lau
@ 2015-05-02 1:01 ` David Miller
5 siblings, 0 replies; 18+ messages in thread
From: David Miller @ 2015-05-02 1:01 UTC (permalink / raw)
To: kafai
Cc: netdev, hannes, steffen.klassert, yangyingliang, shengyong1, Kernel-team
From: Martin KaFai Lau <kafai@fb.com>
Date: Tue, 28 Apr 2015 13:03:02 -0700
> The series is separated from another patch series,
> 'ipv6: Only create RTF_CACHE route after encountering pmtu exception',
> which can be found here:
Looks good, and the divorce of inetpeer from ipv6 routes is
especially nice to see.
Series applied, thanks Martin!
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-04-28 20:03 ` [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
@ 2015-05-02 22:41 ` Hajime Tazaki
2015-05-02 23:20 ` Martin KaFai Lau
0 siblings, 1 reply; 18+ messages in thread
From: Hajime Tazaki @ 2015-05-02 22:41 UTC (permalink / raw)
To: kafai
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
Hi Martin, Dave,
# I'm not a policeman though..
a regression is detected by my nightly test (below) and
quick bisecting with LibOS (ns-3/DCE) gave me this commit.
http://ns-3-dce.cloud.wide.ad.jp/jenkins/job/daily-net-next-sim/878/
At Tue, 28 Apr 2015 13:03:05 -0700,
Martin KaFai Lau wrote:
> ---
> net/ipv6/route.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 07562a2..aa4cfdd 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -977,7 +977,7 @@ redo_rt6_select:
>
> if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
> nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
> - else if (!(rt->dst.flags & DST_HOST))
> + else if (!(rt->dst.flags & DST_HOST) || !(rt->dst.flags & RTF_LOCAL))
> nrt = rt6_alloc_clone(rt, &fl6->daddr);
> else
> goto out2;
> @@ -1172,7 +1172,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
> struct rt6_info *rt6 = (struct rt6_info *)dst;
>
> dst_confirm(dst);
> - if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
> + if (mtu < dst_mtu(dst) && (rt6->rt6i_flags & RTF_CACHE)) {
> struct net *net = dev_net(dst->dev);
>
> rt6->rt6i_flags |= RTF_MODIFIED;
- how to reproduce it
the test is simply sending an IPv6 packet to a node on the
same subnet to verify the connectivity (e.g., ping6
2001:1::2 from 2001:1::1) and echo packets didn't get back.
reverting this commit fixes the issue.
please take a look at it: I'm glad to know if this only
happens in my local environment.
-- Hajime
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-02 22:41 ` Hajime Tazaki
@ 2015-05-02 23:20 ` Martin KaFai Lau
2015-05-03 0:19 ` Hajime Tazaki
0 siblings, 1 reply; 18+ messages in thread
From: Martin KaFai Lau @ 2015-05-02 23:20 UTC (permalink / raw)
To: Hajime Tazaki
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
Hi Hajime,
On Sun, May 03, 2015 at 07:41:57AM +0900, Hajime Tazaki wrote:
> a regression is detected by my nightly test (below) and
> quick bisecting with LibOS (ns-3/DCE) gave me this commit.
>
> https://urldefense.proofpoint.com/v1/url?u=http://ns-3-dce.cloud.wide.ad.jp/jenkins/job/daily-net-next-sim/878/&k=ZVNjlDMF0FElm4dQtryO4A%3D%3D%0A&r=%2Faj1ZOQObwbmtLwlDw3XzQ%3D%3D%0A&m=%2B4tL0PwkB3R%2BcCJxbDDg3rbrKM%2Fq3CKnwlsp5XsSwqg%3D%0A&s=07d84598efaff9b3d5df8d42779e0753f0cc007daaf1aaf9c0ca6bd8999192b9
>
> At Tue, 28 Apr 2015 13:03:05 -0700,
> Martin KaFai Lau wrote:
> > ---
> > net/ipv6/route.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> > index 07562a2..aa4cfdd 100644
> > --- a/net/ipv6/route.c
> > +++ b/net/ipv6/route.c
> > @@ -977,7 +977,7 @@ redo_rt6_select:
> >
> > if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
> > nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
> > - else if (!(rt->dst.flags & DST_HOST))
> > + else if (!(rt->dst.flags & DST_HOST) || !(rt->dst.flags & RTF_LOCAL))
> > nrt = rt6_alloc_clone(rt, &fl6->daddr);
> > else
> > goto out2;
> > @@ -1172,7 +1172,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
> > struct rt6_info *rt6 = (struct rt6_info *)dst;
> >
> > dst_confirm(dst);
> > - if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
> > + if (mtu < dst_mtu(dst) && (rt6->rt6i_flags & RTF_CACHE)) {
> > struct net *net = dev_net(dst->dev);
> >
> > rt6->rt6i_flags |= RTF_MODIFIED;
>
> - how to reproduce it
>
> the test is simply sending an IPv6 packet to a node on the
> same subnet to verify the connectivity (e.g., ping6
> 2001:1::2 from 2001:1::1) and echo packets didn't get back.
>
> reverting this commit fixes the issue.
>
> please take a look at it: I'm glad to know if this only
> happens in my local environment.
Thanks for reporting.
I cannot reproduce in my environment.
15:58:30.658360 6a:aa:e6:a1:ce:f9 > 52:54:00:12:34:56, ethertype IPv6 (0x86dd), length 118: 2001:1::2 > 2001:1::1: ICMP6, echo request, seq 1, length 64
15:58:30.658479 52:54:00:12:34:56 > 6a:aa:e6:a1:ce:f9, ethertype IPv6 (0x86dd), length 118: 2001:1::1 > 2001:1::2: ICMP6, echo reply, seq 1, length 64
15:58:31.658093 6a:aa:e6:a1:ce:f9 > 52:54:00:12:34:56, ethertype IPv6 (0x86dd), length 118: 2001:1::2 > 2001:1::1: ICMP6, echo request, seq 2, length 64
15:58:31.658214 52:54:00:12:34:56 > 6a:aa:e6:a1:ce:f9, ethertype IPv6 (0x86dd), length 118: 2001:1::1 > 2001:1::2: ICMP6, echo reply, seq 2, length 64
15:58:32.657977 6a:aa:e6:a1:ce:f9 > 52:54:00:12:34:56, ethertype IPv6 (0x86dd), length 118: 2001:1::2 > 2001:1::1: ICMP6, echo request, seq 3, length 64
15:58:32.658079 52:54:00:12:34:56 > 6a:aa:e6:a1:ce:f9, ethertype IPv6 (0x86dd), length 118: 2001:1::1 > 2001:1::2: ICMP6, echo reply, seq 3, length 64
15:58:33.658104 6a:aa:e6:a1:ce:f9 > 52:54:00:12:34:56, ethertype IPv6 (0x86dd), length 118: 2001:1::2 > 2001:1::1: ICMP6, echo request, seq 4, length 64
15:58:33.658243 52:54:00:12:34:56 > 6a:aa:e6:a1:ce:f9, ethertype IPv6 (0x86dd), length 118: 2001:1::1 > 2001:1::2: ICMP6, echo reply, seq 4, length 64
15:58:34.658150 6a:aa:e6:a1:ce:f9 > 52:54:00:12:34:56, ethertype IPv6 (0x86dd), length 118: 2001:1::2 > 2001:1::1: ICMP6, echo request, seq 5, length 64
15:58:34.658275 52:54:00:12:34:56 > 6a:aa:e6:a1:ce:f9, ethertype IPv6 (0x86dd), length 118: 2001:1::1 > 2001:1::2: ICMP6, echo reply, seq 5, length 64
I suspect there is a RTF_LOCAL route getting a ICMPv6 too-big packet.
Can you provide a tcpdump at both ends? Also, the output of
the 'ip -6 a' and 'ip -6 r show'.
Also, can you try the following change which is a partial revert. If ping goes
through again, can you capture the 'ip -6 show' on both sides quickly after the
test.
Thanks,
--Martin
diff --git i/net/ipv6/route.c w/net/ipv6/route.c
index 3522711..60212d4 100644
--- i/net/ipv6/route.c
+++ w/net/ipv6/route.c
@@ -1124,7 +1124,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
struct rt6_info *rt6 = (struct rt6_info *)dst;
dst_confirm(dst);
- if (mtu < dst_mtu(dst) && (rt6->rt6i_flags & RTF_CACHE)) {
+ if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
struct net *net = dev_net(dst->dev);
rt6->rt6i_flags |= RTF_MODIFIED;
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-02 23:20 ` Martin KaFai Lau
@ 2015-05-03 0:19 ` Hajime Tazaki
2015-05-03 1:00 ` Martin KaFai Lau
2015-05-03 3:38 ` Martin KaFai Lau
0 siblings, 2 replies; 18+ messages in thread
From: Hajime Tazaki @ 2015-05-03 0:19 UTC (permalink / raw)
To: kafai
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
Hello Martin,
thank you for your quick reply.
At Sat, 2 May 2015 16:20:40 -0700,
Martin KaFai Lau wrote:
> > - how to reproduce it
> >
> > the test is simply sending an IPv6 packet to a node on the
> > same subnet to verify the connectivity (e.g., ping6
> > 2001:1::2 from 2001:1::1) and echo packets didn't get back.
> >
> > reverting this commit fixes the issue.
> >
> > please take a look at it: I'm glad to know if this only
> > happens in my local environment.
> Thanks for reporting.
>
> I cannot reproduce in my environment.
(snip)
> 15:58:34.658150 6a:aa:e6:a1:ce:f9 > 52:54:00:12:34:56, ethertype IPv6 (0x86dd), length 118: 2001:1::2 > 2001:1::1: ICMP6, echo request, seq 5, length 64
> 15:58:34.658275 52:54:00:12:34:56 > 6a:aa:e6:a1:ce:f9, ethertype IPv6 (0x86dd), length 118: 2001:1::1 > 2001:1::2: ICMP6, echo reply, seq 5, length 64
>
> I suspect there is a RTF_LOCAL route getting a ICMPv6 too-big packet.
>
> Can you provide a tcpdump at both ends? Also, the output of
> the 'ip -6 a' and 'ip -6 r show'.
- tcpdump -vvv
09:00:00.200000 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:1 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
source link-address option (1), length 8 (1): 00:00:00:00:00:01
0x0000: 0000 0000 0001
09:00:00.401092 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:2 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
source link-address option (1), length 8 (1): 00:00:00:00:00:02
0x0000: 0000 0000 0002
09:00:01.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 1
09:00:02.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 2
09:00:03.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 3
09:00:04.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 4
09:00:04.200000 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:1 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
source link-address option (1), length 8 (1): 00:00:00:00:00:01
0x0000: 0000 0000 0001
09:00:04.401092 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:2 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
source link-address option (1), length 8 (1): 00:00:00:00:00:02 0x0000: 0000 0000 0002
(snip)
- 'ip -6 a' at the ping6 sender
7: sim0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qlen 1000
inet6 2001:1::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::200:ff:fe00:1/64 scope link
valid_lft forever preferred_lft forever
- 'ip -6 r show' at the ping6 sender
2001:1::/64 dev sim0 proto kernel metric 256
fe80::/64 dev sim0 proto kernel metric 256
# the results of ip command on receiver side are almost
similar.
I found that the test uses non-ARP interface between nodes:
if I changed the interface to 'non-NOARP' NIC, the issue has
gone away without the revert.
I'm using the following scenario: just FYI.
https://gist.github.com/thehajime/26be8606ddbb924f357c
> Also, can you try the following change which is a partial revert. If ping goes
> through again, can you capture the 'ip -6 show' on both sides quickly after the
> test.
>
> Thanks,
> --Martin
>
> diff --git i/net/ipv6/route.c w/net/ipv6/route.c
> index 3522711..60212d4 100644
> --- i/net/ipv6/route.c
> +++ w/net/ipv6/route.c
> @@ -1124,7 +1124,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
> struct rt6_info *rt6 = (struct rt6_info *)dst;
>
> dst_confirm(dst);
> - if (mtu < dst_mtu(dst) && (rt6->rt6i_flags & RTF_CACHE)) {
> + if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
> struct net *net = dev_net(dst->dev);
>
> rt6->rt6i_flags |= RTF_MODIFIED;
this partial revert didn't change my situation.
-- Hajime
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-03 0:19 ` Hajime Tazaki
@ 2015-05-03 1:00 ` Martin KaFai Lau
2015-05-03 1:03 ` Martin KaFai Lau
2015-05-03 14:26 ` Hajime Tazaki
2015-05-03 3:38 ` Martin KaFai Lau
1 sibling, 2 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2015-05-03 1:00 UTC (permalink / raw)
To: Hajime Tazaki
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
On Sun, May 03, 2015 at 09:19:47AM +0900, Hajime Tazaki wrote:
> At Sat, 2 May 2015 16:20:40 -0700,
> Martin KaFai Lau wrote:
> > Can you provide a tcpdump at both ends? Also, the output of
> > the 'ip -6 a' and 'ip -6 r show'.
>
> - tcpdump -vvv
> 09:00:00.200000 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:1 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
> source link-address option (1), length 8 (1): 00:00:00:00:00:01
> 0x0000: 0000 0000 0001
> 09:00:00.401092 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:2 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
> source link-address option (1), length 8 (1): 00:00:00:00:00:02
> 0x0000: 0000 0000 0002
> 09:00:01.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 1
> 09:00:02.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 2
> 09:00:03.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 3
> 09:00:04.000000 IP6 (hlim 64, next-header ICMPv6 (58) payload length: 1008) 2001:1::1 > 2001:1::2: [icmp6 sum ok] ICMP6, echo request, seq 4
> 09:00:04.200000 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:1 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
> source link-address option (1), length 8 (1): 00:00:00:00:00:01
> 0x0000: 0000 0000 0001
> 09:00:04.401092 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:2 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
> source link-address option (1), length 8 (1): 00:00:00:00:00:02 0x0000: 0000 0000 0002
Was it captured at the sender side?
Did the receiver (2001:1::2) get the echo request?
> (snip)
>
> - 'ip -6 a' at the ping6 sender
> 7: sim0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qlen 1000
> inet6 2001:1::1/64 scope global
> valid_lft forever preferred_lft forever
> inet6 fe80::200:ff:fe00:1/64 scope link
> valid_lft forever preferred_lft forever
>
> - 'ip -6 r show' at the ping6 sender
> 2001:1::/64 dev sim0 proto kernel metric 256
> fe80::/64 dev sim0 proto kernel metric 256
>
hmm...It is weird. It is a /64 route, so it should have
failed the (rt->dst.flags & DST_HOST) test anyway...
> # the results of ip command on receiver side are almost
> similar.
>
> I found that the test uses non-ARP interface between nodes:
> if I changed the interface to 'non-NOARP' NIC, the issue has
> gone away without the revert.
>
> I'm using the following scenario: just FYI.
>
> https://gist.github.com/thehajime/26be8606ddbb924f357c
>
You meant without 'arp off'? Can you grep those IP from 'ip -6 neigh'?
Can you try this patch just to confirm:
Thanks
--Martin
diff --git i/net/ipv6/route.c w/net/ipv6/route.c
index 3522711..c0ae180 100644
--- i/net/ipv6/route.c
+++ w/net/ipv6/route.c
@@ -920,7 +920,7 @@ redo_rt6_select:
if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
- else if (!(rt->dst.flags & DST_HOST) || !(rt->dst.flags & RTF_LOCAL))
+ else if (!(rt->dst.flags & DST_HOST)))
nrt = rt6_alloc_clone(rt, &fl6->daddr);
else
goto out2;
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-03 1:00 ` Martin KaFai Lau
@ 2015-05-03 1:03 ` Martin KaFai Lau
2015-05-03 14:26 ` Hajime Tazaki
1 sibling, 0 replies; 18+ messages in thread
From: Martin KaFai Lau @ 2015-05-03 1:03 UTC (permalink / raw)
To: Hajime Tazaki
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
On Sat, May 02, 2015 at 06:00:55PM -0700, Martin KaFai Lau wrote:
> Can you try this patch just to confirm:
>
> Thanks
> --Martin
>
> diff --git i/net/ipv6/route.c w/net/ipv6/route.c
> index 3522711..c0ae180 100644
> --- i/net/ipv6/route.c
> +++ w/net/ipv6/route.c
> @@ -920,7 +920,7 @@ redo_rt6_select:
>
> if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
> nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
> - else if (!(rt->dst.flags & DST_HOST) || !(rt->dst.flags & RTF_LOCAL))
> + else if (!(rt->dst.flags & DST_HOST)))
> nrt = rt6_alloc_clone(rt, &fl6->daddr);
> else
> goto out2;
Sorry for the noise, try this one instead
diff --git i/net/ipv6/route.c w/net/ipv6/route.c
index 3522711..f81b321 100644
--- i/net/ipv6/route.c
+++ w/net/ipv6/route.c
@@ -920,7 +920,7 @@ redo_rt6_select:
if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
- else if (!(rt->dst.flags & DST_HOST) || !(rt->dst.flags & RTF_LOCAL))
+ else if (!(rt->dst.flags & DST_HOST))
nrt = rt6_alloc_clone(rt, &fl6->daddr);
else
goto out2;
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-03 0:19 ` Hajime Tazaki
2015-05-03 1:00 ` Martin KaFai Lau
@ 2015-05-03 3:38 ` Martin KaFai Lau
2015-05-03 14:29 ` Hajime Tazaki
1 sibling, 1 reply; 18+ messages in thread
From: Martin KaFai Lau @ 2015-05-03 3:38 UTC (permalink / raw)
To: Hajime Tazaki
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
On Sun, May 03, 2015 at 09:19:47AM +0900, Hajime Tazaki wrote:
> - 'ip -6 r show' at the ping6 sender
> 2001:1::/64 dev sim0 proto kernel metric 256
> fe80::/64 dev sim0 proto kernel metric 256
>
> # the results of ip command on receiver side are almost
> similar.
>
> I found that the test uses non-ARP interface between nodes:
> if I changed the interface to 'non-NOARP' NIC, the issue has
> gone away without the revert.
I have given a little more thoughts. With the below partial patch
ruled out and together with a /64 route in your test,
I failed to see how another line change could have broken.
Can you share some more details on your test and
can you reproduce it with some basic iproute2 commands?
--Martin
>
> I'm using the following scenario: just FYI.
>
> https://gist.github.com/thehajime/26be8606ddbb924f357c
>
> > Also, can you try the following change which is a partial revert. If ping goes
> > through again, can you capture the 'ip -6 show' on both sides quickly after the
> > test.
> >
> > Thanks,
> > --Martin
> >
> > diff --git i/net/ipv6/route.c w/net/ipv6/route.c
> > index 3522711..60212d4 100644
> > --- i/net/ipv6/route.c
> > +++ w/net/ipv6/route.c
> > @@ -1124,7 +1124,7 @@ static void ip6_rt_update_pmtu(struct dst_entry *dst, struct sock *sk,
> > struct rt6_info *rt6 = (struct rt6_info *)dst;
> >
> > dst_confirm(dst);
> > - if (mtu < dst_mtu(dst) && (rt6->rt6i_flags & RTF_CACHE)) {
> > + if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
> > struct net *net = dev_net(dst->dev);
> >
> > rt6->rt6i_flags |= RTF_MODIFIED;
>
> this partial revert didn't change my situation.
>
>
> -- Hajime
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-03 1:00 ` Martin KaFai Lau
2015-05-03 1:03 ` Martin KaFai Lau
@ 2015-05-03 14:26 ` Hajime Tazaki
1 sibling, 0 replies; 18+ messages in thread
From: Hajime Tazaki @ 2015-05-03 14:26 UTC (permalink / raw)
To: kafai
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
At Sat, 2 May 2015 18:00:55 -0700,
Martin KaFai Lau wrote:
> > 09:00:04.401092 IP6 (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::200:ff:fe00:2 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16
> > source link-address option (1), length 8 (1): 00:00:00:00:00:02 0x0000: 0000 0000 0002
> Was it captured at the sender side?
> Did the receiver (2001:1::2) get the echo request?
the capture was on the sender side.
the receiver got the echo request: I will detail the next
email but since two nodes connected back to back via
point-to-point data link, the receiver side also has exactly
the same pcap.
> > (snip)
> > - 'ip -6 a' at the ping6 sender
> > 7: sim0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qlen 1000
> > inet6 2001:1::1/64 scope global
> > valid_lft forever preferred_lft forever
> > inet6 fe80::200:ff:fe00:1/64 scope link
> > valid_lft forever preferred_lft forever
> >
> > - 'ip -6 r show' at the ping6 sender
> > 2001:1::/64 dev sim0 proto kernel metric 256
> > fe80::/64 dev sim0 proto kernel metric 256
> >
> hmm...It is weird. It is a /64 route, so it should have
> failed the (rt->dst.flags & DST_HOST) test anyway...
>
> > # the results of ip command on receiver side are almost
> > similar.
> >
> > I found that the test uses non-ARP interface between nodes:
> > if I changed the interface to 'non-NOARP' NIC, the issue has
> > gone away without the revert.
> >
> > I'm using the following scenario: just FYI.
> >
> > https://gist.github.com/thehajime/26be8606ddbb924f357c
> >
> You meant without 'arp off'?
Yes, I meant that.
> Can you grep those IP from 'ip -6 neigh'?
there is no output from 'ip -6 neigh' since the interfaces
is configured with IFF_NOARP.
> Can you try this patch just to confirm:
I applied the updated patch and the ping successfully got
replies.
-- Hajime
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-03 3:38 ` Martin KaFai Lau
@ 2015-05-03 14:29 ` Hajime Tazaki
2015-05-03 19:01 ` Martin KaFai Lau
0 siblings, 1 reply; 18+ messages in thread
From: Hajime Tazaki @ 2015-05-03 14:29 UTC (permalink / raw)
To: kafai
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
Hi,
At Sat, 2 May 2015 20:38:01 -0700,
Martin KaFai Lau wrote:
>
> On Sun, May 03, 2015 at 09:19:47AM +0900, Hajime Tazaki wrote:
> > - 'ip -6 r show' at the ping6 sender
> > 2001:1::/64 dev sim0 proto kernel metric 256
> > fe80::/64 dev sim0 proto kernel metric 256
> >
> > # the results of ip command on receiver side are almost
> > similar.
> >
> > I found that the test uses non-ARP interface between nodes:
> > if I changed the interface to 'non-NOARP' NIC, the issue has
> > gone away without the revert.
> I have given a little more thoughts. With the below partial patch
> ruled out and together with a /64 route in your test,
> I failed to see how another line change could have broken.
>
> Can you share some more details on your test and
the test uses two nodes (running on ns-3 network simulator
with net-next kernel), which is connected via a
point-to-point data link: payload is encapsulated by PPP
(RFC1661).
https://www.nsnam.org/docs/models/html/point-to-point.html
so there is no need of neighbor resolution (ARP, NS/NA) on
that link (dev->flags has IFF_NOARP and IFF_POINTOPOINT bits).
point-to-point
node 0 <---------------> node 1
2001:1::1/64 2001:1::2/64
let me know if you need further information.
> can you reproduce it with some basic iproute2 commands?
I'm going to create a real environment to figure out a
minimum-reproducible command set that we can try.
do you have any chance to create an interface with IFF_NOARP
and IFF_POINTOPOINT (some tunnel device ?) on the latest
net-next kernel ?
-- Hajime
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-03 14:29 ` Hajime Tazaki
@ 2015-05-03 19:01 ` Martin KaFai Lau
2015-05-04 0:29 ` Martin KaFai Lau
0 siblings, 1 reply; 18+ messages in thread
From: Martin KaFai Lau @ 2015-05-03 19:01 UTC (permalink / raw)
To: Hajime Tazaki
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
Hi Hajime,
On Sun, May 03, 2015 at 11:29:55PM +0900, Hajime Tazaki wrote:
> the test uses two nodes (running on ns-3 network simulator
> with net-next kernel), which is connected via a
> point-to-point data link: payload is encapsulated by PPP
> (RFC1661).
>
> so there is no need of neighbor resolution (ARP, NS/NA) on
> that link (dev->flags has IFF_NOARP and IFF_POINTOPOINT bits).
>
> point-to-point
> node 0 <---------------> node 1
> 2001:1::1/64 2001:1::2/64
>
> let me know if you need further information.
Thanks for the details and confirming the last patch. I think I may
know what could be wrong. I am going to confirm it first by trying
to reproduce it.
> I'm going to create a real environment to figure out a
> minimum-reproducible command set that we can try.
That will be great!
> do you have any chance to create an interface with IFF_NOARP
> and IFF_POINTOPOINT (some tunnel device ?) on the latest
> net-next kernel ?
I am going to give it a try.
Thanks,
--Martin
^ permalink raw reply [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-03 19:01 ` Martin KaFai Lau
@ 2015-05-04 0:29 ` Martin KaFai Lau
2015-05-04 1:11 ` Hajime Tazaki
0 siblings, 1 reply; 18+ messages in thread
From: Martin KaFai Lau @ 2015-05-04 0:29 UTC (permalink / raw)
To: Hajime Tazaki
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
Hi Hajime,
On Sun, May 03, 2015 at 12:01:09PM -0700, Martin KaFai Lau wrote:
> Thanks for the details and confirming the last patch. I think I may
> know what could be wrong. I am going to confirm it first by trying
> to reproduce it.
I tried the sit and also the gre6 tunnel. I cannot make it break as
the way you have observed. The ping can still go through. I am probably
missing something.
However, I did uncover a problem in this patch and posted a fix to
netdev. I have also attached here. Can you give it a try?
If there is still no luck, do you have a chance to
reproduce it with a simple setup by iproute2 commands?
Can you specify which POINTTOPOINT device and sim device you are using?
Are they in or out of kernel-tree driver?
Thanks,
--Martin
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 3522711..106dbe5 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -920,7 +920,7 @@ redo_rt6_select:
if (!(rt->rt6i_flags & (RTF_NONEXTHOP | RTF_GATEWAY)))
nrt = rt6_alloc_cow(rt, &fl6->daddr, &fl6->saddr);
- else if (!(rt->dst.flags & DST_HOST) || !(rt->dst.flags & RTF_LOCAL))
+ else if (!(rt->dst.flags & DST_HOST) || !(rt->rt6i_flags & RTF_LOCAL))
nrt = rt6_alloc_clone(rt, &fl6->daddr);
else
goto out2;
^ permalink raw reply related [flat|nested] 18+ messages in thread
* Re: [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update
2015-05-04 0:29 ` Martin KaFai Lau
@ 2015-05-04 1:11 ` Hajime Tazaki
0 siblings, 0 replies; 18+ messages in thread
From: Hajime Tazaki @ 2015-05-04 1:11 UTC (permalink / raw)
To: kafai
Cc: netdev, hannes, steffen.klassert, davem, yangyingliang,
shengyong1, Kernel-team
Hi Martin,
At Sun, 3 May 2015 17:29:38 -0700,
Martin KaFai Lau wrote:
> On Sun, May 03, 2015 at 12:01:09PM -0700, Martin KaFai Lau wrote:
> > Thanks for the details and confirming the last patch. I think I may
> > know what could be wrong. I am going to confirm it first by trying
> > to reproduce it.
> I tried the sit and also the gre6 tunnel. I cannot make it break as
> the way you have observed. The ping can still go through. I am probably
> missing something.
>
> However, I did uncover a problem in this patch and posted a fix to
> netdev. I have also attached here. Can you give it a try?
tried it and it's perfect !
all other tests I have are also working fine.
> If there is still no luck, do you have a chance to
> reproduce it with a simple setup by iproute2 commands?
> Can you specify which POINTTOPOINT device and sim device you are using?
> Are they in or out of kernel-tree driver?
indeed, it's an out-of-tree driver of LibOS patchset (*1)
https://github.com/libos-nuse/net-next-nuse/blob/nuse/arch/lib/lib-device.c
this is not always P2P device: an application (e.g., ns-3)
can define the flags of the device.
I'll follow up to reproduce with sit or gre and let you know
once I got succeed.
thank you.
-- Hajime
^ permalink raw reply [flat|nested] 18+ messages in thread
end of thread, other threads:[~2015-05-04 1:11 UTC | newest]
Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-28 20:03 [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 1/5] ipv6: Consider RTF_CACHE when searching the fib6 tree Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 2/5] ipv6: Extend the route lookups to low priority metrics Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 3/5] ipv6: Stop /128 route from disappearing after pmtu update Martin KaFai Lau
2015-05-02 22:41 ` Hajime Tazaki
2015-05-02 23:20 ` Martin KaFai Lau
2015-05-03 0:19 ` Hajime Tazaki
2015-05-03 1:00 ` Martin KaFai Lau
2015-05-03 1:03 ` Martin KaFai Lau
2015-05-03 14:26 ` Hajime Tazaki
2015-05-03 3:38 ` Martin KaFai Lau
2015-05-03 14:29 ` Hajime Tazaki
2015-05-03 19:01 ` Martin KaFai Lau
2015-05-04 0:29 ` Martin KaFai Lau
2015-05-04 1:11 ` Hajime Tazaki
2015-04-28 20:03 ` [PATCH net-next 4/5] ipv6: Stop rt6_info from using inet_peer's metrics Martin KaFai Lau
2015-04-28 20:03 ` [PATCH net-next 5/5] ipv6: Remove DST_METRICS_FORCE_OVERWRITE and _rt6i_peer Martin KaFai Lau
2015-05-02 1:01 ` [PATCH net-next 0/5] ipv6: Stop /128 route from disappearing after pmtu update David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).