All of lore.kernel.org
 help / color / mirror / Atom feed
From: Alex Gartrell <agartrell@fb.com>
To: shengyong <shengyong1@huawei.com>, <davem@davemloft.net>
Cc: <netdev@vger.kernel.org>, <yangyingling@huawei.com>,
	<steffen.klassert@secunet.com>, <hannes@redhat.com>,
	<lvs-devel@vger.kernel.org>, Calvin Owens <calvinowens@fb.com>,
	<kernel-team@fb.com>
Subject: Re: Question: should local address be expired when updating PMTU?
Date: Mon, 2 Feb 2015 16:52:58 -0800	[thread overview]
Message-ID: <54D01BEA.2070501@fb.com> (raw)
In-Reply-To: <54CF3348.40207@huawei.com>

Hello Shengyong,

 > diff --git a/net/ipv6/route.c b/net/ipv6/route.c
 > index b2614b2..b80317a 100644
 > --- a/net/ipv6/route.c
 > +++ b/net/ipv6/route.c
 > @@ -1136,6 +1136,9 @@ static void ip6_rt_update_pmtu(struct dst_entry 
*dst, struct sock *sk,
 >   {
 >          struct rt6_info *rt6 = (struct rt6_info*)dst;
 >
 > +       if (rt6->rt6i_flags & RTF_LOCAL)
 > +               return;
 > +
 >          dst_confirm(dst);
 >          if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
 >                  struct net *net = dev_net(dst->dev);
 >
 > So is this modification correct? Or how can we avoid such expiring?


FWIW, we encountered this problem with IPVS tunneling.  Here's a patch 
done by Calvin (cc'ed) that fixes my attempted fix for this.  We're not 
particularly proud of this...

At a high level, I don't think the RTF_LOCAL check was sufficient, but I 
didn't investigate deeply enough and hopefully Calvin can say why.

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index f14d49b..c607a42 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1159,18 +1159,18 @@ static void ip6_rt_update_pmtu(struct dst_entry 
*dst, struct sock *sk,
                 }
                 dst_metric_set(dst, RTAX_MTU, mtu);

-               /* FACEBOOK HACK: We need to not expire local non-expiring
-                * routes so that we don't accidentally start blackholing
-                * ipvs traffic when we happen to use it locally for
-                * healthchecking (see ip_vs_xmit.c --
-                * __ip_vs_get_out_rt_v6 invokes update_pmtu if the rt is
-                * associated with a socket)
-                * Alex Gartrell <agartrell@fb.com>
+               /*
+                * FACEBOOK HACK: Only expire routes that aren't 
destined for
+                * the loopback interface.
+                *
+                * This prevents the strange route coalescing that 
happens when
+                * you add an address to the loopback that had a route 
that had
+                * been used when the address didn't exist from getting 
expired
+                * and causing packet loss in shiv.
                  */
-               if (!(rt6->rt6i_flags & RTF_LOCAL) ||
-                   (rt6->rt6i_flags & (RTF_EXPIRES | RTF_CACHE)))
-                       rt6_update_expires(
-                               rt6, net->ipv6.sysctl.ip6_rt_mtu_expires);
+               if (!(dst->dev->flags & IFF_LOOPBACK))
+                       rt6_update_expires(rt6,
+ 
net->ipv6.sysctl.ip6_rt_mtu_expires);
         }
  }


Cheers,
-- 
Alex Gartrell <agartrell@fb.com>

WARNING: multiple messages have this Message-ID (diff)
From: Alex Gartrell <agartrell@fb.com>
To: shengyong <shengyong1@huawei.com>, davem@davemloft.net
Cc: netdev@vger.kernel.org, yangyingling@huawei.com,
	steffen.klassert@secunet.com, hannes@redhat.com,
	lvs-devel@vger.kernel.org, Calvin Owens <calvinowens@fb.com>,
	kernel-team@fb.com
Subject: Re: Question: should local address be expired when updating PMTU?
Date: Mon, 2 Feb 2015 16:52:58 -0800	[thread overview]
Message-ID: <54D01BEA.2070501@fb.com> (raw)
In-Reply-To: <54CF3348.40207@huawei.com>

Hello Shengyong,

 > diff --git a/net/ipv6/route.c b/net/ipv6/route.c
 > index b2614b2..b80317a 100644
 > --- a/net/ipv6/route.c
 > +++ b/net/ipv6/route.c
 > @@ -1136,6 +1136,9 @@ static void ip6_rt_update_pmtu(struct dst_entry 
*dst, struct sock *sk,
 >   {
 >          struct rt6_info *rt6 = (struct rt6_info*)dst;
 >
 > +       if (rt6->rt6i_flags & RTF_LOCAL)
 > +               return;
 > +
 >          dst_confirm(dst);
 >          if (mtu < dst_mtu(dst) && rt6->rt6i_dst.plen == 128) {
 >                  struct net *net = dev_net(dst->dev);
 >
 > So is this modification correct? Or how can we avoid such expiring?


FWIW, we encountered this problem with IPVS tunneling.  Here's a patch 
done by Calvin (cc'ed) that fixes my attempted fix for this.  We're not 
particularly proud of this...

At a high level, I don't think the RTF_LOCAL check was sufficient, but I 
didn't investigate deeply enough and hopefully Calvin can say why.

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index f14d49b..c607a42 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1159,18 +1159,18 @@ static void ip6_rt_update_pmtu(struct dst_entry 
*dst, struct sock *sk,
                 }
                 dst_metric_set(dst, RTAX_MTU, mtu);

-               /* FACEBOOK HACK: We need to not expire local non-expiring
-                * routes so that we don't accidentally start blackholing
-                * ipvs traffic when we happen to use it locally for
-                * healthchecking (see ip_vs_xmit.c --
-                * __ip_vs_get_out_rt_v6 invokes update_pmtu if the rt is
-                * associated with a socket)
-                * Alex Gartrell <agartrell@fb.com>
+               /*
+                * FACEBOOK HACK: Only expire routes that aren't 
destined for
+                * the loopback interface.
+                *
+                * This prevents the strange route coalescing that 
happens when
+                * you add an address to the loopback that had a route 
that had
+                * been used when the address didn't exist from getting 
expired
+                * and causing packet loss in shiv.
                  */
-               if (!(rt6->rt6i_flags & RTF_LOCAL) ||
-                   (rt6->rt6i_flags & (RTF_EXPIRES | RTF_CACHE)))
-                       rt6_update_expires(
-                               rt6, net->ipv6.sysctl.ip6_rt_mtu_expires);
+               if (!(dst->dev->flags & IFF_LOOPBACK))
+                       rt6_update_expires(rt6,
+ 
net->ipv6.sysctl.ip6_rt_mtu_expires);
         }
  }


Cheers,
-- 
Alex Gartrell <agartrell@fb.com>

  parent reply	other threads:[~2015-02-03  0:52 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-02  8:20 Question: should local address be expired when updating PMTU? shengyong
2015-02-02 21:31 ` David Miller
2015-02-03  0:52 ` Alex Gartrell [this message]
2015-02-03  0:52   ` Alex Gartrell
2015-02-03  1:28   ` shengyong
2015-02-03  1:28     ` shengyong
2015-02-03  2:10   ` Calvin Owens
2015-02-03  2:10     ` Calvin Owens
2015-02-03  3:21     ` shengyong
2015-02-03  3:21       ` shengyong
2015-02-03  9:28 ` Steffen Klassert
2015-02-03 10:54   ` shengyong
2015-02-03 12:01     ` Steffen Klassert
2015-02-04  1:59       ` shengyong
2015-02-05  7:21         ` Steffen Klassert
2015-02-27  2:37           ` shengyong
2015-02-27 10:32             ` Steffen Klassert
2015-03-30 10:32             ` Steffen Klassert
2015-03-30 10:33               ` [PATCH RFC 1/3] ipv6: Fix after pmtu events dissapearing host routes Steffen Klassert
2015-03-30 11:15                 ` Sheng Yong
2015-03-30 18:24                 ` Martin Lau
2015-04-01  8:09                   ` Steffen Klassert
2015-03-30 10:33               ` [PATCH RFC 2/3] ipv6: Extend the route lookups to low priority metrics Steffen Klassert
2015-03-30 10:34               ` [PATCH RFC 3/3] ipv6: Don't update pmtu on uncached routes Steffen Klassert
2015-03-30 11:13               ` Question: should local address be expired when updating PMTU? Sheng Yong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54D01BEA.2070501@fb.com \
    --to=agartrell@fb.com \
    --cc=calvinowens@fb.com \
    --cc=davem@davemloft.net \
    --cc=hannes@redhat.com \
    --cc=kernel-team@fb.com \
    --cc=lvs-devel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=shengyong1@huawei.com \
    --cc=steffen.klassert@secunet.com \
    --cc=yangyingling@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.