From: Martin KaFai Lau <kafai@fb.com> To: Julian Anastasov <ja@ssi.bg> Cc: <netdev@vger.kernel.org>, David Ahern <dsahern@gmail.com>, Tom Herbert <tom@herbertland.com>, Eric Dumazet <eric.dumazet@gmail.com>, Nikita Shirokov <tehnerd@fb.com>, <kernel-team@fb.com>, <lvs-devel@vger.kernel.org> Subject: Re: [PATCH RFC net-next] net: ipvs: Adjust gso_size for IPPROTO_TCP Date: Thu, 3 May 2018 00:01:25 -0700 [thread overview] Message-ID: <20180503070114.bcuusvzga45klccs@kafai-mbp> (raw) In-Reply-To: <alpine.LFD.2.20.1805022143360.3301@ja.home.ssi.bg> On Wed, May 02, 2018 at 10:30:32PM +0300, Julian Anastasov wrote: > > Hello, > > On Wed, 2 May 2018, Martin KaFai Lau wrote: > > > On Wed, May 02, 2018 at 09:38:43AM +0300, Julian Anastasov wrote: > > > > > > - initial traffic for port 21 does not use GSO. But after > > > every packet IPVS calls maybe_update_pmtu (rt->dst.ops->update_pmtu) > > > to report the reduced MTU. These updates are stored in fnhe_pmtu > > > but they do not go to any route, even if we try to get fresh > > > output route. Why? Because the local routes are not cached, so > > > they can not use the fnhe. This is what my patch for route.c > > > will fix. With this fix FTP-DATA gets route with reduced PMTU. > > For IPv6, the 'if (rt6->rt6i_flags & RTF_LOCAL)' gate in > > __ip6_rt_update_pmtu() may need to be lifted also. > > Probably. I completely forgot the IPv6 part > but as I don't know the IPv6 code enough, it may take > some time to understand what can be the problem there... > I'm not sure whether everything started with commit 0a6b2a1dc2a2, > so that in some configurations before that commit things > worked and problem was not noticed. > > I think, we should focus on such direction for IPv6: > > - do we remember per-VIP PMTU for the local routes IPv6 used not to create cache route for DST_HOST route which is a /128 route (that includes local /128 route). Because of this, it had a bug such that a PMTU for the DST_HOST route will trigger dst.ops->update_pmtu() which then set an expire on the permanent /128 route instead of a cache route. The permanent route got unexpectedly expired/removed later. The fix was to allow creating /128 cache route as long as it is not RTF_LOCAL in 653437d02f1f and 7035870d1219. The first post spelled out the problem better: https://patchwork.ozlabs.org/patch/456050/ Later, when we only create cache route after seeing PMTU in 45e4fd26683c, this RTF_LOCAL checking was carried over to __ip6_rt_update_pmtu(). Out of my head, I don't see issue removing the RTF_LOCAL check from __ip6_rt_update_pmtu(). DavidA, what do you think? > > - when exactly we start to use the new PMTU, eg. what happens > in case socket caches the route, whether route is killed via > dst->obsolete. Or may be while the PMTU expiration is handled > per-packet, the PMTU change is noticed only on ICMP... Before sk can reuse its dst cache, the sk will notice its dst cache is no longer valid by calling dst_check(). dst_check() should return NULL which is one of the side effect of the earlier update_pmtu(). This dst_check() is usually only called when the sk needs to do output, so the new PMTU route (i.e. the RTF_CACHE IPv6 route) only have effect to the later packets. > > - as IPVS reports the PMTU via dst.ops->update_pmtu() long > before any large packets are sent, do we propagate the > PMTU. Also, for IPv4 __ip_rt_update_pmtu() has some protection > from such per-packet updates that do not change the PMTU. > > - if IPVS starts to send ICMP when gso_size exceeds PMTU, > like in my draft patch, whether the PMTU is propagated > to route and then to socket. As for the gso_size decrease, > playing in IPVS is not very safe, at least, we need help > from GSO experts to know how we should use it. > > Regards > > -- > Julian Anastasov <ja@ssi.bg>
WARNING: multiple messages have this Message-ID (diff)
From: Martin KaFai Lau <kafai@fb.com> To: Julian Anastasov <ja@ssi.bg> Cc: netdev@vger.kernel.org, David Ahern <dsahern@gmail.com>, Tom Herbert <tom@herbertland.com>, Eric Dumazet <eric.dumazet@gmail.com>, Nikita Shirokov <tehnerd@fb.com>, kernel-team@fb.com, lvs-devel@vger.kernel.org Subject: Re: [PATCH RFC net-next] net: ipvs: Adjust gso_size for IPPROTO_TCP Date: Thu, 3 May 2018 00:01:25 -0700 [thread overview] Message-ID: <20180503070114.bcuusvzga45klccs@kafai-mbp> (raw) In-Reply-To: <alpine.LFD.2.20.1805022143360.3301@ja.home.ssi.bg> On Wed, May 02, 2018 at 10:30:32PM +0300, Julian Anastasov wrote: > > Hello, > > On Wed, 2 May 2018, Martin KaFai Lau wrote: > > > On Wed, May 02, 2018 at 09:38:43AM +0300, Julian Anastasov wrote: > > > > > > - initial traffic for port 21 does not use GSO. But after > > > every packet IPVS calls maybe_update_pmtu (rt->dst.ops->update_pmtu) > > > to report the reduced MTU. These updates are stored in fnhe_pmtu > > > but they do not go to any route, even if we try to get fresh > > > output route. Why? Because the local routes are not cached, so > > > they can not use the fnhe. This is what my patch for route.c > > > will fix. With this fix FTP-DATA gets route with reduced PMTU. > > For IPv6, the 'if (rt6->rt6i_flags & RTF_LOCAL)' gate in > > __ip6_rt_update_pmtu() may need to be lifted also. > > Probably. I completely forgot the IPv6 part > but as I don't know the IPv6 code enough, it may take > some time to understand what can be the problem there... > I'm not sure whether everything started with commit 0a6b2a1dc2a2, > so that in some configurations before that commit things > worked and problem was not noticed. > > I think, we should focus on such direction for IPv6: > > - do we remember per-VIP PMTU for the local routes IPv6 used not to create cache route for DST_HOST route which is a /128 route (that includes local /128 route). Because of this, it had a bug such that a PMTU for the DST_HOST route will trigger dst.ops->update_pmtu() which then set an expire on the permanent /128 route instead of a cache route. The permanent route got unexpectedly expired/removed later. The fix was to allow creating /128 cache route as long as it is not RTF_LOCAL in 653437d02f1f and 7035870d1219. The first post spelled out the problem better: https://patchwork.ozlabs.org/patch/456050/ Later, when we only create cache route after seeing PMTU in 45e4fd26683c, this RTF_LOCAL checking was carried over to __ip6_rt_update_pmtu(). Out of my head, I don't see issue removing the RTF_LOCAL check from __ip6_rt_update_pmtu(). DavidA, what do you think? > > - when exactly we start to use the new PMTU, eg. what happens > in case socket caches the route, whether route is killed via > dst->obsolete. Or may be while the PMTU expiration is handled > per-packet, the PMTU change is noticed only on ICMP... Before sk can reuse its dst cache, the sk will notice its dst cache is no longer valid by calling dst_check(). dst_check() should return NULL which is one of the side effect of the earlier update_pmtu(). This dst_check() is usually only called when the sk needs to do output, so the new PMTU route (i.e. the RTF_CACHE IPv6 route) only have effect to the later packets. > > - as IPVS reports the PMTU via dst.ops->update_pmtu() long > before any large packets are sent, do we propagate the > PMTU. Also, for IPv4 __ip_rt_update_pmtu() has some protection > from such per-packet updates that do not change the PMTU. > > - if IPVS starts to send ICMP when gso_size exceeds PMTU, > like in my draft patch, whether the PMTU is propagated > to route and then to socket. As for the gso_size decrease, > playing in IPVS is not very safe, at least, we need help > from GSO experts to know how we should use it. > > Regards > > -- > Julian Anastasov <ja@ssi.bg>
next prev parent reply other threads:[~2018-05-03 7:01 UTC|newest] Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-04-19 21:23 [PATCH RFC net-next] net: ipvs: Adjust gso_size for IPPROTO_TCP Martin KaFai Lau 2018-04-20 20:43 ` Julian Anastasov 2018-04-20 23:13 ` Martin KaFai Lau 2018-05-02 6:38 ` Julian Anastasov 2018-05-02 17:10 ` Martin KaFai Lau 2018-05-02 17:10 ` Martin KaFai Lau 2018-05-02 19:30 ` Julian Anastasov 2018-05-03 7:01 ` Martin KaFai Lau [this message] 2018-05-03 7:01 ` Martin KaFai Lau 2018-05-04 3:50 ` David Ahern 2018-05-05 12:58 ` Julian Anastasov 2018-05-07 16:22 ` Martin KaFai Lau 2018-05-07 16:22 ` Martin KaFai Lau 2018-05-07 17:00 ` Julian Anastasov 2018-06-05 21:31 ` Martin KaFai Lau 2018-06-05 21:31 ` Martin KaFai Lau 2018-06-05 22:35 ` Julian Anastasov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20180503070114.bcuusvzga45klccs@kafai-mbp \ --to=kafai@fb.com \ --cc=dsahern@gmail.com \ --cc=eric.dumazet@gmail.com \ --cc=ja@ssi.bg \ --cc=kernel-team@fb.com \ --cc=lvs-devel@vger.kernel.org \ --cc=netdev@vger.kernel.org \ --cc=tehnerd@fb.com \ --cc=tom@herbertland.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.