Netdev Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH net] ipv6/route: only update neigh confirm time if pmtu changed
@ 2019-11-22  6:19 Hangbin Liu
  2019-11-22 18:04 ` David Miller
  2019-12-03  2:11 ` [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update Hangbin Liu
  0 siblings, 2 replies; 8+ messages in thread
From: Hangbin Liu @ 2019-11-22  6:19 UTC (permalink / raw)
  To: netdev
  Cc: Julian Anastasov, David S . Miller, Marcelo Ricardo Leitner,
	David Ahern, Eric Dumazet, Hangbin Liu

When we setup a pair of gretap, ping each other and create neighbour cache.
Then delete and recreate one side. We will never be able to ping6 to the new
created gretap.

The reason is when we ping6 remote via gretap, we will call like

gre_tap_xmit()
 - ip_tunnel_xmit()
   - tnl_update_pmtu()
     - skb_dst_update_pmtu()
       - ip6_rt_update_pmtu()
         - __ip6_rt_update_pmtu()
           - dst_confirm_neigh()
             - ip6_confirm_neigh()
               - __ipv6_confirm_neigh()
                 - n->confirmed = now

As the confirmed time updated, in neigh_timer_handler() the check for
NUD_DELAY confirm time will pass and the neigh state will back to
NUD_REACHABLE. So the old/wrong mac address will be used again.

If we do not update the confirmed time, the neigh state will go to
neigh->nud_state = NUD_PROBE; then go to NUD_FAILED and re-create the
neigh later, which is what IPv4 does.

Fix it by reordering the dst_confirm_neigh() and only update it when
pmtu changed.

Fixes: 0dec879f636f ("net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP")
Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 net/ipv6/route.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 3f83ea851ebf..6fbef61b8f64 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2713,11 +2713,12 @@ static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk,
 		daddr = NULL;
 		saddr = NULL;
 	}
-	dst_confirm_neigh(dst, daddr);
 	mtu = max_t(u32, mtu, IPV6_MIN_MTU);
 	if (mtu >= dst_mtu(dst))
 		return;
 
+	dst_confirm_neigh(dst, daddr);
+
 	if (!rt6_cache_allowed_for_pmtu(rt6)) {
 		rt6_do_update_pmtu(rt6, mtu);
 		/* update rt6_ex->stamp for cache */
-- 
2.19.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] ipv6/route: only update neigh confirm time if pmtu changed
  2019-11-22  6:19 [PATCH net] ipv6/route: only update neigh confirm time if pmtu changed Hangbin Liu
@ 2019-11-22 18:04 ` David Miller
  2019-11-26  9:17   ` Hangbin Liu
  2019-12-03  2:11 ` [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update Hangbin Liu
  1 sibling, 1 reply; 8+ messages in thread
From: David Miller @ 2019-11-22 18:04 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, ja, marcelo.leitner, dsahern, edumazet

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Fri, 22 Nov 2019 14:19:19 +0800

> The reason is when we ping6 remote via gretap, we will call like
> 
> gre_tap_xmit()
>  - ip_tunnel_xmit()
>    - tnl_update_pmtu()
>      - skb_dst_update_pmtu()
>        - ip6_rt_update_pmtu()
>          - __ip6_rt_update_pmtu()
>            - dst_confirm_neigh()
>              - ip6_confirm_neigh()
>                - __ipv6_confirm_neigh()
>                  - n->confirmed = now

This whole callchain violates the assumptions of the MTU update
machinery.

In this case it's just the tunneling code accounting for the
encapsulation it is creating, and checking the MTU just in case.

But the MTU update code is supposed to be invoked in response to real
networking events that update the PMTU.

So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
should not be invoking dst_confirm_neigh() as we have no evidence
of successful two-way communication at this point.

We have to stop papering over the tunneling code's abuse of the PMTU
update framework and do this properly.

Sorry, I'm not applying this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH net] ipv6/route: only update neigh confirm time if pmtu changed
  2019-11-22 18:04 ` David Miller
@ 2019-11-26  9:17   ` Hangbin Liu
  0 siblings, 0 replies; 8+ messages in thread
From: Hangbin Liu @ 2019-11-26  9:17 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, ja, marcelo.leitner, dsahern, edumazet

Hi David,

Sorry for the late reply. I'm not sure why your reply went to spam list and
I didn't receive it timely.
On Fri, Nov 22, 2019 at 10:04:38AM -0800, David Miller wrote:
> From: Hangbin Liu <liuhangbin@gmail.com>
> Date: Fri, 22 Nov 2019 14:19:19 +0800
> 
> > The reason is when we ping6 remote via gretap, we will call like
> > 
> > gre_tap_xmit()
> >  - ip_tunnel_xmit()
> >    - tnl_update_pmtu()
> >      - skb_dst_update_pmtu()
> >        - ip6_rt_update_pmtu()
> >          - __ip6_rt_update_pmtu()
> >            - dst_confirm_neigh()
> >              - ip6_confirm_neigh()
> >                - __ipv6_confirm_neigh()
> >                  - n->confirmed = now
> 
> This whole callchain violates the assumptions of the MTU update
> machinery.
> 
> In this case it's just the tunneling code accounting for the
> encapsulation it is creating, and checking the MTU just in case.
> 
> But the MTU update code is supposed to be invoked in response to real
> networking events that update the PMTU.
> 
> So for this ip_tunnel_xmit() case, _EVEN_ if the MTU is changed, we
> should not be invoking dst_confirm_neigh() as we have no evidence
> of successful two-way communication at this point.

Thanks for the explanation. When I fixed the code, I was also wondering
if we need this neighbor confirmation. So I just moved the dst_confirm_neigh()
a little down to make sure pmtu changed. Your explanation make me clear that
we should not have this neighbor confirmation as PMTU is not a two-way
communication.
> 
> We have to stop papering over the tunneling code's abuse of the PMTU
> update framework and do this properly.

Should I do other works than just remove dst_confirm_neigh() in
__ip6_rt_update_pmtu().

Thanks
Hangbin
> 
> Sorry, I'm not applying this.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update
  2019-11-22  6:19 [PATCH net] ipv6/route: only update neigh confirm time if pmtu changed Hangbin Liu
  2019-11-22 18:04 ` David Miller
@ 2019-12-03  2:11 ` Hangbin Liu
  2019-12-03  2:47   ` David Miller
  1 sibling, 1 reply; 8+ messages in thread
From: Hangbin Liu @ 2019-12-03  2:11 UTC (permalink / raw)
  To: netdev
  Cc: Julian Anastasov, Marcelo Ricardo Leitner, David Ahern,
	Eric Dumazet, David Miller, Hangbin Liu

When we setup a pair of gretap, ping each other and create neighbour cache.
Then delete and recreate one side. We will never be able to ping6 to the new
created gretap.

The reason is when we ping6 remote via gretap, we will call like

gre_tap_xmit()
 - ip_tunnel_xmit()
   - tnl_update_pmtu()
     - skb_dst_update_pmtu()
       - ip6_rt_update_pmtu()
         - __ip6_rt_update_pmtu()
           - dst_confirm_neigh()
             - ip6_confirm_neigh()
               - __ipv6_confirm_neigh()
                 - n->confirmed = now

As the confirmed time updated, in neigh_timer_handler() the check for
NUD_DELAY confirm time will pass and the neigh state will back to
NUD_REACHABLE. So the old/wrong mac address will be used again.

If we do not update the confirmed time, the neigh state will go to
neigh->nud_state = NUD_PROBE; then go to NUD_FAILED and re-create the
neigh later, which is what IPv4 does.

Fix it by removing the dst_confirm_neigh() in __ip6_rt_update_pmtu() as
there is no two-way communication during PMTU update.

v2: remove dst_confirm_neigh directly as David Miller pointed out.

Fixes: 0dec879f636f ("net: use dst_confirm_neigh for UDP, RAW, ICMP, L2TP")
Reported-by: Jianlin Shi <jishi@redhat.com>
Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 net/ipv6/route.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b59940416cb5..335ed36a5c78 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2710,7 +2710,6 @@ static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk,
 		daddr = NULL;
 		saddr = NULL;
 	}
-	dst_confirm_neigh(dst, daddr);
 	mtu = max_t(u32, mtu, IPV6_MIN_MTU);
 	if (mtu >= dst_mtu(dst))
 		return;
-- 
2.19.2


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update
  2019-12-03  2:11 ` [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update Hangbin Liu
@ 2019-12-03  2:47   ` David Miller
  2019-12-03 10:15     ` Hangbin Liu
  0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2019-12-03  2:47 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, ja, marcelo.leitner, dsahern, edumazet

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Tue,  3 Dec 2019 10:11:37 +0800

> Fix it by removing the dst_confirm_neigh() in __ip6_rt_update_pmtu() as
> there is no two-way communication during PMTU update.
> 
> v2: remove dst_confirm_neigh directly as David Miller pointed out.

That's not what I said.

I said that this interface is designed for situations where the neigh
update is appropriate, and that's what happens for most callers _except_
these tunnel cases.

The tunnel use is the exception and invoking the interface
inappropriately.

It is important to keep the neigh reachability fresh for TCP flows so
you cannot remove this dst_confirm_neigh() call.

Instead, make a new interface that the tunnel use cases can call into
to elide the neigh update.

Yes, this means you will have too update all of the tunnel callers
into these calls chains but that's the price we have to pay in this
situation unfortunately.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update
  2019-12-03  2:47   ` David Miller
@ 2019-12-03 10:15     ` Hangbin Liu
  2019-12-03 10:25       ` Hangbin Liu
  0 siblings, 1 reply; 8+ messages in thread
From: Hangbin Liu @ 2019-12-03 10:15 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, ja, marcelo.leitner, dsahern, edumazet

On Mon, Dec 02, 2019 at 06:47:04PM -0800, David Miller wrote:
> From: Hangbin Liu <liuhangbin@gmail.com>
> Date: Tue,  3 Dec 2019 10:11:37 +0800
> 
> > Fix it by removing the dst_confirm_neigh() in __ip6_rt_update_pmtu() as
> > there is no two-way communication during PMTU update.
> > 
> > v2: remove dst_confirm_neigh directly as David Miller pointed out.
> 
> That's not what I said.
> 
> I said that this interface is designed for situations where the neigh
> update is appropriate, and that's what happens for most callers _except_
> these tunnel cases.
> 
> The tunnel use is the exception and invoking the interface
> inappropriately.
> 
> It is important to keep the neigh reachability fresh for TCP flows so
> you cannot remove this dst_confirm_neigh() call.
> 
> Instead, make a new interface that the tunnel use cases can call into
> to elide the neigh update.
> 
> Yes, this means you will have too update all of the tunnel callers
> into these calls chains but that's the price we have to pay in this
> situation unfortunately.

Oh, I got what you mean now. thanks for the explain. I will see how
to do this.

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update
  2019-12-03 10:15     ` Hangbin Liu
@ 2019-12-03 10:25       ` Hangbin Liu
  2019-12-03 19:58         ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Hangbin Liu @ 2019-12-03 10:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, ja, marcelo.leitner, dsahern, edumazet


Hi David,
On Tue, Dec 03, 2019 at 06:15:36PM +0800, Hangbin Liu wrote:
> On Mon, Dec 02, 2019 at 06:47:04PM -0800, David Miller wrote:
> > From: Hangbin Liu <liuhangbin@gmail.com>
> > Date: Tue,  3 Dec 2019 10:11:37 +0800
> > 
> > > Fix it by removing the dst_confirm_neigh() in __ip6_rt_update_pmtu() as
> > > there is no two-way communication during PMTU update.
> > > 
> > > v2: remove dst_confirm_neigh directly as David Miller pointed out.
> > 
> > That's not what I said.
> > 
> > I said that this interface is designed for situations where the neigh
> > update is appropriate, and that's what happens for most callers _except_
> > these tunnel cases.
> > 
> > The tunnel use is the exception and invoking the interface
> > inappropriately.
> > 
> > It is important to keep the neigh reachability fresh for TCP flows so
> > you cannot remove this dst_confirm_neigh() call.

I have one question here. Since we have the .confirm_neigh fuction in
struct dst_ops. How about do a dst->ops->confirm_neigh() separately after
dst->ops->update_pmtu()? Why should we mix the confirm_neigh() in
update_pmtu(), like ip6_rt_update_pmtu()?

Thanks
Hangbin

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update
  2019-12-03 10:25       ` Hangbin Liu
@ 2019-12-03 19:58         ` David Miller
  0 siblings, 0 replies; 8+ messages in thread
From: David Miller @ 2019-12-03 19:58 UTC (permalink / raw)
  To: liuhangbin; +Cc: netdev, ja, marcelo.leitner, dsahern, edumazet

From: Hangbin Liu <liuhangbin@gmail.com>
Date: Tue, 3 Dec 2019 18:25:35 +0800

> 
> Hi David,
> On Tue, Dec 03, 2019 at 06:15:36PM +0800, Hangbin Liu wrote:
>> On Mon, Dec 02, 2019 at 06:47:04PM -0800, David Miller wrote:
>> > From: Hangbin Liu <liuhangbin@gmail.com>
>> > Date: Tue,  3 Dec 2019 10:11:37 +0800
>> > 
>> > > Fix it by removing the dst_confirm_neigh() in __ip6_rt_update_pmtu() as
>> > > there is no two-way communication during PMTU update.
>> > > 
>> > > v2: remove dst_confirm_neigh directly as David Miller pointed out.
>> > 
>> > That's not what I said.
>> > 
>> > I said that this interface is designed for situations where the neigh
>> > update is appropriate, and that's what happens for most callers _except_
>> > these tunnel cases.
>> > 
>> > The tunnel use is the exception and invoking the interface
>> > inappropriately.
>> > 
>> > It is important to keep the neigh reachability fresh for TCP flows so
>> > you cannot remove this dst_confirm_neigh() call.
> 
> I have one question here. Since we have the .confirm_neigh fuction in
> struct dst_ops. How about do a dst->ops->confirm_neigh() separately after
> dst->ops->update_pmtu()? Why should we mix the confirm_neigh() in
> update_pmtu(), like ip6_rt_update_pmtu()?

Two indirect calls which have high cost due to spectre mitigation?

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, back to index

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-22  6:19 [PATCH net] ipv6/route: only update neigh confirm time if pmtu changed Hangbin Liu
2019-11-22 18:04 ` David Miller
2019-11-26  9:17   ` Hangbin Liu
2019-12-03  2:11 ` [PATCHv2 net] ipv6/route: should not update neigh confirm time during PMTU update Hangbin Liu
2019-12-03  2:47   ` David Miller
2019-12-03 10:15     ` Hangbin Liu
2019-12-03 10:25       ` Hangbin Liu
2019-12-03 19:58         ` David Miller

Netdev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netdev/0 netdev/git/0.git
	git clone --mirror https://lore.kernel.org/netdev/1 netdev/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netdev netdev/ https://lore.kernel.org/netdev \
		netdev@vger.kernel.org
	public-inbox-index netdev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netdev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git