[PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
@ 2015-05-27 17:40 Alexander Duyck
  2015-05-28  4:49 ` Herbert Xu
  2015-05-28  5:36 ` Steffen Klassert
  0 siblings, 2 replies; 22+ messages in thread
From: Alexander Duyck @ 2015-05-27 17:40 UTC (permalink / raw)
  To: steffen.klassert, davem, herbert; +Cc: netdev, linux-crypto

This change makes it so that we use icmpv6_send to report PMTU issues back
into tunnels in the case that the resulting packet is larger than the MTU
of the outgoing interface.  Previously xfrm_local_error was being used in
this case, however this was resulting in no changes, I suspect due to the
fact that the tunnel itself was being kept out of the loop.

This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
behavior seen if the socket was orphaned.  Instead of requiring the socket
to be orphaned this patch simply defaults to using icmpv6_send in the case
that the frame came though a tunnel.

Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---
 net/ipv6/xfrm6_output.c |   18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/net/ipv6/xfrm6_output.c b/net/ipv6/xfrm6_output.c
index 09c76a7b474d..6f9b514d0e38 100644
--- a/net/ipv6/xfrm6_output.c
+++ b/net/ipv6/xfrm6_output.c
@@ -72,6 +72,7 @@ static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 {
 	int mtu, ret = 0;
 	struct dst_entry *dst = skb_dst(skb);
+	struct xfrm_state *x = dst->xfrm;
 
 	mtu = dst_mtu(dst);
 	if (mtu < IPV6_MIN_MTU)
@@ -82,7 +83,7 @@ static int xfrm6_tunnel_check_size(struct sk_buff *skb)
 
 		if (xfrm6_local_dontfrag(skb))
 			xfrm6_local_rxpmtu(skb, mtu);
-		else if (skb->sk)
+		else if (skb->sk && x->props.mode != XFRM_MODE_TUNNEL)
 			xfrm_local_error(skb, mtu);
 		else
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
@@ -149,11 +150,16 @@ static int __xfrm6_output(struct sock *sk, struct sk_buff *skb)
 	else
 		mtu = dst_mtu(skb_dst(skb));
 
-	if (skb->len > mtu && xfrm6_local_dontfrag(skb)) {
-		xfrm6_local_rxpmtu(skb, mtu);
-		return -EMSGSIZE;
-	} else if (!skb->ignore_df && skb->len > mtu && skb->sk) {
-		xfrm_local_error(skb, mtu);
+	if (!skb->ignore_df && skb->len > mtu) {
+		skb->dev = dst->dev;
+
+		if (xfrm6_local_dontfrag(skb))
+			xfrm6_local_rxpmtu(skb, mtu);
+		else if (skb->sk && x->props.mode != XFRM_MODE_TUNNEL)
+			xfrm_local_error(skb, mtu);
+		else
+			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+
 		return -EMSGSIZE;
 	}
 

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
  2015-05-27 17:40 [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels Alexander Duyck
@ 2015-05-28  4:49 ` Herbert Xu
  2015-05-28  4:56   ` Steffen Klassert
  2015-05-28  5:36 ` Steffen Klassert
  1 sibling, 1 reply; 22+ messages in thread
From: Herbert Xu @ 2015-05-28  4:49 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: steffen.klassert, davem, netdev, linux-crypto

On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
> This change makes it so that we use icmpv6_send to report PMTU issues back
> into tunnels in the case that the resulting packet is larger than the MTU
> of the outgoing interface.  Previously xfrm_local_error was being used in
> this case, however this was resulting in no changes, I suspect due to the
> fact that the tunnel itself was being kept out of the loop.
> 
> This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
> behavior seen if the socket was orphaned.  Instead of requiring the socket
> to be orphaned this patch simply defaults to using icmpv6_send in the case
> that the frame came though a tunnel.
> 
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>

Does this still work with normal tunnel mode and identical inner
and outer addresses? I recall we used to have a bug where in that
situation the kernel would interpret the ICMP message as a reduction
in outer MTU and thus resulting in a loop where the MTU keeps
getting smaller.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
  2015-05-28  4:49 ` Herbert Xu
@ 2015-05-28  4:56   ` Steffen Klassert
  0 siblings, 0 replies; 22+ messages in thread
From: Steffen Klassert @ 2015-05-28  4:56 UTC (permalink / raw)
  To: Herbert Xu; +Cc: Alexander Duyck, davem, netdev, linux-crypto

On Thu, May 28, 2015 at 12:49:19PM +0800, Herbert Xu wrote:
> On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
> > This change makes it so that we use icmpv6_send to report PMTU issues back
> > into tunnels in the case that the resulting packet is larger than the MTU
> > of the outgoing interface.  Previously xfrm_local_error was being used in
> > this case, however this was resulting in no changes, I suspect due to the
> > fact that the tunnel itself was being kept out of the loop.
> > 
> > This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
> > behavior seen if the socket was orphaned.  Instead of requiring the socket
> > to be orphaned this patch simply defaults to using icmpv6_send in the case
> > that the frame came though a tunnel.
> > 
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
> 
> Does this still work with normal tunnel mode and identical inner
> and outer addresses? I recall we used to have a bug where in that
> situation the kernel would interpret the ICMP message as a reduction
> in outer MTU and thus resulting in a loop where the MTU keeps
> getting smaller.

Right, I think this reintroduces a bug that I fixed some years ago with
commit dd767856a36e ("xfrm6: Don't call icmpv6_send on local error")

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
  2015-05-27 17:40 [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels Alexander Duyck
  2015-05-28  4:49 ` Herbert Xu
@ 2015-05-28  5:36 ` Steffen Klassert
  2015-05-28  7:18   ` Alexander Duyck
  1 sibling, 1 reply; 22+ messages in thread
From: Steffen Klassert @ 2015-05-28  5:36 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: davem, herbert, netdev, linux-crypto

On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
> This change makes it so that we use icmpv6_send to report PMTU issues back
> into tunnels in the case that the resulting packet is larger than the MTU
> of the outgoing interface.  Previously xfrm_local_error was being used in
> this case, however this was resulting in no changes, I suspect due to the
> fact that the tunnel itself was being kept out of the loop.
> 
> This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
> behavior seen if the socket was orphaned.  Instead of requiring the socket
> to be orphaned this patch simply defaults to using icmpv6_send in the case
> that the frame came though a tunnel.

We can use icmpv6_send() just in the case that the packet
was already transmitted by a tunnel device, otherwise we
get the bug back that I mentioned in my other mail.

Not sure if we have something to know that the packet
traversed a tunnel device. That's what I asked in the
thread 'Looking for a lost patch'.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
  2015-05-28  5:36 ` Steffen Klassert
@ 2015-05-28  7:18   ` Alexander Duyck
  2015-05-28  8:40     ` Steffen Klassert
  0 siblings, 1 reply; 22+ messages in thread
From: Alexander Duyck @ 2015-05-28  7:18 UTC (permalink / raw)
  To: Steffen Klassert, Alexander Duyck; +Cc: davem, herbert, netdev, linux-crypto

On 05/27/2015 10:36 PM, Steffen Klassert wrote:
> On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
>> This change makes it so that we use icmpv6_send to report PMTU issues back
>> into tunnels in the case that the resulting packet is larger than the MTU
>> of the outgoing interface.  Previously xfrm_local_error was being used in
>> this case, however this was resulting in no changes, I suspect due to the
>> fact that the tunnel itself was being kept out of the loop.
>>
>> This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
>> behavior seen if the socket was orphaned.  Instead of requiring the socket
>> to be orphaned this patch simply defaults to using icmpv6_send in the case
>> that the frame came though a tunnel.
> We can use icmpv6_send() just in the case that the packet
> was already transmitted by a tunnel device, otherwise we
> get the bug back that I mentioned in my other mail.
>
> Not sure if we have something to know that the packet
> traversed a tunnel device. That's what I asked in the
> thread 'Looking for a lost patch'.

Okay I will try to do some more digging.  From what I can tell right now 
it looks like my ping attempts are getting hung up on the 
xfrm_local_error in __xfrm6_output.  I wonder if we couldn't somehow 
make use of the skb->cb to store a pointer to the tunnel that could be 
checked to determine if we are going through a VTI or not.

- Alex

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
  2015-05-28  7:18   ` Alexander Duyck
@ 2015-05-28  8:40     ` Steffen Klassert
  2015-05-28 19:15       ` Alexander Duyck
  0 siblings, 1 reply; 22+ messages in thread
From: Steffen Klassert @ 2015-05-28  8:40 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Alexander Duyck, davem, herbert, netdev, linux-crypto

On Thu, May 28, 2015 at 12:18:51AM -0700, Alexander Duyck wrote:
> On 05/27/2015 10:36 PM, Steffen Klassert wrote:
> >On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
> >>This change makes it so that we use icmpv6_send to report PMTU issues back
> >>into tunnels in the case that the resulting packet is larger than the MTU
> >>of the outgoing interface.  Previously xfrm_local_error was being used in
> >>this case, however this was resulting in no changes, I suspect due to the
> >>fact that the tunnel itself was being kept out of the loop.
> >>
> >>This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
> >>behavior seen if the socket was orphaned.  Instead of requiring the socket
> >>to be orphaned this patch simply defaults to using icmpv6_send in the case
> >>that the frame came though a tunnel.
> >We can use icmpv6_send() just in the case that the packet
> >was already transmitted by a tunnel device, otherwise we
> >get the bug back that I mentioned in my other mail.
> >
> >Not sure if we have something to know that the packet
> >traversed a tunnel device. That's what I asked in the
> >thread 'Looking for a lost patch'.
> 
> Okay I will try to do some more digging.  From what I can tell right
> now it looks like my ping attempts are getting hung up on the
> xfrm_local_error in __xfrm6_output.  I wonder if we couldn't somehow
> make use of the skb->cb to store a pointer to the tunnel that could
> be checked to determine if we are going through a VTI or not.

Maybe it is as easy as the patch below, could you please test it?

Subject: [PATCH RFC] vti6: Add pmtu handling to vti6_xmit.

We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti6_xmit to
report MTU changes immediately.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv6/ip6_vti.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index ff3bd86..13cb771 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -434,6 +434,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	struct dst_entry *dst = skb_dst(skb);
 	struct net_device *tdev;
 	struct xfrm_state *x;
+	int mtu;
 	int err = -1;
 
 	if (!dst)
@@ -468,6 +469,15 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	skb_dst_set(skb, dst);
 	skb->dev = skb_dst(skb)->dev;
 
+	mtu = dst_mtu(dst);
+	if (!skb->ignore_df && skb->len > mtu) {
+		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
+
+		icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+
+		return -EMSGSIZE;
+	}
+
 	err = dst_output(skb);
 	if (net_xmit_eval(err) == 0) {
 		struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
  2015-05-28  8:40     ` Steffen Klassert
@ 2015-05-28 19:15       ` Alexander Duyck
  2015-05-29 16:53         ` Alexander Duyck
  2015-05-29 18:28         ` [PATCH] vti6: Add pmtu handling to vti6_xmit Alexander Duyck
  0 siblings, 2 replies; 22+ messages in thread
From: Alexander Duyck @ 2015-05-28 19:15 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Alexander Duyck, davem, herbert, netdev, linux-crypto

On 05/28/2015 01:40 AM, Steffen Klassert wrote:
> On Thu, May 28, 2015 at 12:18:51AM -0700, Alexander Duyck wrote:
>> On 05/27/2015 10:36 PM, Steffen Klassert wrote:
>>> On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
>>>> This change makes it so that we use icmpv6_send to report PMTU issues back
>>>> into tunnels in the case that the resulting packet is larger than the MTU
>>>> of the outgoing interface.  Previously xfrm_local_error was being used in
>>>> this case, however this was resulting in no changes, I suspect due to the
>>>> fact that the tunnel itself was being kept out of the loop.
>>>>
>>>> This patch fixes PMTU problems seen on ip6_vti tunnels and is based on the
>>>> behavior seen if the socket was orphaned.  Instead of requiring the socket
>>>> to be orphaned this patch simply defaults to using icmpv6_send in the case
>>>> that the frame came though a tunnel.
>>> We can use icmpv6_send() just in the case that the packet
>>> was already transmitted by a tunnel device, otherwise we
>>> get the bug back that I mentioned in my other mail.
>>>
>>> Not sure if we have something to know that the packet
>>> traversed a tunnel device. That's what I asked in the
>>> thread 'Looking for a lost patch'.
>> Okay I will try to do some more digging.  From what I can tell right
>> now it looks like my ping attempts are getting hung up on the
>> xfrm_local_error in __xfrm6_output.  I wonder if we couldn't somehow
>> make use of the skb->cb to store a pointer to the tunnel that could
>> be checked to determine if we are going through a VTI or not.
> Maybe it is as easy as the patch below, could you please test it?
>
> Subject: [PATCH RFC] vti6: Add pmtu handling to vti6_xmit.
>
> We currently rely on the PMTU discovery of xfrm.
> However if a packet is localy sent, the PMTU mechanism
> of xfrm tries to to local socket notification what
> might not work for applications like ping that don't
> check for this. So add pmtu handling to vti6_xmit to
> report MTU changes immediately.
>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv6/ip6_vti.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
>
> diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
> index ff3bd86..13cb771 100644
> --- a/net/ipv6/ip6_vti.c
> +++ b/net/ipv6/ip6_vti.c
> @@ -434,6 +434,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>   	struct dst_entry *dst = skb_dst(skb);
>   	struct net_device *tdev;
>   	struct xfrm_state *x;
> +	int mtu;
>   	int err = -1;
>   
>   	if (!dst)
> @@ -468,6 +469,15 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
>   	skb_dst_set(skb, dst);
>   	skb->dev = skb_dst(skb)->dev;
>   
> +	mtu = dst_mtu(dst);
> +	if (!skb->ignore_df && skb->len > mtu) {
> +		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
> +
> +		icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
> +
> +		return -EMSGSIZE;
> +	}
> +
>   	err = dst_output(skb);
>   	if (net_xmit_eval(err) == 0) {
>   		struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);

That seems to be working for me.  I'm able to ping and while the first 
packet fails the second one and all that follow make it through 
correctly after the ptmu update.

- Alex

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels
  2015-05-28 19:15       ` Alexander Duyck
@ 2015-05-29 16:53         ` Alexander Duyck
  2015-05-29 18:28         ` [PATCH] vti6: Add pmtu handling to vti6_xmit Alexander Duyck
  1 sibling, 0 replies; 22+ messages in thread
From: Alexander Duyck @ 2015-05-29 16:53 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: Alexander Duyck, davem, herbert, netdev, linux-crypto

On 05/28/2015 12:15 PM, Alexander Duyck wrote:
> On 05/28/2015 01:40 AM, Steffen Klassert wrote:
>> On Thu, May 28, 2015 at 12:18:51AM -0700, Alexander Duyck wrote:
>>> On 05/27/2015 10:36 PM, Steffen Klassert wrote:
>>>> On Wed, May 27, 2015 at 10:40:32AM -0700, Alexander Duyck wrote:
>>>>> This change makes it so that we use icmpv6_send to report PMTU 
>>>>> issues back
>>>>> into tunnels in the case that the resulting packet is larger than 
>>>>> the MTU
>>>>> of the outgoing interface.  Previously xfrm_local_error was being 
>>>>> used in
>>>>> this case, however this was resulting in no changes, I suspect due 
>>>>> to the
>>>>> fact that the tunnel itself was being kept out of the loop.
>>>>>
>>>>> This patch fixes PMTU problems seen on ip6_vti tunnels and is 
>>>>> based on the
>>>>> behavior seen if the socket was orphaned.  Instead of requiring 
>>>>> the socket
>>>>> to be orphaned this patch simply defaults to using icmpv6_send in 
>>>>> the case
>>>>> that the frame came though a tunnel.
>>>> We can use icmpv6_send() just in the case that the packet
>>>> was already transmitted by a tunnel device, otherwise we
>>>> get the bug back that I mentioned in my other mail.
>>>>
>>>> Not sure if we have something to know that the packet
>>>> traversed a tunnel device. That's what I asked in the
>>>> thread 'Looking for a lost patch'.
>>> Okay I will try to do some more digging.  From what I can tell right
>>> now it looks like my ping attempts are getting hung up on the
>>> xfrm_local_error in __xfrm6_output.  I wonder if we couldn't somehow
>>> make use of the skb->cb to store a pointer to the tunnel that could
>>> be checked to determine if we are going through a VTI or not.
>> Maybe it is as easy as the patch below, could you please test it?
>>
>> Subject: [PATCH RFC] vti6: Add pmtu handling to vti6_xmit.
>>
>> We currently rely on the PMTU discovery of xfrm.
>> However if a packet is localy sent, the PMTU mechanism
>> of xfrm tries to to local socket notification what
>> might not work for applications like ping that don't
>> check for this. So add pmtu handling to vti6_xmit to
>> report MTU changes immediately.
>>
>> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
>> ---
>>   net/ipv6/ip6_vti.c | 10 ++++++++++
>>   1 file changed, 10 insertions(+)
>>
>> diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
>> index ff3bd86..13cb771 100644
>> --- a/net/ipv6/ip6_vti.c
>> +++ b/net/ipv6/ip6_vti.c
>> @@ -434,6 +434,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device 
>> *dev, struct flowi *fl)
>>       struct dst_entry *dst = skb_dst(skb);
>>       struct net_device *tdev;
>>       struct xfrm_state *x;
>> +    int mtu;
>>       int err = -1;
>>         if (!dst)
>> @@ -468,6 +469,15 @@ vti6_xmit(struct sk_buff *skb, struct net_device 
>> *dev, struct flowi *fl)
>>       skb_dst_set(skb, dst);
>>       skb->dev = skb_dst(skb)->dev;
>>   +    mtu = dst_mtu(dst);
>> +    if (!skb->ignore_df && skb->len > mtu) {
>> +        skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
>> +
>> +        icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
>> +
>> +        return -EMSGSIZE;
>> +    }
>> +
>>       err = dst_output(skb);
>>       if (net_xmit_eval(err) == 0) {
>>           struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);
>
> That seems to be working for me.  I'm able to ping and while the first 
> packet fails the second one and all that follow make it through 
> correctly after the ptmu update.
>
> - Alex

It looks like I spoke too soon.  It resolves it for IPv6, but IPv4 over 
the tunnel has the same issue.  Probably need to have some sort of 
protocol based check to determine which version of the call to use.

- Alex

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2015-05-28 19:15       ` Alexander Duyck
  2015-05-29 16:53         ` Alexander Duyck
@ 2015-05-29 18:28         ` Alexander Duyck
  2015-06-01 23:04           ` David Miller
  2016-02-10  1:50           ` Mark McKinstry
  1 sibling, 2 replies; 22+ messages in thread
From: Alexander Duyck @ 2015-05-29 18:28 UTC (permalink / raw)
  To: steffen.klassert, davem, herbert; +Cc: netdev, linux-crypto

From: Steffen Klassert <steffen.klassert@secunet.com>

We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti6_xmit to
report MTU changes immediately.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>
---

So this version is slightly modified to cover the IPv4 case in addition to
the IPv6 case.  With this patch I was able to run netperf over either an
IPv4 or IPv6 address routed over the ip6_vti tunnel.

 net/ipv6/ip6_vti.c |   14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index d25209657edc..3b5c1ea50d2f 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -435,6 +435,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	struct net_device *tdev;
 	struct xfrm_state *x;
 	int err = -1;
+	int mtu;
 
 	if (!dst)
 		goto tx_err_link_failure;
@@ -468,6 +469,19 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 	skb_dst_set(skb, dst);
 	skb->dev = skb_dst(skb)->dev;
 
+	mtu = dst_mtu(dst);
+	if (!skb->ignore_df && skb->len > mtu) {
+		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
+
+		if (skb->protocol == htons(ETH_P_IPV6))
+			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+		else
+			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+				  htonl(mtu));
+
+		return -EMSGSIZE;
+	}
+
 	err = dst_output(skb);
 	if (net_xmit_eval(err) == 0) {
 		struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2015-05-29 18:28         ` [PATCH] vti6: Add pmtu handling to vti6_xmit Alexander Duyck
@ 2015-06-01 23:04           ` David Miller
  2016-02-10  1:50           ` Mark McKinstry
  1 sibling, 0 replies; 22+ messages in thread
From: David Miller @ 2015-06-01 23:04 UTC (permalink / raw)
  To: alexander.h.duyck; +Cc: steffen.klassert, herbert, netdev, linux-crypto

From: Alexander Duyck <alexander.h.duyck@redhat.com>
Date: Fri, 29 May 2015 11:28:26 -0700

> From: Steffen Klassert <steffen.klassert@secunet.com>
> 
> We currently rely on the PMTU discovery of xfrm.
> However if a packet is localy sent, the PMTU mechanism
> of xfrm tries to to local socket notification what
> might not work for applications like ping that don't
> check for this. So add pmtu handling to vti6_xmit to
> report MTU changes immediately.
> 
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@redhat.com>

Applied, thanks Andrew.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2015-05-29 18:28         ` [PATCH] vti6: Add pmtu handling to vti6_xmit Alexander Duyck
  2015-06-01 23:04           ` David Miller
@ 2016-02-10  1:50           ` Mark McKinstry
  2016-02-17  7:08             ` Steffen Klassert
  1 sibling, 1 reply; 22+ messages in thread
From: Mark McKinstry @ 2016-02-10  1:50 UTC (permalink / raw)
  To: linux-crypto; +Cc: alexander.h.duyck, herbert, steffen.klassert, davem

http://www.spinics.net/lists/linux-crypto/msg15101.html
> From: Steffen Klassert <steffen.klassert@xxxxxxxxxxx>
>
> We currently rely on the PMTU discovery of xfrm.
> However if a packet is localy sent, the PMTU mechanism
> of xfrm tries to to local socket notification what
> might not work for applications like ping that don't
> check for this. So add pmtu handling to vti6_xmit to
> report MTU changes immediately.
>
> Signed-off-by: Steffen Klassert <steffen.klassert@xxxxxxxxxxx>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@xxxxxxxxxx>
> ---
>
> So this version is slightly modified to cover the IPv4 case in addition to
> the IPv6 case.  With this patch I was able to run netperf over either an
> IPv4 or IPv6 address routed over the ip6_vti tunnel.
We have the same issue. When we do a local ping to a remote device over
a v4 vti tunnel and an intermediate device has a low mtu, pmtu
discovery reduces the route's pmtu, and ping fails because it does not
handle the local error message generated by xfrm4_tunnel_check_size().
Your patch fixes our issue for v6 vti tunnels, but the issue still
exists for v4 tunnels. Is there any particular reason this patch was
not delivered for v4 tunnels too - i.e. in vti_xmit()?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-02-10  1:50           ` Mark McKinstry
@ 2016-02-17  7:08             ` Steffen Klassert
  2016-02-18  1:40               ` Mark McKinstry
  0 siblings, 1 reply; 22+ messages in thread
From: Steffen Klassert @ 2016-02-17  7:08 UTC (permalink / raw)
  To: Mark McKinstry; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On Wed, Feb 10, 2016 at 01:50:20AM +0000, Mark McKinstry wrote:
> >
> > So this version is slightly modified to cover the IPv4 case in addition to
> > the IPv6 case.  With this patch I was able to run netperf over either an
> > IPv4 or IPv6 address routed over the ip6_vti tunnel.
> We have the same issue. When we do a local ping to a remote device over
> a v4 vti tunnel and an intermediate device has a low mtu, pmtu
> discovery reduces the route's pmtu, and ping fails because it does not
> handle the local error message generated by xfrm4_tunnel_check_size().
> Your patch fixes our issue for v6 vti tunnels, but the issue still
> exists for v4 tunnels. Is there any particular reason this patch was
> not delivered for v4 tunnels too - i.e. in vti_xmit()?

I don't remember why we fixed it just for ipv6, we probably need
a similar patch for ipv4.

Does the patch below help (compile tested only)?

Subject: [PATCH] vti: Add pmtu handling to vti_xmit.

We currently rely on the PMTU discovery of xfrm.
However if a packet is localy sent, the PMTU mechanism
of xfrm tries to to local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti_xmit to
report MTU changes immediately.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/ip_vti.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 5cf10b7..6862305 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -156,6 +156,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 	struct dst_entry *dst = skb_dst(skb);
 	struct net_device *tdev;	/* Device to other host */
 	int err;
+	int mtu;
 
 	if (!dst) {
 		dev->stats.tx_carrier_errors++;
@@ -196,6 +197,18 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 	skb_dst_set(skb, dst);
 	skb->dev = skb_dst(skb)->dev;
 
+	mtu = dst_mtu(dst);
+	if (!skb->ignore_df && skb->len > mtu) {
+		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
+		if (skb->protocol == htons(ETH_P_IP))
+			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+				  htonl(mtu));
+		else
+			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+
+		return -EMSGSIZE;
+	}
+
 	err = dst_output(tunnel->net, skb->sk, skb);
 	if (net_xmit_eval(err) == 0)
 		err = skb->len;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-02-17  7:08             ` Steffen Klassert
@ 2016-02-18  1:40               ` Mark McKinstry
  2016-02-18 12:19                 ` Steffen Klassert
  0 siblings, 1 reply; 22+ messages in thread
From: Mark McKinstry @ 2016-02-18  1:40 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: linux-crypto, alexander.h.duyck, herbert, davem



On 17/02/16 20:08, Steffen Klassert wrote:
> On Wed, Feb 10, 2016 at 01:50:20AM +0000, Mark McKinstry wrote:
>>> So this version is slightly modified to cover the IPv4 case in addition to
>>> the IPv6 case.  With this patch I was able to run netperf over either an
>>> IPv4 or IPv6 address routed over the ip6_vti tunnel.
>> We have the same issue. When we do a local ping to a remote device over
>> a v4 vti tunnel and an intermediate device has a low mtu, pmtu
>> discovery reduces the route's pmtu, and ping fails because it does not
>> handle the local error message generated by xfrm4_tunnel_check_size().
>> Your patch fixes our issue for v6 vti tunnels, but the issue still
>> exists for v4 tunnels. Is there any particular reason this patch was
>> not delivered for v4 tunnels too - i.e. in vti_xmit()?
> I don't remember why we fixed it just for ipv6, we probably need
> a similar patch for ipv4.
>
> Does the patch below help (compile tested only)?
>
> Subject: [PATCH] vti: Add pmtu handling to vti_xmit.
>
> We currently rely on the PMTU discovery of xfrm.
> However if a packet is localy sent, the PMTU mechanism
> of xfrm tries to to local socket notification what
> might not work for applications like ping that don't
> check for this. So add pmtu handling to vti_xmit to
> report MTU changes immediately.
>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv4/ip_vti.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
> index 5cf10b7..6862305 100644
> --- a/net/ipv4/ip_vti.c
> +++ b/net/ipv4/ip_vti.c
> @@ -156,6 +156,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
>   	struct dst_entry *dst = skb_dst(skb);
>   	struct net_device *tdev;	/* Device to other host */
>   	int err;
> +	int mtu;
>   
>   	if (!dst) {
>   		dev->stats.tx_carrier_errors++;
> @@ -196,6 +197,18 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
>   	skb_dst_set(skb, dst);
>   	skb->dev = skb_dst(skb)->dev;
>   
> +	mtu = dst_mtu(dst);
> +	if (!skb->ignore_df && skb->len > mtu) {
> +		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
> +		if (skb->protocol == htons(ETH_P_IP))
> +			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> +				  htonl(mtu));
> +		else
> +			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
> +
> +		return -EMSGSIZE;
> +	}
> +
>   	err = dst_output(tunnel->net, skb->sk, skb);
>   	if (net_xmit_eval(err) == 0)
>   		err = skb->len;
This patch fixes our issue, thanks. In our scenario the tunnel path MTU 
now gets updated so that subsequent large packets sent over the tunnel 
get fragmented correctly.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-02-18  1:40               ` Mark McKinstry
@ 2016-02-18 12:19                 ` Steffen Klassert
  2016-02-24 21:37                   ` Mark McKinstry
  0 siblings, 1 reply; 22+ messages in thread
From: Steffen Klassert @ 2016-02-18 12:19 UTC (permalink / raw)
  To: Mark McKinstry; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
> This patch fixes our issue, thanks. In our scenario the tunnel path MTU 
> now gets updated so that subsequent large packets sent over the tunnel 
> get fragmented correctly.

I've applied this patch to the ipsec tree now.
Thanks for testing!

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-02-18 12:19                 ` Steffen Klassert
@ 2016-02-24 21:37                   ` Mark McKinstry
  2016-02-25 11:21                     ` Steffen Klassert
  2016-03-04  7:05                     ` Steffen Klassert
  0 siblings, 2 replies; 22+ messages in thread
From: Mark McKinstry @ 2016-02-24 21:37 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On 19/02/16 01:19, Steffen Klassert wrote:
> On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
>> This patch fixes our issue, thanks. In our scenario the tunnel path MTU
>> now gets updated so that subsequent large packets sent over the tunnel
>> get fragmented correctly.
> I've applied this patch to the ipsec tree now.
> Thanks for testing!
I spoke too soon. Upon further testing with this patch we have found it 
causes
a skt buffer leak. This is problematic for us and can cause memory 
exhaustion in
one of our test scenarios that has an IPv4 IPsec tunnel over a PPP link. 
Also
the patch's -EMSGSIZE return value appears to be invalid because vti_xmit()
should be returning a type netdev_tx_t (NETDEV_TX_OK etc). It looks to 
me that
this patch should really be doing a goto tx_error rather than doing an early
return with -EMSGSIZE. This would result in the skt buffer being freed,
NETDEV_TX_OK being returned (thus indicating vti_xmit() "took care of 
packet"),
and the tx_errors counter being incremented (which seems like a reasonable
thing to do).

I think the original IPv6 patch probably has the same issues, and could be
causing a DOS attack vulnerability in recent Linux releases. If this patch's
code gets hit for every received packet then the box's memory will soon be
exhausted - e.g. a rogue device sends a stream of largish pkts through a box
with a vti interface, and ignores every ICMPV6_PKT_TOOBIG pkt sent back 
to it.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-02-24 21:37                   ` Mark McKinstry
@ 2016-02-25 11:21                     ` Steffen Klassert
  2016-03-04  7:05                     ` Steffen Klassert
  1 sibling, 0 replies; 22+ messages in thread
From: Steffen Klassert @ 2016-02-25 11:21 UTC (permalink / raw)
  To: Mark McKinstry; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On Wed, Feb 24, 2016 at 09:37:39PM +0000, Mark McKinstry wrote:
> On 19/02/16 01:19, Steffen Klassert wrote:
> > On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
> >> This patch fixes our issue, thanks. In our scenario the tunnel path MTU
> >> now gets updated so that subsequent large packets sent over the tunnel
> >> get fragmented correctly.
> > I've applied this patch to the ipsec tree now.
> > Thanks for testing!
> I spoke too soon. Upon further testing with this patch we have found it 
> causes
> a skt buffer leak. This is problematic for us and can cause memory 
> exhaustion in
> one of our test scenarios that has an IPv4 IPsec tunnel over a PPP link. 
> Also
> the patch's -EMSGSIZE return value appears to be invalid because vti_xmit()
> should be returning a type netdev_tx_t (NETDEV_TX_OK etc). It looks to 
> me that
> this patch should really be doing a goto tx_error rather than doing an early
> return with -EMSGSIZE. This would result in the skt buffer being freed,
> NETDEV_TX_OK being returned (thus indicating vti_xmit() "took care of 
> packet"),
> and the tx_errors counter being incremented (which seems like a reasonable
> thing to do).

Yes, you are right here. 

> 
> I think the original IPv6 patch probably has the same issues, and could be
> causing a DOS attack vulnerability in recent Linux releases.

We need to fix both, ipv4 and ipv6. I'll care for it,
thanks for the report.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-02-24 21:37                   ` Mark McKinstry
  2016-02-25 11:21                     ` Steffen Klassert
@ 2016-03-04  7:05                     ` Steffen Klassert
  2016-03-14 21:52                       ` Mark McKinstry
  1 sibling, 1 reply; 22+ messages in thread
From: Steffen Klassert @ 2016-03-04  7:05 UTC (permalink / raw)
  To: Mark McKinstry; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On Wed, Feb 24, 2016 at 09:37:39PM +0000, Mark McKinstry wrote:
> On 19/02/16 01:19, Steffen Klassert wrote:
> > On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
> >> This patch fixes our issue, thanks. In our scenario the tunnel path MTU
> >> now gets updated so that subsequent large packets sent over the tunnel
> >> get fragmented correctly.
> > I've applied this patch to the ipsec tree now.
> > Thanks for testing!
> I spoke too soon. Upon further testing with this patch we have found it 
> causes
> a skt buffer leak. This is problematic for us and can cause memory 
> exhaustion in
> one of our test scenarios that has an IPv4 IPsec tunnel over a PPP link. 

The patch below is what I plan to apply on top of the original patch.

Subject: [PATCH] vti: Fix recource leeks on pmtu discovery

A recent patch introduced pmtu handling directly in the
vti transmit routine. Unfortunately we now return without
releasing the dst_entry and freeing the sk_buff. This patch
fixes the issue.

Fixes: 325b71fe0f57 ("vti: Add pmtu handling to vti_xmit.")
Reported-by: Mark McKinstry <Mark.McKinstry@alliedtelesis.co.nz>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/ip_vti.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 6862305..2ea2b6e 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -206,7 +206,8 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 		else
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
 
-		return -EMSGSIZE;
+		dst_release(dst);
+		goto tx_error;
 	}
 
 	err = dst_output(tunnel->net, skb->sk, skb);
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-03-04  7:05                     ` Steffen Klassert
@ 2016-03-14 21:52                       ` Mark McKinstry
  2016-03-15 12:28                         ` Steffen Klassert
  0 siblings, 1 reply; 22+ messages in thread
From: Mark McKinstry @ 2016-03-14 21:52 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On 04/03/16 20:05, Steffen Klassert wrote:
> On Wed, Feb 24, 2016 at 09:37:39PM +0000, Mark McKinstry wrote:
>> On 19/02/16 01:19, Steffen Klassert wrote:
>>> On Thu, Feb 18, 2016 at 01:40:00AM +0000, Mark McKinstry wrote:
>>>> This patch fixes our issue, thanks. In our scenario the tunnel path MTU
>>>> now gets updated so that subsequent large packets sent over the tunnel
>>>> get fragmented correctly.
>>> I've applied this patch to the ipsec tree now.
>>> Thanks for testing!
>> I spoke too soon. Upon further testing with this patch we have found it
>> causes
>> a skt buffer leak. This is problematic for us and can cause memory
>> exhaustion in
>> one of our test scenarios that has an IPv4 IPsec tunnel over a PPP link.
> The patch below is what I plan to apply on top of the original patch.
>
> Subject: [PATCH] vti: Fix recource leeks on pmtu discovery
>
> A recent patch introduced pmtu handling directly in the
> vti transmit routine. Unfortunately we now return without
> releasing the dst_entry and freeing the sk_buff. This patch
> fixes the issue.
>
> Fixes: 325b71fe0f57 ("vti: Add pmtu handling to vti_xmit.")
> Reported-by: Mark McKinstry <Mark.McKinstry@alliedtelesis.co.nz>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv4/ip_vti.c | 3 ++-
>   1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
> index 6862305..2ea2b6e 100644
> --- a/net/ipv4/ip_vti.c
> +++ b/net/ipv4/ip_vti.c
> @@ -206,7 +206,8 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
>   		else
>   			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
>   
> -		return -EMSGSIZE;
> +		dst_release(dst);
> +		goto tx_error;
>   	}
>   
>   	err = dst_output(tunnel->net, skb->sk, skb);
Your patch adds a dst_release() call to my suggested fix, but this is 
problematic because the kfree_skb() call at tx_error already takes care 
of releasing dst - via kfree_skb() > __kfree_skb() > skb_release_all() > 
skb_release_head_state() > skb_dst_drop()
 > refdst_drop() > dst_release(). In our scenario your patch results in 
a negative refcount kernel warning being generated in dst_release() for 
every packet that is too big to go over the vti.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-03-14 21:52                       ` Mark McKinstry
@ 2016-03-15 12:28                         ` Steffen Klassert
  2016-03-22 10:53                           ` Steffen Klassert
  0 siblings, 1 reply; 22+ messages in thread
From: Steffen Klassert @ 2016-03-15 12:28 UTC (permalink / raw)
  To: Mark McKinstry; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On Mon, Mar 14, 2016 at 09:52:05PM +0000, Mark McKinstry wrote:
> Your patch adds a dst_release() call to my suggested fix, but this is 
> problematic because the kfree_skb() call at tx_error already takes care 
> of releasing dst - via kfree_skb() > __kfree_skb() > skb_release_all() > 
> skb_release_head_state() > skb_dst_drop()
>  > refdst_drop() > dst_release(). In our scenario your patch results in 
> a negative refcount kernel warning being generated in dst_release() for 
> every packet that is too big to go over the vti.

Hm. I've just noticed that my pmtu test does not trigger this
codepath, so I did not see the warning.

Seems like we do the pmtu handling too late, it should happen before
we do skb_dst_set(). Also skb_scrub_packet() resets skb->ignore_df,
so checking ignore_df after skb_scrub_packet() does not make much sense.

I'll send an updated version after some more testing.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-03-15 12:28                         ` Steffen Klassert
@ 2016-03-22 10:53                           ` Steffen Klassert
  2016-03-30 21:04                             ` Mark McKinstry
  0 siblings, 1 reply; 22+ messages in thread
From: Steffen Klassert @ 2016-03-22 10:53 UTC (permalink / raw)
  To: Mark McKinstry; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On Tue, Mar 15, 2016 at 01:28:01PM +0100, Steffen Klassert wrote:
> On Mon, Mar 14, 2016 at 09:52:05PM +0000, Mark McKinstry wrote:
> > Your patch adds a dst_release() call to my suggested fix, but this is 
> > problematic because the kfree_skb() call at tx_error already takes care 
> > of releasing dst - via kfree_skb() > __kfree_skb() > skb_release_all() > 
> > skb_release_head_state() > skb_dst_drop()
> >  > refdst_drop() > dst_release(). In our scenario your patch results in 
> > a negative refcount kernel warning being generated in dst_release() for 
> > every packet that is too big to go over the vti.
> 
> Hm. I've just noticed that my pmtu test does not trigger this
> codepath, so I did not see the warning.
> 
> Seems like we do the pmtu handling too late, it should happen before
> we do skb_dst_set(). Also skb_scrub_packet() resets skb->ignore_df,
> so checking ignore_df after skb_scrub_packet() does not make much sense.
> 
> I'll send an updated version after some more testing.
> 

I've added a testcase that triggers this codepath to my testing
environment. The patch below works for me, could you please test
if it fixes your problems?

Subject: [PATCH] vti: Add pmtu handling to vti_xmit.

We currently rely on the PMTU discovery of xfrm.
However if a packet is locally sent, the PMTU mechanism
of xfrm tries to do local socket notification what
might not work for applications like ping that don't
check for this. So add pmtu handling to vti_xmit to
report MTU changes immediately.

Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/ip_vti.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 5cf10b7..a917903 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -156,6 +156,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 	struct dst_entry *dst = skb_dst(skb);
 	struct net_device *tdev;	/* Device to other host */
 	int err;
+	int mtu;
 
 	if (!dst) {
 		dev->stats.tx_carrier_errors++;
@@ -192,6 +193,23 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 			tunnel->err_count = 0;
 	}
 
+	mtu = dst_mtu(dst);
+	if (skb->len > mtu) {
+		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+		if (skb->protocol == htons(ETH_P_IP)) {
+			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
+				  htonl(mtu));
+		} else {
+			if (mtu < IPV6_MIN_MTU)
+				mtu = IPV6_MIN_MTU;
+
+			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
+		}
+
+		dst_release(dst);
+		goto tx_error;
+	}
+
 	skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(dev)));
 	skb_dst_set(skb, dst);
 	skb->dev = skb_dst(skb)->dev;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-03-22 10:53                           ` Steffen Klassert
@ 2016-03-30 21:04                             ` Mark McKinstry
  2016-04-01  8:08                               ` Steffen Klassert
  0 siblings, 1 reply; 22+ messages in thread
From: Mark McKinstry @ 2016-03-30 21:04 UTC (permalink / raw)
  To: Steffen Klassert; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

I've tested this patch in our scenario and I can confirm that it still 
fixes all of our issues.

On 22/03/16 23:53, Steffen Klassert wrote:
> On Tue, Mar 15, 2016 at 01:28:01PM +0100, Steffen Klassert wrote:
>> On Mon, Mar 14, 2016 at 09:52:05PM +0000, Mark McKinstry wrote:
>>> Your patch adds a dst_release() call to my suggested fix, but this is
>>> problematic because the kfree_skb() call at tx_error already takes care
>>> of releasing dst - via kfree_skb() > __kfree_skb() > skb_release_all() >
>>> skb_release_head_state() > skb_dst_drop()
>>>   > refdst_drop() > dst_release(). In our scenario your patch results in
>>> a negative refcount kernel warning being generated in dst_release() for
>>> every packet that is too big to go over the vti.
>> Hm. I've just noticed that my pmtu test does not trigger this
>> codepath, so I did not see the warning.
>>
>> Seems like we do the pmtu handling too late, it should happen before
>> we do skb_dst_set(). Also skb_scrub_packet() resets skb->ignore_df,
>> so checking ignore_df after skb_scrub_packet() does not make much sense.
>>
>> I'll send an updated version after some more testing.
>>
> I've added a testcase that triggers this codepath to my testing
> environment. The patch below works for me, could you please test
> if it fixes your problems?
>
> Subject: [PATCH] vti: Add pmtu handling to vti_xmit.
>
> We currently rely on the PMTU discovery of xfrm.
> However if a packet is locally sent, the PMTU mechanism
> of xfrm tries to do local socket notification what
> might not work for applications like ping that don't
> check for this. So add pmtu handling to vti_xmit to
> report MTU changes immediately.
>
> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
> ---
>   net/ipv4/ip_vti.c | 18 ++++++++++++++++++
>   1 file changed, 18 insertions(+)
>
> diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
> index 5cf10b7..a917903 100644
> --- a/net/ipv4/ip_vti.c
> +++ b/net/ipv4/ip_vti.c
> @@ -156,6 +156,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
>   	struct dst_entry *dst = skb_dst(skb);
>   	struct net_device *tdev;	/* Device to other host */
>   	int err;
> +	int mtu;
>   
>   	if (!dst) {
>   		dev->stats.tx_carrier_errors++;
> @@ -192,6 +193,23 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
>   			tunnel->err_count = 0;
>   	}
>   
> +	mtu = dst_mtu(dst);
> +	if (skb->len > mtu) {
> +		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
> +		if (skb->protocol == htons(ETH_P_IP)) {
> +			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
> +				  htonl(mtu));
> +		} else {
> +			if (mtu < IPV6_MIN_MTU)
> +				mtu = IPV6_MIN_MTU;
> +
> +			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
> +		}
> +
> +		dst_release(dst);
> +		goto tx_error;
> +	}
> +
>   	skb_scrub_packet(skb, !net_eq(tunnel->net, dev_net(dev)));
>   	skb_dst_set(skb, dst);
>   	skb->dev = skb_dst(skb)->dev;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] vti6: Add pmtu handling to vti6_xmit.
  2016-03-30 21:04                             ` Mark McKinstry
@ 2016-04-01  8:08                               ` Steffen Klassert
  0 siblings, 0 replies; 22+ messages in thread
From: Steffen Klassert @ 2016-04-01  8:08 UTC (permalink / raw)
  To: Mark McKinstry; +Cc: linux-crypto, alexander.h.duyck, herbert, davem

On Wed, Mar 30, 2016 at 09:04:03PM +0000, Mark McKinstry wrote:
> I've tested this patch in our scenario and I can confirm that it still 
> fixes all of our issues.

I've applied the patch to the ipsec tree now.
Thanks for testing!

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2016-04-01  8:08 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-27 17:40 [PATCH] xfrm6: Do not use xfrm_local_error for path MTU issues in tunnels Alexander Duyck
2015-05-28  4:49 ` Herbert Xu
2015-05-28  4:56   ` Steffen Klassert
2015-05-28  5:36 ` Steffen Klassert
2015-05-28  7:18   ` Alexander Duyck
2015-05-28  8:40     ` Steffen Klassert
2015-05-28 19:15       ` Alexander Duyck
2015-05-29 16:53         ` Alexander Duyck
2015-05-29 18:28         ` [PATCH] vti6: Add pmtu handling to vti6_xmit Alexander Duyck
2015-06-01 23:04           ` David Miller
2016-02-10  1:50           ` Mark McKinstry
2016-02-17  7:08             ` Steffen Klassert
2016-02-18  1:40               ` Mark McKinstry
2016-02-18 12:19                 ` Steffen Klassert
2016-02-24 21:37                   ` Mark McKinstry
2016-02-25 11:21                     ` Steffen Klassert
2016-03-04  7:05                     ` Steffen Klassert
2016-03-14 21:52                       ` Mark McKinstry
2016-03-15 12:28                         ` Steffen Klassert
2016-03-22 10:53                           ` Steffen Klassert
2016-03-30 21:04                             ` Mark McKinstry
2016-04-01  8:08                               ` Steffen Klassert

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.