All of lore.kernel.org
 help / color / mirror / Atom feed
* Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
@ 2018-02-07 16:38 Yves-Alexis Perez
  2018-02-07 17:05 ` Yves-Alexis Perez
  0 siblings, 1 reply; 8+ messages in thread
From: Yves-Alexis Perez @ 2018-02-07 16:38 UTC (permalink / raw)
  To: Mike Maloney
  Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev,
	linux-kernel, Eric Dumazet, Greg Kroah-Hartman, stable,
	debian-kernel


[-- Attachment #1.1: Type: text/plain, Size: 1022 bytes --]

Hi Mike,

I noticed a regression in 4.14.16 stable kernel (I assume it's also
present in mainline).

I'm using an IPsec setup where I tunnel all my IP trafic to a gateway.
The tunnel can use either IPv6 or IPv4 (depending on what's available
locally) and will route both IPv4 and IPv6 (my gateway will assign both
family addresses).

The tunnel doesn't use ESP directly but rather encapsulates in UDP.

Starting with 4.14.16, IPv6 traffic is broken. When trying a simple ping
to an IPv6 address I get:

ping: sendmsg: Invalid argument

I bisected 4.14.15 to 4.14.16 and got the attached bisect log, which
ends with 8278804e05f6bcfe3fdfea4a404020752ead15a6 as the offending
commit. The -EINVAL is consistent with the “Invalid argument” return
from ping. I didn't try yet on a pure IPv6 setup (without IPsec
tunneling) but will followup when I have a chance to test it.

Reverting that commit on top of 4.14.17 fixes the problem.

If you need more info, please ask.

Regards,
-- 
Yves-Alexis

[-- Attachment #1.2: ip6-bisect.log --]
[-- Type: text/x-log, Size: 1308 bytes --]

# bad: [6c70076667f246dc200c7a3e9aeabd2f8f388416] Linux 4.14.16
# good: [a16134b082346b7e7c34f594a0763eafacdcea92] Linux 4.14.15
git bisect start 'v4.14.16' 'v4.14.15'
# bad: [72d4f3abd6d3521f5cd978b63f9301051f127812] r8169: fix memory corruption on retrieval of hardware statistics.
git bisect bad 72d4f3abd6d3521f5cd978b63f9301051f127812
# good: [295bcfbbcf5a741e9103605c3252276ed21433bb] ARM: net: bpf: correct stack layout documentation
git bisect good 295bcfbbcf5a741e9103605c3252276ed21433bb
# bad: [8278804e05f6bcfe3fdfea4a404020752ead15a6] ipv6: fix udpv6 sendmsg crash caused by too small MTU
git bisect bad 8278804e05f6bcfe3fdfea4a404020752ead15a6
# good: [9ad970c8a13595e38d3af98114bcbbec6d8a5be4] drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()
git bisect good 9ad970c8a13595e38d3af98114bcbbec6d8a5be4
# good: [42d68bf2a42381642ea5ae460c6a5d86a56213f0] ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
git bisect good 42d68bf2a42381642ea5ae460c6a5d86a56213f0
# good: [2295b3dd543f9a5a1834e4265d506a5bc0819983] ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
git bisect good 2295b3dd543f9a5a1834e4265d506a5bc0819983
# first bad commit: [8278804e05f6bcfe3fdfea4a404020752ead15a6] ipv6: fix udpv6 sendmsg crash caused by too small MTU

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
  2018-02-07 16:38 Regression for ip6-in-ip4 IPsec tunnel in 4.14.16 Yves-Alexis Perez
@ 2018-02-07 17:05 ` Yves-Alexis Perez
  2018-02-07 17:23   ` Yves-Alexis Perez
  0 siblings, 1 reply; 8+ messages in thread
From: Yves-Alexis Perez @ 2018-02-07 17:05 UTC (permalink / raw)
  To: Mike Maloney
  Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev,
	linux-kernel, Eric Dumazet, Greg Kroah-Hartman, stable,
	debian-kernel, Tobias Brunner

[-- Attachment #1: Type: text/plain, Size: 772 bytes --]

On Wed, 2018-02-07 at 17:38 +0100, Yves-Alexis Perez wrote:
> Starting with 4.14.16, IPv6 traffic is broken. When trying a simple ping
> to an IPv6 address I get:
> 
> ping: sendmsg: Invalid argument

I forgot an important bit of information: due to the way routers usually broke
path MTU discovery by filtering ICMP, I'm lowering the MTU to 1280 (so exactly
  IPV6_MIN_MTU) when using IPsec.

The MTU is configured by the IKE daemon (strongSwan, thus adding Tobias to
CC:) in routing table 220:

default via 192.168.28.254 dev eth0 proto static src 192.168.0.129 mtu 1280 advmss 1320

I'll try to printk the mtu before returning EINVAL to see why it's lower than
1280, but maybe the IP encapsulation is not correctly handled?

Regards,
-- 
Yves-Alexis

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
  2018-02-07 17:05 ` Yves-Alexis Perez
@ 2018-02-07 17:23   ` Yves-Alexis Perez
  2018-02-07 18:50       ` Mike Maloney
  0 siblings, 1 reply; 8+ messages in thread
From: Yves-Alexis Perez @ 2018-02-07 17:23 UTC (permalink / raw)
  To: Mike Maloney
  Cc: David S. Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI, netdev,
	linux-kernel, Eric Dumazet, Greg Kroah-Hartman, stable,
	debian-kernel, Tobias Brunner

[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]

On Wed, 2018-02-07 at 18:05 +0100, Yves-Alexis Perez wrote:
> I'll try to printk the mtu before returning EINVAL to see why it's lower than
> 1280, but maybe the IP encapsulation is not correctly handled?

I did:

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3763dc01e374..d3c651158d35 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1215,7 +1215,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
                        mtu = np->frag_size;
        }
        if (mtu < IPV6_MIN_MTU)
-               return -EINVAL;
+               printk("mtu: %d\n", mtu);
        cork->base.fragsize = mtu;
        if (dst_allfrag(rt->dst.path))
                cork->base.flags |= IPCORK_ALLFRAG;

and I get:

févr. 07 18:19:50 scapa kernel: mtu: 1218

and it doesn't depend on the original packet size (same thing happens with
ping -s 100). It also happens with UDP (DNS) traffic, but apparently not with
TCP.

Regards,
-- 
Yves-Alexis

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
  2018-02-07 17:23   ` Yves-Alexis Perez
@ 2018-02-07 18:50       ` Mike Maloney
  0 siblings, 0 replies; 8+ messages in thread
From: Mike Maloney @ 2018-02-07 18:50 UTC (permalink / raw)
  To: Yves-Alexis Perez
  Cc: Mike Maloney, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, netdev, linux-kernel, Eric Dumazet,
	Greg Kroah-Hartman, stable, debian-kernel, Tobias Brunner

On Wed, Feb 7, 2018 at 12:23 PM, Yves-Alexis Perez <corsac@debian.org> wrote:
> On Wed, 2018-02-07 at 18:05 +0100, Yves-Alexis Perez wrote:
>> I'll try to printk the mtu before returning EINVAL to see why it's lower than
>> 1280, but maybe the IP encapsulation is not correctly handled?
>
> I did:
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 3763dc01e374..d3c651158d35 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1215,7 +1215,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
>                         mtu = np->frag_size;
>         }
>         if (mtu < IPV6_MIN_MTU)
> -               return -EINVAL;
> +               printk("mtu: %d\n", mtu);
>         cork->base.fragsize = mtu;
>         if (dst_allfrag(rt->dst.path))
>                 cork->base.flags |= IPCORK_ALLFRAG;
>
> and I get:
>
> févr. 07 18:19:50 scapa kernel: mtu: 1218
>
> and it doesn't depend on the original packet size (same thing happens with
> ping -s 100). It also happens with UDP (DNS) traffic, but apparently not with
> TCP.
>
> Regards,
> --
> Yves-Alexis

Hi Yves-Alexis -

I apologize for the problem.  It seems to me that tunneling with an
outer MTU that causes the inner MTU to be smaller than the min, is
potentially problematic in other ways as well.

But also it could seem unfortunate that the code with my fix does not
look at actual packet size, but instead only looks at the MTU and then
fails, even if no packet was going to be so large.  The intention of
my patch was to prevent a negative number while calculating the
maxfraglen in  __ip6_append_data().  An alternative fix maybe to
instead return an error only if the mtu is less than or equal to the
fragheaderlen.   Something like:

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3763dc01e374..5d912a289b95 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1214,8 +1214,6 @@ static int ip6_setup_cork(struct sock *sk,
struct inet_cork_full *cork,
                if (np->frag_size)
                        mtu = np->frag_size;
        }
-       if (mtu < IPV6_MIN_MTU)
-               return -EINVAL;
        cork->base.fragsize = mtu;
        if (dst_allfrag(rt->dst.path))
                cork->base.flags |= IPCORK_ALLFRAG;
@@ -1264,6 +1262,8 @@ static int __ip6_append_data(struct sock *sk,

        fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len +
                        (opt ? opt->opt_nflen : 0);
+       if (mtu < fragheaderlen + 8)
+               return -EINVAL;
        maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
                     sizeof(struct frag_hdr);
                        (opt ? opt->opt_nflen : 0);

But then we also have to convince ourselves that maxfraglen can never
be <= 0.  I'd have to think about that.

I am not sure if others have thoughts on supporting MTUs configured
below the min in the spec.


Thanks.
--
Mike Maloney

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
@ 2018-02-07 18:50       ` Mike Maloney
  0 siblings, 0 replies; 8+ messages in thread
From: Mike Maloney @ 2018-02-07 18:50 UTC (permalink / raw)
  To: Yves-Alexis Perez
  Cc: Mike Maloney, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, netdev, linux-kernel, Eric Dumazet,
	Greg Kroah-Hartman, stable, debian-kernel, Tobias Brunner

On Wed, Feb 7, 2018 at 12:23 PM, Yves-Alexis Perez <corsac@debian.org> wrote:
> On Wed, 2018-02-07 at 18:05 +0100, Yves-Alexis Perez wrote:
>> I'll try to printk the mtu before returning EINVAL to see why it's lower than
>> 1280, but maybe the IP encapsulation is not correctly handled?
>
> I did:
>
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 3763dc01e374..d3c651158d35 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1215,7 +1215,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
>                         mtu = np->frag_size;
>         }
>         if (mtu < IPV6_MIN_MTU)
> -               return -EINVAL;
> +               printk("mtu: %d\n", mtu);
>         cork->base.fragsize = mtu;
>         if (dst_allfrag(rt->dst.path))
>                 cork->base.flags |= IPCORK_ALLFRAG;
>
> and I get:
>
> févr. 07 18:19:50 scapa kernel: mtu: 1218
>
> and it doesn't depend on the original packet size (same thing happens with
> ping -s 100). It also happens with UDP (DNS) traffic, but apparently not with
> TCP.
>
> Regards,
> --
> Yves-Alexis

Hi Yves-Alexis -

I apologize for the problem.  It seems to me that tunneling with an
outer MTU that causes the inner MTU to be smaller than the min, is
potentially problematic in other ways as well.

But also it could seem unfortunate that the code with my fix does not
look at actual packet size, but instead only looks at the MTU and then
fails, even if no packet was going to be so large.  The intention of
my patch was to prevent a negative number while calculating the
maxfraglen in  __ip6_append_data().  An alternative fix maybe to
instead return an error only if the mtu is less than or equal to the
fragheaderlen.   Something like:

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 3763dc01e374..5d912a289b95 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1214,8 +1214,6 @@ static int ip6_setup_cork(struct sock *sk,
struct inet_cork_full *cork,
                if (np->frag_size)
                        mtu = np->frag_size;
        }
-       if (mtu < IPV6_MIN_MTU)
-               return -EINVAL;
        cork->base.fragsize = mtu;
        if (dst_allfrag(rt->dst.path))
                cork->base.flags |= IPCORK_ALLFRAG;
@@ -1264,6 +1262,8 @@ static int __ip6_append_data(struct sock *sk,

        fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len +
                        (opt ? opt->opt_nflen : 0);
+       if (mtu < fragheaderlen + 8)
+               return -EINVAL;
        maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
                     sizeof(struct frag_hdr);
                        (opt ? opt->opt_nflen : 0);

But then we also have to convince ourselves that maxfraglen can never
be <= 0.  I'd have to think about that.

I am not sure if others have thoughts on supporting MTUs configured
below the min in the spec.


Thanks.

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
  2018-02-07 18:50       ` Mike Maloney
  (?)
@ 2018-02-07 19:46       ` Yves-Alexis Perez
  2018-02-08 12:40           ` Yves-Alexis Perez
  -1 siblings, 1 reply; 8+ messages in thread
From: Yves-Alexis Perez @ 2018-02-07 19:46 UTC (permalink / raw)
  To: Mike Maloney
  Cc: Mike Maloney, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, netdev, linux-kernel, Eric Dumazet,
	Greg Kroah-Hartman, stable, debian-kernel, Tobias Brunner

[-- Attachment #1: Type: text/plain, Size: 2372 bytes --]

On Wed, 2018-02-07 at 13:50 -0500, Mike Maloney wrote:
> On Wed, Feb 7, 2018 at 12:23 PM, Yves-Alexis Perez <corsac@debian.org>
> 
> Hi Yves-Alexis -
> 
> I apologize for the problem.  It seems to me that tunneling with an
> outer MTU that causes the inner MTU to be smaller than the min, is
> potentially problematic in other ways as well.

Maybe. I tried with removing the MTU setting, and I get (on ping again)

févr. 07 20:44:01 scapa kernel: mtu: 1266

which means I would get -EINVAL on standards kernels, which is not really good
either.

> But also it could seem unfortunate that the code with my fix does not
> look at actual packet size, but instead only looks at the MTU and then
> fails, even if no packet was going to be so large.  The intention of
> my patch was to prevent a negative number while calculating the
> maxfraglen in  __ip6_append_data().  An alternative fix maybe to
> instead return an error only if the mtu is less than or equal to the
> fragheaderlen.   Something like:
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 3763dc01e374..5d912a289b95 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1214,8 +1214,6 @@ static int ip6_setup_cork(struct sock *sk,
> struct inet_cork_full *cork,
>                 if (np->frag_size)
>                         mtu = np->frag_size;
>         }
> -       if (mtu < IPV6_MIN_MTU)
> -               return -EINVAL;
>         cork->base.fragsize = mtu;
>         if (dst_allfrag(rt->dst.path))
>                 cork->base.flags |= IPCORK_ALLFRAG;
> @@ -1264,6 +1262,8 @@ static int __ip6_append_data(struct sock *sk,
> 
>         fragheaderlen = sizeof(struct ipv6hdr) + rt->rt6i_nfheader_len +
>                         (opt ? opt->opt_nflen : 0);
> +       if (mtu < fragheaderlen + 8)
> +               return -EINVAL;
>         maxfraglen = ((mtu - fragheaderlen) & ~7) + fragheaderlen -
>                      sizeof(struct frag_hdr);
>                         (opt ? opt->opt_nflen : 0);
> 
> But then we also have to convince ourselves that maxfraglen can never
> be <= 0.  I'd have to think about that.
> 
> I am not sure if others have thoughts on supporting MTUs configured
> below the min in the spec.
> 
Here, the MTU is not below, so I'm not sure what happens.

Regards,
-- 
Yves-Alexis

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
  2018-02-07 19:46       ` Yves-Alexis Perez
@ 2018-02-08 12:40           ` Yves-Alexis Perez
  0 siblings, 0 replies; 8+ messages in thread
From: Yves-Alexis Perez @ 2018-02-08 12:40 UTC (permalink / raw)
  To: Mike Maloney
  Cc: Mike Maloney, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, netdev, linux-kernel, Eric Dumazet,
	Greg Kroah-Hartman, stable, debian-kernel, Tobias Brunner

[-- Attachment #1: Type: text/plain, Size: 779 bytes --]

On Wed, 2018-02-07 at 20:46 +0100, Yves-Alexis Perez wrote:
> Maybe. I tried with removing the MTU setting, and I get (on ping again)
> 
> févr. 07 20:44:01 scapa kernel: mtu: 1266
> 
> which means I would get -EINVAL on standards kernels, which is not really good
> either.

Actually after rebooting on the Debian 4.14.17 kernel, with the outter MTU
unset (or set to 1360), I don't get the -EINVAL anymore, so maybe it'd be OK.

Unfortunately I have the feeling that debugging this will be a bit tricky for
other people. MTU tuning for tunnels are always a bit confusing, but having
the kernel return EINVAL on a ping doesn't really help narrowing it down to
that MTU setting. Not sure if some kind of logging could be added to help?

Regards,
-- 
Yves-Alexis

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Regression for ip6-in-ip4 IPsec tunnel in 4.14.16
@ 2018-02-08 12:40           ` Yves-Alexis Perez
  0 siblings, 0 replies; 8+ messages in thread
From: Yves-Alexis Perez @ 2018-02-08 12:40 UTC (permalink / raw)
  To: Mike Maloney
  Cc: Mike Maloney, David S. Miller, Alexey Kuznetsov,
	Hideaki YOSHIFUJI, netdev, linux-kernel, Eric Dumazet,
	Greg Kroah-Hartman, stable, debian-kernel, Tobias Brunner

[-- Attachment #1: Type: text/plain, Size: 779 bytes --]

On Wed, 2018-02-07 at 20:46 +0100, Yves-Alexis Perez wrote:
> Maybe. I tried with removing the MTU setting, and I get (on ping again)
> 
> févr. 07 20:44:01 scapa kernel: mtu: 1266
> 
> which means I would get -EINVAL on standards kernels, which is not really good
> either.

Actually after rebooting on the Debian 4.14.17 kernel, with the outter MTU
unset (or set to 1360), I don't get the -EINVAL anymore, so maybe it'd be OK.

Unfortunately I have the feeling that debugging this will be a bit tricky for
other people. MTU tuning for tunnels are always a bit confusing, but having
the kernel return EINVAL on a ping doesn't really help narrowing it down to
that MTU setting. Not sure if some kind of logging could be added to help?

Regards,
-- 
Yves-Alexis

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-02-08 12:40 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-07 16:38 Regression for ip6-in-ip4 IPsec tunnel in 4.14.16 Yves-Alexis Perez
2018-02-07 17:05 ` Yves-Alexis Perez
2018-02-07 17:23   ` Yves-Alexis Perez
2018-02-07 18:50     ` Mike Maloney
2018-02-07 18:50       ` Mike Maloney
2018-02-07 19:46       ` Yves-Alexis Perez
2018-02-08 12:40         ` Yves-Alexis Perez
2018-02-08 12:40           ` Yves-Alexis Perez

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.