All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
@ 2019-01-11 14:43 Nicolas Dichtel
  2019-01-11 21:29 ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Nicolas Dichtel @ 2019-01-11 14:43 UTC (permalink / raw)
  To: davem; +Cc: netdev, Nicolas Dichtel, Willem de Bruijn

Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:

Here is a example of the setup:
$ ip link set ntfp2 up
$ ip addr add 10.125.0.1/24 dev ntfp2
$ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
$ ip addr add fd00:cafe:cafe::1/128 dev tun1
$ ip link set dev tun1 up
$ ip route add fd00:200::/64 dev tun1
$ scapy
>>> p = []
>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
>>> send(p, count=1, inter=0.1)
>>> quit()
$ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
    TX: bytes  packets  errors  dropped carrier collsns
    0          0        1       0       0       0

The problem is that the network offset is set to the hard_header_len of the
output device (tun1, ie 14 + 20) and in our case, because the packet is
small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
(ipv6 header) starting from the network offset). Let's reset the network
offset so that it points to the beginning of the L3 data, ie skb->data.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=993675a3100b1
Fixes: cb9f1b783850 ("ip: validate header length on virtual device xmit")
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/packet/af_packet.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d0945253f43b..911c3870e66d 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2887,8 +2887,7 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 			goto out_free;
 	} else if (reserve) {
 		skb_reserve(skb, -reserve);
-		if (len < reserve)
-			skb_reset_network_header(skb);
+		skb_reset_network_header(skb);
 	}
 
 	/* Returns -EFAULT on error */
-- 
2.18.0

^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-11 14:43 [PATCH net] af_packet: fix raw sockets over 6in4 tunnel Nicolas Dichtel
@ 2019-01-11 21:29 ` Willem de Bruijn
  2019-01-12 18:00   ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2019-01-11 21:29 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: David Miller, Network Development

On Fri, Jan 11, 2019 at 9:44 AM Nicolas Dichtel
<nicolas.dichtel@6wind.com> wrote:
>
> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
>
> Here is a example of the setup:
> $ ip link set ntfp2 up
> $ ip addr add 10.125.0.1/24 dev ntfp2
> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> $ ip link set dev tun1 up
> $ ip route add fd00:200::/64 dev tun1
> $ scapy
> >>> p = []
> >>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
> >>> send(p, count=1, inter=0.1)
> >>> quit()
> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
>     TX: bytes  packets  errors  dropped carrier collsns
>     0          0        1       0       0       0
>
> The problem is that the network offset is set to the hard_header_len of the
> output device (tun1, ie 14 + 20) and in our case, because the packet is
> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> (ipv6 header) starting from the network offset). Let's reset the network
> offset so that it points to the beginning of the L3 data, ie skb->data.

This only just landed in my inbox, sorry for the delay. So far I'm not
reproducing the issue, but I'm trying and am having a look.

Which pskb_inet_may_pull() is failing?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-11 21:29 ` Willem de Bruijn
@ 2019-01-12 18:00   ` Willem de Bruijn
  2019-01-14 14:51     ` Nicolas Dichtel
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2019-01-12 18:00 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: David Miller, Network Development

On Fri, Jan 11, 2019 at 4:29 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Fri, Jan 11, 2019 at 9:44 AM Nicolas Dichtel
> <nicolas.dichtel@6wind.com> wrote:
> >
> > Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> > SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
> >
> > Here is a example of the setup:
> > $ ip link set ntfp2 up
> > $ ip addr add 10.125.0.1/24 dev ntfp2
> > $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> > $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> > $ ip link set dev tun1 up
> > $ ip route add fd00:200::/64 dev tun1
> > $ scapy
> > >>> p = []
> > >>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
> > >>> send(p, count=1, inter=0.1)
> > >>> quit()
> > $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
> >     TX: bytes  packets  errors  dropped carrier collsns
> >     0          0        1       0       0       0
> >
> > The problem is that the network offset is set to the hard_header_len of the
> > output device (tun1, ie 14 + 20) and in our case, because the packet is
> > small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> > (ipv6 header) starting from the network offset). Let's reset the network
> > offset so that it points to the beginning of the L3 data, ie skb->data.
>
> This only just landed in my inbox, sorry for the delay. So far I'm not
> reproducing the issue, but I'm trying and am having a look.
>
> Which pskb_inet_may_pull() is failing?

Okay. I see what's going on.

Commit cb9f1b783850 converted a pskb_may_pull ipv6hdr, pulling from
skb->data, to pskb_network_may_pull, pulling from skb->network_header.

Normally packet sockets with SOCK_RAW write a fixed size hardware
header and place skb->network_header directly after that. Dropping
these in sit_tunnel_xmit if they fail pskb_inet_may_pull is correct.

But with devices with variable length hardware headers like sit, it is
possible to cook packets that are shorter than the upper bound hhlen.
In that case, we have no way of knowing where the variable length
header ends (short of device specific parsing if implemented, see also
header_ops.validate), so we set the network header to the same as the
mac header as of commit 993675a3100b1.

This is not a good thing to do in general, and not needed in the
common case.

The issue here is that the icmpv6 packet (48B) exceeds the sit hhlen
(34), so is not caught by that workaround. We need to extend it to
cases where the variable length hhlen is less than its upper bound
in such a way that the packet as a whole exceeds hhlen, but by so
little that the L3 header still starts before dev->hard_header_len.

The simplest, to handle ip validation, is to extend the check from

  if (len < reserve)
    skb_reset_network_header(skb);

to one that accounts for requiring a fully formed protocol header

        } else if (reserve) {
+               int l3_min_len = reserve;
+
                skb_reserve(skb, -reserve);
-               if (len < reserve)
+
+               if (proto == htons(ETH_P_IPV6))
+                       l3_min_len += sizeof(struct ipv6hdr);
+               else if (proto == htons(ETH_P_IP))
+                       l3_min_len += sizeof(struct iphdr);
+
+               if (len < l3_min_len)
                        skb_reset_network_header(skb);
        }

Instead of special casing protocols, it might be cleaner to convert
dev_validate_header to return instead of a pass/fail bool the hardware
header length. And then update skb->network_header accordingly. That
would definitely only be for net-next.

For completeness, my variant of your test on top of v5.0-rc1

"
ip netns add test
ip netns exec test bash

ip link set lo up
ip addr add 10.0.0.1/24 dev lo
ip tunnel add tun1 mode sit ttl 64 local 10.0.0.1 remote 10.0.0.2 dev lo
ip addr add fd00:cafe:cafe::1/128 dev tun1
ip link set dev tun1 up
ip route add fd00:200::/64 dev tun1

tcpdump -vvv -i tun1 -n &
sleep 0.4






scapy
p = []
p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
send(p, count=1, inter=0.1)
quit()
"

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-12 18:00   ` Willem de Bruijn
@ 2019-01-14 14:51     ` Nicolas Dichtel
  2019-01-14 15:15       ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Nicolas Dichtel @ 2019-01-14 14:51 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: David Miller, Network Development

Le 12/01/2019 à 19:00, Willem de Bruijn a écrit :
> On Fri, Jan 11, 2019 at 4:29 PM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
>>
>> On Fri, Jan 11, 2019 at 9:44 AM Nicolas Dichtel
>> <nicolas.dichtel@6wind.com> wrote:
>>>
>>> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
>>> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
>>>
>>> Here is a example of the setup:
>>> $ ip link set ntfp2 up
>>> $ ip addr add 10.125.0.1/24 dev ntfp2
>>> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
>>> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
>>> $ ip link set dev tun1 up
>>> $ ip route add fd00:200::/64 dev tun1
>>> $ scapy
>>>>>> p = []
>>>>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
>>>>>> send(p, count=1, inter=0.1)
>>>>>> quit()
>>> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
>>>     TX: bytes  packets  errors  dropped carrier collsns
>>>     0          0        1       0       0       0
>>>
>>> The problem is that the network offset is set to the hard_header_len of the
>>> output device (tun1, ie 14 + 20) and in our case, because the packet is
>>> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
>>> (ipv6 header) starting from the network offset). Let's reset the network
>>> offset so that it points to the beginning of the L3 data, ie skb->data.
>>
>> This only just landed in my inbox, sorry for the delay. So far I'm not
>> reproducing the issue, but I'm trying and am having a look.
>>
>> Which pskb_inet_may_pull() is failing?
> 
> Okay. I see what's going on.
> 
> Commit cb9f1b783850 converted a pskb_may_pull ipv6hdr, pulling from
> skb->data, to pskb_network_may_pull, pulling from skb->network_header.
> 
> Normally packet sockets with SOCK_RAW write a fixed size hardware
> header and place skb->network_header directly after that. Dropping
> these in sit_tunnel_xmit if they fail pskb_inet_may_pull is correct.
> 
> But with devices with variable length hardware headers like sit, it is
> possible to cook packets that are shorter than the upper bound hhlen.
> In that case, we have no way of knowing where the variable length
> header ends (short of device specific parsing if implemented, see also
> header_ops.validate), so we set the network header to the same as the
> mac header as of commit 993675a3100b1.
> 
> This is not a good thing to do in general, and not needed in the
> common case.
Sorry, but I fail to understand what we try to achieve here. In this case, the
packet socket builds a packet without a hard header:

<-------48B--------->
<--40B--->
+--------+----------+
|IPv6 hdr|ICMPv6 hdr|
+--------+----------+
^ skb->data
      ^ nh offset: 34 (ie dev->hard_header_len, ie ethernet hdr + ipv4 hdr)

The nh offset does not match any data in the packet.

dev_validate_header() checks that datalen (ie 48) is >= dev->hard_header_len (ie
34). Still no hard header here.

Then, dev->xmit (ie sit_tunnel_xmit()) is called. This function calls
pskb_inet_may_pull() which tries to pull 40B from the nh offset (34). But the nh
offset still points somewhere in the ipv6 hdr, thus the test fails (34 + 40 > 48).
I don't understand why it is wrong to set the nh offset to 0 (skb->data), ie
exactly where the ipv6 header starts.


Regards,
Nicolas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-14 14:51     ` Nicolas Dichtel
@ 2019-01-14 15:15       ` Willem de Bruijn
  2019-01-14 15:38         ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2019-01-14 15:15 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: David Miller, Network Development

On Mon, Jan 14, 2019 at 9:51 AM Nicolas Dichtel
<nicolas.dichtel@6wind.com> wrote:
>
> Le 12/01/2019 à 19:00, Willem de Bruijn a écrit :
> > On Fri, Jan 11, 2019 at 4:29 PM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> >>
> >> On Fri, Jan 11, 2019 at 9:44 AM Nicolas Dichtel
> >> <nicolas.dichtel@6wind.com> wrote:
> >>>
> >>> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> >>> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
> >>>
> >>> Here is a example of the setup:
> >>> $ ip link set ntfp2 up
> >>> $ ip addr add 10.125.0.1/24 dev ntfp2
> >>> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> >>> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> >>> $ ip link set dev tun1 up
> >>> $ ip route add fd00:200::/64 dev tun1
> >>> $ scapy
> >>>>>> p = []
> >>>>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
> >>>>>> send(p, count=1, inter=0.1)
> >>>>>> quit()
> >>> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
> >>>     TX: bytes  packets  errors  dropped carrier collsns
> >>>     0          0        1       0       0       0
> >>>
> >>> The problem is that the network offset is set to the hard_header_len of the
> >>> output device (tun1, ie 14 + 20) and in our case, because the packet is
> >>> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> >>> (ipv6 header) starting from the network offset). Let's reset the network
> >>> offset so that it points to the beginning of the L3 data, ie skb->data.
> >>
> >> This only just landed in my inbox, sorry for the delay. So far I'm not
> >> reproducing the issue, but I'm trying and am having a look.
> >>
> >> Which pskb_inet_may_pull() is failing?
> >
> > Okay. I see what's going on.
> >
> > Commit cb9f1b783850 converted a pskb_may_pull ipv6hdr, pulling from
> > skb->data, to pskb_network_may_pull, pulling from skb->network_header.
> >
> > Normally packet sockets with SOCK_RAW write a fixed size hardware
> > header and place skb->network_header directly after that. Dropping
> > these in sit_tunnel_xmit if they fail pskb_inet_may_pull is correct.
> >
> > But with devices with variable length hardware headers like sit, it is
> > possible to cook packets that are shorter than the upper bound hhlen.
> > In that case, we have no way of knowing where the variable length
> > header ends (short of device specific parsing if implemented, see also
> > header_ops.validate), so we set the network header to the same as the
> > mac header as of commit 993675a3100b1.
> >
> > This is not a good thing to do in general, and not needed in the
> > common case.
> Sorry, but I fail to understand what we try to achieve here. In this case, the
> packet socket builds a packet without a hard header:
>
> <-------48B--------->
> <--40B--->
> +--------+----------+
> |IPv6 hdr|ICMPv6 hdr|
> +--------+----------+
> ^ skb->data
>       ^ nh offset: 34 (ie dev->hard_header_len, ie ethernet hdr + ipv4 hdr)
>
> The nh offset does not match any data in the packet.
>
> dev_validate_header() checks that datalen (ie 48) is >= dev->hard_header_len (ie
> 34). Still no hard header here.
>
> Then, dev->xmit (ie sit_tunnel_xmit()) is called. This function calls
> pskb_inet_may_pull() which tries to pull 40B from the nh offset (34). But the nh
> offset still points somewhere in the ipv6 hdr, thus the test fails (34 + 40 > 48).
> I don't understand why it is wrong to set the nh offset to 0 (skb->data), ie
> exactly where the ipv6 header starts.

It is wrong because for other devices l2 header length is not zero, so
this incorrectly sets skb->network_header to the start of the link
layer header on all those devices.

A one line variant of the above would be

-               if (len < reserve)
+               if (len < reserve + sizeof(struct ipv6hdr))

>
> Regards,
> Nicolas

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-14 15:15       ` Willem de Bruijn
@ 2019-01-14 15:38         ` Willem de Bruijn
  2019-01-15  9:49           ` Nicolas Dichtel
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2019-01-14 15:38 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: David Miller, Network Development

On Mon, Jan 14, 2019 at 10:15 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Mon, Jan 14, 2019 at 9:51 AM Nicolas Dichtel
> <nicolas.dichtel@6wind.com> wrote:
> >
> > Le 12/01/2019 à 19:00, Willem de Bruijn a écrit :
> > > On Fri, Jan 11, 2019 at 4:29 PM Willem de Bruijn
> > > <willemdebruijn.kernel@gmail.com> wrote:
> > >>
> > >> On Fri, Jan 11, 2019 at 9:44 AM Nicolas Dichtel
> > >> <nicolas.dichtel@6wind.com> wrote:
> > >>>
> > >>> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> > >>> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
> > >>>
> > >>> Here is a example of the setup:
> > >>> $ ip link set ntfp2 up
> > >>> $ ip addr add 10.125.0.1/24 dev ntfp2
> > >>> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> > >>> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> > >>> $ ip link set dev tun1 up
> > >>> $ ip route add fd00:200::/64 dev tun1
> > >>> $ scapy
> > >>>>>> p = []
> > >>>>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
> > >>>>>> send(p, count=1, inter=0.1)
> > >>>>>> quit()
> > >>> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
> > >>>     TX: bytes  packets  errors  dropped carrier collsns
> > >>>     0          0        1       0       0       0
> > >>>
> > >>> The problem is that the network offset is set to the hard_header_len of the
> > >>> output device (tun1, ie 14 + 20) and in our case, because the packet is
> > >>> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> > >>> (ipv6 header) starting from the network offset). Let's reset the network
> > >>> offset so that it points to the beginning of the L3 data, ie skb->data.
> > >>
> > >> This only just landed in my inbox, sorry for the delay. So far I'm not
> > >> reproducing the issue, but I'm trying and am having a look.
> > >>
> > >> Which pskb_inet_may_pull() is failing?
> > >
> > > Okay. I see what's going on.
> > >
> > > Commit cb9f1b783850 converted a pskb_may_pull ipv6hdr, pulling from
> > > skb->data, to pskb_network_may_pull, pulling from skb->network_header.
> > >
> > > Normally packet sockets with SOCK_RAW write a fixed size hardware
> > > header and place skb->network_header directly after that. Dropping
> > > these in sit_tunnel_xmit if they fail pskb_inet_may_pull is correct.
> > >
> > > But with devices with variable length hardware headers like sit, it is
> > > possible to cook packets that are shorter than the upper bound hhlen.
> > > In that case, we have no way of knowing where the variable length
> > > header ends (short of device specific parsing if implemented, see also
> > > header_ops.validate), so we set the network header to the same as the
> > > mac header as of commit 993675a3100b1.
> > >
> > > This is not a good thing to do in general, and not needed in the
> > > common case.
> > Sorry, but I fail to understand what we try to achieve here. In this case, the
> > packet socket builds a packet without a hard header:
> >
> > <-------48B--------->
> > <--40B--->
> > +--------+----------+
> > |IPv6 hdr|ICMPv6 hdr|
> > +--------+----------+
> > ^ skb->data
> >       ^ nh offset: 34 (ie dev->hard_header_len, ie ethernet hdr + ipv4 hdr)
> >
> > The nh offset does not match any data in the packet.
> >
> > dev_validate_header() checks that datalen (ie 48) is >= dev->hard_header_len (ie
> > 34). Still no hard header here.
> >
> > Then, dev->xmit (ie sit_tunnel_xmit()) is called. This function calls
> > pskb_inet_may_pull() which tries to pull 40B from the nh offset (34). But the nh
> > offset still points somewhere in the ipv6 hdr, thus the test fails (34 + 40 > 48).
> > I don't understand why it is wrong to set the nh offset to 0 (skb->data), ie
> > exactly where the ipv6 header starts.
>
> It is wrong because for other devices l2 header length is not zero, so
> this incorrectly sets skb->network_header to the start of the link
> layer header on all those devices.
>
> A one line variant of the above would be
>
> -               if (len < reserve)
> +               if (len < reserve + sizeof(struct ipv6hdr))

and exclude the majority of devices with fixed hard header len. Those
require len to be >= reserve, so this workaround does not apply to
them:

if (len < reserve + sizeof(struct ipv6hdr) &&
    dev->min_header_len != dev->hard_header_len)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-14 15:38         ` Willem de Bruijn
@ 2019-01-15  9:49           ` Nicolas Dichtel
  2019-01-15 16:20             ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Nicolas Dichtel @ 2019-01-15  9:49 UTC (permalink / raw)
  To: Willem de Bruijn; +Cc: David Miller, Network Development

Le 14/01/2019 à 16:38, Willem de Bruijn a écrit :
> On Mon, Jan 14, 2019 at 10:15 AM Willem de Bruijn
> <willemdebruijn.kernel@gmail.com> wrote:
[snip]
>> It is wrong because for other devices l2 header length is not zero, so
>> this incorrectly sets skb->network_header to the start of the link
>> layer header on all those devices.
Ok, thank you for the details.

>>
>> A one line variant of the above would be
>>
>> -               if (len < reserve)
>> +               if (len < reserve + sizeof(struct ipv6hdr))
> 
> and exclude the majority of devices with fixed hard header len. Those
> require len to be >= reserve, so this workaround does not apply to
> them:
> 
> if (len < reserve + sizeof(struct ipv6hdr) &&
>     dev->min_header_len != dev->hard_header_len)
> 
And what about:
- if (len < reserve)
+ if (dev->min_header_len != dev->hard_header_len)
 	skb_reset_network_header(skb);
?

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-15  9:49           ` Nicolas Dichtel
@ 2019-01-15 16:20             ` Willem de Bruijn
  2019-01-17 10:27               ` [PATCH net v2] " Nicolas Dichtel
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2019-01-15 16:20 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: David Miller, Network Development

On Tue, Jan 15, 2019 at 4:49 AM Nicolas Dichtel
<nicolas.dichtel@6wind.com> wrote:
>
> Le 14/01/2019 à 16:38, Willem de Bruijn a écrit :
> > On Mon, Jan 14, 2019 at 10:15 AM Willem de Bruijn
> > <willemdebruijn.kernel@gmail.com> wrote:
> [snip]
> >> It is wrong because for other devices l2 header length is not zero, so
> >> this incorrectly sets skb->network_header to the start of the link
> >> layer header on all those devices.
> Ok, thank you for the details.
>
> >>
> >> A one line variant of the above would be
> >>
> >> -               if (len < reserve)
> >> +               if (len < reserve + sizeof(struct ipv6hdr))
> >
> > and exclude the majority of devices with fixed hard header len. Those
> > require len to be >= reserve, so this workaround does not apply to
> > them:
> >
> > if (len < reserve + sizeof(struct ipv6hdr) &&
> >     dev->min_header_len != dev->hard_header_len)
> >
> And what about:
> - if (len < reserve)
> + if (dev->min_header_len != dev->hard_header_len)
>         skb_reset_network_header(skb);
> ?

As a fix for the regression introduced by cb9f1b783850 ("ip: validate
header length on virtual device xmit"), the test I proposed is a bit
more narrow by only applying this ugly workaround in cases where that
test might have started failing.

By limiting to short packets we also avoid in the common case reading
dev->min_header_len, which may not be cached (btw, we should use
reserve instead of dev->hard_header_len for the same reason).

But the length restriction does look rather arbitrary, so there is
something to say for your suggestion to apply this uniformly to
devices with variable length.

Note also the concurrent discussion in
http://patchwork.ozlabs.org/patch/1024489/ about extending header_ops
with a hard header parser which may return the exist header length.
That is only for net-next, but it will allow us to set
skb->network_header correctly even for these devices where
(dev->min_header_len != dev->hard_header_len).

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-15 16:20             ` Willem de Bruijn
@ 2019-01-17 10:27               ` Nicolas Dichtel
  2019-01-17 15:09                 ` Willem de Bruijn
  2019-01-17 23:55                 ` David Miller
  0 siblings, 2 replies; 17+ messages in thread
From: Nicolas Dichtel @ 2019-01-17 10:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, Nicolas Dichtel, Willem de Bruijn, Maxim Mikityanskiy

Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:

Here is a example of the setup:
$ ip link set ntfp2 up
$ ip addr add 10.125.0.1/24 dev ntfp2
$ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
$ ip addr add fd00:cafe:cafe::1/128 dev tun1
$ ip link set dev tun1 up
$ ip route add fd00:200::/64 dev tun1
$ scapy
>>> p = []
>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
>>> send(p, count=1, inter=0.1)
>>> quit()
$ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
    TX: bytes  packets  errors  dropped carrier collsns
    0          0        1       0       0       0

The problem is that the network offset is set to the hard_header_len of the
output device (tun1, ie 14 + 20) and in our case, because the packet is
small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
(ipv6 header) starting from the network offset).

This problem is more generally related to device with variable hard header
length. To avoid a too intrusive patch in the current release, a (ugly)
workaround is proposed in this patch. It has to be cleaned up in net-next.

Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=993675a3100b1
Link: http://patchwork.ozlabs.org/patch/1024489/
Fixes: cb9f1b783850 ("ip: validate header length on virtual device xmit")
CC: Willem de Bruijn <willemb@google.com>
CC: Maxim Mikityanskiy <maximmi@mellanox.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---

v1 -> v2:
  reset nh offset only for small packets sent on a variable hard hdr len device

 net/packet/af_packet.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index d0945253f43b..3b1a78906bc0 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -2887,7 +2887,8 @@ static int packet_snd(struct socket *sock, struct msghdr *msg, size_t len)
 			goto out_free;
 	} else if (reserve) {
 		skb_reserve(skb, -reserve);
-		if (len < reserve)
+		if (len < reserve + sizeof(struct ipv6hdr) &&
+		    dev->min_header_len != dev->hard_header_len)
 			skb_reset_network_header(skb);
 	}
 
-- 
2.18.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-17 10:27               ` [PATCH net v2] " Nicolas Dichtel
@ 2019-01-17 15:09                 ` Willem de Bruijn
  2019-01-17 23:55                 ` David Miller
  1 sibling, 0 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-01-17 15:09 UTC (permalink / raw)
  To: Nicolas Dichtel
  Cc: David Miller, Network Development, Willem de Bruijn, Maxim Mikityanskiy

On Thu, Jan 17, 2019 at 5:29 AM Nicolas Dichtel
<nicolas.dichtel@6wind.com> wrote:
>
> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
>
> Here is a example of the setup:
> $ ip link set ntfp2 up
> $ ip addr add 10.125.0.1/24 dev ntfp2
> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> $ ip link set dev tun1 up
> $ ip route add fd00:200::/64 dev tun1
> $ scapy
> >>> p = []
> >>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
> >>> send(p, count=1, inter=0.1)
> >>> quit()
> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
>     TX: bytes  packets  errors  dropped carrier collsns
>     0          0        1       0       0       0
>
> The problem is that the network offset is set to the hard_header_len of the
> output device (tun1, ie 14 + 20) and in our case, because the packet is
> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> (ipv6 header) starting from the network offset).
>
> This problem is more generally related to device with variable hard header
> length. To avoid a too intrusive patch in the current release, a (ugly)
> workaround is proposed in this patch. It has to be cleaned up in net-next.
>
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=993675a3100b1
> Link: http://patchwork.ozlabs.org/patch/1024489/
> Fixes: cb9f1b783850 ("ip: validate header length on virtual device xmit")
> CC: Willem de Bruijn <willemb@google.com>
> CC: Maxim Mikityanskiy <maximmi@mellanox.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

Acked-by: Willem de Bruijn <willemb@google.com>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-17 10:27               ` [PATCH net v2] " Nicolas Dichtel
  2019-01-17 15:09                 ` Willem de Bruijn
@ 2019-01-17 23:55                 ` David Miller
  2019-02-18 15:57                   ` Sasha Levin
  1 sibling, 1 reply; 17+ messages in thread
From: David Miller @ 2019-01-17 23:55 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: netdev, willemb, maximmi

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Thu, 17 Jan 2019 11:27:22 +0100

> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
> 
> Here is a example of the setup:
> $ ip link set ntfp2 up
> $ ip addr add 10.125.0.1/24 dev ntfp2
> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> $ ip link set dev tun1 up
> $ ip route add fd00:200::/64 dev tun1
> $ scapy
>>>> p = []
>>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
>>>> send(p, count=1, inter=0.1)
>>>> quit()
> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
>     TX: bytes  packets  errors  dropped carrier collsns
>     0          0        1       0       0       0
> 
> The problem is that the network offset is set to the hard_header_len of the
> output device (tun1, ie 14 + 20) and in our case, because the packet is
> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> (ipv6 header) starting from the network offset).
> 
> This problem is more generally related to device with variable hard header
> length. To avoid a too intrusive patch in the current release, a (ugly)
> workaround is proposed in this patch. It has to be cleaned up in net-next.
> 
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=993675a3100b1
> Link: http://patchwork.ozlabs.org/patch/1024489/
> Fixes: cb9f1b783850 ("ip: validate header length on virtual device xmit")
> CC: Willem de Bruijn <willemb@google.com>
> CC: Maxim Mikityanskiy <maximmi@mellanox.com>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> ---
> 
> v1 -> v2:
>   reset nh offset only for small packets sent on a variable hard hdr len device

Applied.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-01-17 23:55                 ` David Miller
@ 2019-02-18 15:57                   ` Sasha Levin
  2019-02-20 18:39                     ` Willem de Bruijn
  0 siblings, 1 reply; 17+ messages in thread
From: Sasha Levin @ 2019-02-18 15:57 UTC (permalink / raw)
  To: David Miller; +Cc: nicolas.dichtel, netdev, willemb, maximmi

On Thu, Jan 17, 2019 at 03:55:27PM -0800, David Miller wrote:
>From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>Date: Thu, 17 Jan 2019 11:27:22 +0100
>
>> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
>> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
>>
>> Here is a example of the setup:
>> $ ip link set ntfp2 up
>> $ ip addr add 10.125.0.1/24 dev ntfp2
>> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
>> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
>> $ ip link set dev tun1 up
>> $ ip route add fd00:200::/64 dev tun1
>> $ scapy
>>>>> p = []
>>>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
>>>>> send(p, count=1, inter=0.1)
>>>>> quit()
>> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
>>     TX: bytes  packets  errors  dropped carrier collsns
>>     0          0        1       0       0       0
>>
>> The problem is that the network offset is set to the hard_header_len of the
>> output device (tun1, ie 14 + 20) and in our case, because the packet is
>> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
>> (ipv6 header) starting from the network offset).
>>
>> This problem is more generally related to device with variable hard header
>> length. To avoid a too intrusive patch in the current release, a (ugly)
>> workaround is proposed in this patch. It has to be cleaned up in net-next.
>>
>> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=993675a3100b1
>> Link: http://patchwork.ozlabs.org/patch/1024489/
>> Fixes: cb9f1b783850 ("ip: validate header length on virtual device xmit")
>> CC: Willem de Bruijn <willemb@google.com>
>> CC: Maxim Mikityanskiy <maximmi@mellanox.com>
>> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>> ---
>>
>> v1 -> v2:
>>   reset nh offset only for small packets sent on a variable hard hdr len device
>
>Applied.

Should this go to -stable as well? The patch it fixes is in 4.20.

--
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-02-18 15:57                   ` Sasha Levin
@ 2019-02-20 18:39                     ` Willem de Bruijn
  2019-02-20 19:33                       ` David Miller
  2019-02-22 19:50                       ` David Miller
  0 siblings, 2 replies; 17+ messages in thread
From: Willem de Bruijn @ 2019-02-20 18:39 UTC (permalink / raw)
  To: Sasha Levin
  Cc: David Miller, Nicolas Dichtel, Network Development,
	Willem de Bruijn, Maxim Mikityanskiy

On Mon, Feb 18, 2019 at 1:50 PM Sasha Levin <sashal@kernel.org> wrote:
>
> On Thu, Jan 17, 2019 at 03:55:27PM -0800, David Miller wrote:
> >From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> >Date: Thu, 17 Jan 2019 11:27:22 +0100
> >
> >> Since commit cb9f1b783850, scapy (which uses an AF_PACKET socket in
> >> SOCK_RAW mode) is unable to send a basic icmp packet over a sit tunnel:
> >>
> >> Here is a example of the setup:
> >> $ ip link set ntfp2 up
> >> $ ip addr add 10.125.0.1/24 dev ntfp2
> >> $ ip tunnel add tun1 mode sit ttl 64 local 10.125.0.1 remote 10.125.0.2 dev ntfp2
> >> $ ip addr add fd00:cafe:cafe::1/128 dev tun1
> >> $ ip link set dev tun1 up
> >> $ ip route add fd00:200::/64 dev tun1
> >> $ scapy
> >>>>> p = []
> >>>>> p += IPv6(src='fd00:100::1', dst='fd00:200::1')/ICMPv6EchoRequest()
> >>>>> send(p, count=1, inter=0.1)
> >>>>> quit()
> >> $ ip -s link ls dev tun1 | grep -A1 "TX.*errors"
> >>     TX: bytes  packets  errors  dropped carrier collsns
> >>     0          0        1       0       0       0
> >>
> >> The problem is that the network offset is set to the hard_header_len of the
> >> output device (tun1, ie 14 + 20) and in our case, because the packet is
> >> small (48 bytes) the pskb_inet_may_pull() fails (it tries to pull 40 bytes
> >> (ipv6 header) starting from the network offset).
> >>
> >> This problem is more generally related to device with variable hard header
> >> length. To avoid a too intrusive patch in the current release, a (ugly)
> >> workaround is proposed in this patch. It has to be cleaned up in net-next.
> >>
> >> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=993675a3100b1
> >> Link: http://patchwork.ozlabs.org/patch/1024489/
> >> Fixes: cb9f1b783850 ("ip: validate header length on virtual device xmit")
> >> CC: Willem de Bruijn <willemb@google.com>
> >> CC: Maxim Mikityanskiy <maximmi@mellanox.com>
> >> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> >> ---
> >>
> >> v1 -> v2:
> >>   reset nh offset only for small packets sent on a variable hard hdr len device
> >
> >Applied.
>
> Should this go to -stable as well? The patch it fixes is in 4.20.

I believe so. It was also backported to 4.19 stable.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-02-20 18:39                     ` Willem de Bruijn
@ 2019-02-20 19:33                       ` David Miller
  2019-02-22 19:50                       ` David Miller
  1 sibling, 0 replies; 17+ messages in thread
From: David Miller @ 2019-02-20 19:33 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: sashal, nicolas.dichtel, netdev, willemb, maximmi

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Wed, 20 Feb 2019 13:39:23 -0500

> On Mon, Feb 18, 2019 at 1:50 PM Sasha Levin <sashal@kernel.org> wrote:
>> Should this go to -stable as well? The patch it fixes is in 4.20.
> 
> I believe so. It was also backported to 4.19 stable.

Ok, I'll submit it, thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-02-20 18:39                     ` Willem de Bruijn
  2019-02-20 19:33                       ` David Miller
@ 2019-02-22 19:50                       ` David Miller
  2019-02-22 23:53                         ` Willem de Bruijn
  1 sibling, 1 reply; 17+ messages in thread
From: David Miller @ 2019-02-22 19:50 UTC (permalink / raw)
  To: willemdebruijn.kernel; +Cc: sashal, nicolas.dichtel, netdev, willemb, maximmi

From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Wed, 20 Feb 2019 13:39:23 -0500

> On Mon, Feb 18, 2019 at 1:50 PM Sasha Levin <sashal@kernel.org> wrote:
>> Should this go to -stable as well? The patch it fixes is in 4.20.
> 
> I believe so. It was also backported to 4.19 stable.

It's queued up now.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-02-22 19:50                       ` David Miller
@ 2019-02-22 23:53                         ` Willem de Bruijn
  2019-02-25  8:55                           ` Nicolas Dichtel
  0 siblings, 1 reply; 17+ messages in thread
From: Willem de Bruijn @ 2019-02-22 23:53 UTC (permalink / raw)
  To: David Miller
  Cc: Sasha Levin, Nicolas Dichtel, Network Development,
	Willem de Bruijn, Maxim Mikityanskiy

On Fri, Feb 22, 2019 at 2:50 PM David Miller <davem@davemloft.net> wrote:
>
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Wed, 20 Feb 2019 13:39:23 -0500
>
> > On Mon, Feb 18, 2019 at 1:50 PM Sasha Levin <sashal@kernel.org> wrote:
> >> Should this go to -stable as well? The patch it fixes is in 4.20.
> >
> > I believe so. It was also backported to 4.19 stable.
>
> It's queued up now.

Thanks David!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH net v2] af_packet: fix raw sockets over 6in4 tunnel
  2019-02-22 23:53                         ` Willem de Bruijn
@ 2019-02-25  8:55                           ` Nicolas Dichtel
  0 siblings, 0 replies; 17+ messages in thread
From: Nicolas Dichtel @ 2019-02-25  8:55 UTC (permalink / raw)
  To: Willem de Bruijn, David Miller
  Cc: Sasha Levin, Network Development, Willem de Bruijn, Maxim Mikityanskiy

Le 23/02/2019 à 00:53, Willem de Bruijn a écrit :
> On Fri, Feb 22, 2019 at 2:50 PM David Miller <davem@davemloft.net> wrote:
>>
>> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
>> Date: Wed, 20 Feb 2019 13:39:23 -0500
>>
>>> On Mon, Feb 18, 2019 at 1:50 PM Sasha Levin <sashal@kernel.org> wrote:
>>>> Should this go to -stable as well? The patch it fixes is in 4.20.
>>>
>>> I believe so. It was also backported to 4.19 stable.
>>
>> It's queued up now.
> 
> Thanks David!
> 
Thanks all and sorry for the late reply, I was off last week.


Regards,
Nicolas

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-02-25  8:55 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-11 14:43 [PATCH net] af_packet: fix raw sockets over 6in4 tunnel Nicolas Dichtel
2019-01-11 21:29 ` Willem de Bruijn
2019-01-12 18:00   ` Willem de Bruijn
2019-01-14 14:51     ` Nicolas Dichtel
2019-01-14 15:15       ` Willem de Bruijn
2019-01-14 15:38         ` Willem de Bruijn
2019-01-15  9:49           ` Nicolas Dichtel
2019-01-15 16:20             ` Willem de Bruijn
2019-01-17 10:27               ` [PATCH net v2] " Nicolas Dichtel
2019-01-17 15:09                 ` Willem de Bruijn
2019-01-17 23:55                 ` David Miller
2019-02-18 15:57                   ` Sasha Levin
2019-02-20 18:39                     ` Willem de Bruijn
2019-02-20 19:33                       ` David Miller
2019-02-22 19:50                       ` David Miller
2019-02-22 23:53                         ` Willem de Bruijn
2019-02-25  8:55                           ` Nicolas Dichtel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.