All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
@ 2019-10-23 19:02 Praveen Chaudhary
  2019-10-23 19:02 ` [PATCH] [netfilter]: " Praveen Chaudhary
  0 siblings, 1 reply; 9+ messages in thread
From: Praveen Chaudhary @ 2019-10-23 19:02 UTC (permalink / raw)
  To: davem, fw, kadlec, pablo, netdev, linux-kernel


-------------Issue [BUG in current code]:-------------

IPV6 UDP packet is dropped by kernel in function udp6_csum_init(), when netfilter for NF_NAT_MANIP_SRC\DST is applied.
    
Counter increased: Udp6InCsumErrors. Note: incoming UPD6 packet has correct UDP checksum. 


-------------Reproduction steps: (using IPV6 UDP DNS Query)--- [linux kernel any version > 4.9.110]-------------

1.) Set an SNAT entry in /etc/iptables/rules.v6 as show below.
```
    -A POSTROUTING -p udp -m udp --dport 53 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:xxxx:4::2 (Masked)
```
Note: Above rule will change ipv6 source address to 2a04:xxxx:xxxx:4::2 for outgoing UDPv6 DNS packet , and will change the ipv6     destination address from 2a04:xxxx:xxxx:4::2 to original ipv6 address for incoming DNS response.

2.) Apply the iptable changes.
```
    switch$ sudo /etc/network/if-pre-up.d/iptables
```
3.) Run a DNS query using IPV6 DNS server for any site. We can observe that DNS query is timed out.
```
    switch$ host facebook.com 2a04:xxxx:xx:1::c216
    ;; connection timed out; no servers could be reached
```

---------------Details of issue----------------

1.) On incoming path: 
Function Trace: udp_error()-->nf_checksum()-->nf_ip6_checksum():

In function nf_ip6_checksum(); checksum is verified for incoming packet. Here skb->data points to IPV6 HEADER and ip_summed == CHECKSUM_NONE. So after a call to __skb_checksum_complete, skb->csum will store the 16-bit sum of [ipv6 header + UDP header + UDP data].

Checksum verification will be successful here, because csum_ipv6_magic() subtracts 16-bit sum of IPv6 header from 16-bit sum of Pseudo header.

Note: UDP checksum = ~[pseudo header + UDP header + UDP data].


2.) SNAT iptable rule processing, 

Function Trace: nf_nat_ipv6_manip_pkt()--> l4proto_manip_pkt()-->udp_manip_pkt()-->__udp_manip_pkt()-->nf_csum_update()---> nf_nat_ipv6_csum_update()-->inet_proto_csum_replace16():

In function inet_proto_csum_replace16(), first udp_header checksum field will be updated as below, because of the NF_NAT_MANIP_SRC manipulation,     I.e to reflect IPV6 src address change in IPV6 header.
```
        *sum = csum_fold(csum_partial(diff, sizeof(diff),
                 ~csum_unfold(*sum)));
```
Since skb->csum includes udp header checksum field, skb->csum will also go through similar calculation to reflect udp header checksum field change.
```
    if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr)
            skb->csum = ~csum_partial(diff, sizeof(diff),
                          ~skb->csum);
```
After this update, skb->csum = [Original IPV6 Header + Modified UDP header + UDP data]



3.) Then in function nf_nat_ipv6_manip_pkt(), ipv6 header will be updated to have target IPV6 src address:
    
```
    ipv6h->saddr = target->src.u3.in6;
```
Here we do not update skb->csum. So skb->csum will still be equal to [Original IPV6 Header + Modified UDP header + UDP data]. 
**BUG BUG BUG**: This is the bug, skb->csum must be updated to reflect this change in IPV6 SRC Address.

Ideally change in UDP header checksum field and change is IPV6 SRC address cancels each other in this case, so no update was needed in skb->csum.

4.) IPV6 header processing: 

Function Trace: ip6_input() --> ip6_input_finish() --> ip6_protocol_deliver_rcu():

In ip6_protocol_deliver_rcu(), 16-bit sum of IPV6 header will be subtracted from skb->csum.

```
    skb_postpull_rcsum(skb, skb_network_header(skb),
                       skb_network_header_len(skb));
```

This is the sum of new IPV6 header with modified IPV6 Source address. After this subtraction 

skb->csum = [Original IPV6 Header + Modified UDP header + UDP data] - [Modified IPv6 Header]
This is wrong value of skb->csum.


5.) UDP Header Checksum init:

Function Trace: __udp6_lib_rcv() --> udp6_csum_init() -->  __skb_checksum_validate_complete()

In  __skb_checksum_validate_complete() below condition will never met. Because value of skb->csum is unexpected. 
```
    if (skb->ip_summed == CHECKSUM_COMPLETE) {
        if (!csum_fold(csum_add(psum, skb->csum))) {
            skb->csum_valid = 1;
            return 0;
        }
    }
```

In udp6_csum_init(), below check will be true:
```
if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) {
        /* If SW calculated the value, we know it's bad */
        if (skb->csum_complete_sw)
            return 1;
```

and this packet will be dropped in  __udp6_lib_rcv() due to below check

```
if (udp6_csum_init(skb, uh, proto))
        goto csum_error;
.......
.......

csum_error:
    __UDP6_INC_STATS(net, UDP_MIB_CSUMERRORS, proto == IPPROTO_UDPLITE);
```



-------------------Related JPROBE LOGS:-------------------
```
Oct 19 05:15:01.420147 asw03 NOTICE kernel: [350574.264081]  nf_ip6_checksum: dataoff=40   
Oct 19 05:15:01.420180 asw03 NOTICE kernel: [350574.264084]  nf_ip6_checksum: skb:ffff9faa71cbc500 ip_sum=0 csum_valid=0 csum_com_sw=0 csum=0
```
[Explanation]: dataoff = 40 for ipv6 header and checksum is not verified yet, so skb->csum = 0.

```
Oct 19 05:15:01.420190 asw03 NOTICE kernel: [350574.264090]  nf_ip6_checksum: len:228 psum:ffffd68e csum=2e57fb19 sum=0
```
[Explanation]: psum = pseudo header sum - ipv6 header sum. Csum = skb->csm = [ipv6 header + UDP header + UDP data]
psum + csum (16-bit sum) = ffff + d68e + 2e57 + fb19 = 0x2FFFD = 0xFFFD + 0x2 = 0xFFFF.

```
Oct 19 05:15:01.420199 asw03 NOTICE kernel: [350574.264105] inet_proto_csum_replace16 : prev_hw_check=7feb uh_check=01cb
```
[Explanation]: With SNAT, IPV6 address will change from 2axx:xxxx:00xx:0004:0000:0000:0000:0002 to 2axx:xxx:00xx:2082:0000:0000:0000:0002. So the 16 bit diff is: 0x2082 - 0x0004 = 0x207E . So UDP header checksum changes from 0xEB7f to 0xCB01, Diff = ~(0x207E). [Printed without htons above.]

```
Oct 19 05:15:01.420203 asw03 NOTICE kernel: [350574.264107] inet_proto_csum_replace16: skb:ffff9faa71cbc500 ip_sum=2 csum_valid=1 csum_com_sw=1 csum= b037fb18
```
[Explanation]: Skb->csum changed according to udp header checksum diff: 16-bit diff for (b037fb18 - 2e57fb19) = ~htons(0x207E).

```
Oct 19 05:15:01.420218 asw03 NOTICE kernel: [350574.264116] ip6_rcv_finish: skb:ffff9faa71cbc500 ip_sum=2 csum_valid=1 csum_com_sw=1 csum=b037fb18
Oct 19 05:15:01.420227 asw03 NOTICE kernel: [350574.264135] __udp6_lib_rcv: src: 2axx:xxxx:00xx:0001:0000:0000:0000:c216 dst: 2axx:xxxx:00xx:2082:0000:0000:0000:0002 port: 53 uh_check: cb01
Oct 19 05:15:01.420230 asw03 NOTICE kernel: [350574.264137] __udp6_lib_rcv: skb:ffff9faa71cbc500 ip_sum=2 csum_valid=1 csum_com_sw=1 csum=57e294da
```
[Explanation]: Between ip6_rcv_finish and __udp6_lib_rcv, 16-bit sum of ipv6 header is subtracted from skb->csum. So skb->csum = 57e294da at __udp6_lib_rcv.

```
Oct 19 05:15:01.420236 asw03 NOTICE kernel: [350574.264140] __udp6_lib_rcv_post: len:188 psum:ffff9522 csum=57e294da
```
Here psum + csum = ffff + 9522 + 57e2 + 94da = 0x281DD = 0x81DF = ~(htons(207E)). Which shows that SKB_CSUM should be updated due to change IPV6 address.


--------------FIX ------------------------

nf_nat_ipv6_csum_update():

If nf_nat_ipv6_csum_update() function is called, it is guaranteed that one of the manipulation (NF_NAT_MANIP_SRC\NF_NAT_MANIP_DST) will surely happen. So skb->csum can be updated here for IPV6 header change. 

With IPV6 address update, skb->csum must be updated as below:

```
    skb->csum = csum_partial(diff, sizeof(diff),
                  skb->csum);
```
Where diff is:
```
__be32 diff[] = {
        ~from[0], ~from[1], ~from[2], ~from[3],
        to[0], to[1], to[2], to[3],
    };
```
Here From contains old IPV6 address and to contains new IPV6 address.


-----------------------
Interesting Facts:

------------------Why Problem is not seen for IPV4--------------

IPV4 suffers with similar issue, but since 16-bit sum of IPV4 header is always zero [Due to IPv4 header checksum present in IPV4 header]. Skb->csum as per below state will correct:

skb->csum = [Original IPV6 Header + Modified UDP header + UDP data - Modified IPv6 Header] = [Modified UDP header + UDP data]


------------------Why Problem is not seen for IPV6 TCP--------------

Even though skb->csum will be incorrect, when packet hit tcp_v6_rcv()--> skb_checksum_init()--> __skb_checksum_validate().

TCP code start checksum calculation from scratch by assigning pseudo header sum in skb->csum and by returning 0 from skb_checksum_init().

With this fix,  validation will be successful in __skb_checksum_validate() even for tcpv6 packet, so kernel will not recalculate checksum.


------------------Why Problem is not seen in linux kernel version < 3.16--------------

Call to udp_csum_pull_header(skb) in udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) has exposed the issue for UDPv6.

-------------------------------------------------------------------------------------

Note: 

-- Please let me know correct name and the file for new function inet_proto_skb_csum_replace16().

-- I am new to linux kernel contribution, so please let me know if any guideline is missed, I will follow it. I appreciate any suggestion. Thanks. 

-- I see KernelPRBot reply on each PR, let me know if that is the correct way to contribute.

-- Kindly let me know, If needed I can present entire packet and example with checksum calculation at each stage.

-------------------------------------------------------------------------------------



---------------------------Testing (with JPROBE LOGS) [AFTER FIX UDPv6 DNS query works fine]:---------------------------


1.) 
```
pchaudha@asw03:~$ sudo cat /etc/iptables/rules.v6 | grep SNAT
# SNAT
-A POSTROUTING -p tcp -m tcp --dport 443 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:62:4::2
-A POSTROUTING -p udp -m udp --dport 53 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:62:4::2
-A POSTROUTING -p tcp -m tcp --dport 53 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:62:4::2
```


2.)
```
 pchaudha@asw03:~$ sudo /etc/network/if-pre-up.d/iptables
```
3.) UDPv6 DNS query:
```
pchaudha@asw03:~$ host facebook.com 2a04:xxxx:xx:1::c216
Using domain server:
Name: 2a04:xxxx:32:1::c216
Address: 2a04:xxxx:32:1::c216#53
Aliases:

facebook.com has address 157.240.11.35
facebook.com has IPv6 address 2a03:2880:f10d:183:face:b00c:0:25de
facebook.com mail is handled by 10 smtpin.vvv.facebook.com.
```
4.) JPROBE LOGS: 
```
Oct 22 05:51:20.748153 asw03 NOTICE kernel: [611954.922595] nf_nat_ipv6_manip_pkt: skb:ffff9faa7c9bf400 t->dst.protonum=17
Oct 22 05:51:20.748186 asw03 NOTICE kernel: [611954.922600] inet_proto_csum_replace16: skb:ffff9faa7c9bf400 ips=2 csumv=1 csumsw=1 csum=4914e05c
Oct 22 05:51:20.748192 asw03 NOTICE kernel: [611954.922601] inet_proto_csum_replace16: new sum:c734e05c <<<<<<<<
Oct 22 05:51:20.748195 asw03 NOTICE kernel: [611954.922605] ip6_rcv_finish: skb:ffff9faa7c9bf400 ips=2 csumv=1 csumsw=1 csum=4914e05c
```
5.) TCPv6 query:
```
pchaudha@asw03:~$ host -T facebook.com 2a04:xxxx:32:1::c216
Using domain server:
Name: 2a04:xxxx:32:1::c216
Address: 2a04:xxxx:32:1::c216#53
Aliases:

facebook.com has address 31.13.70.36
facebook.com has IPv6 address 2a03:2880:f10d:183:face:b00c:0:25de
facebook.com mail is handled by 10 smtpin.vvv.facebook.com.
--------------------------
```
6.) JPROBE LOGS:
```
Oct 22 18:13:29.199793 asw03 NOTICE kernel: [656483.605018] nf_nat_ipv6_manip_pkt: skb:ffff9faab3e79400 t->dst.protonum=6
Oct 22 18:13:29.199826 asw03 NOTICE kernel: [656483.605032] inet_proto_csum_replace16: skb:ffff9faab3e79400 ips=2 csumv=1 csumsw=1 csum=c6796dec
Oct 22 18:13:29.199831 asw03 NOTICE kernel: [656483.605033] inet_proto_csum_replace16: new sum:c4996dec
Oct 22 18:13:29.199835 asw03 NOTICE kernel: [656483.605038] ip6_rcv_finish: skb:ffff9faab3e79400 ips=2 csumv=1 csumsw=1 csum=c6796dec
```

Praveen Chaudhary (1):
  [netfilter]: Fix skb->csum calculation when netfilter manipulation for
    NF_NAT_MANIP_SRC\DST is done on IPV6 packet.

 include/net/checksum.h       |  2 ++
 net/core/utils.c             | 13 +++++++++++++
 net/netfilter/nf_nat_proto.c |  2 ++
 3 files changed, 17 insertions(+)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
  2019-10-23 19:02 [PATCH] Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet Praveen Chaudhary
@ 2019-10-23 19:02 ` Praveen Chaudhary
  2019-10-23 19:33   ` Florian Westphal
  0 siblings, 1 reply; 9+ messages in thread
From: Praveen Chaudhary @ 2019-10-23 19:02 UTC (permalink / raw)
  To: davem, fw, kadlec, pablo, netdev, linux-kernel; +Cc: Zhenggen Xu, Andy Stracner

Update skb->csum, when netfilter code updates IPV6 SRC\DST address in IPV6 HEADER due to iptable rule.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Signed-off-by: Andy Stracner <astracner@linkedin.com>
---
 include/net/checksum.h       |  2 ++
 net/core/utils.c             | 13 +++++++++++++
 net/netfilter/nf_nat_proto.c |  2 ++
 3 files changed, 17 insertions(+)

diff --git a/include/net/checksum.h b/include/net/checksum.h
index 97bf488..d7d28b7 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -145,6 +145,8 @@ void inet_proto_csum_replace4(__sum16 *sum, struct sk_buff *skb,
 void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
 			       const __be32 *from, const __be32 *to,
 			       bool pseudohdr);
+void inet_proto_skb_csum_replace16(struct sk_buff *skb,
+			       const __be32 *from, const __be32 *to);
 void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb,
 				     __wsum diff, bool pseudohdr);
 
diff --git a/net/core/utils.c b/net/core/utils.c
index 6b6e51d..ab3284b 100644
--- a/net/core/utils.c
+++ b/net/core/utils.c
@@ -458,6 +458,19 @@ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
 }
 EXPORT_SYMBOL(inet_proto_csum_replace16);
 
+void inet_proto_skb_csum_replace16(struct sk_buff *skb,
+			       const __be32 *from, const __be32 *to)
+{
+	__be32 diff[] = {
+		~from[0], ~from[1], ~from[2], ~from[3],
+		to[0], to[1], to[2], to[3],
+	};
+	if (skb->ip_summed == CHECKSUM_COMPLETE)
+		skb->csum = csum_partial(diff, sizeof(diff),
+				  skb->csum);
+}
+EXPORT_SYMBOL(inet_proto_skb_csum_replace16);
+
 void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb,
 				     __wsum diff, bool pseudohdr)
 {
diff --git a/net/netfilter/nf_nat_proto.c b/net/netfilter/nf_nat_proto.c
index 0a59c14..de94590 100644
--- a/net/netfilter/nf_nat_proto.c
+++ b/net/netfilter/nf_nat_proto.c
@@ -467,6 +467,8 @@ static void nf_nat_ipv6_csum_update(struct sk_buff *skb,
 	}
 	inet_proto_csum_replace16(check, skb, oldip->s6_addr32,
 				  newip->s6_addr32, true);
+	inet_proto_skb_csum_replace16(skb, oldip->s6_addr32,
+				  newip->s6_addr32);
 #endif
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
  2019-10-23 19:02 ` [PATCH] [netfilter]: " Praveen Chaudhary
@ 2019-10-23 19:33   ` Florian Westphal
  2019-10-23 20:56     ` [PATCH] [netfilter]: Fix skb->csum calculation when netfilter Praveen Chaudhary
       [not found]     ` <CAJ_cd4qHM3kqz24Uywpyyz0mPz7axiNZk0Q385ROd4O8XZ11fA@mail.gmail.com>
  0 siblings, 2 replies; 9+ messages in thread
From: Florian Westphal @ 2019-10-23 19:33 UTC (permalink / raw)
  To: Praveen Chaudhary
  Cc: davem, fw, kadlec, pablo, netdev, linux-kernel, Zhenggen Xu,
	Andy Stracner

Praveen Chaudhary <praveen5582@gmail.com> wrote:
> Update skb->csum, when netfilter code updates IPV6 SRC\DST address in IPV6 HEADER due to iptable rule.
> 
> Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
> Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
> Signed-off-by: Andy Stracner <astracner@linkedin.com>
> ---
>  include/net/checksum.h       |  2 ++
>  net/core/utils.c             | 13 +++++++++++++
>  net/netfilter/nf_nat_proto.c |  2 ++
>  3 files changed, 17 insertions(+)
> 
> diff --git a/include/net/checksum.h b/include/net/checksum.h
> index 97bf488..d7d28b7 100644
> --- a/include/net/checksum.h
> +++ b/include/net/checksum.h
> @@ -145,6 +145,8 @@ void inet_proto_csum_replace4(__sum16 *sum, struct sk_buff *skb,
>  void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
>  			       const __be32 *from, const __be32 *to,
>  			       bool pseudohdr);
> +void inet_proto_skb_csum_replace16(struct sk_buff *skb,
> +			       const __be32 *from, const __be32 *to);
>  void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb,
>  				     __wsum diff, bool pseudohdr);
>  
> diff --git a/net/core/utils.c b/net/core/utils.c
> index 6b6e51d..ab3284b 100644
> --- a/net/core/utils.c
> +++ b/net/core/utils.c
> @@ -458,6 +458,19 @@ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
>  }
>  EXPORT_SYMBOL(inet_proto_csum_replace16);
>  
> +void inet_proto_skb_csum_replace16(struct sk_buff *skb,
> +			       const __be32 *from, const __be32 *to)
> +{
> +	__be32 diff[] = {
> +		~from[0], ~from[1], ~from[2], ~from[3],
> +		to[0], to[1], to[2], to[3],
> +	};
> +	if (skb->ip_summed == CHECKSUM_COMPLETE)
> +		skb->csum = csum_partial(diff, sizeof(diff),
> +				  skb->csum);
> +}
> +EXPORT_SYMBOL(inet_proto_skb_csum_replace16);
> +
>  void inet_proto_csum_replace_by_diff(__sum16 *sum, struct sk_buff *skb,
>  				     __wsum diff, bool pseudohdr)
>  {
> diff --git a/net/netfilter/nf_nat_proto.c b/net/netfilter/nf_nat_proto.c
> index 0a59c14..de94590 100644
> --- a/net/netfilter/nf_nat_proto.c
> +++ b/net/netfilter/nf_nat_proto.c
> @@ -467,6 +467,8 @@ static void nf_nat_ipv6_csum_update(struct sk_buff *skb,
>  	}
>  	inet_proto_csum_replace16(check, skb, oldip->s6_addr32,
>  				  newip->s6_addr32, true);
> +	inet_proto_skb_csum_replace16(skb, oldip->s6_addr32,
> +				  newip->s6_addr32);

This is confusing.

You're saying that inet_proto_csum_replace16() is producing a wrong
skb->csum.  So why are you adding a new function to do the
csum calculation instead of fixing inet_proto_csum_replace16() to do
the right thing?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [PATCH] [netfilter]: Fix skb->csum calculation when netfilter
  2019-10-23 19:33   ` Florian Westphal
@ 2019-10-23 20:56     ` Praveen Chaudhary
       [not found]     ` <CAJ_cd4qHM3kqz24Uywpyyz0mPz7axiNZk0Q385ROd4O8XZ11fA@mail.gmail.com>
  1 sibling, 0 replies; 9+ messages in thread
From: Praveen Chaudhary @ 2019-10-23 20:56 UTC (permalink / raw)
  To: fw
  Cc: astracner, davem, kadlec, linux-kernel, netdev, pablo, praveen5582, zxu

Hi Florian 

Thanks for the review,

inet_proto_csum_replace16 is called from many places, whereas this fix is applicable only for nf_nat_ipv6_csum_update, where we need to update skb->csum for ipv6 src/dst address change. 
Also my point is, inet_proto_csum_replace16 is updating skb->csum for change in udp header checksum field, but that is not complete. So, I added a new function. Basically, I used a safe apprioach to fix it, without impacting other cases. Let me know other options,  I am open to suggestions.

More importantly, I hope this is clear that the current code does not update skb->csum completely. Which is a bug. Thanks again.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
       [not found]     ` <CAJ_cd4qHM3kqz24Uywpyyz0mPz7axiNZk0Q385ROd4O8XZ11fA@mail.gmail.com>
@ 2019-10-24  1:12       ` Florian Westphal
  2019-10-24  5:19         ` [netfilter]: Fix skb->csum calculation when netfilter Praveen Chaudhary
  0 siblings, 1 reply; 9+ messages in thread
From: Florian Westphal @ 2019-10-24  1:12 UTC (permalink / raw)
  To: praveen chaudhary
  Cc: Florian Westphal, davem, kadlec, pablo, netdev, linux-kernel,
	Zhenggen Xu, Andy Stracner

praveen chaudhary <praveen5582@gmail.com> wrote:
> inet_proto_csum_replace16 is called from many places, whereas this fix is
> applicable only for nf_nat_ipv6_csum_update. where we need to update
> skb->csum for ipv6 src/dst address change.

Under which circumstances does inet_proto_csum_replace16 upate
skb->csum correctly?

> So, I added a new function. Basically, I used a safe approach to fix it,
> without impacting other cases. Let me know other options,  I am open to
> suggestions.

You seem to imply inet_proto_csum_replace16 is fine and only broken for ipv6
nat.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [netfilter]: Fix skb->csum calculation when netfilter
  2019-10-24  1:12       ` [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet Florian Westphal
@ 2019-10-24  5:19         ` Praveen Chaudhary
  0 siblings, 0 replies; 9+ messages in thread
From: Praveen Chaudhary @ 2019-10-24  5:19 UTC (permalink / raw)
  To: fw
  Cc: astracner, davem, kadlec, linux-kernel, netdev, pablo, praveen5582, zxu

Hi Florian

Thanks for giving time for review. This fix is pretty important for SONiC OS (https://azure.github.io/SONiC/). So I really appreciate it.

>> inet_proto_csum_replace16 is called from many places, whereas this fix is
>> applicable only for nf_nat_ipv6_csum_update. where we need to update
>> skb->csum for ipv6 src/dst address change.
>
>Under which circumstances does inet_proto_csum_replace16 upate
>skb->csum correctly?

inet_proto_csum_replace16 calculates skb->csum correctly if skb->csum does not include 16-bit sum of IPv6 Header i.e skb->data points to UDP\TCP\ICMPv6 header while calling __skb_checksum_complete() on packet. Function inet_proto_csum_replace16 is called from  nf_nat_ipv6_csum_update(),  nf_flow_nat_ipv6_tcp(),  nf_flow_nat_ipv6_udp() and update_ipv6_checksum(). For all nf_*() functions, inet_proto_csum_replace16() will not update skb->csum correctly\completely. 
But I am not sure about update_ipv6_checksum() (in net/openvswitch/actions.c). This is where I seek help from experts. If even for update_ipv6_checksum(), skb->csum includes 16-bit sum of IPv6 Header then inet_proto_csum_replace16() does updates skb->csum correctly. Then our fix will be to remove below line from this function. Because change in UDP\TCP\ICMPv6 header checksum field and change in IPv6 SRC\DST address cancels each other for checksum calculation, i.e. no update to skb->csum is needed.
```
if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr)
    skb->csum = ~csum_partial(diff, sizeof(diff),
            ~skb->csum);
```
>
>> So, I added a new function. Basically, I used a safe approach to fix it,
>> without impacting other cases. Let me know other options,  I am open to
>> suggestions.
>
>You seem to imply inet_proto_csum_replace16 is fine and only broken for ipv6
>nat.

Yeah, as mentioned above, I took safe approach to fix only nf_nat_ipv6_csum_update() part, where I am sure that skb->csum is broken. But I am not sure if (net/openvswitch/actions.c) needs this fix or not. Consider this my lack of expertise, So kindly suggest whether net/openvswitch/actions.c needs this fix or not. Note: I may not be able to test this part. After your suggestion, I will change my patch. Again Thanks for your time. 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
  2019-10-28 22:27 ` Praveen Chaudhary
@ 2019-10-29 16:09   ` Florian Westphal
  0 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2019-10-29 16:09 UTC (permalink / raw)
  To: Praveen Chaudhary
  Cc: davem, fw, kadlec, pablo, netdev, linux-kernel, Zhenggen Xu,
	Andy Stracner

Praveen Chaudhary <praveen5582@gmail.com> wrote:
> No need to update skb->csum in function inet_proto_csum_replace16(),
> even if skb->ip_summed == CHECKSUM_COMPLETE, because change in L4
> header checksum field and change in IPV6 header cancels each other
> for skb->csum calculation.

Can you resend this and submit this patch to
netfilter-devel@vger.kernel.org?

You may add:
Reviewed-by: Florian Westphal <fw@strlen.de>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
  2019-10-28 22:27 [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet Praveen Chaudhary
@ 2019-10-28 22:27 ` Praveen Chaudhary
  2019-10-29 16:09   ` Florian Westphal
  0 siblings, 1 reply; 9+ messages in thread
From: Praveen Chaudhary @ 2019-10-28 22:27 UTC (permalink / raw)
  To: davem, fw, kadlec, pablo, netdev, linux-kernel; +Cc: Zhenggen Xu, Andy Stracner

No need to update skb->csum in function inet_proto_csum_replace16(),
even if skb->ip_summed == CHECKSUM_COMPLETE, because change in L4
header checksum field and change in IPV6 header cancels each other
for skb->csum calculation.

Signed-off-by: Praveen Chaudhary <pchaudhary@linkedin.com>
Signed-off-by: Zhenggen Xu <zxu@linkedin.com>
Signed-off-by: Andy Stracner <astracner@linkedin.com>
---
Changes in V2.
1.) Updating diff as per email discussion with Florian Westphal.
    Since inet_proto_csum_replace16() does incorrect calculation
    for skb->csum in all cases.
2.) Change in Commmit logs.
---
---
 net/core/utils.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/net/core/utils.c b/net/core/utils.c
index 6b6e51d..cec9924 100644
--- a/net/core/utils.c
+++ b/net/core/utils.c
@@ -438,6 +438,12 @@ void inet_proto_csum_replace4(__sum16 *sum, struct sk_buff *skb,
 }
 EXPORT_SYMBOL(inet_proto_csum_replace4);
 
+/**
+ * No need to update skb->csum in this function, even if
+ * skb->ip_summed == CHECKSUM_COMPLETE, because change in
+ * L4 header checksum field and change in IPV6 header
+ * cancels each other for skb->csum calculation.
+ */
 void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
 			       const __be32 *from, const __be32 *to,
 			       bool pseudohdr)
@@ -449,9 +455,6 @@ void inet_proto_csum_replace16(__sum16 *sum, struct sk_buff *skb,
 	if (skb->ip_summed != CHECKSUM_PARTIAL) {
 		*sum = csum_fold(csum_partial(diff, sizeof(diff),
 				 ~csum_unfold(*sum)));
-		if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr)
-			skb->csum = ~csum_partial(diff, sizeof(diff),
-						  ~skb->csum);
 	} else if (pseudohdr)
 		*sum = ~csum_fold(csum_partial(diff, sizeof(diff),
 				  csum_unfold(*sum)));
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet.
@ 2019-10-28 22:27 Praveen Chaudhary
  2019-10-28 22:27 ` Praveen Chaudhary
  0 siblings, 1 reply; 9+ messages in thread
From: Praveen Chaudhary @ 2019-10-28 22:27 UTC (permalink / raw)
  To: davem, fw, kadlec, pablo, netdev, linux-kernel

-------------Issue [BUG in current code]:-------------

IPV6 UDP packet is dropped by kernel in function udp6_csum_init(), when netfilter for NF_NAT_MANIP_SRC\DST is applied.
    
Counter increased: Udp6InCsumErrors. Note: incoming UPD6 packet has correct UDP checksum. 


-------------Reproduction steps: (using IPV6 UDP DNS Query)--- [linux kernel any version > 4.9.110]-------------

1.) Set an SNAT entry in /etc/iptables/rules.v6 as show below.
```
    -A POSTROUTING -p udp -m udp --dport 53 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:xxxx:4::2 (Masked)
```
Note: Above rule will change ipv6 source address to 2a04:xxxx:xxxx:4::2 for outgoing UDPv6 DNS packet , and will change the ipv6     destination address from 2a04:xxxx:xxxx:4::2 to original ipv6 address for incoming DNS response.

2.) Apply the iptable changes.
```
    switch$ sudo /etc/network/if-pre-up.d/iptables
```
3.) Run a DNS query using IPV6 DNS server for any site. We can observe that DNS query is timed out.
```
    switch$ host facebook.com 2a04:xxxx:xx:1::c216
    ;; connection timed out; no servers could be reached
```

---------------Details of issue----------------

1.) On incoming path: 
Function Trace: udp_error()-->nf_checksum()-->nf_ip6_checksum():

In function nf_ip6_checksum(); checksum is verified for incoming packet. Here skb->data points to IPV6 HEADER and ip_summed == CHECKSUM_NONE. So after a call to __skb_checksum_complete, skb->csum will store the 16-bit sum of [ipv6 header + UDP header + UDP data].

Checksum verification will be successful here, because csum_ipv6_magic() subtracts 16-bit sum of IPv6 header from 16-bit sum of Pseudo header.

Note: UDP checksum = ~[pseudo header + UDP header + UDP data].


2.) SNAT iptable rule processing, 

Function Trace: nf_nat_ipv6_manip_pkt()--> l4proto_manip_pkt()-->udp_manip_pkt()-->__udp_manip_pkt()-->nf_csum_update()---> nf_nat_ipv6_csum_update()-->inet_proto_csum_replace16():

In function inet_proto_csum_replace16(), first udp_header checksum field will be updated as below, because of the NF_NAT_MANIP_SRC manipulation,     I.e to reflect IPV6 src\dst address change in IPV6 header.
```
        *sum = csum_fold(csum_partial(diff, sizeof(diff),
                 ~csum_unfold(*sum)));
```
Since skb->csum includes udp header checksum field, skb->csum will also go through similar calculation to reflect udp header checksum field change.
```
    if (skb->ip_summed == CHECKSUM_COMPLETE && pseudohdr)
            skb->csum = ~csum_partial(diff, sizeof(diff),
                          ~skb->csum);
```
After this update, skb->csum = [Original IPV6 Header + Modified UDP header + UDP data]



3.) Then in function nf_nat_ipv6_manip_pkt(), ipv6 header will be updated to have target IPV6 src address:
    
```
    ipv6h->saddr = target->src.u3.in6;
```
Here we do not update skb->csum. So skb->csum will still be equal to [Original IPV6 Header + Modified UDP header + UDP data]. 
**BUG BUG BUG**: This is the bug, skb->csum must be updated to reflect this change in IPV6 SRC Address.

Ideally change in UDP header checksum field and change in IPV6 SRC address cancels each other in this case, so no update was needed in skb->csum.

4.) IPV6 header processing: 

Function Trace: ip6_input() --> ip6_input_finish() --> ip6_protocol_deliver_rcu():

In ip6_protocol_deliver_rcu(), 16-bit sum of IPV6 header will be subtracted from skb->csum.

```
    skb_postpull_rcsum(skb, skb_network_header(skb),
                       skb_network_header_len(skb));
```

This is the sum of new IPV6 header with modified IPV6 Source\dest address. After this subtraction 

skb->csum = [Original IPV6 Header + Modified UDP header + UDP data] - [Modified IPv6 Header]
This is wrong value of skb->csum.


5.) UDP Header Checksum init:

Function Trace: __udp6_lib_rcv() --> udp6_csum_init() -->  __skb_checksum_validate_complete()

In  __skb_checksum_validate_complete() below condition will never met. Because value of skb->csum is unexpected. 
```
    if (skb->ip_summed == CHECKSUM_COMPLETE) {
        if (!csum_fold(csum_add(psum, skb->csum))) {
            skb->csum_valid = 1;
            return 0;
        }
    }
```

In udp6_csum_init(), below check will be true:
```
if (skb->ip_summed == CHECKSUM_COMPLETE && !skb->csum_valid) {
        /* If SW calculated the value, we know it's bad */
        if (skb->csum_complete_sw)
            return 1;
```

and this packet will be dropped in  __udp6_lib_rcv() due to below check

```
if (udp6_csum_init(skb, uh, proto))
        goto csum_error;
.......
.......

csum_error:
    __UDP6_INC_STATS(net, UDP_MIB_CSUMERRORS, proto == IPPROTO_UDPLITE);
```



-------------------Related JPROBE LOGS:-------------------
```
Oct 19 05:15:01.420147 asw03 NOTICE kernel: [350574.264081]  nf_ip6_checksum: dataoff=40   
Oct 19 05:15:01.420180 asw03 NOTICE kernel: [350574.264084]  nf_ip6_checksum: skb:ffff9faa71cbc500 ip_sum=0 csum_valid=0 csum_com_sw=0 csum=0
```
[Explanation]: dataoff = 40 for ipv6 header and checksum is not verified yet, so skb->csum = 0.

```
Oct 19 05:15:01.420190 asw03 NOTICE kernel: [350574.264090]  nf_ip6_checksum: len:228 psum:ffffd68e csum=2e57fb19 sum=0
```
[Explanation]: psum = pseudo header sum - ipv6 header sum. Csum = skb->csm = [ipv6 header + UDP header + UDP data]
psum + csum (16-bit sum) = ffff + d68e + 2e57 + fb19 = 0x2FFFD = 0xFFFD + 0x2 = 0xFFFF.

```
Oct 19 05:15:01.420199 asw03 NOTICE kernel: [350574.264105] inet_proto_csum_replace16 : prev_hw_check=7feb uh_check=01cb
```
[Explanation]: With SNAT, IPV6 address will change from 2axx:xxxx:00xx:0004:0000:0000:0000:0002 to 2axx:xxx:00xx:2082:0000:0000:0000:0002. So the 16 bit diff is: 0x2082 - 0x0004 = 0x207E . So UDP header checksum changes from 0xEB7f to 0xCB01, Diff = ~(0x207E). [Printed without htons above.]

```
Oct 19 05:15:01.420203 asw03 NOTICE kernel: [350574.264107] inet_proto_csum_replace16: skb:ffff9faa71cbc500 ip_sum=2 csum_valid=1 csum_com_sw=1 csum= b037fb18
```
[Explanation]: Skb->csum changed according to udp header checksum diff: 16-bit diff for (b037fb18 - 2e57fb19) = ~htons(0x207E).

```
Oct 19 05:15:01.420218 asw03 NOTICE kernel: [350574.264116] ip6_rcv_finish: skb:ffff9faa71cbc500 ip_sum=2 csum_valid=1 csum_com_sw=1 csum=b037fb18
Oct 19 05:15:01.420227 asw03 NOTICE kernel: [350574.264135] __udp6_lib_rcv: src: 2axx:xxxx:00xx:0001:0000:0000:0000:c216 dst: 2axx:xxxx:00xx:2082:0000:0000:0000:0002 port: 53 uh_check: cb01
Oct 19 05:15:01.420230 asw03 NOTICE kernel: [350574.264137] __udp6_lib_rcv: skb:ffff9faa71cbc500 ip_sum=2 csum_valid=1 csum_com_sw=1 csum=57e294da
```
[Explanation]: Between ip6_rcv_finish and __udp6_lib_rcv, 16-bit sum of ipv6 header is subtracted from skb->csum. So skb->csum = 57e294da at __udp6_lib_rcv.

```
Oct 19 05:15:01.420236 asw03 NOTICE kernel: [350574.264140] __udp6_lib_rcv_post: len:188 psum:ffff9522 csum=57e294da
```
Here psum + csum = ffff + 9522 + 57e2 + 94da = 0x281DD = 0x81DF = ~(htons(207E)). Which shows that SKB_CSUM should be updated due to change IPV6 address.


--------------FIX ------------------------

No need to update skb->csum in function inet_proto_csum_replace16(), even if skb->ip_summed == CHECKSUM_COMPLETE, because change in L4
header checksum field and change in IPV6 header cancels each other for skb->csum calculation. This is because skb->csum includes sum of 
[IPv6 Header + L4 Header + L4 Data]

-----------------------
Interesting Facts:

------------------Why Problem is not seen for IPV4--------------

IPV4 suffers with similar issue, but since 16-bit sum of IPV4 header is always zero [Due to IPv4 header checksum present in IPV4 header]. Skb->csum as per below state will correct:

skb->csum = [Original IPV6 Header + Modified UDP header + UDP data - Modified IPv6 Header] = [Modified UDP header + UDP data]


------------------Why Problem is not seen for IPV6 TCP--------------

Even though skb->csum will be incorrect, when packet hit tcp_v6_rcv()--> skb_checksum_init()--> __skb_checksum_validate().

TCP code start checksum calculation from scratch by assigning pseudo header sum in skb->csum and by returning 0 from skb_checksum_init().

With this fix,  validation will be successful in __skb_checksum_validate() even for tcpv6 packet, so kernel will not recalculate checksum.


------------------Why Problem is not seen in linux kernel version < 3.16--------------

Call to udp_csum_pull_header(skb) in udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb) has exposed the issue for UDPv6.

-------------------------------------------------------------------------------------

Note: 

-- Kindly let me know, If needed I can present entire packet and example with checksum calculation at each stage.

-------------------------------------------------------------------------------------



---------------------------Testing (with JPROBE LOGS) [AFTER FIX UDPv6 DNS query works fine]:---------------------------


1.) 
```
pchaudha@asw03:~$ sudo cat /etc/iptables/rules.v6 | grep SNAT
# SNAT
-A POSTROUTING -p tcp -m tcp --dport 443 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:62:4::2
-A POSTROUTING -p udp -m udp --dport 53 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:62:4::2
-A POSTROUTING -p tcp -m tcp --dport 53 -o Ethernet+ -j SNAT --to-source 2a04:xxxx:62:4::2
```


2.)
```
 pchaudha@asw03:~$ sudo /etc/network/if-pre-up.d/iptables
```
3.) UDPv6 DNS query:
```
pchaudha@asw03:~$ host facebook.com 2a04:xxxx:xx:1::c216
Using domain server:
Name: 2a04:xxxx:32:1::c216
Address: 2a04:xxxx:32:1::c216#53
Aliases:

facebook.com has address 157.240.11.35
facebook.com has IPv6 address 2a03:2880:f10d:183:face:b00c:0:25de
facebook.com mail is handled by 10 smtpin.vvv.facebook.com.
```
4.) JPROBE LOGS: 
```
Oct 22 05:51:20.748153 asw03 NOTICE kernel: [611954.922595] nf_nat_ipv6_manip_pkt: skb:ffff9faa7c9bf400 t->dst.protonum=17
Oct 22 05:51:20.748186 asw03 NOTICE kernel: [611954.922600] inet_proto_csum_replace16: skb:ffff9faa7c9bf400 ips=2 csumv=1 csumsw=1 csum=4914e05c
Oct 22 05:51:20.748192 asw03 NOTICE kernel: [611954.922601] inet_proto_csum_replace16: new sum:c734e05c <<<<<<<<
Oct 22 05:51:20.748195 asw03 NOTICE kernel: [611954.922605] ip6_rcv_finish: skb:ffff9faa7c9bf400 ips=2 csumv=1 csumsw=1 csum=4914e05c
```
5.) TCPv6 query:
```
pchaudha@asw03:~$ host -T facebook.com 2a04:xxxx:32:1::c216
Using domain server:
Name: 2a04:xxxx:32:1::c216
Address: 2a04:xxxx:32:1::c216#53
Aliases:

facebook.com has address 31.13.70.36
facebook.com has IPv6 address 2a03:2880:f10d:183:face:b00c:0:25de
facebook.com mail is handled by 10 smtpin.vvv.facebook.com.
--------------------------
```
6.) JPROBE LOGS:
```
Oct 22 18:13:29.199793 asw03 NOTICE kernel: [656483.605018] nf_nat_ipv6_manip_pkt: skb:ffff9faab3e79400 t->dst.protonum=6
Oct 22 18:13:29.199826 asw03 NOTICE kernel: [656483.605032] inet_proto_csum_replace16: skb:ffff9faab3e79400 ips=2 csumv=1 csumsw=1 csum=c6796dec
Oct 22 18:13:29.199831 asw03 NOTICE kernel: [656483.605033] inet_proto_csum_replace16: new sum:c4996dec
Oct 22 18:13:29.199835 asw03 NOTICE kernel: [656483.605038] ip6_rcv_finish: skb:ffff9faab3e79400 ips=2 csumv=1 csumsw=1 csum=c6796dec
```

Praveen Chaudhary (1):
  [netfilter]: Fix skb->csum calculation when netfilter manipulation for
    NF_NAT_MANIP_SRC\DST is done on IPV6 packet.

 net/core/utils.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-10-29 16:09 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-23 19:02 [PATCH] Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet Praveen Chaudhary
2019-10-23 19:02 ` [PATCH] [netfilter]: " Praveen Chaudhary
2019-10-23 19:33   ` Florian Westphal
2019-10-23 20:56     ` [PATCH] [netfilter]: Fix skb->csum calculation when netfilter Praveen Chaudhary
     [not found]     ` <CAJ_cd4qHM3kqz24Uywpyyz0mPz7axiNZk0Q385ROd4O8XZ11fA@mail.gmail.com>
2019-10-24  1:12       ` [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet Florian Westphal
2019-10-24  5:19         ` [netfilter]: Fix skb->csum calculation when netfilter Praveen Chaudhary
2019-10-28 22:27 [PATCH] [netfilter]: Fix skb->csum calculation when netfilter manipulation for NF_NAT_MANIP_SRC\DST is done on IPV6 packet Praveen Chaudhary
2019-10-28 22:27 ` Praveen Chaudhary
2019-10-29 16:09   ` Florian Westphal

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.