From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ed Swierk Subject: Re: [PATCH] openvswitch: Trim off padding before L3 conntrack processing Date: Thu, 14 Dec 2017 12:05:02 -0800 Message-ID: References: <1513095439-128864-1-git-send-email-eswierk@skyportsystems.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Cc: ovs-dev , Linux Kernel Network Developers , Lance Richardson , Benjamin Warren , Keith Holleman To: Pravin Shelar Return-path: Received: from mail-it0-f65.google.com ([209.85.214.65]:46623 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752298AbdLNUFo (ORCPT ); Thu, 14 Dec 2017 15:05:44 -0500 Received: by mail-it0-f65.google.com with SMTP id t1so14141826ite.5 for ; Thu, 14 Dec 2017 12:05:44 -0800 (PST) In-Reply-To: Sender: netdev-owner@vger.kernel.org List-ID: On Wed, Dec 13, 2017 at 4:58 PM, Pravin Shelar wrote: > On Tue, Dec 12, 2017 at 8:17 AM, Ed Swierk wrote: >> A short IPv4 packet may have up to 6 bytes of padding following the IP >> payload when received on an Ethernet device. >> >> In the normal IPv4 receive path, ip_rcv() trims the packet to >> ip_hdr->tot_len before invoking NF_INET_PRE_ROUTING hooks (including >> conntrack). Then any subsequent L3+ processing steps, like >> nf_checksum(), use skb->len as the length of the packet, rather than >> referring back to ip_hdr->tot_len. In the IPv6 receive path, ip6_rcv() >> does the same using ipv6_hdr->payload_len. >> >> In the OVS conntrack receive path, this trimming does not occur, so >> the checksum verification in tcp_header() fails, printing "nf_ct_tcp: >> bad TCP checksum". Extra zero bytes don't affect the checksum, but the >> length in the IP pseudoheader does. That length is based on skb->len, >> and without trimming, it doesn't match the length the sender used when >> computing the checksum. >> >> With this change, OVS conntrack trims IPv4 and IPv6 packets prior to >> L3 processing. >> >> Signed-off-by: Ed Swierk >> --- >> net/openvswitch/conntrack.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/net/openvswitch/conntrack.c b/net/openvswitch/conntrack.c >> index d558e882ca0c..3a7c9215c431 100644 >> --- a/net/openvswitch/conntrack.c >> +++ b/net/openvswitch/conntrack.c >> @@ -1105,12 +1105,29 @@ int ovs_ct_execute(struct net *net, struct sk_buff *skb, >> const struct ovs_conntrack_info *info) >> { >> int nh_ofs; >> + unsigned int nh_len; >> int err; >> >> /* The conntrack module expects to be working at L3. */ >> nh_ofs = skb_network_offset(skb); >> skb_pull_rcsum(skb, nh_ofs); >> >> + /* Trim to L3 length since nf_checksum() doesn't expect padding. */ > Can you explore if nf_checksum can be changed to avoid the padding? The nf_ip_checksum() and nf_ip6_checksum() helper functions can easily be changed to avoid the padding. My worry is that conntrack is just one of many netfilter hooks that perform L3+ processing, and may assume that once skb->data points to the L3 header, skb->len reflects the length of the L3 header and payload. For example, in nf_conntrack_ftp.c, help() uses skb->len to determine the length of the FTP payload and the TCP sequence number of the next packet; this would be thrown off by lower-layer padding. br_netfilter, a cousin of OVS, has always preserved this assumption--like ip_rcv() and ip6_rcv(), br_validate_ipv4() and br_validate_ipv6() trim the skb to the L3 length before they invoke NF_INET_PRE_ROUTING hooks. Modifying OVS to fit the mold seems more straightforward than changing this assumption. >> + switch (skb->protocol) { >> + case htons(ETH_P_IP): >> + nh_len = ntohs(ip_hdr(skb)->tot_len); >> + break; >> + case htons(ETH_P_IPV6): >> + nh_len = ntohs(ipv6_hdr(skb)->payload_len) >> + + sizeof(struct ipv6hdr); >> + break; >> + default: >> + nh_len = skb->len; >> + } >> + err = pskb_trim_rcsum(skb, nh_len); >> + if (err) >> + return err; >> + > In case of error skb needs to be freed. Thanks, I will fix this. --Ed