From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753335AbcD0P7s (ORCPT ); Wed, 27 Apr 2016 11:59:48 -0400 Received: from mail2.candelatech.com ([208.74.158.173]:56571 "EHLO mail2.candelatech.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750785AbcD0P7q (ORCPT ); Wed, 27 Apr 2016 11:59:46 -0400 Subject: =?UTF-8?Q?Re:_[PATCH_3.2_085/115]_veth:_don=e2=80=99t_modify_ip=5fs?= =?UTF-8?Q?ummed;_doing_so_treats_packets_with_bad_checksums_as_good.?= To: Ben Hutchings , linux-kernel@vger.kernel.org, stable@vger.kernel.org References: Cc: akpm@linux-foundation.org, "David S. Miller" , Vijay Pandurangan , Cong Wang , netdev@vger.kernel.org, Evan Jones , Nicolas Dichtel , Phil Sutter , Toshiaki Makita From: Ben Greear Organization: Candela Technologies Message-ID: <5720E1F0.9010203@candelatech.com> Date: Wed, 27 Apr 2016 08:59:44 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 04/26/2016 04:02 PM, Ben Hutchings wrote: > 3.2.80-rc1 review patch. If anyone has any objections, please let me know. I would be careful about this. It causes regressions when sending PACKET_SOCKET buffers from user-space to veth devices. There was a proposed upstream fix for the regression, but it has not gone into the tree as far as I know. http://www.spinics.net/lists/netdev/msg370436.html Thanks, Ben > > ------------------ > > From: Vijay Pandurangan > > [ Upstream commit ce8c839b74e3017996fad4e1b7ba2e2625ede82f ] > > Packets that arrive from real hardware devices have ip_summed == > CHECKSUM_UNNECESSARY if the hardware verified the checksums, or > CHECKSUM_NONE if the packet is bad or it was unable to verify it. The > current version of veth will replace CHECKSUM_NONE with > CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to > a veth device to be delivered to the application. This caused applications > at Twitter to receive corrupt data when network hardware was corrupting > packets. > > We believe this was added as an optimization to skip computing and > verifying checksums for communication between containers. However, locally > generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as > written does nothing for them. As far as we can tell, after removing this > code, these packets are transmitted from one stack to another unmodified > (tcpdump shows invalid checksums on both sides, as expected), and they are > delivered correctly to applications. We didn’t test every possible network > configuration, but we tried a few common ones such as bridging containers, > using NAT between the host and a container, and routing from hardware > devices to containers. We have effectively deployed this in production at > Twitter (by disabling RX checksum offloading on veth devices). > > This code dates back to the first version of the driver, commit > ("[NET]: Virtual ethernet device driver"), so I > suspect this bug occurred mostly because the driver API has evolved > significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix > packet checksumming") (in December 2010) fixed this for packets that get > created locally and sent to hardware devices, by not changing > CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming > in from hardware devices. > > Co-authored-by: Evan Jones > Signed-off-by: Evan Jones > Cc: Nicolas Dichtel > Cc: Phil Sutter > Cc: Toshiaki Makita > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Vijay Pandurangan > Acked-by: Cong Wang > Signed-off-by: David S. Miller > [bwh: Backported to 3.2: adjust context] > Signed-off-by: Ben Hutchings > --- > drivers/net/veth.c | 6 ------ > 1 file changed, 6 deletions(-) > > --- a/drivers/net/veth.c > +++ b/drivers/net/veth.c > @@ -126,11 +126,6 @@ static netdev_tx_t veth_xmit(struct sk_b > stats = this_cpu_ptr(priv->stats); > rcv_stats = this_cpu_ptr(rcv_priv->stats); > > - /* don't change ip_summed == CHECKSUM_PARTIAL, as that > - will cause bad checksum on forwarded packets */ > - if (skb->ip_summed == CHECKSUM_NONE && > - rcv->features & NETIF_F_RXCSUM) > - skb->ip_summed = CHECKSUM_UNNECESSARY; > > length = skb->len; > if (dev_forward_skb(rcv, skb) != NET_RX_SUCCESS) > -- Ben Greear Candela Technologies Inc http://www.candelatech.com