All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good.
@ 2015-12-18 17:54 Vijay Pandurangan
  2015-12-18 19:00 ` Cong Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Vijay Pandurangan @ 2015-12-18 17:54 UTC (permalink / raw)
  To: Nicolas Dichtel, Phil Sutter, Toshiaki Makita; +Cc: netdev, linux-kernel, ej

Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
<e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones <ej@evanjones.ca>
Signed-off-by: Evan Jones <ej@evanjones.ca>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>
---
 drivers/net/veth.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 0ef4a5a..ba21d07 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -117,12 +117,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb,
struct net_device *dev)
  kfree_skb(skb);
  goto drop;
  }
- /* don't change ip_summed == CHECKSUM_PARTIAL, as that
- * will cause bad checksum on forwarded packets
- */
- if (skb->ip_summed == CHECKSUM_NONE &&
-    rcv->features & NETIF_F_RXCSUM)
- skb->ip_summed = CHECKSUM_UNNECESSARY;

  if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
  struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good.
  2015-12-18 17:54 [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good Vijay Pandurangan
@ 2015-12-18 19:00 ` Cong Wang
  2015-12-18 19:34   ` [PATCH] veth: don’t modify ip_summed; " Vijay Pandurangan
  2015-12-18 19:42   ` [PATCH] veth: don't modify ip-summed; " Vijay Pandurangan
  0 siblings, 2 replies; 11+ messages in thread
From: Cong Wang @ 2015-12-18 19:00 UTC (permalink / raw)
  To: Vijay Pandurangan
  Cc: Nicolas Dichtel, Phil Sutter, Toshiaki Makita,
	Linux Kernel Network Developers, LKML, ej, Eric Biederman,
	Tom Herbert

(Cc'ing Eric B and Tom)

On Fri, Dec 18, 2015 at 9:54 AM, Vijay Pandurangan <vijayp@vijayp.ca> wrote:
> Packets that arrive from real hardware devices have ip_summed ==
> CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
> CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
> current version of veth will replace CHECKSUM_NONE with
> CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
> a veth device to be delivered to the application. This caused applications
> at Twitter to receive corrupt data when network hardware was corrupting
> packets.

Yeah, https://reviews.apache.org/r/41158/.

This is because normally packets to a veth device are _only_ from its pair
device, Mesos network isolator redirects packets from a hardware interface
to veth, which violates this expectation. This is also why no one else sees
this bug. ;)

>
> We believe this was added as an optimization to skip computing and
> verifying checksums for communication between containers. However, locally
> generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
> written does nothing for them. As far as we can tell, after removing this
> code, these packets are transmitted from one stack to another unmodified
> (tcpdump shows invalid checksums on both sides, as expected), and they are
> delivered correctly to applications. We didn’t test every possible network
> configuration, but we tried a few common ones such as bridging containers,
> using NAT between the host and a container, and routing from hardware
> devices to containers. We have effectively deployed this in production at
> Twitter (by disabling RX checksum offloading on veth devices).


I am wondering if there is any other CHECKSUM_NONE case in the tx
path we could miss here. Mesos case is too special not only because
it redirects packets from hardware to veth, but also because it moves
packets from RX path to TX path.

Eric? Tom?

>
> This code dates back to the first version of the driver, commit
> <e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
> suspect this bug occurred mostly because the driver API has evolved
> significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
> packet checksumming") (in December 2010) fixed this for packets that get
> created locally and sent to hardware devices, by not changing
> CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
> in from hardware devices.
>
> Co-authored-by: Evan Jones <ej@evanjones.ca>
> Signed-off-by: Evan Jones <ej@evanjones.ca>
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Cc: Phil Sutter <phil@nwl.cc>
> Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>

Your patch looks good to me but your email client corrupts your patch,
so please resend.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
  2015-12-18 19:00 ` Cong Wang
@ 2015-12-18 19:34   ` Vijay Pandurangan
  2015-12-19 21:41     ` Cong Wang
  2015-12-22 20:16       ` David Miller
  2015-12-18 19:42   ` [PATCH] veth: don't modify ip-summed; " Vijay Pandurangan
  1 sibling, 2 replies; 11+ messages in thread
From: Vijay Pandurangan @ 2015-12-18 19:34 UTC (permalink / raw)
  To: Cong Wang
  Cc: Vijay Pandurangan, Evan Jones, Nicolas Dichtel, Phil Sutter,
	Toshiaki Makita, netdev, linux-kernel

Packets that arrive from real hardware devices have ip_summed ==
CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
current version of veth will replace CHECKSUM_NONE with
CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
a veth device to be delivered to the application. This caused applications
at Twitter to receive corrupt data when network hardware was corrupting
packets.

We believe this was added as an optimization to skip computing and
verifying checksums for communication between containers. However, locally
generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
written does nothing for them. As far as we can tell, after removing this
code, these packets are transmitted from one stack to another unmodified
(tcpdump shows invalid checksums on both sides, as expected), and they are
delivered correctly to applications. We didn’t test every possible network
configuration, but we tried a few common ones such as bridging containers,
using NAT between the host and a container, and routing from hardware
devices to containers. We have effectively deployed this in production at
Twitter (by disabling RX checksum offloading on veth devices).

This code dates back to the first version of the driver, commit
<e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
suspect this bug occurred mostly because the driver API has evolved
significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
packet checksumming") (in December 2010) fixed this for packets that get
created locally and sent to hardware devices, by not changing
CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
in from hardware devices.

Co-authored-by: Evan Jones <ej@evanjones.ca>
Signed-off-by: Evan Jones <ej@evanjones.ca>
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Cc: Phil Sutter <phil@nwl.cc>
Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>
---
 drivers/net/veth.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 0ef4a5a..ba21d07 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -117,12 +117,6 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
 		kfree_skb(skb);
 		goto drop;
 	}
-	/* don't change ip_summed == CHECKSUM_PARTIAL, as that
-	 * will cause bad checksum on forwarded packets
-	 */
-	if (skb->ip_summed == CHECKSUM_NONE &&
-	    rcv->features & NETIF_F_RXCSUM)
-		skb->ip_summed = CHECKSUM_UNNECESSARY;
 
 	if (likely(dev_forward_skb(rcv, skb) == NET_RX_SUCCESS)) {
 		struct pcpu_vstats *stats = this_cpu_ptr(dev->vstats);
-- 
2.5.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good.
  2015-12-18 19:00 ` Cong Wang
  2015-12-18 19:34   ` [PATCH] veth: don’t modify ip_summed; " Vijay Pandurangan
@ 2015-12-18 19:42   ` Vijay Pandurangan
  2015-12-19 21:01     ` Cong Wang
  1 sibling, 1 reply; 11+ messages in thread
From: Vijay Pandurangan @ 2015-12-18 19:42 UTC (permalink / raw)
  To: Cong Wang
  Cc: Nicolas Dichtel, Phil Sutter, Toshiaki Makita,
	Linux Kernel Network Developers, LKML, Evan Jones,
	Eric Biederman, Tom Herbert

Evan and I have demonstrated this bug on Kubernetes as well, so it's
not just a problem in Mesos. (See
https://github.com/kubernetes/kubernetes/issues/18898)

Sorry about my email client, I've re-sent the patch in another thread
from git-email as I should have initially.

I'll read through the TX path again to see if we missed something, but
I'd love input from anyone else!

--

Vijay Pandurangan
https://www.twitter.com/vijayp
http://www.vijayp.ca


On Fri, Dec 18, 2015 at 2:00 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> (Cc'ing Eric B and Tom)
>
> On Fri, Dec 18, 2015 at 9:54 AM, Vijay Pandurangan <vijayp@vijayp.ca> wrote:
>> Packets that arrive from real hardware devices have ip_summed ==
>> CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
>> CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
>> current version of veth will replace CHECKSUM_NONE with
>> CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
>> a veth device to be delivered to the application. This caused applications
>> at Twitter to receive corrupt data when network hardware was corrupting
>> packets.
>
> Yeah, https://reviews.apache.org/r/41158/.
>
> This is because normally packets to a veth device are _only_ from its pair
> device, Mesos network isolator redirects packets from a hardware interface
> to veth, which violates this expectation. This is also why no one else sees
> this bug. ;)
>
>>
>> We believe this was added as an optimization to skip computing and
>> verifying checksums for communication between containers. However, locally
>> generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
>> written does nothing for them. As far as we can tell, after removing this
>> code, these packets are transmitted from one stack to another unmodified
>> (tcpdump shows invalid checksums on both sides, as expected), and they are
>> delivered correctly to applications. We didn’t test every possible network
>> configuration, but we tried a few common ones such as bridging containers,
>> using NAT between the host and a container, and routing from hardware
>> devices to containers. We have effectively deployed this in production at
>> Twitter (by disabling RX checksum offloading on veth devices).
>
>
> I am wondering if there is any other CHECKSUM_NONE case in the tx
> path we could miss here. Mesos case is too special not only because
> it redirects packets from hardware to veth, but also because it moves
> packets from RX path to TX path.
>
> Eric? Tom?
>
>>
>> This code dates back to the first version of the driver, commit
>> <e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
>> suspect this bug occurred mostly because the driver API has evolved
>> significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
>> packet checksumming") (in December 2010) fixed this for packets that get
>> created locally and sent to hardware devices, by not changing
>> CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
>> in from hardware devices.
>>
>> Co-authored-by: Evan Jones <ej@evanjones.ca>
>> Signed-off-by: Evan Jones <ej@evanjones.ca>
>> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
>> Cc: Phil Sutter <phil@nwl.cc>
>> Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>> Cc: netdev@vger.kernel.org
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>
>
> Your patch looks good to me but your email client corrupts your patch,
> so please resend.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good.
  2015-12-18 19:42   ` [PATCH] veth: don't modify ip-summed; " Vijay Pandurangan
@ 2015-12-19 21:01     ` Cong Wang
  2015-12-19 21:34       ` Cong Wang
  0 siblings, 1 reply; 11+ messages in thread
From: Cong Wang @ 2015-12-19 21:01 UTC (permalink / raw)
  To: Vijay Pandurangan
  Cc: Nicolas Dichtel, Phil Sutter, Toshiaki Makita,
	Linux Kernel Network Developers, LKML, Evan Jones,
	Eric Biederman, Tom Herbert

On Fri, Dec 18, 2015 at 11:42 AM, Vijay Pandurangan <vijayp@vijayp.ca> wrote:
> Evan and I have demonstrated this bug on Kubernetes as well, so it's
> not just a problem in Mesos. (See
> https://github.com/kubernetes/kubernetes/issues/18898)
>

Interesting... then this problem is much more serious than I thought.

Looks like in RX path the bridge sets the checksum to CHECKSUM_NONE
too:

static inline void skb_forward_csum(struct sk_buff *skb)
{
        /* Unfortunately we don't support this one.  Any brave souls? */
        if (skb->ip_summed == CHECKSUM_COMPLETE)
                skb->ip_summed = CHECKSUM_NONE;
}

I guess this is probably why Docker/Kubernetes could be affected too.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good.
  2015-12-19 21:01     ` Cong Wang
@ 2015-12-19 21:34       ` Cong Wang
  0 siblings, 0 replies; 11+ messages in thread
From: Cong Wang @ 2015-12-19 21:34 UTC (permalink / raw)
  To: Cong Wang
  Cc: Vijay Pandurangan, Nicolas Dichtel, Phil Sutter, Toshiaki Makita,
	Linux Kernel Network Developers, LKML, Evan Jones, Tom Herbert

On Sat, Dec 19, 2015 at 1:01 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Fri, Dec 18, 2015 at 11:42 AM, Vijay Pandurangan <vijayp@vijayp.ca> wrote:
>> Evan and I have demonstrated this bug on Kubernetes as well, so it's
>> not just a problem in Mesos. (See
>> https://github.com/kubernetes/kubernetes/issues/18898)
>>
>
> Interesting... then this problem is much more serious than I thought.
>
> Looks like in RX path the bridge sets the checksum to CHECKSUM_NONE
> too:
>
> static inline void skb_forward_csum(struct sk_buff *skb)
> {
>         /* Unfortunately we don't support this one.  Any brave souls? */
>         if (skb->ip_summed == CHECKSUM_COMPLETE)
>                 skb->ip_summed = CHECKSUM_NONE;
> }
>
> I guess this is probably why Docker/Kubernetes could be affected too.

Hmm, no, actually this is due to netem does the software checksum
and sets it to CHECKSUM_NONE:

        if (q->corrupt && q->corrupt >= get_crandom(&q->corrupt_cor)) {
                if (!(skb = skb_unshare(skb, GFP_ATOMIC)) ||
                    (skb->ip_summed == CHECKSUM_PARTIAL &&
                     skb_checksum_help(skb)))
                        return qdisc_drop(skb, sch);

                skb->data[prandom_u32() % skb_headlen(skb)] ^=
                        1<<(prandom_u32() % 8);
        }


But anyway, your patch still looks correct to me.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
  2015-12-18 19:34   ` [PATCH] veth: don’t modify ip_summed; " Vijay Pandurangan
@ 2015-12-19 21:41     ` Cong Wang
  2015-12-22 20:16       ` David Miller
  1 sibling, 0 replies; 11+ messages in thread
From: Cong Wang @ 2015-12-19 21:41 UTC (permalink / raw)
  To: Vijay Pandurangan
  Cc: Evan Jones, Nicolas Dichtel, Phil Sutter, Toshiaki Makita,
	Linux Kernel Network Developers, LKML

On Fri, Dec 18, 2015 at 11:34 AM, Vijay Pandurangan <vijayp@vijayp.ca> wrote:
> Packets that arrive from real hardware devices have ip_summed ==
> CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
> CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
> current version of veth will replace CHECKSUM_NONE with
> CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
> a veth device to be delivered to the application. This caused applications
> at Twitter to receive corrupt data when network hardware was corrupting
> packets.
>
> We believe this was added as an optimization to skip computing and
> verifying checksums for communication between containers. However, locally
> generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
> written does nothing for them. As far as we can tell, after removing this
> code, these packets are transmitted from one stack to another unmodified
> (tcpdump shows invalid checksums on both sides, as expected), and they are
> delivered correctly to applications. We didn’t test every possible network
> configuration, but we tried a few common ones such as bridging containers,
> using NAT between the host and a container, and routing from hardware
> devices to containers. We have effectively deployed this in production at
> Twitter (by disabling RX checksum offloading on veth devices).
>
> This code dates back to the first version of the driver, commit
> <e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
> suspect this bug occurred mostly because the driver API has evolved
> significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
> packet checksumming") (in December 2010) fixed this for packets that get
> created locally and sent to hardware devices, by not changing
> CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
> in from hardware devices.
>
> Co-authored-by: Evan Jones <ej@evanjones.ca>
> Signed-off-by: Evan Jones <ej@evanjones.ca>
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Cc: Phil Sutter <phil@nwl.cc>
> Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>


Acked-by: Cong Wang <cwang@twopensource.com>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
  2015-12-18 19:34   ` [PATCH] veth: don’t modify ip_summed; " Vijay Pandurangan
@ 2015-12-22 20:16       ` David Miller
  2015-12-22 20:16       ` David Miller
  1 sibling, 0 replies; 11+ messages in thread
From: David Miller @ 2015-12-22 20:16 UTC (permalink / raw)
  To: vijayp
  Cc: xiyou.wangcong, ej, nicolas.dichtel, phil, makita.toshiaki,
	netdev, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: Text/Plain; charset=iso-8859-7, Size: 2484 bytes --]

From: Vijay Pandurangan <vijayp@vijayp.ca>
Date: Fri, 18 Dec 2015 14:34:59 -0500

> Packets that arrive from real hardware devices have ip_summed ==
> CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
> CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
> current version of veth will replace CHECKSUM_NONE with
> CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
> a veth device to be delivered to the application. This caused applications
> at Twitter to receive corrupt data when network hardware was corrupting
> packets.
> 
> We believe this was added as an optimization to skip computing and
> verifying checksums for communication between containers. However, locally
> generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
> written does nothing for them. As far as we can tell, after removing this
> code, these packets are transmitted from one stack to another unmodified
> (tcpdump shows invalid checksums on both sides, as expected), and they are
> delivered correctly to applications. We didn¢t test every possible network
> configuration, but we tried a few common ones such as bridging containers,
> using NAT between the host and a container, and routing from hardware
> devices to containers. We have effectively deployed this in production at
> Twitter (by disabling RX checksum offloading on veth devices).
> 
> This code dates back to the first version of the driver, commit
> <e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
> suspect this bug occurred mostly because the driver API has evolved
> significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
> packet checksumming") (in December 2010) fixed this for packets that get
> created locally and sent to hardware devices, by not changing
> CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
> in from hardware devices.
> 
> Co-authored-by: Evan Jones <ej@evanjones.ca>
> Signed-off-by: Evan Jones <ej@evanjones.ca>
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Cc: Phil Sutter <phil@nwl.cc>
> Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>

Applied and queued up for -stable, thanks.
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
@ 2015-12-22 20:16       ` David Miller
  0 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2015-12-22 20:16 UTC (permalink / raw)
  To: vijayp
  Cc: xiyou.wangcong, ej, nicolas.dichtel, phil, makita.toshiaki,
	netdev, linux-kernel

From: Vijay Pandurangan <vijayp@vijayp.ca>
Date: Fri, 18 Dec 2015 14:34:59 -0500

> Packets that arrive from real hardware devices have ip_summed ==
> CHECKSUM_UNNECESSARY if the hardware verified the checksums, or
> CHECKSUM_NONE if the packet is bad or it was unable to verify it. The
> current version of veth will replace CHECKSUM_NONE with
> CHECKSUM_UNNECESSARY, which causes corrupt packets routed from hardware to
> a veth device to be delivered to the application. This caused applications
> at Twitter to receive corrupt data when network hardware was corrupting
> packets.
> 
> We believe this was added as an optimization to skip computing and
> verifying checksums for communication between containers. However, locally
> generated packets have ip_summed == CHECKSUM_PARTIAL, so the code as
> written does nothing for them. As far as we can tell, after removing this
> code, these packets are transmitted from one stack to another unmodified
> (tcpdump shows invalid checksums on both sides, as expected), and they are
> delivered correctly to applications. We didn’t test every possible network
> configuration, but we tried a few common ones such as bridging containers,
> using NAT between the host and a container, and routing from hardware
> devices to containers. We have effectively deployed this in production at
> Twitter (by disabling RX checksum offloading on veth devices).
> 
> This code dates back to the first version of the driver, commit
> <e314dbdc1c0dc6a548ecf> ("[NET]: Virtual ethernet device driver"), so I
> suspect this bug occurred mostly because the driver API has evolved
> significantly since then. Commit <0b7967503dc97864f283a> ("net/veth: Fix
> packet checksumming") (in December 2010) fixed this for packets that get
> created locally and sent to hardware devices, by not changing
> CHECKSUM_PARTIAL. However, the same issue still occurs for packets coming
> in from hardware devices.
> 
> Co-authored-by: Evan Jones <ej@evanjones.ca>
> Signed-off-by: Evan Jones <ej@evanjones.ca>
> Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Cc: Phil Sutter <phil@nwl.cc>
> Cc: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Vijay Pandurangan <vijayp@vijayp.ca>

Applied and queued up for -stable, thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
       [not found]       ` <CAKUBDd83j1yL+HR24K+qdQBcN66BZ4U50Dj7H_STRrjHb85-zA@mail.gmail.com>
@ 2015-12-23 17:57         ` Cong Wang
  2016-02-11 19:19           ` Vijay Pandurangan
  0 siblings, 1 reply; 11+ messages in thread
From: Cong Wang @ 2015-12-23 17:57 UTC (permalink / raw)
  To: Vijay Pandurangan
  Cc: David Miller, Evan Jones, Nicolas Dichtel, Phil Sutter,
	Toshiaki Makita, Linux Kernel Network Developers, LKML

On Tue, Dec 22, 2015 at 11:37 PM, Vijay Pandurangan <vijayp@vijayp.ca> wrote:
> Cool, thanks! I see it in the -stable queue. Is there anything else I need
> to do to help with getting this into main or backporting?  Happy to pitch in
> if I can be helpful.
>

DaveM usually just backports it to a few recent stable tree, if you want
to backport further, for example 3.14, you probably need to send
the commit ID to Greg KH.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] veth: don’t modify ip_summed; doing so treats packets with bad checksums as good.
  2015-12-23 17:57         ` Cong Wang
@ 2016-02-11 19:19           ` Vijay Pandurangan
  0 siblings, 0 replies; 11+ messages in thread
From: Vijay Pandurangan @ 2016-02-11 19:19 UTC (permalink / raw)
  To: Cong Wang
  Cc: David Miller, Evan Jones, Nicolas Dichtel, Phil Sutter,
	Toshiaki Makita, Linux Kernel Network Developers, LKML

Just as a followup, I wrote a short blog detailing the bug and our
resolution: (https://twitter.com/vijayp/status/697837808417779716)

Thanks again for your help in guiding us through our first kernel
patch. This was a great experience!

direct link: https://medium.com/vijay-pandurangan/linux-kernel-bug-delivers-corrupt-tcp-ip-data-to-mesos-kubernetes-docker-containers-4986f88f7a19#.aymvnbaa8

On Wed, Dec 23, 2015 at 9:57 AM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Tue, Dec 22, 2015 at 11:37 PM, Vijay Pandurangan <vijayp@vijayp.ca> wrote:
>> Cool, thanks! I see it in the -stable queue. Is there anything else I need
>> to do to help with getting this into main or backporting?  Happy to pitch in
>> if I can be helpful.
>>
>
> DaveM usually just backports it to a few recent stable tree, if you want
> to backport further, for example 3.14, you probably need to send
> the commit ID to Greg KH.
>
> Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-02-11 19:20 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-18 17:54 [PATCH] veth: don't modify ip-summed; doing so treats packets with bad checksums as good Vijay Pandurangan
2015-12-18 19:00 ` Cong Wang
2015-12-18 19:34   ` [PATCH] veth: don’t modify ip_summed; " Vijay Pandurangan
2015-12-19 21:41     ` Cong Wang
2015-12-22 20:16     ` David Miller
2015-12-22 20:16       ` David Miller
     [not found]       ` <CAKUBDd83j1yL+HR24K+qdQBcN66BZ4U50Dj7H_STRrjHb85-zA@mail.gmail.com>
2015-12-23 17:57         ` Cong Wang
2016-02-11 19:19           ` Vijay Pandurangan
2015-12-18 19:42   ` [PATCH] veth: don't modify ip-summed; " Vijay Pandurangan
2015-12-19 21:01     ` Cong Wang
2015-12-19 21:34       ` Cong Wang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.