linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Unexpected timestamps in tcpdump with veth + tc qdisc netem delay
@ 2021-04-26 14:35 Henrique de Moraes Holschuh
  2021-04-26 17:07 ` Eric Dumazet
  0 siblings, 1 reply; 3+ messages in thread
From: Henrique de Moraes Holschuh @ 2021-04-26 14:35 UTC (permalink / raw)
  To: linux-kernel, netdev

[-- Attachment #1: Type: text/plain, Size: 3942 bytes --]

(please CC me in any replies, thank you!)

Hello,

While trying to simulate large delay links using veth and netns, I came across what looks like unexpected / incorrect behavior.

I have reproduced it in Debian 4.19 and 5.10 kernels, and a quick look at mainline doesn't show any relevant deviation from Debian kernels to mainline in my limited understanding of this area of the kernel.

I have attached a simple script to reproduce the scenario.  If my explanation below is not clear, please just look at the script to see what it does: it should be trivial to understand.  It needs tcpdump, and CAP_NET_ADMIN (or root, etc).

Topology

root netns:
   veth vec0 (192.168.233.1)   paired to ves0 (192.168.233.2)
   tc qdisc dev vec0 root netem delay 250ms

lab500ms netns:
   veth ves0 (192.168.233.2), paired to vec0 (192.168.233.1)
   tc qdisc dev ves0 root netem delay 250ms

So:
[root netns  -- veth (tc qdisc netem delay 250ms) ] <> [ veth (tc qdisc netem delay 250ms) -- lab500ms netns ]

Expected RTT from a packet roundtrip (root nets -> lab500ms netns -> root netns) is 500ms.


The problem:

[root netns]:  ping 192.168.233.2
PING 192.168.233.2 (192.168.233.2) 56(84) bytes of data.
64 bytes from 192.168.233.2: icmp_seq=1 ttl=64 time=500 ms
64 bytes from 192.168.233.2: icmp_seq=2 ttl=64 time=500 ms

(the RTT reported by ping is 500ms as expected: there is a 250ms transmit delay attached to each member of the veth pair)

However:

[root netns]: tcpdump -i vec0 -s0 -n -p net 192.168.233.0/30
listening on vec0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:09:09.740681 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 1, length 64
17:09:09.990891 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 1, length 64
17:09:10.741903 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 2, length 64
17:09:10.992031 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 2, length 64
17:09:11.742813 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 3, length 64
17:09:11.993009 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 3, length 64

[lab500ms netns]: ip netns exec lab500ms tcpdump -i ves0 -s0 -n -p net 192.168.233.0/30
listening on ves0, link-type EN10MB (Ethernet), capture size 262144 bytes
17:09:09.740724 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 1, length 64
17:09:09.990867 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 1, length 64
17:09:10.741942 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 2, length 64
17:09:10.992012 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 2, length 64
17:09:11.742851 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 3, length 64
17:09:11.992985 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 3, length 64

One can see that the timestamps shown by tcpdump (also reproduced using wireshark) are *not* what one would expect: the 250ms delays are missing in incoming packets (i.e. there's 250ms missing from timestamps in packets "echo reply" in vec0, and "echo request" in ves0).

The 250ms vec0->ves0 delay AND 250ms ves0 -> vec0 delay *are* there, as shown by "ping", but you'd not know it if you look at the tcpdump.  The timing shown in tcpdump looks more like packet injection time at the first interface, than the time the packet was "seen" at the other end (capture interface).

Adding more namespaces and VETH pairs + routing "in a row" so that the packet "exits" one veth tunnel and enters another one (after trivial routing) doesn't fix the tcpdump timestamps in the capture at the other end of the veth-veth->routing->veth-veth->routing->... chain.

It looks like some sort of bug to me, but maybe I am missing something, in which case I would greatly appreciate an explanation of where I went wrong... 

Thanks in advance,
Henrique de Moraes Holschuh <hmh@hmh.eng.br>

[-- Attachment #2: netns.sh --]
[-- Type: application/x-sh, Size: 1858 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Unexpected timestamps in tcpdump with veth + tc qdisc netem delay
  2021-04-26 14:35 Unexpected timestamps in tcpdump with veth + tc qdisc netem delay Henrique de Moraes Holschuh
@ 2021-04-26 17:07 ` Eric Dumazet
  2021-04-27  2:34   ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 3+ messages in thread
From: Eric Dumazet @ 2021-04-26 17:07 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh, linux-kernel, netdev



On 4/26/21 4:35 PM, Henrique de Moraes Holschuh wrote:
> (please CC me in any replies, thank you!)
> 
> Hello,
> 
> While trying to simulate large delay links using veth and netns, I came across what looks like unexpected / incorrect behavior.
> 
> I have reproduced it in Debian 4.19 and 5.10 kernels, and a quick look at mainline doesn't show any relevant deviation from Debian kernels to mainline in my limited understanding of this area of the kernel.
> 
> I have attached a simple script to reproduce the scenario.  If my explanation below is not clear, please just look at the script to see what it does: it should be trivial to understand.  It needs tcpdump, and CAP_NET_ADMIN (or root, etc).
> 
> Topology
> 
> root netns:
>    veth vec0 (192.168.233.1)   paired to ves0 (192.168.233.2)
>    tc qdisc dev vec0 root netem delay 250ms
> 
> lab500ms netns:
>    veth ves0 (192.168.233.2), paired to vec0 (192.168.233.1)
>    tc qdisc dev ves0 root netem delay 250ms
> 
> So:
> [root netns  -- veth (tc qdisc netem delay 250ms) ] <> [ veth (tc qdisc netem delay 250ms) -- lab500ms netns ]
> 
> Expected RTT from a packet roundtrip (root nets -> lab500ms netns -> root netns) is 500ms.
> 
> 
> The problem:
> 
> [root netns]:  ping 192.168.233.2
> PING 192.168.233.2 (192.168.233.2) 56(84) bytes of data.
> 64 bytes from 192.168.233.2: icmp_seq=1 ttl=64 time=500 ms
> 64 bytes from 192.168.233.2: icmp_seq=2 ttl=64 time=500 ms
> 
> (the RTT reported by ping is 500ms as expected: there is a 250ms transmit delay attached to each member of the veth pair)
> 
> However:
> 
> [root netns]: tcpdump -i vec0 -s0 -n -p net 192.168.233.0/30
> listening on vec0, link-type EN10MB (Ethernet), capture size 262144 bytes
> 17:09:09.740681 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 1, length 64

Here you see the packet _after_ the 250ms delay

> 17:09:09.990891 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 1, length 64
Same here.

I do not see any problem.

> 17:09:10.741903 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 2, length 64
> 17:09:10.992031 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 2, length 64
> 17:09:11.742813 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 3, length 64
> 17:09:11.993009 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 3, length 64
> 
> [lab500ms netns]: ip netns exec lab500ms tcpdump -i ves0 -s0 -n -p net 192.168.233.0/30
> listening on ves0, link-type EN10MB (Ethernet), capture size 262144 bytes
> 17:09:09.740724 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 1, length 64
> 17:09:09.990867 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 1, length 64
> 17:09:10.741942 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 2, length 64
> 17:09:10.992012 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 2, length 64
> 17:09:11.742851 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 3, length 64
> 17:09:11.992985 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 3, length 64
> 
> One can see that the timestamps shown by tcpdump (also reproduced using wireshark) are *not* what one would expect: the 250ms delays are missing in incoming packets (i.e. there's 250ms missing from timestamps in packets "echo reply" in vec0, and "echo request" in ves0).
> 
> The 250ms vec0->ves0 delay AND 250ms ves0 -> vec0 delay *are* there, as shown by "ping", but you'd not know it if you look at the tcpdump.  The timing shown in tcpdump looks more like packet injection time at the first interface, than the time the packet was "seen" at the other end (capture interface).
> 
> Adding more namespaces and VETH pairs + routing "in a row" so that the packet "exits" one veth tunnel and enters another one (after trivial routing) doesn't fix the tcpdump timestamps in the capture at the other end of the veth-veth->routing->veth-veth->routing->... chain.
> 
> It looks like some sort of bug to me, but maybe I am missing something, in which case I would greatly appreciate an explanation of where I went wrong... 
> 

That is only because you expect to see something, but you forgot that tcpdump captures
TX packet _after_ netem.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Unexpected timestamps in tcpdump with veth + tc qdisc netem delay
  2021-04-26 17:07 ` Eric Dumazet
@ 2021-04-27  2:34   ` Henrique de Moraes Holschuh
  0 siblings, 0 replies; 3+ messages in thread
From: Henrique de Moraes Holschuh @ 2021-04-27  2:34 UTC (permalink / raw)
  To: Eric Dumazet, linux-kernel, netdev

On Mon, Apr 26, 2021, at 14:07, Eric Dumazet wrote:
> On 4/26/21 4:35 PM, Henrique de Moraes Holschuh wrote:
[...]
> > [root netns]: tcpdump -i vec0 -s0 -n -p net 192.168.233.0/30
> > listening on vec0, link-type EN10MB (Ethernet), capture size 262144 bytes
> > 17:09:09.740681 IP 192.168.233.1 > 192.168.233.2: ICMP echo request, id 9327, seq 1, length 64
> 
> Here you see the packet _after_ the 250ms delay
> > 17:09:09.990891 IP 192.168.233.2 > 192.168.233.1: ICMP echo reply, id 9327, seq 1, length 64
> Same here.
[...]
> > Adding more namespaces and VETH pairs + routing "in a row" so that the packet "exits" one veth tunnel and enters another one (after trivial routing) doesn't fix the tcpdump timestamps in the capture at the other end of the veth-veth->routing->veth-veth->routing->... chain.
> > 
> > It looks like some sort of bug to me, but maybe I am missing something, in which case I would greatly appreciate an explanation of where I went wrong... 
> 
> That is only because you expect to see something, but you forgot that tcpdump captures TX packet _after_ netem.

That was it!  Thank you very much for the quick reply, and direct, precise explanation.

I had completely forgotten about the qdisc running before interface timestamping, and I overlooked the fact that I did not try three netns in-a-chain with the qdiscs *in the middle netns*: I tried them in the two opposite edge netns, only.

Again, thank you!

-- 
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-04-27  2:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-26 14:35 Unexpected timestamps in tcpdump with veth + tc qdisc netem delay Henrique de Moraes Holschuh
2021-04-26 17:07 ` Eric Dumazet
2021-04-27  2:34   ` Henrique de Moraes Holschuh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).