All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: [MPTCP] KSelftests: errors last night (20190604)
@ 2019-06-04 20:59 Mat Martineau
  0 siblings, 0 replies; 6+ messages in thread
From: Mat Martineau @ 2019-06-04 20:59 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1702 bytes --]


On Tue, 4 Jun 2019, Matthieu Baerts wrote:

> Hello,
>
> Last night, I got an error with one test:
>
> (...)
> # ns2  TCP   -> ns3  (10.0.2.2:10029) MPTCP	[ OK ]
> # ns2  MPTCP -> ns3  (10.0.3.2:10030) MPTCP	[ OK ]
> # ns2  MPTCP -> ns3  (10.0.3.2:10031) TCP  	[ OK ]
> # ns2  TCP   -> ns3  (10.0.3.2:10032) MPTCP	[ OK ]
> # ns2  MPTCP -> ns4  (10.0.3.1:10033) MPTCP	main_loop_s: timed out
> # copyfd_io: poll timed out (events: POLLIN 1, POLLOUT 4)
> # [ FAIL ] client exit code 2, server 2
> # \nnetns ns4 socket stat for 10033:
> # State     Recv-Q   Send-Q        Local Address:Port        Peer
> Address:Port
> # SYN-RECV  0        0                  10.0.3.1:10033
> 10.0.2.1:37248    timer:(on,5.020ms,3) cwnd:2 retrans:3/0 reordering:0
> # \nnetns ns2 socket stat for 10033:
> # State       Recv-Q   Send-Q      Local Address:Port        Peer
> Address:Port
> # FIN-WAIT-1  0        49153            10.0.2.1:37248
> 10.0.3.1:10033    timer:(on,3.280ms,5) rto:6.75 cwnd:1 ssthresh:7
> retrans:5/0 reordering:0
> # ns3  MPTCP -> ns1  (10.0.1.1:10034) MPTCP	[ OK ]
> # ns3  MPTCP -> ns1  (10.0.1.1:10035) TCP  	[ OK ]
> # ns3  TCP   -> ns1  (10.0.1.1:10036) MPTCP	[ OK ]
> # ns3  MPTCP -> ns2  (10.0.1.2:10037) MPTCP	[ OK ]
> (...)
>
> The same tests were running fine just before without KASAN and
> PROVE_LOCKING.
>
> I just re-launched the tests and I no longer have this error.
>
> Anyone else saw this error?

Before Paolo fixed the skb_ext copying problem, I used to see this 
poll timeout regularly, but I haven't seen it in a while. I will keep a 
look out for the error and try to reproduce it if it comes up again.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [MPTCP] KSelftests: errors last night (20190604)
@ 2019-07-01 18:15 Mat Martineau
  0 siblings, 0 replies; 6+ messages in thread
From: Mat Martineau @ 2019-07-01 18:15 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2147 bytes --]


On Mon, 1 Jul 2019, Paolo Abeni wrote:

> On Fri, 2019-06-28 at 15:47 +0200, Matthieu Baerts wrote:
>> (...)
>> 00:13:23.223 # ns1  TCP   -> ns2  (10.0.1.2:10005) MPTCP	[ OK ]
>> 00:13:24.256 # ns1  MPTCP -> ns2  (10.0.2.1:10006) MPTCP	[ OK ]
>> 00:13:25.292 # ns1  MPTCP -> ns2  (10.0.2.1:10007) TCP  	[ OK ]
>> 00:13:26.338 # ns1  TCP   -> ns2  (10.0.2.1:10008) MPTCP	[ OK ]
>> 00:13:27.374 # ns1  MPTCP -> ns3  (10.0.2.2:10009) MPTCP	main_loop_s:
>> timed out
>> 00:13:58.414 # copyfd_io: poll timed out (events: POLLIN 1, POLLOUT 4)
>> 00:13:58.415 # [ FAIL ] client exit code 2, server 2
>> 00:13:58.415 # \nnetns ns3 socket stat for 10009:
>> 00:13:58.441 # State     Recv-Q   Send-Q        Local Address:Port
>>   Peer Address:Port
>> 00:13:58.441 # SYN-RECV  0        0                  10.0.2.2:10009
>>       10.0.1.1:32774    timer:(on,1.270ms,4) cwnd:2 retrans:4/0 reordering:0
>> 00:13:58.443 # \nnetns ns1 socket stat for 10009:
>> 00:13:58.455 # State       Recv-Q   Send-Q      Local Address:Port
>>   Peer Address:Port
>> 00:13:58.456 # FIN-WAIT-1  0        49153            10.0.1.1:32774
>>       10.0.2.2:10009    timer:(on,22sec,7) rto:25.72 cwnd:1 ssthresh:7
>> retrans:7/0 reordering:0
>> 00:13:58.460 # ns1  MPTCP -> ns3  (10.0.3.2:10010) MPTCP	[ OK ]
>> 00:13:59.502 # ns1  MPTCP -> ns3  (10.0.3.2:10011) TCP  	[ OK ]
>> 00:14:00.543 # ns1  TCP   -> ns3  (10.0.3.2:10012) MPTCP	[ OK ]
>> 00:14:01.585 # ns1  MPTCP -> ns4  (10.0.3.1:10013) MPTCP	[ OK ]
>> (...)
>
> I'm (/was) able to reproduce the issue locally, but not reliably.
>
> @Mat: is "mptcp: Make MPTCP socket block/wakeup ignore
> sk_receive_queue" somehow related to this issue ? (e.g. an attempt to
> fix it)
>

That patch is a fix for a blocking issue I noticed in a different testing 
scenario, where recvmsg was blocking after a peer disconnected.

I was speculating that my patch since it looked like something was 
blocking too long - but examining the output more closely, that SYN-RECV 
state on one peer and the send-q on the other make this bug seem like 
something different.

--
Mat Martineau
Intel

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [MPTCP] KSelftests: errors last night (20190604)
@ 2019-07-01 17:24 Paolo Abeni
  0 siblings, 0 replies; 6+ messages in thread
From: Paolo Abeni @ 2019-07-01 17:24 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1675 bytes --]

On Fri, 2019-06-28 at 15:47 +0200, Matthieu Baerts wrote:
> (...)
> 00:13:23.223 # ns1  TCP   -> ns2  (10.0.1.2:10005) MPTCP	[ OK ]
> 00:13:24.256 # ns1  MPTCP -> ns2  (10.0.2.1:10006) MPTCP	[ OK ]
> 00:13:25.292 # ns1  MPTCP -> ns2  (10.0.2.1:10007) TCP  	[ OK ]
> 00:13:26.338 # ns1  TCP   -> ns2  (10.0.2.1:10008) MPTCP	[ OK ]
> 00:13:27.374 # ns1  MPTCP -> ns3  (10.0.2.2:10009) MPTCP	main_loop_s:
> timed out
> 00:13:58.414 # copyfd_io: poll timed out (events: POLLIN 1, POLLOUT 4)
> 00:13:58.415 # [ FAIL ] client exit code 2, server 2
> 00:13:58.415 # \nnetns ns3 socket stat for 10009:
> 00:13:58.441 # State     Recv-Q   Send-Q        Local Address:Port
>   Peer Address:Port
> 00:13:58.441 # SYN-RECV  0        0                  10.0.2.2:10009
>       10.0.1.1:32774    timer:(on,1.270ms,4) cwnd:2 retrans:4/0 reordering:0
> 00:13:58.443 # \nnetns ns1 socket stat for 10009:
> 00:13:58.455 # State       Recv-Q   Send-Q      Local Address:Port
>   Peer Address:Port
> 00:13:58.456 # FIN-WAIT-1  0        49153            10.0.1.1:32774
>       10.0.2.2:10009    timer:(on,22sec,7) rto:25.72 cwnd:1 ssthresh:7
> retrans:7/0 reordering:0
> 00:13:58.460 # ns1  MPTCP -> ns3  (10.0.3.2:10010) MPTCP	[ OK ]
> 00:13:59.502 # ns1  MPTCP -> ns3  (10.0.3.2:10011) TCP  	[ OK ]
> 00:14:00.543 # ns1  TCP   -> ns3  (10.0.3.2:10012) MPTCP	[ OK ]
> 00:14:01.585 # ns1  MPTCP -> ns4  (10.0.3.1:10013) MPTCP	[ OK ]
> (...)

I'm (/was) able to reproduce the issue locally, but not reliably. 

@Mat: is "mptcp: Make MPTCP socket block/wakeup ignore
sk_receive_queue" somehow related to this issue ? (e.g. an attempt to
fix it)

Thanks,

Paolo


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [MPTCP] KSelftests: errors last night (20190604)
@ 2019-06-28 13:47 Matthieu Baerts
  0 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts @ 2019-06-28 13:47 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 3121 bytes --]

Hello

Last night I also got a similar issue:

(...)
00:13:23.223 # ns1  TCP   -> ns2  (10.0.1.2:10005) MPTCP	[ OK ]
00:13:24.256 # ns1  MPTCP -> ns2  (10.0.2.1:10006) MPTCP	[ OK ]
00:13:25.292 # ns1  MPTCP -> ns2  (10.0.2.1:10007) TCP  	[ OK ]
00:13:26.338 # ns1  TCP   -> ns2  (10.0.2.1:10008) MPTCP	[ OK ]
00:13:27.374 # ns1  MPTCP -> ns3  (10.0.2.2:10009) MPTCP	main_loop_s:
timed out
00:13:58.414 # copyfd_io: poll timed out (events: POLLIN 1, POLLOUT 4)
00:13:58.415 # [ FAIL ] client exit code 2, server 2
00:13:58.415 # \nnetns ns3 socket stat for 10009:
00:13:58.441 # State     Recv-Q   Send-Q        Local Address:Port
  Peer Address:Port
00:13:58.441 # SYN-RECV  0        0                  10.0.2.2:10009
      10.0.1.1:32774    timer:(on,1.270ms,4) cwnd:2 retrans:4/0 reordering:0
00:13:58.443 # \nnetns ns1 socket stat for 10009:
00:13:58.455 # State       Recv-Q   Send-Q      Local Address:Port
  Peer Address:Port
00:13:58.456 # FIN-WAIT-1  0        49153            10.0.1.1:32774
      10.0.2.2:10009    timer:(on,22sec,7) rto:25.72 cwnd:1 ssthresh:7
retrans:7/0 reordering:0
00:13:58.460 # ns1  MPTCP -> ns3  (10.0.3.2:10010) MPTCP	[ OK ]
00:13:59.502 # ns1  MPTCP -> ns3  (10.0.3.2:10011) TCP  	[ OK ]
00:14:00.543 # ns1  TCP   -> ns3  (10.0.3.2:10012) MPTCP	[ OK ]
00:14:01.585 # ns1  MPTCP -> ns4  (10.0.3.1:10013) MPTCP	[ OK ]
(...)

But I was not able to reproduce it today.

Cheers,
Matt

On 04/06/2019 14:18, Matthieu Baerts wrote:
> Hello,
> 
> Last night, I got an error with one test:
> 
> (...)
> # ns2  TCP   -> ns3  (10.0.2.2:10029) MPTCP	[ OK ]
> # ns2  MPTCP -> ns3  (10.0.3.2:10030) MPTCP	[ OK ]
> # ns2  MPTCP -> ns3  (10.0.3.2:10031) TCP  	[ OK ]
> # ns2  TCP   -> ns3  (10.0.3.2:10032) MPTCP	[ OK ]
> # ns2  MPTCP -> ns4  (10.0.3.1:10033) MPTCP	main_loop_s: timed out
> # copyfd_io: poll timed out (events: POLLIN 1, POLLOUT 4)
> # [ FAIL ] client exit code 2, server 2
> # \nnetns ns4 socket stat for 10033:
> # State     Recv-Q   Send-Q        Local Address:Port        Peer
> Address:Port
> # SYN-RECV  0        0                  10.0.3.1:10033
> 10.0.2.1:37248    timer:(on,5.020ms,3) cwnd:2 retrans:3/0 reordering:0
> # \nnetns ns2 socket stat for 10033:
> # State       Recv-Q   Send-Q      Local Address:Port        Peer
> Address:Port
> # FIN-WAIT-1  0        49153            10.0.2.1:37248
> 10.0.3.1:10033    timer:(on,3.280ms,5) rto:6.75 cwnd:1 ssthresh:7
> retrans:5/0 reordering:0
> # ns3  MPTCP -> ns1  (10.0.1.1:10034) MPTCP	[ OK ]
> # ns3  MPTCP -> ns1  (10.0.1.1:10035) TCP  	[ OK ]
> # ns3  TCP   -> ns1  (10.0.1.1:10036) MPTCP	[ OK ]
> # ns3  MPTCP -> ns2  (10.0.1.2:10037) MPTCP	[ OK ]
> (...)
> 
> The same tests were running fine just before without KASAN and
> PROVE_LOCKING.
> 
> I just re-launched the tests and I no longer have this error.
> 
> Anyone else saw this error?
> 
> Cheers,
> Matt
> 

-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [MPTCP] KSelftests: errors last night (20190604)
@ 2019-06-04 22:12 Florian Westphal
  0 siblings, 0 replies; 6+ messages in thread
From: Florian Westphal @ 2019-06-04 22:12 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 791 bytes --]

Mat Martineau <mathew.j.martineau(a)linux.intel.com> wrote:
> > The same tests were running fine just before without KASAN and
> > PROVE_LOCKING.
> > 
> > I just re-launched the tests and I no longer have this error.
> > 
> > Anyone else saw this error?
> 
> Before Paolo fixed the skb_ext copying problem, I used to see this poll
> timeout regularly, but I haven't seen it in a while. I will keep a look out
> for the error and try to reproduce it if it comes up again.

Paolo thinks its just because of the test script change that added
qdiscs with random loss and delay/reordering.

I've started a test run now (a ton of test script repeats) and will
test with just raising the timeout.

If that fixes things then Paolo is right and i will submit a test script
patch.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [MPTCP] KSelftests: errors last night (20190604)
@ 2019-06-04 12:18 Matthieu Baerts
  0 siblings, 0 replies; 6+ messages in thread
From: Matthieu Baerts @ 2019-06-04 12:18 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1545 bytes --]

Hello,

Last night, I got an error with one test:

(...)
# ns2  TCP   -> ns3  (10.0.2.2:10029) MPTCP	[ OK ]
# ns2  MPTCP -> ns3  (10.0.3.2:10030) MPTCP	[ OK ]
# ns2  MPTCP -> ns3  (10.0.3.2:10031) TCP  	[ OK ]
# ns2  TCP   -> ns3  (10.0.3.2:10032) MPTCP	[ OK ]
# ns2  MPTCP -> ns4  (10.0.3.1:10033) MPTCP	main_loop_s: timed out
# copyfd_io: poll timed out (events: POLLIN 1, POLLOUT 4)
# [ FAIL ] client exit code 2, server 2
# \nnetns ns4 socket stat for 10033:
# State     Recv-Q   Send-Q        Local Address:Port        Peer
Address:Port
# SYN-RECV  0        0                  10.0.3.1:10033
10.0.2.1:37248    timer:(on,5.020ms,3) cwnd:2 retrans:3/0 reordering:0
# \nnetns ns2 socket stat for 10033:
# State       Recv-Q   Send-Q      Local Address:Port        Peer
Address:Port
# FIN-WAIT-1  0        49153            10.0.2.1:37248
10.0.3.1:10033    timer:(on,3.280ms,5) rto:6.75 cwnd:1 ssthresh:7
retrans:5/0 reordering:0
# ns3  MPTCP -> ns1  (10.0.1.1:10034) MPTCP	[ OK ]
# ns3  MPTCP -> ns1  (10.0.1.1:10035) TCP  	[ OK ]
# ns3  TCP   -> ns1  (10.0.1.1:10036) MPTCP	[ OK ]
# ns3  MPTCP -> ns2  (10.0.1.2:10037) MPTCP	[ OK ]
(...)

The same tests were running fine just before without KASAN and
PROVE_LOCKING.

I just re-launched the tests and I no longer have this error.

Anyone else saw this error?

Cheers,
Matt

-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-07-01 18:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-04 20:59 [MPTCP] KSelftests: errors last night (20190604) Mat Martineau
  -- strict thread matches above, loose matches on Subject: below --
2019-07-01 18:15 Mat Martineau
2019-07-01 17:24 Paolo Abeni
2019-06-28 13:47 Matthieu Baerts
2019-06-04 22:12 Florian Westphal
2019-06-04 12:18 Matthieu Baerts

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.