All of lore.kernel.org
 help / color / mirror / Atom feed
* [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag()
@ 2019-11-16 12:59 Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2019-11-16 12:59 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 4643 bytes --]

Hi Paolo,

Thank you for your answer!

On 15/11/2019 23:39, Paolo Abeni wrote:
> On Fri, 2019-11-15 at 22:06 +0100, Matthieu Baerts wrote:
>> Hello,
>>
>> When testing some changes in the kselftests area, I initially saw the
>> message:
>>
>>     main_loop_s: timed out
>>
>> in the middle of the tests. But just after, there was a "[ OK ]".
>> Anybody saw that?
> 
> Yes. It happens when you kill the script via ctrl-c while the server
> program is running (in background). The server is not killed, and so it
> time-out in the background.
> 
> Meanwhile you run another iteration of the self-tests and you see the
> timeout from the previous iteration just in the middle of the current
> output.

Oh it makes sense! That's probably what I had!

>> I tried to understand why because the binary should returns 2 if the
>> message is printed and the shell script should catch the error. Then I
>> added debug in the script and launched it in a while-loop. I didn't get
>> the initial error I was looking for but instead I got a WARN followed by
>> an OOPS:
> 
> That is way worse :(
> 
>> Of course, in parallel I did other stuffs, I no longer have the proper
>> vmlinux and my MPTCP VM was not recent...
> 
> uhm... that's a pity... a decoded stack trace would help

The tests are running in a loop, I don't have any issue so far. Maybe 
because I was using a VM build last week.

>> I suspect that "skb" is NULL and we then have an issue with this chunk:
>>
>>
>>           collapsed = skb == tcp_write_queue_tail(ssk);
>>           if (collapsed) {
>>                   WARN_ON_ONCE(!can_collapse);
>>                   /* when collapsing mpext always exists */
>>                   mpext->data_len += ret;
>>                   goto out;
>>           }
>>
>>
>> Please also note that earlier in the function, skb is initialised with:
>>
>>       skb = tcp_write_queue_tail(ssk);
>>
>> So I guess collapsed is always true, no?
> 
> No. 'collapsed' is true only if do_tcp_sendpages() did not allocated a
> new skb - that is, collapsed the new data to the tail one. It should
> happen only when the 'can_collapse' variable is true.

OK, I missed that the queue could be modified by this function.

> I just notice that there is a very unlikely scenario where
> 
> 'collapsed = skb == tcp_write_queue_tail(ssk);'
> 
> is true, but do_tcp_sendpages() did not actually collapsed the data:
> 
> - do_tcp_sendpages() is called with 'can_collapse' == false.
> - do_tcp_sendpages() tries to create a new skb but hits memory limits
> and calls sk_stream_wait_memory()
> - sk_stream_wait_memory() release the socket lock and goes to sleep.
> - the tail skb is transmitted and freed, than do_tcp_sendpages() is
> woken again
> - do_tcp_sendpages() allocates a new skb and get a ptr to the last
> freed skb - that is, the 'old' tail just freed.
> - do_tcp_sendpages() complete successfully and mptcp_sendmsg_frag()
> sees skb == tcp_write_queue_tail(ssk), even if there was no collapsing.
> - havoc happens
> 
> All the above is very unlikely - after all I never observed that in
> practice - but still possible I think.

Thank you for your analysis!

I will let the tests running in a while loop for a few days and check if 
I get an issue.

>> If skb is NULL, mpext is NULL too (even if it could unlikely also be
>> even if skb is no NULL but sounds more like a bug I guess) and
>> can_collapse is False.
> 
> Uhm... I don't think this is the case. mpext is NULL even when
> do_tcp_sendpages() did not collapsed the data - that is in the above
> scenario.

I get the OOPS I got was because mpext was NULL, no?

BUG: kernel NULL pointer dereference, address: 0000000000000014

I didn't check with pahole but maybe linked to this instruction:

   mpext->data_len += ret;

> A possible solution would be let mptcp_sendmsg_frag() allocate the skb
> and do the memory scheduling for the subflow before calling
> do_tcp_sendpages(), so that it would never block.
> 
> Some/a lot of code duplication vs do_tcp_sendpages(), but that would
> allow us also to take care of the pending todo in
> mptcp_sendmsg_frag()[1] e.g. handle mptcp ext allocation failure
> gracefully.
> 
> Any better ideas?!?

#BPF? I thought it was the solution to all problems...

Maybe when MPTCP will be published, it will be possible to modify 
do_tcp_sendpages() :)

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag()
@ 2019-11-20 15:26 Paolo Abeni
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2019-11-20 15:26 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2045 bytes --]

On Wed, 2019-11-20 at 11:57 +0100, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 20/11/2019 11:18, Paolo Abeni wrote:
> > On Wed, 2019-11-20 at 10:51 +0100, Matthieu Baerts wrote:
> > > Hi Paolo,
> > > 
> > > On 18/11/2019 12:07, Paolo Abeni wrote:
> > > > On Sat, 2019-11-16 at 13:59 +0100, Matthieu Baerts wrote:
> > > > > On 15/11/2019 23:39, Paolo Abeni wrote:
> > > > > > On Fri, 2019-11-15 at 22:06 +0100, Matthieu Baerts wrote:
> > > > > > > If skb is NULL, mpext is NULL too (even if it could unlikely also be
> > > > > > > even if skb is no NULL but sounds more like a bug I guess) and
> > > > > > > can_collapse is False.
> > > > > > 
> > > > > > Uhm... I don't think this is the case. mpext is NULL even when
> > > > > > do_tcp_sendpages() did not collapsed the data - that is in the above
> > > > > > scenario.
> > > > > 
> > > > > I get the OOPS I got was because mpext was NULL, no?
> > > > > 
> > > > > BUG: kernel NULL pointer dereference, address: 0000000000000014
> > > > > 
> > > > > I didn't check with pahole but maybe linked to this instruction:
> > > > > 
> > > > >      mpext->data_len += ret;
> > > > 
> > > > Sorry, I was not clear.
> > > > 
> > > > yes, very likely the crash is due to mpext, but I don't think mpext is
> > > > NULL because skb is NULL. I think it's NULL because there was no
> > > > coalescing on skb (and thus no prior mpext in place).
> > > 
> > > Thank you, it is clearer!
> > > 
> > > I was not able to reproduce this bug by running the selftests for a few
> > > days so I guess this bug has been fixed. Sorry for the noise!
> > 
> > Uhm... if the diagnosys is correct, the bug is still there. It's just
> > very difficult to trigger... expecially without pktdrill ;)
> 
> Indeed, sorry.
> If it is very difficult to trigger, should we simply add a return/goto 
> if the WARN condition is valid? We can fix it properly later if it 
> really seen somewhere?

I have a tentative fix - but I can't test it ;)

Will try to push soon.

Cheers,

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag()
@ 2019-11-20 10:57 Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2019-11-20 10:57 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1902 bytes --]

Hi Paolo,

On 20/11/2019 11:18, Paolo Abeni wrote:
> On Wed, 2019-11-20 at 10:51 +0100, Matthieu Baerts wrote:
>> Hi Paolo,
>>
>> On 18/11/2019 12:07, Paolo Abeni wrote:
>>> On Sat, 2019-11-16 at 13:59 +0100, Matthieu Baerts wrote:
>>>> On 15/11/2019 23:39, Paolo Abeni wrote:
>>>>> On Fri, 2019-11-15 at 22:06 +0100, Matthieu Baerts wrote:
>>>>>> If skb is NULL, mpext is NULL too (even if it could unlikely also be
>>>>>> even if skb is no NULL but sounds more like a bug I guess) and
>>>>>> can_collapse is False.
>>>>>
>>>>> Uhm... I don't think this is the case. mpext is NULL even when
>>>>> do_tcp_sendpages() did not collapsed the data - that is in the above
>>>>> scenario.
>>>>
>>>> I get the OOPS I got was because mpext was NULL, no?
>>>>
>>>> BUG: kernel NULL pointer dereference, address: 0000000000000014
>>>>
>>>> I didn't check with pahole but maybe linked to this instruction:
>>>>
>>>>      mpext->data_len += ret;
>>>
>>> Sorry, I was not clear.
>>>
>>> yes, very likely the crash is due to mpext, but I don't think mpext is
>>> NULL because skb is NULL. I think it's NULL because there was no
>>> coalescing on skb (and thus no prior mpext in place).
>>
>> Thank you, it is clearer!
>>
>> I was not able to reproduce this bug by running the selftests for a few
>> days so I guess this bug has been fixed. Sorry for the noise!
> 
> Uhm... if the diagnosys is correct, the bug is still there. It's just
> very difficult to trigger... expecially without pktdrill ;)

Indeed, sorry.
If it is very difficult to trigger, should we simply add a return/goto 
if the WARN condition is valid? We can fix it properly later if it 
really seen somewhere?

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag()
@ 2019-11-20 10:18 Paolo Abeni
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2019-11-20 10:18 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1574 bytes --]

On Wed, 2019-11-20 at 10:51 +0100, Matthieu Baerts wrote:
> Hi Paolo,
> 
> On 18/11/2019 12:07, Paolo Abeni wrote:
> > On Sat, 2019-11-16 at 13:59 +0100, Matthieu Baerts wrote:
> > > On 15/11/2019 23:39, Paolo Abeni wrote:
> > > > On Fri, 2019-11-15 at 22:06 +0100, Matthieu Baerts wrote:
> > > > > If skb is NULL, mpext is NULL too (even if it could unlikely also be
> > > > > even if skb is no NULL but sounds more like a bug I guess) and
> > > > > can_collapse is False.
> > > > 
> > > > Uhm... I don't think this is the case. mpext is NULL even when
> > > > do_tcp_sendpages() did not collapsed the data - that is in the above
> > > > scenario.
> > > 
> > > I get the OOPS I got was because mpext was NULL, no?
> > > 
> > > BUG: kernel NULL pointer dereference, address: 0000000000000014
> > > 
> > > I didn't check with pahole but maybe linked to this instruction:
> > > 
> > >     mpext->data_len += ret;
> > 
> > Sorry, I was not clear.
> > 
> > yes, very likely the crash is due to mpext, but I don't think mpext is
> > NULL because skb is NULL. I think it's NULL because there was no
> > coalescing on skb (and thus no prior mpext in place).
> 
> Thank you, it is clearer!
> 
> I was not able to reproduce this bug by running the selftests for a few 
> days so I guess this bug has been fixed. Sorry for the noise!

Uhm... if the diagnosys is correct, the bug is still there. It's just
very difficult to trigger... expecially without pktdrill ;)

And even with pktdrill I think would not be easy to create a repro
(!!!)

/P

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag()
@ 2019-11-20  9:51 Matthieu Baerts
  0 siblings, 0 replies; 7+ messages in thread
From: Matthieu Baerts @ 2019-11-20  9:51 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1396 bytes --]

Hi Paolo,

On 18/11/2019 12:07, Paolo Abeni wrote:
> On Sat, 2019-11-16 at 13:59 +0100, Matthieu Baerts wrote:
>> On 15/11/2019 23:39, Paolo Abeni wrote:
>>> On Fri, 2019-11-15 at 22:06 +0100, Matthieu Baerts wrote:
>>>> If skb is NULL, mpext is NULL too (even if it could unlikely also be
>>>> even if skb is no NULL but sounds more like a bug I guess) and
>>>> can_collapse is False.
>>>
>>> Uhm... I don't think this is the case. mpext is NULL even when
>>> do_tcp_sendpages() did not collapsed the data - that is in the above
>>> scenario.
>>
>> I get the OOPS I got was because mpext was NULL, no?
>>
>> BUG: kernel NULL pointer dereference, address: 0000000000000014
>>
>> I didn't check with pahole but maybe linked to this instruction:
>>
>>     mpext->data_len += ret;
> 
> Sorry, I was not clear.
> 
> yes, very likely the crash is due to mpext, but I don't think mpext is
> NULL because skb is NULL. I think it's NULL because there was no
> coalescing on skb (and thus no prior mpext in place).

Thank you, it is clearer!

I was not able to reproduce this bug by running the selftests for a few 
days so I guess this bug has been fixed. Sorry for the noise!

Cheers,
Matt
-- 
Matthieu Baerts | R&D Engineer
matthieu.baerts(a)tessares.net
Tessares SA | Hybrid Access Solutions
www.tessares.net
1 Avenue Jean Monnet, 1348 Louvain-la-Neuve, Belgium

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag()
@ 2019-11-18 11:07 Paolo Abeni
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2019-11-18 11:07 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 983 bytes --]

On Sat, 2019-11-16 at 13:59 +0100, Matthieu Baerts wrote:
> On 15/11/2019 23:39, Paolo Abeni wrote:
> > On Fri, 2019-11-15 at 22:06 +0100, Matthieu Baerts wrote:
> > > If skb is NULL, mpext is NULL too (even if it could unlikely also be
> > > even if skb is no NULL but sounds more like a bug I guess) and
> > > can_collapse is False.
> > 
> > Uhm... I don't think this is the case. mpext is NULL even when
> > do_tcp_sendpages() did not collapsed the data - that is in the above
> > scenario.
> 
> I get the OOPS I got was because mpext was NULL, no?
> 
> BUG: kernel NULL pointer dereference, address: 0000000000000014
> 
> I didn't check with pahole but maybe linked to this instruction:
> 
>    mpext->data_len += ret;

Sorry, I was not clear.

yes, very likely the crash is due to mpext, but I don't think mpext is
NULL because skb is NULL. I think it's NULL because there was no
coalescing on skb (and thus no prior mpext in place).

Cheers,

Paolo

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag()
@ 2019-11-15 22:39 Paolo Abeni
  0 siblings, 0 replies; 7+ messages in thread
From: Paolo Abeni @ 2019-11-15 22:39 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 3560 bytes --]

On Fri, 2019-11-15 at 22:06 +0100, Matthieu Baerts wrote:
> Hello,
> 
> When testing some changes in the kselftests area, I initially saw the 
> message:
> 
>    main_loop_s: timed out
> 
> in the middle of the tests. But just after, there was a "[ OK ]".
> Anybody saw that?

Yes. It happens when you kill the script via ctrl-c while the server
program is running (in background). The server is not killed, and so it
time-out in the background. 

Meanwhile you run another iteration of the self-tests and you see the
timeout from the previous iteration just in the middle of the current
output.

> I tried to understand why because the binary should returns 2 if the 
> message is printed and the shell script should catch the error. Then I 
> added debug in the script and launched it in a while-loop. I didn't get 
> the initial error I was looking for but instead I got a WARN followed by 
> an OOPS:

That is way worse :(

> Of course, in parallel I did other stuffs, I no longer have the proper 
> vmlinux and my MPTCP VM was not recent...

uhm... that's a pity... a decoded stack trace would help

> I suspect that "skb" is NULL and we then have an issue with this chunk:
> 
> 
>          collapsed = skb == tcp_write_queue_tail(ssk);
>          if (collapsed) {
>                  WARN_ON_ONCE(!can_collapse);
>                  /* when collapsing mpext always exists */
>                  mpext->data_len += ret;
>                  goto out;
>          }
> 
> 
> Please also note that earlier in the function, skb is initialised with:
> 
>      skb = tcp_write_queue_tail(ssk);
> 
> So I guess collapsed is always true, no?

No. 'collapsed' is true only if do_tcp_sendpages() did not allocated a
new skb - that is, collapsed the new data to the tail one. It should
happen only when the 'can_collapse' variable is true.

I just notice that there is a very unlikely scenario where 

'collapsed = skb == tcp_write_queue_tail(ssk);' 

is true, but do_tcp_sendpages() did not actually collapsed the data:

- do_tcp_sendpages() is called with 'can_collapse' == false.
- do_tcp_sendpages() tries to create a new skb but hits memory limits
and calls sk_stream_wait_memory()
- sk_stream_wait_memory() release the socket lock and goes to sleep.
- the tail skb is transmitted and freed, than do_tcp_sendpages() is
woken again
- do_tcp_sendpages() allocates a new skb and get a ptr to the last
freed skb - that is, the 'old' tail just freed.
- do_tcp_sendpages() complete successfully and mptcp_sendmsg_frag()
sees skb == tcp_write_queue_tail(ssk), even if there was no collapsing.
- havoc happens

All the above is very unlikely - after all I never observed that in
practice - but still possible I think.

> If skb is NULL, mpext is NULL too (even if it could unlikely also be 
> even if skb is no NULL but sounds more like a bug I guess) and 
> can_collapse is False.

Uhm... I don't think this is the case. mpext is NULL even when
do_tcp_sendpages() did not collapsed the data - that is in the above
scenario. 

A possible solution would be let mptcp_sendmsg_frag() allocate the skb
and do the memory scheduling for the subflow before calling
do_tcp_sendpages(), so that it would never block. 

Some/a lot of code duplication vs do_tcp_sendpages(), but that would
allow us also to take care of the pending todo in
mptcp_sendmsg_frag()[1] e.g. handle mptcp ext allocation failure
gracefully.

Any better ideas?!?

Cheers,

Paolo

[1] in net/mptcp/protocol.c:337

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2019-11-20 15:26 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-16 12:59 [MPTCP] Re: Warn + crash in mptcp_sendmsg_frag() Matthieu Baerts
  -- strict thread matches above, loose matches on Subject: below --
2019-11-20 15:26 Paolo Abeni
2019-11-20 10:57 Matthieu Baerts
2019-11-20 10:18 Paolo Abeni
2019-11-20  9:51 Matthieu Baerts
2019-11-18 11:07 Paolo Abeni
2019-11-15 22:39 Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.