Re: [MPTCP] [PATCH] mptcp: sendmsg: fix stream corrution

* Re: [MPTCP] [PATCH] mptcp: sendmsg: fix stream corrution
@ 2019-09-23  7:46 Paolo Abeni
  0 siblings, 0 replies; 8+ messages in thread
From: Paolo Abeni @ 2019-09-23  7:46 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2387 bytes --]

On Fri, 2019-09-20 at 15:45 -0700, Mat Martineau wrote:
> On Fri, 20 Sep 2019, Paolo Abeni wrote:
> 
> > Currently we can hit a stream's corruption with the following sequence:
> > - sendmsg_frag is invoked while the write_queue tail skb's len is equal to
> >  the current size_goal.
> >  sendmsg_frag sets can_collapse to false, even if mptcp data are in sequence.
> >  Anyway it does not set eor in the write_queue skb.
> > - sendmsg_frag invokes do_tcp_sendmsg(), which tries to allocate a new skb -
> >  as the 'size_goal - skb->len <= 0' condition is true.
> >  But it hits memory limit and/or such allocation fails, so do_tcp_sendmsg()
> >  jumps to the 'wait_for_memory:' label and invokes tcp_push()
> > - tcp_push() tries to transmit the write queue tail skb, but due to window size
> >  limit, split it, send the first half and leave in the write queue a smaller
> >  skb - without any ext attached
> > - do_tcp_sendmsg() invokes sk_stream_wait_memory and than checks again for skb
> >  allocation, this time skb->len is less than size_goal and do_tcp_sendmsg()
> >  collapse the data on the existing skb
> > - sendmsg_frag adds a new ext to the tail skb in write queue, but due to the above
> >  it is associated with a wrong/unexpected subflow sequence number: it maps
> >  a future ssn -> stream corruption
> > 
> > This change addresses the above issue explicitly setting the 'eor' when hitting
> > the initial condition so that tcp will not collapse the data after tcp_push().
> 
> Thanks for the detailed description. Code looks good to me and tested fine 
> (although failures remain, but I don't think this was intended to fix 
> everything).

I haven't seen any failure on top of this patch (plus recvmsg
refactor), could you share debug enabled dmsg and/or pcap trace for
failure you see?

There is still at least a possible known bad scenario: it's quite alike
the above, but additionally the window size shrinks at
sk_stream_wait_memory() invocation time so that on next iterations
do_tcp_sendpages() ends-up creating multiple skbs. All except the last
one are sent without any DSS attached and that will corrupt the stream.

AFAICS there is no way of fixing the above, except replacing
do_tcp_sendpages() call with a chunk of almost duplicate code in mptcp,
that handles such corner case explicitly.

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 8+ messages in thread