Re: [MPTCP] [PRE-RFC 0/6] mptcp: implement retransmit infrastructure

All of lore.kernel.org
 help / color / mirror / Atom feed

* Re: [MPTCP] [PRE-RFC 0/6] mptcp: implement retransmit infrastructure
@ 2019-08-19 14:35 Florian Westphal
  0 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2019-08-19 14:35 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 932 bytes --]

Paolo Abeni <pabeni(a)redhat.com> wrote:
> On Mon, 2019-08-19 at 13:06 +0200, Florian Westphal wrote:
> > Is this for future failover case?
> 
> Yes, with a single subflow, we should not use MPTCP-level
> retransmission, it should be triggered by scenarios using failover
> and/or multiple concurrent substream.
> 
> Well, to be 110% honest, I think that with a wrong/less than perfect
> MPTCP-level retransmission heuristic we could end-up retransmitting
> even with a single flow: since the receiver sends MPTCP-level ack only
> after that the user-space read the data, a slow receiver user-space
> could trigger the timeout on the sender (and now that I have wrote it
> down, it looks a quite bad thing to me :(

Its not ideal, but I don't think its something that needs
to be addressed *right now*.

> [If I'm not completely lost, this is the only question left without an
> answer right???]

Yes, thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [MPTCP] [PRE-RFC 0/6] mptcp: implement retransmit infrastructure
@ 2019-08-19 13:02 Paolo Abeni
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-08-19 13:02 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 1616 bytes --]

Hi,

On Mon, 2019-08-19 at 13:06 +0200, Florian Westphal wrote:
> Paolo Abeni <pabeni(a)redhat.com> wrote:
> > As suggested by the subj prefix, this is a very early draft, mainly to
> > discuss design decisions and/or architectural issues. This is only
> > compile-tested so far.
> > 
> > In the current status:
> > - updates the MPTCP unacked sequence number (patches 1-3) on sendmsg()
> >   - also try to cope with 32 bit sequence number
> > - queue data (page frags) in MPTCP retransmission queue (patch 4)
> > - cleanup acked data from retransmission queue and schedule a timeout to
> >    update MPTCP unacked sequence number (patch 5)
> > - do memory accounting for the MPTCP rtx queue (patch 6)
> 
> Can you explain how this rtx queue is expected to be used?
> 
> At this time, we only use one tcp connection for sending data, so
> we can rely on tcp level retransmits.
> 
> Is this for future failover case?

Yes, with a single subflow, we should not use MPTCP-level
retransmission, it should be triggered by scenarios using failover
and/or multiple concurrent substream.

Well, to be 110% honest, I think that with a wrong/less than perfect
MPTCP-level retransmission heuristic we could end-up retransmitting
even with a single flow: since the receiver sends MPTCP-level ack only
after that the user-space read the data, a slow receiver user-space
could trigger the timeout on the sender (and now that I have wrote it
down, it looks a quite bad thing to me :(

[If I'm not completely lost, this is the only question left without an
answer right???]

Cheers,

Paolo


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [MPTCP] [PRE-RFC 0/6] mptcp: implement retransmit infrastructure
@ 2019-08-19 11:06 Florian Westphal
  0 siblings, 0 replies; 4+ messages in thread
From: Florian Westphal @ 2019-08-19 11:06 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2383 bytes --]

Paolo Abeni <pabeni(a)redhat.com> wrote:
> As suggested by the subj prefix, this is a very early draft, mainly to
> discuss design decisions and/or architectural issues. This is only
> compile-tested so far.
> 
> In the current status:
> - updates the MPTCP unacked sequence number (patches 1-3) on sendmsg()
>   - also try to cope with 32 bit sequence number
> - queue data (page frags) in MPTCP retransmission queue (patch 4)
> - cleanup acked data from retransmission queue and schedule a timeout to
>    update MPTCP unacked sequence number (patch 5)
> - do memory accounting for the MPTCP rtx queue (patch 6)

Can you explain how this rtx queue is expected to be used?

At this time, we only use one tcp connection for sending data, so
we can rely on tcp level retransmits.

Is this for future failover case?

> Open points:
> - what about the current seq_una update schema (patch 2-3)? too much complex?
>    too much loose? too much generic?
> - is it too early to plug any heuristic inside the code? (as done in patch 3)

As long as they don't expose a particular behaviour i think its fine, it
can be tweaked/refined later.

> - do we really need a timer to update the unacked sequence?
>   - we can avoid it with an additional atomic operation in
>     mptcp_incoming_options() - either using an atomic type for 'snd_una' or
>     use an additional spin lock to protect it (and possibly other fields)[2]

I think that using a timer is very strange.  I will comment in the patch
itself.

>   - still we will likely need a timer to detect that a subflow has become
>     unusable e.g. due to link down event on the peer side. Should we rely on
>     explicit MPTCP notification only (e.g. DEL_ADDR sub-option)?

Ah, that answers my question above.  I don't think we can rely on
on-wire notifications only.  Client can e.g. move outside of wifi range
for example, so we would only notice that the tcp connection is
"blocked" (e.g. outstanding data with no further acks coming in).

> - retransmission code needs to run in process context, as we will need to
>   acquire msk socket lock and ssh socket lock in order.
>   - As MPTCP-level retransmission should be considerably less frequent than
>     TCP-level retransmission-timeout, we could use/schedule a workqueue for
>     that.

Agree, workqueue seems right for this.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [MPTCP] [PRE-RFC 0/6] mptcp: implement retransmit infrastructure
@ 2019-08-16 16:47 Paolo Abeni
  0 siblings, 0 replies; 4+ messages in thread
From: Paolo Abeni @ 2019-08-16 16:47 UTC (permalink / raw)
  To: mptcp

[-- Attachment #1: Type: text/plain, Size: 2682 bytes --]

As suggested by the subj prefix, this is a very early draft, mainly to
discuss design decisions and/or architectural issues. This is only
compile-tested so far.

In the current status:
- updates the MPTCP unacked sequence number (patches 1-3) on sendmsg()
  - also try to cope with 32 bit sequence number
- queue data (page frags) in MPTCP retransmission queue (patch 4)
- cleanup acked data from retransmission queue and schedule a timeout to
   update MPTCP unacked sequence number (patch 5)
- do memory accounting for the MPTCP rtx queue (patch 6)

TODO:
- do retransmissions in timer handler, with some heuristic - e.g. last seq_una
   update ts older than timeout*2 [1]

Open points:
- what about the current seq_una update schema (patch 2-3)? too much complex?
   too much loose? too much generic?
- is it too early to plug any heuristic inside the code? (as done in patch 3)
- do we really need a timer to update the unacked sequence?
  - we can avoid it with an additional atomic operation in
    mptcp_incoming_options() - either using an atomic type for 'snd_una' or
    use an additional spin lock to protect it (and possibly other fields)[2]
  - still we will likely need a timer to detect that a subflow has become
    unusable e.g. due to link down event on the peer side. Should we rely on
    explicit MPTCP notification only (e.g. DEL_ADDR sub-option)? 
- retransmission code needs to run in process context, as we will need to
  acquire msk socket lock and ssh socket lock in order.
  - As MPTCP-level retransmission should be considerably less frequent than
    TCP-level retransmission-timeout, we could use/schedule a workqueue for
    that.

Any kind of feedback more than welcome!

[1] After a better look at the current code, I see no issues in allocating the
sk_buff hdr at retransmission time - ensuring we do that in unblocking wait,
with GFP_ATOMIC, e.g. as tls_device_write_space() currently does.

[2] I choosed the current design to avoid such extra atomic operation: it 
will happen on a contented cache-line, while processing each MPTCP ack.

Paolo Abeni (6):
  mptcp: move before/after64 into the header file
  mptcp: update per subflow unacked sequence on pkt reception
  mptcp: update msk unacked sequence in sendmsg()
  mptcp: queue data for mptcp level retransmission
  mptcp: use retransmission timer to update msk una
  mptcp: implement memory accounting for mptcp rtx queue

 net/mptcp/options.c  |  31 ++++--
 net/mptcp/protocol.c | 240 +++++++++++++++++++++++++++++++++++++++++--
 net/mptcp/protocol.h |  30 ++++++
 3 files changed, 286 insertions(+), 15 deletions(-)

-- 
2.20.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-08-19 14:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-19 14:35 [MPTCP] [PRE-RFC 0/6] mptcp: implement retransmit infrastructure Florian Westphal
  -- strict thread matches above, loose matches on Subject: below --
2019-08-19 13:02 Paolo Abeni
2019-08-19 11:06 Florian Westphal
2019-08-16 16:47 Paolo Abeni

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.