* Re: [MPTCP] [RFC PATCH 4/4] mptcp: allow collapsing consecutive sendpages on the same substream
@ 2019-04-10 8:05 Paolo Abeni
0 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-04-10 8:05 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 1306 bytes --]
On Fri, 2019-04-05 at 17:34 -0700, Mat Martineau wrote:
> On Fri, 5 Apr 2019, Paolo Abeni wrote:
> > diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> > index 2fbfe3e30630..5df2283de488 100644
> > --- a/net/mptcp/protocol.c
> > +++ b/net/mptcp/protocol.c
> > @@ -47,11 +47,19 @@ static struct sock *mptcp_subflow_get_ref(const struct mptcp_sock *msk)
> > return NULL;
> > }
> >
> > +static inline bool mptcp_skb_can_collapse_to(const struct mptcp_sock *msk,
> > + const struct sock *ssk,
> > + const struct sk_buff *skb)
> > +{
> > + return tcp_skb_can_collapse_to(skb) && (msk->last == ssk);
>
> Even if the last send was on a particular subflow, that doesn't guarantee
> the next send can be collapsed on that subflow. There are other cases like
> MPTCP-level resends or sending duplicate data on different subflows.
uhmm... I thought a bit more about the above and the code in the patch
still LGTM: msk->last references the last subflow we enqueued new data
to, that is, the subflow with the higher/newer MPTCP write_seq...
> I think it would have to make sure the MPTCP-level sequence numbers are
> contiguous before collapsing.
... so this condition always held, when (msk->last == ssk).
Am I missing something?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [MPTCP] [RFC PATCH 4/4] mptcp: allow collapsing consecutive sendpages on the same substream
@ 2019-04-10 20:59 Paolo Abeni
0 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-04-10 20:59 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 1635 bytes --]
On Wed, 2019-04-10 at 10:16 -0700, Mat Martineau wrote:
> On Wed, 10 Apr 2019, Paolo Abeni wrote:
> > uhmm... I thought a bit more about the above and the code in the patch
> > still LGTM: msk->last references the last subflow we enqueued new data
> > to, that is, the subflow with the higher/newer MPTCP write_seq...
> >
> > > I think it would have to make sure the MPTCP-level sequence numbers are
> > > contiguous before collapsing.
> >
> > ... so this condition always held, when (msk->last == ssk).
> >
> > Am I missing something?
> >
>
> Resends will be a problem with the "last subflow written to" assumption.
>
> In a case where you have multiple subflows S1 and S2, you might have a
> scenario like this (sequence numbers are in the 64-bit MPTCP sequence
> space):
>
> 1. Send bytes 0-99 on S1
>
> 2. Send bytes 100-199 on S2
>
> 3. <time passes: S2 connection has a problem and data does not get
> through>
>
> 4. Send bytes 200-299 on S1
>
> 5. MPTCP-level resend 100-199 on S1
>
> 6. Send bytes 300-399 on S1
>
>
> When you get to step 6, the "last" send was on S1, but you need a new
> MPTCP mapping header for bytes 300-399 because the previous byte in the S1
> stream is 199. It's important to know where the earlier mappings end so
> you can create a new one without leaving a gap in the MPTCP sequence
> space.
Ah, now I see! thank you for the detailed example!
I'll take care in the next iteration, but I doubt I'll be able to send
a new version before the next mtg, because I'm travelling with no/very
limited time avail.
Thank you,
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [MPTCP] [RFC PATCH 4/4] mptcp: allow collapsing consecutive sendpages on the same substream
@ 2019-04-10 17:16 Mat Martineau
0 siblings, 0 replies; 5+ messages in thread
From: Mat Martineau @ 2019-04-10 17:16 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 2161 bytes --]
On Wed, 10 Apr 2019, Paolo Abeni wrote:
> On Fri, 2019-04-05 at 17:34 -0700, Mat Martineau wrote:
>> On Fri, 5 Apr 2019, Paolo Abeni wrote:
>>> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
>>> index 2fbfe3e30630..5df2283de488 100644
>>> --- a/net/mptcp/protocol.c
>>> +++ b/net/mptcp/protocol.c
>>> @@ -47,11 +47,19 @@ static struct sock *mptcp_subflow_get_ref(const struct mptcp_sock *msk)
>>> return NULL;
>>> }
>>>
>>> +static inline bool mptcp_skb_can_collapse_to(const struct mptcp_sock *msk,
>>> + const struct sock *ssk,
>>> + const struct sk_buff *skb)
>>> +{
>>> + return tcp_skb_can_collapse_to(skb) && (msk->last == ssk);
>>
>> Even if the last send was on a particular subflow, that doesn't guarantee
>> the next send can be collapsed on that subflow. There are other cases like
>> MPTCP-level resends or sending duplicate data on different subflows.
>
> uhmm... I thought a bit more about the above and the code in the patch
> still LGTM: msk->last references the last subflow we enqueued new data
> to, that is, the subflow with the higher/newer MPTCP write_seq...
>
>> I think it would have to make sure the MPTCP-level sequence numbers are
>> contiguous before collapsing.
>
> ... so this condition always held, when (msk->last == ssk).
>
> Am I missing something?
>
Resends will be a problem with the "last subflow written to" assumption.
In a case where you have multiple subflows S1 and S2, you might have a
scenario like this (sequence numbers are in the 64-bit MPTCP sequence
space):
1. Send bytes 0-99 on S1
2. Send bytes 100-199 on S2
3. <time passes: S2 connection has a problem and data does not get
through>
4. Send bytes 200-299 on S1
5. MPTCP-level resend 100-199 on S1
6. Send bytes 300-399 on S1
When you get to step 6, the "last" send was on S1, but you need a new
MPTCP mapping header for bytes 300-399 because the previous byte in the S1
stream is 199. It's important to know where the earlier mappings end so
you can create a new one without leaving a gap in the MPTCP sequence
space.
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [MPTCP] [RFC PATCH 4/4] mptcp: allow collapsing consecutive sendpages on the same substream
@ 2019-04-06 0:34 Mat Martineau
0 siblings, 0 replies; 5+ messages in thread
From: Mat Martineau @ 2019-04-06 0:34 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 5620 bytes --]
Hi Paolo,
On Fri, 5 Apr 2019, Paolo Abeni wrote:
> If the current sendmsg() lands on the same subflow we used last, we
> can try to collapse the data.
>
> Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
> ---
> include/net/mptcp.h | 1 +
> net/mptcp/protocol.c | 57 +++++++++++++++++++++++++++++++-------------
> 2 files changed, 41 insertions(+), 17 deletions(-)
>
> diff --git a/include/net/mptcp.h b/include/net/mptcp.h
> index 0c0839d9ab38..788d4d5a084c 100644
> --- a/include/net/mptcp.h
> +++ b/include/net/mptcp.h
> @@ -50,6 +50,7 @@ struct mptcp_sock {
> spinlock_t conn_list_lock;
> struct hlist_head conn_list;
> struct socket *subflow; /* outgoing connect/listener/!mp_capable */
> + struct sock *last;
> };
>
> #define mptcp_for_each_subflow(__msk, __subflow) \
> diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
> index 2fbfe3e30630..5df2283de488 100644
> --- a/net/mptcp/protocol.c
> +++ b/net/mptcp/protocol.c
> @@ -47,11 +47,19 @@ static struct sock *mptcp_subflow_get_ref(const struct mptcp_sock *msk)
> return NULL;
> }
>
> +static inline bool mptcp_skb_can_collapse_to(const struct mptcp_sock *msk,
> + const struct sock *ssk,
> + const struct sk_buff *skb)
> +{
> + return tcp_skb_can_collapse_to(skb) && (msk->last == ssk);
Even if the last send was on a particular subflow, that doesn't guarantee
the next send can be collapsed on that subflow. There are other cases like
MPTCP-level resends or sending duplicate data on different subflows.
I think it would have to make sure the MPTCP-level sequence numbers are
contiguous before collapsing.
> +}
> +
> static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> - struct msghdr *msg)
> + struct msghdr *msg, int *pmss_now, int *ps_goal)
> {
> + int mss_now, avail_size, size_goal, poffset, ret;
> struct mptcp_sock *msk = mptcp_sk(sk);
> - int mss_now, size_goal, poffset, ret;
> + bool collapsed, can_collapse = false;
> struct mptcp_ext *mpext = NULL;
> struct page_frag *pfrag;
> struct sk_buff *skb;
> @@ -78,25 +86,35 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> if (!psize)
> return -EINVAL;
>
> - /* Mark the end of the previous write so the beginning of the
> - * next write (with its own mptcp skb extension data) is not
> - * collapsed.
> - */
> + mss_now = tcp_send_mss(ssk, &size_goal, msg->msg_flags);
> + *pmss_now = mss_now;
> + *ps_goal = size_goal;
> + avail_size = size_goal;
> skb = tcp_write_queue_tail(ssk);
> - if (skb)
> - TCP_SKB_CB(skb)->eor = 1;
> + if (skb) {
> + can_collapse = mptcp_skb_can_collapse_to(msk, ssk, skb);
> + if (!can_collapse)
> + TCP_SKB_CB(skb)->eor = 1;
> + else if (size_goal - skb->len > 0)
> + avail_size = size_goal + size_goal - skb->len;
Can you comment more on the calculation for avail_size here - is this how
you "allocate at most a new skb" as you mention below?
> + }
>
> - mss_now = tcp_send_mss(ssk, &size_goal, msg->msg_flags);
> - psize = min_t(int, size_goal, psize);
> + /* be sure that we are going to allocate at most a new skb, and
> + * tell the tcp stack to delay the push so that we can safely
> + * access the skb after the sendpages call
> + */
> + psize = min_t(int, avail_size, psize);
> ret = do_tcp_sendpages(ssk, pfrag->page, poffset, psize,
> msg->msg_flags | MSG_SENDPAGE_NOTLAST);
> if (ret <= 0)
> return ret;
>
> - if (skb == tcp_write_queue_tail(ssk))
> - pr_err("no new skb %p/%p", sk, ssk);
> + collapsed = skb == tcp_write_queue_tail(ssk);
> + BUG_ON(collapsed && !can_collapse);
>
> skb = tcp_write_queue_tail(ssk);
> + if (collapsed)
> + goto out;
Without the previous commit (moving mpext->data_len assignment), could you
instead update the existing mpext in the "collapsed == true" case?
Mat
>
> mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
>
> @@ -109,24 +127,25 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
> mpext->dsn64 = 1;
>
> pr_debug("data_seq=%llu subflow_seq=%u data_len=%u checksum=%u, dsn64=%d",
> - mpext->data_seq, mpext->subflow_seq, mpext->data_len,
> + mpext->data_seq, mpext->subflow_seq, skb->len,
> mpext->checksum, mpext->dsn64);
> } /* TODO: else fallback */
>
> +out:
> pfrag->offset += ret;
> + msk->last = ssk;
> msk->write_seq += ret;
> subflow_ctx(ssk)->rel_write_seq += ret;
>
> - tcp_push(ssk, msg->msg_flags, mss_now, tcp_sk(ssk)->nonagle, size_goal);
> return ret;
> }
>
> static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
> {
> struct mptcp_sock *msk = mptcp_sk(sk);
> + int mss_now = 0, size_goal = 0, ret = 0;
> size_t copied = 0;
> struct sock *ssk;
> - int ret = 0;
>
> pr_debug("msk=%p", msk);
> if (msk->subflow) {
> @@ -153,15 +172,19 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
>
> lock_sock(sk);
> lock_sock(ssk);
> +
> while (msg_data_left(msg)) {
> - ret = mptcp_sendmsg_frag(sk, ssk, msg);
> + ret = mptcp_sendmsg_frag(sk, ssk, msg, &mss_now, &size_goal);
> if (ret < 0)
> break;
>
> copied += ret;
> }
> - if (copied > 0)
> + if (copied) {
> ret = copied;
> + tcp_push(ssk, msg->msg_flags, mss_now, tcp_sk(ssk)->nonagle,
> + size_goal);
> + }
>
> release_sock(ssk);
> release_sock(sk);
> --
> 2.20.1
>
> _______________________________________________
> mptcp mailing list
> mptcp(a)lists.01.org
> https://lists.01.org/mailman/listinfo/mptcp
>
--
Mat Martineau
Intel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [MPTCP] [RFC PATCH 4/4] mptcp: allow collapsing consecutive sendpages on the same substream
@ 2019-04-05 14:46 Paolo Abeni
0 siblings, 0 replies; 5+ messages in thread
From: Paolo Abeni @ 2019-04-05 14:46 UTC (permalink / raw)
To: mptcp
[-- Attachment #1: Type: text/plain, Size: 4550 bytes --]
If the current sendmsg() lands on the same subflow we used last, we
can try to collapse the data.
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
---
include/net/mptcp.h | 1 +
net/mptcp/protocol.c | 57 +++++++++++++++++++++++++++++++-------------
2 files changed, 41 insertions(+), 17 deletions(-)
diff --git a/include/net/mptcp.h b/include/net/mptcp.h
index 0c0839d9ab38..788d4d5a084c 100644
--- a/include/net/mptcp.h
+++ b/include/net/mptcp.h
@@ -50,6 +50,7 @@ struct mptcp_sock {
spinlock_t conn_list_lock;
struct hlist_head conn_list;
struct socket *subflow; /* outgoing connect/listener/!mp_capable */
+ struct sock *last;
};
#define mptcp_for_each_subflow(__msk, __subflow) \
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 2fbfe3e30630..5df2283de488 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -47,11 +47,19 @@ static struct sock *mptcp_subflow_get_ref(const struct mptcp_sock *msk)
return NULL;
}
+static inline bool mptcp_skb_can_collapse_to(const struct mptcp_sock *msk,
+ const struct sock *ssk,
+ const struct sk_buff *skb)
+{
+ return tcp_skb_can_collapse_to(skb) && (msk->last == ssk);
+}
+
static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
- struct msghdr *msg)
+ struct msghdr *msg, int *pmss_now, int *ps_goal)
{
+ int mss_now, avail_size, size_goal, poffset, ret;
struct mptcp_sock *msk = mptcp_sk(sk);
- int mss_now, size_goal, poffset, ret;
+ bool collapsed, can_collapse = false;
struct mptcp_ext *mpext = NULL;
struct page_frag *pfrag;
struct sk_buff *skb;
@@ -78,25 +86,35 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
if (!psize)
return -EINVAL;
- /* Mark the end of the previous write so the beginning of the
- * next write (with its own mptcp skb extension data) is not
- * collapsed.
- */
+ mss_now = tcp_send_mss(ssk, &size_goal, msg->msg_flags);
+ *pmss_now = mss_now;
+ *ps_goal = size_goal;
+ avail_size = size_goal;
skb = tcp_write_queue_tail(ssk);
- if (skb)
- TCP_SKB_CB(skb)->eor = 1;
+ if (skb) {
+ can_collapse = mptcp_skb_can_collapse_to(msk, ssk, skb);
+ if (!can_collapse)
+ TCP_SKB_CB(skb)->eor = 1;
+ else if (size_goal - skb->len > 0)
+ avail_size = size_goal + size_goal - skb->len;
+ }
- mss_now = tcp_send_mss(ssk, &size_goal, msg->msg_flags);
- psize = min_t(int, size_goal, psize);
+ /* be sure that we are going to allocate at most a new skb, and
+ * tell the tcp stack to delay the push so that we can safely
+ * access the skb after the sendpages call
+ */
+ psize = min_t(int, avail_size, psize);
ret = do_tcp_sendpages(ssk, pfrag->page, poffset, psize,
msg->msg_flags | MSG_SENDPAGE_NOTLAST);
if (ret <= 0)
return ret;
- if (skb == tcp_write_queue_tail(ssk))
- pr_err("no new skb %p/%p", sk, ssk);
+ collapsed = skb == tcp_write_queue_tail(ssk);
+ BUG_ON(collapsed && !can_collapse);
skb = tcp_write_queue_tail(ssk);
+ if (collapsed)
+ goto out;
mpext = skb_ext_add(skb, SKB_EXT_MPTCP);
@@ -109,24 +127,25 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,
mpext->dsn64 = 1;
pr_debug("data_seq=%llu subflow_seq=%u data_len=%u checksum=%u, dsn64=%d",
- mpext->data_seq, mpext->subflow_seq, mpext->data_len,
+ mpext->data_seq, mpext->subflow_seq, skb->len,
mpext->checksum, mpext->dsn64);
} /* TODO: else fallback */
+out:
pfrag->offset += ret;
+ msk->last = ssk;
msk->write_seq += ret;
subflow_ctx(ssk)->rel_write_seq += ret;
- tcp_push(ssk, msg->msg_flags, mss_now, tcp_sk(ssk)->nonagle, size_goal);
return ret;
}
static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
{
struct mptcp_sock *msk = mptcp_sk(sk);
+ int mss_now = 0, size_goal = 0, ret = 0;
size_t copied = 0;
struct sock *ssk;
- int ret = 0;
pr_debug("msk=%p", msk);
if (msk->subflow) {
@@ -153,15 +172,19 @@ static int mptcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
lock_sock(sk);
lock_sock(ssk);
+
while (msg_data_left(msg)) {
- ret = mptcp_sendmsg_frag(sk, ssk, msg);
+ ret = mptcp_sendmsg_frag(sk, ssk, msg, &mss_now, &size_goal);
if (ret < 0)
break;
copied += ret;
}
- if (copied > 0)
+ if (copied) {
ret = copied;
+ tcp_push(ssk, msg->msg_flags, mss_now, tcp_sk(ssk)->nonagle,
+ size_goal);
+ }
release_sock(ssk);
release_sock(sk);
--
2.20.1
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2019-04-10 20:59 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-10 8:05 [MPTCP] [RFC PATCH 4/4] mptcp: allow collapsing consecutive sendpages on the same substream Paolo Abeni
-- strict thread matches above, loose matches on Subject: below --
2019-04-10 20:59 Paolo Abeni
2019-04-10 17:16 Mat Martineau
2019-04-06 0:34 Mat Martineau
2019-04-05 14:46 Paolo Abeni
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.