* [PATCH net-next 0/2] sctp: support MSG_MORE flag when sending msg @ 2017-02-18 17:52 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-18 17:52 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem, David Laight This patch is to add support for MSG_MORE on sctp. Patch 1/2 is an improvement ahead of patch 2/2 to solve the close block problem mentioned in https://patchwork.ozlabs.org/patch/372404/. Xin Long (2): sctp: flush out queue once assoc state falls into SHUTDOWN_PENDING sctp: add support for MSG_MORE include/net/sctp/structs.h | 1 + net/sctp/output.c | 9 +++------ net/sctp/sm_sideeffect.c | 4 ++++ net/sctp/socket.c | 1 + 4 files changed, 9 insertions(+), 6 deletions(-) -- 2.1.0 ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH net-next 0/2] sctp: support MSG_MORE flag when sending msg @ 2017-02-18 17:52 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-18 17:52 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem, David Laight This patch is to add support for MSG_MORE on sctp. Patch 1/2 is an improvement ahead of patch 2/2 to solve the close block problem mentioned in https://patchwork.ozlabs.org/patch/372404/. Xin Long (2): sctp: flush out queue once assoc state falls into SHUTDOWN_PENDING sctp: add support for MSG_MORE include/net/sctp/structs.h | 1 + net/sctp/output.c | 9 +++------ net/sctp/sm_sideeffect.c | 4 ++++ net/sctp/socket.c | 1 + 4 files changed, 9 insertions(+), 6 deletions(-) -- 2.1.0 ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH net-next 1/2] sctp: flush out queue once assoc state falls into SHUTDOWN_PENDING 2017-02-18 17:52 ` Xin Long @ 2017-02-18 17:52 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-18 17:52 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem, David Laight This patch is to flush out queue when assoc state falls into SHUTDOWN_PENDING if there are still chunks in it, so that the data can be sent out as soon as possible before sending SHUTDOWN chunk. When sctp supports MSG_MORE flag in next patch, this improvement can also solve the problem that the chunks with MSG_MORE flag may be stuck in queue when closing an assoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> --- net/sctp/sm_sideeffect.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c index 51abcc9..25384fa 100644 --- a/net/sctp/sm_sideeffect.c +++ b/net/sctp/sm_sideeffect.c @@ -872,6 +872,10 @@ static void sctp_cmd_new_state(sctp_cmd_seq_t *cmds, if (!sctp_style(sk, UDP)) sk->sk_state_change(sk); } + + if (sctp_state(asoc, SHUTDOWN_PENDING) && + !sctp_outq_is_empty(&asoc->outqueue)) + sctp_outq_uncork(&asoc->outqueue, GFP_ATOMIC); } /* Helper function to delete an association. */ -- 2.1.0 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH net-next 1/2] sctp: flush out queue once assoc state falls into SHUTDOWN_PENDING @ 2017-02-18 17:52 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-18 17:52 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem, David Laight This patch is to flush out queue when assoc state falls into SHUTDOWN_PENDING if there are still chunks in it, so that the data can be sent out as soon as possible before sending SHUTDOWN chunk. When sctp supports MSG_MORE flag in next patch, this improvement can also solve the problem that the chunks with MSG_MORE flag may be stuck in queue when closing an assoc. Signed-off-by: Xin Long <lucien.xin@gmail.com> --- net/sctp/sm_sideeffect.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c index 51abcc9..25384fa 100644 --- a/net/sctp/sm_sideeffect.c +++ b/net/sctp/sm_sideeffect.c @@ -872,6 +872,10 @@ static void sctp_cmd_new_state(sctp_cmd_seq_t *cmds, if (!sctp_style(sk, UDP)) sk->sk_state_change(sk); } + + if (sctp_state(asoc, SHUTDOWN_PENDING) && + !sctp_outq_is_empty(&asoc->outqueue)) + sctp_outq_uncork(&asoc->outqueue, GFP_ATOMIC); } /* Helper function to delete an association. */ -- 2.1.0 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-18 17:52 ` Xin Long @ 2017-02-18 17:52 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-18 17:52 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem, David Laight This patch is to add support for MSG_MORE on sctp. It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after creating datamsg according to the send flag. sctp_packet_can_append_data then uses it to decide if the chunks of this msg will be sent at once or delay it. Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of in assoc. As sctp enqueues the chunks first, then dequeue them one by one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may affect other chunks' bundling. Since last patch, sctp flush out queue once assoc state falls into SHUTDOWN_PENDING, the close block problem mentioned in [1] has been solved as well. [1] https://patchwork.ozlabs.org/patch/372404/ Signed-off-by: Xin Long <lucien.xin@gmail.com> --- include/net/sctp/structs.h | 1 + net/sctp/output.c | 9 +++------ net/sctp/socket.c | 1 + 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 387c802..a244db5 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -497,6 +497,7 @@ struct sctp_datamsg { /* Did the messenge fail to send? */ int send_error; u8 send_failed:1, + force_delay:1, can_delay; /* should this message be Nagle delayed */ }; diff --git a/net/sctp/output.c b/net/sctp/output.c index 814eac0..85406d5 100644 --- a/net/sctp/output.c +++ b/net/sctp/output.c @@ -704,18 +704,15 @@ static sctp_xmit_t sctp_packet_can_append_data(struct sctp_packet *packet, * unacknowledged. */ - if (sctp_sk(asoc->base.sk)->nodelay) - /* Nagle disabled */ + if ((sctp_sk(asoc->base.sk)->nodelay || inflight == 0) && + !chunk->msg->force_delay) + /* Nothing unacked */ return SCTP_XMIT_OK; if (!sctp_packet_empty(packet)) /* Append to packet */ return SCTP_XMIT_OK; - if (inflight == 0) - /* Nothing unacked */ - return SCTP_XMIT_OK; - if (!sctp_state(asoc, ESTABLISHED)) return SCTP_XMIT_OK; diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 75f35ce..b532148 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1964,6 +1964,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len) err = PTR_ERR(datamsg); goto out_free; } + datamsg->force_delay = !!(msg->msg_flags & MSG_MORE); /* Now send the (possibly) fragmented message. */ list_for_each_entry(chunk, &datamsg->chunks, frag_list) { -- 2.1.0 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-18 17:52 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-18 17:52 UTC (permalink / raw) To: network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem, David Laight This patch is to add support for MSG_MORE on sctp. It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after creating datamsg according to the send flag. sctp_packet_can_append_data then uses it to decide if the chunks of this msg will be sent at once or delay it. Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of in assoc. As sctp enqueues the chunks first, then dequeue them one by one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may affect other chunks' bundling. Since last patch, sctp flush out queue once assoc state falls into SHUTDOWN_PENDING, the close block problem mentioned in [1] has been solved as well. [1] https://patchwork.ozlabs.org/patch/372404/ Signed-off-by: Xin Long <lucien.xin@gmail.com> --- include/net/sctp/structs.h | 1 + net/sctp/output.c | 9 +++------ net/sctp/socket.c | 1 + 3 files changed, 5 insertions(+), 6 deletions(-) diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h index 387c802..a244db5 100644 --- a/include/net/sctp/structs.h +++ b/include/net/sctp/structs.h @@ -497,6 +497,7 @@ struct sctp_datamsg { /* Did the messenge fail to send? */ int send_error; u8 send_failed:1, + force_delay:1, can_delay; /* should this message be Nagle delayed */ }; diff --git a/net/sctp/output.c b/net/sctp/output.c index 814eac0..85406d5 100644 --- a/net/sctp/output.c +++ b/net/sctp/output.c @@ -704,18 +704,15 @@ static sctp_xmit_t sctp_packet_can_append_data(struct sctp_packet *packet, * unacknowledged. */ - if (sctp_sk(asoc->base.sk)->nodelay) - /* Nagle disabled */ + if ((sctp_sk(asoc->base.sk)->nodelay || inflight = 0) && + !chunk->msg->force_delay) + /* Nothing unacked */ return SCTP_XMIT_OK; if (!sctp_packet_empty(packet)) /* Append to packet */ return SCTP_XMIT_OK; - if (inflight = 0) - /* Nothing unacked */ - return SCTP_XMIT_OK; - if (!sctp_state(asoc, ESTABLISHED)) return SCTP_XMIT_OK; diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 75f35ce..b532148 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -1964,6 +1964,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len) err = PTR_ERR(datamsg); goto out_free; } + datamsg->force_delay = !!(msg->msg_flags & MSG_MORE); /* Now send the (possibly) fragmented message. */ list_for_each_entry(chunk, &datamsg->chunks, frag_list) { -- 2.1.0 ^ permalink raw reply related [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-18 17:52 ` Xin Long (?) @ 2017-02-21 14:27 ` David Laight 2017-02-23 3:45 ` Xin Long -1 siblings, 1 reply; 44+ messages in thread From: David Laight @ 2017-02-21 14:27 UTC (permalink / raw) To: 'Xin Long', network dev, linux-sctp Cc: Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem From: Xin Long > Sent: 18 February 2017 17:53 > This patch is to add support for MSG_MORE on sctp. > > It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after > creating datamsg according to the send flag. sctp_packet_can_append_data > then uses it to decide if the chunks of this msg will be sent at once or > delay it. > > Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of > in assoc. As sctp enqueues the chunks first, then dequeue them one by > one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may > affect other chunks' bundling. I thought about that and decided that the MSG_MORE flag on the last data chunk was the only one that mattered. Indeed looking at any others is broken. Consider what happens if you have two small chunks queued, the first with MSG_MORE set, the second with it clear. I think that sctp_outq_flush() will look at the first chunk and decide it doesn't need to do anything because sctp_packet_transmit_chunk() returns SCTP_XMIT_DELAY. The data chunk with MSG_MORE clear won't even be looked at. So the data will never be sent. I wouldn't worry about having messages queued that have MSG_MORE clean when the final message has it set. While it might be 'nice' to send the data (would have to be tx credit) waiting for the next data chunk shouldn't be a problem. I'm not sure I even want to test the current patch! David ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-21 14:27 ` David Laight @ 2017-02-23 3:45 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-23 3:45 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: > From: Xin Long >> Sent: 18 February 2017 17:53 >> This patch is to add support for MSG_MORE on sctp. >> >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after >> creating datamsg according to the send flag. sctp_packet_can_append_data >> then uses it to decide if the chunks of this msg will be sent at once or >> delay it. >> >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of >> in assoc. As sctp enqueues the chunks first, then dequeue them one by >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may >> affect other chunks' bundling. > > I thought about that and decided that the MSG_MORE flag on the last data > chunk was the only one that mattered. > Indeed looking at any others is broken. > > Consider what happens if you have two small chunks queued, the first > with MSG_MORE set, the second with it clear. > > I think that sctp_outq_flush() will look at the first chunk and decide it > doesn't need to do anything because sctp_packet_transmit_chunk() > returns SCTP_XMIT_DELAY. > The data chunk with MSG_MORE clear won't even be looked at. > So the data will never be sent. It's not that bad as you thought, in sctp_packet_can_append_data(): when inflight == 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks would be still sent out. What MSG_MORE flag actually does is ignore inflight == 0 and sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still it has to respect the original logic (like !chunk->msg->can_delay || !sctp_packet_empty(packet) || ...) To delay the chunks with MSG_MORE set even when inflight is 0 it especially important here for users. > > I wouldn't worry about having messages queued that have MSG_MORE clean > when the final message has it set. Yeah, It's an old optimization for bundling. MSG_MORE should NOT break that. > While it might be 'nice' to send the data (would have to be tx credit) > waiting for the next data chunk shouldn't be a problem. sorry, you mean it shouldn't send the data if it's waiting for the next data whenever ? > > I'm not sure I even want to test the current patch! > > David > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-23 3:45 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-23 3:45 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: > From: Xin Long >> Sent: 18 February 2017 17:53 >> This patch is to add support for MSG_MORE on sctp. >> >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after >> creating datamsg according to the send flag. sctp_packet_can_append_data >> then uses it to decide if the chunks of this msg will be sent at once or >> delay it. >> >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of >> in assoc. As sctp enqueues the chunks first, then dequeue them one by >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may >> affect other chunks' bundling. > > I thought about that and decided that the MSG_MORE flag on the last data > chunk was the only one that mattered. > Indeed looking at any others is broken. > > Consider what happens if you have two small chunks queued, the first > with MSG_MORE set, the second with it clear. > > I think that sctp_outq_flush() will look at the first chunk and decide it > doesn't need to do anything because sctp_packet_transmit_chunk() > returns SCTP_XMIT_DELAY. > The data chunk with MSG_MORE clear won't even be looked at. > So the data will never be sent. It's not that bad as you thought, in sctp_packet_can_append_data(): when inflight = 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks would be still sent out. What MSG_MORE flag actually does is ignore inflight = 0 and sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still it has to respect the original logic (like !chunk->msg->can_delay || !sctp_packet_empty(packet) || ...) To delay the chunks with MSG_MORE set even when inflight is 0 it especially important here for users. > > I wouldn't worry about having messages queued that have MSG_MORE clean > when the final message has it set. Yeah, It's an old optimization for bundling. MSG_MORE should NOT break that. > While it might be 'nice' to send the data (would have to be tx credit) > waiting for the next data chunk shouldn't be a problem. sorry, you mean it shouldn't send the data if it's waiting for the next data whenever ? > > I'm not sure I even want to test the current patch! > > David > ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-23 3:45 ` Xin Long @ 2017-02-23 16:04 ` David Laight -1 siblings, 0 replies; 44+ messages in thread From: David Laight @ 2017-02-23 16:04 UTC (permalink / raw) To: 'Xin Long' Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem From: Xin Long > Sent: 23 February 2017 03:46 > On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: > > From: Xin Long > >> Sent: 18 February 2017 17:53 > >> This patch is to add support for MSG_MORE on sctp. > >> > >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after > >> creating datamsg according to the send flag. sctp_packet_can_append_data > >> then uses it to decide if the chunks of this msg will be sent at once or > >> delay it. > >> > >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of > >> in assoc. As sctp enqueues the chunks first, then dequeue them one by > >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may > >> affect other chunks' bundling. > > > > I thought about that and decided that the MSG_MORE flag on the last data > > chunk was the only one that mattered. > > Indeed looking at any others is broken. > > > > Consider what happens if you have two small chunks queued, the first > > with MSG_MORE set, the second with it clear. > > > > I think that sctp_outq_flush() will look at the first chunk and decide it > > doesn't need to do anything because sctp_packet_transmit_chunk() > > returns SCTP_XMIT_DELAY. > > The data chunk with MSG_MORE clear won't even be looked at. > > So the data will never be sent. > It's not that bad as you thought, in sctp_packet_can_append_data(): > when inflight == 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks > would be still sent out. One of us isn't understanding the other :-) IIRC sctp_packet_can_append_data() is called for the first queued data chunk in order to decide whether to generate a message that consists only of data chunks. If it returns SCTP_XMIT_OK then a message is built collecting the rest of the queued data chunks (until the window fills). So if I send a message with MSG_MORE set (on an idle connection) SCTP_XMIT_DELAY is returned and a message isn't sent. I now send a second small message, this time with MSG_MORE clear. The message is queued, then the code looks to see if it can send anything. sctp_packet_can_append_data() is called for the first queued chunk. Since it has force_delay set SCTP_XMIT_DELAY is returned and no message is built. The second message isn't even looked at. > What MSG_MORE flag actually does is ignore inflight == 0 and > sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still > it has to respect the original logic (like !chunk->msg->can_delay > || !sctp_packet_empty(packet) || ...) > > To delay the chunks with MSG_MORE set even when inflight is 0 > it especially important here for users. I'm not too worried about that. Sending the first message was a cheap way to ensure something got sent if the application lied and didn't send a subsequent message. The change has hit Linus's tree, I'll should be able to test that and confirm what I think is going on. David ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-23 16:04 ` David Laight 0 siblings, 0 replies; 44+ messages in thread From: David Laight @ 2017-02-23 16:04 UTC (permalink / raw) To: 'Xin Long' Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem RnJvbTogWGluIExvbmcNCj4gU2VudDogMjMgRmVicnVhcnkgMjAxNyAwMzo0Ng0KPiBPbiBUdWUs IEZlYiAyMSwgMjAxNyBhdCAxMDoyNyBQTSwgRGF2aWQgTGFpZ2h0IDxEYXZpZC5MYWlnaHRAYWN1 bGFiLmNvbT4gd3JvdGU6DQo+ID4gRnJvbTogWGluIExvbmcNCj4gPj4gU2VudDogMTggRmVicnVh cnkgMjAxNyAxNzo1Mw0KPiA+PiBUaGlzIHBhdGNoIGlzIHRvIGFkZCBzdXBwb3J0IGZvciBNU0df TU9SRSBvbiBzY3RwLg0KPiA+Pg0KPiA+PiBJdCBhZGRzIGZvcmNlX2RlbGF5IGluIHNjdHBfZGF0 YW1zZyB0byBzYXZlIE1TR19NT1JFLCBhbmQgc2V0cyBpdCBhZnRlcg0KPiA+PiBjcmVhdGluZyBk YXRhbXNnIGFjY29yZGluZyB0byB0aGUgc2VuZCBmbGFnLiBzY3RwX3BhY2tldF9jYW5fYXBwZW5k X2RhdGENCj4gPj4gdGhlbiB1c2VzIGl0IHRvIGRlY2lkZSBpZiB0aGUgY2h1bmtzIG9mIHRoaXMg bXNnIHdpbGwgYmUgc2VudCBhdCBvbmNlIG9yDQo+ID4+IGRlbGF5IGl0Lg0KPiA+Pg0KPiA+PiBO b3RlIHRoYXQgdW5saWtlIFsxXSwgdGhpcyBwYXRjaCBzYXZlcyBNU0dfTU9SRSBpbiBkYXRhbXNn LCBpbnN0ZWFkIG9mDQo+ID4+IGluIGFzc29jLiBBcyBzY3RwIGVucXVldWVzIHRoZSBjaHVua3Mg Zmlyc3QsIHRoZW4gZGVxdWV1ZSB0aGVtIG9uZSBieQ0KPiA+PiBvbmUuIElmIGl0J3Mgc2F2ZWQg aW4gYXNzb2MsdGhlIGN1cnJlbnQgbXNnJ3Mgc2VuZCBmbGFnIChNU0dfTU9SRSkgbWF5DQo+ID4+ IGFmZmVjdCBvdGhlciBjaHVua3MnIGJ1bmRsaW5nLg0KPiA+DQo+ID4gSSB0aG91Z2h0IGFib3V0 IHRoYXQgYW5kIGRlY2lkZWQgdGhhdCB0aGUgTVNHX01PUkUgZmxhZyBvbiB0aGUgbGFzdCBkYXRh DQo+ID4gY2h1bmsgd2FzIHRoZSBvbmx5IG9uZSB0aGF0IG1hdHRlcmVkLg0KPiA+IEluZGVlZCBs b29raW5nIGF0IGFueSBvdGhlcnMgaXMgYnJva2VuLg0KPiA+DQo+ID4gQ29uc2lkZXIgd2hhdCBo YXBwZW5zIGlmIHlvdSBoYXZlIHR3byBzbWFsbCBjaHVua3MgcXVldWVkLCB0aGUgZmlyc3QNCj4g PiB3aXRoIE1TR19NT1JFIHNldCwgdGhlIHNlY29uZCB3aXRoIGl0IGNsZWFyLg0KPiA+DQo+ID4g SSB0aGluayB0aGF0IHNjdHBfb3V0cV9mbHVzaCgpIHdpbGwgbG9vayBhdCB0aGUgZmlyc3QgY2h1 bmsgYW5kIGRlY2lkZSBpdA0KPiA+IGRvZXNuJ3QgbmVlZCB0byBkbyBhbnl0aGluZyBiZWNhdXNl IHNjdHBfcGFja2V0X3RyYW5zbWl0X2NodW5rKCkNCj4gPiByZXR1cm5zIFNDVFBfWE1JVF9ERUxB WS4NCj4gPiBUaGUgZGF0YSBjaHVuayB3aXRoIE1TR19NT1JFIGNsZWFyIHdvbid0IGV2ZW4gYmUg bG9va2VkIGF0Lg0KPiA+IFNvIHRoZSBkYXRhIHdpbGwgbmV2ZXIgYmUgc2VudC4NCg0KPiBJdCdz IG5vdCB0aGF0IGJhZCBhcyB5b3UgdGhvdWdodCwgaW4gc2N0cF9wYWNrZXRfY2FuX2FwcGVuZF9k YXRhKCk6DQo+IHdoZW4gaW5mbGlnaHQgPT0gMCB8fCBzY3RwX3NrKGFzb2MtPmJhc2Uuc2spLT5u b2RlbGF5LCB0aGUgY2h1bmtzDQo+IHdvdWxkIGJlIHN0aWxsIHNlbnQgb3V0Lg0KDQpPbmUgb2Yg dXMgaXNuJ3QgdW5kZXJzdGFuZGluZyB0aGUgb3RoZXIgOi0pDQoNCklJUkMgc2N0cF9wYWNrZXRf Y2FuX2FwcGVuZF9kYXRhKCkgaXMgY2FsbGVkIGZvciB0aGUgZmlyc3QgcXVldWVkDQpkYXRhIGNo dW5rIGluIG9yZGVyIHRvIGRlY2lkZSB3aGV0aGVyIHRvIGdlbmVyYXRlIGEgbWVzc2FnZSB0aGF0 DQpjb25zaXN0cyBvbmx5IG9mIGRhdGEgY2h1bmtzLg0KSWYgaXQgcmV0dXJucyBTQ1RQX1hNSVRf T0sgdGhlbiBhIG1lc3NhZ2UgaXMgYnVpbHQgY29sbGVjdGluZyB0aGUNCnJlc3Qgb2YgdGhlIHF1 ZXVlZCBkYXRhIGNodW5rcyAodW50aWwgdGhlIHdpbmRvdyBmaWxscykuDQoNClNvIGlmIEkgc2Vu ZCBhIG1lc3NhZ2Ugd2l0aCBNU0dfTU9SRSBzZXQgKG9uIGFuIGlkbGUgY29ubmVjdGlvbikNClND VFBfWE1JVF9ERUxBWSBpcyByZXR1cm5lZCBhbmQgYSBtZXNzYWdlIGlzbid0IHNlbnQuDQoNCkkg bm93IHNlbmQgYSBzZWNvbmQgc21hbGwgbWVzc2FnZSwgdGhpcyB0aW1lIHdpdGggTVNHX01PUkUg Y2xlYXIuDQpUaGUgbWVzc2FnZSBpcyBxdWV1ZWQsIHRoZW4gdGhlIGNvZGUgbG9va3MgdG8gc2Vl IGlmIGl0IGNhbiBzZW5kIGFueXRoaW5nLg0KDQpzY3RwX3BhY2tldF9jYW5fYXBwZW5kX2RhdGEo KSBpcyBjYWxsZWQgZm9yIHRoZSBmaXJzdCBxdWV1ZWQgY2h1bmsuDQpTaW5jZSBpdCBoYXMgZm9y Y2VfZGVsYXkgc2V0IFNDVFBfWE1JVF9ERUxBWSBpcyByZXR1cm5lZCBhbmQgbm8NCm1lc3NhZ2Ug aXMgYnVpbHQuDQpUaGUgc2Vjb25kIG1lc3NhZ2UgaXNuJ3QgZXZlbiBsb29rZWQgYXQuDQoNCj4g V2hhdCBNU0dfTU9SRSBmbGFnIGFjdHVhbGx5IGRvZXMgaXMgaWdub3JlIGluZmxpZ2h0ID09IDAg YW5kDQo+IHNjdHBfc2soYXNvYy0+YmFzZS5zayktPm5vZGVsYXkgdG8gZGVsYXkgdGhlIGNodW5r cywgYnV0IHN0aWxsDQo+IGl0IGhhcyB0byByZXNwZWN0IHRoZSBvcmlnaW5hbCBsb2dpYyAobGlr ZSAhY2h1bmstPm1zZy0+Y2FuX2RlbGF5DQo+IHx8ICFzY3RwX3BhY2tldF9lbXB0eShwYWNrZXQp IHx8IC4uLikNCj4gDQo+IFRvIGRlbGF5IHRoZSBjaHVua3Mgd2l0aCBNU0dfTU9SRSBzZXQgZXZl biB3aGVuIGluZmxpZ2h0IGlzIDANCj4gaXQgZXNwZWNpYWxseSBpbXBvcnRhbnQgaGVyZSBmb3Ig dXNlcnMuDQoNCkknbSBub3QgdG9vIHdvcnJpZWQgYWJvdXQgdGhhdC4NClNlbmRpbmcgdGhlIGZp cnN0IG1lc3NhZ2Ugd2FzIGEgY2hlYXAgd2F5IHRvIGVuc3VyZSBzb21ldGhpbmcgZ290DQpzZW50 IGlmIHRoZSBhcHBsaWNhdGlvbiBsaWVkIGFuZCBkaWRuJ3Qgc2VuZCBhIHN1YnNlcXVlbnQgbWVz c2FnZS4NCg0KVGhlIGNoYW5nZSBoYXMgaGl0IExpbnVzJ3MgdHJlZSwgSSdsbCBzaG91bGQgYmUg YWJsZSB0byB0ZXN0IHRoYXQNCmFuZCBjb25maXJtIHdoYXQgSSB0aGluayBpcyBnb2luZyBvbi4N Cg0KCURhdmlkDQoNCg= ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-23 16:04 ` David Laight @ 2017-02-23 17:40 ` Marcelo Ricardo Leitner -1 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-02-23 17:40 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Thu, Feb 23, 2017 at 04:04:10PM +0000, David Laight wrote: > From: Xin Long > > Sent: 23 February 2017 03:46 > > On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: > > > From: Xin Long > > >> Sent: 18 February 2017 17:53 > > >> This patch is to add support for MSG_MORE on sctp. > > >> > > >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after > > >> creating datamsg according to the send flag. sctp_packet_can_append_data > > >> then uses it to decide if the chunks of this msg will be sent at once or > > >> delay it. > > >> > > >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of > > >> in assoc. As sctp enqueues the chunks first, then dequeue them one by > > >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may > > >> affect other chunks' bundling. > > > > > > I thought about that and decided that the MSG_MORE flag on the last data > > > chunk was the only one that mattered. > > > Indeed looking at any others is broken. > > > > > > Consider what happens if you have two small chunks queued, the first > > > with MSG_MORE set, the second with it clear. > > > > > > I think that sctp_outq_flush() will look at the first chunk and decide it > > > doesn't need to do anything because sctp_packet_transmit_chunk() > > > returns SCTP_XMIT_DELAY. > > > The data chunk with MSG_MORE clear won't even be looked at. > > > So the data will never be sent. > > > It's not that bad as you thought, in sctp_packet_can_append_data(): > > when inflight == 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks > > would be still sent out. > > One of us isn't understanding the other :-) > > IIRC sctp_packet_can_append_data() is called for the first queued > data chunk in order to decide whether to generate a message that Perhaps here lies the source of the confusion? sctp_packet_can_append_data() is called for all queued data chunks, and not just the first one. sctp_outq_flush (retransmissions here, omitted for simplicity) /* Finally, transmit new packets. */ while ((chunk = sctp_outq_dequeue_data(q)) != NULL) { sctp_packet_transmit_chunk sctp_packet_append_chunk sctp_packet_can_append_data __sctp_packet_append_chunk So chunks are checked one by one. > consists only of data chunks. That's not really its purpose. It's to check if it can append a data chunk to the packet being prepared, while respecting asoc state, cwnd, etc. HTH! Marcelo > If it returns SCTP_XMIT_OK then a message is built collecting the > rest of the queued data chunks (until the window fills). > > So if I send a message with MSG_MORE set (on an idle connection) > SCTP_XMIT_DELAY is returned and a message isn't sent. > > I now send a second small message, this time with MSG_MORE clear. > The message is queued, then the code looks to see if it can send anything. > > sctp_packet_can_append_data() is called for the first queued chunk. > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > message is built. > The second message isn't even looked at. > > > What MSG_MORE flag actually does is ignore inflight == 0 and > > sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still > > it has to respect the original logic (like !chunk->msg->can_delay > > || !sctp_packet_empty(packet) || ...) > > > > To delay the chunks with MSG_MORE set even when inflight is 0 > > it especially important here for users. > > I'm not too worried about that. > Sending the first message was a cheap way to ensure something got > sent if the application lied and didn't send a subsequent message. > > The change has hit Linus's tree, I'll should be able to test that > and confirm what I think is going on. > > David > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-23 17:40 ` Marcelo Ricardo Leitner 0 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-02-23 17:40 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Thu, Feb 23, 2017 at 04:04:10PM +0000, David Laight wrote: > From: Xin Long > > Sent: 23 February 2017 03:46 > > On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: > > > From: Xin Long > > >> Sent: 18 February 2017 17:53 > > >> This patch is to add support for MSG_MORE on sctp. > > >> > > >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after > > >> creating datamsg according to the send flag. sctp_packet_can_append_data > > >> then uses it to decide if the chunks of this msg will be sent at once or > > >> delay it. > > >> > > >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of > > >> in assoc. As sctp enqueues the chunks first, then dequeue them one by > > >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may > > >> affect other chunks' bundling. > > > > > > I thought about that and decided that the MSG_MORE flag on the last data > > > chunk was the only one that mattered. > > > Indeed looking at any others is broken. > > > > > > Consider what happens if you have two small chunks queued, the first > > > with MSG_MORE set, the second with it clear. > > > > > > I think that sctp_outq_flush() will look at the first chunk and decide it > > > doesn't need to do anything because sctp_packet_transmit_chunk() > > > returns SCTP_XMIT_DELAY. > > > The data chunk with MSG_MORE clear won't even be looked at. > > > So the data will never be sent. > > > It's not that bad as you thought, in sctp_packet_can_append_data(): > > when inflight = 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks > > would be still sent out. > > One of us isn't understanding the other :-) > > IIRC sctp_packet_can_append_data() is called for the first queued > data chunk in order to decide whether to generate a message that Perhaps here lies the source of the confusion? sctp_packet_can_append_data() is called for all queued data chunks, and not just the first one. sctp_outq_flush (retransmissions here, omitted for simplicity) /* Finally, transmit new packets. */ while ((chunk = sctp_outq_dequeue_data(q)) != NULL) { sctp_packet_transmit_chunk sctp_packet_append_chunk sctp_packet_can_append_data __sctp_packet_append_chunk So chunks are checked one by one. > consists only of data chunks. That's not really its purpose. It's to check if it can append a data chunk to the packet being prepared, while respecting asoc state, cwnd, etc. HTH! Marcelo > If it returns SCTP_XMIT_OK then a message is built collecting the > rest of the queued data chunks (until the window fills). > > So if I send a message with MSG_MORE set (on an idle connection) > SCTP_XMIT_DELAY is returned and a message isn't sent. > > I now send a second small message, this time with MSG_MORE clear. > The message is queued, then the code looks to see if it can send anything. > > sctp_packet_can_append_data() is called for the first queued chunk. > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > message is built. > The second message isn't even looked at. > > > What MSG_MORE flag actually does is ignore inflight = 0 and > > sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still > > it has to respect the original logic (like !chunk->msg->can_delay > > || !sctp_packet_empty(packet) || ...) > > > > To delay the chunks with MSG_MORE set even when inflight is 0 > > it especially important here for users. > > I'm not too worried about that. > Sending the first message was a cheap way to ensure something got > sent if the application lied and didn't send a subsequent message. > > The change has hit Linus's tree, I'll should be able to test that > and confirm what I think is going on. > > David > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-23 17:40 ` Marcelo Ricardo Leitner @ 2017-02-23 18:16 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-23 18:16 UTC (permalink / raw) To: Marcelo Ricardo Leitner Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 1:40 AM, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > On Thu, Feb 23, 2017 at 04:04:10PM +0000, David Laight wrote: >> From: Xin Long >> > Sent: 23 February 2017 03:46 >> > On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: >> > > From: Xin Long >> > >> Sent: 18 February 2017 17:53 >> > >> This patch is to add support for MSG_MORE on sctp. >> > >> >> > >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after >> > >> creating datamsg according to the send flag. sctp_packet_can_append_data >> > >> then uses it to decide if the chunks of this msg will be sent at once or >> > >> delay it. >> > >> >> > >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of >> > >> in assoc. As sctp enqueues the chunks first, then dequeue them one by >> > >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may >> > >> affect other chunks' bundling. >> > > >> > > I thought about that and decided that the MSG_MORE flag on the last data >> > > chunk was the only one that mattered. >> > > Indeed looking at any others is broken. >> > > >> > > Consider what happens if you have two small chunks queued, the first >> > > with MSG_MORE set, the second with it clear. >> > > >> > > I think that sctp_outq_flush() will look at the first chunk and decide it >> > > doesn't need to do anything because sctp_packet_transmit_chunk() >> > > returns SCTP_XMIT_DELAY. >> > > The data chunk with MSG_MORE clear won't even be looked at. >> > > So the data will never be sent. >> >> > It's not that bad as you thought, in sctp_packet_can_append_data(): >> > when inflight == 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks >> > would be still sent out. >> >> One of us isn't understanding the other :-) >> >> IIRC sctp_packet_can_append_data() is called for the first queued >> data chunk in order to decide whether to generate a message that > > Perhaps here lies the source of the confusion? > sctp_packet_can_append_data() is called for all queued data chunks, and > not just the first one. > > sctp_outq_flush > (retransmissions here, omitted for simplicity) > /* Finally, transmit new packets. */ > while ((chunk = sctp_outq_dequeue_data(q)) != NULL) { > sctp_packet_transmit_chunk > sctp_packet_append_chunk > sctp_packet_can_append_data > __sctp_packet_append_chunk > > So chunks are checked one by one. I think I got David's point. like, the queue is: chunk3[null]-->chunk2 [msg_more]-->chunk1 [msg_more] it dequeue from chunk1, once it returns SCTP_XMIT_DELAY chunk2, chunk3 will has no chance to dequeue, as it will goto: sctpflush_out in sctp_outq_flush(), But we are expecting to send all chunks. > >> consists only of data chunks. > > That's not really its purpose. It's to check if it can append a data > chunk to the packet being prepared, while respecting asoc state, cwnd, > etc. > > HTH! > > Marcelo > >> If it returns SCTP_XMIT_OK then a message is built collecting the >> rest of the queued data chunks (until the window fills). >> >> So if I send a message with MSG_MORE set (on an idle connection) >> SCTP_XMIT_DELAY is returned and a message isn't sent. >> >> I now send a second small message, this time with MSG_MORE clear. >> The message is queued, then the code looks to see if it can send anything. >> >> sctp_packet_can_append_data() is called for the first queued chunk. >> Since it has force_delay set SCTP_XMIT_DELAY is returned and no >> message is built. >> The second message isn't even looked at. >> >> > What MSG_MORE flag actually does is ignore inflight == 0 and >> > sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still >> > it has to respect the original logic (like !chunk->msg->can_delay >> > || !sctp_packet_empty(packet) || ...) >> > >> > To delay the chunks with MSG_MORE set even when inflight is 0 >> > it especially important here for users. >> >> I'm not too worried about that. >> Sending the first message was a cheap way to ensure something got >> sent if the application lied and didn't send a subsequent message. >> >> The change has hit Linus's tree, I'll should be able to test that >> and confirm what I think is going on. >> >> David >> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-23 18:16 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-23 18:16 UTC (permalink / raw) To: Marcelo Ricardo Leitner Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 1:40 AM, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > On Thu, Feb 23, 2017 at 04:04:10PM +0000, David Laight wrote: >> From: Xin Long >> > Sent: 23 February 2017 03:46 >> > On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: >> > > From: Xin Long >> > >> Sent: 18 February 2017 17:53 >> > >> This patch is to add support for MSG_MORE on sctp. >> > >> >> > >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after >> > >> creating datamsg according to the send flag. sctp_packet_can_append_data >> > >> then uses it to decide if the chunks of this msg will be sent at once or >> > >> delay it. >> > >> >> > >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of >> > >> in assoc. As sctp enqueues the chunks first, then dequeue them one by >> > >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may >> > >> affect other chunks' bundling. >> > > >> > > I thought about that and decided that the MSG_MORE flag on the last data >> > > chunk was the only one that mattered. >> > > Indeed looking at any others is broken. >> > > >> > > Consider what happens if you have two small chunks queued, the first >> > > with MSG_MORE set, the second with it clear. >> > > >> > > I think that sctp_outq_flush() will look at the first chunk and decide it >> > > doesn't need to do anything because sctp_packet_transmit_chunk() >> > > returns SCTP_XMIT_DELAY. >> > > The data chunk with MSG_MORE clear won't even be looked at. >> > > So the data will never be sent. >> >> > It's not that bad as you thought, in sctp_packet_can_append_data(): >> > when inflight = 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks >> > would be still sent out. >> >> One of us isn't understanding the other :-) >> >> IIRC sctp_packet_can_append_data() is called for the first queued >> data chunk in order to decide whether to generate a message that > > Perhaps here lies the source of the confusion? > sctp_packet_can_append_data() is called for all queued data chunks, and > not just the first one. > > sctp_outq_flush > (retransmissions here, omitted for simplicity) > /* Finally, transmit new packets. */ > while ((chunk = sctp_outq_dequeue_data(q)) != NULL) { > sctp_packet_transmit_chunk > sctp_packet_append_chunk > sctp_packet_can_append_data > __sctp_packet_append_chunk > > So chunks are checked one by one. I think I got David's point. like, the queue is: chunk3[null]-->chunk2 [msg_more]-->chunk1 [msg_more] it dequeue from chunk1, once it returns SCTP_XMIT_DELAY chunk2, chunk3 will has no chance to dequeue, as it will goto: sctpflush_out in sctp_outq_flush(), But we are expecting to send all chunks. > >> consists only of data chunks. > > That's not really its purpose. It's to check if it can append a data > chunk to the packet being prepared, while respecting asoc state, cwnd, > etc. > > HTH! > > Marcelo > >> If it returns SCTP_XMIT_OK then a message is built collecting the >> rest of the queued data chunks (until the window fills). >> >> So if I send a message with MSG_MORE set (on an idle connection) >> SCTP_XMIT_DELAY is returned and a message isn't sent. >> >> I now send a second small message, this time with MSG_MORE clear. >> The message is queued, then the code looks to see if it can send anything. >> >> sctp_packet_can_append_data() is called for the first queued chunk. >> Since it has force_delay set SCTP_XMIT_DELAY is returned and no >> message is built. >> The second message isn't even looked at. >> >> > What MSG_MORE flag actually does is ignore inflight = 0 and >> > sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still >> > it has to respect the original logic (like !chunk->msg->can_delay >> > || !sctp_packet_empty(packet) || ...) >> > >> > To delay the chunks with MSG_MORE set even when inflight is 0 >> > it especially important here for users. >> >> I'm not too worried about that. >> Sending the first message was a cheap way to ensure something got >> sent if the application lied and didn't send a subsequent message. >> >> The change has hit Linus's tree, I'll should be able to test that >> and confirm what I think is going on. >> >> David >> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-23 18:16 ` Xin Long @ 2017-02-23 18:39 ` Marcelo Ricardo Leitner -1 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-02-23 18:39 UTC (permalink / raw) To: Xin Long Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 02:16:02AM +0800, Xin Long wrote: > On Fri, Feb 24, 2017 at 1:40 AM, Marcelo Ricardo Leitner > <marcelo.leitner@gmail.com> wrote: > > On Thu, Feb 23, 2017 at 04:04:10PM +0000, David Laight wrote: > >> From: Xin Long > >> > Sent: 23 February 2017 03:46 > >> > On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: > >> > > From: Xin Long > >> > >> Sent: 18 February 2017 17:53 > >> > >> This patch is to add support for MSG_MORE on sctp. > >> > >> > >> > >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after > >> > >> creating datamsg according to the send flag. sctp_packet_can_append_data > >> > >> then uses it to decide if the chunks of this msg will be sent at once or > >> > >> delay it. > >> > >> > >> > >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of > >> > >> in assoc. As sctp enqueues the chunks first, then dequeue them one by > >> > >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may > >> > >> affect other chunks' bundling. > >> > > > >> > > I thought about that and decided that the MSG_MORE flag on the last data > >> > > chunk was the only one that mattered. > >> > > Indeed looking at any others is broken. > >> > > > >> > > Consider what happens if you have two small chunks queued, the first > >> > > with MSG_MORE set, the second with it clear. > >> > > > >> > > I think that sctp_outq_flush() will look at the first chunk and decide it > >> > > doesn't need to do anything because sctp_packet_transmit_chunk() > >> > > returns SCTP_XMIT_DELAY. > >> > > The data chunk with MSG_MORE clear won't even be looked at. > >> > > So the data will never be sent. > >> > >> > It's not that bad as you thought, in sctp_packet_can_append_data(): > >> > when inflight == 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks > >> > would be still sent out. > >> > >> One of us isn't understanding the other :-) > >> > >> IIRC sctp_packet_can_append_data() is called for the first queued > >> data chunk in order to decide whether to generate a message that > > > > Perhaps here lies the source of the confusion? > > sctp_packet_can_append_data() is called for all queued data chunks, and > > not just the first one. > > > > sctp_outq_flush > > (retransmissions here, omitted for simplicity) > > /* Finally, transmit new packets. */ > > while ((chunk = sctp_outq_dequeue_data(q)) != NULL) { > > sctp_packet_transmit_chunk > > sctp_packet_append_chunk > > sctp_packet_can_append_data > > __sctp_packet_append_chunk > > > > So chunks are checked one by one. > I think I got David's point. > like, the queue is: > > chunk3[null]-->chunk2 [msg_more]-->chunk1 [msg_more] > > it dequeue from chunk1, once it returns SCTP_XMIT_DELAY > chunk2, chunk3 will has no chance to dequeue, as it will > goto: sctpflush_out in sctp_outq_flush(), But we are expecting > to send all chunks. Ahh yes, exactly. > > > > >> consists only of data chunks. > > > > That's not really its purpose. It's to check if it can append a data > > chunk to the packet being prepared, while respecting asoc state, cwnd, > > etc. > > > > HTH! > > > > Marcelo > > > >> If it returns SCTP_XMIT_OK then a message is built collecting the > >> rest of the queued data chunks (until the window fills). > >> > >> So if I send a message with MSG_MORE set (on an idle connection) > >> SCTP_XMIT_DELAY is returned and a message isn't sent. > >> > >> I now send a second small message, this time with MSG_MORE clear. > >> The message is queued, then the code looks to see if it can send anything. > >> > >> sctp_packet_can_append_data() is called for the first queued chunk. > >> Since it has force_delay set SCTP_XMIT_DELAY is returned and no > >> message is built. > >> The second message isn't even looked at. > >> > >> > What MSG_MORE flag actually does is ignore inflight == 0 and > >> > sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still > >> > it has to respect the original logic (like !chunk->msg->can_delay > >> > || !sctp_packet_empty(packet) || ...) > >> > > >> > To delay the chunks with MSG_MORE set even when inflight is 0 > >> > it especially important here for users. > >> > >> I'm not too worried about that. > >> Sending the first message was a cheap way to ensure something got > >> sent if the application lied and didn't send a subsequent message. > >> > >> The change has hit Linus's tree, I'll should be able to test that > >> and confirm what I think is going on. > >> > >> David > >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-23 18:39 ` Marcelo Ricardo Leitner 0 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-02-23 18:39 UTC (permalink / raw) To: Xin Long Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 02:16:02AM +0800, Xin Long wrote: > On Fri, Feb 24, 2017 at 1:40 AM, Marcelo Ricardo Leitner > <marcelo.leitner@gmail.com> wrote: > > On Thu, Feb 23, 2017 at 04:04:10PM +0000, David Laight wrote: > >> From: Xin Long > >> > Sent: 23 February 2017 03:46 > >> > On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: > >> > > From: Xin Long > >> > >> Sent: 18 February 2017 17:53 > >> > >> This patch is to add support for MSG_MORE on sctp. > >> > >> > >> > >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after > >> > >> creating datamsg according to the send flag. sctp_packet_can_append_data > >> > >> then uses it to decide if the chunks of this msg will be sent at once or > >> > >> delay it. > >> > >> > >> > >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of > >> > >> in assoc. As sctp enqueues the chunks first, then dequeue them one by > >> > >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may > >> > >> affect other chunks' bundling. > >> > > > >> > > I thought about that and decided that the MSG_MORE flag on the last data > >> > > chunk was the only one that mattered. > >> > > Indeed looking at any others is broken. > >> > > > >> > > Consider what happens if you have two small chunks queued, the first > >> > > with MSG_MORE set, the second with it clear. > >> > > > >> > > I think that sctp_outq_flush() will look at the first chunk and decide it > >> > > doesn't need to do anything because sctp_packet_transmit_chunk() > >> > > returns SCTP_XMIT_DELAY. > >> > > The data chunk with MSG_MORE clear won't even be looked at. > >> > > So the data will never be sent. > >> > >> > It's not that bad as you thought, in sctp_packet_can_append_data(): > >> > when inflight = 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks > >> > would be still sent out. > >> > >> One of us isn't understanding the other :-) > >> > >> IIRC sctp_packet_can_append_data() is called for the first queued > >> data chunk in order to decide whether to generate a message that > > > > Perhaps here lies the source of the confusion? > > sctp_packet_can_append_data() is called for all queued data chunks, and > > not just the first one. > > > > sctp_outq_flush > > (retransmissions here, omitted for simplicity) > > /* Finally, transmit new packets. */ > > while ((chunk = sctp_outq_dequeue_data(q)) != NULL) { > > sctp_packet_transmit_chunk > > sctp_packet_append_chunk > > sctp_packet_can_append_data > > __sctp_packet_append_chunk > > > > So chunks are checked one by one. > I think I got David's point. > like, the queue is: > > chunk3[null]-->chunk2 [msg_more]-->chunk1 [msg_more] > > it dequeue from chunk1, once it returns SCTP_XMIT_DELAY > chunk2, chunk3 will has no chance to dequeue, as it will > goto: sctpflush_out in sctp_outq_flush(), But we are expecting > to send all chunks. Ahh yes, exactly. > > > > >> consists only of data chunks. > > > > That's not really its purpose. It's to check if it can append a data > > chunk to the packet being prepared, while respecting asoc state, cwnd, > > etc. > > > > HTH! > > > > Marcelo > > > >> If it returns SCTP_XMIT_OK then a message is built collecting the > >> rest of the queued data chunks (until the window fills). > >> > >> So if I send a message with MSG_MORE set (on an idle connection) > >> SCTP_XMIT_DELAY is returned and a message isn't sent. > >> > >> I now send a second small message, this time with MSG_MORE clear. > >> The message is queued, then the code looks to see if it can send anything. > >> > >> sctp_packet_can_append_data() is called for the first queued chunk. > >> Since it has force_delay set SCTP_XMIT_DELAY is returned and no > >> message is built. > >> The second message isn't even looked at. > >> > >> > What MSG_MORE flag actually does is ignore inflight = 0 and > >> > sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still > >> > it has to respect the original logic (like !chunk->msg->can_delay > >> > || !sctp_packet_empty(packet) || ...) > >> > > >> > To delay the chunks with MSG_MORE set even when inflight is 0 > >> > it especially important here for users. > >> > >> I'm not too worried about that. > >> Sending the first message was a cheap way to ensure something got > >> sent if the application lied and didn't send a subsequent message. > >> > >> The change has hit Linus's tree, I'll should be able to test that > >> and confirm what I think is going on. > >> > >> David > >> > -- > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-23 16:04 ` David Laight @ 2017-02-24 6:43 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-24 6:43 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 12:04 AM, David Laight <David.Laight@aculab.com> wrote: > From: Xin Long >> Sent: 23 February 2017 03:46 >> On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: >> > From: Xin Long >> >> Sent: 18 February 2017 17:53 >> >> This patch is to add support for MSG_MORE on sctp. >> >> >> >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after >> >> creating datamsg according to the send flag. sctp_packet_can_append_data >> >> then uses it to decide if the chunks of this msg will be sent at once or >> >> delay it. >> >> >> >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of >> >> in assoc. As sctp enqueues the chunks first, then dequeue them one by >> >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may >> >> affect other chunks' bundling. >> > >> > I thought about that and decided that the MSG_MORE flag on the last data >> > chunk was the only one that mattered. >> > Indeed looking at any others is broken. >> > >> > Consider what happens if you have two small chunks queued, the first >> > with MSG_MORE set, the second with it clear. >> > >> > I think that sctp_outq_flush() will look at the first chunk and decide it >> > doesn't need to do anything because sctp_packet_transmit_chunk() >> > returns SCTP_XMIT_DELAY. >> > The data chunk with MSG_MORE clear won't even be looked at. >> > So the data will never be sent. > >> It's not that bad as you thought, in sctp_packet_can_append_data(): >> when inflight == 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks >> would be still sent out. > > One of us isn't understanding the other :-) > > IIRC sctp_packet_can_append_data() is called for the first queued > data chunk in order to decide whether to generate a message that > consists only of data chunks. > If it returns SCTP_XMIT_OK then a message is built collecting the > rest of the queued data chunks (until the window fills). > > So if I send a message with MSG_MORE set (on an idle connection) > SCTP_XMIT_DELAY is returned and a message isn't sent. > > I now send a second small message, this time with MSG_MORE clear. > The message is queued, then the code looks to see if it can send anything. > > sctp_packet_can_append_data() is called for the first queued chunk. > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > message is built. > The second message isn't even looked at. You're right. I can see the problem now. What I expected is it should work like: 1, send 3 small chunks with MSG_MORE set, the queue is: chk3 [set] -> chk2 [set] -> chk1 [set] 2. send 1 more chunk with MSG_MORE clear, the queue is: chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] 3. then if user send more small chunks with MSG_MORE set, the queue is like: chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] so that the new small chunks' flag will not affect the other chunks bundling. is it ok to you to work like that ? if yes, I propose the fix for this issue is: @@ -303,6 +303,17 @@ void sctp_outq_tail(struct sctp_outq *q, struct sctp_chunk *chunk, gfp_t gfp) sctp_cname(SCTP_ST_CHUNK(chunk->chunk_hdr->type)) : "illegal chunk"); + if (q->has_delay && !chunk->msg->force_delay) { + struct sctp_chunk *chk; + + list_for_each_entry(chk, &q->out_chunk_list, list) { + if (!chk->msg->force_delay) + break; + chk->msg->force_delay = 0; + } + } + q->has_delay = chunk->msg->force_delay; + Thanks. > >> What MSG_MORE flag actually does is ignore inflight == 0 and >> sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still >> it has to respect the original logic (like !chunk->msg->can_delay >> || !sctp_packet_empty(packet) || ...) >> >> To delay the chunks with MSG_MORE set even when inflight is 0 >> it especially important here for users. > > I'm not too worried about that. > Sending the first message was a cheap way to ensure something got > sent if the application lied and didn't send a subsequent message. > > The change has hit Linus's tree, I'll should be able to test that > and confirm what I think is going on. > > David > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-24 6:43 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-24 6:43 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 12:04 AM, David Laight <David.Laight@aculab.com> wrote: > From: Xin Long >> Sent: 23 February 2017 03:46 >> On Tue, Feb 21, 2017 at 10:27 PM, David Laight <David.Laight@aculab.com> wrote: >> > From: Xin Long >> >> Sent: 18 February 2017 17:53 >> >> This patch is to add support for MSG_MORE on sctp. >> >> >> >> It adds force_delay in sctp_datamsg to save MSG_MORE, and sets it after >> >> creating datamsg according to the send flag. sctp_packet_can_append_data >> >> then uses it to decide if the chunks of this msg will be sent at once or >> >> delay it. >> >> >> >> Note that unlike [1], this patch saves MSG_MORE in datamsg, instead of >> >> in assoc. As sctp enqueues the chunks first, then dequeue them one by >> >> one. If it's saved in assoc,the current msg's send flag (MSG_MORE) may >> >> affect other chunks' bundling. >> > >> > I thought about that and decided that the MSG_MORE flag on the last data >> > chunk was the only one that mattered. >> > Indeed looking at any others is broken. >> > >> > Consider what happens if you have two small chunks queued, the first >> > with MSG_MORE set, the second with it clear. >> > >> > I think that sctp_outq_flush() will look at the first chunk and decide it >> > doesn't need to do anything because sctp_packet_transmit_chunk() >> > returns SCTP_XMIT_DELAY. >> > The data chunk with MSG_MORE clear won't even be looked at. >> > So the data will never be sent. > >> It's not that bad as you thought, in sctp_packet_can_append_data(): >> when inflight = 0 || sctp_sk(asoc->base.sk)->nodelay, the chunks >> would be still sent out. > > One of us isn't understanding the other :-) > > IIRC sctp_packet_can_append_data() is called for the first queued > data chunk in order to decide whether to generate a message that > consists only of data chunks. > If it returns SCTP_XMIT_OK then a message is built collecting the > rest of the queued data chunks (until the window fills). > > So if I send a message with MSG_MORE set (on an idle connection) > SCTP_XMIT_DELAY is returned and a message isn't sent. > > I now send a second small message, this time with MSG_MORE clear. > The message is queued, then the code looks to see if it can send anything. > > sctp_packet_can_append_data() is called for the first queued chunk. > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > message is built. > The second message isn't even looked at. You're right. I can see the problem now. What I expected is it should work like: 1, send 3 small chunks with MSG_MORE set, the queue is: chk3 [set] -> chk2 [set] -> chk1 [set] 2. send 1 more chunk with MSG_MORE clear, the queue is: chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] 3. then if user send more small chunks with MSG_MORE set, the queue is like: chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] so that the new small chunks' flag will not affect the other chunks bundling. is it ok to you to work like that ? if yes, I propose the fix for this issue is: @@ -303,6 +303,17 @@ void sctp_outq_tail(struct sctp_outq *q, struct sctp_chunk *chunk, gfp_t gfp) sctp_cname(SCTP_ST_CHUNK(chunk->chunk_hdr->type)) : "illegal chunk"); + if (q->has_delay && !chunk->msg->force_delay) { + struct sctp_chunk *chk; + + list_for_each_entry(chk, &q->out_chunk_list, list) { + if (!chk->msg->force_delay) + break; + chk->msg->force_delay = 0; + } + } + q->has_delay = chunk->msg->force_delay; + Thanks. > >> What MSG_MORE flag actually does is ignore inflight = 0 and >> sctp_sk(asoc->base.sk)->nodelay to delay the chunks, but still >> it has to respect the original logic (like !chunk->msg->can_delay >> || !sctp_packet_empty(packet) || ...) >> >> To delay the chunks with MSG_MORE set even when inflight is 0 >> it especially important here for users. > > I'm not too worried about that. > Sending the first message was a cheap way to ensure something got > sent if the application lied and didn't send a subsequent message. > > The change has hit Linus's tree, I'll should be able to test that > and confirm what I think is going on. > > David > ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-24 6:43 ` Xin Long @ 2017-02-24 10:14 ` David Laight -1 siblings, 0 replies; 44+ messages in thread From: David Laight @ 2017-02-24 10:14 UTC (permalink / raw) To: 'Xin Long' Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem From: Xin Long > Sent: 24 February 2017 06:44 ... > > IIRC sctp_packet_can_append_data() is called for the first queued > > data chunk in order to decide whether to generate a message that > > consists only of data chunks. > > If it returns SCTP_XMIT_OK then a message is built collecting the > > rest of the queued data chunks (until the window fills). > > > > So if I send a message with MSG_MORE set (on an idle connection) > > SCTP_XMIT_DELAY is returned and a message isn't sent. > > > > I now send a second small message, this time with MSG_MORE clear. > > The message is queued, then the code looks to see if it can send anything. > > > > sctp_packet_can_append_data() is called for the first queued chunk. > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > > message is built. > > The second message isn't even looked at. > You're right. I can see the problem now. > > What I expected is it should work like: > > 1, send 3 small chunks with MSG_MORE set, the queue is: > chk3 [set] -> chk2 [set] -> chk1 [set] Strange way to write a queue! chk1 points to chk2 :-) > 2. send 1 more chunk with MSG_MORE clear, the queue is: > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] I don't think processing the entire queue is a good idea. Both from execution time and the effects on the data cache. The SCTP code is horrid enough as it is. > 3. then if user send more small chunks with MSG_MORE set, > the queue is like: > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > so that the new small chunks' flag will not affect the other chunks bundling. That isn't really necessary. The user can't expect to have absolute control over which chunks get bundled together. If the above chunks still aren't big enough to fill a frame the code might as well wait for the next chunk instead of building a packet that contains chk1 through to chkB. Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. As soon as data can be sent, if the first chunk has MSG_MORE clear all of the queued chunks will be sent. So immediately after your (3) the application is expected to send a chunk with MSG_MORE clear - at that point all the queued chunks can be sent in a single packet. So just save the last MSG_MORE on the association as I did. David ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-24 10:14 ` David Laight 0 siblings, 0 replies; 44+ messages in thread From: David Laight @ 2017-02-24 10:14 UTC (permalink / raw) To: 'Xin Long' Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem RnJvbTogWGluIExvbmcNCj4gU2VudDogMjQgRmVicnVhcnkgMjAxNyAwNjo0NA0KLi4uDQo+ID4g SUlSQyBzY3RwX3BhY2tldF9jYW5fYXBwZW5kX2RhdGEoKSBpcyBjYWxsZWQgZm9yIHRoZSBmaXJz dCBxdWV1ZWQNCj4gPiBkYXRhIGNodW5rIGluIG9yZGVyIHRvIGRlY2lkZSB3aGV0aGVyIHRvIGdl bmVyYXRlIGEgbWVzc2FnZSB0aGF0DQo+ID4gY29uc2lzdHMgb25seSBvZiBkYXRhIGNodW5rcy4N Cj4gPiBJZiBpdCByZXR1cm5zIFNDVFBfWE1JVF9PSyB0aGVuIGEgbWVzc2FnZSBpcyBidWlsdCBj b2xsZWN0aW5nIHRoZQ0KPiA+IHJlc3Qgb2YgdGhlIHF1ZXVlZCBkYXRhIGNodW5rcyAodW50aWwg dGhlIHdpbmRvdyBmaWxscykuDQo+ID4NCj4gPiBTbyBpZiBJIHNlbmQgYSBtZXNzYWdlIHdpdGgg TVNHX01PUkUgc2V0IChvbiBhbiBpZGxlIGNvbm5lY3Rpb24pDQo+ID4gU0NUUF9YTUlUX0RFTEFZ IGlzIHJldHVybmVkIGFuZCBhIG1lc3NhZ2UgaXNuJ3Qgc2VudC4NCj4gPg0KPiA+IEkgbm93IHNl bmQgYSBzZWNvbmQgc21hbGwgbWVzc2FnZSwgdGhpcyB0aW1lIHdpdGggTVNHX01PUkUgY2xlYXIu DQo+ID4gVGhlIG1lc3NhZ2UgaXMgcXVldWVkLCB0aGVuIHRoZSBjb2RlIGxvb2tzIHRvIHNlZSBp ZiBpdCBjYW4gc2VuZCBhbnl0aGluZy4NCj4gPg0KPiA+IHNjdHBfcGFja2V0X2Nhbl9hcHBlbmRf ZGF0YSgpIGlzIGNhbGxlZCBmb3IgdGhlIGZpcnN0IHF1ZXVlZCBjaHVuay4NCj4gPiBTaW5jZSBp dCBoYXMgZm9yY2VfZGVsYXkgc2V0IFNDVFBfWE1JVF9ERUxBWSBpcyByZXR1cm5lZCBhbmQgbm8N Cj4gPiBtZXNzYWdlIGlzIGJ1aWx0Lg0KPiA+IFRoZSBzZWNvbmQgbWVzc2FnZSBpc24ndCBldmVu IGxvb2tlZCBhdC4NCj4gWW91J3JlIHJpZ2h0LiBJIGNhbiBzZWUgdGhlIHByb2JsZW0gbm93Lg0K PiANCj4gV2hhdCBJIGV4cGVjdGVkIGlzIGl0IHNob3VsZCB3b3JrIGxpa2U6DQo+IA0KPiAxLCBz ZW5kIDMgc21hbGwgY2h1bmtzIHdpdGggTVNHX01PUkUgc2V0LCB0aGUgcXVldWUgaXM6DQo+ICAg Y2hrMyBbc2V0XSAtPiBjaGsyIFtzZXRdIC0+IGNoazEgW3NldF0NCg0KU3RyYW5nZSB3YXkgdG8g d3JpdGUgYSBxdWV1ZSEgY2hrMSBwb2ludHMgdG8gY2hrMiA6LSkNCg0KPiAyLiBzZW5kIDEgbW9y ZSBjaHVuayB3aXRoIE1TR19NT1JFIGNsZWFyLCB0aGUgcXVldWUgaXM6DQo+ICAgY2hrNFtjbGVh cl0gLT4gY2hrMyBbY2xlYXJdIC0+IGNoazIgW2NsZWFyXSAtPiBjaGsxIFtjbGVhcl0NCg0KSSBk b24ndCB0aGluayBwcm9jZXNzaW5nIHRoZSBlbnRpcmUgcXVldWUgaXMgYSBnb29kIGlkZWEuDQpC b3RoIGZyb20gZXhlY3V0aW9uIHRpbWUgYW5kIHRoZSBlZmZlY3RzIG9uIHRoZSBkYXRhIGNhY2hl Lg0KVGhlIFNDVFAgY29kZSBpcyBob3JyaWQgZW5vdWdoIGFzIGl0IGlzLg0KDQo+IDMuIHRoZW4g aWYgdXNlciBzZW5kIG1vcmUgc21hbGwgY2h1bmtzIHdpdGggTVNHX01PUkUgc2V0LA0KPiB0aGUg cXVldWUgaXMgbGlrZToNCj4gICBjaGtCW3NldF0gLT4gY2hrQVtzZXRdIC0+IGNoazRbY2xlYXJd IC0+IGNoazMgW2NsZWFyXSAtPiBjaGsyIFtjbGVhcl0gLT4gY2hrMSBbY2xlYXJdDQo+IHNvIHRo YXQgdGhlIG5ldyBzbWFsbCBjaHVua3MnIGZsYWcgd2lsbCBub3QgYWZmZWN0IHRoZSBvdGhlciBj aHVua3MgYnVuZGxpbmcuDQoNClRoYXQgaXNuJ3QgcmVhbGx5IG5lY2Vzc2FyeS4NClRoZSB1c2Vy IGNhbid0IGV4cGVjdCB0byBoYXZlIGFic29sdXRlIGNvbnRyb2wgb3ZlciB3aGljaCBjaHVua3Mg Z2V0IGJ1bmRsZWQNCnRvZ2V0aGVyLg0KSWYgdGhlIGFib3ZlIGNodW5rcyBzdGlsbCBhcmVuJ3Qg YmlnIGVub3VnaCB0byBmaWxsIGEgZnJhbWUgdGhlIGNvZGUgbWlnaHQNCmFzIHdlbGwgd2FpdCBm b3IgdGhlIG5leHQgY2h1bmsgaW5zdGVhZCBvZiBidWlsZGluZyBhIHBhY2tldCB0aGF0IGNvbnRh aW5zDQpjaGsxIHRocm91Z2ggdG8gY2hrQi4NCg0KUmVtZW1iZXIgeW91J2xsIG9ubHkgZ2V0IGEg cXVldWVkIGNodW5rIHdpdGggTVNHX01PUkUgY2xlYXIgaWYgZGF0YSBjYW4ndCBiZSBzZW50Lg0K QXMgc29vbiBhcyBkYXRhIGNhbiBiZSBzZW50LCBpZiB0aGUgZmlyc3QgY2h1bmsgaGFzIE1TR19N T1JFIGNsZWFyIGFsbCBvZiB0aGUNCnF1ZXVlZCBjaHVua3Mgd2lsbCBiZSBzZW50Lg0KDQpTbyBp bW1lZGlhdGVseSBhZnRlciB5b3VyICgzKSB0aGUgYXBwbGljYXRpb24gaXMgZXhwZWN0ZWQgdG8g c2VuZCBhIGNodW5rDQp3aXRoIE1TR19NT1JFIGNsZWFyIC0gYXQgdGhhdCBwb2ludCBhbGwgdGhl IHF1ZXVlZCBjaHVua3MgY2FuIGJlIHNlbnQgaW4NCmEgc2luZ2xlIHBhY2tldC4NCg0KU28ganVz dCBzYXZlIHRoZSBsYXN0IE1TR19NT1JFIG9uIHRoZSBhc3NvY2lhdGlvbiBhcyBJIGRpZC4NCg0K CURhdmlkDQo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-24 10:14 ` David Laight @ 2017-02-25 8:41 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-25 8:41 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 6:14 PM, David Laight <David.Laight@aculab.com> wrote: > > From: Xin Long > > Sent: 24 February 2017 06:44 > ... > > > IIRC sctp_packet_can_append_data() is called for the first queued > > > data chunk in order to decide whether to generate a message that > > > consists only of data chunks. > > > If it returns SCTP_XMIT_OK then a message is built collecting the > > > rest of the queued data chunks (until the window fills). > > > > > > So if I send a message with MSG_MORE set (on an idle connection) > > > SCTP_XMIT_DELAY is returned and a message isn't sent. > > > > > > I now send a second small message, this time with MSG_MORE clear. > > > The message is queued, then the code looks to see if it can send anything. > > > > > > sctp_packet_can_append_data() is called for the first queued chunk. > > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > > > message is built. > > > The second message isn't even looked at. > > You're right. I can see the problem now. > > > > What I expected is it should work like: > > > > 1, send 3 small chunks with MSG_MORE set, the queue is: > > chk3 [set] -> chk2 [set] -> chk1 [set] > > Strange way to write a queue! chk1 points to chk2 :-) haha, just a model. > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > I don't think processing the entire queue is a good idea. > Both from execution time and the effects on the data cache. > The SCTP code is horrid enough as it is. you check the codes in last email, it's not processing the entire queue. 1). only when queue has delay chunk inside by checking queue->has_delay and current chunk has msg_more flag. 2). will break on the first chunk with clear in the queue. but yes, in 2), extra work has to be done than before, but not much. > > > 3. then if user send more small chunks with MSG_MORE set, > > the queue is like: > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > so that the new small chunks' flag will not affect the other chunks bundling. > > That isn't really necessary. > The user can't expect to have absolute control over which chunks get bundled > together. > If the above chunks still aren't big enough to fill a frame the code might > as well wait for the next chunk instead of building a packet that contains > chk1 through to chkB. > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > queued chunks will be sent. > > So immediately after your (3) the application is expected to send a chunk > with MSG_MORE clear - at that point all the queued chunks can be sent in > a single packet. understand this. what I'm worried about is if the msg_more is saved in assoc: chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] then when you send a small chkA with MSG_MORE, the queue will be like: chkA [set] -> chk4[set] -> chk3 [set] -> chk2 [set] -> chk1 [set] because msg_more is saved in assoc, every chunk can look at it. chk1 - chk4 are big enough to be packed into a packet, they were not sent last time because a lot of chunks are in the retransmit queue. But now even if retransmit queue is null, chk1-chk4 are still blocked. can you accept that chkA may block the old chunks ? > > So just save the last MSG_MORE on the association as I did. I will think about it, thanks. > > David ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-25 8:41 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-25 8:41 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Fri, Feb 24, 2017 at 6:14 PM, David Laight <David.Laight@aculab.com> wrote: > > From: Xin Long > > Sent: 24 February 2017 06:44 > ... > > > IIRC sctp_packet_can_append_data() is called for the first queued > > > data chunk in order to decide whether to generate a message that > > > consists only of data chunks. > > > If it returns SCTP_XMIT_OK then a message is built collecting the > > > rest of the queued data chunks (until the window fills). > > > > > > So if I send a message with MSG_MORE set (on an idle connection) > > > SCTP_XMIT_DELAY is returned and a message isn't sent. > > > > > > I now send a second small message, this time with MSG_MORE clear. > > > The message is queued, then the code looks to see if it can send anything. > > > > > > sctp_packet_can_append_data() is called for the first queued chunk. > > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > > > message is built. > > > The second message isn't even looked at. > > You're right. I can see the problem now. > > > > What I expected is it should work like: > > > > 1, send 3 small chunks with MSG_MORE set, the queue is: > > chk3 [set] -> chk2 [set] -> chk1 [set] > > Strange way to write a queue! chk1 points to chk2 :-) haha, just a model. > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > I don't think processing the entire queue is a good idea. > Both from execution time and the effects on the data cache. > The SCTP code is horrid enough as it is. you check the codes in last email, it's not processing the entire queue. 1). only when queue has delay chunk inside by checking queue->has_delay and current chunk has msg_more flag. 2). will break on the first chunk with clear in the queue. but yes, in 2), extra work has to be done than before, but not much. > > > 3. then if user send more small chunks with MSG_MORE set, > > the queue is like: > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > so that the new small chunks' flag will not affect the other chunks bundling. > > That isn't really necessary. > The user can't expect to have absolute control over which chunks get bundled > together. > If the above chunks still aren't big enough to fill a frame the code might > as well wait for the next chunk instead of building a packet that contains > chk1 through to chkB. > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > queued chunks will be sent. > > So immediately after your (3) the application is expected to send a chunk > with MSG_MORE clear - at that point all the queued chunks can be sent in > a single packet. understand this. what I'm worried about is if the msg_more is saved in assoc: chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] then when you send a small chkA with MSG_MORE, the queue will be like: chkA [set] -> chk4[set] -> chk3 [set] -> chk2 [set] -> chk1 [set] because msg_more is saved in assoc, every chunk can look at it. chk1 - chk4 are big enough to be packed into a packet, they were not sent last time because a lot of chunks are in the retransmit queue. But now even if retransmit queue is null, chk1-chk4 are still blocked. can you accept that chkA may block the old chunks ? > > So just save the last MSG_MORE on the association as I did. I will think about it, thanks. > > David ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-25 8:41 ` Xin Long @ 2017-02-27 4:49 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-27 4:49 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Sat, Feb 25, 2017 at 4:41 PM, Xin Long <lucien.xin@gmail.com> wrote: > On Fri, Feb 24, 2017 at 6:14 PM, David Laight <David.Laight@aculab.com> wrote: >> >> From: Xin Long >> > Sent: 24 February 2017 06:44 >> ... >> > > IIRC sctp_packet_can_append_data() is called for the first queued >> > > data chunk in order to decide whether to generate a message that >> > > consists only of data chunks. >> > > If it returns SCTP_XMIT_OK then a message is built collecting the >> > > rest of the queued data chunks (until the window fills). >> > > >> > > So if I send a message with MSG_MORE set (on an idle connection) >> > > SCTP_XMIT_DELAY is returned and a message isn't sent. >> > > >> > > I now send a second small message, this time with MSG_MORE clear. >> > > The message is queued, then the code looks to see if it can send anything. >> > > >> > > sctp_packet_can_append_data() is called for the first queued chunk. >> > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no >> > > message is built. >> > > The second message isn't even looked at. >> > You're right. I can see the problem now. >> > >> > What I expected is it should work like: >> > >> > 1, send 3 small chunks with MSG_MORE set, the queue is: >> > chk3 [set] -> chk2 [set] -> chk1 [set] >> >> Strange way to write a queue! chk1 points to chk2 :-) > haha, just a model. > >> >> > 2. send 1 more chunk with MSG_MORE clear, the queue is: >> > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> >> I don't think processing the entire queue is a good idea. >> Both from execution time and the effects on the data cache. >> The SCTP code is horrid enough as it is. > you check the codes in last email, it's not processing the entire queue. > > 1). only when queue has delay chunk inside by checking queue->has_delay > and current chunk has msg_more flag. > > 2). will break on the first chunk with clear in the queue. > > but yes, in 2), extra work has to be done than before, but not much. > >> >> > 3. then if user send more small chunks with MSG_MORE set, >> > the queue is like: >> > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> > so that the new small chunks' flag will not affect the other chunks bundling. >> >> That isn't really necessary. >> The user can't expect to have absolute control over which chunks get bundled >> together. >> If the above chunks still aren't big enough to fill a frame the code might >> as well wait for the next chunk instead of building a packet that contains >> chk1 through to chkB. >> >> Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. >> As soon as data can be sent, if the first chunk has MSG_MORE clear all of the >> queued chunks will be sent. >> >> So immediately after your (3) the application is expected to send a chunk >> with MSG_MORE clear - at that point all the queued chunks can be sent in >> a single packet. > understand this. > > what I'm worried about is if the msg_more is saved in assoc: > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > then when you send a small chkA with MSG_MORE, > the queue will be like: > chkA [set] -> chk4[set] -> chk3 [set] -> chk2 [set] -> chk1 [set] > because msg_more is saved in assoc, every chunk can look at it. > chk1 - chk4 are big enough to be packed into a packet, they were > not sent last time because a lot of chunks are in the retransmit > queue. > > But now even if retransmit queue is null, chk1-chk4 are still blocked. > > can you accept that chkA may block the old chunks ? even also block the retransmit chunks. > >> >> So just save the last MSG_MORE on the association as I did. > I will think about it, thanks. > >> >> David ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-27 4:49 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-02-27 4:49 UTC (permalink / raw) To: David Laight Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem On Sat, Feb 25, 2017 at 4:41 PM, Xin Long <lucien.xin@gmail.com> wrote: > On Fri, Feb 24, 2017 at 6:14 PM, David Laight <David.Laight@aculab.com> wrote: >> >> From: Xin Long >> > Sent: 24 February 2017 06:44 >> ... >> > > IIRC sctp_packet_can_append_data() is called for the first queued >> > > data chunk in order to decide whether to generate a message that >> > > consists only of data chunks. >> > > If it returns SCTP_XMIT_OK then a message is built collecting the >> > > rest of the queued data chunks (until the window fills). >> > > >> > > So if I send a message with MSG_MORE set (on an idle connection) >> > > SCTP_XMIT_DELAY is returned and a message isn't sent. >> > > >> > > I now send a second small message, this time with MSG_MORE clear. >> > > The message is queued, then the code looks to see if it can send anything. >> > > >> > > sctp_packet_can_append_data() is called for the first queued chunk. >> > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no >> > > message is built. >> > > The second message isn't even looked at. >> > You're right. I can see the problem now. >> > >> > What I expected is it should work like: >> > >> > 1, send 3 small chunks with MSG_MORE set, the queue is: >> > chk3 [set] -> chk2 [set] -> chk1 [set] >> >> Strange way to write a queue! chk1 points to chk2 :-) > haha, just a model. > >> >> > 2. send 1 more chunk with MSG_MORE clear, the queue is: >> > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> >> I don't think processing the entire queue is a good idea. >> Both from execution time and the effects on the data cache. >> The SCTP code is horrid enough as it is. > you check the codes in last email, it's not processing the entire queue. > > 1). only when queue has delay chunk inside by checking queue->has_delay > and current chunk has msg_more flag. > > 2). will break on the first chunk with clear in the queue. > > but yes, in 2), extra work has to be done than before, but not much. > >> >> > 3. then if user send more small chunks with MSG_MORE set, >> > the queue is like: >> > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> > so that the new small chunks' flag will not affect the other chunks bundling. >> >> That isn't really necessary. >> The user can't expect to have absolute control over which chunks get bundled >> together. >> If the above chunks still aren't big enough to fill a frame the code might >> as well wait for the next chunk instead of building a packet that contains >> chk1 through to chkB. >> >> Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. >> As soon as data can be sent, if the first chunk has MSG_MORE clear all of the >> queued chunks will be sent. >> >> So immediately after your (3) the application is expected to send a chunk >> with MSG_MORE clear - at that point all the queued chunks can be sent in >> a single packet. > understand this. > > what I'm worried about is if the msg_more is saved in assoc: > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > then when you send a small chkA with MSG_MORE, > the queue will be like: > chkA [set] -> chk4[set] -> chk3 [set] -> chk2 [set] -> chk1 [set] > because msg_more is saved in assoc, every chunk can look at it. > chk1 - chk4 are big enough to be packed into a packet, they were > not sent last time because a lot of chunks are in the retransmit > queue. > > But now even if retransmit queue is null, chk1-chk4 are still blocked. > > can you accept that chkA may block the old chunks ? even also block the retransmit chunks. > >> >> So just save the last MSG_MORE on the association as I did. > I will think about it, thanks. > >> >> David ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-27 4:49 ` Xin Long @ 2017-02-27 10:48 ` David Laight -1 siblings, 0 replies; 44+ messages in thread From: David Laight @ 2017-02-27 10:48 UTC (permalink / raw) To: 'Xin Long' Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem From: Xin Long > Sent: 27 February 2017 04:49 ... > > what I'm worried about is if the msg_more is saved in assoc: > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > then when you send a small chkA with MSG_MORE, > > the queue will be like: > > chkA [set] -> chk4[set] -> chk3 [set] -> chk2 [set] -> chk1 [set] > > because msg_more is saved in assoc, every chunk can look at it. > > chk1 - chk4 are big enough to be packed into a packet, they were > > not sent last time because a lot of chunks are in the retransmit > > queue. Or just waiting for transmit window space. > > But now even if retransmit queue is null, chk1-chk4 are still blocked. > > > > can you accept that chkA may block the old chunks ? > even also block the retransmit chunks. I don't see a problem, the application is about to send another chunk. It is likely that the whole lot will go in one packet. If the connection is flow controlled off (even if due to packet loss) the minor delay waiting for the application won't make any real difference. Indeed reducing the number of ethernet frames may help increase overall throughput. Does a request for chunks to be retransmitted force a response packet be built? If so it will pick up all the queued data chunks regardless of any saved MSG_MORE bits. David ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-02-27 10:48 ` David Laight 0 siblings, 0 replies; 44+ messages in thread From: David Laight @ 2017-02-27 10:48 UTC (permalink / raw) To: 'Xin Long' Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich, davem RnJvbTogWGluIExvbmcNCj4gU2VudDogMjcgRmVicnVhcnkgMjAxNyAwNDo0OQ0KLi4uDQo+ID4g d2hhdCBJJ20gd29ycmllZCBhYm91dCBpcyBpZiB0aGUgbXNnX21vcmUgaXMgc2F2ZWQgaW4gYXNz b2M6DQo+ID4gICAgICBjaGs0W2NsZWFyXSAtPiBjaGszIFtjbGVhcl0gLT4gY2hrMiBbY2xlYXJd IC0+IGNoazEgW2NsZWFyXQ0KPiA+IHRoZW4gd2hlbiB5b3Ugc2VuZCBhIHNtYWxsIGNoa0Egd2l0 aCBNU0dfTU9SRSwNCj4gPiB0aGUgcXVldWUgd2lsbCBiZSBsaWtlOg0KPiA+ICAgICAgY2hrQSBb c2V0XSAtPiBjaGs0W3NldF0gLT4gY2hrMyBbc2V0XSAtPiBjaGsyIFtzZXRdIC0+IGNoazEgW3Nl dF0NCj4gPiBiZWNhdXNlIG1zZ19tb3JlIGlzIHNhdmVkIGluIGFzc29jLCBldmVyeSBjaHVuayBj YW4gbG9vayBhdCBpdC4NCj4gPiBjaGsxIC0gY2hrNCBhcmUgYmlnIGVub3VnaCB0byBiZSBwYWNr ZWQgaW50byBhIHBhY2tldCwgdGhleSB3ZXJlDQo+ID4gbm90IHNlbnQgbGFzdCB0aW1lIGJlY2F1 c2UgYSBsb3Qgb2YgY2h1bmtzIGFyZSBpbiB0aGUgcmV0cmFuc21pdA0KPiA+IHF1ZXVlLg0KDQpP ciBqdXN0IHdhaXRpbmcgZm9yIHRyYW5zbWl0IHdpbmRvdyBzcGFjZS4NCg0KPiA+IEJ1dCBub3cg ZXZlbiBpZiByZXRyYW5zbWl0IHF1ZXVlIGlzIG51bGwsIGNoazEtY2hrNCBhcmUgc3RpbGwgYmxv Y2tlZC4NCj4gPg0KPiA+IGNhbiB5b3UgYWNjZXB0IHRoYXQgY2hrQSBtYXkgYmxvY2sgdGhlIG9s ZCBjaHVua3MgPw0KPiBldmVuIGFsc28gYmxvY2sgdGhlIHJldHJhbnNtaXQgY2h1bmtzLg0KDQpJ IGRvbid0IHNlZSBhIHByb2JsZW0sIHRoZSBhcHBsaWNhdGlvbiBpcyBhYm91dCB0byBzZW5kIGFu b3RoZXIgY2h1bmsuDQpJdCBpcyBsaWtlbHkgdGhhdCB0aGUgd2hvbGUgbG90IHdpbGwgZ28gaW4g b25lIHBhY2tldC4NCg0KSWYgdGhlIGNvbm5lY3Rpb24gaXMgZmxvdyBjb250cm9sbGVkIG9mZiAo ZXZlbiBpZiBkdWUgdG8gcGFja2V0IGxvc3MpDQp0aGUgbWlub3IgZGVsYXkgd2FpdGluZyBmb3Ig dGhlIGFwcGxpY2F0aW9uIHdvbid0IG1ha2UgYW55IHJlYWwgZGlmZmVyZW5jZS4NCkluZGVlZCBy ZWR1Y2luZyB0aGUgbnVtYmVyIG9mIGV0aGVybmV0IGZyYW1lcyBtYXkgaGVscCBpbmNyZWFzZQ0K b3ZlcmFsbCB0aHJvdWdocHV0Lg0KDQpEb2VzIGEgcmVxdWVzdCBmb3IgY2h1bmtzIHRvIGJlIHJl dHJhbnNtaXR0ZWQgZm9yY2UgYSByZXNwb25zZSBwYWNrZXQgYmUgYnVpbHQ/DQpJZiBzbyBpdCB3 aWxsIHBpY2sgdXAgYWxsIHRoZSBxdWV1ZWQgZGF0YSBjaHVua3MgcmVnYXJkbGVzcyBvZiBhbnkg c2F2ZWQNCk1TR19NT1JFIGJpdHMuDQoNCglEYXZpZA0KDQoNCg= ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-02-24 10:14 ` David Laight @ 2017-03-21 22:04 ` Marcelo Ricardo Leitner -1 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-03-21 22:04 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem Hi, On Fri, Feb 24, 2017 at 10:14:09AM +0000, David Laight wrote: > From: Xin Long > > Sent: 24 February 2017 06:44 > ... > > > IIRC sctp_packet_can_append_data() is called for the first queued > > > data chunk in order to decide whether to generate a message that > > > consists only of data chunks. > > > If it returns SCTP_XMIT_OK then a message is built collecting the > > > rest of the queued data chunks (until the window fills). > > > > > > So if I send a message with MSG_MORE set (on an idle connection) > > > SCTP_XMIT_DELAY is returned and a message isn't sent. > > > > > > I now send a second small message, this time with MSG_MORE clear. > > > The message is queued, then the code looks to see if it can send anything. > > > > > > sctp_packet_can_append_data() is called for the first queued chunk. > > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > > > message is built. > > > The second message isn't even looked at. > > You're right. I can see the problem now. > > > > What I expected is it should work like: > > > > 1, send 3 small chunks with MSG_MORE set, the queue is: > > chk3 [set] -> chk2 [set] -> chk1 [set] > > Strange way to write a queue! chk1 points to chk2 :-) Strictly speaking, it's actually both together, as it's a double-linked list. :-) > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > I don't think processing the entire queue is a good idea. > Both from execution time and the effects on the data cache. It won't be processing the entire queue if not needed, and it will only process it on the last sendmsg() call. As the list is double-linked, it can walk backwards as necessary and stop just at the right point. So this doesn't imply on any quadratic or exponential factor here, but linear and only if/when finishing the MSG_MORE block. If the application is not using MSG_MORE, impact is zero. > The SCTP code is horrid enough as it is. > > > 3. then if user send more small chunks with MSG_MORE set, > > the queue is like: > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > so that the new small chunks' flag will not affect the other chunks bundling. > > That isn't really necessary. > The user can't expect to have absolute control over which chunks get bundled > together. So...? I mean, I'm okay with that but that doesn't explain why we can't do as Xin proposed on previous email here. > If the above chunks still aren't big enough to fill a frame the code might > as well wait for the next chunk instead of building a packet that contains > chk1 through to chkB. Our expectations are the same and that's what the proposed solution also achieves, no? > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > queued chunks will be sent. With the fix proposed by Xin, this would be more like: ... all of the _non-held_ chunks will be sent. After all, application asked to hold them, for whatever reason it had. > > So immediately after your (3) the application is expected to send a chunk > with MSG_MORE clear - at that point all the queued chunks can be sent in > a single packet. Yes. Isn't that the idea? > > So just save the last MSG_MORE on the association as I did. I don't see the reason to change that. Your reply seem to reject the idea but I cannot get the reason why. The solution proposed is more complex, yes, allows more control, yes, but those aren't real issues here. Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-03-21 22:04 ` Marcelo Ricardo Leitner 0 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-03-21 22:04 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem Hi, On Fri, Feb 24, 2017 at 10:14:09AM +0000, David Laight wrote: > From: Xin Long > > Sent: 24 February 2017 06:44 > ... > > > IIRC sctp_packet_can_append_data() is called for the first queued > > > data chunk in order to decide whether to generate a message that > > > consists only of data chunks. > > > If it returns SCTP_XMIT_OK then a message is built collecting the > > > rest of the queued data chunks (until the window fills). > > > > > > So if I send a message with MSG_MORE set (on an idle connection) > > > SCTP_XMIT_DELAY is returned and a message isn't sent. > > > > > > I now send a second small message, this time with MSG_MORE clear. > > > The message is queued, then the code looks to see if it can send anything. > > > > > > sctp_packet_can_append_data() is called for the first queued chunk. > > > Since it has force_delay set SCTP_XMIT_DELAY is returned and no > > > message is built. > > > The second message isn't even looked at. > > You're right. I can see the problem now. > > > > What I expected is it should work like: > > > > 1, send 3 small chunks with MSG_MORE set, the queue is: > > chk3 [set] -> chk2 [set] -> chk1 [set] > > Strange way to write a queue! chk1 points to chk2 :-) Strictly speaking, it's actually both together, as it's a double-linked list. :-) > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > I don't think processing the entire queue is a good idea. > Both from execution time and the effects on the data cache. It won't be processing the entire queue if not needed, and it will only process it on the last sendmsg() call. As the list is double-linked, it can walk backwards as necessary and stop just at the right point. So this doesn't imply on any quadratic or exponential factor here, but linear and only if/when finishing the MSG_MORE block. If the application is not using MSG_MORE, impact is zero. > The SCTP code is horrid enough as it is. > > > 3. then if user send more small chunks with MSG_MORE set, > > the queue is like: > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > so that the new small chunks' flag will not affect the other chunks bundling. > > That isn't really necessary. > The user can't expect to have absolute control over which chunks get bundled > together. So...? I mean, I'm okay with that but that doesn't explain why we can't do as Xin proposed on previous email here. > If the above chunks still aren't big enough to fill a frame the code might > as well wait for the next chunk instead of building a packet that contains > chk1 through to chkB. Our expectations are the same and that's what the proposed solution also achieves, no? > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > queued chunks will be sent. With the fix proposed by Xin, this would be more like: ... all of the _non-held_ chunks will be sent. After all, application asked to hold them, for whatever reason it had. > > So immediately after your (3) the application is expected to send a chunk > with MSG_MORE clear - at that point all the queued chunks can be sent in > a single packet. Yes. Isn't that the idea? > > So just save the last MSG_MORE on the association as I did. I don't see the reason to change that. Your reply seem to reject the idea but I cannot get the reason why. The solution proposed is more complex, yes, allows more control, yes, but those aren't real issues here. Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-21 22:04 ` Marcelo Ricardo Leitner (?) @ 2017-03-22 14:07 ` David Laight 2017-03-22 17:33 ` 'Marcelo Ricardo Leitner' -1 siblings, 1 reply; 44+ messages in thread From: David Laight @ 2017-03-22 14:07 UTC (permalink / raw) To: 'Marcelo Ricardo Leitner' Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem From: Marcelo Ricardo Leitner [mailto:marcelo.leitner@gmail.com] > Sent: 21 March 2017 22:04 > Hi, ... > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: > > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > > I don't think processing the entire queue is a good idea. > > Both from execution time and the effects on the data cache. > > It won't be processing the entire queue if not needed, and it will only > process it on the last sendmsg() call. As the list is double-linked, it > can walk backwards as necessary and stop just at the right point. So > this doesn't imply on any quadratic or exponential factor here, but > linear and only if/when finishing the MSG_MORE block. > > If the application is not using MSG_MORE, impact is zero. > > > The SCTP code is horrid enough as it is. > > > > > 3. then if user send more small chunks with MSG_MORE set, > > > the queue is like: > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > so that the new small chunks' flag will not affect the other chunks bundling. > > > > That isn't really necessary. > > The user can't expect to have absolute control over which chunks get bundled > > together. > > So...? > I mean, I'm okay with that but that doesn't explain why we can't do as > Xin proposed on previous email here. > > > If the above chunks still aren't big enough to fill a frame the code might > > as well wait for the next chunk instead of building a packet that contains > > chk1 through to chkB. > > Our expectations are the same and that's what the proposed solution also > achieves, no? Not really. > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > > queued chunks will be sent. > > With the fix proposed by Xin, this would be more like: ... all of the > _non-held_ chunks will be sent. > After all, application asked to hold them, for whatever reason it had. You are mis-understanding what I think MSG_MORE is for. It isn't the application saying 'don't send this packet', but rather 'there is no point sending ANY data because I've more data to send'. There is also the inference that the application will immediately send the next piece of data. So it isn't a property of the queued chunk, it is an indication that the application is going to send more data immediately. > > So immediately after your (3) the application is expected to send a chunk > > with MSG_MORE clear - at that point all the queued chunks can be sent in > > a single packet. > > Yes. Isn't that the idea? > > > > > So just save the last MSG_MORE on the association as I did. > > I don't see the reason to change that. Your reply seem to reject the > idea but I cannot get the reason why. The solution proposed is more > complex, yes, allows more control, yes, but those aren't real issues > here. I think you are trying to provide control of chunking that is neither necessary nor desirable and may give a false indication of what it might be sensible for the application to have control over. Regardless of the MSG_MORE flags associated with any specific send() request there will always be protocol effects (like retransmissions or flow control 'on') that will generate different 'chunking'. David ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-22 14:07 ` David Laight @ 2017-03-22 17:33 ` 'Marcelo Ricardo Leitner' 0 siblings, 0 replies; 44+ messages in thread From: 'Marcelo Ricardo Leitner' @ 2017-03-22 17:33 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: > From: Marcelo Ricardo Leitner [mailto:marcelo.leitner@gmail.com] > > Sent: 21 March 2017 22:04 > > Hi, > ... > > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: > > > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > > > > I don't think processing the entire queue is a good idea. > > > Both from execution time and the effects on the data cache. > > > > It won't be processing the entire queue if not needed, and it will only > > process it on the last sendmsg() call. As the list is double-linked, it > > can walk backwards as necessary and stop just at the right point. So > > this doesn't imply on any quadratic or exponential factor here, but > > linear and only if/when finishing the MSG_MORE block. > > > > If the application is not using MSG_MORE, impact is zero. > > > > > The SCTP code is horrid enough as it is. > > > > > > > 3. then if user send more small chunks with MSG_MORE set, > > > > the queue is like: > > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > > so that the new small chunks' flag will not affect the other chunks bundling. > > > > > > That isn't really necessary. > > > The user can't expect to have absolute control over which chunks get bundled > > > together. > > > > So...? > > I mean, I'm okay with that but that doesn't explain why we can't do as > > Xin proposed on previous email here. > > > > > If the above chunks still aren't big enough to fill a frame the code might > > > as well wait for the next chunk instead of building a packet that contains > > > chk1 through to chkB. > > > > Our expectations are the same and that's what the proposed solution also > > achieves, no? > > Not really. > > > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > > > queued chunks will be sent. > > > > With the fix proposed by Xin, this would be more like: ... all of the > > _non-held_ chunks will be sent. > > After all, application asked to hold them, for whatever reason it had. > > You are mis-understanding what I think MSG_MORE is for. > It isn't the application saying 'don't send this packet', but rather > 'there is no point sending ANY data because I've more data to send'. I see the difference now, thanks. > There is also the inference that the application will immediately > send the next piece of data. > > So it isn't a property of the queued chunk, it is an indication that > the application is going to send more data immediately. It would basically avoid sending the chunk immediately when inflight==0 and the rest could stay as is then. As the application is supposed to send more data immediately and clear the flag, we can send packets as they become full, and also let normal Nagle processing be. TCP also has a ceiling on this waiting, 200ms according to man 7 tcp/TCP_CORK. > > > > So immediately after your (3) the application is expected to send a chunk > > > with MSG_MORE clear - at that point all the queued chunks can be sent in > > > a single packet. > > > > Yes. Isn't that the idea? > > > > > > > > So just save the last MSG_MORE on the association as I did. > > > > I don't see the reason to change that. Your reply seem to reject the > > idea but I cannot get the reason why. The solution proposed is more > > complex, yes, allows more control, yes, but those aren't real issues > > here. > > I think you are trying to provide control of chunking that is neither > necessary nor desirable and may give a false indication of what it > might be sensible for the application to have control over. Agreed. > > Regardless of the MSG_MORE flags associated with any specific send() > request there will always be protocol effects (like retransmissions > or flow control 'on') that will generate different 'chunking'. Yes, those are the ones that may lead to some confusion on how it actually works, and mangling them is not really desired for the sideeffects that it might have. Sooner or later we could have bug reports like "hey this chunk shouldn't have been packed with that." if we stick with the initial proposition, while with David's view, we are only promising to not send packets with a single chunk and as long as the application send more data fast enough. David, are we on the same page now? ;-) Xin, what do you think? Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-03-22 17:33 ` 'Marcelo Ricardo Leitner' 0 siblings, 0 replies; 44+ messages in thread From: 'Marcelo Ricardo Leitner' @ 2017-03-22 17:33 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: > From: Marcelo Ricardo Leitner [mailto:marcelo.leitner@gmail.com] > > Sent: 21 March 2017 22:04 > > Hi, > ... > > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: > > > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > > > > I don't think processing the entire queue is a good idea. > > > Both from execution time and the effects on the data cache. > > > > It won't be processing the entire queue if not needed, and it will only > > process it on the last sendmsg() call. As the list is double-linked, it > > can walk backwards as necessary and stop just at the right point. So > > this doesn't imply on any quadratic or exponential factor here, but > > linear and only if/when finishing the MSG_MORE block. > > > > If the application is not using MSG_MORE, impact is zero. > > > > > The SCTP code is horrid enough as it is. > > > > > > > 3. then if user send more small chunks with MSG_MORE set, > > > > the queue is like: > > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > > so that the new small chunks' flag will not affect the other chunks bundling. > > > > > > That isn't really necessary. > > > The user can't expect to have absolute control over which chunks get bundled > > > together. > > > > So...? > > I mean, I'm okay with that but that doesn't explain why we can't do as > > Xin proposed on previous email here. > > > > > If the above chunks still aren't big enough to fill a frame the code might > > > as well wait for the next chunk instead of building a packet that contains > > > chk1 through to chkB. > > > > Our expectations are the same and that's what the proposed solution also > > achieves, no? > > Not really. > > > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > > > queued chunks will be sent. > > > > With the fix proposed by Xin, this would be more like: ... all of the > > _non-held_ chunks will be sent. > > After all, application asked to hold them, for whatever reason it had. > > You are mis-understanding what I think MSG_MORE is for. > It isn't the application saying 'don't send this packet', but rather > 'there is no point sending ANY data because I've more data to send'. I see the difference now, thanks. > There is also the inference that the application will immediately > send the next piece of data. > > So it isn't a property of the queued chunk, it is an indication that > the application is going to send more data immediately. It would basically avoid sending the chunk immediately when inflight=0 and the rest could stay as is then. As the application is supposed to send more data immediately and clear the flag, we can send packets as they become full, and also let normal Nagle processing be. TCP also has a ceiling on this waiting, 200ms according to man 7 tcp/TCP_CORK. > > > > So immediately after your (3) the application is expected to send a chunk > > > with MSG_MORE clear - at that point all the queued chunks can be sent in > > > a single packet. > > > > Yes. Isn't that the idea? > > > > > > > > So just save the last MSG_MORE on the association as I did. > > > > I don't see the reason to change that. Your reply seem to reject the > > idea but I cannot get the reason why. The solution proposed is more > > complex, yes, allows more control, yes, but those aren't real issues > > here. > > I think you are trying to provide control of chunking that is neither > necessary nor desirable and may give a false indication of what it > might be sensible for the application to have control over. Agreed. > > Regardless of the MSG_MORE flags associated with any specific send() > request there will always be protocol effects (like retransmissions > or flow control 'on') that will generate different 'chunking'. Yes, those are the ones that may lead to some confusion on how it actually works, and mangling them is not really desired for the sideeffects that it might have. Sooner or later we could have bug reports like "hey this chunk shouldn't have been packed with that." if we stick with the initial proposition, while with David's view, we are only promising to not send packets with a single chunk and as long as the application send more data fast enough. David, are we on the same page now? ;-) Xin, what do you think? Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-22 17:33 ` 'Marcelo Ricardo Leitner' @ 2017-03-23 4:35 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-03-23 4:35 UTC (permalink / raw) To: Marcelo Ricardo Leitner Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Thu, Mar 23, 2017 at 1:33 AM, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: >> From: Marcelo Ricardo Leitner [mailto:marcelo.leitner@gmail.com] >> > Sent: 21 March 2017 22:04 >> > Hi, >> ... >> > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: >> > > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> > > >> > > I don't think processing the entire queue is a good idea. >> > > Both from execution time and the effects on the data cache. >> > >> > It won't be processing the entire queue if not needed, and it will only >> > process it on the last sendmsg() call. As the list is double-linked, it >> > can walk backwards as necessary and stop just at the right point. So >> > this doesn't imply on any quadratic or exponential factor here, but >> > linear and only if/when finishing the MSG_MORE block. >> > >> > If the application is not using MSG_MORE, impact is zero. >> > >> > > The SCTP code is horrid enough as it is. >> > > >> > > > 3. then if user send more small chunks with MSG_MORE set, >> > > > the queue is like: >> > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> > > > so that the new small chunks' flag will not affect the other chunks bundling. >> > > >> > > That isn't really necessary. >> > > The user can't expect to have absolute control over which chunks get bundled >> > > together. >> > >> > So...? >> > I mean, I'm okay with that but that doesn't explain why we can't do as >> > Xin proposed on previous email here. >> > >> > > If the above chunks still aren't big enough to fill a frame the code might >> > > as well wait for the next chunk instead of building a packet that contains >> > > chk1 through to chkB. >> > >> > Our expectations are the same and that's what the proposed solution also >> > achieves, no? >> >> Not really. >> >> > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. >> > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the >> > > queued chunks will be sent. >> > >> > With the fix proposed by Xin, this would be more like: ... all of the >> > _non-held_ chunks will be sent. >> > After all, application asked to hold them, for whatever reason it had. >> >> You are mis-understanding what I think MSG_MORE is for. >> It isn't the application saying 'don't send this packet', but rather >> 'there is no point sending ANY data because I've more data to send'. > > I see the difference now, thanks. > >> There is also the inference that the application will immediately >> send the next piece of data. >> >> So it isn't a property of the queued chunk, it is an indication that >> the application is going to send more data immediately. > > It would basically avoid sending the chunk immediately when inflight==0 > and the rest could stay as is then. > As the application is supposed to send more data immediately and clear > the flag, we can send packets as they become full, and also let normal > Nagle processing be. TCP also has a ceiling on this waiting, 200ms > according to man 7 tcp/TCP_CORK. > >> >> > > So immediately after your (3) the application is expected to send a chunk >> > > with MSG_MORE clear - at that point all the queued chunks can be sent in >> > > a single packet. >> > >> > Yes. Isn't that the idea? >> > >> > > >> > > So just save the last MSG_MORE on the association as I did. >> > >> > I don't see the reason to change that. Your reply seem to reject the >> > idea but I cannot get the reason why. The solution proposed is more >> > complex, yes, allows more control, yes, but those aren't real issues >> > here. >> >> I think you are trying to provide control of chunking that is neither >> necessary nor desirable and may give a false indication of what it >> might be sensible for the application to have control over. > > Agreed. > >> >> Regardless of the MSG_MORE flags associated with any specific send() >> request there will always be protocol effects (like retransmissions >> or flow control 'on') that will generate different 'chunking'. > > Yes, those are the ones that may lead to some confusion on how it > actually works, and mangling them is not really desired for the > sideeffects that it might have. > > Sooner or later we could have bug reports like "hey this chunk shouldn't > have been packed with that." if we stick with the initial proposition, > while with David's view, we are only promising to not send packets with > a single chunk and as long as the application send more data fast enough. > > David, are we on the same page now? ;-) > > Xin, what do you think? If we insist that MSG_MORE means not to send ANY data, I compromise. does ANY include retransmission DATA? should MSG_MORE block retransmission ? > > Marcelo > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-03-23 4:35 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-03-23 4:35 UTC (permalink / raw) To: Marcelo Ricardo Leitner Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Thu, Mar 23, 2017 at 1:33 AM, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: >> From: Marcelo Ricardo Leitner [mailto:marcelo.leitner@gmail.com] >> > Sent: 21 March 2017 22:04 >> > Hi, >> ... >> > > > 2. send 1 more chunk with MSG_MORE clear, the queue is: >> > > > chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> > > >> > > I don't think processing the entire queue is a good idea. >> > > Both from execution time and the effects on the data cache. >> > >> > It won't be processing the entire queue if not needed, and it will only >> > process it on the last sendmsg() call. As the list is double-linked, it >> > can walk backwards as necessary and stop just at the right point. So >> > this doesn't imply on any quadratic or exponential factor here, but >> > linear and only if/when finishing the MSG_MORE block. >> > >> > If the application is not using MSG_MORE, impact is zero. >> > >> > > The SCTP code is horrid enough as it is. >> > > >> > > > 3. then if user send more small chunks with MSG_MORE set, >> > > > the queue is like: >> > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] >> > > > so that the new small chunks' flag will not affect the other chunks bundling. >> > > >> > > That isn't really necessary. >> > > The user can't expect to have absolute control over which chunks get bundled >> > > together. >> > >> > So...? >> > I mean, I'm okay with that but that doesn't explain why we can't do as >> > Xin proposed on previous email here. >> > >> > > If the above chunks still aren't big enough to fill a frame the code might >> > > as well wait for the next chunk instead of building a packet that contains >> > > chk1 through to chkB. >> > >> > Our expectations are the same and that's what the proposed solution also >> > achieves, no? >> >> Not really. >> >> > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. >> > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the >> > > queued chunks will be sent. >> > >> > With the fix proposed by Xin, this would be more like: ... all of the >> > _non-held_ chunks will be sent. >> > After all, application asked to hold them, for whatever reason it had. >> >> You are mis-understanding what I think MSG_MORE is for. >> It isn't the application saying 'don't send this packet', but rather >> 'there is no point sending ANY data because I've more data to send'. > > I see the difference now, thanks. > >> There is also the inference that the application will immediately >> send the next piece of data. >> >> So it isn't a property of the queued chunk, it is an indication that >> the application is going to send more data immediately. > > It would basically avoid sending the chunk immediately when inflight=0 > and the rest could stay as is then. > As the application is supposed to send more data immediately and clear > the flag, we can send packets as they become full, and also let normal > Nagle processing be. TCP also has a ceiling on this waiting, 200ms > according to man 7 tcp/TCP_CORK. > >> >> > > So immediately after your (3) the application is expected to send a chunk >> > > with MSG_MORE clear - at that point all the queued chunks can be sent in >> > > a single packet. >> > >> > Yes. Isn't that the idea? >> > >> > > >> > > So just save the last MSG_MORE on the association as I did. >> > >> > I don't see the reason to change that. Your reply seem to reject the >> > idea but I cannot get the reason why. The solution proposed is more >> > complex, yes, allows more control, yes, but those aren't real issues >> > here. >> >> I think you are trying to provide control of chunking that is neither >> necessary nor desirable and may give a false indication of what it >> might be sensible for the application to have control over. > > Agreed. > >> >> Regardless of the MSG_MORE flags associated with any specific send() >> request there will always be protocol effects (like retransmissions >> or flow control 'on') that will generate different 'chunking'. > > Yes, those are the ones that may lead to some confusion on how it > actually works, and mangling them is not really desired for the > sideeffects that it might have. > > Sooner or later we could have bug reports like "hey this chunk shouldn't > have been packed with that." if we stick with the initial proposition, > while with David's view, we are only promising to not send packets with > a single chunk and as long as the application send more data fast enough. > > David, are we on the same page now? ;-) > > Xin, what do you think? If we insist that MSG_MORE means not to send ANY data, I compromise. does ANY include retransmission DATA? should MSG_MORE block retransmission ? > > Marcelo > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-23 4:35 ` Xin Long @ 2017-03-23 16:42 ` Marcelo Ricardo Leitner -1 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-03-23 16:42 UTC (permalink / raw) To: Xin Long Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Thu, Mar 23, 2017 at 12:35:46PM +0800, Xin Long wrote: > On Thu, Mar 23, 2017 at 1:33 AM, Marcelo Ricardo Leitner > <marcelo.leitner@gmail.com> wrote: > > On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: > >> Regardless of the MSG_MORE flags associated with any specific send() > >> request there will always be protocol effects (like retransmissions > >> or flow control 'on') that will generate different 'chunking'. > > > > Yes, those are the ones that may lead to some confusion on how it > > actually works, and mangling them is not really desired for the > > sideeffects that it might have. > > > > Sooner or later we could have bug reports like "hey this chunk shouldn't > > have been packed with that." if we stick with the initial proposition, > > while with David's view, we are only promising to not send packets with > > a single chunk and as long as the application send more data fast enough. > > > > David, are we on the same page now? ;-) > > > > Xin, what do you think? > If we insist that MSG_MORE means not to send ANY data, I compromise. > does ANY include retransmission DATA? should MSG_MORE block > retransmission ? That's not really what he meant by that, I think. That "ANY" in there is a way to refer to the entire buf and not that msg sendmsg is sending. Later I explained what I got from his explanation, which should be more like: "If MSG_MORE was used, and there are no packets in flight, do not send a packet right away because the application is going to send more data." Would have to handle the (Not-)Nagle situation too: "If not using Nagle and using MSG_MORE, try to not generate a packet right away." (because this may send packets with a single chunk even if in_flight != 0) In both cases, if the flush is generated by other triggers, it's okay. Because if there are chunks already queued, they will be sent as soon as in_flight reaches 0 or some other break is lifted (flow control). Holding the chunk that was queued with MSG_MORE and sending a partial packet in this case because of MSG_MORE is not good, it's possibly not saving anything. Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-03-23 16:42 ` Marcelo Ricardo Leitner 0 siblings, 0 replies; 44+ messages in thread From: Marcelo Ricardo Leitner @ 2017-03-23 16:42 UTC (permalink / raw) To: Xin Long Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Thu, Mar 23, 2017 at 12:35:46PM +0800, Xin Long wrote: > On Thu, Mar 23, 2017 at 1:33 AM, Marcelo Ricardo Leitner > <marcelo.leitner@gmail.com> wrote: > > On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: > >> Regardless of the MSG_MORE flags associated with any specific send() > >> request there will always be protocol effects (like retransmissions > >> or flow control 'on') that will generate different 'chunking'. > > > > Yes, those are the ones that may lead to some confusion on how it > > actually works, and mangling them is not really desired for the > > sideeffects that it might have. > > > > Sooner or later we could have bug reports like "hey this chunk shouldn't > > have been packed with that." if we stick with the initial proposition, > > while with David's view, we are only promising to not send packets with > > a single chunk and as long as the application send more data fast enough. > > > > David, are we on the same page now? ;-) > > > > Xin, what do you think? > If we insist that MSG_MORE means not to send ANY data, I compromise. > does ANY include retransmission DATA? should MSG_MORE block > retransmission ? That's not really what he meant by that, I think. That "ANY" in there is a way to refer to the entire buf and not that msg sendmsg is sending. Later I explained what I got from his explanation, which should be more like: "If MSG_MORE was used, and there are no packets in flight, do not send a packet right away because the application is going to send more data." Would have to handle the (Not-)Nagle situation too: "If not using Nagle and using MSG_MORE, try to not generate a packet right away." (because this may send packets with a single chunk even if in_flight != 0) In both cases, if the flush is generated by other triggers, it's okay. Because if there are chunks already queued, they will be sent as soon as in_flight reaches 0 or some other break is lifted (flow control). Holding the chunk that was queued with MSG_MORE and sending a partial packet in this case because of MSG_MORE is not good, it's possibly not saving anything. Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-23 16:42 ` Marcelo Ricardo Leitner @ 2017-03-24 16:09 ` Xin Long -1 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-03-24 16:09 UTC (permalink / raw) To: Marcelo Ricardo Leitner Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Fri, Mar 24, 2017 at 12:42 AM, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > On Thu, Mar 23, 2017 at 12:35:46PM +0800, Xin Long wrote: >> On Thu, Mar 23, 2017 at 1:33 AM, Marcelo Ricardo Leitner >> <marcelo.leitner@gmail.com> wrote: >> > On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: >> >> Regardless of the MSG_MORE flags associated with any specific send() >> >> request there will always be protocol effects (like retransmissions >> >> or flow control 'on') that will generate different 'chunking'. >> > >> > Yes, those are the ones that may lead to some confusion on how it >> > actually works, and mangling them is not really desired for the >> > sideeffects that it might have. >> > >> > Sooner or later we could have bug reports like "hey this chunk shouldn't >> > have been packed with that." if we stick with the initial proposition, >> > while with David's view, we are only promising to not send packets with >> > a single chunk and as long as the application send more data fast enough. >> > >> > David, are we on the same page now? ;-) >> > >> > Xin, what do you think? >> If we insist that MSG_MORE means not to send ANY data, I compromise. >> does ANY include retransmission DATA? should MSG_MORE block >> retransmission ? > > That's not really what he meant by that, I think. That "ANY" in there is > a way to refer to the entire buf and not that msg sendmsg is sending. > Later I explained what I got from his explanation, which should be more > like: > "If MSG_MORE was used, and there are no packets in flight, do not send a > packet right away because the application is going to send more data." > Would have to handle the (Not-)Nagle situation too: > "If not using Nagle and using MSG_MORE, try to not generate a packet > right away." (because this may send packets with a single chunk even if > in_flight != 0) > In both cases, if the flush is generated by other triggers, it's okay. > > Because if there are chunks already queued, they will be sent as soon as > in_flight reaches 0 or some other break is lifted (flow control). > Holding the chunk that was queued with MSG_MORE and sending a partial > packet in this case because of MSG_MORE is not good, it's possibly not > saving anything. Makes sence, thanks for making this clear, will post a new fix. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-03-24 16:09 ` Xin Long 0 siblings, 0 replies; 44+ messages in thread From: Xin Long @ 2017-03-24 16:09 UTC (permalink / raw) To: Marcelo Ricardo Leitner Cc: David Laight, network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Fri, Mar 24, 2017 at 12:42 AM, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > On Thu, Mar 23, 2017 at 12:35:46PM +0800, Xin Long wrote: >> On Thu, Mar 23, 2017 at 1:33 AM, Marcelo Ricardo Leitner >> <marcelo.leitner@gmail.com> wrote: >> > On Wed, Mar 22, 2017 at 02:07:37PM +0000, David Laight wrote: >> >> Regardless of the MSG_MORE flags associated with any specific send() >> >> request there will always be protocol effects (like retransmissions >> >> or flow control 'on') that will generate different 'chunking'. >> > >> > Yes, those are the ones that may lead to some confusion on how it >> > actually works, and mangling them is not really desired for the >> > sideeffects that it might have. >> > >> > Sooner or later we could have bug reports like "hey this chunk shouldn't >> > have been packed with that." if we stick with the initial proposition, >> > while with David's view, we are only promising to not send packets with >> > a single chunk and as long as the application send more data fast enough. >> > >> > David, are we on the same page now? ;-) >> > >> > Xin, what do you think? >> If we insist that MSG_MORE means not to send ANY data, I compromise. >> does ANY include retransmission DATA? should MSG_MORE block >> retransmission ? > > That's not really what he meant by that, I think. That "ANY" in there is > a way to refer to the entire buf and not that msg sendmsg is sending. > Later I explained what I got from his explanation, which should be more > like: > "If MSG_MORE was used, and there are no packets in flight, do not send a > packet right away because the application is going to send more data." > Would have to handle the (Not-)Nagle situation too: > "If not using Nagle and using MSG_MORE, try to not generate a packet > right away." (because this may send packets with a single chunk even if > in_flight != 0) > In both cases, if the flush is generated by other triggers, it's okay. > > Because if there are chunks already queued, they will be sent as soon as > in_flight reaches 0 or some other break is lifted (flow control). > Holding the chunk that was queued with MSG_MORE and sending a partial > packet in this case because of MSG_MORE is not good, it's possibly not > saving anything. Makes sence, thanks for making this clear, will post a new fix. ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-23 16:42 ` Marcelo Ricardo Leitner (?) (?) @ 2017-03-24 17:38 ` David Laight -1 siblings, 0 replies; 44+ messages in thread From: David Laight @ 2017-03-24 17:38 UTC (permalink / raw) To: 'Marcelo Ricardo Leitner', Xin Long Cc: network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem From: Marcelo Ricardo Leitner > Sent: 23 March 2017 16:42 ... > > > David, are we on the same page now? ;-) Probably... > > > Xin, what do you think? > > If we insist that MSG_MORE means not to send ANY data, I compromise. > > does ANY include retransmission DATA? should MSG_MORE block > > retransmission ? > > That's not really what he meant by that, I think. That "ANY" in there is > a way to refer to the entire buf and not that msg sendmsg is sending. > Later I explained what I got from his explanation, which should be more > like: > "If MSG_MORE was used, and there are no packets in flight, do not send a > packet right away because the application is going to send more data." > Would have to handle the (Not-)Nagle situation too: > "If not using Nagle and using MSG_MORE, try to not generate a packet > right away." (because this may send packets with a single chunk even if > in_flight != 0) > In both cases, if the flush is generated by other triggers, it's okay. > > Because if there are chunks already queued, they will be sent as soon as > in_flight reaches 0 or some other break is lifted (flow control). > Holding the chunk that was queued with MSG_MORE and sending a partial > packet in this case because of MSG_MORE is not good, it's possibly not > saving anything. It is also worth remembering that this is all about whether the first chunk in an ethernet frame is a data chunk. If a frame is being sent for some other reason (eg ack or heartbeat) then it will collect queued data chunks. I've seen hearbeats collect data chunks, I've not checked that this doesn't happen for heartbeats that are probing IP addresses. David ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-21 22:04 ` Marcelo Ricardo Leitner (?) (?) @ 2017-03-28 10:29 ` David Laight 2017-03-28 18:12 ` 'Marcelo Ricardo Leitner' -1 siblings, 1 reply; 44+ messages in thread From: David Laight @ 2017-03-28 10:29 UTC (permalink / raw) To: 'Marcelo Ricardo Leitner' Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem From: Marcelo Ricardo Leitner > Sent: 21 March 2017 22:04 ... > > > 3. then if user send more small chunks with MSG_MORE set, > > > the queue is like: > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > so that the new small chunks' flag will not affect the other chunks bundling. > > > > That isn't really necessary. > > The user can't expect to have absolute control over which chunks get bundled > > together. > > So...? > I mean, I'm okay with that but that doesn't explain why we can't do as > Xin proposed on previous email here. > > > If the above chunks still aren't big enough to fill a frame the code might > > as well wait for the next chunk instead of building a packet that contains > > chk1 through to chkB. > > Our expectations are the same and that's what the proposed solution also > achieves, no? > > > > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > > queued chunks will be sent. > > With the fix proposed by Xin, this would be more like: ... all of the > _non-held_ chunks will be sent. > After all, application asked to hold them, for whatever reason it had. > > > > > So immediately after your (3) the application is expected to send a chunk > > with MSG_MORE clear - at that point all the queued chunks can be sent in > > a single packet. > > Yes. Isn't that the idea? > > > > > So just save the last MSG_MORE on the association as I did. > > I don't see the reason to change that. Your reply seem to reject the > idea but I cannot get the reason why. The solution proposed is more > complex, yes, allows more control, yes, but those aren't real issues > here. I think I've realised why we were disagreeing.... For TCP MSG_MORE was implemented as an alternative to 'corking' for allocations that used multiple send() calls to send a single application level message (eg those that send a length and data separately). The real solution is to use sendv() (or writev() if sendv() doesn't exist). SCTP is different, every send() generates a separate data chunk. The problem in SCTP is that if an application has more than one data chunk to send it needs to stop the kernel sending anything until it has sent all the data chunks - otherwise they all go out in separate ethernet packets. A TCP application can avoid this by doing a single send() for all the data. This doesn't work for SCTP because it would generate a single data chunk. David ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE 2017-03-28 10:29 ` David Laight @ 2017-03-28 18:12 ` 'Marcelo Ricardo Leitner' 0 siblings, 0 replies; 44+ messages in thread From: 'Marcelo Ricardo Leitner' @ 2017-03-28 18:12 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Tue, Mar 28, 2017 at 10:29:27AM +0000, David Laight wrote: > From: Marcelo Ricardo Leitner > > Sent: 21 March 2017 22:04 > ... > > > > 3. then if user send more small chunks with MSG_MORE set, > > > > the queue is like: > > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > > so that the new small chunks' flag will not affect the other chunks bundling. > > > > > > That isn't really necessary. > > > The user can't expect to have absolute control over which chunks get bundled > > > together. > > > > So...? > > I mean, I'm okay with that but that doesn't explain why we can't do as > > Xin proposed on previous email here. > > > > > If the above chunks still aren't big enough to fill a frame the code might > > > as well wait for the next chunk instead of building a packet that contains > > > chk1 through to chkB. > > > > Our expectations are the same and that's what the proposed solution also > > achieves, no? > > > > > > > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > > > queued chunks will be sent. > > > > With the fix proposed by Xin, this would be more like: ... all of the > > _non-held_ chunks will be sent. > > After all, application asked to hold them, for whatever reason it had. > > > > > > > > So immediately after your (3) the application is expected to send a chunk > > > with MSG_MORE clear - at that point all the queued chunks can be sent in > > > a single packet. > > > > Yes. Isn't that the idea? > > > > > > > > So just save the last MSG_MORE on the association as I did. > > > > I don't see the reason to change that. Your reply seem to reject the > > idea but I cannot get the reason why. The solution proposed is more > > complex, yes, allows more control, yes, but those aren't real issues > > here. > > I think I've realised why we were disagreeing.... > > For TCP MSG_MORE was implemented as an alternative to 'corking' for > allocations that used multiple send() calls to send a single application > level message (eg those that send a length and data separately). > The real solution is to use sendv() (or writev() if sendv() doesn't exist). > > SCTP is different, every send() generates a separate data chunk. > The problem in SCTP is that if an application has more than one data > chunk to send it needs to stop the kernel sending anything until > it has sent all the data chunks - otherwise they all go out in > separate ethernet packets. > > A TCP application can avoid this by doing a single send() for all the > data. This doesn't work for SCTP because it would generate a single > data chunk. Yep, it was based on TCP one since day 0. There is also the way MSG_MORE works for UDP, in which it appends the subsequent write to the same datagram. But then, for SCTP, we should use EOR flag, as Michael Tuexen had indicated here on the list a few months ago: """ What about adding support for the explicit EOR mode as specified in https://tools.ietf.org/html/rfc6458#section-8.1.26 """ Thus why it was very based on TCP implementation, so maybe in the future we could have both options available for SCTP. Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 2/2] sctp: add support for MSG_MORE @ 2017-03-28 18:12 ` 'Marcelo Ricardo Leitner' 0 siblings, 0 replies; 44+ messages in thread From: 'Marcelo Ricardo Leitner' @ 2017-03-28 18:12 UTC (permalink / raw) To: David Laight Cc: 'Xin Long', network dev, linux-sctp, Neil Horman, Vlad Yasevich, davem On Tue, Mar 28, 2017 at 10:29:27AM +0000, David Laight wrote: > From: Marcelo Ricardo Leitner > > Sent: 21 March 2017 22:04 > ... > > > > 3. then if user send more small chunks with MSG_MORE set, > > > > the queue is like: > > > > chkB[set] -> chkA[set] -> chk4[clear] -> chk3 [clear] -> chk2 [clear] -> chk1 [clear] > > > > so that the new small chunks' flag will not affect the other chunks bundling. > > > > > > That isn't really necessary. > > > The user can't expect to have absolute control over which chunks get bundled > > > together. > > > > So...? > > I mean, I'm okay with that but that doesn't explain why we can't do as > > Xin proposed on previous email here. > > > > > If the above chunks still aren't big enough to fill a frame the code might > > > as well wait for the next chunk instead of building a packet that contains > > > chk1 through to chkB. > > > > Our expectations are the same and that's what the proposed solution also > > achieves, no? > > > > > > > > Remember you'll only get a queued chunk with MSG_MORE clear if data can't be sent. > > > As soon as data can be sent, if the first chunk has MSG_MORE clear all of the > > > queued chunks will be sent. > > > > With the fix proposed by Xin, this would be more like: ... all of the > > _non-held_ chunks will be sent. > > After all, application asked to hold them, for whatever reason it had. > > > > > > > > So immediately after your (3) the application is expected to send a chunk > > > with MSG_MORE clear - at that point all the queued chunks can be sent in > > > a single packet. > > > > Yes. Isn't that the idea? > > > > > > > > So just save the last MSG_MORE on the association as I did. > > > > I don't see the reason to change that. Your reply seem to reject the > > idea but I cannot get the reason why. The solution proposed is more > > complex, yes, allows more control, yes, but those aren't real issues > > here. > > I think I've realised why we were disagreeing.... > > For TCP MSG_MORE was implemented as an alternative to 'corking' for > allocations that used multiple send() calls to send a single application > level message (eg those that send a length and data separately). > The real solution is to use sendv() (or writev() if sendv() doesn't exist). > > SCTP is different, every send() generates a separate data chunk. > The problem in SCTP is that if an application has more than one data > chunk to send it needs to stop the kernel sending anything until > it has sent all the data chunks - otherwise they all go out in > separate ethernet packets. > > A TCP application can avoid this by doing a single send() for all the > data. This doesn't work for SCTP because it would generate a single > data chunk. Yep, it was based on TCP one since day 0. There is also the way MSG_MORE works for UDP, in which it appends the subsequent write to the same datagram. But then, for SCTP, we should use EOR flag, as Michael Tuexen had indicated here on the list a few months ago: """ What about adding support for the explicit EOR mode as specified in https://tools.ietf.org/html/rfc6458#section-8.1.26 """ Thus why it was very based on TCP implementation, so maybe in the future we could have both options available for SCTP. Marcelo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 0/2] sctp: support MSG_MORE flag when sending msg 2017-02-18 17:52 ` Xin Long @ 2017-02-20 15:26 ` David Miller -1 siblings, 0 replies; 44+ messages in thread From: David Miller @ 2017-02-20 15:26 UTC (permalink / raw) To: lucien.xin Cc: netdev, linux-sctp, marcelo.leitner, nhorman, vyasevich, david.laight From: Xin Long <lucien.xin@gmail.com> Date: Sun, 19 Feb 2017 01:52:44 +0800 > This patch is to add support for MSG_MORE on sctp. Patch 1/2 is an > improvement ahead of patch 2/2 to solve the close block problem > mentioned in https://patchwork.ozlabs.org/patch/372404/. Series applied, thanks Xin. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH net-next 0/2] sctp: support MSG_MORE flag when sending msg @ 2017-02-20 15:26 ` David Miller 0 siblings, 0 replies; 44+ messages in thread From: David Miller @ 2017-02-20 15:26 UTC (permalink / raw) To: lucien.xin Cc: netdev, linux-sctp, marcelo.leitner, nhorman, vyasevich, david.laight From: Xin Long <lucien.xin@gmail.com> Date: Sun, 19 Feb 2017 01:52:44 +0800 > This patch is to add support for MSG_MORE on sctp. Patch 1/2 is an > improvement ahead of patch 2/2 to solve the close block problem > mentioned in https://patchwork.ozlabs.org/patch/372404/. Series applied, thanks Xin. ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2017-03-28 18:13 UTC | newest] Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-02-18 17:52 [PATCH net-next 0/2] sctp: support MSG_MORE flag when sending msg Xin Long 2017-02-18 17:52 ` Xin Long 2017-02-18 17:52 ` [PATCH net-next 1/2] sctp: flush out queue once assoc state falls into SHUTDOWN_PENDING Xin Long 2017-02-18 17:52 ` Xin Long 2017-02-18 17:52 ` [PATCH net-next 2/2] sctp: add support for MSG_MORE Xin Long 2017-02-18 17:52 ` Xin Long 2017-02-21 14:27 ` David Laight 2017-02-23 3:45 ` Xin Long 2017-02-23 3:45 ` Xin Long 2017-02-23 16:04 ` David Laight 2017-02-23 16:04 ` David Laight 2017-02-23 17:40 ` Marcelo Ricardo Leitner 2017-02-23 17:40 ` Marcelo Ricardo Leitner 2017-02-23 18:16 ` Xin Long 2017-02-23 18:16 ` Xin Long 2017-02-23 18:39 ` Marcelo Ricardo Leitner 2017-02-23 18:39 ` Marcelo Ricardo Leitner 2017-02-24 6:43 ` Xin Long 2017-02-24 6:43 ` Xin Long 2017-02-24 10:14 ` David Laight 2017-02-24 10:14 ` David Laight 2017-02-25 8:41 ` Xin Long 2017-02-25 8:41 ` Xin Long 2017-02-27 4:49 ` Xin Long 2017-02-27 4:49 ` Xin Long 2017-02-27 10:48 ` David Laight 2017-02-27 10:48 ` David Laight 2017-03-21 22:04 ` Marcelo Ricardo Leitner 2017-03-21 22:04 ` Marcelo Ricardo Leitner 2017-03-22 14:07 ` David Laight 2017-03-22 17:33 ` 'Marcelo Ricardo Leitner' 2017-03-22 17:33 ` 'Marcelo Ricardo Leitner' 2017-03-23 4:35 ` Xin Long 2017-03-23 4:35 ` Xin Long 2017-03-23 16:42 ` Marcelo Ricardo Leitner 2017-03-23 16:42 ` Marcelo Ricardo Leitner 2017-03-24 16:09 ` Xin Long 2017-03-24 16:09 ` Xin Long 2017-03-24 17:38 ` David Laight 2017-03-28 10:29 ` David Laight 2017-03-28 18:12 ` 'Marcelo Ricardo Leitner' 2017-03-28 18:12 ` 'Marcelo Ricardo Leitner' 2017-02-20 15:26 ` [PATCH net-next 0/2] sctp: support MSG_MORE flag when sending msg David Miller 2017-02-20 15:26 ` David Miller
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.