All of lore.kernel.org
 help / color / mirror / Atom feed
* Piggyback the final ACK of the three way TCP connection establishment with the data
@ 2012-03-21 23:38 Vincent Li
  2012-03-21 23:52 ` Rick Jones
  2012-03-22  0:00 ` Eric Dumazet
  0 siblings, 2 replies; 11+ messages in thread
From: Vincent Li @ 2012-03-21 23:38 UTC (permalink / raw)
  To: netdev

Hi,

I happen to see this link
http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02020743/c02020743.pdf
and

tcp_delay_final_twh_ack:
            Piggyback the final ACK of the three way TCP connection
            establishment with the data by delaying the final ack by
           10ms.

           0: Send the final ACK as soon as the SYN+ACK packet
               arrives from the remote host.

           1: Delay the sending of the final ACK by 10ms. If
               there is data available to be sent with in the
               next 10ms, piggyback the ACK for the SYN.
[0-1] Default: 1

It appears this feature is not available in kernel tcp implementation,
is it trivial to make a custom patch to make this feature available?
can someone give a hint on how to make a patch for this?

We had a situation that we want to make linux kernel tcp stack behave
this way so we can reproduce another issue at hand.

Thanks

Vincent

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-21 23:38 Piggyback the final ACK of the three way TCP connection establishment with the data Vincent Li
@ 2012-03-21 23:52 ` Rick Jones
  2012-03-22  0:00 ` Eric Dumazet
  1 sibling, 0 replies; 11+ messages in thread
From: Rick Jones @ 2012-03-21 23:52 UTC (permalink / raw)
  To: Vincent Li; +Cc: netdev

On 03/21/2012 04:38 PM, Vincent Li wrote:
> Hi,
>
> I happen to see this link
> http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02020743/c02020743.pdf
> and

That dates to March of 2008.

> tcp_delay_final_twh_ack:
>              Piggyback the final ACK of the three way TCP connection
>              establishment with the data by delaying the final ack by
>             10ms.
>
>             0: Send the final ACK as soon as the SYN+ACK packet
>                 arrives from the remote host.
>
>             1: Delay the sending of the final ACK by 10ms. If
>                 there is data available to be sent with in the
>                 next 10ms, piggyback the ACK for the SYN.
> [0-1] Default: 1

I guess that was to allow the client's ACK of the server's SYN|ACK of 
the client's SYN to be piggybacked with the client's initial request. 
Probably did nice things on either netperf TCP_CRR, or some 
connect()-heavy workload.    Probably not a good thing for those 
applications where the accept()ing side sends first...

> It appears this feature is not available in kernel tcp implementation,
> is it trivial to make a custom patch to make this feature available?
> can someone give a hint on how to make a patch for this?
>
> We had a situation that we want to make linux kernel tcp stack behave
> this way so we can reproduce another issue at hand.

Unless some other stack has that feature, why not simply use an HP-UX 
system directly?

rick jones

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-21 23:38 Piggyback the final ACK of the three way TCP connection establishment with the data Vincent Li
  2012-03-21 23:52 ` Rick Jones
@ 2012-03-22  0:00 ` Eric Dumazet
  2012-03-22 23:02   ` Vincent Li
  1 sibling, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-03-22  0:00 UTC (permalink / raw)
  To: Vincent Li; +Cc: netdev

On Wed, 2012-03-21 at 16:38 -0700, Vincent Li wrote:
> Hi,
> 
> I happen to see this link
> http://h20000.www2.hp.com/bc/docs/support/SupportManual/c02020743/c02020743.pdf
> and
> 
> tcp_delay_final_twh_ack:
>             Piggyback the final ACK of the three way TCP connection
>             establishment with the data by delaying the final ack by
>            10ms.
> 
>            0: Send the final ACK as soon as the SYN+ACK packet
>                arrives from the remote host.
> 
>            1: Delay the sending of the final ACK by 10ms. If
>                there is data available to be sent with in the
>                next 10ms, piggyback the ACK for the SYN.
> [0-1] Default: 1
> 
> It appears this feature is not available in kernel tcp implementation,
> is it trivial to make a custom patch to make this feature available?
> can someone give a hint on how to make a patch for this?
> 
> We had a situation that we want to make linux kernel tcp stack behave
> this way so we can reproduce another issue at hand.
> 

No kernel patch is needed, you already can do this on linux.

Check file net/ipv4/tcp_input.c lines around 5722

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-22  0:00 ` Eric Dumazet
@ 2012-03-22 23:02   ` Vincent Li
  2012-03-22 23:07     ` David Miller
                       ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Vincent Li @ 2012-03-22 23:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

>
> No kernel patch is needed, you already can do this on linux.
>
> Check file net/ipv4/tcp_input.c lines around 5722
>
>

is this code snippet in tcp_rcv_synsent_state_process that you refer to?

5676                 if (sk->sk_write_pending ||
5677                     icsk->icsk_accept_queue.rskq_defer_accept ||
5678                     icsk->icsk_ack.pingpong) {
5679                         /* Save one ACK. Data will be ready after
5680                          * several ticks, if write_pending is set.
5681                          *
5682                          * It may be deleted, but with this
feature tcpdumps
5683                          * look so _wonderfully_ clever, that I
was not able
5684                          * to stand against the temptation 8)     --ANK
5685                          */
5686                         inet_csk_schedule_ack(sk);
5687                         icsk->icsk_ack.lrcvtime = tcp_time_stamp;
5688                         icsk->icsk_ack.ato       = TCP_ATO_MIN;
5689                         tcp_incr_quickack(sk);
5690                         tcp_enter_quickack_mode(sk);
5691                         inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
5692                                                   TCP_DELACK_MAX,
TCP_RTO_MAX);
5693
5694 discard:
5695                         __kfree_skb(skb);
5696                         return 0;
5697                 } else {
5698                         tcp_send_ack(sk);
5699                 }

if I understand it correct on linux, the application code need to set
socket option with TCP_DEFER_ACCEPT or TCP_QUICKACK in order to
trigger it, correct?

We have user running wu-ftpd on HP Unix with tcp tunable
tcp_delay_final_twh_ack on. so in active FTP situation, when wu-ftpd
open up data connection to client, it sends SYN, client SYN/ACK, then
ACK/PUSH with data. so on HU UNIX, it appears just turn on tcp tunable
tcp_delay_final_twh_ack would make it happen.

but on Linux, do I need to change wu-ftpd code and modify the socket
option with TCP_DEFER_ACCEPT or TCP_QUICKACK in order to trigger the
code snippet in tcp_rcv_synsent_state_process?

Thanks

Vincent

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-22 23:02   ` Vincent Li
@ 2012-03-22 23:07     ` David Miller
  2012-03-22 23:13     ` Eric Dumazet
  2012-03-22 23:44     ` Rick Jones
  2 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2012-03-22 23:07 UTC (permalink / raw)
  To: vincent.mc.li; +Cc: eric.dumazet, netdev

From: Vincent Li <vincent.mc.li@gmail.com>
Date: Thu, 22 Mar 2012 16:02:41 -0700

> but on Linux, do I need to change wu-ftpd code and modify the socket
> option with TCP_DEFER_ACCEPT or TCP_QUICKACK in order to trigger the
> code snippet in tcp_rcv_synsent_state_process?

Yes.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-22 23:02   ` Vincent Li
  2012-03-22 23:07     ` David Miller
@ 2012-03-22 23:13     ` Eric Dumazet
  2012-03-22 23:17       ` Eric Dumazet
  2012-03-22 23:44     ` Rick Jones
  2 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-03-22 23:13 UTC (permalink / raw)
  To: Vincent Li; +Cc: netdev

On Thu, 2012-03-22 at 16:02 -0700, Vincent Li wrote:
> >
> > No kernel patch is needed, you already can do this on linux.
> >
> > Check file net/ipv4/tcp_input.c lines around 5722
> >
> >
> 
> is this code snippet in tcp_rcv_synsent_state_process that you refer to?
> 
> 5676                 if (sk->sk_write_pending ||
> 5677                     icsk->icsk_accept_queue.rskq_defer_accept ||
> 5678                     icsk->icsk_ack.pingpong) {
> 5679                         /* Save one ACK. Data will be ready after
> 5680                          * several ticks, if write_pending is set.
> 5681                          *
> 5682                          * It may be deleted, but with this
> feature tcpdumps
> 5683                          * look so _wonderfully_ clever, that I
> was not able
> 5684                          * to stand against the temptation 8)     --ANK
> 5685                          */
> 5686                         inet_csk_schedule_ack(sk);
> 5687                         icsk->icsk_ack.lrcvtime = tcp_time_stamp;
> 5688                         icsk->icsk_ack.ato       = TCP_ATO_MIN;
> 5689                         tcp_incr_quickack(sk);
> 5690                         tcp_enter_quickack_mode(sk);
> 5691                         inet_csk_reset_xmit_timer(sk, ICSK_TIME_DACK,
> 5692                                                   TCP_DELACK_MAX,
> TCP_RTO_MAX);
> 5693
> 5694 discard:
> 5695                         __kfree_skb(skb);
> 5696                         return 0;
> 5697                 } else {
> 5698                         tcp_send_ack(sk);
> 5699                 }
> 
> if I understand it correct on linux, the application code need to set
> socket option with TCP_DEFER_ACCEPT or TCP_QUICKACK in order to
> trigger it, correct?
> 
> We have user running wu-ftpd on HP Unix with tcp tunable
> tcp_delay_final_twh_ack on. so in active FTP situation, when wu-ftpd
> open up data connection to client, it sends SYN, client SYN/ACK, then
> ACK/PUSH with data. so on HU UNIX, it appears just turn on tcp tunable
> tcp_delay_final_twh_ack would make it happen.
> 
> but on Linux, do I need to change wu-ftpd code and modify the socket
> option with TCP_DEFER_ACCEPT or TCP_QUICKACK in order to trigger the
> code snippet in tcp_rcv_synsent_state_process?

Yes.

A third possibility (reading the code) if you use non blocking IO, is to
send() a message right after connect()

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-22 23:13     ` Eric Dumazet
@ 2012-03-22 23:17       ` Eric Dumazet
  2012-03-23 15:38         ` Vincent Li
  0 siblings, 1 reply; 11+ messages in thread
From: Eric Dumazet @ 2012-03-22 23:17 UTC (permalink / raw)
  To: Vincent Li; +Cc: netdev

On Thu, 2012-03-22 at 16:13 -0700, Eric Dumazet wrote:

> A third possibility (reading the code) if you use non blocking IO, is to
> send() a message right after connect()
> 
> 

Or use the auto connect on sendto() more probably...

That combines the connect() and send() 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-22 23:02   ` Vincent Li
  2012-03-22 23:07     ` David Miller
  2012-03-22 23:13     ` Eric Dumazet
@ 2012-03-22 23:44     ` Rick Jones
  2 siblings, 0 replies; 11+ messages in thread
From: Rick Jones @ 2012-03-22 23:44 UTC (permalink / raw)
  To: Vincent Li; +Cc: Eric Dumazet, netdev

I like piggybacking as much as the next guy, and pushed for a bunch of 
it in the past, but are the files being transferred really that small 
relative to the RTT that the savings of the standalone ACK of the 
SYN|ACK really buys that much?

I'm sure it does nice things for a default netperf TCP_CRR test, shaving 
one segment out of 8 or so, and maybe even more on a TCP_CC test, but 
how many sub-MSS files are transferred these days?

rick jones

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-22 23:17       ` Eric Dumazet
@ 2012-03-23 15:38         ` Vincent Li
  2012-03-23 19:03           ` Yuchung Cheng
  0 siblings, 1 reply; 11+ messages in thread
From: Vincent Li @ 2012-03-23 15:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev

On Thu, Mar 22, 2012 at 4:17 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-03-22 at 16:13 -0700, Eric Dumazet wrote:
>
>> A third possibility (reading the code) if you use non blocking IO, is to
>> send() a message right after connect()
>>
>>
>
> Or use the auto connect on sendto() more probably...
>
> That combines the connect() and send()
>

thanks,

I got the TCP_QUICKACK working, but not the non blocking IO,
basically, what I did is

  if((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0){
    perror("Can't create TCP socket");
    return 0;
  }

  val = Fcntl(sock, F_GETFL, 0);
  Fcntl(sock, F_SETFL, val | O_NONBLOCK);


  ip = get_ip(host);
  fprintf(stderr, "IP is %s\n", ip);
  remote = (struct sockaddr_in *)malloc(sizeof(struct sockaddr_in *));
  remote->sin_family = AF_INET;
  tmpres = inet_pton(AF_INET, ip, (void *)(&(remote->sin_addr.s_addr)));
  if( tmpres < 0)
  {
    perror("Can't set remote->sin_addr.s_addr");
    return 0;
  }else if(tmpres == 0)
  {
    fprintf(stderr, "%s is not a valid IP address\n", ip);
    return 0;
  }
  remote->sin_port = htons(PORT);

char *query = "GET / HTTP/1.0\r\nHost: 127.0.0.1\r\n\r\n";

tmpres = sendto(sock, query, strlen(query), 0, (struct sockaddr
*)remote, sizeof(struct sockaddr));

 maybe I mis-interpret you.

Vincent

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-23 15:38         ` Vincent Li
@ 2012-03-23 19:03           ` Yuchung Cheng
  2012-03-23 19:11             ` Eric Dumazet
  0 siblings, 1 reply; 11+ messages in thread
From: Yuchung Cheng @ 2012-03-23 19:03 UTC (permalink / raw)
  To: Vincent Li; +Cc: Eric Dumazet, netdev

Hi Vincent

On Fri, Mar 23, 2012 at 8:38 AM, Vincent Li <vincent.mc.li@gmail.com> wrote:
> On Thu, Mar 22, 2012 at 4:17 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> On Thu, 2012-03-22 at 16:13 -0700, Eric Dumazet wrote:
>>
>>> A third possibility (reading the code) if you use non blocking IO, is to
>>> send() a message right after connect()
>>>
>>>
>>
>> Or use the auto connect on sendto() more probably...
>>
>> That combines the connect() and send()
>>
>
> thanks,
>
> I got the TCP_QUICKACK working, but not the non blocking IO,
> basically, what I did is
>
>  if((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0){
>    perror("Can't create TCP socket");
>    return 0;
>  }
>
>  val = Fcntl(sock, F_GETFL, 0);
>  Fcntl(sock, F_SETFL, val | O_NONBLOCK);
>
>
>  ip = get_ip(host);
>  fprintf(stderr, "IP is %s\n", ip);
>  remote = (struct sockaddr_in *)malloc(sizeof(struct sockaddr_in *));
>  remote->sin_family = AF_INET;
>  tmpres = inet_pton(AF_INET, ip, (void *)(&(remote->sin_addr.s_addr)));
>  if( tmpres < 0)
>  {
>    perror("Can't set remote->sin_addr.s_addr");
>    return 0;
>  }else if(tmpres == 0)
>  {
>    fprintf(stderr, "%s is not a valid IP address\n", ip);
>    return 0;
>  }
>  remote->sin_port = htons(PORT);
>
> char *query = "GET / HTTP/1.0\r\nHost: 127.0.0.1\r\n\r\n";
>
> tmpres = sendto(sock, query, strlen(query), 0, (struct sockaddr
> *)remote, sizeof(struct sockaddr));
>
>  maybe I mis-interpret you.
>
> Vincent
AFAICT the feature that Eric refers to is TCP Fast Open that I am
still testing and have not yet submit to netdev.

But one way to achieve that currently is doing a non-blocking connect
then a blocking write.

I just tried this on 2.6 machine to a remote server

fcntl(sd, F_SETFL, fcntl(sd, F_GETFL, 0) | O_NONBLOCK);
connect(sd, ...);
fcntl(sd, F_SETFL, fcntl(sd, F_GETFL, 0) & ~O_NONBLOCK);
sendto(sd, buf, buf, 0, ...);

The key is that the socket has to be in progress of connecting
when application calls write/sendto(2). If you are testing on loopback,
the connect might finish before sendto so this won't work.

HTH

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Piggyback the final ACK of the three way TCP connection establishment with the data
  2012-03-23 19:03           ` Yuchung Cheng
@ 2012-03-23 19:11             ` Eric Dumazet
  0 siblings, 0 replies; 11+ messages in thread
From: Eric Dumazet @ 2012-03-23 19:11 UTC (permalink / raw)
  To: Yuchung Cheng; +Cc: Vincent Li, netdev

On Fri, 2012-03-23 at 12:03 -0700, Yuchung Cheng wrote:

> AFAICT the feature that Eric refers to is TCP Fast Open that I am
> still testing and have not yet submit to netdev.
> 

Nope, I was referring to actual linux code ;)

TFO will permit to send the DATA in the SYN packet, while Vincent only
asked to send it in the second packet the client sends to server.

> But one way to achieve that currently is doing a non-blocking connect
> then a blocking write.
> 
> I just tried this on 2.6 machine to a remote server
> 
> fcntl(sd, F_SETFL, fcntl(sd, F_GETFL, 0) | O_NONBLOCK);
> connect(sd, ...);
> fcntl(sd, F_SETFL, fcntl(sd, F_GETFL, 0) & ~O_NONBLOCK);
> sendto(sd, buf, buf, 0, ...);
> 
> The key is that the socket has to be in progress of connecting
> when application calls write/sendto(2). If you are testing on loopback,
> the connect might finish before sendto so this won't work.
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-03-23 19:11 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-21 23:38 Piggyback the final ACK of the three way TCP connection establishment with the data Vincent Li
2012-03-21 23:52 ` Rick Jones
2012-03-22  0:00 ` Eric Dumazet
2012-03-22 23:02   ` Vincent Li
2012-03-22 23:07     ` David Miller
2012-03-22 23:13     ` Eric Dumazet
2012-03-22 23:17       ` Eric Dumazet
2012-03-23 15:38         ` Vincent Li
2012-03-23 19:03           ` Yuchung Cheng
2012-03-23 19:11             ` Eric Dumazet
2012-03-22 23:44     ` Rick Jones

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.