All of lore.kernel.org
 help / color / mirror / Atom feed
From: Avner Ben Hanoch <avnerb@mellanox.com>
To: Yehuda Sadeh-Weinraub <yehuda@redhat.com>
Cc: Sage Weil <sage@newdream.net>,
	Ceph Development <ceph-devel@vger.kernel.org>
Subject: RE: msgr2 protocol
Date: Thu, 30 Jun 2016 11:59:15 +0000	[thread overview]
Message-ID: <DB5PR05MB1607B0F2121D66DED27A66F1A9240@DB5PR05MB1607.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <CADRKj5RQF8jOAMWgB+PPXn_z5nQpFPO43ieyuG8WpMy2bO5JuQ@mail.gmail.com>



> -----Original Message-----
> From: Yehuda Sadeh-Weinraub 
> Sent: Wednesday, June 29, 2016 19:53
> 
> On Wed, Jun 29, 2016 at 4:59 AM, Avner Ben Hanoch
> <avnerb@mellanox.com> wrote:
> >
> > On Sat, 28 May 2016 11:19 AM, Yehuda Sadeh-Weinraub
> <yehuda@xxxxxxxxxx> wrote:
> >>On Fri, May 27, 2016 at 10:37 AM, Sage Weil <sweil@xxxxxxxxxx> wrote:
> >>>
> >>> If you do a connect and immediately write a few bytes to teh TCP
> >>> stream, does that actaully translate to fewer packets?  I was
> >>> guessing that the server writing the first bytes of the exchange
> >>> would be fine but if it speeds things up for the client to
> >>> optimistically start the exchange too we may as well...
> >>>
> >>
> >>While haven't really looked at it recently, I don't think it'd be possible to
> embed data with the SYN packet using the plain >vanilla tcp implementation.
> However, I believe that doing connect() and sending data immediately
> following it should improve >things, specifically if doing async connect (as
> with the async messenger), but this still needs to be proven.
> >>
> >>Yehuda
> >
> > I am using TCP with network sniffers like Wireshark and this is always the
> case that I see in Linux  - *sending data soon after connect will always save
> packet by combining the ACK from the last step of TCP 3-way handshake with
> the 1st data packet* .
> > This is the case even when I did "short" activity between connect and send.
> >
> > Sniffer will show you 3 packets on the stream:
> > 1.      Client sends SYN packet
> > 2.      Server replies with SYN-ACK packet
> > 3.      Client send *data packet* that have the ACK flag set in it (this ACK
> completes the TCP 3-way handshake and makes 'accept' return on the server
> side)
> >
> > synchronous or asynchronous socket isn't relevant here because 'connect'
> returns with success upon receiving SYN-ACK from the server regardless of the
> actual client send of the TCP 3-way completing ACK (i.e., the client application
> doesn't need this ACK for relying on the socket as connected - only the server
> side need it).
> >
> > From my experience, even disabling nagle (TCP_NODELAY) doesn't affect
> > this behavior (probably because TCP_NODELAY only affect sending *data*
> > faster but does not change TCP handshake behavior)
> 
> Right. However, I was aiming at sending the data with the SYN, not with the
> ACK that follows it (trying to avoid the first roundtrip). I assume it's not really
> possible with vanilla tcp.
> 
> Yehuda
> 
I think that Sage's idea/question was to have fewer packets and to speed things up by having the client starting the exchange instead of having the server starting the exchange.
Hence, I am saying - YES.  This idea is correct.  It will save one packet and it will save time of half round trip (i.e., one leg) because the client is allowed to send data after 2 legs of TCP 3-way handshake; while the server must wait for completion of all 3 legs of TCP 3-way handshake before sending any data.
And I am saying that this is the behavior of vanilla TCP without any special settings.

If you want to send traffic with the 1st SYN packet than it sounds a bit strange even for non-vanilla TCP.  Because the client doesn't even know that the server is alive at this phase, and the server doesn't know that the client is legitimate at this phase.  This will also probably expose your server to security issue that is called SYN attack, because servers are afraid of allocating connection resources just for SYN packets (because the src ip/port can be fictitious at this phase before the client used the server's reply)

Avner

  reply	other threads:[~2016-06-30 12:33 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-29 11:59 msgr2 protocol Avner Ben Hanoch
2016-06-29 16:52 ` Yehuda Sadeh-Weinraub
2016-06-30 11:59   ` Avner Ben Hanoch [this message]
  -- strict thread matches above, loose matches on Subject: below --
2016-05-26 18:17 Sage Weil
2016-05-27  4:41 ` Haomai Wang
2016-05-27  4:45   ` Haomai Wang
2016-05-27  8:28   ` Marcus Watts
2016-05-27 17:33     ` Sage Weil
2016-05-27 17:28   ` Sage Weil
2016-05-27  9:44 ` Yehuda Sadeh-Weinraub
2016-05-27 17:37   ` Sage Weil
2016-05-28 18:19     ` Yehuda Sadeh-Weinraub
2016-06-02 15:43       ` Sage Weil
2016-06-02 15:59         ` Haomai Wang
2016-06-02 16:35           ` Sage Weil
2016-06-02 18:11 ` Gregory Farnum
2016-06-02 18:24   ` Sage Weil
2016-06-02 18:34     ` Gregory Farnum
2016-06-03 13:11       ` Sage Weil
2016-06-03 13:24       ` Sage Weil
2016-06-03 16:47         ` Haomai Wang
2016-06-03 17:33           ` Sage Weil
2016-06-03 17:35             ` Haomai Wang
2016-06-06  8:23               ` Junwang Zhao
2016-06-10  8:31                 ` Marcus Watts
2016-06-10 10:11                   ` Sage Weil
2016-06-10 10:48                   ` Sage Weil
2016-06-06 20:16             ` Gregory Farnum
2016-06-10 11:04               ` Sage Weil
2016-06-10 19:05                 ` Marcus Watts
2016-06-10 21:15                   ` Sage Weil
2016-06-10 21:22                     ` Gregory Farnum
2016-06-11 23:05                     ` Marcus Watts
2016-06-12 23:59                       ` Sage Weil
     [not found]                         ` <CACJqLyax_SXEZp3vA2_wR+CdwKOo2Re=SsK2xfXqmXjz9d8iNw@mail.gmail.com>
2016-09-09 21:14                           ` Sage Weil
     [not found]                             ` <CACJqLyYwKZ5_1OHR_5=+mr=1ED2Nt34x4TB29j5dE1D+NjzFpg@mail.gmail.com>
2016-09-10 14:43                               ` Haomai Wang
2016-09-11 17:05                                 ` Sage Weil
2016-09-12  2:29                                   ` Haomai Wang
2016-09-12 13:21                                     ` Sage Weil
2016-09-13  0:03                                       ` Gregory Farnum
2016-09-13  1:35                                         ` Haomai Wang
2016-09-13 13:21                                           ` Sage Weil
2016-09-13 11:50                                       ` Jeff Layton
2016-09-13 11:18                                   ` Jeff Layton
2016-09-13 13:31                                     ` Sage Weil
2016-09-13 14:48                                       ` Jeff Layton
2016-09-13 15:10                                         ` Sage Weil
2016-09-13 20:07                                           ` Gregory Farnum
2016-06-02 18:16 ` Gregory Farnum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB5PR05MB1607B0F2121D66DED27A66F1A9240@DB5PR05MB1607.eurprd05.prod.outlook.com \
    --to=avnerb@mellanox.com \
    --cc=ceph-devel@vger.kernel.org \
    --cc=sage@newdream.net \
    --cc=yehuda@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.