All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tom Herbert <therbert@google.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: David Miller <davem@davemloft.net>,
	rick.jones2@hp.com, ycheng@google.com, dave.taht@gmail.com,
	netdev@vger.kernel.org, codel@lists.bufferbloat.net,
	mattmathis@google.com, nanditad@google.com, ncardwell@google.com,
	andrewmcgr@gmail.com
Subject: Re: [RFC PATCH v2] tcp: TCP Small Queues
Date: Thu, 12 Jul 2012 07:55:33 -0700	[thread overview]
Message-ID: <CA+mtBx8tHJ1QkJWMSUVfFp_a4ymjsf7fA=wL+VQTJMKXmj0uuQ@mail.gmail.com> (raw)
In-Reply-To: <1342079487.3265.8245.camel@edumazet-glaptop>

On Thu, Jul 12, 2012 at 12:51 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Thu, 2012-07-12 at 00:37 -0700, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Thu, 12 Jul 2012 09:34:19 +0200
>>
>> > On Thu, 2012-07-12 at 01:49 +0200, Eric Dumazet wrote:
>> >
>> >> The 10Gb receiver is a net-next kernel, but the 1Gb receiver is a 2.6.38
>> >> ubuntu kernel. They probably have very different TCP behavior.
>> >
>> >
>> > I tested TSQ on bnx2x and 10Gb links.
>> >
>> > I get full rate even using 65536 bytes for
>> > the /proc/sys/net/ipv4/tcp_limit_output_bytes tunable
>>
>> Great work Eric.
>
> Thanks !
>
This is indeed great work!  A couple of comments...

Do you know if there are are any qdiscs that function less efficiently
when we are restricting the number of packets?  For instance, will HTB
work as expected in various configurations?

One extension to this work be to make the limit dynamic and mostly
eliminate the tunable.  I'm thinking we might be able to correlate the
limit to the BQL limit of the egress queue for the flow it there is
one.

Assuming all work conserving qdiscs the minimal amount of outstanding
host data for a queue could be associated with the BQL limit of the
egress NIC queue.  We want to minimize the outstanding data so that:

sum(data_of_tcp_flows_share_same_queue) > bql_limit_for _queue

So this could imply a per flow limit of:

tcp_limit = max(bql_limit - bql_inflight, one_packet)

For a single active connection on a queue, the tcp_limit is equal to
the BQL limit.  Once the BQL limit is hit in the NIC, we only need one
packet outstanding per flow to maintain flow control.  For fairness,
we might need "one_packet" to actually be max GSO data.  Also, this
disregards any latency of scheduling and running the tasklet, that
might need to be taken into account also.

Tom

  reply	other threads:[~2012-07-12 14:55 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-28 17:07 [PATCH net-next] fq_codel: report congestion notification at enqueue time Eric Dumazet
2012-06-28 17:51 ` Dave Taht
2012-06-28 18:12   ` Eric Dumazet
2012-06-28 22:56     ` Yuchung Cheng
2012-06-28 23:47       ` Dave Taht
2012-06-29  4:50         ` Eric Dumazet
2012-06-29  5:24           ` Dave Taht
2012-07-04 10:11           ` [RFC PATCH] tcp: limit data skbs in qdisc layer Eric Dumazet
2012-07-09  7:08             ` David Miller
2012-07-09  8:03               ` Eric Dumazet
2012-07-09  8:48                 ` Eric Dumazet
2012-07-09 14:55               ` Eric Dumazet
2012-07-10 13:28                 ` Lin Ming
2012-07-10 15:13                 ` [RFC PATCH v2] tcp: TCP Small Queues Eric Dumazet
2012-07-10 17:06                   ` Eric Dumazet
2012-07-10 17:37                   ` Yuchung Cheng
2012-07-10 18:32                     ` Eric Dumazet
2012-07-11 15:11                   ` Eric Dumazet
2012-07-11 15:16                     ` Ben Greear
2012-07-11 15:25                       ` Eric Dumazet
2012-07-11 15:43                         ` Ben Greear
2012-07-11 15:54                           ` Eric Dumazet
2012-07-11 16:03                             ` Ben Greear
2012-07-11 18:23                     ` Rick Jones
2012-07-11 23:38                       ` Eric Dumazet
2012-07-11 18:44                     ` Rick Jones
2012-07-11 23:49                       ` Eric Dumazet
2012-07-12  7:34                         ` Eric Dumazet
2012-07-12  7:37                           ` David Miller
2012-07-12  7:51                             ` Eric Dumazet
2012-07-12 14:55                               ` Tom Herbert [this message]
2012-07-12 13:33                   ` John Heffner
2012-07-12 13:46                     ` Eric Dumazet
2012-07-12 16:44                       ` John Heffner
2012-07-12 16:54                         ` Jim Gettys
2012-06-28 23:52 ` [PATCH net-next] fq_codel: report congestion notification at enqueue time Nandita Dukkipati
2012-06-29  4:18   ` Eric Dumazet
2012-06-29  4:53 ` Eric Dumazet
2012-06-29  5:12   ` David Miller
2012-06-29  5:24     ` Eric Dumazet
2012-06-29  5:29       ` David Miller
2012-06-29  5:50         ` Eric Dumazet
2012-06-29  7:53           ` David Miller
2012-06-29  8:04           ` David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CA+mtBx8tHJ1QkJWMSUVfFp_a4ymjsf7fA=wL+VQTJMKXmj0uuQ@mail.gmail.com' \
    --to=therbert@google.com \
    --cc=andrewmcgr@gmail.com \
    --cc=codel@lists.bufferbloat.net \
    --cc=dave.taht@gmail.com \
    --cc=davem@davemloft.net \
    --cc=eric.dumazet@gmail.com \
    --cc=mattmathis@google.com \
    --cc=nanditad@google.com \
    --cc=ncardwell@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=rick.jones2@hp.com \
    --cc=ycheng@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.