All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Laight <David.Laight@ACULAB.COM>
To: "'George Dunlap'" <george.dunlap@eu.citrix.com>,
	Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jonathan Davies <Jonathan.Davies@citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	"Stefano Stabellini" <stefano.stabellini@eu.citrix.com>,
	netdev <netdev@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	"Paul Durrant" <paul.durrant@citrix.com>,
	Christoffer Dall <christoffer.dall@linaro.org>,
	Felipe Franciosi <felipe.franciosi@citrix.com>,
	"linux-arm-kernel@lists.infradead.org" 
	<linux-arm-kernel@lists.infradead.org>,
	David Vrabel <david.vrabel@citrix.com>
Subject: RE: [Xen-devel] "tcp: refine TSO autosizing" causes performance regression on Xen
Date: Thu, 16 Apr 2015 09:22:23 +0000	[thread overview]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1CB1F43C@AcuExch.aculab.com> (raw)
In-Reply-To: <552F7936.9070205@eu.citrix.com>

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 2125 bytes --]

From: George Dunlap
> Sent: 16 April 2015 09:56
> On 04/15/2015 07:19 PM, Eric Dumazet wrote:
> > On Wed, 2015-04-15 at 19:04 +0100, George Dunlap wrote:
> >
> >> Maybe you should stop wasting all of our time and just tell us what
> >> you're thinking.
> >
> > I think you make me wasting my time.
> >
> > I already gave all the hints in prior discussions.
> 
> Right, and I suggested these two options:
> 
> "Obviously one solution would be to allow the drivers themselves to set
> the tcp_limit_output_bytes, but that seems like a maintenance
> nightmare.
> 
> "Another simple solution would be to allow drivers to indicate whether
> they have a high transmit latency, and have the kernel use a higher
> value by default when that's the case." [1]
> 
> Neither of which you commented on.  Instead you pointed me to a comment
> that only partially described what the limitations were. (I.e., it
> described the "two packets or 1ms", but not how they related, nor how
> they related to the "max of 2 64k packets outstanding" of the default
> tcp_limit_output_bytes setting.)

ISTM that you are changing the wrong knob.
You need to change something that affects the global amount of pending tx data,
not the amount that can be buffered by a single connection.

If you change tcp_limit_output_bytes and then have 1000 connections trying
to send data you'll suffer 'bufferbloat'.

If you call skb_orphan() in the tx setup path then the total number of
buffers is limited, but a single connection can (and will) will the tx
ring leading to incorrect RTT calculations and additional latency for
other connections.
This will give high single connection throughput but isn't ideal.

One possibility might be to call skb_orphan() when enough time has
elapsed since the packet was queued for transmit that it is very likely
to have actually been transmitted - even though 'transmit done' has
not yet been signalled.
Not at all sure how this would fit in though...

	David


ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

WARNING: multiple messages have this Message-ID (diff)
From: David Laight <David.Laight@ACULAB.COM>
To: 'George Dunlap' <george.dunlap@eu.citrix.com>,
	Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jonathan Davies <Jonathan.Davies@citrix.com>,
	"xen-devel@lists.xensource.com" <xen-devel@lists.xensource.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Ian Campbell <Ian.Campbell@citrix.com>,
	"Stefano Stabellini" <stefano.stabellini@eu.citrix.com>,
	netdev <netdev@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Eric Dumazet <edumazet@google.com>,
	"Paul Durrant" <paul.durrant@citrix.com>,
	Christoffer Dall <christoffer.dall@linaro.org>,
	Felipe Franciosi <felipe.franciosi@citrix.com>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	David Vrabel <david.vrabel@citrix.com>
Subject: RE: [Xen-devel] "tcp: refine TSO autosizing" causes performance regression on Xen
Date: Thu, 16 Apr 2015 09:22:23 +0000	[thread overview]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1CB1F43C@AcuExch.aculab.com> (raw)
In-Reply-To: <552F7936.9070205@eu.citrix.com>

From: George Dunlap
> Sent: 16 April 2015 09:56
> On 04/15/2015 07:19 PM, Eric Dumazet wrote:
> > On Wed, 2015-04-15 at 19:04 +0100, George Dunlap wrote:
> >
> >> Maybe you should stop wasting all of our time and just tell us what
> >> you're thinking.
> >
> > I think you make me wasting my time.
> >
> > I already gave all the hints in prior discussions.
> 
> Right, and I suggested these two options:
> 
> "Obviously one solution would be to allow the drivers themselves to set
> the tcp_limit_output_bytes, but that seems like a maintenance
> nightmare.
> 
> "Another simple solution would be to allow drivers to indicate whether
> they have a high transmit latency, and have the kernel use a higher
> value by default when that's the case." [1]
> 
> Neither of which you commented on.  Instead you pointed me to a comment
> that only partially described what the limitations were. (I.e., it
> described the "two packets or 1ms", but not how they related, nor how
> they related to the "max of 2 64k packets outstanding" of the default
> tcp_limit_output_bytes setting.)

ISTM that you are changing the wrong knob.
You need to change something that affects the global amount of pending tx data,
not the amount that can be buffered by a single connection.

If you change tcp_limit_output_bytes and then have 1000 connections trying
to send data you'll suffer 'bufferbloat'.

If you call skb_orphan() in the tx setup path then the total number of
buffers is limited, but a single connection can (and will) will the tx
ring leading to incorrect RTT calculations and additional latency for
other connections.
This will give high single connection throughput but isn't ideal.

One possibility might be to call skb_orphan() when enough time has
elapsed since the packet was queued for transmit that it is very likely
to have actually been transmitted - even though 'transmit done' has
not yet been signalled.
Not at all sure how this would fit in though...

	David



WARNING: multiple messages have this Message-ID (diff)
From: David.Laight@ACULAB.COM (David Laight)
To: linux-arm-kernel@lists.infradead.org
Subject: [Xen-devel] "tcp: refine TSO autosizing" causes performance regression on Xen
Date: Thu, 16 Apr 2015 09:22:23 +0000	[thread overview]
Message-ID: <063D6719AE5E284EB5DD2968C1650D6D1CB1F43C@AcuExch.aculab.com> (raw)
In-Reply-To: <552F7936.9070205@eu.citrix.com>

From: George Dunlap
> Sent: 16 April 2015 09:56
> On 04/15/2015 07:19 PM, Eric Dumazet wrote:
> > On Wed, 2015-04-15 at 19:04 +0100, George Dunlap wrote:
> >
> >> Maybe you should stop wasting all of our time and just tell us what
> >> you're thinking.
> >
> > I think you make me wasting my time.
> >
> > I already gave all the hints in prior discussions.
> 
> Right, and I suggested these two options:
> 
> "Obviously one solution would be to allow the drivers themselves to set
> the tcp_limit_output_bytes, but that seems like a maintenance
> nightmare.
> 
> "Another simple solution would be to allow drivers to indicate whether
> they have a high transmit latency, and have the kernel use a higher
> value by default when that's the case." [1]
> 
> Neither of which you commented on.  Instead you pointed me to a comment
> that only partially described what the limitations were. (I.e., it
> described the "two packets or 1ms", but not how they related, nor how
> they related to the "max of 2 64k packets outstanding" of the default
> tcp_limit_output_bytes setting.)

ISTM that you are changing the wrong knob.
You need to change something that affects the global amount of pending tx data,
not the amount that can be buffered by a single connection.

If you change tcp_limit_output_bytes and then have 1000 connections trying
to send data you'll suffer 'bufferbloat'.

If you call skb_orphan() in the tx setup path then the total number of
buffers is limited, but a single connection can (and will) will the tx
ring leading to incorrect RTT calculations and additional latency for
other connections.
This will give high single connection throughput but isn't ideal.

One possibility might be to call skb_orphan() when enough time has
elapsed since the packet was queued for transmit that it is very likely
to have actually been transmitted - even though 'transmit done' has
not yet been signalled.
Not at all sure how this would fit in though...

	David

  parent reply	other threads:[~2015-04-16  9:24 UTC|newest]

Thread overview: 92+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-09 15:46 "tcp: refine TSO autosizing" causes performance regression on Xen Stefano Stabellini
2015-04-09 15:46 ` Stefano Stabellini
2015-04-09 15:46 ` Stefano Stabellini
2015-04-09 16:16 ` Eric Dumazet
2015-04-09 16:16   ` Eric Dumazet
2015-04-09 16:36   ` Stefano Stabellini
2015-04-09 16:36     ` Stefano Stabellini
2015-04-09 16:36     ` Stefano Stabellini
2015-04-09 17:07     ` Eric Dumazet
2015-04-09 17:07       ` Eric Dumazet
2015-04-13 10:56     ` [Xen-devel] " George Dunlap
2015-04-13 10:56       ` George Dunlap
2015-04-13 13:38       ` Jonathan Davies
2015-04-13 13:38         ` Jonathan Davies
2015-04-13 13:38         ` Jonathan Davies
2015-04-13 13:49       ` Eric Dumazet
2015-04-13 13:49         ` Eric Dumazet
2015-04-15 13:43         ` George Dunlap
2015-04-15 13:43           ` George Dunlap
2015-04-15 16:38           ` Eric Dumazet
2015-04-15 16:38             ` Eric Dumazet
2015-04-15 16:38             ` Eric Dumazet
2015-04-15 17:23             ` George Dunlap
2015-04-15 17:23               ` George Dunlap
2015-04-15 17:23               ` George Dunlap
2015-04-15 17:29               ` Eric Dumazet
2015-04-15 17:29                 ` Eric Dumazet
2015-04-15 17:41                 ` George Dunlap
2015-04-15 17:41                   ` George Dunlap
2015-04-15 17:41                   ` George Dunlap
2015-04-15 17:52                   ` Eric Dumazet
2015-04-15 17:52                     ` Eric Dumazet
2015-04-15 17:55                     ` Rick Jones
2015-04-15 17:55                       ` Rick Jones
2015-04-15 18:08                       ` Eric Dumazet
2015-04-15 18:08                         ` Eric Dumazet
2015-04-15 18:19                         ` Rick Jones
2015-04-15 18:19                           ` Rick Jones
2015-04-15 18:32                           ` Eric Dumazet
2015-04-15 18:32                             ` Eric Dumazet
2015-04-15 18:32                             ` Eric Dumazet
2015-04-15 20:08                             ` [Xen-devel] " Rick Jones
2015-04-15 20:08                               ` Rick Jones
2015-04-15 20:08                               ` Rick Jones
2015-04-15 18:04                     ` George Dunlap
2015-04-15 18:04                       ` George Dunlap
2015-04-15 18:04                       ` George Dunlap
2015-04-15 18:19                       ` Eric Dumazet
2015-04-15 18:19                         ` Eric Dumazet
2015-04-16  8:56                         ` George Dunlap
2015-04-16  8:56                           ` George Dunlap
2015-04-16  8:56                           ` George Dunlap
2015-04-16  9:20                           ` Daniel Borkmann
2015-04-16  9:20                             ` Daniel Borkmann
2015-04-16  9:20                             ` Daniel Borkmann
2015-04-16 10:01                             ` George Dunlap
2015-04-16 10:01                               ` George Dunlap
2015-04-16 10:01                               ` George Dunlap
2015-04-16 12:42                               ` Eric Dumazet
2015-04-16 12:42                                 ` Eric Dumazet
2015-04-20 11:03                                 ` George Dunlap
2015-04-20 11:03                                   ` George Dunlap
2015-06-02  9:52                                 ` Wei Liu
2015-06-02  9:52                                   ` Wei Liu
2015-06-02  9:52                                   ` Wei Liu
2015-06-02 16:16                                   ` Eric Dumazet
2015-06-02 16:16                                     ` Eric Dumazet
2015-04-16  9:22                           ` David Laight [this message]
2015-04-16  9:22                             ` David Laight
2015-04-16  9:22                             ` David Laight
2015-04-16 10:57                             ` George Dunlap
2015-04-16 10:57                               ` George Dunlap
2015-04-16 10:57                               ` George Dunlap
2015-04-15 17:41               ` Eric Dumazet
2015-04-15 17:41                 ` Eric Dumazet
2015-04-15 17:58                 ` Stefano Stabellini
2015-04-15 17:58                   ` Stefano Stabellini
2015-04-15 17:58                   ` Stefano Stabellini
2015-04-15 18:17                   ` Eric Dumazet
2015-04-15 18:17                     ` Eric Dumazet
2015-04-16  4:20                     ` Herbert Xu
2015-04-16  4:20                       ` Herbert Xu
2015-04-16  4:30                       ` Eric Dumazet
2015-04-16  4:30                         ` Eric Dumazet
2015-04-16 11:39                     ` George Dunlap
2015-04-16 11:39                       ` George Dunlap
2015-04-16 11:39                       ` George Dunlap
2015-04-16 12:16                       ` Eric Dumazet
2015-04-16 12:16                         ` Eric Dumazet
2015-04-16 13:00                       ` Tim Deegan
2015-04-16 13:00                         ` Tim Deegan
2015-04-16 13:00                         ` Tim Deegan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=063D6719AE5E284EB5DD2968C1650D6D1CB1F43C@AcuExch.aculab.com \
    --to=david.laight@aculab.com \
    --cc=Ian.Campbell@citrix.com \
    --cc=Jonathan.Davies@citrix.com \
    --cc=christoffer.dall@linaro.org \
    --cc=david.vrabel@citrix.com \
    --cc=edumazet@google.com \
    --cc=eric.dumazet@gmail.com \
    --cc=felipe.franciosi@citrix.com \
    --cc=george.dunlap@eu.citrix.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=paul.durrant@citrix.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=wei.liu2@citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.