From: Willy Tarreau <w@1wt.eu> To: Eric Dumazet <eric.dumazet@gmail.com> Cc: Arnaud Ebalard <arno@natisbad.org>, Cong Wang <xiyou.wangcong@gmail.com>, edumazet@google.com, linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org, Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Subject: Re: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Date: Wed, 20 Nov 2013 18:38:28 +0100 [thread overview] Message-ID: <20131120173828.GL8581@1wt.eu> (raw) In-Reply-To: <1384968607.10637.14.camel@edumazet-glaptop2.roam.corp.google.com> On Wed, Nov 20, 2013 at 09:30:07AM -0800, Eric Dumazet wrote: > Well, all TCP performance results are highly dependent on the workload, > and both receivers and senders behavior. > > We made many improvements like TSO auto sizing, DRS (dynamic Right > Sizing), and if the application used some specific settings (like > SO_SNDBUF / SO_RCVBUF or other tweaks), we can not guarantee that same > exact performance is reached from kernel version X to kernel version Y. Of course, which is why I only care when there's a significant difference. If I need 6 streams in a version and 8 in another one to fill the wire, I call them identical. It's only when we dig into the details that we analyse the differences. > We try to make forward progress, there is little gain to revert all > these great works. Linux had this tendency to favor throughput by using > overly large skbs. Its time to do better. I agree. Unfortunately our mails have crossed each other, so just to keep this tread mostly linear, your next patch here : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=98e09386c0ef4dfd48af7ba60ff908f0d525cdee Fixes that regression and the performance is back to normal which is good. > As explained, some drivers are buggy, and need fixes. Agreed! > If nobody wants to fix them, this really means no one is interested > getting them fixed. I was exactly reading the code when I found a window with your patch above that I was looking for :-) > I am willing to help if you provide details, because otherwise I need > a crystal ball ;) > > One known problem of TCP is the fact that an incoming ACK making room in > socket write queue immediately wakeup a blocked thread (POLLOUT), even > if only one MSS was ack, and write queue has 2MB of outstanding bytes. Indeed. > All these scheduling problems should be identified and fixed, and yes, > this will require a dozen more patches. > > max (128KB , 1-2 ms) of buffering per flow should be enough to reach > line rate, even for a single flow, but this means the sk_sndbuf value > for the socket must take into account the pipe size _plus_ 1ms of > buffering. Which is the purpose of your patch above and I confirm it fixes the problem. Now looking at how to workaround this lack of Tx IRQ. Thanks! Willy
WARNING: multiple messages have this Message-ID (diff)
From: w@1wt.eu (Willy Tarreau) To: linux-arm-kernel@lists.infradead.org Subject: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Date: Wed, 20 Nov 2013 18:38:28 +0100 [thread overview] Message-ID: <20131120173828.GL8581@1wt.eu> (raw) In-Reply-To: <1384968607.10637.14.camel@edumazet-glaptop2.roam.corp.google.com> On Wed, Nov 20, 2013 at 09:30:07AM -0800, Eric Dumazet wrote: > Well, all TCP performance results are highly dependent on the workload, > and both receivers and senders behavior. > > We made many improvements like TSO auto sizing, DRS (dynamic Right > Sizing), and if the application used some specific settings (like > SO_SNDBUF / SO_RCVBUF or other tweaks), we can not guarantee that same > exact performance is reached from kernel version X to kernel version Y. Of course, which is why I only care when there's a significant difference. If I need 6 streams in a version and 8 in another one to fill the wire, I call them identical. It's only when we dig into the details that we analyse the differences. > We try to make forward progress, there is little gain to revert all > these great works. Linux had this tendency to favor throughput by using > overly large skbs. Its time to do better. I agree. Unfortunately our mails have crossed each other, so just to keep this tread mostly linear, your next patch here : http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=98e09386c0ef4dfd48af7ba60ff908f0d525cdee Fixes that regression and the performance is back to normal which is good. > As explained, some drivers are buggy, and need fixes. Agreed! > If nobody wants to fix them, this really means no one is interested > getting them fixed. I was exactly reading the code when I found a window with your patch above that I was looking for :-) > I am willing to help if you provide details, because otherwise I need > a crystal ball ;) > > One known problem of TCP is the fact that an incoming ACK making room in > socket write queue immediately wakeup a blocked thread (POLLOUT), even > if only one MSS was ack, and write queue has 2MB of outstanding bytes. Indeed. > All these scheduling problems should be identified and fixed, and yes, > this will require a dozen more patches. > > max (128KB , 1-2 ms) of buffering per flow should be enough to reach > line rate, even for a single flow, but this means the sk_sndbuf value > for the socket must take into account the pipe size _plus_ 1ms of > buffering. Which is the purpose of your patch above and I confirm it fixes the problem. Now looking at how to workaround this lack of Tx IRQ. Thanks! Willy
next prev parent reply other threads:[~2013-11-20 17:38 UTC|newest] Thread overview: 121+ messages / expand[flat|nested] mbox.gz Atom feed top 2013-11-10 13:53 [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Arnaud Ebalard 2013-11-10 13:53 ` Arnaud Ebalard 2013-11-12 6:48 ` Cong Wang 2013-11-12 6:48 ` Cong Wang 2013-11-12 7:56 ` Arnaud Ebalard 2013-11-12 7:56 ` Arnaud Ebalard 2013-11-12 8:36 ` Willy Tarreau 2013-11-12 8:36 ` Willy Tarreau 2013-11-12 9:14 ` Arnaud Ebalard 2013-11-12 9:14 ` Arnaud Ebalard 2013-11-12 10:01 ` Willy Tarreau 2013-11-12 10:01 ` Willy Tarreau 2013-11-12 15:34 ` Arnaud Ebalard 2013-11-12 15:34 ` Arnaud Ebalard 2013-11-13 7:22 ` Willy Tarreau 2013-11-13 7:22 ` Willy Tarreau 2013-11-17 14:19 ` Willy Tarreau 2013-11-17 14:19 ` Willy Tarreau 2013-11-17 17:41 ` Eric Dumazet 2013-11-17 17:41 ` Eric Dumazet 2013-11-19 6:44 ` Arnaud Ebalard 2013-11-19 6:44 ` Arnaud Ebalard 2013-11-19 13:53 ` Eric Dumazet 2013-11-19 13:53 ` Eric Dumazet 2013-11-19 17:43 ` Willy Tarreau 2013-11-19 17:43 ` Willy Tarreau 2013-11-19 18:31 ` Eric Dumazet 2013-11-19 18:31 ` Eric Dumazet 2013-11-19 18:41 ` Willy Tarreau 2013-11-19 18:41 ` Willy Tarreau 2013-11-19 23:53 ` Arnaud Ebalard 2013-11-19 23:53 ` Arnaud Ebalard 2013-11-20 0:08 ` Eric Dumazet 2013-11-20 0:08 ` Eric Dumazet 2013-11-20 0:35 ` Willy Tarreau 2013-11-20 0:35 ` Willy Tarreau 2013-11-20 0:43 ` Eric Dumazet 2013-11-20 0:43 ` Eric Dumazet 2013-11-20 0:52 ` Willy Tarreau 2013-11-20 0:52 ` Willy Tarreau 2013-11-20 8:50 ` Thomas Petazzoni 2013-11-20 8:50 ` Thomas Petazzoni 2013-11-20 19:21 ` Arnaud Ebalard 2013-11-20 19:11 ` Willy Tarreau 2013-11-20 19:11 ` Willy Tarreau 2013-11-20 19:26 ` Arnaud Ebalard 2013-11-20 19:26 ` Arnaud Ebalard 2013-11-20 21:28 ` Arnaud Ebalard 2013-11-20 21:28 ` Arnaud Ebalard 2013-11-20 21:54 ` Willy Tarreau 2013-11-20 21:54 ` Willy Tarreau 2013-11-21 0:44 ` Willy Tarreau 2013-11-21 0:44 ` Willy Tarreau 2013-11-21 18:38 ` ARM network performance and dma_mask (was: [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s) Willy Tarreau 2013-11-21 19:04 ` Thomas Petazzoni 2013-11-21 19:04 ` Thomas Petazzoni 2013-11-21 21:51 ` Willy Tarreau 2013-11-21 21:51 ` ARM network performance and dma_mask (was: [BUG, REGRESSION?] 3.11.6+, 3.12: " Willy Tarreau 2013-11-21 22:01 ` ARM network performance and dma_mask Rob Herring 2013-11-21 22:01 ` Rob Herring 2013-11-21 22:13 ` Willy Tarreau 2013-11-21 22:13 ` Willy Tarreau 2013-11-21 21:51 ` [BUG,REGRESSION?] 3.11.6+,3.12: GbE iface rate drops to few KB/s Arnaud Ebalard 2013-11-21 21:51 ` Arnaud Ebalard 2013-11-21 21:52 ` Willy Tarreau 2013-11-21 21:52 ` Willy Tarreau 2013-11-21 22:00 ` Eric Dumazet 2013-11-21 22:00 ` Eric Dumazet 2013-11-21 22:55 ` Arnaud Ebalard 2013-11-21 22:55 ` Arnaud Ebalard 2013-11-21 23:23 ` Rick Jones 2013-11-21 23:23 ` Rick Jones 2013-11-20 17:12 ` Willy Tarreau 2013-11-20 17:12 ` Willy Tarreau 2013-11-20 17:30 ` Eric Dumazet 2013-11-20 17:30 ` Eric Dumazet 2013-11-20 17:38 ` Willy Tarreau [this message] 2013-11-20 17:38 ` Willy Tarreau 2013-11-20 18:52 ` David Miller 2013-11-20 18:52 ` David Miller 2013-11-20 17:34 ` Willy Tarreau 2013-11-20 17:34 ` Willy Tarreau 2013-11-20 17:40 ` Eric Dumazet 2013-11-20 17:40 ` Eric Dumazet 2013-11-20 18:15 ` Willy Tarreau 2013-11-20 18:15 ` Willy Tarreau 2013-11-20 18:21 ` Eric Dumazet 2013-11-20 18:21 ` Eric Dumazet 2013-11-20 18:29 ` Willy Tarreau 2013-11-20 18:29 ` Willy Tarreau 2013-11-20 19:22 ` Arnaud Ebalard 2013-11-20 19:22 ` Arnaud Ebalard 2013-11-18 10:09 ` David Laight 2013-11-18 10:09 ` David Laight 2013-11-18 10:52 ` Willy Tarreau 2013-11-18 10:52 ` Willy Tarreau 2013-11-18 10:26 ` Thomas Petazzoni 2013-11-18 10:26 ` Thomas Petazzoni 2013-11-18 10:44 ` Simon Guinot 2013-11-18 10:44 ` Simon Guinot 2013-11-18 16:54 ` Stephen Hemminger 2013-11-18 16:54 ` Stephen Hemminger 2013-11-18 17:13 ` Eric Dumazet 2013-11-18 17:13 ` Eric Dumazet 2013-11-18 10:51 ` Willy Tarreau 2013-11-18 10:51 ` Willy Tarreau 2013-11-18 17:58 ` Florian Fainelli 2013-11-18 17:58 ` Florian Fainelli 2013-11-12 14:39 ` [PATCH] tcp: tsq: restore minimal amount of queueing Eric Dumazet 2013-11-12 15:24 ` Sujith Manoharan 2013-11-13 14:06 ` Eric Dumazet 2013-11-13 14:32 ` [PATCH v2] " Eric Dumazet 2013-11-13 21:18 ` Arnaud Ebalard 2013-11-13 21:59 ` Holger Hoffstaette 2013-11-13 23:40 ` Eric Dumazet 2013-11-13 23:52 ` Holger Hoffstaette 2013-11-17 23:15 ` Francois Romieu 2013-11-18 16:26 ` Holger Hoffstätte 2013-11-18 16:47 ` Eric Dumazet 2013-11-13 22:41 ` Eric Dumazet 2013-11-14 21:26 ` David Miller
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20131120173828.GL8581@1wt.eu \ --to=w@1wt.eu \ --cc=arno@natisbad.org \ --cc=edumazet@google.com \ --cc=eric.dumazet@gmail.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=netdev@vger.kernel.org \ --cc=thomas.petazzoni@free-electrons.com \ --cc=xiyou.wangcong@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.