All of lore.kernel.org
 help / color / mirror / Atom feed
* TCP many-connection regression between 4.7 and 4.13 kernels.
@ 2018-01-22 17:28 Ben Greear
  2018-01-22 18:16 ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-22 17:28 UTC (permalink / raw)
  To: netdev

My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
but even after forcing those values higher, the max connections we can get is around 15k.

Both kernels have my out-of-tree patches applied, so it is possible it is my fault
at this point.

Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?

I will start bisecting in the meantime...

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression between 4.7 and 4.13 kernels.
  2018-01-22 17:28 TCP many-connection regression between 4.7 and 4.13 kernels Ben Greear
@ 2018-01-22 18:16 ` Eric Dumazet
  2018-01-22 18:27   ` Willy Tarreau
                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Eric Dumazet @ 2018-01-22 18:16 UTC (permalink / raw)
  To: Ben Greear, netdev

On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
> but even after forcing those values higher, the max connections we can get is around 15k.
> 
> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
> at this point.
> 
> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
> 
> I will start bisecting in the meantime...
> 

Hi Ben

Unfortunately I have no idea.

Are you using loopback flows, or have I misunderstood you ?

How loopback connections can be slow-speed ?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression between 4.7 and 4.13 kernels.
  2018-01-22 18:16 ` Eric Dumazet
@ 2018-01-22 18:27   ` Willy Tarreau
  2018-01-22 18:30   ` Ben Greear
  2018-01-23 21:49   ` TCP many-connection regression (bisected to 4.5.0-rc2+) Ben Greear
  2 siblings, 0 replies; 15+ messages in thread
From: Willy Tarreau @ 2018-01-22 18:27 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Ben Greear, netdev

Hi Eric,

On Mon, Jan 22, 2018 at 10:16:06AM -0800, Eric Dumazet wrote:
> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
> > My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
> > on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
> > will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
> > but even after forcing those values higher, the max connections we can get is around 15k.
> > 
> > Both kernels have my out-of-tree patches applied, so it is possible it is my fault
> > at this point.
> > 
> > Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
> > 
> > I will start bisecting in the meantime...
> > 
> 
> Hi Ben
> 
> Unfortunately I have no idea.
> 
> Are you using loopback flows, or have I misunderstood you ?
> 
> How loopback connections can be slow-speed ?

A few quick points : I have not noticed this on 4.9, which we use with
pretty satisfying performance (typically around 100k conn/s). However
during some recent tests I did around the meltdown fixes on 4.15, I
noticed a high connect() or bind() cost to find a local port when
running on the loopback, that I didn't have the time to compare to
older kernels. However, strace clearly showed that bind() (or connect()
if bind was not used) could take as much as 2-3 ms as source ports were
filling up.

To be clear, it was just a quick observation and anything could be wrong
there, including my tests. I'm just saying this in case it matches anything
Ben has observed. I can try to get more info if that helps, but it could be
a different case.

Cheers,
Willy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression between 4.7 and 4.13 kernels.
  2018-01-22 18:16 ` Eric Dumazet
  2018-01-22 18:27   ` Willy Tarreau
@ 2018-01-22 18:30   ` Ben Greear
  2018-01-22 18:44     ` Ben Greear
  2018-01-22 18:46     ` Josh Hunt
  2018-01-23 21:49   ` TCP many-connection regression (bisected to 4.5.0-rc2+) Ben Greear
  2 siblings, 2 replies; 15+ messages in thread
From: Ben Greear @ 2018-01-22 18:30 UTC (permalink / raw)
  To: Eric Dumazet, netdev

On 01/22/2018 10:16 AM, Eric Dumazet wrote:
> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
>> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
>> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
>> but even after forcing those values higher, the max connections we can get is around 15k.
>>
>> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
>> at this point.
>>
>> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
>>
>> I will start bisecting in the meantime...
>>
>
> Hi Ben
>
> Unfortunately I have no idea.
>
> Are you using loopback flows, or have I misunderstood you ?
>
> How loopback connections can be slow-speed ?
>

I am sending to self, but over external network interfaces, by using
routing tables and rules and such.

On 4.13.16+, I see the Intel driver bouncing when I try to start 20k
connections.  In this case, I have a pair of 10G ports doing 15k, and then
I try to start 5k on two of the 1G ports....

Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 22 10:15:43 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
Jan 22 10:15:45 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: NETDEV WATCHDOG: eth3 (e1000e): transmit queue 0 timed out, trans_s...es: 1
Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: e1000e 0000:07:00.0 eth3: Reset adapter unexpectedly


System reports 10+GB RAM free in this case, btw.

Actually, maybe the good kernel was even older than 4.7...I see same resets and inability to do a full 20k
connections on 4.7 too.   I double-checked with system-test and it seems 4.4 was a good kernel.  I'll test
that next.  Here is splat from 4.7:

[  238.921679] ------------[ cut here ]------------
[  238.921689] WARNING: CPU: 0 PID: 3 at /home/greearb/git/linux-bisect/net/sched/sch_generic.c:272 dev_watchdog+0xd4/0x12f
[  238.921690] NETDEV WATCHDOG: eth3 (e1000e): transmit queue 0 timed out
[  238.921691] Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 cfg80211 macvlan pktgen bnep bluetooth fuse coretemp intel_rapl 
ftdi_sio x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm iTCO_wdt iTCO_vendor_support joydev ie31200_edac ipmi_devintf irqbypass serio_raw ipmi_si edac_core 
shpchp fjes video i2c_i801 tpm_tis lpc_ich ipmi_msghandler tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core 
e1000e ixgbe mdio hwmon dca ptp pps_core ipv6 [last unloaded: nf_conntrack]
[  238.921720] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.7.0 #62
[  238.921721] Hardware name: Supermicro X9SCI/X9SCA/X9SCI/X9SCA, BIOS 2.0b 09/17/2012
[  238.921723]  0000000000000000 ffff88041cdd7cd8 ffffffff81352a23 ffff88041cdd7d28
[  238.921725]  0000000000000000 ffff88041cdd7d18 ffffffff810ea5dd 000001101cdd7d90
[  238.921727]  ffff880417a84000 0000000000000100 ffffffff8163ecff ffff880417a84440
[  238.921728] Call Trace:
[  238.921733]  [<ffffffff81352a23>] dump_stack+0x61/0x7d
[  238.921736]  [<ffffffff810ea5dd>] __warn+0xbd/0xd8
[  238.921738]  [<ffffffff8163ecff>] ? netif_tx_lock+0x81/0x81
[  238.921740]  [<ffffffff810ea63e>] warn_slowpath_fmt+0x46/0x4e
[  238.921741]  [<ffffffff8163ecf2>] ? netif_tx_lock+0x74/0x81
[  238.921743]  [<ffffffff8163edd3>] dev_watchdog+0xd4/0x12f
[  238.921746]  [<ffffffff8113cfbb>] call_timer_fn+0x65/0x11b
[  238.921748]  [<ffffffff8163ecff>] ? netif_tx_lock+0x81/0x81
[  238.921749]  [<ffffffff8113d73e>] run_timer_softirq+0x1ad/0x1d7
[  238.921751]  [<ffffffff810ee701>] __do_softirq+0xfb/0x25c
[  238.921752]  [<ffffffff810ee87b>] run_ksoftirqd+0x19/0x35
[  238.921755]  [<ffffffff81105ae8>] smpboot_thread_fn+0x169/0x1a9
[  238.921756]  [<ffffffff8110597f>] ? sort_range+0x1d/0x1d
[  238.921759]  [<ffffffff811031a1>] kthread+0xa0/0xa8
[  238.921763]  [<ffffffff816ce19f>] ret_from_fork+0x1f/0x40
[  238.921764]  [<ffffffff81103101>] ? init_completion+0x24/0x24
[  238.921765] ---[ end trace 933912956c6ee5ff ]---
[  238.961672] e1000e 0000:07:00.0 eth3: Reset adapter unexpectedly


Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression between 4.7 and 4.13 kernels.
  2018-01-22 18:30   ` Ben Greear
@ 2018-01-22 18:44     ` Ben Greear
  2018-01-22 18:46     ` Josh Hunt
  1 sibling, 0 replies; 15+ messages in thread
From: Ben Greear @ 2018-01-22 18:44 UTC (permalink / raw)
  To: Eric Dumazet, netdev

On 01/22/2018 10:30 AM, Ben Greear wrote:
> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
>> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>>> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
>>> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
>>> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
>>> but even after forcing those values higher, the max connections we can get is around 15k.
>>>
>>> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
>>> at this point.
>>>
>>> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
>>>
>>> I will start bisecting in the meantime...
>>>
>>
>> Hi Ben
>>
>> Unfortunately I have no idea.
>>
>> Are you using loopback flows, or have I misunderstood you ?
>>
>> How loopback connections can be slow-speed ?
>>
>
> I am sending to self, but over external network interfaces, by using
> routing tables and rules and such.
>
> On 4.13.16+, I see the Intel driver bouncing when I try to start 20k
> connections.  In this case, I have a pair of 10G ports doing 15k, and then
> I try to start 5k on two of the 1G ports....
>
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:43 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Down
> Jan 22 10:15:45 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: NETDEV WATCHDOG: eth3 (e1000e): transmit queue 0 timed out, trans_s...es: 1
> Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: e1000e 0000:07:00.0 eth3: Reset adapter unexpectedly
>
>
> System reports 10+GB RAM free in this case, btw.
>
> Actually, maybe the good kernel was even older than 4.7...I see same resets and inability to do a full 20k
> connections on 4.7 too.   I double-checked with system-test and it seems 4.4 was a good kernel.  I'll test
> that next.  Here is splat from 4.7:
>
> [  238.921679] ------------[ cut here ]------------
> [  238.921689] WARNING: CPU: 0 PID: 3 at /home/greearb/git/linux-bisect/net/sched/sch_generic.c:272 dev_watchdog+0xd4/0x12f
> [  238.921690] NETDEV WATCHDOG: eth3 (e1000e): transmit queue 0 timed out
> [  238.921691] Modules linked in: nf_conntrack_netlink nf_conntrack nfnetlink nf_defrag_ipv4 cfg80211 macvlan pktgen bnep bluetooth fuse coretemp intel_rapl
> ftdi_sio x86_pkg_temp_thermal intel_powerclamp kvm_intel kvm iTCO_wdt iTCO_vendor_support joydev ie31200_edac ipmi_devintf irqbypass serio_raw ipmi_si edac_core
> shpchp fjes video i2c_i801 tpm_tis lpc_ich ipmi_msghandler tpm nfsd auth_rpcgss nfs_acl lockd grace sunrpc mgag200 i2c_algo_bit drm_kms_helper ttm drm i2c_core
> e1000e ixgbe mdio hwmon dca ptp pps_core ipv6 [last unloaded: nf_conntrack]
> [  238.921720] CPU: 0 PID: 3 Comm: ksoftirqd/0 Not tainted 4.7.0 #62
> [  238.921721] Hardware name: Supermicro X9SCI/X9SCA/X9SCI/X9SCA, BIOS 2.0b 09/17/2012
> [  238.921723]  0000000000000000 ffff88041cdd7cd8 ffffffff81352a23 ffff88041cdd7d28
> [  238.921725]  0000000000000000 ffff88041cdd7d18 ffffffff810ea5dd 000001101cdd7d90
> [  238.921727]  ffff880417a84000 0000000000000100 ffffffff8163ecff ffff880417a84440
> [  238.921728] Call Trace:
> [  238.921733]  [<ffffffff81352a23>] dump_stack+0x61/0x7d
> [  238.921736]  [<ffffffff810ea5dd>] __warn+0xbd/0xd8
> [  238.921738]  [<ffffffff8163ecff>] ? netif_tx_lock+0x81/0x81
> [  238.921740]  [<ffffffff810ea63e>] warn_slowpath_fmt+0x46/0x4e
> [  238.921741]  [<ffffffff8163ecf2>] ? netif_tx_lock+0x74/0x81
> [  238.921743]  [<ffffffff8163edd3>] dev_watchdog+0xd4/0x12f
> [  238.921746]  [<ffffffff8113cfbb>] call_timer_fn+0x65/0x11b
> [  238.921748]  [<ffffffff8163ecff>] ? netif_tx_lock+0x81/0x81
> [  238.921749]  [<ffffffff8113d73e>] run_timer_softirq+0x1ad/0x1d7
> [  238.921751]  [<ffffffff810ee701>] __do_softirq+0xfb/0x25c
> [  238.921752]  [<ffffffff810ee87b>] run_ksoftirqd+0x19/0x35
> [  238.921755]  [<ffffffff81105ae8>] smpboot_thread_fn+0x169/0x1a9
> [  238.921756]  [<ffffffff8110597f>] ? sort_range+0x1d/0x1d
> [  238.921759]  [<ffffffff811031a1>] kthread+0xa0/0xa8
> [  238.921763]  [<ffffffff816ce19f>] ret_from_fork+0x1f/0x40
> [  238.921764]  [<ffffffff81103101>] ? init_completion+0x24/0x24
> [  238.921765] ---[ end trace 933912956c6ee5ff ]---
> [  238.961672] e1000e 0000:07:00.0 eth3: Reset adapter unexpectedly

So, on 4.4.8+, I see this and other splats related to e1000e.  I guess that is a separate
issue.  I can easily start 40k connections however, 30k across the two 10G ports,
and 10k more across a pair of mac-vlans on the 10G ports (since I was out of
address space to add a full 40k on the two physical ports).


Looks like the e1000e problem is a separate issue, so I'll just focus on the 10G NIC for now.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression between 4.7 and 4.13 kernels.
  2018-01-22 18:30   ` Ben Greear
  2018-01-22 18:44     ` Ben Greear
@ 2018-01-22 18:46     ` Josh Hunt
  2018-01-23 22:06       ` Ben Greear
  1 sibling, 1 reply; 15+ messages in thread
From: Josh Hunt @ 2018-01-22 18:46 UTC (permalink / raw)
  To: Ben Greear; +Cc: Eric Dumazet, netdev

On Mon, Jan 22, 2018 at 10:30 AM, Ben Greear <greearb@candelatech.com> wrote:
> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
>>
>> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>>>
>>> My test case is to have 6 processes each create 5000 TCP IPv4 connections
>>> to each other
>>> on a system with 16GB RAM and send slow-speed data.  This works fine on a
>>> 4.7 kernel, but
>>> will not work at all on a 4.13.  The 4.13 first complains about running
>>> out of tcp memory,
>>> but even after forcing those values higher, the max connections we can
>>> get is around 15k.
>>>
>>> Both kernels have my out-of-tree patches applied, so it is possible it is
>>> my fault
>>> at this point.
>>>
>>> Any suggestions as to what this might be caused by, or if it is fixed in
>>> more recent kernels?
>>>
>>> I will start bisecting in the meantime...
>>>
>>
>> Hi Ben
>>
>> Unfortunately I have no idea.
>>
>> Are you using loopback flows, or have I misunderstood you ?
>>
>> How loopback connections can be slow-speed ?
>>
>
> I am sending to self, but over external network interfaces, by using
> routing tables and rules and such.
>
> On 4.13.16+, I see the Intel driver bouncing when I try to start 20k
> connections.  In this case, I have a pair of 10G ports doing 15k, and then
> I try to start 5k on two of the 1G ports....
>
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Down
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Down
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Down
> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:43 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Down
> Jan 22 10:15:45 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
> Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: NETDEV WATCHDOG: eth3
> (e1000e): transmit queue 0 timed out, trans_s...es: 1
> Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: e1000e 0000:07:00.0
> eth3: Reset adapter unexpectedly
>

Ben

We had an interface doing this and grabbing these commits resolved it for us:

4aea7a5c5e94 e1000e: Avoid receiver overrun interrupt bursts
19110cfbb34d e1000e: Separate signaling for link check/link up
d3509f8bc7b0 e1000e: Fix return value test
65a29da1f5fd e1000e: Fix wrong comment related to link detection
c4c40e51f9c3 e1000e: Fix error path in link detection

They are in the LTS kernels now, but don't believe they were when we
first hit this problem.

Josh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-22 18:16 ` Eric Dumazet
  2018-01-22 18:27   ` Willy Tarreau
  2018-01-22 18:30   ` Ben Greear
@ 2018-01-23 21:49   ` Ben Greear
  2018-01-23 22:07     ` Eric Dumazet
  2 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-23 21:49 UTC (permalink / raw)
  To: Eric Dumazet, netdev

On 01/22/2018 10:16 AM, Eric Dumazet wrote:
> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
>> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
>> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
>> but even after forcing those values higher, the max connections we can get is around 15k.
>>
>> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
>> at this point.
>>
>> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
>>
>> I will start bisecting in the meantime...
>>
>
> Hi Ben
>
> Unfortunately I have no idea.
>
> Are you using loopback flows, or have I misunderstood you ?
>
> How loopback connections can be slow-speed ?
>

Hello Eric, looks like it is one of your commits that causes the issue
I see.

Here are some more details on my specific test case I used to bisect:

I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
side let me send-to-self over the external looped cable.

I have 2 mac-vlans on each physical interface.

I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.

On the client-side, I create a process that spawns 5000 connections to the corresponding server side.

End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
mac-vlan ports.

In the passing case, I get very close to all 5000 connections on all endpoints quickly.

In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
across them working reliably.  It seems to be an issue with 'connect' failing.

connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
fcntl(2075, F_GETFD)                    = 0
fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
fcntl(2076, F_GETFD)                    = 0
fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
....


ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
commit ea8add2b190395408b22a9127bed2c0912aecbc8
Author: Eric Dumazet <edumazet@google.com>
Date:   Thu Feb 11 16:28:50 2016 -0800

     tcp/dccp: better use of ephemeral ports in bind()

     Implement strategy used in __inet_hash_connect() in opposite way :

     Try to find a candidate using odd ports, then fallback to even ports.

     We no longer disable BH for whole traversal, but one bucket at a time.
     We also use cond_resched() to yield cpu to other tasks if needed.

     I removed one indentation level and tried to mirror the loop we have
     in __inet_hash_connect() and variable names to ease code maintenance.

     Signed-off-by: Eric Dumazet <edumazet@google.com>
     Signed-off-by: David S. Miller <davem@davemloft.net>

:040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M	net


I will be happy to test patches or try to get any other results that might help diagnose
this problem better.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression between 4.7 and 4.13 kernels.
  2018-01-22 18:46     ` Josh Hunt
@ 2018-01-23 22:06       ` Ben Greear
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Greear @ 2018-01-23 22:06 UTC (permalink / raw)
  To: Josh Hunt; +Cc: Eric Dumazet, netdev

On 01/22/2018 10:46 AM, Josh Hunt wrote:
> On Mon, Jan 22, 2018 at 10:30 AM, Ben Greear <greearb@candelatech.com> wrote:
>> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
>>>
>>> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>>>>
>>>> My test case is to have 6 processes each create 5000 TCP IPv4 connections
>>>> to each other
>>>> on a system with 16GB RAM and send slow-speed data.  This works fine on a
>>>> 4.7 kernel, but
>>>> will not work at all on a 4.13.  The 4.13 first complains about running
>>>> out of tcp memory,
>>>> but even after forcing those values higher, the max connections we can
>>>> get is around 15k.
>>>>
>>>> Both kernels have my out-of-tree patches applied, so it is possible it is
>>>> my fault
>>>> at this point.
>>>>
>>>> Any suggestions as to what this might be caused by, or if it is fixed in
>>>> more recent kernels?
>>>>
>>>> I will start bisecting in the meantime...
>>>>
>>>
>>> Hi Ben
>>>
>>> Unfortunately I have no idea.
>>>
>>> Are you using loopback flows, or have I misunderstood you ?
>>>
>>> How loopback connections can be slow-speed ?
>>>
>>
>> I am sending to self, but over external network interfaces, by using
>> routing tables and rules and such.
>>
>> On 4.13.16+, I see the Intel driver bouncing when I try to start 20k
>> connections.  In this case, I have a pair of 10G ports doing 15k, and then
>> I try to start 5k on two of the 1G ports....
>>
>> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Down
>> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Down
>> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Down
>> Jan 22 10:15:41 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>> Jan 22 10:15:43 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Down
>> Jan 22 10:15:45 lf1003-e3v2-13100124-f20x64 kernel: e1000e: eth3 NIC Link is
>> Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>> Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: NETDEV WATCHDOG: eth3
>> (e1000e): transmit queue 0 timed out, trans_s...es: 1
>> Jan 22 10:15:51 lf1003-e3v2-13100124-f20x64 kernel: e1000e 0000:07:00.0
>> eth3: Reset adapter unexpectedly
>>
>
> Ben
>
> We had an interface doing this and grabbing these commits resolved it for us:
>
> 4aea7a5c5e94 e1000e: Avoid receiver overrun interrupt bursts
> 19110cfbb34d e1000e: Separate signaling for link check/link up
> d3509f8bc7b0 e1000e: Fix return value test
> 65a29da1f5fd e1000e: Fix wrong comment related to link detection
> c4c40e51f9c3 e1000e: Fix error path in link detection
>
> They are in the LTS kernels now, but don't believe they were when we
> first hit this problem.

Thanks a lot for the suggestions, I can confirm that these patches applied to my 4.13.16+
tree does indeed seem to fix the problem.

Thanks,
Ben

>
> Josh
>


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-23 21:49   ` TCP many-connection regression (bisected to 4.5.0-rc2+) Ben Greear
@ 2018-01-23 22:07     ` Eric Dumazet
  2018-01-23 22:09       ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2018-01-23 22:07 UTC (permalink / raw)
  To: Ben Greear, netdev

On Tue, 2018-01-23 at 13:49 -0800, Ben Greear wrote:
> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
> > On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
> > > My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
> > > on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
> > > will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
> > > but even after forcing those values higher, the max connections we can get is around 15k.
> > > 
> > > Both kernels have my out-of-tree patches applied, so it is possible it is my fault
> > > at this point.
> > > 
> > > Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
> > > 
> > > I will start bisecting in the meantime...
> > > 
> > 
> > Hi Ben
> > 
> > Unfortunately I have no idea.
> > 
> > Are you using loopback flows, or have I misunderstood you ?
> > 
> > How loopback connections can be slow-speed ?
> > 
> 
> Hello Eric, looks like it is one of your commits that causes the issue
> I see.
> 
> Here are some more details on my specific test case I used to bisect:
> 
> I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
> Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
> side let me send-to-self over the external looped cable.
> 
> I have 2 mac-vlans on each physical interface.
> 
> I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.
> 
> On the client-side, I create a process that spawns 5000 connections to the corresponding server side.
> 
> End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
> mac-vlan ports.
> 
> In the passing case, I get very close to all 5000 connections on all endpoints quickly.
> 
> In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
> across them working reliably.  It seems to be an issue with 'connect' failing.
> 
> connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
> fcntl(2075, F_GETFD)                    = 0
> fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
> setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
> setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
> getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
> getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
> setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
> fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
> fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
> connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
> fcntl(2076, F_GETFD)                    = 0
> fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
> setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
> setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
> getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
> getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
> setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
> fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
> fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
> connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
> ....
> 
> 
> ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
> commit ea8add2b190395408b22a9127bed2c0912aecbc8
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Thu Feb 11 16:28:50 2016 -0800
> 
>      tcp/dccp: better use of ephemeral ports in bind()
> 
>      Implement strategy used in __inet_hash_connect() in opposite way :
> 
>      Try to find a candidate using odd ports, then fallback to even ports.
> 
>      We no longer disable BH for whole traversal, but one bucket at a time.
>      We also use cond_resched() to yield cpu to other tasks if needed.
> 
>      I removed one indentation level and tried to mirror the loop we have
>      in __inet_hash_connect() and variable names to ease code maintenance.
> 
>      Signed-off-by: Eric Dumazet <edumazet@google.com>
>      Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> :040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M	net
> 
> 
> I will be happy to test patches or try to get any other results that might help diagnose
> this problem better.

Problem is I do not see anything obvious here.

Please provide /proc/sys/net/ipv4/ip_local_port_range

Also you probably could use IP_BIND_ADDRESS_NO_PORT socket option
before the bind()

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-23 22:07     ` Eric Dumazet
@ 2018-01-23 22:09       ` Ben Greear
  2018-01-23 22:29         ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-23 22:09 UTC (permalink / raw)
  To: Eric Dumazet, netdev

On 01/23/2018 02:07 PM, Eric Dumazet wrote:
> On Tue, 2018-01-23 at 13:49 -0800, Ben Greear wrote:
>> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
>>> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>>>> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
>>>> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
>>>> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
>>>> but even after forcing those values higher, the max connections we can get is around 15k.
>>>>
>>>> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
>>>> at this point.
>>>>
>>>> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
>>>>
>>>> I will start bisecting in the meantime...
>>>>
>>>
>>> Hi Ben
>>>
>>> Unfortunately I have no idea.
>>>
>>> Are you using loopback flows, or have I misunderstood you ?
>>>
>>> How loopback connections can be slow-speed ?
>>>
>>
>> Hello Eric, looks like it is one of your commits that causes the issue
>> I see.
>>
>> Here are some more details on my specific test case I used to bisect:
>>
>> I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
>> Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
>> side let me send-to-self over the external looped cable.
>>
>> I have 2 mac-vlans on each physical interface.
>>
>> I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.
>>
>> On the client-side, I create a process that spawns 5000 connections to the corresponding server side.
>>
>> End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
>> mac-vlan ports.
>>
>> In the passing case, I get very close to all 5000 connections on all endpoints quickly.
>>
>> In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
>> across them working reliably.  It seems to be an issue with 'connect' failing.
>>
>> connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
>> fcntl(2075, F_GETFD)                    = 0
>> fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
>> setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>> setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>> bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>> getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>> getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>> setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>> fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
>> fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>> connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
>> fcntl(2076, F_GETFD)                    = 0
>> fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
>> setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>> setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>> bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>> getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>> getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>> setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>> fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
>> fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>> connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
>> ....
>>
>>
>> ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
>> commit ea8add2b190395408b22a9127bed2c0912aecbc8
>> Author: Eric Dumazet <edumazet@google.com>
>> Date:   Thu Feb 11 16:28:50 2016 -0800
>>
>>      tcp/dccp: better use of ephemeral ports in bind()
>>
>>      Implement strategy used in __inet_hash_connect() in opposite way :
>>
>>      Try to find a candidate using odd ports, then fallback to even ports.
>>
>>      We no longer disable BH for whole traversal, but one bucket at a time.
>>      We also use cond_resched() to yield cpu to other tasks if needed.
>>
>>      I removed one indentation level and tried to mirror the loop we have
>>      in __inet_hash_connect() and variable names to ease code maintenance.
>>
>>      Signed-off-by: Eric Dumazet <edumazet@google.com>
>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>
>> :040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M	net
>>
>>
>> I will be happy to test patches or try to get any other results that might help diagnose
>> this problem better.
>
> Problem is I do not see anything obvious here.
>
> Please provide /proc/sys/net/ipv4/ip_local_port_range

[root@lf1003-e3v2-13100124-f20x64 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
10000	61001

>
> Also you probably could use IP_BIND_ADDRESS_NO_PORT socket option
> before the bind()

I'll read up on that to see what it does...

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-23 22:09       ` Ben Greear
@ 2018-01-23 22:29         ` Eric Dumazet
  2018-01-23 23:10           ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2018-01-23 22:29 UTC (permalink / raw)
  To: Ben Greear, netdev

On Tue, 2018-01-23 at 14:09 -0800, Ben Greear wrote:
> On 01/23/2018 02:07 PM, Eric Dumazet wrote:
> > On Tue, 2018-01-23 at 13:49 -0800, Ben Greear wrote:
> > > On 01/22/2018 10:16 AM, Eric Dumazet wrote:
> > > > On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
> > > > > My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
> > > > > on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
> > > > > will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
> > > > > but even after forcing those values higher, the max connections we can get is around 15k.
> > > > > 
> > > > > Both kernels have my out-of-tree patches applied, so it is possible it is my fault
> > > > > at this point.
> > > > > 
> > > > > Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
> > > > > 
> > > > > I will start bisecting in the meantime...
> > > > > 
> > > > 
> > > > Hi Ben
> > > > 
> > > > Unfortunately I have no idea.
> > > > 
> > > > Are you using loopback flows, or have I misunderstood you ?
> > > > 
> > > > How loopback connections can be slow-speed ?
> > > > 
> > > 
> > > Hello Eric, looks like it is one of your commits that causes the issue
> > > I see.
> > > 
> > > Here are some more details on my specific test case I used to bisect:
> > > 
> > > I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
> > > Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
> > > side let me send-to-self over the external looped cable.
> > > 
> > > I have 2 mac-vlans on each physical interface.
> > > 
> > > I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.
> > > 
> > > On the client-side, I create a process that spawns 5000 connections to the corresponding server side.
> > > 
> > > End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
> > > mac-vlan ports.
> > > 
> > > In the passing case, I get very close to all 5000 connections on all endpoints quickly.
> > > 
> > > In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
> > > across them working reliably.  It seems to be an issue with 'connect' failing.
> > > 
> > > connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
> > > socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
> > > fcntl(2075, F_GETFD)                    = 0
> > > fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
> > > setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
> > > setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> > > bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
> > > getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
> > > getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
> > > setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
> > > fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
> > > fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
> > > connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
> > > socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
> > > fcntl(2076, F_GETFD)                    = 0
> > > fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
> > > setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
> > > setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> > > bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
> > > getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
> > > getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
> > > setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
> > > fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
> > > fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
> > > connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
> > > ....
> > > 
> > > 
> > > ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
> > > commit ea8add2b190395408b22a9127bed2c0912aecbc8
> > > Author: Eric Dumazet <edumazet@google.com>
> > > Date:   Thu Feb 11 16:28:50 2016 -0800
> > > 
> > >      tcp/dccp: better use of ephemeral ports in bind()
> > > 
> > >      Implement strategy used in __inet_hash_connect() in opposite way :
> > > 
> > >      Try to find a candidate using odd ports, then fallback to even ports.
> > > 
> > >      We no longer disable BH for whole traversal, but one bucket at a time.
> > >      We also use cond_resched() to yield cpu to other tasks if needed.
> > > 
> > >      I removed one indentation level and tried to mirror the loop we have
> > >      in __inet_hash_connect() and variable names to ease code maintenance.
> > > 
> > >      Signed-off-by: Eric Dumazet <edumazet@google.com>
> > >      Signed-off-by: David S. Miller <davem@davemloft.net>
> > > 
> > > :040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M	net
> > > 
> > > 
> > > I will be happy to test patches or try to get any other results that might help diagnose
> > > this problem better.
> > 
> > Problem is I do not see anything obvious here.
> > 
> > Please provide /proc/sys/net/ipv4/ip_local_port_range
> 
> [root@lf1003-e3v2-13100124-f20x64 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
> 10000	61001
> 
> > 
> > Also you probably could use IP_BIND_ADDRESS_NO_PORT socket option
> > before the bind()
> 
> I'll read up on that to see what it does...

man 7 ip

       IP_BIND_ADDRESS_NO_PORT (since Linux 4.2)
              Inform
the kernel to not reserve an ephemeral  port
              when  using 
bind(2)  with a port number of 0.  The
              port will later be 
automatically  chosen  at  con‐
              nect(2) time, in a way
that allows sharing a source
              port as long as the 4-tuple
is unique.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-23 22:29         ` Eric Dumazet
@ 2018-01-23 23:10           ` Ben Greear
  2018-01-23 23:21             ` Eric Dumazet
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-23 23:10 UTC (permalink / raw)
  To: Eric Dumazet, netdev

On 01/23/2018 02:29 PM, Eric Dumazet wrote:
> On Tue, 2018-01-23 at 14:09 -0800, Ben Greear wrote:
>> On 01/23/2018 02:07 PM, Eric Dumazet wrote:
>>> On Tue, 2018-01-23 at 13:49 -0800, Ben Greear wrote:
>>>> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
>>>>> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>>>>>> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
>>>>>> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
>>>>>> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
>>>>>> but even after forcing those values higher, the max connections we can get is around 15k.
>>>>>>
>>>>>> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
>>>>>> at this point.
>>>>>>
>>>>>> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
>>>>>>
>>>>>> I will start bisecting in the meantime...
>>>>>>
>>>>>
>>>>> Hi Ben
>>>>>
>>>>> Unfortunately I have no idea.
>>>>>
>>>>> Are you using loopback flows, or have I misunderstood you ?
>>>>>
>>>>> How loopback connections can be slow-speed ?
>>>>>
>>>>
>>>> Hello Eric, looks like it is one of your commits that causes the issue
>>>> I see.
>>>>
>>>> Here are some more details on my specific test case I used to bisect:
>>>>
>>>> I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
>>>> Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
>>>> side let me send-to-self over the external looped cable.
>>>>
>>>> I have 2 mac-vlans on each physical interface.
>>>>
>>>> I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.
>>>>
>>>> On the client-side, I create a process that spawns 5000 connections to the corresponding server side.
>>>>
>>>> End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
>>>> mac-vlan ports.
>>>>
>>>> In the passing case, I get very close to all 5000 connections on all endpoints quickly.
>>>>
>>>> In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
>>>> across them working reliably.  It seems to be an issue with 'connect' failing.
>>>>
>>>> connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>>>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
>>>> fcntl(2075, F_GETFD)                    = 0
>>>> fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
>>>> setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>>>> setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>>> bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>>>> getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>>>> getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>>>> setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>>> fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
>>>> fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>>>> connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>>>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
>>>> fcntl(2076, F_GETFD)                    = 0
>>>> fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
>>>> setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>>>> setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>>> bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>>>> getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>>>> getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>>>> setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>>> fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
>>>> fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>>>> connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
>>>> ....
>>>>
>>>>
>>>> ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
>>>> commit ea8add2b190395408b22a9127bed2c0912aecbc8
>>>> Author: Eric Dumazet <edumazet@google.com>
>>>> Date:   Thu Feb 11 16:28:50 2016 -0800
>>>>
>>>>      tcp/dccp: better use of ephemeral ports in bind()
>>>>
>>>>      Implement strategy used in __inet_hash_connect() in opposite way :
>>>>
>>>>      Try to find a candidate using odd ports, then fallback to even ports.
>>>>
>>>>      We no longer disable BH for whole traversal, but one bucket at a time.
>>>>      We also use cond_resched() to yield cpu to other tasks if needed.
>>>>
>>>>      I removed one indentation level and tried to mirror the loop we have
>>>>      in __inet_hash_connect() and variable names to ease code maintenance.
>>>>
>>>>      Signed-off-by: Eric Dumazet <edumazet@google.com>
>>>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>>>
>>>> :040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M	net
>>>>
>>>>
>>>> I will be happy to test patches or try to get any other results that might help diagnose
>>>> this problem better.
>>>
>>> Problem is I do not see anything obvious here.
>>>
>>> Please provide /proc/sys/net/ipv4/ip_local_port_range
>>
>> [root@lf1003-e3v2-13100124-f20x64 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
>> 10000	61001
>>
>>>
>>> Also you probably could use IP_BIND_ADDRESS_NO_PORT socket option
>>> before the bind()
>>
>> I'll read up on that to see what it does...
>
> man 7 ip
>
>        IP_BIND_ADDRESS_NO_PORT (since Linux 4.2)
>               Inform
> the kernel to not reserve an ephemeral  port
>               when  using
> bind(2)  with a port number of 0.  The
>               port will later be
> automatically  chosen  at  con‐
>               nect(2) time, in a way
> that allows sharing a source
>               port as long as the 4-tuple
> is unique.
>

Yes, I found that.

It appears this option works well for my case, and I see 30k connections across my pair of e1000e
(though the NIC is wretching again, so I guess its issues are not fully resolved).

I tested this on my 4.13.16+ kernel.

But that said, maybe there is still some issue with the patch I bisected to, so if you have
other suggestions, I can back out this IP_BIND_ADDRESS_NO_PORT feature and re-test.

Also, I had to increase /proc/sys/net/ipv4/tcp_mem to get 30k connections to work without
the kernel spamming:

Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem
Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem

This is a 16 GB RAM system, and I did not have to tune this on the 4.5.0-rc2+ (good) kernels
to get the similar performance.  I was testing on ixgbe there though, possibly that is part
of it, or maybe I just need to force tcp_mem to be larger on more recent kernels??

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-23 23:10           ` Ben Greear
@ 2018-01-23 23:21             ` Eric Dumazet
  2018-01-23 23:27               ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Eric Dumazet @ 2018-01-23 23:21 UTC (permalink / raw)
  To: Ben Greear, netdev

On Tue, 2018-01-23 at 15:10 -0800, Ben Greear wrote:
> On 01/23/2018 02:29 PM, Eric Dumazet wrote:
> > On Tue, 2018-01-23 at 14:09 -0800, Ben Greear wrote:
> > > On 01/23/2018 02:07 PM, Eric Dumazet wrote:
> > > > On Tue, 2018-01-23 at 13:49 -0800, Ben Greear wrote:
> > > > > On 01/22/2018 10:16 AM, Eric Dumazet wrote:
> > > > > > On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
> > > > > > > My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
> > > > > > > on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
> > > > > > > will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
> > > > > > > but even after forcing those values higher, the max connections we can get is around 15k.
> > > > > > > 
> > > > > > > Both kernels have my out-of-tree patches applied, so it is possible it is my fault
> > > > > > > at this point.
> > > > > > > 
> > > > > > > Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
> > > > > > > 
> > > > > > > I will start bisecting in the meantime...
> > > > > > > 
> > > > > > 
> > > > > > Hi Ben
> > > > > > 
> > > > > > Unfortunately I have no idea.
> > > > > > 
> > > > > > Are you using loopback flows, or have I misunderstood you ?
> > > > > > 
> > > > > > How loopback connections can be slow-speed ?
> > > > > > 
> > > > > 
> > > > > Hello Eric, looks like it is one of your commits that causes the issue
> > > > > I see.
> > > > > 
> > > > > Here are some more details on my specific test case I used to bisect:
> > > > > 
> > > > > I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
> > > > > Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
> > > > > side let me send-to-self over the external looped cable.
> > > > > 
> > > > > I have 2 mac-vlans on each physical interface.
> > > > > 
> > > > > I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.
> > > > > 
> > > > > On the client-side, I create a process that spawns 5000 connections to the corresponding server side.
> > > > > 
> > > > > End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
> > > > > mac-vlan ports.
> > > > > 
> > > > > In the passing case, I get very close to all 5000 connections on all endpoints quickly.
> > > > > 
> > > > > In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
> > > > > across them working reliably.  It seems to be an issue with 'connect' failing.
> > > > > 
> > > > > connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
> > > > > socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
> > > > > fcntl(2075, F_GETFD)                    = 0
> > > > > fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
> > > > > setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
> > > > > setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> > > > > bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
> > > > > getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
> > > > > getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
> > > > > setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
> > > > > fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
> > > > > fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
> > > > > connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
> > > > > socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
> > > > > fcntl(2076, F_GETFD)                    = 0
> > > > > fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
> > > > > setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
> > > > > setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
> > > > > bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
> > > > > getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
> > > > > getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
> > > > > setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
> > > > > fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
> > > > > fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
> > > > > connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
> > > > > ....
> > > > > 
> > > > > 
> > > > > ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
> > > > > commit ea8add2b190395408b22a9127bed2c0912aecbc8
> > > > > Author: Eric Dumazet <edumazet@google.com>
> > > > > Date:   Thu Feb 11 16:28:50 2016 -0800
> > > > > 
> > > > >      tcp/dccp: better use of ephemeral ports in bind()
> > > > > 
> > > > >      Implement strategy used in __inet_hash_connect() in opposite way :
> > > > > 
> > > > >      Try to find a candidate using odd ports, then fallback to even ports.
> > > > > 
> > > > >      We no longer disable BH for whole traversal, but one bucket at a time.
> > > > >      We also use cond_resched() to yield cpu to other tasks if needed.
> > > > > 
> > > > >      I removed one indentation level and tried to mirror the loop we have
> > > > >      in __inet_hash_connect() and variable names to ease code maintenance.
> > > > > 
> > > > >      Signed-off-by: Eric Dumazet <edumazet@google.com>
> > > > >      Signed-off-by: David S. Miller <davem@davemloft.net>
> > > > > 
> > > > > :040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M	net
> > > > > 
> > > > > 
> > > > > I will be happy to test patches or try to get any other results that might help diagnose
> > > > > this problem better.
> > > > 
> > > > Problem is I do not see anything obvious here.
> > > > 
> > > > Please provide /proc/sys/net/ipv4/ip_local_port_range
> > > 
> > > [root@lf1003-e3v2-13100124-f20x64 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
> > > 10000	61001
> > > 
> > > > 
> > > > Also you probably could use IP_BIND_ADDRESS_NO_PORT socket option
> > > > before the bind()
> > > 
> > > I'll read up on that to see what it does...
> > 
> > man 7 ip
> > 
> >        IP_BIND_ADDRESS_NO_PORT (since Linux 4.2)
> >               Inform
> > the kernel to not reserve an ephemeral  port
> >               when  using
> > bind(2)  with a port number of 0.  The
> >               port will later be
> > automatically  chosen  at  con‐
> >               nect(2) time, in a way
> > that allows sharing a source
> >               port as long as the 4-tuple
> > is unique.
> > 
> 
> Yes, I found that.
> 
> It appears this option works well for my case, and I see 30k connections across my pair of e1000e
> (though the NIC is wretching again, so I guess its issues are not fully resolved).
> 
> I tested this on my 4.13.16+ kernel.
> 
> But that said, maybe there is still some issue with the patch I bisected to, so if you have
> other suggestions, I can back out this IP_BIND_ADDRESS_NO_PORT feature and re-test.
> 
> Also, I had to increase /proc/sys/net/ipv4/tcp_mem to get 30k connections to work without
> the kernel spamming:
> 
> Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem
> Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem
> 
> This is a 16 GB RAM system, and I did not have to tune this on the 4.5.0-rc2+ (good) kernels
> to get the similar performance.  I was testing on ixgbe there though, possibly that is part
> of it, or maybe I just need to force tcp_mem to be larger on more recent kernels??


Since linux-4.2 tcp_mem[0,1,2] defaults are 4.68%, 6.25%, 9.37% of
physical memory. 

It used to be twice that in older kernels.

It is also possible that some change in TCP congestion or autotuning
allows each of your TCP flow to store more data in its write queue,
if your application is pushing bulk data as much as it can.

It is virtually not possible to change anything in the kernel without
zero impact on very pathological use cases.

tcp_wmem[2] is 4MB.

30,000 * 4MB = 120 GB

Definitely more than your physical memory.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-23 23:21             ` Eric Dumazet
@ 2018-01-23 23:27               ` Ben Greear
  2018-01-24  0:05                 ` Ben Greear
  0 siblings, 1 reply; 15+ messages in thread
From: Ben Greear @ 2018-01-23 23:27 UTC (permalink / raw)
  To: Eric Dumazet, netdev

On 01/23/2018 03:21 PM, Eric Dumazet wrote:
> On Tue, 2018-01-23 at 15:10 -0800, Ben Greear wrote:
>> On 01/23/2018 02:29 PM, Eric Dumazet wrote:
>>> On Tue, 2018-01-23 at 14:09 -0800, Ben Greear wrote:
>>>> On 01/23/2018 02:07 PM, Eric Dumazet wrote:
>>>>> On Tue, 2018-01-23 at 13:49 -0800, Ben Greear wrote:
>>>>>> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
>>>>>>> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>>>>>>>> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
>>>>>>>> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
>>>>>>>> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
>>>>>>>> but even after forcing those values higher, the max connections we can get is around 15k.
>>>>>>>>
>>>>>>>> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
>>>>>>>> at this point.
>>>>>>>>
>>>>>>>> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
>>>>>>>>
>>>>>>>> I will start bisecting in the meantime...
>>>>>>>>
>>>>>>>
>>>>>>> Hi Ben
>>>>>>>
>>>>>>> Unfortunately I have no idea.
>>>>>>>
>>>>>>> Are you using loopback flows, or have I misunderstood you ?
>>>>>>>
>>>>>>> How loopback connections can be slow-speed ?
>>>>>>>
>>>>>>
>>>>>> Hello Eric, looks like it is one of your commits that causes the issue
>>>>>> I see.
>>>>>>
>>>>>> Here are some more details on my specific test case I used to bisect:
>>>>>>
>>>>>> I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
>>>>>> Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
>>>>>> side let me send-to-self over the external looped cable.
>>>>>>
>>>>>> I have 2 mac-vlans on each physical interface.
>>>>>>
>>>>>> I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.
>>>>>>
>>>>>> On the client-side, I create a process that spawns 5000 connections to the corresponding server side.
>>>>>>
>>>>>> End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
>>>>>> mac-vlan ports.
>>>>>>
>>>>>> In the passing case, I get very close to all 5000 connections on all endpoints quickly.
>>>>>>
>>>>>> In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
>>>>>> across them working reliably.  It seems to be an issue with 'connect' failing.
>>>>>>
>>>>>> connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>>>>>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
>>>>>> fcntl(2075, F_GETFD)                    = 0
>>>>>> fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
>>>>>> setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>>>>>> setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>>>>> bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>>>>>> getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>>>>>> getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>>>>>> setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>>>>> fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
>>>>>> fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>>>>>> connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>>>>>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
>>>>>> fcntl(2076, F_GETFD)                    = 0
>>>>>> fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
>>>>>> setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>>>>>> setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>>>>> bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>>>>>> getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>>>>>> getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>>>>>> setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>>>>> fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
>>>>>> fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>>>>>> connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
>>>>>> ....
>>>>>>
>>>>>>
>>>>>> ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
>>>>>> commit ea8add2b190395408b22a9127bed2c0912aecbc8
>>>>>> Author: Eric Dumazet <edumazet@google.com>
>>>>>> Date:   Thu Feb 11 16:28:50 2016 -0800
>>>>>>
>>>>>>      tcp/dccp: better use of ephemeral ports in bind()
>>>>>>
>>>>>>      Implement strategy used in __inet_hash_connect() in opposite way :
>>>>>>
>>>>>>      Try to find a candidate using odd ports, then fallback to even ports.
>>>>>>
>>>>>>      We no longer disable BH for whole traversal, but one bucket at a time.
>>>>>>      We also use cond_resched() to yield cpu to other tasks if needed.
>>>>>>
>>>>>>      I removed one indentation level and tried to mirror the loop we have
>>>>>>      in __inet_hash_connect() and variable names to ease code maintenance.
>>>>>>
>>>>>>      Signed-off-by: Eric Dumazet <edumazet@google.com>
>>>>>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>>>>>
>>>>>> :040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M	net
>>>>>>
>>>>>>
>>>>>> I will be happy to test patches or try to get any other results that might help diagnose
>>>>>> this problem better.
>>>>>
>>>>> Problem is I do not see anything obvious here.
>>>>>
>>>>> Please provide /proc/sys/net/ipv4/ip_local_port_range
>>>>
>>>> [root@lf1003-e3v2-13100124-f20x64 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
>>>> 10000	61001
>>>>
>>>>>
>>>>> Also you probably could use IP_BIND_ADDRESS_NO_PORT socket option
>>>>> before the bind()
>>>>
>>>> I'll read up on that to see what it does...
>>>
>>> man 7 ip
>>>
>>>        IP_BIND_ADDRESS_NO_PORT (since Linux 4.2)
>>>               Inform
>>> the kernel to not reserve an ephemeral  port
>>>               when  using
>>> bind(2)  with a port number of 0.  The
>>>               port will later be
>>> automatically  chosen  at  con‐
>>>               nect(2) time, in a way
>>> that allows sharing a source
>>>               port as long as the 4-tuple
>>> is unique.
>>>
>>
>> Yes, I found that.
>>
>> It appears this option works well for my case, and I see 30k connections across my pair of e1000e
>> (though the NIC is wretching again, so I guess its issues are not fully resolved).
>>
>> I tested this on my 4.13.16+ kernel.
>>
>> But that said, maybe there is still some issue with the patch I bisected to, so if you have
>> other suggestions, I can back out this IP_BIND_ADDRESS_NO_PORT feature and re-test.
>>
>> Also, I had to increase /proc/sys/net/ipv4/tcp_mem to get 30k connections to work without
>> the kernel spamming:
>>
>> Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem
>> Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem
>>
>> This is a 16 GB RAM system, and I did not have to tune this on the 4.5.0-rc2+ (good) kernels
>> to get the similar performance.  I was testing on ixgbe there though, possibly that is part
>> of it, or maybe I just need to force tcp_mem to be larger on more recent kernels??
>
>
> Since linux-4.2 tcp_mem[0,1,2] defaults are 4.68%, 6.25%, 9.37% of
> physical memory.
>
> It used to be twice that in older kernels.
>
> It is also possible that some change in TCP congestion or autotuning
> allows each of your TCP flow to store more data in its write queue,
> if your application is pushing bulk data as much as it can.
>
> It is virtually not possible to change anything in the kernel without
> zero impact on very pathological use cases.

Yes, but also pathological use cases may uncover a real issue that normal
users will not often notice....  Based on the commit message, it seems you
expected no real regressions with that patch, but at least in my case, I see
large ones, so something might be off with it.

>
> tcp_wmem[2] is 4MB.
>
> 30,000 * 4MB = 120 GB
>
> Definitely more than your physical memory.

I'll spend some time looking at the tcp-mem issue now that I have a work-around for
the many connection issues...

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: TCP many-connection regression (bisected to 4.5.0-rc2+)
  2018-01-23 23:27               ` Ben Greear
@ 2018-01-24  0:05                 ` Ben Greear
  0 siblings, 0 replies; 15+ messages in thread
From: Ben Greear @ 2018-01-24  0:05 UTC (permalink / raw)
  To: Eric Dumazet, netdev

On 01/23/2018 03:27 PM, Ben Greear wrote:
> On 01/23/2018 03:21 PM, Eric Dumazet wrote:
>> On Tue, 2018-01-23 at 15:10 -0800, Ben Greear wrote:
>>> On 01/23/2018 02:29 PM, Eric Dumazet wrote:
>>>> On Tue, 2018-01-23 at 14:09 -0800, Ben Greear wrote:
>>>>> On 01/23/2018 02:07 PM, Eric Dumazet wrote:
>>>>>> On Tue, 2018-01-23 at 13:49 -0800, Ben Greear wrote:
>>>>>>> On 01/22/2018 10:16 AM, Eric Dumazet wrote:
>>>>>>>> On Mon, 2018-01-22 at 09:28 -0800, Ben Greear wrote:
>>>>>>>>> My test case is to have 6 processes each create 5000 TCP IPv4 connections to each other
>>>>>>>>> on a system with 16GB RAM and send slow-speed data.  This works fine on a 4.7 kernel, but
>>>>>>>>> will not work at all on a 4.13.  The 4.13 first complains about running out of tcp memory,
>>>>>>>>> but even after forcing those values higher, the max connections we can get is around 15k.
>>>>>>>>>
>>>>>>>>> Both kernels have my out-of-tree patches applied, so it is possible it is my fault
>>>>>>>>> at this point.
>>>>>>>>>
>>>>>>>>> Any suggestions as to what this might be caused by, or if it is fixed in more recent kernels?
>>>>>>>>>
>>>>>>>>> I will start bisecting in the meantime...
>>>>>>>>>
>>>>>>>>
>>>>>>>> Hi Ben
>>>>>>>>
>>>>>>>> Unfortunately I have no idea.
>>>>>>>>
>>>>>>>> Are you using loopback flows, or have I misunderstood you ?
>>>>>>>>
>>>>>>>> How loopback connections can be slow-speed ?
>>>>>>>>
>>>>>>>
>>>>>>> Hello Eric, looks like it is one of your commits that causes the issue
>>>>>>> I see.
>>>>>>>
>>>>>>> Here are some more details on my specific test case I used to bisect:
>>>>>>>
>>>>>>> I have two ixgbe ports looped back, configured on same subnet, but with different IPs.
>>>>>>> Routing table rules, SO_BINDTODEVICE, binding to specific IPs on both client and server
>>>>>>> side let me send-to-self over the external looped cable.
>>>>>>>
>>>>>>> I have 2 mac-vlans on each physical interface.
>>>>>>>
>>>>>>> I created 5 server-side connections on one physical port, and two more on one of the mac-vlans.
>>>>>>>
>>>>>>> On the client-side, I create a process that spawns 5000 connections to the corresponding server side.
>>>>>>>
>>>>>>> End result is 25,000 connections on one pair of real interfaces, and 10,000 connections on the
>>>>>>> mac-vlan ports.
>>>>>>>
>>>>>>> In the passing case, I get very close to all 5000 connections on all endpoints quickly.
>>>>>>>
>>>>>>> In the failing case, I get a max of around 16k connections on the two physical ports.  The two mac-vlans have 10k connections
>>>>>>> across them working reliably.  It seems to be an issue with 'connect' failing.
>>>>>>>
>>>>>>> connect(2074, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>>>>>>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2075
>>>>>>> fcntl(2075, F_GETFD)                    = 0
>>>>>>> fcntl(2075, F_SETFD, FD_CLOEXEC)        = 0
>>>>>>> setsockopt(2075, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>>>>>>> setsockopt(2075, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>>>>>> bind(2075, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>>>>>>> getsockopt(2075, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>>>>>>> getsockopt(2075, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>>>>>>> setsockopt(2075, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>>>>>> fcntl(2075, F_GETFL)                    = 0x2 (flags O_RDWR)
>>>>>>> fcntl(2075, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>>>>>>> connect(2075, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EINPROGRESS (Operation now in progress)
>>>>>>> socket(PF_INET, SOCK_STREAM, IPPROTO_IP) = 2076
>>>>>>> fcntl(2076, F_GETFD)                    = 0
>>>>>>> fcntl(2076, F_SETFD, FD_CLOEXEC)        = 0
>>>>>>> setsockopt(2076, SOL_SOCKET, SO_BINDTODEVICE, "eth4\0\0\0\0\0\0\0\0\0\0\0\0", 16) = 0
>>>>>>> setsockopt(2076, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0
>>>>>>> bind(2076, {sa_family=AF_INET, sin_port=htons(0), sin_addr=inet_addr("10.1.1.4")}, 16) = 0
>>>>>>> getsockopt(2076, SOL_SOCKET, SO_RCVBUF, [87380], [4]) = 0
>>>>>>> getsockopt(2076, SOL_SOCKET, SO_SNDBUF, [16384], [4]) = 0
>>>>>>> setsockopt(2076, SOL_TCP, TCP_NODELAY, [0], 4) = 0
>>>>>>> fcntl(2076, F_GETFL)                    = 0x2 (flags O_RDWR)
>>>>>>> fcntl(2076, F_SETFL, O_ACCMODE|O_NONBLOCK) = 0
>>>>>>> connect(2076, {sa_family=AF_INET, sin_port=htons(33012), sin_addr=inet_addr("10.1.1.5")}, 16) = -1 EADDRNOTAVAIL (Cannot assign requested address)
>>>>>>> ....
>>>>>>>
>>>>>>>
>>>>>>> ea8add2b190395408b22a9127bed2c0912aecbc8 is the first bad commit
>>>>>>> commit ea8add2b190395408b22a9127bed2c0912aecbc8
>>>>>>> Author: Eric Dumazet <edumazet@google.com>
>>>>>>> Date:   Thu Feb 11 16:28:50 2016 -0800
>>>>>>>
>>>>>>>      tcp/dccp: better use of ephemeral ports in bind()
>>>>>>>
>>>>>>>      Implement strategy used in __inet_hash_connect() in opposite way :
>>>>>>>
>>>>>>>      Try to find a candidate using odd ports, then fallback to even ports.
>>>>>>>
>>>>>>>      We no longer disable BH for whole traversal, but one bucket at a time.
>>>>>>>      We also use cond_resched() to yield cpu to other tasks if needed.
>>>>>>>
>>>>>>>      I removed one indentation level and tried to mirror the loop we have
>>>>>>>      in __inet_hash_connect() and variable names to ease code maintenance.
>>>>>>>
>>>>>>>      Signed-off-by: Eric Dumazet <edumazet@google.com>
>>>>>>>      Signed-off-by: David S. Miller <davem@davemloft.net>
>>>>>>>
>>>>>>> :040000 040000 3af4595c6eb6d331e1cba78a142d44e00f710d81 e0c014ae8b7e2867256eff60f6210821d36eacef M    net
>>>>>>>
>>>>>>>
>>>>>>> I will be happy to test patches or try to get any other results that might help diagnose
>>>>>>> this problem better.
>>>>>>
>>>>>> Problem is I do not see anything obvious here.
>>>>>>
>>>>>> Please provide /proc/sys/net/ipv4/ip_local_port_range
>>>>>
>>>>> [root@lf1003-e3v2-13100124-f20x64 ~]# cat /proc/sys/net/ipv4/ip_local_port_range
>>>>> 10000    61001
>>>>>
>>>>>>
>>>>>> Also you probably could use IP_BIND_ADDRESS_NO_PORT socket option
>>>>>> before the bind()
>>>>>
>>>>> I'll read up on that to see what it does...
>>>>
>>>> man 7 ip
>>>>
>>>>        IP_BIND_ADDRESS_NO_PORT (since Linux 4.2)
>>>>               Inform
>>>> the kernel to not reserve an ephemeral  port
>>>>               when  using
>>>> bind(2)  with a port number of 0.  The
>>>>               port will later be
>>>> automatically  chosen  at  con‐
>>>>               nect(2) time, in a way
>>>> that allows sharing a source
>>>>               port as long as the 4-tuple
>>>> is unique.
>>>>
>>>
>>> Yes, I found that.
>>>
>>> It appears this option works well for my case, and I see 30k connections across my pair of e1000e
>>> (though the NIC is wretching again, so I guess its issues are not fully resolved).
>>>
>>> I tested this on my 4.13.16+ kernel.
>>>
>>> But that said, maybe there is still some issue with the patch I bisected to, so if you have
>>> other suggestions, I can back out this IP_BIND_ADDRESS_NO_PORT feature and re-test.
>>>
>>> Also, I had to increase /proc/sys/net/ipv4/tcp_mem to get 30k connections to work without
>>> the kernel spamming:
>>>
>>> Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem
>>> Jan 23 15:02:41 lf1003-e3v2-13100124-f20x64 kernel: TCP: out of memory -- consider tuning tcp_mem
>>>
>>> This is a 16 GB RAM system, and I did not have to tune this on the 4.5.0-rc2+ (good) kernels
>>> to get the similar performance.  I was testing on ixgbe there though, possibly that is part
>>> of it, or maybe I just need to force tcp_mem to be larger on more recent kernels??
>>
>>
>> Since linux-4.2 tcp_mem[0,1,2] defaults are 4.68%, 6.25%, 9.37% of
>> physical memory.
>>
>> It used to be twice that in older kernels.
>>
>> It is also possible that some change in TCP congestion or autotuning
>> allows each of your TCP flow to store more data in its write queue,
>> if your application is pushing bulk data as much as it can.
>>
>> It is virtually not possible to change anything in the kernel without
>> zero impact on very pathological use cases.
>
> Yes, but also pathological use cases may uncover a real issue that normal
> users will not often notice....  Based on the commit message, it seems you
> expected no real regressions with that patch, but at least in my case, I see
> large ones, so something might be off with it.
>
>>
>> tcp_wmem[2] is 4MB.
>>
>> 30,000 * 4MB = 120 GB
>>
>> Definitely more than your physical memory.
>
> I'll spend some time looking at the tcp-mem issue now that I have a work-around for
> the many connection issues...

Looks like when I use the ixgbe, I can do 70k connections (using 2 physical, plus two mac-vlans on each
to ensure no more than 30k connections per IP pair).  It runs solid, 3GB RAM free, and no tcp-mem warnings.

e1000e has lots of tx-hangs, and that exacerbates tcp memory pressure it seems.

So, seems I'm good to go on this as long as I stay away from e1000e.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2018-01-24  0:05 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-22 17:28 TCP many-connection regression between 4.7 and 4.13 kernels Ben Greear
2018-01-22 18:16 ` Eric Dumazet
2018-01-22 18:27   ` Willy Tarreau
2018-01-22 18:30   ` Ben Greear
2018-01-22 18:44     ` Ben Greear
2018-01-22 18:46     ` Josh Hunt
2018-01-23 22:06       ` Ben Greear
2018-01-23 21:49   ` TCP many-connection regression (bisected to 4.5.0-rc2+) Ben Greear
2018-01-23 22:07     ` Eric Dumazet
2018-01-23 22:09       ` Ben Greear
2018-01-23 22:29         ` Eric Dumazet
2018-01-23 23:10           ` Ben Greear
2018-01-23 23:21             ` Eric Dumazet
2018-01-23 23:27               ` Ben Greear
2018-01-24  0:05                 ` Ben Greear

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.