All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Hannes Frederic Sowa <hannes@stressinduktion.org>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
	Tom Herbert <tom@herbertland.com>,
	Linux Kernel Network Developers <netdev@vger.kernel.org>,
	Josef Bacik <jbacik@fb.com>
Subject: Re: Soft lockup in inet_put_port on 4.6
Date: Mon, 12 Dec 2016 17:24:26 -0500	[thread overview]
Message-ID: <1481581466.24490.2@smtp.office365.com> (raw)
In-Reply-To: <3c022731-e703-34ac-55f1-60f5b94b6d62@stressinduktion.org>


On Mon, Dec 12, 2016 at 1:44 PM, Hannes Frederic Sowa 
<hannes@stressinduktion.org> wrote:
> On 12.12.2016 19:05, Josef Bacik wrote:
>>  On Fri, Dec 9, 2016 at 11:14 PM, Eric Dumazet 
>> <eric.dumazet@gmail.com>
>>  wrote:
>>>  On Fri, 2016-12-09 at 19:47 -0800, Eric Dumazet wrote:
>>> 
>>>> 
>>>>   Hmm... Is your ephemeral port range includes the port your load
>>>>   balancing app is using ?
>>> 
>>>  I suspect that you might have processes doing bind( port = 0) that 
>>> are
>>>  trapped into the bind_conflict() scan ?
>>> 
>>>  With 100,000 + timewaits there, this possibly hurts.
>>> 
>>>  Can you try the following loop breaker ?
>> 
>>  It doesn't appear that the app is doing bind(port = 0) during normal
>>  operation.  I tested this patch and it made no difference.  I'm 
>> going to
>>  test simply restarting the app without changing to the SO_REUSEPORT
>>  option.  Thanks,
> 
> Would it be possible to trace the time the function uses with trace? 
> If
> we don't see the number growing considerably over time we probably can
> rule out that we loop somewhere in there (I would instrument
> inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port).
> 
> __inet_hash_connect -> __inet_check_established also takes a lock
> (inet_ehash_lockp) which can be locked from inet_diag code path during
> socket diag info dumping.
> 
> Unfortunately we couldn't reproduce it so far. :/

So I had a bcc script running to time how long we spent in 
inet_csk_bind_conflict, __inet_hash_connect and inet_csk_get_port, but 
of course I'm an idiot and didn't actually separate out the stats so I 
could tell _which_ one was taking forever.  But anyway here's a normal 
distribution on the box

     Some shit           : count     distribution
         0 -> 1          : 0        |                                   
     |
         2 -> 3          : 0        |                                   
     |
         4 -> 7          : 0        |                                   
     |
         8 -> 15         : 0        |                                   
     |
        16 -> 31         : 0        |                                   
     |
        32 -> 63         : 0        |                                   
     |
        64 -> 127        : 0        |                                   
     |
       128 -> 255        : 0        |                                   
     |
       256 -> 511        : 0        |                                   
     |
       512 -> 1023       : 0        |                                   
     |
      1024 -> 2047       : 74       |                                   
     |
      2048 -> 4095       : 10537    
|****************************************|
      4096 -> 8191       : 8497     |********************************   
     |
      8192 -> 16383      : 3745     |**************                     
     |
     16384 -> 32767      : 300      |*                                  
     |
     32768 -> 65535      : 250      |                                   
     |
     65536 -> 131071     : 180      |                                   
     |
    131072 -> 262143     : 71       |                                   
     |
    262144 -> 524287     : 18       |                                   
     |
    524288 -> 1048575    : 5        |                                   
     |

With the times in nanoseconds, and here's the distribution during the 
problem

     Some shit           : count     distribution
         0 -> 1          : 0        |                                   
     |
         2 -> 3          : 0        |                                   
     |
         4 -> 7          : 0        |                                   
     |
         8 -> 15         : 0        |                                   
     |
        16 -> 31         : 0        |                                   
     |
        32 -> 63         : 0        |                                   
     |
        64 -> 127        : 0        |                                   
     |
       128 -> 255        : 0        |                                   
     |
       256 -> 511        : 0        |                                   
     |
       512 -> 1023       : 0        |                                   
     |
      1024 -> 2047       : 21       |                                   
     |
      2048 -> 4095       : 21820    
|****************************************|
      4096 -> 8191       : 11598    |*********************              
     |
      8192 -> 16383      : 4337     |*******                            
     |
     16384 -> 32767      : 290      |                                   
     |
     32768 -> 65535      : 59       |                                   
     |
     65536 -> 131071     : 23       |                                   
     |
    131072 -> 262143     : 12       |                                   
     |
    262144 -> 524287     : 6        |                                   
     |
    524288 -> 1048575    : 19       |                                   
     |
   1048576 -> 2097151    : 1079     |*                                  
     |
   2097152 -> 4194303    : 0        |                                   
     |
   4194304 -> 8388607    : 1        |                                   
     |
   8388608 -> 16777215   : 0        |                                   
     |
  16777216 -> 33554431   : 0        |                                   
     |
  33554432 -> 67108863   : 1192     |**                                 
     |
               Some shit                     : count     distribution
                   0 -> 1                    : 0        |               
     |
                   2 -> 3                    : 0        |               
     |
                   4 -> 7                    : 0        |               
     |
                   8 -> 15                   : 0        |               
     |
                  16 -> 31                   : 0        |               
     |
                  32 -> 63                   : 0        |               
     |
                  64 -> 127                  : 0        |               
     |
                 128 -> 255                  : 0        |               
     |
                 256 -> 511                  : 0        |               
     |
                 512 -> 1023                 : 0        |               
     |
                1024 -> 2047                 : 48       |               
     |
                2048 -> 4095                 : 14714    
|********************|
                4096 -> 8191                 : 6769     |*********      
     |
                8192 -> 16383                : 2234     |***            
     |
               16384 -> 32767                : 422      |               
     |
               32768 -> 65535                : 208      |               
     |
               65536 -> 131071               : 61       |               
     |
              131072 -> 262143               : 10       |               
     |
              262144 -> 524287               : 416      |               
     |
              524288 -> 1048575              : 826      |*              
     |
             1048576 -> 2097151              : 598      |               
     |
             2097152 -> 4194303              : 10       |               
     |
             4194304 -> 8388607              : 0        |               
     |
             8388608 -> 16777215             : 1        |               
     |
            16777216 -> 33554431             : 289      |               
     |
            33554432 -> 67108863             : 921      |*              
     |
            67108864 -> 134217727            : 74       |               
     |
           134217728 -> 268435455            : 75       |               
     |
           268435456 -> 536870911            : 48       |               
     |
           536870912 -> 1073741823           : 25       |               
     |
          1073741824 -> 2147483647           : 3        |               
     |
          2147483648 -> 4294967295           : 2        |               
     |
          4294967296 -> 8589934591           : 1        |               
     |

As you can see we start getting tail latencies of up to 4-8 seconds.  
Tomorrow I'll separate out the stats so we can know which function is 
the problem child.  Sorry about not doing that first.  Thanks,

Josef

  parent reply	other threads:[~2016-12-12 22:24 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-06 23:06 Soft lockup in inet_put_port on 4.6 Tom Herbert
2016-12-08 21:03 ` Hannes Frederic Sowa
2016-12-08 21:36   ` Josef Bacik
2016-12-09  0:30     ` Eric Dumazet
2016-12-09  1:01       ` Josef Bacik
2016-12-10  1:59         ` Josef Bacik
2016-12-10  3:47           ` Eric Dumazet
2016-12-10  4:14             ` Eric Dumazet
2016-12-12 18:05               ` Josef Bacik
2016-12-12 18:44                 ` Hannes Frederic Sowa
2016-12-12 21:23                   ` Josef Bacik
2016-12-12 22:24                   ` Josef Bacik [this message]
2016-12-13 20:51                     ` Tom Herbert
2016-12-13 23:03                       ` Craig Gallek
2016-12-13 23:32                         ` Tom Herbert
2016-12-15 18:53                           ` Josef Bacik
2016-12-15 22:39                             ` Tom Herbert
2016-12-15 23:25                               ` Craig Gallek
2016-12-16  0:07                             ` Hannes Frederic Sowa
2016-12-16 14:54                               ` Josef Bacik
2016-12-16 15:21                                 ` Josef Bacik
2016-12-16 22:08                                   ` Josef Bacik
2016-12-16 22:18                                     ` Tom Herbert
2016-12-16 22:50                                       ` Josef Bacik
2016-12-17 11:08                                         ` Hannes Frederic Sowa
2016-12-17 13:26                                           ` Josef Bacik
2016-12-20  1:56                                             ` David Miller
2016-12-20  2:07                                               ` Tom Herbert
2016-12-20  2:41                                                 ` Eric Dumazet
2016-12-20  3:40                                                   ` Josef Bacik
2016-12-20  4:52                                                     ` Eric Dumazet
2016-12-20  4:59                                                       ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1481581466.24490.2@smtp.office365.com \
    --to=jbacik@fb.com \
    --cc=eric.dumazet@gmail.com \
    --cc=hannes@stressinduktion.org \
    --cc=netdev@vger.kernel.org \
    --cc=tom@herbertland.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.