All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Dumazet <eric.dumazet@gmail.com>
To: "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Cc: Christoph Lameter <cl@linux-foundation.org>,
	netdev <netdev@vger.kernel.org>, Tejun Heo <tj@kernel.org>,
	Pekka Enberg <penberg@cs.helsinki.fi>,
	alex.shi@intel.com,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Ma, Ling" <ling.ma@intel.com>,
	"Chen, Tim C" <tim.c.chen@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: hackbench regression due to commit 9dfc6e68bfe6e
Date: Wed, 07 Apr 2010 08:39:12 +0200	[thread overview]
Message-ID: <1270622352.2091.702.camel@edumazet-laptop> (raw)
In-Reply-To: <1270607668.2078.259.camel@ymzhang.sh.intel.com>

Le mercredi 07 avril 2010 à 10:34 +0800, Zhang, Yanmin a écrit :

> I collected retired instruction, dtlb miss and LLC miss.
> Below is data of LLC miss.
> 
> Kernel 2.6.33:
> # Samples: 11639436896 LLC-load-misses
> #
> # Overhead          Command                                           Shared Object  Symbol
> # ........  ...............  ......................................................  ......
> #
>     20.94%        hackbench  [kernel.kallsyms]                                       [k] copy_user_generic_string
>     14.56%        hackbench  [kernel.kallsyms]                                       [k] unix_stream_recvmsg
>     12.88%        hackbench  [kernel.kallsyms]                                       [k] kfree
>      7.37%        hackbench  [kernel.kallsyms]                                       [k] kmem_cache_free
>      7.18%        hackbench  [kernel.kallsyms]                                       [k] kmem_cache_alloc_node
>      6.78%        hackbench  [kernel.kallsyms]                                       [k] kfree_skb
>      6.27%        hackbench  [kernel.kallsyms]                                       [k] __kmalloc_node_track_caller
>      2.73%        hackbench  [kernel.kallsyms]                                       [k] __slab_free
>      2.21%        hackbench  [kernel.kallsyms]                                       [k] get_partial_node
>      2.01%        hackbench  [kernel.kallsyms]                                       [k] _raw_spin_lock
>      1.59%        hackbench  [kernel.kallsyms]                                       [k] schedule
>      1.27%        hackbench  hackbench                                               [.] receiver
>      0.99%        hackbench  libpthread-2.9.so                                       [.] __read
>      0.87%        hackbench  [kernel.kallsyms]                                       [k] unix_stream_sendmsg
> 
> 
> 
> 
> Kernel 2.6.34-rc3:
> # Samples: 13079611308 LLC-load-misses
> #
> # Overhead          Command                                                         Shared Object  Symbol
> # ........  ...............  ....................................................................  ......
> #
>     18.55%        hackbench  [kernel.kallsyms]                                                     [k] copy_user_generic_str
> ing
>     13.19%        hackbench  [kernel.kallsyms]                                                     [k] unix_stream_recvmsg
>     11.62%        hackbench  [kernel.kallsyms]                                                     [k] kfree
>      8.54%        hackbench  [kernel.kallsyms]                                                     [k] kmem_cache_free
>      7.88%        hackbench  [kernel.kallsyms]                                                     [k] __kmalloc_node_track_
> caller
>      6.54%        hackbench  [kernel.kallsyms]                                                     [k] kmem_cache_alloc_node
>      5.94%        hackbench  [kernel.kallsyms]                                                     [k] kfree_skb
>      3.48%        hackbench  [kernel.kallsyms]                                                     [k] __slab_free
>      2.15%        hackbench  [kernel.kallsyms]                                                     [k] _raw_spin_lock
>      1.83%        hackbench  [kernel.kallsyms]                                                     [k] schedule
>      1.82%        hackbench  [kernel.kallsyms]                                                     [k] get_partial_node
>      1.59%        hackbench  hackbench                                                             [.] receiver
>      1.37%        hackbench  libpthread-2.9.so                                                     [.] __read
> 
> 

Please check values of /proc/sys/net/core/rmem_default
and /proc/sys/net/core/wmem_default on your machines.

Their values can also change hackbench results, because increasing
wmem_default allows af_unix senders to consume much more skbs and stress
slab allocators (__slab_free), way beyond slub_min_order can tune them.

When 2000 senders are running (and 2000 receivers), we might consume
something like 2000 * 100.000 bytes of kernel memory for skbs. TLB
trashing is expected, because all these skbs can span many 2MB pages.
Maybe some node imbalance happens too.



You could try to boot your machine with less ram per node and check :

# cat /proc/buddyinfo 
Node 0, zone      DMA      2      1      2      2      1      1      1      0      1      1      3 
Node 0, zone    DMA32    219    298    143    584    145     57     44     41     31     26    517 
Node 1, zone    DMA32      4      1     17      1      0      3      2      2      2      2    123 
Node 1, zone   Normal    126    169     83      8      7      5     59     59     49     28    459 


One experiment on your Nehalem machine would be to change hackbench so
that each group (20 senders/ 20 receivers) run on a particular NUMA
node.

x86info -c ->

CPU #1
EFamily: 0 EModel: 1 Family: 6 Model: 26 Stepping: 5
CPU Model: Core i7 (Nehalem)
Processor name string: Intel(R) Xeon(R) CPU           X5570  @ 2.93GHz
Type: 0 (Original OEM)	Brand: 0 (Unsupported)
Number of cores per physical package=8
Number of logical processors per socket=16
Number of logical processors per core=2
APIC ID: 0x10	Package: 0  Core: 1   SMT ID 0
Cache info
 L1 Instruction cache: 32KB, 4-way associative. 64 byte line size.
 L1 Data cache: 32KB, 8-way associative. 64 byte line size.
 L2 (MLC): 256KB, 8-way associative. 64 byte line size.
TLB info
 Data TLB: 4KB pages, 4-way associative, 64 entries
 64 byte prefetching.
Found unknown cache descriptors: 55 5a b2 ca e4 



  reply	other threads:[~2010-04-07  6:39 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-25  8:40 hackbench regression due to commit 9dfc6e68bfe6e Alex Shi
2010-03-25 14:49 ` Christoph Lameter
2010-03-26  2:35   ` Alex Shi
2010-04-01  9:29     ` Zhang, Yanmin
2010-04-01 15:53       ` Christoph Lameter
2010-04-02  8:06         ` Zhang, Yanmin
2010-04-05 13:54           ` Christoph Lameter
2010-04-05 17:30             ` Pekka Enberg
2010-04-06  1:27               ` Tejun Heo
2010-04-06  8:28                 ` Zhang, Yanmin
2010-04-06 15:41                   ` Christoph Lameter
2010-04-06 20:55                     ` Christoph Lameter
2010-04-06 22:10                       ` Eric Dumazet
2010-04-07  2:34                         ` Zhang, Yanmin
2010-04-07  6:39                           ` Eric Dumazet [this message]
2010-04-07  9:07                             ` Zhang, Yanmin
2010-04-07  9:20                               ` Eric Dumazet
2010-04-07 10:47                           ` Pekka Enberg
2010-04-07 16:30                           ` Christoph Lameter
2010-04-07 16:43                           ` Christoph Lameter
2010-04-07 16:49                             ` Pekka Enberg
2010-04-07 16:52                               ` Pekka Enberg
2010-04-07 18:20                                 ` Christoph Lameter
2010-04-07 18:25                                   ` Pekka Enberg
2010-04-07 18:25                                     ` Pekka Enberg
2010-04-07 19:30                                     ` Christoph Lameter
2010-04-07 18:38                                   ` Eric Dumazet
2010-04-08  1:05                                     ` Zhang, Yanmin
2010-04-08  4:59                                       ` Eric Dumazet
2010-04-08  5:39                                         ` Eric Dumazet
2010-04-08  7:00                                           ` Eric Dumazet
2010-04-08  7:05                                             ` David Miller
2010-04-08  7:20                                               ` David Miller
2010-04-08  7:25                                               ` Eric Dumazet
2010-04-08  7:54                                             ` Zhang, Yanmin
2010-04-08  7:54                                               ` Eric Dumazet
2010-04-08  8:09                                                 ` Eric Dumazet
2010-04-08 15:34                                           ` Christoph Lameter
2010-04-08 15:52                                             ` Eric Dumazet
2010-04-07 18:18                               ` Christoph Lameter
2010-04-08  7:18                             ` Zhang, Yanmin
2010-04-07  2:20                       ` Zhang, Yanmin
2010-04-07  0:58                     ` Zhang, Yanmin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1270622352.2091.702.camel@edumazet-laptop \
    --to=eric.dumazet@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@intel.com \
    --cc=cl@linux-foundation.org \
    --cc=ling.ma@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=penberg@cs.helsinki.fi \
    --cc=tim.c.chen@intel.com \
    --cc=tj@kernel.org \
    --cc=yanmin_zhang@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.