From: David Rientjes <rientjes@google.com>
To: Pekka Enberg <penberg@cs.helsinki.fi>
Cc: Christoph Lameter <cl@linux-foundation.org>,
Nick Piggin <nickpiggin@yahoo.com.au>,
Martin Bligh <mbligh@google.com>,
linux-kernel@vger.kernel.org
Subject: [patch 0/3] slub partial list thrashing performance degradation
Date: Sun, 29 Mar 2009 22:43:38 -0700 (PDT) [thread overview]
Message-ID: <alpine.DEB.2.00.0903292241300.15813@chino.kir.corp.google.com> (raw)
SLUB causes a performance degradation in comparison to SLAB when a
workload has an object allocation and freeing pattern such that it spends
more time in partial list handling than utilizing the fastpaths.
This usually occurs when freeing to a non-cpu slab either due to remote
cpu freeing or freeing to a full or partial slab. When the cpu slab is
later replaced with the freeing slab, it can only satisfy a limited
number of allocations before becoming full and requiring additional
partial list handling.
When the slowpath to fastpath ratio becomes high, this partial list
handling causes the entire allocator to become very slow for the specific
workload.
The bash script at the end of this email (inline) illustrates the
performance degradation well. It uses the netperf TCP_RR benchmark to
measure transfer rates with various thread counts, each being multiples
of the number of cores. The transfer rates are reported as an aggregate
of the individual thread results.
CONFIG_SLUB_STATS demonstrates that the kmalloc-256 and kmalloc-2048 are
performing quite poorly:
cache ALLOC_FASTPATH ALLOC_SLOWPATH
kmalloc-256 98125871 31585955
kmalloc-2048 77243698 52347453
cache FREE_FASTPATH FREE_SLOWPATH
kmalloc-256 173624 129538000
kmalloc-2048 90520 129500630
The majority of slowpath allocations were from the partial list
(30786261, or 97.5%, for kmalloc-256 and 51688159, or 98.7%, for
kmalloc-2048).
A large percentage of frees required the slab to be added back to the
partial list. For kmalloc-256, 30786630 (23.8%) of slowpath frees
required partial list handling. For kmalloc-2048, 51688697 (39.9%) of
slowpath frees required partial list handling.
On my 16-core machines with 64G of ram, these are the results:
# threads SLAB SLUB SLUB+patchset
16 69892 71592 69505
32 126490 95373 119731
48 138050 113072 125014
64 169240 149043 158919
80 192294 172035 179679
96 197779 187849 192154
112 217283 204962 209988
128 229848 217547 223507
144 238550 232369 234565
160 250333 239871 244789
176 256878 242712 248971
192 261611 243182 255596
[ The SLUB+patchset results were attained with the latest git plus this
patchset and slab_thrash_ratio set at 20 for both the kmalloc-256 and
the kmalloc-2048 cache. ]
Cc: Christoph Lameter <cl@linux-foundation.org>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: David Rientjes <rientjes@google.com>
---
include/linux/slub_def.h | 4 +
mm/slub.c | 138 +++++++++++++++++++++++++++++++++++++++-------
2 files changed, 122 insertions(+), 20 deletions(-)
#!/bin/bash
TIME=60 # seconds
HOSTNAME=<hostname> # netserver
NR_CPUS=$(grep ^processor /proc/cpuinfo | wc -l)
echo NR_CPUS=$NR_CPUS
run_netperf() {
for i in $(seq 1 $1); do
netperf -H $HOSTNAME -t TCP_RR -l $TIME &
done
}
ITERATIONS=0
while [ $ITERATIONS -lt 12 ]; do
RATE=0
ITERATIONS=$[$ITERATIONS + 1]
THREADS=$[$NR_CPUS * $ITERATIONS]
RESULTS=$(run_netperf $THREADS | grep -v '[a-zA-Z]' | awk '{ print $6 }')
for j in $RESULTS; do
RATE=$[$RATE + ${j/.*}]
done
echo threads=$THREADS rate=$RATE
done
next reply other threads:[~2009-03-30 5:44 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-30 5:43 David Rientjes [this message]
2009-03-30 5:43 ` [patch 1/3] slub: add per-cache slab thrash ratio David Rientjes
2009-03-30 5:43 ` [patch 2/3] slub: scan partial list for free slabs when thrashing David Rientjes
2009-03-30 5:43 ` [patch 3/3] slub: sort parital list " David Rientjes
2009-03-30 14:41 ` Christoph Lameter
2009-03-30 20:29 ` David Rientjes
2009-03-30 14:37 ` [patch 2/3] slub: scan partial list for free slabs " Christoph Lameter
2009-03-30 20:22 ` David Rientjes
2009-03-30 21:20 ` Christoph Lameter
2009-03-31 7:13 ` Pekka Enberg
2009-03-31 8:23 ` David Rientjes
2009-03-31 8:49 ` Pekka Enberg
2009-03-31 13:23 ` Christoph Lameter
2009-03-30 7:11 ` [patch 1/3] slub: add per-cache slab thrash ratio Pekka Enberg
2009-03-30 8:41 ` David Rientjes
2009-03-30 15:54 ` Mel Gorman
2009-03-30 20:38 ` David Rientjes
2009-03-30 14:30 ` Christoph Lameter
2009-03-30 20:12 ` David Rientjes
2009-03-30 21:19 ` Christoph Lameter
2009-03-30 22:48 ` David Rientjes
2009-03-31 4:44 ` David Rientjes
2009-03-31 13:26 ` Christoph Lameter
2009-03-31 17:21 ` David Rientjes
2009-03-31 17:24 ` Christoph Lameter
2009-03-31 17:35 ` David Rientjes
2009-03-30 6:38 ` [patch 0/3] slub partial list thrashing performance degradation Pekka Enberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.00.0903292241300.15813@chino.kir.corp.google.com \
--to=rientjes@google.com \
--cc=cl@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mbligh@google.com \
--cc=nickpiggin@yahoo.com.au \
--cc=penberg@cs.helsinki.fi \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.