mm/slab: reduce lock contention in alloc path

* mm/slab: reduce lock contention in alloc path
@ 2016-03-28  5:26 js1304
  2016-03-28  5:26 ` [PATCH 01/11] mm/slab: hold a slab_mutex when calling __kmem_cache_shrink() js1304
                   ` (10 more replies)
  0 siblings, 11 replies; 29+ messages in thread
From: js1304 @ 2016-03-28  5:26 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Christoph Lameter, Pekka Enberg, David Rientjes,
	Jesper Dangaard Brouer, linux-mm, linux-kernel, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

While processing concurrent allocation, SLAB could be contended
a lot because it did a lots of work with holding a lock. This
patchset try to reduce the number of critical section to reduce
lock contention. Major changes are lockless decision to allocate
more slab and lockless cpu cache refill from the newly allocated slab.

Below is the result of concurrent allocation/free in slab allocation
benchmark made by Christoph a long time ago. I make the output simpler.
The number shows cycle count during alloc/free respectively so less
is better.

* Before
Kmalloc N*alloc N*free(32): Average=365/806
Kmalloc N*alloc N*free(64): Average=452/690
Kmalloc N*alloc N*free(128): Average=736/886
Kmalloc N*alloc N*free(256): Average=1167/985
Kmalloc N*alloc N*free(512): Average=2088/1125
Kmalloc N*alloc N*free(1024): Average=4115/1184
Kmalloc N*alloc N*free(2048): Average=8451/1748
Kmalloc N*alloc N*free(4096): Average=16024/2048

* After
Kmalloc N*alloc N*free(32): Average=344/792
Kmalloc N*alloc N*free(64): Average=347/882
Kmalloc N*alloc N*free(128): Average=390/959
Kmalloc N*alloc N*free(256): Average=393/1067
Kmalloc N*alloc N*free(512): Average=683/1229
Kmalloc N*alloc N*free(1024): Average=1295/1325
Kmalloc N*alloc N*free(2048): Average=2513/1664
Kmalloc N*alloc N*free(4096): Average=4742/2172

It shows that performance improves greatly (roughly more than 50%)
for the object class whose size is more than 128 bytes.

Thanks.

Joonsoo Kim (11):
  mm/slab: hold a slab_mutex when calling __kmem_cache_shrink()
  mm/slab: remove BAD_ALIEN_MAGIC again
  mm/slab: drain the free slab as much as possible
  mm/slab: factor out kmem_cache_node initialization code
  mm/slab: clean-up kmem_cache_node setup
  mm/slab: don't keep free slabs if free_objects exceeds free_limit
  mm/slab: racy access/modify the slab color
  mm/slab: make cache_grow() handle the page allocated on arbitrary node
  mm/slab: separate cache_grow() to two parts
  mm/slab: refill cpu cache through a new slab without holding a node
    lock
  mm/slab: lockless decision to grow cache

 mm/slab.c        | 495 ++++++++++++++++++++++++++++---------------------------
 mm/slab_common.c |   4 +
 2 files changed, 255 insertions(+), 244 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 29+ messages in thread