linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [S+Q3 00/23] SLUB: The Unified slab allocator (V3)
@ 2010-08-04  2:45 Christoph Lameter
  2010-08-04  2:45 ` [S+Q3 01/23] percpu: make @dyn_size always mean min dyn_size in first chunk init functions Christoph Lameter
                   ` (23 more replies)
  0 siblings, 24 replies; 47+ messages in thread
From: Christoph Lameter @ 2010-08-04  2:45 UTC (permalink / raw)
  To: Pekka Enberg; +Cc: linux-mm, linux-kernel, Nick Piggin, David Rientjes

The following is a first release of an allocator based on SLAB
and SLUB that integrates the best approaches from both allocators. The
per cpu queuing is like the two prior releases. The NUMA facilities
were much improved vs V2. Shared and alien cache support was added to
track the cache hot state of objects. 

After this patches SLUB will track the cpu cache contents
like SLAB attemped to. There are a number of architectural differences:

1. SLUB accurately tracks cpu caches instead of assuming that there
   is only a single cpu cache per node or system.

2. SLUB object expiration is tied into the page reclaim logic. There
   is no periodic cache expiration.

3. SLUB caches are dynamically configurable via the sysfs filesystem.

4. There is no per slab page metadata structure to maintain (aside
   from the object bitmap that usually fits into the page struct).

5. Keeps all the other good features of SLUB as well.

SLUB+Q is a merging of SLUB with some queuing concepts from SLAB and a
new way of managing objects in the slabs using bitmaps. It uses a percpu
queue so that free operations can be properly buffered and a bitmap for
managing the free/allocated state in the slabs. It is slightly more
inefficient than SLUB (due to the need to place large bitmaps --sized
a few words--in some slab pages if there are more than BITS_PER_LONG
objects in a slab) but in general does not increase space use too much.

The SLAB scheme of not touching the object during management is adopted.
SLUB+Q can efficiently free and allocate cache cold objects without
causing cache misses.

I have had limited time for benchmarking this release so far since I
was more focused on getting SLAB features merged in and making it
work reliably with all the usual SLUB bells and whistles. The queueing
scheme from the SLUB+Q V1/V2 releases was not changed so that the basic
SMP performance is still the same. V1 and V2 did not have NUMA clean
queues and therefore the performance on NUMA system was not great.

Since the basic queueing scheme from SLAB was taken we should be seeing
similar or better performance on NUMA. But then I am limited to two node
systems at this point. For those systems the alien caches are allocated
of similar size than the shared caches. Meaning that more optimizations
will now be geared to small NUMA systems.



Patches against 2.6.35

1,2 Some percpu stuff that I hope will independently be merged in the 2.6.36
	cycle.

3-13 Cleanup patches for SLUB that are general improvements. Some of those
	are already in the slab tree for 2.6.36.

14-18 Minimal set that realizes per cpu queues without fancy shared or alien
    queues.  This should be enough to be competitive with SMP against SLAB
    on modern hardware as the earlier measurements show.

19   NUMA policies applied at the object level. This will cause significantly
	more processing in the allocator hotpath for the NUMA case on
	particular slabs so that individual allocations can be redirected
	to different nodes.

20	Shared caches per cache sibling group between processors.

21	Alien caches per cache sibling group. Just adds a couple of
	shared caches and uses them for foreign nodes.

22	Cache expiration

23	Expire caches from page reclaim logic in mm/vmscan.c

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2010-08-18 19:32 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-08-04  2:45 [S+Q3 00/23] SLUB: The Unified slab allocator (V3) Christoph Lameter
2010-08-04  2:45 ` [S+Q3 01/23] percpu: make @dyn_size always mean min dyn_size in first chunk init functions Christoph Lameter
2010-08-04  2:45 ` [S+Q3 02/23] percpu: allow limited allocation before slab is online Christoph Lameter
2010-08-04  2:45 ` [S+Q3 03/23] slub: Use a constant for a unspecified node Christoph Lameter
2010-08-04  3:34   ` David Rientjes
2010-08-04 16:15     ` Christoph Lameter
2010-08-05  7:40       ` David Rientjes
2010-08-04  2:45 ` [S+Q3 04/23] SLUB: Constants need UL Christoph Lameter
2010-08-04  2:45 ` [S+Q3 05/23] Subjec Slub: Force no inlining of debug functions Christoph Lameter
2010-08-04  2:45 ` [S+Q3 06/23] slub: Check kasprintf results in kmem_cache_init() Christoph Lameter
2010-08-04  2:45 ` [S+Q3 07/23] slub: Use kmem_cache flags to detect if slab is in debugging mode Christoph Lameter
2010-08-04  2:45 ` [S+Q3 08/23] slub: remove dynamic dma slab allocation Christoph Lameter
2010-08-04  2:45 ` [S+Q3 09/23] slub: Remove static kmem_cache_cpu array for boot Christoph Lameter
2010-08-04  2:45 ` [S+Q3 10/23] slub: Allow removal of slab caches during boot V2 Christoph Lameter
2010-08-04  2:45 ` [S+Q3 11/23] slub: Dynamically size kmalloc cache allocations Christoph Lameter
2010-08-04  2:45 ` [S+Q3 12/23] slub: Extract hooks for memory checkers from hotpaths Christoph Lameter
2010-08-04  2:45 ` [S+Q3 13/23] slub: Move gfpflag masking out of the hotpath Christoph Lameter
2010-08-04  2:45 ` [S+Q3 14/23] slub: Add SLAB style per cpu queueing Christoph Lameter
2010-08-04  2:45 ` [S+Q3 15/23] slub: Allow resizing of per cpu queues Christoph Lameter
2010-08-04  2:45 ` [S+Q3 16/23] slub: Get rid of useless function count_free() Christoph Lameter
2010-08-04  2:45 ` [S+Q3 17/23] slub: Remove MAX_OBJS limitation Christoph Lameter
2010-08-04  2:45 ` [S+Q3 18/23] slub: Drop allocator announcement Christoph Lameter
2010-08-04  2:45 ` [S+Q3 19/23] slub: Object based NUMA policies Christoph Lameter
2010-08-04  2:45 ` [S+Q3 20/23] slub: Shared cache to exploit cross cpu caching abilities Christoph Lameter
2010-08-17  5:52   ` David Rientjes
2010-08-17 17:51     ` Christoph Lameter
2010-08-17 18:42       ` David Rientjes
2010-08-17 18:50         ` Christoph Lameter
2010-08-17 19:02           ` David Rientjes
2010-08-17 19:32             ` Christoph Lameter
2010-08-18 19:32               ` Christoph Lameter
2010-08-04  2:45 ` [S+Q3 21/23] slub: Support Alien Caches Christoph Lameter
2010-08-04  2:45 ` [S+Q3 22/23] slub: Cached object expiration Christoph Lameter
2010-08-04  2:45 ` [S+Q3 23/23] vmscan: Tie slub object expiration into page reclaim Christoph Lameter
2010-08-04  4:39 ` [S+Q3 00/23] SLUB: The Unified slab allocator (V3) David Rientjes
2010-08-04 16:17   ` Christoph Lameter
2010-08-05  8:38     ` David Rientjes
2010-08-05 17:33       ` Christoph Lameter
2010-08-17  4:56         ` David Rientjes
2010-08-17  7:55           ` Tejun Heo
2010-08-17 13:56             ` Christoph Lameter
2010-08-17 17:23           ` Christoph Lameter
2010-08-17 17:29             ` Christoph Lameter
2010-08-17 18:02             ` David Rientjes
2010-08-17 18:47               ` Christoph Lameter
2010-08-17 18:54                 ` David Rientjes
2010-08-17 19:34                   ` Christoph Lameter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).