All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/35] Cleanup and optimise the page allocator V3
@ 2009-03-16  9:45 ` Mel Gorman
  0 siblings, 0 replies; 188+ messages in thread
From: Mel Gorman @ 2009-03-16  9:45 UTC (permalink / raw)
  To: Mel Gorman, Linux Memory Management List
  Cc: Pekka Enberg, Rik van Riel, KOSAKI Motohiro, Christoph Lameter,
	Johannes Weiner, Nick Piggin, Linux Kernel Mailing List,
	Lin Ming, Zhang Yanmin, Peter Zijlstra

Here is V3 of an attempt to cleanup and optimise the page allocator and should
be ready for general testing. The page allocator is now faster (16%
reduced time overall for kernbench on one machine) and it has a smaller cache
footprint (16.5% less L1 cache misses and 19.5% less L2 cache misses for
kernbench on one machine). The text footprint has unfortunately increased,
largely due to the introduction of a form of lazy buddy merging mechanism
that avoids cache misses by postponing buddy merging until a high-order
allocation needs it.

I tested the patchset with kernbench, hackbench, sysbench-postgres and netperf
UDP and TCP with a variety of sizes. Many machines and loads showed improved
performance *however* it was not universal. On some machines, one load would
be faster and another slower (perversely, sometimes netperf-UDP would be
faster with netperf-TCP slower). On an different machines, the workloads
that gained or lost would differ.  I haven't fully pinned down why this is
yet but I have observed on at least one machine lock contention is higher
and more time is spent in functions like rb_erase(), both which might imply
some sort of scheduling artifact. I've also noted that while the allocator
incurs fewer cache misses, sometimes cache misses overall are increased
for the workload but the increased lock contention might account for this.

In some cases, more time is spent in copy_user_generic_string()[1] which
might imply that strings are getting the same colour with the greater
effort spent giving back hot pages but theories as to why this is not a
universal effect are welcome. I've also noted that machines with many CPUs
with different caches suffer because struct page is not cache-aligned but
aligning it hurts other machines so I left it alone. Finally, the performance
characteristics are vary depending on if you use SLAB, SLUB or SLQB.

So, while the page allocator is faster in most cases, making all workloads
universally go faster needs to now look at other areas like the sl*b
allocator and the scheduler.

Here is the patchset as it stands and I think it's ready for wider testing
and to be considered for merging depending on the outcome of testing and
reviews.

[1] copy_user_generic_unrolled on one machine was slowed down by an extreme
amount. I did not check if there was a pattern of slowdowns versus which
version of copy_user_generic() was used

Changes since V2
  o Remove brances by treating watermark flags as array indices
  o Remove branch by assuming __GFP_HIGH == ALLOC_HIGH
  o Do not check for compound on every page free
  o Remove branch by always ensuring the migratetype is known on free
  o Simplify buffered_rmqueue further
  o Reintroduce improved version of batched bulk free of pcp pages
  o Use allocation flags as an index to zone watermarks
  o Work out __GFP_COLD only once
  o Reduce the number of times zone stats are updated
  o Do not dump reserve pages back into the allocator. Instead treat them
    as MOVABLE so that MIGRATE_RESERVE gets used on the max-order-overlapped
    boundaries without causing trouble
  o Allow pages up to PAGE_ALLOC_COSTLY_ORDER to use the per-cpu allocator.
    order-1 allocations are frequently enough in particular to justify this
  o Rearrange inlining such that the hot-path is inlined but not in a way
    that increases the text size of the page allocator
  o Make the check for needing additional zonelist filtering due to NUMA
    or cpusets as light as possible
  o Do not destroy compound pages going to the PCP lists
  o Delay the merging of buddies until a high-order allocation needs them
    or anti-fragmentation is being forced to fallback
  o Count high-order pages as 1

Changes since V1
  o Remove the ifdef CONFIG_CPUSETS from inside get_page_from_freelist()
  o Use non-lock bit operations for clearing the mlock flag
  o Factor out alloc_flags calculation so it is only done once (Peter)
  o Make gfp.h a bit prettier and clear-cut (Peter)
  o Instead of deleting a debugging check, replace page_count() in the
    free path with a version that does not check for compound pages (Nick)
  o Drop the alteration for hot/cold page freeing until we know if it
    helps or not

^ permalink raw reply	[flat|nested] 188+ messages in thread

end of thread, other threads:[~2009-03-19 23:55 UTC | newest]

Thread overview: 188+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-16  9:45 [PATCH 00/35] Cleanup and optimise the page allocator V3 Mel Gorman
2009-03-16  9:45 ` Mel Gorman
2009-03-16  9:45 ` [PATCH 01/35] Replace __alloc_pages_internal() with __alloc_pages_nodemask() Mel Gorman
2009-03-16  9:45   ` Mel Gorman
2009-03-16 15:49   ` Christoph Lameter
2009-03-16 15:49     ` Christoph Lameter
2009-03-16  9:45 ` [PATCH 02/35] Do not sanity check order in the fast path Mel Gorman
2009-03-16  9:45   ` Mel Gorman
2009-03-16 15:52   ` Christoph Lameter
2009-03-16 15:52     ` Christoph Lameter
2009-03-16  9:45 ` [PATCH 03/35] Do not check NUMA node ID when the caller knows the node is valid Mel Gorman
2009-03-16  9:45   ` Mel Gorman
2009-03-16  9:45 ` [PATCH 04/35] Check only once if the zonelist is suitable for the allocation Mel Gorman
2009-03-16  9:45   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 05/35] Break up the allocator entry point into fast and slow paths Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 06/35] Move check for disabled anti-fragmentation out of fastpath Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 15:54   ` Christoph Lameter
2009-03-16 15:54     ` Christoph Lameter
2009-03-16  9:46 ` [PATCH 07/35] Check in advance if the zonelist needs additional filtering Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 08/35] Calculate the preferred zone for allocation only once Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 09/35] Calculate the migratetype " Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 10/35] Calculate the alloc_flags " Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 11/35] Calculate the cold parameter " Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 12/35] Remove a branch by assuming __GFP_HIGH == ALLOC_HIGH Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 13/35] Inline __rmqueue_smallest() Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 14/35] Inline buffered_rmqueue() Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 15/35] Inline __rmqueue_fallback() Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 15:57   ` Christoph Lameter
2009-03-16 15:57     ` Christoph Lameter
2009-03-16 16:25     ` Mel Gorman
2009-03-16 16:25       ` Mel Gorman
2009-03-16  9:46 ` [PATCH 16/35] Save text by reducing call sites of __rmqueue() Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 17/35] Do not call get_pageblock_migratetype() more than necessary Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:00   ` Christoph Lameter
2009-03-16 16:00     ` Christoph Lameter
2009-03-16  9:46 ` [PATCH 18/35] Do not disable interrupts in free_page_mlock() Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:05   ` Christoph Lameter
2009-03-16 16:05     ` Christoph Lameter
2009-03-16 16:29     ` Mel Gorman
2009-03-16 16:29       ` Mel Gorman
2009-03-16  9:46 ` [PATCH 19/35] Do not setup zonelist cache when there is only one node Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:06   ` Christoph Lameter
2009-03-16 16:06     ` Christoph Lameter
2009-03-16  9:46 ` [PATCH 20/35] Use a pre-calculated value for num_online_nodes() Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 11:42   ` Nick Piggin
2009-03-16 11:42     ` Nick Piggin
2009-03-16 11:46     ` Nick Piggin
2009-03-16 11:46       ` Nick Piggin
2009-03-16 16:08   ` Christoph Lameter
2009-03-16 16:08     ` Christoph Lameter
2009-03-16 16:36     ` Mel Gorman
2009-03-16 16:36       ` Mel Gorman
2009-03-16 16:47       ` Christoph Lameter
2009-03-16 16:47         ` Christoph Lameter
2009-03-18 15:08         ` Mel Gorman
2009-03-18 15:08           ` Mel Gorman
2009-03-18 16:58           ` Christoph Lameter
2009-03-18 16:58             ` Christoph Lameter
2009-03-18 18:01             ` Mel Gorman
2009-03-18 18:01               ` Mel Gorman
2009-03-18 19:10               ` Christoph Lameter
2009-03-18 19:10                 ` Christoph Lameter
2009-03-19 20:43                 ` Christoph Lameter
2009-03-19 20:43                   ` Christoph Lameter
2009-03-19 21:29                   ` Mel Gorman
2009-03-19 21:29                     ` Mel Gorman
2009-03-19 22:22                     ` Christoph Lameter
2009-03-19 22:22                       ` Christoph Lameter
2009-03-19 22:33                       ` Mel Gorman
2009-03-19 22:33                         ` Mel Gorman
2009-03-19 22:42                         ` Christoph Lameter
2009-03-19 22:42                           ` Christoph Lameter
2009-03-19 22:52                           ` Mel Gorman
2009-03-19 22:52                             ` Mel Gorman
2009-03-19 22:06                   ` Mel Gorman
2009-03-19 22:06                     ` Mel Gorman
2009-03-19 22:39                     ` Christoph Lameter
2009-03-19 22:39                       ` Christoph Lameter
2009-03-19 22:21                   ` Mel Gorman
2009-03-19 22:21                     ` Mel Gorman
2009-03-19 22:24                     ` Christoph Lameter
2009-03-19 22:24                       ` Christoph Lameter
2009-03-19 23:04                       ` Mel Gorman
2009-03-19 23:04                         ` Mel Gorman
2009-03-16  9:46 ` [PATCH 21/35] Do not check for compound pages during the page allocator sanity checks Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:09   ` Christoph Lameter
2009-03-16 16:09     ` Christoph Lameter
2009-03-16  9:46 ` [PATCH 22/35] Use allocation flags as an index to the zone watermark Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:11   ` Christoph Lameter
2009-03-16 16:11     ` Christoph Lameter
2009-03-16  9:46 ` [PATCH 23/35] Update NR_FREE_PAGES only as necessary Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:17   ` Christoph Lameter
2009-03-16 16:17     ` Christoph Lameter
2009-03-16 16:42     ` Mel Gorman
2009-03-16 16:42       ` Mel Gorman
2009-03-16 16:48       ` Christoph Lameter
2009-03-16 16:48         ` Christoph Lameter
2009-03-16 16:58         ` Mel Gorman
2009-03-16 16:58           ` Mel Gorman
2009-03-16  9:46 ` [PATCH 24/35] Convert gfp_zone() to use a table of precalculated values Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:19   ` Christoph Lameter
2009-03-16 16:19     ` Christoph Lameter
2009-03-16 16:45     ` Mel Gorman
2009-03-16 16:45       ` Mel Gorman
2009-03-16  9:46 ` [PATCH 25/35] Re-sort GFP flags and fix whitespace alignment for easier reading Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 26/35] Use the per-cpu allocator for orders up to PAGE_ALLOC_COSTLY_ORDER Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:26   ` Christoph Lameter
2009-03-16 16:26     ` Christoph Lameter
2009-03-16 16:47     ` Mel Gorman
2009-03-16 16:47       ` Mel Gorman
2009-03-16  9:46 ` [PATCH 27/35] Split per-cpu list into one-list-per-migrate-type Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 28/35] Batch free pages from migratetype per-cpu lists Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 29/35] Do not store the PCP high and batch watermarks in the per-cpu structure Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:30   ` Christoph Lameter
2009-03-16 16:30     ` Christoph Lameter
2009-03-16  9:46 ` [PATCH 30/35] Skip the PCP list search by counting the order and type of pages on list Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:31   ` Christoph Lameter
2009-03-16 16:31     ` Christoph Lameter
2009-03-16 16:51     ` Mel Gorman
2009-03-16 16:51       ` Mel Gorman
2009-03-16  9:46 ` [PATCH 31/35] Optimistically check the first page on the PCP free list is suitable Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:33   ` Christoph Lameter
2009-03-16 16:33     ` Christoph Lameter
2009-03-16 16:52     ` Mel Gorman
2009-03-16 16:52       ` Mel Gorman
2009-03-16  9:46 ` [PATCH 32/35] Inline next_zones_zonelist() of the zonelist scan in the fastpath Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 33/35] Do not merge buddies until they are needed by a high-order allocation or anti-fragmentation Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16  9:46 ` [PATCH 34/35] Allow compound pages to be stored on the PCP lists Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 16:47   ` Christoph Lameter
2009-03-16 16:47     ` Christoph Lameter
2009-03-16  9:46 ` [PATCH 35/35] Allow up to 4MB PCP lists due to compound pages Mel Gorman
2009-03-16  9:46   ` Mel Gorman
2009-03-16 10:40 ` [PATCH 00/35] Cleanup and optimise the page allocator V3 Nick Piggin
2009-03-16 10:40   ` Nick Piggin
2009-03-16 11:19   ` Mel Gorman
2009-03-16 11:19     ` Mel Gorman
2009-03-16 11:33     ` Nick Piggin
2009-03-16 11:33       ` Nick Piggin
2009-03-16 12:02       ` Mel Gorman
2009-03-16 12:02         ` Mel Gorman
2009-03-16 12:25         ` Nick Piggin
2009-03-16 12:25           ` Nick Piggin
2009-03-16 13:32           ` Mel Gorman
2009-03-16 13:32             ` Mel Gorman
2009-03-16 15:53             ` Nick Piggin
2009-03-16 15:53               ` Nick Piggin
2009-03-16 16:56               ` Mel Gorman
2009-03-16 16:56                 ` Mel Gorman
2009-03-16 17:05                 ` Nick Piggin
2009-03-16 17:05                   ` Nick Piggin
2009-03-18 15:07                   ` Mel Gorman
2009-03-18 15:07                     ` Mel Gorman
2009-03-16 11:45 ` Nick Piggin
2009-03-16 11:45   ` Nick Piggin
2009-03-16 12:11   ` Mel Gorman
2009-03-16 12:11     ` Mel Gorman
2009-03-16 12:28     ` Nick Piggin
2009-03-16 12:28       ` Nick Piggin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.