linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Reclaim page capture v2
@ 2008-09-03 18:44 Andy Whitcroft
  2008-09-03 18:44 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
                   ` (3 more replies)
  0 siblings, 4 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-03 18:44 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

For sometime we have been looking at mechanisms for improving the availability
of larger allocations under load.  One of the options we have explored is
the capturing of pages freed under direct reclaim in order to increase the
chances of free pages coelescing before they are subject to reallocation
by racing allocators.

Following this email is a patch stack implementing page capture during
direct reclaim.  It consits of four patches.  The first two simply pull
out existing code into helpers for reuse.  The third makes buddy's use
of struct page explicit.  The fourth contains the meat of the changes,
and its leader contains a much fuller description of the feature.

This update represents a rebase to -mm and incorporates feedback from
KOSAKI Motohiro.  It also incorporates an accounting fix which was
preventing some captures.

I have done a lot of comparitive testing with and without this patch
set and in broad brush I am seeing improvements in hugepage allocations
(worst case size) success on all of my test systems.  These tests consist
of placing a constant stream of high order allocations on the system,
at varying rates.  The results for these various runs are then averaged
to give an overall improvement.

		Absolute	Effective
x86-64		2.48%		 4.58%
powerpc		5.55%		25.22%

x86-64 has a relatively small huge page size and so is always much more
effective at allocating huge pages.  Even there we get a measurable
improvement.  On powerpc the huge pages are much larger and much harder
to recover.  Here we see a full 25% increase in page recovery.

It should be noted that these are worst case testing, and very agressive
taking every possible page in the system.

Against: 2.6.27-rc1-mm1

Comments?

-apw

Changes since V1:
 - Incorporates review feedback from KOSAKI Motohiro,
 - fixes up accounting when checking watermarks for captured pages,
 - rebase 2.6.27-rc1-mm1,
 - Incorporates review feedback from Mel.

Andy Whitcroft (4):
  pull out the page pre-release and sanity check logic for reuse
  pull out zone cpuset and watermark checks for reuse
  buddy: explicitly identify buddy field use in struct page
  capture pages freed during direct reclaim for allocation by the
    reclaimer

 include/linux/mm_types.h   |    4 +
 include/linux/page-flags.h |    6 +
 mm/internal.h              |    8 ++-
 mm/page_alloc.c            |  255 ++++++++++++++++++++++++++++++++++++++------
 mm/vmscan.c                |  115 ++++++++++++++++----
 5 files changed, 332 insertions(+), 56 deletions(-)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-09-03 18:44 [RFC PATCH 0/4] Reclaim page capture v2 Andy Whitcroft
@ 2008-09-03 18:44 ` Andy Whitcroft
  2008-09-04  1:24   ` Rik van Riel
  2008-09-05  1:52   ` KOSAKI Motohiro
  2008-09-03 18:44 ` [PATCH 2/4] pull out zone cpuset and watermark checks " Andy Whitcroft
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-03 18:44 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

When we are about to release a page we perform a number of actions
on that page.  We clear down any anonymous mappings, confirm that
the page is safe to release, check for freeing locks, before mapping
the page should that be required.  Pull this processing out into a
helper function for reuse in a later patch.

Note that we do not convert the similar cleardown in free_hot_cold_page()
as the optimiser is unable to squash the loops during the inline.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
---
 mm/page_alloc.c |   43 ++++++++++++++++++++++++++++++-------------
 1 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f52fcf1..b2a2c2b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -489,6 +489,35 @@ static inline int free_pages_check(struct page *page)
 }
 
 /*
+ * Prepare this page for release to the buddy.  Sanity check the page.
+ * Returns 1 if the page is safe to free.
+ */
+static inline int free_page_prepare(struct page *page, int order)
+{
+	int i;
+	int reserved = 0;
+
+	if (PageAnon(page))
+		page->mapping = NULL;
+
+	for (i = 0 ; i < (1 << order) ; ++i)
+		reserved += free_pages_check(page + i);
+	if (reserved)
+		return 0;
+
+	if (!PageHighMem(page)) {
+		debug_check_no_locks_freed(page_address(page),
+							PAGE_SIZE << order);
+		debug_check_no_obj_freed(page_address(page),
+					   PAGE_SIZE << order);
+	}
+	arch_free_page(page, order);
+	kernel_map_pages(page, 1 << order, 0);
+
+	return 1;
+}
+
+/*
  * Frees a list of pages. 
  * Assumes all pages on list are in same zone, and of same order.
  * count is the number of pages to free.
@@ -529,22 +558,10 @@ static void free_one_page(struct zone *zone, struct page *page, int order)
 static void __free_pages_ok(struct page *page, unsigned int order)
 {
 	unsigned long flags;
-	int i;
-	int reserved = 0;
 
-	for (i = 0 ; i < (1 << order) ; ++i)
-		reserved += free_pages_check(page + i);
-	if (reserved)
+	if (!free_page_prepare(page, order))
 		return;
 
-	if (!PageHighMem(page)) {
-		debug_check_no_locks_freed(page_address(page),PAGE_SIZE<<order);
-		debug_check_no_obj_freed(page_address(page),
-					   PAGE_SIZE << order);
-	}
-	arch_free_page(page, order);
-	kernel_map_pages(page, 1 << order, 0);
-
 	local_irq_save(flags);
 	__count_vm_events(PGFREE, 1 << order);
 	free_one_page(page_zone(page), page, order);
-- 
1.6.0.rc1.258.g80295


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 2/4] pull out zone cpuset and watermark checks for reuse
  2008-09-03 18:44 [RFC PATCH 0/4] Reclaim page capture v2 Andy Whitcroft
  2008-09-03 18:44 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
@ 2008-09-03 18:44 ` Andy Whitcroft
  2008-09-04  1:24   ` Rik van Riel
  2008-09-05  1:52   ` KOSAKI Motohiro
  2008-09-03 18:44 ` [PATCH 3/4] buddy: explicitly identify buddy field use in struct page Andy Whitcroft
  2008-09-03 18:44 ` [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer Andy Whitcroft
  3 siblings, 2 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-03 18:44 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

When allocating we need to confirm that the zone we are about to allocate
from is acceptable to the CPUSET we are in, and that it does not violate
the zone watermarks.  Pull these checks out so we can reuse them in a
later patch.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
---
 mm/page_alloc.c |   62 ++++++++++++++++++++++++++++++++++++++----------------
 1 files changed, 43 insertions(+), 19 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b2a2c2b..2c3874e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1274,6 +1274,44 @@ int zone_watermark_ok(struct zone *z, int order, unsigned long mark,
 	return 1;
 }
 
+/*
+ * Return 1 if this zone is an acceptable source given the cpuset
+ * constraints.
+ */
+static inline int zone_cpuset_permits(struct zone *zone,
+					int alloc_flags, gfp_t gfp_mask)
+{
+	if ((alloc_flags & ALLOC_CPUSET) &&
+	    !cpuset_zone_allowed_softwall(zone, gfp_mask))
+		return 0;
+	return 1;
+}
+
+/*
+ * Return 1 if this zone is within the watermarks specified by the
+ * allocation flags.
+ */
+static inline int zone_watermark_permits(struct zone *zone, int order,
+			int classzone_idx, int alloc_flags, gfp_t gfp_mask)
+{
+	if (!(alloc_flags & ALLOC_NO_WATERMARKS)) {
+		unsigned long mark;
+		if (alloc_flags & ALLOC_WMARK_MIN)
+			mark = zone->pages_min;
+		else if (alloc_flags & ALLOC_WMARK_LOW)
+			mark = zone->pages_low;
+		else
+			mark = zone->pages_high;
+		if (!zone_watermark_ok(zone, order, mark,
+			    classzone_idx, alloc_flags)) {
+			if (!zone_reclaim_mode ||
+					!zone_reclaim(zone, gfp_mask, order))
+				return 0;
+		}
+	}
+	return 1;
+}
+
 #ifdef CONFIG_NUMA
 /*
  * zlc_setup - Setup for "zonelist cache".  Uses cached zone data to
@@ -1427,25 +1465,11 @@ zonelist_scan:
 		if (NUMA_BUILD && zlc_active &&
 			!zlc_zone_worth_trying(zonelist, z, allowednodes))
 				continue;
-		if ((alloc_flags & ALLOC_CPUSET) &&
-			!cpuset_zone_allowed_softwall(zone, gfp_mask))
-				goto try_next_zone;
-
-		if (!(alloc_flags & ALLOC_NO_WATERMARKS)) {
-			unsigned long mark;
-			if (alloc_flags & ALLOC_WMARK_MIN)
-				mark = zone->pages_min;
-			else if (alloc_flags & ALLOC_WMARK_LOW)
-				mark = zone->pages_low;
-			else
-				mark = zone->pages_high;
-			if (!zone_watermark_ok(zone, order, mark,
-				    classzone_idx, alloc_flags)) {
-				if (!zone_reclaim_mode ||
-				    !zone_reclaim(zone, gfp_mask, order))
-					goto this_zone_full;
-			}
-		}
+		if (!zone_cpuset_permits(zone, alloc_flags, gfp_mask))
+			goto try_next_zone;
+		if (!zone_watermark_permits(zone, order, classzone_idx,
+							alloc_flags, gfp_mask))
+			goto this_zone_full;
 
 		page = buffered_rmqueue(preferred_zone, zone, order, gfp_mask);
 		if (page)
-- 
1.6.0.rc1.258.g80295


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 3/4] buddy: explicitly identify buddy field use in struct page
  2008-09-03 18:44 [RFC PATCH 0/4] Reclaim page capture v2 Andy Whitcroft
  2008-09-03 18:44 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
  2008-09-03 18:44 ` [PATCH 2/4] pull out zone cpuset and watermark checks " Andy Whitcroft
@ 2008-09-03 18:44 ` Andy Whitcroft
  2008-09-03 20:36   ` Christoph Lameter
                     ` (2 more replies)
  2008-09-03 18:44 ` [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer Andy Whitcroft
  3 siblings, 3 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-03 18:44 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

Explicitly define the struct page fields which buddy uses when it owns
pages.  Defines a new anonymous struct to allow additional fields to
be defined in a later patch.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
---
 include/linux/mm_types.h |    3 +++
 mm/internal.h            |    2 +-
 mm/page_alloc.c          |    4 ++--
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 995c588..906d8e0 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -70,6 +70,9 @@ struct page {
 #endif
 	    struct kmem_cache *slab;	/* SLUB: Pointer to slab */
 	    struct page *first_page;	/* Compound tail pages */
+	    struct {
+		unsigned long buddy_order;     /* buddy: free page order */
+	    };
 	};
 	union {
 		pgoff_t index;		/* Our offset within mapping. */
diff --git a/mm/internal.h b/mm/internal.h
index c0e4859..fcedcd0 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -58,7 +58,7 @@ extern void __free_pages_bootmem(struct page *page, unsigned int order);
 static inline unsigned long page_order(struct page *page)
 {
 	VM_BUG_ON(!PageBuddy(page));
-	return page_private(page);
+	return page->buddy_order;
 }
 
 extern int mlock_vma_pages_range(struct vm_area_struct *vma,
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2c3874e..db0dbd6 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -331,7 +331,7 @@ static inline void prep_zero_page(struct page *page, int order, gfp_t gfp_flags)
 
 static inline void set_page_order(struct page *page, int order)
 {
-	set_page_private(page, order);
+	page->buddy_order = order;
 	__SetPageBuddy(page);
 #ifdef CONFIG_PAGE_OWNER
 		page->order = -1;
@@ -341,7 +341,7 @@ static inline void set_page_order(struct page *page, int order)
 static inline void rmv_page_order(struct page *page)
 {
 	__ClearPageBuddy(page);
-	set_page_private(page, 0);
+	page->buddy_order = 0;
 }
 
 /*
-- 
1.6.0.rc1.258.g80295


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 18:44 [RFC PATCH 0/4] Reclaim page capture v2 Andy Whitcroft
                   ` (2 preceding siblings ...)
  2008-09-03 18:44 ` [PATCH 3/4] buddy: explicitly identify buddy field use in struct page Andy Whitcroft
@ 2008-09-03 18:44 ` Andy Whitcroft
  2008-09-03 20:35   ` Christoph Lameter
  2008-09-03 20:53   ` Andy Whitcroft
  3 siblings, 2 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-03 18:44 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

When a process enters direct reclaim it will expend effort identifying
and releasing pages in the hope of obtaining a page.  However as these
pages are released asynchronously there is every possibility that the
pages will have been consumed by other allocators before the reclaimer
gets a look in.  This is particularly problematic where the reclaimer is
attempting to allocate a higher order page.  It is highly likely that
a parallel allocation will consume lower order constituent pages as we
release them preventing them coelescing into the higher order page the
reclaimer desires.

This patch set attempts to address this for allocations above
ALLOC_COSTLY_ORDER by temporarily collecting the pages we are releasing
onto a local free list.  Instead of freeing them to the main buddy lists,
pages are collected and coelesced on this per direct reclaimer free list.
Pages which are freed by other processes are also considered, where they
coelesce with a page already under capture they will be moved to the
capture list.  When pressure has been applied to a zone we then consult
the capture list and if there is an appropriatly sized page available
it is taken immediatly and the remainder returned to the free pool.
Capture is only enabled when the reclaimer's allocation order exceeds
ALLOC_COSTLY_ORDER as free pages below this order should naturally occur
in large numbers following regular reclaim.

Thanks go to Mel Gorman for numerous discussions during the development
of this patch and for his repeated reviews.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 18:44 ` [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer Andy Whitcroft
@ 2008-09-03 20:35   ` Christoph Lameter
  2008-09-03 20:53   ` Andy Whitcroft
  1 sibling, 0 replies; 30+ messages in thread
From: Christoph Lameter @ 2008-09-03 20:35 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman


> Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> 
> --

You forgot to include the patch.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] buddy: explicitly identify buddy field use in struct page
  2008-09-03 18:44 ` [PATCH 3/4] buddy: explicitly identify buddy field use in struct page Andy Whitcroft
@ 2008-09-03 20:36   ` Christoph Lameter
  2008-09-04  1:25   ` Rik van Riel
  2008-09-05  1:52   ` KOSAKI Motohiro
  2 siblings, 0 replies; 30+ messages in thread
From: Christoph Lameter @ 2008-09-03 20:36 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman

Andy Whitcroft wrote:
> Explicitly define the struct page fields which buddy uses when it owns
> pages.  Defines a new anonymous struct to allow additional fields to
> be defined in a later patch.

Good. I have a similar patch floating around.

Reviewed-by: Christoph Lameter <cl@linux-foundation.org>

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 18:44 ` [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer Andy Whitcroft
  2008-09-03 20:35   ` Christoph Lameter
@ 2008-09-03 20:53   ` Andy Whitcroft
  2008-09-03 21:00     ` Christoph Lameter
                       ` (2 more replies)
  1 sibling, 3 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-03 20:53 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft,
	Christoph Lameter

[Doh, as pointed out by Christoph the patch was missing from this one...]

When a process enters direct reclaim it will expend effort identifying
and releasing pages in the hope of obtaining a page.  However as these
pages are released asynchronously there is every possibility that the
pages will have been consumed by other allocators before the reclaimer
gets a look in.  This is particularly problematic where the reclaimer is
attempting to allocate a higher order page.  It is highly likely that
a parallel allocation will consume lower order constituent pages as we
release them preventing them coelescing into the higher order page the
reclaimer desires.

This patch set attempts to address this for allocations above
ALLOC_COSTLY_ORDER by temporarily collecting the pages we are releasing
onto a local free list.  Instead of freeing them to the main buddy lists,
pages are collected and coelesced on this per direct reclaimer free list.
Pages which are freed by other processes are also considered, where they
coelesce with a page already under capture they will be moved to the
capture list.  When pressure has been applied to a zone we then consult
the capture list and if there is an appropriatly sized page available
it is taken immediatly and the remainder returned to the free pool.
Capture is only enabled when the reclaimer's allocation order exceeds
ALLOC_COSTLY_ORDER as free pages below this order should naturally occur
in large numbers following regular reclaim.

Thanks go to Mel Gorman for numerous discussions during the development
of this patch and for his repeated reviews.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
---
 include/linux/mm_types.h   |    1 +
 include/linux/page-flags.h |    6 ++
 mm/internal.h              |    6 ++
 mm/page_alloc.c            |  154 +++++++++++++++++++++++++++++++++++++++++++-
 mm/vmscan.c                |  115 +++++++++++++++++++++++++++------
 5 files changed, 261 insertions(+), 21 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 906d8e0..cd2b549 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -72,6 +72,7 @@ struct page {
 	    struct page *first_page;	/* Compound tail pages */
 	    struct {
 		unsigned long buddy_order;     /* buddy: free page order */
+		struct list_head *buddy_free;  /* buddy: free list pointer */
 	    };
 	};
 	union {
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index c0ac9e0..b0d8bc7 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -117,6 +117,9 @@ enum pageflags {
 	/* SLUB */
 	PG_slub_frozen = PG_active,
 	PG_slub_debug = PG_error,
+
+	/* BUDDY overlays. */
+	PG_buddy_capture = PG_owner_priv_1,
 };
 
 #ifndef __GENERATING_BOUNDS_H
@@ -208,6 +211,9 @@ __PAGEFLAG(SlubDebug, slub_debug)
  */
 TESTPAGEFLAG(Writeback, writeback) TESTSCFLAG(Writeback, writeback)
 __PAGEFLAG(Buddy, buddy)
+PAGEFLAG(BuddyCapture, buddy_capture)	/* A buddy page, but reserved. */
+	__SETPAGEFLAG(BuddyCapture, buddy_capture)
+	__CLEARPAGEFLAG(BuddyCapture, buddy_capture)
 PAGEFLAG(MappedToDisk, mappedtodisk)
 
 /* PG_readahead is only used for file reads; PG_reclaim is only for writes */
diff --git a/mm/internal.h b/mm/internal.h
index fcedcd0..f266a35 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -251,4 +251,10 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
 		     unsigned long start, int len, int flags,
 		     struct page **pages, struct vm_area_struct **vmas);
 
+extern struct page *capture_alloc_or_return(struct zone *, struct zone *,
+					struct list_head *, int, int, gfp_t);
+void capture_one_page(struct list_head *, struct zone *, struct page *, int);
+unsigned long try_to_free_pages_capture(struct page **, struct zonelist *,
+					nodemask_t *, int, gfp_t, int);
+
 #endif
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index db0dbd6..992f02c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -428,6 +428,51 @@ static inline int page_is_buddy(struct page *page, struct page *buddy,
  * -- wli
  */
 
+static inline void __capture_one_page(struct list_head *capture_list,
+		struct page *page, struct zone *zone, unsigned int order)
+{
+	unsigned long page_idx;
+	unsigned long order_size = 1UL << order;
+
+	if (unlikely(PageCompound(page)))
+		destroy_compound_page(page, order);
+
+	page_idx = page_to_pfn(page) & ((1 << MAX_ORDER) - 1);
+
+	VM_BUG_ON(page_idx & (order_size - 1));
+	VM_BUG_ON(bad_range(zone, page));
+
+	while (order < MAX_ORDER-1) {
+		unsigned long combined_idx;
+		struct page *buddy;
+
+		buddy = __page_find_buddy(page, page_idx, order);
+		if (!page_is_buddy(page, buddy, order))
+			break;
+
+		/* Our buddy is free, merge with it and move up one order. */
+		list_del(&buddy->lru);
+		if (PageBuddyCapture(buddy)) {
+			buddy->buddy_free = 0;
+			__ClearPageBuddyCapture(buddy);
+		} else {
+			zone->free_area[order].nr_free--;
+			__mod_zone_page_state(zone,
+					NR_FREE_PAGES, -(1UL << order));
+		}
+		rmv_page_order(buddy);
+		combined_idx = __find_combined_index(page_idx, order);
+		page = page + (combined_idx - page_idx);
+		page_idx = combined_idx;
+		order++;
+	}
+	set_page_order(page, order);
+	__SetPageBuddyCapture(page);
+	page->buddy_free = capture_list;
+
+	list_add(&page->lru, capture_list);
+}
+
 static inline void __free_one_page(struct page *page,
 		struct zone *zone, unsigned int order)
 {
@@ -451,6 +496,12 @@ static inline void __free_one_page(struct page *page,
 		buddy = __page_find_buddy(page, page_idx, order);
 		if (!page_is_buddy(page, buddy, order))
 			break;
+		if (PageBuddyCapture(buddy)) {
+			__mod_zone_page_state(zone,
+					NR_FREE_PAGES, -(1UL << order));
+			return __capture_one_page(buddy->buddy_free,
+							page, zone, order);
+		}
 
 		/* Our buddy is free, merge with it and move up one order. */
 		list_del(&buddy->lru);
@@ -555,6 +606,19 @@ static void free_one_page(struct zone *zone, struct page *page, int order)
 	spin_unlock(&zone->lock);
 }
 
+void capture_one_page(struct list_head *free_list,
+			struct zone *zone, struct page *page, int order)
+{
+	unsigned long flags;
+
+	if (!free_page_prepare(page, order))
+		return;
+
+	spin_lock_irqsave(&zone->lock, flags);
+	__capture_one_page(free_list, page, zone, order);
+	spin_unlock_irqrestore(&zone->lock, flags);
+}
+
 static void __free_pages_ok(struct page *page, unsigned int order)
 {
 	unsigned long flags;
@@ -629,6 +693,23 @@ static inline void expand(struct zone *zone, struct page *page,
 }
 
 /*
+ * Convert the passed page from actual_order to desired_order.
+ * Given a page of actual_order release all but the desired_order sized
+ * buddy at its start.
+ */
+void __carve_off(struct page *page, unsigned long actual_order,
+					unsigned long desired_order)
+{
+	int migratetype = get_pageblock_migratetype(page);
+	struct zone *zone = page_zone(page);
+	struct free_area *area = &(zone->free_area[actual_order]);
+
+	__mod_zone_page_state(zone, NR_FREE_PAGES,
+				(1UL << actual_order) - (1UL << desired_order));
+	expand(zone, page, desired_order, actual_order, area, migratetype);
+}
+
+/*
  * This page is about to be returned from the page allocator
  */
 static int prep_new_page(struct page *page, int order, gfp_t gfp_flags)
@@ -1667,11 +1748,15 @@ nofail_alloc:
 	reclaim_state.reclaimed_slab = 0;
 	p->reclaim_state = &reclaim_state;
 
-	did_some_progress = try_to_free_pages(zonelist, order, gfp_mask);
+	did_some_progress = try_to_free_pages_capture(&page, zonelist, nodemask,
+						order, gfp_mask, alloc_flags);
 
 	p->reclaim_state = NULL;
 	p->flags &= ~PF_MEMALLOC;
 
+	if (page)
+		goto got_pg;
+
 	cond_resched();
 
 	if (order != 0)
@@ -4815,6 +4900,73 @@ out:
 	spin_unlock_irqrestore(&zone->lock, flags);
 }
 
+#define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
+
+/*
+ * Run through the accumulated list of captured pages and the first
+ * which is big enough to satisfy the original allocation.  Free
+ * the remainder of that page and all other pages.
+ */
+struct page *capture_alloc_or_return(struct zone *zone,
+		struct zone *preferred_zone, struct list_head *capture_list,
+		int order, int alloc_flags, gfp_t gfp_mask)
+{
+	struct page *capture_page = 0;
+	unsigned long flags;
+	int classzone_idx = zone_idx(preferred_zone);
+
+	spin_lock_irqsave(&zone->lock, flags);
+
+	while (!list_empty(capture_list)) {
+		struct page *page;
+		int pg_order;
+
+		page = lru_to_page(capture_list);
+		list_del(&page->lru);
+		pg_order = page_order(page);
+
+		/*
+		 * Clear out our buddy size and list information before
+		 * releasing or allocating the page.
+		 */
+		rmv_page_order(page);
+		page->buddy_free = 0;
+		ClearPageBuddyCapture(page);
+
+		if (!capture_page && pg_order >= order) {
+			__carve_off(page, pg_order, order);
+			capture_page = page;
+		} else
+			__free_one_page(page, zone, pg_order);
+	}
+
+	/*
+	 * Ensure that this capture would not violate the watermarks.
+	 * Subtle, we actually already have the page outside the watermarks
+	 * so check if we can allocate an order 0 page.
+	 */
+	if (capture_page &&
+	    (!zone_cpuset_permits(zone, alloc_flags, gfp_mask) ||
+	     !zone_watermark_permits(zone, 0, classzone_idx,
+					     alloc_flags, gfp_mask))) {
+		__free_one_page(capture_page, zone, order);
+		capture_page = NULL;
+	}
+
+	if (capture_page)
+		__count_zone_vm_events(PGALLOC, zone, 1 << order);
+
+	zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
+	zone->pages_scanned = 0;
+
+	spin_unlock_irqrestore(&zone->lock, flags);
+
+	if (capture_page)
+		prep_new_page(capture_page, order, gfp_mask);
+
+	return capture_page;
+}
+
 #ifdef CONFIG_MEMORY_HOTREMOVE
 /*
  * All pages in the range must be isolated before calling this.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 85ce427..1e11c12 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -56,6 +56,8 @@ struct scan_control {
 	/* This context's GFP mask */
 	gfp_t gfp_mask;
 
+	int alloc_flags;
+
 	int may_writepage;
 
 	/* Can pages be swapped as part of reclaim? */
@@ -81,6 +83,12 @@ struct scan_control {
 			unsigned long *scanned, int order, int mode,
 			struct zone *z, struct mem_cgroup *mem_cont,
 			int active, int file);
+
+	/* Captured page. */
+	struct page **capture;
+	
+	/* Nodemask for acceptable allocations. */
+	nodemask_t *nodemask;
 };
 
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -560,7 +568,8 @@ void putback_lru_page(struct page *page)
 /*
  * shrink_page_list() returns the number of reclaimed pages
  */
-static unsigned long shrink_page_list(struct list_head *page_list,
+static unsigned long shrink_page_list(struct list_head *free_list,
+					struct list_head *page_list,
 					struct scan_control *sc,
 					enum pageout_io sync_writeback)
 {
@@ -742,9 +751,13 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		unlock_page(page);
 free_it:
 		nr_reclaimed++;
-		if (!pagevec_add(&freed_pvec, page)) {
-			__pagevec_free(&freed_pvec);
-			pagevec_reinit(&freed_pvec);
+		if (free_list) {
+			capture_one_page(free_list, page_zone(page), page, 0);
+		} else {
+			if (!pagevec_add(&freed_pvec, page)) {
+				__pagevec_free(&freed_pvec);
+				pagevec_reinit(&freed_pvec);
+			}
 		}
 		continue;
 
@@ -1024,7 +1037,8 @@ int isolate_lru_page(struct page *page)
  * shrink_inactive_list() is a helper for shrink_zone().  It returns the number
  * of reclaimed pages
  */
-static unsigned long shrink_inactive_list(unsigned long max_scan,
+static unsigned long shrink_inactive_list(struct list_head *free_list,
+			unsigned long max_scan,
 			struct zone *zone, struct scan_control *sc,
 			int priority, int file)
 {
@@ -1083,7 +1097,8 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
 		spin_unlock_irq(&zone->lru_lock);
 
 		nr_scanned += nr_scan;
-		nr_freed = shrink_page_list(&page_list, sc, PAGEOUT_IO_ASYNC);
+		nr_freed = shrink_page_list(free_list, &page_list,
+							sc, PAGEOUT_IO_ASYNC);
 
 		/*
 		 * If we are direct reclaiming for contiguous pages and we do
@@ -1102,8 +1117,8 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
 			nr_active = clear_active_flags(&page_list, count);
 			count_vm_events(PGDEACTIVATE, nr_active);
 
-			nr_freed += shrink_page_list(&page_list, sc,
-							PAGEOUT_IO_SYNC);
+			nr_freed += shrink_page_list(free_list, &page_list,
+							sc, PAGEOUT_IO_SYNC);
 		}
 
 		nr_reclaimed += nr_freed;
@@ -1337,7 +1352,8 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
 	pagevec_release(&pvec);
 }
 
-static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
+static unsigned long shrink_list(struct list_head *free_list, 
+	enum lru_list lru, unsigned long nr_to_scan,
 	struct zone *zone, struct scan_control *sc, int priority)
 {
 	int file = is_file_lru(lru);
@@ -1352,7 +1368,8 @@ static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
 		shrink_active_list(nr_to_scan, zone, sc, priority, file);
 		return 0;
 	}
-	return shrink_inactive_list(nr_to_scan, zone, sc, priority, file);
+	return shrink_inactive_list(free_list, nr_to_scan, zone,
+							sc, priority, file);
 }
 
 /*
@@ -1444,7 +1461,7 @@ static void get_scan_ratio(struct zone *zone, struct scan_control * sc,
  * This is a basic per-zone page freer.  Used by both kswapd and direct reclaim.
  */
 static unsigned long shrink_zone(int priority, struct zone *zone,
-				struct scan_control *sc)
+			struct zone *preferred_zone, struct scan_control *sc)
 {
 	unsigned long nr[NR_LRU_LISTS];
 	unsigned long nr_to_scan;
@@ -1452,6 +1469,23 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
 	unsigned long percent[2];	/* anon @ 0; file @ 1 */
 	enum lru_list l;
 
+	struct list_head __capture_list;
+	struct list_head *capture_list = NULL;
+	struct page *capture_page;
+
+	/*
+	 * When direct reclaimers are asking for larger orders
+	 * capture pages for them.  There is no point if we already
+	 * have an acceptable page or if this zone is not within the
+	 * nodemask.
+	 */
+	if (sc->order > PAGE_ALLOC_COSTLY_ORDER &&
+	    sc->capture && !*(sc->capture) && (sc->nodemask == NULL ||
+	    node_isset(zone_to_nid(zone), *sc->nodemask))) {
+		capture_list = &__capture_list;
+		INIT_LIST_HEAD(capture_list);
+	}
+
 	get_scan_ratio(zone, sc, percent);
 
 	for_each_evictable_lru(l) {
@@ -1481,6 +1515,8 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
 		}
 	}
 
+	capture_page = NULL;
+
 	while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
 					nr[LRU_INACTIVE_FILE]) {
 		for_each_evictable_lru(l) {
@@ -1489,10 +1525,18 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
 					(unsigned long)sc->swap_cluster_max);
 				nr[l] -= nr_to_scan;
 
-				nr_reclaimed += shrink_list(l, nr_to_scan,
+				nr_reclaimed += shrink_list(capture_list,
+							l, nr_to_scan,
 							zone, sc, priority);
 			}
 		}
+		if (capture_list) {
+			capture_page = capture_alloc_or_return(zone,
+				preferred_zone, capture_list, sc->order,
+				sc->alloc_flags, sc->gfp_mask);
+			if (capture_page)
+				capture_list = NULL;
+		}
 	}
 
 	/*
@@ -1504,6 +1548,9 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
 	else if (!scan_global_lru(sc))
 		shrink_active_list(SWAP_CLUSTER_MAX, zone, sc, priority, 0);
 
+	if (capture_page)
+		*(sc->capture) = capture_page;
+
 	throttle_vm_writeout(sc->gfp_mask);
 	return nr_reclaimed;
 }
@@ -1525,7 +1572,7 @@ static unsigned long shrink_zone(int priority, struct zone *zone,
  * scan then give up on it.
  */
 static unsigned long shrink_zones(int priority, struct zonelist *zonelist,
-					struct scan_control *sc)
+		struct zone *preferred_zone, struct scan_control *sc)
 {
 	enum zone_type high_zoneidx = gfp_zone(sc->gfp_mask);
 	unsigned long nr_reclaimed = 0;
@@ -1559,7 +1606,7 @@ static unsigned long shrink_zones(int priority, struct zonelist *zonelist,
 							priority);
 		}
 
-		nr_reclaimed += shrink_zone(priority, zone, sc);
+		nr_reclaimed += shrink_zone(priority, zone, preferred_zone, sc);
 	}
 
 	return nr_reclaimed;
@@ -1592,8 +1639,14 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 	unsigned long lru_pages = 0;
 	struct zoneref *z;
 	struct zone *zone;
+	struct zone *preferred_zone;
 	enum zone_type high_zoneidx = gfp_zone(sc->gfp_mask);
 
+	/* This should never fail as we should be scanning a real zonelist. */
+	(void)first_zones_zonelist(zonelist, high_zoneidx, sc->nodemask,
+							&preferred_zone);
+	BUG_ON(!preferred_zone);
+
 	delayacct_freepages_start();
 
 	if (scan_global_lru(sc))
@@ -1615,7 +1668,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 		sc->nr_scanned = 0;
 		if (!priority)
 			disable_swap_token();
-		nr_reclaimed += shrink_zones(priority, zonelist, sc);
+		nr_reclaimed += shrink_zones(priority, zonelist,
+							preferred_zone, sc);
 		/*
 		 * Don't shrink slabs when reclaiming memory from
 		 * over limit cgroups
@@ -1680,11 +1734,13 @@ out:
 	return ret;
 }
 
-unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
-								gfp_t gfp_mask)
+unsigned long try_to_free_pages_capture(struct page **capture_pagep,
+		struct zonelist *zonelist, nodemask_t *nodemask,
+		int order, gfp_t gfp_mask, int alloc_flags)
 {
 	struct scan_control sc = {
 		.gfp_mask = gfp_mask,
+		.alloc_flags = alloc_flags,
 		.may_writepage = !laptop_mode,
 		.swap_cluster_max = SWAP_CLUSTER_MAX,
 		.may_swap = 1,
@@ -1692,17 +1748,28 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 		.order = order,
 		.mem_cgroup = NULL,
 		.isolate_pages = isolate_pages_global,
+		.capture = capture_pagep,
+		.nodemask = nodemask,
 	};
 
 	return do_try_to_free_pages(zonelist, &sc);
 }
 
+unsigned long try_to_free_pages(struct zonelist *zonelist,
+						int order, gfp_t gfp_mask)
+{
+	return try_to_free_pages_capture(NULL, zonelist, NULL,
+							order, gfp_mask, 0);
+}
+
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR
 
 unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
 						gfp_t gfp_mask)
 {
 	struct scan_control sc = {
+		.gfp_mask = gfp_mask,
+		.alloc_flags = 0,
 		.may_writepage = !laptop_mode,
 		.may_swap = 1,
 		.swap_cluster_max = SWAP_CLUSTER_MAX,
@@ -1710,6 +1777,8 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
 		.order = 0,
 		.mem_cgroup = mem_cont,
 		.isolate_pages = mem_cgroup_isolate_pages,
+		.capture = NULL,
+		.nodemask = NULL,
 	};
 	struct zonelist *zonelist;
 
@@ -1751,12 +1820,15 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
+		.alloc_flags = 0,
 		.may_swap = 1,
 		.swap_cluster_max = SWAP_CLUSTER_MAX,
 		.swappiness = vm_swappiness,
 		.order = order,
 		.mem_cgroup = NULL,
 		.isolate_pages = isolate_pages_global,
+		.capture = NULL,
+		.nodemask = NULL,
 	};
 	/*
 	 * temp_priority is used to remember the scanning priority at which
@@ -1852,7 +1924,8 @@ loop_again:
 			 */
 			if (!zone_watermark_ok(zone, order, 8*zone->pages_high,
 						end_zone, 0))
-				nr_reclaimed += shrink_zone(priority, zone, &sc);
+				nr_reclaimed += shrink_zone(priority,
+							zone, zone, &sc);
 			reclaim_state->reclaimed_slab = 0;
 			nr_slab = shrink_slab(sc.nr_scanned, GFP_KERNEL,
 						lru_pages);
@@ -2054,7 +2127,7 @@ static unsigned long shrink_all_zones(unsigned long nr_pages, int prio,
 				nr_to_scan = min(nr_pages,
 					zone_page_state(zone,
 							NR_LRU_BASE + l));
-				ret += shrink_list(l, nr_to_scan, zone,
+				ret += shrink_list(NULL, l, nr_to_scan, zone,
 								sc, prio);
 				if (ret >= nr_pages)
 					return ret;
@@ -2081,6 +2154,7 @@ unsigned long shrink_all_memory(unsigned long nr_pages)
 	struct reclaim_state reclaim_state;
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
+		.alloc_flags = 0,
 		.may_swap = 0,
 		.swap_cluster_max = nr_pages,
 		.may_writepage = 1,
@@ -2269,6 +2343,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		.swap_cluster_max = max_t(unsigned long, nr_pages,
 					SWAP_CLUSTER_MAX),
 		.gfp_mask = gfp_mask,
+		.alloc_flags = 0,
 		.swappiness = vm_swappiness,
 		.isolate_pages = isolate_pages_global,
 	};
@@ -2295,7 +2370,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		priority = ZONE_RECLAIM_PRIORITY;
 		do {
 			note_zone_scanning_priority(zone, priority);
-			nr_reclaimed += shrink_zone(priority, zone, &sc);
+			nr_reclaimed += shrink_zone(priority, zone, zone, &sc);
 			priority--;
 		} while (priority >= 0 && nr_reclaimed < nr_pages);
 	}
-- 
1.6.0.rc1.258.g80295


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 20:53   ` Andy Whitcroft
@ 2008-09-03 21:00     ` Christoph Lameter
  2008-09-04  6:38       ` Peter Zijlstra
                         ` (2 more replies)
  2008-09-04  7:20     ` Peter Zijlstra
  2008-09-04  7:59     ` KOSAKI Motohiro
  2 siblings, 3 replies; 30+ messages in thread
From: Christoph Lameter @ 2008-09-03 21:00 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman

Andy Whitcroft wrote:

>  
>  #ifndef __GENERATING_BOUNDS_H
> @@ -208,6 +211,9 @@ __PAGEFLAG(SlubDebug, slub_debug)
>   */
>  TESTPAGEFLAG(Writeback, writeback) TESTSCFLAG(Writeback, writeback)
>  __PAGEFLAG(Buddy, buddy)
> +PAGEFLAG(BuddyCapture, buddy_capture)	/* A buddy page, but reserved. */
> +	__SETPAGEFLAG(BuddyCapture, buddy_capture)
> +	__CLEARPAGEFLAG(BuddyCapture, buddy_capture)

Doesnt __PAGEFLAG do what you want without having to explicitly specify
__SET/__CLEAR?


How does page allocator fastpath behavior fare with this pathch?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-09-03 18:44 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
@ 2008-09-04  1:24   ` Rik van Riel
  2008-09-05  1:52   ` KOSAKI Motohiro
  1 sibling, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2008-09-04  1:24 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

On Wed,  3 Sep 2008 19:44:09 +0100
Andy Whitcroft <apw@shadowen.org> wrote:

> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/4] pull out zone cpuset and watermark checks for reuse
  2008-09-03 18:44 ` [PATCH 2/4] pull out zone cpuset and watermark checks " Andy Whitcroft
@ 2008-09-04  1:24   ` Rik van Riel
  2008-09-05  1:52   ` KOSAKI Motohiro
  1 sibling, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2008-09-04  1:24 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

On Wed,  3 Sep 2008 19:44:10 +0100
Andy Whitcroft <apw@shadowen.org> wrote:

> When allocating we need to confirm that the zone we are about to
> allocate from is acceptable to the CPUSET we are in, and that it does
> not violate the zone watermarks.  Pull these checks out so we can
> reuse them in a later patch.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] buddy: explicitly identify buddy field use in struct page
  2008-09-03 18:44 ` [PATCH 3/4] buddy: explicitly identify buddy field use in struct page Andy Whitcroft
  2008-09-03 20:36   ` Christoph Lameter
@ 2008-09-04  1:25   ` Rik van Riel
  2008-09-05  1:52   ` KOSAKI Motohiro
  2 siblings, 0 replies; 30+ messages in thread
From: Rik van Riel @ 2008-09-04  1:25 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman, Andy Whitcroft

On Wed,  3 Sep 2008 19:44:11 +0100
Andy Whitcroft <apw@shadowen.org> wrote:

> Explicitly define the struct page fields which buddy uses when it owns
> pages.  Defines a new anonymous struct to allow additional fields to
> be defined in a later patch.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

-- 
All rights reversed.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 21:00     ` Christoph Lameter
@ 2008-09-04  6:38       ` Peter Zijlstra
  2008-09-04 14:18         ` Christoph Lameter
  2008-09-04  8:11       ` KOSAKI Motohiro
  2008-09-04  8:58       ` Andy Whitcroft
  2 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2008-09-04  6:38 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: Andy Whitcroft, linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman

On Wed, 2008-09-03 at 16:00 -0500, Christoph Lameter wrote:
> Andy Whitcroft wrote:
> 
> >  
> >  #ifndef __GENERATING_BOUNDS_H
> > @@ -208,6 +211,9 @@ __PAGEFLAG(SlubDebug, slub_debug)
> >   */
> >  TESTPAGEFLAG(Writeback, writeback) TESTSCFLAG(Writeback, writeback)
> >  __PAGEFLAG(Buddy, buddy)
> > +PAGEFLAG(BuddyCapture, buddy_capture)	/* A buddy page, but reserved. */
> > +	__SETPAGEFLAG(BuddyCapture, buddy_capture)
> > +	__CLEARPAGEFLAG(BuddyCapture, buddy_capture)
> 
> Doesnt __PAGEFLAG do what you want without having to explicitly specify
> __SET/__CLEAR?

PAGEFLAG() __PAGEFLAG()

does TESTPAGEFLAG() double.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 20:53   ` Andy Whitcroft
  2008-09-03 21:00     ` Christoph Lameter
@ 2008-09-04  7:20     ` Peter Zijlstra
  2008-09-04 11:35       ` Andy Whitcroft
  2008-09-04  7:59     ` KOSAKI Motohiro
  2 siblings, 1 reply; 30+ messages in thread
From: Peter Zijlstra @ 2008-09-04  7:20 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman, Christoph Lameter

On Wed, 2008-09-03 at 21:53 +0100, Andy Whitcroft wrote:
> [Doh, as pointed out by Christoph the patch was missing from this one...]
> 
> When a process enters direct reclaim it will expend effort identifying
> and releasing pages in the hope of obtaining a page.  However as these
> pages are released asynchronously there is every possibility that the
> pages will have been consumed by other allocators before the reclaimer
> gets a look in.  This is particularly problematic where the reclaimer is
> attempting to allocate a higher order page.  It is highly likely that
> a parallel allocation will consume lower order constituent pages as we
> release them preventing them coelescing into the higher order page the
> reclaimer desires.
> 
> This patch set attempts to address this for allocations above
> ALLOC_COSTLY_ORDER by temporarily collecting the pages we are releasing
> onto a local free list.  Instead of freeing them to the main buddy lists,
> pages are collected and coelesced on this per direct reclaimer free list.
> Pages which are freed by other processes are also considered, where they
> coelesce with a page already under capture they will be moved to the
> capture list.  When pressure has been applied to a zone we then consult
> the capture list and if there is an appropriatly sized page available
> it is taken immediatly and the remainder returned to the free pool.
> Capture is only enabled when the reclaimer's allocation order exceeds
> ALLOC_COSTLY_ORDER as free pages below this order should naturally occur
> in large numbers following regular reclaim.
> 
> Thanks go to Mel Gorman for numerous discussions during the development
> of this patch and for his repeated reviews.

Whole series looks good, a few comments below.

Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>

> Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> ---

> @@ -4815,6 +4900,73 @@ out:
>  	spin_unlock_irqrestore(&zone->lock, flags);
>  }
>  
> +#define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
> +
> +/*
> + * Run through the accumulated list of captured pages and the first
> + * which is big enough to satisfy the original allocation.  Free
> + * the remainder of that page and all other pages.
> + */

That sentence looks incomplete, did you intend to write something along
the lines of:

Run through the accumulated list of captures pages and /take/ the first
which is big enough to satisfy the original allocation. Free the
remaining pages.

?

> +struct page *capture_alloc_or_return(struct zone *zone,
> +		struct zone *preferred_zone, struct list_head *capture_list,
> +		int order, int alloc_flags, gfp_t gfp_mask)
> +{
> +	struct page *capture_page = 0;
> +	unsigned long flags;
> +	int classzone_idx = zone_idx(preferred_zone);
> +
> +	spin_lock_irqsave(&zone->lock, flags);
> +
> +	while (!list_empty(capture_list)) {
> +		struct page *page;
> +		int pg_order;
> +
> +		page = lru_to_page(capture_list);
> +		list_del(&page->lru);
> +		pg_order = page_order(page);
> +
> +		/*
> +		 * Clear out our buddy size and list information before
> +		 * releasing or allocating the page.
> +		 */
> +		rmv_page_order(page);
> +		page->buddy_free = 0;
> +		ClearPageBuddyCapture(page);
> +
> +		if (!capture_page && pg_order >= order) {
> +			__carve_off(page, pg_order, order);
> +			capture_page = page;
> +		} else
> +			__free_one_page(page, zone, pg_order);
> +	}
> +
> +	/*
> +	 * Ensure that this capture would not violate the watermarks.
> +	 * Subtle, we actually already have the page outside the watermarks
> +	 * so check if we can allocate an order 0 page.
> +	 */
> +	if (capture_page &&
> +	    (!zone_cpuset_permits(zone, alloc_flags, gfp_mask) ||
> +	     !zone_watermark_permits(zone, 0, classzone_idx,
> +					     alloc_flags, gfp_mask))) {
> +		__free_one_page(capture_page, zone, order);
> +		capture_page = NULL;
> +	}

This makes me a little sad - we got a high order page and give it away
again...

Can we start another round of direct reclaim with a lower order to try
and increase the watermarks while we hold on to this large order page?

> +	if (capture_page)
> +		__count_zone_vm_events(PGALLOC, zone, 1 << order);
> +
> +	zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> +	zone->pages_scanned = 0;
> +
> +	spin_unlock_irqrestore(&zone->lock, flags);
> +
> +	if (capture_page)
> +		prep_new_page(capture_page, order, gfp_mask);
> +
> +	return capture_page;
> +}
> +
>  #ifdef CONFIG_MEMORY_HOTREMOVE
>  /*
>   * All pages in the range must be isolated before calling this.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 20:53   ` Andy Whitcroft
  2008-09-03 21:00     ` Christoph Lameter
  2008-09-04  7:20     ` Peter Zijlstra
@ 2008-09-04  7:59     ` KOSAKI Motohiro
  2008-09-04 14:44       ` Andy Whitcroft
  2 siblings, 1 reply; 30+ messages in thread
From: KOSAKI Motohiro @ 2008-09-04  7:59 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: kosaki.motohiro, linux-mm, linux-kernel, Mel Gorman, Christoph Lameter

> When a process enters direct reclaim it will expend effort identifying
> and releasing pages in the hope of obtaining a page.  However as these
> pages are released asynchronously there is every possibility that the
> pages will have been consumed by other allocators before the reclaimer
> gets a look in.  This is particularly problematic where the reclaimer is
> attempting to allocate a higher order page.  It is highly likely that
> a parallel allocation will consume lower order constituent pages as we
> release them preventing them coelescing into the higher order page the
> reclaimer desires.
> 
> This patch set attempts to address this for allocations above
> ALLOC_COSTLY_ORDER by temporarily collecting the pages we are releasing
> onto a local free list.  Instead of freeing them to the main buddy lists,
> pages are collected and coelesced on this per direct reclaimer free list.
> Pages which are freed by other processes are also considered, where they
> coelesce with a page already under capture they will be moved to the
> capture list.  When pressure has been applied to a zone we then consult
> the capture list and if there is an appropriatly sized page available
> it is taken immediatly and the remainder returned to the free pool.
> Capture is only enabled when the reclaimer's allocation order exceeds
> ALLOC_COSTLY_ORDER as free pages below this order should naturally occur
> in large numbers following regular reclaim.


Hi Andy,

I like almost part of your patch.
(at least, I can ack patch 1/4 - 3/4)

So, I worry about OOM risk.
Can you remember desired page size to capture list (or any other location)?
if possible, __capture_on_page can avoid to capture unnecessary pages.

So, if __capture_on_page() can make desired size page by buddy merging, 
it can free other pages on capture_list.

In worst case, shrink_zone() is called by very much process at the same time.
Then, if each process doesn't back few pages, very many pages doesn't be backed.




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 21:00     ` Christoph Lameter
  2008-09-04  6:38       ` Peter Zijlstra
@ 2008-09-04  8:11       ` KOSAKI Motohiro
  2008-09-04  8:58       ` Andy Whitcroft
  2 siblings, 0 replies; 30+ messages in thread
From: KOSAKI Motohiro @ 2008-09-04  8:11 UTC (permalink / raw)
  To: Christoph Lameter
  Cc: kosaki.motohiro, Andy Whitcroft, linux-mm, linux-kernel, Mel Gorman

Hi Cristoph

> How does page allocator fastpath behavior fare with this pathch?

Don't worry it because

1. shrink_zone() isn't fastpath because any reclaim isn't fastpath.
2. buddy combining on __free_one_page() isn't fastpath because
   any buddy combining isn't fastpath. (*)

(*)
all modern allocator have delayed buddy combining mecanism
because buddy combining increase cache miss.
(please imazine address X+1 is freed when address X is cold.
 combining cause next alloc get address X, then caller see cold page)

at least, allocator's fastpath should avoid its combining IMHO.

Unfortunately the linux buddy's one is limited because
zone->pcp only cache order-0 page.

Then, higher order pages's free always use slow path now.
but it isn't his patch failure.




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-03 21:00     ` Christoph Lameter
  2008-09-04  6:38       ` Peter Zijlstra
  2008-09-04  8:11       ` KOSAKI Motohiro
@ 2008-09-04  8:58       ` Andy Whitcroft
  2 siblings, 0 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-04  8:58 UTC (permalink / raw)
  To: Christoph Lameter; +Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman

On Wed, Sep 03, 2008 at 04:00:41PM -0500, Christoph Lameter wrote:
> Andy Whitcroft wrote:
> 
> >  
> >  #ifndef __GENERATING_BOUNDS_H
> > @@ -208,6 +211,9 @@ __PAGEFLAG(SlubDebug, slub_debug)
> >   */
> >  TESTPAGEFLAG(Writeback, writeback) TESTSCFLAG(Writeback, writeback)
> >  __PAGEFLAG(Buddy, buddy)
> > +PAGEFLAG(BuddyCapture, buddy_capture)	/* A buddy page, but reserved. */
> > +	__SETPAGEFLAG(BuddyCapture, buddy_capture)
> > +	__CLEARPAGEFLAG(BuddyCapture, buddy_capture)
> 
> Doesnt __PAGEFLAG do what you want without having to explicitly specify
> __SET/__CLEAR?

I think I end up with one extra test that I don't need, but its
probabally much clearer.

> How does page allocator fastpath behavior fare with this pathch?

The fastpath should be unaffected on the allocation side.  On the free
side there is an additional check for merging with a buddy under capture
as we merge buddies in __free_one_page.

-apw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-04  7:20     ` Peter Zijlstra
@ 2008-09-04 11:35       ` Andy Whitcroft
  0 siblings, 0 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-04 11:35 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman, Christoph Lameter

On Thu, Sep 04, 2008 at 09:20:18AM +0200, Peter Zijlstra wrote:
> On Wed, 2008-09-03 at 21:53 +0100, Andy Whitcroft wrote:
> > [Doh, as pointed out by Christoph the patch was missing from this one...]
> > 
> > When a process enters direct reclaim it will expend effort identifying
> > and releasing pages in the hope of obtaining a page.  However as these
> > pages are released asynchronously there is every possibility that the
> > pages will have been consumed by other allocators before the reclaimer
> > gets a look in.  This is particularly problematic where the reclaimer is
> > attempting to allocate a higher order page.  It is highly likely that
> > a parallel allocation will consume lower order constituent pages as we
> > release them preventing them coelescing into the higher order page the
> > reclaimer desires.
> > 
> > This patch set attempts to address this for allocations above
> > ALLOC_COSTLY_ORDER by temporarily collecting the pages we are releasing
> > onto a local free list.  Instead of freeing them to the main buddy lists,
> > pages are collected and coelesced on this per direct reclaimer free list.
> > Pages which are freed by other processes are also considered, where they
> > coelesce with a page already under capture they will be moved to the
> > capture list.  When pressure has been applied to a zone we then consult
> > the capture list and if there is an appropriatly sized page available
> > it is taken immediatly and the remainder returned to the free pool.
> > Capture is only enabled when the reclaimer's allocation order exceeds
> > ALLOC_COSTLY_ORDER as free pages below this order should naturally occur
> > in large numbers following regular reclaim.
> > 
> > Thanks go to Mel Gorman for numerous discussions during the development
> > of this patch and for his repeated reviews.
> 
> Whole series looks good, a few comments below.
> 
> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> 
> > Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> > ---
> 
> > @@ -4815,6 +4900,73 @@ out:
> >  	spin_unlock_irqrestore(&zone->lock, flags);
> >  }
> >  
> > +#define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
> > +
> > +/*
> > + * Run through the accumulated list of captured pages and the first
> > + * which is big enough to satisfy the original allocation.  Free
> > + * the remainder of that page and all other pages.
> > + */
> 
> That sentence looks incomplete, did you intend to write something along
> the lines of:
> 
> Run through the accumulated list of captures pages and /take/ the first
> which is big enough to satisfy the original allocation. Free the
> remaining pages.
> 
> ?

Yeah that is more like it.  Updated.

> > +struct page *capture_alloc_or_return(struct zone *zone,
> > +		struct zone *preferred_zone, struct list_head *capture_list,
> > +		int order, int alloc_flags, gfp_t gfp_mask)
> > +{
> > +	struct page *capture_page = 0;
> > +	unsigned long flags;
> > +	int classzone_idx = zone_idx(preferred_zone);
> > +
> > +	spin_lock_irqsave(&zone->lock, flags);
> > +
> > +	while (!list_empty(capture_list)) {
> > +		struct page *page;
> > +		int pg_order;
> > +
> > +		page = lru_to_page(capture_list);
> > +		list_del(&page->lru);
> > +		pg_order = page_order(page);
> > +
> > +		/*
> > +		 * Clear out our buddy size and list information before
> > +		 * releasing or allocating the page.
> > +		 */
> > +		rmv_page_order(page);
> > +		page->buddy_free = 0;
> > +		ClearPageBuddyCapture(page);
> > +
> > +		if (!capture_page && pg_order >= order) {
> > +			__carve_off(page, pg_order, order);
> > +			capture_page = page;
> > +		} else
> > +			__free_one_page(page, zone, pg_order);
> > +	}
> > +
> > +	/*
> > +	 * Ensure that this capture would not violate the watermarks.
> > +	 * Subtle, we actually already have the page outside the watermarks
> > +	 * so check if we can allocate an order 0 page.
> > +	 */
> > +	if (capture_page &&
> > +	    (!zone_cpuset_permits(zone, alloc_flags, gfp_mask) ||
> > +	     !zone_watermark_permits(zone, 0, classzone_idx,
> > +					     alloc_flags, gfp_mask))) {
> > +		__free_one_page(capture_page, zone, order);
> > +		capture_page = NULL;
> > +	}
> 
> This makes me a little sad - we got a high order page and give it away
> again...
> 
> Can we start another round of direct reclaim with a lower order to try
> and increase the watermarks while we hold on to this large order page?

Well in theory we have already pushed a load of other pages back, the
ones we discarded during the capture selection.  This actually triggers
very rarely in real use, without it we would occasionally OOM but it was
rare.  Looking at some stats collected when running our tests I have yet
to see it trigger.  So its probabally not worth any additional effort
there.

> > +	if (capture_page)
> > +		__count_zone_vm_events(PGALLOC, zone, 1 << order);
> > +
> > +	zone_clear_flag(zone, ZONE_ALL_UNRECLAIMABLE);
> > +	zone->pages_scanned = 0;
> > +
> > +	spin_unlock_irqrestore(&zone->lock, flags);
> > +
> > +	if (capture_page)
> > +		prep_new_page(capture_page, order, gfp_mask);
> > +
> > +	return capture_page;
> > +}
> > +
> >  #ifdef CONFIG_MEMORY_HOTREMOVE
> >  /*
> >   * All pages in the range must be isolated before calling this.

-apw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-04  6:38       ` Peter Zijlstra
@ 2008-09-04 14:18         ` Christoph Lameter
  0 siblings, 0 replies; 30+ messages in thread
From: Christoph Lameter @ 2008-09-04 14:18 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Andy Whitcroft, linux-mm, linux-kernel, KOSAKI Motohiro, Mel Gorman

Peter Zijlstra wrote:
> On Wed, 2008-09-03 at 16:00 -0500, Christoph Lameter wrote:
>> Andy Whitcroft wrote:
>>
>>>  
>>>  #ifndef __GENERATING_BOUNDS_H
>>> @@ -208,6 +211,9 @@ __PAGEFLAG(SlubDebug, slub_debug)
>>>   */
>>>  TESTPAGEFLAG(Writeback, writeback) TESTSCFLAG(Writeback, writeback)
>>>  __PAGEFLAG(Buddy, buddy)
>>> +PAGEFLAG(BuddyCapture, buddy_capture)	/* A buddy page, but reserved. */
>>> +	__SETPAGEFLAG(BuddyCapture, buddy_capture)
>>> +	__CLEARPAGEFLAG(BuddyCapture, buddy_capture)
>> Doesnt __PAGEFLAG do what you want without having to explicitly specify
>> __SET/__CLEAR?
> 
> PAGEFLAG() __PAGEFLAG()
> 
> does TESTPAGEFLAG() double.
> 

Usually one either wants the atomic versions or the non atomic versions. This
usage seems to be mainly non atomic plus one use of ClearPageBuddy() in
capture_or_return() (Which raises some questions about how the bit
modifications are serialized. Is there concurrency during free?)

So

__PAGEFLAG(BuddyCapture, buddy_capture)
	CLEARPAGEFLAG(BuddyCapture, buddy_capture)


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-04  7:59     ` KOSAKI Motohiro
@ 2008-09-04 14:44       ` Andy Whitcroft
  2008-09-05  1:52         ` KOSAKI Motohiro
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-04 14:44 UTC (permalink / raw)
  To: KOSAKI Motohiro; +Cc: linux-mm, linux-kernel, Mel Gorman, Christoph Lameter

On Thu, Sep 04, 2008 at 04:59:44PM +0900, KOSAKI Motohiro wrote:
> > When a process enters direct reclaim it will expend effort identifying
> > and releasing pages in the hope of obtaining a page.  However as these
> > pages are released asynchronously there is every possibility that the
> > pages will have been consumed by other allocators before the reclaimer
> > gets a look in.  This is particularly problematic where the reclaimer is
> > attempting to allocate a higher order page.  It is highly likely that
> > a parallel allocation will consume lower order constituent pages as we
> > release them preventing them coelescing into the higher order page the
> > reclaimer desires.
> > 
> > This patch set attempts to address this for allocations above
> > ALLOC_COSTLY_ORDER by temporarily collecting the pages we are releasing
> > onto a local free list.  Instead of freeing them to the main buddy lists,
> > pages are collected and coelesced on this per direct reclaimer free list.
> > Pages which are freed by other processes are also considered, where they
> > coelesce with a page already under capture they will be moved to the
> > capture list.  When pressure has been applied to a zone we then consult
> > the capture list and if there is an appropriatly sized page available
> > it is taken immediatly and the remainder returned to the free pool.
> > Capture is only enabled when the reclaimer's allocation order exceeds
> > ALLOC_COSTLY_ORDER as free pages below this order should naturally occur
> > in large numbers following regular reclaim.
> 
> 
> Hi Andy,
> 
> I like almost part of your patch.
> (at least, I can ack patch 1/4 - 3/4)
> 
> So, I worry about OOM risk.
> Can you remember desired page size to capture list (or any other location)?
> if possible, __capture_on_page can avoid to capture unnecessary pages.
> 
> So, if __capture_on_page() can make desired size page by buddy merging, 
> it can free other pages on capture_list.
> 
> In worst case, shrink_zone() is called by very much process at the same time.
> Then, if each process doesn't back few pages, very many pages doesn't be backed.

The testing we have done pushes the system pretty damn hard, about as
hard as you can.  Without the zone watermark checks in capture we would
periodically lose a test to an OOM.  Since adding that I have never seen
an OOM, so I am confident we are safe.  That said, clearly some wider
testing in -mm would be very desirable to confirm that this does not
tickle OOM for some unexpected workload.

I think the idea of trying to short-circuit capture once it has a page
of the requisit order or greater is eminently sensible.  I suspect we
are going to have trouble getting the information to the right place,
but it is clearly worth investigating.  It feels like a logical step on
top of this, so I would propose to do it as a patch on top of this set.

Thanks for your feedback.

-apw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 3/4] buddy: explicitly identify buddy field use in struct page
  2008-09-03 18:44 ` [PATCH 3/4] buddy: explicitly identify buddy field use in struct page Andy Whitcroft
  2008-09-03 20:36   ` Christoph Lameter
  2008-09-04  1:25   ` Rik van Riel
@ 2008-09-05  1:52   ` KOSAKI Motohiro
  2 siblings, 0 replies; 30+ messages in thread
From: KOSAKI Motohiro @ 2008-09-05  1:52 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: kosaki.motohiro, linux-mm, linux-kernel, Mel Gorman

> Explicitly define the struct page fields which buddy uses when it owns
> pages.  Defines a new anonymous struct to allow additional fields to
> be defined in a later patch.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer
  2008-09-04 14:44       ` Andy Whitcroft
@ 2008-09-05  1:52         ` KOSAKI Motohiro
  0 siblings, 0 replies; 30+ messages in thread
From: KOSAKI Motohiro @ 2008-09-05  1:52 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: kosaki.motohiro, linux-mm, linux-kernel, Mel Gorman, Christoph Lameter

> > Hi Andy,
> > 
> > I like almost part of your patch.
> > (at least, I can ack patch 1/4 - 3/4)
> > 
> > So, I worry about OOM risk.
> > Can you remember desired page size to capture list (or any other location)?
> > if possible, __capture_on_page can avoid to capture unnecessary pages.
> > 
> > So, if __capture_on_page() can make desired size page by buddy merging, 
> > it can free other pages on capture_list.
> > 
> > In worst case, shrink_zone() is called by very much process at the same time.
> > Then, if each process doesn't back few pages, very many pages doesn't be backed.
> 
> The testing we have done pushes the system pretty damn hard, about as
> hard as you can.  Without the zone watermark checks in capture we would
> periodically lose a test to an OOM.  Since adding that I have never seen
> an OOM, so I am confident we are safe.  That said, clearly some wider
> testing in -mm would be very desirable to confirm that this does not
> tickle OOM for some unexpected workload.
>
> I think the idea of trying to short-circuit capture once it has a page
> of the requisit order or greater is eminently sensible.  I suspect we
> are going to have trouble getting the information to the right place,
> but it is clearly worth investigating.  It feels like a logical step on
> top of this, so I would propose to do it as a patch on top of this set.
> 
> Thanks for your feedback.

Ok. makes sense.
Thanks for good patch.

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>




^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 2/4] pull out zone cpuset and watermark checks for reuse
  2008-09-03 18:44 ` [PATCH 2/4] pull out zone cpuset and watermark checks " Andy Whitcroft
  2008-09-04  1:24   ` Rik van Riel
@ 2008-09-05  1:52   ` KOSAKI Motohiro
  1 sibling, 0 replies; 30+ messages in thread
From: KOSAKI Motohiro @ 2008-09-05  1:52 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: kosaki.motohiro, linux-mm, linux-kernel, Mel Gorman

> When allocating we need to confirm that the zone we are about to allocate
> from is acceptable to the CPUSET we are in, and that it does not violate
> the zone watermarks.  Pull these checks out so we can reuse them in a
> later patch.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-09-03 18:44 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
  2008-09-04  1:24   ` Rik van Riel
@ 2008-09-05  1:52   ` KOSAKI Motohiro
  1 sibling, 0 replies; 30+ messages in thread
From: KOSAKI Motohiro @ 2008-09-05  1:52 UTC (permalink / raw)
  To: Andy Whitcroft; +Cc: kosaki.motohiro, linux-mm, linux-kernel, Mel Gorman

> When we are about to release a page we perform a number of actions
> on that page.  We clear down any anonymous mappings, confirm that
> the page is safe to release, check for freeing locks, before mapping
> the page should that be required.  Pull this processing out into a
> helper function for reuse in a later patch.
> 
> Note that we do not convert the similar cleardown in free_hot_cold_page()
> as the optimiser is unable to squash the loops during the inline.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>

Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>






^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-10-01 12:30 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
@ 2008-10-02  7:05   ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 30+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-10-02  7:05 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Peter Zijlstra,
	Christoph Lameter, Rik van Riel, Mel Gorman, Nick Piggin,
	Andrew Morton

On Wed,  1 Oct 2008 13:30:58 +0100
Andy Whitcroft <apw@shadowen.org> wrote:

> When we are about to release a page we perform a number of actions
> on that page.  We clear down any anonymous mappings, confirm that
> the page is safe to release, check for freeing locks, before mapping
> the page should that be required.  Pull this processing out into a
> helper function for reuse in a later patch.
> 
> Note that we do not convert the similar cleardown in free_hot_cold_page()
> as the optimiser is unable to squash the loops during the inline.
> 
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Rik van Riel <riel@redhat.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiruyo@jp.fujitsu.com>


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-10-01 12:30 [PATCH 0/4] Reclaim page capture v4 Andy Whitcroft
@ 2008-10-01 12:30 ` Andy Whitcroft
  2008-10-02  7:05   ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Whitcroft @ 2008-10-01 12:30 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, KOSAKI Motohiro, Peter Zijlstra, Christoph Lameter,
	Rik van Riel, Mel Gorman, Andy Whitcroft, Nick Piggin,
	Andrew Morton

When we are about to release a page we perform a number of actions
on that page.  We clear down any anonymous mappings, confirm that
the page is safe to release, check for freeing locks, before mapping
the page should that be required.  Pull this processing out into a
helper function for reuse in a later patch.

Note that we do not convert the similar cleardown in free_hot_cold_page()
as the optimiser is unable to squash the loops during the inline.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
---
 mm/page_alloc.c |   40 +++++++++++++++++++++++++++-------------
 1 files changed, 27 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f52fcf1..55d8d9b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -489,6 +489,32 @@ static inline int free_pages_check(struct page *page)
 }
 
 /*
+ * Prepare this page for release to the buddy.  Sanity check the page.
+ * Returns 1 if the page is safe to free.
+ */
+static inline int free_page_prepare(struct page *page, int order)
+{
+	int i;
+	int reserved = 0;
+
+	for (i = 0 ; i < (1 << order) ; ++i)
+		reserved += free_pages_check(page + i);
+	if (reserved)
+		return 0;
+
+	if (!PageHighMem(page)) {
+		debug_check_no_locks_freed(page_address(page),
+							PAGE_SIZE << order);
+		debug_check_no_obj_freed(page_address(page),
+					   PAGE_SIZE << order);
+	}
+	arch_free_page(page, order);
+	kernel_map_pages(page, 1 << order, 0);
+
+	return 1;
+}
+
+/*
  * Frees a list of pages. 
  * Assumes all pages on list are in same zone, and of same order.
  * count is the number of pages to free.
@@ -529,22 +555,10 @@ static void free_one_page(struct zone *zone, struct page *page, int order)
 static void __free_pages_ok(struct page *page, unsigned int order)
 {
 	unsigned long flags;
-	int i;
-	int reserved = 0;
 
-	for (i = 0 ; i < (1 << order) ; ++i)
-		reserved += free_pages_check(page + i);
-	if (reserved)
+	if (!free_page_prepare(page, order))
 		return;
 
-	if (!PageHighMem(page)) {
-		debug_check_no_locks_freed(page_address(page),PAGE_SIZE<<order);
-		debug_check_no_obj_freed(page_address(page),
-					   PAGE_SIZE << order);
-	}
-	arch_free_page(page, order);
-	kernel_map_pages(page, 1 << order, 0);
-
 	local_irq_save(flags);
 	__count_vm_events(PGFREE, 1 << order);
 	free_one_page(page_zone(page), page, order);
-- 
1.6.0.1.451.gc8d31


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-09-08 14:11   ` MinChan Kim
@ 2008-09-08 15:14     ` Andy Whitcroft
  0 siblings, 0 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-08 15:14 UTC (permalink / raw)
  To: MinChan Kim
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Peter Zijlstra,
	Christoph Lameter, Rik van Riel, Mel Gorman

On Mon, Sep 08, 2008 at 11:11:22PM +0900, MinChan Kim wrote:
> On Fri, Sep 5, 2008 at 7:19 PM, Andy Whitcroft <apw@shadowen.org> wrote:
> > When we are about to release a page we perform a number of actions
> > on that page.  We clear down any anonymous mappings, confirm that
> > the page is safe to release, check for freeing locks, before mapping
> > the page should that be required.  Pull this processing out into a
> > helper function for reuse in a later patch.
> >
> > Note that we do not convert the similar cleardown in free_hot_cold_page()
> > as the optimiser is unable to squash the loops during the inline.
> >
> > Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> > Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> > Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> > Reviewed-by: Rik van Riel <riel@redhat.com>
> > ---
> >  mm/page_alloc.c |   43 ++++++++++++++++++++++++++++++-------------
> >  1 files changed, 30 insertions(+), 13 deletions(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index f52fcf1..b2a2c2b 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -489,6 +489,35 @@ static inline int free_pages_check(struct page *page)
> >  }
> >
> >  /*
> > + * Prepare this page for release to the buddy.  Sanity check the page.
> > + * Returns 1 if the page is safe to free.
> > + */
> > +static inline int free_page_prepare(struct page *page, int order)
> > +{
> > +       int i;
> > +       int reserved = 0;
> > +
> > +       if (PageAnon(page))
> > +               page->mapping = NULL;
> 
> Why do you need to clear down anonymous mapping ?
> I think if you don't convert this cleardown in free_hot_cold_page(),
> you don't need it.
> 
> If you do it, bad_page can't do his role.

Yeah that has slipped through from where originally this patch used to
merge two different instances of this code.  Good spot.  Will sort that
out.

> > +       for (i = 0 ; i < (1 << order) ; ++i)
> > +               reserved += free_pages_check(page + i);
> > +       if (reserved)
> > +               return 0;
> > +
> > +       if (!PageHighMem(page)) { > > +               debug_check_no_locks_freed(page_address(page),
> > +                                                       PAGE_SIZE << order);
> > +               debug_check_no_obj_freed(page_address(page),
> > +                                          PAGE_SIZE << order);
> > +       }
> > +       arch_free_page(page, order);
> > +       kernel_map_pages(page, 1 << order, 0);
> > +
> > +       return 1;
> > +}
> > +
> > +/*
> >  * Frees a list of pages.
> >  * Assumes all pages on list are in same zone, and of same order.
> >  * count is the number of pages to free.
> > @@ -529,22 +558,10 @@ static void free_one_page(struct zone *zone, struct page *page, int order)
> >  static void __free_pages_ok(struct page *page, unsigned int order)
> >  {
> >        unsigned long flags;
> > -       int i;
> > -       int reserved = 0;
> >
> > -       for (i = 0 ; i < (1 << order) ; ++i)
> > -               reserved += free_pages_check(page + i);
> > -       if (reserved)
> > +       if (!free_page_prepare(page, order))
> >                return;
> >
> > -       if (!PageHighMem(page)) {
> > -               debug_check_no_locks_freed(page_address(page),PAGE_SIZE<<order);
> > -               debug_check_no_obj_freed(page_address(page),
> > -                                          PAGE_SIZE << order);
> > -       }
> > -       arch_free_page(page, order);
> > -       kernel_map_pages(page, 1 << order, 0);
> > -
> >        local_irq_save(flags);
> >        __count_vm_events(PGFREE, 1 << order);
> >        free_one_page(page_zone(page), page, order);

-apw

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-09-05 10:19 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
@ 2008-09-08 14:11   ` MinChan Kim
  2008-09-08 15:14     ` Andy Whitcroft
  0 siblings, 1 reply; 30+ messages in thread
From: MinChan Kim @ 2008-09-08 14:11 UTC (permalink / raw)
  To: Andy Whitcroft
  Cc: linux-mm, linux-kernel, KOSAKI Motohiro, Peter Zijlstra,
	Christoph Lameter, Rik van Riel, Mel Gorman

On Fri, Sep 5, 2008 at 7:19 PM, Andy Whitcroft <apw@shadowen.org> wrote:
> When we are about to release a page we perform a number of actions
> on that page.  We clear down any anonymous mappings, confirm that
> the page is safe to release, check for freeing locks, before mapping
> the page should that be required.  Pull this processing out into a
> helper function for reuse in a later patch.
>
> Note that we do not convert the similar cleardown in free_hot_cold_page()
> as the optimiser is unable to squash the loops during the inline.
>
> Signed-off-by: Andy Whitcroft <apw@shadowen.org>
> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Reviewed-by: Rik van Riel <riel@redhat.com>
> ---
>  mm/page_alloc.c |   43 ++++++++++++++++++++++++++++++-------------
>  1 files changed, 30 insertions(+), 13 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f52fcf1..b2a2c2b 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -489,6 +489,35 @@ static inline int free_pages_check(struct page *page)
>  }
>
>  /*
> + * Prepare this page for release to the buddy.  Sanity check the page.
> + * Returns 1 if the page is safe to free.
> + */
> +static inline int free_page_prepare(struct page *page, int order)
> +{
> +       int i;
> +       int reserved = 0;
> +
> +       if (PageAnon(page))
> +               page->mapping = NULL;

Why do you need to clear down anonymous mapping ?
I think if you don't convert this cleardown in free_hot_cold_page(),
you don't need it.

If you do it, bad_page can't do his role.

> +       for (i = 0 ; i < (1 << order) ; ++i)
> +               reserved += free_pages_check(page + i);
> +       if (reserved)
> +               return 0;
> +
> +       if (!PageHighMem(page)) {
> +               debug_check_no_locks_freed(page_address(page),
> +                                                       PAGE_SIZE << order);
> +               debug_check_no_obj_freed(page_address(page),
> +                                          PAGE_SIZE << order);
> +       }
> +       arch_free_page(page, order);
> +       kernel_map_pages(page, 1 << order, 0);
> +
> +       return 1;
> +}
> +
> +/*
>  * Frees a list of pages.
>  * Assumes all pages on list are in same zone, and of same order.
>  * count is the number of pages to free.
> @@ -529,22 +558,10 @@ static void free_one_page(struct zone *zone, struct page *page, int order)
>  static void __free_pages_ok(struct page *page, unsigned int order)
>  {
>        unsigned long flags;
> -       int i;
> -       int reserved = 0;
>
> -       for (i = 0 ; i < (1 << order) ; ++i)
> -               reserved += free_pages_check(page + i);
> -       if (reserved)
> +       if (!free_page_prepare(page, order))
>                return;
>
> -       if (!PageHighMem(page)) {
> -               debug_check_no_locks_freed(page_address(page),PAGE_SIZE<<order);
> -               debug_check_no_obj_freed(page_address(page),
> -                                          PAGE_SIZE << order);
> -       }
> -       arch_free_page(page, order);
> -       kernel_map_pages(page, 1 << order, 0);
> -
>        local_irq_save(flags);
>        __count_vm_events(PGFREE, 1 << order);
>        free_one_page(page_zone(page), page, order);
> --
> 1.6.0.rc1.258.g80295
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Thanks,
MinChan Kim

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-09-05 10:19 [PATCH 0/4] Reclaim page capture v3 Andy Whitcroft
@ 2008-09-05 10:19 ` Andy Whitcroft
  2008-09-08 14:11   ` MinChan Kim
  0 siblings, 1 reply; 30+ messages in thread
From: Andy Whitcroft @ 2008-09-05 10:19 UTC (permalink / raw)
  To: linux-mm
  Cc: linux-kernel, KOSAKI Motohiro, Peter Zijlstra, Christoph Lameter,
	Rik van Riel, Mel Gorman, Andy Whitcroft

When we are about to release a page we perform a number of actions
on that page.  We clear down any anonymous mappings, confirm that
the page is safe to release, check for freeing locks, before mapping
the page should that be required.  Pull this processing out into a
helper function for reuse in a later patch.

Note that we do not convert the similar cleardown in free_hot_cold_page()
as the optimiser is unable to squash the loops during the inline.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Acked-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
---
 mm/page_alloc.c |   43 ++++++++++++++++++++++++++++++-------------
 1 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f52fcf1..b2a2c2b 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -489,6 +489,35 @@ static inline int free_pages_check(struct page *page)
 }
 
 /*
+ * Prepare this page for release to the buddy.  Sanity check the page.
+ * Returns 1 if the page is safe to free.
+ */
+static inline int free_page_prepare(struct page *page, int order)
+{
+	int i;
+	int reserved = 0;
+
+	if (PageAnon(page))
+		page->mapping = NULL;
+
+	for (i = 0 ; i < (1 << order) ; ++i)
+		reserved += free_pages_check(page + i);
+	if (reserved)
+		return 0;
+
+	if (!PageHighMem(page)) {
+		debug_check_no_locks_freed(page_address(page),
+							PAGE_SIZE << order);
+		debug_check_no_obj_freed(page_address(page),
+					   PAGE_SIZE << order);
+	}
+	arch_free_page(page, order);
+	kernel_map_pages(page, 1 << order, 0);
+
+	return 1;
+}
+
+/*
  * Frees a list of pages. 
  * Assumes all pages on list are in same zone, and of same order.
  * count is the number of pages to free.
@@ -529,22 +558,10 @@ static void free_one_page(struct zone *zone, struct page *page, int order)
 static void __free_pages_ok(struct page *page, unsigned int order)
 {
 	unsigned long flags;
-	int i;
-	int reserved = 0;
 
-	for (i = 0 ; i < (1 << order) ; ++i)
-		reserved += free_pages_check(page + i);
-	if (reserved)
+	if (!free_page_prepare(page, order))
 		return;
 
-	if (!PageHighMem(page)) {
-		debug_check_no_locks_freed(page_address(page),PAGE_SIZE<<order);
-		debug_check_no_obj_freed(page_address(page),
-					   PAGE_SIZE << order);
-	}
-	arch_free_page(page, order);
-	kernel_map_pages(page, 1 << order, 0);
-
 	local_irq_save(flags);
 	__count_vm_events(PGFREE, 1 << order);
 	free_one_page(page_zone(page), page, order);
-- 
1.6.0.rc1.258.g80295


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse
  2008-07-01 17:58 [RFC PATCH 0/4] Reclaim page capture v1 Andy Whitcroft
@ 2008-07-01 17:58 ` Andy Whitcroft
  0 siblings, 0 replies; 30+ messages in thread
From: Andy Whitcroft @ 2008-07-01 17:58 UTC (permalink / raw)
  To: linux-mm; +Cc: linux-kernel, Mel Gorman, Andy Whitcroft

When we are about to release a page we perform a number of actions
on that page.  We clear down any anonymous mappings, confirm that
the page is safe to release, check for freeing locks, before mapping
the page should that be required.  Pull this processing out into a
helper function for reuse in a later patch.

Note that we do not convert the similar cleardown in free_hot_cold_page()
as the optimiser is unable to squash the loops during the inline.

Signed-off-by: Andy Whitcroft <apw@shadowen.org>
---
 mm/page_alloc.c |   43 ++++++++++++++++++++++++++++++-------------
 1 files changed, 30 insertions(+), 13 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8aa93f3..758ecf1 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -468,6 +468,35 @@ static inline int free_pages_check(struct page *page)
 }
 
 /*
+ * Prepare this page for release to the buddy.  Sanity check the page.
+ * Returns 1 if the page is safe to free.
+ */
+static inline int free_page_prepare(struct page *page, int order)
+{
+	int i;
+	int reserved = 0;
+
+	if (PageAnon(page))
+		page->mapping = NULL;
+
+	for (i = 0 ; i < (1 << order) ; ++i)
+		reserved += free_pages_check(page + i);
+	if (reserved)
+		return 0;
+
+	if (!PageHighMem(page)) {
+		debug_check_no_locks_freed(page_address(page),
+							PAGE_SIZE << order);
+		debug_check_no_obj_freed(page_address(page),
+					   PAGE_SIZE << order);
+	}
+	arch_free_page(page, order);
+	kernel_map_pages(page, 1 << order, 0);
+
+	return 1;
+}
+
+/*
  * Frees a list of pages. 
  * Assumes all pages on list are in same zone, and of same order.
  * count is the number of pages to free.
@@ -508,22 +537,10 @@ static void free_one_page(struct zone *zone, struct page *page, int order)
 static void __free_pages_ok(struct page *page, unsigned int order)
 {
 	unsigned long flags;
-	int i;
-	int reserved = 0;
 
-	for (i = 0 ; i < (1 << order) ; ++i)
-		reserved += free_pages_check(page + i);
-	if (reserved)
+	if (!free_page_prepare(page, order))
 		return;
 
-	if (!PageHighMem(page)) {
-		debug_check_no_locks_freed(page_address(page),PAGE_SIZE<<order);
-		debug_check_no_obj_freed(page_address(page),
-					   PAGE_SIZE << order);
-	}
-	arch_free_page(page, order);
-	kernel_map_pages(page, 1 << order, 0);
-
 	local_irq_save(flags);
 	__count_vm_events(PGFREE, 1 << order);
 	free_one_page(page_zone(page), page, order);
-- 
1.5.6.1.201.g3e7d3


^ permalink raw reply related	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2008-10-02  7:00 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2008-09-03 18:44 [RFC PATCH 0/4] Reclaim page capture v2 Andy Whitcroft
2008-09-03 18:44 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
2008-09-04  1:24   ` Rik van Riel
2008-09-05  1:52   ` KOSAKI Motohiro
2008-09-03 18:44 ` [PATCH 2/4] pull out zone cpuset and watermark checks " Andy Whitcroft
2008-09-04  1:24   ` Rik van Riel
2008-09-05  1:52   ` KOSAKI Motohiro
2008-09-03 18:44 ` [PATCH 3/4] buddy: explicitly identify buddy field use in struct page Andy Whitcroft
2008-09-03 20:36   ` Christoph Lameter
2008-09-04  1:25   ` Rik van Riel
2008-09-05  1:52   ` KOSAKI Motohiro
2008-09-03 18:44 ` [PATCH 4/4] capture pages freed during direct reclaim for allocation by the reclaimer Andy Whitcroft
2008-09-03 20:35   ` Christoph Lameter
2008-09-03 20:53   ` Andy Whitcroft
2008-09-03 21:00     ` Christoph Lameter
2008-09-04  6:38       ` Peter Zijlstra
2008-09-04 14:18         ` Christoph Lameter
2008-09-04  8:11       ` KOSAKI Motohiro
2008-09-04  8:58       ` Andy Whitcroft
2008-09-04  7:20     ` Peter Zijlstra
2008-09-04 11:35       ` Andy Whitcroft
2008-09-04  7:59     ` KOSAKI Motohiro
2008-09-04 14:44       ` Andy Whitcroft
2008-09-05  1:52         ` KOSAKI Motohiro
  -- strict thread matches above, loose matches on Subject: below --
2008-10-01 12:30 [PATCH 0/4] Reclaim page capture v4 Andy Whitcroft
2008-10-01 12:30 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
2008-10-02  7:05   ` KAMEZAWA Hiroyuki
2008-09-05 10:19 [PATCH 0/4] Reclaim page capture v3 Andy Whitcroft
2008-09-05 10:19 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft
2008-09-08 14:11   ` MinChan Kim
2008-09-08 15:14     ` Andy Whitcroft
2008-07-01 17:58 [RFC PATCH 0/4] Reclaim page capture v1 Andy Whitcroft
2008-07-01 17:58 ` [PATCH 1/4] pull out the page pre-release and sanity check logic for reuse Andy Whitcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).