All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC]pagealloc: compensate a task for direct page reclaim
@ 2010-09-16 11:26 Shaohua Li
  2010-09-16 15:00 ` Minchan Kim
  2010-09-17  5:52 ` KOSAKI Motohiro
  0 siblings, 2 replies; 6+ messages in thread
From: Shaohua Li @ 2010-09-16 11:26 UTC (permalink / raw)
  To: linux-mm; +Cc: Andrew Morton, Mel Gorman

A task enters into direct page reclaim, free some memory. But sometimes
the task can't get a free page after direct page reclaim because
other tasks take them (this is quite common in a multi-task workload
in my test). This behavior will bring extra latency to the task and is
unfair. Since the task already gets penalty, we'd better give it a compensation.
If a task frees some pages from direct page reclaim, we cache one freed page,
and the task will get it soon. We only consider order 0 allocation, because
it's hard to cache order > 0 page.

Below is a trace output when a task frees some pages in try_to_free_pages(), but
get_page_from_freelist() can't get a page in direct page reclaim.

<...>-809   [004]   730.218991: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
<...>-806   [001]   730.237969: __alloc_pages_nodemask: progress 147, order 0, pid 806, comm mmap_test
<...>-810   [005]   730.237971: __alloc_pages_nodemask: progress 147, order 0, pid 810, comm mmap_test
<...>-809   [004]   730.237972: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
<...>-811   [006]   730.241409: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
<...>-809   [004]   730.241412: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
<...>-812   [007]   730.241435: __alloc_pages_nodemask: progress 147, order 0, pid 812, comm mmap_test
<...>-809   [004]   730.245036: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
<...>-809   [004]   730.260360: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
<...>-805   [000]   730.260362: __alloc_pages_nodemask: progress 147, order 0, pid 805, comm mmap_test
<...>-811   [006]   730.263877: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
---
 include/linux/swap.h |    1 +
 mm/page_alloc.c      |   23 +++++++++++++++++++++++
 mm/vmscan.c          |   10 ++++++++++
 3 files changed, 34 insertions(+)

Index: linux/include/linux/swap.h
===================================================================
--- linux.orig/include/linux/swap.h	2010-09-16 11:01:56.000000000 +0800
+++ linux/include/linux/swap.h	2010-09-16 11:03:07.000000000 +0800
@@ -109,6 +109,7 @@ typedef struct {
  */
 struct reclaim_state {
 	unsigned long reclaimed_slab;
+	struct page **cached_page;
 };
 
 #ifdef __KERNEL__
Index: linux/mm/page_alloc.c
===================================================================
--- linux.orig/mm/page_alloc.c	2010-09-16 11:01:56.000000000 +0800
+++ linux/mm/page_alloc.c	2010-09-16 16:51:12.000000000 +0800
@@ -1837,6 +1837,21 @@ __alloc_pages_direct_compact(gfp_t gfp_m
 }
 #endif /* CONFIG_COMPACTION */
 
+static void prepare_cached_page(struct page *page, gfp_t gfp_mask)
+{
+	int wasMlocked = __TestClearPageMlocked(page);
+	unsigned long flags;
+
+	if (!free_pages_prepare(page, 0))
+		return;
+
+	local_irq_save(flags);
+	if (unlikely(wasMlocked))
+		free_page_mlock(page);
+	local_irq_restore(flags);
+	prep_new_page(page, 0, gfp_mask);
+}
+
 /* The really slow allocator path where we enter direct reclaim */
 static inline struct page *
 __alloc_pages_direct_reclaim(gfp_t gfp_mask, unsigned int order,
@@ -1856,6 +1871,10 @@ __alloc_pages_direct_reclaim(gfp_t gfp_m
 	p->flags |= PF_MEMALLOC;
 	lockdep_set_current_reclaim_state(gfp_mask);
 	reclaim_state.reclaimed_slab = 0;
+	if (order == 0)
+		reclaim_state.cached_page = &page;
+	else
+		reclaim_state.cached_page = NULL;
 	p->reclaim_state = &reclaim_state;
 
 	*did_some_progress = try_to_free_pages(zonelist, order, gfp_mask, nodemask);
@@ -1864,6 +1883,10 @@ __alloc_pages_direct_reclaim(gfp_t gfp_m
 	lockdep_clear_current_reclaim_state();
 	p->flags &= ~PF_MEMALLOC;
 
+	if (page) {
+		prepare_cached_page(page, gfp_mask);
+		return page;
+	}
 	cond_resched();
 
 	if (unlikely(!(*did_some_progress)))
Index: linux/mm/vmscan.c
===================================================================
--- linux.orig/mm/vmscan.c	2010-09-16 11:01:56.000000000 +0800
+++ linux/mm/vmscan.c	2010-09-16 11:03:07.000000000 +0800
@@ -626,9 +626,17 @@ static noinline_for_stack void free_page
 {
 	struct pagevec freed_pvec;
 	struct page *page, *tmp;
+	struct reclaim_state *reclaim_state = current->reclaim_state;
 
 	pagevec_init(&freed_pvec, 1);
 
+	if (!list_empty(free_pages) && reclaim_state &&
+			reclaim_state->cached_page) {
+		page = list_entry(free_pages->next, struct page, lru);
+		list_del(&page->lru);
+		*reclaim_state->cached_page = page;
+	}
+
 	list_for_each_entry_safe(page, tmp, free_pages, lru) {
 		list_del(&page->lru);
 		if (!pagevec_add(&freed_pvec, page)) {
@@ -2467,6 +2475,7 @@ unsigned long shrink_all_memory(unsigned
 	p->flags |= PF_MEMALLOC;
 	lockdep_set_current_reclaim_state(sc.gfp_mask);
 	reclaim_state.reclaimed_slab = 0;
+	reclaim_state.cached_page = NULL;
 	p->reclaim_state = &reclaim_state;
 
 	nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
@@ -2655,6 +2664,7 @@ static int __zone_reclaim(struct zone *z
 	p->flags |= PF_MEMALLOC | PF_SWAPWRITE;
 	lockdep_set_current_reclaim_state(gfp_mask);
 	reclaim_state.reclaimed_slab = 0;
+	reclaim_state.cached_page = NULL;
 	p->reclaim_state = &reclaim_state;
 
 	if (zone_pagecache_reclaimable(zone) > zone->min_unmapped_pages) {


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC]pagealloc: compensate a task for direct page reclaim
  2010-09-16 11:26 [RFC]pagealloc: compensate a task for direct page reclaim Shaohua Li
@ 2010-09-16 15:00 ` Minchan Kim
  2010-09-17  2:34   ` Shaohua Li
  2010-09-20  8:50   ` Mel Gorman
  2010-09-17  5:52 ` KOSAKI Motohiro
  1 sibling, 2 replies; 6+ messages in thread
From: Minchan Kim @ 2010-09-16 15:00 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-mm, Andrew Morton, Mel Gorman

On Thu, Sep 16, 2010 at 07:26:36PM +0800, Shaohua Li wrote:
> A task enters into direct page reclaim, free some memory. But sometimes
> the task can't get a free page after direct page reclaim because
> other tasks take them (this is quite common in a multi-task workload
> in my test). This behavior will bring extra latency to the task and is
> unfair. Since the task already gets penalty, we'd better give it a compensation.
> If a task frees some pages from direct page reclaim, we cache one freed page,
> and the task will get it soon. We only consider order 0 allocation, because
> it's hard to cache order > 0 page.
> 
> Below is a trace output when a task frees some pages in try_to_free_pages(), but
> get_page_from_freelist() can't get a page in direct page reclaim.
> 
> <...>-809   [004]   730.218991: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-806   [001]   730.237969: __alloc_pages_nodemask: progress 147, order 0, pid 806, comm mmap_test
> <...>-810   [005]   730.237971: __alloc_pages_nodemask: progress 147, order 0, pid 810, comm mmap_test
> <...>-809   [004]   730.237972: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-811   [006]   730.241409: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
> <...>-809   [004]   730.241412: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-812   [007]   730.241435: __alloc_pages_nodemask: progress 147, order 0, pid 812, comm mmap_test
> <...>-809   [004]   730.245036: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-809   [004]   730.260360: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-805   [000]   730.260362: __alloc_pages_nodemask: progress 147, order 0, pid 805, comm mmap_test
> <...>-811   [006]   730.263877: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
> 

The idea is good.

I think we need to reserve at least one page for direct reclaimer who make the effort so that
it can reduce latency of stalled process.

But I don't like this implementation. 

1. It selects random page of reclaimed pages as cached page.
This doesn't consider requestor's migratetype so that it causes fragment problem in future. 

2. It skips buddy allocator. It means we lost coalescence chance so that fragement problem
would be severe than old. 

In addition, I think this patch needs some number about enhancing of latency 
and fragmentation if you are going with this approach.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC]pagealloc: compensate a task for direct page reclaim
  2010-09-16 15:00 ` Minchan Kim
@ 2010-09-17  2:34   ` Shaohua Li
  2010-09-17  4:47     ` Minchan Kim
  2010-09-20  8:50   ` Mel Gorman
  1 sibling, 1 reply; 6+ messages in thread
From: Shaohua Li @ 2010-09-17  2:34 UTC (permalink / raw)
  To: Minchan Kim; +Cc: linux-mm, Andrew Morton, Mel Gorman

On Thu, Sep 16, 2010 at 11:00:10PM +0800, Minchan Kim wrote:
> On Thu, Sep 16, 2010 at 07:26:36PM +0800, Shaohua Li wrote:
> > A task enters into direct page reclaim, free some memory. But sometimes
> > the task can't get a free page after direct page reclaim because
> > other tasks take them (this is quite common in a multi-task workload
> > in my test). This behavior will bring extra latency to the task and is
> > unfair. Since the task already gets penalty, we'd better give it a compensation.
> > If a task frees some pages from direct page reclaim, we cache one freed page,
> > and the task will get it soon. We only consider order 0 allocation, because
> > it's hard to cache order > 0 page.
> > 
> > Below is a trace output when a task frees some pages in try_to_free_pages(), but
> > get_page_from_freelist() can't get a page in direct page reclaim.
> > 
> > <...>-809   [004]   730.218991: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-806   [001]   730.237969: __alloc_pages_nodemask: progress 147, order 0, pid 806, comm mmap_test
> > <...>-810   [005]   730.237971: __alloc_pages_nodemask: progress 147, order 0, pid 810, comm mmap_test
> > <...>-809   [004]   730.237972: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-811   [006]   730.241409: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
> > <...>-809   [004]   730.241412: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-812   [007]   730.241435: __alloc_pages_nodemask: progress 147, order 0, pid 812, comm mmap_test
> > <...>-809   [004]   730.245036: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-809   [004]   730.260360: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-805   [000]   730.260362: __alloc_pages_nodemask: progress 147, order 0, pid 805, comm mmap_test
> > <...>-811   [006]   730.263877: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
> > 
> 
> The idea is good.
> 
> I think we need to reserve at least one page for direct reclaimer who make the effort so that
> it can reduce latency of stalled process.
> 
> But I don't like this implementation. 
> 
> 1. It selects random page of reclaimed pages as cached page.
> This doesn't consider requestor's migratetype so that it causes fragment problem in future. 
maybe we can limit the migratetype to MIGRATE_MOVABLE, which is the most common case.
 
> 2. It skips buddy allocator. It means we lost coalescence chance so that fragement problem
> would be severe than old. 
we only cache order 0 allocation, which doesn't enter lumpy reclaim, so this sounds not
an issue to me.

> In addition, I think this patch needs some number about enhancing of latency 
> and fragmentation if you are going with this approach.
ok, sure.

Thanks,
Shaohua

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC]pagealloc: compensate a task for direct page reclaim
  2010-09-17  2:34   ` Shaohua Li
@ 2010-09-17  4:47     ` Minchan Kim
  0 siblings, 0 replies; 6+ messages in thread
From: Minchan Kim @ 2010-09-17  4:47 UTC (permalink / raw)
  To: Shaohua Li; +Cc: linux-mm, Andrew Morton, Mel Gorman

On Fri, Sep 17, 2010 at 11:34 AM, Shaohua Li <shaohua.li@intel.com> wrote:
> On Thu, Sep 16, 2010 at 11:00:10PM +0800, Minchan Kim wrote:
>> On Thu, Sep 16, 2010 at 07:26:36PM +0800, Shaohua Li wrote:
>> > A task enters into direct page reclaim, free some memory. But sometimes
>> > the task can't get a free page after direct page reclaim because
>> > other tasks take them (this is quite common in a multi-task workload
>> > in my test). This behavior will bring extra latency to the task and is
>> > unfair. Since the task already gets penalty, we'd better give it a compensation.
>> > If a task frees some pages from direct page reclaim, we cache one freed page,
>> > and the task will get it soon. We only consider order 0 allocation, because
>> > it's hard to cache order > 0 page.
>> >
>> > Below is a trace output when a task frees some pages in try_to_free_pages(), but
>> > get_page_from_freelist() can't get a page in direct page reclaim.
>> >
>> > <...>-809   [004]   730.218991: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-806   [001]   730.237969: __alloc_pages_nodemask: progress 147, order 0, pid 806, comm mmap_test
>> > <...>-810   [005]   730.237971: __alloc_pages_nodemask: progress 147, order 0, pid 810, comm mmap_test
>> > <...>-809   [004]   730.237972: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-811   [006]   730.241409: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
>> > <...>-809   [004]   730.241412: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-812   [007]   730.241435: __alloc_pages_nodemask: progress 147, order 0, pid 812, comm mmap_test
>> > <...>-809   [004]   730.245036: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-809   [004]   730.260360: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
>> > <...>-805   [000]   730.260362: __alloc_pages_nodemask: progress 147, order 0, pid 805, comm mmap_test
>> > <...>-811   [006]   730.263877: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
>> >
>>
>> The idea is good.
>>
>> I think we need to reserve at least one page for direct reclaimer who make the effort so that
>> it can reduce latency of stalled process.
>>
>> But I don't like this implementation.
>>
>> 1. It selects random page of reclaimed pages as cached page.
>> This doesn't consider requestor's migratetype so that it causes fragment problem in future.
> maybe we can limit the migratetype to MIGRATE_MOVABLE, which is the most common case.
>
>> 2. It skips buddy allocator. It means we lost coalescence chance so that fragement problem
>> would be severe than old.
> we only cache order 0 allocation, which doesn't enter lumpy reclaim, so this sounds not
> an issue to me.

I mean following as.

Old behavior.

1) return 0-order page
2) Fortunately, It fills the hole for order-1, so the page would be
promoted order-1 page
3) Fortunately, It fills the hole for order-2, so the page would be
promoted order-2 page
4) repeatedly until some order.
5) Finally, alloc_page will allocate a order-o one page(ie not
coalesce) of all which reclaimed direct reclaimer from buddy.

But your patch lost the chance on cached page.

Of course, If any pages reclaimed isn't in order 0 list(ie, all page
should be coalesce), big page have to be break with order-0 page. But
it's unlikely.

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC]pagealloc: compensate a task for direct page reclaim
  2010-09-16 11:26 [RFC]pagealloc: compensate a task for direct page reclaim Shaohua Li
  2010-09-16 15:00 ` Minchan Kim
@ 2010-09-17  5:52 ` KOSAKI Motohiro
  1 sibling, 0 replies; 6+ messages in thread
From: KOSAKI Motohiro @ 2010-09-17  5:52 UTC (permalink / raw)
  To: Shaohua Li; +Cc: kosaki.motohiro, linux-mm, Andrew Morton, Mel Gorman

> A task enters into direct page reclaim, free some memory. But sometimes
> the task can't get a free page after direct page reclaim because
> other tasks take them (this is quite common in a multi-task workload
> in my test). This behavior will bring extra latency to the task and is
> unfair. Since the task already gets penalty, we'd better give it a compensation.
> If a task frees some pages from direct page reclaim, we cache one freed page,
> and the task will get it soon. We only consider order 0 allocation, because
> it's hard to cache order > 0 page.
> 
> Below is a trace output when a task frees some pages in try_to_free_pages(), but
> get_page_from_freelist() can't get a page in direct page reclaim.
> 
> <...>-809   [004]   730.218991: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-806   [001]   730.237969: __alloc_pages_nodemask: progress 147, order 0, pid 806, comm mmap_test
> <...>-810   [005]   730.237971: __alloc_pages_nodemask: progress 147, order 0, pid 810, comm mmap_test
> <...>-809   [004]   730.237972: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-811   [006]   730.241409: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
> <...>-809   [004]   730.241412: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-812   [007]   730.241435: __alloc_pages_nodemask: progress 147, order 0, pid 812, comm mmap_test
> <...>-809   [004]   730.245036: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-809   [004]   730.260360: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> <...>-805   [000]   730.260362: __alloc_pages_nodemask: progress 147, order 0, pid 805, comm mmap_test
> <...>-811   [006]   730.263877: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test

As far as I remembered, two years ago, similar patches was posted. but 
minchan found it makes performance regression when kernbench run.

http://archives.free.net.ph/message/20080905.101958.0f84e87d.ja.html



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [RFC]pagealloc: compensate a task for direct page reclaim
  2010-09-16 15:00 ` Minchan Kim
  2010-09-17  2:34   ` Shaohua Li
@ 2010-09-20  8:50   ` Mel Gorman
  1 sibling, 0 replies; 6+ messages in thread
From: Mel Gorman @ 2010-09-20  8:50 UTC (permalink / raw)
  To: Minchan Kim; +Cc: Shaohua Li, linux-mm, Andrew Morton

On Fri, Sep 17, 2010 at 12:00:10AM +0900, Minchan Kim wrote:
> On Thu, Sep 16, 2010 at 07:26:36PM +0800, Shaohua Li wrote:
> > A task enters into direct page reclaim, free some memory. But sometimes
> > the task can't get a free page after direct page reclaim because
> > other tasks take them (this is quite common in a multi-task workload
> > in my test). This behavior will bring extra latency to the task and is
> > unfair. Since the task already gets penalty, we'd better give it a compensation.
> > If a task frees some pages from direct page reclaim, we cache one freed page,
> > and the task will get it soon. We only consider order 0 allocation, because
> > it's hard to cache order > 0 page.
> > 
> > Below is a trace output when a task frees some pages in try_to_free_pages(), but
> > get_page_from_freelist() can't get a page in direct page reclaim.
> > 
> > <...>-809   [004]   730.218991: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-806   [001]   730.237969: __alloc_pages_nodemask: progress 147, order 0, pid 806, comm mmap_test
> > <...>-810   [005]   730.237971: __alloc_pages_nodemask: progress 147, order 0, pid 810, comm mmap_test
> > <...>-809   [004]   730.237972: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-811   [006]   730.241409: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
> > <...>-809   [004]   730.241412: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-812   [007]   730.241435: __alloc_pages_nodemask: progress 147, order 0, pid 812, comm mmap_test
> > <...>-809   [004]   730.245036: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-809   [004]   730.260360: __alloc_pages_nodemask: progress 147, order 0, pid 809, comm mmap_test
> > <...>-805   [000]   730.260362: __alloc_pages_nodemask: progress 147, order 0, pid 805, comm mmap_test
> > <...>-811   [006]   730.263877: __alloc_pages_nodemask: progress 147, order 0, pid 811, comm mmap_test
> > 
> 
> The idea is good.
> 
> I think we need to reserve at least one page for direct reclaimer who make the effort so that
> it can reduce latency of stalled process.
> 

The latency reduction is very minimal except in the case where a direct reclaim
has its pages stolen because the system is under heavy memory pressure. Under
such pressure, I would wonder how noticable unfairness even is. The systems
performance has already hit the floor. I'd like to hear more about the
problem being solved here and if there is a workload that is really suffering.

> But I don't like this implementation. 
> 
> 1. It selects random page of reclaimed pages as cached page.
> This doesn't consider requestor's migratetype so that it causes fragment problem in future. 
> 

Agreed.

> 2. It skips buddy allocator. It means we lost coalescence chance so that fragement problem
> would be severe than old. 
> 

Agreed.

Also it can be the case that the cached page is no longer a hot page to
the current CPU. It can result in a small performance hit in other
cases.

> In addition, I think this patch needs some number about enhancing of latency 
> and fragmentation if you are going with this approach.
> 

Depending on what the problem is being solved, it might be better solved by
limiting the number of direct reclaimers to the number of CPUs on the system
so that there is less stealing on the per-cpu lists.

-- 
Mel Gorman
Part-time Phd Student                          Linux Technology Center
University of Limerick                         IBM Dublin Software Lab

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-09-20  8:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-16 11:26 [RFC]pagealloc: compensate a task for direct page reclaim Shaohua Li
2010-09-16 15:00 ` Minchan Kim
2010-09-17  2:34   ` Shaohua Li
2010-09-17  4:47     ` Minchan Kim
2010-09-20  8:50   ` Mel Gorman
2010-09-17  5:52 ` KOSAKI Motohiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.