* [PATCH v3 0/5] Sorting out sc.swap_cluster_max and kill hibernation specific reclaim logic
@ 2009-11-11 1:57 ` KOSAKI Motohiro
0 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 1:57 UTC (permalink / raw)
To: LKML, linux-mm, Andrew Morton, Minchan Kim, Rik van Riel,
Rafael J. Wysocki
Cc: kosaki.motohiro
Currently, shrink_all_memory() and sc.swap_cluster_max have unnecessary
mess. We can cleanup it and kill some useless code.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v3 0/5] Sorting out sc.swap_cluster_max and kill hibernation specific reclaim logic
@ 2009-11-11 1:57 ` KOSAKI Motohiro
0 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 1:57 UTC (permalink / raw)
To: LKML, linux-mm, Andrew Morton, Minchan Kim, Rik van Riel,
Rafael J. Wysocki
Cc: kosaki.motohiro
Currently, shrink_all_memory() and sc.swap_cluster_max have unnecessary
mess. We can cleanup it and kill some useless code.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/5] vmscan: separate sc.swap_cluster_max and sc.nr_max_reclaim
2009-11-11 1:57 ` KOSAKI Motohiro
@ 2009-11-11 1:58 ` KOSAKI Motohiro
-1 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 1:58 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
Currently, sc.scap_cluster_max has double meanings.
1) reclaim batch size as isolate_lru_pages()'s argument
2) reclaim baling out thresolds
The two meanings pretty unrelated. Thus, Let's separate it.
this patch doesn't change any behavior.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 25 +++++++++++++++++++------
1 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index f805958..7be2927 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -55,6 +55,9 @@ struct scan_control {
/* Number of pages freed so far during a call to shrink_zones() */
unsigned long nr_reclaimed;
+ /* How many pages shrink_list() should reclaim */
+ unsigned long nr_to_reclaim;
+
/* This context's GFP mask */
gfp_t gfp_mask;
@@ -1585,6 +1588,7 @@ static void shrink_zone(int priority, struct zone *zone,
enum lru_list l;
unsigned long nr_reclaimed = sc->nr_reclaimed;
unsigned long swap_cluster_max = sc->swap_cluster_max;
+ unsigned long nr_to_reclaim = sc->nr_to_reclaim;
struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
int noswap = 0;
@@ -1634,8 +1638,7 @@ static void shrink_zone(int priority, struct zone *zone,
* with multiple processes reclaiming pages, the total
* freeing target can get unreasonably large.
*/
- if (nr_reclaimed > swap_cluster_max &&
- priority < DEF_PRIORITY && !current_is_kswapd())
+ if (nr_reclaimed > nr_to_reclaim && priority < DEF_PRIORITY)
break;
}
@@ -1733,6 +1736,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
struct zoneref *z;
struct zone *zone;
enum zone_type high_zoneidx = gfp_zone(sc->gfp_mask);
+ unsigned long writeback_threshold;
delayacct_freepages_start();
@@ -1768,7 +1772,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
}
}
total_scanned += sc->nr_scanned;
- if (sc->nr_reclaimed >= sc->swap_cluster_max) {
+ if (sc->nr_reclaimed >= sc->nr_to_reclaim) {
ret = sc->nr_reclaimed;
goto out;
}
@@ -1780,8 +1784,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
* that's undesirable in laptop mode, where we *want* lumpy
* writeout. So in laptop mode, write out the whole world.
*/
- if (total_scanned > sc->swap_cluster_max +
- sc->swap_cluster_max / 2) {
+ writeback_threshold = sc->nr_to_reclaim + sc->nr_to_reclaim / 2;
+ if (total_scanned > writeback_threshold) {
wakeup_flusher_threads(laptop_mode ? 0 : total_scanned);
sc->may_writepage = 1;
}
@@ -1827,6 +1831,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
.gfp_mask = gfp_mask,
.may_writepage = !laptop_mode,
.swap_cluster_max = SWAP_CLUSTER_MAX,
+ .nr_to_reclaim = SWAP_CLUSTER_MAX,
.may_unmap = 1,
.may_swap = 1,
.swappiness = vm_swappiness,
@@ -1885,6 +1890,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
.may_unmap = 1,
.may_swap = !noswap,
.swap_cluster_max = SWAP_CLUSTER_MAX,
+ .nr_to_reclaim = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
.mem_cgroup = mem_cont,
@@ -1932,6 +1938,11 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
.may_unmap = 1,
.may_swap = 1,
.swap_cluster_max = SWAP_CLUSTER_MAX,
+ /*
+ * kswapd doesn't want to be bailed out while reclaim. because
+ * we want to put equal scanning pressure on each zone.
+ */
+ .nr_to_reclaim = ULONG_MAX,
.swappiness = vm_swappiness,
.order = order,
.mem_cgroup = NULL,
@@ -2549,7 +2560,9 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
.swap_cluster_max = max_t(unsigned long, nr_pages,
- SWAP_CLUSTER_MAX),
+ SWAP_CLUSTER_MAX),
+ .nr_to_reclaim = max_t(unsigned long, nr_pages,
+ SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
.swappiness = vm_swappiness,
.order = order,
--
1.6.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 1/5] vmscan: separate sc.swap_cluster_max and sc.nr_max_reclaim
@ 2009-11-11 1:58 ` KOSAKI Motohiro
0 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 1:58 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
Currently, sc.scap_cluster_max has double meanings.
1) reclaim batch size as isolate_lru_pages()'s argument
2) reclaim baling out thresolds
The two meanings pretty unrelated. Thus, Let's separate it.
this patch doesn't change any behavior.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 25 +++++++++++++++++++------
1 files changed, 19 insertions(+), 6 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index f805958..7be2927 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -55,6 +55,9 @@ struct scan_control {
/* Number of pages freed so far during a call to shrink_zones() */
unsigned long nr_reclaimed;
+ /* How many pages shrink_list() should reclaim */
+ unsigned long nr_to_reclaim;
+
/* This context's GFP mask */
gfp_t gfp_mask;
@@ -1585,6 +1588,7 @@ static void shrink_zone(int priority, struct zone *zone,
enum lru_list l;
unsigned long nr_reclaimed = sc->nr_reclaimed;
unsigned long swap_cluster_max = sc->swap_cluster_max;
+ unsigned long nr_to_reclaim = sc->nr_to_reclaim;
struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
int noswap = 0;
@@ -1634,8 +1638,7 @@ static void shrink_zone(int priority, struct zone *zone,
* with multiple processes reclaiming pages, the total
* freeing target can get unreasonably large.
*/
- if (nr_reclaimed > swap_cluster_max &&
- priority < DEF_PRIORITY && !current_is_kswapd())
+ if (nr_reclaimed > nr_to_reclaim && priority < DEF_PRIORITY)
break;
}
@@ -1733,6 +1736,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
struct zoneref *z;
struct zone *zone;
enum zone_type high_zoneidx = gfp_zone(sc->gfp_mask);
+ unsigned long writeback_threshold;
delayacct_freepages_start();
@@ -1768,7 +1772,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
}
}
total_scanned += sc->nr_scanned;
- if (sc->nr_reclaimed >= sc->swap_cluster_max) {
+ if (sc->nr_reclaimed >= sc->nr_to_reclaim) {
ret = sc->nr_reclaimed;
goto out;
}
@@ -1780,8 +1784,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
* that's undesirable in laptop mode, where we *want* lumpy
* writeout. So in laptop mode, write out the whole world.
*/
- if (total_scanned > sc->swap_cluster_max +
- sc->swap_cluster_max / 2) {
+ writeback_threshold = sc->nr_to_reclaim + sc->nr_to_reclaim / 2;
+ if (total_scanned > writeback_threshold) {
wakeup_flusher_threads(laptop_mode ? 0 : total_scanned);
sc->may_writepage = 1;
}
@@ -1827,6 +1831,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
.gfp_mask = gfp_mask,
.may_writepage = !laptop_mode,
.swap_cluster_max = SWAP_CLUSTER_MAX,
+ .nr_to_reclaim = SWAP_CLUSTER_MAX,
.may_unmap = 1,
.may_swap = 1,
.swappiness = vm_swappiness,
@@ -1885,6 +1890,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
.may_unmap = 1,
.may_swap = !noswap,
.swap_cluster_max = SWAP_CLUSTER_MAX,
+ .nr_to_reclaim = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
.mem_cgroup = mem_cont,
@@ -1932,6 +1938,11 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
.may_unmap = 1,
.may_swap = 1,
.swap_cluster_max = SWAP_CLUSTER_MAX,
+ /*
+ * kswapd doesn't want to be bailed out while reclaim. because
+ * we want to put equal scanning pressure on each zone.
+ */
+ .nr_to_reclaim = ULONG_MAX,
.swappiness = vm_swappiness,
.order = order,
.mem_cgroup = NULL,
@@ -2549,7 +2560,9 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
.swap_cluster_max = max_t(unsigned long, nr_pages,
- SWAP_CLUSTER_MAX),
+ SWAP_CLUSTER_MAX),
+ .nr_to_reclaim = max_t(unsigned long, nr_pages,
+ SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
.swappiness = vm_swappiness,
.order = order,
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/5] vmscan: Kill hibernation specific reclaim logic and unify it
2009-11-11 1:57 ` KOSAKI Motohiro
@ 2009-11-11 2:01 ` KOSAKI Motohiro
-1 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:01 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
shrink_all_zone() was introduced by commit d6277db4ab (swsusp: rework
memory shrinker) for hibernate performance improvement. and sc.swap_cluster_max
was introduced by commit a06fe4d307 (Speed freeing memory for suspend).
commit a06fe4d307 said
Without the patch:
Freed 14600 pages in 1749 jiffies = 32.61 MB/s (Anomolous!)
Freed 88563 pages in 14719 jiffies = 23.50 MB/s
Freed 205734 pages in 32389 jiffies = 24.81 MB/s
With the patch:
Freed 68252 pages in 496 jiffies = 537.52 MB/s
Freed 116464 pages in 569 jiffies = 798.54 MB/s
Freed 209699 pages in 705 jiffies = 1161.89 MB/s
At that time, their patch was pretty worth. However, Modern Hardware
trend and recent VM improvement broke its worth. From several reason,
I think we should remove shrink_all_zones() at all.
detail:
1) Old days, shrink_zone()'s slowness was mainly caused by stupid io-throttle
at no i/o congestion.
but current shrink_zone() is sane, not slow.
2) shrink_all_zone() try to shrink all pages at a time. but it doesn't works
fine on numa system.
example)
System has 4GB memory and each node have 2GB. and hibernate need 1GB.
optimal)
steal 500MB from each node.
shrink_all_zones)
steal 1GB from node-0.
Oh, Cache balancing logic was broken. ;)
Unfortunately, Desktop system moved ahead NUMA at nowadays.
(Side note, if hibernate require 2GB, shrink_all_zones() never success
on above machine)
3) if the node has several I/O flighting pages, shrink_all_zones() makes
pretty bad result.
schenario) hibernate need 1GB
1) shrink_all_zones() try to reclaim 1GB from Node-0
2) but it only reclaimed 990MB
3) stupidly, shrink_all_zones() try to reclaim 1GB from Node-1
4) it reclaimed 990MB
Oh, well. it reclaimed twice much than required.
In the other hand, current shrink_zone() has sane baling out logic.
then, it doesn't make overkill reclaim. then, we lost shrink_zones()'s risk.
4) SplitLRU VM always keep active/inactive ratio very carefully. inactive list only
shrinking break its assumption. it makes unnecessary OOM risk. it obviously suboptimal.
Now, shrink_all_memory() is only the wrapper function of do_try_to_free_pages().
it bring good reviewability and debuggability, and solve above problems.
side note: Reclaim logic unificication makes two good side effect.
- Fix recursive reclaim bug on shrink_all_memory().
it did forgot to use PF_MEMALLOC. it mean the system be able to stuck into deadlock.
- Now, shrink_all_memory() got lockdep awareness. it bring good debuggability.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
---
mm/vmscan.c | 153 ++++++++++-------------------------------------------------
1 files changed, 26 insertions(+), 127 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7be2927..eebd260 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -58,6 +58,8 @@ struct scan_control {
/* How many pages shrink_list() should reclaim */
unsigned long nr_to_reclaim;
+ unsigned long hibernation_mode;
+
/* This context's GFP mask */
gfp_t gfp_mask;
@@ -1791,7 +1793,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
}
/* Take a nap, wait for some writeback to complete */
- if (sc->nr_scanned && priority < DEF_PRIORITY - 2)
+ if (!sc->hibernation_mode && sc->nr_scanned &&
+ priority < DEF_PRIORITY - 2)
congestion_wait(BLK_RW_ASYNC, HZ/10);
}
/* top priority shrink_zones still had more to do? don't OOM, then */
@@ -2266,148 +2269,44 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
#ifdef CONFIG_HIBERNATION
/*
- * Helper function for shrink_all_memory(). Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
- int pass, struct scan_control *sc)
-{
- struct zone *zone;
- unsigned long nr_reclaimed = 0;
- struct zone_reclaim_stat *reclaim_stat;
-
- for_each_populated_zone(zone) {
- enum lru_list l;
-
- if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
- continue;
-
- for_each_evictable_lru(l) {
- enum zone_stat_item ls = NR_LRU_BASE + l;
- unsigned long lru_pages = zone_page_state(zone, ls);
-
- /* For pass = 0, we don't shrink the active list */
- if (pass == 0 && (l == LRU_ACTIVE_ANON ||
- l == LRU_ACTIVE_FILE))
- continue;
-
- reclaim_stat = get_reclaim_stat(zone, sc);
- reclaim_stat->nr_saved_scan[l] +=
- (lru_pages >> prio) + 1;
- if (reclaim_stat->nr_saved_scan[l]
- >= nr_pages || pass > 3) {
- unsigned long nr_to_scan;
-
- reclaim_stat->nr_saved_scan[l] = 0;
- nr_to_scan = min(nr_pages, lru_pages);
- nr_reclaimed += shrink_list(l, nr_to_scan, zone,
- sc, prio);
- if (nr_reclaimed >= nr_pages) {
- sc->nr_reclaimed += nr_reclaimed;
- return;
- }
- }
- }
- }
- sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
+ * Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
* freed pages.
*
* Rather than trying to age LRUs the aim is to preserve the overall
* LRU order by reclaiming preferentially
* inactive > active > active referenced > active mapped
*/
-unsigned long shrink_all_memory(unsigned long nr_pages)
+unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
{
- unsigned long lru_pages, nr_slab;
- int pass;
struct reclaim_state reclaim_state;
struct scan_control sc = {
- .gfp_mask = GFP_KERNEL,
- .may_unmap = 0,
+ .gfp_mask = GFP_HIGHUSER_MOVABLE,
+ .may_swap = 1,
+ .may_unmap = 1,
.may_writepage = 1,
+ .swap_cluster_max = SWAP_CLUSTER_MAX,
+ .nr_to_reclaim = nr_to_reclaim,
+ .hibernation_mode = 1,
+ .swappiness = vm_swappiness,
+ .order = 0,
.isolate_pages = isolate_pages_global,
- .nr_reclaimed = 0,
};
+ struct zonelist * zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
+ struct task_struct *p = current;
+ unsigned long nr_reclaimed;
- current->reclaim_state = &reclaim_state;
-
- lru_pages = global_reclaimable_pages();
- nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
- /* If slab caches are huge, it's better to hit them first */
- while (nr_slab >= lru_pages) {
- reclaim_state.reclaimed_slab = 0;
- shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
- if (!reclaim_state.reclaimed_slab)
- break;
-
- sc.nr_reclaimed += reclaim_state.reclaimed_slab;
- if (sc.nr_reclaimed >= nr_pages)
- goto out;
-
- nr_slab -= reclaim_state.reclaimed_slab;
- }
-
- /*
- * We try to shrink LRUs in 5 passes:
- * 0 = Reclaim from inactive_list only
- * 1 = Reclaim from active list but don't reclaim mapped
- * 2 = 2nd pass of type 1
- * 3 = Reclaim mapped (normal reclaim)
- * 4 = 2nd pass of type 3
- */
- for (pass = 0; pass < 5; pass++) {
- int prio;
-
- /* Force reclaiming mapped pages in the passes #3 and #4 */
- if (pass > 2)
- sc.may_unmap = 1;
-
- for (prio = DEF_PRIORITY; prio >= 0; prio--) {
- unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
- sc.nr_scanned = 0;
- sc.swap_cluster_max = nr_to_scan;
- shrink_all_zones(nr_to_scan, prio, pass, &sc);
- if (sc.nr_reclaimed >= nr_pages)
- goto out;
-
- reclaim_state.reclaimed_slab = 0;
- shrink_slab(sc.nr_scanned, sc.gfp_mask,
- global_reclaimable_pages());
- sc.nr_reclaimed += reclaim_state.reclaimed_slab;
- if (sc.nr_reclaimed >= nr_pages)
- goto out;
-
- if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
- congestion_wait(BLK_RW_ASYNC, HZ / 10);
- }
- }
-
- /*
- * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
- * something in slab caches
- */
- if (!sc.nr_reclaimed) {
- do {
- reclaim_state.reclaimed_slab = 0;
- shrink_slab(nr_pages, sc.gfp_mask,
- global_reclaimable_pages());
- sc.nr_reclaimed += reclaim_state.reclaimed_slab;
- } while (sc.nr_reclaimed < nr_pages &&
- reclaim_state.reclaimed_slab > 0);
- }
+ p->flags |= PF_MEMALLOC;
+ lockdep_set_current_reclaim_state(sc.gfp_mask);
+ reclaim_state.reclaimed_slab = 0;
+ p->reclaim_state = &reclaim_state;
+ nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
-out:
- current->reclaim_state = NULL;
+ p->reclaim_state = NULL;
+ lockdep_clear_current_reclaim_state();
+ p->flags &= ~PF_MEMALLOC;
- return sc.nr_reclaimed;
+ return nr_reclaimed;
}
#endif /* CONFIG_HIBERNATION */
--
1.6.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 2/5] vmscan: Kill hibernation specific reclaim logic and unify it
@ 2009-11-11 2:01 ` KOSAKI Motohiro
0 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:01 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
shrink_all_zone() was introduced by commit d6277db4ab (swsusp: rework
memory shrinker) for hibernate performance improvement. and sc.swap_cluster_max
was introduced by commit a06fe4d307 (Speed freeing memory for suspend).
commit a06fe4d307 said
Without the patch:
Freed 14600 pages in 1749 jiffies = 32.61 MB/s (Anomolous!)
Freed 88563 pages in 14719 jiffies = 23.50 MB/s
Freed 205734 pages in 32389 jiffies = 24.81 MB/s
With the patch:
Freed 68252 pages in 496 jiffies = 537.52 MB/s
Freed 116464 pages in 569 jiffies = 798.54 MB/s
Freed 209699 pages in 705 jiffies = 1161.89 MB/s
At that time, their patch was pretty worth. However, Modern Hardware
trend and recent VM improvement broke its worth. From several reason,
I think we should remove shrink_all_zones() at all.
detail:
1) Old days, shrink_zone()'s slowness was mainly caused by stupid io-throttle
at no i/o congestion.
but current shrink_zone() is sane, not slow.
2) shrink_all_zone() try to shrink all pages at a time. but it doesn't works
fine on numa system.
example)
System has 4GB memory and each node have 2GB. and hibernate need 1GB.
optimal)
steal 500MB from each node.
shrink_all_zones)
steal 1GB from node-0.
Oh, Cache balancing logic was broken. ;)
Unfortunately, Desktop system moved ahead NUMA at nowadays.
(Side note, if hibernate require 2GB, shrink_all_zones() never success
on above machine)
3) if the node has several I/O flighting pages, shrink_all_zones() makes
pretty bad result.
schenario) hibernate need 1GB
1) shrink_all_zones() try to reclaim 1GB from Node-0
2) but it only reclaimed 990MB
3) stupidly, shrink_all_zones() try to reclaim 1GB from Node-1
4) it reclaimed 990MB
Oh, well. it reclaimed twice much than required.
In the other hand, current shrink_zone() has sane baling out logic.
then, it doesn't make overkill reclaim. then, we lost shrink_zones()'s risk.
4) SplitLRU VM always keep active/inactive ratio very carefully. inactive list only
shrinking break its assumption. it makes unnecessary OOM risk. it obviously suboptimal.
Now, shrink_all_memory() is only the wrapper function of do_try_to_free_pages().
it bring good reviewability and debuggability, and solve above problems.
side note: Reclaim logic unificication makes two good side effect.
- Fix recursive reclaim bug on shrink_all_memory().
it did forgot to use PF_MEMALLOC. it mean the system be able to stuck into deadlock.
- Now, shrink_all_memory() got lockdep awareness. it bring good debuggability.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Acked-by: Rafael J. Wysocki <rjw@sisk.pl>
---
mm/vmscan.c | 153 ++++++++++-------------------------------------------------
1 files changed, 26 insertions(+), 127 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7be2927..eebd260 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -58,6 +58,8 @@ struct scan_control {
/* How many pages shrink_list() should reclaim */
unsigned long nr_to_reclaim;
+ unsigned long hibernation_mode;
+
/* This context's GFP mask */
gfp_t gfp_mask;
@@ -1791,7 +1793,8 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
}
/* Take a nap, wait for some writeback to complete */
- if (sc->nr_scanned && priority < DEF_PRIORITY - 2)
+ if (!sc->hibernation_mode && sc->nr_scanned &&
+ priority < DEF_PRIORITY - 2)
congestion_wait(BLK_RW_ASYNC, HZ/10);
}
/* top priority shrink_zones still had more to do? don't OOM, then */
@@ -2266,148 +2269,44 @@ unsigned long zone_reclaimable_pages(struct zone *zone)
#ifdef CONFIG_HIBERNATION
/*
- * Helper function for shrink_all_memory(). Tries to reclaim 'nr_pages' pages
- * from LRU lists system-wide, for given pass and priority.
- *
- * For pass > 3 we also try to shrink the LRU lists that contain a few pages
- */
-static void shrink_all_zones(unsigned long nr_pages, int prio,
- int pass, struct scan_control *sc)
-{
- struct zone *zone;
- unsigned long nr_reclaimed = 0;
- struct zone_reclaim_stat *reclaim_stat;
-
- for_each_populated_zone(zone) {
- enum lru_list l;
-
- if (zone_is_all_unreclaimable(zone) && prio != DEF_PRIORITY)
- continue;
-
- for_each_evictable_lru(l) {
- enum zone_stat_item ls = NR_LRU_BASE + l;
- unsigned long lru_pages = zone_page_state(zone, ls);
-
- /* For pass = 0, we don't shrink the active list */
- if (pass == 0 && (l == LRU_ACTIVE_ANON ||
- l == LRU_ACTIVE_FILE))
- continue;
-
- reclaim_stat = get_reclaim_stat(zone, sc);
- reclaim_stat->nr_saved_scan[l] +=
- (lru_pages >> prio) + 1;
- if (reclaim_stat->nr_saved_scan[l]
- >= nr_pages || pass > 3) {
- unsigned long nr_to_scan;
-
- reclaim_stat->nr_saved_scan[l] = 0;
- nr_to_scan = min(nr_pages, lru_pages);
- nr_reclaimed += shrink_list(l, nr_to_scan, zone,
- sc, prio);
- if (nr_reclaimed >= nr_pages) {
- sc->nr_reclaimed += nr_reclaimed;
- return;
- }
- }
- }
- }
- sc->nr_reclaimed += nr_reclaimed;
-}
-
-/*
- * Try to free `nr_pages' of memory, system-wide, and return the number of
+ * Try to free `nr_to_reclaim' of memory, system-wide, and return the number of
* freed pages.
*
* Rather than trying to age LRUs the aim is to preserve the overall
* LRU order by reclaiming preferentially
* inactive > active > active referenced > active mapped
*/
-unsigned long shrink_all_memory(unsigned long nr_pages)
+unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
{
- unsigned long lru_pages, nr_slab;
- int pass;
struct reclaim_state reclaim_state;
struct scan_control sc = {
- .gfp_mask = GFP_KERNEL,
- .may_unmap = 0,
+ .gfp_mask = GFP_HIGHUSER_MOVABLE,
+ .may_swap = 1,
+ .may_unmap = 1,
.may_writepage = 1,
+ .swap_cluster_max = SWAP_CLUSTER_MAX,
+ .nr_to_reclaim = nr_to_reclaim,
+ .hibernation_mode = 1,
+ .swappiness = vm_swappiness,
+ .order = 0,
.isolate_pages = isolate_pages_global,
- .nr_reclaimed = 0,
};
+ struct zonelist * zonelist = node_zonelist(numa_node_id(), sc.gfp_mask);
+ struct task_struct *p = current;
+ unsigned long nr_reclaimed;
- current->reclaim_state = &reclaim_state;
-
- lru_pages = global_reclaimable_pages();
- nr_slab = global_page_state(NR_SLAB_RECLAIMABLE);
- /* If slab caches are huge, it's better to hit them first */
- while (nr_slab >= lru_pages) {
- reclaim_state.reclaimed_slab = 0;
- shrink_slab(nr_pages, sc.gfp_mask, lru_pages);
- if (!reclaim_state.reclaimed_slab)
- break;
-
- sc.nr_reclaimed += reclaim_state.reclaimed_slab;
- if (sc.nr_reclaimed >= nr_pages)
- goto out;
-
- nr_slab -= reclaim_state.reclaimed_slab;
- }
-
- /*
- * We try to shrink LRUs in 5 passes:
- * 0 = Reclaim from inactive_list only
- * 1 = Reclaim from active list but don't reclaim mapped
- * 2 = 2nd pass of type 1
- * 3 = Reclaim mapped (normal reclaim)
- * 4 = 2nd pass of type 3
- */
- for (pass = 0; pass < 5; pass++) {
- int prio;
-
- /* Force reclaiming mapped pages in the passes #3 and #4 */
- if (pass > 2)
- sc.may_unmap = 1;
-
- for (prio = DEF_PRIORITY; prio >= 0; prio--) {
- unsigned long nr_to_scan = nr_pages - sc.nr_reclaimed;
-
- sc.nr_scanned = 0;
- sc.swap_cluster_max = nr_to_scan;
- shrink_all_zones(nr_to_scan, prio, pass, &sc);
- if (sc.nr_reclaimed >= nr_pages)
- goto out;
-
- reclaim_state.reclaimed_slab = 0;
- shrink_slab(sc.nr_scanned, sc.gfp_mask,
- global_reclaimable_pages());
- sc.nr_reclaimed += reclaim_state.reclaimed_slab;
- if (sc.nr_reclaimed >= nr_pages)
- goto out;
-
- if (sc.nr_scanned && prio < DEF_PRIORITY - 2)
- congestion_wait(BLK_RW_ASYNC, HZ / 10);
- }
- }
-
- /*
- * If sc.nr_reclaimed = 0, we could not shrink LRUs, but there may be
- * something in slab caches
- */
- if (!sc.nr_reclaimed) {
- do {
- reclaim_state.reclaimed_slab = 0;
- shrink_slab(nr_pages, sc.gfp_mask,
- global_reclaimable_pages());
- sc.nr_reclaimed += reclaim_state.reclaimed_slab;
- } while (sc.nr_reclaimed < nr_pages &&
- reclaim_state.reclaimed_slab > 0);
- }
+ p->flags |= PF_MEMALLOC;
+ lockdep_set_current_reclaim_state(sc.gfp_mask);
+ reclaim_state.reclaimed_slab = 0;
+ p->reclaim_state = &reclaim_state;
+ nr_reclaimed = do_try_to_free_pages(zonelist, &sc);
-out:
- current->reclaim_state = NULL;
+ p->reclaim_state = NULL;
+ lockdep_clear_current_reclaim_state();
+ p->flags &= ~PF_MEMALLOC;
- return sc.nr_reclaimed;
+ return nr_reclaimed;
}
#endif /* CONFIG_HIBERNATION */
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/5] vmscan: zone_reclaim() don't use insane swap_cluster_max
2009-11-11 1:57 ` KOSAKI Motohiro
@ 2009-11-11 2:02 ` KOSAKI Motohiro
-1 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:02 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
In old days, we didn't have sc.nr_to_reclaim and it brought
sc.swap_cluster_max misuse.
huge sc.swap_cluster_max might makes unnecessary OOM risk and
no performance benefit.
Now, we can stop its insane thing.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index eebd260..d0422a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2458,8 +2458,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
- .swap_cluster_max = max_t(unsigned long, nr_pages,
- SWAP_CLUSTER_MAX),
+ .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
--
1.6.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 3/5] vmscan: zone_reclaim() don't use insane swap_cluster_max
@ 2009-11-11 2:02 ` KOSAKI Motohiro
0 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:02 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
In old days, we didn't have sc.nr_to_reclaim and it brought
sc.swap_cluster_max misuse.
huge sc.swap_cluster_max might makes unnecessary OOM risk and
no performance benefit.
Now, we can stop its insane thing.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index eebd260..d0422a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2458,8 +2458,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
- .swap_cluster_max = max_t(unsigned long, nr_pages,
- SWAP_CLUSTER_MAX),
+ .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/5] vmscan: Kill sc.swap_cluster_max
2009-11-11 1:57 ` KOSAKI Motohiro
@ 2009-11-11 2:03 ` KOSAKI Motohiro
-1 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:03 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
Now, All caller of reclaim use swap_cluster_max as SWAP_CLUSTER_MAX.
Then, we can remove it perfectly.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 26 ++++++--------------------
1 files changed, 6 insertions(+), 20 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d0422a8..d53a934 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -71,12 +71,6 @@ struct scan_control {
/* Can pages be swapped as part of reclaim? */
int may_swap;
- /* This context's SWAP_CLUSTER_MAX. If freeing memory for
- * suspend, we effectively ignore SWAP_CLUSTER_MAX.
- * In this context, it doesn't matter that we scan the
- * whole list at once. */
- int swap_cluster_max;
-
int swappiness;
int all_unreclaimable;
@@ -1127,7 +1121,7 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
unsigned long nr_anon;
unsigned long nr_file;
- nr_taken = sc->isolate_pages(sc->swap_cluster_max,
+ nr_taken = sc->isolate_pages(SWAP_CLUSTER_MAX,
&page_list, &nr_scan, sc->order, mode,
zone, sc->mem_cgroup, 0, file);
@@ -1562,15 +1556,14 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
* until we collected @swap_cluster_max pages to scan.
*/
static unsigned long nr_scan_try_batch(unsigned long nr_to_scan,
- unsigned long *nr_saved_scan,
- unsigned long swap_cluster_max)
+ unsigned long *nr_saved_scan)
{
unsigned long nr;
*nr_saved_scan += nr_to_scan;
nr = *nr_saved_scan;
- if (nr >= swap_cluster_max)
+ if (nr >= SWAP_CLUSTER_MAX)
*nr_saved_scan = 0;
else
nr = 0;
@@ -1589,7 +1582,6 @@ static void shrink_zone(int priority, struct zone *zone,
unsigned long percent[2]; /* anon @ 0; file @ 1 */
enum lru_list l;
unsigned long nr_reclaimed = sc->nr_reclaimed;
- unsigned long swap_cluster_max = sc->swap_cluster_max;
unsigned long nr_to_reclaim = sc->nr_to_reclaim;
struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
int noswap = 0;
@@ -1612,8 +1604,7 @@ static void shrink_zone(int priority, struct zone *zone,
scan = (scan * percent[file]) / 100;
}
nr[l] = nr_scan_try_batch(scan,
- &reclaim_stat->nr_saved_scan[l],
- swap_cluster_max);
+ &reclaim_stat->nr_saved_scan[l]);
}
trace_printk("sz p=%d %d:%s %lu:%lu [%lu %lu %lu %lu] \n",
@@ -1625,7 +1616,8 @@ static void shrink_zone(int priority, struct zone *zone,
nr[LRU_INACTIVE_FILE]) {
for_each_evictable_lru(l) {
if (nr[l]) {
- nr_to_scan = min(nr[l], swap_cluster_max);
+ nr_to_scan = min_t(unsigned long,
+ nr[l], SWAP_CLUSTER_MAX);
nr[l] -= nr_to_scan;
nr_reclaimed += shrink_list(l, nr_to_scan,
@@ -1833,7 +1825,6 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
struct scan_control sc = {
.gfp_mask = gfp_mask,
.may_writepage = !laptop_mode,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = SWAP_CLUSTER_MAX,
.may_unmap = 1,
.may_swap = 1,
@@ -1858,7 +1849,6 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = !noswap,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
.mem_cgroup = mem,
@@ -1892,7 +1882,6 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = !noswap,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
@@ -1940,7 +1929,6 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
.gfp_mask = GFP_KERNEL,
.may_unmap = 1,
.may_swap = 1,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
/*
* kswapd doesn't want to be bailed out while reclaim. because
* we want to put equal scanning pressure on each zone.
@@ -2284,7 +2272,6 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
.may_swap = 1,
.may_unmap = 1,
.may_writepage = 1,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = nr_to_reclaim,
.hibernation_mode = 1,
.swappiness = vm_swappiness,
@@ -2458,7 +2445,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
--
1.6.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 4/5] vmscan: Kill sc.swap_cluster_max
@ 2009-11-11 2:03 ` KOSAKI Motohiro
0 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:03 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
Now, All caller of reclaim use swap_cluster_max as SWAP_CLUSTER_MAX.
Then, we can remove it perfectly.
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 26 ++++++--------------------
1 files changed, 6 insertions(+), 20 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d0422a8..d53a934 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -71,12 +71,6 @@ struct scan_control {
/* Can pages be swapped as part of reclaim? */
int may_swap;
- /* This context's SWAP_CLUSTER_MAX. If freeing memory for
- * suspend, we effectively ignore SWAP_CLUSTER_MAX.
- * In this context, it doesn't matter that we scan the
- * whole list at once. */
- int swap_cluster_max;
-
int swappiness;
int all_unreclaimable;
@@ -1127,7 +1121,7 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
unsigned long nr_anon;
unsigned long nr_file;
- nr_taken = sc->isolate_pages(sc->swap_cluster_max,
+ nr_taken = sc->isolate_pages(SWAP_CLUSTER_MAX,
&page_list, &nr_scan, sc->order, mode,
zone, sc->mem_cgroup, 0, file);
@@ -1562,15 +1556,14 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
* until we collected @swap_cluster_max pages to scan.
*/
static unsigned long nr_scan_try_batch(unsigned long nr_to_scan,
- unsigned long *nr_saved_scan,
- unsigned long swap_cluster_max)
+ unsigned long *nr_saved_scan)
{
unsigned long nr;
*nr_saved_scan += nr_to_scan;
nr = *nr_saved_scan;
- if (nr >= swap_cluster_max)
+ if (nr >= SWAP_CLUSTER_MAX)
*nr_saved_scan = 0;
else
nr = 0;
@@ -1589,7 +1582,6 @@ static void shrink_zone(int priority, struct zone *zone,
unsigned long percent[2]; /* anon @ 0; file @ 1 */
enum lru_list l;
unsigned long nr_reclaimed = sc->nr_reclaimed;
- unsigned long swap_cluster_max = sc->swap_cluster_max;
unsigned long nr_to_reclaim = sc->nr_to_reclaim;
struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
int noswap = 0;
@@ -1612,8 +1604,7 @@ static void shrink_zone(int priority, struct zone *zone,
scan = (scan * percent[file]) / 100;
}
nr[l] = nr_scan_try_batch(scan,
- &reclaim_stat->nr_saved_scan[l],
- swap_cluster_max);
+ &reclaim_stat->nr_saved_scan[l]);
}
trace_printk("sz p=%d %d:%s %lu:%lu [%lu %lu %lu %lu] \n",
@@ -1625,7 +1616,8 @@ static void shrink_zone(int priority, struct zone *zone,
nr[LRU_INACTIVE_FILE]) {
for_each_evictable_lru(l) {
if (nr[l]) {
- nr_to_scan = min(nr[l], swap_cluster_max);
+ nr_to_scan = min_t(unsigned long,
+ nr[l], SWAP_CLUSTER_MAX);
nr[l] -= nr_to_scan;
nr_reclaimed += shrink_list(l, nr_to_scan,
@@ -1833,7 +1825,6 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
struct scan_control sc = {
.gfp_mask = gfp_mask,
.may_writepage = !laptop_mode,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = SWAP_CLUSTER_MAX,
.may_unmap = 1,
.may_swap = 1,
@@ -1858,7 +1849,6 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = !noswap,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
.mem_cgroup = mem,
@@ -1892,7 +1882,6 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *mem_cont,
.may_writepage = !laptop_mode,
.may_unmap = 1,
.may_swap = !noswap,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = SWAP_CLUSTER_MAX,
.swappiness = swappiness,
.order = 0,
@@ -1940,7 +1929,6 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order)
.gfp_mask = GFP_KERNEL,
.may_unmap = 1,
.may_swap = 1,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
/*
* kswapd doesn't want to be bailed out while reclaim. because
* we want to put equal scanning pressure on each zone.
@@ -2284,7 +2272,6 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
.may_swap = 1,
.may_unmap = 1,
.may_writepage = 1,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = nr_to_reclaim,
.hibernation_mode = 1,
.swappiness = vm_swappiness,
@@ -2458,7 +2445,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
.may_swap = 1,
- .swap_cluster_max = SWAP_CLUSTER_MAX,
.nr_to_reclaim = max_t(unsigned long, nr_pages,
SWAP_CLUSTER_MAX),
.gfp_mask = gfp_mask,
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 5/5][nit fix] vmscan Make consistent of reclaim bale out between do_try_to_free_page and shrink_zone
2009-11-11 1:57 ` KOSAKI Motohiro
@ 2009-11-11 2:04 ` KOSAKI Motohiro
-1 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:04 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
Fix small inconsistent of ">" and ">=".
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d53a934..0d0a948 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1632,7 +1632,7 @@ static void shrink_zone(int priority, struct zone *zone,
* with multiple processes reclaiming pages, the total
* freeing target can get unreasonably large.
*/
- if (nr_reclaimed > nr_to_reclaim && priority < DEF_PRIORITY)
+ if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY)
break;
}
--
1.6.2.5
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH 5/5][nit fix] vmscan Make consistent of reclaim bale out between do_try_to_free_page and shrink_zone
@ 2009-11-11 2:04 ` KOSAKI Motohiro
0 siblings, 0 replies; 12+ messages in thread
From: KOSAKI Motohiro @ 2009-11-11 2:04 UTC (permalink / raw)
To: LKML
Cc: kosaki.motohiro, linux-mm, Andrew Morton, Minchan Kim,
Rik van Riel, Rafael J. Wysocki
Fix small inconsistent of ">" and ">=".
Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Reviewed-by: Minchan Kim <minchan.kim@gmail.com>
---
mm/vmscan.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index d53a934..0d0a948 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1632,7 +1632,7 @@ static void shrink_zone(int priority, struct zone *zone,
* with multiple processes reclaiming pages, the total
* freeing target can get unreasonably large.
*/
- if (nr_reclaimed > nr_to_reclaim && priority < DEF_PRIORITY)
+ if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY)
break;
}
--
1.6.2.5
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-11-11 2:04 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-11 1:57 [PATCH v3 0/5] Sorting out sc.swap_cluster_max and kill hibernation specific reclaim logic KOSAKI Motohiro
2009-11-11 1:57 ` KOSAKI Motohiro
2009-11-11 1:58 ` [PATCH 1/5] vmscan: separate sc.swap_cluster_max and sc.nr_max_reclaim KOSAKI Motohiro
2009-11-11 1:58 ` KOSAKI Motohiro
2009-11-11 2:01 ` [PATCH 2/5] vmscan: Kill hibernation specific reclaim logic and unify it KOSAKI Motohiro
2009-11-11 2:01 ` KOSAKI Motohiro
2009-11-11 2:02 ` [PATCH 3/5] vmscan: zone_reclaim() don't use insane swap_cluster_max KOSAKI Motohiro
2009-11-11 2:02 ` KOSAKI Motohiro
2009-11-11 2:03 ` [PATCH 4/5] vmscan: Kill sc.swap_cluster_max KOSAKI Motohiro
2009-11-11 2:03 ` KOSAKI Motohiro
2009-11-11 2:04 ` [PATCH 5/5][nit fix] vmscan Make consistent of reclaim bale out between do_try_to_free_page and shrink_zone KOSAKI Motohiro
2009-11-11 2:04 ` KOSAKI Motohiro
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.