linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec
@ 2012-04-26  7:53 Konstantin Khlebnikov
  2012-04-26  7:53 ` [PATCH 01/12] mm/vmscan: store "priority" in struct scan_control Konstantin Khlebnikov
                   ` (12 more replies)
  0 siblings, 13 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

This patchset depends on Johannes Weiner's patch
"mm: memcg: count pte references from every member of the reclaimed hierarchy".

bloat-o-meter delta for patches 2..12

add/remove: 6/6 grow/shrink: 6/14 up/down: 4414/-4625 (-211)
function                                     old     new   delta
shrink_page_list                               -    2270   +2270
shrink_lruvec                                  -    1386   +1386
update_isolated_counts                         -     376    +376
lruvec_init                                    -     195    +195
get_lruvec_size                                -      61     +61
balance_pgdat                               1856    1904     +48
mem_cgroup_shrink_node_zone                  283     302     +19
shrink_inactive_list                         985    1003     +18
mem_cgroup_get_lruvec_size                     -      18     +18
mem_cgroup_create                           1453    1468     +15
shrink_active_list                           824     830      +6
shrink_zone                                  147     149      +2
mem_control_stat_show                        750     745      -5
mem_cgroup_zone_lruvec                        72      67      -5
mem_cgroup_get_reclaim_stat_from_page        108     103      -5
mem_cgroup_nr_lru_pages                      185     179      -6
inactive_anon_is_low                         110     103      -7
test_mem_cgroup_node_reclaimable             200     192      -8
__mem_cgroup_free                            389     381      -8
putback_inactive_pages                       634     620     -14
mem_control_numa_stat_show                  1015    1001     -14
static.isolate_lru_pages                     419     403     -16
mem_cgroup_force_empty                      1694    1678     -16
get_reclaim_stat                              30       -     -30
mem_cgroup_zone_nr_lru_pages                  64       -     -64
free_area_init_node                          849     784     -65
mem_cgroup_inactive_anon_is_low              177      84     -93
mem_cgroup_inactive_file_is_low              140      31    -109
zone_nr_lru_pages                            110       -    -110
static.update_isolated_counts                376       -    -376
shrink_mem_cgroup_zone                      1381       -   -1381
static.shrink_page_list                     2293       -   -2293

---

Konstantin Khlebnikov (12):
      mm/vmscan: store "priority" in struct scan_control
      mm: add link from struct lruvec to struct zone
      mm/vmscan: push lruvec pointer into isolate_lru_pages()
      mm/vmscan: push zone pointer into shrink_page_list()
      mm/vmscan: push zone pointer into update_isolated_counts()
      mm/vmscan: push lruvec pointer into putback_inactive_pages()
      mm/vmscan: replace zone_nr_lru_pages() with get_lruvec_size()
      mm/vmscan: push lruvec pointer into inactive_list_is_low()
      mm/vmscan: push lruvec pointer into shrink_list()
      mm/vmscan: push lruvec pointer into get_scan_count()
      mm/vmscan: push lruvec pointer into should_continue_reclaim()
      mm/vmscan: kill struct mem_cgroup_zone


 include/linux/memcontrol.h |   16 +--
 include/linux/mmzone.h     |   14 ++
 mm/memcontrol.c            |   33 +++--
 mm/mmzone.c                |   14 ++
 mm/page_alloc.c            |    8 -
 mm/vmscan.c                |  277 ++++++++++++++++++++------------------------
 6 files changed, 177 insertions(+), 185 deletions(-)

-- 
Signature

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 01/12] mm/vmscan: store "priority" in struct scan_control
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
@ 2012-04-26  7:53 ` Konstantin Khlebnikov
  2012-04-26  7:53 ` [PATCH 02/12] mm: add link from struct lruvec to struct zone Konstantin Khlebnikov
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

In memory reclaimer's code some functions has too many arguments, "priority" is
one of them. It can be stored on "sc", we construct them on the same level.
Instead of open coded loop we set initial sc.priority, and do_try_to_free_pages()
decreases it down to zero.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

---

This patch unrelated to this patchset, but intersects by context with other patches.

add/remove: 1/1 grow/shrink: 2/7 up/down: 856/-1045 (-189)
function                                     old     new   delta
shrink_active_list                             -     824    +824
try_to_free_pages                            279     295     +16
try_to_free_mem_cgroup_pages                 295     311     +16
do_try_to_free_pages                        1255    1252      -3
shrink_zone                                  152     147      -5
static.shrink_page_list                     2309    2293     -16
shrink_inactive_list                        1001     985     -16
zone_reclaim                                 639     615     -24
balance_pgdat                               1909    1856     -53
shrink_mem_cgroup_zone                      1485    1381    -104
static.shrink_active_list                    824       -    -824
---
 mm/vmscan.c |  117 +++++++++++++++++++++++++++++++----------------------------
 1 file changed, 61 insertions(+), 56 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 7bc7b8b..d81750c 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -78,6 +78,9 @@ struct scan_control {
 
 	int order;
 
+	/* Scan (total_size >> priority) pages at once */
+	int priority;
+
 	/*
 	 * The memory cgroup that hit its limit and as a result is the
 	 * primary target of this reclaim invocation.
@@ -687,7 +690,6 @@ static enum page_references page_check_references(struct page *page,
 static unsigned long shrink_page_list(struct list_head *page_list,
 				      struct mem_cgroup_zone *mz,
 				      struct scan_control *sc,
-				      int priority,
 				      unsigned long *ret_nr_dirty,
 				      unsigned long *ret_nr_writeback)
 {
@@ -790,7 +792,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 			 * unless under significant pressure.
 			 */
 			if (page_is_file_cache(page) &&
-					(!current_is_kswapd() || priority >= DEF_PRIORITY - 2)) {
+					(!current_is_kswapd() ||
+					 sc->priority >= DEF_PRIORITY - 2)) {
 				/*
 				 * Immediately reclaim when written back.
 				 * Similar in principal to deactivate_page()
@@ -1257,7 +1260,7 @@ update_isolated_counts(struct mem_cgroup_zone *mz,
  */
 static noinline_for_stack unsigned long
 shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
-		     struct scan_control *sc, int priority, enum lru_list lru)
+		     struct scan_control *sc, enum lru_list lru)
 {
 	LIST_HEAD(page_list);
 	unsigned long nr_scanned;
@@ -1307,7 +1310,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 
 	update_isolated_counts(mz, &page_list, &nr_anon, &nr_file);
 
-	nr_reclaimed = shrink_page_list(&page_list, mz, sc, priority,
+	nr_reclaimed = shrink_page_list(&page_list, mz, sc,
 						&nr_dirty, &nr_writeback);
 
 	spin_lock_irq(&zone->lru_lock);
@@ -1351,13 +1354,14 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 	 * DEF_PRIORITY-6 For SWAP_CLUSTER_MAX isolated pages, throttle if any
 	 *                     isolated page is PageWriteback
 	 */
-	if (nr_writeback && nr_writeback >= (nr_taken >> (DEF_PRIORITY-priority)))
+	if (nr_writeback && nr_writeback >=
+			(nr_taken >> (DEF_PRIORITY - sc->priority)))
 		wait_iff_congested(zone, BLK_RW_ASYNC, HZ/10);
 
 	trace_mm_vmscan_lru_shrink_inactive(zone->zone_pgdat->node_id,
 		zone_idx(zone),
 		nr_scanned, nr_reclaimed,
-		priority,
+		sc->priority,
 		trace_shrink_flags(file));
 	return nr_reclaimed;
 }
@@ -1421,7 +1425,7 @@ static void move_active_pages_to_lru(struct zone *zone,
 static void shrink_active_list(unsigned long nr_to_scan,
 			       struct mem_cgroup_zone *mz,
 			       struct scan_control *sc,
-			       int priority, enum lru_list lru)
+			       enum lru_list lru)
 {
 	unsigned long nr_taken;
 	unsigned long nr_scanned;
@@ -1604,17 +1608,17 @@ static int inactive_list_is_low(struct mem_cgroup_zone *mz, int file)
 
 static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
 				 struct mem_cgroup_zone *mz,
-				 struct scan_control *sc, int priority)
+				 struct scan_control *sc)
 {
 	int file = is_file_lru(lru);
 
 	if (is_active_lru(lru)) {
 		if (inactive_list_is_low(mz, file))
-			shrink_active_list(nr_to_scan, mz, sc, priority, lru);
+			shrink_active_list(nr_to_scan, mz, sc, lru);
 		return 0;
 	}
 
-	return shrink_inactive_list(nr_to_scan, mz, sc, priority, lru);
+	return shrink_inactive_list(nr_to_scan, mz, sc, lru);
 }
 
 static int vmscan_swappiness(struct scan_control *sc)
@@ -1633,7 +1637,7 @@ static int vmscan_swappiness(struct scan_control *sc)
  * nr[0] = anon pages to scan; nr[1] = file pages to scan
  */
 static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
-			   unsigned long *nr, int priority)
+			   unsigned long *nr)
 {
 	unsigned long anon, file, free;
 	unsigned long anon_prio, file_prio;
@@ -1735,8 +1739,8 @@ out:
 		unsigned long scan;
 
 		scan = zone_nr_lru_pages(mz, lru);
-		if (priority || noswap) {
-			scan >>= priority;
+		if (sc->priority || noswap) {
+			scan >>= sc->priority;
 			if (!scan && force_scan)
 				scan = SWAP_CLUSTER_MAX;
 			scan = div64_u64(scan * fraction[file], denominator);
@@ -1746,11 +1750,11 @@ out:
 }
 
 /* Use reclaim/compaction for costly allocs or under memory pressure */
-static bool in_reclaim_compaction(int priority, struct scan_control *sc)
+static bool in_reclaim_compaction(struct scan_control *sc)
 {
 	if (COMPACTION_BUILD && sc->order &&
 			(sc->order > PAGE_ALLOC_COSTLY_ORDER ||
-			 priority < DEF_PRIORITY - 2))
+			 sc->priority < DEF_PRIORITY - 2))
 		return true;
 
 	return false;
@@ -1766,14 +1770,13 @@ static bool in_reclaim_compaction(int priority, struct scan_control *sc)
 static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
 					unsigned long nr_reclaimed,
 					unsigned long nr_scanned,
-					int priority,
 					struct scan_control *sc)
 {
 	unsigned long pages_for_compaction;
 	unsigned long inactive_lru_pages;
 
 	/* If not in reclaim/compaction mode, stop */
-	if (!in_reclaim_compaction(priority, sc))
+	if (!in_reclaim_compaction(sc))
 		return false;
 
 	/* Consider stopping depending on scan and reclaim activity */
@@ -1824,7 +1827,7 @@ static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
 /*
  * This is a basic per-zone page freer.  Used by both kswapd and direct reclaim.
  */
-static void shrink_mem_cgroup_zone(int priority, struct mem_cgroup_zone *mz,
+static void shrink_mem_cgroup_zone(struct mem_cgroup_zone *mz,
 				   struct scan_control *sc)
 {
 	unsigned long nr[NR_LRU_LISTS];
@@ -1837,7 +1840,7 @@ static void shrink_mem_cgroup_zone(int priority, struct mem_cgroup_zone *mz,
 restart:
 	nr_reclaimed = 0;
 	nr_scanned = sc->nr_scanned;
-	get_scan_count(mz, sc, nr, priority);
+	get_scan_count(mz, sc, nr);
 
 	blk_start_plug(&plug);
 	while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
@@ -1849,7 +1852,7 @@ restart:
 				nr[lru] -= nr_to_scan;
 
 				nr_reclaimed += shrink_list(lru, nr_to_scan,
-							    mz, sc, priority);
+							    mz, sc);
 			}
 		}
 		/*
@@ -1860,7 +1863,8 @@ restart:
 		 * with multiple processes reclaiming pages, the total
 		 * freeing target can get unreasonably large.
 		 */
-		if (nr_reclaimed >= nr_to_reclaim && priority < DEF_PRIORITY)
+		if (nr_reclaimed >= nr_to_reclaim &&
+		    sc->priority < DEF_PRIORITY)
 			break;
 	}
 	blk_finish_plug(&plug);
@@ -1872,24 +1876,22 @@ restart:
 	 */
 	if (inactive_anon_is_low(mz))
 		shrink_active_list(SWAP_CLUSTER_MAX, mz,
-				   sc, priority, LRU_ACTIVE_ANON);
+				   sc, LRU_ACTIVE_ANON);
 
 	/* reclaim/compaction might need reclaim to continue */
 	if (should_continue_reclaim(mz, nr_reclaimed,
-					sc->nr_scanned - nr_scanned,
-					priority, sc))
+				    sc->nr_scanned - nr_scanned, sc))
 		goto restart;
 
 	throttle_vm_writeout(sc->gfp_mask);
 }
 
-static void shrink_zone(int priority, struct zone *zone,
-			struct scan_control *sc)
+static void shrink_zone(struct zone *zone, struct scan_control *sc)
 {
 	struct mem_cgroup *root = sc->target_mem_cgroup;
 	struct mem_cgroup_reclaim_cookie reclaim = {
 		.zone = zone,
-		.priority = priority,
+		.priority = sc->priority,
 	};
 	struct mem_cgroup *memcg;
 
@@ -1900,7 +1902,7 @@ static void shrink_zone(int priority, struct zone *zone,
 			.zone = zone,
 		};
 
-		shrink_mem_cgroup_zone(priority, &mz, sc);
+		shrink_mem_cgroup_zone(&mz, sc);
 		/*
 		 * Limit reclaim has historically picked one memcg and
 		 * scanned it with decreasing priority levels until
@@ -1976,8 +1978,7 @@ static inline bool compaction_ready(struct zone *zone, struct scan_control *sc)
  * the caller that it should consider retrying the allocation instead of
  * further reclaim.
  */
-static bool shrink_zones(int priority, struct zonelist *zonelist,
-					struct scan_control *sc)
+static bool shrink_zones(struct zonelist *zonelist, struct scan_control *sc)
 {
 	struct zoneref *z;
 	struct zone *zone;
@@ -2004,7 +2005,8 @@ static bool shrink_zones(int priority, struct zonelist *zonelist,
 		if (global_reclaim(sc)) {
 			if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
 				continue;
-			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
+			if (zone->all_unreclaimable &&
+					sc->priority != DEF_PRIORITY)
 				continue;	/* Let kswapd poll it */
 			if (COMPACTION_BUILD) {
 				/*
@@ -2036,7 +2038,7 @@ static bool shrink_zones(int priority, struct zonelist *zonelist,
 			/* need some check for avoid more shrink_zone() */
 		}
 
-		shrink_zone(priority, zone, sc);
+		shrink_zone(zone, sc);
 	}
 
 	return aborted_reclaim;
@@ -2087,7 +2089,6 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 					struct scan_control *sc,
 					struct shrink_control *shrink)
 {
-	int priority;
 	unsigned long total_scanned = 0;
 	struct reclaim_state *reclaim_state = current->reclaim_state;
 	struct zoneref *z;
@@ -2100,9 +2101,9 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 	if (global_reclaim(sc))
 		count_vm_event(ALLOCSTALL);
 
-	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
+	do {
 		sc->nr_scanned = 0;
-		aborted_reclaim = shrink_zones(priority, zonelist, sc);
+		aborted_reclaim = shrink_zones(zonelist, sc);
 
 		/*
 		 * Don't shrink slabs when reclaiming memory from
@@ -2144,7 +2145,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 
 		/* Take a nap, wait for some writeback to complete */
 		if (!sc->hibernation_mode && sc->nr_scanned &&
-		    priority < DEF_PRIORITY - 2) {
+		    sc->priority < DEF_PRIORITY - 2) {
 			struct zone *preferred_zone;
 
 			first_zones_zonelist(zonelist, gfp_zone(sc->gfp_mask),
@@ -2152,7 +2153,7 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist,
 						&preferred_zone);
 			wait_iff_congested(preferred_zone, BLK_RW_ASYNC, HZ/10);
 		}
-	}
+	} while (--sc->priority >= 0);
 
 out:
 	delayacct_freepages_end();
@@ -2190,6 +2191,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
 		.may_unmap = 1,
 		.may_swap = 1,
 		.order = order,
+		.priority = DEF_PRIORITY,
 		.target_mem_cgroup = NULL,
 		.nodemask = nodemask,
 	};
@@ -2222,6 +2224,7 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *memcg,
 		.may_unmap = 1,
 		.may_swap = !noswap,
 		.order = 0,
+		.priority = 0,
 		.target_mem_cgroup = memcg,
 	};
 	struct mem_cgroup_zone mz = {
@@ -2232,7 +2235,7 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *memcg,
 	sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
 			(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
 
-	trace_mm_vmscan_memcg_softlimit_reclaim_begin(0,
+	trace_mm_vmscan_memcg_softlimit_reclaim_begin(sc.order,
 						      sc.may_writepage,
 						      sc.gfp_mask);
 
@@ -2243,7 +2246,7 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *memcg,
 	 * will pick up pages from other mem cgroup's as well. We hack
 	 * the priority and make it zero.
 	 */
-	shrink_mem_cgroup_zone(0, &mz, &sc);
+	shrink_mem_cgroup_zone(&mz, &sc);
 
 	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed);
 
@@ -2264,6 +2267,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
 		.may_swap = !noswap,
 		.nr_to_reclaim = SWAP_CLUSTER_MAX,
 		.order = 0,
+		.priority = DEF_PRIORITY,
 		.target_mem_cgroup = memcg,
 		.nodemask = NULL, /* we don't care the placement */
 		.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
@@ -2294,8 +2298,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
 }
 #endif
 
-static void age_active_anon(struct zone *zone, struct scan_control *sc,
-			    int priority)
+static void age_active_anon(struct zone *zone, struct scan_control *sc)
 {
 	struct mem_cgroup *memcg;
 
@@ -2311,7 +2314,7 @@ static void age_active_anon(struct zone *zone, struct scan_control *sc,
 
 		if (inactive_anon_is_low(&mz))
 			shrink_active_list(SWAP_CLUSTER_MAX, &mz,
-					   sc, priority, LRU_ACTIVE_ANON);
+					   sc, LRU_ACTIVE_ANON);
 
 		memcg = mem_cgroup_iter(NULL, memcg, NULL);
 	} while (memcg);
@@ -2420,7 +2423,6 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
 {
 	int all_zones_ok;
 	unsigned long balanced;
-	int priority;
 	int i;
 	int end_zone = 0;	/* Inclusive.  0 = ZONE_DMA */
 	unsigned long total_scanned;
@@ -2444,11 +2446,12 @@ static unsigned long balance_pgdat(pg_data_t *pgdat, int order,
 	};
 loop_again:
 	total_scanned = 0;
+	sc.priority = DEF_PRIORITY;
 	sc.nr_reclaimed = 0;
 	sc.may_writepage = !laptop_mode;
 	count_vm_event(PAGEOUTRUN);
 
-	for (priority = DEF_PRIORITY; priority >= 0; priority--) {
+	do {
 		unsigned long lru_pages = 0;
 		int has_under_min_watermark_zone = 0;
 
@@ -2465,14 +2468,15 @@ loop_again:
 			if (!populated_zone(zone))
 				continue;
 
-			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
+			if (zone->all_unreclaimable &&
+			    sc.priority != DEF_PRIORITY)
 				continue;
 
 			/*
 			 * Do some background aging of the anon list, to give
 			 * pages a chance to be referenced before reclaiming.
 			 */
-			age_active_anon(zone, &sc, priority);
+			age_active_anon(zone, &sc);
 
 			/*
 			 * If the number of buffer_heads in the machine
@@ -2520,7 +2524,8 @@ loop_again:
 			if (!populated_zone(zone))
 				continue;
 
-			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
+			if (zone->all_unreclaimable &&
+			    sc.priority != DEF_PRIORITY)
 				continue;
 
 			sc.nr_scanned = 0;
@@ -2564,7 +2569,7 @@ loop_again:
 				    !zone_watermark_ok_safe(zone, testorder,
 					high_wmark_pages(zone) + balance_gap,
 					end_zone, 0)) {
-				shrink_zone(priority, zone, &sc);
+				shrink_zone(zone, &sc);
 
 				reclaim_state->reclaimed_slab = 0;
 				nr_slab = shrink_slab(&shrink, sc.nr_scanned, lru_pages);
@@ -2621,7 +2626,7 @@ loop_again:
 		 * OK, kswapd is getting into trouble.  Take a nap, then take
 		 * another pass across the zones.
 		 */
-		if (total_scanned && (priority < DEF_PRIORITY - 2)) {
+		if (total_scanned && (sc.priority < DEF_PRIORITY - 2)) {
 			if (has_under_min_watermark_zone)
 				count_vm_event(KSWAPD_SKIP_CONGESTION_WAIT);
 			else
@@ -2636,7 +2641,7 @@ loop_again:
 		 */
 		if (sc.nr_reclaimed >= SWAP_CLUSTER_MAX)
 			break;
-	}
+	} while (--sc.priority >= 0);
 out:
 
 	/*
@@ -2686,7 +2691,8 @@ out:
 			if (!populated_zone(zone))
 				continue;
 
-			if (zone->all_unreclaimable && priority != DEF_PRIORITY)
+			if (zone->all_unreclaimable &&
+			    sc.priority != DEF_PRIORITY)
 				continue;
 
 			/* Would compaction fail due to lack of free memory? */
@@ -2953,6 +2959,7 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
 		.nr_to_reclaim = nr_to_reclaim,
 		.hibernation_mode = 1,
 		.order = 0,
+		.priority = DEF_PRIORITY,
 	};
 	struct shrink_control shrink = {
 		.gfp_mask = sc.gfp_mask,
@@ -3130,7 +3137,6 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 	const unsigned long nr_pages = 1 << order;
 	struct task_struct *p = current;
 	struct reclaim_state reclaim_state;
-	int priority;
 	struct scan_control sc = {
 		.may_writepage = !!(zone_reclaim_mode & RECLAIM_WRITE),
 		.may_unmap = !!(zone_reclaim_mode & RECLAIM_SWAP),
@@ -3139,6 +3145,7 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 				       SWAP_CLUSTER_MAX),
 		.gfp_mask = gfp_mask,
 		.order = order,
+		.priority = ZONE_RECLAIM_PRIORITY,
 	};
 	struct shrink_control shrink = {
 		.gfp_mask = sc.gfp_mask,
@@ -3161,11 +3168,9 @@ static int __zone_reclaim(struct zone *zone, gfp_t gfp_mask, unsigned int order)
 		 * Free memory by calling shrink zone with increasing
 		 * priorities until we have enough memory freed.
 		 */
-		priority = ZONE_RECLAIM_PRIORITY;
 		do {
-			shrink_zone(priority, zone, &sc);
-			priority--;
-		} while (priority >= 0 && sc.nr_reclaimed < nr_pages);
+			shrink_zone(zone, &sc);
+		} while (sc.nr_reclaimed < nr_pages && --sc.priority >= 0);
 	}
 
 	nr_slab_pages0 = zone_page_state(zone, NR_SLAB_RECLAIMABLE);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 02/12] mm: add link from struct lruvec to struct zone
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
  2012-04-26  7:53 ` [PATCH 01/12] mm/vmscan: store "priority" in struct scan_control Konstantin Khlebnikov
@ 2012-04-26  7:53 ` Konstantin Khlebnikov
  2012-05-02  5:52   ` [PATCH v2 " Konstantin Khlebnikov
  2012-04-26  7:53 ` [PATCH 03/12] mm/vmscan: push lruvec pointer into isolate_lru_pages() Konstantin Khlebnikov
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

This is first stage of struct mem_cgroup_zone removal, further patches replaces
struct mem_cgroup_zone with pointer to struct lruvec.
If CONFIG_CGROUP_MEM_RES_CTLR=n page_zone() is just container_of().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/mmzone.h |   14 ++++++++++++++
 mm/memcontrol.c        |    4 +---
 mm/mmzone.c            |   14 ++++++++++++++
 mm/page_alloc.c        |    8 +-------
 4 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 6ee32b2..7ac3527 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -201,6 +201,9 @@ struct zone_reclaim_stat {
 struct lruvec {
 	struct list_head lists[NR_LRU_LISTS];
 	struct zone_reclaim_stat reclaim_stat;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	struct zone *zone;
+#endif
 };
 
 /* Mask used at gathering information at once (see memcontrol.c) */
@@ -729,6 +732,17 @@ extern int init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
 				     unsigned long size,
 				     enum memmap_context context);
 
+extern void lruvec_init(struct lruvec *lruvec, struct zone *zone);
+
+static inline struct zone *lruvec_zone(struct lruvec *lruvec)
+{
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	return lruvec->zone;
+#else
+	return container_of(lruvec, struct zone, lruvec);
+#endif
+}
+
 #ifdef CONFIG_HAVE_MEMORY_PRESENT
 void memory_present(int nid, unsigned long start, unsigned long end);
 #else
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8d9d29f..66c2f80 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4730,7 +4730,6 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 {
 	struct mem_cgroup_per_node *pn;
 	struct mem_cgroup_per_zone *mz;
-	enum lru_list lru;
 	int zone, tmp = node;
 	/*
 	 * This routine is called against possible nodes.
@@ -4748,8 +4747,7 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 
 	for (zone = 0; zone < MAX_NR_ZONES; zone++) {
 		mz = &pn->zoneinfo[zone];
-		for_each_lru(lru)
-			INIT_LIST_HEAD(&mz->lruvec.lists[lru]);
+		lruvec_init(&mz->lruvec, &NODE_DATA(node)->node_zones[zone]);
 		mz->usage_in_excess = 0;
 		mz->on_tree = false;
 		mz->memcg = memcg;
diff --git a/mm/mmzone.c b/mm/mmzone.c
index 7cf7b7d..6830eab 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -86,3 +86,17 @@ int memmap_valid_within(unsigned long pfn,
 	return 1;
 }
 #endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
+
+void lruvec_init(struct lruvec *lruvec, struct zone *zone)
+{
+	enum lru_list lru;
+
+	memset(lruvec, 0, sizeof(struct lruvec));
+
+	for_each_lru(lru)
+		INIT_LIST_HEAD(&lruvec->lists[lru]);
+
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	lruvec->zone = zone;
+#endif
+}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1be83d5..63cb3a8 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4374,7 +4374,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 	for (j = 0; j < MAX_NR_ZONES; j++) {
 		struct zone *zone = pgdat->node_zones + j;
 		unsigned long size, realsize, memmap_pages;
-		enum lru_list lru;
 
 		size = zone_spanned_pages_in_node(nid, j, zones_size);
 		realsize = size - zone_absent_pages_in_node(nid, j,
@@ -4424,12 +4423,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 		zone->zone_pgdat = pgdat;
 
 		zone_pcp_init(zone);
-		for_each_lru(lru)
-			INIT_LIST_HEAD(&zone->lruvec.lists[lru]);
-		zone->lruvec.reclaim_stat.recent_rotated[0] = 0;
-		zone->lruvec.reclaim_stat.recent_rotated[1] = 0;
-		zone->lruvec.reclaim_stat.recent_scanned[0] = 0;
-		zone->lruvec.reclaim_stat.recent_scanned[1] = 0;
+		lruvec_init(&zone->lruvec, zone);
 		zap_zone_vm_stats(zone);
 		zone->flags = 0;
 		if (!size)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 03/12] mm/vmscan: push lruvec pointer into isolate_lru_pages()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
  2012-04-26  7:53 ` [PATCH 01/12] mm/vmscan: store "priority" in struct scan_control Konstantin Khlebnikov
  2012-04-26  7:53 ` [PATCH 02/12] mm: add link from struct lruvec to struct zone Konstantin Khlebnikov
@ 2012-04-26  7:53 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 04/12] mm/vmscan: push zone pointer into shrink_page_list() Konstantin Khlebnikov
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

This patch moves mem_cgroup_zone_lruvec() call from isolate_lru_pages() into
shrink_[in]active_list(), further patches pushes it to shrink_zone() step by step.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d81750c..49e79d5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1027,7 +1027,7 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode)
  * Appropriate locks must be held before calling this function.
  *
  * @nr_to_scan:	The number of pages to look through on the list.
- * @mz:		The mem_cgroup_zone to pull pages from.
+ * @lruvec:	The LRU vector to pull pages from.
  * @dst:	The temp list to put pages on to.
  * @nr_scanned:	The number of pages that were scanned.
  * @sc:		The scan_control struct for this reclaim session
@@ -1037,17 +1037,15 @@ int __isolate_lru_page(struct page *page, isolate_mode_t mode)
  * returns how many pages were moved onto *@dst.
  */
 static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
-		struct mem_cgroup_zone *mz, struct list_head *dst,
+		struct lruvec *lruvec, struct list_head *dst,
 		unsigned long *nr_scanned, struct scan_control *sc,
 		isolate_mode_t mode, enum lru_list lru)
 {
-	struct lruvec *lruvec;
 	struct list_head *src;
 	unsigned long nr_taken = 0;
 	unsigned long scan;
 	int file = is_file_lru(lru);
 
-	lruvec = mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup);
 	src = &lruvec->lists[lru];
 
 	for (scan = 0; scan < nr_to_scan && !list_empty(src); scan++) {
@@ -1274,6 +1272,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 	int file = is_file_lru(lru);
 	struct zone *zone = mz->zone;
 	struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(mz);
+	struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, mz->mem_cgroup);
 
 	while (unlikely(too_many_isolated(zone, file, sc))) {
 		congestion_wait(BLK_RW_ASYNC, HZ/10);
@@ -1292,8 +1291,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 
 	spin_lock_irq(&zone->lru_lock);
 
-	nr_taken = isolate_lru_pages(nr_to_scan, mz, &page_list, &nr_scanned,
-				     sc, isolate_mode, lru);
+	nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &page_list,
+				     &nr_scanned, sc, isolate_mode, lru);
 	if (global_reclaim(sc)) {
 		zone->pages_scanned += nr_scanned;
 		if (current_is_kswapd())
@@ -1439,6 +1438,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
 	struct zone *zone = mz->zone;
+	struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, mz->mem_cgroup);
 
 	lru_add_drain();
 
@@ -1449,8 +1449,8 @@ static void shrink_active_list(unsigned long nr_to_scan,
 
 	spin_lock_irq(&zone->lru_lock);
 
-	nr_taken = isolate_lru_pages(nr_to_scan, mz, &l_hold, &nr_scanned, sc,
-				     isolate_mode, lru);
+	nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &l_hold,
+				     &nr_scanned, sc, isolate_mode, lru);
 	if (global_reclaim(sc))
 		zone->pages_scanned += nr_scanned;
 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 04/12] mm/vmscan: push zone pointer into shrink_page_list()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (2 preceding siblings ...)
  2012-04-26  7:53 ` [PATCH 03/12] mm/vmscan: push lruvec pointer into isolate_lru_pages() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 05/12] mm/vmscan: push zone pointer into update_isolated_counts() Konstantin Khlebnikov
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

It doesn't need pointer to cgroup, pointer to zone is enough.
This patch also kills "mz" argument of page_check_references(), it unused after
"mm: memcg: count pte references from every member of the reclaimed hierarch"

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |   11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 49e79d5..44d5821 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -629,7 +629,6 @@ enum page_references {
 };
 
 static enum page_references page_check_references(struct page *page,
-						  struct mem_cgroup_zone *mz,
 						  struct scan_control *sc)
 {
 	int referenced_ptes, referenced_page;
@@ -688,7 +687,7 @@ static enum page_references page_check_references(struct page *page,
  * shrink_page_list() returns the number of reclaimed pages
  */
 static unsigned long shrink_page_list(struct list_head *page_list,
-				      struct mem_cgroup_zone *mz,
+				      struct zone *zone,
 				      struct scan_control *sc,
 				      unsigned long *ret_nr_dirty,
 				      unsigned long *ret_nr_writeback)
@@ -718,7 +717,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 			goto keep;
 
 		VM_BUG_ON(PageActive(page));
-		VM_BUG_ON(page_zone(page) != mz->zone);
+		VM_BUG_ON(page_zone(page) != zone);
 
 		sc->nr_scanned++;
 
@@ -741,7 +740,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 			goto keep;
 		}
 
-		references = page_check_references(page, mz, sc);
+		references = page_check_references(page, sc);
 		switch (references) {
 		case PAGEREF_ACTIVATE:
 			goto activate_locked;
@@ -931,7 +930,7 @@ keep:
 	 * will encounter the same problem
 	 */
 	if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc))
-		zone_set_flag(mz->zone, ZONE_CONGESTED);
+		zone_set_flag(zone, ZONE_CONGESTED);
 
 	free_hot_cold_page_list(&free_pages, 1);
 
@@ -1309,7 +1308,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 
 	update_isolated_counts(mz, &page_list, &nr_anon, &nr_file);
 
-	nr_reclaimed = shrink_page_list(&page_list, mz, sc,
+	nr_reclaimed = shrink_page_list(&page_list, zone, sc,
 						&nr_dirty, &nr_writeback);
 
 	spin_lock_irq(&zone->lru_lock);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 05/12] mm/vmscan: push zone pointer into update_isolated_counts()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (3 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 04/12] mm/vmscan: push zone pointer into shrink_page_list() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26 13:17   ` [PATCH v2 05/12] mm/vmscan: remove update_isolated_counts() Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 06/12] mm/vmscan: push lruvec pointer into putback_inactive_pages() Konstantin Khlebnikov
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

ditto

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 44d5821..814948ad9 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1206,12 +1206,11 @@ putback_inactive_pages(struct mem_cgroup_zone *mz,
 }
 
 static noinline_for_stack void
-update_isolated_counts(struct mem_cgroup_zone *mz,
+update_isolated_counts(struct zone *zone,
 		       struct list_head *page_list,
 		       unsigned long *nr_anon,
 		       unsigned long *nr_file)
 {
-	struct zone *zone = mz->zone;
 	unsigned int count[NR_LRU_LISTS] = { 0, };
 	unsigned long nr_active = 0;
 	struct page *page;
@@ -1306,7 +1305,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 	if (nr_taken == 0)
 		return 0;
 
-	update_isolated_counts(mz, &page_list, &nr_anon, &nr_file);
+	update_isolated_counts(zone, &page_list, &nr_anon, &nr_file);
 
 	nr_reclaimed = shrink_page_list(&page_list, zone, sc,
 						&nr_dirty, &nr_writeback);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 06/12] mm/vmscan: push lruvec pointer into putback_inactive_pages()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (4 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 05/12] mm/vmscan: push zone pointer into update_isolated_counts() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 07/12] mm/vmscan: replace zone_nr_lru_pages() with get_lruvec_size() Konstantin Khlebnikov
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

Now zone_reclaim_stat located on lruvec, we can reach it directly.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 814948ad9..31df071 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1155,11 +1155,11 @@ static int too_many_isolated(struct zone *zone, int file,
 }
 
 static noinline_for_stack void
-putback_inactive_pages(struct mem_cgroup_zone *mz,
+putback_inactive_pages(struct lruvec *lruvec,
 		       struct list_head *page_list)
 {
-	struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(mz);
-	struct zone *zone = mz->zone;
+	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
+	struct zone *zone = lruvec_zone(lruvec);
 	LIST_HEAD(pages_to_free);
 
 	/*
@@ -1319,7 +1319,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 		__count_vm_events(KSWAPD_STEAL, nr_reclaimed);
 	__count_zone_vm_events(PGSTEAL, zone, nr_reclaimed);
 
-	putback_inactive_pages(mz, &page_list);
+	putback_inactive_pages(lruvec, &page_list);
 
 	__mod_zone_page_state(zone, NR_ISOLATED_ANON, -nr_anon);
 	__mod_zone_page_state(zone, NR_ISOLATED_FILE, -nr_file);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 07/12] mm/vmscan: replace zone_nr_lru_pages() with get_lruvec_size()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (5 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 06/12] mm/vmscan: push lruvec pointer into putback_inactive_pages() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 08/12] mm/vmscan: push lruvec pointer into inactive_list_is_low() Konstantin Khlebnikov
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

If memory cgroup enabled we always use lruvecs which are embedded into
struct mem_cgroup_per_zone, so we can reach lru_size counters via container_of().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/memcontrol.h |    6 ++----
 mm/memcontrol.c            |    9 +++++++++
 mm/vmscan.c                |   31 ++++++++++++++++---------------
 3 files changed, 27 insertions(+), 19 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 76f9d9b..7980187 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -122,8 +122,7 @@ int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg,
 int mem_cgroup_inactive_file_is_low(struct mem_cgroup *memcg,
 				    struct zone *zone);
 int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);
-unsigned long mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *memcg,
-					int nid, int zid, unsigned int lrumask);
+unsigned long mem_cgroup_get_lruvec_size(struct lruvec *lruvec, enum lru_list);
 struct zone_reclaim_stat*
 mem_cgroup_get_reclaim_stat_from_page(struct page *page);
 extern void mem_cgroup_print_oom_info(struct mem_cgroup *memcg,
@@ -342,8 +341,7 @@ mem_cgroup_inactive_file_is_low(struct mem_cgroup *memcg, struct zone *zone)
 }
 
 static inline unsigned long
-mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *memcg, int nid, int zid,
-				unsigned int lru_mask)
+mem_cgroup_get_lruvec_size(struct lruvec *lruvec, enum lru_list lru)
 {
 	return 0;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 66c2f80..2cb6f4d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -723,6 +723,15 @@ static void mem_cgroup_charge_statistics(struct mem_cgroup *memcg,
 }
 
 unsigned long
+mem_cgroup_get_lruvec_size(struct lruvec *lruvec, enum lru_list lru)
+{
+	struct mem_cgroup_per_zone *mz;
+
+	mz = container_of(lruvec, struct mem_cgroup_per_zone, lruvec);
+	return mz->lru_size[lru];
+}
+
+static unsigned long
 mem_cgroup_zone_nr_lru_pages(struct mem_cgroup *memcg, int nid, int zid,
 			unsigned int lru_mask)
 {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 31df071..6d46117 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -155,19 +155,14 @@ static struct zone_reclaim_stat *get_reclaim_stat(struct mem_cgroup_zone *mz)
 	return &mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup)->reclaim_stat;
 }
 
-static unsigned long zone_nr_lru_pages(struct mem_cgroup_zone *mz,
-				       enum lru_list lru)
+static unsigned long get_lruvec_size(struct lruvec *lruvec, enum lru_list lru)
 {
 	if (!mem_cgroup_disabled())
-		return mem_cgroup_zone_nr_lru_pages(mz->mem_cgroup,
-						    zone_to_nid(mz->zone),
-						    zone_idx(mz->zone),
-						    BIT(lru));
+		return mem_cgroup_get_lruvec_size(lruvec, lru);
 
-	return zone_page_state(mz->zone, NR_LRU_BASE + lru);
+	return zone_page_state(lruvec_zone(lruvec), NR_LRU_BASE + lru);
 }
 
-
 /*
  * Add a shrinker callback to be called from the vm
  */
@@ -1645,6 +1640,9 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
 	enum lru_list lru;
 	int noswap = 0;
 	bool force_scan = false;
+	struct lruvec *lruvec;
+
+	lruvec = mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup);
 
 	/*
 	 * If the zone or memcg is small, nr[l] can be 0.  This
@@ -1670,10 +1668,10 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
 		goto out;
 	}
 
-	anon  = zone_nr_lru_pages(mz, LRU_ACTIVE_ANON) +
-		zone_nr_lru_pages(mz, LRU_INACTIVE_ANON);
-	file  = zone_nr_lru_pages(mz, LRU_ACTIVE_FILE) +
-		zone_nr_lru_pages(mz, LRU_INACTIVE_FILE);
+	anon  = get_lruvec_size(lruvec, LRU_ACTIVE_ANON) +
+		get_lruvec_size(lruvec, LRU_INACTIVE_ANON);
+	file  = get_lruvec_size(lruvec, LRU_ACTIVE_FILE) +
+		get_lruvec_size(lruvec, LRU_INACTIVE_FILE);
 
 	if (global_reclaim(sc)) {
 		free  = zone_page_state(mz->zone, NR_FREE_PAGES);
@@ -1736,7 +1734,7 @@ out:
 		int file = is_file_lru(lru);
 		unsigned long scan;
 
-		scan = zone_nr_lru_pages(mz, lru);
+		scan = get_lruvec_size(lruvec, lru);
 		if (sc->priority || noswap) {
 			scan >>= sc->priority;
 			if (!scan && force_scan)
@@ -1772,6 +1770,7 @@ static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
 {
 	unsigned long pages_for_compaction;
 	unsigned long inactive_lru_pages;
+	struct lruvec *lruvec;
 
 	/* If not in reclaim/compaction mode, stop */
 	if (!in_reclaim_compaction(sc))
@@ -1804,10 +1803,12 @@ static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
 	 * If we have not reclaimed enough pages for compaction and the
 	 * inactive lists are large enough, continue reclaiming
 	 */
+	lruvec = mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup);
 	pages_for_compaction = (2UL << sc->order);
-	inactive_lru_pages = zone_nr_lru_pages(mz, LRU_INACTIVE_FILE);
+	inactive_lru_pages = get_lruvec_size(lruvec, LRU_INACTIVE_FILE);
 	if (nr_swap_pages > 0)
-		inactive_lru_pages += zone_nr_lru_pages(mz, LRU_INACTIVE_ANON);
+		inactive_lru_pages += get_lruvec_size(lruvec,
+						      LRU_INACTIVE_ANON);
 	if (sc->nr_reclaimed < pages_for_compaction &&
 			inactive_lru_pages > pages_for_compaction)
 		return true;


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 08/12] mm/vmscan: push lruvec pointer into inactive_list_is_low()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (6 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 07/12] mm/vmscan: replace zone_nr_lru_pages() with get_lruvec_size() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 09/12] mm/vmscan: push lruvec pointer into shrink_list() Konstantin Khlebnikov
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

This patch switches mem_cgroup_inactive_anon_is_low() to lruvec pointers,
mem_cgroup_get_lruvec_size() is more effective than mem_cgroup_zone_nr_lru_pages()

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 include/linux/memcontrol.h |   10 ++++------
 mm/memcontrol.c            |   20 ++++++--------------
 mm/vmscan.c                |   40 ++++++++++++++++++++++------------------
 3 files changed, 32 insertions(+), 38 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 7980187..88877a9 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -117,10 +117,8 @@ void mem_cgroup_iter_break(struct mem_cgroup *, struct mem_cgroup *);
 /*
  * For memory reclaim.
  */
-int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg,
-				    struct zone *zone);
-int mem_cgroup_inactive_file_is_low(struct mem_cgroup *memcg,
-				    struct zone *zone);
+int mem_cgroup_inactive_anon_is_low(struct lruvec *lruvec);
+int mem_cgroup_inactive_file_is_low(struct lruvec *lruvec);
 int mem_cgroup_select_victim_node(struct mem_cgroup *memcg);
 unsigned long mem_cgroup_get_lruvec_size(struct lruvec *lruvec, enum lru_list);
 struct zone_reclaim_stat*
@@ -329,13 +327,13 @@ static inline bool mem_cgroup_disabled(void)
 }
 
 static inline int
-mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *zone)
+mem_cgroup_inactive_anon_is_low(struct lruvec *lruvec)
 {
 	return 1;
 }
 
 static inline int
-mem_cgroup_inactive_file_is_low(struct mem_cgroup *memcg, struct zone *zone)
+mem_cgroup_inactive_file_is_low(struct lruvec *lruvec)
 {
 	return 1;
 }
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 2cb6f4d..07c15dd 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1208,19 +1208,15 @@ int task_in_mem_cgroup(struct task_struct *task, const struct mem_cgroup *memcg)
 	return ret;
 }
 
-int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *zone)
+int mem_cgroup_inactive_anon_is_low(struct lruvec *lruvec)
 {
 	unsigned long inactive_ratio;
-	int nid = zone_to_nid(zone);
-	int zid = zone_idx(zone);
 	unsigned long inactive;
 	unsigned long active;
 	unsigned long gb;
 
-	inactive = mem_cgroup_zone_nr_lru_pages(memcg, nid, zid,
-						BIT(LRU_INACTIVE_ANON));
-	active = mem_cgroup_zone_nr_lru_pages(memcg, nid, zid,
-					      BIT(LRU_ACTIVE_ANON));
+	inactive = mem_cgroup_get_lruvec_size(lruvec, LRU_INACTIVE_ANON);
+	active = mem_cgroup_get_lruvec_size(lruvec, LRU_ACTIVE_ANON);
 
 	gb = (inactive + active) >> (30 - PAGE_SHIFT);
 	if (gb)
@@ -1231,17 +1227,13 @@ int mem_cgroup_inactive_anon_is_low(struct mem_cgroup *memcg, struct zone *zone)
 	return inactive * inactive_ratio < active;
 }
 
-int mem_cgroup_inactive_file_is_low(struct mem_cgroup *memcg, struct zone *zone)
+int mem_cgroup_inactive_file_is_low(struct lruvec *lruvec)
 {
 	unsigned long active;
 	unsigned long inactive;
-	int zid = zone_idx(zone);
-	int nid = zone_to_nid(zone);
 
-	inactive = mem_cgroup_zone_nr_lru_pages(memcg, nid, zid,
-						BIT(LRU_INACTIVE_FILE));
-	active = mem_cgroup_zone_nr_lru_pages(memcg, nid, zid,
-					      BIT(LRU_ACTIVE_FILE));
+	inactive = mem_cgroup_get_lruvec_size(lruvec, LRU_INACTIVE_FILE);
+	active = mem_cgroup_get_lruvec_size(lruvec, LRU_ACTIVE_FILE);
 
 	return (active > inactive);
 }
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6d46117..c055d6e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1530,13 +1530,12 @@ static int inactive_anon_is_low_global(struct zone *zone)
 
 /**
  * inactive_anon_is_low - check if anonymous pages need to be deactivated
- * @zone: zone to check
- * @sc:   scan control of this context
+ * @lruvec: LRU vector to check
  *
  * Returns true if the zone does not have enough inactive anon pages,
  * meaning some active anon pages need to be deactivated.
  */
-static int inactive_anon_is_low(struct mem_cgroup_zone *mz)
+static int inactive_anon_is_low(struct lruvec *lruvec)
 {
 	/*
 	 * If we don't have swap space, anonymous page deactivation
@@ -1546,13 +1545,12 @@ static int inactive_anon_is_low(struct mem_cgroup_zone *mz)
 		return 0;
 
 	if (!mem_cgroup_disabled())
-		return mem_cgroup_inactive_anon_is_low(mz->mem_cgroup,
-						       mz->zone);
+		return mem_cgroup_inactive_anon_is_low(lruvec);
 
-	return inactive_anon_is_low_global(mz->zone);
+	return inactive_anon_is_low_global(lruvec_zone(lruvec));
 }
 #else
-static inline int inactive_anon_is_low(struct mem_cgroup_zone *mz)
+static inline int inactive_anon_is_low(struct lruvec *lruvec)
 {
 	return 0;
 }
@@ -1570,7 +1568,7 @@ static int inactive_file_is_low_global(struct zone *zone)
 
 /**
  * inactive_file_is_low - check if file pages need to be deactivated
- * @mz: memory cgroup and zone to check
+ * @lruvec: LRU vector to check
  *
  * When the system is doing streaming IO, memory pressure here
  * ensures that active file pages get deactivated, until more
@@ -1582,21 +1580,20 @@ static int inactive_file_is_low_global(struct zone *zone)
  * This uses a different ratio than the anonymous pages, because
  * the page cache uses a use-once replacement algorithm.
  */
-static int inactive_file_is_low(struct mem_cgroup_zone *mz)
+static int inactive_file_is_low(struct lruvec *lruvec)
 {
 	if (!mem_cgroup_disabled())
-		return mem_cgroup_inactive_file_is_low(mz->mem_cgroup,
-						       mz->zone);
+		return mem_cgroup_inactive_file_is_low(lruvec);
 
-	return inactive_file_is_low_global(mz->zone);
+	return inactive_file_is_low_global(lruvec_zone(lruvec));
 }
 
-static int inactive_list_is_low(struct mem_cgroup_zone *mz, int file)
+static int inactive_list_is_low(struct lruvec *lruvec, int file)
 {
 	if (file)
-		return inactive_file_is_low(mz);
+		return inactive_file_is_low(lruvec);
 	else
-		return inactive_anon_is_low(mz);
+		return inactive_anon_is_low(lruvec);
 }
 
 static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
@@ -1606,7 +1603,10 @@ static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
 	int file = is_file_lru(lru);
 
 	if (is_active_lru(lru)) {
-		if (inactive_list_is_low(mz, file))
+		struct lruvec *lruvec = mem_cgroup_zone_lruvec(mz->zone,
+							       mz->mem_cgroup);
+
+		if (inactive_list_is_low(lruvec, file))
 			shrink_active_list(nr_to_scan, mz, sc, lru);
 		return 0;
 	}
@@ -1835,6 +1835,9 @@ static void shrink_mem_cgroup_zone(struct mem_cgroup_zone *mz,
 	unsigned long nr_reclaimed, nr_scanned;
 	unsigned long nr_to_reclaim = sc->nr_to_reclaim;
 	struct blk_plug plug;
+	struct lruvec *lruvec;
+
+	lruvec = mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup);
 
 restart:
 	nr_reclaimed = 0;
@@ -1873,7 +1876,7 @@ restart:
 	 * Even if we did not try to evict anon pages at all, we want to
 	 * rebalance the anon lru active/inactive ratio.
 	 */
-	if (inactive_anon_is_low(mz))
+	if (inactive_anon_is_low(lruvec))
 		shrink_active_list(SWAP_CLUSTER_MAX, mz,
 				   sc, LRU_ACTIVE_ANON);
 
@@ -2306,12 +2309,13 @@ static void age_active_anon(struct zone *zone, struct scan_control *sc)
 
 	memcg = mem_cgroup_iter(NULL, NULL, NULL);
 	do {
+		struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, memcg);
 		struct mem_cgroup_zone mz = {
 			.mem_cgroup = memcg,
 			.zone = zone,
 		};
 
-		if (inactive_anon_is_low(&mz))
+		if (inactive_anon_is_low(lruvec))
 			shrink_active_list(SWAP_CLUSTER_MAX, &mz,
 					   sc, LRU_ACTIVE_ANON);
 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 09/12] mm/vmscan: push lruvec pointer into shrink_list()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (7 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 08/12] mm/vmscan: push lruvec pointer into inactive_list_is_low() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 10/12] mm/vmscan: push lruvec pointer into get_scan_count() Konstantin Khlebnikov
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |   34 ++++++++++++----------------------
 1 file changed, 12 insertions(+), 22 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c055d6e..258e002 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1250,7 +1250,7 @@ update_isolated_counts(struct zone *zone,
  * of reclaimed pages
  */
 static noinline_for_stack unsigned long
-shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
+shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
 		     struct scan_control *sc, enum lru_list lru)
 {
 	LIST_HEAD(page_list);
@@ -1263,9 +1263,8 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 	unsigned long nr_writeback = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
-	struct zone *zone = mz->zone;
-	struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(mz);
-	struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, mz->mem_cgroup);
+	struct zone *zone = lruvec_zone(lruvec);
+	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 
 	while (unlikely(too_many_isolated(zone, file, sc))) {
 		congestion_wait(BLK_RW_ASYNC, HZ/10);
@@ -1415,7 +1414,7 @@ static void move_active_pages_to_lru(struct zone *zone,
 }
 
 static void shrink_active_list(unsigned long nr_to_scan,
-			       struct mem_cgroup_zone *mz,
+			       struct lruvec *lruvec,
 			       struct scan_control *sc,
 			       enum lru_list lru)
 {
@@ -1426,12 +1425,11 @@ static void shrink_active_list(unsigned long nr_to_scan,
 	LIST_HEAD(l_active);
 	LIST_HEAD(l_inactive);
 	struct page *page;
-	struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(mz);
+	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 	unsigned long nr_rotated = 0;
 	isolate_mode_t isolate_mode = 0;
 	int file = is_file_lru(lru);
-	struct zone *zone = mz->zone;
-	struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, mz->mem_cgroup);
+	struct zone *zone = lruvec_zone(lruvec);
 
 	lru_add_drain();
 
@@ -1597,21 +1595,17 @@ static int inactive_list_is_low(struct lruvec *lruvec, int file)
 }
 
 static unsigned long shrink_list(enum lru_list lru, unsigned long nr_to_scan,
-				 struct mem_cgroup_zone *mz,
-				 struct scan_control *sc)
+				 struct lruvec *lruvec, struct scan_control *sc)
 {
 	int file = is_file_lru(lru);
 
 	if (is_active_lru(lru)) {
-		struct lruvec *lruvec = mem_cgroup_zone_lruvec(mz->zone,
-							       mz->mem_cgroup);
-
 		if (inactive_list_is_low(lruvec, file))
-			shrink_active_list(nr_to_scan, mz, sc, lru);
+			shrink_active_list(nr_to_scan, lruvec, sc, lru);
 		return 0;
 	}
 
-	return shrink_inactive_list(nr_to_scan, mz, sc, lru);
+	return shrink_inactive_list(nr_to_scan, lruvec, sc, lru);
 }
 
 static int vmscan_swappiness(struct scan_control *sc)
@@ -1854,7 +1848,7 @@ restart:
 				nr[lru] -= nr_to_scan;
 
 				nr_reclaimed += shrink_list(lru, nr_to_scan,
-							    mz, sc);
+							    lruvec, sc);
 			}
 		}
 		/*
@@ -1877,7 +1871,7 @@ restart:
 	 * rebalance the anon lru active/inactive ratio.
 	 */
 	if (inactive_anon_is_low(lruvec))
-		shrink_active_list(SWAP_CLUSTER_MAX, mz,
+		shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
 				   sc, LRU_ACTIVE_ANON);
 
 	/* reclaim/compaction might need reclaim to continue */
@@ -2310,13 +2304,9 @@ static void age_active_anon(struct zone *zone, struct scan_control *sc)
 	memcg = mem_cgroup_iter(NULL, NULL, NULL);
 	do {
 		struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, memcg);
-		struct mem_cgroup_zone mz = {
-			.mem_cgroup = memcg,
-			.zone = zone,
-		};
 
 		if (inactive_anon_is_low(lruvec))
-			shrink_active_list(SWAP_CLUSTER_MAX, &mz,
+			shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
 					   sc, LRU_ACTIVE_ANON);
 
 		memcg = mem_cgroup_iter(NULL, memcg, NULL);


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 10/12] mm/vmscan: push lruvec pointer into get_scan_count()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (8 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 09/12] mm/vmscan: push lruvec pointer into shrink_list() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 11/12] mm/vmscan: push lruvec pointer into should_continue_reclaim() Konstantin Khlebnikov
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |   25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 258e002..1724ec6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -150,11 +150,6 @@ static bool global_reclaim(struct scan_control *sc)
 }
 #endif
 
-static struct zone_reclaim_stat *get_reclaim_stat(struct mem_cgroup_zone *mz)
-{
-	return &mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup)->reclaim_stat;
-}
-
 static unsigned long get_lruvec_size(struct lruvec *lruvec, enum lru_list lru)
 {
 	if (!mem_cgroup_disabled())
@@ -1623,20 +1618,18 @@ static int vmscan_swappiness(struct scan_control *sc)
  *
  * nr[0] = anon pages to scan; nr[1] = file pages to scan
  */
-static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
+static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
 			   unsigned long *nr)
 {
 	unsigned long anon, file, free;
 	unsigned long anon_prio, file_prio;
 	unsigned long ap, fp;
-	struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(mz);
+	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
 	u64 fraction[2], denominator;
 	enum lru_list lru;
 	int noswap = 0;
 	bool force_scan = false;
-	struct lruvec *lruvec;
-
-	lruvec = mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup);
+	struct zone *zone = lruvec_zone(lruvec);
 
 	/*
 	 * If the zone or memcg is small, nr[l] can be 0.  This
@@ -1648,7 +1641,7 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
 	 * latencies, so it's better to scan a minimum amount there as
 	 * well.
 	 */
-	if (current_is_kswapd() && mz->zone->all_unreclaimable)
+	if (current_is_kswapd() && zone->all_unreclaimable)
 		force_scan = true;
 	if (!global_reclaim(sc))
 		force_scan = true;
@@ -1668,10 +1661,10 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
 		get_lruvec_size(lruvec, LRU_INACTIVE_FILE);
 
 	if (global_reclaim(sc)) {
-		free  = zone_page_state(mz->zone, NR_FREE_PAGES);
+		free  = zone_page_state(zone, NR_FREE_PAGES);
 		/* If we have very few page cache pages,
 		   force-scan anon pages. */
-		if (unlikely(file + free <= high_wmark_pages(mz->zone))) {
+		if (unlikely(file + free <= high_wmark_pages(zone))) {
 			fraction[0] = 1;
 			fraction[1] = 0;
 			denominator = 1;
@@ -1697,7 +1690,7 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
 	 *
 	 * anon in [0], file in [1]
 	 */
-	spin_lock_irq(&mz->zone->lru_lock);
+	spin_lock_irq(&zone->lru_lock);
 	if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) {
 		reclaim_stat->recent_scanned[0] /= 2;
 		reclaim_stat->recent_rotated[0] /= 2;
@@ -1718,7 +1711,7 @@ static void get_scan_count(struct mem_cgroup_zone *mz, struct scan_control *sc,
 
 	fp = (file_prio + 1) * (reclaim_stat->recent_scanned[1] + 1);
 	fp /= reclaim_stat->recent_rotated[1] + 1;
-	spin_unlock_irq(&mz->zone->lru_lock);
+	spin_unlock_irq(&zone->lru_lock);
 
 	fraction[0] = ap;
 	fraction[1] = fp;
@@ -1836,7 +1829,7 @@ static void shrink_mem_cgroup_zone(struct mem_cgroup_zone *mz,
 restart:
 	nr_reclaimed = 0;
 	nr_scanned = sc->nr_scanned;
-	get_scan_count(mz, sc, nr);
+	get_scan_count(lruvec, sc, nr);
 
 	blk_start_plug(&plug);
 	while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 11/12] mm/vmscan: push lruvec pointer into should_continue_reclaim()
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (9 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 10/12] mm/vmscan: push lruvec pointer into get_scan_count() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26  7:54 ` [PATCH 12/12] mm/vmscan: kill struct mem_cgroup_zone Konstantin Khlebnikov
  2012-04-26 23:25 ` [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Andrew Morton
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |    8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1724ec6..a9114739 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1750,14 +1750,13 @@ static bool in_reclaim_compaction(struct scan_control *sc)
  * calls try_to_compact_zone() that it will have enough free pages to succeed.
  * It will give up earlier than that if there is difficulty reclaiming pages.
  */
-static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
+static inline bool should_continue_reclaim(struct lruvec *lruvec,
 					unsigned long nr_reclaimed,
 					unsigned long nr_scanned,
 					struct scan_control *sc)
 {
 	unsigned long pages_for_compaction;
 	unsigned long inactive_lru_pages;
-	struct lruvec *lruvec;
 
 	/* If not in reclaim/compaction mode, stop */
 	if (!in_reclaim_compaction(sc))
@@ -1790,7 +1789,6 @@ static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
 	 * If we have not reclaimed enough pages for compaction and the
 	 * inactive lists are large enough, continue reclaiming
 	 */
-	lruvec = mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup);
 	pages_for_compaction = (2UL << sc->order);
 	inactive_lru_pages = get_lruvec_size(lruvec, LRU_INACTIVE_FILE);
 	if (nr_swap_pages > 0)
@@ -1801,7 +1799,7 @@ static inline bool should_continue_reclaim(struct mem_cgroup_zone *mz,
 		return true;
 
 	/* If compaction would go ahead or the allocation would succeed, stop */
-	switch (compaction_suitable(mz->zone, sc->order)) {
+	switch (compaction_suitable(lruvec_zone(lruvec), sc->order)) {
 	case COMPACT_PARTIAL:
 	case COMPACT_CONTINUE:
 		return false;
@@ -1868,7 +1866,7 @@ restart:
 				   sc, LRU_ACTIVE_ANON);
 
 	/* reclaim/compaction might need reclaim to continue */
-	if (should_continue_reclaim(mz, nr_reclaimed,
+	if (should_continue_reclaim(lruvec, nr_reclaimed,
 				    sc->nr_scanned - nr_scanned, sc))
 		goto restart;
 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 12/12] mm/vmscan: kill struct mem_cgroup_zone
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (10 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 11/12] mm/vmscan: push lruvec pointer into should_continue_reclaim() Konstantin Khlebnikov
@ 2012-04-26  7:54 ` Konstantin Khlebnikov
  2012-04-26 23:25 ` [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Andrew Morton
  12 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26  7:54 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel

This patch kills struct mem_cgroup_zone and renames shrink_mem_cgroup_zone()
into shrink_lruvec(), it always shrinks one lruvec which it takes as argument.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |   26 ++++++--------------------
 1 file changed, 6 insertions(+), 20 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a9114739..34cd8a5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -94,11 +94,6 @@ struct scan_control {
 	nodemask_t	*nodemask;
 };
 
-struct mem_cgroup_zone {
-	struct mem_cgroup *mem_cgroup;
-	struct zone *zone;
-};
-
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
 
 #ifdef ARCH_HAS_PREFETCH
@@ -1811,8 +1806,7 @@ static inline bool should_continue_reclaim(struct lruvec *lruvec,
 /*
  * This is a basic per-zone page freer.  Used by both kswapd and direct reclaim.
  */
-static void shrink_mem_cgroup_zone(struct mem_cgroup_zone *mz,
-				   struct scan_control *sc)
+static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
 {
 	unsigned long nr[NR_LRU_LISTS];
 	unsigned long nr_to_scan;
@@ -1820,9 +1814,6 @@ static void shrink_mem_cgroup_zone(struct mem_cgroup_zone *mz,
 	unsigned long nr_reclaimed, nr_scanned;
 	unsigned long nr_to_reclaim = sc->nr_to_reclaim;
 	struct blk_plug plug;
-	struct lruvec *lruvec;
-
-	lruvec = mem_cgroup_zone_lruvec(mz->zone, mz->mem_cgroup);
 
 restart:
 	nr_reclaimed = 0;
@@ -1884,12 +1875,10 @@ static void shrink_zone(struct zone *zone, struct scan_control *sc)
 
 	memcg = mem_cgroup_iter(root, NULL, &reclaim);
 	do {
-		struct mem_cgroup_zone mz = {
-			.mem_cgroup = memcg,
-			.zone = zone,
-		};
+		struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, memcg);
+
+		shrink_lruvec(lruvec, sc);
 
-		shrink_mem_cgroup_zone(&mz, sc);
 		/*
 		 * Limit reclaim has historically picked one memcg and
 		 * scanned it with decreasing priority levels until
@@ -2214,10 +2203,7 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *memcg,
 		.priority = 0,
 		.target_mem_cgroup = memcg,
 	};
-	struct mem_cgroup_zone mz = {
-		.mem_cgroup = memcg,
-		.zone = zone,
-	};
+	struct lruvec *lruvec = mem_cgroup_zone_lruvec(zone, memcg);
 
 	sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
 			(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
@@ -2233,7 +2219,7 @@ unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *memcg,
 	 * will pick up pages from other mem cgroup's as well. We hack
 	 * the priority and make it zero.
 	 */
-	shrink_mem_cgroup_zone(&mz, &sc);
+	shrink_lruvec(lruvec, &sc);
 
 	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed);
 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v2 05/12] mm/vmscan: remove update_isolated_counts()
  2012-04-26  7:54 ` [PATCH 05/12] mm/vmscan: push zone pointer into update_isolated_counts() Konstantin Khlebnikov
@ 2012-04-26 13:17   ` Konstantin Khlebnikov
  2012-04-27 10:28     ` Mel Gorman
  0 siblings, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-26 13:17 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, Mel Gorman

update_isolated_counts() no longer required, because lumpy-reclaim was removed.
Insanity is over, now here only one kind of inactive pages.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |   60 ++++++-----------------------------------------------------
 1 file changed, 6 insertions(+), 54 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 44d5821..6f617c4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1205,52 +1205,6 @@ putback_inactive_pages(struct mem_cgroup_zone *mz,
 	list_splice(&pages_to_free, page_list);
 }
 
-static noinline_for_stack void
-update_isolated_counts(struct mem_cgroup_zone *mz,
-		       struct list_head *page_list,
-		       unsigned long *nr_anon,
-		       unsigned long *nr_file)
-{
-	struct zone *zone = mz->zone;
-	unsigned int count[NR_LRU_LISTS] = { 0, };
-	unsigned long nr_active = 0;
-	struct page *page;
-	int lru;
-
-	/*
-	 * Count pages and clear active flags
-	 */
-	list_for_each_entry(page, page_list, lru) {
-		int numpages = hpage_nr_pages(page);
-		lru = page_lru_base_type(page);
-		if (PageActive(page)) {
-			lru += LRU_ACTIVE;
-			ClearPageActive(page);
-			nr_active += numpages;
-		}
-		count[lru] += numpages;
-	}
-
-	preempt_disable();
-	__count_vm_events(PGDEACTIVATE, nr_active);
-
-	__mod_zone_page_state(zone, NR_ACTIVE_FILE,
-			      -count[LRU_ACTIVE_FILE]);
-	__mod_zone_page_state(zone, NR_INACTIVE_FILE,
-			      -count[LRU_INACTIVE_FILE]);
-	__mod_zone_page_state(zone, NR_ACTIVE_ANON,
-			      -count[LRU_ACTIVE_ANON]);
-	__mod_zone_page_state(zone, NR_INACTIVE_ANON,
-			      -count[LRU_INACTIVE_ANON]);
-
-	*nr_anon = count[LRU_ACTIVE_ANON] + count[LRU_INACTIVE_ANON];
-	*nr_file = count[LRU_ACTIVE_FILE] + count[LRU_INACTIVE_FILE];
-
-	__mod_zone_page_state(zone, NR_ISOLATED_ANON, *nr_anon);
-	__mod_zone_page_state(zone, NR_ISOLATED_FILE, *nr_file);
-	preempt_enable();
-}
-
 /*
  * shrink_inactive_list() is a helper for shrink_zone().  It returns the number
  * of reclaimed pages
@@ -1263,8 +1217,6 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 	unsigned long nr_scanned;
 	unsigned long nr_reclaimed = 0;
 	unsigned long nr_taken;
-	unsigned long nr_anon;
-	unsigned long nr_file;
 	unsigned long nr_dirty = 0;
 	unsigned long nr_writeback = 0;
 	isolate_mode_t isolate_mode = 0;
@@ -1292,6 +1244,10 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 
 	nr_taken = isolate_lru_pages(nr_to_scan, lruvec, &page_list,
 				     &nr_scanned, sc, isolate_mode, lru);
+
+	__mod_zone_page_state(zone, NR_LRU_BASE + lru, -nr_taken);
+	__mod_zone_page_state(zone, NR_ISOLATED_ANON + file, nr_taken);
+
 	if (global_reclaim(sc)) {
 		zone->pages_scanned += nr_scanned;
 		if (current_is_kswapd())
@@ -1306,15 +1262,12 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 	if (nr_taken == 0)
 		return 0;
 
-	update_isolated_counts(mz, &page_list, &nr_anon, &nr_file);
-
 	nr_reclaimed = shrink_page_list(&page_list, zone, sc,
 						&nr_dirty, &nr_writeback);
 
 	spin_lock_irq(&zone->lru_lock);
 
-	reclaim_stat->recent_scanned[0] += nr_anon;
-	reclaim_stat->recent_scanned[1] += nr_file;
+	reclaim_stat->recent_scanned[file] += nr_taken;
 
 	if (current_is_kswapd())
 		__count_vm_events(KSWAPD_STEAL, nr_reclaimed);
@@ -1322,8 +1275,7 @@ shrink_inactive_list(unsigned long nr_to_scan, struct mem_cgroup_zone *mz,
 
 	putback_inactive_pages(mz, &page_list);
 
-	__mod_zone_page_state(zone, NR_ISOLATED_ANON, -nr_anon);
-	__mod_zone_page_state(zone, NR_ISOLATED_FILE, -nr_file);
+	__mod_zone_page_state(zone, NR_ISOLATED_ANON + file, -nr_taken);
 
 	spin_unlock_irq(&zone->lru_lock);
 


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec
  2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
                   ` (11 preceding siblings ...)
  2012-04-26  7:54 ` [PATCH 12/12] mm/vmscan: kill struct mem_cgroup_zone Konstantin Khlebnikov
@ 2012-04-26 23:25 ` Andrew Morton
  2012-04-27  7:45   ` Konstantin Khlebnikov
  12 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2012-04-26 23:25 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: linux-mm, linux-kernel

On Thu, 26 Apr 2012 11:53:44 +0400
Konstantin Khlebnikov <khlebnikov@openvz.org> wrote:

> This patchset depends on Johannes Weiner's patch
> "mm: memcg: count pte references from every member of the reclaimed hierarchy".
> 
> bloat-o-meter delta for patches 2..12
> 
> add/remove: 6/6 grow/shrink: 6/14 up/down: 4414/-4625 (-211)

That's the sole effect and intent of the patchset?  To save 211 bytes?

> ...
>
>  include/linux/memcontrol.h |   16 +--
>  include/linux/mmzone.h     |   14 ++
>  mm/memcontrol.c            |   33 +++--
>  mm/mmzone.c                |   14 ++
>  mm/page_alloc.c            |    8 -
>  mm/vmscan.c                |  277 ++++++++++++++++++++------------------------
>  6 files changed, 177 insertions(+), 185 deletions(-)

If so, I'm not sure that it is worth the risk and effort?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec
  2012-04-26 23:25 ` [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Andrew Morton
@ 2012-04-27  7:45   ` Konstantin Khlebnikov
  2012-05-02  4:09     ` Hugh Dickins
  0 siblings, 1 reply; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-04-27  7:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, linux-kernel, Hugh Dickins

Andrew Morton wrote:
> On Thu, 26 Apr 2012 11:53:44 +0400
> Konstantin Khlebnikov<khlebnikov@openvz.org>  wrote:
>
>> This patchset depends on Johannes Weiner's patch
>> "mm: memcg: count pte references from every member of the reclaimed hierarchy".
>>
>> bloat-o-meter delta for patches 2..12
>>
>> add/remove: 6/6 grow/shrink: 6/14 up/down: 4414/-4625 (-211)
>
> That's the sole effect and intent of the patchset?  To save 211 bytes?

This is almost last bunch of cleanups for lru_lock splitting,
code reducing is only nice side-effect.
Also this patchset removes many redundant lruvec relookups.

Now mostly all page-to-lruvec translations are located at the same level
as zone->lru_lock locking. So lru-lock splitting patchset can something like this:

-zone = page_zone(page)
-spin_lock_irq(&zone->lru_lock)
-lruvec = mem_cgroup_page_lruvec(page)
+lruvec = lock_page_lruvec_irq(page)

>
>> ...
>>
>>   include/linux/memcontrol.h |   16 +--
>>   include/linux/mmzone.h     |   14 ++
>>   mm/memcontrol.c            |   33 +++--
>>   mm/mmzone.c                |   14 ++
>>   mm/page_alloc.c            |    8 -
>>   mm/vmscan.c                |  277 ++++++++++++++++++++------------------------
>>   6 files changed, 177 insertions(+), 185 deletions(-)
>
> If so, I'm not sure that it is worth the risk and effort?

After lumpy-reclaim removal there a lot of dead or redundant code, maybe someone else
wants to cleanup this code, I specifically sent this set early to avoid conflicts.

>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email:<a href=mailto:"dont@kvack.org">  email@kvack.org</a>


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH v2 05/12] mm/vmscan: remove update_isolated_counts()
  2012-04-26 13:17   ` [PATCH v2 05/12] mm/vmscan: remove update_isolated_counts() Konstantin Khlebnikov
@ 2012-04-27 10:28     ` Mel Gorman
  0 siblings, 0 replies; 20+ messages in thread
From: Mel Gorman @ 2012-04-27 10:28 UTC (permalink / raw)
  To: Konstantin Khlebnikov; +Cc: Andrew Morton, linux-mm, linux-kernel

On Thu, Apr 26, 2012 at 05:17:43PM +0400, Konstantin Khlebnikov wrote:
> update_isolated_counts() no longer required, because lumpy-reclaim was removed.
> Insanity is over, now here only one kind of inactive pages.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

I should have spotted that. Thanks.

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec
  2012-04-27  7:45   ` Konstantin Khlebnikov
@ 2012-05-02  4:09     ` Hugh Dickins
  2012-05-02  6:13       ` Konstantin Khlebnikov
  0 siblings, 1 reply; 20+ messages in thread
From: Hugh Dickins @ 2012-05-02  4:09 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Konstantin Khlebnikov, linux-mm, linux-kernel

On Fri, 27 Apr 2012, Konstantin Khlebnikov wrote:
> Andrew Morton wrote:
> > On Thu, 26 Apr 2012 11:53:44 +0400
> > Konstantin Khlebnikov<khlebnikov@openvz.org>  wrote:
> > 
> > > This patchset depends on Johannes Weiner's patch
> > > "mm: memcg: count pte references from every member of the reclaimed
> > > hierarchy".
> > > 
> > > bloat-o-meter delta for patches 2..12
> > > 
> > > add/remove: 6/6 grow/shrink: 6/14 up/down: 4414/-4625 (-211)
> > 
> > That's the sole effect and intent of the patchset?  To save 211 bytes?

I am surprised it's not more: it feels like more.

> 
> This is almost last bunch of cleanups for lru_lock splitting,
> code reducing is only nice side-effect.
> Also this patchset removes many redundant lruvec relookups.
> 
> Now mostly all page-to-lruvec translations are located at the same level
> as zone->lru_lock locking. So lru-lock splitting patchset can something like
> this:
> 
> -zone = page_zone(page)
> -spin_lock_irq(&zone->lru_lock)
> -lruvec = mem_cgroup_page_lruvec(page)
> +lruvec = lock_page_lruvec_irq(page)
> 
> > 
> > > ...
> > > 
> > >   include/linux/memcontrol.h |   16 +--
> > >   include/linux/mmzone.h     |   14 ++
> > >   mm/memcontrol.c            |   33 +++--
> > >   mm/mmzone.c                |   14 ++
> > >   mm/page_alloc.c            |    8 -
> > >   mm/vmscan.c                |  277
> > > ++++++++++++++++++++------------------------
> > >   6 files changed, 177 insertions(+), 185 deletions(-)
> > 
> > If so, I'm not sure that it is worth the risk and effort?

I'm pretty sure that it is worth the effort, and see very little risk.

It's close to my "[PATCH 3/10] mm/memcg: add zone pointer into lruvec"
posted 20 Feb (after Konstantin posted his set a few days earlier),
which Kamezawa-san Acked with "I like this cleanup".  But this goes
a little further (e.g. 01/12 saving an arg by moving priority into sc,
that's nice; and v2 05/12 removing update_isolated_counts(), great).

Konstantin and I came independently to this simplification, or
generalization, from zone to lruvec: we're confident that it is the
right direction, that it's a good basis for further work.  Certainly
neither of us have yet posted numbers to justify per-memcg per-zone
locking (and I expect split zone locking to need more justification
than it's had); but we both think these patches are a worthwhile
cleanup on their own.

I don't think it was particularly useful to split this into all of
12 pieces!  But never mind, that's a trivial detail, not worth undoing.
There's a few by-the-by bits and pieces I liked in my version that are
not here, but nothing important: if I care enough, I can always send a
little cleanup afterwards.

The only change I'd ask for is in the commit comment on 02/12: it
puzzlingly says "page_zone()" where it means to say "lruvec_zone()".
I think if I'd been doing 04/12, I'd have resented passing "zone" to
shrink_page_list(), would have deleted its VM_BUG_ON, and used a
page_zone() for ZONE_CONGESTED: but that's just me being mean.

I've gone through and compared the result of these 12 against my own
tree updated to next-20120427.  We come out much the same: the only
divergence which worried me was that my mem_cgroup_zone_lruvec() says
	IF (!memcg || mem_cgroup_disabled())
		return &zone->lruvec;
and although I'm sure I had a reason for adding that "!memcg || ",
I cannot now see why.  Maybe it was for some intermediate use that went
away (but I mention it in the hope that Konstantin will double check).

To each one of the 12 (with lruvec_zone in 02/12, and v2 of 05/12):
Acked-by: Hugh Dickins <hughd@google.com>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v2 02/12] mm: add link from struct lruvec to struct zone
  2012-04-26  7:53 ` [PATCH 02/12] mm: add link from struct lruvec to struct zone Konstantin Khlebnikov
@ 2012-05-02  5:52   ` Konstantin Khlebnikov
  0 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-05-02  5:52 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm, Hugh Dickins, linux-kernel

This is first stage of struct mem_cgroup_zone removal, further patches replaces
struct mem_cgroup_zone with pointer to struct lruvec.

If CONFIG_CGROUP_MEM_RES_CTLR=n lruvec_zone() is just container_of().

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

---

v2: fix comment
---
 include/linux/mmzone.h |   14 ++++++++++++++
 mm/memcontrol.c        |    4 +---
 mm/mmzone.c            |   14 ++++++++++++++
 mm/page_alloc.c        |    8 +-------
 4 files changed, 30 insertions(+), 10 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 5c4880b..2427706 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -201,6 +201,9 @@ struct zone_reclaim_stat {
 struct lruvec {
 	struct list_head lists[NR_LRU_LISTS];
 	struct zone_reclaim_stat reclaim_stat;
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	struct zone *zone;
+#endif
 };
 
 /* Mask used at gathering information at once (see memcontrol.c) */
@@ -729,6 +732,17 @@ extern int init_currently_empty_zone(struct zone *zone, unsigned long start_pfn,
 				     unsigned long size,
 				     enum memmap_context context);
 
+extern void lruvec_init(struct lruvec *lruvec, struct zone *zone);
+
+static inline struct zone *lruvec_zone(struct lruvec *lruvec)
+{
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	return lruvec->zone;
+#else
+	return container_of(lruvec, struct zone, lruvec);
+#endif
+}
+
 #ifdef CONFIG_HAVE_MEMORY_PRESENT
 void memory_present(int nid, unsigned long start, unsigned long end);
 #else
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a2184a2..3e7b91c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4934,7 +4934,6 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 {
 	struct mem_cgroup_per_node *pn;
 	struct mem_cgroup_per_zone *mz;
-	enum lru_list lru;
 	int zone, tmp = node;
 	/*
 	 * This routine is called against possible nodes.
@@ -4952,8 +4951,7 @@ static int alloc_mem_cgroup_per_zone_info(struct mem_cgroup *memcg, int node)
 
 	for (zone = 0; zone < MAX_NR_ZONES; zone++) {
 		mz = &pn->zoneinfo[zone];
-		for_each_lru(lru)
-			INIT_LIST_HEAD(&mz->lruvec.lists[lru]);
+		lruvec_init(&mz->lruvec, &NODE_DATA(node)->node_zones[zone]);
 		mz->usage_in_excess = 0;
 		mz->on_tree = false;
 		mz->memcg = memcg;
diff --git a/mm/mmzone.c b/mm/mmzone.c
index 7cf7b7d..6830eab 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -86,3 +86,17 @@ int memmap_valid_within(unsigned long pfn,
 	return 1;
 }
 #endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
+
+void lruvec_init(struct lruvec *lruvec, struct zone *zone)
+{
+	enum lru_list lru;
+
+	memset(lruvec, 0, sizeof(struct lruvec));
+
+	for_each_lru(lru)
+		INIT_LIST_HEAD(&lruvec->lists[lru]);
+
+#ifdef CONFIG_CGROUP_MEM_RES_CTLR
+	lruvec->zone = zone;
+#endif
+}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 1b951de..35478bd 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4361,7 +4361,6 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 	for (j = 0; j < MAX_NR_ZONES; j++) {
 		struct zone *zone = pgdat->node_zones + j;
 		unsigned long size, realsize, memmap_pages;
-		enum lru_list lru;
 
 		size = zone_spanned_pages_in_node(nid, j, zones_size);
 		realsize = size - zone_absent_pages_in_node(nid, j,
@@ -4411,12 +4410,7 @@ static void __paginginit free_area_init_core(struct pglist_data *pgdat,
 		zone->zone_pgdat = pgdat;
 
 		zone_pcp_init(zone);
-		for_each_lru(lru)
-			INIT_LIST_HEAD(&zone->lruvec.lists[lru]);
-		zone->lruvec.reclaim_stat.recent_rotated[0] = 0;
-		zone->lruvec.reclaim_stat.recent_rotated[1] = 0;
-		zone->lruvec.reclaim_stat.recent_scanned[0] = 0;
-		zone->lruvec.reclaim_stat.recent_scanned[1] = 0;
+		lruvec_init(&zone->lruvec, zone);
 		zap_zone_vm_stats(zone);
 		zone->flags = 0;
 		if (!size)


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec
  2012-05-02  4:09     ` Hugh Dickins
@ 2012-05-02  6:13       ` Konstantin Khlebnikov
  0 siblings, 0 replies; 20+ messages in thread
From: Konstantin Khlebnikov @ 2012-05-02  6:13 UTC (permalink / raw)
  To: Hugh Dickins; +Cc: Andrew Morton, linux-mm, linux-kernel

Hugh Dickins wrote:
> On Fri, 27 Apr 2012, Konstantin Khlebnikov wrote:
>> Andrew Morton wrote:
>>> On Thu, 26 Apr 2012 11:53:44 +0400
>>> Konstantin Khlebnikov<khlebnikov@openvz.org>   wrote:
>>>
>>>> This patchset depends on Johannes Weiner's patch
>>>> "mm: memcg: count pte references from every member of the reclaimed
>>>> hierarchy".
>>>>
>>>> bloat-o-meter delta for patches 2..12
>>>>
>>>> add/remove: 6/6 grow/shrink: 6/14 up/down: 4414/-4625 (-211)
>>>
>>> That's the sole effect and intent of the patchset?  To save 211 bytes?
>
> I am surprised it's not more: it feels like more.
>
>>
>> This is almost last bunch of cleanups for lru_lock splitting,
>> code reducing is only nice side-effect.
>> Also this patchset removes many redundant lruvec relookups.
>>
>> Now mostly all page-to-lruvec translations are located at the same level
>> as zone->lru_lock locking. So lru-lock splitting patchset can something like
>> this:
>>
>> -zone = page_zone(page)
>> -spin_lock_irq(&zone->lru_lock)
>> -lruvec = mem_cgroup_page_lruvec(page)
>> +lruvec = lock_page_lruvec_irq(page)
>>
>>>
>>>> ...
>>>>
>>>>    include/linux/memcontrol.h |   16 +--
>>>>    include/linux/mmzone.h     |   14 ++
>>>>    mm/memcontrol.c            |   33 +++--
>>>>    mm/mmzone.c                |   14 ++
>>>>    mm/page_alloc.c            |    8 -
>>>>    mm/vmscan.c                |  277
>>>> ++++++++++++++++++++------------------------
>>>>    6 files changed, 177 insertions(+), 185 deletions(-)
>>>
>>> If so, I'm not sure that it is worth the risk and effort?
>
> I'm pretty sure that it is worth the effort, and see very little risk.
>
> It's close to my "[PATCH 3/10] mm/memcg: add zone pointer into lruvec"
> posted 20 Feb (after Konstantin posted his set a few days earlier),
> which Kamezawa-san Acked with "I like this cleanup".  But this goes
> a little further (e.g. 01/12 saving an arg by moving priority into sc,
> that's nice; and v2 05/12 removing update_isolated_counts(), great).
>
> Konstantin and I came independently to this simplification, or
> generalization, from zone to lruvec: we're confident that it is the
> right direction, that it's a good basis for further work.  Certainly
> neither of us have yet posted numbers to justify per-memcg per-zone
> locking (and I expect split zone locking to need more justification
> than it's had); but we both think these patches are a worthwhile
> cleanup on their own.
>
> I don't think it was particularly useful to split this into all of
> 12 pieces!  But never mind, that's a trivial detail, not worth undoing.
> There's a few by-the-by bits and pieces I liked in my version that are
> not here, but nothing important: if I care enough, I can always send a
> little cleanup afterwards.
>
> The only change I'd ask for is in the commit comment on 02/12: it
> puzzlingly says "page_zone()" where it means to say "lruvec_zone()".
> I think if I'd been doing 04/12, I'd have resented passing "zone" to
> shrink_page_list(), would have deleted its VM_BUG_ON, and used a
> page_zone() for ZONE_CONGESTED: but that's just me being mean.

We already know which zone we scan, why you prefer to re-lookup it via
page's reference? And which page you will choose for that? There are many of them. =)

>
> I've gone through and compared the result of these 12 against my own
> tree updated to next-20120427.  We come out much the same: the only
> divergence which worried me was that my mem_cgroup_zone_lruvec() says
> 	IF (!memcg || mem_cgroup_disabled())
> 		return&zone->lruvec;
> and although I'm sure I had a reason for adding that "!memcg || ",
> I cannot now see why.  Maybe it was for some intermediate use that went
> away (but I mention it in the hope that Konstantin will double check).

memcg can be null here if and only if mem_cgroup_disabled()

After this patchset mem_cgroup_zone_lruvec() is used only in few places,
usually right after mem_cgroup_iter(), so proof is trivial.

>
> To each one of the 12 (with lruvec_zone in 02/12, and v2 of 05/12):
> Acked-by: Hugh Dickins<hughd@google.com>

Thanks =)

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2012-05-02  6:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-26  7:53 [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Konstantin Khlebnikov
2012-04-26  7:53 ` [PATCH 01/12] mm/vmscan: store "priority" in struct scan_control Konstantin Khlebnikov
2012-04-26  7:53 ` [PATCH 02/12] mm: add link from struct lruvec to struct zone Konstantin Khlebnikov
2012-05-02  5:52   ` [PATCH v2 " Konstantin Khlebnikov
2012-04-26  7:53 ` [PATCH 03/12] mm/vmscan: push lruvec pointer into isolate_lru_pages() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 04/12] mm/vmscan: push zone pointer into shrink_page_list() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 05/12] mm/vmscan: push zone pointer into update_isolated_counts() Konstantin Khlebnikov
2012-04-26 13:17   ` [PATCH v2 05/12] mm/vmscan: remove update_isolated_counts() Konstantin Khlebnikov
2012-04-27 10:28     ` Mel Gorman
2012-04-26  7:54 ` [PATCH 06/12] mm/vmscan: push lruvec pointer into putback_inactive_pages() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 07/12] mm/vmscan: replace zone_nr_lru_pages() with get_lruvec_size() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 08/12] mm/vmscan: push lruvec pointer into inactive_list_is_low() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 09/12] mm/vmscan: push lruvec pointer into shrink_list() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 10/12] mm/vmscan: push lruvec pointer into get_scan_count() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 11/12] mm/vmscan: push lruvec pointer into should_continue_reclaim() Konstantin Khlebnikov
2012-04-26  7:54 ` [PATCH 12/12] mm/vmscan: kill struct mem_cgroup_zone Konstantin Khlebnikov
2012-04-26 23:25 ` [PATCH next 00/12] mm: replace struct mem_cgroup_zone with struct lruvec Andrew Morton
2012-04-27  7:45   ` Konstantin Khlebnikov
2012-05-02  4:09     ` Hugh Dickins
2012-05-02  6:13       ` Konstantin Khlebnikov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).