* [RFC][PATCH 0/6 v2] reducing page_cgroup size
@ 2012-03-28 10:44 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:44 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
Hi, here is v2 (still RFC again because I changed many parts.)
I'd like to post v3 without 'RFC' tag after Lsf-MM Summit.
This series is for reducing size of 'struct page_cgroup' to 8 bytes.
v2 contains 6 patches and did clean-ups and fixes race in v1 and
adds a trial to integrate struct page_cgroup into struct page.
each patches are...
1/6 ....add methods to access pc->mem_cgroup
2/6 ....add pc_set_mem_cgroup_and_flags() to set ->mem_cgroup and flags by one call.
3/6 ....add PageCgroupReset() for handling a race.
4/6 ....reduce size of struct page_cgroup by removing ->mem_cgroup
5/6 ....remove unnecessary memory barrier
6/6 ....add CONFIG_INTEGRATED_PAGE_CGROUP to place page_cgroup in struct page.
Regards,
-Kame
^ permalink raw reply [flat|nested] 24+ messages in thread
* [RFC][PATCH 0/6 v2] reducing page_cgroup size
@ 2012-03-28 10:44 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:44 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
Hi, here is v2 (still RFC again because I changed many parts.)
I'd like to post v3 without 'RFC' tag after Lsf-MM Summit.
This series is for reducing size of 'struct page_cgroup' to 8 bytes.
v2 contains 6 patches and did clean-ups and fixes race in v1 and
adds a trial to integrate struct page_cgroup into struct page.
each patches are...
1/6 ....add methods to access pc->mem_cgroup
2/6 ....add pc_set_mem_cgroup_and_flags() to set ->mem_cgroup and flags by one call.
3/6 ....add PageCgroupReset() for handling a race.
4/6 ....reduce size of struct page_cgroup by removing ->mem_cgroup
5/6 ....remove unnecessary memory barrier
6/6 ....add CONFIG_INTEGRATED_PAGE_CGROUP to place page_cgroup in struct page.
Regards,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* [RFC][PATCH 1/6] memcg: add methods to access pc->mem_cgroup
2012-03-28 10:44 ` KAMEZAWA Hiroyuki
@ 2012-03-28 10:47 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:47 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
In order to encode pc->mem_cgroup and pc->flags to be in a word,
access function to pc->mem_cgroup is required.
This patch replaces access to pc->mem_cgroup with
pc_to_mem_cgroup(pc) : pc->mem_cgroup
pc_set_mem_cgroup(pc, memcg) : pc->mem_cgroup = memcg
Changelog:
- rebased onto linux-next-Mar 27 2012 handle THP move_page
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 12 +++++++
mm/memcontrol.c | 71 ++++++++++++++++++++++--------------------
2 files changed, 49 insertions(+), 34 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index a88cdba..92768cb 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -82,6 +82,18 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
bit_spin_unlock(PCG_LOCK, &pc->flags);
}
+
+static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
+{
+ return pc->mem_cgroup;
+}
+
+static inline void
+pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
+{
+ pc->mem_cgroup = memcg;
+}
+
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
struct page_cgroup;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b2ee6df..8077460 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1034,9 +1034,9 @@ struct lruvec *mem_cgroup_zone_lruvec(struct zone *zone,
/*
* Following LRU functions are allowed to be used without PCG_LOCK.
* Operations are called by routine of global LRU independently from memcg.
- * What we have to take care of here is validness of pc->mem_cgroup.
+ * What we have to take care of here is validness of pc's mem_cgroup.
*
- * Changes to pc->mem_cgroup happens when
+ * Changes to pc's mem_cgroup happens when
* 1. charge
* 2. moving account
* In typical case, "charge" is done before add-to-lru. Exception is SwapCache.
@@ -1068,7 +1068,7 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
return &zone->lruvec;
pc = lookup_page_cgroup(page);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
/*
* Surreptitiously switch any uncharged page to root:
@@ -1077,10 +1077,12 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
*
* Our caller holds lru_lock, and PageCgroupUsed is updated
* under page_cgroup lock: between them, they make all uses
- * of pc->mem_cgroup safe.
+ * of pc's mem_cgroup safe.
*/
- if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup)
- pc->mem_cgroup = memcg = root_mem_cgroup;
+ if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
+ pc_set_mem_cgroup(pc, root_mem_cgroup);
+ memcg = root_mem_cgroup;
+ }
mz = page_cgroup_zoneinfo(memcg, page);
/* compound_order() is stabilized through lru_lock */
@@ -1108,7 +1110,7 @@ void mem_cgroup_lru_del_list(struct page *page, enum lru_list lru)
return;
pc = lookup_page_cgroup(page);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
VM_BUG_ON(!memcg);
mz = page_cgroup_zoneinfo(memcg, page);
/* huge page split is done under lru_lock. so, we have no races. */
@@ -1255,9 +1257,9 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page)
pc = lookup_page_cgroup(page);
if (!PageCgroupUsed(pc))
return NULL;
- /* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
+ /* Ensure pc's mem_cgroup is visible after reading PCG_USED. */
smp_rmb();
- mz = page_cgroup_zoneinfo(pc->mem_cgroup, page);
+ mz = page_cgroup_zoneinfo(pc_to_mem_cgroup(pc), page);
return &mz->reclaim_stat;
}
@@ -1334,7 +1336,7 @@ static void mem_cgroup_end_move(struct mem_cgroup *memcg)
*
* mem_cgroup_stolen() - checking whether a cgroup is mc.from or not. This
* is used for avoiding races in accounting. If true,
- * pc->mem_cgroup may be overwritten.
+ * pc's mem_cgroup may be overwritten.
*
* mem_cgroup_under_move() - checking a cgroup is mc.from or mc.to or
* under hierarchy of moving cgroups. This is for
@@ -1907,8 +1909,8 @@ bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
* file-stat operations happen after a page is attached to radix-tree. There
* are no race with "charge".
*
- * Considering "uncharge", we know that memcg doesn't clear pc->mem_cgroup
- * at "uncharge" intentionally. So, we always see valid pc->mem_cgroup even
+ * Considering "uncharge", we know that memcg doesn't clear pc's mem_cgroup
+ * at "uncharge" intentionally. So, we always see valid pc's mem_cgroup even
* if there are race with "uncharge". Statistics itself is properly handled
* by flags.
*
@@ -1925,7 +1927,7 @@ void __mem_cgroup_begin_update_page_stat(struct page *page,
pc = lookup_page_cgroup(page);
again:
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (unlikely(!memcg || !PageCgroupUsed(pc)))
return;
/*
@@ -1938,7 +1940,7 @@ again:
return;
move_lock_mem_cgroup(memcg, flags);
- if (memcg != pc->mem_cgroup || !PageCgroupUsed(pc)) {
+ if (memcg != pc_to_mem_cgroup(pc) || !PageCgroupUsed(pc)) {
move_unlock_mem_cgroup(memcg, flags);
goto again;
}
@@ -1950,11 +1952,11 @@ void __mem_cgroup_end_update_page_stat(struct page *page, unsigned long *flags)
struct page_cgroup *pc = lookup_page_cgroup(page);
/*
- * It's guaranteed that pc->mem_cgroup never changes while
- * lock is held because a routine modifies pc->mem_cgroup
+ * It's guaranteed that pc's mem_cgroup never changes while
+ * lock is held because a routine modifies pc's mem_cgroup
* should take move_lock_page_cgroup().
*/
- move_unlock_mem_cgroup(pc->mem_cgroup, flags);
+ move_unlock_mem_cgroup(pc_to_mem_cgroup(pc), flags);
}
void mem_cgroup_update_page_stat(struct page *page,
@@ -1967,7 +1969,7 @@ void mem_cgroup_update_page_stat(struct page *page,
if (mem_cgroup_disabled())
return;
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (unlikely(!memcg || !PageCgroupUsed(pc)))
return;
@@ -2264,7 +2266,7 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
* has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
* to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
* as possible without any hazards. 2: all pages should have a valid
- * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
+ * pc's mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
* pointer, that is treated as a charge to root_mem_cgroup.
*
* So __mem_cgroup_try_charge() will return
@@ -2457,7 +2459,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc);
if (PageCgroupUsed(pc)) {
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (memcg && !css_tryget(&memcg->css))
memcg = NULL;
} else if (PageSwapCache(page)) {
@@ -2509,11 +2511,11 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
}
}
- pc->mem_cgroup = memcg;
+ pc_set_mem_cgroup(pc, memcg);
/*
* We access a page_cgroup asynchronously without lock_page_cgroup().
- * Especially when a page_cgroup is taken from a page, pc->mem_cgroup
- * is accessed after testing USED bit. To make pc->mem_cgroup visible
+ * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
+ * is accessed after testing USED bit. To make pc's mem_cgroup visible
* before USED bit, we need memory barrier here.
* See mem_cgroup_add_lru_list(), etc.
*/
@@ -2558,13 +2560,14 @@ void mem_cgroup_split_huge_fixup(struct page *head)
{
struct page_cgroup *head_pc = lookup_page_cgroup(head);
struct page_cgroup *pc;
+ struct mem_cgroup *memcg = pc_to_mem_cgroup(head_pc);
int i;
if (mem_cgroup_disabled())
return;
for (i = 1; i < HPAGE_PMD_NR; i++) {
pc = head_pc + i;
- pc->mem_cgroup = head_pc->mem_cgroup;
+ pc_set_mem_cgroup(pc, memcg);
smp_wmb();/* see __commit_charge() */
pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
}
@@ -2615,7 +2618,7 @@ static int mem_cgroup_move_account(struct page *page,
lock_page_cgroup(pc);
ret = -EINVAL;
- if (!PageCgroupUsed(pc) || pc->mem_cgroup != from)
+ if (!PageCgroupUsed(pc) || pc_to_mem_cgroup(pc) != from)
goto unlock;
move_lock_mem_cgroup(from, &flags);
@@ -2633,7 +2636,7 @@ static int mem_cgroup_move_account(struct page *page,
__mem_cgroup_cancel_charge(from, nr_pages);
/* caller should have done css_get */
- pc->mem_cgroup = to;
+ pc_set_mem_cgroup(pc, to);
mem_cgroup_charge_statistics(to, anon, nr_pages);
/*
* We charges against "to" which may not have any tasks. Then, "to"
@@ -2976,7 +2979,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
lock_page_cgroup(pc);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (!PageCgroupUsed(pc))
goto unlock_out;
@@ -3012,7 +3015,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
ClearPageCgroupUsed(pc);
/*
- * pc->mem_cgroup is not cleared here. It will be accessed when it's
+ * pc's mem_cgroup is not cleared here. It will be accessed when it's
* freed from LRU. This is safe because uncharged page is expected not
* to be reused (freed soon). Exception is SwapCache, it's handled by
* special functions.
@@ -3234,7 +3237,7 @@ int mem_cgroup_prepare_migration(struct page *page,
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc);
if (PageCgroupUsed(pc)) {
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
css_get(&memcg->css);
/*
* At migrating an anonymous page, its mapcount goes down
@@ -3379,7 +3382,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
pc = lookup_page_cgroup(oldpage);
/* fix accounting on old pages */
lock_page_cgroup(pc);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
mem_cgroup_charge_statistics(memcg, false, -1);
ClearPageCgroupUsed(pc);
unlock_page_cgroup(pc);
@@ -3390,7 +3393,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
/*
* Even if newpage->mapping was NULL before starting replacement,
* the newpage may be on LRU(or pagevec for LRU) already. We lock
- * LRU while we overwrite pc->mem_cgroup.
+ * LRU while we overwrite pc's mem_cgroup.
*/
__mem_cgroup_commit_charge(memcg, newpage, 1, pc, type, true);
}
@@ -3426,7 +3429,7 @@ void mem_cgroup_print_bad_page(struct page *page)
pc = lookup_page_cgroup_used(page);
if (pc) {
printk(KERN_ALERT "pc:%p pc->flags:%lx pc->mem_cgroup:%p\n",
- pc, pc->flags, pc->mem_cgroup);
+ pc, pc->flags, pc_to_mem_cgroup(pc));
}
}
#endif
@@ -5238,7 +5241,7 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma,
* mem_cgroup_move_account() checks the pc is valid or not under
* the lock.
*/
- if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
+ if (PageCgroupUsed(pc) && pc_to_mem_cgroup(pc) == mc.from) {
ret = MC_TARGET_PAGE;
if (target)
target->page = page;
@@ -5274,7 +5277,7 @@ static enum mc_target_type get_mctgt_type_thp(struct vm_area_struct *vma,
if (!move_anon())
return ret;
pc = lookup_page_cgroup(page);
- if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
+ if (PageCgroupUsed(pc) && pc_to_mem_cgroup(pc) == mc.from) {
ret = MC_TARGET_PAGE;
if (target) {
get_page(page);
--
1.7.4.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 1/6] memcg: add methods to access pc->mem_cgroup
@ 2012-03-28 10:47 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:47 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
In order to encode pc->mem_cgroup and pc->flags to be in a word,
access function to pc->mem_cgroup is required.
This patch replaces access to pc->mem_cgroup with
pc_to_mem_cgroup(pc) : pc->mem_cgroup
pc_set_mem_cgroup(pc, memcg) : pc->mem_cgroup = memcg
Changelog:
- rebased onto linux-next-Mar 27 2012 handle THP move_page
Acked-by: Michal Hocko <mhocko@suse.cz>
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 12 +++++++
mm/memcontrol.c | 71 ++++++++++++++++++++++--------------------
2 files changed, 49 insertions(+), 34 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index a88cdba..92768cb 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -82,6 +82,18 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
bit_spin_unlock(PCG_LOCK, &pc->flags);
}
+
+static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
+{
+ return pc->mem_cgroup;
+}
+
+static inline void
+pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
+{
+ pc->mem_cgroup = memcg;
+}
+
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
struct page_cgroup;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b2ee6df..8077460 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1034,9 +1034,9 @@ struct lruvec *mem_cgroup_zone_lruvec(struct zone *zone,
/*
* Following LRU functions are allowed to be used without PCG_LOCK.
* Operations are called by routine of global LRU independently from memcg.
- * What we have to take care of here is validness of pc->mem_cgroup.
+ * What we have to take care of here is validness of pc's mem_cgroup.
*
- * Changes to pc->mem_cgroup happens when
+ * Changes to pc's mem_cgroup happens when
* 1. charge
* 2. moving account
* In typical case, "charge" is done before add-to-lru. Exception is SwapCache.
@@ -1068,7 +1068,7 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
return &zone->lruvec;
pc = lookup_page_cgroup(page);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
/*
* Surreptitiously switch any uncharged page to root:
@@ -1077,10 +1077,12 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
*
* Our caller holds lru_lock, and PageCgroupUsed is updated
* under page_cgroup lock: between them, they make all uses
- * of pc->mem_cgroup safe.
+ * of pc's mem_cgroup safe.
*/
- if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup)
- pc->mem_cgroup = memcg = root_mem_cgroup;
+ if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
+ pc_set_mem_cgroup(pc, root_mem_cgroup);
+ memcg = root_mem_cgroup;
+ }
mz = page_cgroup_zoneinfo(memcg, page);
/* compound_order() is stabilized through lru_lock */
@@ -1108,7 +1110,7 @@ void mem_cgroup_lru_del_list(struct page *page, enum lru_list lru)
return;
pc = lookup_page_cgroup(page);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
VM_BUG_ON(!memcg);
mz = page_cgroup_zoneinfo(memcg, page);
/* huge page split is done under lru_lock. so, we have no races. */
@@ -1255,9 +1257,9 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page)
pc = lookup_page_cgroup(page);
if (!PageCgroupUsed(pc))
return NULL;
- /* Ensure pc->mem_cgroup is visible after reading PCG_USED. */
+ /* Ensure pc's mem_cgroup is visible after reading PCG_USED. */
smp_rmb();
- mz = page_cgroup_zoneinfo(pc->mem_cgroup, page);
+ mz = page_cgroup_zoneinfo(pc_to_mem_cgroup(pc), page);
return &mz->reclaim_stat;
}
@@ -1334,7 +1336,7 @@ static void mem_cgroup_end_move(struct mem_cgroup *memcg)
*
* mem_cgroup_stolen() - checking whether a cgroup is mc.from or not. This
* is used for avoiding races in accounting. If true,
- * pc->mem_cgroup may be overwritten.
+ * pc's mem_cgroup may be overwritten.
*
* mem_cgroup_under_move() - checking a cgroup is mc.from or mc.to or
* under hierarchy of moving cgroups. This is for
@@ -1907,8 +1909,8 @@ bool mem_cgroup_handle_oom(struct mem_cgroup *memcg, gfp_t mask, int order)
* file-stat operations happen after a page is attached to radix-tree. There
* are no race with "charge".
*
- * Considering "uncharge", we know that memcg doesn't clear pc->mem_cgroup
- * at "uncharge" intentionally. So, we always see valid pc->mem_cgroup even
+ * Considering "uncharge", we know that memcg doesn't clear pc's mem_cgroup
+ * at "uncharge" intentionally. So, we always see valid pc's mem_cgroup even
* if there are race with "uncharge". Statistics itself is properly handled
* by flags.
*
@@ -1925,7 +1927,7 @@ void __mem_cgroup_begin_update_page_stat(struct page *page,
pc = lookup_page_cgroup(page);
again:
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (unlikely(!memcg || !PageCgroupUsed(pc)))
return;
/*
@@ -1938,7 +1940,7 @@ again:
return;
move_lock_mem_cgroup(memcg, flags);
- if (memcg != pc->mem_cgroup || !PageCgroupUsed(pc)) {
+ if (memcg != pc_to_mem_cgroup(pc) || !PageCgroupUsed(pc)) {
move_unlock_mem_cgroup(memcg, flags);
goto again;
}
@@ -1950,11 +1952,11 @@ void __mem_cgroup_end_update_page_stat(struct page *page, unsigned long *flags)
struct page_cgroup *pc = lookup_page_cgroup(page);
/*
- * It's guaranteed that pc->mem_cgroup never changes while
- * lock is held because a routine modifies pc->mem_cgroup
+ * It's guaranteed that pc's mem_cgroup never changes while
+ * lock is held because a routine modifies pc's mem_cgroup
* should take move_lock_page_cgroup().
*/
- move_unlock_mem_cgroup(pc->mem_cgroup, flags);
+ move_unlock_mem_cgroup(pc_to_mem_cgroup(pc), flags);
}
void mem_cgroup_update_page_stat(struct page *page,
@@ -1967,7 +1969,7 @@ void mem_cgroup_update_page_stat(struct page *page,
if (mem_cgroup_disabled())
return;
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (unlikely(!memcg || !PageCgroupUsed(pc)))
return;
@@ -2264,7 +2266,7 @@ static int mem_cgroup_do_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
* has TIF_MEMDIE, this function returns -EINTR while writing root_mem_cgroup
* to *ptr. There are two reasons for this. 1: fatal threads should quit as soon
* as possible without any hazards. 2: all pages should have a valid
- * pc->mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
+ * pc's mem_cgroup. If mm is NULL and the caller doesn't pass a valid memcg
* pointer, that is treated as a charge to root_mem_cgroup.
*
* So __mem_cgroup_try_charge() will return
@@ -2457,7 +2459,7 @@ struct mem_cgroup *try_get_mem_cgroup_from_page(struct page *page)
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc);
if (PageCgroupUsed(pc)) {
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (memcg && !css_tryget(&memcg->css))
memcg = NULL;
} else if (PageSwapCache(page)) {
@@ -2509,11 +2511,11 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
}
}
- pc->mem_cgroup = memcg;
+ pc_set_mem_cgroup(pc, memcg);
/*
* We access a page_cgroup asynchronously without lock_page_cgroup().
- * Especially when a page_cgroup is taken from a page, pc->mem_cgroup
- * is accessed after testing USED bit. To make pc->mem_cgroup visible
+ * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
+ * is accessed after testing USED bit. To make pc's mem_cgroup visible
* before USED bit, we need memory barrier here.
* See mem_cgroup_add_lru_list(), etc.
*/
@@ -2558,13 +2560,14 @@ void mem_cgroup_split_huge_fixup(struct page *head)
{
struct page_cgroup *head_pc = lookup_page_cgroup(head);
struct page_cgroup *pc;
+ struct mem_cgroup *memcg = pc_to_mem_cgroup(head_pc);
int i;
if (mem_cgroup_disabled())
return;
for (i = 1; i < HPAGE_PMD_NR; i++) {
pc = head_pc + i;
- pc->mem_cgroup = head_pc->mem_cgroup;
+ pc_set_mem_cgroup(pc, memcg);
smp_wmb();/* see __commit_charge() */
pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
}
@@ -2615,7 +2618,7 @@ static int mem_cgroup_move_account(struct page *page,
lock_page_cgroup(pc);
ret = -EINVAL;
- if (!PageCgroupUsed(pc) || pc->mem_cgroup != from)
+ if (!PageCgroupUsed(pc) || pc_to_mem_cgroup(pc) != from)
goto unlock;
move_lock_mem_cgroup(from, &flags);
@@ -2633,7 +2636,7 @@ static int mem_cgroup_move_account(struct page *page,
__mem_cgroup_cancel_charge(from, nr_pages);
/* caller should have done css_get */
- pc->mem_cgroup = to;
+ pc_set_mem_cgroup(pc, to);
mem_cgroup_charge_statistics(to, anon, nr_pages);
/*
* We charges against "to" which may not have any tasks. Then, "to"
@@ -2976,7 +2979,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
lock_page_cgroup(pc);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
if (!PageCgroupUsed(pc))
goto unlock_out;
@@ -3012,7 +3015,7 @@ __mem_cgroup_uncharge_common(struct page *page, enum charge_type ctype)
ClearPageCgroupUsed(pc);
/*
- * pc->mem_cgroup is not cleared here. It will be accessed when it's
+ * pc's mem_cgroup is not cleared here. It will be accessed when it's
* freed from LRU. This is safe because uncharged page is expected not
* to be reused (freed soon). Exception is SwapCache, it's handled by
* special functions.
@@ -3234,7 +3237,7 @@ int mem_cgroup_prepare_migration(struct page *page,
pc = lookup_page_cgroup(page);
lock_page_cgroup(pc);
if (PageCgroupUsed(pc)) {
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
css_get(&memcg->css);
/*
* At migrating an anonymous page, its mapcount goes down
@@ -3379,7 +3382,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
pc = lookup_page_cgroup(oldpage);
/* fix accounting on old pages */
lock_page_cgroup(pc);
- memcg = pc->mem_cgroup;
+ memcg = pc_to_mem_cgroup(pc);
mem_cgroup_charge_statistics(memcg, false, -1);
ClearPageCgroupUsed(pc);
unlock_page_cgroup(pc);
@@ -3390,7 +3393,7 @@ void mem_cgroup_replace_page_cache(struct page *oldpage,
/*
* Even if newpage->mapping was NULL before starting replacement,
* the newpage may be on LRU(or pagevec for LRU) already. We lock
- * LRU while we overwrite pc->mem_cgroup.
+ * LRU while we overwrite pc's mem_cgroup.
*/
__mem_cgroup_commit_charge(memcg, newpage, 1, pc, type, true);
}
@@ -3426,7 +3429,7 @@ void mem_cgroup_print_bad_page(struct page *page)
pc = lookup_page_cgroup_used(page);
if (pc) {
printk(KERN_ALERT "pc:%p pc->flags:%lx pc->mem_cgroup:%p\n",
- pc, pc->flags, pc->mem_cgroup);
+ pc, pc->flags, pc_to_mem_cgroup(pc));
}
}
#endif
@@ -5238,7 +5241,7 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma,
* mem_cgroup_move_account() checks the pc is valid or not under
* the lock.
*/
- if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
+ if (PageCgroupUsed(pc) && pc_to_mem_cgroup(pc) == mc.from) {
ret = MC_TARGET_PAGE;
if (target)
target->page = page;
@@ -5274,7 +5277,7 @@ static enum mc_target_type get_mctgt_type_thp(struct vm_area_struct *vma,
if (!move_anon())
return ret;
pc = lookup_page_cgroup(page);
- if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
+ if (PageCgroupUsed(pc) && pc_to_mem_cgroup(pc) == mc.from) {
ret = MC_TARGET_PAGE;
if (target) {
get_page(page);
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
2012-03-28 10:44 ` KAMEZAWA Hiroyuki
@ 2012-03-28 10:51 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:51 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 18 ++++++++++++++++++
mm/memcontrol.c | 18 ++++--------------
2 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 92768cb..2707809 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -1,6 +1,8 @@
#ifndef __LINUX_PAGE_CGROUP_H
#define __LINUX_PAGE_CGROUP_H
+#include <linux/smp.h>
+
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
@@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
pc->mem_cgroup = memcg;
}
+static inline void
+pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
+ unsigned long flags)
+{
+ pc->mem_cgroup = memcg;
+ /*
+ * We access a page_cgroup asynchronously without lock_page_cgroup().
+ * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
+ * is accessed after testing USED bit. To make pc's mem_cgroup visible
+ * before USED bit, we need memory barrier here.
+ * See mem_cgroup_add_lru_list(), etc.
+ */
+ smp_wmb();
+ pc->flags = flags;
+}
+
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
struct page_cgroup;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8077460..d366b60 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
}
}
- pc_set_mem_cgroup(pc, memcg);
- /*
- * We access a page_cgroup asynchronously without lock_page_cgroup().
- * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
- * is accessed after testing USED bit. To make pc's mem_cgroup visible
- * before USED bit, we need memory barrier here.
- * See mem_cgroup_add_lru_list(), etc.
- */
- smp_wmb();
- SetPageCgroupUsed(pc);
+ pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
if (lrucare) {
if (was_on_lru) {
@@ -2549,7 +2540,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MIGRATION))
/*
* Because tail pages are not marked as "used", set it. We're under
* zone->lru_lock, 'splitting on pmd' and compound_lock.
@@ -2565,11 +2555,11 @@ void mem_cgroup_split_huge_fixup(struct page *head)
if (mem_cgroup_disabled())
return;
+ if (!PageCgroupUsed(head_pc))
+ return;
for (i = 1; i < HPAGE_PMD_NR; i++) {
pc = head_pc + i;
- pc_set_mem_cgroup(pc, memcg);
- smp_wmb();/* see __commit_charge() */
- pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
+ pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED));
}
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
--
1.7.4.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
@ 2012-03-28 10:51 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:51 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 18 ++++++++++++++++++
mm/memcontrol.c | 18 ++++--------------
2 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 92768cb..2707809 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -1,6 +1,8 @@
#ifndef __LINUX_PAGE_CGROUP_H
#define __LINUX_PAGE_CGROUP_H
+#include <linux/smp.h>
+
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
@@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
pc->mem_cgroup = memcg;
}
+static inline void
+pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
+ unsigned long flags)
+{
+ pc->mem_cgroup = memcg;
+ /*
+ * We access a page_cgroup asynchronously without lock_page_cgroup().
+ * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
+ * is accessed after testing USED bit. To make pc's mem_cgroup visible
+ * before USED bit, we need memory barrier here.
+ * See mem_cgroup_add_lru_list(), etc.
+ */
+ smp_wmb();
+ pc->flags = flags;
+}
+
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
struct page_cgroup;
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8077460..d366b60 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
}
}
- pc_set_mem_cgroup(pc, memcg);
- /*
- * We access a page_cgroup asynchronously without lock_page_cgroup().
- * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
- * is accessed after testing USED bit. To make pc's mem_cgroup visible
- * before USED bit, we need memory barrier here.
- * See mem_cgroup_add_lru_list(), etc.
- */
- smp_wmb();
- SetPageCgroupUsed(pc);
+ pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
if (lrucare) {
if (was_on_lru) {
@@ -2549,7 +2540,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
-#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MIGRATION))
/*
* Because tail pages are not marked as "used", set it. We're under
* zone->lru_lock, 'splitting on pmd' and compound_lock.
@@ -2565,11 +2555,11 @@ void mem_cgroup_split_huge_fixup(struct page *head)
if (mem_cgroup_disabled())
return;
+ if (!PageCgroupUsed(head_pc))
+ return;
for (i = 1; i < HPAGE_PMD_NR; i++) {
pc = head_pc + i;
- pc_set_mem_cgroup(pc, memcg);
- smp_wmb();/* see __commit_charge() */
- pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
+ pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED));
}
}
#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 3/6] memcg: add PageCgroupReset()
2012-03-28 10:44 ` KAMEZAWA Hiroyuki
@ 2012-03-28 10:57 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:57 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
A commit "memcg: simplify LRU handling by new rule" removes PCG_ACCT_LRU.
and the bug introduced by it was fixed by "memcg: fix GPF when cgroup removal
races with last exit"
This was for reducing flags on pc->flags....Now, we have 3bits of flags.
but this patch adds a new flag, I'm sorry. (Considering alignment of
kmalloc(), we'll able to have 5 bits..)
This patch adds PCG_RESET which is similar to PCG_ACCT_LRU. This is set
when mem_cgroup_add_lru_list() finds we cannot trust the pc's mem_cgroup.
The reason why this patch adds a (renamed) flag again is for merging
pc->flags and pc->mem_cgroup. Assume pc's mem_cgroup is encoded as
mem_cgroup = pc->flags & ~0x7
Updating multiple bits of pc->flags without talking lock_page_cgroup()
is very dangerous. And mem_cgroup_add_lru_list() updates pc->mem_cgroup
without taking lock. Then I add RESET bit. After this, pc_to_mem_cgroup()
is written as
if (PageCgroupReset(pc))
return root_mem_cgroup;
return pc->mem_cgroup;
This update of Reset bit can be done in atomic by set_bit(). And
cleared when USED bit is set.
Considering kmalloc()'s alignment, having 4bits of flags will be ok....
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 15 ++++++++-------
mm/memcontrol.c | 5 +++--
2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 2707809..3f3b4ff 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -8,6 +8,7 @@ enum {
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
PCG_USED, /* this object is in use. */
PCG_MIGRATION, /* under page migration */
+ PCG_RESET, /* have been reset to root_mem_cgroup */
__NR_PCG_FLAGS,
};
@@ -70,6 +71,9 @@ SETPCGFLAG(Migration, MIGRATION)
CLEARPCGFLAG(Migration, MIGRATION)
TESTPCGFLAG(Migration, MIGRATION)
+TESTPCGFLAG(Reset, RESET)
+SETPCGFLAG(Reset, RESET)
+
static inline void lock_page_cgroup(struct page_cgroup *pc)
{
/*
@@ -84,16 +88,13 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
bit_spin_unlock(PCG_LOCK, &pc->flags);
}
+extern struct mem_cgroup* root_mem_cgroup;
static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
{
- return pc->mem_cgroup;
-}
-
-static inline void
-pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
-{
- pc->mem_cgroup = memcg;
+ if (likely(!PageCgroupReset(pc)))
+ return pc->mem_cgroup;
+ return root_mem_cgroup;
}
static inline void
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d366b60..622fd2e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1080,7 +1080,8 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
* of pc's mem_cgroup safe.
*/
if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
- pc_set_mem_cgroup(pc, root_mem_cgroup);
+ /* this reset bit is cleared when the page is charged */
+ SetPageCgroupReset(pc);
memcg = root_mem_cgroup;
}
@@ -2626,7 +2627,7 @@ static int mem_cgroup_move_account(struct page *page,
__mem_cgroup_cancel_charge(from, nr_pages);
/* caller should have done css_get */
- pc_set_mem_cgroup(pc, to);
+ pc_set_mem_cgroup_and_flags(pc, to, BIT(PCG_USED) | BIT(PCG_LOCK));
mem_cgroup_charge_statistics(to, anon, nr_pages);
/*
* We charges against "to" which may not have any tasks. Then, "to"
--
1.7.4.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 3/6] memcg: add PageCgroupReset()
@ 2012-03-28 10:57 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:57 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
A commit "memcg: simplify LRU handling by new rule" removes PCG_ACCT_LRU.
and the bug introduced by it was fixed by "memcg: fix GPF when cgroup removal
races with last exit"
This was for reducing flags on pc->flags....Now, we have 3bits of flags.
but this patch adds a new flag, I'm sorry. (Considering alignment of
kmalloc(), we'll able to have 5 bits..)
This patch adds PCG_RESET which is similar to PCG_ACCT_LRU. This is set
when mem_cgroup_add_lru_list() finds we cannot trust the pc's mem_cgroup.
The reason why this patch adds a (renamed) flag again is for merging
pc->flags and pc->mem_cgroup. Assume pc's mem_cgroup is encoded as
mem_cgroup = pc->flags & ~0x7
Updating multiple bits of pc->flags without talking lock_page_cgroup()
is very dangerous. And mem_cgroup_add_lru_list() updates pc->mem_cgroup
without taking lock. Then I add RESET bit. After this, pc_to_mem_cgroup()
is written as
if (PageCgroupReset(pc))
return root_mem_cgroup;
return pc->mem_cgroup;
This update of Reset bit can be done in atomic by set_bit(). And
cleared when USED bit is set.
Considering kmalloc()'s alignment, having 4bits of flags will be ok....
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 15 ++++++++-------
mm/memcontrol.c | 5 +++--
2 files changed, 11 insertions(+), 9 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 2707809..3f3b4ff 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -8,6 +8,7 @@ enum {
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
PCG_USED, /* this object is in use. */
PCG_MIGRATION, /* under page migration */
+ PCG_RESET, /* have been reset to root_mem_cgroup */
__NR_PCG_FLAGS,
};
@@ -70,6 +71,9 @@ SETPCGFLAG(Migration, MIGRATION)
CLEARPCGFLAG(Migration, MIGRATION)
TESTPCGFLAG(Migration, MIGRATION)
+TESTPCGFLAG(Reset, RESET)
+SETPCGFLAG(Reset, RESET)
+
static inline void lock_page_cgroup(struct page_cgroup *pc)
{
/*
@@ -84,16 +88,13 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
bit_spin_unlock(PCG_LOCK, &pc->flags);
}
+extern struct mem_cgroup* root_mem_cgroup;
static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
{
- return pc->mem_cgroup;
-}
-
-static inline void
-pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
-{
- pc->mem_cgroup = memcg;
+ if (likely(!PageCgroupReset(pc)))
+ return pc->mem_cgroup;
+ return root_mem_cgroup;
}
static inline void
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d366b60..622fd2e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1080,7 +1080,8 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
* of pc's mem_cgroup safe.
*/
if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
- pc_set_mem_cgroup(pc, root_mem_cgroup);
+ /* this reset bit is cleared when the page is charged */
+ SetPageCgroupReset(pc);
memcg = root_mem_cgroup;
}
@@ -2626,7 +2627,7 @@ static int mem_cgroup_move_account(struct page *page,
__mem_cgroup_cancel_charge(from, nr_pages);
/* caller should have done css_get */
- pc_set_mem_cgroup(pc, to);
+ pc_set_mem_cgroup_and_flags(pc, to, BIT(PCG_USED) | BIT(PCG_LOCK));
mem_cgroup_charge_statistics(to, anon, nr_pages);
/*
* We charges against "to" which may not have any tasks. Then, "to"
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][[PATCH 4/6] memcg: remove mem_cgroup pointer from page_cgroup
2012-03-28 10:44 ` KAMEZAWA Hiroyuki
@ 2012-03-28 10:59 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:59 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
This patch removes pc->mem_cgroup and merge the pointer into flags as
63 4 0
| pointer to memcg..........|flags|
After this, memory cgroup's overhead is 8(4)bytes per a page.
Changelog:
- added BUILD_BUG_ON()
- update comments in Kconfig
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 21 +++++++++------------
init/Kconfig | 2 +-
2 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 3f3b4ff..7e3a3c7 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -3,6 +3,10 @@
#include <linux/smp.h>
+/*
+ * These flags are encoded into low bits of unsigned long,
+ * ORed with a pointer to mem_cgroup.
+ */
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
@@ -12,6 +16,8 @@ enum {
__NR_PCG_FLAGS,
};
+#define PCG_FLAGS_MASK ((1 << __NR_PCG_FLAGS) - 1)
+
#ifndef __GENERATING_BOUNDS_H
#include <generated/bounds.h>
@@ -27,7 +33,6 @@ enum {
*/
struct page_cgroup {
unsigned long flags;
- struct mem_cgroup *mem_cgroup;
};
void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -93,7 +98,7 @@ extern struct mem_cgroup* root_mem_cgroup;
static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
{
if (likely(!PageCgroupReset(pc)))
- return pc->mem_cgroup;
+ return (struct mem_cgroup*)(pc->flags & ~PCG_FLAGS_MASK);
return root_mem_cgroup;
}
@@ -101,16 +106,8 @@ static inline void
pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
unsigned long flags)
{
- pc->mem_cgroup = memcg;
- /*
- * We access a page_cgroup asynchronously without lock_page_cgroup().
- * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
- * is accessed after testing USED bit. To make pc's mem_cgroup visible
- * before USED bit, we need memory barrier here.
- * See mem_cgroup_add_lru_list(), etc.
- */
- smp_wmb();
- pc->flags = flags;
+ BUILD_BUG_ON(__NR_PCG_FLAGS > 5); /* assume 32byte alignment. */
+ pc->flags = (unsigned long)memcg | flags;
}
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
diff --git a/init/Kconfig b/init/Kconfig
index 6cfd71d..e0bfe92 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -651,7 +651,7 @@ config CGROUP_MEM_RES_CTLR
Note that setting this option increases fixed memory overhead
associated with each page of memory in the system. By this,
- 20(40)bytes/PAGE_SIZE on 32(64)bit system will be occupied by memory
+ 4(8)bytes/PAGE_SIZE on 32(64)bit system will be occupied by memory
usage tracking struct at boot. Total amount of this is printed out
at boot.
--
1.7.4.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][[PATCH 4/6] memcg: remove mem_cgroup pointer from page_cgroup
@ 2012-03-28 10:59 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 10:59 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
This patch removes pc->mem_cgroup and merge the pointer into flags as
63 4 0
| pointer to memcg..........|flags|
After this, memory cgroup's overhead is 8(4)bytes per a page.
Changelog:
- added BUILD_BUG_ON()
- update comments in Kconfig
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/page_cgroup.h | 21 +++++++++------------
init/Kconfig | 2 +-
2 files changed, 10 insertions(+), 13 deletions(-)
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 3f3b4ff..7e3a3c7 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -3,6 +3,10 @@
#include <linux/smp.h>
+/*
+ * These flags are encoded into low bits of unsigned long,
+ * ORed with a pointer to mem_cgroup.
+ */
enum {
/* flags for mem_cgroup */
PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
@@ -12,6 +16,8 @@ enum {
__NR_PCG_FLAGS,
};
+#define PCG_FLAGS_MASK ((1 << __NR_PCG_FLAGS) - 1)
+
#ifndef __GENERATING_BOUNDS_H
#include <generated/bounds.h>
@@ -27,7 +33,6 @@ enum {
*/
struct page_cgroup {
unsigned long flags;
- struct mem_cgroup *mem_cgroup;
};
void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
@@ -93,7 +98,7 @@ extern struct mem_cgroup* root_mem_cgroup;
static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
{
if (likely(!PageCgroupReset(pc)))
- return pc->mem_cgroup;
+ return (struct mem_cgroup*)(pc->flags & ~PCG_FLAGS_MASK);
return root_mem_cgroup;
}
@@ -101,16 +106,8 @@ static inline void
pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
unsigned long flags)
{
- pc->mem_cgroup = memcg;
- /*
- * We access a page_cgroup asynchronously without lock_page_cgroup().
- * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
- * is accessed after testing USED bit. To make pc's mem_cgroup visible
- * before USED bit, we need memory barrier here.
- * See mem_cgroup_add_lru_list(), etc.
- */
- smp_wmb();
- pc->flags = flags;
+ BUILD_BUG_ON(__NR_PCG_FLAGS > 5); /* assume 32byte alignment. */
+ pc->flags = (unsigned long)memcg | flags;
}
#else /* CONFIG_CGROUP_MEM_RES_CTLR */
diff --git a/init/Kconfig b/init/Kconfig
index 6cfd71d..e0bfe92 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -651,7 +651,7 @@ config CGROUP_MEM_RES_CTLR
Note that setting this option increases fixed memory overhead
associated with each page of memory in the system. By this,
- 20(40)bytes/PAGE_SIZE on 32(64)bit system will be occupied by memory
+ 4(8)bytes/PAGE_SIZE on 32(64)bit system will be occupied by memory
usage tracking struct at boot. Total amount of this is printed out
at boot.
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 5/6] memcg: remove unnecessary memory barrier.
2012-03-28 10:44 ` KAMEZAWA Hiroyuki
@ 2012-03-28 11:00 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 11:00 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
Now, Used bit and a pointer to memory cgroup are set at once.
memory barrier for Used bit -> pc->mem_cgroup is not necessary.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 622fd2e..767bef3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1258,8 +1258,6 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page)
pc = lookup_page_cgroup(page);
if (!PageCgroupUsed(pc))
return NULL;
- /* Ensure pc's mem_cgroup is visible after reading PCG_USED. */
- smp_rmb();
mz = page_cgroup_zoneinfo(pc_to_mem_cgroup(pc), page);
return &mz->reclaim_stat;
}
--
1.7.4.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 5/6] memcg: remove unnecessary memory barrier.
@ 2012-03-28 11:00 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 11:00 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
Now, Used bit and a pointer to memory cgroup are set at once.
memory barrier for Used bit -> pc->mem_cgroup is not necessary.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
mm/memcontrol.c | 2 --
1 files changed, 0 insertions(+), 2 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 622fd2e..767bef3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1258,8 +1258,6 @@ mem_cgroup_get_reclaim_stat_from_page(struct page *page)
pc = lookup_page_cgroup(page);
if (!PageCgroupUsed(pc))
return NULL;
- /* Ensure pc's mem_cgroup is visible after reading PCG_USED. */
- smp_rmb();
mz = page_cgroup_zoneinfo(pc_to_mem_cgroup(pc), page);
return &mz->reclaim_stat;
}
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 6/6] memcg: config for integrate page_cgroup into memmap
2012-03-28 10:44 ` KAMEZAWA Hiroyuki
@ 2012-03-28 11:06 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 11:06 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
This patch is an experimental patch.
Considering 32bit archs, I think this should be CONFIG option...
==
Now, struct page_cgroup is 8byte object and allocated per a page.
This patch adds a config option to allocate page_cgroup in struct page.
By this
Pros.
- lookup_page_cgroup() is almost 0 cost.
- implementation seems very natural...
Cons.
- size of 'struct page' will be increased (to 64bytes in typical case)
- cgroup_disable=memory will not allow user to avoid 8bytes of overhead.
Tested 'kernel make' on tmpfs.
Config=n
Performance counter stats for 'make -j 8':
1,180,857,100,495 instructions # 0.00 insns per cycle
923,084,678 cache-misses
71.346377273 seconds time elapsed
Config=y
Performance counter stats for 'make -j 8':
1,178,404,304,530 instructions # 0.00 insns per cycle
911,098,615 cache-misses
71.607477840 seconds time elapsed
seems instructions and cache-misses decreased to some extent.
But no visible change in total execution time...
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/mm_types.h | 4 +++-
include/linux/page_cgroup.h | 33 ++++++++++++++++++++++++++++++++-
init/Kconfig | 14 ++++++++++++++
mm/memcontrol.c | 3 ++-
mm/page_alloc.c | 1 +
mm/page_cgroup.c | 3 ++-
6 files changed, 54 insertions(+), 4 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 76bbdaf..2beda78 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -141,7 +141,9 @@ struct page {
#ifdef CONFIG_WANT_PAGE_DEBUG_FLAGS
unsigned long debug_flags; /* Use atomic bitops on this */
#endif
-
+#ifdef CONFIG_INTEGRATED_PAGE_CGROUP
+ unsigned long page_cgroup; /* see page_cgroup.h */
+#endif
#ifdef CONFIG_KMEMCHECK
/*
* kmemcheck wants to track the status of each byte in a page; this
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 7e3a3c7..0e02632 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -35,8 +35,9 @@ struct page_cgroup {
unsigned long flags;
};
-void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
+#ifndef CONFIG_INTEGRATED_PAGE_CGROUP
+void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
#ifdef CONFIG_SPARSEMEM
static inline void __init page_cgroup_init_flatmem(void)
{
@@ -51,6 +52,36 @@ static inline void __init page_cgroup_init(void)
struct page_cgroup *lookup_page_cgroup(struct page *page);
struct page *lookup_cgroup_page(struct page_cgroup *pc);
+static inline void memmap_init_cgroup(struct page *page)
+{
+}
+#else
+static inline struct page_cgroup *lookup_page_cgroup(struct page *page)
+{
+ return (struct page_cgroup*)&page->page_cgroup;
+}
+
+static inline struct page *lookup_cgroup_page(struct page_cgroup *pc)
+{
+ return container_of((unsigned long*)pc, struct page, page_cgroup);
+}
+
+static inline void memmap_init_cgroup(struct page *page)
+{
+ page->page_cgroup = 0;
+}
+
+static inline void __init page_cgroup_init_flatmem(void)
+{
+}
+static inline void __init page_cgroup_init(void)
+{
+}
+
+static inline void pgdat_page_cgroup_init(struct pglist_data *pgdat)
+{
+}
+#endif
#define TESTPCGFLAG(uname, lname) \
static inline int PageCgroup##uname(struct page_cgroup *pc) \
diff --git a/init/Kconfig b/init/Kconfig
index e0bfe92..99514c2 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -694,6 +694,20 @@ config CGROUP_MEM_RES_CTLR_SWAP_ENABLED
For those who want to have the feature enabled by default should
select this option (if, for some reason, they need to disable it
then swapaccount=0 does the trick).
+
+config INTEGRATED_PAGE_CGROUP
+ bool "record memory cgroup information into struct page"
+ depends on CGROUP_MEM_RES_CTLR
+ default n
+ help
+ Memory Resource Controller consumes 4/(8 if 64bit)bytes per page.
+ It's independent of 'struct page'. If you say Y here, memory cgroup
+ information is recorded into struct page and increase size of it
+ 4/8 bytes. With this, cpu overheads in runtime will be reduced
+ but you cannot avoid above overheads of 4/8 bytes per page by boot
+ option. If unsure, say N.
+
+
config CGROUP_MEM_RES_CTLR_KMEM
bool "Memory Resource Controller Kernel Memory accounting (EXPERIMENTAL)"
depends on CGROUP_MEM_RES_CTLR && EXPERIMENTAL
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 767bef3..0c5b15c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2557,7 +2557,8 @@ void mem_cgroup_split_huge_fixup(struct page *head)
if (!PageCgroupUsed(head_pc))
return;
for (i = 1; i < HPAGE_PMD_NR; i++) {
- pc = head_pc + i;
+ /* page struct is contiguous in hugepage. */
+ pc = lookup_page_cgroup(head + i);
pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED));
}
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0b37873..9be94df 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3682,6 +3682,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
if (!is_highmem_idx(zone))
set_page_address(page, __va(pfn << PAGE_SHIFT));
#endif
+ memmap_init_cgroup(page);
}
}
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 1ccbd71..036c8ea 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -11,6 +11,7 @@
#include <linux/swapops.h>
#include <linux/kmemleak.h>
+#ifndef CONFIG_INTEGRATED_PAGE_CGROUP
static unsigned long total_usage;
#if !defined(CONFIG_SPARSEMEM)
@@ -315,7 +316,7 @@ void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
}
#endif
-
+#endif
#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
--
1.7.4.1
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [RFC][PATCH 6/6] memcg: config for integrate page_cgroup into memmap
@ 2012-03-28 11:06 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-28 11:06 UTC (permalink / raw)
To: linux-mm
Cc: Linux Kernel, Michal Hocko, Johannes Weiner, Andrew Morton,
Han Ying, Glauber Costa, Konstantin Khlebnikov, Suleiman Souhlal
This patch is an experimental patch.
Considering 32bit archs, I think this should be CONFIG option...
==
Now, struct page_cgroup is 8byte object and allocated per a page.
This patch adds a config option to allocate page_cgroup in struct page.
By this
Pros.
- lookup_page_cgroup() is almost 0 cost.
- implementation seems very natural...
Cons.
- size of 'struct page' will be increased (to 64bytes in typical case)
- cgroup_disable=memory will not allow user to avoid 8bytes of overhead.
Tested 'kernel make' on tmpfs.
Config=n
Performance counter stats for 'make -j 8':
1,180,857,100,495 instructions # 0.00 insns per cycle
923,084,678 cache-misses
71.346377273 seconds time elapsed
Config=y
Performance counter stats for 'make -j 8':
1,178,404,304,530 instructions # 0.00 insns per cycle
911,098,615 cache-misses
71.607477840 seconds time elapsed
seems instructions and cache-misses decreased to some extent.
But no visible change in total execution time...
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
include/linux/mm_types.h | 4 +++-
include/linux/page_cgroup.h | 33 ++++++++++++++++++++++++++++++++-
init/Kconfig | 14 ++++++++++++++
mm/memcontrol.c | 3 ++-
mm/page_alloc.c | 1 +
mm/page_cgroup.c | 3 ++-
6 files changed, 54 insertions(+), 4 deletions(-)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 76bbdaf..2beda78 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -141,7 +141,9 @@ struct page {
#ifdef CONFIG_WANT_PAGE_DEBUG_FLAGS
unsigned long debug_flags; /* Use atomic bitops on this */
#endif
-
+#ifdef CONFIG_INTEGRATED_PAGE_CGROUP
+ unsigned long page_cgroup; /* see page_cgroup.h */
+#endif
#ifdef CONFIG_KMEMCHECK
/*
* kmemcheck wants to track the status of each byte in a page; this
diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
index 7e3a3c7..0e02632 100644
--- a/include/linux/page_cgroup.h
+++ b/include/linux/page_cgroup.h
@@ -35,8 +35,9 @@ struct page_cgroup {
unsigned long flags;
};
-void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
+#ifndef CONFIG_INTEGRATED_PAGE_CGROUP
+void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat);
#ifdef CONFIG_SPARSEMEM
static inline void __init page_cgroup_init_flatmem(void)
{
@@ -51,6 +52,36 @@ static inline void __init page_cgroup_init(void)
struct page_cgroup *lookup_page_cgroup(struct page *page);
struct page *lookup_cgroup_page(struct page_cgroup *pc);
+static inline void memmap_init_cgroup(struct page *page)
+{
+}
+#else
+static inline struct page_cgroup *lookup_page_cgroup(struct page *page)
+{
+ return (struct page_cgroup*)&page->page_cgroup;
+}
+
+static inline struct page *lookup_cgroup_page(struct page_cgroup *pc)
+{
+ return container_of((unsigned long*)pc, struct page, page_cgroup);
+}
+
+static inline void memmap_init_cgroup(struct page *page)
+{
+ page->page_cgroup = 0;
+}
+
+static inline void __init page_cgroup_init_flatmem(void)
+{
+}
+static inline void __init page_cgroup_init(void)
+{
+}
+
+static inline void pgdat_page_cgroup_init(struct pglist_data *pgdat)
+{
+}
+#endif
#define TESTPCGFLAG(uname, lname) \
static inline int PageCgroup##uname(struct page_cgroup *pc) \
diff --git a/init/Kconfig b/init/Kconfig
index e0bfe92..99514c2 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -694,6 +694,20 @@ config CGROUP_MEM_RES_CTLR_SWAP_ENABLED
For those who want to have the feature enabled by default should
select this option (if, for some reason, they need to disable it
then swapaccount=0 does the trick).
+
+config INTEGRATED_PAGE_CGROUP
+ bool "record memory cgroup information into struct page"
+ depends on CGROUP_MEM_RES_CTLR
+ default n
+ help
+ Memory Resource Controller consumes 4/(8 if 64bit)bytes per page.
+ It's independent of 'struct page'. If you say Y here, memory cgroup
+ information is recorded into struct page and increase size of it
+ 4/8 bytes. With this, cpu overheads in runtime will be reduced
+ but you cannot avoid above overheads of 4/8 bytes per page by boot
+ option. If unsure, say N.
+
+
config CGROUP_MEM_RES_CTLR_KMEM
bool "Memory Resource Controller Kernel Memory accounting (EXPERIMENTAL)"
depends on CGROUP_MEM_RES_CTLR && EXPERIMENTAL
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 767bef3..0c5b15c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2557,7 +2557,8 @@ void mem_cgroup_split_huge_fixup(struct page *head)
if (!PageCgroupUsed(head_pc))
return;
for (i = 1; i < HPAGE_PMD_NR; i++) {
- pc = head_pc + i;
+ /* page struct is contiguous in hugepage. */
+ pc = lookup_page_cgroup(head + i);
pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED));
}
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0b37873..9be94df 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3682,6 +3682,7 @@ void __meminit memmap_init_zone(unsigned long size, int nid, unsigned long zone,
if (!is_highmem_idx(zone))
set_page_address(page, __va(pfn << PAGE_SHIFT));
#endif
+ memmap_init_cgroup(page);
}
}
diff --git a/mm/page_cgroup.c b/mm/page_cgroup.c
index 1ccbd71..036c8ea 100644
--- a/mm/page_cgroup.c
+++ b/mm/page_cgroup.c
@@ -11,6 +11,7 @@
#include <linux/swapops.h>
#include <linux/kmemleak.h>
+#ifndef CONFIG_INTEGRATED_PAGE_CGROUP
static unsigned long total_usage;
#if !defined(CONFIG_SPARSEMEM)
@@ -315,7 +316,7 @@ void __meminit pgdat_page_cgroup_init(struct pglist_data *pgdat)
}
#endif
-
+#endif
#ifdef CONFIG_CGROUP_MEM_RES_CTLR_SWAP
--
1.7.4.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
2012-03-28 10:51 ` KAMEZAWA Hiroyuki
@ 2012-04-17 21:17 ` Ying Han
-1 siblings, 0 replies; 24+ messages in thread
From: Ying Han @ 2012-04-17 21:17 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
On Wed, Mar 28, 2012 at 3:51 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
> And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> include/linux/page_cgroup.h | 18 ++++++++++++++++++
> mm/memcontrol.c | 18 ++++--------------
> 2 files changed, 22 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> index 92768cb..2707809 100644
> --- a/include/linux/page_cgroup.h
> +++ b/include/linux/page_cgroup.h
> @@ -1,6 +1,8 @@
> #ifndef __LINUX_PAGE_CGROUP_H
> #define __LINUX_PAGE_CGROUP_H
>
> +#include <linux/smp.h>
> +
> enum {
> /* flags for mem_cgroup */
> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
> @@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
> pc->mem_cgroup = memcg;
> }
>
> +static inline void
> +pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
> + unsigned long flags)
> +{
> + pc->mem_cgroup = memcg;
> + /*
> + * We access a page_cgroup asynchronously without lock_page_cgroup().
> + * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
> + * is accessed after testing USED bit. To make pc's mem_cgroup visible
> + * before USED bit, we need memory barrier here.
> + * See mem_cgroup_add_lru_list(), etc.
> + */
> + smp_wmb();
> + pc->flags = flags;
> +}
> +
> #else /* CONFIG_CGROUP_MEM_RES_CTLR */
> struct page_cgroup;
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 8077460..d366b60 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
> }
> }
>
> - pc_set_mem_cgroup(pc, memcg);
> - /*
> - * We access a page_cgroup asynchronously without lock_page_cgroup().
> - * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
> - * is accessed after testing USED bit. To make pc's mem_cgroup visible
> - * before USED bit, we need memory barrier here.
> - * See mem_cgroup_add_lru_list(), etc.
> - */
> - smp_wmb();
> - SetPageCgroupUsed(pc);
I might be confused. We removed this SetPageCgroupUsed() but not
adding it back elsewhere ?
--Ying
> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
>
> if (lrucare) {
> if (was_on_lru) {
> @@ -2549,7 +2540,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>
> -#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MIGRATION))
> /*
> * Because tail pages are not marked as "used", set it. We're under
> * zone->lru_lock, 'splitting on pmd' and compound_lock.
> @@ -2565,11 +2555,11 @@ void mem_cgroup_split_huge_fixup(struct page *head)
>
> if (mem_cgroup_disabled())
> return;
> + if (!PageCgroupUsed(head_pc))
> + return;
> for (i = 1; i < HPAGE_PMD_NR; i++) {
> pc = head_pc + i;
> - pc_set_mem_cgroup(pc, memcg);
> - smp_wmb();/* see __commit_charge() */
> - pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED));
> }
> }
> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> --
> 1.7.4.1
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
@ 2012-04-17 21:17 ` Ying Han
0 siblings, 0 replies; 24+ messages in thread
From: Ying Han @ 2012-04-17 21:17 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
On Wed, Mar 28, 2012 at 3:51 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
> And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> include/linux/page_cgroup.h | 18 ++++++++++++++++++
> mm/memcontrol.c | 18 ++++--------------
> 2 files changed, 22 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> index 92768cb..2707809 100644
> --- a/include/linux/page_cgroup.h
> +++ b/include/linux/page_cgroup.h
> @@ -1,6 +1,8 @@
> #ifndef __LINUX_PAGE_CGROUP_H
> #define __LINUX_PAGE_CGROUP_H
>
> +#include <linux/smp.h>
> +
> enum {
> /* flags for mem_cgroup */
> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
> @@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
> pc->mem_cgroup = memcg;
> }
>
> +static inline void
> +pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
> + unsigned long flags)
> +{
> + pc->mem_cgroup = memcg;
> + /*
> + * We access a page_cgroup asynchronously without lock_page_cgroup().
> + * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
> + * is accessed after testing USED bit. To make pc's mem_cgroup visible
> + * before USED bit, we need memory barrier here.
> + * See mem_cgroup_add_lru_list(), etc.
> + */
> + smp_wmb();
> + pc->flags = flags;
> +}
> +
> #else /* CONFIG_CGROUP_MEM_RES_CTLR */
> struct page_cgroup;
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 8077460..d366b60 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
> }
> }
>
> - pc_set_mem_cgroup(pc, memcg);
> - /*
> - * We access a page_cgroup asynchronously without lock_page_cgroup().
> - * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
> - * is accessed after testing USED bit. To make pc's mem_cgroup visible
> - * before USED bit, we need memory barrier here.
> - * See mem_cgroup_add_lru_list(), etc.
> - */
> - smp_wmb();
> - SetPageCgroupUsed(pc);
I might be confused. We removed this SetPageCgroupUsed() but not
adding it back elsewhere ?
--Ying
> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
>
> if (lrucare) {
> if (was_on_lru) {
> @@ -2549,7 +2540,6 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
>
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>
> -#define PCGF_NOCOPY_AT_SPLIT ((1 << PCG_LOCK) | (1 << PCG_MIGRATION))
> /*
> * Because tail pages are not marked as "used", set it. We're under
> * zone->lru_lock, 'splitting on pmd' and compound_lock.
> @@ -2565,11 +2555,11 @@ void mem_cgroup_split_huge_fixup(struct page *head)
>
> if (mem_cgroup_disabled())
> return;
> + if (!PageCgroupUsed(head_pc))
> + return;
> for (i = 1; i < HPAGE_PMD_NR; i++) {
> pc = head_pc + i;
> - pc_set_mem_cgroup(pc, memcg);
> - smp_wmb();/* see __commit_charge() */
> - pc->flags = head_pc->flags & ~PCGF_NOCOPY_AT_SPLIT;
> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED));
> }
> }
> #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> --
> 1.7.4.1
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 3/6] memcg: add PageCgroupReset()
2012-03-28 10:57 ` KAMEZAWA Hiroyuki
@ 2012-04-17 21:25 ` Ying Han
-1 siblings, 0 replies; 24+ messages in thread
From: Ying Han @ 2012-04-17 21:25 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
On Wed, Mar 28, 2012 at 3:57 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> A commit "memcg: simplify LRU handling by new rule" removes PCG_ACCT_LRU.
> and the bug introduced by it was fixed by "memcg: fix GPF when cgroup removal
> races with last exit"
>
> This was for reducing flags on pc->flags....Now, we have 3bits of flags.
> but this patch adds a new flag, I'm sorry. (Considering alignment of
> kmalloc(), we'll able to have 5 bits..)
>
> This patch adds PCG_RESET which is similar to PCG_ACCT_LRU.
This is set
> when mem_cgroup_add_lru_list() finds we cannot trust the pc's mem_cgroup.
Do we still need the new flag? I assume some of the upcoming patches
will provide the guarantee of pc->mem_cgroup.
--Ying
>
> The reason why this patch adds a (renamed) flag again is for merging
> pc->flags and pc->mem_cgroup. Assume pc's mem_cgroup is encoded as
>
> mem_cgroup = pc->flags & ~0x7
>
> Updating multiple bits of pc->flags without talking lock_page_cgroup()
> is very dangerous. And mem_cgroup_add_lru_list() updates pc->mem_cgroup
> without taking lock. Then I add RESET bit. After this, pc_to_mem_cgroup()
> is written as
>
> if (PageCgroupReset(pc))
> return root_mem_cgroup;
> return pc->mem_cgroup;
>
> This update of Reset bit can be done in atomic by set_bit(). And
> cleared when USED bit is set.
>
> Considering kmalloc()'s alignment, having 4bits of flags will be ok....
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> include/linux/page_cgroup.h | 15 ++++++++-------
> mm/memcontrol.c | 5 +++--
> 2 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> index 2707809..3f3b4ff 100644
> --- a/include/linux/page_cgroup.h
> +++ b/include/linux/page_cgroup.h
> @@ -8,6 +8,7 @@ enum {
> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
> PCG_USED, /* this object is in use. */
> PCG_MIGRATION, /* under page migration */
> + PCG_RESET, /* have been reset to root_mem_cgroup */
> __NR_PCG_FLAGS,
> };
>
> @@ -70,6 +71,9 @@ SETPCGFLAG(Migration, MIGRATION)
> CLEARPCGFLAG(Migration, MIGRATION)
> TESTPCGFLAG(Migration, MIGRATION)
>
> +TESTPCGFLAG(Reset, RESET)
> +SETPCGFLAG(Reset, RESET)
> +
> static inline void lock_page_cgroup(struct page_cgroup *pc)
> {
> /*
> @@ -84,16 +88,13 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
> bit_spin_unlock(PCG_LOCK, &pc->flags);
> }
>
> +extern struct mem_cgroup* root_mem_cgroup;
>
> static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
> {
> - return pc->mem_cgroup;
> -}
> -
> -static inline void
> -pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
> -{
> - pc->mem_cgroup = memcg;
> + if (likely(!PageCgroupReset(pc)))
> + return pc->mem_cgroup;
> + return root_mem_cgroup;
> }
>
> static inline void
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d366b60..622fd2e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1080,7 +1080,8 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
> * of pc's mem_cgroup safe.
> */
> if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
> - pc_set_mem_cgroup(pc, root_mem_cgroup);
> + /* this reset bit is cleared when the page is charged */
> + SetPageCgroupReset(pc);
> memcg = root_mem_cgroup;
> }
>
> @@ -2626,7 +2627,7 @@ static int mem_cgroup_move_account(struct page *page,
> __mem_cgroup_cancel_charge(from, nr_pages);
>
> /* caller should have done css_get */
> - pc_set_mem_cgroup(pc, to);
> + pc_set_mem_cgroup_and_flags(pc, to, BIT(PCG_USED) | BIT(PCG_LOCK));
> mem_cgroup_charge_statistics(to, anon, nr_pages);
> /*
> * We charges against "to" which may not have any tasks. Then, "to"
> --
> 1.7.4.1
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 3/6] memcg: add PageCgroupReset()
@ 2012-04-17 21:25 ` Ying Han
0 siblings, 0 replies; 24+ messages in thread
From: Ying Han @ 2012-04-17 21:25 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
On Wed, Mar 28, 2012 at 3:57 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
>
> A commit "memcg: simplify LRU handling by new rule" removes PCG_ACCT_LRU.
> and the bug introduced by it was fixed by "memcg: fix GPF when cgroup removal
> races with last exit"
>
> This was for reducing flags on pc->flags....Now, we have 3bits of flags.
> but this patch adds a new flag, I'm sorry. (Considering alignment of
> kmalloc(), we'll able to have 5 bits..)
>
> This patch adds PCG_RESET which is similar to PCG_ACCT_LRU.
This is set
> when mem_cgroup_add_lru_list() finds we cannot trust the pc's mem_cgroup.
Do we still need the new flag? I assume some of the upcoming patches
will provide the guarantee of pc->mem_cgroup.
--Ying
>
> The reason why this patch adds a (renamed) flag again is for merging
> pc->flags and pc->mem_cgroup. Assume pc's mem_cgroup is encoded as
>
> mem_cgroup = pc->flags & ~0x7
>
> Updating multiple bits of pc->flags without talking lock_page_cgroup()
> is very dangerous. And mem_cgroup_add_lru_list() updates pc->mem_cgroup
> without taking lock. Then I add RESET bit. After this, pc_to_mem_cgroup()
> is written as
>
> if (PageCgroupReset(pc))
> return root_mem_cgroup;
> return pc->mem_cgroup;
>
> This update of Reset bit can be done in atomic by set_bit(). And
> cleared when USED bit is set.
>
> Considering kmalloc()'s alignment, having 4bits of flags will be ok....
>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> ---
> include/linux/page_cgroup.h | 15 ++++++++-------
> mm/memcontrol.c | 5 +++--
> 2 files changed, 11 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
> index 2707809..3f3b4ff 100644
> --- a/include/linux/page_cgroup.h
> +++ b/include/linux/page_cgroup.h
> @@ -8,6 +8,7 @@ enum {
> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
> PCG_USED, /* this object is in use. */
> PCG_MIGRATION, /* under page migration */
> + PCG_RESET, /* have been reset to root_mem_cgroup */
> __NR_PCG_FLAGS,
> };
>
> @@ -70,6 +71,9 @@ SETPCGFLAG(Migration, MIGRATION)
> CLEARPCGFLAG(Migration, MIGRATION)
> TESTPCGFLAG(Migration, MIGRATION)
>
> +TESTPCGFLAG(Reset, RESET)
> +SETPCGFLAG(Reset, RESET)
> +
> static inline void lock_page_cgroup(struct page_cgroup *pc)
> {
> /*
> @@ -84,16 +88,13 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
> bit_spin_unlock(PCG_LOCK, &pc->flags);
> }
>
> +extern struct mem_cgroup* root_mem_cgroup;
>
> static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
> {
> - return pc->mem_cgroup;
> -}
> -
> -static inline void
> -pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
> -{
> - pc->mem_cgroup = memcg;
> + if (likely(!PageCgroupReset(pc)))
> + return pc->mem_cgroup;
> + return root_mem_cgroup;
> }
>
> static inline void
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index d366b60..622fd2e 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1080,7 +1080,8 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
> * of pc's mem_cgroup safe.
> */
> if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
> - pc_set_mem_cgroup(pc, root_mem_cgroup);
> + /* this reset bit is cleared when the page is charged */
> + SetPageCgroupReset(pc);
> memcg = root_mem_cgroup;
> }
>
> @@ -2626,7 +2627,7 @@ static int mem_cgroup_move_account(struct page *page,
> __mem_cgroup_cancel_charge(from, nr_pages);
>
> /* caller should have done css_get */
> - pc_set_mem_cgroup(pc, to);
> + pc_set_mem_cgroup_and_flags(pc, to, BIT(PCG_USED) | BIT(PCG_LOCK));
> mem_cgroup_charge_statistics(to, anon, nr_pages);
> /*
> * We charges against "to" which may not have any tasks. Then, "to"
> --
> 1.7.4.1
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
2012-04-17 21:17 ` Ying Han
@ 2012-04-18 7:23 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-04-18 7:23 UTC (permalink / raw)
To: Ying Han
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
(2012/04/18 6:17), Ying Han wrote:
> On Wed, Mar 28, 2012 at 3:51 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>> Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
>> And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> ---
>> include/linux/page_cgroup.h | 18 ++++++++++++++++++
>> mm/memcontrol.c | 18 ++++--------------
>> 2 files changed, 22 insertions(+), 14 deletions(-)
>>
>> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
>> index 92768cb..2707809 100644
>> --- a/include/linux/page_cgroup.h
>> +++ b/include/linux/page_cgroup.h
>> @@ -1,6 +1,8 @@
>> #ifndef __LINUX_PAGE_CGROUP_H
>> #define __LINUX_PAGE_CGROUP_H
>>
>> +#include <linux/smp.h>
>> +
>> enum {
>> /* flags for mem_cgroup */
>> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
>> @@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
>> pc->mem_cgroup = memcg;
>> }
>>
>> +static inline void
>> +pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
>> + unsigned long flags)
>> +{
>> + pc->mem_cgroup = memcg;
>> + /*
>> + * We access a page_cgroup asynchronously without lock_page_cgroup().
>> + * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>> + * is accessed after testing USED bit. To make pc's mem_cgroup visible
>> + * before USED bit, we need memory barrier here.
>> + * See mem_cgroup_add_lru_list(), etc.
>> + */
>> + smp_wmb();
>> + pc->flags = flags;
>> +}
>> +
>> #else /* CONFIG_CGROUP_MEM_RES_CTLR */
>> struct page_cgroup;
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 8077460..d366b60 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
>> }
>> }
>>
>> - pc_set_mem_cgroup(pc, memcg);
>> - /*
>> - * We access a page_cgroup asynchronously without lock_page_cgroup().
>> - * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>> - * is accessed after testing USED bit. To make pc's mem_cgroup visible
>> - * before USED bit, we need memory barrier here.
>> - * See mem_cgroup_add_lru_list(), etc.
>> - */
>> - smp_wmb();
>> - SetPageCgroupUsed(pc);
>
> I might be confused. We removed this SetPageCgroupUsed() but not
> adding it back elsewhere ?
>
> --Ying
>
>> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
Added here. This sets
| memcg pointer | Used | Locked |
Thanks,
-Kame
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
@ 2012-04-18 7:23 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-04-18 7:23 UTC (permalink / raw)
To: Ying Han
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
(2012/04/18 6:17), Ying Han wrote:
> On Wed, Mar 28, 2012 at 3:51 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>> Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
>> And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> ---
>> include/linux/page_cgroup.h | 18 ++++++++++++++++++
>> mm/memcontrol.c | 18 ++++--------------
>> 2 files changed, 22 insertions(+), 14 deletions(-)
>>
>> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
>> index 92768cb..2707809 100644
>> --- a/include/linux/page_cgroup.h
>> +++ b/include/linux/page_cgroup.h
>> @@ -1,6 +1,8 @@
>> #ifndef __LINUX_PAGE_CGROUP_H
>> #define __LINUX_PAGE_CGROUP_H
>>
>> +#include <linux/smp.h>
>> +
>> enum {
>> /* flags for mem_cgroup */
>> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
>> @@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
>> pc->mem_cgroup = memcg;
>> }
>>
>> +static inline void
>> +pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
>> + unsigned long flags)
>> +{
>> + pc->mem_cgroup = memcg;
>> + /*
>> + * We access a page_cgroup asynchronously without lock_page_cgroup().
>> + * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>> + * is accessed after testing USED bit. To make pc's mem_cgroup visible
>> + * before USED bit, we need memory barrier here.
>> + * See mem_cgroup_add_lru_list(), etc.
>> + */
>> + smp_wmb();
>> + pc->flags = flags;
>> +}
>> +
>> #else /* CONFIG_CGROUP_MEM_RES_CTLR */
>> struct page_cgroup;
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 8077460..d366b60 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
>> }
>> }
>>
>> - pc_set_mem_cgroup(pc, memcg);
>> - /*
>> - * We access a page_cgroup asynchronously without lock_page_cgroup().
>> - * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>> - * is accessed after testing USED bit. To make pc's mem_cgroup visible
>> - * before USED bit, we need memory barrier here.
>> - * See mem_cgroup_add_lru_list(), etc.
>> - */
>> - smp_wmb();
>> - SetPageCgroupUsed(pc);
>
> I might be confused. We removed this SetPageCgroupUsed() but not
> adding it back elsewhere ?
>
> --Ying
>
>> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
Added here. This sets
| memcg pointer | Used | Locked |
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 3/6] memcg: add PageCgroupReset()
2012-04-17 21:25 ` Ying Han
@ 2012-04-18 7:34 ` KAMEZAWA Hiroyuki
-1 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-04-18 7:34 UTC (permalink / raw)
To: Ying Han
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
(2012/04/18 6:25), Ying Han wrote:
> On Wed, Mar 28, 2012 at 3:57 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>
>> A commit "memcg: simplify LRU handling by new rule" removes PCG_ACCT_LRU.
>> and the bug introduced by it was fixed by "memcg: fix GPF when cgroup removal
>> races with last exit"
>>
>> This was for reducing flags on pc->flags....Now, we have 3bits of flags.
>> but this patch adds a new flag, I'm sorry. (Considering alignment of
>> kmalloc(), we'll able to have 5 bits..)
>>
>> This patch adds PCG_RESET which is similar to PCG_ACCT_LRU.
>
>
>
> This is set
>> when mem_cgroup_add_lru_list() finds we cannot trust the pc's mem_cgroup.
>
> Do we still need the new flag? I assume some of the upcoming patches
> will provide the guarantee of pc->mem_cgroup.
>
If per-lruvec locking can do reference-count GC, memcg will never be freed
there are pages which points to the memcg, and allows us to remove
whole this 'move to root' logic, I agree. We'll not require a new flag.
It means that memcg cannot be freed until there are not page(_cgroup) which
points to dead memcg. mem_cgroup_reset_owner() is removed, now.
We need to handle 2 cases.
- newly allocated pages which linked to memcg before use.
- unused pages but added to lru by some lazy logic.
And make a guarantee
- pages added to LRU of dead memcg will be freed or moved to other cgroup, soon.
I have no good idea.
Thanks,
-Kame
> --Ying
>>
>> The reason why this patch adds a (renamed) flag again is for merging
>> pc->flags and pc->mem_cgroup. Assume pc's mem_cgroup is encoded as
>>
>> mem_cgroup = pc->flags & ~0x7
>>
>> Updating multiple bits of pc->flags without talking lock_page_cgroup()
>> is very dangerous. And mem_cgroup_add_lru_list() updates pc->mem_cgroup
>> without taking lock. Then I add RESET bit. After this, pc_to_mem_cgroup()
>> is written as
>>
>> if (PageCgroupReset(pc))
>> return root_mem_cgroup;
>> return pc->mem_cgroup;
>>
>> This update of Reset bit can be done in atomic by set_bit(). And
>> cleared when USED bit is set.
>>
>> Considering kmalloc()'s alignment, having 4bits of flags will be ok....
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> ---
>> include/linux/page_cgroup.h | 15 ++++++++-------
>> mm/memcontrol.c | 5 +++--
>> 2 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
>> index 2707809..3f3b4ff 100644
>> --- a/include/linux/page_cgroup.h
>> +++ b/include/linux/page_cgroup.h
>> @@ -8,6 +8,7 @@ enum {
>> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
>> PCG_USED, /* this object is in use. */
>> PCG_MIGRATION, /* under page migration */
>> + PCG_RESET, /* have been reset to root_mem_cgroup */
>> __NR_PCG_FLAGS,
>> };
>>
>> @@ -70,6 +71,9 @@ SETPCGFLAG(Migration, MIGRATION)
>> CLEARPCGFLAG(Migration, MIGRATION)
>> TESTPCGFLAG(Migration, MIGRATION)
>>
>> +TESTPCGFLAG(Reset, RESET)
>> +SETPCGFLAG(Reset, RESET)
>> +
>> static inline void lock_page_cgroup(struct page_cgroup *pc)
>> {
>> /*
>> @@ -84,16 +88,13 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
>> bit_spin_unlock(PCG_LOCK, &pc->flags);
>> }
>>
>> +extern struct mem_cgroup* root_mem_cgroup;
>>
>> static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
>> {
>> - return pc->mem_cgroup;
>> -}
>> -
>> -static inline void
>> -pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
>> -{
>> - pc->mem_cgroup = memcg;
>> + if (likely(!PageCgroupReset(pc)))
>> + return pc->mem_cgroup;
>> + return root_mem_cgroup;
>> }
>>
>> static inline void
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index d366b60..622fd2e 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -1080,7 +1080,8 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
>> * of pc's mem_cgroup safe.
>> */
>> if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
>> - pc_set_mem_cgroup(pc, root_mem_cgroup);
>> + /* this reset bit is cleared when the page is charged */
>> + SetPageCgroupReset(pc);
>> memcg = root_mem_cgroup;
>> }
>>
>> @@ -2626,7 +2627,7 @@ static int mem_cgroup_move_account(struct page *page,
>> __mem_cgroup_cancel_charge(from, nr_pages);
>>
>> /* caller should have done css_get */
>> - pc_set_mem_cgroup(pc, to);
>> + pc_set_mem_cgroup_and_flags(pc, to, BIT(PCG_USED) | BIT(PCG_LOCK));
>> mem_cgroup_charge_statistics(to, anon, nr_pages);
>> /*
>> * We charges against "to" which may not have any tasks. Then, "to"
>> --
>> 1.7.4.1
>>
>>
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 3/6] memcg: add PageCgroupReset()
@ 2012-04-18 7:34 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 24+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-04-18 7:34 UTC (permalink / raw)
To: Ying Han
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
(2012/04/18 6:25), Ying Han wrote:
> On Wed, Mar 28, 2012 at 3:57 AM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>
>> A commit "memcg: simplify LRU handling by new rule" removes PCG_ACCT_LRU.
>> and the bug introduced by it was fixed by "memcg: fix GPF when cgroup removal
>> races with last exit"
>>
>> This was for reducing flags on pc->flags....Now, we have 3bits of flags.
>> but this patch adds a new flag, I'm sorry. (Considering alignment of
>> kmalloc(), we'll able to have 5 bits..)
>>
>> This patch adds PCG_RESET which is similar to PCG_ACCT_LRU.
>
>
>
> This is set
>> when mem_cgroup_add_lru_list() finds we cannot trust the pc's mem_cgroup.
>
> Do we still need the new flag? I assume some of the upcoming patches
> will provide the guarantee of pc->mem_cgroup.
>
If per-lruvec locking can do reference-count GC, memcg will never be freed
there are pages which points to the memcg, and allows us to remove
whole this 'move to root' logic, I agree. We'll not require a new flag.
It means that memcg cannot be freed until there are not page(_cgroup) which
points to dead memcg. mem_cgroup_reset_owner() is removed, now.
We need to handle 2 cases.
- newly allocated pages which linked to memcg before use.
- unused pages but added to lru by some lazy logic.
And make a guarantee
- pages added to LRU of dead memcg will be freed or moved to other cgroup, soon.
I have no good idea.
Thanks,
-Kame
> --Ying
>>
>> The reason why this patch adds a (renamed) flag again is for merging
>> pc->flags and pc->mem_cgroup. Assume pc's mem_cgroup is encoded as
>>
>> mem_cgroup = pc->flags & ~0x7
>>
>> Updating multiple bits of pc->flags without talking lock_page_cgroup()
>> is very dangerous. And mem_cgroup_add_lru_list() updates pc->mem_cgroup
>> without taking lock. Then I add RESET bit. After this, pc_to_mem_cgroup()
>> is written as
>>
>> if (PageCgroupReset(pc))
>> return root_mem_cgroup;
>> return pc->mem_cgroup;
>>
>> This update of Reset bit can be done in atomic by set_bit(). And
>> cleared when USED bit is set.
>>
>> Considering kmalloc()'s alignment, having 4bits of flags will be ok....
>>
>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> ---
>> include/linux/page_cgroup.h | 15 ++++++++-------
>> mm/memcontrol.c | 5 +++--
>> 2 files changed, 11 insertions(+), 9 deletions(-)
>>
>> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
>> index 2707809..3f3b4ff 100644
>> --- a/include/linux/page_cgroup.h
>> +++ b/include/linux/page_cgroup.h
>> @@ -8,6 +8,7 @@ enum {
>> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
>> PCG_USED, /* this object is in use. */
>> PCG_MIGRATION, /* under page migration */
>> + PCG_RESET, /* have been reset to root_mem_cgroup */
>> __NR_PCG_FLAGS,
>> };
>>
>> @@ -70,6 +71,9 @@ SETPCGFLAG(Migration, MIGRATION)
>> CLEARPCGFLAG(Migration, MIGRATION)
>> TESTPCGFLAG(Migration, MIGRATION)
>>
>> +TESTPCGFLAG(Reset, RESET)
>> +SETPCGFLAG(Reset, RESET)
>> +
>> static inline void lock_page_cgroup(struct page_cgroup *pc)
>> {
>> /*
>> @@ -84,16 +88,13 @@ static inline void unlock_page_cgroup(struct page_cgroup *pc)
>> bit_spin_unlock(PCG_LOCK, &pc->flags);
>> }
>>
>> +extern struct mem_cgroup* root_mem_cgroup;
>>
>> static inline struct mem_cgroup* pc_to_mem_cgroup(struct page_cgroup *pc)
>> {
>> - return pc->mem_cgroup;
>> -}
>> -
>> -static inline void
>> -pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
>> -{
>> - pc->mem_cgroup = memcg;
>> + if (likely(!PageCgroupReset(pc)))
>> + return pc->mem_cgroup;
>> + return root_mem_cgroup;
>> }
>>
>> static inline void
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index d366b60..622fd2e 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -1080,7 +1080,8 @@ struct lruvec *mem_cgroup_lru_add_list(struct zone *zone, struct page *page,
>> * of pc's mem_cgroup safe.
>> */
>> if (!PageCgroupUsed(pc) && memcg != root_mem_cgroup) {
>> - pc_set_mem_cgroup(pc, root_mem_cgroup);
>> + /* this reset bit is cleared when the page is charged */
>> + SetPageCgroupReset(pc);
>> memcg = root_mem_cgroup;
>> }
>>
>> @@ -2626,7 +2627,7 @@ static int mem_cgroup_move_account(struct page *page,
>> __mem_cgroup_cancel_charge(from, nr_pages);
>>
>> /* caller should have done css_get */
>> - pc_set_mem_cgroup(pc, to);
>> + pc_set_mem_cgroup_and_flags(pc, to, BIT(PCG_USED) | BIT(PCG_LOCK));
>> mem_cgroup_charge_statistics(to, anon, nr_pages);
>> /*
>> * We charges against "to" which may not have any tasks. Then, "to"
>> --
>> 1.7.4.1
>>
>>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
2012-04-18 7:23 ` KAMEZAWA Hiroyuki
@ 2012-04-18 18:18 ` Ying Han
-1 siblings, 0 replies; 24+ messages in thread
From: Ying Han @ 2012-04-18 18:18 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
On Wed, Apr 18, 2012 at 12:23 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> (2012/04/18 6:17), Ying Han wrote:
>
>> On Wed, Mar 28, 2012 at 3:51 AM, KAMEZAWA Hiroyuki
>> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>> Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
>>> And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
>>>
>>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>>> ---
>>> include/linux/page_cgroup.h | 18 ++++++++++++++++++
>>> mm/memcontrol.c | 18 ++++--------------
>>> 2 files changed, 22 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
>>> index 92768cb..2707809 100644
>>> --- a/include/linux/page_cgroup.h
>>> +++ b/include/linux/page_cgroup.h
>>> @@ -1,6 +1,8 @@
>>> #ifndef __LINUX_PAGE_CGROUP_H
>>> #define __LINUX_PAGE_CGROUP_H
>>>
>>> +#include <linux/smp.h>
>>> +
>>> enum {
>>> /* flags for mem_cgroup */
>>> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
>>> @@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
>>> pc->mem_cgroup = memcg;
>>> }
>>>
>>> +static inline void
>>> +pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
>>> + unsigned long flags)
>>> +{
>>> + pc->mem_cgroup = memcg;
>>> + /*
>>> + * We access a page_cgroup asynchronously without lock_page_cgroup().
>>> + * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>>> + * is accessed after testing USED bit. To make pc's mem_cgroup visible
>>> + * before USED bit, we need memory barrier here.
>>> + * See mem_cgroup_add_lru_list(), etc.
>>> + */
>>> + smp_wmb();
>>> + pc->flags = flags;
>>> +}
>>> +
>>> #else /* CONFIG_CGROUP_MEM_RES_CTLR */
>>> struct page_cgroup;
>>>
>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>> index 8077460..d366b60 100644
>>> --- a/mm/memcontrol.c
>>> +++ b/mm/memcontrol.c
>>> @@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
>>> }
>>> }
>>>
>>> - pc_set_mem_cgroup(pc, memcg);
>>> - /*
>>> - * We access a page_cgroup asynchronously without lock_page_cgroup().
>>> - * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>>> - * is accessed after testing USED bit. To make pc's mem_cgroup visible
>>> - * before USED bit, we need memory barrier here.
>>> - * See mem_cgroup_add_lru_list(), etc.
>>> - */
>>> - smp_wmb();
>>> - SetPageCgroupUsed(pc);
>>
>> I might be confused. We removed this SetPageCgroupUsed() but not
>> adding it back elsewhere ?
>>
>> --Ying
>>
>>> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
>
>
> Added here. This sets
>
> | memcg pointer | Used | Locked |
Ah. I missed that part.
Thanks
--Ying
>
>
> Thanks,
> -Kame
>
>
>
>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags()
@ 2012-04-18 18:18 ` Ying Han
0 siblings, 0 replies; 24+ messages in thread
From: Ying Han @ 2012-04-18 18:18 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: linux-mm, Linux Kernel, Michal Hocko, Johannes Weiner,
Andrew Morton, Glauber Costa, Konstantin Khlebnikov,
Suleiman Souhlal
On Wed, Apr 18, 2012 at 12:23 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> (2012/04/18 6:17), Ying Han wrote:
>
>> On Wed, Mar 28, 2012 at 3:51 AM, KAMEZAWA Hiroyuki
>> <kamezawa.hiroyu@jp.fujitsu.com> wrote:
>>> Consolidate a code for setting pc->mem_cgroup and USED bit which requires smp_wmb().
>>> And remove a macro PCGF_NOCOPY_AT_SPLIT which isn't helpful to read code, now.
>>>
>>> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>>> ---
>>> include/linux/page_cgroup.h | 18 ++++++++++++++++++
>>> mm/memcontrol.c | 18 ++++--------------
>>> 2 files changed, 22 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/include/linux/page_cgroup.h b/include/linux/page_cgroup.h
>>> index 92768cb..2707809 100644
>>> --- a/include/linux/page_cgroup.h
>>> +++ b/include/linux/page_cgroup.h
>>> @@ -1,6 +1,8 @@
>>> #ifndef __LINUX_PAGE_CGROUP_H
>>> #define __LINUX_PAGE_CGROUP_H
>>>
>>> +#include <linux/smp.h>
>>> +
>>> enum {
>>> /* flags for mem_cgroup */
>>> PCG_LOCK, /* Lock for pc->mem_cgroup and following bits. */
>>> @@ -94,6 +96,22 @@ pc_set_mem_cgroup(struct page_cgroup *pc, struct mem_cgroup *memcg)
>>> pc->mem_cgroup = memcg;
>>> }
>>>
>>> +static inline void
>>> +pc_set_mem_cgroup_and_flags(struct page_cgroup *pc, struct mem_cgroup *memcg,
>>> + unsigned long flags)
>>> +{
>>> + pc->mem_cgroup = memcg;
>>> + /*
>>> + * We access a page_cgroup asynchronously without lock_page_cgroup().
>>> + * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>>> + * is accessed after testing USED bit. To make pc's mem_cgroup visible
>>> + * before USED bit, we need memory barrier here.
>>> + * See mem_cgroup_add_lru_list(), etc.
>>> + */
>>> + smp_wmb();
>>> + pc->flags = flags;
>>> +}
>>> +
>>> #else /* CONFIG_CGROUP_MEM_RES_CTLR */
>>> struct page_cgroup;
>>>
>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>> index 8077460..d366b60 100644
>>> --- a/mm/memcontrol.c
>>> +++ b/mm/memcontrol.c
>>> @@ -2511,16 +2511,7 @@ static void __mem_cgroup_commit_charge(struct mem_cgroup *memcg,
>>> }
>>> }
>>>
>>> - pc_set_mem_cgroup(pc, memcg);
>>> - /*
>>> - * We access a page_cgroup asynchronously without lock_page_cgroup().
>>> - * Especially when a page_cgroup is taken from a page, pc's mem_cgroup
>>> - * is accessed after testing USED bit. To make pc's mem_cgroup visible
>>> - * before USED bit, we need memory barrier here.
>>> - * See mem_cgroup_add_lru_list(), etc.
>>> - */
>>> - smp_wmb();
>>> - SetPageCgroupUsed(pc);
>>
>> I might be confused. We removed this SetPageCgroupUsed() but not
>> adding it back elsewhere ?
>>
>> --Ying
>>
>>> + pc_set_mem_cgroup_and_flags(pc, memcg, BIT(PCG_USED) | BIT(PCG_LOCK));
>
>
> Added here. This sets
>
> | memcg pointer | Used | Locked |
Ah. I missed that part.
Thanks
--Ying
>
>
> Thanks,
> -Kame
>
>
>
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2012-04-18 18:18 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-28 10:44 [RFC][PATCH 0/6 v2] reducing page_cgroup size KAMEZAWA Hiroyuki
2012-03-28 10:44 ` KAMEZAWA Hiroyuki
2012-03-28 10:47 ` [RFC][PATCH 1/6] memcg: add methods to access pc->mem_cgroup KAMEZAWA Hiroyuki
2012-03-28 10:47 ` KAMEZAWA Hiroyuki
2012-03-28 10:51 ` [RFC][PATCH 2/6] memcg: add pc_set_mem_cgroup_and_flags() KAMEZAWA Hiroyuki
2012-03-28 10:51 ` KAMEZAWA Hiroyuki
2012-04-17 21:17 ` Ying Han
2012-04-17 21:17 ` Ying Han
2012-04-18 7:23 ` KAMEZAWA Hiroyuki
2012-04-18 7:23 ` KAMEZAWA Hiroyuki
2012-04-18 18:18 ` Ying Han
2012-04-18 18:18 ` Ying Han
2012-03-28 10:57 ` [RFC][PATCH 3/6] memcg: add PageCgroupReset() KAMEZAWA Hiroyuki
2012-03-28 10:57 ` KAMEZAWA Hiroyuki
2012-04-17 21:25 ` Ying Han
2012-04-17 21:25 ` Ying Han
2012-04-18 7:34 ` KAMEZAWA Hiroyuki
2012-04-18 7:34 ` KAMEZAWA Hiroyuki
2012-03-28 10:59 ` [RFC][[PATCH 4/6] memcg: remove mem_cgroup pointer from page_cgroup KAMEZAWA Hiroyuki
2012-03-28 10:59 ` KAMEZAWA Hiroyuki
2012-03-28 11:00 ` [RFC][PATCH 5/6] memcg: remove unnecessary memory barrier KAMEZAWA Hiroyuki
2012-03-28 11:00 ` KAMEZAWA Hiroyuki
2012-03-28 11:06 ` [RFC][PATCH 6/6] memcg: config for integrate page_cgroup into memmap KAMEZAWA Hiroyuki
2012-03-28 11:06 ` KAMEZAWA Hiroyuki
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.