linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE
@ 2012-03-02 20:13 Naoya Horiguchi
  2012-03-02 20:13 ` [PATCH v3 2/2] memcg: avoid THP split in task migration Naoya Horiguchi
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Naoya Horiguchi @ 2012-03-02 20:13 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	Daisuke Nishimura, Hillf Danton, linux-kernel, Naoya Horiguchi

These macros will be used in later patch, where all usage are expected
to be optimized away without #ifdef CONFIG_TRANSPARENT_HUGEPAGE.
But to detect unexpected usages, we convert existing BUG() to BUILD_BUG().

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
---
 include/linux/huge_mm.h |   11 ++++++-----
 1 files changed, 6 insertions(+), 5 deletions(-)

diff --git linux-next-20120228.orig/include/linux/huge_mm.h linux-next-20120228/include/linux/huge_mm.h
index f56cacb..c8af7a2 100644
--- linux-next-20120228.orig/include/linux/huge_mm.h
+++ linux-next-20120228/include/linux/huge_mm.h
@@ -51,6 +51,9 @@ extern pmd_t *page_check_address_pmd(struct page *page,
 				     unsigned long address,
 				     enum page_check_address_pmd_flag flag);
 
+#define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT)
+#define HPAGE_PMD_NR (1<<HPAGE_PMD_ORDER)
+
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 #define HPAGE_PMD_SHIFT HPAGE_SHIFT
 #define HPAGE_PMD_MASK HPAGE_MASK
@@ -102,8 +105,6 @@ extern void __split_huge_page_pmd(struct mm_struct *mm, pmd_t *pmd);
 		BUG_ON(pmd_trans_splitting(*____pmd) ||			\
 		       pmd_trans_huge(*____pmd));			\
 	} while (0)
-#define HPAGE_PMD_ORDER (HPAGE_PMD_SHIFT-PAGE_SHIFT)
-#define HPAGE_PMD_NR (1<<HPAGE_PMD_ORDER)
 #if HPAGE_PMD_ORDER > MAX_ORDER
 #error "hugepages can't be allocated by the buddy allocator"
 #endif
@@ -158,9 +159,9 @@ static inline struct page *compound_trans_head(struct page *page)
 	return page;
 }
 #else /* CONFIG_TRANSPARENT_HUGEPAGE */
-#define HPAGE_PMD_SHIFT ({ BUG(); 0; })
-#define HPAGE_PMD_MASK ({ BUG(); 0; })
-#define HPAGE_PMD_SIZE ({ BUG(); 0; })
+#define HPAGE_PMD_SHIFT ({ BUILD_BUG(); 0; })
+#define HPAGE_PMD_MASK ({ BUILD_BUG(); 0; })
+#define HPAGE_PMD_SIZE ({ BUILD_BUG(); 0; })
 
 #define hpage_nr_pages(x) 1
 
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-02 20:13 [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE Naoya Horiguchi
@ 2012-03-02 20:13 ` Naoya Horiguchi
  2012-03-09  0:30   ` Andrew Morton
  2012-03-09  1:16   ` KAMEZAWA Hiroyuki
  2012-03-09  0:33 ` [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE KAMEZAWA Hiroyuki
  2012-03-09 23:49 ` David Rientjes
  2 siblings, 2 replies; 19+ messages in thread
From: Naoya Horiguchi @ 2012-03-02 20:13 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	Daisuke Nishimura, Hillf Danton, linux-kernel, Naoya Horiguchi

Currently we can't do task migration among memory cgroups without THP split,
which means processes heavily using THP experience large overhead in task
migration. This patch introduce the code for moving charge of THP and makes
THP more valuable.

Changes from v2:
- add move_anon() and mapcount check

Changes from v1:
- rename is_target_huge_pmd_for_mc() to is_target_thp_for_mc()
- remove pmd_present() check (it's buggy when pmd_trans_huge(pmd) is true)
- is_target_thp_for_mc() calls get_page() only when checks are passed
- unlock page table lock if !mc.precharge
- compare return value of is_target_thp_for_mc() explicitly to MC_TARGET_TYPE
- clean up &walk->mm->page_table_lock to &vma->vm_mm->page_table_lock
- add comment about why race with split_huge_page() does not happen

Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Acked-by: Hillf Danton <dhillf@gmail.com>
---
 mm/memcontrol.c |   89 +++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 files changed, 83 insertions(+), 6 deletions(-)

diff --git linux-next-20120228.orig/mm/memcontrol.c linux-next-20120228/mm/memcontrol.c
index c83aeb5..b6d1bab 100644
--- linux-next-20120228.orig/mm/memcontrol.c
+++ linux-next-20120228/mm/memcontrol.c
@@ -5211,6 +5211,41 @@ static int is_target_pte_for_mc(struct vm_area_struct *vma,
 	return ret;
 }
 
+#ifdef CONFIG_TRANSPARENT_HUGEPAGE
+/*
+ * We don't consider swapping or file mapped pages because THP does not
+ * support them for now.
+ * Caller should make sure that pmd_trans_huge(pmd) is true.
+ */
+static int is_target_thp_for_mc(struct vm_area_struct *vma,
+		unsigned long addr, pmd_t pmd, union mc_target *target)
+{
+	struct page *page = NULL;
+	struct page_cgroup *pc;
+	int ret = 0;
+
+	page = pmd_page(pmd);
+	VM_BUG_ON(!page || !PageHead(page));
+	if (!move_anon() || page_mapcount(page) != 1)
+		return 0;
+	pc = lookup_page_cgroup(page);
+	if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
+		ret = MC_TARGET_PAGE;
+		if (target) {
+			get_page(page);
+			target->page = page;
+		}
+	}
+	return ret;
+}
+#else
+static inline int is_target_thp_for_mc(struct vm_area_struct *vma,
+		unsigned long addr, pmd_t pmd, union mc_target *target)
+{
+	return 0;
+}
+#endif
+
 static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
 					unsigned long addr, unsigned long end,
 					struct mm_walk *walk)
@@ -5219,7 +5254,14 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
 	pte_t *pte;
 	spinlock_t *ptl;
 
-	split_huge_page_pmd(walk->mm, pmd);
+	if (pmd_trans_huge_lock(pmd, vma) == 1) {
+		if (is_target_thp_for_mc(vma, addr, *pmd, NULL)
+		    == MC_TARGET_PAGE)
+			mc.precharge += HPAGE_PMD_NR;
+		spin_unlock(&vma->vm_mm->page_table_lock);
+		cond_resched();
+		return 0;
+	}
 
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
 	for (; addr != end; pte++, addr += PAGE_SIZE)
@@ -5378,16 +5420,51 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd,
 	struct vm_area_struct *vma = walk->private;
 	pte_t *pte;
 	spinlock_t *ptl;
+	int type;
+	union mc_target target;
+	struct page *page;
+	struct page_cgroup *pc;
+
+	/*
+	 * We don't take compound_lock() here but no race with splitting thp
+	 * happens because:
+	 *  - if pmd_trans_huge_lock() returns 1, the relevant thp is not
+	 *    under splitting, which means there's no concurrent thp split,
+	 *  - if another thread runs into split_huge_page() just after we
+	 *    entered this if-block, the thread must wait for page table lock
+	 *    to be unlocked in __split_huge_page_splitting(), where the main
+	 *    part of thp split is not executed yet.
+	 */
+	if (pmd_trans_huge_lock(pmd, vma) == 1) {
+		if (!mc.precharge) {
+			spin_unlock(&vma->vm_mm->page_table_lock);
+			cond_resched();
+			return 0;
+		}
+		type = is_target_thp_for_mc(vma, addr, *pmd, &target);
+		if (type == MC_TARGET_PAGE) {
+			page = target.page;
+			if (!isolate_lru_page(page)) {
+				pc = lookup_page_cgroup(page);
+				if (!mem_cgroup_move_account(page, HPAGE_PMD_NR,
+							     pc, mc.from, mc.to,
+							     false)) {
+					mc.precharge -= HPAGE_PMD_NR;
+					mc.moved_charge += HPAGE_PMD_NR;
+				}
+				putback_lru_page(page);
+			}
+			put_page(page);
+		}
+		spin_unlock(&vma->vm_mm->page_table_lock);
+		cond_resched();
+		return 0;
+	}
 
-	split_huge_page_pmd(walk->mm, pmd);
 retry:
 	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
 	for (; addr != end; addr += PAGE_SIZE) {
 		pte_t ptent = *(pte++);
-		union mc_target target;
-		int type;
-		struct page *page;
-		struct page_cgroup *pc;
 		swp_entry_t ent;
 
 		if (!mc.precharge)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-02 20:13 ` [PATCH v3 2/2] memcg: avoid THP split in task migration Naoya Horiguchi
@ 2012-03-09  0:30   ` Andrew Morton
  2012-03-09  1:47     ` Naoya Horiguchi
  2012-03-09  1:16   ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2012-03-09  0:30 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrea Arcangeli, KAMEZAWA Hiroyuki, Daisuke Nishimura,
	Hillf Danton, linux-kernel, Hugh Dickins

On Fri,  2 Mar 2012 15:13:09 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> Currently we can't do task migration among memory cgroups without THP split,
> which means processes heavily using THP experience large overhead in task
> migration. This patch introduce the code for moving charge of THP and makes
> THP more valuable.

Some review input from Kame and Andrea would be good, please.

> diff --git linux-next-20120228.orig/mm/memcontrol.c linux-next-20120228/mm/memcontrol.c
> index c83aeb5..b6d1bab 100644
> --- linux-next-20120228.orig/mm/memcontrol.c
> +++ linux-next-20120228/mm/memcontrol.c
> @@ -5211,6 +5211,41 @@ static int is_target_pte_for_mc(struct vm_area_struct *vma,
>  	return ret;
>  }
>  
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +/*
> + * We don't consider swapping or file mapped pages because THP does not
> + * support them for now.
> + * Caller should make sure that pmd_trans_huge(pmd) is true.
> + */
> +static int is_target_thp_for_mc(struct vm_area_struct *vma,
> +		unsigned long addr, pmd_t pmd, union mc_target *target)
> +{
> +	struct page *page = NULL;
> +	struct page_cgroup *pc;
> +	int ret = 0;

This should be MC_TARGET_NONE.  And this function should have a return
type of "enum mc_target_type".  And local variable `ret' should have
type "enum mc_target_type" as well.

Also, the name "is_target_thp_for_mc" doesn't make sense: an "is_foo"
function should return a boolean result, but this function doesn't do
that.

> +	page = pmd_page(pmd);
> +	VM_BUG_ON(!page || !PageHead(page));
> +	if (!move_anon() || page_mapcount(page) != 1)

More page_mapcount tricks, and we just fixed a bug in the other one and
Hugh got upset.

Can we please at least document what we're doing here?  This reader
forgot, and cannot reremember.

> +		return 0;

MC_TARGET_NONE.

> +	pc = lookup_page_cgroup(page);
> +	if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
> +		ret = MC_TARGET_PAGE;
> +		if (target) {
> +			get_page(page);
> +			target->page = page;
> +		}
> +	}
> +	return ret;
> +}
> +#else
> +static inline int is_target_thp_for_mc(struct vm_area_struct *vma,
> +		unsigned long addr, pmd_t pmd, union mc_target *target)
> +{
> +	return 0;

MC_TARGET_NONE.

> +}
> +#endif
> +
>  static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
>  					unsigned long addr, unsigned long end,
>  					struct mm_walk *walk)
> @@ -5219,7 +5254,14 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
>  	pte_t *pte;
>  	spinlock_t *ptl;
>  
> -	split_huge_page_pmd(walk->mm, pmd);
> +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> +		if (is_target_thp_for_mc(vma, addr, *pmd, NULL)
> +		    == MC_TARGET_PAGE)
> +			mc.precharge += HPAGE_PMD_NR;

That code layout is rather an eyesore :(

This:

		if (is_target_thp_for_mc(vma, addr, *pmd, NULL) == MC_TARGET_PAGE)
			mc.precharge += HPAGE_PMD_NR;

is probably better, but still an eyesore.  See if we can come up with a
shorter name than "is_target_thp_for_mc" and all will be fixed!

> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +		cond_resched();
> +		return 0;
> +	}
>  
>  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
>  	for (; addr != end; pte++, addr += PAGE_SIZE)
> @@ -5378,16 +5420,51 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd,
>  	struct vm_area_struct *vma = walk->private;
>  	pte_t *pte;
>  	spinlock_t *ptl;
> +	int type;

"enum mc_target_type".  Also choose a more specific name?  Perhaps
`target_type'.

> +	union mc_target target;
> +	struct page *page;
> +	struct page_cgroup *pc;
> +
> +	/*
> +	 * We don't take compound_lock() here but no race with splitting thp
> +	 * happens because:
> +	 *  - if pmd_trans_huge_lock() returns 1, the relevant thp is not
> +	 *    under splitting, which means there's no concurrent thp split,
> +	 *  - if another thread runs into split_huge_page() just after we
> +	 *    entered this if-block, the thread must wait for page table lock
> +	 *    to be unlocked in __split_huge_page_splitting(), where the main
> +	 *    part of thp split is not executed yet.
> +	 */
> +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> +		if (!mc.precharge) {
> +			spin_unlock(&vma->vm_mm->page_table_lock);
> +			cond_resched();
> +			return 0;
> +		}
> +		type = is_target_thp_for_mc(vma, addr, *pmd, &target);
> +		if (type == MC_TARGET_PAGE) {
> +			page = target.page;
> +			if (!isolate_lru_page(page)) {
> +				pc = lookup_page_cgroup(page);
> +				if (!mem_cgroup_move_account(page, HPAGE_PMD_NR,
> +							     pc, mc.from, mc.to,
> +							     false)) {
> +					mc.precharge -= HPAGE_PMD_NR;
> +					mc.moved_charge += HPAGE_PMD_NR;
> +				}
> +				putback_lru_page(page);
> +			}
> +			put_page(page);
> +		}
> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +		cond_resched();
> +		return 0;
> +	}

cond_resched() is an ugly thing.  Are we sure that it is needed here?

> -	split_huge_page_pmd(walk->mm, pmd);
>  retry:
>  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
>  	for (; addr != end; addr += PAGE_SIZE) {
>  		pte_t ptent = *(pte++);
> -		union mc_target target;
> -		int type;
> -		struct page *page;
> -		struct page_cgroup *pc;
>  		swp_entry_t ent;
>  
>  		if (!mc.precharge)


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE
  2012-03-02 20:13 [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE Naoya Horiguchi
  2012-03-02 20:13 ` [PATCH v3 2/2] memcg: avoid THP split in task migration Naoya Horiguchi
@ 2012-03-09  0:33 ` KAMEZAWA Hiroyuki
  2012-03-09 23:49 ` David Rientjes
  2 siblings, 0 replies; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-09  0:33 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrew Morton, Andrea Arcangeli, Daisuke Nishimura,
	Hillf Danton, linux-kernel

On Fri,  2 Mar 2012 15:13:08 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> These macros will be used in later patch, where all usage are expected
> to be optimized away without #ifdef CONFIG_TRANSPARENT_HUGEPAGE.
> But to detect unexpected usages, we convert existing BUG() to BUILD_BUG().
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Acked-by: Hillf Danton <dhillf@gmail.com>
> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-02 20:13 ` [PATCH v3 2/2] memcg: avoid THP split in task migration Naoya Horiguchi
  2012-03-09  0:30   ` Andrew Morton
@ 2012-03-09  1:16   ` KAMEZAWA Hiroyuki
  2012-03-09  2:33     ` Hugh Dickins
  2012-03-09  4:25     ` Naoya Horiguchi
  1 sibling, 2 replies; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-09  1:16 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrew Morton, Andrea Arcangeli, Daisuke Nishimura,
	Hillf Danton, linux-kernel

On Fri,  2 Mar 2012 15:13:09 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> Currently we can't do task migration among memory cgroups without THP split,
> which means processes heavily using THP experience large overhead in task
> migration. This patch introduce the code for moving charge of THP and makes
> THP more valuable.
> 
> Changes from v2:
> - add move_anon() and mapcount check
> 
> Changes from v1:
> - rename is_target_huge_pmd_for_mc() to is_target_thp_for_mc()
> - remove pmd_present() check (it's buggy when pmd_trans_huge(pmd) is true)
> - is_target_thp_for_mc() calls get_page() only when checks are passed
> - unlock page table lock if !mc.precharge
> - compare return value of is_target_thp_for_mc() explicitly to MC_TARGET_TYPE
> - clean up &walk->mm->page_table_lock to &vma->vm_mm->page_table_lock
> - add comment about why race with split_huge_page() does not happen
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Acked-by: Hillf Danton <dhillf@gmail.com>

I write this after reading Andrew's one.


> ---
>  mm/memcontrol.c |   89 +++++++++++++++++++++++++++++++++++++++++++++++++++----
>  1 files changed, 83 insertions(+), 6 deletions(-)
> 
> diff --git linux-next-20120228.orig/mm/memcontrol.c linux-next-20120228/mm/memcontrol.c
> index c83aeb5..b6d1bab 100644
> --- linux-next-20120228.orig/mm/memcontrol.c
> +++ linux-next-20120228/mm/memcontrol.c
> @@ -5211,6 +5211,41 @@ static int is_target_pte_for_mc(struct vm_area_struct *vma,
>  	return ret;
>  }
>  
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +/*
> + * We don't consider swapping or file mapped pages because THP does not
> + * support them for now.
> + * Caller should make sure that pmd_trans_huge(pmd) is true.
> + */
> +static int is_target_thp_for_mc(struct vm_area_struct *vma,
> +		unsigned long addr, pmd_t pmd, union mc_target *target)
> +{
> +	struct page *page = NULL;
> +	struct page_cgroup *pc;
> +	int ret = 0;

As Andrew pointed out, I agree MC_TARGET_NONE will be better.
Maybe other part should be rewritten.

> +
> +	page = pmd_page(pmd);
> +	VM_BUG_ON(!page || !PageHead(page));
> +	if (!move_anon() || page_mapcount(page) != 1)
> +		return 0;

Could you add this ?
==
static bool move_check_shared_map(struct page *page)
{
  /*
   * Handling of shared pages between processes is a big trouble in memcg.
   * Now, we never move shared-mapped pages between memcg at 'task' moving because
   * we have no hint which task the page is really belongs to. For example, 
   * When a task does "fork()-> move to the child other group -> exec()", the charges
   * should be stay in the original cgroup. 
   * So, check mapcount to determine we can move or not.
   */
   return page_mapcount(page) != 1;
}
==
We may be able to support madvise(MOVE_MEMCG) or fadvise(MOVE_MEMCG), if necessary.



> +	pc = lookup_page_cgroup(page);
> +	if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
> +		ret = MC_TARGET_PAGE;
> +		if (target) {
> +			get_page(page);
> +			target->page = page;

Here, get_page() is used rather than get_page_unless_zero() because of
__pmd_trans_huge_lock() is held ?



> +		}
> +	}
> +	return ret;
> +}
> +#else
> +static inline int is_target_thp_for_mc(struct vm_area_struct *vma,
> +		unsigned long addr, pmd_t pmd, union mc_target *target)
> +{
> +	return 0;
> +}
> +#endif
> +
>  static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
>  					unsigned long addr, unsigned long end,
>  					struct mm_walk *walk)
> @@ -5219,7 +5254,14 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
>  	pte_t *pte;
>  	spinlock_t *ptl;
>  
> -	split_huge_page_pmd(walk->mm, pmd);
> +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> +		if (is_target_thp_for_mc(vma, addr, *pmd, NULL)
> +		    == MC_TARGET_PAGE)
> +			mc.precharge += HPAGE_PMD_NR;
> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +		cond_resched();
> +		return 0;
> +	}

Maybe hard to read ;) I think is_target_thp_for_mc includes too much '_'
and short words...

Hmm, how about renaming "is_target_thp_for_mc"  as "pmd_move_target()" or some.
(Ah yes, other handler's name should be fixed, too.)

>  
>  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
>  	for (; addr != end; pte++, addr += PAGE_SIZE)
> @@ -5378,16 +5420,51 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd,
>  	struct vm_area_struct *vma = walk->private;
>  	pte_t *pte;
>  	spinlock_t *ptl;
> +	int type;
> +	union mc_target target;
> +	struct page *page;
> +	struct page_cgroup *pc;
> +
> +	/*
> +	 * We don't take compound_lock() here but no race with splitting thp
> +	 * happens because:
> +	 *  - if pmd_trans_huge_lock() returns 1, the relevant thp is not
> +	 *    under splitting, which means there's no concurrent thp split,
> +	 *  - if another thread runs into split_huge_page() just after we
> +	 *    entered this if-block, the thread must wait for page table lock
> +	 *    to be unlocked in __split_huge_page_splitting(), where the main
> +	 *    part of thp split is not executed yet.
> +	 */

ok.


> +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> +		if (!mc.precharge) {
> +			spin_unlock(&vma->vm_mm->page_table_lock);
> +			cond_resched();

Hm. Original code calls cond_resched() after 'scanning' the full pmd, 1024 entries.
With THP, it just handles 1 entry. cond_resched() will not be required.

> +			return 0;
> +		}
> +		type = is_target_thp_for_mc(vma, addr, *pmd, &target);
> +		if (type == MC_TARGET_PAGE) {
> +			page = target.page;
> +			if (!isolate_lru_page(page)) {
> +				pc = lookup_page_cgroup(page);
> +				if (!mem_cgroup_move_account(page, HPAGE_PMD_NR,
> +							     pc, mc.from, mc.to,
> +							     false)) {
> +					mc.precharge -= HPAGE_PMD_NR;
> +					mc.moved_charge += HPAGE_PMD_NR;
> +				}
> +				putback_lru_page(page);
> +			}
> +			put_page(page);
> +		}
> +		spin_unlock(&vma->vm_mm->page_table_lock);
> +		cond_resched();

ditto.


> +		return 0;
> +	}
>  
> -	split_huge_page_pmd(walk->mm, pmd);
>  retry:
>  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
>  	for (; addr != end; addr += PAGE_SIZE) {
>  		pte_t ptent = *(pte++);
> -		union mc_target target;
> -		int type;
> -		struct page *page;
> -		struct page_cgroup *pc;
>  		swp_entry_t ent;
>  
>  		if (!mc.precharge)


Thank you for your efforts!

-Kame


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  0:30   ` Andrew Morton
@ 2012-03-09  1:47     ` Naoya Horiguchi
  0 siblings, 0 replies; 19+ messages in thread
From: Naoya Horiguchi @ 2012-03-09  1:47 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, linux-mm, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	Daisuke Nishimura, Hillf Danton, linux-kernel, Hugh Dickins

On Thu, Mar 08, 2012 at 04:30:08PM -0800, Andrew Morton wrote:
> On Fri,  2 Mar 2012 15:13:09 -0500
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > Currently we can't do task migration among memory cgroups without THP split,
> > which means processes heavily using THP experience large overhead in task
> > migration. This patch introduce the code for moving charge of THP and makes
> > THP more valuable.
> 
> Some review input from Kame and Andrea would be good, please.
> 
> > diff --git linux-next-20120228.orig/mm/memcontrol.c linux-next-20120228/mm/memcontrol.c
> > index c83aeb5..b6d1bab 100644
> > --- linux-next-20120228.orig/mm/memcontrol.c
> > +++ linux-next-20120228/mm/memcontrol.c
> > @@ -5211,6 +5211,41 @@ static int is_target_pte_for_mc(struct vm_area_struct *vma,
> >  	return ret;
> >  }
> >  
> > +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > +/*
> > + * We don't consider swapping or file mapped pages because THP does not
> > + * support them for now.
> > + * Caller should make sure that pmd_trans_huge(pmd) is true.
> > + */
> > +static int is_target_thp_for_mc(struct vm_area_struct *vma,
> > +		unsigned long addr, pmd_t pmd, union mc_target *target)
> > +{
> > +	struct page *page = NULL;
> > +	struct page_cgroup *pc;
> > +	int ret = 0;
> 
> This should be MC_TARGET_NONE.  And this function should have a return
> type of "enum mc_target_type".  And local variable `ret' should have
> type "enum mc_target_type" as well.

OK.
And I'll also refactor the similar points on is_target_pte_for_mc()
with a separate patch.

> Also, the name "is_target_thp_for_mc" doesn't make sense: an "is_foo"
> function should return a boolean result, but this function doesn't do
> that.

Agreed. I replace it with descriptive and shorter "get_mctgt_type_thp."
Someont can feel the abbreviation like "mctgt" odd, so I also add
comments to describe what this function does. And I do the similar on
is_target_pte_for_mc().

> > +	page = pmd_page(pmd);
> > +	VM_BUG_ON(!page || !PageHead(page));
> > +	if (!move_anon() || page_mapcount(page) != 1)
> 
> More page_mapcount tricks, and we just fixed a bug in the other one and
> Hugh got upset.
> 
> Can we please at least document what we're doing here?  This reader
> forgot, and cannot reremember.

Yes. Documentation seems to be obsolete and we are in danger of forgetting
why we have this restriction, so I add comment here that says this is
temporary one to take simple approarch of accounting for shared pages.

> 
> > +		return 0;
> 
> MC_TARGET_NONE.
> 
> > +	pc = lookup_page_cgroup(page);
> > +	if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
> > +		ret = MC_TARGET_PAGE;
> > +		if (target) {
> > +			get_page(page);
> > +			target->page = page;
> > +		}
> > +	}
> > +	return ret;
> > +}
> > +#else
> > +static inline int is_target_thp_for_mc(struct vm_area_struct *vma,
> > +		unsigned long addr, pmd_t pmd, union mc_target *target)
> > +{
> > +	return 0;
> 
> MC_TARGET_NONE.
> 
> > +}
> > +#endif
> > +
> >  static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
> >  					unsigned long addr, unsigned long end,
> >  					struct mm_walk *walk)
> > @@ -5219,7 +5254,14 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
> >  	pte_t *pte;
> >  	spinlock_t *ptl;
> >  
> > -	split_huge_page_pmd(walk->mm, pmd);
> > +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> > +		if (is_target_thp_for_mc(vma, addr, *pmd, NULL)
> > +		    == MC_TARGET_PAGE)
> > +			mc.precharge += HPAGE_PMD_NR;
> 
> That code layout is rather an eyesore :(
> 
> This:
> 
> 		if (is_target_thp_for_mc(vma, addr, *pmd, NULL) == MC_TARGET_PAGE)
> 			mc.precharge += HPAGE_PMD_NR;
> 
> is probably better, but still an eyesore.  See if we can come up with a
> shorter name than "is_target_thp_for_mc" and all will be fixed!

2 letters shorter get_mctgt_type_thp() can fix it like above :)

> > +		spin_unlock(&vma->vm_mm->page_table_lock);
> > +		cond_resched();
> > +		return 0;
> > +	}
> >  
> >  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> >  	for (; addr != end; pte++, addr += PAGE_SIZE)
> > @@ -5378,16 +5420,51 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd,
> >  	struct vm_area_struct *vma = walk->private;
> >  	pte_t *pte;
> >  	spinlock_t *ptl;
> > +	int type;
> 
> "enum mc_target_type".  Also choose a more specific name?  Perhaps
> `target_type'.

OK. I'll take it.

> > +	union mc_target target;
> > +	struct page *page;
> > +	struct page_cgroup *pc;
> > +
> > +	/*
> > +	 * We don't take compound_lock() here but no race with splitting thp
> > +	 * happens because:
> > +	 *  - if pmd_trans_huge_lock() returns 1, the relevant thp is not
> > +	 *    under splitting, which means there's no concurrent thp split,
> > +	 *  - if another thread runs into split_huge_page() just after we
> > +	 *    entered this if-block, the thread must wait for page table lock
> > +	 *    to be unlocked in __split_huge_page_splitting(), where the main
> > +	 *    part of thp split is not executed yet.
> > +	 */
> > +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> > +		if (!mc.precharge) {
> > +			spin_unlock(&vma->vm_mm->page_table_lock);
> > +			cond_resched();
> > +			return 0;
> > +		}
> > +		type = is_target_thp_for_mc(vma, addr, *pmd, &target);
> > +		if (type == MC_TARGET_PAGE) {
> > +			page = target.page;
> > +			if (!isolate_lru_page(page)) {
> > +				pc = lookup_page_cgroup(page);
> > +				if (!mem_cgroup_move_account(page, HPAGE_PMD_NR,
> > +							     pc, mc.from, mc.to,
> > +							     false)) {
> > +					mc.precharge -= HPAGE_PMD_NR;
> > +					mc.moved_charge += HPAGE_PMD_NR;
> > +				}
> > +				putback_lru_page(page);
> > +			}
> > +			put_page(page);
> > +		}
> > +		spin_unlock(&vma->vm_mm->page_table_lock);
> > +		cond_resched();
> > +		return 0;
> > +	}
> 
> cond_resched() is an ugly thing.  Are we sure that it is needed here?

This is copied from regular size page path just below, but thp path takes
much shorter time than regular page path. So we can drop it.

Thanks,
Naoya

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  1:16   ` KAMEZAWA Hiroyuki
@ 2012-03-09  2:33     ` Hugh Dickins
  2012-03-09  3:24       ` KAMEZAWA Hiroyuki
  2012-03-09  4:25     ` Naoya Horiguchi
  1 sibling, 1 reply; 19+ messages in thread
From: Hugh Dickins @ 2012-03-09  2:33 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, Andrea Arcangeli,
	Daisuke Nishimura, Hillf Danton, linux-kernel

On Fri, 9 Mar 2012, KAMEZAWA Hiroyuki wrote:
> > +
> > +	page = pmd_page(pmd);
> > +	VM_BUG_ON(!page || !PageHead(page));
> > +	if (!move_anon() || page_mapcount(page) != 1)
> > +		return 0;
> 
> Could you add this ?
> ==
> static bool move_check_shared_map(struct page *page)
> {
>   /*
>    * Handling of shared pages between processes is a big trouble in memcg.
>    * Now, we never move shared-mapped pages between memcg at 'task' moving because
>    * we have no hint which task the page is really belongs to. For example, 
>    * When a task does "fork()-> move to the child other group -> exec()", the charges
>    * should be stay in the original cgroup. 
>    * So, check mapcount to determine we can move or not.
>    */
>    return page_mapcount(page) != 1;
> }

That's a helpful elucidation, thank you.  However...

That is not how it has actually been behaving for the last 18 months
(because of the "> 2" bug), so in practice you are asking for a change
in behaviour there.

And it's not how it has been and continues to behave with file pages.

Isn't getting that behaviour in fork-move-exec just a good reason not
to set move_charge_at_immigrate?

I think there are other scenarios where you do want all the pages to
move if move_charge_at_immigrate: and that's certainly easier to
describe and to understand and to code.

But if you do insist on not moving the shared, then it needs to involve
something like mem_cgroup_count_swap_user() on PageSwapCache pages,
rather than just the bare page_mapcount().

I'd rather delete than add code here!

Hugh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  2:33     ` Hugh Dickins
@ 2012-03-09  3:24       ` KAMEZAWA Hiroyuki
  2012-03-09  4:00         ` KAMEZAWA Hiroyuki
  2012-03-09  6:01         ` Daisuke Nishimura
  0 siblings, 2 replies; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-09  3:24 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, Andrea Arcangeli,
	Daisuke Nishimura, Hillf Danton, linux-kernel, nishimura

On Thu, 8 Mar 2012 18:33:14 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> On Fri, 9 Mar 2012, KAMEZAWA Hiroyuki wrote:
> > > +
> > > +	page = pmd_page(pmd);
> > > +	VM_BUG_ON(!page || !PageHead(page));
> > > +	if (!move_anon() || page_mapcount(page) != 1)
> > > +		return 0;
> > 
> > Could you add this ?
> > ==
> > static bool move_check_shared_map(struct page *page)
> > {
> >   /*
> >    * Handling of shared pages between processes is a big trouble in memcg.
> >    * Now, we never move shared-mapped pages between memcg at 'task' moving because
> >    * we have no hint which task the page is really belongs to. For example, 
> >    * When a task does "fork()-> move to the child other group -> exec()", the charges
> >    * should be stay in the original cgroup. 
> >    * So, check mapcount to determine we can move or not.
> >    */
> >    return page_mapcount(page) != 1;
> > }
> 
> That's a helpful elucidation, thank you.  However...
> 
> That is not how it has actually been behaving for the last 18 months
> (because of the "> 2" bug), so in practice you are asking for a change
> in behaviour there.
> 
Yes.


> And it's not how it has been and continues to behave with file pages.
> 
It's ok to add somethink like..

	if (PageAnon(page) && !move_anon())
		return false;
	...

> Isn't getting that behaviour in fork-move-exec just a good reason not
> to set move_charge_at_immigrate?
> 
Hmm. Maybe.

> I think there are other scenarios where you do want all the pages to
> move if move_charge_at_immigrate: and that's certainly easier to
> describe and to understand and to code.
> 
> But if you do insist on not moving the shared, then it needs to involve
> something like mem_cgroup_count_swap_user() on PageSwapCache pages,
> rather than just the bare page_mapcount().
> 

This 'moving swap account' was a requirement from a user (NEC?).
But no user doesn't say 'I want to move shared pages between cgroups at task
move !' and I don't like to move shared objects.

> I'd rather delete than add code here!
> 

As a user, for Fujitsu, I believe it's insane to move task between cgroups.
So, I have no benefit from this code, at all.
Ok, maybe I'm not a stakeholder,here.

If users say all shared pages should be moved, ok, let's move.
But change of behavior should be documented and implemented in an independet
patch. CC'ed Nishimura-san, he implemetned this, a real user.

Thanks,
-Kame




^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  3:24       ` KAMEZAWA Hiroyuki
@ 2012-03-09  4:00         ` KAMEZAWA Hiroyuki
  2012-03-09  6:01         ` Daisuke Nishimura
  1 sibling, 0 replies; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-09  4:00 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Hugh Dickins, Naoya Horiguchi, linux-mm, Andrew Morton,
	Andrea Arcangeli, Daisuke Nishimura, Hillf Danton, linux-kernel

On Fri, 9 Mar 2012 12:24:48 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Thu, 8 Mar 2012 18:33:14 -0800 (PST)
> Hugh Dickins <hughd@google.com> wrote:
> 
> > On Fri, 9 Mar 2012, KAMEZAWA Hiroyuki wrote:
> > > > +
> > > > +	page = pmd_page(pmd);
> > > > +	VM_BUG_ON(!page || !PageHead(page));
> > > > +	if (!move_anon() || page_mapcount(page) != 1)
> > > > +		return 0;
> > > 
> > > Could you add this ?
> > > ==
> > > static bool move_check_shared_map(struct page *page)
> > > {
> > >   /*
> > >    * Handling of shared pages between processes is a big trouble in memcg.
> > >    * Now, we never move shared-mapped pages between memcg at 'task' moving because
> > >    * we have no hint which task the page is really belongs to. For example, 
> > >    * When a task does "fork()-> move to the child other group -> exec()", the charges
> > >    * should be stay in the original cgroup. 
> > >    * So, check mapcount to determine we can move or not.
> > >    */
> > >    return page_mapcount(page) != 1;
> > > }
> > 
> > That's a helpful elucidation, thank you.  However...
> > 
> > That is not how it has actually been behaving for the last 18 months
> > (because of the "> 2" bug), so in practice you are asking for a change
> > in behaviour there.
> > 
> Yes.
> 
> 
> > And it's not how it has been and continues to behave with file pages.
> > 
> It's ok to add somethink like..
> 
> 	if (PageAnon(page) && !move_anon())
> 		return false;
> 	...
> 
> > Isn't getting that behaviour in fork-move-exec just a good reason not
> > to set move_charge_at_immigrate?
> > 
> Hmm. Maybe.
> 
> > I think there are other scenarios where you do want all the pages to
> > move if move_charge_at_immigrate: and that's certainly easier to
> > describe and to understand and to code.
> > 
> > But if you do insist on not moving the shared, then it needs to involve
> > something like mem_cgroup_count_swap_user() on PageSwapCache pages,
> > rather than just the bare page_mapcount().
> > 
> 
> This 'moving swap account' was a requirement from a user (NEC?).
> But no user doesn't say 'I want to move shared pages between cgroups at task
> move !' and I don't like to move shared objects.
> 
> > I'd rather delete than add code here!
> > 
> 
> As a user, for Fujitsu, I believe it's insane to move task between cgroups.
> So, I have no benefit from this code, at all.
> Ok, maybe I'm not a stakeholder,here.
> 

Considering again, in my personal opinion, 
at fork() -> move() -> exec(), parent's RSS charge should not be moved.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  1:16   ` KAMEZAWA Hiroyuki
  2012-03-09  2:33     ` Hugh Dickins
@ 2012-03-09  4:25     ` Naoya Horiguchi
  2012-03-09  7:32       ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 19+ messages in thread
From: Naoya Horiguchi @ 2012-03-09  4:25 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Naoya Horiguchi, linux-mm, Andrew Morton, Andrea Arcangeli,
	Daisuke Nishimura, Hillf Danton, linux-kernel

Hi KAMEZAWA-san,

> On Fri,  2 Mar 2012 15:13:09 -0500
> Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
...
> > +
> > +	page = pmd_page(pmd);
> > +	VM_BUG_ON(!page || !PageHead(page));
> > +	if (!move_anon() || page_mapcount(page) != 1)
> > +		return 0;
>
> Could you add this ?
> ==
> static bool move_check_shared_map(struct page *page)
> {
>   /*
>    * Handling of shared pages between processes is a big trouble in memcg.
>    * Now, we never move shared-mapped pages between memcg at 'task' moving because
>    * we have no hint which task the page is really belongs to. For example,
>    * When a task does "fork()-> move to the child other group -> exec()", the charges
>    * should be stay in the original cgroup.
>    * So, check mapcount to determine we can move or not.
>    */
>    return page_mapcount(page) != 1;
> }
> ==

Thank you.

We check mapcount only for anonymous pages, so we had better also describe
that viewpoint?  And this function returns whether the target page of moving
charge is shared or not, so a name like is_mctgt_shared() looks better to me.
Moreover, this function explains why we have current implementation, rather
than why return value is mapcount != 1, so I put the comment above function
declaration like this:

  /*
   * Handling of shared pages between processes is a big trouble in memcg.
   * Now, we never move shared anonymous pages between memcg at 'task'
   * moving because we have no hint which task the page is really belongs to.
   * For example, when a task does "fork() -> move to the child other group
   * -> exec()", the charges should be stay in the original cgroup.
   * So, check if a given page is shared or not to determine to move charge.
   */
  static bool is_mctgt_shared(struct page *page)
  {
     return page_mapcount(page) != 1;
  }

As for the difference between anon page and filemapped page, I have no idea
about current charge moving policy. Is this explained anywhere? (sorry to
question before researching by myself ...)


> We may be able to support madvise(MOVE_MEMCG) or fadvise(MOVE_MEMCG), if necessary.

Is this mean moving charge policy can depend on users?
I feel that's strange because I don't think resouce management should be
under users' control.

>
> > +	pc = lookup_page_cgroup(page);
> > +	if (PageCgroupUsed(pc) && pc->mem_cgroup == mc.from) {
> > +		ret = MC_TARGET_PAGE;
> > +		if (target) {
> > +			get_page(page);
> > +			target->page = page;
>
> Here, get_page() is used rather than get_page_unless_zero() because of
> __pmd_trans_huge_lock() is held ?

Yes, and page should be thp head page, so never have 0 refcount.
So I thought get_page() which has VM_BUG_ON(count<=0) is preferable.

>
> > +		}
> > +	}
> > +	return ret;
> > +}
> > +#else
> > +static inline int is_target_thp_for_mc(struct vm_area_struct *vma,
> > +		unsigned long addr, pmd_t pmd, union mc_target *target)
> > +{
> > +	return 0;
> > +}
> > +#endif
> > +
> >  static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
> >  					unsigned long addr, unsigned long end,
> >  					struct mm_walk *walk)
> > @@ -5219,7 +5254,14 @@ static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd,
> >  	pte_t *pte;
> >  	spinlock_t *ptl;
> >
> > -	split_huge_page_pmd(walk->mm, pmd);
> > +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> > +		if (is_target_thp_for_mc(vma, addr, *pmd, NULL)
> > +		    == MC_TARGET_PAGE)
> > +			mc.precharge += HPAGE_PMD_NR;
> > +		spin_unlock(&vma->vm_mm->page_table_lock);
> > +		cond_resched();
> > +		return 0;
> > +	}
>
> Maybe hard to read ;) I think is_target_thp_for_mc includes too much '_'
> and short words...
>
> Hmm, how about renaming "is_target_thp_for_mc"  as "pmd_move_target()" or some.
> (Ah yes, other handler's name should be fixed, too.)

As I wrote in the reply to Andrew, I want to go with get_mctgt_type[_thp]
if possible.

> >
> >  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> >  	for (; addr != end; pte++, addr += PAGE_SIZE)
> > @@ -5378,16 +5420,51 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd,
> >  	struct vm_area_struct *vma = walk->private;
> >  	pte_t *pte;
> >  	spinlock_t *ptl;
> > +	int type;
> > +	union mc_target target;
> > +	struct page *page;
> > +	struct page_cgroup *pc;
> > +
> > +	/*
> > +	 * We don't take compound_lock() here but no race with splitting thp
> > +	 * happens because:
> > +	 *  - if pmd_trans_huge_lock() returns 1, the relevant thp is not
> > +	 *    under splitting, which means there's no concurrent thp split,
> > +	 *  - if another thread runs into split_huge_page() just after we
> > +	 *    entered this if-block, the thread must wait for page table lock
> > +	 *    to be unlocked in __split_huge_page_splitting(), where the main
> > +	 *    part of thp split is not executed yet.
> > +	 */
>
> ok.
>
>
> > +	if (pmd_trans_huge_lock(pmd, vma) == 1) {
> > +		if (!mc.precharge) {
> > +			spin_unlock(&vma->vm_mm->page_table_lock);
> > +			cond_resched();
>
> Hm. Original code calls cond_resched() after 'scanning' the full pmd, 1024 entries.
> With THP, it just handles 1 entry. cond_resched() will not be required.

I agree.

> > +			return 0;
> > +		}
> > +		type = is_target_thp_for_mc(vma, addr, *pmd, &target);
> > +		if (type == MC_TARGET_PAGE) {
> > +			page = target.page;
> > +			if (!isolate_lru_page(page)) {
> > +				pc = lookup_page_cgroup(page);
> > +				if (!mem_cgroup_move_account(page, HPAGE_PMD_NR,
> > +							     pc, mc.from, mc.to,
> > +							     false)) {
> > +					mc.precharge -= HPAGE_PMD_NR;
> > +					mc.moved_charge += HPAGE_PMD_NR;
> > +				}
> > +				putback_lru_page(page);
> > +			}
> > +			put_page(page);
> > +		}
> > +		spin_unlock(&vma->vm_mm->page_table_lock);
> > +		cond_resched();
>
> ditto.
>
>
> > +		return 0;
> > +	}
> >
> > -	split_huge_page_pmd(walk->mm, pmd);
> >  retry:
> >  	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
> >  	for (; addr != end; addr += PAGE_SIZE) {
> >  		pte_t ptent = *(pte++);
> > -		union mc_target target;
> > -		int type;
> > -		struct page *page;
> > -		struct page_cgroup *pc;
> >  		swp_entry_t ent;
> >
> >  		if (!mc.precharge)
>
>
> Thank you for your efforts!

Thanks for taking time for the review.

Naoya

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  3:24       ` KAMEZAWA Hiroyuki
  2012-03-09  4:00         ` KAMEZAWA Hiroyuki
@ 2012-03-09  6:01         ` Daisuke Nishimura
  2012-03-09  7:23           ` [PATCH] memcg: fix behavior of shard anon pages at task_move (Was " KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 19+ messages in thread
From: Daisuke Nishimura @ 2012-03-09  6:01 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Daisuke Nishimura, Hugh Dickins, Naoya Horiguchi, linux-mm,
	Andrew Morton, Andrea Arcangeli, Hillf Danton, linux-kernel

On Fri, 9 Mar 2012 12:24:48 +0900
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:

> On Thu, 8 Mar 2012 18:33:14 -0800 (PST)
> Hugh Dickins <hughd@google.com> wrote:
> 
> > On Fri, 9 Mar 2012, KAMEZAWA Hiroyuki wrote:
> > > > +
> > > > +	page = pmd_page(pmd);
> > > > +	VM_BUG_ON(!page || !PageHead(page));
> > > > +	if (!move_anon() || page_mapcount(page) != 1)
> > > > +		return 0;
> > > 
> > > Could you add this ?
> > > ==
> > > static bool move_check_shared_map(struct page *page)
> > > {
> > >   /*
> > >    * Handling of shared pages between processes is a big trouble in memcg.
> > >    * Now, we never move shared-mapped pages between memcg at 'task' moving because
> > >    * we have no hint which task the page is really belongs to. For example, 
> > >    * When a task does "fork()-> move to the child other group -> exec()", the charges
> > >    * should be stay in the original cgroup. 
> > >    * So, check mapcount to determine we can move or not.
> > >    */
> > >    return page_mapcount(page) != 1;
> > > }
> > 
> > That's a helpful elucidation, thank you.  However...
> > 
> > That is not how it has actually been behaving for the last 18 months
> > (because of the "> 2" bug), so in practice you are asking for a change
> > in behaviour there.
> > 
> Yes.
> 
> 
> > And it's not how it has been and continues to behave with file pages.
> > 
> It's ok to add somethink like..
> 
> 	if (PageAnon(page) && !move_anon())
> 		return false;
> 	...
> 
> > Isn't getting that behaviour in fork-move-exec just a good reason not
> > to set move_charge_at_immigrate?
> > 
> Hmm. Maybe.
> 
> > I think there are other scenarios where you do want all the pages to
> > move if move_charge_at_immigrate: and that's certainly easier to
> > describe and to understand and to code.
> > 
> > But if you do insist on not moving the shared, then it needs to involve
> > something like mem_cgroup_count_swap_user() on PageSwapCache pages,
> > rather than just the bare page_mapcount().
> > 
> 
> This 'moving swap account' was a requirement from a user (NEC?).
> But no user doesn't say 'I want to move shared pages between cgroups at task
> move !' and I don't like to move shared objects.
> 
> > I'd rather delete than add code here!
> > 
> 
> As a user, for Fujitsu, I believe it's insane to move task between cgroups.
> So, I have no benefit from this code, at all.
> Ok, maybe I'm not a stakeholder,here.
> 
I agree that moving tasks between cgroup is not a sane operation,
users won't do it so frequently, but I cannot prevent that.
That's why I implemented this feature.

> If users say all shared pages should be moved, ok, let's move.
> But change of behavior should be documented and implemented in an independet
> patch. CC'ed Nishimura-san, he implemetned this, a real user.
> 
To be honest, shared anon is not my concern. My concern is 
shared memory(that's why, mapcount is not checked as for file pages.
I assume all processes sharing the same shared memory will be moved together).
So, it's all right for me to change the behavior for shared anon(or leave
it as it is).


Thanks,
Daisuke Nishimura.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] memcg: fix behavior of shard anon pages at task_move (Was Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  6:01         ` Daisuke Nishimura
@ 2012-03-09  7:23           ` KAMEZAWA Hiroyuki
  2012-03-09 21:05             ` Hugh Dickins
  0 siblings, 1 reply; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-09  7:23 UTC (permalink / raw)
  To: Daisuke Nishimura
  Cc: Hugh Dickins, Naoya Horiguchi, linux-mm, Andrew Morton,
	Andrea Arcangeli, Hillf Danton, linux-kernel

On Fri, 9 Mar 2012 15:01:09 +0900
Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp> wrote:

> On Fri, 9 Mar 2012 12:24:48 +0900
> KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> wrote:
> > > I'd rather delete than add code here!
> > > 
> > 
> > As a user, for Fujitsu, I believe it's insane to move task between cgroups.
> > So, I have no benefit from this code, at all.
> > Ok, maybe I'm not a stakeholder,here.
> > 
> I agree that moving tasks between cgroup is not a sane operation,
> users won't do it so frequently, but I cannot prevent that.
> That's why I implemented this feature.
> 
> > If users say all shared pages should be moved, ok, let's move.
> > But change of behavior should be documented and implemented in an independet
> > patch. CC'ed Nishimura-san, he implemetned this, a real user.
> > 
> To be honest, shared anon is not my concern. My concern is 
> shared memory(that's why, mapcount is not checked as for file pages.
> I assume all processes sharing the same shared memory will be moved together).
> So, it's all right for me to change the behavior for shared anon(or leave
> it as it is).
> 

Thank you for comment. Then, here is a patch.

Other opinions ?

==
>From 1012e97e3b123fb80d0ec6b1f5d3dbc87a5a5139 Mon Sep 17 00:00:00 2001
From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Date: Fri, 9 Mar 2012 16:22:32 +0900
Subject: [PATCH] memcg: fix/change behavior of shared anon at moving task.

In documentation, it's said that 'shared anon are not moved'.
But in implementation, this check was done.

  if (!move_anon() || page_mapcount(page) > 2)

Ah, memcg has been moving shared anon pages for a long time.

Then, here is a discussion about handling of shared anon pages.

 - It's complex
 - Now, shared file caches are moved in force.
 - It adds unclear check as page_mapcount(). To do correct check,
   we should check swap users, etc.
 - No one notice this implementation behavior. So, no one get benefit
   from the design.
 - In general, once task is moved to a cgroup for running, it will not
   be moved....
 - Finally, we have control knob as memory.move_charge_at_immigrate.


Here is a patch to allow moving shared pages. This makes memcg simpler
and fix current broken implementation.
I added a notice for what happens at fork() -> move -> exec() usage.

Note:
 IIUC, libcgroup's cgroup daemon moves tasks after exec().
 So, it's not affected. 
 libcgroup's command "cgexec" does move itsef to a memcg and call exec()
 without fork(). it's not affected.

Suggested-by: Hugh Dickins <hughd@google.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
---
 Documentation/cgroups/memory.txt |   10 ++++++++--
 include/linux/swap.h             |    9 ---------
 mm/memcontrol.c                  |   15 +++------------
 mm/swapfile.c                    |   31 -------------------------------
 4 files changed, 11 insertions(+), 54 deletions(-)

diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
index 4c95c00..16bc9f2 100644
--- a/Documentation/cgroups/memory.txt
+++ b/Documentation/cgroups/memory.txt
@@ -185,6 +185,9 @@ behind this approach is that a cgroup that aggressively uses a shared
 page will eventually get charged for it (once it is uncharged from
 the cgroup that brought it in -- this will happen on memory pressure).
 
+(*)See section 8.2. At task moving, you can recharge mapped pages to other
+   cgroup.
+
 Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.
 When you do swapoff and make swapped-out pages of shmem(tmpfs) to
 be backed into memory in force, charges for pages are accounted against the
@@ -623,8 +626,8 @@ memory cgroup.
   bit | what type of charges would be moved ?
  -----+------------------------------------------------------------------------
    0  | A charge of an anonymous page(or swap of it) used by the target task.
-      | Those pages and swaps must be used only by the target task. You must
-      | enable Swap Extension(see 2.4) to enable move of swap charges.
+      | Even if it's shared, it will be moved in force(*). You must enable Swap
+      | Extension(see 2.4) to enable move of swap charges.
  -----+------------------------------------------------------------------------
    1  | A charge of file pages(normal file, tmpfs file(e.g. ipc shared memory)
       | and swaps of tmpfs file) mmapped by the target task. Unlike the case of
@@ -635,6 +638,9 @@ memory cgroup.
       | page_mapcount(page) > 1). You must enable Swap Extension(see 2.4) to
       | enable move of swap charges.
 
+(*) Because of this, fork() -> move -> exec() will move all parent's page
+    to the target cgroup. Please be careful.
+
 8.3 TODO
 
 - Implement madvise(2) to let users decide the vma to be moved or not to be
diff --git a/include/linux/swap.h b/include/linux/swap.h
index f7df3ea..13c8d6f 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -390,7 +390,6 @@ static inline void deactivate_swap_token(struct mm_struct *mm, bool swap_token)
 #ifdef CONFIG_CGROUP_MEM_RES_CTLR
 extern void
 mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout);
-extern int mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep);
 #else
 static inline void
 mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
@@ -535,14 +534,6 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent)
 {
 }
 
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
-static inline int
-mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep)
-{
-	return 0;
-}
-#endif
-
 #endif /* CONFIG_SWAP */
 #endif /* __KERNEL__*/
 #endif /* _LINUX_SWAP_H */
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c83aeb5..e7e4e3d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5100,12 +5100,9 @@ static struct page *mc_handle_present_pte(struct vm_area_struct *vma,
 
 	if (!page || !page_mapped(page))
 		return NULL;
-	if (PageAnon(page)) {
-		/* we don't move shared anon */
-		if (!move_anon() || page_mapcount(page) > 2)
-			return NULL;
+	if (PageAnon(page) && !move_anon()) {
+		return NULL;
 	} else if (!move_file())
-		/* we ignore mapcount for file pages */
 		return NULL;
 	if (!get_page_unless_zero(page))
 		return NULL;
@@ -5116,18 +5113,12 @@ static struct page *mc_handle_present_pte(struct vm_area_struct *vma,
 static struct page *mc_handle_swap_pte(struct vm_area_struct *vma,
 			unsigned long addr, pte_t ptent, swp_entry_t *entry)
 {
-	int usage_count;
 	struct page *page = NULL;
 	swp_entry_t ent = pte_to_swp_entry(ptent);
 
 	if (!move_anon() || non_swap_entry(ent))
 		return NULL;
-	usage_count = mem_cgroup_count_swap_user(ent, &page);
-	if (usage_count > 1) { /* we don't move shared anon */
-		if (page)
-			put_page(page);
-		return NULL;
-	}
+	page = lookup_swap_cache(ent);
 	if (do_swap_account)
 		entry->val = ent.val;
 
diff --git a/mm/swapfile.c b/mm/swapfile.c
index fa3c519..85b4548 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -720,37 +720,6 @@ int free_swap_and_cache(swp_entry_t entry)
 	return p != NULL;
 }
 
-#ifdef CONFIG_CGROUP_MEM_RES_CTLR
-/**
- * mem_cgroup_count_swap_user - count the user of a swap entry
- * @ent: the swap entry to be checked
- * @pagep: the pointer for the swap cache page of the entry to be stored
- *
- * Returns the number of the user of the swap entry. The number is valid only
- * for swaps of anonymous pages.
- * If the entry is found on swap cache, the page is stored to pagep with
- * refcount of it being incremented.
- */
-int mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep)
-{
-	struct page *page;
-	struct swap_info_struct *p;
-	int count = 0;
-
-	page = find_get_page(&swapper_space, ent.val);
-	if (page)
-		count += page_mapcount(page);
-	p = swap_info_get(ent);
-	if (p) {
-		count += swap_count(p->swap_map[swp_offset(ent)]);
-		spin_unlock(&swap_lock);
-	}
-
-	*pagep = page;
-	return count;
-}
-#endif
-
 #ifdef CONFIG_HIBERNATION
 /*
  * Find the swap type that corresponds to given device (if any).
-- 
1.7.4.1












^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  4:25     ` Naoya Horiguchi
@ 2012-03-09  7:32       ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-09  7:32 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrew Morton, Andrea Arcangeli, Daisuke Nishimura,
	Hillf Danton, linux-kernel

On Thu,  8 Mar 2012 23:25:28 -0500
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> Hi KAMEZAWA-san,
> 
> > On Fri,  2 Mar 2012 15:13:09 -0500
> > Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> ...
> > > +
> > > +	page = pmd_page(pmd);
> > > +	VM_BUG_ON(!page || !PageHead(page));
> > > +	if (!move_anon() || page_mapcount(page) != 1)
> > > +		return 0;
> >
> > Could you add this ?
> > ==
> > static bool move_check_shared_map(struct page *page)
> > {
> >   /*
> >    * Handling of shared pages between processes is a big trouble in memcg.
> >    * Now, we never move shared-mapped pages between memcg at 'task' moving because
> >    * we have no hint which task the page is really belongs to. For example,
> >    * When a task does "fork()-> move to the child other group -> exec()", the charges
> >    * should be stay in the original cgroup.
> >    * So, check mapcount to determine we can move or not.
> >    */
> >    return page_mapcount(page) != 1;
> > }
> > ==
> 
> Thank you.
> 
> We check mapcount only for anonymous pages, so we had better also describe
> that viewpoint?  And this function returns whether the target page of moving
> charge is shared or not, so a name like is_mctgt_shared() looks better to me.
> Moreover, this function explains why we have current implementation, rather
> than why return value is mapcount != 1, so I put the comment above function
> declaration like this:
> 
>   /*
>    * Handling of shared pages between processes is a big trouble in memcg.
>    * Now, we never move shared anonymous pages between memcg at 'task'
>    * moving because we have no hint which task the page is really belongs to.
>    * For example, when a task does "fork() -> move to the child other group
>    * -> exec()", the charges should be stay in the original cgroup.
>    * So, check if a given page is shared or not to determine to move charge.
>    */
>   static bool is_mctgt_shared(struct page *page)
>   {
>      return page_mapcount(page) != 1;
>   }
> 
> As for the difference between anon page and filemapped page, I have no idea
> about current charge moving policy. Is this explained anywhere? (sorry to
> question before researching by myself ...)
> 
> 

Now, I think it's okay to move mapcount check. I posted a patch for reference. 
Please check it.
https://lkml.org/lkml/2012/3/9/40

> > We may be able to support madvise(MOVE_MEMCG) or fadvise(MOVE_MEMCG), if necessary.
> 
> Is this mean moving charge policy can depend on users?
> I feel that's strange because I don't think resouce management should be
> under users' control.
> 
You're right. I 

Hm. I remember some guy suggested 'how about passing prefer memcg as mount option'
or some. Anyway, shared page handling is trouble since memory cgroup was born.

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] memcg: fix behavior of shard anon pages at task_move (Was Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09  7:23           ` [PATCH] memcg: fix behavior of shard anon pages at task_move (Was " KAMEZAWA Hiroyuki
@ 2012-03-09 21:05             ` Hugh Dickins
  2012-03-09 21:37               ` [PATCH] memcg: revert fix to mapcount check for this release Hugh Dickins
  2012-03-13  5:18               ` [PATCH] memcg: fix behavior of shard anon pages at task_move (Was Re: [PATCH v3 2/2] memcg: avoid THP split in task migration KAMEZAWA Hiroyuki
  0 siblings, 2 replies; 19+ messages in thread
From: Hugh Dickins @ 2012-03-09 21:05 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Daisuke Nishimura, Naoya Horiguchi, linux-mm, Andrew Morton,
	Andrea Arcangeli, Hillf Danton, linux-kernel

On Fri, 9 Mar 2012, KAMEZAWA Hiroyuki wrote:
> From 1012e97e3b123fb80d0ec6b1f5d3dbc87a5a5139 Mon Sep 17 00:00:00 2001
> From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> Date: Fri, 9 Mar 2012 16:22:32 +0900
> Subject: [PATCH] memcg: fix/change behavior of shared anon at moving task.
> 
> In documentation, it's said that 'shared anon are not moved'.
> But in implementation, this check was done.
> 
>   if (!move_anon() || page_mapcount(page) > 2)
> 
> Ah, memcg has been moving shared anon pages for a long time.
> 
> Then, here is a discussion about handling of shared anon pages.
> 
>  - It's complex
>  - Now, shared file caches are moved in force.
>  - It adds unclear check as page_mapcount(). To do correct check,
>    we should check swap users, etc.
>  - No one notice this implementation behavior. So, no one get benefit
>    from the design.
>  - In general, once task is moved to a cgroup for running, it will not
>    be moved....
>  - Finally, we have control knob as memory.move_charge_at_immigrate.
> 
> 
> Here is a patch to allow moving shared pages. This makes memcg simpler
> and fix current broken implementation.
> I added a notice for what happens at fork() -> move -> exec() usage.
> 
> Note:
>  IIUC, libcgroup's cgroup daemon moves tasks after exec().
>  So, it's not affected. 
>  libcgroup's command "cgexec" does move itsef to a memcg and call exec()
>  without fork(). it's not affected.
> 
> Suggested-by: Hugh Dickins <hughd@google.com>
> Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

Yes, this is much like what I had in mind (well, on disk):
I'm glad we are in agreement now.

But first priority, I think, should be to revert to the silly "> 2"
test for 3.3 final, so it behaves just like 2.6.35 through 3.2:
I'll send Andrew and Linus a patch for that shortly.

> ---
>  Documentation/cgroups/memory.txt |   10 ++++++++--
>  include/linux/swap.h             |    9 ---------
>  mm/memcontrol.c                  |   15 +++------------
>  mm/swapfile.c                    |   31 -------------------------------
>  4 files changed, 11 insertions(+), 54 deletions(-)
> 
> diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
> index 4c95c00..16bc9f2 100644
> --- a/Documentation/cgroups/memory.txt
> +++ b/Documentation/cgroups/memory.txt
> @@ -185,6 +185,9 @@ behind this approach is that a cgroup that aggressively uses a shared
>  page will eventually get charged for it (once it is uncharged from
>  the cgroup that brought it in -- this will happen on memory pressure).
>  
> +(*)See section 8.2. At task moving, you can recharge mapped pages to other
> +   cgroup.
> +

Perhaps:

But see section 8.2: when moving a task to another cgroup, its pages may
be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.

(I've intentionally omitted the word "mapped" there because it can move
even unmapped file pages and swapped-out anon pages.)

>  Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.
>  When you do swapoff and make swapped-out pages of shmem(tmpfs) to
>  be backed into memory in force, charges for pages are accounted against the
> @@ -623,8 +626,8 @@ memory cgroup.
>    bit | what type of charges would be moved ?
>   -----+------------------------------------------------------------------------
>     0  | A charge of an anonymous page(or swap of it) used by the target task.
> -      | Those pages and swaps must be used only by the target task. You must
> -      | enable Swap Extension(see 2.4) to enable move of swap charges.
> +      | Even if it's shared, it will be moved in force(*). You must enable Swap

What is this "force"?  I solved the Documentation problem by simply
removing one sentence, but you have tried harder.

> +      | Extension(see 2.4) to enable move of swap charges.
>   -----+------------------------------------------------------------------------
>     1  | A charge of file pages(normal file, tmpfs file(e.g. ipc shared memory)
>        | and swaps of tmpfs file) mmapped by the target task. Unlike the case of
> @@ -635,6 +638,9 @@ memory cgroup.
>        | page_mapcount(page) > 1). You must enable Swap Extension(see 2.4) to
>        | enable move of swap charges.
>  
> +(*) Because of this, fork() -> move -> exec() will move all parent's page
> +    to the target cgroup. Please be careful.
> +

It only moves those pages which were in the old cgroup.

>  8.3 TODO
>  
>  - Implement madvise(2) to let users decide the vma to be moved or not to be
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index f7df3ea..13c8d6f 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -390,7 +390,6 @@ static inline void deactivate_swap_token(struct mm_struct *mm, bool swap_token)
>  #ifdef CONFIG_CGROUP_MEM_RES_CTLR
>  extern void
>  mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout);
> -extern int mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep);
>  #else
>  static inline void
>  mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
> @@ -535,14 +534,6 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent)
>  {
>  }
>  
> -#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> -static inline int
> -mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep)
> -{
> -	return 0;
> -}
> -#endif
> -
>  #endif /* CONFIG_SWAP */
>  #endif /* __KERNEL__*/
>  #endif /* _LINUX_SWAP_H */
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c83aeb5..e7e4e3d 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -5100,12 +5100,9 @@ static struct page *mc_handle_present_pte(struct vm_area_struct *vma,
>  
>  	if (!page || !page_mapped(page))
>  		return NULL;
> -	if (PageAnon(page)) {
> -		/* we don't move shared anon */
> -		if (!move_anon() || page_mapcount(page) > 2)
> -			return NULL;
> +	if (PageAnon(page) && !move_anon()) {
> +		return NULL;
>  	} else if (!move_file())
> -		/* we ignore mapcount for file pages */
>  		return NULL;

Doesn't that need to be

	if (PageAnon(page)) {
		if (!move_anon())
			return NULL;
	} else if (!move_file())
		return NULL;

I think what you've written there makes treatment of anon pages
in the move_anon() case dependent on move_file(), doesn't it?

>  	if (!get_page_unless_zero(page))
>  		return NULL;
> @@ -5116,18 +5113,12 @@ static struct page *mc_handle_present_pte(struct vm_area_struct *vma,
>  static struct page *mc_handle_swap_pte(struct vm_area_struct *vma,
>  			unsigned long addr, pte_t ptent, swp_entry_t *entry)
>  {
> -	int usage_count;
>  	struct page *page = NULL;
>  	swp_entry_t ent = pte_to_swp_entry(ptent);
>  
>  	if (!move_anon() || non_swap_entry(ent))
>  		return NULL;
> -	usage_count = mem_cgroup_count_swap_user(ent, &page);
> -	if (usage_count > 1) { /* we don't move shared anon */
> -		if (page)
> -			put_page(page);
> -		return NULL;
> -	}
> +	page = lookup_swap_cache(ent);

It's annoying, but there's a tiny reason why that line needs to be

#ifdef CONFIG_SWAP
	page = find_get_page(&swapper_space, ent.val);
#endif

because lookup_swap_cache() updates some antiquated swap lookup stats
(which I do sometimes look at), which a straight find_get_page() avoids,
rightly in this case; but swapper_space only defined when CONFIG_SWAP=y.

I was about to propose that we just add an arg to lookup_swap_cache(),
for updating those stats or not; but see that mincore.c would like a
little more than that to avoid its CONFIG_SWAPs, not sure quite what.

So for now, please just use the #ifdef CONFIG_SWAP find_get_page()
that we use elsewhere, and I'll try to remove those warts later on.

>  	if (do_swap_account)
>  		entry->val = ent.val;
>  
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index fa3c519..85b4548 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -720,37 +720,6 @@ int free_swap_and_cache(swp_entry_t entry)
>  	return p != NULL;
>  }
>  
> -#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> -/**
> - * mem_cgroup_count_swap_user - count the user of a swap entry
> - * @ent: the swap entry to be checked
> - * @pagep: the pointer for the swap cache page of the entry to be stored
> - *
> - * Returns the number of the user of the swap entry. The number is valid only
> - * for swaps of anonymous pages.
> - * If the entry is found on swap cache, the page is stored to pagep with
> - * refcount of it being incremented.
> - */
> -int mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep)
> -{
> -	struct page *page;
> -	struct swap_info_struct *p;
> -	int count = 0;
> -
> -	page = find_get_page(&swapper_space, ent.val);
> -	if (page)
> -		count += page_mapcount(page);
> -	p = swap_info_get(ent);
> -	if (p) {
> -		count += swap_count(p->swap_map[swp_offset(ent)]);
> -		spin_unlock(&swap_lock);
> -	}
> -
> -	*pagep = page;
> -	return count;
> -}
> -#endif
> -

Exactly, delete delete delete!

>  #ifdef CONFIG_HIBERNATION
>  /*
>   * Find the swap type that corresponds to given device (if any).
> -- 
> 1.7.4.1

Hugh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] memcg: revert fix to mapcount check for this release
  2012-03-09 21:05             ` Hugh Dickins
@ 2012-03-09 21:37               ` Hugh Dickins
  2012-03-09 22:46                 ` Naoya Horiguchi
  2012-03-13  5:13                 ` KAMEZAWA Hiroyuki
  2012-03-13  5:18               ` [PATCH] memcg: fix behavior of shard anon pages at task_move (Was Re: [PATCH v3 2/2] memcg: avoid THP split in task migration KAMEZAWA Hiroyuki
  1 sibling, 2 replies; 19+ messages in thread
From: Hugh Dickins @ 2012-03-09 21:37 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Andrew Morton, KAMEZAWA Hiroyuki, Daisuke Nishimura,
	Naoya Horiguchi, Andrea Arcangeli, Hillf Danton, linux-kernel,
	linux-mm

Respectfully revert commit e6ca7b89dc76 "memcg: fix mapcount check
in move charge code for anonymous page" for the 3.3 release, so that
it behaves exactly like releases 2.6.35 through 3.2 in this respect.

Horiguchi-san's commit is correct in itself, 1 makes much more sense
than 2 in that check; but it does not go far enough - swapcount
should be considered too - if we really want such a check at all.

We appear to have reached agreement now, and expect that 3.4 will
remove the mapcount check, but had better not make 3.3 different.

Signed-off-by: Hugh Dickins <hughd@google.com>
---

 mm/memcontrol.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- 3.3-rc6+/mm/memcontrol.c	2012-03-05 22:03:45.940000832 -0800
+++ linux/mm/memcontrol.c	2012-03-09 13:06:41.716250093 -0800
@@ -5075,7 +5075,7 @@ static struct page *mc_handle_present_pt
 		return NULL;
 	if (PageAnon(page)) {
 		/* we don't move shared anon */
-		if (!move_anon() || page_mapcount(page) > 1)
+		if (!move_anon() || page_mapcount(page) > 2)
 			return NULL;
 	} else if (!move_file())
 		/* we ignore mapcount for file pages */

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] memcg: revert fix to mapcount check for this release
  2012-03-09 21:37               ` [PATCH] memcg: revert fix to mapcount check for this release Hugh Dickins
@ 2012-03-09 22:46                 ` Naoya Horiguchi
  2012-03-13  5:13                 ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 19+ messages in thread
From: Naoya Horiguchi @ 2012-03-09 22:46 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, KAMEZAWA Hiroyuki,
	Daisuke Nishimura, Naoya Horiguchi, Andrea Arcangeli,
	Hillf Danton, linux-kernel, linux-mm

On Fri, Mar 09, 2012 at 01:37:32PM -0800, Hugh Dickins wrote:
> Respectfully revert commit e6ca7b89dc76 "memcg: fix mapcount check
> in move charge code for anonymous page" for the 3.3 release, so that
> it behaves exactly like releases 2.6.35 through 3.2 in this respect.
> 
> Horiguchi-san's commit is correct in itself, 1 makes much more sense
> than 2 in that check; but it does not go far enough - swapcount
> should be considered too - if we really want such a check at all.

Agreed. We should rethink whole design rather than detail.

> We appear to have reached agreement now, and expect that 3.4 will
> remove the mapcount check, but had better not make 3.3 different.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

> ---
> 
>  mm/memcontrol.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- 3.3-rc6+/mm/memcontrol.c	2012-03-05 22:03:45.940000832 -0800
> +++ linux/mm/memcontrol.c	2012-03-09 13:06:41.716250093 -0800
> @@ -5075,7 +5075,7 @@ static struct page *mc_handle_present_pt
>  		return NULL;
>  	if (PageAnon(page)) {
>  		/* we don't move shared anon */
> -		if (!move_anon() || page_mapcount(page) > 1)
> +		if (!move_anon() || page_mapcount(page) > 2)
>  			return NULL;
>  	} else if (!move_file())
>  		/* we ignore mapcount for file pages */
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE
  2012-03-02 20:13 [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE Naoya Horiguchi
  2012-03-02 20:13 ` [PATCH v3 2/2] memcg: avoid THP split in task migration Naoya Horiguchi
  2012-03-09  0:33 ` [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE KAMEZAWA Hiroyuki
@ 2012-03-09 23:49 ` David Rientjes
  2 siblings, 0 replies; 19+ messages in thread
From: David Rientjes @ 2012-03-09 23:49 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: linux-mm, Andrew Morton, Andrea Arcangeli, KAMEZAWA Hiroyuki,
	Daisuke Nishimura, Hillf Danton, linux-kernel

On Fri, 2 Mar 2012, Naoya Horiguchi wrote:

> These macros will be used in later patch, where all usage are expected
> to be optimized away without #ifdef CONFIG_TRANSPARENT_HUGEPAGE.
> But to detect unexpected usages, we convert existing BUG() to BUILD_BUG().
> 
> Signed-off-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
> Acked-by: Hillf Danton <dhillf@gmail.com>
> Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>

Acked-by: David Rientjes <rientjes@google.com>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] memcg: revert fix to mapcount check for this release
  2012-03-09 21:37               ` [PATCH] memcg: revert fix to mapcount check for this release Hugh Dickins
  2012-03-09 22:46                 ` Naoya Horiguchi
@ 2012-03-13  5:13                 ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-13  5:13 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Linus Torvalds, Andrew Morton, Daisuke Nishimura,
	Naoya Horiguchi, Andrea Arcangeli, Hillf Danton, linux-kernel,
	linux-mm

On Fri, 9 Mar 2012 13:37:32 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> Respectfully revert commit e6ca7b89dc76 "memcg: fix mapcount check
> in move charge code for anonymous page" for the 3.3 release, so that
> it behaves exactly like releases 2.6.35 through 3.2 in this respect.
> 
> Horiguchi-san's commit is correct in itself, 1 makes much more sense
> than 2 in that check; but it does not go far enough - swapcount
> should be considered too - if we really want such a check at all.
> 
> We appear to have reached agreement now, and expect that 3.4 will
> remove the mapcount check, but had better not make 3.3 different.
> 
> Signed-off-by: Hugh Dickins <hughd@google.com>

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] memcg: fix behavior of shard anon pages at task_move (Was Re: [PATCH v3 2/2] memcg: avoid THP split in task migration
  2012-03-09 21:05             ` Hugh Dickins
  2012-03-09 21:37               ` [PATCH] memcg: revert fix to mapcount check for this release Hugh Dickins
@ 2012-03-13  5:18               ` KAMEZAWA Hiroyuki
  1 sibling, 0 replies; 19+ messages in thread
From: KAMEZAWA Hiroyuki @ 2012-03-13  5:18 UTC (permalink / raw)
  To: Hugh Dickins
  Cc: Daisuke Nishimura, Naoya Horiguchi, linux-mm, Andrew Morton,
	Andrea Arcangeli, Hillf Danton, linux-kernel

On Fri, 9 Mar 2012 13:05:05 -0800 (PST)
Hugh Dickins <hughd@google.com> wrote:

> On Fri, 9 Mar 2012, KAMEZAWA Hiroyuki wrote:
> > From 1012e97e3b123fb80d0ec6b1f5d3dbc87a5a5139 Mon Sep 17 00:00:00 2001
> > From: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

> > Suggested-by: Hugh Dickins <hughd@google.com>
> > Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> Yes, this is much like what I had in mind (well, on disk):
> I'm glad we are in agreement now.
> 
> But first priority, I think, should be to revert to the silly "> 2"
> test for 3.3 final, so it behaves just like 2.6.35 through 3.2:
> I'll send Andrew and Linus a patch for that shortly.
> 
> > ---
> >  Documentation/cgroups/memory.txt |   10 ++++++++--
> >  include/linux/swap.h             |    9 ---------
> >  mm/memcontrol.c                  |   15 +++------------
> >  mm/swapfile.c                    |   31 -------------------------------
> >  4 files changed, 11 insertions(+), 54 deletions(-)
> > 
> > diff --git a/Documentation/cgroups/memory.txt b/Documentation/cgroups/memory.txt
> > index 4c95c00..16bc9f2 100644
> > --- a/Documentation/cgroups/memory.txt
> > +++ b/Documentation/cgroups/memory.txt
> > @@ -185,6 +185,9 @@ behind this approach is that a cgroup that aggressively uses a shared
> >  page will eventually get charged for it (once it is uncharged from
> >  the cgroup that brought it in -- this will happen on memory pressure).
> >  
> > +(*)See section 8.2. At task moving, you can recharge mapped pages to other
> > +   cgroup.
> > +
> 
> Perhaps:
> 
> But see section 8.2: when moving a task to another cgroup, its pages may
> be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.
> 
> (I've intentionally omitted the word "mapped" there because it can move
> even unmapped file pages and swapped-out anon pages.)
> 

Sure.


> >  Exception: If CONFIG_CGROUP_CGROUP_MEM_RES_CTLR_SWAP is not used.
> >  When you do swapoff and make swapped-out pages of shmem(tmpfs) to
> >  be backed into memory in force, charges for pages are accounted against the
> > @@ -623,8 +626,8 @@ memory cgroup.
> >    bit | what type of charges would be moved ?
> >   -----+------------------------------------------------------------------------
> >     0  | A charge of an anonymous page(or swap of it) used by the target task.
> > -      | Those pages and swaps must be used only by the target task. You must
> > -      | enable Swap Extension(see 2.4) to enable move of swap charges.
> > +      | Even if it's shared, it will be moved in force(*). You must enable Swap
> 
> What is this "force"?  I solved the Documentation problem by simply
> removing one sentence, but you have tried harder.
> 

Ok, I remove the force (Sorry, I was not enough logical.)



> > +      | Extension(see 2.4) to enable move of swap charges.
> >   -----+------------------------------------------------------------------------
> >     1  | A charge of file pages(normal file, tmpfs file(e.g. ipc shared memory)
> >        | and swaps of tmpfs file) mmapped by the target task. Unlike the case of
> > @@ -635,6 +638,9 @@ memory cgroup.
> >        | page_mapcount(page) > 1). You must enable Swap Extension(see 2.4) to
> >        | enable move of swap charges.
> >  
> > +(*) Because of this, fork() -> move -> exec() will move all parent's page
> > +    to the target cgroup. Please be careful.
> > +
> 
> It only moves those pages which were in the old cgroup.
> 


You're right.

I'll consinder simpler one.



> >  8.3 TODO
> >  
> >  - Implement madvise(2) to let users decide the vma to be moved or not to be
> > diff --git a/include/linux/swap.h b/include/linux/swap.h
> > index f7df3ea..13c8d6f 100644
> > --- a/include/linux/swap.h
> > +++ b/include/linux/swap.h
> > @@ -390,7 +390,6 @@ static inline void deactivate_swap_token(struct mm_struct *mm, bool swap_token)
> >  #ifdef CONFIG_CGROUP_MEM_RES_CTLR
> >  extern void
> >  mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout);
> > -extern int mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep);
> >  #else
> >  static inline void
> >  mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent, bool swapout)
> > @@ -535,14 +534,6 @@ mem_cgroup_uncharge_swapcache(struct page *page, swp_entry_t ent)
> >  {
> >  }
> >  
> > -#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> > -static inline int
> > -mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep)
> > -{
> > -	return 0;
> > -}
> > -#endif
> > -
> >  #endif /* CONFIG_SWAP */
> >  #endif /* __KERNEL__*/
> >  #endif /* _LINUX_SWAP_H */
> > diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> > index c83aeb5..e7e4e3d 100644
> > --- a/mm/memcontrol.c
> > +++ b/mm/memcontrol.c
> > @@ -5100,12 +5100,9 @@ static struct page *mc_handle_present_pte(struct vm_area_struct *vma,
> >  
> >  	if (!page || !page_mapped(page))
> >  		return NULL;
> > -	if (PageAnon(page)) {
> > -		/* we don't move shared anon */
> > -		if (!move_anon() || page_mapcount(page) > 2)
> > -			return NULL;
> > +	if (PageAnon(page) && !move_anon()) {
> > +		return NULL;
> >  	} else if (!move_file())
> > -		/* we ignore mapcount for file pages */
> >  		return NULL;
> 
> Doesn't that need to be
> 
> 	if (PageAnon(page)) {
> 		if (!move_anon())
> 			return NULL;
> 	} else if (!move_file())
> 		return NULL;
> 
> I think what you've written there makes treatment of anon pages
> in the move_anon() case dependent on move_file(), doesn't it?
> 

will rewrite.



> >  	if (!get_page_unless_zero(page))
> >  		return NULL;
> > @@ -5116,18 +5113,12 @@ static struct page *mc_handle_present_pte(struct vm_area_struct *vma,
> >  static struct page *mc_handle_swap_pte(struct vm_area_struct *vma,
> >  			unsigned long addr, pte_t ptent, swp_entry_t *entry)
> >  {
> > -	int usage_count;
> >  	struct page *page = NULL;
> >  	swp_entry_t ent = pte_to_swp_entry(ptent);
> >  
> >  	if (!move_anon() || non_swap_entry(ent))
> >  		return NULL;
> > -	usage_count = mem_cgroup_count_swap_user(ent, &page);
> > -	if (usage_count > 1) { /* we don't move shared anon */
> > -		if (page)
> > -			put_page(page);
> > -		return NULL;
> > -	}
> > +	page = lookup_swap_cache(ent);
> 
> It's annoying, but there's a tiny reason why that line needs to be
> 
> #ifdef CONFIG_SWAP
> 	page = find_get_page(&swapper_space, ent.val);
> #endif
> 
> because lookup_swap_cache() updates some antiquated swap lookup stats
> (which I do sometimes look at), which a straight find_get_page() avoids,
> rightly in this case; but swapper_space only defined when CONFIG_SWAP=y.
> 
> I was about to propose that we just add an arg to lookup_swap_cache(),
> for updating those stats or not; but see that mincore.c would like a
> little more than that to avoid its CONFIG_SWAPs, not sure quite what.
> 
> So for now, please just use the #ifdef CONFIG_SWAP find_get_page()
> that we use elsewhere, and I'll try to remove those warts later on.
> 

Okay, I will do that.


> >  	if (do_swap_account)
> >  		entry->val = ent.val;
> >  
> > diff --git a/mm/swapfile.c b/mm/swapfile.c
> > index fa3c519..85b4548 100644
> > --- a/mm/swapfile.c
> > +++ b/mm/swapfile.c
> > @@ -720,37 +720,6 @@ int free_swap_and_cache(swp_entry_t entry)
> >  	return p != NULL;
> >  }
> >  
> > -#ifdef CONFIG_CGROUP_MEM_RES_CTLR
> > -/**
> > - * mem_cgroup_count_swap_user - count the user of a swap entry
> > - * @ent: the swap entry to be checked
> > - * @pagep: the pointer for the swap cache page of the entry to be stored
> > - *
> > - * Returns the number of the user of the swap entry. The number is valid only
> > - * for swaps of anonymous pages.
> > - * If the entry is found on swap cache, the page is stored to pagep with
> > - * refcount of it being incremented.
> > - */
> > -int mem_cgroup_count_swap_user(swp_entry_t ent, struct page **pagep)
> > -{
> > -	struct page *page;
> > -	struct swap_info_struct *p;
> > -	int count = 0;
> > -
> > -	page = find_get_page(&swapper_space, ent.val);
> > -	if (page)
> > -		count += page_mapcount(page);
> > -	p = swap_info_get(ent);
> > -	if (p) {
> > -		count += swap_count(p->swap_map[swp_offset(ent)]);
> > -		spin_unlock(&swap_lock);
> > -	}
> > -
> > -	*pagep = page;
> > -	return count;
> > -}
> > -#endif
> > -
> 
> Exactly, delete delete delete!
> 
 :)

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2012-03-13  5:19 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-02 20:13 [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE Naoya Horiguchi
2012-03-02 20:13 ` [PATCH v3 2/2] memcg: avoid THP split in task migration Naoya Horiguchi
2012-03-09  0:30   ` Andrew Morton
2012-03-09  1:47     ` Naoya Horiguchi
2012-03-09  1:16   ` KAMEZAWA Hiroyuki
2012-03-09  2:33     ` Hugh Dickins
2012-03-09  3:24       ` KAMEZAWA Hiroyuki
2012-03-09  4:00         ` KAMEZAWA Hiroyuki
2012-03-09  6:01         ` Daisuke Nishimura
2012-03-09  7:23           ` [PATCH] memcg: fix behavior of shard anon pages at task_move (Was " KAMEZAWA Hiroyuki
2012-03-09 21:05             ` Hugh Dickins
2012-03-09 21:37               ` [PATCH] memcg: revert fix to mapcount check for this release Hugh Dickins
2012-03-09 22:46                 ` Naoya Horiguchi
2012-03-13  5:13                 ` KAMEZAWA Hiroyuki
2012-03-13  5:18               ` [PATCH] memcg: fix behavior of shard anon pages at task_move (Was Re: [PATCH v3 2/2] memcg: avoid THP split in task migration KAMEZAWA Hiroyuki
2012-03-09  4:25     ` Naoya Horiguchi
2012-03-09  7:32       ` KAMEZAWA Hiroyuki
2012-03-09  0:33 ` [PATCH v3 1/2] thp: add HPAGE_PMD_* definitions for !CONFIG_TRANSPARENT_HUGEPAGE KAMEZAWA Hiroyuki
2012-03-09 23:49 ` David Rientjes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).