linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk
@ 2013-11-28  7:48 Joonsoo Kim
  2013-11-28  7:48 ` [PATCH 1/9] mm/rmap: recompute pgoff for huge page Joonsoo Kim
                   ` (8 more replies)
  0 siblings, 9 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

Rmap traversing is used in five different cases, try_to_unmap(),
try_to_munlock(), page_referenced(), page_mkclean() and
remove_migration_ptes(). Each one implements its own traversing functions
for the cases, anon, file, ksm, respectively. These cause lots of duplications
and cause maintenance overhead. They also make codes being hard to understand
and error-prone. One example is hugepage handling. There is a code to compute
hugepage offset correctly in try_to_unmap_file(), but, there isn't a code
to compute hugepage offset in rmap_walk_file(). These are used pairwise
in migration context, but we missed to modify pairwise.

To overcome these drawbacks, we should unify these through one unified
function. I decide rmap_walk() as main function since it has no
unnecessity. And to control behavior of rmap_walk(), I introduce
struct rmap_walk_control having some function pointers. These makes
rmap_walk() working for their specific needs.

This patchset remove a lot of duplicated code as you can see in below
short-stat and kernel text size also decrease slightly.

   text    data     bss     dec     hex filename
  10640       1      16   10657    29a1 mm/rmap.o
  10047       1      16   10064    2750 mm/rmap.o

  13823     705    8288   22816    5920 mm/ksm.o
  13199     705    8288   22192    56b0 mm/ksm.o

Thanks.

Joonsoo Kim (9):
  mm/rmap: recompute pgoff for huge page
  mm/rmap: factor nonlinear handling out of try_to_unmap_file()
  mm/rmap: factor lock function out of rmap_walk_anon()
  mm/rmap: make rmap_walk to get the rmap_walk_control argument
  mm/rmap: extend rmap_walk_xxx() to cope with different cases
  mm/rmap: use rmap_walk() in try_to_unmap()
  mm/rmap: use rmap_walk() in try_to_munlock()
  mm/rmap: use rmap_walk() in page_referenced()
  mm/rmap: use rmap_walk() in page_mkclean()

 include/linux/ksm.h  |   15 +-
 include/linux/rmap.h |   19 +-
 mm/ksm.c             |  116 +---------
 mm/migrate.c         |    7 +-
 mm/rmap.c            |  570 ++++++++++++++++++++++----------------------------
 5 files changed, 286 insertions(+), 441 deletions(-)

-- 
1.7.9.5


^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH 1/9] mm/rmap: recompute pgoff for huge page
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
  2013-12-02 22:44   ` Andrew Morton
  2013-11-28  7:48 ` [PATCH 2/9] mm/rmap: factor nonlinear handling out of try_to_unmap_file() Joonsoo Kim
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

We have to recompute pgoff if the given page is huge, since result based
on HPAGE_SIZE is not approapriate for scanning the vma interval tree, as
shown by commit 36e4f20af833 ("hugetlb: do not use vma_hugecache_offset()
for vma_prio_tree_foreach") and commit 369a713e ("rmap: recompute pgoff
for unmapping huge page").

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/mm/rmap.c b/mm/rmap.c
index 55c8b8d..1214703 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1714,6 +1714,10 @@ static int rmap_walk_file(struct page *page, int (*rmap_one)(struct page *,
 
 	if (!mapping)
 		return ret;
+
+	if (PageHuge(page))
+		pgoff = page->index << compound_order(page);
+
 	mutex_lock(&mapping->i_mmap_mutex);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
 		unsigned long address = vma_address(page, vma);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 2/9] mm/rmap: factor nonlinear handling out of try_to_unmap_file()
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
  2013-11-28  7:48 ` [PATCH 1/9] mm/rmap: recompute pgoff for huge page Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
  2013-11-28  7:48 ` [PATCH 3/9] mm/rmap: factor lock function out of rmap_walk_anon() Joonsoo Kim
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

To merge all kinds of rmap traverse functions, try_to_unmap(),
try_to_munlock(), page_referenced() and page_mkclean(), we need to
extract common parts and separate out non-common parts.

Nonlinear handling is handled just in try_to_unmap_file() and other
rmap traverse functions doesn't care of it. Therfore it is better
to factor nonlinear handling out of try_to_unmap_file() in order to
merge all kinds of rmap traverse functions easily.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/mm/rmap.c b/mm/rmap.c
index 1214703..e6d532c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1422,6 +1422,79 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
 	return ret;
 }
 
+static int try_to_unmap_nonlinear(struct page *page,
+		struct address_space *mapping, struct vm_area_struct *vma)
+{
+	int ret = SWAP_AGAIN;
+	unsigned long cursor;
+	unsigned long max_nl_cursor = 0;
+	unsigned long max_nl_size = 0;
+	unsigned int mapcount;
+
+	list_for_each_entry(vma,
+		&mapping->i_mmap_nonlinear, shared.nonlinear) {
+
+		cursor = (unsigned long) vma->vm_private_data;
+		if (cursor > max_nl_cursor)
+			max_nl_cursor = cursor;
+		cursor = vma->vm_end - vma->vm_start;
+		if (cursor > max_nl_size)
+			max_nl_size = cursor;
+	}
+
+	if (max_nl_size == 0) {	/* all nonlinears locked or reserved ? */
+		return SWAP_FAIL;
+	}
+
+	/*
+	 * We don't try to search for this page in the nonlinear vmas,
+	 * and page_referenced wouldn't have found it anyway.  Instead
+	 * just walk the nonlinear vmas trying to age and unmap some.
+	 * The mapcount of the page we came in with is irrelevant,
+	 * but even so use it as a guide to how hard we should try?
+	 */
+	mapcount = page_mapcount(page);
+	if (!mapcount)
+		return ret;
+
+	cond_resched();
+
+	max_nl_size = (max_nl_size + CLUSTER_SIZE - 1) & CLUSTER_MASK;
+	if (max_nl_cursor == 0)
+		max_nl_cursor = CLUSTER_SIZE;
+
+	do {
+		list_for_each_entry(vma,
+			&mapping->i_mmap_nonlinear, shared.nonlinear) {
+
+			cursor = (unsigned long) vma->vm_private_data;
+			while (cursor < max_nl_cursor &&
+				cursor < vma->vm_end - vma->vm_start) {
+				if (try_to_unmap_cluster(cursor, &mapcount,
+						vma, page) == SWAP_MLOCK)
+					ret = SWAP_MLOCK;
+				cursor += CLUSTER_SIZE;
+				vma->vm_private_data = (void *) cursor;
+				if ((int)mapcount <= 0)
+					return ret;
+			}
+			vma->vm_private_data = (void *) max_nl_cursor;
+		}
+		cond_resched();
+		max_nl_cursor += CLUSTER_SIZE;
+	} while (max_nl_cursor <= max_nl_size);
+
+	/*
+	 * Don't loop forever (perhaps all the remaining pages are
+	 * in locked vmas).  Reset cursor on all unreserved nonlinear
+	 * vmas, now forgetting on which ones it had fallen behind.
+	 */
+	list_for_each_entry(vma, &mapping->i_mmap_nonlinear, shared.nonlinear)
+		vma->vm_private_data = NULL;
+
+	return ret;
+}
+
 bool is_vma_temporary_stack(struct vm_area_struct *vma)
 {
 	int maybe_stack = vma->vm_flags & (VM_GROWSDOWN | VM_GROWSUP);
@@ -1511,10 +1584,6 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
 	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 	struct vm_area_struct *vma;
 	int ret = SWAP_AGAIN;
-	unsigned long cursor;
-	unsigned long max_nl_cursor = 0;
-	unsigned long max_nl_size = 0;
-	unsigned int mapcount;
 
 	if (PageHuge(page))
 		pgoff = page->index << compound_order(page);
@@ -1538,64 +1607,7 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
 	if (TTU_ACTION(flags) == TTU_MUNLOCK)
 		goto out;
 
-	list_for_each_entry(vma, &mapping->i_mmap_nonlinear,
-							shared.nonlinear) {
-		cursor = (unsigned long) vma->vm_private_data;
-		if (cursor > max_nl_cursor)
-			max_nl_cursor = cursor;
-		cursor = vma->vm_end - vma->vm_start;
-		if (cursor > max_nl_size)
-			max_nl_size = cursor;
-	}
-
-	if (max_nl_size == 0) {	/* all nonlinears locked or reserved ? */
-		ret = SWAP_FAIL;
-		goto out;
-	}
-
-	/*
-	 * We don't try to search for this page in the nonlinear vmas,
-	 * and page_referenced wouldn't have found it anyway.  Instead
-	 * just walk the nonlinear vmas trying to age and unmap some.
-	 * The mapcount of the page we came in with is irrelevant,
-	 * but even so use it as a guide to how hard we should try?
-	 */
-	mapcount = page_mapcount(page);
-	if (!mapcount)
-		goto out;
-	cond_resched();
-
-	max_nl_size = (max_nl_size + CLUSTER_SIZE - 1) & CLUSTER_MASK;
-	if (max_nl_cursor == 0)
-		max_nl_cursor = CLUSTER_SIZE;
-
-	do {
-		list_for_each_entry(vma, &mapping->i_mmap_nonlinear,
-							shared.nonlinear) {
-			cursor = (unsigned long) vma->vm_private_data;
-			while ( cursor < max_nl_cursor &&
-				cursor < vma->vm_end - vma->vm_start) {
-				if (try_to_unmap_cluster(cursor, &mapcount,
-						vma, page) == SWAP_MLOCK)
-					ret = SWAP_MLOCK;
-				cursor += CLUSTER_SIZE;
-				vma->vm_private_data = (void *) cursor;
-				if ((int)mapcount <= 0)
-					goto out;
-			}
-			vma->vm_private_data = (void *) max_nl_cursor;
-		}
-		cond_resched();
-		max_nl_cursor += CLUSTER_SIZE;
-	} while (max_nl_cursor <= max_nl_size);
-
-	/*
-	 * Don't loop forever (perhaps all the remaining pages are
-	 * in locked vmas).  Reset cursor on all unreserved nonlinear
-	 * vmas, now forgetting on which ones it had fallen behind.
-	 */
-	list_for_each_entry(vma, &mapping->i_mmap_nonlinear, shared.nonlinear)
-		vma->vm_private_data = NULL;
+	ret = try_to_unmap_nonlinear(page, mapping, vma);
 out:
 	mutex_unlock(&mapping->i_mmap_mutex);
 	return ret;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 3/9] mm/rmap: factor lock function out of rmap_walk_anon()
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
  2013-11-28  7:48 ` [PATCH 1/9] mm/rmap: recompute pgoff for huge page Joonsoo Kim
  2013-11-28  7:48 ` [PATCH 2/9] mm/rmap: factor nonlinear handling out of try_to_unmap_file() Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
  2013-11-28  7:48 ` [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument Joonsoo Kim
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

When we traverse anon_vma, we need to take a read-side anon_lock.
But there is subtle difference in the situation so that we can't use
same method to take a lock in each cases. Therefore, we need to make
rmap_walk_anon() taking difference lock function.

This patch is the first step, factoring lock function for anon_lock out
of rmap_walk_anon(). It will be used in case of removing migration entry
and in default of rmap_walk_anon().

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/mm/rmap.c b/mm/rmap.c
index e6d532c..916f2ed 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1683,6 +1683,24 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 }
 
 #ifdef CONFIG_MIGRATION
+static struct anon_vma *rmap_walk_anon_lock(struct page *page)
+{
+	struct anon_vma *anon_vma;
+
+	/*
+	 * Note: remove_migration_ptes() cannot use page_lock_anon_vma_read()
+	 * because that depends on page_mapped(); but not all its usages
+	 * are holding mmap_sem. Users without mmap_sem are required to
+	 * take a reference count to prevent the anon_vma disappearing
+	 */
+	anon_vma = page_anon_vma(page);
+	if (!anon_vma)
+		return NULL;
+
+	anon_vma_lock_read(anon_vma);
+	return anon_vma;
+}
+
 /*
  * rmap_walk() and its helpers rmap_walk_anon() and rmap_walk_file():
  * Called by migrate.c to remove migration ptes, but might be used more later.
@@ -1695,16 +1713,10 @@ static int rmap_walk_anon(struct page *page, int (*rmap_one)(struct page *,
 	struct anon_vma_chain *avc;
 	int ret = SWAP_AGAIN;
 
-	/*
-	 * Note: remove_migration_ptes() cannot use page_lock_anon_vma_read()
-	 * because that depends on page_mapped(); but not all its usages
-	 * are holding mmap_sem. Users without mmap_sem are required to
-	 * take a reference count to prevent the anon_vma disappearing
-	 */
-	anon_vma = page_anon_vma(page);
+	anon_vma = rmap_walk_anon_lock(page);
 	if (!anon_vma)
 		return ret;
-	anon_vma_lock_read(anon_vma);
+
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
 		struct vm_area_struct *vma = avc->vma;
 		unsigned long address = vma_address(page, vma);
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
                   ` (2 preceding siblings ...)
  2013-11-28  7:48 ` [PATCH 3/9] mm/rmap: factor lock function out of rmap_walk_anon() Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
  2013-12-02 22:52   ` Andrew Morton
  2013-11-28  7:48 ` [PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases Joonsoo Kim
                   ` (4 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

In each rmap traverse case, there is some difference so that we need
function pointers and arguments to them in order to handle these
difference properly.

For this purpose, struct rmap_walk_control is introduced in this patch,
and will be extended in following patch. Introducing and extending are
separate, because it clarify changes.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 45c9b6a..0eef8cb 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -76,8 +76,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
 int page_referenced_ksm(struct page *page,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
 int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
-int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
-		  struct vm_area_struct *, unsigned long, void *), void *arg);
+int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
 
 #else  /* !CONFIG_KSM */
@@ -120,8 +119,8 @@ static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
 	return 0;
 }
 
-static inline int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page*,
-		struct vm_area_struct *, unsigned long, void *), void *arg)
+static inline int rmap_walk_ksm(struct page *page,
+			struct rmap_walk_control *rwc)
 {
 	return 0;
 }
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..0f65686 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -235,11 +235,16 @@ struct anon_vma *page_lock_anon_vma_read(struct page *page);
 void page_unlock_anon_vma_read(struct anon_vma *anon_vma);
 int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma);
 
+struct rmap_walk_control {
+	int (*main)(struct page *, struct vm_area_struct *,
+					unsigned long, void *);
+	void *arg;	/* argument to main function */
+};
+
 /*
  * Called by migrate.c to remove migration ptes, but might be used more later.
  */
-int rmap_walk(struct page *page, int (*rmap_one)(struct page *,
-		struct vm_area_struct *, unsigned long, void *), void *arg);
+int rmap_walk(struct page *page, struct rmap_walk_control *rwc);
 
 #else	/* !CONFIG_MMU */
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 175fff7..c42ec30 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1997,8 +1997,7 @@ out:
 }
 
 #ifdef CONFIG_MIGRATION
-int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
-		  struct vm_area_struct *, unsigned long, void *), void *arg)
+int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
 	struct stable_node *stable_node;
 	struct rmap_item *rmap_item;
@@ -2033,7 +2032,8 @@ again:
 			if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
 				continue;
 
-			ret = rmap_one(page, vma, rmap_item->address, arg);
+			ret = rwc->main(page, vma,
+					rmap_item->address, rwc->arg);
 			if (ret != SWAP_AGAIN) {
 				anon_vma_unlock_read(anon_vma);
 				goto out;
diff --git a/mm/migrate.c b/mm/migrate.c
index bb94004..c47425b 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -198,7 +198,12 @@ out:
  */
 static void remove_migration_ptes(struct page *old, struct page *new)
 {
-	rmap_walk(new, remove_migration_pte, old);
+	struct rmap_walk_control rwc;
+
+	memset(&rwc, 0, sizeof(rwc));
+	rwc.main = remove_migration_pte;
+	rwc.arg = old;
+	rmap_walk(new, &rwc);
 }
 
 /*
diff --git a/mm/rmap.c b/mm/rmap.c
index 916f2ed..5933488 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1705,8 +1705,7 @@ static struct anon_vma *rmap_walk_anon_lock(struct page *page)
  * rmap_walk() and its helpers rmap_walk_anon() and rmap_walk_file():
  * Called by migrate.c to remove migration ptes, but might be used more later.
  */
-static int rmap_walk_anon(struct page *page, int (*rmap_one)(struct page *,
-		struct vm_area_struct *, unsigned long, void *), void *arg)
+static int rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc)
 {
 	struct anon_vma *anon_vma;
 	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1720,7 +1719,7 @@ static int rmap_walk_anon(struct page *page, int (*rmap_one)(struct page *,
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
 		struct vm_area_struct *vma = avc->vma;
 		unsigned long address = vma_address(page, vma);
-		ret = rmap_one(page, vma, address, arg);
+		ret = rwc->main(page, vma, address, rwc->arg);
 		if (ret != SWAP_AGAIN)
 			break;
 	}
@@ -1728,8 +1727,7 @@ static int rmap_walk_anon(struct page *page, int (*rmap_one)(struct page *,
 	return ret;
 }
 
-static int rmap_walk_file(struct page *page, int (*rmap_one)(struct page *,
-		struct vm_area_struct *, unsigned long, void *), void *arg)
+static int rmap_walk_file(struct page *page, struct rmap_walk_control *rwc)
 {
 	struct address_space *mapping = page->mapping;
 	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1745,7 +1743,7 @@ static int rmap_walk_file(struct page *page, int (*rmap_one)(struct page *,
 	mutex_lock(&mapping->i_mmap_mutex);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
 		unsigned long address = vma_address(page, vma);
-		ret = rmap_one(page, vma, address, arg);
+		ret = rwc->main(page, vma, address, rwc->arg);
 		if (ret != SWAP_AGAIN)
 			break;
 	}
@@ -1758,17 +1756,16 @@ static int rmap_walk_file(struct page *page, int (*rmap_one)(struct page *,
 	return ret;
 }
 
-int rmap_walk(struct page *page, int (*rmap_one)(struct page *,
-		struct vm_area_struct *, unsigned long, void *), void *arg)
+int rmap_walk(struct page *page, struct rmap_walk_control *rwc)
 {
 	VM_BUG_ON(!PageLocked(page));
 
 	if (unlikely(PageKsm(page)))
-		return rmap_walk_ksm(page, rmap_one, arg);
+		return rmap_walk_ksm(page, rwc);
 	else if (PageAnon(page))
-		return rmap_walk_anon(page, rmap_one, arg);
+		return rmap_walk_anon(page, rwc);
 	else
-		return rmap_walk_file(page, rmap_one, arg);
+		return rmap_walk_file(page, rwc);
 }
 #endif /* CONFIG_MIGRATION */
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
                   ` (3 preceding siblings ...)
  2013-11-28  7:48 ` [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
  2013-11-28  7:48 ` [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap() Joonsoo Kim
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

There are a lot of common parts in traversing functions, but there are
also a little of uncommon parts in it. By assigning proper function
pointer on each rmap_walker_control, we can handle these difference
correctly.

Following are differences we should handle.

1. difference of lock function in anon mapping case
2. nonlinear handling in file mapping case
3. prechecked condition:
	checking memcg in page_referenced(),
	checking VM_SHARE in page_mkclean()
	checking temporary vma in try_to_unmap()
4. exit condition:
	checking page_mapped() in try_to_unmap()

So, in this patch, I introduce 4 function pointers to
handle above differences.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 0f65686..58624b4 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -239,6 +239,12 @@ struct rmap_walk_control {
 	int (*main)(struct page *, struct vm_area_struct *,
 					unsigned long, void *);
 	void *arg;	/* argument to main function */
+	int (*main_done)(struct page *page);	/* check exit condition */
+	int (*file_nonlinear)(struct page *, struct address_space *,
+					struct vm_area_struct *vma);
+	struct anon_vma *(*anon_lock)(struct page *);
+	int (*vma_skip)(struct vm_area_struct *, void *);
+	void *skip_arg;	/* argument to vma_skip function */
 };
 
 /*
diff --git a/mm/ksm.c b/mm/ksm.c
index c42ec30..0aa6e09 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2032,12 +2032,19 @@ again:
 			if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
 				continue;
 
+			if (rwc->vma_skip && rwc->vma_skip(vma, rwc->skip_arg))
+				continue;
+
 			ret = rwc->main(page, vma,
 					rmap_item->address, rwc->arg);
 			if (ret != SWAP_AGAIN) {
 				anon_vma_unlock_read(anon_vma);
 				goto out;
 			}
+			if (rwc->main_done && rwc->main_done(page)) {
+				anon_vma_unlock_read(anon_vma);
+				goto out;
+			}
 		}
 		anon_vma_unlock_read(anon_vma);
 	}
diff --git a/mm/rmap.c b/mm/rmap.c
index 5933488..5dad5dd 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1683,10 +1683,14 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 }
 
 #ifdef CONFIG_MIGRATION
-static struct anon_vma *rmap_walk_anon_lock(struct page *page)
+static struct anon_vma *rmap_walk_anon_lock(struct page *page,
+					struct rmap_walk_control *rwc)
 {
 	struct anon_vma *anon_vma;
 
+	if (rwc->anon_lock)
+		return rwc->anon_lock(page);
+
 	/*
 	 * Note: remove_migration_ptes() cannot use page_lock_anon_vma_read()
 	 * because that depends on page_mapped(); but not all its usages
@@ -1712,16 +1716,22 @@ static int rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc)
 	struct anon_vma_chain *avc;
 	int ret = SWAP_AGAIN;
 
-	anon_vma = rmap_walk_anon_lock(page);
+	anon_vma = rmap_walk_anon_lock(page, rwc);
 	if (!anon_vma)
 		return ret;
 
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
 		struct vm_area_struct *vma = avc->vma;
 		unsigned long address = vma_address(page, vma);
+
+		if (rwc->vma_skip && rwc->vma_skip(vma, rwc->skip_arg))
+			continue;
+
 		ret = rwc->main(page, vma, address, rwc->arg);
 		if (ret != SWAP_AGAIN)
 			break;
+		if (rwc->main_done && rwc->main_done(page))
+			break;
 	}
 	anon_vma_unlock_read(anon_vma);
 	return ret;
@@ -1743,15 +1753,26 @@ static int rmap_walk_file(struct page *page, struct rmap_walk_control *rwc)
 	mutex_lock(&mapping->i_mmap_mutex);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
 		unsigned long address = vma_address(page, vma);
+
+		if (rwc->vma_skip && rwc->vma_skip(vma, rwc->skip_arg))
+			continue;
+
 		ret = rwc->main(page, vma, address, rwc->arg);
 		if (ret != SWAP_AGAIN)
-			break;
+			goto done;
+		if (rwc->main_done && rwc->main_done(page))
+			goto done;
 	}
-	/*
-	 * No nonlinear handling: being always shared, nonlinear vmas
-	 * never contain migration ptes.  Decide what to do about this
-	 * limitation to linear when we need rmap_walk() on nonlinear.
-	 */
+
+	if (!rwc->file_nonlinear)
+		goto done;
+
+	if (list_empty(&mapping->i_mmap_nonlinear))
+		goto done;
+
+	ret = rwc->file_nonlinear(page, mapping, vma);
+
+done:
 	mutex_unlock(&mapping->i_mmap_mutex);
 	return ret;
 }
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap()
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
                   ` (4 preceding siblings ...)
  2013-11-28  7:48 ` [PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
  2013-12-02 23:01   ` Andrew Morton
  2013-11-28  7:48 ` [PATCH 7/9] mm/rmap: use rmap_walk() in try_to_munlock() Joonsoo Kim
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in try_to_unmap().

In this patch, I change following things.

1. enable rmap_walk() if !CONFIG_MIGRATION.
2. mechanical change to use rmap_walk() in try_to_unmap().

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 58624b4..d641f6d 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -190,7 +190,7 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
 
 int try_to_unmap(struct page *, enum ttu_flags flags);
 int try_to_unmap_one(struct page *, struct vm_area_struct *,
-			unsigned long address, enum ttu_flags flags);
+			unsigned long address, void *arg);
 
 /*
  * Called from mm/filemap_xip.c to unmap empty zero page
diff --git a/mm/ksm.c b/mm/ksm.c
index 0aa6e09..e1b0198 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1982,7 +1982,7 @@ again:
 				continue;
 
 			ret = try_to_unmap_one(page, vma,
-					rmap_item->address, flags);
+					rmap_item->address, (void *)flags);
 			if (ret != SWAP_AGAIN || !page_mapped(page)) {
 				anon_vma_unlock_read(anon_vma);
 				goto out;
@@ -1996,7 +1996,6 @@ out:
 	return ret;
 }
 
-#ifdef CONFIG_MIGRATION
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
 	struct stable_node *stable_node;
@@ -2054,6 +2053,7 @@ out:
 	return ret;
 }
 
+#ifdef CONFIG_MIGRATION
 void ksm_migrate_page(struct page *newpage, struct page *oldpage)
 {
 	struct stable_node *stable_node;
diff --git a/mm/rmap.c b/mm/rmap.c
index 5dad5dd..7407710 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1177,13 +1177,14 @@ out:
  * repeatedly from try_to_unmap_ksm, try_to_unmap_anon or try_to_unmap_file.
  */
 int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
-		     unsigned long address, enum ttu_flags flags)
+		     unsigned long address, void *arg)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pte_t *pte;
 	pte_t pteval;
 	spinlock_t *ptl;
 	int ret = SWAP_AGAIN;
+	enum ttu_flags flags = (enum ttu_flags)arg;
 
 	pte = page_check_address(page, mm, address, &ptl, 0);
 	if (!pte)
@@ -1509,6 +1510,11 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
 	return false;
 }
 
+static int skip_vma_temporary_stack(struct vm_area_struct *vma, void *arg)
+{
+	return (int)is_vma_temporary_stack(vma);
+}
+
 /**
  * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
  * rmap method
@@ -1554,7 +1560,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
 			continue;
 
 		address = vma_address(page, vma);
-		ret = try_to_unmap_one(page, vma, address, flags);
+		ret = try_to_unmap_one(page, vma, address, (void *)flags);
 		if (ret != SWAP_AGAIN || !page_mapped(page))
 			break;
 	}
@@ -1591,7 +1597,7 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
 	mutex_lock(&mapping->i_mmap_mutex);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
 		unsigned long address = vma_address(page, vma);
-		ret = try_to_unmap_one(page, vma, address, flags);
+		ret = try_to_unmap_one(page, vma, address, (void *)flags);
 		if (ret != SWAP_AGAIN || !page_mapped(page))
 			goto out;
 	}
@@ -1613,6 +1619,11 @@ out:
 	return ret;
 }
 
+static int page_not_mapped(struct page *page)
+{
+	return !page_mapped(page);
+};
+
 /**
  * try_to_unmap - try to remove all page table mappings to a page
  * @page: the page to get unmapped
@@ -1630,16 +1641,30 @@ out:
 int try_to_unmap(struct page *page, enum ttu_flags flags)
 {
 	int ret;
+	struct rmap_walk_control rwc;
 
-	BUG_ON(!PageLocked(page));
 	VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
 
-	if (unlikely(PageKsm(page)))
-		ret = try_to_unmap_ksm(page, flags);
-	else if (PageAnon(page))
-		ret = try_to_unmap_anon(page, flags);
-	else
-		ret = try_to_unmap_file(page, flags);
+	memset(&rwc, 0, sizeof(rwc));
+	rwc.main = try_to_unmap_one;
+	rwc.arg = (void *)flags;
+	rwc.main_done = page_not_mapped;
+	rwc.file_nonlinear = try_to_unmap_nonlinear;
+	rwc.anon_lock = page_lock_anon_vma_read;
+
+	/*
+	 * During exec, a temporary VMA is setup and later moved.
+	 * The VMA is moved under the anon_vma lock but not the
+	 * page tables leading to a race where migration cannot
+	 * find the migration ptes. Rather than increasing the
+	 * locking requirements of exec(), migration skips
+	 * temporary VMAs until after exec() completes.
+	 */
+	if (flags & TTU_MIGRATION && !PageKsm(page) && PageAnon(page))
+		rwc.vma_skip = skip_vma_temporary_stack;
+
+	ret = rmap_walk(page, &rwc);
+
 	if (ret != SWAP_MLOCK && !page_mapped(page))
 		ret = SWAP_SUCCESS;
 	return ret;
@@ -1682,7 +1707,6 @@ void __put_anon_vma(struct anon_vma *anon_vma)
 	anon_vma_free(anon_vma);
 }
 
-#ifdef CONFIG_MIGRATION
 static struct anon_vma *rmap_walk_anon_lock(struct page *page,
 					struct rmap_walk_control *rwc)
 {
@@ -1788,7 +1812,6 @@ int rmap_walk(struct page *page, struct rmap_walk_control *rwc)
 	else
 		return rmap_walk_file(page, rwc);
 }
-#endif /* CONFIG_MIGRATION */
 
 #ifdef CONFIG_HUGETLB_PAGE
 /*
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 7/9] mm/rmap: use rmap_walk() in try_to_munlock()
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
                   ` (5 preceding siblings ...)
  2013-11-28  7:48 ` [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap() Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
  2013-11-28  7:48 ` [PATCH 8/9] mm/rmap: use rmap_walk() in page_referenced() Joonsoo Kim
  2013-11-28  7:48 ` [PATCH 9/9] mm/rmap: use rmap_walk() in page_mkclean() Joonsoo Kim
  8 siblings, 1 reply; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in try_to_munlock().

In this patch, I change following things.

1. remove some variants of rmap traversing functions.
	cf> try_to_unmap_ksm, try_to_unmap_anon, try_to_unmap_file
2. mechanical change to use rmap_walk() in try_to_munlock().
3. copy and paste comments.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 0eef8cb..91b9719 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -75,7 +75,6 @@ struct page *ksm_might_need_to_copy(struct page *page,
 
 int page_referenced_ksm(struct page *page,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
 
@@ -114,11 +113,6 @@ static inline int page_referenced_ksm(struct page *page,
 	return 0;
 }
 
-static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
-{
-	return 0;
-}
-
 static inline int rmap_walk_ksm(struct page *page,
 			struct rmap_walk_control *rwc)
 {
diff --git a/mm/ksm.c b/mm/ksm.c
index e1b0198..4f25cf7 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1946,56 +1946,6 @@ out:
 	return referenced;
 }
 
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
-{
-	struct stable_node *stable_node;
-	struct rmap_item *rmap_item;
-	int ret = SWAP_AGAIN;
-	int search_new_forks = 0;
-
-	VM_BUG_ON(!PageKsm(page));
-	VM_BUG_ON(!PageLocked(page));
-
-	stable_node = page_stable_node(page);
-	if (!stable_node)
-		return SWAP_FAIL;
-again:
-	hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
-		struct anon_vma *anon_vma = rmap_item->anon_vma;
-		struct anon_vma_chain *vmac;
-		struct vm_area_struct *vma;
-
-		anon_vma_lock_read(anon_vma);
-		anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
-					       0, ULONG_MAX) {
-			vma = vmac->vma;
-			if (rmap_item->address < vma->vm_start ||
-			    rmap_item->address >= vma->vm_end)
-				continue;
-			/*
-			 * Initially we examine only the vma which covers this
-			 * rmap_item; but later, if there is still work to do,
-			 * we examine covering vmas in other mms: in case they
-			 * were forked from the original since ksmd passed.
-			 */
-			if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
-				continue;
-
-			ret = try_to_unmap_one(page, vma,
-					rmap_item->address, (void *)flags);
-			if (ret != SWAP_AGAIN || !page_mapped(page)) {
-				anon_vma_unlock_read(anon_vma);
-				goto out;
-			}
-		}
-		anon_vma_unlock_read(anon_vma);
-	}
-	if (!search_new_forks++)
-		goto again;
-out:
-	return ret;
-}
-
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
 	struct stable_node *stable_node;
diff --git a/mm/rmap.c b/mm/rmap.c
index 7407710..860a393 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1515,110 +1515,6 @@ static int skip_vma_temporary_stack(struct vm_area_struct *vma, void *arg)
 	return (int)is_vma_temporary_stack(vma);
 }
 
-/**
- * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
- * rmap method
- * @page: the page to unmap/unlock
- * @flags: action and flags
- *
- * Find all the mappings of a page using the mapping pointer and the vma chains
- * contained in the anon_vma struct it points to.
- *
- * This function is only called from try_to_unmap/try_to_munlock for
- * anonymous pages.
- * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
- * where the page was found will be held for write.  So, we won't recheck
- * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
- * 'LOCKED.
- */
-static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
-{
-	struct anon_vma *anon_vma;
-	pgoff_t pgoff;
-	struct anon_vma_chain *avc;
-	int ret = SWAP_AGAIN;
-
-	anon_vma = page_lock_anon_vma_read(page);
-	if (!anon_vma)
-		return ret;
-
-	pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
-		struct vm_area_struct *vma = avc->vma;
-		unsigned long address;
-
-		/*
-		 * During exec, a temporary VMA is setup and later moved.
-		 * The VMA is moved under the anon_vma lock but not the
-		 * page tables leading to a race where migration cannot
-		 * find the migration ptes. Rather than increasing the
-		 * locking requirements of exec(), migration skips
-		 * temporary VMAs until after exec() completes.
-		 */
-		if (IS_ENABLED(CONFIG_MIGRATION) && (flags & TTU_MIGRATION) &&
-				is_vma_temporary_stack(vma))
-			continue;
-
-		address = vma_address(page, vma);
-		ret = try_to_unmap_one(page, vma, address, (void *)flags);
-		if (ret != SWAP_AGAIN || !page_mapped(page))
-			break;
-	}
-
-	page_unlock_anon_vma_read(anon_vma);
-	return ret;
-}
-
-/**
- * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
- * @page: the page to unmap/unlock
- * @flags: action and flags
- *
- * Find all the mappings of a page using the mapping pointer and the vma chains
- * contained in the address_space struct it points to.
- *
- * This function is only called from try_to_unmap/try_to_munlock for
- * object-based pages.
- * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
- * where the page was found will be held for write.  So, we won't recheck
- * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
- * 'LOCKED.
- */
-static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
-{
-	struct address_space *mapping = page->mapping;
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-	struct vm_area_struct *vma;
-	int ret = SWAP_AGAIN;
-
-	if (PageHuge(page))
-		pgoff = page->index << compound_order(page);
-
-	mutex_lock(&mapping->i_mmap_mutex);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
-		unsigned long address = vma_address(page, vma);
-		ret = try_to_unmap_one(page, vma, address, (void *)flags);
-		if (ret != SWAP_AGAIN || !page_mapped(page))
-			goto out;
-	}
-
-	if (list_empty(&mapping->i_mmap_nonlinear))
-		goto out;
-
-	/*
-	 * We don't bother to try to find the munlocked page in nonlinears.
-	 * It's costly. Instead, later, page reclaim logic may call
-	 * try_to_unmap(TTU_MUNLOCK) and recover PG_mlocked lazily.
-	 */
-	if (TTU_ACTION(flags) == TTU_MUNLOCK)
-		goto out;
-
-	ret = try_to_unmap_nonlinear(page, mapping, vma);
-out:
-	mutex_unlock(&mapping->i_mmap_mutex);
-	return ret;
-}
-
 static int page_not_mapped(struct page *page)
 {
 	return !page_mapped(page);
@@ -1687,14 +1583,26 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
  */
 int try_to_munlock(struct page *page)
 {
+	int ret;
+	struct rmap_walk_control rwc;
+
 	VM_BUG_ON(!PageLocked(page) || PageLRU(page));
 
-	if (unlikely(PageKsm(page)))
-		return try_to_unmap_ksm(page, TTU_MUNLOCK);
-	else if (PageAnon(page))
-		return try_to_unmap_anon(page, TTU_MUNLOCK);
-	else
-		return try_to_unmap_file(page, TTU_MUNLOCK);
+	memset(&rwc, 0, sizeof(rwc));
+	rwc.main = try_to_unmap_one;
+	rwc.arg = (void *)TTU_MUNLOCK;
+	rwc.main_done = page_not_mapped;
+
+	/*
+	 * We don't bother to try to find the munlocked page in nonlinears.
+	 * It's costly. Instead, later, page reclaim logic may call
+	 * try_to_unmap(TTU_MUNLOCK) and recover PG_mlocked lazily.
+	 */
+	rwc.file_nonlinear = NULL;
+	rwc.anon_lock = page_lock_anon_vma_read;
+
+	ret = rmap_walk(page, &rwc);
+	return ret;
 }
 
 void __put_anon_vma(struct anon_vma *anon_vma)
@@ -1730,8 +1638,18 @@ static struct anon_vma *rmap_walk_anon_lock(struct page *page,
 }
 
 /*
- * rmap_walk() and its helpers rmap_walk_anon() and rmap_walk_file():
- * Called by migrate.c to remove migration ptes, but might be used more later.
+ * rmap_walk_anon - do something to anonymous page using the object-based
+ * rmap method
+ * @page: the page to be handled
+ * @rwc: control variable according to each walk type
+ *
+ * Find all the mappings of a page using the mapping pointer and the vma chains
+ * contained in the anon_vma struct it points to.
+ *
+ * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
+ * where the page was found will be held for write.  So, we won't recheck
+ * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
+ * LOCKED.
  */
 static int rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc)
 {
@@ -1761,6 +1679,19 @@ static int rmap_walk_anon(struct page *page, struct rmap_walk_control *rwc)
 	return ret;
 }
 
+/*
+ * rmap_walk_file - do something to file page using the object-based rmap method
+ * @page: the page to be handled
+ * @rwc: control variable according to each walk type
+ *
+ * Find all the mappings of a page using the mapping pointer and the vma chains
+ * contained in the address_space struct it points to.
+ *
+ * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
+ * where the page was found will be held for write.  So, we won't recheck
+ * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
+ * LOCKED.
+ */
 static int rmap_walk_file(struct page *page, struct rmap_walk_control *rwc)
 {
 	struct address_space *mapping = page->mapping;
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 8/9] mm/rmap: use rmap_walk() in page_referenced()
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
                   ` (6 preceding siblings ...)
  2013-11-28  7:48 ` [PATCH 7/9] mm/rmap: use rmap_walk() in try_to_munlock() Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:10   ` Naoya Horiguchi
  2013-11-28  7:48 ` [PATCH 9/9] mm/rmap: use rmap_walk() in page_mkclean() Joonsoo Kim
  8 siblings, 1 reply; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in page_referenced().

In this patch, I change following things.

1. remove some variants of rmap traversing functions.
	cf> page_referenced_ksm, page_referenced_anon,
	page_referenced_file
2. introduce new struct page_referenced_arg and pass it to
page_referenced_one(), main function of rmap_walk, in order to
count reference, to store vm_flags and to check finish condition.
3. mechanical change to use rmap_walk() in page_referenced().

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 91b9719..3be6bb1 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -73,8 +73,6 @@ static inline void set_page_stable_node(struct page *page,
 struct page *ksm_might_need_to_copy(struct page *page,
 			struct vm_area_struct *vma, unsigned long address);
 
-int page_referenced_ksm(struct page *page,
-			struct mem_cgroup *memcg, unsigned long *vm_flags);
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
 
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index d641f6d..e529ba3 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -184,7 +184,7 @@ static inline void page_dup_rmap(struct page *page)
 int page_referenced(struct page *, int is_locked,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
 int page_referenced_one(struct page *, struct vm_area_struct *,
-	unsigned long address, unsigned int *mapcount, unsigned long *vm_flags);
+	unsigned long address, void *arg);
 
 #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
 
diff --git a/mm/ksm.c b/mm/ksm.c
index 4f25cf7..4c4541b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1891,61 +1891,6 @@ struct page *ksm_might_need_to_copy(struct page *page,
 	return new_page;
 }
 
-int page_referenced_ksm(struct page *page, struct mem_cgroup *memcg,
-			unsigned long *vm_flags)
-{
-	struct stable_node *stable_node;
-	struct rmap_item *rmap_item;
-	unsigned int mapcount = page_mapcount(page);
-	int referenced = 0;
-	int search_new_forks = 0;
-
-	VM_BUG_ON(!PageKsm(page));
-	VM_BUG_ON(!PageLocked(page));
-
-	stable_node = page_stable_node(page);
-	if (!stable_node)
-		return 0;
-again:
-	hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
-		struct anon_vma *anon_vma = rmap_item->anon_vma;
-		struct anon_vma_chain *vmac;
-		struct vm_area_struct *vma;
-
-		anon_vma_lock_read(anon_vma);
-		anon_vma_interval_tree_foreach(vmac, &anon_vma->rb_root,
-					       0, ULONG_MAX) {
-			vma = vmac->vma;
-			if (rmap_item->address < vma->vm_start ||
-			    rmap_item->address >= vma->vm_end)
-				continue;
-			/*
-			 * Initially we examine only the vma which covers this
-			 * rmap_item; but later, if there is still work to do,
-			 * we examine covering vmas in other mms: in case they
-			 * were forked from the original since ksmd passed.
-			 */
-			if ((rmap_item->mm == vma->vm_mm) == search_new_forks)
-				continue;
-
-			if (memcg && !mm_match_cgroup(vma->vm_mm, memcg))
-				continue;
-
-			referenced += page_referenced_one(page, vma,
-				rmap_item->address, &mapcount, vm_flags);
-			if (!search_new_forks || !mapcount)
-				break;
-		}
-		anon_vma_unlock_read(anon_vma);
-		if (!mapcount)
-			goto out;
-	}
-	if (!search_new_forks++)
-		goto again;
-out:
-	return referenced;
-}
-
 int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc)
 {
 	struct stable_node *stable_node;
diff --git a/mm/rmap.c b/mm/rmap.c
index 860a393..5e78d5c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -656,17 +656,23 @@ int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma)
 	return 1;
 }
 
+struct page_referenced_arg {
+	int mapcount;
+	int referenced;
+	unsigned long vm_flags;
+};
+
 /*
  * Subfunctions of page_referenced: page_referenced_one called
  * repeatedly from either page_referenced_anon or page_referenced_file.
  */
 int page_referenced_one(struct page *page, struct vm_area_struct *vma,
-			unsigned long address, unsigned int *mapcount,
-			unsigned long *vm_flags)
+			unsigned long address, void *arg)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	spinlock_t *ptl;
 	int referenced = 0;
+	struct page_referenced_arg *ra = arg;
 
 	if (unlikely(PageTransHuge(page))) {
 		pmd_t *pmd;
@@ -678,13 +684,12 @@ int page_referenced_one(struct page *page, struct vm_area_struct *vma,
 		pmd = page_check_address_pmd(page, mm, address,
 					     PAGE_CHECK_ADDRESS_PMD_FLAG, &ptl);
 		if (!pmd)
-			goto out;
+			return SWAP_AGAIN;
 
 		if (vma->vm_flags & VM_LOCKED) {
 			spin_unlock(ptl);
-			*mapcount = 0;	/* break early from loop */
-			*vm_flags |= VM_LOCKED;
-			goto out;
+			ra->vm_flags |= VM_LOCKED;
+			return SWAP_FAIL; /* To break the loop */
 		}
 
 		/* go ahead even if the pmd is pmd_trans_splitting() */
@@ -700,13 +705,12 @@ int page_referenced_one(struct page *page, struct vm_area_struct *vma,
 		 */
 		pte = page_check_address(page, mm, address, &ptl, 0);
 		if (!pte)
-			goto out;
+			return SWAP_AGAIN;
 
 		if (vma->vm_flags & VM_LOCKED) {
 			pte_unmap_unlock(pte, ptl);
-			*mapcount = 0;	/* break early from loop */
-			*vm_flags |= VM_LOCKED;
-			goto out;
+			ra->vm_flags |= VM_LOCKED;
+			return SWAP_FAIL; /* To break the loop */
 		}
 
 		if (ptep_clear_flush_young_notify(vma, address, pte)) {
@@ -723,113 +727,26 @@ int page_referenced_one(struct page *page, struct vm_area_struct *vma,
 		pte_unmap_unlock(pte, ptl);
 	}
 
-	(*mapcount)--;
-
-	if (referenced)
-		*vm_flags |= vma->vm_flags;
-out:
-	return referenced;
-}
-
-static int page_referenced_anon(struct page *page,
-				struct mem_cgroup *memcg,
-				unsigned long *vm_flags)
-{
-	unsigned int mapcount;
-	struct anon_vma *anon_vma;
-	pgoff_t pgoff;
-	struct anon_vma_chain *avc;
-	int referenced = 0;
-
-	anon_vma = page_lock_anon_vma_read(page);
-	if (!anon_vma)
-		return referenced;
-
-	mapcount = page_mapcount(page);
-	pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
-		struct vm_area_struct *vma = avc->vma;
-		unsigned long address = vma_address(page, vma);
-		/*
-		 * If we are reclaiming on behalf of a cgroup, skip
-		 * counting on behalf of references from different
-		 * cgroups
-		 */
-		if (memcg && !mm_match_cgroup(vma->vm_mm, memcg))
-			continue;
-		referenced += page_referenced_one(page, vma, address,
-						  &mapcount, vm_flags);
-		if (!mapcount)
-			break;
+	if (referenced) {
+		ra->referenced++;
+		ra->vm_flags |= vma->vm_flags;
 	}
 
-	page_unlock_anon_vma_read(anon_vma);
-	return referenced;
+	ra->mapcount--;
+	if (!ra->mapcount)
+		return SWAP_SUCCESS; /* To break the loop */
+
+	return SWAP_AGAIN;
 }
 
-/**
- * page_referenced_file - referenced check for object-based rmap
- * @page: the page we're checking references on.
- * @memcg: target memory control group
- * @vm_flags: collect encountered vma->vm_flags who actually referenced the page
- *
- * For an object-based mapped page, find all the places it is mapped and
- * check/clear the referenced flag.  This is done by following the page->mapping
- * pointer, then walking the chain of vmas it holds.  It returns the number
- * of references it found.
- *
- * This function is only called from page_referenced for object-based pages.
- */
-static int page_referenced_file(struct page *page,
-				struct mem_cgroup *memcg,
-				unsigned long *vm_flags)
+static int skip_vma_non_matched_memcg(struct vm_area_struct *vma, void *arg)
 {
-	unsigned int mapcount;
-	struct address_space *mapping = page->mapping;
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-	struct vm_area_struct *vma;
-	int referenced = 0;
-
-	/*
-	 * The caller's checks on page->mapping and !PageAnon have made
-	 * sure that this is a file page: the check for page->mapping
-	 * excludes the case just before it gets set on an anon page.
-	 */
-	BUG_ON(PageAnon(page));
+	struct mem_cgroup *memcg = arg;
 
-	/*
-	 * The page lock not only makes sure that page->mapping cannot
-	 * suddenly be NULLified by truncation, it makes sure that the
-	 * structure at mapping cannot be freed and reused yet,
-	 * so we can safely take mapping->i_mmap_mutex.
-	 */
-	BUG_ON(!PageLocked(page));
+	if (!mm_match_cgroup(vma->vm_mm, memcg))
+		return 1;
 
-	mutex_lock(&mapping->i_mmap_mutex);
-
-	/*
-	 * i_mmap_mutex does not stabilize mapcount at all, but mapcount
-	 * is more likely to be accurate if we note it after spinning.
-	 */
-	mapcount = page_mapcount(page);
-
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
-		unsigned long address = vma_address(page, vma);
-		/*
-		 * If we are reclaiming on behalf of a cgroup, skip
-		 * counting on behalf of references from different
-		 * cgroups
-		 */
-		if (memcg && !mm_match_cgroup(vma->vm_mm, memcg))
-			continue;
-		referenced += page_referenced_one(page, vma, address,
-						  &mapcount, vm_flags);
-		if (!mapcount)
-			break;
-	}
-
-	mutex_unlock(&mapping->i_mmap_mutex);
-	return referenced;
+	return 0;
 }
 
 /**
@@ -847,32 +764,48 @@ int page_referenced(struct page *page,
 		    struct mem_cgroup *memcg,
 		    unsigned long *vm_flags)
 {
-	int referenced = 0;
+	int ret;
 	int we_locked = 0;
+	struct rmap_walk_control rwc;
+	struct page_referenced_arg ra;
 
 	*vm_flags = 0;
-	if (page_mapped(page) && page_rmapping(page)) {
-		if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
-			we_locked = trylock_page(page);
-			if (!we_locked) {
-				referenced++;
-				goto out;
-			}
-		}
-		if (unlikely(PageKsm(page)))
-			referenced += page_referenced_ksm(page, memcg,
-								vm_flags);
-		else if (PageAnon(page))
-			referenced += page_referenced_anon(page, memcg,
-								vm_flags);
-		else if (page->mapping)
-			referenced += page_referenced_file(page, memcg,
-								vm_flags);
-		if (we_locked)
-			unlock_page(page);
+	if (!page_mapped(page))
+		return 0;
+
+	if (!page_rmapping(page))
+		return 0;
+
+	if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
+		we_locked = trylock_page(page);
+		if (!we_locked)
+			return 1;
 	}
-out:
-	return referenced;
+
+	memset(&rwc, 0, sizeof(rwc));
+	memset(&ra, 0, sizeof(ra));
+	rwc.main = page_referenced_one;
+	rwc.arg = (void *)&ra;
+	rwc.anon_lock = page_lock_anon_vma_read;
+	ra.mapcount = page_mapcount(page);
+
+	/*
+	 * If we are reclaiming on behalf of a cgroup, skip
+	 * counting on behalf of references from different
+	 * cgroups
+	 */
+	if (memcg) {
+		rwc.vma_skip = skip_vma_non_matched_memcg;
+		rwc.skip_arg = (void *)memcg;
+	}
+
+	ret = rmap_walk(page, &rwc);
+	*vm_flags = ra.vm_flags;
+
+	if (we_locked)
+		unlock_page(page);
+
+	return ra.referenced;
 }
 
 static int page_mkclean_one(struct page *page, struct vm_area_struct *vma,
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH 9/9] mm/rmap: use rmap_walk() in page_mkclean()
  2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
                   ` (7 preceding siblings ...)
  2013-11-28  7:48 ` [PATCH 8/9] mm/rmap: use rmap_walk() in page_referenced() Joonsoo Kim
@ 2013-11-28  7:48 ` Joonsoo Kim
  2013-12-02 20:10   ` Naoya Horiguchi
  8 siblings, 1 reply; 27+ messages in thread
From: Joonsoo Kim @ 2013-11-28  7:48 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim, Joonsoo Kim

Now, we have an infrastructure in rmap_walk() to handle difference
from variants of rmap traversing functions.

So, just use it in page_mkclean().

In this patch, I change following things.

1. remove some variants of rmap traversing functions.
    cf> page_mkclean_file
2. mechanical change to use rmap_walk() in page_mkclean().

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

diff --git a/mm/rmap.c b/mm/rmap.c
index 5e78d5c..bbbc705 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -809,12 +809,13 @@ int page_referenced(struct page *page,
 }
 
 static int page_mkclean_one(struct page *page, struct vm_area_struct *vma,
-			    unsigned long address)
+			    unsigned long address, void *arg)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pte_t *pte;
 	spinlock_t *ptl;
 	int ret = 0;
+	int *cleaned = arg;
 
 	pte = page_check_address(page, mm, address, &ptl, 1);
 	if (!pte)
@@ -833,44 +834,46 @@ static int page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 
 	pte_unmap_unlock(pte, ptl);
 
-	if (ret)
+	if (ret) {
 		mmu_notifier_invalidate_page(mm, address);
+		(*cleaned)++;
+	}
 out:
-	return ret;
+	return SWAP_AGAIN;
 }
 
-static int page_mkclean_file(struct address_space *mapping, struct page *page)
+static int skip_vma_non_shared(struct vm_area_struct *vma, void *arg)
 {
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
-	struct vm_area_struct *vma;
-	int ret = 0;
-
-	BUG_ON(PageAnon(page));
+	if (vma->vm_flags & VM_SHARED)
+		return 0;
 
-	mutex_lock(&mapping->i_mmap_mutex);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
-		if (vma->vm_flags & VM_SHARED) {
-			unsigned long address = vma_address(page, vma);
-			ret += page_mkclean_one(page, vma, address);
-		}
-	}
-	mutex_unlock(&mapping->i_mmap_mutex);
-	return ret;
+	return 1;
 }
 
 int page_mkclean(struct page *page)
 {
-	int ret = 0;
+	struct address_space *mapping;
+	struct rmap_walk_control rwc;
+	int cleaned;
 
 	BUG_ON(!PageLocked(page));
 
-	if (page_mapped(page)) {
-		struct address_space *mapping = page_mapping(page);
-		if (mapping)
-			ret = page_mkclean_file(mapping, page);
-	}
+	if (!page_mapped(page))
+		return 0;
 
-	return ret;
+	mapping = page_mapping(page);
+	if (!mapping)
+		return 0;
+
+	memset(&rwc, 0, sizeof(rwc));
+	cleaned = 0;
+	rwc.main = page_mkclean_one;
+	rwc.arg = (void *)&cleaned;
+	rwc.vma_skip = skip_vma_non_shared;
+
+	rmap_walk(page, &rwc);
+
+	return cleaned;
 }
 EXPORT_SYMBOL_GPL(page_mkclean);
 
-- 
1.7.9.5


^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/9] mm/rmap: recompute pgoff for huge page
  2013-11-28  7:48 ` [PATCH 1/9] mm/rmap: recompute pgoff for huge page Joonsoo Kim
@ 2013-12-02 20:09   ` Naoya Horiguchi
  2013-12-02 22:44   ` Andrew Morton
  1 sibling, 0 replies; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:09 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:38PM +0900, Joonsoo Kim wrote:
> We have to recompute pgoff if the given page is huge, since result based
> on HPAGE_SIZE is not approapriate for scanning the vma interval tree, as
> shown by commit 36e4f20af833 ("hugetlb: do not use vma_hugecache_offset()
> for vma_prio_tree_foreach") and commit 369a713e ("rmap: recompute pgoff
> for unmapping huge page").
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 2/9] mm/rmap: factor nonlinear handling out of try_to_unmap_file()
  2013-11-28  7:48 ` [PATCH 2/9] mm/rmap: factor nonlinear handling out of try_to_unmap_file() Joonsoo Kim
@ 2013-12-02 20:09   ` Naoya Horiguchi
  0 siblings, 0 replies; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:09 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:39PM +0900, Joonsoo Kim wrote:
> To merge all kinds of rmap traverse functions, try_to_unmap(),
> try_to_munlock(), page_referenced() and page_mkclean(), we need to
> extract common parts and separate out non-common parts.
> 
> Nonlinear handling is handled just in try_to_unmap_file() and other
> rmap traverse functions doesn't care of it. Therfore it is better
> to factor nonlinear handling out of try_to_unmap_file() in order to
> merge all kinds of rmap traverse functions easily.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 3/9] mm/rmap: factor lock function out of rmap_walk_anon()
  2013-11-28  7:48 ` [PATCH 3/9] mm/rmap: factor lock function out of rmap_walk_anon() Joonsoo Kim
@ 2013-12-02 20:09   ` Naoya Horiguchi
  0 siblings, 0 replies; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:09 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:40PM +0900, Joonsoo Kim wrote:
> When we traverse anon_vma, we need to take a read-side anon_lock.
> But there is subtle difference in the situation so that we can't use
> same method to take a lock in each cases. Therefore, we need to make
> rmap_walk_anon() taking difference lock function.
> 
> This patch is the first step, factoring lock function for anon_lock out
> of rmap_walk_anon(). It will be used in case of removing migration entry
> and in default of rmap_walk_anon().
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument
  2013-11-28  7:48 ` [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument Joonsoo Kim
@ 2013-12-02 20:09   ` Naoya Horiguchi
  2013-12-02 22:51     ` Andrew Morton
  2013-12-02 22:52   ` Andrew Morton
  1 sibling, 1 reply; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:09 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:41PM +0900, Joonsoo Kim wrote:
> In each rmap traverse case, there is some difference so that we need
> function pointers and arguments to them in order to handle these
> difference properly.
> 
> For this purpose, struct rmap_walk_control is introduced in this patch,
> and will be extended in following patch. Introducing and extending are
> separate, because it clarify changes.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

A few comment below ...

> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
> index 45c9b6a..0eef8cb 100644
> --- a/include/linux/ksm.h
> +++ b/include/linux/ksm.h
> @@ -76,8 +76,7 @@ struct page *ksm_might_need_to_copy(struct page *page,
>  int page_referenced_ksm(struct page *page,
>  			struct mem_cgroup *memcg, unsigned long *vm_flags);
>  int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
> -int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
> -		  struct vm_area_struct *, unsigned long, void *), void *arg);
> +int rmap_walk_ksm(struct page *page, struct rmap_walk_control *rwc);
>  void ksm_migrate_page(struct page *newpage, struct page *oldpage);
>  
>  #else  /* !CONFIG_KSM */
> @@ -120,8 +119,8 @@ static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>  	return 0;
>  }
>  
> -static inline int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page*,
> -		struct vm_area_struct *, unsigned long, void *), void *arg)
> +static inline int rmap_walk_ksm(struct page *page,
> +			struct rmap_walk_control *rwc)
>  {
>  	return 0;
>  }
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index 6dacb93..0f65686 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -235,11 +235,16 @@ struct anon_vma *page_lock_anon_vma_read(struct page *page);
>  void page_unlock_anon_vma_read(struct anon_vma *anon_vma);
>  int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma);
>  
> +struct rmap_walk_control {
> +	int (*main)(struct page *, struct vm_area_struct *,
> +					unsigned long, void *);

Maybe you can add parameters' names to make this prototype more readable.

> +	void *arg;	/* argument to main function */
> +};
> +
>  /*
>   * Called by migrate.c to remove migration ptes, but might be used more later.
>   */

This comment also needs update?

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases
  2013-11-28  7:48 ` [PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases Joonsoo Kim
@ 2013-12-02 20:09   ` Naoya Horiguchi
  2013-12-03  2:05     ` Joonsoo Kim
  0 siblings, 1 reply; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:09 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:42PM +0900, Joonsoo Kim wrote:
> There are a lot of common parts in traversing functions, but there are
> also a little of uncommon parts in it. By assigning proper function
> pointer on each rmap_walker_control, we can handle these difference
> correctly.
> 
> Following are differences we should handle.
> 
> 1. difference of lock function in anon mapping case
> 2. nonlinear handling in file mapping case
> 3. prechecked condition:
> 	checking memcg in page_referenced(),
> 	checking VM_SHARE in page_mkclean()
> 	checking temporary vma in try_to_unmap()
> 4. exit condition:
> 	checking page_mapped() in try_to_unmap()
> 
> So, in this patch, I introduce 4 function pointers to
> handle above differences.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index 0f65686..58624b4 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -239,6 +239,12 @@ struct rmap_walk_control {
>  	int (*main)(struct page *, struct vm_area_struct *,
>  					unsigned long, void *);
>  	void *arg;	/* argument to main function */
> +	int (*main_done)(struct page *page);	/* check exit condition */
> +	int (*file_nonlinear)(struct page *, struct address_space *,
> +					struct vm_area_struct *vma);
> +	struct anon_vma *(*anon_lock)(struct page *);
> +	int (*vma_skip)(struct vm_area_struct *, void *);

Can you add some comments about how these callbacks work and when it
should be set to for future users?  For example, anon_lock() are
used to override the default behavior and it's not trivial.

> +	void *skip_arg;	/* argument to vma_skip function */

I think that it's better to move this field into the structure pointed
to by arg (which can be defined by each caller in its own way) and pass
arg to *vma_skip().

Thanks,
Naoya Horiguchi

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap()
  2013-11-28  7:48 ` [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap() Joonsoo Kim
@ 2013-12-02 20:09   ` Naoya Horiguchi
  2013-12-02 23:01   ` Andrew Morton
  1 sibling, 0 replies; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:09 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:43PM +0900, Joonsoo Kim wrote:
> Now, we have an infrastructure in rmap_walk() to handle difference
> from variants of rmap traversing functions.
> 
> So, just use it in try_to_unmap().
> 
> In this patch, I change following things.
> 
> 1. enable rmap_walk() if !CONFIG_MIGRATION.
> 2. mechanical change to use rmap_walk() in try_to_unmap().
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 7/9] mm/rmap: use rmap_walk() in try_to_munlock()
  2013-11-28  7:48 ` [PATCH 7/9] mm/rmap: use rmap_walk() in try_to_munlock() Joonsoo Kim
@ 2013-12-02 20:09   ` Naoya Horiguchi
  0 siblings, 0 replies; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:09 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:44PM +0900, Joonsoo Kim wrote:
> Now, we have an infrastructure in rmap_walk() to handle difference
> from variants of rmap traversing functions.
> 
> So, just use it in try_to_munlock().
> 
> In this patch, I change following things.
> 
> 1. remove some variants of rmap traversing functions.
> 	cf> try_to_unmap_ksm, try_to_unmap_anon, try_to_unmap_file
> 2. mechanical change to use rmap_walk() in try_to_munlock().
> 3. copy and paste comments.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 8/9] mm/rmap: use rmap_walk() in page_referenced()
  2013-11-28  7:48 ` [PATCH 8/9] mm/rmap: use rmap_walk() in page_referenced() Joonsoo Kim
@ 2013-12-02 20:10   ` Naoya Horiguchi
  0 siblings, 0 replies; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:10 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:45PM +0900, Joonsoo Kim wrote:
> Now, we have an infrastructure in rmap_walk() to handle difference
> from variants of rmap traversing functions.
> 
> So, just use it in page_referenced().
> 
> In this patch, I change following things.
> 
> 1. remove some variants of rmap traversing functions.
> 	cf> page_referenced_ksm, page_referenced_anon,
> 	page_referenced_file
> 2. introduce new struct page_referenced_arg and pass it to
> page_referenced_one(), main function of rmap_walk, in order to
> count reference, to store vm_flags and to check finish condition.
> 3. mechanical change to use rmap_walk() in page_referenced().
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 9/9] mm/rmap: use rmap_walk() in page_mkclean()
  2013-11-28  7:48 ` [PATCH 9/9] mm/rmap: use rmap_walk() in page_mkclean() Joonsoo Kim
@ 2013-12-02 20:10   ` Naoya Horiguchi
  0 siblings, 0 replies; 27+ messages in thread
From: Naoya Horiguchi @ 2013-12-02 20:10 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Thu, Nov 28, 2013 at 04:48:46PM +0900, Joonsoo Kim wrote:
> Now, we have an infrastructure in rmap_walk() to handle difference
> from variants of rmap traversing functions.
> 
> So, just use it in page_mkclean().
> 
> In this patch, I change following things.
> 
> 1. remove some variants of rmap traversing functions.
>     cf> page_mkclean_file
> 2. mechanical change to use rmap_walk() in page_mkclean().
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/9] mm/rmap: recompute pgoff for huge page
  2013-11-28  7:48 ` [PATCH 1/9] mm/rmap: recompute pgoff for huge page Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
@ 2013-12-02 22:44   ` Andrew Morton
  2013-12-03  2:01     ` Joonsoo Kim
  1 sibling, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2013-12-02 22:44 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim

On Thu, 28 Nov 2013 16:48:38 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> We have to recompute pgoff if the given page is huge, since result based
> on HPAGE_SIZE is not approapriate for scanning the vma interval tree, as
> shown by commit 36e4f20af833 ("hugetlb: do not use vma_hugecache_offset()
> for vma_prio_tree_foreach") and commit 369a713e ("rmap: recompute pgoff
> for unmapping huge page").
> 
> ...
>
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1714,6 +1714,10 @@ static int rmap_walk_file(struct page *page, int (*rmap_one)(struct page *,
>  
>  	if (!mapping)
>  		return ret;
> +
> +	if (PageHuge(page))
> +		pgoff = page->index << compound_order(page);
> +
>  	mutex_lock(&mapping->i_mmap_mutex);
>  	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
>  		unsigned long address = vma_address(page, vma);

a)  Can't we just do this?

--- a/mm/rmap.c~mm-rmap-recompute-pgoff-for-huge-page-fix
+++ a/mm/rmap.c
@@ -1708,16 +1708,13 @@ static int rmap_walk_file(struct page *p
 		struct vm_area_struct *, unsigned long, void *), void *arg)
 {
 	struct address_space *mapping = page->mapping;
-	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
+	pgoff_t pgoff = page->index << compound_order(page);
 	struct vm_area_struct *vma;
 	int ret = SWAP_AGAIN;
 
 	if (!mapping)
 		return ret;
 
-	if (PageHuge(page))
-		pgoff = page->index << compound_order(page);
-
 	mutex_lock(&mapping->i_mmap_mutex);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
 		unsigned long address = vma_address(page, vma);

compound_order() does the right thing for all styles of page, yes?

b) If that PageHuge() test you added the correct thing to use?

/*
 * PageHuge() only returns true for hugetlbfs pages, but not for normal or
 * transparent huge pages.  See the PageTransHuge() documentation for more
 * details.
 */

   Obviously we won't be encountering transparent huge pages here,
   but what's the best future-safe approach?


I hate that PageHuge() oddity with a passion!  Maybe it would be better
if it was called PageHugetlbfs.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument
  2013-12-02 20:09   ` Naoya Horiguchi
@ 2013-12-02 22:51     ` Andrew Morton
  2013-12-03  2:03       ` Joonsoo Kim
  0 siblings, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2013-12-02 22:51 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Joonsoo Kim, Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Hillf Danton, linux-kernel, linux-mm, Joonsoo Kim

On Mon, 02 Dec 2013 15:09:33 -0500 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:

> > --- a/include/linux/rmap.h
> > +++ b/include/linux/rmap.h
> > @@ -235,11 +235,16 @@ struct anon_vma *page_lock_anon_vma_read(struct page *page);
> >  void page_unlock_anon_vma_read(struct anon_vma *anon_vma);
> >  int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma);
> >  
> > +struct rmap_walk_control {
> > +	int (*main)(struct page *, struct vm_area_struct *,
> > +					unsigned long, void *);
> 
> Maybe you can add parameters' names to make this prototype more readable.
> 

Yes, I find it quite maddening when the names are left out.  They're really
very useful for understanding what's going on.

The name "main" seems odd as well.  What does "main" mean?  "rmap_one"
was better, but "rmap_one_page" or "rmap_one_pte" or whatever would
be better.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument
  2013-11-28  7:48 ` [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
@ 2013-12-02 22:52   ` Andrew Morton
  1 sibling, 0 replies; 27+ messages in thread
From: Andrew Morton @ 2013-12-02 22:52 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim

On Thu, 28 Nov 2013 16:48:41 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> In each rmap traverse case, there is some difference so that we need
> function pointers and arguments to them in order to handle these
> difference properly.
> 
> For this purpose, struct rmap_walk_control is introduced in this patch,
> and will be extended in following patch. Introducing and extending are
> separate, because it clarify changes.
> 
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -198,7 +198,12 @@ out:
>   */
>  static void remove_migration_ptes(struct page *old, struct page *new)
>  {
> -	rmap_walk(new, remove_migration_pte, old);
> +	struct rmap_walk_control rwc;
> +
> +	memset(&rwc, 0, sizeof(rwc));
> +	rwc.main = remove_migration_pte;
> +	rwc.arg = old;
> +	rmap_walk(new, &rwc);
>  }

It is much neater to do

	struct rmap_walk_control rwc = {
		.main = remove_migration_pte,
		.arg = old,
	};

which will zero out all remaining fields as well.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap()
  2013-11-28  7:48 ` [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap() Joonsoo Kim
  2013-12-02 20:09   ` Naoya Horiguchi
@ 2013-12-02 23:01   ` Andrew Morton
  2013-12-03  2:08     ` Joonsoo Kim
  1 sibling, 1 reply; 27+ messages in thread
From: Andrew Morton @ 2013-12-02 23:01 UTC (permalink / raw)
  To: Joonsoo Kim
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm,
	Joonsoo Kim

On Thu, 28 Nov 2013 16:48:43 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:

> Now, we have an infrastructure in rmap_walk() to handle difference
> from variants of rmap traversing functions.
> 
> So, just use it in try_to_unmap().
> 
> In this patch, I change following things.
> 
> 1. enable rmap_walk() if !CONFIG_MIGRATION.
> 2. mechanical change to use rmap_walk() in try_to_unmap().
> 
> ...
>
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -190,7 +190,7 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
>  
>  int try_to_unmap(struct page *, enum ttu_flags flags);
>  int try_to_unmap_one(struct page *, struct vm_area_struct *,
> -			unsigned long address, enum ttu_flags flags);
> +			unsigned long address, void *arg);

This change is ugly and unchangelogged.

Also, "enum ttu_flags flags" was nice and meaningful, but "void *arg"
conveys far less information.  A suitable way to address this
shortcoming is to document `arg' at the try_to_unmap_one() definition
site.  try_to_unmap_one() doesn't actually have any documentation at
this stage - let's please fix that?

>
> ...
>
> @@ -1509,6 +1510,11 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
>  	return false;
>  }
>  
> +static int skip_vma_temporary_stack(struct vm_area_struct *vma, void *arg)
> +{
> +	return (int)is_vma_temporary_stack(vma);
> +}

The (int) cast is unneeded - the compiler will turn a bool into an int.

Should this function (and rmap_walk_control.skip()) really be returning
a bool?

The name of this function is poor: "skip_foo" implies that the function
will skip over a foo.  But that isn't what this function does.  Please
choose something which accurately reflects the function's behavior.

>  /**
>   * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
>   * rmap method
> @@ -1554,7 +1560,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>  			continue;
>  
>  		address = vma_address(page, vma);
> -		ret = try_to_unmap_one(page, vma, address, flags);
> +		ret = try_to_unmap_one(page, vma, address, (void *)flags);
>  		if (ret != SWAP_AGAIN || !page_mapped(page))
>  			break;
>  	}
>
> ...
>
>  /**
>   * try_to_unmap - try to remove all page table mappings to a page
>   * @page: the page to get unmapped
> @@ -1630,16 +1641,30 @@ out:
>  int try_to_unmap(struct page *page, enum ttu_flags flags)
>  {
>  	int ret;
> +	struct rmap_walk_control rwc;
>  
> -	BUG_ON(!PageLocked(page));
>  	VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
>  
> -	if (unlikely(PageKsm(page)))
> -		ret = try_to_unmap_ksm(page, flags);
> -	else if (PageAnon(page))
> -		ret = try_to_unmap_anon(page, flags);
> -	else
> -		ret = try_to_unmap_file(page, flags);
> +	memset(&rwc, 0, sizeof(rwc));
> +	rwc.main = try_to_unmap_one;
> +	rwc.arg = (void *)flags;
> +	rwc.main_done = page_not_mapped;
> +	rwc.file_nonlinear = try_to_unmap_nonlinear;
> +	rwc.anon_lock = page_lock_anon_vma_read;

	struct rmap_walk_control rwc = {
		...
	};

> +	/*
> +	 * During exec, a temporary VMA is setup and later moved.
> +	 * The VMA is moved under the anon_vma lock but not the
> +	 * page tables leading to a race where migration cannot
> +	 * find the migration ptes. Rather than increasing the
> +	 * locking requirements of exec(), migration skips
> +	 * temporary VMAs until after exec() completes.
> +	 */
> +	if (flags & TTU_MIGRATION && !PageKsm(page) && PageAnon(page))
> +		rwc.vma_skip = skip_vma_temporary_stack;
> +
> +	ret = rmap_walk(page, &rwc);
> +
>  	if (ret != SWAP_MLOCK && !page_mapped(page))
>  		ret = SWAP_SUCCESS;
>  	return ret;
>
> ...
>


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 1/9] mm/rmap: recompute pgoff for huge page
  2013-12-02 22:44   ` Andrew Morton
@ 2013-12-03  2:01     ` Joonsoo Kim
  0 siblings, 0 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-12-03  2:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm

On Mon, Dec 02, 2013 at 02:44:34PM -0800, Andrew Morton wrote:
> On Thu, 28 Nov 2013 16:48:38 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
> 
> > We have to recompute pgoff if the given page is huge, since result based
> > on HPAGE_SIZE is not approapriate for scanning the vma interval tree, as
> > shown by commit 36e4f20af833 ("hugetlb: do not use vma_hugecache_offset()
> > for vma_prio_tree_foreach") and commit 369a713e ("rmap: recompute pgoff
> > for unmapping huge page").
> > 
> > ...
> >
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1714,6 +1714,10 @@ static int rmap_walk_file(struct page *page, int (*rmap_one)(struct page *,
> >  
> >  	if (!mapping)
> >  		return ret;
> > +
> > +	if (PageHuge(page))
> > +		pgoff = page->index << compound_order(page);
> > +
> >  	mutex_lock(&mapping->i_mmap_mutex);
> >  	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
> >  		unsigned long address = vma_address(page, vma);
> 
> a)  Can't we just do this?
> 
> --- a/mm/rmap.c~mm-rmap-recompute-pgoff-for-huge-page-fix
> +++ a/mm/rmap.c
> @@ -1708,16 +1708,13 @@ static int rmap_walk_file(struct page *p
>  		struct vm_area_struct *, unsigned long, void *), void *arg)
>  {
>  	struct address_space *mapping = page->mapping;
> -	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> +	pgoff_t pgoff = page->index << compound_order(page);
>  	struct vm_area_struct *vma;
>  	int ret = SWAP_AGAIN;
>  
>  	if (!mapping)
>  		return ret;
>  
> -	if (PageHuge(page))
> -		pgoff = page->index << compound_order(page);
> -
>  	mutex_lock(&mapping->i_mmap_mutex);
>  	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
>  		unsigned long address = vma_address(page, vma);
> 
> compound_order() does the right thing for all styles of page, yes?

Yes. I will change.

> 
> b) If that PageHuge() test you added the correct thing to use?
> 
> /*
>  * PageHuge() only returns true for hugetlbfs pages, but not for normal or
>  * transparent huge pages.  See the PageTransHuge() documentation for more
>  * details.
>  */
> 
>    Obviously we won't be encountering transparent huge pages here,
>    but what's the best future-safe approach?

compound_order() also works for transparent huge pages, so it may be safe way.

> I hate that PageHuge() oddity with a passion!  Maybe it would be better
> if it was called PageHugetlbfs.

I also think that PageHuge() is odd name.
It has only 50 call sites. Let's change it :)

Thanks.

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument
  2013-12-02 22:51     ` Andrew Morton
@ 2013-12-03  2:03       ` Joonsoo Kim
  0 siblings, 0 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-12-03  2:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Naoya Horiguchi, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm

On Mon, Dec 02, 2013 at 02:51:05PM -0800, Andrew Morton wrote:
> On Mon, 02 Dec 2013 15:09:33 -0500 Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> wrote:
> 
> > > --- a/include/linux/rmap.h
> > > +++ b/include/linux/rmap.h
> > > @@ -235,11 +235,16 @@ struct anon_vma *page_lock_anon_vma_read(struct page *page);
> > >  void page_unlock_anon_vma_read(struct anon_vma *anon_vma);
> > >  int page_mapped_in_vma(struct page *page, struct vm_area_struct *vma);
> > >  
> > > +struct rmap_walk_control {
> > > +	int (*main)(struct page *, struct vm_area_struct *,
> > > +					unsigned long, void *);
> > 
> > Maybe you can add parameters' names to make this prototype more readable.
> > 
> 
> Yes, I find it quite maddening when the names are left out.  They're really
> very useful for understanding what's going on.

Okay. Will do.

> 
> The name "main" seems odd as well.  What does "main" mean?  "rmap_one"
> was better, but "rmap_one_page" or "rmap_one_pte" or whatever would
> be better.

Okay. I will think better name.

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases
  2013-12-02 20:09   ` Naoya Horiguchi
@ 2013-12-03  2:05     ` Joonsoo Kim
  0 siblings, 0 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-12-03  2:05 UTC (permalink / raw)
  To: Naoya Horiguchi
  Cc: Andrew Morton, Mel Gorman, Hugh Dickins, Rik van Riel,
	Ingo Molnar, Hillf Danton, linux-kernel, linux-mm

On Mon, Dec 02, 2013 at 03:09:42PM -0500, Naoya Horiguchi wrote:
> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> > index 0f65686..58624b4 100644
> > --- a/include/linux/rmap.h
> > +++ b/include/linux/rmap.h
> > @@ -239,6 +239,12 @@ struct rmap_walk_control {
> >  	int (*main)(struct page *, struct vm_area_struct *,
> >  					unsigned long, void *);
> >  	void *arg;	/* argument to main function */
> > +	int (*main_done)(struct page *page);	/* check exit condition */
> > +	int (*file_nonlinear)(struct page *, struct address_space *,
> > +					struct vm_area_struct *vma);
> > +	struct anon_vma *(*anon_lock)(struct page *);
> > +	int (*vma_skip)(struct vm_area_struct *, void *);
> 
> Can you add some comments about how these callbacks work and when it
> should be set to for future users?  For example, anon_lock() are
> used to override the default behavior and it's not trivial.

Okay. I will add.

> 
> > +	void *skip_arg;	/* argument to vma_skip function */
> 
> I think that it's better to move this field into the structure pointed
> to by arg (which can be defined by each caller in its own way) and pass
> arg to *vma_skip().

Will do.

Thanks.

> 
> Thanks,
> Naoya Horiguchi
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap()
  2013-12-02 23:01   ` Andrew Morton
@ 2013-12-03  2:08     ` Joonsoo Kim
  0 siblings, 0 replies; 27+ messages in thread
From: Joonsoo Kim @ 2013-12-03  2:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mel Gorman, Hugh Dickins, Rik van Riel, Ingo Molnar,
	Naoya Horiguchi, Hillf Danton, linux-kernel, linux-mm

On Mon, Dec 02, 2013 at 03:01:07PM -0800, Andrew Morton wrote:
> On Thu, 28 Nov 2013 16:48:43 +0900 Joonsoo Kim <iamjoonsoo.kim@lge.com> wrote:
> 
> > Now, we have an infrastructure in rmap_walk() to handle difference
> > from variants of rmap traversing functions.
> > 
> > So, just use it in try_to_unmap().
> > 
> > In this patch, I change following things.
> > 
> > 1. enable rmap_walk() if !CONFIG_MIGRATION.
> > 2. mechanical change to use rmap_walk() in try_to_unmap().
> > 
> > ...
> >
> > --- a/include/linux/rmap.h
> > +++ b/include/linux/rmap.h
> > @@ -190,7 +190,7 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
> >  
> >  int try_to_unmap(struct page *, enum ttu_flags flags);
> >  int try_to_unmap_one(struct page *, struct vm_area_struct *,
> > -			unsigned long address, enum ttu_flags flags);
> > +			unsigned long address, void *arg);
> 
> This change is ugly and unchangelogged.
> 
> Also, "enum ttu_flags flags" was nice and meaningful, but "void *arg"
> conveys far less information.  A suitable way to address this
> shortcoming is to document `arg' at the try_to_unmap_one() definition
> site.  try_to_unmap_one() doesn't actually have any documentation at
> this stage - let's please fix that?

Okay. I will add some comments.

> >
> > ...
> >
> > @@ -1509,6 +1510,11 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
> >  	return false;
> >  }
> >  
> > +static int skip_vma_temporary_stack(struct vm_area_struct *vma, void *arg)
> > +{
> > +	return (int)is_vma_temporary_stack(vma);
> > +}
> 
> The (int) cast is unneeded - the compiler will turn a bool into an int.
> 
> Should this function (and rmap_walk_control.skip()) really be returning
> a bool?

Okay. Will do.

> 
> The name of this function is poor: "skip_foo" implies that the function
> will skip over a foo.  But that isn't what this function does.  Please
> choose something which accurately reflects the function's behavior.

Okay.

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2013-12-03  2:06 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-28  7:48 [PATCH 0/9] mm/rmap: unify rmap traversing functions through rmap_walk Joonsoo Kim
2013-11-28  7:48 ` [PATCH 1/9] mm/rmap: recompute pgoff for huge page Joonsoo Kim
2013-12-02 20:09   ` Naoya Horiguchi
2013-12-02 22:44   ` Andrew Morton
2013-12-03  2:01     ` Joonsoo Kim
2013-11-28  7:48 ` [PATCH 2/9] mm/rmap: factor nonlinear handling out of try_to_unmap_file() Joonsoo Kim
2013-12-02 20:09   ` Naoya Horiguchi
2013-11-28  7:48 ` [PATCH 3/9] mm/rmap: factor lock function out of rmap_walk_anon() Joonsoo Kim
2013-12-02 20:09   ` Naoya Horiguchi
2013-11-28  7:48 ` [PATCH 4/9] mm/rmap: make rmap_walk to get the rmap_walk_control argument Joonsoo Kim
2013-12-02 20:09   ` Naoya Horiguchi
2013-12-02 22:51     ` Andrew Morton
2013-12-03  2:03       ` Joonsoo Kim
2013-12-02 22:52   ` Andrew Morton
2013-11-28  7:48 ` [PATCH 5/9] mm/rmap: extend rmap_walk_xxx() to cope with different cases Joonsoo Kim
2013-12-02 20:09   ` Naoya Horiguchi
2013-12-03  2:05     ` Joonsoo Kim
2013-11-28  7:48 ` [PATCH 6/9] mm/rmap: use rmap_walk() in try_to_unmap() Joonsoo Kim
2013-12-02 20:09   ` Naoya Horiguchi
2013-12-02 23:01   ` Andrew Morton
2013-12-03  2:08     ` Joonsoo Kim
2013-11-28  7:48 ` [PATCH 7/9] mm/rmap: use rmap_walk() in try_to_munlock() Joonsoo Kim
2013-12-02 20:09   ` Naoya Horiguchi
2013-11-28  7:48 ` [PATCH 8/9] mm/rmap: use rmap_walk() in page_referenced() Joonsoo Kim
2013-12-02 20:10   ` Naoya Horiguchi
2013-11-28  7:48 ` [PATCH 9/9] mm/rmap: use rmap_walk() in page_mkclean() Joonsoo Kim
2013-12-02 20:10   ` Naoya Horiguchi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).