linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 1/4] mm: Per process reclaim
@ 2013-03-25  6:21 Minchan Kim
  2013-03-25  6:21 ` [RFC 2/4] mm: make shrink_page_list with pages from multiple zones Minchan Kim
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Minchan Kim @ 2013-03-25  6:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee, Minchan Kim

These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Writing 1 to /proc/pid/reclaim reclaims only file pages.
Writing 2 to /proc/pid/reclaim reclaims only anonymous pages.
Writing 3 to /proc/pid/reclaim reclaims all pages from target process.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/base.c       |   3 ++
 fs/proc/internal.h   |   1 +
 fs/proc/task_mmu.c   | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/rmap.h |   4 ++
 mm/Kconfig           |  13 ++++++
 mm/internal.h        |   7 +---
 mm/vmscan.c          |  59 ++++++++++++++++++++++++++
 7 files changed, 196 insertions(+), 6 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9b43ff77..ed83e85 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2532,6 +2532,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("mounts",     S_IRUGO, proc_mounts_operations),
 	REG("mountinfo",  S_IRUGO, proc_mountinfo_operations),
 	REG("mountstats", S_IRUSR, proc_mountstats_operations),
+#ifdef CONFIG_PROCESS_RECLAIM
+	REG("reclaim", S_IWUSR, proc_reclaim_operations),
+#endif
 #ifdef CONFIG_PROC_PAGE_MONITOR
 	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index 3f711d6..48ccb52 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -51,6 +51,7 @@ extern const struct file_operations proc_pagemap_operations;
 extern const struct file_operations proc_net_operations;
 extern const struct inode_operations proc_net_inode_operations;
 extern const struct inode_operations proc_pid_link_inode_operations;
+extern const struct file_operations proc_reclaim_operations;
 
 struct proc_maps_private {
 	struct pid *pid;
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ca5ce7f..c3713a4 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -11,6 +11,7 @@
 #include <linux/rmap.h>
 #include <linux/swap.h>
 #include <linux/swapops.h>
+#include <linux/mm_inline.h>
 
 #include <asm/elf.h>
 #include <asm/uaccess.h>
@@ -1116,6 +1117,120 @@ const struct file_operations proc_pagemap_operations = {
 };
 #endif /* CONFIG_PROC_PAGE_MONITOR */
 
+#ifdef CONFIG_PROCESS_RECLAIM
+static int reclaim_pte_range(pmd_t *pmd, unsigned long addr,
+				unsigned long end, struct mm_walk *walk)
+{
+	struct vm_area_struct *vma = walk->private;
+	pte_t *pte, ptent;
+	spinlock_t *ptl;
+	struct page *page;
+	LIST_HEAD(page_list);
+	int isolated;
+
+	split_huge_page_pmd(vma, addr, pmd);
+	if (pmd_trans_unstable(pmd))
+		return 0;
+cont:
+	isolated = 0;
+	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+	for (; addr != end; pte++, addr += PAGE_SIZE) {
+		ptent = *pte;
+		if (!pte_present(ptent))
+			continue;
+
+		page = vm_normal_page(vma, addr, ptent);
+		if (!page)
+			continue;
+
+		if (isolate_lru_page(page))
+			continue;
+
+		list_add(&page->lru, &page_list);
+		inc_zone_page_state(page, NR_ISOLATED_ANON +
+				page_is_file_cache(page));
+		isolated++;
+		if (isolated >= SWAP_CLUSTER_MAX)
+			break;
+	}
+	pte_unmap_unlock(pte - 1, ptl);
+	reclaim_pages_from_list(&page_list);
+	if (addr != end)
+		goto cont;
+
+	cond_resched();
+	return 0;
+}
+
+#define RECLAIM_FILE (1 << 0)
+#define RECLAIM_ANON (1 << 1)
+#define RECLAIM_ALL (RECLAIM_FILE | RECLAIM_ANON)
+
+static ssize_t reclaim_write(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	struct task_struct *task;
+	char buffer[PROC_NUMBUF];
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	int type;
+	int rv;
+
+	memset(buffer, 0, sizeof(buffer));
+	if (count > sizeof(buffer) - 1)
+		count = sizeof(buffer) - 1;
+	if (copy_from_user(buffer, buf, count))
+		return -EFAULT;
+	rv = kstrtoint(strstrip(buffer), 10, &type);
+	if (rv < 0)
+		return rv;
+	if (type < RECLAIM_ALL || type > RECLAIM_FILE)
+		return -EINVAL;
+	task = get_proc_task(file->f_path.dentry->d_inode);
+	if (!task)
+		return -ESRCH;
+	mm = get_task_mm(task);
+	if (mm) {
+		struct mm_walk reclaim_walk = {
+			.pmd_entry = reclaim_pte_range,
+			.mm = mm,
+		};
+		down_read(&mm->mmap_sem);
+		for (vma = mm->mmap; vma; vma = vma->vm_next) {
+			reclaim_walk.private = vma;
+			if (is_vm_hugetlb_page(vma))
+				continue;
+			/*
+			 * Writing 1 to /proc/pid/reclaim only affects file
+			 * mapped pages.
+			 *
+			 * Writing 2 to /proc/pid/reclaim enly affects
+			 * anonymous pages.
+			 *
+			 * Writing 3 to /proc/pid/reclaim affects all pages.
+			 */
+			if (type == RECLAIM_ANON && vma->vm_file)
+				continue;
+			if (type == RECLAIM_FILE && !vma->vm_file)
+				continue;
+			walk_page_range(vma->vm_start, vma->vm_end,
+					&reclaim_walk);
+		}
+		flush_tlb_mm(mm);
+		up_read(&mm->mmap_sem);
+		mmput(mm);
+	}
+	put_task_struct(task);
+
+	return count;
+}
+
+const struct file_operations proc_reclaim_operations = {
+	.write		= reclaim_write,
+	.llseek		= noop_llseek,
+};
+#endif
+
 #ifdef CONFIG_NUMA
 
 struct numa_maps {
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..a24e34e 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -10,6 +10,10 @@
 #include <linux/rwsem.h>
 #include <linux/memcontrol.h>
 
+extern int isolate_lru_page(struct page *page);
+extern void putback_lru_page(struct page *page);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
  * an anonymous page pointing to this anon_vma needs to be unmapped:
diff --git a/mm/Kconfig b/mm/Kconfig
index 5881e8c..a947f4b 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -467,3 +467,16 @@ config FRONTSWAP
 	  and swap data is stored as normal on the matching swap device.
 
 	  If unsure, say Y to enable frontswap.
+
+config PROCESS_RECLAIM
+	bool "Enable per process reclaim"
+	depends on PROC_FS
+	default n
+	help
+	 It allows to reclaim pages of the process by /proc/pid/reclaim.
+
+	 (echo 1 > /proc/PID/reclaim) reclaims file-backed pages only.
+	 (echo 2 > /proc/PID/reclaim) reclaims anonymous pages only.
+	 (echo 3 > /proc/PID/reclaim) reclaims all pages.
+
+	 Any other vaule is ignored.
diff --git a/mm/internal.h b/mm/internal.h
index 8562de0..589a29b 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -86,12 +86,6 @@ static inline void get_page_foll(struct page *page)
 extern unsigned long highest_memmap_pfn;
 
 /*
- * in mm/vmscan.c:
- */
-extern int isolate_lru_page(struct page *page);
-extern void putback_lru_page(struct page *page);
-
-/*
  * in mm/rmap.c:
  */
 extern pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address);
@@ -360,6 +354,7 @@ extern unsigned long vm_mmap_pgoff(struct file *, unsigned long,
 extern void set_pageblock_order(void);
 unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 					    struct list_head *page_list);
+
 /* The ALLOC_WMARK bits are used as an index to zone->watermark */
 #define ALLOC_WMARK_MIN		WMARK_MIN
 #define ALLOC_WMARK_LOW		WMARK_LOW
diff --git a/mm/vmscan.c b/mm/vmscan.c
index df78d17..d3dc95f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -991,6 +991,65 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 	return ret;
 }
 
+#ifdef CONFIG_PROCESS_RECLAIM
+static unsigned long shrink_page(struct page *page,
+					struct zone *zone,
+					struct scan_control *sc,
+					enum ttu_flags ttu_flags,
+					unsigned long *ret_nr_dirty,
+					unsigned long *ret_nr_writeback,
+					bool force_reclaim,
+					struct list_head *ret_pages)
+{
+	int reclaimed;
+	LIST_HEAD(page_list);
+	list_add(&page->lru, &page_list);
+
+	reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
+				ret_nr_dirty, ret_nr_writeback,
+				force_reclaim);
+	if (!reclaimed)
+		list_splice(&page_list, ret_pages);
+
+	return reclaimed;
+}
+
+unsigned long reclaim_pages_from_list(struct list_head *page_list)
+{
+	struct scan_control sc = {
+		.gfp_mask = GFP_KERNEL,
+		.priority = DEF_PRIORITY,
+		.may_unmap = 1,
+		.may_swap = 1,
+	};
+
+	LIST_HEAD(ret_pages);
+	struct page *page;
+	unsigned long dummy1, dummy2;
+	unsigned long nr_reclaimed = 0;
+
+	while (!list_empty(page_list)) {
+		page = lru_to_page(page_list);
+		list_del(&page->lru);
+
+		ClearPageActive(page);
+		nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+				TTU_UNMAP|TTU_IGNORE_ACCESS,
+				&dummy1, &dummy2, true, &ret_pages);
+	}
+
+	while (!list_empty(&ret_pages)) {
+		page = lru_to_page(&ret_pages);
+		list_del(&page->lru);
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				page_is_file_cache(page));
+		putback_lru_page(page);
+	}
+
+	return nr_reclaimed;
+}
+#endif
+
 /*
  * Attempt to remove the specified page from its LRU.  Only take this page
  * if it is of the appropriate PageActive status.  Pages which are being
-- 
1.8.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC 2/4] mm: make shrink_page_list with pages from multiple zones
  2013-03-25  6:21 [RFC 1/4] mm: Per process reclaim Minchan Kim
@ 2013-03-25  6:21 ` Minchan Kim
  2013-03-25  6:21 ` [RFC 3/4] mm: Remove shrink_page Minchan Kim
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Minchan Kim @ 2013-03-25  6:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee, Minchan Kim

Now shrink_page_list expects all pages come from a same zone
but it's too limited to use it.

This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d3dc95f..9434ba2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -705,7 +705,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 			goto keep;
 
 		VM_BUG_ON(PageActive(page));
-		VM_BUG_ON(page_zone(page) != zone);
+		if (zone)
+			VM_BUG_ON(page_zone(page) != zone);
 
 		sc->nr_scanned++;
 
@@ -951,7 +952,7 @@ keep:
 	 * back off and wait for congestion to clear because further reclaim
 	 * will encounter the same problem
 	 */
-	if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc))
+	if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc) && zone)
 		zone_set_flag(zone, ZONE_CONGESTED);
 
 	free_hot_cold_page_list(&free_pages, 1);
-- 
1.8.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC 3/4] mm: Remove shrink_page
  2013-03-25  6:21 [RFC 1/4] mm: Per process reclaim Minchan Kim
  2013-03-25  6:21 ` [RFC 2/4] mm: make shrink_page_list with pages from multiple zones Minchan Kim
@ 2013-03-25  6:21 ` Minchan Kim
  2013-03-25  6:21 ` [RFC 4/4] mm: Enhance per process reclaim Minchan Kim
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Minchan Kim @ 2013-03-25  6:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee, Minchan Kim

By previous patch, shrink_page_list can handle pages from
multiple zone so let's remove shrink_page.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 47 ++++++++++++++---------------------------------
 1 file changed, 14 insertions(+), 33 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9434ba2..367d0f4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -923,6 +923,13 @@ free_it:
 		 * appear not as the counts should be low
 		 */
 		list_add(&page->lru, &free_pages);
+		/*
+		 * If pagelist are from multiple zones, we should decrease
+		 * NR_ISOLATED_ANON + x on freed pages in here.
+		 */
+		if (!zone)
+			dec_zone_page_state(page, NR_ISOLATED_ANON +
+					page_is_file_cache(page));
 		continue;
 
 cull_mlocked:
@@ -993,28 +1000,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 }
 
 #ifdef CONFIG_PROCESS_RECLAIM
-static unsigned long shrink_page(struct page *page,
-					struct zone *zone,
-					struct scan_control *sc,
-					enum ttu_flags ttu_flags,
-					unsigned long *ret_nr_dirty,
-					unsigned long *ret_nr_writeback,
-					bool force_reclaim,
-					struct list_head *ret_pages)
-{
-	int reclaimed;
-	LIST_HEAD(page_list);
-	list_add(&page->lru, &page_list);
-
-	reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
-				ret_nr_dirty, ret_nr_writeback,
-				force_reclaim);
-	if (!reclaimed)
-		list_splice(&page_list, ret_pages);
-
-	return reclaimed;
-}
-
 unsigned long reclaim_pages_from_list(struct list_head *page_list)
 {
 	struct scan_control sc = {
@@ -1024,23 +1009,19 @@ unsigned long reclaim_pages_from_list(struct list_head *page_list)
 		.may_swap = 1,
 	};
 
-	LIST_HEAD(ret_pages);
+	unsigned long nr_reclaimed;
 	struct page *page;
 	unsigned long dummy1, dummy2;
-	unsigned long nr_reclaimed = 0;
-
-	while (!list_empty(page_list)) {
-		page = lru_to_page(page_list);
-		list_del(&page->lru);
 
+	list_for_each_entry(page, page_list, lru)
 		ClearPageActive(page);
-		nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+
+	nr_reclaimed = shrink_page_list(page_list, NULL, &sc,
 				TTU_UNMAP|TTU_IGNORE_ACCESS,
-				&dummy1, &dummy2, true, &ret_pages);
-	}
+				&dummy1, &dummy2, true);
 
-	while (!list_empty(&ret_pages)) {
-		page = lru_to_page(&ret_pages);
+	while (!list_empty(page_list)) {
+		page = lru_to_page(page_list);
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-- 
1.8.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [RFC 4/4] mm: Enhance per process reclaim
  2013-03-25  6:21 [RFC 1/4] mm: Per process reclaim Minchan Kim
  2013-03-25  6:21 ` [RFC 2/4] mm: make shrink_page_list with pages from multiple zones Minchan Kim
  2013-03-25  6:21 ` [RFC 3/4] mm: Remove shrink_page Minchan Kim
@ 2013-03-25  6:21 ` Minchan Kim
  2013-04-02 13:25   ` Michael Kerrisk
  2013-04-03  9:17 ` [RFC 1/4] mm: Per " Michael Kerrisk
  2013-04-03 10:10 ` Michael Kerrisk
  4 siblings, 1 reply; 13+ messages in thread
From: Minchan Kim @ 2013-03-25  6:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee, Minchan Kim

Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/task_mmu.c   |  2 +-
 include/linux/ksm.h  |  6 ++++--
 include/linux/rmap.h |  8 +++++---
 mm/ksm.c             |  9 +++++++-
 mm/memory-failure.c  |  2 +-
 mm/migrate.c         |  6 ++++--
 mm/rmap.c            | 58 +++++++++++++++++++++++++++++++++++++---------------
 mm/vmscan.c          | 14 +++++++++++--
 8 files changed, 77 insertions(+), 28 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index c3713a4..7f6aaf5 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1154,7 +1154,7 @@ cont:
 			break;
 	}
 	pte_unmap_unlock(pte - 1, ptl);
-	reclaim_pages_from_list(&page_list);
+	reclaim_pages_from_list(&page_list, vma);
 	if (addr != end)
 		goto cont;
 
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 45c9b6a..d8e556b 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
 
 int page_referenced_ksm(struct page *page,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
+int try_to_unmap_ksm(struct page *page,
+			enum ttu_flags flags, struct vm_area_struct *vma);
 int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
 		  struct vm_area_struct *, unsigned long, void *), void *arg);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
@@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
 	return 0;
 }
 
-static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+static inline int try_to_unmap_ksm(struct page *page,
+			enum ttu_flags flags, struct vm_area_struct *target_vma)
 {
 	return 0;
 }
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index a24e34e..6c7d030 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -12,7 +12,8 @@
 
 extern int isolate_lru_page(struct page *page);
 extern void putback_lru_page(struct page *page);
-extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
+					     struct vm_area_struct *vma);
 
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
@@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
 
 #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
 
-int try_to_unmap(struct page *, enum ttu_flags flags);
+int try_to_unmap(struct page *, enum ttu_flags flags,
+			struct vm_area_struct *vma);
 int try_to_unmap_one(struct page *, struct vm_area_struct *,
 			unsigned long address, enum ttu_flags flags);
 
@@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
 	return 0;
 }
 
-#define try_to_unmap(page, refs) SWAP_FAIL
+#define try_to_unmap(page, refs, vma) SWAP_FAIL
 
 static inline int page_mkclean(struct page *page)
 {
diff --git a/mm/ksm.c b/mm/ksm.c
index 7f629e4..1a90d13 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1949,7 +1949,8 @@ out:
 	return referenced;
 }
 
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
+			struct vm_area_struct *target_vma)
 {
 	struct stable_node *stable_node;
 	struct hlist_node *hlist;
@@ -1963,6 +1964,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
 	stable_node = page_stable_node(page);
 	if (!stable_node)
 		return SWAP_FAIL;
+
+	if (target_vma) {
+		unsigned long address = vma_address(page, target_vma);
+		ret = try_to_unmap_one(page, vma, address, flags);
+		goto out;
+	}
 again:
 	hlist_for_each_entry(rmap_item, hlist, &stable_node->hlist, hlist) {
 		struct anon_vma *anon_vma = rmap_item->anon_vma;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ceb0c7f..f3928e4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
 	if (hpage != ppage)
 		lock_page(ppage);
 
-	ret = try_to_unmap(ppage, ttu);
+	ret = try_to_unmap(ppage, ttu, NULL);
 	if (ret != SWAP_SUCCESS)
 		printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
 				pfn, page_mapcount(ppage));
diff --git a/mm/migrate.c b/mm/migrate.c
index 6fa4ebc..aafbc66 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	}
 
 	/* Establish migration ptes or remove ptes */
-	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+			NULL);
 
 skip_unmap:
 	if (!page_mapped(page))
@@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	if (PageAnon(hpage))
 		anon_vma = page_get_anon_vma(hpage);
 
-	try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+	try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+						NULL);
 
 	if (!page_mapped(hpage))
 		rc = move_to_new_page(new_hpage, hpage, 1, mode);
diff --git a/mm/rmap.c b/mm/rmap.c
index 6280da8..a880f24 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
 
 /**
  * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
- * rmap method
+ * rmap method if @vma is NULL
  * @page: the page to unmap/unlock
  * @flags: action and flags
+ * @target_vma: vma for unmapping a @page
  *
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the anon_vma struct it points to.
  *
+ * If @target_vma isn't NULL, this function unmap a page from the vma
+ *
  * This function is only called from try_to_unmap/try_to_munlock for
  * anonymous pages.
  * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
@@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * 'LOCKED.
  */
-static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
+					struct vm_area_struct *target_vma)
 {
+	int ret = SWAP_AGAIN;
+	unsigned long address;
 	struct anon_vma *anon_vma;
 	pgoff_t pgoff;
 	struct anon_vma_chain *avc;
-	int ret = SWAP_AGAIN;
+
+	if (target_vma) {
+		address = vma_address(page, target_vma);
+		return try_to_unmap_one(page, target_vma, address, flags);
+	}
 
 	anon_vma = page_lock_anon_vma_read(page);
 	if (!anon_vma)
@@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
 	pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
 		struct vm_area_struct *vma = avc->vma;
-		unsigned long address;
 
 		/*
 		 * During exec, a temporary VMA is setup and later moved.
@@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
  * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
  * @page: the page to unmap/unlock
  * @flags: action and flags
+ * @target_vma: vma for unmapping @page
  *
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the address_space struct it points to.
@@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * 'LOCKED.
  */
-static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
+				struct vm_area_struct *target_vma)
 {
 	struct address_space *mapping = page->mapping;
 	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1512,16 +1523,27 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
 	unsigned long max_nl_cursor = 0;
 	unsigned long max_nl_size = 0;
 	unsigned int mapcount;
+	unsigned long address;
 
 	if (PageHuge(page))
 		pgoff = page->index << compound_order(page);
 
 	mutex_lock(&mapping->i_mmap_mutex);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
-		unsigned long address = vma_address(page, vma);
-		ret = try_to_unmap_one(page, vma, address, flags);
-		if (ret != SWAP_AGAIN || !page_mapped(page))
+	if (target_vma) {
+		/* We don't handle non-linear vma on ramfs */
+		if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
 			goto out;
+
+		address = vma_address(page, target_vma);
+		ret = try_to_unmap_one(page, target_vma, address, flags);
+		goto out;
+	} else {
+		vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+			address = vma_address(page, vma);
+			ret = try_to_unmap_one(page, vma, address, flags);
+			if (ret != SWAP_AGAIN || !page_mapped(page))
+				goto out;
+		}
 	}
 
 	if (list_empty(&mapping->i_mmap_nonlinear))
@@ -1602,9 +1624,12 @@ out:
  * try_to_unmap - try to remove all page table mappings to a page
  * @page: the page to get unmapped
  * @flags: action and flags
+ * @vma : target vma for reclaim
  *
  * Tries to remove all the page table entries which are mapping this
  * page, used in the pageout path.  Caller must hold the page lock.
+ * If @vma is not NULL, this function try to remove @page from only @vma
+ * without peeking all mapped vma for @page.
  * Return values are:
  *
  * SWAP_SUCCESS	- we succeeded in removing all mappings
@@ -1612,7 +1637,8 @@ out:
  * SWAP_FAIL	- the page is unswappable
  * SWAP_MLOCK	- page is mlocked.
  */
-int try_to_unmap(struct page *page, enum ttu_flags flags)
+int try_to_unmap(struct page *page, enum ttu_flags flags,
+				struct vm_area_struct *vma)
 {
 	int ret;
 
@@ -1620,11 +1646,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
 	VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
 
 	if (unlikely(PageKsm(page)))
-		ret = try_to_unmap_ksm(page, flags);
+		ret = try_to_unmap_ksm(page, flags, vma);
 	else if (PageAnon(page))
-		ret = try_to_unmap_anon(page, flags);
+		ret = try_to_unmap_anon(page, flags, vma);
 	else
-		ret = try_to_unmap_file(page, flags);
+		ret = try_to_unmap_file(page, flags, vma);
 	if (ret != SWAP_MLOCK && !page_mapped(page))
 		ret = SWAP_SUCCESS;
 	return ret;
@@ -1650,11 +1676,11 @@ int try_to_munlock(struct page *page)
 	VM_BUG_ON(!PageLocked(page) || PageLRU(page));
 
 	if (unlikely(PageKsm(page)))
-		return try_to_unmap_ksm(page, TTU_MUNLOCK);
+		return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
 	else if (PageAnon(page))
-		return try_to_unmap_anon(page, TTU_MUNLOCK);
+		return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
 	else
-		return try_to_unmap_file(page, TTU_MUNLOCK);
+		return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
 }
 
 void __put_anon_vma(struct anon_vma *anon_vma)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 367d0f4..df9c4d3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -92,6 +92,13 @@ struct scan_control {
 	 * are scanned.
 	 */
 	nodemask_t	*nodemask;
+
+	/*
+	 * Reclaim pages from a vma. If the page is shared by other tasks
+	 * it is zapped from a vma without reclaim so it ends up remaining
+	 * on memory until last task zap it.
+	 */
+	struct vm_area_struct *target_vma;
 };
 
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -793,7 +800,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		 * processes. Try to unmap it here.
 		 */
 		if (page_mapped(page) && mapping) {
-			switch (try_to_unmap(page, ttu_flags)) {
+			switch (try_to_unmap(page,
+					ttu_flags, sc->target_vma)) {
 			case SWAP_FAIL:
 				goto activate_locked;
 			case SWAP_AGAIN:
@@ -1000,13 +1008,15 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 }
 
 #ifdef CONFIG_PROCESS_RECLAIM
-unsigned long reclaim_pages_from_list(struct list_head *page_list)
+unsigned long reclaim_pages_from_list(struct list_head *page_list,
+					struct vm_area_struct *vma)
 {
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
 		.priority = DEF_PRIORITY,
 		.may_unmap = 1,
 		.may_swap = 1,
+		.target_vma = vma,
 	};
 
 	unsigned long nr_reclaimed;
-- 
1.8.2


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC 4/4] mm: Enhance per process reclaim
  2013-03-25  6:21 ` [RFC 4/4] mm: Enhance per process reclaim Minchan Kim
@ 2013-04-02 13:25   ` Michael Kerrisk
  2013-04-03  0:23     ` Minchan Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Kerrisk @ 2013-04-02 13:25 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee,
	Michael Kerrisk-manpages

Minchan,

On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
>
> Some pages could be shared by several processes. (ex, libc)
> In case of that, it's too bad to reclaim them from the beginnig.
>
> This patch causes VM to keep them on memory until last task
> try to reclaim them so shared pages will be reclaimed only if
> all of task has gone swapping out.
>
> This feature doesn't handle non-linear mapping on ramfs because
> it's very time-consuming and doesn't make sure of reclaiming and
> not common.

Against what tree does this patch apply? I've tries various trees,
including MMOTM of 26 March, and encounter this error:

  CC      mm/ksm.o
mm/ksm.c: In function ‘try_to_unmap_ksm’:
mm/ksm.c:1970:32: error: ‘vma’ undeclared (first use in this function)
mm/ksm.c:1970:32: note: each undeclared identifier is reported only
once for each function it appears in
make[1]: *** [mm/ksm.o] Error 1
make: *** [mm] Error 2

Cheers,

Michael


> Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  fs/proc/task_mmu.c   |  2 +-
>  include/linux/ksm.h  |  6 ++++--
>  include/linux/rmap.h |  8 +++++---
>  mm/ksm.c             |  9 +++++++-
>  mm/memory-failure.c  |  2 +-
>  mm/migrate.c         |  6 ++++--
>  mm/rmap.c            | 58 +++++++++++++++++++++++++++++++++++++---------------
>  mm/vmscan.c          | 14 +++++++++++--
>  8 files changed, 77 insertions(+), 28 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index c3713a4..7f6aaf5 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1154,7 +1154,7 @@ cont:
>                         break;
>         }
>         pte_unmap_unlock(pte - 1, ptl);
> -       reclaim_pages_from_list(&page_list);
> +       reclaim_pages_from_list(&page_list, vma);
>         if (addr != end)
>                 goto cont;
>
> diff --git a/include/linux/ksm.h b/include/linux/ksm.h
> index 45c9b6a..d8e556b 100644
> --- a/include/linux/ksm.h
> +++ b/include/linux/ksm.h
> @@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
>
>  int page_referenced_ksm(struct page *page,
>                         struct mem_cgroup *memcg, unsigned long *vm_flags);
> -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
> +int try_to_unmap_ksm(struct page *page,
> +                       enum ttu_flags flags, struct vm_area_struct *vma);
>  int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
>                   struct vm_area_struct *, unsigned long, void *), void *arg);
>  void ksm_migrate_page(struct page *newpage, struct page *oldpage);
> @@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
>         return 0;
>  }
>
> -static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
> +static inline int try_to_unmap_ksm(struct page *page,
> +                       enum ttu_flags flags, struct vm_area_struct *target_vma)
>  {
>         return 0;
>  }
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index a24e34e..6c7d030 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -12,7 +12,8 @@
>
>  extern int isolate_lru_page(struct page *page);
>  extern void putback_lru_page(struct page *page);
> -extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
> +extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
> +                                            struct vm_area_struct *vma);
>
>  /*
>   * The anon_vma heads a list of private "related" vmas, to scan if
> @@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
>
>  #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
>
> -int try_to_unmap(struct page *, enum ttu_flags flags);
> +int try_to_unmap(struct page *, enum ttu_flags flags,
> +                       struct vm_area_struct *vma);
>  int try_to_unmap_one(struct page *, struct vm_area_struct *,
>                         unsigned long address, enum ttu_flags flags);
>
> @@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
>         return 0;
>  }
>
> -#define try_to_unmap(page, refs) SWAP_FAIL
> +#define try_to_unmap(page, refs, vma) SWAP_FAIL
>
>  static inline int page_mkclean(struct page *page)
>  {
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 7f629e4..1a90d13 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -1949,7 +1949,8 @@ out:
>         return referenced;
>  }
>
> -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
> +int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
> +                       struct vm_area_struct *target_vma)
>  {
>         struct stable_node *stable_node;
>         struct hlist_node *hlist;
> @@ -1963,6 +1964,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>         stable_node = page_stable_node(page);
>         if (!stable_node)
>                 return SWAP_FAIL;
> +
> +       if (target_vma) {
> +               unsigned long address = vma_address(page, target_vma);
> +               ret = try_to_unmap_one(page, vma, address, flags);
> +               goto out;
> +       }
>  again:
>         hlist_for_each_entry(rmap_item, hlist, &stable_node->hlist, hlist) {
>                 struct anon_vma *anon_vma = rmap_item->anon_vma;
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index ceb0c7f..f3928e4 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
>         if (hpage != ppage)
>                 lock_page(ppage);
>
> -       ret = try_to_unmap(ppage, ttu);
> +       ret = try_to_unmap(ppage, ttu, NULL);
>         if (ret != SWAP_SUCCESS)
>                 printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
>                                 pfn, page_mapcount(ppage));
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 6fa4ebc..aafbc66 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>         }
>
>         /* Establish migration ptes or remove ptes */
> -       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
> +       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
> +                       NULL);
>
>  skip_unmap:
>         if (!page_mapped(page))
> @@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>         if (PageAnon(hpage))
>                 anon_vma = page_get_anon_vma(hpage);
>
> -       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
> +       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
> +                                               NULL);
>
>         if (!page_mapped(hpage))
>                 rc = move_to_new_page(new_hpage, hpage, 1, mode);
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 6280da8..a880f24 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
>
>  /**
>   * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
> - * rmap method
> + * rmap method if @vma is NULL
>   * @page: the page to unmap/unlock
>   * @flags: action and flags
> + * @target_vma: vma for unmapping a @page
>   *
>   * Find all the mappings of a page using the mapping pointer and the vma chains
>   * contained in the anon_vma struct it points to.
>   *
> + * If @target_vma isn't NULL, this function unmap a page from the vma
> + *
>   * This function is only called from try_to_unmap/try_to_munlock for
>   * anonymous pages.
>   * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
> @@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
>   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
>   * 'LOCKED.
>   */
> -static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
> +static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
> +                                       struct vm_area_struct *target_vma)
>  {
> +       int ret = SWAP_AGAIN;
> +       unsigned long address;
>         struct anon_vma *anon_vma;
>         pgoff_t pgoff;
>         struct anon_vma_chain *avc;
> -       int ret = SWAP_AGAIN;
> +
> +       if (target_vma) {
> +               address = vma_address(page, target_vma);
> +               return try_to_unmap_one(page, target_vma, address, flags);
> +       }
>
>         anon_vma = page_lock_anon_vma_read(page);
>         if (!anon_vma)
> @@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>         pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
>         anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
>                 struct vm_area_struct *vma = avc->vma;
> -               unsigned long address;
>
>                 /*
>                  * During exec, a temporary VMA is setup and later moved.
> @@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>   * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
>   * @page: the page to unmap/unlock
>   * @flags: action and flags
> + * @target_vma: vma for unmapping @page
>   *
>   * Find all the mappings of a page using the mapping pointer and the vma chains
>   * contained in the address_space struct it points to.
> @@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
>   * 'LOCKED.
>   */
> -static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
> +static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
> +                               struct vm_area_struct *target_vma)
>  {
>         struct address_space *mapping = page->mapping;
>         pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> @@ -1512,16 +1523,27 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
>         unsigned long max_nl_cursor = 0;
>         unsigned long max_nl_size = 0;
>         unsigned int mapcount;
> +       unsigned long address;
>
>         if (PageHuge(page))
>                 pgoff = page->index << compound_order(page);
>
>         mutex_lock(&mapping->i_mmap_mutex);
> -       vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
> -               unsigned long address = vma_address(page, vma);
> -               ret = try_to_unmap_one(page, vma, address, flags);
> -               if (ret != SWAP_AGAIN || !page_mapped(page))
> +       if (target_vma) {
> +               /* We don't handle non-linear vma on ramfs */
> +               if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
>                         goto out;
> +
> +               address = vma_address(page, target_vma);
> +               ret = try_to_unmap_one(page, target_vma, address, flags);
> +               goto out;
> +       } else {
> +               vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
> +                       address = vma_address(page, vma);
> +                       ret = try_to_unmap_one(page, vma, address, flags);
> +                       if (ret != SWAP_AGAIN || !page_mapped(page))
> +                               goto out;
> +               }
>         }
>
>         if (list_empty(&mapping->i_mmap_nonlinear))
> @@ -1602,9 +1624,12 @@ out:
>   * try_to_unmap - try to remove all page table mappings to a page
>   * @page: the page to get unmapped
>   * @flags: action and flags
> + * @vma : target vma for reclaim
>   *
>   * Tries to remove all the page table entries which are mapping this
>   * page, used in the pageout path.  Caller must hold the page lock.
> + * If @vma is not NULL, this function try to remove @page from only @vma
> + * without peeking all mapped vma for @page.
>   * Return values are:
>   *
>   * SWAP_SUCCESS        - we succeeded in removing all mappings
> @@ -1612,7 +1637,8 @@ out:
>   * SWAP_FAIL   - the page is unswappable
>   * SWAP_MLOCK  - page is mlocked.
>   */
> -int try_to_unmap(struct page *page, enum ttu_flags flags)
> +int try_to_unmap(struct page *page, enum ttu_flags flags,
> +                               struct vm_area_struct *vma)
>  {
>         int ret;
>
> @@ -1620,11 +1646,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
>         VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
>
>         if (unlikely(PageKsm(page)))
> -               ret = try_to_unmap_ksm(page, flags);
> +               ret = try_to_unmap_ksm(page, flags, vma);
>         else if (PageAnon(page))
> -               ret = try_to_unmap_anon(page, flags);
> +               ret = try_to_unmap_anon(page, flags, vma);
>         else
> -               ret = try_to_unmap_file(page, flags);
> +               ret = try_to_unmap_file(page, flags, vma);
>         if (ret != SWAP_MLOCK && !page_mapped(page))
>                 ret = SWAP_SUCCESS;
>         return ret;
> @@ -1650,11 +1676,11 @@ int try_to_munlock(struct page *page)
>         VM_BUG_ON(!PageLocked(page) || PageLRU(page));
>
>         if (unlikely(PageKsm(page)))
> -               return try_to_unmap_ksm(page, TTU_MUNLOCK);
> +               return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
>         else if (PageAnon(page))
> -               return try_to_unmap_anon(page, TTU_MUNLOCK);
> +               return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
>         else
> -               return try_to_unmap_file(page, TTU_MUNLOCK);
> +               return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
>  }
>
>  void __put_anon_vma(struct anon_vma *anon_vma)
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 367d0f4..df9c4d3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -92,6 +92,13 @@ struct scan_control {
>          * are scanned.
>          */
>         nodemask_t      *nodemask;
> +
> +       /*
> +        * Reclaim pages from a vma. If the page is shared by other tasks
> +        * it is zapped from a vma without reclaim so it ends up remaining
> +        * on memory until last task zap it.
> +        */
> +       struct vm_area_struct *target_vma;
>  };
>
>  #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
> @@ -793,7 +800,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>                  * processes. Try to unmap it here.
>                  */
>                 if (page_mapped(page) && mapping) {
> -                       switch (try_to_unmap(page, ttu_flags)) {
> +                       switch (try_to_unmap(page,
> +                                       ttu_flags, sc->target_vma)) {
>                         case SWAP_FAIL:
>                                 goto activate_locked;
>                         case SWAP_AGAIN:
> @@ -1000,13 +1008,15 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
>  }
>
>  #ifdef CONFIG_PROCESS_RECLAIM
> -unsigned long reclaim_pages_from_list(struct list_head *page_list)
> +unsigned long reclaim_pages_from_list(struct list_head *page_list,
> +                                       struct vm_area_struct *vma)
>  {
>         struct scan_control sc = {
>                 .gfp_mask = GFP_KERNEL,
>                 .priority = DEF_PRIORITY,
>                 .may_unmap = 1,
>                 .may_swap = 1,
> +               .target_vma = vma,
>         };
>
>         unsigned long nr_reclaimed;
> --
> 1.8.2
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 4/4] mm: Enhance per process reclaim
  2013-04-02 13:25   ` Michael Kerrisk
@ 2013-04-03  0:23     ` Minchan Kim
  2013-04-03  6:16       ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 13+ messages in thread
From: Minchan Kim @ 2013-04-03  0:23 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee

Hey Michael,

On Tue, Apr 02, 2013 at 03:25:25PM +0200, Michael Kerrisk wrote:
> Minchan,
> 
> On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
> >
> > Some pages could be shared by several processes. (ex, libc)
> > In case of that, it's too bad to reclaim them from the beginnig.
> >
> > This patch causes VM to keep them on memory until last task
> > try to reclaim them so shared pages will be reclaimed only if
> > all of task has gone swapping out.
> >
> > This feature doesn't handle non-linear mapping on ramfs because
> > it's very time-consuming and doesn't make sure of reclaiming and
> > not common.
> 
> Against what tree does this patch apply? I've tries various trees,
> including MMOTM of 26 March, and encounter this error:
> 
>   CC      mm/ksm.o
> mm/ksm.c: In function ‘try_to_unmap_ksm’:
> mm/ksm.c:1970:32: error: ‘vma’ undeclared (first use in this function)
> mm/ksm.c:1970:32: note: each undeclared identifier is reported only
> once for each function it appears in
> make[1]: *** [mm/ksm.o] Error 1
> make: *** [mm] Error 2

I did it based on mmotm-2013-03-22-15-21 and you found build problem.
Could you apply below patch? I will fix up below in next spin.
Thanks for the testing!

>From 0934270618ccd4883d6bb05653c664a385fb9441 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Wed, 3 Apr 2013 09:19:49 +0900
Subject: [PATCH] fix: compile error for CONFIG_KSM

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 include/linux/rmap.h | 2 ++
 mm/ksm.c             | 2 +-
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6c7d030..7bcf090 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -14,6 +14,8 @@ extern int isolate_lru_page(struct page *page);
 extern void putback_lru_page(struct page *page);
 extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
 					     struct vm_area_struct *vma);
+extern unsigned long vma_address(struct page *page,
+				struct vm_area_struct *vma);
 
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
diff --git a/mm/ksm.c b/mm/ksm.c
index 1a90d13..44de936 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1967,7 +1967,7 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
 
 	if (target_vma) {
 		unsigned long address = vma_address(page, target_vma);
-		ret = try_to_unmap_one(page, vma, address, flags);
+		ret = try_to_unmap_one(page, target_vma, address, flags);
 		goto out;
 	}
 again:
-- 
1.8.2

> 
> Cheers,
> 
> Michael
> 
> 
> > Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  fs/proc/task_mmu.c   |  2 +-
> >  include/linux/ksm.h  |  6 ++++--
> >  include/linux/rmap.h |  8 +++++---
> >  mm/ksm.c             |  9 +++++++-
> >  mm/memory-failure.c  |  2 +-
> >  mm/migrate.c         |  6 ++++--
> >  mm/rmap.c            | 58 +++++++++++++++++++++++++++++++++++++---------------
> >  mm/vmscan.c          | 14 +++++++++++--
> >  8 files changed, 77 insertions(+), 28 deletions(-)
> >
> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> > index c3713a4..7f6aaf5 100644
> > --- a/fs/proc/task_mmu.c
> > +++ b/fs/proc/task_mmu.c
> > @@ -1154,7 +1154,7 @@ cont:
> >                         break;
> >         }
> >         pte_unmap_unlock(pte - 1, ptl);
> > -       reclaim_pages_from_list(&page_list);
> > +       reclaim_pages_from_list(&page_list, vma);
> >         if (addr != end)
> >                 goto cont;
> >
> > diff --git a/include/linux/ksm.h b/include/linux/ksm.h
> > index 45c9b6a..d8e556b 100644
> > --- a/include/linux/ksm.h
> > +++ b/include/linux/ksm.h
> > @@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
> >
> >  int page_referenced_ksm(struct page *page,
> >                         struct mem_cgroup *memcg, unsigned long *vm_flags);
> > -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
> > +int try_to_unmap_ksm(struct page *page,
> > +                       enum ttu_flags flags, struct vm_area_struct *vma);
> >  int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
> >                   struct vm_area_struct *, unsigned long, void *), void *arg);
> >  void ksm_migrate_page(struct page *newpage, struct page *oldpage);
> > @@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
> >         return 0;
> >  }
> >
> > -static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
> > +static inline int try_to_unmap_ksm(struct page *page,
> > +                       enum ttu_flags flags, struct vm_area_struct *target_vma)
> >  {
> >         return 0;
> >  }
> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> > index a24e34e..6c7d030 100644
> > --- a/include/linux/rmap.h
> > +++ b/include/linux/rmap.h
> > @@ -12,7 +12,8 @@
> >
> >  extern int isolate_lru_page(struct page *page);
> >  extern void putback_lru_page(struct page *page);
> > -extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
> > +extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
> > +                                            struct vm_area_struct *vma);
> >
> >  /*
> >   * The anon_vma heads a list of private "related" vmas, to scan if
> > @@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
> >
> >  #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
> >
> > -int try_to_unmap(struct page *, enum ttu_flags flags);
> > +int try_to_unmap(struct page *, enum ttu_flags flags,
> > +                       struct vm_area_struct *vma);
> >  int try_to_unmap_one(struct page *, struct vm_area_struct *,
> >                         unsigned long address, enum ttu_flags flags);
> >
> > @@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
> >         return 0;
> >  }
> >
> > -#define try_to_unmap(page, refs) SWAP_FAIL
> > +#define try_to_unmap(page, refs, vma) SWAP_FAIL
> >
> >  static inline int page_mkclean(struct page *page)
> >  {
> > diff --git a/mm/ksm.c b/mm/ksm.c
> > index 7f629e4..1a90d13 100644
> > --- a/mm/ksm.c
> > +++ b/mm/ksm.c
> > @@ -1949,7 +1949,8 @@ out:
> >         return referenced;
> >  }
> >
> > -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
> > +int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
> > +                       struct vm_area_struct *target_vma)
> >  {
> >         struct stable_node *stable_node;
> >         struct hlist_node *hlist;
> > @@ -1963,6 +1964,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
> >         stable_node = page_stable_node(page);
> >         if (!stable_node)
> >                 return SWAP_FAIL;
> > +
> > +       if (target_vma) {
> > +               unsigned long address = vma_address(page, target_vma);
> > +               ret = try_to_unmap_one(page, vma, address, flags);
> > +               goto out;
> > +       }
> >  again:
> >         hlist_for_each_entry(rmap_item, hlist, &stable_node->hlist, hlist) {
> >                 struct anon_vma *anon_vma = rmap_item->anon_vma;
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index ceb0c7f..f3928e4 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
> >         if (hpage != ppage)
> >                 lock_page(ppage);
> >
> > -       ret = try_to_unmap(ppage, ttu);
> > +       ret = try_to_unmap(ppage, ttu, NULL);
> >         if (ret != SWAP_SUCCESS)
> >                 printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
> >                                 pfn, page_mapcount(ppage));
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index 6fa4ebc..aafbc66 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
> >         }
> >
> >         /* Establish migration ptes or remove ptes */
> > -       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
> > +       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
> > +                       NULL);
> >
> >  skip_unmap:
> >         if (!page_mapped(page))
> > @@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
> >         if (PageAnon(hpage))
> >                 anon_vma = page_get_anon_vma(hpage);
> >
> > -       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
> > +       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
> > +                                               NULL);
> >
> >         if (!page_mapped(hpage))
> >                 rc = move_to_new_page(new_hpage, hpage, 1, mode);
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 6280da8..a880f24 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
> >
> >  /**
> >   * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
> > - * rmap method
> > + * rmap method if @vma is NULL
> >   * @page: the page to unmap/unlock
> >   * @flags: action and flags
> > + * @target_vma: vma for unmapping a @page
> >   *
> >   * Find all the mappings of a page using the mapping pointer and the vma chains
> >   * contained in the anon_vma struct it points to.
> >   *
> > + * If @target_vma isn't NULL, this function unmap a page from the vma
> > + *
> >   * This function is only called from try_to_unmap/try_to_munlock for
> >   * anonymous pages.
> >   * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
> > @@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
> >   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
> >   * 'LOCKED.
> >   */
> > -static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
> > +static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
> > +                                       struct vm_area_struct *target_vma)
> >  {
> > +       int ret = SWAP_AGAIN;
> > +       unsigned long address;
> >         struct anon_vma *anon_vma;
> >         pgoff_t pgoff;
> >         struct anon_vma_chain *avc;
> > -       int ret = SWAP_AGAIN;
> > +
> > +       if (target_vma) {
> > +               address = vma_address(page, target_vma);
> > +               return try_to_unmap_one(page, target_vma, address, flags);
> > +       }
> >
> >         anon_vma = page_lock_anon_vma_read(page);
> >         if (!anon_vma)
> > @@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
> >         pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> >         anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
> >                 struct vm_area_struct *vma = avc->vma;
> > -               unsigned long address;
> >
> >                 /*
> >                  * During exec, a temporary VMA is setup and later moved.
> > @@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
> >   * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
> >   * @page: the page to unmap/unlock
> >   * @flags: action and flags
> > + * @target_vma: vma for unmapping @page
> >   *
> >   * Find all the mappings of a page using the mapping pointer and the vma chains
> >   * contained in the address_space struct it points to.
> > @@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
> >   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
> >   * 'LOCKED.
> >   */
> > -static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
> > +static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
> > +                               struct vm_area_struct *target_vma)
> >  {
> >         struct address_space *mapping = page->mapping;
> >         pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
> > @@ -1512,16 +1523,27 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
> >         unsigned long max_nl_cursor = 0;
> >         unsigned long max_nl_size = 0;
> >         unsigned int mapcount;
> > +       unsigned long address;
> >
> >         if (PageHuge(page))
> >                 pgoff = page->index << compound_order(page);
> >
> >         mutex_lock(&mapping->i_mmap_mutex);
> > -       vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
> > -               unsigned long address = vma_address(page, vma);
> > -               ret = try_to_unmap_one(page, vma, address, flags);
> > -               if (ret != SWAP_AGAIN || !page_mapped(page))
> > +       if (target_vma) {
> > +               /* We don't handle non-linear vma on ramfs */
> > +               if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
> >                         goto out;
> > +
> > +               address = vma_address(page, target_vma);
> > +               ret = try_to_unmap_one(page, target_vma, address, flags);
> > +               goto out;
> > +       } else {
> > +               vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
> > +                       address = vma_address(page, vma);
> > +                       ret = try_to_unmap_one(page, vma, address, flags);
> > +                       if (ret != SWAP_AGAIN || !page_mapped(page))
> > +                               goto out;
> > +               }
> >         }
> >
> >         if (list_empty(&mapping->i_mmap_nonlinear))
> > @@ -1602,9 +1624,12 @@ out:
> >   * try_to_unmap - try to remove all page table mappings to a page
> >   * @page: the page to get unmapped
> >   * @flags: action and flags
> > + * @vma : target vma for reclaim
> >   *
> >   * Tries to remove all the page table entries which are mapping this
> >   * page, used in the pageout path.  Caller must hold the page lock.
> > + * If @vma is not NULL, this function try to remove @page from only @vma
> > + * without peeking all mapped vma for @page.
> >   * Return values are:
> >   *
> >   * SWAP_SUCCESS        - we succeeded in removing all mappings
> > @@ -1612,7 +1637,8 @@ out:
> >   * SWAP_FAIL   - the page is unswappable
> >   * SWAP_MLOCK  - page is mlocked.
> >   */
> > -int try_to_unmap(struct page *page, enum ttu_flags flags)
> > +int try_to_unmap(struct page *page, enum ttu_flags flags,
> > +                               struct vm_area_struct *vma)
> >  {
> >         int ret;
> >
> > @@ -1620,11 +1646,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
> >         VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
> >
> >         if (unlikely(PageKsm(page)))
> > -               ret = try_to_unmap_ksm(page, flags);
> > +               ret = try_to_unmap_ksm(page, flags, vma);
> >         else if (PageAnon(page))
> > -               ret = try_to_unmap_anon(page, flags);
> > +               ret = try_to_unmap_anon(page, flags, vma);
> >         else
> > -               ret = try_to_unmap_file(page, flags);
> > +               ret = try_to_unmap_file(page, flags, vma);
> >         if (ret != SWAP_MLOCK && !page_mapped(page))
> >                 ret = SWAP_SUCCESS;
> >         return ret;
> > @@ -1650,11 +1676,11 @@ int try_to_munlock(struct page *page)
> >         VM_BUG_ON(!PageLocked(page) || PageLRU(page));
> >
> >         if (unlikely(PageKsm(page)))
> > -               return try_to_unmap_ksm(page, TTU_MUNLOCK);
> > +               return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
> >         else if (PageAnon(page))
> > -               return try_to_unmap_anon(page, TTU_MUNLOCK);
> > +               return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
> >         else
> > -               return try_to_unmap_file(page, TTU_MUNLOCK);
> > +               return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
> >  }
> >
> >  void __put_anon_vma(struct anon_vma *anon_vma)
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 367d0f4..df9c4d3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -92,6 +92,13 @@ struct scan_control {
> >          * are scanned.
> >          */
> >         nodemask_t      *nodemask;
> > +
> > +       /*
> > +        * Reclaim pages from a vma. If the page is shared by other tasks
> > +        * it is zapped from a vma without reclaim so it ends up remaining
> > +        * on memory until last task zap it.
> > +        */
> > +       struct vm_area_struct *target_vma;
> >  };
> >
> >  #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
> > @@ -793,7 +800,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >                  * processes. Try to unmap it here.
> >                  */
> >                 if (page_mapped(page) && mapping) {
> > -                       switch (try_to_unmap(page, ttu_flags)) {
> > +                       switch (try_to_unmap(page,
> > +                                       ttu_flags, sc->target_vma)) {
> >                         case SWAP_FAIL:
> >                                 goto activate_locked;
> >                         case SWAP_AGAIN:
> > @@ -1000,13 +1008,15 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
> >  }
> >
> >  #ifdef CONFIG_PROCESS_RECLAIM
> > -unsigned long reclaim_pages_from_list(struct list_head *page_list)
> > +unsigned long reclaim_pages_from_list(struct list_head *page_list,
> > +                                       struct vm_area_struct *vma)
> >  {
> >         struct scan_control sc = {
> >                 .gfp_mask = GFP_KERNEL,
> >                 .priority = DEF_PRIORITY,
> >                 .may_unmap = 1,
> >                 .may_swap = 1,
> > +               .target_vma = vma,
> >         };
> >
> >         unsigned long nr_reclaimed;
> > --
> > 1.8.2
> >
> > --
> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
> > the body to majordomo@kvack.org.  For more info on Linux MM,
> > see: http://www.linux-mm.org/ .
> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [RFC 4/4] mm: Enhance per process reclaim
  2013-04-03  0:23     ` Minchan Kim
@ 2013-04-03  6:16       ` Michael Kerrisk (man-pages)
  2013-04-03  6:47         ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 13+ messages in thread
From: Michael Kerrisk (man-pages) @ 2013-04-03  6:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee

Hello Minchan

On Wed, Apr 3, 2013 at 2:23 AM, Minchan Kim <minchan@kernel.org> wrote:
> Hey Michael,
>
> On Tue, Apr 02, 2013 at 03:25:25PM +0200, Michael Kerrisk wrote:
>> Minchan,
>>
>> On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
>> >
>> > Some pages could be shared by several processes. (ex, libc)
>> > In case of that, it's too bad to reclaim them from the beginnig.
>> >
>> > This patch causes VM to keep them on memory until last task
>> > try to reclaim them so shared pages will be reclaimed only if
>> > all of task has gone swapping out.
>> >
>> > This feature doesn't handle non-linear mapping on ramfs because
>> > it's very time-consuming and doesn't make sure of reclaiming and
>> > not common.
>>
>> Against what tree does this patch apply? I've tries various trees,
>> including MMOTM of 26 March, and encounter this error:
>>
>>   CC      mm/ksm.o
>> mm/ksm.c: In function ‘try_to_unmap_ksm’:
>> mm/ksm.c:1970:32: error: ‘vma’ undeclared (first use in this function)
>> mm/ksm.c:1970:32: note: each undeclared identifier is reported only
>> once for each function it appears in
>> make[1]: *** [mm/ksm.o] Error 1
>> make: *** [mm] Error 2
>
> I did it based on mmotm-2013-03-22-15-21 and you found build problem.
> Could you apply below patch? I will fix up below in next spin.
> Thanks for the testing!

This is getting confusing. Was that a patch on top of the other 4
patches? I assume so. I applied all 5 patches on
mmotm-2013-03-22-15-21, but still get a build error:

mm/memcontrol.c: In function ‘mem_cgroup_move_parent’:
mm/memcontrol.c:3868:2: error: implicit declaration of function
‘isolate_lru_page’ [-Werror=implicit-function-declaration]
mm/memcontrol.c:3892:2: error: implicit declaration of function
‘putback_lru_page’ [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors
make[1]: *** [mm/memcontrol.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [mm] Error 2
make: *** Waiting for unfinished jobs....

Cheers,

Michael



> From 0934270618ccd4883d6bb05653c664a385fb9441 Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Wed, 3 Apr 2013 09:19:49 +0900
> Subject: [PATCH] fix: compile error for CONFIG_KSM
>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  include/linux/rmap.h | 2 ++
>  mm/ksm.c             | 2 +-
>  2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index 6c7d030..7bcf090 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -14,6 +14,8 @@ extern int isolate_lru_page(struct page *page);
>  extern void putback_lru_page(struct page *page);
>  extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
>                                              struct vm_area_struct *vma);
> +extern unsigned long vma_address(struct page *page,
> +                               struct vm_area_struct *vma);
>
>  /*
>   * The anon_vma heads a list of private "related" vmas, to scan if
> diff --git a/mm/ksm.c b/mm/ksm.c
> index 1a90d13..44de936 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -1967,7 +1967,7 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
>
>         if (target_vma) {
>                 unsigned long address = vma_address(page, target_vma);
> -               ret = try_to_unmap_one(page, vma, address, flags);
> +               ret = try_to_unmap_one(page, target_vma, address, flags);
>                 goto out;
>         }
>  again:
> --
> 1.8.2
>
>>
>> Cheers,
>>
>> Michael
>>
>>
>> > Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
>> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>> > ---
>> >  fs/proc/task_mmu.c   |  2 +-
>> >  include/linux/ksm.h  |  6 ++++--
>> >  include/linux/rmap.h |  8 +++++---
>> >  mm/ksm.c             |  9 +++++++-
>> >  mm/memory-failure.c  |  2 +-
>> >  mm/migrate.c         |  6 ++++--
>> >  mm/rmap.c            | 58 +++++++++++++++++++++++++++++++++++++---------------
>> >  mm/vmscan.c          | 14 +++++++++++--
>> >  8 files changed, 77 insertions(+), 28 deletions(-)
>> >
>> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> > index c3713a4..7f6aaf5 100644
>> > --- a/fs/proc/task_mmu.c
>> > +++ b/fs/proc/task_mmu.c
>> > @@ -1154,7 +1154,7 @@ cont:
>> >                         break;
>> >         }
>> >         pte_unmap_unlock(pte - 1, ptl);
>> > -       reclaim_pages_from_list(&page_list);
>> > +       reclaim_pages_from_list(&page_list, vma);
>> >         if (addr != end)
>> >                 goto cont;
>> >
>> > diff --git a/include/linux/ksm.h b/include/linux/ksm.h
>> > index 45c9b6a..d8e556b 100644
>> > --- a/include/linux/ksm.h
>> > +++ b/include/linux/ksm.h
>> > @@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
>> >
>> >  int page_referenced_ksm(struct page *page,
>> >                         struct mem_cgroup *memcg, unsigned long *vm_flags);
>> > -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
>> > +int try_to_unmap_ksm(struct page *page,
>> > +                       enum ttu_flags flags, struct vm_area_struct *vma);
>> >  int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
>> >                   struct vm_area_struct *, unsigned long, void *), void *arg);
>> >  void ksm_migrate_page(struct page *newpage, struct page *oldpage);
>> > @@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
>> >         return 0;
>> >  }
>> >
>> > -static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>> > +static inline int try_to_unmap_ksm(struct page *page,
>> > +                       enum ttu_flags flags, struct vm_area_struct *target_vma)
>> >  {
>> >         return 0;
>> >  }
>> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>> > index a24e34e..6c7d030 100644
>> > --- a/include/linux/rmap.h
>> > +++ b/include/linux/rmap.h
>> > @@ -12,7 +12,8 @@
>> >
>> >  extern int isolate_lru_page(struct page *page);
>> >  extern void putback_lru_page(struct page *page);
>> > -extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
>> > +extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
>> > +                                            struct vm_area_struct *vma);
>> >
>> >  /*
>> >   * The anon_vma heads a list of private "related" vmas, to scan if
>> > @@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
>> >
>> >  #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
>> >
>> > -int try_to_unmap(struct page *, enum ttu_flags flags);
>> > +int try_to_unmap(struct page *, enum ttu_flags flags,
>> > +                       struct vm_area_struct *vma);
>> >  int try_to_unmap_one(struct page *, struct vm_area_struct *,
>> >                         unsigned long address, enum ttu_flags flags);
>> >
>> > @@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
>> >         return 0;
>> >  }
>> >
>> > -#define try_to_unmap(page, refs) SWAP_FAIL
>> > +#define try_to_unmap(page, refs, vma) SWAP_FAIL
>> >
>> >  static inline int page_mkclean(struct page *page)
>> >  {
>> > diff --git a/mm/ksm.c b/mm/ksm.c
>> > index 7f629e4..1a90d13 100644
>> > --- a/mm/ksm.c
>> > +++ b/mm/ksm.c
>> > @@ -1949,7 +1949,8 @@ out:
>> >         return referenced;
>> >  }
>> >
>> > -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>> > +int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
>> > +                       struct vm_area_struct *target_vma)
>> >  {
>> >         struct stable_node *stable_node;
>> >         struct hlist_node *hlist;
>> > @@ -1963,6 +1964,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>> >         stable_node = page_stable_node(page);
>> >         if (!stable_node)
>> >                 return SWAP_FAIL;
>> > +
>> > +       if (target_vma) {
>> > +               unsigned long address = vma_address(page, target_vma);
>> > +               ret = try_to_unmap_one(page, vma, address, flags);
>> > +               goto out;
>> > +       }
>> >  again:
>> >         hlist_for_each_entry(rmap_item, hlist, &stable_node->hlist, hlist) {
>> >                 struct anon_vma *anon_vma = rmap_item->anon_vma;
>> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>> > index ceb0c7f..f3928e4 100644
>> > --- a/mm/memory-failure.c
>> > +++ b/mm/memory-failure.c
>> > @@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
>> >         if (hpage != ppage)
>> >                 lock_page(ppage);
>> >
>> > -       ret = try_to_unmap(ppage, ttu);
>> > +       ret = try_to_unmap(ppage, ttu, NULL);
>> >         if (ret != SWAP_SUCCESS)
>> >                 printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
>> >                                 pfn, page_mapcount(ppage));
>> > diff --git a/mm/migrate.c b/mm/migrate.c
>> > index 6fa4ebc..aafbc66 100644
>> > --- a/mm/migrate.c
>> > +++ b/mm/migrate.c
>> > @@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>> >         }
>> >
>> >         /* Establish migration ptes or remove ptes */
>> > -       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
>> > +       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
>> > +                       NULL);
>> >
>> >  skip_unmap:
>> >         if (!page_mapped(page))
>> > @@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>> >         if (PageAnon(hpage))
>> >                 anon_vma = page_get_anon_vma(hpage);
>> >
>> > -       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
>> > +       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
>> > +                                               NULL);
>> >
>> >         if (!page_mapped(hpage))
>> >                 rc = move_to_new_page(new_hpage, hpage, 1, mode);
>> > diff --git a/mm/rmap.c b/mm/rmap.c
>> > index 6280da8..a880f24 100644
>> > --- a/mm/rmap.c
>> > +++ b/mm/rmap.c
>> > @@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
>> >
>> >  /**
>> >   * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
>> > - * rmap method
>> > + * rmap method if @vma is NULL
>> >   * @page: the page to unmap/unlock
>> >   * @flags: action and flags
>> > + * @target_vma: vma for unmapping a @page
>> >   *
>> >   * Find all the mappings of a page using the mapping pointer and the vma chains
>> >   * contained in the anon_vma struct it points to.
>> >   *
>> > + * If @target_vma isn't NULL, this function unmap a page from the vma
>> > + *
>> >   * This function is only called from try_to_unmap/try_to_munlock for
>> >   * anonymous pages.
>> >   * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
>> > @@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
>> >   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
>> >   * 'LOCKED.
>> >   */
>> > -static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>> > +static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
>> > +                                       struct vm_area_struct *target_vma)
>> >  {
>> > +       int ret = SWAP_AGAIN;
>> > +       unsigned long address;
>> >         struct anon_vma *anon_vma;
>> >         pgoff_t pgoff;
>> >         struct anon_vma_chain *avc;
>> > -       int ret = SWAP_AGAIN;
>> > +
>> > +       if (target_vma) {
>> > +               address = vma_address(page, target_vma);
>> > +               return try_to_unmap_one(page, target_vma, address, flags);
>> > +       }
>> >
>> >         anon_vma = page_lock_anon_vma_read(page);
>> >         if (!anon_vma)
>> > @@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>> >         pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
>> >         anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
>> >                 struct vm_area_struct *vma = avc->vma;
>> > -               unsigned long address;
>> >
>> >                 /*
>> >                  * During exec, a temporary VMA is setup and later moved.
>> > @@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>> >   * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
>> >   * @page: the page to unmap/unlock
>> >   * @flags: action and flags
>> > + * @target_vma: vma for unmapping @page
>> >   *
>> >   * Find all the mappings of a page using the mapping pointer and the vma chains
>> >   * contained in the address_space struct it points to.
>> > @@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>> >   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
>> >   * 'LOCKED.
>> >   */
>> > -static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
>> > +static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
>> > +                               struct vm_area_struct *target_vma)
>> >  {
>> >         struct address_space *mapping = page->mapping;
>> >         pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
>> > @@ -1512,16 +1523,27 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
>> >         unsigned long max_nl_cursor = 0;
>> >         unsigned long max_nl_size = 0;
>> >         unsigned int mapcount;
>> > +       unsigned long address;
>> >
>> >         if (PageHuge(page))
>> >                 pgoff = page->index << compound_order(page);
>> >
>> >         mutex_lock(&mapping->i_mmap_mutex);
>> > -       vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
>> > -               unsigned long address = vma_address(page, vma);
>> > -               ret = try_to_unmap_one(page, vma, address, flags);
>> > -               if (ret != SWAP_AGAIN || !page_mapped(page))
>> > +       if (target_vma) {
>> > +               /* We don't handle non-linear vma on ramfs */
>> > +               if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
>> >                         goto out;
>> > +
>> > +               address = vma_address(page, target_vma);
>> > +               ret = try_to_unmap_one(page, target_vma, address, flags);
>> > +               goto out;
>> > +       } else {
>> > +               vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
>> > +                       address = vma_address(page, vma);
>> > +                       ret = try_to_unmap_one(page, vma, address, flags);
>> > +                       if (ret != SWAP_AGAIN || !page_mapped(page))
>> > +                               goto out;
>> > +               }
>> >         }
>> >
>> >         if (list_empty(&mapping->i_mmap_nonlinear))
>> > @@ -1602,9 +1624,12 @@ out:
>> >   * try_to_unmap - try to remove all page table mappings to a page
>> >   * @page: the page to get unmapped
>> >   * @flags: action and flags
>> > + * @vma : target vma for reclaim
>> >   *
>> >   * Tries to remove all the page table entries which are mapping this
>> >   * page, used in the pageout path.  Caller must hold the page lock.
>> > + * If @vma is not NULL, this function try to remove @page from only @vma
>> > + * without peeking all mapped vma for @page.
>> >   * Return values are:
>> >   *
>> >   * SWAP_SUCCESS        - we succeeded in removing all mappings
>> > @@ -1612,7 +1637,8 @@ out:
>> >   * SWAP_FAIL   - the page is unswappable
>> >   * SWAP_MLOCK  - page is mlocked.
>> >   */
>> > -int try_to_unmap(struct page *page, enum ttu_flags flags)
>> > +int try_to_unmap(struct page *page, enum ttu_flags flags,
>> > +                               struct vm_area_struct *vma)
>> >  {
>> >         int ret;
>> >
>> > @@ -1620,11 +1646,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
>> >         VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
>> >
>> >         if (unlikely(PageKsm(page)))
>> > -               ret = try_to_unmap_ksm(page, flags);
>> > +               ret = try_to_unmap_ksm(page, flags, vma);
>> >         else if (PageAnon(page))
>> > -               ret = try_to_unmap_anon(page, flags);
>> > +               ret = try_to_unmap_anon(page, flags, vma);
>> >         else
>> > -               ret = try_to_unmap_file(page, flags);
>> > +               ret = try_to_unmap_file(page, flags, vma);
>> >         if (ret != SWAP_MLOCK && !page_mapped(page))
>> >                 ret = SWAP_SUCCESS;
>> >         return ret;
>> > @@ -1650,11 +1676,11 @@ int try_to_munlock(struct page *page)
>> >         VM_BUG_ON(!PageLocked(page) || PageLRU(page));
>> >
>> >         if (unlikely(PageKsm(page)))
>> > -               return try_to_unmap_ksm(page, TTU_MUNLOCK);
>> > +               return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
>> >         else if (PageAnon(page))
>> > -               return try_to_unmap_anon(page, TTU_MUNLOCK);
>> > +               return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
>> >         else
>> > -               return try_to_unmap_file(page, TTU_MUNLOCK);
>> > +               return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
>> >  }
>> >
>> >  void __put_anon_vma(struct anon_vma *anon_vma)
>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>> > index 367d0f4..df9c4d3 100644
>> > --- a/mm/vmscan.c
>> > +++ b/mm/vmscan.c
>> > @@ -92,6 +92,13 @@ struct scan_control {
>> >          * are scanned.
>> >          */
>> >         nodemask_t      *nodemask;
>> > +
>> > +       /*
>> > +        * Reclaim pages from a vma. If the page is shared by other tasks
>> > +        * it is zapped from a vma without reclaim so it ends up remaining
>> > +        * on memory until last task zap it.
>> > +        */
>> > +       struct vm_area_struct *target_vma;
>> >  };
>> >
>> >  #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
>> > @@ -793,7 +800,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>> >                  * processes. Try to unmap it here.
>> >                  */
>> >                 if (page_mapped(page) && mapping) {
>> > -                       switch (try_to_unmap(page, ttu_flags)) {
>> > +                       switch (try_to_unmap(page,
>> > +                                       ttu_flags, sc->target_vma)) {
>> >                         case SWAP_FAIL:
>> >                                 goto activate_locked;
>> >                         case SWAP_AGAIN:
>> > @@ -1000,13 +1008,15 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
>> >  }
>> >
>> >  #ifdef CONFIG_PROCESS_RECLAIM
>> > -unsigned long reclaim_pages_from_list(struct list_head *page_list)
>> > +unsigned long reclaim_pages_from_list(struct list_head *page_list,
>> > +                                       struct vm_area_struct *vma)
>> >  {
>> >         struct scan_control sc = {
>> >                 .gfp_mask = GFP_KERNEL,
>> >                 .priority = DEF_PRIORITY,
>> >                 .may_unmap = 1,
>> >                 .may_swap = 1,
>> > +               .target_vma = vma,
>> >         };
>> >
>> >         unsigned long nr_reclaimed;
>> > --
>> > 1.8.2
>> >
>> > --
>> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> > the body to majordomo@kvack.org.  For more info on Linux MM,
>> > see: http://www.linux-mm.org/ .
>> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@kvack.org.  For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>
> --
> Kind regards,
> Minchan Kim



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface"; http://man7.org/tlpi/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 4/4] mm: Enhance per process reclaim
  2013-04-03  6:16       ` Michael Kerrisk (man-pages)
@ 2013-04-03  6:47         ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 13+ messages in thread
From: Michael Kerrisk (man-pages) @ 2013-04-03  6:47 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee

Hi Minchan,

On Wed, Apr 3, 2013 at 8:16 AM, Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
> Hello Minchan
>
> On Wed, Apr 3, 2013 at 2:23 AM, Minchan Kim <minchan@kernel.org> wrote:
>> Hey Michael,
>>
>> On Tue, Apr 02, 2013 at 03:25:25PM +0200, Michael Kerrisk wrote:
>>> Minchan,
>>>
>>> On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
>>> >
>>> > Some pages could be shared by several processes. (ex, libc)
>>> > In case of that, it's too bad to reclaim them from the beginnig.
>>> >
>>> > This patch causes VM to keep them on memory until last task
>>> > try to reclaim them so shared pages will be reclaimed only if
>>> > all of task has gone swapping out.
>>> >
>>> > This feature doesn't handle non-linear mapping on ramfs because
>>> > it's very time-consuming and doesn't make sure of reclaiming and
>>> > not common.
>>>
>>> Against what tree does this patch apply? I've tries various trees,
>>> including MMOTM of 26 March, and encounter this error:
>>>
>>>   CC      mm/ksm.o
>>> mm/ksm.c: In function ‘try_to_unmap_ksm’:
>>> mm/ksm.c:1970:32: error: ‘vma’ undeclared (first use in this function)
>>> mm/ksm.c:1970:32: note: each undeclared identifier is reported only
>>> once for each function it appears in
>>> make[1]: *** [mm/ksm.o] Error 1
>>> make: *** [mm] Error 2
>>
>> I did it based on mmotm-2013-03-22-15-21 and you found build problem.
>> Could you apply below patch? I will fix up below in next spin.
>> Thanks for the testing!
>
> This is getting confusing. Was that a patch on top of the other 4
> patches? I assume so. I applied all 5 patches on
> mmotm-2013-03-22-15-21, but still get a build error:
>
> mm/memcontrol.c: In function ‘mem_cgroup_move_parent’:
> mm/memcontrol.c:3868:2: error: implicit declaration of function
> ‘isolate_lru_page’ [-Werror=implicit-function-declaration]
> mm/memcontrol.c:3892:2: error: implicit declaration of function
> ‘putback_lru_page’ [-Werror=implicit-function-declaration]
> cc1: some warnings being treated as errors
> make[1]: *** [mm/memcontrol.o] Error 1
> make[1]: *** Waiting for unfinished jobs....
> make: *** [mm] Error 2
> make: *** Waiting for unfinished jobs....

Okay -- it turns out that adding 'extern' declarations for those two
functions was enough. I have a built kernel now, and see the
/proc/PID/reclaim files.

Thanks,

Michael


>> From 0934270618ccd4883d6bb05653c664a385fb9441 Mon Sep 17 00:00:00 2001
>> From: Minchan Kim <minchan@kernel.org>
>> Date: Wed, 3 Apr 2013 09:19:49 +0900
>> Subject: [PATCH] fix: compile error for CONFIG_KSM
>>
>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>> ---
>>  include/linux/rmap.h | 2 ++
>>  mm/ksm.c             | 2 +-
>>  2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>> index 6c7d030..7bcf090 100644
>> --- a/include/linux/rmap.h
>> +++ b/include/linux/rmap.h
>> @@ -14,6 +14,8 @@ extern int isolate_lru_page(struct page *page);
>>  extern void putback_lru_page(struct page *page);
>>  extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
>>                                              struct vm_area_struct *vma);
>> +extern unsigned long vma_address(struct page *page,
>> +                               struct vm_area_struct *vma);
>>
>>  /*
>>   * The anon_vma heads a list of private "related" vmas, to scan if
>> diff --git a/mm/ksm.c b/mm/ksm.c
>> index 1a90d13..44de936 100644
>> --- a/mm/ksm.c
>> +++ b/mm/ksm.c
>> @@ -1967,7 +1967,7 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
>>
>>         if (target_vma) {
>>                 unsigned long address = vma_address(page, target_vma);
>> -               ret = try_to_unmap_one(page, vma, address, flags);
>> +               ret = try_to_unmap_one(page, target_vma, address, flags);
>>                 goto out;
>>         }
>>  again:
>> --
>> 1.8.2
>>
>>>
>>> Cheers,
>>>
>>> Michael
>>>
>>>
>>> > Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
>>> > Signed-off-by: Minchan Kim <minchan@kernel.org>
>>> > ---
>>> >  fs/proc/task_mmu.c   |  2 +-
>>> >  include/linux/ksm.h  |  6 ++++--
>>> >  include/linux/rmap.h |  8 +++++---
>>> >  mm/ksm.c             |  9 +++++++-
>>> >  mm/memory-failure.c  |  2 +-
>>> >  mm/migrate.c         |  6 ++++--
>>> >  mm/rmap.c            | 58 +++++++++++++++++++++++++++++++++++++---------------
>>> >  mm/vmscan.c          | 14 +++++++++++--
>>> >  8 files changed, 77 insertions(+), 28 deletions(-)
>>> >
>>> > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>>> > index c3713a4..7f6aaf5 100644
>>> > --- a/fs/proc/task_mmu.c
>>> > +++ b/fs/proc/task_mmu.c
>>> > @@ -1154,7 +1154,7 @@ cont:
>>> >                         break;
>>> >         }
>>> >         pte_unmap_unlock(pte - 1, ptl);
>>> > -       reclaim_pages_from_list(&page_list);
>>> > +       reclaim_pages_from_list(&page_list, vma);
>>> >         if (addr != end)
>>> >                 goto cont;
>>> >
>>> > diff --git a/include/linux/ksm.h b/include/linux/ksm.h
>>> > index 45c9b6a..d8e556b 100644
>>> > --- a/include/linux/ksm.h
>>> > +++ b/include/linux/ksm.h
>>> > @@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
>>> >
>>> >  int page_referenced_ksm(struct page *page,
>>> >                         struct mem_cgroup *memcg, unsigned long *vm_flags);
>>> > -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
>>> > +int try_to_unmap_ksm(struct page *page,
>>> > +                       enum ttu_flags flags, struct vm_area_struct *vma);
>>> >  int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
>>> >                   struct vm_area_struct *, unsigned long, void *), void *arg);
>>> >  void ksm_migrate_page(struct page *newpage, struct page *oldpage);
>>> > @@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
>>> >         return 0;
>>> >  }
>>> >
>>> > -static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>>> > +static inline int try_to_unmap_ksm(struct page *page,
>>> > +                       enum ttu_flags flags, struct vm_area_struct *target_vma)
>>> >  {
>>> >         return 0;
>>> >  }
>>> > diff --git a/include/linux/rmap.h b/include/linux/rmap.h
>>> > index a24e34e..6c7d030 100644
>>> > --- a/include/linux/rmap.h
>>> > +++ b/include/linux/rmap.h
>>> > @@ -12,7 +12,8 @@
>>> >
>>> >  extern int isolate_lru_page(struct page *page);
>>> >  extern void putback_lru_page(struct page *page);
>>> > -extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
>>> > +extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
>>> > +                                            struct vm_area_struct *vma);
>>> >
>>> >  /*
>>> >   * The anon_vma heads a list of private "related" vmas, to scan if
>>> > @@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
>>> >
>>> >  #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
>>> >
>>> > -int try_to_unmap(struct page *, enum ttu_flags flags);
>>> > +int try_to_unmap(struct page *, enum ttu_flags flags,
>>> > +                       struct vm_area_struct *vma);
>>> >  int try_to_unmap_one(struct page *, struct vm_area_struct *,
>>> >                         unsigned long address, enum ttu_flags flags);
>>> >
>>> > @@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
>>> >         return 0;
>>> >  }
>>> >
>>> > -#define try_to_unmap(page, refs) SWAP_FAIL
>>> > +#define try_to_unmap(page, refs, vma) SWAP_FAIL
>>> >
>>> >  static inline int page_mkclean(struct page *page)
>>> >  {
>>> > diff --git a/mm/ksm.c b/mm/ksm.c
>>> > index 7f629e4..1a90d13 100644
>>> > --- a/mm/ksm.c
>>> > +++ b/mm/ksm.c
>>> > @@ -1949,7 +1949,8 @@ out:
>>> >         return referenced;
>>> >  }
>>> >
>>> > -int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>>> > +int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
>>> > +                       struct vm_area_struct *target_vma)
>>> >  {
>>> >         struct stable_node *stable_node;
>>> >         struct hlist_node *hlist;
>>> > @@ -1963,6 +1964,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
>>> >         stable_node = page_stable_node(page);
>>> >         if (!stable_node)
>>> >                 return SWAP_FAIL;
>>> > +
>>> > +       if (target_vma) {
>>> > +               unsigned long address = vma_address(page, target_vma);
>>> > +               ret = try_to_unmap_one(page, vma, address, flags);
>>> > +               goto out;
>>> > +       }
>>> >  again:
>>> >         hlist_for_each_entry(rmap_item, hlist, &stable_node->hlist, hlist) {
>>> >                 struct anon_vma *anon_vma = rmap_item->anon_vma;
>>> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
>>> > index ceb0c7f..f3928e4 100644
>>> > --- a/mm/memory-failure.c
>>> > +++ b/mm/memory-failure.c
>>> > @@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
>>> >         if (hpage != ppage)
>>> >                 lock_page(ppage);
>>> >
>>> > -       ret = try_to_unmap(ppage, ttu);
>>> > +       ret = try_to_unmap(ppage, ttu, NULL);
>>> >         if (ret != SWAP_SUCCESS)
>>> >                 printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
>>> >                                 pfn, page_mapcount(ppage));
>>> > diff --git a/mm/migrate.c b/mm/migrate.c
>>> > index 6fa4ebc..aafbc66 100644
>>> > --- a/mm/migrate.c
>>> > +++ b/mm/migrate.c
>>> > @@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>>> >         }
>>> >
>>> >         /* Establish migration ptes or remove ptes */
>>> > -       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
>>> > +       try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
>>> > +                       NULL);
>>> >
>>> >  skip_unmap:
>>> >         if (!page_mapped(page))
>>> > @@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
>>> >         if (PageAnon(hpage))
>>> >                 anon_vma = page_get_anon_vma(hpage);
>>> >
>>> > -       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
>>> > +       try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
>>> > +                                               NULL);
>>> >
>>> >         if (!page_mapped(hpage))
>>> >                 rc = move_to_new_page(new_hpage, hpage, 1, mode);
>>> > diff --git a/mm/rmap.c b/mm/rmap.c
>>> > index 6280da8..a880f24 100644
>>> > --- a/mm/rmap.c
>>> > +++ b/mm/rmap.c
>>> > @@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
>>> >
>>> >  /**
>>> >   * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
>>> > - * rmap method
>>> > + * rmap method if @vma is NULL
>>> >   * @page: the page to unmap/unlock
>>> >   * @flags: action and flags
>>> > + * @target_vma: vma for unmapping a @page
>>> >   *
>>> >   * Find all the mappings of a page using the mapping pointer and the vma chains
>>> >   * contained in the anon_vma struct it points to.
>>> >   *
>>> > + * If @target_vma isn't NULL, this function unmap a page from the vma
>>> > + *
>>> >   * This function is only called from try_to_unmap/try_to_munlock for
>>> >   * anonymous pages.
>>> >   * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
>>> > @@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
>>> >   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
>>> >   * 'LOCKED.
>>> >   */
>>> > -static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>>> > +static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
>>> > +                                       struct vm_area_struct *target_vma)
>>> >  {
>>> > +       int ret = SWAP_AGAIN;
>>> > +       unsigned long address;
>>> >         struct anon_vma *anon_vma;
>>> >         pgoff_t pgoff;
>>> >         struct anon_vma_chain *avc;
>>> > -       int ret = SWAP_AGAIN;
>>> > +
>>> > +       if (target_vma) {
>>> > +               address = vma_address(page, target_vma);
>>> > +               return try_to_unmap_one(page, target_vma, address, flags);
>>> > +       }
>>> >
>>> >         anon_vma = page_lock_anon_vma_read(page);
>>> >         if (!anon_vma)
>>> > @@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>>> >         pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
>>> >         anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
>>> >                 struct vm_area_struct *vma = avc->vma;
>>> > -               unsigned long address;
>>> >
>>> >                 /*
>>> >                  * During exec, a temporary VMA is setup and later moved.
>>> > @@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>>> >   * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
>>> >   * @page: the page to unmap/unlock
>>> >   * @flags: action and flags
>>> > + * @target_vma: vma for unmapping @page
>>> >   *
>>> >   * Find all the mappings of a page using the mapping pointer and the vma chains
>>> >   * contained in the address_space struct it points to.
>>> > @@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
>>> >   * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
>>> >   * 'LOCKED.
>>> >   */
>>> > -static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
>>> > +static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
>>> > +                               struct vm_area_struct *target_vma)
>>> >  {
>>> >         struct address_space *mapping = page->mapping;
>>> >         pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
>>> > @@ -1512,16 +1523,27 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
>>> >         unsigned long max_nl_cursor = 0;
>>> >         unsigned long max_nl_size = 0;
>>> >         unsigned int mapcount;
>>> > +       unsigned long address;
>>> >
>>> >         if (PageHuge(page))
>>> >                 pgoff = page->index << compound_order(page);
>>> >
>>> >         mutex_lock(&mapping->i_mmap_mutex);
>>> > -       vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
>>> > -               unsigned long address = vma_address(page, vma);
>>> > -               ret = try_to_unmap_one(page, vma, address, flags);
>>> > -               if (ret != SWAP_AGAIN || !page_mapped(page))
>>> > +       if (target_vma) {
>>> > +               /* We don't handle non-linear vma on ramfs */
>>> > +               if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
>>> >                         goto out;
>>> > +
>>> > +               address = vma_address(page, target_vma);
>>> > +               ret = try_to_unmap_one(page, target_vma, address, flags);
>>> > +               goto out;
>>> > +       } else {
>>> > +               vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
>>> > +                       address = vma_address(page, vma);
>>> > +                       ret = try_to_unmap_one(page, vma, address, flags);
>>> > +                       if (ret != SWAP_AGAIN || !page_mapped(page))
>>> > +                               goto out;
>>> > +               }
>>> >         }
>>> >
>>> >         if (list_empty(&mapping->i_mmap_nonlinear))
>>> > @@ -1602,9 +1624,12 @@ out:
>>> >   * try_to_unmap - try to remove all page table mappings to a page
>>> >   * @page: the page to get unmapped
>>> >   * @flags: action and flags
>>> > + * @vma : target vma for reclaim
>>> >   *
>>> >   * Tries to remove all the page table entries which are mapping this
>>> >   * page, used in the pageout path.  Caller must hold the page lock.
>>> > + * If @vma is not NULL, this function try to remove @page from only @vma
>>> > + * without peeking all mapped vma for @page.
>>> >   * Return values are:
>>> >   *
>>> >   * SWAP_SUCCESS        - we succeeded in removing all mappings
>>> > @@ -1612,7 +1637,8 @@ out:
>>> >   * SWAP_FAIL   - the page is unswappable
>>> >   * SWAP_MLOCK  - page is mlocked.
>>> >   */
>>> > -int try_to_unmap(struct page *page, enum ttu_flags flags)
>>> > +int try_to_unmap(struct page *page, enum ttu_flags flags,
>>> > +                               struct vm_area_struct *vma)
>>> >  {
>>> >         int ret;
>>> >
>>> > @@ -1620,11 +1646,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
>>> >         VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
>>> >
>>> >         if (unlikely(PageKsm(page)))
>>> > -               ret = try_to_unmap_ksm(page, flags);
>>> > +               ret = try_to_unmap_ksm(page, flags, vma);
>>> >         else if (PageAnon(page))
>>> > -               ret = try_to_unmap_anon(page, flags);
>>> > +               ret = try_to_unmap_anon(page, flags, vma);
>>> >         else
>>> > -               ret = try_to_unmap_file(page, flags);
>>> > +               ret = try_to_unmap_file(page, flags, vma);
>>> >         if (ret != SWAP_MLOCK && !page_mapped(page))
>>> >                 ret = SWAP_SUCCESS;
>>> >         return ret;
>>> > @@ -1650,11 +1676,11 @@ int try_to_munlock(struct page *page)
>>> >         VM_BUG_ON(!PageLocked(page) || PageLRU(page));
>>> >
>>> >         if (unlikely(PageKsm(page)))
>>> > -               return try_to_unmap_ksm(page, TTU_MUNLOCK);
>>> > +               return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
>>> >         else if (PageAnon(page))
>>> > -               return try_to_unmap_anon(page, TTU_MUNLOCK);
>>> > +               return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
>>> >         else
>>> > -               return try_to_unmap_file(page, TTU_MUNLOCK);
>>> > +               return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
>>> >  }
>>> >
>>> >  void __put_anon_vma(struct anon_vma *anon_vma)
>>> > diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> > index 367d0f4..df9c4d3 100644
>>> > --- a/mm/vmscan.c
>>> > +++ b/mm/vmscan.c
>>> > @@ -92,6 +92,13 @@ struct scan_control {
>>> >          * are scanned.
>>> >          */
>>> >         nodemask_t      *nodemask;
>>> > +
>>> > +       /*
>>> > +        * Reclaim pages from a vma. If the page is shared by other tasks
>>> > +        * it is zapped from a vma without reclaim so it ends up remaining
>>> > +        * on memory until last task zap it.
>>> > +        */
>>> > +       struct vm_area_struct *target_vma;
>>> >  };
>>> >
>>> >  #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
>>> > @@ -793,7 +800,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>> >                  * processes. Try to unmap it here.
>>> >                  */
>>> >                 if (page_mapped(page) && mapping) {
>>> > -                       switch (try_to_unmap(page, ttu_flags)) {
>>> > +                       switch (try_to_unmap(page,
>>> > +                                       ttu_flags, sc->target_vma)) {
>>> >                         case SWAP_FAIL:
>>> >                                 goto activate_locked;
>>> >                         case SWAP_AGAIN:
>>> > @@ -1000,13 +1008,15 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
>>> >  }
>>> >
>>> >  #ifdef CONFIG_PROCESS_RECLAIM
>>> > -unsigned long reclaim_pages_from_list(struct list_head *page_list)
>>> > +unsigned long reclaim_pages_from_list(struct list_head *page_list,
>>> > +                                       struct vm_area_struct *vma)
>>> >  {
>>> >         struct scan_control sc = {
>>> >                 .gfp_mask = GFP_KERNEL,
>>> >                 .priority = DEF_PRIORITY,
>>> >                 .may_unmap = 1,
>>> >                 .may_swap = 1,
>>> > +               .target_vma = vma,
>>> >         };
>>> >
>>> >         unsigned long nr_reclaimed;
>>> > --
>>> > 1.8.2
>>> >
>>> > --
>>> > To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> > the body to majordomo@kvack.org.  For more info on Linux MM,
>>> > see: http://www.linux-mm.org/ .
>>> > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>>
>>> --
>>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>>> the body to majordomo@kvack.org.  For more info on Linux MM,
>>> see: http://www.linux-mm.org/ .
>>> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>>
>> --
>> Kind regards,
>> Minchan Kim
>
>
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Author of "The Linux Programming Interface"; http://man7.org/tlpi/



--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface"; http://man7.org/tlpi/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/4] mm: Per process reclaim
  2013-03-25  6:21 [RFC 1/4] mm: Per process reclaim Minchan Kim
                   ` (2 preceding siblings ...)
  2013-03-25  6:21 ` [RFC 4/4] mm: Enhance per process reclaim Minchan Kim
@ 2013-04-03  9:17 ` Michael Kerrisk
  2013-04-03 23:31   ` Minchan Kim
  2013-04-03 10:10 ` Michael Kerrisk
  4 siblings, 1 reply; 13+ messages in thread
From: Michael Kerrisk @ 2013-04-03  9:17 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee

Hello Minchan,

On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
> These day, there are many platforms avaiable in the embedded market
> and they are smarter than kernel which has very limited information
> about working set so they want to involve memory management more heavily
> like android's lowmemory killer and ashmem or recent many lowmemory
> notifier(there was several trial for various company NOKIA, SAMSUNG,
> Linaro, Google ChromeOS, Redhat).
>
> One of the simple imagine scenario about userspace's intelligence is that
> platform can manage tasks as forground and backgroud so it would be
> better to reclaim background's task pages for end-user's *responsibility*
> although it has frequent referenced pages.
>
> This patch adds new knob "reclaim under proc/<pid>/" so task manager
> can reclaim any target process anytime, anywhere. It could give another
> method to platform for using memory efficiently.
>
> It can avoid process killing for getting free memory, which was really
> terrible experience because I lost my best score of game I had ever
> after I switch the phone call while I enjoyed the game.
>
> Writing 1 to /proc/pid/reclaim reclaims only file pages.
> Writing 2 to /proc/pid/reclaim reclaims only anonymous pages.
> Writing 3 to /proc/pid/reclaim reclaims all pages from target process.
>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  fs/proc/base.c       |   3 ++
>  fs/proc/internal.h   |   1 +
>  fs/proc/task_mmu.c   | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++
>  include/linux/rmap.h |   4 ++
>  mm/Kconfig           |  13 ++++++
>  mm/internal.h        |   7 +---
>  mm/vmscan.c          |  59 ++++++++++++++++++++++++++
>  7 files changed, 196 insertions(+), 6 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 9b43ff77..ed83e85 100644

[...]

> +#define RECLAIM_FILE (1 << 0)
> +#define RECLAIM_ANON (1 << 1)
> +#define RECLAIM_ALL (RECLAIM_FILE | RECLAIM_ANON)
> +
> +static ssize_t reclaim_write(struct file *file, const char __user *buf,
> +                               size_t count, loff_t *ppos)
> +{
> +       struct task_struct *task;
> +       char buffer[PROC_NUMBUF];
> +       struct mm_struct *mm;
> +       struct vm_area_struct *vma;
> +       int type;
> +       int rv;
> +
> +       memset(buffer, 0, sizeof(buffer));
> +       if (count > sizeof(buffer) - 1)
> +               count = sizeof(buffer) - 1;
> +       if (copy_from_user(buffer, buf, count))
> +               return -EFAULT;
> +       rv = kstrtoint(strstrip(buffer), 10, &type);
> +       if (rv < 0)
> +               return rv
> +       if (type < RECLAIM_ALL || type > RECLAIM_FILE)
> +               return -EINVAL;> +       task = get_proc_task(file->f_path.dentry->d_inode);

The check here is the wrong way round. Should be

       if (type < RECLAIM_FILE || type > RECLAIM_ALL)

Thanks,

Michael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/4] mm: Per process reclaim
  2013-03-25  6:21 [RFC 1/4] mm: Per process reclaim Minchan Kim
                   ` (3 preceding siblings ...)
  2013-04-03  9:17 ` [RFC 1/4] mm: Per " Michael Kerrisk
@ 2013-04-03 10:10 ` Michael Kerrisk
  2013-04-03 10:12   ` Michael Kerrisk
  2013-04-03 23:46   ` Minchan Kim
  4 siblings, 2 replies; 13+ messages in thread
From: Michael Kerrisk @ 2013-04-03 10:10 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee,
	Michael Kerrisk-manpages

Hello Minchan,

On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
> These day, there are many platforms avaiable in the embedded market
> and they are smarter than kernel which has very limited information
> about working set so they want to involve memory management more heavily
> like android's lowmemory killer and ashmem or recent many lowmemory
> notifier(there was several trial for various company NOKIA, SAMSUNG,
> Linaro, Google ChromeOS, Redhat).
>
> One of the simple imagine scenario about userspace's intelligence is that
> platform can manage tasks as forground and backgroud so it would be
> better to reclaim background's task pages for end-user's *responsibility*
> although it has frequent referenced pages.
>
> This patch adds new knob "reclaim under proc/<pid>/" so task manager
> can reclaim any target process anytime, anywhere. It could give another
> method to platform for using memory efficiently.
>
> It can avoid process killing for getting free memory, which was really
> terrible experience because I lost my best score of game I had ever
> after I switch the phone call while I enjoyed the game.
>
> Writing 1 to /proc/pid/reclaim reclaims only file pages.
> Writing 2 to /proc/pid/reclaim reclaims only anonymous pages.
> Writing 3 to /proc/pid/reclaim reclaims all pages from target process.

This interface seems to work as advertized, at least from some light
testing that I've done.

However, the interface is a quite blunt instrument. Would there be any
virtue in extending it so that an address range could be written to
/proc/PID/reclaim? Used in conjunction with /proc/PID/maps, a manager
process might then choose to trigger reclaim of just selected regions
of a processes address space. Thus, one might reclaim file backed
pages in a range using:

    echo '2 start-address end-address' > /proc/PID/reclaim

What do you think?

Thanks,

Michael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/4] mm: Per process reclaim
  2013-04-03 10:10 ` Michael Kerrisk
@ 2013-04-03 10:12   ` Michael Kerrisk
  2013-04-03 23:46   ` Minchan Kim
  1 sibling, 0 replies; 13+ messages in thread
From: Michael Kerrisk @ 2013-04-03 10:12 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee,
	Michael Kerrisk-manpages

> However, the interface is a quite blunt instrument. Would there be any
> virtue in extending it so that an address range could be written to

Here, I did mean to say "an *optional* address range.

Thanks,

Michael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/4] mm: Per process reclaim
  2013-04-03  9:17 ` [RFC 1/4] mm: Per " Michael Kerrisk
@ 2013-04-03 23:31   ` Minchan Kim
  0 siblings, 0 replies; 13+ messages in thread
From: Minchan Kim @ 2013-04-03 23:31 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee

Hi Michael,

On Wed, Apr 03, 2013 at 11:17:58AM +0200, Michael Kerrisk wrote:
> Hello Minchan,
> 
> On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
> > These day, there are many platforms avaiable in the embedded market
> > and they are smarter than kernel which has very limited information
> > about working set so they want to involve memory management more heavily
> > like android's lowmemory killer and ashmem or recent many lowmemory
> > notifier(there was several trial for various company NOKIA, SAMSUNG,
> > Linaro, Google ChromeOS, Redhat).
> >
> > One of the simple imagine scenario about userspace's intelligence is that
> > platform can manage tasks as forground and backgroud so it would be
> > better to reclaim background's task pages for end-user's *responsibility*
> > although it has frequent referenced pages.
> >
> > This patch adds new knob "reclaim under proc/<pid>/" so task manager
> > can reclaim any target process anytime, anywhere. It could give another
> > method to platform for using memory efficiently.
> >
> > It can avoid process killing for getting free memory, which was really
> > terrible experience because I lost my best score of game I had ever
> > after I switch the phone call while I enjoyed the game.
> >
> > Writing 1 to /proc/pid/reclaim reclaims only file pages.
> > Writing 2 to /proc/pid/reclaim reclaims only anonymous pages.
> > Writing 3 to /proc/pid/reclaim reclaims all pages from target process.
> >
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  fs/proc/base.c       |   3 ++
> >  fs/proc/internal.h   |   1 +
> >  fs/proc/task_mmu.c   | 115 +++++++++++++++++++++++++++++++++++++++++++++++++++
> >  include/linux/rmap.h |   4 ++
> >  mm/Kconfig           |  13 ++++++
> >  mm/internal.h        |   7 +---
> >  mm/vmscan.c          |  59 ++++++++++++++++++++++++++
> >  7 files changed, 196 insertions(+), 6 deletions(-)
> >
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 9b43ff77..ed83e85 100644
> 
> [...]
> 
> > +#define RECLAIM_FILE (1 << 0)
> > +#define RECLAIM_ANON (1 << 1)
> > +#define RECLAIM_ALL (RECLAIM_FILE | RECLAIM_ANON)
> > +
> > +static ssize_t reclaim_write(struct file *file, const char __user *buf,
> > +                               size_t count, loff_t *ppos)
> > +{
> > +       struct task_struct *task;
> > +       char buffer[PROC_NUMBUF];
> > +       struct mm_struct *mm;
> > +       struct vm_area_struct *vma;
> > +       int type;
> > +       int rv;
> > +
> > +       memset(buffer, 0, sizeof(buffer));
> > +       if (count > sizeof(buffer) - 1)
> > +               count = sizeof(buffer) - 1;
> > +       if (copy_from_user(buffer, buf, count))
> > +               return -EFAULT;
> > +       rv = kstrtoint(strstrip(buffer), 10, &type);
> > +       if (rv < 0)
> > +               return rv
> > +       if (type < RECLAIM_ALL || type > RECLAIM_FILE)
> > +               return -EINVAL;> +       task = get_proc_task(file->f_path.dentry->d_inode);
> 
> The check here is the wrong way round. Should be
> 
>        if (type < RECLAIM_FILE || type > RECLAIM_ALL)
> 
> Thanks,

You give me a chance to remember "last minute change is really evil"
Thanks!

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC 1/4] mm: Per process reclaim
  2013-04-03 10:10 ` Michael Kerrisk
  2013-04-03 10:12   ` Michael Kerrisk
@ 2013-04-03 23:46   ` Minchan Kim
  1 sibling, 0 replies; 13+ messages in thread
From: Minchan Kim @ 2013-04-03 23:46 UTC (permalink / raw)
  To: Michael Kerrisk
  Cc: Andrew Morton, Linux Kernel, linux-mm, Mel Gorman, Rik van Riel,
	Johannes Weiner, Hugh Dickins, Sangseok Lee

On Wed, Apr 03, 2013 at 12:10:22PM +0200, Michael Kerrisk wrote:
> Hello Minchan,
> 
> On Mon, Mar 25, 2013 at 7:21 AM, Minchan Kim <minchan@kernel.org> wrote:
> > These day, there are many platforms avaiable in the embedded market
> > and they are smarter than kernel which has very limited information
> > about working set so they want to involve memory management more heavily
> > like android's lowmemory killer and ashmem or recent many lowmemory
> > notifier(there was several trial for various company NOKIA, SAMSUNG,
> > Linaro, Google ChromeOS, Redhat).
> >
> > One of the simple imagine scenario about userspace's intelligence is that
> > platform can manage tasks as forground and backgroud so it would be
> > better to reclaim background's task pages for end-user's *responsibility*
> > although it has frequent referenced pages.
> >
> > This patch adds new knob "reclaim under proc/<pid>/" so task manager
> > can reclaim any target process anytime, anywhere. It could give another
> > method to platform for using memory efficiently.
> >
> > It can avoid process killing for getting free memory, which was really
> > terrible experience because I lost my best score of game I had ever
> > after I switch the phone call while I enjoyed the game.
> >
> > Writing 1 to /proc/pid/reclaim reclaims only file pages.
> > Writing 2 to /proc/pid/reclaim reclaims only anonymous pages.
> > Writing 3 to /proc/pid/reclaim reclaims all pages from target process.
> 
> This interface seems to work as advertized, at least from some light
> testing that I've done.

Thanks for the testing!

> 
> However, the interface is a quite blunt instrument. Would there be any
> virtue in extending it so that an address range could be written to
> /proc/PID/reclaim? Used in conjunction with /proc/PID/maps, a manager
> process might then choose to trigger reclaim of just selected regions
> of a processes address space. Thus, one might reclaim file backed
> pages in a range using:
> 
>     echo '2 start-address end-address' > /proc/PID/reclaim
> 
> What do you think?

It is really nice idea because some platform use a address space with
mulitple object. Simply, multiple application could work in a address space
but they use separate virtual address range in a address space.
In such model, per-process reclaim isn't vaild any more so your idea
would be nice for it.

One nitpick is that I'd like to use (address, size) instead of (start_address
, end_address) because we always confuse that end_address itself is
inclusive or not in the range.

And I am thinking another options like this

RECLAIM_SOFT_[FILE|ANON] 

It reclaims pages which are not workset.

RECLAIM_HARD_[FILE|ANON]

It reclaims pages of range unconditionally although pages are shared by
several processes.

But before that, I'd like to hear other guys's opinion.

> 
> Thanks,
> 
> Michael
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2013-04-03 23:46 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-25  6:21 [RFC 1/4] mm: Per process reclaim Minchan Kim
2013-03-25  6:21 ` [RFC 2/4] mm: make shrink_page_list with pages from multiple zones Minchan Kim
2013-03-25  6:21 ` [RFC 3/4] mm: Remove shrink_page Minchan Kim
2013-03-25  6:21 ` [RFC 4/4] mm: Enhance per process reclaim Minchan Kim
2013-04-02 13:25   ` Michael Kerrisk
2013-04-03  0:23     ` Minchan Kim
2013-04-03  6:16       ` Michael Kerrisk (man-pages)
2013-04-03  6:47         ` Michael Kerrisk (man-pages)
2013-04-03  9:17 ` [RFC 1/4] mm: Per " Michael Kerrisk
2013-04-03 23:31   ` Minchan Kim
2013-04-03 10:10 ` Michael Kerrisk
2013-04-03 10:12   ` Michael Kerrisk
2013-04-03 23:46   ` Minchan Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).