All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] Per process reclaim
@ 2013-05-09  7:21 ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

The patch[1] prepares that force_reclaim in shrink_page_list can
handle anonymous pages as well as file-backed pages.

The patch[2] adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.
The patch[5] causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

Another requirement is per address space reclaim.(By Michael Kerrisk)
In case of Webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so patch[6] supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

	echo [addr] [size-byte] > /proc/pid/reclaim

* Changelog from v4
  * Fix anonymous page write out - Minkyung Kim

* Changelog from v3
  * Rebased on next-20130508
  * Minor change

* Changelog from v2
  * Use memparse - Namhung Kim
  * Add Acked-by

* Changelog from v1
  * Change reclaim knob interface - Dave Hansen
  * proc.txt document change - Rob Landley

Minchan Kim (7):
  [1] mm: prevent to write out dirty page in CMA by may_writepage
  [2] mm: Per process reclaim
  [3] mm: make shrink_page_list with pages work from multiple zones
  [4] mm: Remove shrink_page
  [5] mm: Enhance per process reclaim to consider shared pages
  [6] mm: Support address range reclaim
  [7] add documentation about reclaim knob on proc.txt

 Documentation/filesystems/proc.txt |  20 +++++
 fs/proc/base.c                     |   3 +
 fs/proc/internal.h                 |   1 +
 fs/proc/task_mmu.c                 | 176 +++++++++++++++++++++++++++++++++++++
 include/linux/ksm.h                |   6 +-
 include/linux/rmap.h               |  10 ++-
 mm/Kconfig                         |  16 ++++
 mm/ksm.c                           |   9 +-
 mm/memory-failure.c                |   2 +-
 mm/migrate.c                       |   6 +-
 mm/rmap.c                          |  57 ++++++++----
 mm/vmscan.c                        |  62 ++++++++++++-
 12 files changed, 340 insertions(+), 28 deletions(-)

-- 
1.8.2.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v5 0/7] Per process reclaim
@ 2013-05-09  7:21 ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

The patch[1] prepares that force_reclaim in shrink_page_list can
handle anonymous pages as well as file-backed pages.

The patch[2] adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.
The patch[5] causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

Another requirement is per address space reclaim.(By Michael Kerrisk)
In case of Webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so patch[6] supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

	echo [addr] [size-byte] > /proc/pid/reclaim

* Changelog from v4
  * Fix anonymous page write out - Minkyung Kim

* Changelog from v3
  * Rebased on next-20130508
  * Minor change

* Changelog from v2
  * Use memparse - Namhung Kim
  * Add Acked-by

* Changelog from v1
  * Change reclaim knob interface - Dave Hansen
  * proc.txt document change - Rob Landley

Minchan Kim (7):
  [1] mm: prevent to write out dirty page in CMA by may_writepage
  [2] mm: Per process reclaim
  [3] mm: make shrink_page_list with pages work from multiple zones
  [4] mm: Remove shrink_page
  [5] mm: Enhance per process reclaim to consider shared pages
  [6] mm: Support address range reclaim
  [7] add documentation about reclaim knob on proc.txt

 Documentation/filesystems/proc.txt |  20 +++++
 fs/proc/base.c                     |   3 +
 fs/proc/internal.h                 |   1 +
 fs/proc/task_mmu.c                 | 176 +++++++++++++++++++++++++++++++++++++
 include/linux/ksm.h                |   6 +-
 include/linux/rmap.h               |  10 ++-
 mm/Kconfig                         |  16 ++++
 mm/ksm.c                           |   9 +-
 mm/memory-failure.c                |   2 +-
 mm/migrate.c                       |   6 +-
 mm/rmap.c                          |  57 ++++++++----
 mm/vmscan.c                        |  62 ++++++++++++-
 12 files changed, 340 insertions(+), 28 deletions(-)

-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v5 1/7] mm: prevent to write out dirty page in CMA by may_writepage
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-09  7:21   ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim,
	Marek Szyprowski, Mel Gorman

Now, local variable references in shrink_page_list is
PAGEREF_RECLAIM_CLEAN as default. It is for preventing to reclaim
dirty pages when CMA try to migrate pages.
Strictly speaking, we don't need it because CMA already didn't allow
to write out by .may_writepage = 0 in reclaim_clean_pages_from_list.

Morever, it has a problem to prevent anonymous pages's swap out when
we use force_reclaim = true in shrink_page_list(ex, per process reclaim
can do it)

So this patch makes references's default value to PAGEREF_RECLAIM
and declare .may_writepage = 0 of scan_control in CMA part to make
code more clear.

Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Reported-by: Minkyung Kim <minkyung88@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index fa6a853..c22f9c1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -695,7 +695,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		struct address_space *mapping;
 		struct page *page;
 		int may_enter_fs;
-		enum page_references references = PAGEREF_RECLAIM_CLEAN;
+		enum page_references references = PAGEREF_RECLAIM;
 
 		cond_resched();
 
@@ -972,6 +972,8 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 		.gfp_mask = GFP_KERNEL,
 		.priority = DEF_PRIORITY,
 		.may_unmap = 1,
+		/* Doesn't allow to write out dirty page */
+		.may_writepage = 0,
 	};
 	unsigned long ret, dummy1, dummy2;
 	struct page *page, *next;
-- 
1.8.2.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 1/7] mm: prevent to write out dirty page in CMA by may_writepage
@ 2013-05-09  7:21   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim,
	Marek Szyprowski, Mel Gorman

Now, local variable references in shrink_page_list is
PAGEREF_RECLAIM_CLEAN as default. It is for preventing to reclaim
dirty pages when CMA try to migrate pages.
Strictly speaking, we don't need it because CMA already didn't allow
to write out by .may_writepage = 0 in reclaim_clean_pages_from_list.

Morever, it has a problem to prevent anonymous pages's swap out when
we use force_reclaim = true in shrink_page_list(ex, per process reclaim
can do it)

So this patch makes references's default value to PAGEREF_RECLAIM
and declare .may_writepage = 0 of scan_control in CMA part to make
code more clear.

Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mel Gorman <mgorman@suse.de>
Reported-by: Minkyung Kim <minkyung88@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index fa6a853..c22f9c1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -695,7 +695,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		struct address_space *mapping;
 		struct page *page;
 		int may_enter_fs;
-		enum page_references references = PAGEREF_RECLAIM_CLEAN;
+		enum page_references references = PAGEREF_RECLAIM;
 
 		cond_resched();
 
@@ -972,6 +972,8 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 		.gfp_mask = GFP_KERNEL,
 		.priority = DEF_PRIORITY,
 		.may_unmap = 1,
+		/* Doesn't allow to write out dirty page */
+		.may_writepage = 0,
 	};
 	unsigned long ret, dummy1, dummy2;
 	struct page *page, *next;
-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 2/7] mm: Per process reclaim
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-09  7:21   ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/base.c       |   3 ++
 fs/proc/internal.h   |   1 +
 fs/proc/task_mmu.c   | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/rmap.h |   4 ++
 mm/Kconfig           |  13 ++++++
 mm/vmscan.c          |  60 +++++++++++++++++++++++++
 6 files changed, 202 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4d3ebd6..9286b43 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2669,6 +2669,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("mounts",     S_IRUGO, proc_mounts_operations),
 	REG("mountinfo",  S_IRUGO, proc_mountinfo_operations),
 	REG("mountstats", S_IRUSR, proc_mountstats_operations),
+#ifdef CONFIG_PROCESS_RECLAIM
+	REG("reclaim", S_IWUSR, proc_reclaim_operations),
+#endif
 #ifdef CONFIG_PROC_PAGE_MONITOR
 	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index f417d43..0c29375 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -204,6 +204,7 @@ struct pde_opener {
 };
 
 extern const struct inode_operations proc_pid_link_inode_operations;
+extern const struct file_operations proc_reclaim_operations;
 
 extern void proc_init_inodecache(void);
 extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3240a49..cd6bb70 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -11,6 +11,7 @@
 #include <linux/rmap.h>
 #include <linux/swap.h>
 #include <linux/swapops.h>
+#include <linux/mm_inline.h>
 
 #include <asm/elf.h>
 #include <asm/uaccess.h>
@@ -1182,6 +1183,126 @@ const struct file_operations proc_pagemap2_operations = {
 };
 #endif /* CONFIG_PROC_PAGE_MONITOR */
 
+#ifdef CONFIG_PROCESS_RECLAIM
+static int reclaim_pte_range(pmd_t *pmd, unsigned long addr,
+				unsigned long end, struct mm_walk *walk)
+{
+	struct vm_area_struct *vma = walk->private;
+	pte_t *pte, ptent;
+	spinlock_t *ptl;
+	struct page *page;
+	LIST_HEAD(page_list);
+	int isolated;
+
+	split_huge_page_pmd(vma, addr, pmd);
+	if (pmd_trans_unstable(pmd))
+		return 0;
+cont:
+	isolated = 0;
+	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+	for (; addr != end; pte++, addr += PAGE_SIZE) {
+		ptent = *pte;
+		if (!pte_present(ptent))
+			continue;
+
+		page = vm_normal_page(vma, addr, ptent);
+		if (!page)
+			continue;
+
+		if (isolate_lru_page(page))
+			continue;
+
+		list_add(&page->lru, &page_list);
+		inc_zone_page_state(page, NR_ISOLATED_ANON +
+				page_is_file_cache(page));
+		isolated++;
+		if (isolated >= SWAP_CLUSTER_MAX)
+			break;
+	}
+	pte_unmap_unlock(pte - 1, ptl);
+	reclaim_pages_from_list(&page_list);
+	if (addr != end)
+		goto cont;
+
+	cond_resched();
+	return 0;
+}
+
+enum reclaim_type {
+	RECLAIM_FILE,
+	RECLAIM_ANON,
+	RECLAIM_ALL,
+	RECLAIM_RANGE,
+};
+
+static ssize_t reclaim_write(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	struct task_struct *task;
+	char buffer[PROC_NUMBUF];
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	enum reclaim_type type;
+	char *type_buf;
+
+	memset(buffer, 0, sizeof(buffer));
+	if (count > sizeof(buffer) - 1)
+		count = sizeof(buffer) - 1;
+
+	if (copy_from_user(buffer, buf, count))
+		return -EFAULT;
+
+	type_buf = strstrip(buffer);
+	if (!strcmp(type_buf, "file"))
+		type = RECLAIM_FILE;
+	else if (!strcmp(type_buf, "anon"))
+		type = RECLAIM_ANON;
+	else if (!strcmp(type_buf, "all"))
+		type = RECLAIM_ALL;
+	else
+		return -EINVAL;
+
+	task = get_proc_task(file->f_path.dentry->d_inode);
+	if (!task)
+		return -ESRCH;
+
+	mm = get_task_mm(task);
+	if (mm) {
+		struct mm_walk reclaim_walk = {
+			.pmd_entry = reclaim_pte_range,
+			.mm = mm,
+		};
+
+		down_read(&mm->mmap_sem);
+		for (vma = mm->mmap; vma; vma = vma->vm_next) {
+			reclaim_walk.private = vma;
+
+			if (is_vm_hugetlb_page(vma))
+				continue;
+
+			if (type == RECLAIM_ANON && vma->vm_file)
+				continue;
+			if (type == RECLAIM_FILE && !vma->vm_file)
+				continue;
+
+			walk_page_range(vma->vm_start, vma->vm_end,
+					&reclaim_walk);
+		}
+		flush_tlb_mm(mm);
+		up_read(&mm->mmap_sem);
+		mmput(mm);
+	}
+	put_task_struct(task);
+
+	return count;
+}
+
+const struct file_operations proc_reclaim_operations = {
+	.write		= reclaim_write,
+	.llseek		= noop_llseek,
+};
+#endif
+
 #ifdef CONFIG_NUMA
 
 struct numa_maps {
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..a24e34e 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -10,6 +10,10 @@
 #include <linux/rwsem.h>
 #include <linux/memcontrol.h>
 
+extern int isolate_lru_page(struct page *page);
+extern void putback_lru_page(struct page *page);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
  * an anonymous page pointing to this anon_vma needs to be unmapped:
diff --git a/mm/Kconfig b/mm/Kconfig
index b0a7dad..d119b5b 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -489,3 +489,16 @@ config MEM_SOFT_DIRTY
 	  it can be cleared by hands.
 
 	  See Documentation/vm/soft-dirty.txt for more details.
+
+config PROCESS_RECLAIM
+	bool "Enable process reclaim"
+	depends on PROC_FS
+	default n
+	help
+	 It allows to reclaim pages of the process by /proc/pid/reclaim.
+
+	 (echo file > /proc/PID/reclaim) reclaims file-backed pages only.
+	 (echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
+	 (echo all > /proc/PID/reclaim) reclaims all pages.
+
+	 Any other vaule is ignored.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c22f9c1..93f3b79 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -994,6 +994,66 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 	return ret;
 }
 
+#ifdef CONFIG_PROCESS_RECLAIM
+static unsigned long shrink_page(struct page *page,
+					struct zone *zone,
+					struct scan_control *sc,
+					enum ttu_flags ttu_flags,
+					unsigned long *ret_nr_dirty,
+					unsigned long *ret_nr_writeback,
+					bool force_reclaim,
+					struct list_head *ret_pages)
+{
+	int reclaimed;
+	LIST_HEAD(page_list);
+	list_add(&page->lru, &page_list);
+
+	reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
+				ret_nr_dirty, ret_nr_writeback,
+				force_reclaim);
+	if (!reclaimed)
+		list_splice(&page_list, ret_pages);
+
+	return reclaimed;
+}
+
+unsigned long reclaim_pages_from_list(struct list_head *page_list)
+{
+	struct scan_control sc = {
+		.gfp_mask = GFP_KERNEL,
+		.priority = DEF_PRIORITY,
+		.may_writepage = 1,
+		.may_unmap = 1,
+		.may_swap = 1,
+	};
+
+	LIST_HEAD(ret_pages);
+	struct page *page;
+	unsigned long dummy1, dummy2;
+	unsigned long nr_reclaimed = 0;
+
+	while (!list_empty(page_list)) {
+		page = lru_to_page(page_list);
+		list_del(&page->lru);
+
+		ClearPageActive(page);
+		nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+				TTU_UNMAP|TTU_IGNORE_ACCESS,
+				&dummy1, &dummy2, true, &ret_pages);
+	}
+
+	while (!list_empty(&ret_pages)) {
+		page = lru_to_page(&ret_pages);
+		list_del(&page->lru);
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				page_is_file_cache(page));
+		putback_lru_page(page);
+	}
+
+	return nr_reclaimed;
+}
+#endif
+
 /*
  * Attempt to remove the specified page from its LRU.  Only take this page
  * if it is of the appropriate PageActive status.  Pages which are being
-- 
1.8.2.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 2/7] mm: Per process reclaim
@ 2013-05-09  7:21   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

These day, there are many platforms avaiable in the embedded market
and they are smarter than kernel which has very limited information
about working set so they want to involve memory management more heavily
like android's lowmemory killer and ashmem or recent many lowmemory
notifier(there was several trial for various company NOKIA, SAMSUNG,
Linaro, Google ChromeOS, Redhat).

One of the simple imagine scenario about userspace's intelligence is that
platform can manage tasks as forground and backgroud so it would be
better to reclaim background's task pages for end-user's *responsibility*
although it has frequent referenced pages.

This patch adds new knob "reclaim under proc/<pid>/" so task manager
can reclaim any target process anytime, anywhere. It could give another
method to platform for using memory efficiently.

It can avoid process killing for getting free memory, which was really
terrible experience because I lost my best score of game I had ever
after I switch the phone call while I enjoyed the game.

Reclaim file-backed pages only.
	echo file > /proc/PID/reclaim
Reclaim anonymous pages only.
	echo anon > /proc/PID/reclaim
Reclaim all pages
	echo all > /proc/PID/reclaim

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/base.c       |   3 ++
 fs/proc/internal.h   |   1 +
 fs/proc/task_mmu.c   | 121 +++++++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/rmap.h |   4 ++
 mm/Kconfig           |  13 ++++++
 mm/vmscan.c          |  60 +++++++++++++++++++++++++
 6 files changed, 202 insertions(+)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 4d3ebd6..9286b43 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2669,6 +2669,9 @@ static const struct pid_entry tgid_base_stuff[] = {
 	REG("mounts",     S_IRUGO, proc_mounts_operations),
 	REG("mountinfo",  S_IRUGO, proc_mountinfo_operations),
 	REG("mountstats", S_IRUSR, proc_mountstats_operations),
+#ifdef CONFIG_PROCESS_RECLAIM
+	REG("reclaim", S_IWUSR, proc_reclaim_operations),
+#endif
 #ifdef CONFIG_PROC_PAGE_MONITOR
 	REG("clear_refs", S_IWUSR, proc_clear_refs_operations),
 	REG("smaps",      S_IRUGO, proc_pid_smaps_operations),
diff --git a/fs/proc/internal.h b/fs/proc/internal.h
index f417d43..0c29375 100644
--- a/fs/proc/internal.h
+++ b/fs/proc/internal.h
@@ -204,6 +204,7 @@ struct pde_opener {
 };
 
 extern const struct inode_operations proc_pid_link_inode_operations;
+extern const struct file_operations proc_reclaim_operations;
 
 extern void proc_init_inodecache(void);
 extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *);
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index 3240a49..cd6bb70 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -11,6 +11,7 @@
 #include <linux/rmap.h>
 #include <linux/swap.h>
 #include <linux/swapops.h>
+#include <linux/mm_inline.h>
 
 #include <asm/elf.h>
 #include <asm/uaccess.h>
@@ -1182,6 +1183,126 @@ const struct file_operations proc_pagemap2_operations = {
 };
 #endif /* CONFIG_PROC_PAGE_MONITOR */
 
+#ifdef CONFIG_PROCESS_RECLAIM
+static int reclaim_pte_range(pmd_t *pmd, unsigned long addr,
+				unsigned long end, struct mm_walk *walk)
+{
+	struct vm_area_struct *vma = walk->private;
+	pte_t *pte, ptent;
+	spinlock_t *ptl;
+	struct page *page;
+	LIST_HEAD(page_list);
+	int isolated;
+
+	split_huge_page_pmd(vma, addr, pmd);
+	if (pmd_trans_unstable(pmd))
+		return 0;
+cont:
+	isolated = 0;
+	pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl);
+	for (; addr != end; pte++, addr += PAGE_SIZE) {
+		ptent = *pte;
+		if (!pte_present(ptent))
+			continue;
+
+		page = vm_normal_page(vma, addr, ptent);
+		if (!page)
+			continue;
+
+		if (isolate_lru_page(page))
+			continue;
+
+		list_add(&page->lru, &page_list);
+		inc_zone_page_state(page, NR_ISOLATED_ANON +
+				page_is_file_cache(page));
+		isolated++;
+		if (isolated >= SWAP_CLUSTER_MAX)
+			break;
+	}
+	pte_unmap_unlock(pte - 1, ptl);
+	reclaim_pages_from_list(&page_list);
+	if (addr != end)
+		goto cont;
+
+	cond_resched();
+	return 0;
+}
+
+enum reclaim_type {
+	RECLAIM_FILE,
+	RECLAIM_ANON,
+	RECLAIM_ALL,
+	RECLAIM_RANGE,
+};
+
+static ssize_t reclaim_write(struct file *file, const char __user *buf,
+				size_t count, loff_t *ppos)
+{
+	struct task_struct *task;
+	char buffer[PROC_NUMBUF];
+	struct mm_struct *mm;
+	struct vm_area_struct *vma;
+	enum reclaim_type type;
+	char *type_buf;
+
+	memset(buffer, 0, sizeof(buffer));
+	if (count > sizeof(buffer) - 1)
+		count = sizeof(buffer) - 1;
+
+	if (copy_from_user(buffer, buf, count))
+		return -EFAULT;
+
+	type_buf = strstrip(buffer);
+	if (!strcmp(type_buf, "file"))
+		type = RECLAIM_FILE;
+	else if (!strcmp(type_buf, "anon"))
+		type = RECLAIM_ANON;
+	else if (!strcmp(type_buf, "all"))
+		type = RECLAIM_ALL;
+	else
+		return -EINVAL;
+
+	task = get_proc_task(file->f_path.dentry->d_inode);
+	if (!task)
+		return -ESRCH;
+
+	mm = get_task_mm(task);
+	if (mm) {
+		struct mm_walk reclaim_walk = {
+			.pmd_entry = reclaim_pte_range,
+			.mm = mm,
+		};
+
+		down_read(&mm->mmap_sem);
+		for (vma = mm->mmap; vma; vma = vma->vm_next) {
+			reclaim_walk.private = vma;
+
+			if (is_vm_hugetlb_page(vma))
+				continue;
+
+			if (type == RECLAIM_ANON && vma->vm_file)
+				continue;
+			if (type == RECLAIM_FILE && !vma->vm_file)
+				continue;
+
+			walk_page_range(vma->vm_start, vma->vm_end,
+					&reclaim_walk);
+		}
+		flush_tlb_mm(mm);
+		up_read(&mm->mmap_sem);
+		mmput(mm);
+	}
+	put_task_struct(task);
+
+	return count;
+}
+
+const struct file_operations proc_reclaim_operations = {
+	.write		= reclaim_write,
+	.llseek		= noop_llseek,
+};
+#endif
+
 #ifdef CONFIG_NUMA
 
 struct numa_maps {
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 6dacb93..a24e34e 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -10,6 +10,10 @@
 #include <linux/rwsem.h>
 #include <linux/memcontrol.h>
 
+extern int isolate_lru_page(struct page *page);
+extern void putback_lru_page(struct page *page);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
  * an anonymous page pointing to this anon_vma needs to be unmapped:
diff --git a/mm/Kconfig b/mm/Kconfig
index b0a7dad..d119b5b 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -489,3 +489,16 @@ config MEM_SOFT_DIRTY
 	  it can be cleared by hands.
 
 	  See Documentation/vm/soft-dirty.txt for more details.
+
+config PROCESS_RECLAIM
+	bool "Enable process reclaim"
+	depends on PROC_FS
+	default n
+	help
+	 It allows to reclaim pages of the process by /proc/pid/reclaim.
+
+	 (echo file > /proc/PID/reclaim) reclaims file-backed pages only.
+	 (echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
+	 (echo all > /proc/PID/reclaim) reclaims all pages.
+
+	 Any other vaule is ignored.
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c22f9c1..93f3b79 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -994,6 +994,66 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 	return ret;
 }
 
+#ifdef CONFIG_PROCESS_RECLAIM
+static unsigned long shrink_page(struct page *page,
+					struct zone *zone,
+					struct scan_control *sc,
+					enum ttu_flags ttu_flags,
+					unsigned long *ret_nr_dirty,
+					unsigned long *ret_nr_writeback,
+					bool force_reclaim,
+					struct list_head *ret_pages)
+{
+	int reclaimed;
+	LIST_HEAD(page_list);
+	list_add(&page->lru, &page_list);
+
+	reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
+				ret_nr_dirty, ret_nr_writeback,
+				force_reclaim);
+	if (!reclaimed)
+		list_splice(&page_list, ret_pages);
+
+	return reclaimed;
+}
+
+unsigned long reclaim_pages_from_list(struct list_head *page_list)
+{
+	struct scan_control sc = {
+		.gfp_mask = GFP_KERNEL,
+		.priority = DEF_PRIORITY,
+		.may_writepage = 1,
+		.may_unmap = 1,
+		.may_swap = 1,
+	};
+
+	LIST_HEAD(ret_pages);
+	struct page *page;
+	unsigned long dummy1, dummy2;
+	unsigned long nr_reclaimed = 0;
+
+	while (!list_empty(page_list)) {
+		page = lru_to_page(page_list);
+		list_del(&page->lru);
+
+		ClearPageActive(page);
+		nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+				TTU_UNMAP|TTU_IGNORE_ACCESS,
+				&dummy1, &dummy2, true, &ret_pages);
+	}
+
+	while (!list_empty(&ret_pages)) {
+		page = lru_to_page(&ret_pages);
+		list_del(&page->lru);
+		dec_zone_page_state(page, NR_ISOLATED_ANON +
+				page_is_file_cache(page));
+		putback_lru_page(page);
+	}
+
+	return nr_reclaimed;
+}
+#endif
+
 /*
  * Attempt to remove the specified page from its LRU.  Only take this page
  * if it is of the appropriate PageActive status.  Pages which are being
-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 3/7] mm: make shrink_page_list with pages work from multiple zones
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-09  7:21   ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

Shrink_page_list expects all pages come from a same zone
but it's too limited to use.

This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 93f3b79..d69bfce 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -706,7 +706,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 			goto keep;
 
 		VM_BUG_ON(PageActive(page));
-		VM_BUG_ON(page_zone(page) != zone);
+		if (zone)
+			VM_BUG_ON(page_zone(page) != zone);
 
 		sc->nr_scanned++;
 
@@ -952,7 +953,7 @@ keep:
 	 * back off and wait for congestion to clear because further reclaim
 	 * will encounter the same problem
 	 */
-	if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc))
+	if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc) && zone)
 		zone_set_flag(zone, ZONE_CONGESTED);
 
 	free_hot_cold_page_list(&free_pages, 1);
-- 
1.8.2.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 3/7] mm: make shrink_page_list with pages work from multiple zones
@ 2013-05-09  7:21   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

Shrink_page_list expects all pages come from a same zone
but it's too limited to use.

This patch removes the dependency so next patch can use
shrink_page_list with pages from multiple zones.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 93f3b79..d69bfce 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -706,7 +706,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 			goto keep;
 
 		VM_BUG_ON(PageActive(page));
-		VM_BUG_ON(page_zone(page) != zone);
+		if (zone)
+			VM_BUG_ON(page_zone(page) != zone);
 
 		sc->nr_scanned++;
 
@@ -952,7 +953,7 @@ keep:
 	 * back off and wait for congestion to clear because further reclaim
 	 * will encounter the same problem
 	 */
-	if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc))
+	if (nr_dirty && nr_dirty == nr_congested && global_reclaim(sc) && zone)
 		zone_set_flag(zone, ZONE_CONGESTED);
 
 	free_hot_cold_page_list(&free_pages, 1);
-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 4/7] mm: Remove shrink_page
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-09  7:21   ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

By previous patch, shrink_page_list can handle pages from
multiple zone so let's remove shrink_page.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 47 ++++++++++++++---------------------------------
 1 file changed, 14 insertions(+), 33 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d69bfce..3243705 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -924,6 +924,13 @@ free_it:
 		 * appear not as the counts should be low
 		 */
 		list_add(&page->lru, &free_pages);
+		/*
+		 * If pagelist are from multiple zones, we should decrease
+		 * NR_ISOLATED_ANON + x on freed pages in here.
+		 */
+		if (!zone)
+			dec_zone_page_state(page, NR_ISOLATED_ANON +
+					page_is_file_cache(page));
 		continue;
 
 cull_mlocked:
@@ -996,28 +1003,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 }
 
 #ifdef CONFIG_PROCESS_RECLAIM
-static unsigned long shrink_page(struct page *page,
-					struct zone *zone,
-					struct scan_control *sc,
-					enum ttu_flags ttu_flags,
-					unsigned long *ret_nr_dirty,
-					unsigned long *ret_nr_writeback,
-					bool force_reclaim,
-					struct list_head *ret_pages)
-{
-	int reclaimed;
-	LIST_HEAD(page_list);
-	list_add(&page->lru, &page_list);
-
-	reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
-				ret_nr_dirty, ret_nr_writeback,
-				force_reclaim);
-	if (!reclaimed)
-		list_splice(&page_list, ret_pages);
-
-	return reclaimed;
-}
-
 unsigned long reclaim_pages_from_list(struct list_head *page_list)
 {
 	struct scan_control sc = {
@@ -1028,23 +1013,19 @@ unsigned long reclaim_pages_from_list(struct list_head *page_list)
 		.may_swap = 1,
 	};
 
-	LIST_HEAD(ret_pages);
+	unsigned long nr_reclaimed;
 	struct page *page;
 	unsigned long dummy1, dummy2;
-	unsigned long nr_reclaimed = 0;
-
-	while (!list_empty(page_list)) {
-		page = lru_to_page(page_list);
-		list_del(&page->lru);
 
+	list_for_each_entry(page, page_list, lru)
 		ClearPageActive(page);
-		nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+
+	nr_reclaimed = shrink_page_list(page_list, NULL, &sc,
 				TTU_UNMAP|TTU_IGNORE_ACCESS,
-				&dummy1, &dummy2, true, &ret_pages);
-	}
+				&dummy1, &dummy2, true);
 
-	while (!list_empty(&ret_pages)) {
-		page = lru_to_page(&ret_pages);
+	while (!list_empty(page_list)) {
+		page = lru_to_page(page_list);
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-- 
1.8.2.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 4/7] mm: Remove shrink_page
@ 2013-05-09  7:21   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

By previous patch, shrink_page_list can handle pages from
multiple zone so let's remove shrink_page.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 47 ++++++++++++++---------------------------------
 1 file changed, 14 insertions(+), 33 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index d69bfce..3243705 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -924,6 +924,13 @@ free_it:
 		 * appear not as the counts should be low
 		 */
 		list_add(&page->lru, &free_pages);
+		/*
+		 * If pagelist are from multiple zones, we should decrease
+		 * NR_ISOLATED_ANON + x on freed pages in here.
+		 */
+		if (!zone)
+			dec_zone_page_state(page, NR_ISOLATED_ANON +
+					page_is_file_cache(page));
 		continue;
 
 cull_mlocked:
@@ -996,28 +1003,6 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 }
 
 #ifdef CONFIG_PROCESS_RECLAIM
-static unsigned long shrink_page(struct page *page,
-					struct zone *zone,
-					struct scan_control *sc,
-					enum ttu_flags ttu_flags,
-					unsigned long *ret_nr_dirty,
-					unsigned long *ret_nr_writeback,
-					bool force_reclaim,
-					struct list_head *ret_pages)
-{
-	int reclaimed;
-	LIST_HEAD(page_list);
-	list_add(&page->lru, &page_list);
-
-	reclaimed = shrink_page_list(&page_list, zone, sc, ttu_flags,
-				ret_nr_dirty, ret_nr_writeback,
-				force_reclaim);
-	if (!reclaimed)
-		list_splice(&page_list, ret_pages);
-
-	return reclaimed;
-}
-
 unsigned long reclaim_pages_from_list(struct list_head *page_list)
 {
 	struct scan_control sc = {
@@ -1028,23 +1013,19 @@ unsigned long reclaim_pages_from_list(struct list_head *page_list)
 		.may_swap = 1,
 	};
 
-	LIST_HEAD(ret_pages);
+	unsigned long nr_reclaimed;
 	struct page *page;
 	unsigned long dummy1, dummy2;
-	unsigned long nr_reclaimed = 0;
-
-	while (!list_empty(page_list)) {
-		page = lru_to_page(page_list);
-		list_del(&page->lru);
 
+	list_for_each_entry(page, page_list, lru)
 		ClearPageActive(page);
-		nr_reclaimed += shrink_page(page, page_zone(page), &sc,
+
+	nr_reclaimed = shrink_page_list(page_list, NULL, &sc,
 				TTU_UNMAP|TTU_IGNORE_ACCESS,
-				&dummy1, &dummy2, true, &ret_pages);
-	}
+				&dummy1, &dummy2, true);
 
-	while (!list_empty(&ret_pages)) {
-		page = lru_to_page(&ret_pages);
+	while (!list_empty(page_list)) {
+		page = lru_to_page(page_list);
 		list_del(&page->lru);
 		dec_zone_page_state(page, NR_ISOLATED_ANON +
 				page_is_file_cache(page));
-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 5/7] mm: Enhance per process reclaim to consider shared pages
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-09  7:21   ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim,
	Sangseok Lee

Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/task_mmu.c   |  2 +-
 include/linux/ksm.h  |  6 ++++--
 include/linux/rmap.h |  8 +++++---
 mm/ksm.c             |  9 ++++++++-
 mm/memory-failure.c  |  2 +-
 mm/migrate.c         |  6 ++++--
 mm/rmap.c            | 57 +++++++++++++++++++++++++++++++++++++---------------
 mm/vmscan.c          | 14 +++++++++++--
 8 files changed, 76 insertions(+), 28 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cd6bb70..ccc97b1 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1220,7 +1220,7 @@ cont:
 			break;
 	}
 	pte_unmap_unlock(pte - 1, ptl);
-	reclaim_pages_from_list(&page_list);
+	reclaim_pages_from_list(&page_list, vma);
 	if (addr != end)
 		goto cont;
 
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 45c9b6a..d8e556b 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
 
 int page_referenced_ksm(struct page *page,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
+int try_to_unmap_ksm(struct page *page,
+			enum ttu_flags flags, struct vm_area_struct *vma);
 int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
 		  struct vm_area_struct *, unsigned long, void *), void *arg);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
@@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
 	return 0;
 }
 
-static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+static inline int try_to_unmap_ksm(struct page *page,
+			enum ttu_flags flags, struct vm_area_struct *target_vma)
 {
 	return 0;
 }
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index a24e34e..6c7d030 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -12,7 +12,8 @@
 
 extern int isolate_lru_page(struct page *page);
 extern void putback_lru_page(struct page *page);
-extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
+					     struct vm_area_struct *vma);
 
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
@@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
 
 #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
 
-int try_to_unmap(struct page *, enum ttu_flags flags);
+int try_to_unmap(struct page *, enum ttu_flags flags,
+			struct vm_area_struct *vma);
 int try_to_unmap_one(struct page *, struct vm_area_struct *,
 			unsigned long address, enum ttu_flags flags);
 
@@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
 	return 0;
 }
 
-#define try_to_unmap(page, refs) SWAP_FAIL
+#define try_to_unmap(page, refs, vma) SWAP_FAIL
 
 static inline int page_mkclean(struct page *page)
 {
diff --git a/mm/ksm.c b/mm/ksm.c
index b6afe0c..2efdfd2 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1946,7 +1946,8 @@ out:
 	return referenced;
 }
 
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
+			struct vm_area_struct *target_vma)
 {
 	struct stable_node *stable_node;
 	struct rmap_item *rmap_item;
@@ -1959,6 +1960,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
 	stable_node = page_stable_node(page);
 	if (!stable_node)
 		return SWAP_FAIL;
+
+	if (target_vma) {
+		unsigned long address = vma_address(page, target_vma);
+		ret = try_to_unmap_one(page, target_vma, address, flags);
+		goto out;
+	}
 again:
 	hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
 		struct anon_vma *anon_vma = rmap_item->anon_vma;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ceb0c7f..f3928e4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
 	if (hpage != ppage)
 		lock_page(ppage);
 
-	ret = try_to_unmap(ppage, ttu);
+	ret = try_to_unmap(ppage, ttu, NULL);
 	if (ret != SWAP_SUCCESS)
 		printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
 				pfn, page_mapcount(ppage));
diff --git a/mm/migrate.c b/mm/migrate.c
index c9c5eee..aef29a0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	}
 
 	/* Establish migration ptes or remove ptes */
-	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+			NULL);
 
 skip_unmap:
 	if (!page_mapped(page))
@@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	if (PageAnon(hpage))
 		anon_vma = page_get_anon_vma(hpage);
 
-	try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+	try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+						NULL);
 
 	if (!page_mapped(hpage))
 		rc = move_to_new_page(new_hpage, hpage, 1, mode);
diff --git a/mm/rmap.c b/mm/rmap.c
index 6280da8..43718fc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
 
 /**
  * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
- * rmap method
+ * rmap method if @vma is NULL
  * @page: the page to unmap/unlock
  * @flags: action and flags
+ * @target_vma: vma for unmapping a @page
  *
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the anon_vma struct it points to.
  *
+ * If @target_vma isn't NULL, this function unmap a page from the vma
+ *
  * This function is only called from try_to_unmap/try_to_munlock for
  * anonymous pages.
  * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
@@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * 'LOCKED.
  */
-static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
+					struct vm_area_struct *target_vma)
 {
+	int ret = SWAP_AGAIN;
+	unsigned long address;
 	struct anon_vma *anon_vma;
 	pgoff_t pgoff;
 	struct anon_vma_chain *avc;
-	int ret = SWAP_AGAIN;
+
+	if (target_vma) {
+		address = vma_address(page, target_vma);
+		return try_to_unmap_one(page, target_vma, address, flags);
+	}
 
 	anon_vma = page_lock_anon_vma_read(page);
 	if (!anon_vma)
@@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
 	pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
 		struct vm_area_struct *vma = avc->vma;
-		unsigned long address;
 
 		/*
 		 * During exec, a temporary VMA is setup and later moved.
@@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
  * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
  * @page: the page to unmap/unlock
  * @flags: action and flags
+ * @target_vma: vma for unmapping @page
  *
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the address_space struct it points to.
@@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * 'LOCKED.
  */
-static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
+				struct vm_area_struct *target_vma)
 {
 	struct address_space *mapping = page->mapping;
 	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1512,16 +1523,26 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
 	unsigned long max_nl_cursor = 0;
 	unsigned long max_nl_size = 0;
 	unsigned int mapcount;
+	unsigned long address;
 
 	if (PageHuge(page))
 		pgoff = page->index << compound_order(page);
 
 	mutex_lock(&mapping->i_mmap_mutex);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
-		unsigned long address = vma_address(page, vma);
-		ret = try_to_unmap_one(page, vma, address, flags);
-		if (ret != SWAP_AGAIN || !page_mapped(page))
+	if (target_vma) {
+		/* We don't handle non-linear vma on ramfs */
+		if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
 			goto out;
+		address = vma_address(page, target_vma);
+		ret = try_to_unmap_one(page, target_vma, address, flags);
+		goto out;
+	} else {
+		vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+			address = vma_address(page, vma);
+			ret = try_to_unmap_one(page, vma, address, flags);
+			if (ret != SWAP_AGAIN || !page_mapped(page))
+				goto out;
+		}
 	}
 
 	if (list_empty(&mapping->i_mmap_nonlinear))
@@ -1602,9 +1623,12 @@ out:
  * try_to_unmap - try to remove all page table mappings to a page
  * @page: the page to get unmapped
  * @flags: action and flags
+ * @vma : target vma for reclaim
  *
  * Tries to remove all the page table entries which are mapping this
  * page, used in the pageout path.  Caller must hold the page lock.
+ * If @vma is not NULL, this function try to remove @page from only @vma
+ * without peeking all mapped vma for @page.
  * Return values are:
  *
  * SWAP_SUCCESS	- we succeeded in removing all mappings
@@ -1612,7 +1636,8 @@ out:
  * SWAP_FAIL	- the page is unswappable
  * SWAP_MLOCK	- page is mlocked.
  */
-int try_to_unmap(struct page *page, enum ttu_flags flags)
+int try_to_unmap(struct page *page, enum ttu_flags flags,
+				struct vm_area_struct *vma)
 {
 	int ret;
 
@@ -1620,11 +1645,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
 	VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
 
 	if (unlikely(PageKsm(page)))
-		ret = try_to_unmap_ksm(page, flags);
+		ret = try_to_unmap_ksm(page, flags, vma);
 	else if (PageAnon(page))
-		ret = try_to_unmap_anon(page, flags);
+		ret = try_to_unmap_anon(page, flags, vma);
 	else
-		ret = try_to_unmap_file(page, flags);
+		ret = try_to_unmap_file(page, flags, vma);
 	if (ret != SWAP_MLOCK && !page_mapped(page))
 		ret = SWAP_SUCCESS;
 	return ret;
@@ -1650,11 +1675,11 @@ int try_to_munlock(struct page *page)
 	VM_BUG_ON(!PageLocked(page) || PageLRU(page));
 
 	if (unlikely(PageKsm(page)))
-		return try_to_unmap_ksm(page, TTU_MUNLOCK);
+		return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
 	else if (PageAnon(page))
-		return try_to_unmap_anon(page, TTU_MUNLOCK);
+		return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
 	else
-		return try_to_unmap_file(page, TTU_MUNLOCK);
+		return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
 }
 
 void __put_anon_vma(struct anon_vma *anon_vma)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3243705..75f5b5d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -93,6 +93,13 @@ struct scan_control {
 	 * are scanned.
 	 */
 	nodemask_t	*nodemask;
+
+	/*
+	 * Reclaim pages from a vma. If the page is shared by other tasks
+	 * it is zapped from a vma without reclaim so it ends up remaining
+	 * on memory until last task zap it.
+	 */
+	struct vm_area_struct *target_vma;
 };
 
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -794,7 +801,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		 * processes. Try to unmap it here.
 		 */
 		if (page_mapped(page) && mapping) {
-			switch (try_to_unmap(page, ttu_flags)) {
+			switch (try_to_unmap(page,
+					ttu_flags, sc->target_vma)) {
 			case SWAP_FAIL:
 				goto activate_locked;
 			case SWAP_AGAIN:
@@ -1003,7 +1011,8 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 }
 
 #ifdef CONFIG_PROCESS_RECLAIM
-unsigned long reclaim_pages_from_list(struct list_head *page_list)
+unsigned long reclaim_pages_from_list(struct list_head *page_list,
+					struct vm_area_struct *vma)
 {
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
@@ -1011,6 +1020,7 @@ unsigned long reclaim_pages_from_list(struct list_head *page_list)
 		.may_writepage = 1,
 		.may_unmap = 1,
 		.may_swap = 1,
+		.target_vma = vma,
 	};
 
 	unsigned long nr_reclaimed;
-- 
1.8.2.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 5/7] mm: Enhance per process reclaim to consider shared pages
@ 2013-05-09  7:21   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim,
	Sangseok Lee

Some pages could be shared by several processes. (ex, libc)
In case of that, it's too bad to reclaim them from the beginnig.

This patch causes VM to keep them on memory until last task
try to reclaim them so shared pages will be reclaimed only if
all of task has gone swapping out.

This feature doesn't handle non-linear mapping on ramfs because
it's very time-consuming and doesn't make sure of reclaiming and
not common.

Signed-off-by: Sangseok Lee <sangseok.lee@lge.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/task_mmu.c   |  2 +-
 include/linux/ksm.h  |  6 ++++--
 include/linux/rmap.h |  8 +++++---
 mm/ksm.c             |  9 ++++++++-
 mm/memory-failure.c  |  2 +-
 mm/migrate.c         |  6 ++++--
 mm/rmap.c            | 57 +++++++++++++++++++++++++++++++++++++---------------
 mm/vmscan.c          | 14 +++++++++++--
 8 files changed, 76 insertions(+), 28 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cd6bb70..ccc97b1 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1220,7 +1220,7 @@ cont:
 			break;
 	}
 	pte_unmap_unlock(pte - 1, ptl);
-	reclaim_pages_from_list(&page_list);
+	reclaim_pages_from_list(&page_list, vma);
 	if (addr != end)
 		goto cont;
 
diff --git a/include/linux/ksm.h b/include/linux/ksm.h
index 45c9b6a..d8e556b 100644
--- a/include/linux/ksm.h
+++ b/include/linux/ksm.h
@@ -75,7 +75,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
 
 int page_referenced_ksm(struct page *page,
 			struct mem_cgroup *memcg, unsigned long *vm_flags);
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags);
+int try_to_unmap_ksm(struct page *page,
+			enum ttu_flags flags, struct vm_area_struct *vma);
 int rmap_walk_ksm(struct page *page, int (*rmap_one)(struct page *,
 		  struct vm_area_struct *, unsigned long, void *), void *arg);
 void ksm_migrate_page(struct page *newpage, struct page *oldpage);
@@ -115,7 +116,8 @@ static inline int page_referenced_ksm(struct page *page,
 	return 0;
 }
 
-static inline int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+static inline int try_to_unmap_ksm(struct page *page,
+			enum ttu_flags flags, struct vm_area_struct *target_vma)
 {
 	return 0;
 }
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index a24e34e..6c7d030 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -12,7 +12,8 @@
 
 extern int isolate_lru_page(struct page *page);
 extern void putback_lru_page(struct page *page);
-extern unsigned long reclaim_pages_from_list(struct list_head *page_list);
+extern unsigned long reclaim_pages_from_list(struct list_head *page_list,
+					     struct vm_area_struct *vma);
 
 /*
  * The anon_vma heads a list of private "related" vmas, to scan if
@@ -192,7 +193,8 @@ int page_referenced_one(struct page *, struct vm_area_struct *,
 
 #define TTU_ACTION(x) ((x) & TTU_ACTION_MASK)
 
-int try_to_unmap(struct page *, enum ttu_flags flags);
+int try_to_unmap(struct page *, enum ttu_flags flags,
+			struct vm_area_struct *vma);
 int try_to_unmap_one(struct page *, struct vm_area_struct *,
 			unsigned long address, enum ttu_flags flags);
 
@@ -259,7 +261,7 @@ static inline int page_referenced(struct page *page, int is_locked,
 	return 0;
 }
 
-#define try_to_unmap(page, refs) SWAP_FAIL
+#define try_to_unmap(page, refs, vma) SWAP_FAIL
 
 static inline int page_mkclean(struct page *page)
 {
diff --git a/mm/ksm.c b/mm/ksm.c
index b6afe0c..2efdfd2 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -1946,7 +1946,8 @@ out:
 	return referenced;
 }
 
-int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
+int try_to_unmap_ksm(struct page *page, enum ttu_flags flags,
+			struct vm_area_struct *target_vma)
 {
 	struct stable_node *stable_node;
 	struct rmap_item *rmap_item;
@@ -1959,6 +1960,12 @@ int try_to_unmap_ksm(struct page *page, enum ttu_flags flags)
 	stable_node = page_stable_node(page);
 	if (!stable_node)
 		return SWAP_FAIL;
+
+	if (target_vma) {
+		unsigned long address = vma_address(page, target_vma);
+		ret = try_to_unmap_one(page, target_vma, address, flags);
+		goto out;
+	}
 again:
 	hlist_for_each_entry(rmap_item, &stable_node->hlist, hlist) {
 		struct anon_vma *anon_vma = rmap_item->anon_vma;
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index ceb0c7f..f3928e4 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -955,7 +955,7 @@ static int hwpoison_user_mappings(struct page *p, unsigned long pfn,
 	if (hpage != ppage)
 		lock_page(ppage);
 
-	ret = try_to_unmap(ppage, ttu);
+	ret = try_to_unmap(ppage, ttu, NULL);
 	if (ret != SWAP_SUCCESS)
 		printk(KERN_ERR "MCE %#lx: failed to unmap page (mapcount=%d)\n",
 				pfn, page_mapcount(ppage));
diff --git a/mm/migrate.c b/mm/migrate.c
index c9c5eee..aef29a0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -820,7 +820,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	}
 
 	/* Establish migration ptes or remove ptes */
-	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+	try_to_unmap(page, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+			NULL);
 
 skip_unmap:
 	if (!page_mapped(page))
@@ -947,7 +948,8 @@ static int unmap_and_move_huge_page(new_page_t get_new_page,
 	if (PageAnon(hpage))
 		anon_vma = page_get_anon_vma(hpage);
 
-	try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS);
+	try_to_unmap(hpage, TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS,
+						NULL);
 
 	if (!page_mapped(hpage))
 		rc = move_to_new_page(new_hpage, hpage, 1, mode);
diff --git a/mm/rmap.c b/mm/rmap.c
index 6280da8..43718fc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1435,13 +1435,16 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
 
 /**
  * try_to_unmap_anon - unmap or unlock anonymous page using the object-based
- * rmap method
+ * rmap method if @vma is NULL
  * @page: the page to unmap/unlock
  * @flags: action and flags
+ * @target_vma: vma for unmapping a @page
  *
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the anon_vma struct it points to.
  *
+ * If @target_vma isn't NULL, this function unmap a page from the vma
+ *
  * This function is only called from try_to_unmap/try_to_munlock for
  * anonymous pages.
  * When called from try_to_munlock(), the mmap_sem of the mm containing the vma
@@ -1449,12 +1452,19 @@ bool is_vma_temporary_stack(struct vm_area_struct *vma)
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * 'LOCKED.
  */
-static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_anon(struct page *page, enum ttu_flags flags,
+					struct vm_area_struct *target_vma)
 {
+	int ret = SWAP_AGAIN;
+	unsigned long address;
 	struct anon_vma *anon_vma;
 	pgoff_t pgoff;
 	struct anon_vma_chain *avc;
-	int ret = SWAP_AGAIN;
+
+	if (target_vma) {
+		address = vma_address(page, target_vma);
+		return try_to_unmap_one(page, target_vma, address, flags);
+	}
 
 	anon_vma = page_lock_anon_vma_read(page);
 	if (!anon_vma)
@@ -1463,7 +1473,6 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
 	pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
 	anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root, pgoff, pgoff) {
 		struct vm_area_struct *vma = avc->vma;
-		unsigned long address;
 
 		/*
 		 * During exec, a temporary VMA is setup and later moved.
@@ -1491,6 +1500,7 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
  * try_to_unmap_file - unmap/unlock file page using the object-based rmap method
  * @page: the page to unmap/unlock
  * @flags: action and flags
+ * @target_vma: vma for unmapping @page
  *
  * Find all the mappings of a page using the mapping pointer and the vma chains
  * contained in the address_space struct it points to.
@@ -1502,7 +1512,8 @@ static int try_to_unmap_anon(struct page *page, enum ttu_flags flags)
  * vm_flags for that VMA.  That should be OK, because that vma shouldn't be
  * 'LOCKED.
  */
-static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
+static int try_to_unmap_file(struct page *page, enum ttu_flags flags,
+				struct vm_area_struct *target_vma)
 {
 	struct address_space *mapping = page->mapping;
 	pgoff_t pgoff = page->index << (PAGE_CACHE_SHIFT - PAGE_SHIFT);
@@ -1512,16 +1523,26 @@ static int try_to_unmap_file(struct page *page, enum ttu_flags flags)
 	unsigned long max_nl_cursor = 0;
 	unsigned long max_nl_size = 0;
 	unsigned int mapcount;
+	unsigned long address;
 
 	if (PageHuge(page))
 		pgoff = page->index << compound_order(page);
 
 	mutex_lock(&mapping->i_mmap_mutex);
-	vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
-		unsigned long address = vma_address(page, vma);
-		ret = try_to_unmap_one(page, vma, address, flags);
-		if (ret != SWAP_AGAIN || !page_mapped(page))
+	if (target_vma) {
+		/* We don't handle non-linear vma on ramfs */
+		if (unlikely(!list_empty(&mapping->i_mmap_nonlinear)))
 			goto out;
+		address = vma_address(page, target_vma);
+		ret = try_to_unmap_one(page, target_vma, address, flags);
+		goto out;
+	} else {
+		vma_interval_tree_foreach(vma, &mapping->i_mmap, pgoff, pgoff) {
+			address = vma_address(page, vma);
+			ret = try_to_unmap_one(page, vma, address, flags);
+			if (ret != SWAP_AGAIN || !page_mapped(page))
+				goto out;
+		}
 	}
 
 	if (list_empty(&mapping->i_mmap_nonlinear))
@@ -1602,9 +1623,12 @@ out:
  * try_to_unmap - try to remove all page table mappings to a page
  * @page: the page to get unmapped
  * @flags: action and flags
+ * @vma : target vma for reclaim
  *
  * Tries to remove all the page table entries which are mapping this
  * page, used in the pageout path.  Caller must hold the page lock.
+ * If @vma is not NULL, this function try to remove @page from only @vma
+ * without peeking all mapped vma for @page.
  * Return values are:
  *
  * SWAP_SUCCESS	- we succeeded in removing all mappings
@@ -1612,7 +1636,8 @@ out:
  * SWAP_FAIL	- the page is unswappable
  * SWAP_MLOCK	- page is mlocked.
  */
-int try_to_unmap(struct page *page, enum ttu_flags flags)
+int try_to_unmap(struct page *page, enum ttu_flags flags,
+				struct vm_area_struct *vma)
 {
 	int ret;
 
@@ -1620,11 +1645,11 @@ int try_to_unmap(struct page *page, enum ttu_flags flags)
 	VM_BUG_ON(!PageHuge(page) && PageTransHuge(page));
 
 	if (unlikely(PageKsm(page)))
-		ret = try_to_unmap_ksm(page, flags);
+		ret = try_to_unmap_ksm(page, flags, vma);
 	else if (PageAnon(page))
-		ret = try_to_unmap_anon(page, flags);
+		ret = try_to_unmap_anon(page, flags, vma);
 	else
-		ret = try_to_unmap_file(page, flags);
+		ret = try_to_unmap_file(page, flags, vma);
 	if (ret != SWAP_MLOCK && !page_mapped(page))
 		ret = SWAP_SUCCESS;
 	return ret;
@@ -1650,11 +1675,11 @@ int try_to_munlock(struct page *page)
 	VM_BUG_ON(!PageLocked(page) || PageLRU(page));
 
 	if (unlikely(PageKsm(page)))
-		return try_to_unmap_ksm(page, TTU_MUNLOCK);
+		return try_to_unmap_ksm(page, TTU_MUNLOCK, NULL);
 	else if (PageAnon(page))
-		return try_to_unmap_anon(page, TTU_MUNLOCK);
+		return try_to_unmap_anon(page, TTU_MUNLOCK, NULL);
 	else
-		return try_to_unmap_file(page, TTU_MUNLOCK);
+		return try_to_unmap_file(page, TTU_MUNLOCK, NULL);
 }
 
 void __put_anon_vma(struct anon_vma *anon_vma)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3243705..75f5b5d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -93,6 +93,13 @@ struct scan_control {
 	 * are scanned.
 	 */
 	nodemask_t	*nodemask;
+
+	/*
+	 * Reclaim pages from a vma. If the page is shared by other tasks
+	 * it is zapped from a vma without reclaim so it ends up remaining
+	 * on memory until last task zap it.
+	 */
+	struct vm_area_struct *target_vma;
 };
 
 #define lru_to_page(_head) (list_entry((_head)->prev, struct page, lru))
@@ -794,7 +801,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		 * processes. Try to unmap it here.
 		 */
 		if (page_mapped(page) && mapping) {
-			switch (try_to_unmap(page, ttu_flags)) {
+			switch (try_to_unmap(page,
+					ttu_flags, sc->target_vma)) {
 			case SWAP_FAIL:
 				goto activate_locked;
 			case SWAP_AGAIN:
@@ -1003,7 +1011,8 @@ unsigned long reclaim_clean_pages_from_list(struct zone *zone,
 }
 
 #ifdef CONFIG_PROCESS_RECLAIM
-unsigned long reclaim_pages_from_list(struct list_head *page_list)
+unsigned long reclaim_pages_from_list(struct list_head *page_list,
+					struct vm_area_struct *vma)
 {
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
@@ -1011,6 +1020,7 @@ unsigned long reclaim_pages_from_list(struct list_head *page_list)
 		.may_writepage = 1,
 		.may_unmap = 1,
 		.may_swap = 1,
+		.target_vma = vma,
 	};
 
 	unsigned long nr_reclaimed;
-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 6/7] mm: Support address range reclaim
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-09  7:21   ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

This patch adds address range reclaim of a process.
The requirement is following as,

Like webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so this patch supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

	echo [addr] [size-byte] > /proc/pid/reclaim

The addr should be page-aligned.

So now reclaim konb's interface is following as.

echo file > /proc/pid/reclaim
	reclaim file-backed pages only

echo anon > /proc/pid/reclaim
	reclaim anonymous pages only

echo all > /proc/pid/reclaim
	reclaim all pages

echo 0x100000 8K > /proc/pid/reclaim
	reclaim pages in (0x100000 - 0x102000)

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/task_mmu.c | 85 ++++++++++++++++++++++++++++++++++++++++++++----------
 mm/Kconfig         |  3 ++
 2 files changed, 73 insertions(+), 15 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ccc97b1..61b6bde 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -12,6 +12,7 @@
 #include <linux/swap.h>
 #include <linux/swapops.h>
 #include <linux/mm_inline.h>
+#include <linux/ctype.h>
 
 #include <asm/elf.h>
 #include <asm/uaccess.h>
@@ -1239,11 +1240,14 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
 				size_t count, loff_t *ppos)
 {
 	struct task_struct *task;
-	char buffer[PROC_NUMBUF];
+	char buffer[200];
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
 	enum reclaim_type type;
 	char *type_buf;
+	struct mm_walk reclaim_walk = {};
+	unsigned long start = 0;
+	unsigned long end = 0;
 
 	memset(buffer, 0, sizeof(buffer));
 	if (count > sizeof(buffer) - 1)
@@ -1259,42 +1263,93 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
 		type = RECLAIM_ANON;
 	else if (!strcmp(type_buf, "all"))
 		type = RECLAIM_ALL;
+	else if (isdigit(*type_buf))
+		type = RECLAIM_RANGE;
 	else
-		return -EINVAL;
+		goto out_err;
+
+	if (type == RECLAIM_RANGE) {
+		char *token;
+		unsigned long long len, len_in, tmp;
+		token = strsep(&type_buf, " ");
+		if (!token)
+			goto out_err;
+		tmp = memparse(token, &token);
+		if (tmp & ~PAGE_MASK || tmp > ULONG_MAX)
+			goto out_err;
+		start = tmp;
+
+		token = strsep(&type_buf, " ");
+		if (!token)
+			goto out_err;
+		len_in = memparse(token, &token);
+		len = (len_in + ~PAGE_MASK) & PAGE_MASK;
+		if (len > ULONG_MAX)
+			goto out_err;
+		/*
+		 * Check to see whether len was rounded up from small -ve
+		 * to zero.
+		 */
+		if (len_in && !len)
+			goto out_err;
+
+		end = start + len;
+		if (end < start)
+			goto out_err;
+	}
 
 	task = get_proc_task(file->f_path.dentry->d_inode);
 	if (!task)
 		return -ESRCH;
 
 	mm = get_task_mm(task);
-	if (mm) {
-		struct mm_walk reclaim_walk = {
-			.pmd_entry = reclaim_pte_range,
-			.mm = mm,
-		};
+	if (!mm)
+		goto out;
 
-		down_read(&mm->mmap_sem);
-		for (vma = mm->mmap; vma; vma = vma->vm_next) {
-			reclaim_walk.private = vma;
+	reclaim_walk.mm = mm;
+	reclaim_walk.pmd_entry = reclaim_pte_range;
 
+	down_read(&mm->mmap_sem);
+	if (type == RECLAIM_RANGE) {
+		vma = find_vma(mm, start);
+		while (vma) {
+			if (vma->vm_start > end)
+				break;
+			if (is_vm_hugetlb_page(vma))
+				continue;
+
+			reclaim_walk.private = vma;
+			walk_page_range(max(vma->vm_start, start),
+					min(vma->vm_end, end),
+					&reclaim_walk);
+			vma = vma->vm_next;
+		}
+	} else {
+		for (vma = mm->mmap; vma; vma = vma->vm_next) {
 			if (is_vm_hugetlb_page(vma))
 				continue;
 
 			if (type == RECLAIM_ANON && vma->vm_file)
 				continue;
+
 			if (type == RECLAIM_FILE && !vma->vm_file)
 				continue;
 
+			reclaim_walk.private = vma;
 			walk_page_range(vma->vm_start, vma->vm_end,
-					&reclaim_walk);
+				&reclaim_walk);
 		}
-		flush_tlb_mm(mm);
-		up_read(&mm->mmap_sem);
-		mmput(mm);
 	}
-	put_task_struct(task);
 
+	flush_tlb_mm(mm);
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+out:
+	put_task_struct(task);
 	return count;
+
+out_err:
+	return -EINVAL;
 }
 
 const struct file_operations proc_reclaim_operations = {
diff --git a/mm/Kconfig b/mm/Kconfig
index d119b5b..3460da4 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -501,4 +501,7 @@ config PROCESS_RECLAIM
 	 (echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
 	 (echo all > /proc/PID/reclaim) reclaims all pages.
 
+	 (echo addr size-byte > /proc/PID/reclaim) reclaims pages in
+	 (addr, addr + size-bytes) of the process.
+
 	 Any other vaule is ignored.
-- 
1.8.2.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 6/7] mm: Support address range reclaim
@ 2013-05-09  7:21   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

This patch adds address range reclaim of a process.
The requirement is following as,

Like webkit1, it uses a address space for handling multi tabs.
IOW, it uses *one* process model so all tabs shares address space
of the process. In such scenario, per-process reclaim is rather
coarse-grained so this patch supports more fine-grained reclaim
for being able to reclaim target address range of the process.
For reclaim target range, you should use following format.

	echo [addr] [size-byte] > /proc/pid/reclaim

The addr should be page-aligned.

So now reclaim konb's interface is following as.

echo file > /proc/pid/reclaim
	reclaim file-backed pages only

echo anon > /proc/pid/reclaim
	reclaim anonymous pages only

echo all > /proc/pid/reclaim
	reclaim all pages

echo 0x100000 8K > /proc/pid/reclaim
	reclaim pages in (0x100000 - 0x102000)

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/proc/task_mmu.c | 85 ++++++++++++++++++++++++++++++++++++++++++++----------
 mm/Kconfig         |  3 ++
 2 files changed, 73 insertions(+), 15 deletions(-)

diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index ccc97b1..61b6bde 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -12,6 +12,7 @@
 #include <linux/swap.h>
 #include <linux/swapops.h>
 #include <linux/mm_inline.h>
+#include <linux/ctype.h>
 
 #include <asm/elf.h>
 #include <asm/uaccess.h>
@@ -1239,11 +1240,14 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
 				size_t count, loff_t *ppos)
 {
 	struct task_struct *task;
-	char buffer[PROC_NUMBUF];
+	char buffer[200];
 	struct mm_struct *mm;
 	struct vm_area_struct *vma;
 	enum reclaim_type type;
 	char *type_buf;
+	struct mm_walk reclaim_walk = {};
+	unsigned long start = 0;
+	unsigned long end = 0;
 
 	memset(buffer, 0, sizeof(buffer));
 	if (count > sizeof(buffer) - 1)
@@ -1259,42 +1263,93 @@ static ssize_t reclaim_write(struct file *file, const char __user *buf,
 		type = RECLAIM_ANON;
 	else if (!strcmp(type_buf, "all"))
 		type = RECLAIM_ALL;
+	else if (isdigit(*type_buf))
+		type = RECLAIM_RANGE;
 	else
-		return -EINVAL;
+		goto out_err;
+
+	if (type == RECLAIM_RANGE) {
+		char *token;
+		unsigned long long len, len_in, tmp;
+		token = strsep(&type_buf, " ");
+		if (!token)
+			goto out_err;
+		tmp = memparse(token, &token);
+		if (tmp & ~PAGE_MASK || tmp > ULONG_MAX)
+			goto out_err;
+		start = tmp;
+
+		token = strsep(&type_buf, " ");
+		if (!token)
+			goto out_err;
+		len_in = memparse(token, &token);
+		len = (len_in + ~PAGE_MASK) & PAGE_MASK;
+		if (len > ULONG_MAX)
+			goto out_err;
+		/*
+		 * Check to see whether len was rounded up from small -ve
+		 * to zero.
+		 */
+		if (len_in && !len)
+			goto out_err;
+
+		end = start + len;
+		if (end < start)
+			goto out_err;
+	}
 
 	task = get_proc_task(file->f_path.dentry->d_inode);
 	if (!task)
 		return -ESRCH;
 
 	mm = get_task_mm(task);
-	if (mm) {
-		struct mm_walk reclaim_walk = {
-			.pmd_entry = reclaim_pte_range,
-			.mm = mm,
-		};
+	if (!mm)
+		goto out;
 
-		down_read(&mm->mmap_sem);
-		for (vma = mm->mmap; vma; vma = vma->vm_next) {
-			reclaim_walk.private = vma;
+	reclaim_walk.mm = mm;
+	reclaim_walk.pmd_entry = reclaim_pte_range;
 
+	down_read(&mm->mmap_sem);
+	if (type == RECLAIM_RANGE) {
+		vma = find_vma(mm, start);
+		while (vma) {
+			if (vma->vm_start > end)
+				break;
+			if (is_vm_hugetlb_page(vma))
+				continue;
+
+			reclaim_walk.private = vma;
+			walk_page_range(max(vma->vm_start, start),
+					min(vma->vm_end, end),
+					&reclaim_walk);
+			vma = vma->vm_next;
+		}
+	} else {
+		for (vma = mm->mmap; vma; vma = vma->vm_next) {
 			if (is_vm_hugetlb_page(vma))
 				continue;
 
 			if (type == RECLAIM_ANON && vma->vm_file)
 				continue;
+
 			if (type == RECLAIM_FILE && !vma->vm_file)
 				continue;
 
+			reclaim_walk.private = vma;
 			walk_page_range(vma->vm_start, vma->vm_end,
-					&reclaim_walk);
+				&reclaim_walk);
 		}
-		flush_tlb_mm(mm);
-		up_read(&mm->mmap_sem);
-		mmput(mm);
 	}
-	put_task_struct(task);
 
+	flush_tlb_mm(mm);
+	up_read(&mm->mmap_sem);
+	mmput(mm);
+out:
+	put_task_struct(task);
 	return count;
+
+out_err:
+	return -EINVAL;
 }
 
 const struct file_operations proc_reclaim_operations = {
diff --git a/mm/Kconfig b/mm/Kconfig
index d119b5b..3460da4 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -501,4 +501,7 @@ config PROCESS_RECLAIM
 	 (echo anon > /proc/PID/reclaim) reclaims anonymous pages only.
 	 (echo all > /proc/PID/reclaim) reclaims all pages.
 
+	 (echo addr size-byte > /proc/PID/reclaim) reclaims pages in
+	 (addr, addr + size-bytes) of the process.
+
 	 Any other vaule is ignored.
-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 7/7] add documentation about reclaim knob on proc.txt
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-09  7:21   ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

This patch adds stuff about new reclaim field in proc.txt

Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/filesystems/proc.txt | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 488c094..ee4cef1 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -136,6 +136,7 @@ Table 1-1: Process specific entries in /proc
  maps		Memory maps to executables and library files	(2.4)
  mem		Memory held by this process
  root		Link to the root directory of this process
+ reclaim	Reclaim pages in this process
  stat		Process status
  statm		Process memory status information
  status		Process status in human readable form
@@ -489,6 +490,25 @@ To clear the soft-dirty bit
 
 Any other value written to /proc/PID/clear_refs will have no effect.
 
+The file /proc/PID/reclaim is used to reclaim pages in this process.
+To reclaim file-backed pages,
+    > echo file > /proc/PID/reclaim
+
+To reclaim anonymous pages,
+    > echo anon > /proc/PID/reclaim
+
+To reclaim all pages,
+    > echo all > /proc/PID/reclaim
+
+Also, you can specify address range of process so part of address space
+will be reclaimed. The format is following as
+    > echo addr size-byte > /proc/PID/reclaim
+
+NOTE: addr should be page-aligned.
+
+Below is example which try to reclaim 2M from 0x100000.
+    > echo 0x100000 2M > /proc/PID/reclaim
+
 The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
 using /proc/kpageflags and number of times a page is mapped using
 /proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.txt.
-- 
1.8.2.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v5 7/7] add documentation about reclaim knob on proc.txt
@ 2013-05-09  7:21   ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-09  7:21 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim, Minchan Kim

This patch adds stuff about new reclaim field in proc.txt

Acked-by: Rob Landley <rob@landley.net>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 Documentation/filesystems/proc.txt | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 488c094..ee4cef1 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -136,6 +136,7 @@ Table 1-1: Process specific entries in /proc
  maps		Memory maps to executables and library files	(2.4)
  mem		Memory held by this process
  root		Link to the root directory of this process
+ reclaim	Reclaim pages in this process
  stat		Process status
  statm		Process memory status information
  status		Process status in human readable form
@@ -489,6 +490,25 @@ To clear the soft-dirty bit
 
 Any other value written to /proc/PID/clear_refs will have no effect.
 
+The file /proc/PID/reclaim is used to reclaim pages in this process.
+To reclaim file-backed pages,
+    > echo file > /proc/PID/reclaim
+
+To reclaim anonymous pages,
+    > echo anon > /proc/PID/reclaim
+
+To reclaim all pages,
+    > echo all > /proc/PID/reclaim
+
+Also, you can specify address range of process so part of address space
+will be reclaimed. The format is following as
+    > echo addr size-byte > /proc/PID/reclaim
+
+NOTE: addr should be page-aligned.
+
+Below is example which try to reclaim 2M from 0x100000.
+    > echo 0x100000 2M > /proc/PID/reclaim
+
 The /proc/pid/pagemap gives the PFN, which can be used to find the pageflags
 using /proc/kpageflags and number of times a page is mapped using
 /proc/kpagecount. For detailed explanation, see Documentation/vm/pagemap.txt.
-- 
1.8.2.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 0/7] Per process reclaim
  2013-05-09  7:21 ` Minchan Kim
@ 2013-05-21 23:16   ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2013-05-21 23:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

On Thu,  9 May 2013 16:21:22 +0900 Minchan Kim <minchan@kernel.org> wrote:

> These day, there are many platforms avaiable in the embedded market
> and they are smarter than kernel which has very limited information
> about working set so they want to involve memory management more heavily
> like android's lowmemory killer and ashmem or recent many lowmemory
> notifier(there was several trial for various company NOKIA, SAMSUNG,
> Linaro, Google ChromeOS, Redhat).
> 
> One of the simple imagine scenario about userspace's intelligence is that
> platform can manage tasks as forground and backgroud so it would be
> better to reclaim background's task pages for end-user's *responsibility*
> although it has frequent referenced pages.
> 
> The patch[1] prepares that force_reclaim in shrink_page_list can
> handle anonymous pages as well as file-backed pages.
> 
> The patch[2] adds new knob "reclaim under proc/<pid>/" so task manager
> can reclaim any target process anytime, anywhere. It could give another
> method to platform for using memory efficiently.
> 
> It can avoid process killing for getting free memory, which was really
> terrible experience because I lost my best score of game I had ever
> after I switch the phone call while I enjoyed the game.
> 
> Reclaim file-backed pages only.
> 	echo file > /proc/PID/reclaim
> Reclaim anonymous pages only.
> 	echo anon > /proc/PID/reclaim
> Reclaim all pages
> 	echo all > /proc/PID/reclaim

Oh boy.  I think I do agree with the overall intent, but there are so
many ways of doing this.

- Do we reclaim the pages altogether, or should we just give them one
  round of aging?  If the latter then you'd need to run "echo anon >
  /proc/PID/reclaim" four times to firmly whack the pages, but that's
  more flexible.

- Why do it via the pid at all?  Would it be better to instead do
  this to a memcg and require that the admin put these processes into
  memcgs?  In fact existing memcg controls could get us at least
  partway to this feature.

- I don't understand the need for "Enhance per process reclaim to
  consider shared pages".  If "echo file > /proc/PID/reclaim" causes
  PID's mm's file-backed pte's to be unmapped (which seems to be the
  correct effect) then we get this automatically: unshared file pages
  will be freed and shared file pages will remain in core until the
  other sharing process's also unmap them.


Overall, I'm unsure whether/how to proceed with this.  I'd like to hear
from a lot of the potential users, and hear them say "yes, we can use
this".


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 0/7] Per process reclaim
@ 2013-05-21 23:16   ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2013-05-21 23:16 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

On Thu,  9 May 2013 16:21:22 +0900 Minchan Kim <minchan@kernel.org> wrote:

> These day, there are many platforms avaiable in the embedded market
> and they are smarter than kernel which has very limited information
> about working set so they want to involve memory management more heavily
> like android's lowmemory killer and ashmem or recent many lowmemory
> notifier(there was several trial for various company NOKIA, SAMSUNG,
> Linaro, Google ChromeOS, Redhat).
> 
> One of the simple imagine scenario about userspace's intelligence is that
> platform can manage tasks as forground and backgroud so it would be
> better to reclaim background's task pages for end-user's *responsibility*
> although it has frequent referenced pages.
> 
> The patch[1] prepares that force_reclaim in shrink_page_list can
> handle anonymous pages as well as file-backed pages.
> 
> The patch[2] adds new knob "reclaim under proc/<pid>/" so task manager
> can reclaim any target process anytime, anywhere. It could give another
> method to platform for using memory efficiently.
> 
> It can avoid process killing for getting free memory, which was really
> terrible experience because I lost my best score of game I had ever
> after I switch the phone call while I enjoyed the game.
> 
> Reclaim file-backed pages only.
> 	echo file > /proc/PID/reclaim
> Reclaim anonymous pages only.
> 	echo anon > /proc/PID/reclaim
> Reclaim all pages
> 	echo all > /proc/PID/reclaim

Oh boy.  I think I do agree with the overall intent, but there are so
many ways of doing this.

- Do we reclaim the pages altogether, or should we just give them one
  round of aging?  If the latter then you'd need to run "echo anon >
  /proc/PID/reclaim" four times to firmly whack the pages, but that's
  more flexible.

- Why do it via the pid at all?  Would it be better to instead do
  this to a memcg and require that the admin put these processes into
  memcgs?  In fact existing memcg controls could get us at least
  partway to this feature.

- I don't understand the need for "Enhance per process reclaim to
  consider shared pages".  If "echo file > /proc/PID/reclaim" causes
  PID's mm's file-backed pte's to be unmapped (which seems to be the
  correct effect) then we get this automatically: unshared file pages
  will be freed and shared file pages will remain in core until the
  other sharing process's also unmap them.


Overall, I'm unsure whether/how to proceed with this.  I'd like to hear
from a lot of the potential users, and hear them say "yes, we can use
this".

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/7] mm: Support address range reclaim
  2013-05-09  7:21   ` Minchan Kim
@ 2013-05-22  0:33     ` Andrew Morton
  -1 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2013-05-22  0:33 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

On Thu,  9 May 2013 16:21:28 +0900 Minchan Kim <minchan@kernel.org> wrote:

> This patch adds address range reclaim of a process.
> The requirement is following as,
> 
> Like webkit1, it uses a address space for handling multi tabs.
> IOW, it uses *one* process model so all tabs shares address space
> of the process. In such scenario, per-process reclaim is rather
> coarse-grained so this patch supports more fine-grained reclaim
> for being able to reclaim target address range of the process.
> For reclaim target range, you should use following format.
> 
> 	echo [addr] [size-byte] > /proc/pid/reclaim
> 
> The addr should be page-aligned.
> 
> So now reclaim konb's interface is following as.
> 
> echo file > /proc/pid/reclaim
> 	reclaim file-backed pages only
> 
> echo anon > /proc/pid/reclaim
> 	reclaim anonymous pages only
> 
> echo all > /proc/pid/reclaim
> 	reclaim all pages
> 
> echo 0x100000 8K > /proc/pid/reclaim
> 	reclaim pages in (0x100000 - 0x102000)

This might be going a bit far.  The application itself can be modified
to use fadvise/madvise/whatever to release unused pages and that's a
better interface.

Athough it's a bit of a pipe-dream, I do think we should encourage
userspace to go this path, rather than providing ways for hacky admin
tools to go poking around in /proc/pid/maps and whacking apps
externally.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/7] mm: Support address range reclaim
@ 2013-05-22  0:33     ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2013-05-22  0:33 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

On Thu,  9 May 2013 16:21:28 +0900 Minchan Kim <minchan@kernel.org> wrote:

> This patch adds address range reclaim of a process.
> The requirement is following as,
> 
> Like webkit1, it uses a address space for handling multi tabs.
> IOW, it uses *one* process model so all tabs shares address space
> of the process. In such scenario, per-process reclaim is rather
> coarse-grained so this patch supports more fine-grained reclaim
> for being able to reclaim target address range of the process.
> For reclaim target range, you should use following format.
> 
> 	echo [addr] [size-byte] > /proc/pid/reclaim
> 
> The addr should be page-aligned.
> 
> So now reclaim konb's interface is following as.
> 
> echo file > /proc/pid/reclaim
> 	reclaim file-backed pages only
> 
> echo anon > /proc/pid/reclaim
> 	reclaim anonymous pages only
> 
> echo all > /proc/pid/reclaim
> 	reclaim all pages
> 
> echo 0x100000 8K > /proc/pid/reclaim
> 	reclaim pages in (0x100000 - 0x102000)

This might be going a bit far.  The application itself can be modified
to use fadvise/madvise/whatever to release unused pages and that's a
better interface.

Athough it's a bit of a pipe-dream, I do think we should encourage
userspace to go this path, rather than providing ways for hacky admin
tools to go poking around in /proc/pid/maps and whacking apps
externally.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 0/7] Per process reclaim
  2013-05-21 23:16   ` Andrew Morton
@ 2013-05-27  8:12     ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-27  8:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

Hello Andrew,

Sorry for the late response.
It was bad timing because I was off during last week.

On Tue, May 21, 2013 at 04:16:56PM -0700, Andrew Morton wrote:
> On Thu,  9 May 2013 16:21:22 +0900 Minchan Kim <minchan@kernel.org> wrote:
> 
> > These day, there are many platforms avaiable in the embedded market
> > and they are smarter than kernel which has very limited information
> > about working set so they want to involve memory management more heavily
> > like android's lowmemory killer and ashmem or recent many lowmemory
> > notifier(there was several trial for various company NOKIA, SAMSUNG,
> > Linaro, Google ChromeOS, Redhat).
> > 
> > One of the simple imagine scenario about userspace's intelligence is that
> > platform can manage tasks as forground and backgroud so it would be
> > better to reclaim background's task pages for end-user's *responsibility*
> > although it has frequent referenced pages.
> > 
> > The patch[1] prepares that force_reclaim in shrink_page_list can
> > handle anonymous pages as well as file-backed pages.
> > 
> > The patch[2] adds new knob "reclaim under proc/<pid>/" so task manager
> > can reclaim any target process anytime, anywhere. It could give another
> > method to platform for using memory efficiently.
> > 
> > It can avoid process killing for getting free memory, which was really
> > terrible experience because I lost my best score of game I had ever
> > after I switch the phone call while I enjoyed the game.
> > 
> > Reclaim file-backed pages only.
> > 	echo file > /proc/PID/reclaim
> > Reclaim anonymous pages only.
> > 	echo anon > /proc/PID/reclaim
> > Reclaim all pages
> > 	echo all > /proc/PID/reclaim
> 
> Oh boy.  I think I do agree with the overall intent, but there are so
> many ways of doing this.
> 
> - Do we reclaim the pages altogether, or should we just give them one
>   round of aging?  If the latter then you'd need to run "echo anon >
>   /proc/PID/reclaim" four times to firmly whack the pages, but that's
>   more flexible.

The concern of idea you suggested is that it's tied up to kernel's
reclaim implementation(active/inactive/referenced) so if we changes
the logic, user's behavior would be broken easily.

> 
> - Why do it via the pid at all?  Would it be better to instead do
>   this to a memcg and require that the admin put these processes into
>   memcgs?  In fact existing memcg controls could get us at least
>   partway to this feature.

Hmm, I don't have an idea to handle this with memcg.

1) Is there any way to reclaim anon pages as top prioirty
   rather than file pages? Many embedded system uses zram as swap device
   so somecase, anon reclaim could be cheaper rather than file-backd pages
   reclaim on really slow storage.
   Of course, we can control it with "swappiness" but at most, it makes
   balancing ratio 1:1.
   Should we expand the value to 200? I have no idea.
   What I want is VM smarter that it can understand swap storage's speed
   so that it can balance anon/file more smart. But it's not a simple
   problem, I think.
   
2) Another usecase,
   We used per process reclaim to make smart snapshot image which is
   like hibernation image. As we know well, embedded system's boot time
   is very critical to lunch the product so a few of system have used
   snapshot booting. Snapshot booting's core technology is to minimise
   snapshot image size but it includes working set pages which will
   be needed soon after resuming so we can profile working set pages
   ,discard unnecessary pages via per-process reclaim, then make snapshot
   images.

 
> 
> - I don't understand the need for "Enhance per process reclaim to
>   consider shared pages".  If "echo file > /proc/PID/reclaim" causes
>   PID's mm's file-backed pte's to be unmapped (which seems to be the
>   correct effect) then we get this automatically: unshared file pages
>   will be freed and shared file pages will remain in core until the
>   other sharing process's also unmap them.

In current implemenation, even shared page is detached from all of
processes and is freed.

> 
> 
> Overall, I'm unsure whether/how to proceed with this.  I'd like to hear
> from a lot of the potential users, and hear them say "yes, we can use
> this".

We already used it for creating snapshot image and have a plan to use it
for reducing system latency and avoiding frequent OOM killing.

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 0/7] Per process reclaim
@ 2013-05-27  8:12     ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-27  8:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

Hello Andrew,

Sorry for the late response.
It was bad timing because I was off during last week.

On Tue, May 21, 2013 at 04:16:56PM -0700, Andrew Morton wrote:
> On Thu,  9 May 2013 16:21:22 +0900 Minchan Kim <minchan@kernel.org> wrote:
> 
> > These day, there are many platforms avaiable in the embedded market
> > and they are smarter than kernel which has very limited information
> > about working set so they want to involve memory management more heavily
> > like android's lowmemory killer and ashmem or recent many lowmemory
> > notifier(there was several trial for various company NOKIA, SAMSUNG,
> > Linaro, Google ChromeOS, Redhat).
> > 
> > One of the simple imagine scenario about userspace's intelligence is that
> > platform can manage tasks as forground and backgroud so it would be
> > better to reclaim background's task pages for end-user's *responsibility*
> > although it has frequent referenced pages.
> > 
> > The patch[1] prepares that force_reclaim in shrink_page_list can
> > handle anonymous pages as well as file-backed pages.
> > 
> > The patch[2] adds new knob "reclaim under proc/<pid>/" so task manager
> > can reclaim any target process anytime, anywhere. It could give another
> > method to platform for using memory efficiently.
> > 
> > It can avoid process killing for getting free memory, which was really
> > terrible experience because I lost my best score of game I had ever
> > after I switch the phone call while I enjoyed the game.
> > 
> > Reclaim file-backed pages only.
> > 	echo file > /proc/PID/reclaim
> > Reclaim anonymous pages only.
> > 	echo anon > /proc/PID/reclaim
> > Reclaim all pages
> > 	echo all > /proc/PID/reclaim
> 
> Oh boy.  I think I do agree with the overall intent, but there are so
> many ways of doing this.
> 
> - Do we reclaim the pages altogether, or should we just give them one
>   round of aging?  If the latter then you'd need to run "echo anon >
>   /proc/PID/reclaim" four times to firmly whack the pages, but that's
>   more flexible.

The concern of idea you suggested is that it's tied up to kernel's
reclaim implementation(active/inactive/referenced) so if we changes
the logic, user's behavior would be broken easily.

> 
> - Why do it via the pid at all?  Would it be better to instead do
>   this to a memcg and require that the admin put these processes into
>   memcgs?  In fact existing memcg controls could get us at least
>   partway to this feature.

Hmm, I don't have an idea to handle this with memcg.

1) Is there any way to reclaim anon pages as top prioirty
   rather than file pages? Many embedded system uses zram as swap device
   so somecase, anon reclaim could be cheaper rather than file-backd pages
   reclaim on really slow storage.
   Of course, we can control it with "swappiness" but at most, it makes
   balancing ratio 1:1.
   Should we expand the value to 200? I have no idea.
   What I want is VM smarter that it can understand swap storage's speed
   so that it can balance anon/file more smart. But it's not a simple
   problem, I think.
   
2) Another usecase,
   We used per process reclaim to make smart snapshot image which is
   like hibernation image. As we know well, embedded system's boot time
   is very critical to lunch the product so a few of system have used
   snapshot booting. Snapshot booting's core technology is to minimise
   snapshot image size but it includes working set pages which will
   be needed soon after resuming so we can profile working set pages
   ,discard unnecessary pages via per-process reclaim, then make snapshot
   images.

 
> 
> - I don't understand the need for "Enhance per process reclaim to
>   consider shared pages".  If "echo file > /proc/PID/reclaim" causes
>   PID's mm's file-backed pte's to be unmapped (which seems to be the
>   correct effect) then we get this automatically: unshared file pages
>   will be freed and shared file pages will remain in core until the
>   other sharing process's also unmap them.

In current implemenation, even shared page is detached from all of
processes and is freed.

> 
> 
> Overall, I'm unsure whether/how to proceed with this.  I'd like to hear
> from a lot of the potential users, and hear them say "yes, we can use
> this".

We already used it for creating snapshot image and have a plan to use it
for reducing system latency and avoiding frequent OOM killing.

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/7] mm: Support address range reclaim
  2013-05-22  0:33     ` Andrew Morton
@ 2013-05-27  8:24       ` Minchan Kim
  -1 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-27  8:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

On Tue, May 21, 2013 at 05:33:32PM -0700, Andrew Morton wrote:
> On Thu,  9 May 2013 16:21:28 +0900 Minchan Kim <minchan@kernel.org> wrote:
> 
> > This patch adds address range reclaim of a process.
> > The requirement is following as,
> > 
> > Like webkit1, it uses a address space for handling multi tabs.
> > IOW, it uses *one* process model so all tabs shares address space
> > of the process. In such scenario, per-process reclaim is rather
> > coarse-grained so this patch supports more fine-grained reclaim
> > for being able to reclaim target address range of the process.
> > For reclaim target range, you should use following format.
> > 
> > 	echo [addr] [size-byte] > /proc/pid/reclaim
> > 
> > The addr should be page-aligned.
> > 
> > So now reclaim konb's interface is following as.
> > 
> > echo file > /proc/pid/reclaim
> > 	reclaim file-backed pages only
> > 
> > echo anon > /proc/pid/reclaim
> > 	reclaim anonymous pages only
> > 
> > echo all > /proc/pid/reclaim
> > 	reclaim all pages
> > 
> > echo 0x100000 8K > /proc/pid/reclaim
> > 	reclaim pages in (0x100000 - 0x102000)
> 
> This might be going a bit far.  The application itself can be modified
> to use fadvise/madvise/whatever to release unused pages and that's a
> better interface.

I agree. The webkit should be smarter and it's going on afaik but
let's think another usecase that makes snapshot image scenario
I mentioned in previous reply.

Admin should discard NOT-IMPORTANT pages without modifying application's
code.

In addition, maybe we need madvise(MADV_SWAPOUT_NOT_DONTNEED) for
anonymous pages if we don't have such feature.

> 
> Athough it's a bit of a pipe-dream, I do think we should encourage
> userspace to go this path, rather than providing ways for hacky admin
> tools to go poking around in /proc/pid/maps and whacking apps
> externally.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v5 6/7] mm: Support address range reclaim
@ 2013-05-27  8:24       ` Minchan Kim
  0 siblings, 0 replies; 24+ messages in thread
From: Minchan Kim @ 2013-05-27  8:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Rik van Riel, Michael Kerrisk,
	Dave Hansen, Namhyung Kim, Minkyung Kim

On Tue, May 21, 2013 at 05:33:32PM -0700, Andrew Morton wrote:
> On Thu,  9 May 2013 16:21:28 +0900 Minchan Kim <minchan@kernel.org> wrote:
> 
> > This patch adds address range reclaim of a process.
> > The requirement is following as,
> > 
> > Like webkit1, it uses a address space for handling multi tabs.
> > IOW, it uses *one* process model so all tabs shares address space
> > of the process. In such scenario, per-process reclaim is rather
> > coarse-grained so this patch supports more fine-grained reclaim
> > for being able to reclaim target address range of the process.
> > For reclaim target range, you should use following format.
> > 
> > 	echo [addr] [size-byte] > /proc/pid/reclaim
> > 
> > The addr should be page-aligned.
> > 
> > So now reclaim konb's interface is following as.
> > 
> > echo file > /proc/pid/reclaim
> > 	reclaim file-backed pages only
> > 
> > echo anon > /proc/pid/reclaim
> > 	reclaim anonymous pages only
> > 
> > echo all > /proc/pid/reclaim
> > 	reclaim all pages
> > 
> > echo 0x100000 8K > /proc/pid/reclaim
> > 	reclaim pages in (0x100000 - 0x102000)
> 
> This might be going a bit far.  The application itself can be modified
> to use fadvise/madvise/whatever to release unused pages and that's a
> better interface.

I agree. The webkit should be smarter and it's going on afaik but
let's think another usecase that makes snapshot image scenario
I mentioned in previous reply.

Admin should discard NOT-IMPORTANT pages without modifying application's
code.

In addition, maybe we need madvise(MADV_SWAPOUT_NOT_DONTNEED) for
anonymous pages if we don't have such feature.

> 
> Athough it's a bit of a pipe-dream, I do think we should encourage
> userspace to go this path, rather than providing ways for hacky admin
> tools to go poking around in /proc/pid/maps and whacking apps
> externally.
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2013-05-27  8:39 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-05-09  7:21 [PATCH v5 0/7] Per process reclaim Minchan Kim
2013-05-09  7:21 ` Minchan Kim
2013-05-09  7:21 ` [PATCH v5 1/7] mm: prevent to write out dirty page in CMA by may_writepage Minchan Kim
2013-05-09  7:21   ` Minchan Kim
2013-05-09  7:21 ` [PATCH v5 2/7] mm: Per process reclaim Minchan Kim
2013-05-09  7:21   ` Minchan Kim
2013-05-09  7:21 ` [PATCH v5 3/7] mm: make shrink_page_list with pages work from multiple zones Minchan Kim
2013-05-09  7:21   ` Minchan Kim
2013-05-09  7:21 ` [PATCH v5 4/7] mm: Remove shrink_page Minchan Kim
2013-05-09  7:21   ` Minchan Kim
2013-05-09  7:21 ` [PATCH v5 5/7] mm: Enhance per process reclaim to consider shared pages Minchan Kim
2013-05-09  7:21   ` Minchan Kim
2013-05-09  7:21 ` [PATCH v5 6/7] mm: Support address range reclaim Minchan Kim
2013-05-09  7:21   ` Minchan Kim
2013-05-22  0:33   ` Andrew Morton
2013-05-22  0:33     ` Andrew Morton
2013-05-27  8:24     ` Minchan Kim
2013-05-27  8:24       ` Minchan Kim
2013-05-09  7:21 ` [PATCH v5 7/7] add documentation about reclaim knob on proc.txt Minchan Kim
2013-05-09  7:21   ` Minchan Kim
2013-05-21 23:16 ` [PATCH v5 0/7] Per process reclaim Andrew Morton
2013-05-21 23:16   ` Andrew Morton
2013-05-27  8:12   ` Minchan Kim
2013-05-27  8:12     ` Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.