All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] free reclaimed pages by paging out instantly
@ 2014-07-02  0:13 ` Minchan Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim

Normally, I/O completed pages for reclaim would be rotated into
inactive LRU tail without freeing. The why it works is we can't free
page from atomic context(ie, end_page_writeback) due to vaious locks
isn't aware of atomic context.

So for reclaiming the I/O completed pages, we need one more iteration
of reclaim and it could make unnecessary aging as well as CPU overhead.

Long time ago, at the first trial, most concern was memcg locking
but recently, Johnannes tried amazing effort to make memcg lock simple
and got merged into mmotm so I coded up based on mmotm tree.
(Kudos to Johannes)

On 1G, 12 CPU kvm guest, build kernel 5 times and result was

allocstall
vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00

pgrotated
vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00

Most of field in vmstat are not changed too much but things I can notice
is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
and pgrotated very much.

Welcome testing, review and any feedback!

* from v2 - 2014.06.20
  * Rebased on v3.16-rc2-mmotm-2014-06-25-16-44
  * Remove RFC tag

Minchan Kim (3):
  mm: Don't hide spin_lock in swap_info_get internal
  mm: Introduce atomic_remove_mapping
  mm: Free reclaimed pages indepdent of next reclaim

 include/linux/swap.h |  4 ++++
 mm/filemap.c         | 17 +++++++++-----
 mm/swap.c            | 21 ++++++++++++++++++
 mm/swapfile.c        | 17 ++++++++++++--
 mm/vmscan.c          | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 114 insertions(+), 8 deletions(-)

-- 
2.0.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 0/3] free reclaimed pages by paging out instantly
@ 2014-07-02  0:13 ` Minchan Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim

Normally, I/O completed pages for reclaim would be rotated into
inactive LRU tail without freeing. The why it works is we can't free
page from atomic context(ie, end_page_writeback) due to vaious locks
isn't aware of atomic context.

So for reclaiming the I/O completed pages, we need one more iteration
of reclaim and it could make unnecessary aging as well as CPU overhead.

Long time ago, at the first trial, most concern was memcg locking
but recently, Johnannes tried amazing effort to make memcg lock simple
and got merged into mmotm so I coded up based on mmotm tree.
(Kudos to Johannes)

On 1G, 12 CPU kvm guest, build kernel 5 times and result was

allocstall
vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00

pgrotated
vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00

Most of field in vmstat are not changed too much but things I can notice
is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
and pgrotated very much.

Welcome testing, review and any feedback!

* from v2 - 2014.06.20
  * Rebased on v3.16-rc2-mmotm-2014-06-25-16-44
  * Remove RFC tag

Minchan Kim (3):
  mm: Don't hide spin_lock in swap_info_get internal
  mm: Introduce atomic_remove_mapping
  mm: Free reclaimed pages indepdent of next reclaim

 include/linux/swap.h |  4 ++++
 mm/filemap.c         | 17 +++++++++-----
 mm/swap.c            | 21 ++++++++++++++++++
 mm/swapfile.c        | 17 ++++++++++++--
 mm/vmscan.c          | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 114 insertions(+), 8 deletions(-)

-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v3 1/3] mm: Don't hide spin_lock in swap_info_get internal
  2014-07-02  0:13 ` Minchan Kim
@ 2014-07-02  0:13   ` Minchan Kim
  -1 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim

Now, swap_info_get hides lock holding by doing it internally
but releasing the lock so caller should release the lock.
Normally, it's not a good pattern and I need to handle lock
from caller in next patchset.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/swapfile.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 8798b2e0ac59..ec2ce926ea5f 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -740,7 +740,6 @@ static struct swap_info_struct *swap_info_get(swp_entry_t entry)
 		goto bad_offset;
 	if (!p->swap_map[offset])
 		goto bad_free;
-	spin_lock(&p->lock);
 	return p;
 
 bad_free:
@@ -835,6 +834,7 @@ void swap_free(swp_entry_t entry)
 
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		swap_entry_free(p, entry, 1);
 		spin_unlock(&p->lock);
 	}
@@ -849,6 +849,7 @@ void swapcache_free(swp_entry_t entry)
 
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		swap_entry_free(p, entry, SWAP_HAS_CACHE);
 		spin_unlock(&p->lock);
 	}
@@ -868,6 +869,7 @@ int page_swapcount(struct page *page)
 	entry.val = page_private(page);
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		count = swap_count(p->swap_map[swp_offset(entry)]);
 		spin_unlock(&p->lock);
 	}
@@ -950,6 +952,7 @@ int free_swap_and_cache(swp_entry_t entry)
 
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		if (swap_entry_free(p, entry, 1) == SWAP_HAS_CACHE) {
 			page = find_get_page(swap_address_space(entry),
 						entry.val);
@@ -2763,6 +2766,7 @@ int add_swap_count_continuation(swp_entry_t entry, gfp_t gfp_mask)
 		goto outer;
 	}
 
+	spin_lock(&si->lock);
 	offset = swp_offset(entry);
 	count = si->swap_map[offset] & ~SWAP_HAS_CACHE;
 
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 1/3] mm: Don't hide spin_lock in swap_info_get internal
@ 2014-07-02  0:13   ` Minchan Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim

Now, swap_info_get hides lock holding by doing it internally
but releasing the lock so caller should release the lock.
Normally, it's not a good pattern and I need to handle lock
from caller in next patchset.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/swapfile.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 8798b2e0ac59..ec2ce926ea5f 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -740,7 +740,6 @@ static struct swap_info_struct *swap_info_get(swp_entry_t entry)
 		goto bad_offset;
 	if (!p->swap_map[offset])
 		goto bad_free;
-	spin_lock(&p->lock);
 	return p;
 
 bad_free:
@@ -835,6 +834,7 @@ void swap_free(swp_entry_t entry)
 
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		swap_entry_free(p, entry, 1);
 		spin_unlock(&p->lock);
 	}
@@ -849,6 +849,7 @@ void swapcache_free(swp_entry_t entry)
 
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		swap_entry_free(p, entry, SWAP_HAS_CACHE);
 		spin_unlock(&p->lock);
 	}
@@ -868,6 +869,7 @@ int page_swapcount(struct page *page)
 	entry.val = page_private(page);
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		count = swap_count(p->swap_map[swp_offset(entry)]);
 		spin_unlock(&p->lock);
 	}
@@ -950,6 +952,7 @@ int free_swap_and_cache(swp_entry_t entry)
 
 	p = swap_info_get(entry);
 	if (p) {
+		spin_lock(&p->lock);
 		if (swap_entry_free(p, entry, 1) == SWAP_HAS_CACHE) {
 			page = find_get_page(swap_address_space(entry),
 						entry.val);
@@ -2763,6 +2766,7 @@ int add_swap_count_continuation(swp_entry_t entry, gfp_t gfp_mask)
 		goto outer;
 	}
 
+	spin_lock(&si->lock);
 	offset = swp_offset(entry);
 	count = si->swap_map[offset] & ~SWAP_HAS_CACHE;
 
-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/3] mm: Introduce atomic_remove_mapping
  2014-07-02  0:13 ` Minchan Kim
@ 2014-07-02  0:13   ` Minchan Kim
  -1 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim, Trond Myklebust,
	linux-nfs

For release page from atomic context(ie, softirq), locks
related to the work should be aware of that.

There are two locks.

One is mapping->tree_lock and the other is swap_info_struct->lock.
The mapping->tree_lock is alreay aware of irq so it's no problem
but swap_info_struct->lock isn't so atomic_remove_mapping uses just
try_spinlock and if it fails to hold a lock, it just depends on
a fallback plan which moves the page into LRU's tail and expect page
freeing in next.

A change I know is mapping->a_ops->free is called on atomic context
by this patch. Only user is nfs_readdir_clear_array which is no
problem when I look at.

Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 include/linux/swap.h |  4 ++++
 mm/swapfile.c        | 11 ++++++++-
 mm/vmscan.c          | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 94fd0b23f3f9..5df540205bda 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -336,6 +336,8 @@ extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
 						unsigned long *nr_scanned);
 extern unsigned long shrink_all_memory(unsigned long nr_pages);
 extern int vm_swappiness;
+extern int atomic_remove_mapping(struct address_space *mapping,
+					struct page *page);
 extern int remove_mapping(struct address_space *mapping, struct page *page);
 extern unsigned long vm_total_pages;
 
@@ -441,6 +443,7 @@ static inline long get_nr_swap_pages(void)
 }
 
 extern void si_swapinfo(struct sysinfo *);
+extern struct swap_info_struct *swap_info_get(swp_entry_t entry);
 extern swp_entry_t get_swap_page(void);
 extern swp_entry_t get_swap_page_of_type(int);
 extern int add_swap_count_continuation(swp_entry_t, gfp_t);
@@ -449,6 +452,7 @@ extern int swap_duplicate(swp_entry_t);
 extern int swapcache_prepare(swp_entry_t);
 extern void swap_free(swp_entry_t);
 extern void swapcache_free(swp_entry_t);
+extern void __swapcache_free(swp_entry_t);
 extern int free_swap_and_cache(swp_entry_t);
 extern int swap_type_of(dev_t, sector_t, struct block_device **);
 extern unsigned int count_swap_pages(int, int);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index ec2ce926ea5f..d76496a8a104 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -722,7 +722,7 @@ swp_entry_t get_swap_page_of_type(int type)
 	return (swp_entry_t) {0};
 }
 
-static struct swap_info_struct *swap_info_get(swp_entry_t entry)
+struct swap_info_struct *swap_info_get(swp_entry_t entry)
 {
 	struct swap_info_struct *p;
 	unsigned long offset, type;
@@ -855,6 +855,15 @@ void swapcache_free(swp_entry_t entry)
 	}
 }
 
+void __swapcache_free(swp_entry_t entry)
+{
+	struct swap_info_struct *p;
+
+	p = swap_info_get(entry);
+	if (p)
+		swap_entry_free(p, entry, SWAP_HAS_CACHE);
+}
+
 /*
  * How many references to page are currently swapped out?
  * This does not give an exact answer when swap count is continued,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6d24fd63b209..31af369eef24 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -526,6 +526,69 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
 }
 
 /*
+ * Attempt to detach a locked page from its ->mapping in atomic context.
+ * If it is dirty or if someone else has a ref on the page or couldn't
+ * get necessary locks, abort and return 0.
+ * If it was successfully detached, return 1.
+ * Assumes the caller has a single ref on this page.
+ */
+int atomic_remove_mapping(struct address_space *mapping,
+				struct page *page)
+{
+	BUG_ON(!PageLocked(page));
+	BUG_ON(mapping != page_mapping(page));
+	BUG_ON(!irqs_disabled());
+
+	spin_lock(&mapping->tree_lock);
+
+	/* Look at comment in __remove_mapping */
+	if (!page_freeze_refs(page, 2))
+		goto cannot_free;
+	/* note: atomic_cmpxchg in page_freeze_refs provides the smp_rmb */
+	if (unlikely(PageDirty(page))) {
+		page_unfreeze_refs(page, 2);
+		goto cannot_free;
+	}
+
+	if (PageSwapCache(page)) {
+		swp_entry_t swap = { .val = page_private(page) };
+		struct swap_info_struct *p = swap_info_get(swap);
+
+		if (!p || !spin_trylock(&p->lock)) {
+			page_unfreeze_refs(page, 2);
+			goto cannot_free;
+		}
+
+		mem_cgroup_swapout(page, swap);
+		__delete_from_swap_cache(page);
+		spin_unlock(&mapping->tree_lock);
+		__swapcache_free(swap);
+		spin_unlock(&p->lock);
+	} else {
+		void (*freepage)(struct page *);
+
+		freepage = mapping->a_ops->freepage;
+		__delete_from_page_cache(page, NULL);
+		spin_unlock(&mapping->tree_lock);
+
+		if (freepage != NULL)
+			freepage(page);
+	}
+
+	/*
+	 * Unfreezing the refcount with 1 rather than 2 effectively
+	 * drops the pagecache ref for us without requiring another
+	 * atomic operation.
+	 */
+	page_unfreeze_refs(page, 1);
+	return 1;
+
+cannot_free:
+	spin_unlock(&mapping->tree_lock);
+	return 0;
+}
+
+/*
  * Same as remove_mapping, but if the page is removed from the mapping, it
  * gets returned with a refcount of 0.
  */
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 2/3] mm: Introduce atomic_remove_mapping
@ 2014-07-02  0:13   ` Minchan Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim, Trond Myklebust,
	linux-nfs

For release page from atomic context(ie, softirq), locks
related to the work should be aware of that.

There are two locks.

One is mapping->tree_lock and the other is swap_info_struct->lock.
The mapping->tree_lock is alreay aware of irq so it's no problem
but swap_info_struct->lock isn't so atomic_remove_mapping uses just
try_spinlock and if it fails to hold a lock, it just depends on
a fallback plan which moves the page into LRU's tail and expect page
freeing in next.

A change I know is mapping->a_ops->free is called on atomic context
by this patch. Only user is nfs_readdir_clear_array which is no
problem when I look at.

Cc: Trond Myklebust <trond.myklebust@primarydata.com>
Cc: linux-nfs@vger.kernel.org
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 include/linux/swap.h |  4 ++++
 mm/swapfile.c        | 11 ++++++++-
 mm/vmscan.c          | 63 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 94fd0b23f3f9..5df540205bda 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -336,6 +336,8 @@ extern unsigned long mem_cgroup_shrink_node_zone(struct mem_cgroup *mem,
 						unsigned long *nr_scanned);
 extern unsigned long shrink_all_memory(unsigned long nr_pages);
 extern int vm_swappiness;
+extern int atomic_remove_mapping(struct address_space *mapping,
+					struct page *page);
 extern int remove_mapping(struct address_space *mapping, struct page *page);
 extern unsigned long vm_total_pages;
 
@@ -441,6 +443,7 @@ static inline long get_nr_swap_pages(void)
 }
 
 extern void si_swapinfo(struct sysinfo *);
+extern struct swap_info_struct *swap_info_get(swp_entry_t entry);
 extern swp_entry_t get_swap_page(void);
 extern swp_entry_t get_swap_page_of_type(int);
 extern int add_swap_count_continuation(swp_entry_t, gfp_t);
@@ -449,6 +452,7 @@ extern int swap_duplicate(swp_entry_t);
 extern int swapcache_prepare(swp_entry_t);
 extern void swap_free(swp_entry_t);
 extern void swapcache_free(swp_entry_t);
+extern void __swapcache_free(swp_entry_t);
 extern int free_swap_and_cache(swp_entry_t);
 extern int swap_type_of(dev_t, sector_t, struct block_device **);
 extern unsigned int count_swap_pages(int, int);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index ec2ce926ea5f..d76496a8a104 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -722,7 +722,7 @@ swp_entry_t get_swap_page_of_type(int type)
 	return (swp_entry_t) {0};
 }
 
-static struct swap_info_struct *swap_info_get(swp_entry_t entry)
+struct swap_info_struct *swap_info_get(swp_entry_t entry)
 {
 	struct swap_info_struct *p;
 	unsigned long offset, type;
@@ -855,6 +855,15 @@ void swapcache_free(swp_entry_t entry)
 	}
 }
 
+void __swapcache_free(swp_entry_t entry)
+{
+	struct swap_info_struct *p;
+
+	p = swap_info_get(entry);
+	if (p)
+		swap_entry_free(p, entry, SWAP_HAS_CACHE);
+}
+
 /*
  * How many references to page are currently swapped out?
  * This does not give an exact answer when swap count is continued,
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6d24fd63b209..31af369eef24 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -526,6 +526,69 @@ static pageout_t pageout(struct page *page, struct address_space *mapping,
 }
 
 /*
+ * Attempt to detach a locked page from its ->mapping in atomic context.
+ * If it is dirty or if someone else has a ref on the page or couldn't
+ * get necessary locks, abort and return 0.
+ * If it was successfully detached, return 1.
+ * Assumes the caller has a single ref on this page.
+ */
+int atomic_remove_mapping(struct address_space *mapping,
+				struct page *page)
+{
+	BUG_ON(!PageLocked(page));
+	BUG_ON(mapping != page_mapping(page));
+	BUG_ON(!irqs_disabled());
+
+	spin_lock(&mapping->tree_lock);
+
+	/* Look at comment in __remove_mapping */
+	if (!page_freeze_refs(page, 2))
+		goto cannot_free;
+	/* note: atomic_cmpxchg in page_freeze_refs provides the smp_rmb */
+	if (unlikely(PageDirty(page))) {
+		page_unfreeze_refs(page, 2);
+		goto cannot_free;
+	}
+
+	if (PageSwapCache(page)) {
+		swp_entry_t swap = { .val = page_private(page) };
+		struct swap_info_struct *p = swap_info_get(swap);
+
+		if (!p || !spin_trylock(&p->lock)) {
+			page_unfreeze_refs(page, 2);
+			goto cannot_free;
+		}
+
+		mem_cgroup_swapout(page, swap);
+		__delete_from_swap_cache(page);
+		spin_unlock(&mapping->tree_lock);
+		__swapcache_free(swap);
+		spin_unlock(&p->lock);
+	} else {
+		void (*freepage)(struct page *);
+
+		freepage = mapping->a_ops->freepage;
+		__delete_from_page_cache(page, NULL);
+		spin_unlock(&mapping->tree_lock);
+
+		if (freepage != NULL)
+			freepage(page);
+	}
+
+	/*
+	 * Unfreezing the refcount with 1 rather than 2 effectively
+	 * drops the pagecache ref for us without requiring another
+	 * atomic operation.
+	 */
+	page_unfreeze_refs(page, 1);
+	return 1;
+
+cannot_free:
+	spin_unlock(&mapping->tree_lock);
+	return 0;
+}
+
+/*
  * Same as remove_mapping, but if the page is removed from the mapping, it
  * gets returned with a refcount of 0.
  */
-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/3] mm: Free reclaimed pages indepdent of next reclaim
  2014-07-02  0:13 ` Minchan Kim
@ 2014-07-02  0:13   ` Minchan Kim
  -1 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim

Invalidate dirty/writeback page and file/swap I/O for reclaiming
are asynchronous so that when page writeback is completed,
it will be rotated back into LRU tail for freeing in next reclaim.

But it would make unnecessary CPU overhead and more aging
with higher priority of reclaim than necessary thing.

This patch makes such pages instant release when I/O complete
without LRU movement so that we could reduce reclaim events.

This patch wakes up one waiting PG_writeback and then clear
PG_reclaim bit because the page could be released during
rotating so it makes slighly race with Readahead logic but
the chance would be small and no huge side-effect even though
that happens, I belive.

Test result is as follows,

On 1G, 12 CPU kvm guest, build kernel 5 times and result was

allocstall
vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00

pgrotated
vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00

Most of field in vmstat are not changed too much but things I can notice
is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
and pgrotated very much.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/filemap.c | 17 +++++++++++------
 mm/swap.c    | 21 +++++++++++++++++++++
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c2f30ed8e95f..6e09de6cf510 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -752,23 +752,28 @@ EXPORT_SYMBOL(unlock_page);
  */
 void end_page_writeback(struct page *page)
 {
+	if (!test_clear_page_writeback(page))
+		BUG();
+
+	smp_mb__after_atomic();
+	wake_up_page(page, PG_writeback);
+
 	/*
 	 * TestClearPageReclaim could be used here but it is an atomic
 	 * operation and overkill in this particular case. Failing to
 	 * shuffle a page marked for immediate reclaim is too mild to
 	 * justify taking an atomic operation penalty at the end of
 	 * ever page writeback.
+	 *
+	 * Clearing PG_reclaim after waking up waiter is slightly racy.
+	 * Readahead might see PageReclaim as PageReadahead marker
+	 * so readahead logic might be broken temporally but it isn't
+	 * matter enough to care.
 	 */
 	if (PageReclaim(page)) {
 		ClearPageReclaim(page);
 		rotate_reclaimable_page(page);
 	}
-
-	if (!test_clear_page_writeback(page))
-		BUG();
-
-	smp_mb__after_atomic();
-	wake_up_page(page, PG_writeback);
 }
 EXPORT_SYMBOL(end_page_writeback);
 
diff --git a/mm/swap.c b/mm/swap.c
index 3074210f245d..d61b8783ccc3 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -443,6 +443,27 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec,
 
 	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
 		enum lru_list lru = page_lru_base_type(page);
+		struct address_space *mapping;
+
+		if (!trylock_page(page))
+			goto move_tail;
+
+		mapping = page_mapping(page);
+		if (!mapping)
+			goto unlock;
+
+		/*
+		 * If it is successful, aotmic_remove_mapping
+		 * makes page->count one so the page will be
+		 * released when caller release his refcount.
+		 */
+		if (atomic_remove_mapping(mapping, page)) {
+			unlock_page(page);
+			return;
+		}
+unlock:
+		unlock_page(page);
+move_tail:
 		list_move_tail(&page->lru, &lruvec->lists[lru]);
 		(*pgmoved)++;
 	}
-- 
2.0.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v3 3/3] mm: Free reclaimed pages indepdent of next reclaim
@ 2014-07-02  0:13   ` Minchan Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-02  0:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko, Minchan Kim

Invalidate dirty/writeback page and file/swap I/O for reclaiming
are asynchronous so that when page writeback is completed,
it will be rotated back into LRU tail for freeing in next reclaim.

But it would make unnecessary CPU overhead and more aging
with higher priority of reclaim than necessary thing.

This patch makes such pages instant release when I/O complete
without LRU movement so that we could reduce reclaim events.

This patch wakes up one waiting PG_writeback and then clear
PG_reclaim bit because the page could be released during
rotating so it makes slighly race with Readahead logic but
the chance would be small and no huge side-effect even though
that happens, I belive.

Test result is as follows,

On 1G, 12 CPU kvm guest, build kernel 5 times and result was

allocstall
vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00

pgrotated
vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00

Most of field in vmstat are not changed too much but things I can notice
is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
and pgrotated very much.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/filemap.c | 17 +++++++++++------
 mm/swap.c    | 21 +++++++++++++++++++++
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index c2f30ed8e95f..6e09de6cf510 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -752,23 +752,28 @@ EXPORT_SYMBOL(unlock_page);
  */
 void end_page_writeback(struct page *page)
 {
+	if (!test_clear_page_writeback(page))
+		BUG();
+
+	smp_mb__after_atomic();
+	wake_up_page(page, PG_writeback);
+
 	/*
 	 * TestClearPageReclaim could be used here but it is an atomic
 	 * operation and overkill in this particular case. Failing to
 	 * shuffle a page marked for immediate reclaim is too mild to
 	 * justify taking an atomic operation penalty at the end of
 	 * ever page writeback.
+	 *
+	 * Clearing PG_reclaim after waking up waiter is slightly racy.
+	 * Readahead might see PageReclaim as PageReadahead marker
+	 * so readahead logic might be broken temporally but it isn't
+	 * matter enough to care.
 	 */
 	if (PageReclaim(page)) {
 		ClearPageReclaim(page);
 		rotate_reclaimable_page(page);
 	}
-
-	if (!test_clear_page_writeback(page))
-		BUG();
-
-	smp_mb__after_atomic();
-	wake_up_page(page, PG_writeback);
 }
 EXPORT_SYMBOL(end_page_writeback);
 
diff --git a/mm/swap.c b/mm/swap.c
index 3074210f245d..d61b8783ccc3 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -443,6 +443,27 @@ static void pagevec_move_tail_fn(struct page *page, struct lruvec *lruvec,
 
 	if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
 		enum lru_list lru = page_lru_base_type(page);
+		struct address_space *mapping;
+
+		if (!trylock_page(page))
+			goto move_tail;
+
+		mapping = page_mapping(page);
+		if (!mapping)
+			goto unlock;
+
+		/*
+		 * If it is successful, aotmic_remove_mapping
+		 * makes page->count one so the page will be
+		 * released when caller release his refcount.
+		 */
+		if (atomic_remove_mapping(mapping, page)) {
+			unlock_page(page);
+			return;
+		}
+unlock:
+		unlock_page(page);
+move_tail:
 		list_move_tail(&page->lru, &lruvec->lists[lru]);
 		(*pgmoved)++;
 	}
-- 
2.0.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/3] free reclaimed pages by paging out instantly
  2014-07-02  0:13 ` Minchan Kim
@ 2014-07-02 20:42   ` Andrew Morton
  -1 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2014-07-02 20:42 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko

On Wed,  2 Jul 2014 09:13:46 +0900 Minchan Kim <minchan@kernel.org> wrote:

> Normally, I/O completed pages for reclaim would be rotated into
> inactive LRU tail without freeing. The why it works is we can't free
> page from atomic context(ie, end_page_writeback) due to vaious locks
> isn't aware of atomic context.
> 
> So for reclaiming the I/O completed pages, we need one more iteration
> of reclaim and it could make unnecessary aging as well as CPU overhead.
> 
> Long time ago, at the first trial, most concern was memcg locking
> but recently, Johnannes tried amazing effort to make memcg lock simple
> and got merged into mmotm so I coded up based on mmotm tree.
> (Kudos to Johannes)
> 
> On 1G, 12 CPU kvm guest, build kernel 5 times and result was
> 
> allocstall
> vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
> improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00

Well yes.  We're now doing unaccounted, impact-a-random-process work in
irq context which was previously being done in process context,
accounted to the process which was allocating the memory.  Some would
call this a regression ;)

> pgrotated
> vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
> improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00

Still a surprisingly high amount of rotation going on.

> Most of field in vmstat are not changed too much but things I can notice
> is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
> and pgrotated very much.
> 
> Welcome testing, review and any feedback!

Well, it will worsen IRQ latencies and it's all more code for us to
maintain.  I think I'd like to see a better story about the end-user
benefits before proceeding.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/3] free reclaimed pages by paging out instantly
@ 2014-07-02 20:42   ` Andrew Morton
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2014-07-02 20:42 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko

On Wed,  2 Jul 2014 09:13:46 +0900 Minchan Kim <minchan@kernel.org> wrote:

> Normally, I/O completed pages for reclaim would be rotated into
> inactive LRU tail without freeing. The why it works is we can't free
> page from atomic context(ie, end_page_writeback) due to vaious locks
> isn't aware of atomic context.
> 
> So for reclaiming the I/O completed pages, we need one more iteration
> of reclaim and it could make unnecessary aging as well as CPU overhead.
> 
> Long time ago, at the first trial, most concern was memcg locking
> but recently, Johnannes tried amazing effort to make memcg lock simple
> and got merged into mmotm so I coded up based on mmotm tree.
> (Kudos to Johannes)
> 
> On 1G, 12 CPU kvm guest, build kernel 5 times and result was
> 
> allocstall
> vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
> improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00

Well yes.  We're now doing unaccounted, impact-a-random-process work in
irq context which was previously being done in process context,
accounted to the process which was allocating the memory.  Some would
call this a regression ;)

> pgrotated
> vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
> improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00

Still a surprisingly high amount of rotation going on.

> Most of field in vmstat are not changed too much but things I can notice
> is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
> and pgrotated very much.
> 
> Welcome testing, review and any feedback!

Well, it will worsen IRQ latencies and it's all more code for us to
maintain.  I think I'd like to see a better story about the end-user
benefits before proceeding.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/3] free reclaimed pages by paging out instantly
  2014-07-02 20:42   ` Andrew Morton
@ 2014-07-03  0:59     ` Minchan Kim
  -1 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-03  0:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko

Hello Andrew,

On Wed, Jul 02, 2014 at 01:42:15PM -0700, Andrew Morton wrote:
> On Wed,  2 Jul 2014 09:13:46 +0900 Minchan Kim <minchan@kernel.org> wrote:
> 
> > Normally, I/O completed pages for reclaim would be rotated into
> > inactive LRU tail without freeing. The why it works is we can't free
> > page from atomic context(ie, end_page_writeback) due to vaious locks
> > isn't aware of atomic context.
> > 
> > So for reclaiming the I/O completed pages, we need one more iteration
> > of reclaim and it could make unnecessary aging as well as CPU overhead.
> > 
> > Long time ago, at the first trial, most concern was memcg locking
> > but recently, Johnannes tried amazing effort to make memcg lock simple
> > and got merged into mmotm so I coded up based on mmotm tree.
> > (Kudos to Johannes)
> > 
> > On 1G, 12 CPU kvm guest, build kernel 5 times and result was
> > 
> > allocstall
> > vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
> > improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00
> 
> Well yes.  We're now doing unaccounted, impact-a-random-process work in
> irq context which was previously being done in process context,
> accounted to the process which was allocating the memory.  Some would
> call this a regression ;)

The logic works only if someone try to reclaim dirty memory
by paging out with SetPageReclaim which means immediate reclaim
so normal writeout's overhead would be noop.

I thought it's a good deal in the situation where emergency that
reclaim should happen immediately. It could save more lock/irq
overhead will be caused by next reclaim without this patch
and even we have used pagevec in that path to minimize irq overhead.

> 
> > pgrotated
> > vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
> > improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00
> 
> Still a surprisingly high amount of rotation going on.
> 
> > Most of field in vmstat are not changed too much but things I can notice
> > is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
> > and pgrotated very much.
> > 
> > Welcome testing, review and any feedback!
> 
> Well, it will worsen IRQ latencies and it's all more code for us to
> maintain.  I think I'd like to see a better story about the end-user
> benefits before proceeding.

The motivation was from per-process reclaim(which was internal feature
yet and and I will repost it soon).
It's a feature for us to manage memory from platform so that we could
avoid reclaim.

Anyway, userspace expect they could see increased free pages in vmstat
after they have done per-process reclaim so the logic of userspace
will control their next action depending on the number of current
free page but it doesn't work with existing rotation logic, expecially
anon swap write pages.

When I posted this patchset firstly, Rik was positive and I thought
this feature is useful for everyone as well as per-process reclaim
and don't want to make noise this patchset with perpcoess reclaim.

https://lkml.org/lkml/2013/5/12/174
https://lkml.org/lkml/2013/5/14/484

Could you tell me what should I do to proceed?
Want to send this patchset with perproc reclaim? Or something?

Thanks for early feedback!

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/3] free reclaimed pages by paging out instantly
@ 2014-07-03  0:59     ` Minchan Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Minchan Kim @ 2014-07-03  0:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko

Hello Andrew,

On Wed, Jul 02, 2014 at 01:42:15PM -0700, Andrew Morton wrote:
> On Wed,  2 Jul 2014 09:13:46 +0900 Minchan Kim <minchan@kernel.org> wrote:
> 
> > Normally, I/O completed pages for reclaim would be rotated into
> > inactive LRU tail without freeing. The why it works is we can't free
> > page from atomic context(ie, end_page_writeback) due to vaious locks
> > isn't aware of atomic context.
> > 
> > So for reclaiming the I/O completed pages, we need one more iteration
> > of reclaim and it could make unnecessary aging as well as CPU overhead.
> > 
> > Long time ago, at the first trial, most concern was memcg locking
> > but recently, Johnannes tried amazing effort to make memcg lock simple
> > and got merged into mmotm so I coded up based on mmotm tree.
> > (Kudos to Johannes)
> > 
> > On 1G, 12 CPU kvm guest, build kernel 5 times and result was
> > 
> > allocstall
> > vanilla: records: 5 avg: 4733.80 std: 913.55(19.30%) max: 6442.00 min: 3719.00
> > improve: records: 5 avg: 1514.20 std: 441.69(29.17%) max: 1974.00 min: 863.00
> 
> Well yes.  We're now doing unaccounted, impact-a-random-process work in
> irq context which was previously being done in process context,
> accounted to the process which was allocating the memory.  Some would
> call this a regression ;)

The logic works only if someone try to reclaim dirty memory
by paging out with SetPageReclaim which means immediate reclaim
so normal writeout's overhead would be noop.

I thought it's a good deal in the situation where emergency that
reclaim should happen immediately. It could save more lock/irq
overhead will be caused by next reclaim without this patch
and even we have used pagevec in that path to minimize irq overhead.

> 
> > pgrotated
> > vanilla: records: 5 avg: 873313.80 std: 40999.20(4.69%) max: 954722.00 min: 845903.00
> > improve: records: 5 avg: 28406.40 std: 3296.02(11.60%) max: 34552.00 min: 25047.00
> 
> Still a surprisingly high amount of rotation going on.
> 
> > Most of field in vmstat are not changed too much but things I can notice
> > is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
> > and pgrotated very much.
> > 
> > Welcome testing, review and any feedback!
> 
> Well, it will worsen IRQ latencies and it's all more code for us to
> maintain.  I think I'd like to see a better story about the end-user
> benefits before proceeding.

The motivation was from per-process reclaim(which was internal feature
yet and and I will repost it soon).
It's a feature for us to manage memory from platform so that we could
avoid reclaim.

Anyway, userspace expect they could see increased free pages in vmstat
after they have done per-process reclaim so the logic of userspace
will control their next action depending on the number of current
free page but it doesn't work with existing rotation logic, expecially
anon swap write pages.

When I posted this patchset firstly, Rik was positive and I thought
this feature is useful for everyone as well as per-process reclaim
and don't want to make noise this patchset with perpcoess reclaim.

https://lkml.org/lkml/2013/5/12/174
https://lkml.org/lkml/2013/5/14/484

Could you tell me what should I do to proceed?
Want to send this patchset with perproc reclaim? Or something?

Thanks for early feedback!

> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/3] free reclaimed pages by paging out instantly
  2014-07-03  0:59     ` Minchan Kim
@ 2014-07-07 19:13       ` Andrew Morton
  -1 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2014-07-07 19:13 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko

On Thu, 3 Jul 2014 09:59:49 +0900 Minchan Kim <minchan@kernel.org> wrote:

> > > Most of field in vmstat are not changed too much but things I can notice
> > > is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
> > > and pgrotated very much.
> > > 
> > > Welcome testing, review and any feedback!
> > 
> > Well, it will worsen IRQ latencies and it's all more code for us to
> > maintain.  I think I'd like to see a better story about the end-user
> > benefits before proceeding.
> 
> The motivation was from per-process reclaim(which was internal feature
> yet and and I will repost it soon).
> It's a feature for us to manage memory from platform so that we could
> avoid reclaim.
> 
> Anyway, userspace expect they could see increased free pages in vmstat
> after they have done per-process reclaim so the logic of userspace
> will control their next action depending on the number of current
> free page but it doesn't work with existing rotation logic, expecially
> anon swap write pages.
> 
> When I posted this patchset firstly, Rik was positive and I thought
> this feature is useful for everyone as well as per-process reclaim
> and don't want to make noise this patchset with perpcoess reclaim.
> 
> https://lkml.org/lkml/2013/5/12/174
> https://lkml.org/lkml/2013/5/14/484
> 
> Could you tell me what should I do to proceed?

Quantify the gains, quantify the losses then demonstrate that the
benefits of the gains exceeds the cost of the losses plus the cost of
ongoing maintenance!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v3 0/3] free reclaimed pages by paging out instantly
@ 2014-07-07 19:13       ` Andrew Morton
  0 siblings, 0 replies; 14+ messages in thread
From: Andrew Morton @ 2014-07-07 19:13 UTC (permalink / raw)
  To: Minchan Kim
  Cc: linux-kernel, linux-mm, Mel Gorman, Rik van Riel, Hugh Dickins,
	Johannes Weiner, Michal Hocko

On Thu, 3 Jul 2014 09:59:49 +0900 Minchan Kim <minchan@kernel.org> wrote:

> > > Most of field in vmstat are not changed too much but things I can notice
> > > is allocstall and pgrotated. We could save allocstall(ie, direct relcaim)
> > > and pgrotated very much.
> > > 
> > > Welcome testing, review and any feedback!
> > 
> > Well, it will worsen IRQ latencies and it's all more code for us to
> > maintain.  I think I'd like to see a better story about the end-user
> > benefits before proceeding.
> 
> The motivation was from per-process reclaim(which was internal feature
> yet and and I will repost it soon).
> It's a feature for us to manage memory from platform so that we could
> avoid reclaim.
> 
> Anyway, userspace expect they could see increased free pages in vmstat
> after they have done per-process reclaim so the logic of userspace
> will control their next action depending on the number of current
> free page but it doesn't work with existing rotation logic, expecially
> anon swap write pages.
> 
> When I posted this patchset firstly, Rik was positive and I thought
> this feature is useful for everyone as well as per-process reclaim
> and don't want to make noise this patchset with perpcoess reclaim.
> 
> https://lkml.org/lkml/2013/5/12/174
> https://lkml.org/lkml/2013/5/14/484
> 
> Could you tell me what should I do to proceed?

Quantify the gains, quantify the losses then demonstrate that the
benefits of the gains exceeds the cost of the losses plus the cost of
ongoing maintenance!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2014-07-07 19:13 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-02  0:13 [PATCH v3 0/3] free reclaimed pages by paging out instantly Minchan Kim
2014-07-02  0:13 ` Minchan Kim
2014-07-02  0:13 ` [PATCH v3 1/3] mm: Don't hide spin_lock in swap_info_get internal Minchan Kim
2014-07-02  0:13   ` Minchan Kim
2014-07-02  0:13 ` [PATCH v3 2/3] mm: Introduce atomic_remove_mapping Minchan Kim
2014-07-02  0:13   ` Minchan Kim
2014-07-02  0:13 ` [PATCH v3 3/3] mm: Free reclaimed pages indepdent of next reclaim Minchan Kim
2014-07-02  0:13   ` Minchan Kim
2014-07-02 20:42 ` [PATCH v3 0/3] free reclaimed pages by paging out instantly Andrew Morton
2014-07-02 20:42   ` Andrew Morton
2014-07-03  0:59   ` Minchan Kim
2014-07-03  0:59     ` Minchan Kim
2014-07-07 19:13     ` Andrew Morton
2014-07-07 19:13       ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.