All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/4] mm: free swp_entry in madvise_free
@ 2015-03-11  1:20 ` Minchan Kim
  0 siblings, 0 replies; 56+ messages in thread
From: Minchan Kim @ 2015-03-11  1:20 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Michal Hocko, Johannes Weiner,
	Mel Gorman, Rik van Riel, Shaohua Li, Yalin.Wang, Minchan Kim

When I test below piece of code with 12 processes(ie, 512M * 12 = 6G consume)
on my (3G ram + 12 cpu + 8G swap, the madvise_free is siginficat slower
(ie, 2x times) than madvise_dontneed.

loop = 5;
mmap(512M);
while (loop--) {
        memset(512M);
        madvise(MADV_FREE or MADV_DONTNEED);
}

The reason is lots of swapin.

1) dontneed: 1,612 swapin
2) madvfree: 879,585 swapin

If we find hinted pages were already swapped out when syscall is called,
it's pointless to keep the swapped-out pages in pte.
Instead, let's free the cold page because swapin is more expensive
than (alloc page + zeroing).

With this patch, it reduced swapin from 879,585 to 1,878 so elapsed time

1) dontneed: 6.10user 233.50system 0:50.44elapsed
2) madvfree: 6.03user 401.17system 1:30.67elapsed
2) madvfree + below patch: 6.70user 339.14system 1:04.45elapsed

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/madvise.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/mm/madvise.c b/mm/madvise.c
index 6d0fcb8..ebe692e 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -274,7 +274,9 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
 	spinlock_t *ptl;
 	pte_t *pte, ptent;
 	struct page *page;
+	swp_entry_t entry;
 	unsigned long next;
+	int nr_swap = 0;
 
 	next = pmd_addr_end(addr, end);
 	if (pmd_trans_huge(*pmd)) {
@@ -293,8 +295,22 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
 	for (; addr != end; pte++, addr += PAGE_SIZE) {
 		ptent = *pte;
 
-		if (!pte_present(ptent))
+		if (pte_none(ptent))
 			continue;
+		/*
+		 * If the pte has swp_entry, just clear page table to
+		 * prevent swap-in which is more expensive rather than
+		 * (page allocation + zeroing).
+		 */
+		if (!pte_present(ptent)) {
+			entry = pte_to_swp_entry(ptent);
+			if (non_swap_entry(entry))
+				continue;
+			nr_swap--;
+			free_swap_and_cache(entry);
+			pte_clear_not_present_full(mm, addr, pte, tlb->fullmm);
+			continue;
+		}
 
 		page = vm_normal_page(vma, addr, ptent);
 		if (!page)
@@ -326,6 +342,14 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr,
 		set_pte_at(mm, addr, pte, ptent);
 		tlb_remove_tlb_entry(tlb, pte, addr);
 	}
+
+	if (nr_swap) {
+		if (current->mm == mm)
+			sync_mm_rss(mm);
+
+		add_mm_counter(mm, MM_SWAPENTS, nr_swap);
+	}
+
 	arch_leave_lazy_mmu_mode();
 	pte_unmap_unlock(pte - 1, ptl);
 next:
-- 
1.9.3


^ permalink raw reply related	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2015-04-15  6:50 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-11  1:20 [PATCH 1/4] mm: free swp_entry in madvise_free Minchan Kim
2015-03-11  1:20 ` Minchan Kim
2015-03-11  1:20 ` [PATCH 2/4] mm: change deactivate_page with deactivate_file_page Minchan Kim
2015-03-11  1:20   ` Minchan Kim
2015-03-11  1:20 ` [PATCH 3/4] mm: move lazy free pages to inactive list Minchan Kim
2015-03-11  1:20   ` Minchan Kim
2015-03-11  2:14   ` Wang, Yalin
2015-03-11  2:14     ` Wang, Yalin
2015-03-11  4:30     ` Minchan Kim
2015-03-11  4:30       ` Minchan Kim
2015-04-01 20:38     ` Rik van Riel
2015-04-01 20:38       ` Rik van Riel
2015-03-11  9:05   ` [RFC ] mm: don't ignore file map pages for madvise_free( ) Wang, Yalin
2015-03-11  9:05     ` Wang, Yalin
2015-03-11  9:47   ` [RFC] mm:do recheck for freeable page in reclaim path Wang, Yalin
2015-03-11  9:47     ` Wang, Yalin
2015-03-20 22:43   ` [PATCH 3/4] mm: move lazy free pages to inactive list Andrew Morton
2015-03-20 22:43     ` Andrew Morton
2015-03-30  5:35     ` Minchan Kim
2015-03-30  5:35       ` Minchan Kim
2015-03-30 21:20       ` Andrew Morton
2015-03-30 21:20         ` Andrew Morton
2015-03-31  4:45         ` Minchan Kim
2015-03-31  4:45           ` Minchan Kim
2015-03-31  5:28           ` Andrew Morton
2015-03-31  5:28             ` Andrew Morton
2015-03-31  5:57             ` Minchan Kim
2015-03-31  5:57               ` Minchan Kim
2015-03-11  1:20 ` [PATCH 4/4] mm: make every pte dirty on do_swap_page Minchan Kim
2015-03-11  1:20   ` Minchan Kim
2015-03-30  5:22   ` Minchan Kim
2015-03-30  5:22     ` Minchan Kim
2015-03-30  8:51     ` Cyrill Gorcunov
2015-03-30  8:51       ` Cyrill Gorcunov
2015-03-30  8:59       ` Minchan Kim
2015-03-30  8:59         ` Minchan Kim
2015-03-30 21:14         ` Cyrill Gorcunov
2015-03-30 21:14           ` Cyrill Gorcunov
2015-03-31  4:38           ` Minchan Kim
2015-03-31  4:38             ` Minchan Kim
2015-04-08 23:50   ` Minchan Kim
2015-04-08 23:50     ` Minchan Kim
2015-04-09 20:59     ` Andrew Morton
2015-04-09 20:59       ` Andrew Morton
2015-04-10  0:08       ` Minchan Kim
2015-04-10  0:08         ` Minchan Kim
2015-04-10  0:14       ` Rik van Riel
2015-04-10  0:14         ` Rik van Riel
2015-04-11 21:40   ` Hugh Dickins
2015-04-11 21:40     ` Hugh Dickins
2015-04-12 14:48     ` Minchan Kim
2015-04-12 14:48       ` Minchan Kim
2015-04-15  6:49       ` Minchan Kim
2015-04-15  6:49         ` Minchan Kim
2015-03-19  0:46 ` [PATCH 1/4] mm: free swp_entry in madvise_free Minchan Kim
2015-03-19  0:46   ` Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.