All of lore.kernel.org
 help / color / mirror / Atom feed
* mm: used-once mapped file page detection
@ 2010-02-22 19:49 ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel


Hi,

this is the second submission of the used-once mapped file page
detection patch.

It is meant to help workloads with large amounts of shortly used file
mappings, like rtorrent hashing a file or git when dealing with loose
objects (git gc on a bigger site?).

Right now, the VM activates referenced mapped file pages on first
encounter on the inactive list and it takes a full memory cycle to
reclaim them again.  When those pages dominate memory, the system
no longer has a meaningful notion of 'working set' and is required
to give up the active list to make reclaim progress.  Obviously,
this results in rather bad scanning latencies and the wrong pages
being reclaimed.

This patch makes the VM be more careful about activating mapped file
pages in the first place.  The minimum granted lifetime without
another memory access becomes an inactive list cycle instead of the
full memory cycle, which is more natural given the mentioned loads.

This test resembles a hashing rtorrent process.  Sequentially, 32MB
chunks of a file are mapped into memory, hashed (sha1) and unmapped
again.  While this happens, every 5 seconds a process is launched and
its execution time taken:

	python2.4 -c 'import pydoc'
	old: max=2.31s mean=1.26s (0.34)
	new: max=1.25s mean=0.32s (0.32)

	find /etc -type f
	old: max=2.52s mean=1.44s (0.43)
	new: max=1.92s mean=0.12s (0.17)

	vim -c ':quit'
	old: max=6.14s mean=4.03s (0.49)
	new: max=3.48s mean=2.41s (0.25)

	mplayer --help
	old: max=8.08s mean=5.74s (1.02)
	new: max=3.79s mean=1.32s (0.81)

	overall hash time (stdev):
	old: time=1192.30 (12.85) thruput=25.78mb/s (0.27)
	new: time=1060.27 (32.58) thruput=29.02mb/s (0.88) (-11%)

I also tested kernbench with regular IO streaming in the background to
see whether the delayed activation of frequently used mapped file
pages had a negative impact on performance in the presence of pressure
on the inactive list.  The patch made no significant difference in
timing, neither for kernbench nor for the streaming IO throughput.

The first patch submission raised concerns about the cost of the extra
faults for actually activated pages on machines that have no hardware
support for young page table entries.

I created an artificial worst case scenario on an ARM machine with
around 300MHz and 64MB of memory to figure out the dimensions
involved.  The test would mmap a file of 20MB, then

  1. touch all its pages to fault them in
  2. force one full scan cycle on the inactive file LRU
  -- old: mapping pages activated
  -- new: mapping pages inactive
  3. touch the mapping pages again
  -- old and new: fault exceptions to set the young bits
  4. force another full scan cycle on the inactive file LRU
  5. touch the mapping pages one last time
  -- new: fault exceptions to set the young bits

The test showed an overall increase of 6% in time over 100 iterations
of the above (old: ~212sec, new: ~225sec).  13 secs total overhead /
(100 * 5k pages), ignoring the execution time of the test itself,
makes for about 25us overhead for every page that gets actually
activated.  Note:

  1. File mapping the size of one third of main memory, _completely_
  in active use across memory pressure - i.e., most pages referenced
  within one LRU cycle.  This should be rare to non-existant,
  especially on such embedded setups.

  2. Many huge activation batches.  Those batches only occur when the
  working set fluctuates.  If it changes completely between every full
  LRU cycle, you have problematic reclaim overhead anyway.

  3. Access of activated pages at maximum speed: sequential loads from
  every single page without doing anything in between.  In reality,
  the extra faults will get distributed between actual operations on
  the data.

So even if a workload manages to get the VM into the situation of
activating a third of memory in one go on such a setup, it will take
2.2 seconds instead 2.1 without the patch.

Comparing the numbers (and my user-experience over several months),
I think this change is an overall improvement to the VM.

Patch 1 is only refactoring to break up that ugly compound conditional
in shrink_page_list() and make it easy to document and add new checks
in a readable fashion.

Patch 2 gets rid of the obsolete page_mapping_inuse().  It's not
strictly related to #3, but it was in the original submission and is a
net simplification, so I kept it.

Patch 3 implements used-once detection of mapped file pages.

	Hannes

 include/linux/rmap.h |    2 +-
 mm/rmap.c            |    3 -
 mm/vmscan.c          |  105 ++++++++++++++++++++++++++++++++-----------------
 3 files changed, 69 insertions(+), 41 deletions(-)

^ permalink raw reply	[flat|nested] 44+ messages in thread

* mm: used-once mapped file page detection
@ 2010-02-22 19:49 ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel


Hi,

this is the second submission of the used-once mapped file page
detection patch.

It is meant to help workloads with large amounts of shortly used file
mappings, like rtorrent hashing a file or git when dealing with loose
objects (git gc on a bigger site?).

Right now, the VM activates referenced mapped file pages on first
encounter on the inactive list and it takes a full memory cycle to
reclaim them again.  When those pages dominate memory, the system
no longer has a meaningful notion of 'working set' and is required
to give up the active list to make reclaim progress.  Obviously,
this results in rather bad scanning latencies and the wrong pages
being reclaimed.

This patch makes the VM be more careful about activating mapped file
pages in the first place.  The minimum granted lifetime without
another memory access becomes an inactive list cycle instead of the
full memory cycle, which is more natural given the mentioned loads.

This test resembles a hashing rtorrent process.  Sequentially, 32MB
chunks of a file are mapped into memory, hashed (sha1) and unmapped
again.  While this happens, every 5 seconds a process is launched and
its execution time taken:

	python2.4 -c 'import pydoc'
	old: max=2.31s mean=1.26s (0.34)
	new: max=1.25s mean=0.32s (0.32)

	find /etc -type f
	old: max=2.52s mean=1.44s (0.43)
	new: max=1.92s mean=0.12s (0.17)

	vim -c ':quit'
	old: max=6.14s mean=4.03s (0.49)
	new: max=3.48s mean=2.41s (0.25)

	mplayer --help
	old: max=8.08s mean=5.74s (1.02)
	new: max=3.79s mean=1.32s (0.81)

	overall hash time (stdev):
	old: time=1192.30 (12.85) thruput=25.78mb/s (0.27)
	new: time=1060.27 (32.58) thruput=29.02mb/s (0.88) (-11%)

I also tested kernbench with regular IO streaming in the background to
see whether the delayed activation of frequently used mapped file
pages had a negative impact on performance in the presence of pressure
on the inactive list.  The patch made no significant difference in
timing, neither for kernbench nor for the streaming IO throughput.

The first patch submission raised concerns about the cost of the extra
faults for actually activated pages on machines that have no hardware
support for young page table entries.

I created an artificial worst case scenario on an ARM machine with
around 300MHz and 64MB of memory to figure out the dimensions
involved.  The test would mmap a file of 20MB, then

  1. touch all its pages to fault them in
  2. force one full scan cycle on the inactive file LRU
  -- old: mapping pages activated
  -- new: mapping pages inactive
  3. touch the mapping pages again
  -- old and new: fault exceptions to set the young bits
  4. force another full scan cycle on the inactive file LRU
  5. touch the mapping pages one last time
  -- new: fault exceptions to set the young bits

The test showed an overall increase of 6% in time over 100 iterations
of the above (old: ~212sec, new: ~225sec).  13 secs total overhead /
(100 * 5k pages), ignoring the execution time of the test itself,
makes for about 25us overhead for every page that gets actually
activated.  Note:

  1. File mapping the size of one third of main memory, _completely_
  in active use across memory pressure - i.e., most pages referenced
  within one LRU cycle.  This should be rare to non-existant,
  especially on such embedded setups.

  2. Many huge activation batches.  Those batches only occur when the
  working set fluctuates.  If it changes completely between every full
  LRU cycle, you have problematic reclaim overhead anyway.

  3. Access of activated pages at maximum speed: sequential loads from
  every single page without doing anything in between.  In reality,
  the extra faults will get distributed between actual operations on
  the data.

So even if a workload manages to get the VM into the situation of
activating a third of memory in one go on such a setup, it will take
2.2 seconds instead 2.1 without the patch.

Comparing the numbers (and my user-experience over several months),
I think this change is an overall improvement to the VM.

Patch 1 is only refactoring to break up that ugly compound conditional
in shrink_page_list() and make it easy to document and add new checks
in a readable fashion.

Patch 2 gets rid of the obsolete page_mapping_inuse().  It's not
strictly related to #3, but it was in the original submission and is a
net simplification, so I kept it.

Patch 3 implements used-once detection of mapped file pages.

	Hannes

 include/linux/rmap.h |    2 +-
 mm/rmap.c            |    3 -
 mm/vmscan.c          |  105 ++++++++++++++++++++++++++++++++-----------------
 3 files changed, 69 insertions(+), 41 deletions(-)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [patch 1/3] vmscan: factor out page reference checks
  2010-02-22 19:49 ` Johannes Weiner
@ 2010-02-22 19:49   ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

Moving the big conditional into its own predicate function makes the
code a bit easier to read and allows for better commenting on the
checks one-by-one.

This is just cleaning up, no semantics should have been changed.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/vmscan.c |   53 ++++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c26986c..c2db55b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -579,6 +579,37 @@ redo:
 	put_page(page);		/* drop ref from isolate */
 }
 
+enum page_references {
+	PAGEREF_RECLAIM,
+	PAGEREF_RECLAIM_CLEAN,
+	PAGEREF_ACTIVATE,
+};
+
+static enum page_references page_check_references(struct page *page,
+						  struct scan_control *sc)
+{
+	unsigned long vm_flags;
+	int referenced;
+
+	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
+	if (!referenced)
+		return PAGEREF_RECLAIM;
+
+	/* Lumpy reclaim - ignore references */
+	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
+		return PAGEREF_RECLAIM;
+
+	/* Mlock lost isolation race - let try_to_unmap() handle it */
+	if (vm_flags & VM_LOCKED)
+		return PAGEREF_RECLAIM;
+
+	if (page_mapping_inuse(page))
+		return PAGEREF_ACTIVATE;
+
+	/* Reclaim if clean, defer dirty pages to writeback */
+	return PAGEREF_RECLAIM_CLEAN;
+}
+
 /*
  * shrink_page_list() returns the number of reclaimed pages
  */
@@ -590,16 +621,15 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 	struct pagevec freed_pvec;
 	int pgactivate = 0;
 	unsigned long nr_reclaimed = 0;
-	unsigned long vm_flags;
 
 	cond_resched();
 
 	pagevec_init(&freed_pvec, 1);
 	while (!list_empty(page_list)) {
+		enum page_references references;
 		struct address_space *mapping;
 		struct page *page;
 		int may_enter_fs;
-		int referenced;
 
 		cond_resched();
 
@@ -641,17 +671,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 				goto keep_locked;
 		}
 
-		referenced = page_referenced(page, 1,
-						sc->mem_cgroup, &vm_flags);
-		/*
-		 * In active use or really unfreeable?  Activate it.
-		 * If page which have PG_mlocked lost isoltation race,
-		 * try_to_unmap moves it to unevictable list
-		 */
-		if (sc->order <= PAGE_ALLOC_COSTLY_ORDER &&
-					referenced && page_mapping_inuse(page)
-					&& !(vm_flags & VM_LOCKED))
+		references = page_check_references(page, sc);
+		switch (references) {
+		case PAGEREF_ACTIVATE:
 			goto activate_locked;
+		case PAGEREF_RECLAIM:
+		case PAGEREF_RECLAIM_CLEAN:
+			; /* try to reclaim the page below */
+		}
 
 		/*
 		 * Anonymous process memory has backing store?
@@ -685,7 +712,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		}
 
 		if (PageDirty(page)) {
-			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
+			if (references == PAGEREF_RECLAIM_CLEAN)
 				goto keep_locked;
 			if (!may_enter_fs)
 				goto keep_locked;
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-22 19:49   ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

Moving the big conditional into its own predicate function makes the
code a bit easier to read and allows for better commenting on the
checks one-by-one.

This is just cleaning up, no semantics should have been changed.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/vmscan.c |   53 ++++++++++++++++++++++++++++++++++++++++-------------
 1 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c26986c..c2db55b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -579,6 +579,37 @@ redo:
 	put_page(page);		/* drop ref from isolate */
 }
 
+enum page_references {
+	PAGEREF_RECLAIM,
+	PAGEREF_RECLAIM_CLEAN,
+	PAGEREF_ACTIVATE,
+};
+
+static enum page_references page_check_references(struct page *page,
+						  struct scan_control *sc)
+{
+	unsigned long vm_flags;
+	int referenced;
+
+	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
+	if (!referenced)
+		return PAGEREF_RECLAIM;
+
+	/* Lumpy reclaim - ignore references */
+	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
+		return PAGEREF_RECLAIM;
+
+	/* Mlock lost isolation race - let try_to_unmap() handle it */
+	if (vm_flags & VM_LOCKED)
+		return PAGEREF_RECLAIM;
+
+	if (page_mapping_inuse(page))
+		return PAGEREF_ACTIVATE;
+
+	/* Reclaim if clean, defer dirty pages to writeback */
+	return PAGEREF_RECLAIM_CLEAN;
+}
+
 /*
  * shrink_page_list() returns the number of reclaimed pages
  */
@@ -590,16 +621,15 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 	struct pagevec freed_pvec;
 	int pgactivate = 0;
 	unsigned long nr_reclaimed = 0;
-	unsigned long vm_flags;
 
 	cond_resched();
 
 	pagevec_init(&freed_pvec, 1);
 	while (!list_empty(page_list)) {
+		enum page_references references;
 		struct address_space *mapping;
 		struct page *page;
 		int may_enter_fs;
-		int referenced;
 
 		cond_resched();
 
@@ -641,17 +671,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 				goto keep_locked;
 		}
 
-		referenced = page_referenced(page, 1,
-						sc->mem_cgroup, &vm_flags);
-		/*
-		 * In active use or really unfreeable?  Activate it.
-		 * If page which have PG_mlocked lost isoltation race,
-		 * try_to_unmap moves it to unevictable list
-		 */
-		if (sc->order <= PAGE_ALLOC_COSTLY_ORDER &&
-					referenced && page_mapping_inuse(page)
-					&& !(vm_flags & VM_LOCKED))
+		references = page_check_references(page, sc);
+		switch (references) {
+		case PAGEREF_ACTIVATE:
 			goto activate_locked;
+		case PAGEREF_RECLAIM:
+		case PAGEREF_RECLAIM_CLEAN:
+			; /* try to reclaim the page below */
+		}
 
 		/*
 		 * Anonymous process memory has backing store?
@@ -685,7 +712,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		}
 
 		if (PageDirty(page)) {
-			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
+			if (references == PAGEREF_RECLAIM_CLEAN)
 				goto keep_locked;
 			if (!may_enter_fs)
 				goto keep_locked;
-- 
1.6.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [patch 2/3] vmscan: drop page_mapping_inuse()
  2010-02-22 19:49 ` Johannes Weiner
@ 2010-02-22 19:49   ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

page_mapping_inuse() is a historic predicate function for pages that
are about to be reclaimed or deactivated.

According to it, a page is in use when it is mapped into page tables
OR part of swap cache OR backing an mmapped file.

This function is used in combination with page_referenced(), which
checks for young bits in ptes and the page descriptor itself for the
PG_referenced bit.  Thus, checking for unmapped swap cache pages is
meaningless as PG_referenced is not set for anonymous pages and
unmapped pages do not have young ptes.  The test makes no difference.

Protecting file pages that are not by themselves mapped but are part
of a mapped file is also a historic leftover for short-lived things
like the exec() code in libc.  However, the VM now does reference
accounting and activation of pages at unmap time and thus the special
treatment on reclaim is obsolete.

This patch drops page_mapping_inuse() and switches the two callsites
to use page_mapped() directly.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/vmscan.c |   25 ++-----------------------
 1 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c2db55b..a8e4cbe 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -262,27 +262,6 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 	return ret;
 }
 
-/* Called without lock on whether page is mapped, so answer is unstable */
-static inline int page_mapping_inuse(struct page *page)
-{
-	struct address_space *mapping;
-
-	/* Page is in somebody's page tables. */
-	if (page_mapped(page))
-		return 1;
-
-	/* Be more reluctant to reclaim swapcache than pagecache */
-	if (PageSwapCache(page))
-		return 1;
-
-	mapping = page_mapping(page);
-	if (!mapping)
-		return 0;
-
-	/* File is mmap'd by somebody? */
-	return mapping_mapped(mapping);
-}
-
 static inline int is_page_cache_freeable(struct page *page)
 {
 	/*
@@ -603,7 +582,7 @@ static enum page_references page_check_references(struct page *page,
 	if (vm_flags & VM_LOCKED)
 		return PAGEREF_RECLAIM;
 
-	if (page_mapping_inuse(page))
+	if (page_mapped(page))
 		return PAGEREF_ACTIVATE;
 
 	/* Reclaim if clean, defer dirty pages to writeback */
@@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
 		}
 
 		/* page_referenced clears PageReferenced */
-		if (page_mapping_inuse(page) &&
+		if (page_mapped(page) &&
 		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
 			nr_rotated++;
 			/*
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [patch 2/3] vmscan: drop page_mapping_inuse()
@ 2010-02-22 19:49   ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

page_mapping_inuse() is a historic predicate function for pages that
are about to be reclaimed or deactivated.

According to it, a page is in use when it is mapped into page tables
OR part of swap cache OR backing an mmapped file.

This function is used in combination with page_referenced(), which
checks for young bits in ptes and the page descriptor itself for the
PG_referenced bit.  Thus, checking for unmapped swap cache pages is
meaningless as PG_referenced is not set for anonymous pages and
unmapped pages do not have young ptes.  The test makes no difference.

Protecting file pages that are not by themselves mapped but are part
of a mapped file is also a historic leftover for short-lived things
like the exec() code in libc.  However, the VM now does reference
accounting and activation of pages at unmap time and thus the special
treatment on reclaim is obsolete.

This patch drops page_mapping_inuse() and switches the two callsites
to use page_mapped() directly.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/vmscan.c |   25 ++-----------------------
 1 files changed, 2 insertions(+), 23 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c2db55b..a8e4cbe 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -262,27 +262,6 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
 	return ret;
 }
 
-/* Called without lock on whether page is mapped, so answer is unstable */
-static inline int page_mapping_inuse(struct page *page)
-{
-	struct address_space *mapping;
-
-	/* Page is in somebody's page tables. */
-	if (page_mapped(page))
-		return 1;
-
-	/* Be more reluctant to reclaim swapcache than pagecache */
-	if (PageSwapCache(page))
-		return 1;
-
-	mapping = page_mapping(page);
-	if (!mapping)
-		return 0;
-
-	/* File is mmap'd by somebody? */
-	return mapping_mapped(mapping);
-}
-
 static inline int is_page_cache_freeable(struct page *page)
 {
 	/*
@@ -603,7 +582,7 @@ static enum page_references page_check_references(struct page *page,
 	if (vm_flags & VM_LOCKED)
 		return PAGEREF_RECLAIM;
 
-	if (page_mapping_inuse(page))
+	if (page_mapped(page))
 		return PAGEREF_ACTIVATE;
 
 	/* Reclaim if clean, defer dirty pages to writeback */
@@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
 		}
 
 		/* page_referenced clears PageReferenced */
-		if (page_mapping_inuse(page) &&
+		if (page_mapped(page) &&
 		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
 			nr_rotated++;
 			/*
-- 
1.6.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [patch 3/3] vmscan: detect mapped file pages used only once
  2010-02-22 19:49 ` Johannes Weiner
@ 2010-02-22 19:49   ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

The VM currently assumes that an inactive, mapped and referenced file
page is in use and promotes it to the active list.

However, every mapped file page starts out like this and thus a problem
arises when workloads create a stream of such pages that are used only
for a short time.  By flooding the active list with those pages, the VM
quickly gets into trouble finding eligible reclaim canditates.  The
result is long allocation latencies and eviction of the wrong pages.

This patch reuses the PG_referenced page flag (used for unmapped file
pages) to implement a usage detection that scales with the speed of
LRU list cycling (i.e. memory pressure).

If the scanner encounters those pages, the flag is set and the page
cycled again on the inactive list.  Only if it returns with another
page table reference it is activated.  Otherwise it is reclaimed as
'not recently used cache'.

This effectively changes the minimum lifetime of a used-once mapped
file page from a full memory cycle to an inactive list cycle, which
allows it to occur in linear streams without affecting the stable
working set of the system.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/rmap.h |    2 +-
 mm/rmap.c            |    3 ---
 mm/vmscan.c          |   45 +++++++++++++++++++++++++++++++++++----------
 3 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index b019ae6..f4accb5 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -181,7 +181,7 @@ static inline int page_referenced(struct page *page, int is_locked,
 				  unsigned long *vm_flags)
 {
 	*vm_flags = 0;
-	return TestClearPageReferenced(page);
+	return 0;
 }
 
 #define try_to_unmap(page, refs) SWAP_FAIL
diff --git a/mm/rmap.c b/mm/rmap.c
index 278cd27..5a48bda 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -511,9 +511,6 @@ int page_referenced(struct page *page,
 	int referenced = 0;
 	int we_locked = 0;
 
-	if (TestClearPageReferenced(page))
-		referenced++;
-
 	*vm_flags = 0;
 	if (page_mapped(page) && page_rmapping(page)) {
 		if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a8e4cbe..674a78b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -561,18 +561,18 @@ redo:
 enum page_references {
 	PAGEREF_RECLAIM,
 	PAGEREF_RECLAIM_CLEAN,
+	PAGEREF_KEEP,
 	PAGEREF_ACTIVATE,
 };
 
 static enum page_references page_check_references(struct page *page,
 						  struct scan_control *sc)
 {
+	int referenced_ptes, referenced_page;
 	unsigned long vm_flags;
-	int referenced;
 
-	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
-	if (!referenced)
-		return PAGEREF_RECLAIM;
+	referenced_ptes = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
+	referenced_page = TestClearPageReferenced(page);
 
 	/* Lumpy reclaim - ignore references */
 	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
@@ -582,11 +582,36 @@ static enum page_references page_check_references(struct page *page,
 	if (vm_flags & VM_LOCKED)
 		return PAGEREF_RECLAIM;
 
-	if (page_mapped(page))
-		return PAGEREF_ACTIVATE;
+	if (referenced_ptes) {
+		if (PageAnon(page))
+			return PAGEREF_ACTIVATE;
+		/*
+		 * All mapped pages start out with page table
+		 * references from the instantiating fault, so we need
+		 * to look twice if a mapped file page is used more
+		 * than once.
+		 *
+		 * Mark it and spare it for another trip around the
+		 * inactive list.  Another page table reference will
+		 * lead to its activation.
+		 *
+		 * Note: the mark is set for activated pages as well
+		 * so that recently deactivated but used pages are
+		 * quickly recovered.
+		 */
+		SetPageReferenced(page);
+
+		if (referenced_page)
+			return PAGEREF_ACTIVATE;
+
+		return PAGEREF_KEEP;
+	}
 
 	/* Reclaim if clean, defer dirty pages to writeback */
-	return PAGEREF_RECLAIM_CLEAN;
+	if (referenced_page)
+		return PAGEREF_RECLAIM_CLEAN;
+
+	return PAGEREF_RECLAIM;
 }
 
 /*
@@ -654,6 +679,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		switch (references) {
 		case PAGEREF_ACTIVATE:
 			goto activate_locked;
+		case PAGEREF_KEEP:
+			goto keep_locked;
 		case PAGEREF_RECLAIM:
 		case PAGEREF_RECLAIM_CLEAN:
 			; /* try to reclaim the page below */
@@ -1356,9 +1383,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
 			continue;
 		}
 
-		/* page_referenced clears PageReferenced */
-		if (page_mapped(page) &&
-		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
+		if (page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
 			nr_rotated++;
 			/*
 			 * Identify referenced, file-backed active pages and
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [patch 3/3] vmscan: detect mapped file pages used only once
@ 2010-02-22 19:49   ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-22 19:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

The VM currently assumes that an inactive, mapped and referenced file
page is in use and promotes it to the active list.

However, every mapped file page starts out like this and thus a problem
arises when workloads create a stream of such pages that are used only
for a short time.  By flooding the active list with those pages, the VM
quickly gets into trouble finding eligible reclaim canditates.  The
result is long allocation latencies and eviction of the wrong pages.

This patch reuses the PG_referenced page flag (used for unmapped file
pages) to implement a usage detection that scales with the speed of
LRU list cycling (i.e. memory pressure).

If the scanner encounters those pages, the flag is set and the page
cycled again on the inactive list.  Only if it returns with another
page table reference it is activated.  Otherwise it is reclaimed as
'not recently used cache'.

This effectively changes the minimum lifetime of a used-once mapped
file page from a full memory cycle to an inactive list cycle, which
allows it to occur in linear streams without affecting the stable
working set of the system.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/rmap.h |    2 +-
 mm/rmap.c            |    3 ---
 mm/vmscan.c          |   45 +++++++++++++++++++++++++++++++++++----------
 3 files changed, 36 insertions(+), 14 deletions(-)

diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index b019ae6..f4accb5 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -181,7 +181,7 @@ static inline int page_referenced(struct page *page, int is_locked,
 				  unsigned long *vm_flags)
 {
 	*vm_flags = 0;
-	return TestClearPageReferenced(page);
+	return 0;
 }
 
 #define try_to_unmap(page, refs) SWAP_FAIL
diff --git a/mm/rmap.c b/mm/rmap.c
index 278cd27..5a48bda 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -511,9 +511,6 @@ int page_referenced(struct page *page,
 	int referenced = 0;
 	int we_locked = 0;
 
-	if (TestClearPageReferenced(page))
-		referenced++;
-
 	*vm_flags = 0;
 	if (page_mapped(page) && page_rmapping(page)) {
 		if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a8e4cbe..674a78b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -561,18 +561,18 @@ redo:
 enum page_references {
 	PAGEREF_RECLAIM,
 	PAGEREF_RECLAIM_CLEAN,
+	PAGEREF_KEEP,
 	PAGEREF_ACTIVATE,
 };
 
 static enum page_references page_check_references(struct page *page,
 						  struct scan_control *sc)
 {
+	int referenced_ptes, referenced_page;
 	unsigned long vm_flags;
-	int referenced;
 
-	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
-	if (!referenced)
-		return PAGEREF_RECLAIM;
+	referenced_ptes = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
+	referenced_page = TestClearPageReferenced(page);
 
 	/* Lumpy reclaim - ignore references */
 	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
@@ -582,11 +582,36 @@ static enum page_references page_check_references(struct page *page,
 	if (vm_flags & VM_LOCKED)
 		return PAGEREF_RECLAIM;
 
-	if (page_mapped(page))
-		return PAGEREF_ACTIVATE;
+	if (referenced_ptes) {
+		if (PageAnon(page))
+			return PAGEREF_ACTIVATE;
+		/*
+		 * All mapped pages start out with page table
+		 * references from the instantiating fault, so we need
+		 * to look twice if a mapped file page is used more
+		 * than once.
+		 *
+		 * Mark it and spare it for another trip around the
+		 * inactive list.  Another page table reference will
+		 * lead to its activation.
+		 *
+		 * Note: the mark is set for activated pages as well
+		 * so that recently deactivated but used pages are
+		 * quickly recovered.
+		 */
+		SetPageReferenced(page);
+
+		if (referenced_page)
+			return PAGEREF_ACTIVATE;
+
+		return PAGEREF_KEEP;
+	}
 
 	/* Reclaim if clean, defer dirty pages to writeback */
-	return PAGEREF_RECLAIM_CLEAN;
+	if (referenced_page)
+		return PAGEREF_RECLAIM_CLEAN;
+
+	return PAGEREF_RECLAIM;
 }
 
 /*
@@ -654,6 +679,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		switch (references) {
 		case PAGEREF_ACTIVATE:
 			goto activate_locked;
+		case PAGEREF_KEEP:
+			goto keep_locked;
 		case PAGEREF_RECLAIM:
 		case PAGEREF_RECLAIM_CLEAN:
 			; /* try to reclaim the page below */
@@ -1356,9 +1383,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
 			continue;
 		}
 
-		/* page_referenced clears PageReferenced */
-		if (page_mapped(page) &&
-		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
+		if (page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
 			nr_rotated++;
 			/*
 			 * Identify referenced, file-backed active pages and
-- 
1.6.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
  2010-02-22 19:49   ` Johannes Weiner
@ 2010-02-22 20:27     ` Rik van Riel
  -1 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-22 20:27 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/22/2010 02:49 PM, Johannes Weiner wrote:
> Moving the big conditional into its own predicate function makes the
> code a bit easier to read and allows for better commenting on the
> checks one-by-one.
>
> This is just cleaning up, no semantics should have been changed.
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-22 20:27     ` Rik van Riel
  0 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-22 20:27 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/22/2010 02:49 PM, Johannes Weiner wrote:
> Moving the big conditional into its own predicate function makes the
> code a bit easier to read and allows for better commenting on the
> checks one-by-one.
>
> This is just cleaning up, no semantics should have been changed.
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
  2010-02-22 19:49   ` Johannes Weiner
@ 2010-02-22 20:28     ` Rik van Riel
  -1 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-22 20:28 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/22/2010 02:49 PM, Johannes Weiner wrote:
> page_mapping_inuse() is a historic predicate function for pages that
> are about to be reclaimed or deactivated.
>
> According to it, a page is in use when it is mapped into page tables
> OR part of swap cache OR backing an mmapped file.
>
> This function is used in combination with page_referenced(), which
> checks for young bits in ptes and the page descriptor itself for the
> PG_referenced bit.  Thus, checking for unmapped swap cache pages is
> meaningless as PG_referenced is not set for anonymous pages and
> unmapped pages do not have young ptes.  The test makes no difference.
>
> Protecting file pages that are not by themselves mapped but are part
> of a mapped file is also a historic leftover for short-lived things
> like the exec() code in libc.  However, the VM now does reference
> accounting and activation of pages at unmap time and thus the special
> treatment on reclaim is obsolete.
>
> This patch drops page_mapping_inuse() and switches the two callsites
> to use page_mapped() directly.
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>

Reviewed-by: Rik van Riel <riel@redhat.com>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
@ 2010-02-22 20:28     ` Rik van Riel
  0 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-22 20:28 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/22/2010 02:49 PM, Johannes Weiner wrote:
> page_mapping_inuse() is a historic predicate function for pages that
> are about to be reclaimed or deactivated.
>
> According to it, a page is in use when it is mapped into page tables
> OR part of swap cache OR backing an mmapped file.
>
> This function is used in combination with page_referenced(), which
> checks for young bits in ptes and the page descriptor itself for the
> PG_referenced bit.  Thus, checking for unmapped swap cache pages is
> meaningless as PG_referenced is not set for anonymous pages and
> unmapped pages do not have young ptes.  The test makes no difference.
>
> Protecting file pages that are not by themselves mapped but are part
> of a mapped file is also a historic leftover for short-lived things
> like the exec() code in libc.  However, the VM now does reference
> accounting and activation of pages at unmap time and thus the special
> treatment on reclaim is obsolete.
>
> This patch drops page_mapping_inuse() and switches the two callsites
> to use page_mapped() directly.
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 3/3] vmscan: detect mapped file pages used only once
  2010-02-22 19:49   ` Johannes Weiner
@ 2010-02-22 20:34     ` Rik van Riel
  -1 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-22 20:34 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/22/2010 02:49 PM, Johannes Weiner wrote:
> The VM currently assumes that an inactive, mapped and referenced file
> page is in use and promotes it to the active list.
>
> However, every mapped file page starts out like this and thus a problem
> arises when workloads create a stream of such pages that are used only
> for a short time.  By flooding the active list with those pages, the VM
> quickly gets into trouble finding eligible reclaim canditates.  The
> result is long allocation latencies and eviction of the wrong pages.
>
> This patch reuses the PG_referenced page flag (used for unmapped file
> pages) to implement a usage detection that scales with the speed of
> LRU list cycling (i.e. memory pressure).
>
> If the scanner encounters those pages, the flag is set and the page
> cycled again on the inactive list.  Only if it returns with another
> page table reference it is activated.  Otherwise it is reclaimed as
> 'not recently used cache'.
>
> This effectively changes the minimum lifetime of a used-once mapped
> file page from a full memory cycle to an inactive list cycle, which
> allows it to occur in linear streams without affecting the stable
> working set of the system.
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>

Reviewed-by: Rik van Riel <riel@redhat.com>


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 3/3] vmscan: detect mapped file pages used only once
@ 2010-02-22 20:34     ` Rik van Riel
  0 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-22 20:34 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/22/2010 02:49 PM, Johannes Weiner wrote:
> The VM currently assumes that an inactive, mapped and referenced file
> page is in use and promotes it to the active list.
>
> However, every mapped file page starts out like this and thus a problem
> arises when workloads create a stream of such pages that are used only
> for a short time.  By flooding the active list with those pages, the VM
> quickly gets into trouble finding eligible reclaim canditates.  The
> result is long allocation latencies and eviction of the wrong pages.
>
> This patch reuses the PG_referenced page flag (used for unmapped file
> pages) to implement a usage detection that scales with the speed of
> LRU list cycling (i.e. memory pressure).
>
> If the scanner encounters those pages, the flag is set and the page
> cycled again on the inactive list.  Only if it returns with another
> page table reference it is activated.  Otherwise it is reclaimed as
> 'not recently used cache'.
>
> This effectively changes the minimum lifetime of a used-once mapped
> file page from a full memory cycle to an inactive list cycle, which
> allows it to occur in linear streams without affecting the stable
> working set of the system.
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>

Reviewed-by: Rik van Riel <riel@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
  2010-02-22 19:49   ` Johannes Weiner
@ 2010-02-23 13:38     ` Minchan Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 13:38 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hi, Hannes. 

On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> Moving the big conditional into its own predicate function makes the
> code a bit easier to read and allows for better commenting on the
> checks one-by-one.
> 
> This is just cleaning up, no semantics should have been changed.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/vmscan.c |   53 ++++++++++++++++++++++++++++++++++++++++-------------
>  1 files changed, 40 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c26986c..c2db55b 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -579,6 +579,37 @@ redo:
>  	put_page(page);		/* drop ref from isolate */
>  }
>  
> +enum page_references {
> +	PAGEREF_RECLAIM,
> +	PAGEREF_RECLAIM_CLEAN,
> +	PAGEREF_ACTIVATE,
> +};
> +
> +static enum page_references page_check_references(struct page *page,
> +						  struct scan_control *sc)
> +{
> +	unsigned long vm_flags;
> +	int referenced;
> +
> +	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> +	if (!referenced)
> +		return PAGEREF_RECLAIM;
> +
> +	/* Lumpy reclaim - ignore references */
> +	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> +		return PAGEREF_RECLAIM;
> +
> +	/* Mlock lost isolation race - let try_to_unmap() handle it */

How doest try_to_unamp handle it?

/* Page which PG_mlocked lost isolation race - let try_to_unmap() move
the page to unevitable list */

The point is to move the page into unevictable list in case of race. 
Let's write down comment more clearly. 
As it was, it was clear, I think. :)

> +	if (vm_flags & VM_LOCKED)
> +		return PAGEREF_RECLAIM;
> +
> +	if (page_mapping_inuse(page))
> +		return PAGEREF_ACTIVATE;
> +
> +	/* Reclaim if clean, defer dirty pages to writeback */
> +	return PAGEREF_RECLAIM_CLEAN;
> +}
> +
>  /*
>   * shrink_page_list() returns the number of reclaimed pages
>   */
> @@ -590,16 +621,15 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  	struct pagevec freed_pvec;
>  	int pgactivate = 0;
>  	unsigned long nr_reclaimed = 0;
> -	unsigned long vm_flags;
>  
>  	cond_resched();
>  
>  	pagevec_init(&freed_pvec, 1);
>  	while (!list_empty(page_list)) {
> +		enum page_references references;
>  		struct address_space *mapping;
>  		struct page *page;
>  		int may_enter_fs;
> -		int referenced;
>  
>  		cond_resched();
>  
> @@ -641,17 +671,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  				goto keep_locked;
>  		}
>  
> -		referenced = page_referenced(page, 1,
> -						sc->mem_cgroup, &vm_flags);
> -		/*
> -		 * In active use or really unfreeable?  Activate it.
> -		 * If page which have PG_mlocked lost isoltation race,
> -		 * try_to_unmap moves it to unevictable list
> -		 */
> -		if (sc->order <= PAGE_ALLOC_COSTLY_ORDER &&
> -					referenced && page_mapping_inuse(page)
> -					&& !(vm_flags & VM_LOCKED))
> +		references = page_check_references(page, sc);
> +		switch (references) {
> +		case PAGEREF_ACTIVATE:
>  			goto activate_locked;
> +		case PAGEREF_RECLAIM:
> +		case PAGEREF_RECLAIM_CLEAN:
> +			; /* try to reclaim the page below */
> +		}
>  
>  		/*
>  		 * Anonymous process memory has backing store?
> @@ -685,7 +712,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  		}
>  
>  		if (PageDirty(page)) {
> -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> +			if (references == PAGEREF_RECLAIM_CLEAN)

How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
&& referenced by semantic? 
Dirtyness test is already done above line by PageDirty. 
So I think PAGEREF_RECLAIM_CLEAN isn't proper in there. 
What's your intention I don't catch? 


>  				goto keep_locked;
>  			if (!may_enter_fs)
>  				goto keep_locked;


-- 
Kind regards,
Minchan Kim



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-23 13:38     ` Minchan Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 13:38 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hi, Hannes. 

On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> Moving the big conditional into its own predicate function makes the
> code a bit easier to read and allows for better commenting on the
> checks one-by-one.
> 
> This is just cleaning up, no semantics should have been changed.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/vmscan.c |   53 ++++++++++++++++++++++++++++++++++++++++-------------
>  1 files changed, 40 insertions(+), 13 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c26986c..c2db55b 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -579,6 +579,37 @@ redo:
>  	put_page(page);		/* drop ref from isolate */
>  }
>  
> +enum page_references {
> +	PAGEREF_RECLAIM,
> +	PAGEREF_RECLAIM_CLEAN,
> +	PAGEREF_ACTIVATE,
> +};
> +
> +static enum page_references page_check_references(struct page *page,
> +						  struct scan_control *sc)
> +{
> +	unsigned long vm_flags;
> +	int referenced;
> +
> +	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> +	if (!referenced)
> +		return PAGEREF_RECLAIM;
> +
> +	/* Lumpy reclaim - ignore references */
> +	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> +		return PAGEREF_RECLAIM;
> +
> +	/* Mlock lost isolation race - let try_to_unmap() handle it */

How doest try_to_unamp handle it?

/* Page which PG_mlocked lost isolation race - let try_to_unmap() move
the page to unevitable list */

The point is to move the page into unevictable list in case of race. 
Let's write down comment more clearly. 
As it was, it was clear, I think. :)

> +	if (vm_flags & VM_LOCKED)
> +		return PAGEREF_RECLAIM;
> +
> +	if (page_mapping_inuse(page))
> +		return PAGEREF_ACTIVATE;
> +
> +	/* Reclaim if clean, defer dirty pages to writeback */
> +	return PAGEREF_RECLAIM_CLEAN;
> +}
> +
>  /*
>   * shrink_page_list() returns the number of reclaimed pages
>   */
> @@ -590,16 +621,15 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  	struct pagevec freed_pvec;
>  	int pgactivate = 0;
>  	unsigned long nr_reclaimed = 0;
> -	unsigned long vm_flags;
>  
>  	cond_resched();
>  
>  	pagevec_init(&freed_pvec, 1);
>  	while (!list_empty(page_list)) {
> +		enum page_references references;
>  		struct address_space *mapping;
>  		struct page *page;
>  		int may_enter_fs;
> -		int referenced;
>  
>  		cond_resched();
>  
> @@ -641,17 +671,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  				goto keep_locked;
>  		}
>  
> -		referenced = page_referenced(page, 1,
> -						sc->mem_cgroup, &vm_flags);
> -		/*
> -		 * In active use or really unfreeable?  Activate it.
> -		 * If page which have PG_mlocked lost isoltation race,
> -		 * try_to_unmap moves it to unevictable list
> -		 */
> -		if (sc->order <= PAGE_ALLOC_COSTLY_ORDER &&
> -					referenced && page_mapping_inuse(page)
> -					&& !(vm_flags & VM_LOCKED))
> +		references = page_check_references(page, sc);
> +		switch (references) {
> +		case PAGEREF_ACTIVATE:
>  			goto activate_locked;
> +		case PAGEREF_RECLAIM:
> +		case PAGEREF_RECLAIM_CLEAN:
> +			; /* try to reclaim the page below */
> +		}
>  
>  		/*
>  		 * Anonymous process memory has backing store?
> @@ -685,7 +712,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  		}
>  
>  		if (PageDirty(page)) {
> -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> +			if (references == PAGEREF_RECLAIM_CLEAN)

How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
&& referenced by semantic? 
Dirtyness test is already done above line by PageDirty. 
So I think PAGEREF_RECLAIM_CLEAN isn't proper in there. 
What's your intention I don't catch? 


>  				goto keep_locked;
>  			if (!may_enter_fs)
>  				goto keep_locked;


-- 
Kind regards,
Minchan Kim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
  2010-02-22 19:49   ` Johannes Weiner
@ 2010-02-23 14:03     ` Minchan Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 14:03 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> page_mapping_inuse() is a historic predicate function for pages that
> are about to be reclaimed or deactivated.
> 
> According to it, a page is in use when it is mapped into page tables
> OR part of swap cache OR backing an mmapped file.
> 
> This function is used in combination with page_referenced(), which
> checks for young bits in ptes and the page descriptor itself for the
> PG_referenced bit.  Thus, checking for unmapped swap cache pages is
> meaningless as PG_referenced is not set for anonymous pages and
> unmapped pages do not have young ptes.  The test makes no difference.

Nice catch!

> 
> Protecting file pages that are not by themselves mapped but are part
> of a mapped file is also a historic leftover for short-lived things


I have been a question in the part.
You seem to solve my long question. :)
But I want to make sure it by any log.
Could you tell me where I find the discussion mail thread or git log at
that time?

Just out of curiosity. 

> like the exec() code in libc.  However, the VM now does reference
> accounting and activation of pages at unmap time and thus the special
> treatment on reclaim is obsolete.

It does make sense. 

> 
> This patch drops page_mapping_inuse() and switches the two callsites
> to use page_mapped() directly.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/vmscan.c |   25 ++-----------------------
>  1 files changed, 2 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c2db55b..a8e4cbe 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -262,27 +262,6 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>  	return ret;
>  }
>  
> -/* Called without lock on whether page is mapped, so answer is unstable */
> -static inline int page_mapping_inuse(struct page *page)
> -{
> -	struct address_space *mapping;
> -
> -	/* Page is in somebody's page tables. */
> -	if (page_mapped(page))
> -		return 1;
> -
> -	/* Be more reluctant to reclaim swapcache than pagecache */
> -	if (PageSwapCache(page))
> -		return 1;
> -
> -	mapping = page_mapping(page);
> -	if (!mapping)
> -		return 0;
> -
> -	/* File is mmap'd by somebody? */
> -	return mapping_mapped(mapping);
> -}
> -
>  static inline int is_page_cache_freeable(struct page *page)
>  {
>  	/*
> @@ -603,7 +582,7 @@ static enum page_references page_check_references(struct page *page,
>  	if (vm_flags & VM_LOCKED)
>  		return PAGEREF_RECLAIM;
>  
> -	if (page_mapping_inuse(page))
> +	if (page_mapped(page))
>  		return PAGEREF_ACTIVATE;
>  
>  	/* Reclaim if clean, defer dirty pages to writeback */
> @@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
>  		}
>  
>  		/* page_referenced clears PageReferenced */
> -		if (page_mapping_inuse(page) &&
> +		if (page_mapped(page) &&
>  		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
>  			nr_rotated++;
>  			/*

It's good to me.
But page_referenced already have been checked page_mapped. 
How about folding alone page_mapped check into page_referenced's inner?

-- 
Kind regards,
Minchan Kim



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
@ 2010-02-23 14:03     ` Minchan Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 14:03 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> page_mapping_inuse() is a historic predicate function for pages that
> are about to be reclaimed or deactivated.
> 
> According to it, a page is in use when it is mapped into page tables
> OR part of swap cache OR backing an mmapped file.
> 
> This function is used in combination with page_referenced(), which
> checks for young bits in ptes and the page descriptor itself for the
> PG_referenced bit.  Thus, checking for unmapped swap cache pages is
> meaningless as PG_referenced is not set for anonymous pages and
> unmapped pages do not have young ptes.  The test makes no difference.

Nice catch!

> 
> Protecting file pages that are not by themselves mapped but are part
> of a mapped file is also a historic leftover for short-lived things


I have been a question in the part.
You seem to solve my long question. :)
But I want to make sure it by any log.
Could you tell me where I find the discussion mail thread or git log at
that time?

Just out of curiosity. 

> like the exec() code in libc.  However, the VM now does reference
> accounting and activation of pages at unmap time and thus the special
> treatment on reclaim is obsolete.

It does make sense. 

> 
> This patch drops page_mapping_inuse() and switches the two callsites
> to use page_mapped() directly.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  mm/vmscan.c |   25 ++-----------------------
>  1 files changed, 2 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c2db55b..a8e4cbe 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -262,27 +262,6 @@ unsigned long shrink_slab(unsigned long scanned, gfp_t gfp_mask,
>  	return ret;
>  }
>  
> -/* Called without lock on whether page is mapped, so answer is unstable */
> -static inline int page_mapping_inuse(struct page *page)
> -{
> -	struct address_space *mapping;
> -
> -	/* Page is in somebody's page tables. */
> -	if (page_mapped(page))
> -		return 1;
> -
> -	/* Be more reluctant to reclaim swapcache than pagecache */
> -	if (PageSwapCache(page))
> -		return 1;
> -
> -	mapping = page_mapping(page);
> -	if (!mapping)
> -		return 0;
> -
> -	/* File is mmap'd by somebody? */
> -	return mapping_mapped(mapping);
> -}
> -
>  static inline int is_page_cache_freeable(struct page *page)
>  {
>  	/*
> @@ -603,7 +582,7 @@ static enum page_references page_check_references(struct page *page,
>  	if (vm_flags & VM_LOCKED)
>  		return PAGEREF_RECLAIM;
>  
> -	if (page_mapping_inuse(page))
> +	if (page_mapped(page))
>  		return PAGEREF_ACTIVATE;
>  
>  	/* Reclaim if clean, defer dirty pages to writeback */
> @@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
>  		}
>  
>  		/* page_referenced clears PageReferenced */
> -		if (page_mapping_inuse(page) &&
> +		if (page_mapped(page) &&
>  		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
>  			nr_rotated++;
>  			/*

It's good to me.
But page_referenced already have been checked page_mapped. 
How about folding alone page_mapped check into page_referenced's inner?

-- 
Kind regards,
Minchan Kim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
  2010-02-23 13:38     ` Minchan Kim
@ 2010-02-23 14:21       ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 14:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hello Minchan,

On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:
> Hi, Hannes. 
> 
> On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > Moving the big conditional into its own predicate function makes the
> > code a bit easier to read and allows for better commenting on the
> > checks one-by-one.
> > 
> > This is just cleaning up, no semantics should have been changed.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> >  mm/vmscan.c |   53 ++++++++++++++++++++++++++++++++++++++++-------------
> >  1 files changed, 40 insertions(+), 13 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index c26986c..c2db55b 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -579,6 +579,37 @@ redo:
> >  	put_page(page);		/* drop ref from isolate */
> >  }
> >  
> > +enum page_references {
> > +	PAGEREF_RECLAIM,
> > +	PAGEREF_RECLAIM_CLEAN,
> > +	PAGEREF_ACTIVATE,
> > +};
> > +
> > +static enum page_references page_check_references(struct page *page,
> > +						  struct scan_control *sc)
> > +{
> > +	unsigned long vm_flags;
> > +	int referenced;
> > +
> > +	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> > +	if (!referenced)
> > +		return PAGEREF_RECLAIM;
> > +
> > +	/* Lumpy reclaim - ignore references */
> > +	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> > +		return PAGEREF_RECLAIM;
> > +
> > +	/* Mlock lost isolation race - let try_to_unmap() handle it */
> 
> How doest try_to_unamp handle it?
> 
> /* Page which PG_mlocked lost isolation race - let try_to_unmap() move
> the page to unevitable list */
> 
> The point is to move the page into unevictable list in case of race. 
> Let's write down comment more clearly. 
> As it was, it was clear, I think. :)

Okay, I do not feel strongly about it.  I just figured it would be enough
at this point to say 'page is special, pass it on to try_to_unmap(), it
knows how to handle it.  We do not care.'.  But maybe you are right.

I attached an incremental patch below.

> > +	if (vm_flags & VM_LOCKED)
> > +		return PAGEREF_RECLAIM;
> > +
> > +	if (page_mapping_inuse(page))
> > +		return PAGEREF_ACTIVATE;
> > +
> > +	/* Reclaim if clean, defer dirty pages to writeback */
> > +	return PAGEREF_RECLAIM_CLEAN;
> > +}
> > +
> >  /*
> >   * shrink_page_list() returns the number of reclaimed pages
> >   */
> > @@ -590,16 +621,15 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  	struct pagevec freed_pvec;
> >  	int pgactivate = 0;
> >  	unsigned long nr_reclaimed = 0;
> > -	unsigned long vm_flags;
> >  
> >  	cond_resched();
> >  
> >  	pagevec_init(&freed_pvec, 1);
> >  	while (!list_empty(page_list)) {
> > +		enum page_references references;
> >  		struct address_space *mapping;
> >  		struct page *page;
> >  		int may_enter_fs;
> > -		int referenced;
> >  
> >  		cond_resched();
> >  
> > @@ -641,17 +671,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  				goto keep_locked;
> >  		}
> >  
> > -		referenced = page_referenced(page, 1,
> > -						sc->mem_cgroup, &vm_flags);
> > -		/*
> > -		 * In active use or really unfreeable?  Activate it.
> > -		 * If page which have PG_mlocked lost isoltation race,
> > -		 * try_to_unmap moves it to unevictable list
> > -		 */
> > -		if (sc->order <= PAGE_ALLOC_COSTLY_ORDER &&
> > -					referenced && page_mapping_inuse(page)
> > -					&& !(vm_flags & VM_LOCKED))
> > +		references = page_check_references(page, sc);
> > +		switch (references) {
> > +		case PAGEREF_ACTIVATE:
> >  			goto activate_locked;
> > +		case PAGEREF_RECLAIM:
> > +		case PAGEREF_RECLAIM_CLEAN:
> > +			; /* try to reclaim the page below */
> > +		}
> >  
> >  		/*
> >  		 * Anonymous process memory has backing store?
> > @@ -685,7 +712,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		}
> >  
> >  		if (PageDirty(page)) {
> > -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> > +			if (references == PAGEREF_RECLAIM_CLEAN)
> 
> How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
> && referenced by semantic?

It is encoded in page_check_references().  When
	sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
it returns PAGEREF_RECLAIM_CLEAN.

So

	- PageDirty() && order < COSTLY && referenced
	+ PageDirty() && references == PAGEREF_RECLAIM_CLEAN

is an equivalent transformation.  Does this answer your question?

	Hannes
	
---
From: Johannes Weiner <hannes@cmpxchg.org>
Subject: vmscan: improve comment on mlocked page in reclaim

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 674a78b..819fff7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -578,7 +578,10 @@ static enum page_references page_check_references(struct page *page,
 	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
 		return PAGEREF_RECLAIM;
 
-	/* Mlock lost isolation race - let try_to_unmap() handle it */
+	/*
+	 * Mlock lost the isolation race with us.  Let try_to_unmap()
+	 * move the page to the unevictable list.
+	 */
 	if (vm_flags & VM_LOCKED)
 		return PAGEREF_RECLAIM;
 


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-23 14:21       ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 14:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hello Minchan,

On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:
> Hi, Hannes. 
> 
> On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > Moving the big conditional into its own predicate function makes the
> > code a bit easier to read and allows for better commenting on the
> > checks one-by-one.
> > 
> > This is just cleaning up, no semantics should have been changed.
> > 
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> > ---
> >  mm/vmscan.c |   53 ++++++++++++++++++++++++++++++++++++++++-------------
> >  1 files changed, 40 insertions(+), 13 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index c26986c..c2db55b 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -579,6 +579,37 @@ redo:
> >  	put_page(page);		/* drop ref from isolate */
> >  }
> >  
> > +enum page_references {
> > +	PAGEREF_RECLAIM,
> > +	PAGEREF_RECLAIM_CLEAN,
> > +	PAGEREF_ACTIVATE,
> > +};
> > +
> > +static enum page_references page_check_references(struct page *page,
> > +						  struct scan_control *sc)
> > +{
> > +	unsigned long vm_flags;
> > +	int referenced;
> > +
> > +	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> > +	if (!referenced)
> > +		return PAGEREF_RECLAIM;
> > +
> > +	/* Lumpy reclaim - ignore references */
> > +	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> > +		return PAGEREF_RECLAIM;
> > +
> > +	/* Mlock lost isolation race - let try_to_unmap() handle it */
> 
> How doest try_to_unamp handle it?
> 
> /* Page which PG_mlocked lost isolation race - let try_to_unmap() move
> the page to unevitable list */
> 
> The point is to move the page into unevictable list in case of race. 
> Let's write down comment more clearly. 
> As it was, it was clear, I think. :)

Okay, I do not feel strongly about it.  I just figured it would be enough
at this point to say 'page is special, pass it on to try_to_unmap(), it
knows how to handle it.  We do not care.'.  But maybe you are right.

I attached an incremental patch below.

> > +	if (vm_flags & VM_LOCKED)
> > +		return PAGEREF_RECLAIM;
> > +
> > +	if (page_mapping_inuse(page))
> > +		return PAGEREF_ACTIVATE;
> > +
> > +	/* Reclaim if clean, defer dirty pages to writeback */
> > +	return PAGEREF_RECLAIM_CLEAN;
> > +}
> > +
> >  /*
> >   * shrink_page_list() returns the number of reclaimed pages
> >   */
> > @@ -590,16 +621,15 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  	struct pagevec freed_pvec;
> >  	int pgactivate = 0;
> >  	unsigned long nr_reclaimed = 0;
> > -	unsigned long vm_flags;
> >  
> >  	cond_resched();
> >  
> >  	pagevec_init(&freed_pvec, 1);
> >  	while (!list_empty(page_list)) {
> > +		enum page_references references;
> >  		struct address_space *mapping;
> >  		struct page *page;
> >  		int may_enter_fs;
> > -		int referenced;
> >  
> >  		cond_resched();
> >  
> > @@ -641,17 +671,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  				goto keep_locked;
> >  		}
> >  
> > -		referenced = page_referenced(page, 1,
> > -						sc->mem_cgroup, &vm_flags);
> > -		/*
> > -		 * In active use or really unfreeable?  Activate it.
> > -		 * If page which have PG_mlocked lost isoltation race,
> > -		 * try_to_unmap moves it to unevictable list
> > -		 */
> > -		if (sc->order <= PAGE_ALLOC_COSTLY_ORDER &&
> > -					referenced && page_mapping_inuse(page)
> > -					&& !(vm_flags & VM_LOCKED))
> > +		references = page_check_references(page, sc);
> > +		switch (references) {
> > +		case PAGEREF_ACTIVATE:
> >  			goto activate_locked;
> > +		case PAGEREF_RECLAIM:
> > +		case PAGEREF_RECLAIM_CLEAN:
> > +			; /* try to reclaim the page below */
> > +		}
> >  
> >  		/*
> >  		 * Anonymous process memory has backing store?
> > @@ -685,7 +712,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		}
> >  
> >  		if (PageDirty(page)) {
> > -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> > +			if (references == PAGEREF_RECLAIM_CLEAN)
> 
> How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
> && referenced by semantic?

It is encoded in page_check_references().  When
	sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
it returns PAGEREF_RECLAIM_CLEAN.

So

	- PageDirty() && order < COSTLY && referenced
	+ PageDirty() && references == PAGEREF_RECLAIM_CLEAN

is an equivalent transformation.  Does this answer your question?

	Hannes
	
---
From: Johannes Weiner <hannes@cmpxchg.org>
Subject: vmscan: improve comment on mlocked page in reclaim

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 674a78b..819fff7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -578,7 +578,10 @@ static enum page_references page_check_references(struct page *page,
 	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
 		return PAGEREF_RECLAIM;
 
-	/* Mlock lost isolation race - let try_to_unmap() handle it */
+	/*
+	 * Mlock lost the isolation race with us.  Let try_to_unmap()
+	 * move the page to the unevictable list.
+	 */
 	if (vm_flags & VM_LOCKED)
 		return PAGEREF_RECLAIM;
 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
  2010-02-23 14:21       ` Johannes Weiner
@ 2010-02-23 14:31         ` Rik van Riel
  -1 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-23 14:31 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Minchan Kim, Andrew Morton, KOSAKI Motohiro, linux-mm, linux-kernel

On 02/23/2010 09:21 AM, Johannes Weiner wrote:

> From: Johannes Weiner<hannes@cmpxchg.org>
> Subject: vmscan: improve comment on mlocked page in reclaim
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>
> ---
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 674a78b..819fff7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -578,7 +578,10 @@ static enum page_references page_check_references(struct page *page,
>   	if (sc->order>  PAGE_ALLOC_COSTLY_ORDER)
>   		return PAGEREF_RECLAIM;
>
> -	/* Mlock lost isolation race - let try_to_unmap() handle it */
> +	/*
> +	 * Mlock lost the isolation race with us.  Let try_to_unmap()
> +	 * move the page to the unevictable list.
> +	 */
>   	if (vm_flags&  VM_LOCKED)
>   		return PAGEREF_RECLAIM;

Reviewed-by: Rik van Riel <riel@redhat.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-23 14:31         ` Rik van Riel
  0 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-23 14:31 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Minchan Kim, Andrew Morton, KOSAKI Motohiro, linux-mm, linux-kernel

On 02/23/2010 09:21 AM, Johannes Weiner wrote:

> From: Johannes Weiner<hannes@cmpxchg.org>
> Subject: vmscan: improve comment on mlocked page in reclaim
>
> Signed-off-by: Johannes Weiner<hannes@cmpxchg.org>
> ---
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 674a78b..819fff7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -578,7 +578,10 @@ static enum page_references page_check_references(struct page *page,
>   	if (sc->order>  PAGE_ALLOC_COSTLY_ORDER)
>   		return PAGEREF_RECLAIM;
>
> -	/* Mlock lost isolation race - let try_to_unmap() handle it */
> +	/*
> +	 * Mlock lost the isolation race with us.  Let try_to_unmap()
> +	 * move the page to the unevictable list.
> +	 */
>   	if (vm_flags&  VM_LOCKED)
>   		return PAGEREF_RECLAIM;

Reviewed-by: Rik van Riel <riel@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
  2010-02-23 14:03     ` Minchan Kim
@ 2010-02-23 14:32       ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 14:32 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hello Minchan,

On Tue, Feb 23, 2010 at 11:03:20PM +0900, Minchan Kim wrote:
> On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > Protecting file pages that are not by themselves mapped but are part
> > of a mapped file is also a historic leftover for short-lived things
> 
> I have been a question in the part.
> You seem to solve my long question. :)
> But I want to make sure it by any log.
> Could you tell me where I find the discussion mail thread or git log at
> that time?

I dug up this change in history.git, but unfortunately it was merged
undocumented in a large changeset.  So there does not seem to be any
written reason for why this was merged initially.    What I wrote is
based on what Rik told me on IRC.

> >  	/* Reclaim if clean, defer dirty pages to writeback */
> > @@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
> >  		}
> >  
> >  		/* page_referenced clears PageReferenced */
> > -		if (page_mapping_inuse(page) &&
> > +		if (page_mapped(page) &&
> >  		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
> >  			nr_rotated++;
> >  			/*
> 
> It's good to me.
> But page_referenced already have been checked page_mapped. 
> How about folding alone page_mapped check into page_referenced's inner?

The next patch essentially does that.  page_referenced() will no longer
clear PG_referenced on the page and if page_referenced() is true, it
means that young ptes were found and the page must thus be mapped.

So #3 removes the page_mapped() from this conditional.

	Hannes

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
@ 2010-02-23 14:32       ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 14:32 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hello Minchan,

On Tue, Feb 23, 2010 at 11:03:20PM +0900, Minchan Kim wrote:
> On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > Protecting file pages that are not by themselves mapped but are part
> > of a mapped file is also a historic leftover for short-lived things
> 
> I have been a question in the part.
> You seem to solve my long question. :)
> But I want to make sure it by any log.
> Could you tell me where I find the discussion mail thread or git log at
> that time?

I dug up this change in history.git, but unfortunately it was merged
undocumented in a large changeset.  So there does not seem to be any
written reason for why this was merged initially.    What I wrote is
based on what Rik told me on IRC.

> >  	/* Reclaim if clean, defer dirty pages to writeback */
> > @@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
> >  		}
> >  
> >  		/* page_referenced clears PageReferenced */
> > -		if (page_mapping_inuse(page) &&
> > +		if (page_mapped(page) &&
> >  		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
> >  			nr_rotated++;
> >  			/*
> 
> It's good to me.
> But page_referenced already have been checked page_mapped. 
> How about folding alone page_mapped check into page_referenced's inner?

The next patch essentially does that.  page_referenced() will no longer
clear PG_referenced on the page and if page_referenced() is true, it
means that young ptes were found and the page must thus be mapped.

So #3 removes the page_mapped() from this conditional.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
  2010-02-23 14:21       ` Johannes Weiner
@ 2010-02-23 14:44         ` Minchan Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 14:44 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Tue, 2010-02-23 at 15:21 +0100, Johannes Weiner wrote:
> Hello Minchan,
> 
> On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:

<snip>

> > >  
> > >  		if (PageDirty(page)) {
> > > -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> > > +			if (references == PAGEREF_RECLAIM_CLEAN)
> > 
> > How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
> > && referenced by semantic?
> 
> It is encoded in page_check_references().  When
> 	sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
> it returns PAGEREF_RECLAIM_CLEAN.
> 
> So
> 
> 	- PageDirty() && order < COSTLY && referenced
> 	+ PageDirty() && references == PAGEREF_RECLAIM_CLEAN
> 
> is an equivalent transformation.  Does this answer your question?

Hmm. I knew it. My point was PAGEREF_RECLAIM_CLEAN seems to be a little
awkward. I thought PAGEREF_RECLAIM_CLEAN means if the page was clean, it
can be reclaimed.

I think it would be better to rename it with represent "Although it's
referenced page recently, we can reclaim it if VM try to reclaim high
order page".


> 
> 	Hannes


-- 
Kind regards,
Minchan Kim



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-23 14:44         ` Minchan Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 14:44 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Tue, 2010-02-23 at 15:21 +0100, Johannes Weiner wrote:
> Hello Minchan,
> 
> On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:

<snip>

> > >  
> > >  		if (PageDirty(page)) {
> > > -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> > > +			if (references == PAGEREF_RECLAIM_CLEAN)
> > 
> > How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
> > && referenced by semantic?
> 
> It is encoded in page_check_references().  When
> 	sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
> it returns PAGEREF_RECLAIM_CLEAN.
> 
> So
> 
> 	- PageDirty() && order < COSTLY && referenced
> 	+ PageDirty() && references == PAGEREF_RECLAIM_CLEAN
> 
> is an equivalent transformation.  Does this answer your question?

Hmm. I knew it. My point was PAGEREF_RECLAIM_CLEAN seems to be a little
awkward. I thought PAGEREF_RECLAIM_CLEAN means if the page was clean, it
can be reclaimed.

I think it would be better to rename it with represent "Although it's
referenced page recently, we can reclaim it if VM try to reclaim high
order page".


> 
> 	Hannes


-- 
Kind regards,
Minchan Kim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
  2010-02-23 14:32       ` Johannes Weiner
@ 2010-02-23 14:48         ` Minchan Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 14:48 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Tue, 2010-02-23 at 15:32 +0100, Johannes Weiner wrote:
> Hello Minchan,
> 
> On Tue, Feb 23, 2010 at 11:03:20PM +0900, Minchan Kim wrote:
> > On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > > Protecting file pages that are not by themselves mapped but are part
> > > of a mapped file is also a historic leftover for short-lived things
> > 
> > I have been a question in the part.
> > You seem to solve my long question. :)
> > But I want to make sure it by any log.
> > Could you tell me where I find the discussion mail thread or git log at
> > that time?
> 
> I dug up this change in history.git, but unfortunately it was merged
> undocumented in a large changeset.  So there does not seem to be any
> written reason for why this was merged initially.    What I wrote is
> based on what Rik told me on IRC.
> 
> > >  	/* Reclaim if clean, defer dirty pages to writeback */
> > > @@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
> > >  		}
> > >  
> > >  		/* page_referenced clears PageReferenced */
> > > -		if (page_mapping_inuse(page) &&
> > > +		if (page_mapped(page) &&
> > >  		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
> > >  			nr_rotated++;
> > >  			/*
> > 
> > It's good to me.
> > But page_referenced already have been checked page_mapped. 
> > How about folding alone page_mapped check into page_referenced's inner?
> 
> The next patch essentially does that.  page_referenced() will no longer
> clear PG_referenced on the page and if page_referenced() is true, it
> means that young ptes were found and the page must thus be mapped.
> 
> So #3 removes the page_mapped() from this conditional.

Thanks! 
I should have reviewed your [3/3] before nitpick. 

> 
> 	Hannes


-- 
Kind regards,
Minchan Kim



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 2/3] vmscan: drop page_mapping_inuse()
@ 2010-02-23 14:48         ` Minchan Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 14:48 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Tue, 2010-02-23 at 15:32 +0100, Johannes Weiner wrote:
> Hello Minchan,
> 
> On Tue, Feb 23, 2010 at 11:03:20PM +0900, Minchan Kim wrote:
> > On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > > Protecting file pages that are not by themselves mapped but are part
> > > of a mapped file is also a historic leftover for short-lived things
> > 
> > I have been a question in the part.
> > You seem to solve my long question. :)
> > But I want to make sure it by any log.
> > Could you tell me where I find the discussion mail thread or git log at
> > that time?
> 
> I dug up this change in history.git, but unfortunately it was merged
> undocumented in a large changeset.  So there does not seem to be any
> written reason for why this was merged initially.    What I wrote is
> based on what Rik told me on IRC.
> 
> > >  	/* Reclaim if clean, defer dirty pages to writeback */
> > > @@ -1378,7 +1357,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
> > >  		}
> > >  
> > >  		/* page_referenced clears PageReferenced */
> > > -		if (page_mapping_inuse(page) &&
> > > +		if (page_mapped(page) &&
> > >  		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
> > >  			nr_rotated++;
> > >  			/*
> > 
> > It's good to me.
> > But page_referenced already have been checked page_mapped. 
> > How about folding alone page_mapped check into page_referenced's inner?
> 
> The next patch essentially does that.  page_referenced() will no longer
> clear PG_referenced on the page and if page_referenced() is true, it
> means that young ptes were found and the page must thus be mapped.
> 
> So #3 removes the page_mapped() from this conditional.

Thanks! 
I should have reviewed your [3/3] before nitpick. 

> 
> 	Hannes


-- 
Kind regards,
Minchan Kim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 3/3] vmscan: detect mapped file pages used only once
  2010-02-22 19:49   ` Johannes Weiner
@ 2010-02-23 15:03     ` Minchan Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 15:03 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> The VM currently assumes that an inactive, mapped and referenced file
> page is in use and promotes it to the active list.
> 
> However, every mapped file page starts out like this and thus a problem
> arises when workloads create a stream of such pages that are used only
> for a short time.  By flooding the active list with those pages, the VM
> quickly gets into trouble finding eligible reclaim canditates.  The
> result is long allocation latencies and eviction of the wrong pages.
> 
> This patch reuses the PG_referenced page flag (used for unmapped file
> pages) to implement a usage detection that scales with the speed of
> LRU list cycling (i.e. memory pressure).
> 
> If the scanner encounters those pages, the flag is set and the page
> cycled again on the inactive list.  Only if it returns with another
> page table reference it is activated.  Otherwise it is reclaimed as
> 'not recently used cache'.
> 
> This effectively changes the minimum lifetime of a used-once mapped
> file page from a full memory cycle to an inactive list cycle, which
> allows it to occur in linear streams without affecting the stable
> working set of the system.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  include/linux/rmap.h |    2 +-
>  mm/rmap.c            |    3 ---
>  mm/vmscan.c          |   45 +++++++++++++++++++++++++++++++++++----------
>  3 files changed, 36 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index b019ae6..f4accb5 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -181,7 +181,7 @@ static inline int page_referenced(struct page *page, int is_locked,
>  				  unsigned long *vm_flags)
>  {
>  	*vm_flags = 0;
> -	return TestClearPageReferenced(page);
> +	return 0;
>  }
>  
>  #define try_to_unmap(page, refs) SWAP_FAIL
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 278cd27..5a48bda 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -511,9 +511,6 @@ int page_referenced(struct page *page,
>  	int referenced = 0;
>  	int we_locked = 0;
>  
> -	if (TestClearPageReferenced(page))
> -		referenced++;
> -

>From now on, page_referenced see only page table for reference. 
So let's comment it on function description.
like "This function checks reference from only pte"

>  	*vm_flags = 0;
>  	if (page_mapped(page) && page_rmapping(page)) {
>  		if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a8e4cbe..674a78b 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -561,18 +561,18 @@ redo:
>  enum page_references {
>  	PAGEREF_RECLAIM,
>  	PAGEREF_RECLAIM_CLEAN,
> +	PAGEREF_KEEP,
>  	PAGEREF_ACTIVATE,
>  };
>  
>  static enum page_references page_check_references(struct page *page,
>  						  struct scan_control *sc)
>  {
> +	int referenced_ptes, referenced_page;
>  	unsigned long vm_flags;
> -	int referenced;
>  
> -	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> -	if (!referenced)
> -		return PAGEREF_RECLAIM;
> +	referenced_ptes = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> +	referenced_page = TestClearPageReferenced(page);
>  
>  	/* Lumpy reclaim - ignore references */
>  	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> @@ -582,11 +582,36 @@ static enum page_references page_check_references(struct page *page,
>  	if (vm_flags & VM_LOCKED)
>  		return PAGEREF_RECLAIM;
>  
> -	if (page_mapped(page))
> -		return PAGEREF_ACTIVATE;
> +	if (referenced_ptes) {
> +		if (PageAnon(page))
> +			return PAGEREF_ACTIVATE;
> +		/*
> +		 * All mapped pages start out with page table
> +		 * references from the instantiating fault, so we need
> +		 * to look twice if a mapped file page is used more
> +		 * than once.
> +		 *
> +		 * Mark it and spare it for another trip around the
> +		 * inactive list.  Another page table reference will
> +		 * lead to its activation.
> +		 *
> +		 * Note: the mark is set for activated pages as well
> +		 * so that recently deactivated but used pages are
> +		 * quickly recovered.
> +		 */
> +		SetPageReferenced(page);
> +
> +		if (referenced_page)
> +			return PAGEREF_ACTIVATE;
> +
> +		return PAGEREF_KEEP;
> +	}
>  
>  	/* Reclaim if clean, defer dirty pages to writeback */
> -	return PAGEREF_RECLAIM_CLEAN;
> +	if (referenced_page)
> +		return PAGEREF_RECLAIM_CLEAN;
> +
> +	return PAGEREF_RECLAIM;
>  }
>  
>  /*
> @@ -654,6 +679,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  		switch (references) {
>  		case PAGEREF_ACTIVATE:
>  			goto activate_locked;
> +		case PAGEREF_KEEP:
> +			goto keep_locked;
>  		case PAGEREF_RECLAIM:
>  		case PAGEREF_RECLAIM_CLEAN:
>  			; /* try to reclaim the page below */
> @@ -1356,9 +1383,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
>  			continue;
>  		}
>  
> -		/* page_referenced clears PageReferenced */
> -		if (page_mapped(page) &&
> -		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
> +		if (page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
>  			nr_rotated++;
>  			/*
>  			 * Identify referenced, file-backed active pages and

It looks good to me except PAGEREF_RECLAIM_CLEAN. 

I am glad to meet your this effort, again, Hannes. :)

-- 
Kind regards,
Minchan Kim



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 3/3] vmscan: detect mapped file pages used only once
@ 2010-02-23 15:03     ` Minchan Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 15:03 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> The VM currently assumes that an inactive, mapped and referenced file
> page is in use and promotes it to the active list.
> 
> However, every mapped file page starts out like this and thus a problem
> arises when workloads create a stream of such pages that are used only
> for a short time.  By flooding the active list with those pages, the VM
> quickly gets into trouble finding eligible reclaim canditates.  The
> result is long allocation latencies and eviction of the wrong pages.
> 
> This patch reuses the PG_referenced page flag (used for unmapped file
> pages) to implement a usage detection that scales with the speed of
> LRU list cycling (i.e. memory pressure).
> 
> If the scanner encounters those pages, the flag is set and the page
> cycled again on the inactive list.  Only if it returns with another
> page table reference it is activated.  Otherwise it is reclaimed as
> 'not recently used cache'.
> 
> This effectively changes the minimum lifetime of a used-once mapped
> file page from a full memory cycle to an inactive list cycle, which
> allows it to occur in linear streams without affecting the stable
> working set of the system.
> 
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> ---
>  include/linux/rmap.h |    2 +-
>  mm/rmap.c            |    3 ---
>  mm/vmscan.c          |   45 +++++++++++++++++++++++++++++++++++----------
>  3 files changed, 36 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/rmap.h b/include/linux/rmap.h
> index b019ae6..f4accb5 100644
> --- a/include/linux/rmap.h
> +++ b/include/linux/rmap.h
> @@ -181,7 +181,7 @@ static inline int page_referenced(struct page *page, int is_locked,
>  				  unsigned long *vm_flags)
>  {
>  	*vm_flags = 0;
> -	return TestClearPageReferenced(page);
> +	return 0;
>  }
>  
>  #define try_to_unmap(page, refs) SWAP_FAIL
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 278cd27..5a48bda 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -511,9 +511,6 @@ int page_referenced(struct page *page,
>  	int referenced = 0;
>  	int we_locked = 0;
>  
> -	if (TestClearPageReferenced(page))
> -		referenced++;
> -

>From now on, page_referenced see only page table for reference. 
So let's comment it on function description.
like "This function checks reference from only pte"

>  	*vm_flags = 0;
>  	if (page_mapped(page) && page_rmapping(page)) {
>  		if (!is_locked && (!PageAnon(page) || PageKsm(page))) {
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a8e4cbe..674a78b 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -561,18 +561,18 @@ redo:
>  enum page_references {
>  	PAGEREF_RECLAIM,
>  	PAGEREF_RECLAIM_CLEAN,
> +	PAGEREF_KEEP,
>  	PAGEREF_ACTIVATE,
>  };
>  
>  static enum page_references page_check_references(struct page *page,
>  						  struct scan_control *sc)
>  {
> +	int referenced_ptes, referenced_page;
>  	unsigned long vm_flags;
> -	int referenced;
>  
> -	referenced = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> -	if (!referenced)
> -		return PAGEREF_RECLAIM;
> +	referenced_ptes = page_referenced(page, 1, sc->mem_cgroup, &vm_flags);
> +	referenced_page = TestClearPageReferenced(page);
>  
>  	/* Lumpy reclaim - ignore references */
>  	if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> @@ -582,11 +582,36 @@ static enum page_references page_check_references(struct page *page,
>  	if (vm_flags & VM_LOCKED)
>  		return PAGEREF_RECLAIM;
>  
> -	if (page_mapped(page))
> -		return PAGEREF_ACTIVATE;
> +	if (referenced_ptes) {
> +		if (PageAnon(page))
> +			return PAGEREF_ACTIVATE;
> +		/*
> +		 * All mapped pages start out with page table
> +		 * references from the instantiating fault, so we need
> +		 * to look twice if a mapped file page is used more
> +		 * than once.
> +		 *
> +		 * Mark it and spare it for another trip around the
> +		 * inactive list.  Another page table reference will
> +		 * lead to its activation.
> +		 *
> +		 * Note: the mark is set for activated pages as well
> +		 * so that recently deactivated but used pages are
> +		 * quickly recovered.
> +		 */
> +		SetPageReferenced(page);
> +
> +		if (referenced_page)
> +			return PAGEREF_ACTIVATE;
> +
> +		return PAGEREF_KEEP;
> +	}
>  
>  	/* Reclaim if clean, defer dirty pages to writeback */
> -	return PAGEREF_RECLAIM_CLEAN;
> +	if (referenced_page)
> +		return PAGEREF_RECLAIM_CLEAN;
> +
> +	return PAGEREF_RECLAIM;
>  }
>  
>  /*
> @@ -654,6 +679,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  		switch (references) {
>  		case PAGEREF_ACTIVATE:
>  			goto activate_locked;
> +		case PAGEREF_KEEP:
> +			goto keep_locked;
>  		case PAGEREF_RECLAIM:
>  		case PAGEREF_RECLAIM_CLEAN:
>  			; /* try to reclaim the page below */
> @@ -1356,9 +1383,7 @@ static void shrink_active_list(unsigned long nr_pages, struct zone *zone,
>  			continue;
>  		}
>  
> -		/* page_referenced clears PageReferenced */
> -		if (page_mapped(page) &&
> -		    page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
> +		if (page_referenced(page, 0, sc->mem_cgroup, &vm_flags)) {
>  			nr_rotated++;
>  			/*
>  			 * Identify referenced, file-backed active pages and

It looks good to me except PAGEREF_RECLAIM_CLEAN. 

I am glad to meet your this effort, again, Hannes. :)

-- 
Kind regards,
Minchan Kim


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
  2010-02-23 14:44         ` Minchan Kim
@ 2010-02-23 15:40           ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 15:40 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hello Minchan,

On Tue, Feb 23, 2010 at 11:44:14PM +0900, Minchan Kim wrote:
> On Tue, 2010-02-23 at 15:21 +0100, Johannes Weiner wrote:
> > Hello Minchan,
> > 
> > On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:
> 
> <snip>
> 
> > > >  
> > > >  		if (PageDirty(page)) {
> > > > -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> > > > +			if (references == PAGEREF_RECLAIM_CLEAN)
> > > 
> > > How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
> > > && referenced by semantic?
> > 
> > It is encoded in page_check_references().  When
> > 	sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
> > it returns PAGEREF_RECLAIM_CLEAN.
> > 
> > So
> > 
> > 	- PageDirty() && order < COSTLY && referenced
> > 	+ PageDirty() && references == PAGEREF_RECLAIM_CLEAN
> > 
> > is an equivalent transformation.  Does this answer your question?
> 
> Hmm. I knew it. My point was PAGEREF_RECLAIM_CLEAN seems to be a little
> awkward. I thought PAGEREF_RECLAIM_CLEAN means if the page was clean, it
> can be reclaimed.

But you were thinking right, it is exactly what it means!  If
the state is PAGEREF_RECLAIM_CLEAN, reclaim the page if it is clean:

	if (PageDirty(page)) {
		if (references == PAGEREF_RECLAIM_CLEAN)
			goto keep_locked;	/* do not reclaim */
		...
	}

> I think it would be better to rename it with represent "Although it's
> referenced page recently, we can reclaim it if VM try to reclaim high
> order page".

I changed it to PAGEREF_RECLAIM_LUMPY and PAGEREF_RECLAIM, but I felt
it made it worse.  It's awkward that we have to communicate that state
at all, maybe it would be better to do

        if (PageDirty(page) && referenced_page)
                return PAGEREF_KEEP;

in page_check_references()?  But doing PageDirty() twice is also kinda
lame.

I don't know.  Can we leave it like that for now?

        Hannes

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-23 15:40           ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 15:40 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hello Minchan,

On Tue, Feb 23, 2010 at 11:44:14PM +0900, Minchan Kim wrote:
> On Tue, 2010-02-23 at 15:21 +0100, Johannes Weiner wrote:
> > Hello Minchan,
> > 
> > On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:
> 
> <snip>
> 
> > > >  
> > > >  		if (PageDirty(page)) {
> > > > -			if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
> > > > +			if (references == PAGEREF_RECLAIM_CLEAN)
> > > 
> > > How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
> > > && referenced by semantic?
> > 
> > It is encoded in page_check_references().  When
> > 	sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
> > it returns PAGEREF_RECLAIM_CLEAN.
> > 
> > So
> > 
> > 	- PageDirty() && order < COSTLY && referenced
> > 	+ PageDirty() && references == PAGEREF_RECLAIM_CLEAN
> > 
> > is an equivalent transformation.  Does this answer your question?
> 
> Hmm. I knew it. My point was PAGEREF_RECLAIM_CLEAN seems to be a little
> awkward. I thought PAGEREF_RECLAIM_CLEAN means if the page was clean, it
> can be reclaimed.

But you were thinking right, it is exactly what it means!  If
the state is PAGEREF_RECLAIM_CLEAN, reclaim the page if it is clean:

	if (PageDirty(page)) {
		if (references == PAGEREF_RECLAIM_CLEAN)
			goto keep_locked;	/* do not reclaim */
		...
	}

> I think it would be better to rename it with represent "Although it's
> referenced page recently, we can reclaim it if VM try to reclaim high
> order page".

I changed it to PAGEREF_RECLAIM_LUMPY and PAGEREF_RECLAIM, but I felt
it made it worse.  It's awkward that we have to communicate that state
at all, maybe it would be better to do

        if (PageDirty(page) && referenced_page)
                return PAGEREF_KEEP;

in page_check_references()?  But doing PageDirty() twice is also kinda
lame.

I don't know.  Can we leave it like that for now?

        Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 3/3] vmscan: detect mapped file pages used only once
  2010-02-23 15:03     ` Minchan Kim
@ 2010-02-23 15:45       ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 15:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hi,

On Wed, Feb 24, 2010 at 12:03:13AM +0900, Minchan Kim wrote:
> On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 278cd27..5a48bda 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -511,9 +511,6 @@ int page_referenced(struct page *page,
> >  	int referenced = 0;
> >  	int we_locked = 0;
> >  
> > -	if (TestClearPageReferenced(page))
> > -		referenced++;
> > -
> 
> >From now on, page_referenced see only page table for reference. 
> So let's comment it on function description.
> like "This function checks reference from only pte"

Hehe, the function comment already says:

	* returns the number of ptes which referenced the page.

so it is already correct.  Only the code did not match it until now.

> It looks good to me except PAGEREF_RECLAIM_CLEAN. 
> 
> I am glad to meet your this effort, again, Hannes. :)

Thank you for your review,

	Hannes

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 3/3] vmscan: detect mapped file pages used only once
@ 2010-02-23 15:45       ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-23 15:45 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

Hi,

On Wed, Feb 24, 2010 at 12:03:13AM +0900, Minchan Kim wrote:
> On Mon, 2010-02-22 at 20:49 +0100, Johannes Weiner wrote:
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 278cd27..5a48bda 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -511,9 +511,6 @@ int page_referenced(struct page *page,
> >  	int referenced = 0;
> >  	int we_locked = 0;
> >  
> > -	if (TestClearPageReferenced(page))
> > -		referenced++;
> > -
> 
> >From now on, page_referenced see only page table for reference. 
> So let's comment it on function description.
> like "This function checks reference from only pte"

Hehe, the function comment already says:

	* returns the number of ptes which referenced the page.

so it is already correct.  Only the code did not match it until now.

> It looks good to me except PAGEREF_RECLAIM_CLEAN. 
> 
> I am glad to meet your this effort, again, Hannes. :)

Thank you for your review,

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
  2010-02-23 15:40           ` Johannes Weiner
@ 2010-02-23 16:04             ` Minchan Kim
  -1 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 16:04 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Wed, Feb 24, 2010 at 12:40 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> Hello Minchan,
>
> On Tue, Feb 23, 2010 at 11:44:14PM +0900, Minchan Kim wrote:
>> On Tue, 2010-02-23 at 15:21 +0100, Johannes Weiner wrote:
>> > Hello Minchan,
>> >
>> > On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:
>>
>> <snip>
>>
>> > > >
>> > > >                 if (PageDirty(page)) {
>> > > > -                       if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
>> > > > +                       if (references == PAGEREF_RECLAIM_CLEAN)
>> > >
>> > > How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
>> > > && referenced by semantic?
>> >
>> > It is encoded in page_check_references().  When
>> >     sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
>> > it returns PAGEREF_RECLAIM_CLEAN.
>> >
>> > So
>> >
>> >     - PageDirty() && order < COSTLY && referenced
>> >     + PageDirty() && references == PAGEREF_RECLAIM_CLEAN
>> >
>> > is an equivalent transformation.  Does this answer your question?
>>
>> Hmm. I knew it. My point was PAGEREF_RECLAIM_CLEAN seems to be a little
>> awkward. I thought PAGEREF_RECLAIM_CLEAN means if the page was clean, it
>> can be reclaimed.
>
> But you were thinking right, it is exactly what it means!  If
> the state is PAGEREF_RECLAIM_CLEAN, reclaim the page if it is clean:
>
>        if (PageDirty(page)) {
>                if (references == PAGEREF_RECLAIM_CLEAN)
>                        goto keep_locked;       /* do not reclaim */
>                ...
>        }
>
>> I think it would be better to rename it with represent "Although it's
>> referenced page recently, we can reclaim it if VM try to reclaim high
>> order page".
>
> I changed it to PAGEREF_RECLAIM_LUMPY and PAGEREF_RECLAIM, but I felt
> it made it worse.  It's awkward that we have to communicate that state
> at all, maybe it would be better to do
>
>        if (PageDirty(page) && referenced_page)
>                return PAGEREF_KEEP;
>
> in page_check_references()?  But doing PageDirty() twice is also kinda
> lame.
>
> I don't know.  Can we leave it like that for now?

I hope as it is if we don't have any better idea and you don't feel it strong.
But let's listen to other's opinion. maybe they have a good idea.

Thanks, Hannes.



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [patch 1/3] vmscan: factor out page reference checks
@ 2010-02-23 16:04             ` Minchan Kim
  0 siblings, 0 replies; 44+ messages in thread
From: Minchan Kim @ 2010-02-23 16:04 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Rik van Riel, linux-mm, linux-kernel

On Wed, Feb 24, 2010 at 12:40 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> Hello Minchan,
>
> On Tue, Feb 23, 2010 at 11:44:14PM +0900, Minchan Kim wrote:
>> On Tue, 2010-02-23 at 15:21 +0100, Johannes Weiner wrote:
>> > Hello Minchan,
>> >
>> > On Tue, Feb 23, 2010 at 10:38:23PM +0900, Minchan Kim wrote:
>>
>> <snip>
>>
>> > > >
>> > > >                 if (PageDirty(page)) {
>> > > > -                       if (sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced)
>> > > > +                       if (references == PAGEREF_RECLAIM_CLEAN)
>> > >
>> > > How equal PAGEREF_RECLAIM_CLEAN and sc->order <= PAGE_ALLOC_COSTLY_ORDER
>> > > && referenced by semantic?
>> >
>> > It is encoded in page_check_references().  When
>> >     sc->order <= PAGE_ALLOC_COSTLY_ORDER && referenced
>> > it returns PAGEREF_RECLAIM_CLEAN.
>> >
>> > So
>> >
>> >     - PageDirty() && order < COSTLY && referenced
>> >     + PageDirty() && references == PAGEREF_RECLAIM_CLEAN
>> >
>> > is an equivalent transformation.  Does this answer your question?
>>
>> Hmm. I knew it. My point was PAGEREF_RECLAIM_CLEAN seems to be a little
>> awkward. I thought PAGEREF_RECLAIM_CLEAN means if the page was clean, it
>> can be reclaimed.
>
> But you were thinking right, it is exactly what it means!  If
> the state is PAGEREF_RECLAIM_CLEAN, reclaim the page if it is clean:
>
>        if (PageDirty(page)) {
>                if (references == PAGEREF_RECLAIM_CLEAN)
>                        goto keep_locked;       /* do not reclaim */
>                ...
>        }
>
>> I think it would be better to rename it with represent "Although it's
>> referenced page recently, we can reclaim it if VM try to reclaim high
>> order page".
>
> I changed it to PAGEREF_RECLAIM_LUMPY and PAGEREF_RECLAIM, but I felt
> it made it worse.  It's awkward that we have to communicate that state
> at all, maybe it would be better to do
>
>        if (PageDirty(page) && referenced_page)
>                return PAGEREF_KEEP;
>
> in page_check_references()?  But doing PageDirty() twice is also kinda
> lame.
>
> I don't know.  Can we leave it like that for now?

I hope as it is if we don't have any better idea and you don't feel it strong.
But let's listen to other's opinion. maybe they have a good idea.

Thanks, Hannes.



-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
  2010-02-22 19:49 ` Johannes Weiner
@ 2010-02-24 21:39   ` Andrew Morton
  -1 siblings, 0 replies; 44+ messages in thread
From: Andrew Morton @ 2010-02-24 21:39 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner <hannes@cmpxchg.org> wrote:

> this is the second submission of the used-once mapped file page
> detection patch.
> 
> It is meant to help workloads with large amounts of shortly used file
> mappings, like rtorrent hashing a file or git when dealing with loose
> objects (git gc on a bigger site?).
> 
> Right now, the VM activates referenced mapped file pages on first
> encounter on the inactive list and it takes a full memory cycle to
> reclaim them again.  When those pages dominate memory, the system
> no longer has a meaningful notion of 'working set' and is required
> to give up the active list to make reclaim progress.  Obviously,
> this results in rather bad scanning latencies and the wrong pages
> being reclaimed.
> 
> This patch makes the VM be more careful about activating mapped file
> pages in the first place.  The minimum granted lifetime without
> another memory access becomes an inactive list cycle instead of the
> full memory cycle, which is more natural given the mentioned loads.

iirc from a long time ago, the insta-activation of mapped pages was
done because people were getting peeved about having their interactive
applications (X, browser, etc) getting paged out, and bumping the pages
immediately was found to help with this subjective problem.

So it was a latency issue more than a throughput issue.  I wouldn't be
surprised if we get some complaints from people for the same reasons as
a result of this patch.

I guess that during the evaluation period of this change, it would be
useful to have a /proc knob which people can toggle to revert to the
old behaviour.  So they can verify that this patchset was indeed the
cause of the deterioration, and so they can easily quantify any
deterioration?

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
@ 2010-02-24 21:39   ` Andrew Morton
  0 siblings, 0 replies; 44+ messages in thread
From: Andrew Morton @ 2010-02-24 21:39 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner <hannes@cmpxchg.org> wrote:

> this is the second submission of the used-once mapped file page
> detection patch.
> 
> It is meant to help workloads with large amounts of shortly used file
> mappings, like rtorrent hashing a file or git when dealing with loose
> objects (git gc on a bigger site?).
> 
> Right now, the VM activates referenced mapped file pages on first
> encounter on the inactive list and it takes a full memory cycle to
> reclaim them again.  When those pages dominate memory, the system
> no longer has a meaningful notion of 'working set' and is required
> to give up the active list to make reclaim progress.  Obviously,
> this results in rather bad scanning latencies and the wrong pages
> being reclaimed.
> 
> This patch makes the VM be more careful about activating mapped file
> pages in the first place.  The minimum granted lifetime without
> another memory access becomes an inactive list cycle instead of the
> full memory cycle, which is more natural given the mentioned loads.

iirc from a long time ago, the insta-activation of mapped pages was
done because people were getting peeved about having their interactive
applications (X, browser, etc) getting paged out, and bumping the pages
immediately was found to help with this subjective problem.

So it was a latency issue more than a throughput issue.  I wouldn't be
surprised if we get some complaints from people for the same reasons as
a result of this patch.

I guess that during the evaluation period of this change, it would be
useful to have a /proc knob which people can toggle to revert to the
old behaviour.  So they can verify that this patchset was indeed the
cause of the deterioration, and so they can easily quantify any
deterioration?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
  2010-02-24 21:39   ` Andrew Morton
@ 2010-02-26 14:32     ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-26 14:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

On Wed, Feb 24, 2010 at 01:39:46PM -0800, Andrew Morton wrote:
> On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > This patch makes the VM be more careful about activating mapped file
> > pages in the first place.  The minimum granted lifetime without
> > another memory access becomes an inactive list cycle instead of the
> > full memory cycle, which is more natural given the mentioned loads.
> 
> iirc from a long time ago, the insta-activation of mapped pages was
> done because people were getting peeved about having their interactive
> applications (X, browser, etc) getting paged out, and bumping the pages
> immediately was found to help with this subjective problem.
> 
> So it was a latency issue more than a throughput issue.  I wouldn't be
> surprised if we get some complaints from people for the same reasons as
> a result of this patch.

Agreed.  Although we now have other things in place to protect them once
they are active (VM_EXEC protection, lazy active list scanning).

> I guess that during the evaluation period of this change, it would be
> useful to have a /proc knob which people can toggle to revert to the
> old behaviour.  So they can verify that this patchset was indeed the
> cause of the deterioration, and so they can easily quantify any
> deterioration?

Sounds like a good idea.  By evaluation period, do you mean -mm?  Or
would this knob make it upstream as well?

	Hannes

From: Johannes Weiner <hannes@cmpxchg.org>
Subject: vmscan: add sysctl to revert mapped file heuristics

During the evaluation period of the used-once mapped file detection,
provide a sysctl to disable the heuristics at runtime, allowing users
to verify it as a source of problems.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/swap.h |    1 +
 kernel/sysctl.c      |    7 +++++++
 mm/vmscan.c          |    4 +++-
 3 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index a2602a8..0c1e724 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -254,6 +254,7 @@ extern unsigned long shrink_all_memory(unsigned long nr_pages);
 extern int vm_swappiness;
 extern int remove_mapping(struct address_space *mapping, struct page *page);
 extern long vm_total_pages;
+extern int vm_rigid_filemap_protection;
 
 #ifdef CONFIG_NUMA
 extern int zone_reclaim_mode;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 8a68b24..9fa46fb 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1050,6 +1050,13 @@ static struct ctl_table vm_table[] = {
 		.extra1		= &zero,
 		.extra2		= &one_hundred,
 	},
+	{
+		.procname	= "rigid_filemap_protection",
+		.data		= &vm_rigid_filemap_protection,
+		.maxlen		= sizeof(vm_rigid_filemap_protection),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 #ifdef CONFIG_HUGETLB_PAGE
 	{
 		.procname	= "nr_hugepages",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 819fff7..d494153 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -565,6 +565,8 @@ enum page_references {
 	PAGEREF_ACTIVATE,
 };
 
+int vm_rigid_filemap_protection __read_mostly;
+
 static enum page_references page_check_references(struct page *page,
 						  struct scan_control *sc)
 {
@@ -586,7 +588,7 @@ static enum page_references page_check_references(struct page *page,
 		return PAGEREF_RECLAIM;
 
 	if (referenced_ptes) {
-		if (PageAnon(page))
+		if (PageAnon(page) || vm_rigid_filemap_protection)
 			return PAGEREF_ACTIVATE;
 		/*
 		 * All mapped pages start out with page table
-- 
1.6.6.1

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
@ 2010-02-26 14:32     ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-26 14:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: KOSAKI Motohiro, Minchan Kim, Rik van Riel, linux-mm, linux-kernel

On Wed, Feb 24, 2010 at 01:39:46PM -0800, Andrew Morton wrote:
> On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > This patch makes the VM be more careful about activating mapped file
> > pages in the first place.  The minimum granted lifetime without
> > another memory access becomes an inactive list cycle instead of the
> > full memory cycle, which is more natural given the mentioned loads.
> 
> iirc from a long time ago, the insta-activation of mapped pages was
> done because people were getting peeved about having their interactive
> applications (X, browser, etc) getting paged out, and bumping the pages
> immediately was found to help with this subjective problem.
> 
> So it was a latency issue more than a throughput issue.  I wouldn't be
> surprised if we get some complaints from people for the same reasons as
> a result of this patch.

Agreed.  Although we now have other things in place to protect them once
they are active (VM_EXEC protection, lazy active list scanning).

> I guess that during the evaluation period of this change, it would be
> useful to have a /proc knob which people can toggle to revert to the
> old behaviour.  So they can verify that this patchset was indeed the
> cause of the deterioration, and so they can easily quantify any
> deterioration?

Sounds like a good idea.  By evaluation period, do you mean -mm?  Or
would this knob make it upstream as well?

	Hannes

From: Johannes Weiner <hannes@cmpxchg.org>
Subject: vmscan: add sysctl to revert mapped file heuristics

During the evaluation period of the used-once mapped file detection,
provide a sysctl to disable the heuristics at runtime, allowing users
to verify it as a source of problems.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/swap.h |    1 +
 kernel/sysctl.c      |    7 +++++++
 mm/vmscan.c          |    4 +++-
 3 files changed, 11 insertions(+), 1 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index a2602a8..0c1e724 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -254,6 +254,7 @@ extern unsigned long shrink_all_memory(unsigned long nr_pages);
 extern int vm_swappiness;
 extern int remove_mapping(struct address_space *mapping, struct page *page);
 extern long vm_total_pages;
+extern int vm_rigid_filemap_protection;
 
 #ifdef CONFIG_NUMA
 extern int zone_reclaim_mode;
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 8a68b24..9fa46fb 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1050,6 +1050,13 @@ static struct ctl_table vm_table[] = {
 		.extra1		= &zero,
 		.extra2		= &one_hundred,
 	},
+	{
+		.procname	= "rigid_filemap_protection",
+		.data		= &vm_rigid_filemap_protection,
+		.maxlen		= sizeof(vm_rigid_filemap_protection),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec,
+	},
 #ifdef CONFIG_HUGETLB_PAGE
 	{
 		.procname	= "nr_hugepages",
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 819fff7..d494153 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -565,6 +565,8 @@ enum page_references {
 	PAGEREF_ACTIVATE,
 };
 
+int vm_rigid_filemap_protection __read_mostly;
+
 static enum page_references page_check_references(struct page *page,
 						  struct scan_control *sc)
 {
@@ -586,7 +588,7 @@ static enum page_references page_check_references(struct page *page,
 		return PAGEREF_RECLAIM;
 
 	if (referenced_ptes) {
-		if (PageAnon(page))
+		if (PageAnon(page) || vm_rigid_filemap_protection)
 			return PAGEREF_ACTIVATE;
 		/*
 		 * All mapped pages start out with page table
-- 
1.6.6.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
  2010-02-26 14:32     ` Johannes Weiner
@ 2010-02-28 17:49       ` Rik van Riel
  -1 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-28 17:49 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/26/2010 09:32 AM, Johannes Weiner wrote:
> On Wed, Feb 24, 2010 at 01:39:46PM -0800, Andrew Morton wrote:
>> On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner<hannes@cmpxchg.org>  wrote:
>>
>>> This patch makes the VM be more careful about activating mapped file
>>> pages in the first place.  The minimum granted lifetime without
>>> another memory access becomes an inactive list cycle instead of the
>>> full memory cycle, which is more natural given the mentioned loads.
>>
>> iirc from a long time ago, the insta-activation of mapped pages was
>> done because people were getting peeved about having their interactive
>> applications (X, browser, etc) getting paged out, and bumping the pages
>> immediately was found to help with this subjective problem.
>>
>> So it was a latency issue more than a throughput issue.  I wouldn't be
>> surprised if we get some complaints from people for the same reasons as
>> a result of this patch.
>
> Agreed.  Although we now have other things in place to protect them once
> they are active (VM_EXEC protection, lazy active list scanning).

You think we'll need VM_EXEC protection on the inactive list
after your changes?


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
@ 2010-02-28 17:49       ` Rik van Riel
  0 siblings, 0 replies; 44+ messages in thread
From: Rik van Riel @ 2010-02-28 17:49 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On 02/26/2010 09:32 AM, Johannes Weiner wrote:
> On Wed, Feb 24, 2010 at 01:39:46PM -0800, Andrew Morton wrote:
>> On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner<hannes@cmpxchg.org>  wrote:
>>
>>> This patch makes the VM be more careful about activating mapped file
>>> pages in the first place.  The minimum granted lifetime without
>>> another memory access becomes an inactive list cycle instead of the
>>> full memory cycle, which is more natural given the mentioned loads.
>>
>> iirc from a long time ago, the insta-activation of mapped pages was
>> done because people were getting peeved about having their interactive
>> applications (X, browser, etc) getting paged out, and bumping the pages
>> immediately was found to help with this subjective problem.
>>
>> So it was a latency issue more than a throughput issue.  I wouldn't be
>> surprised if we get some complaints from people for the same reasons as
>> a result of this patch.
>
> Agreed.  Although we now have other things in place to protect them once
> they are active (VM_EXEC protection, lazy active list scanning).

You think we'll need VM_EXEC protection on the inactive list
after your changes?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
  2010-02-28 17:49       ` Rik van Riel
@ 2010-02-28 20:36         ` Johannes Weiner
  -1 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-28 20:36 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On Sun, Feb 28, 2010 at 12:49:14PM -0500, Rik van Riel wrote:
> On 02/26/2010 09:32 AM, Johannes Weiner wrote:
> >On Wed, Feb 24, 2010 at 01:39:46PM -0800, Andrew Morton wrote:
> >>On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner<hannes@cmpxchg.org>  
> >>wrote:
> >>
> >>>This patch makes the VM be more careful about activating mapped file
> >>>pages in the first place.  The minimum granted lifetime without
> >>>another memory access becomes an inactive list cycle instead of the
> >>>full memory cycle, which is more natural given the mentioned loads.
> >>
> >>iirc from a long time ago, the insta-activation of mapped pages was
> >>done because people were getting peeved about having their interactive
> >>applications (X, browser, etc) getting paged out, and bumping the pages
> >>immediately was found to help with this subjective problem.
> >>
> >>So it was a latency issue more than a throughput issue.  I wouldn't be
> >>surprised if we get some complaints from people for the same reasons as
> >>a result of this patch.
> >
> >Agreed.  Although we now have other things in place to protect them once
> >they are active (VM_EXEC protection, lazy active list scanning).
> 
> You think we'll need VM_EXEC protection on the inactive list
> after your changes?

So far I personally did not experience anything that would indicate the
need for it.  But I would consider it an option if Andrew's worries turned
out to be true.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: mm: used-once mapped file page detection
@ 2010-02-28 20:36         ` Johannes Weiner
  0 siblings, 0 replies; 44+ messages in thread
From: Johannes Weiner @ 2010-02-28 20:36 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Andrew Morton, KOSAKI Motohiro, Minchan Kim, linux-mm, linux-kernel

On Sun, Feb 28, 2010 at 12:49:14PM -0500, Rik van Riel wrote:
> On 02/26/2010 09:32 AM, Johannes Weiner wrote:
> >On Wed, Feb 24, 2010 at 01:39:46PM -0800, Andrew Morton wrote:
> >>On Mon, 22 Feb 2010 20:49:07 +0100 Johannes Weiner<hannes@cmpxchg.org>  
> >>wrote:
> >>
> >>>This patch makes the VM be more careful about activating mapped file
> >>>pages in the first place.  The minimum granted lifetime without
> >>>another memory access becomes an inactive list cycle instead of the
> >>>full memory cycle, which is more natural given the mentioned loads.
> >>
> >>iirc from a long time ago, the insta-activation of mapped pages was
> >>done because people were getting peeved about having their interactive
> >>applications (X, browser, etc) getting paged out, and bumping the pages
> >>immediately was found to help with this subjective problem.
> >>
> >>So it was a latency issue more than a throughput issue.  I wouldn't be
> >>surprised if we get some complaints from people for the same reasons as
> >>a result of this patch.
> >
> >Agreed.  Although we now have other things in place to protect them once
> >they are active (VM_EXEC protection, lazy active list scanning).
> 
> You think we'll need VM_EXEC protection on the inactive list
> after your changes?

So far I personally did not experience anything that would indicate the
need for it.  But I would consider it an option if Andrew's worries turned
out to be true.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2010-02-28 20:36 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-22 19:49 mm: used-once mapped file page detection Johannes Weiner
2010-02-22 19:49 ` Johannes Weiner
2010-02-22 19:49 ` [patch 1/3] vmscan: factor out page reference checks Johannes Weiner
2010-02-22 19:49   ` Johannes Weiner
2010-02-22 20:27   ` Rik van Riel
2010-02-22 20:27     ` Rik van Riel
2010-02-23 13:38   ` Minchan Kim
2010-02-23 13:38     ` Minchan Kim
2010-02-23 14:21     ` Johannes Weiner
2010-02-23 14:21       ` Johannes Weiner
2010-02-23 14:31       ` Rik van Riel
2010-02-23 14:31         ` Rik van Riel
2010-02-23 14:44       ` Minchan Kim
2010-02-23 14:44         ` Minchan Kim
2010-02-23 15:40         ` Johannes Weiner
2010-02-23 15:40           ` Johannes Weiner
2010-02-23 16:04           ` Minchan Kim
2010-02-23 16:04             ` Minchan Kim
2010-02-22 19:49 ` [patch 2/3] vmscan: drop page_mapping_inuse() Johannes Weiner
2010-02-22 19:49   ` Johannes Weiner
2010-02-22 20:28   ` Rik van Riel
2010-02-22 20:28     ` Rik van Riel
2010-02-23 14:03   ` Minchan Kim
2010-02-23 14:03     ` Minchan Kim
2010-02-23 14:32     ` Johannes Weiner
2010-02-23 14:32       ` Johannes Weiner
2010-02-23 14:48       ` Minchan Kim
2010-02-23 14:48         ` Minchan Kim
2010-02-22 19:49 ` [patch 3/3] vmscan: detect mapped file pages used only once Johannes Weiner
2010-02-22 19:49   ` Johannes Weiner
2010-02-22 20:34   ` Rik van Riel
2010-02-22 20:34     ` Rik van Riel
2010-02-23 15:03   ` Minchan Kim
2010-02-23 15:03     ` Minchan Kim
2010-02-23 15:45     ` Johannes Weiner
2010-02-23 15:45       ` Johannes Weiner
2010-02-24 21:39 ` mm: used-once mapped file page detection Andrew Morton
2010-02-24 21:39   ` Andrew Morton
2010-02-26 14:32   ` Johannes Weiner
2010-02-26 14:32     ` Johannes Weiner
2010-02-28 17:49     ` Rik van Riel
2010-02-28 17:49       ` Rik van Riel
2010-02-28 20:36       ` Johannes Weiner
2010-02-28 20:36         ` Johannes Weiner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.