linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] vmscan: promote shared file mapped pages
@ 2011-08-08 11:06 Konstantin Khlebnikov
  2011-08-08 11:07 ` [PATCH 2/2] vmscan: activate executable pages after first usage Konstantin Khlebnikov
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Konstantin Khlebnikov @ 2011-08-08 11:06 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Wu Fengguang, KAMEZAWA Hiroyuki, Johannes Weiner

Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
greatly decreases lifetime of single-used mapped file pages.
Unfortunately it also decreases life time of all shared mapped file pages.
Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
page-fault handler does not mark page active or even referenced.

Thus page_check_references() activates file page only if it was used twice while
it stays in inactive list, meanwhile it activates anon pages after first access.
Inactive list can be small enough, this way reclaimer can accidentally
throw away any widely used page if it wasn't used twice in short period.

After this patch page_check_references() also activate file mapped page at first
inactive list scan if this page is already used multiple times via several ptes.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 47403c9..3cd766d 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -724,7 +724,7 @@ static enum page_references page_check_references(struct page *page,
 		 */
 		SetPageReferenced(page);
 
-		if (referenced_page)
+		if (referenced_page || referenced_ptes > 1)
 			return PAGEREF_ACTIVATE;
 
 		return PAGEREF_KEEP;


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/2] vmscan: activate executable pages after first usage
  2011-08-08 11:06 [PATCH 1/2] vmscan: promote shared file mapped pages Konstantin Khlebnikov
@ 2011-08-08 11:07 ` Konstantin Khlebnikov
  2011-08-08 23:58   ` KAMEZAWA Hiroyuki
                     ` (2 more replies)
  2011-08-08 11:37 ` [PATCH 1/2] vmscan: promote shared file mapped pages Pekka Enberg
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 32+ messages in thread
From: Konstantin Khlebnikov @ 2011-08-08 11:07 UTC (permalink / raw)
  To: linux-mm, Andrew Morton, linux-kernel
  Cc: Wu Fengguang, KAMEZAWA Hiroyuki, Johannes Weiner

Logic added in commit v2.6.30-5507-g8cab475
(vmscan: make mapped executable pages the first class citizen)
was noticeably weakened in commit v2.6.33-5448-g6457474
(vmscan: detect mapped file pages used only once)

Currently these pages can become "first class citizens" only after second usage.

After this patch page_check_references() will activate they after first usage,
and executable code gets yet better chance to stay in memory.

TODO:
run some cool tests like in v2.6.30-5507-g8cab475 =)

Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
---
 mm/vmscan.c |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 3cd766d..29b3612 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -727,6 +727,12 @@ static enum page_references page_check_references(struct page *page,
 		if (referenced_page || referenced_ptes > 1)
 			return PAGEREF_ACTIVATE;
 
+		/*
+		 * Activate file-backed executable pages after first usage.
+		 */
+		if (vm_flags & VM_EXEC)
+			return PAGEREF_ACTIVATE;
+
 		return PAGEREF_KEEP;
 	}
 


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 11:06 [PATCH 1/2] vmscan: promote shared file mapped pages Konstantin Khlebnikov
  2011-08-08 11:07 ` [PATCH 2/2] vmscan: activate executable pages after first usage Konstantin Khlebnikov
@ 2011-08-08 11:37 ` Pekka Enberg
  2011-08-08 12:18   ` Konstantin Khlebnikov
  2011-08-08 23:36 ` Minchan Kim
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 32+ messages in thread
From: Pekka Enberg @ 2011-08-08 11:37 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang,
	KAMEZAWA Hiroyuki, Johannes Weiner

Hi Konstantin,

On Mon, Aug 8, 2011 at 2:06 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
> greatly decreases lifetime of single-used mapped file pages.
> Unfortunately it also decreases life time of all shared mapped file pages.
> Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
> page-fault handler does not mark page active or even referenced.
>
> Thus page_check_references() activates file page only if it was used twice while
> it stays in inactive list, meanwhile it activates anon pages after first access.
> Inactive list can be small enough, this way reclaimer can accidentally
> throw away any widely used page if it wasn't used twice in short period.
>
> After this patch page_check_references() also activate file mapped page at first
> inactive list scan if this page is already used multiple times via several ptes.
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

Both patches seem reasonable but the changelogs don't really explain
why you're doing the changes. How did you find out about the problem?
Is there some workload that's affected? How did you test your changes?

                       Pekka

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 11:37 ` [PATCH 1/2] vmscan: promote shared file mapped pages Pekka Enberg
@ 2011-08-08 12:18   ` Konstantin Khlebnikov
  2011-08-08 12:40     ` Pekka Enberg
                       ` (2 more replies)
  0 siblings, 3 replies; 32+ messages in thread
From: Konstantin Khlebnikov @ 2011-08-08 12:18 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang,
	KAMEZAWA Hiroyuki, Johannes Weiner

Pekka Enberg wrote:
> Hi Konstantin,
>
> On Mon, Aug 8, 2011 at 2:06 PM, Konstantin Khlebnikov
> <khlebnikov@openvz.org>  wrote:
>> Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
>> greatly decreases lifetime of single-used mapped file pages.
>> Unfortunately it also decreases life time of all shared mapped file pages.
>> Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
>> page-fault handler does not mark page active or even referenced.
>>
>> Thus page_check_references() activates file page only if it was used twice while
>> it stays in inactive list, meanwhile it activates anon pages after first access.
>> Inactive list can be small enough, this way reclaimer can accidentally
>> throw away any widely used page if it wasn't used twice in short period.
>>
>> After this patch page_check_references() also activate file mapped page at first
>> inactive list scan if this page is already used multiple times via several ptes.
>>
>> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>
> Both patches seem reasonable but the changelogs don't really explain
> why you're doing the changes. How did you find out about the problem?
> Is there some workload that's affected? How did you test your changes?
>

I found this while trying to fix degragation in rhel6 (~2.6.32) from rhel5 (~2.6.18).
There a complete mess with >100 web/mail/spam/ftp containers,
they share all their files but there a lot of anonymous pages:
~500mb shared file mapped memory and 15-20Gb non-shared anonymous memory.
In this situation major-pagefaults are very costly, because all containers share the same page.
In my load kernel created a disproportionate pressure on the file memory, compared with the anonymous,
they equaled only if I raise swappiness up to 150 =)

These patches actually wasn't helped a lot in my problem,
but I saw noticable (10-20 times) reduce in count and average time of major-pagefault in file-mapped areas.

Actually both patches are fixes for commit v2.6.33-5448-g6457474,
because it was aimed at one scenario (singly used pages),
but it breaks the logic in other scenarios (shared and/or executable pages)

>                         Pekka


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 12:18   ` Konstantin Khlebnikov
@ 2011-08-08 12:40     ` Pekka Enberg
  2011-08-08 12:51       ` Konstantin Khlebnikov
  2011-08-18  9:09     ` Johannes Weiner
  2011-11-02 16:30     ` Johannes Weiner
  2 siblings, 1 reply; 32+ messages in thread
From: Pekka Enberg @ 2011-08-08 12:40 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang,
	KAMEZAWA Hiroyuki, Johannes Weiner

On Mon, Aug 8, 2011 at 3:18 PM, Konstantin Khlebnikov
<khlebnikov@parallels.com> wrote:
> Pekka Enberg wrote:
>>
>> Hi Konstantin,
>>
>> On Mon, Aug 8, 2011 at 2:06 PM, Konstantin Khlebnikov
>> <khlebnikov@openvz.org>  wrote:
>>>
>>> Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only
>>> once)
>>> greatly decreases lifetime of single-used mapped file pages.
>>> Unfortunately it also decreases life time of all shared mapped file
>>> pages.
>>> Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed
>>> in fault path)
>>> page-fault handler does not mark page active or even referenced.
>>>
>>> Thus page_check_references() activates file page only if it was used
>>> twice while
>>> it stays in inactive list, meanwhile it activates anon pages after first
>>> access.
>>> Inactive list can be small enough, this way reclaimer can accidentally
>>> throw away any widely used page if it wasn't used twice in short period.
>>>
>>> After this patch page_check_references() also activate file mapped page
>>> at first
>>> inactive list scan if this page is already used multiple times via
>>> several ptes.
>>>
>>> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>>
>> Both patches seem reasonable but the changelogs don't really explain
>> why you're doing the changes. How did you find out about the problem?
>> Is there some workload that's affected? How did you test your changes?
>>
>
> I found this while trying to fix degragation in rhel6 (~2.6.32) from rhel5
> (~2.6.18).
> There a complete mess with >100 web/mail/spam/ftp containers,
> they share all their files but there a lot of anonymous pages:
> ~500mb shared file mapped memory and 15-20Gb non-shared anonymous memory.
> In this situation major-pagefaults are very costly, because all containers
> share the same page.
> In my load kernel created a disproportionate pressure on the file memory,
> compared with the anonymous,
> they equaled only if I raise swappiness up to 150 =)
>
> These patches actually wasn't helped a lot in my problem,
> but I saw noticable (10-20 times) reduce in count and average time of
> major-pagefault in file-mapped areas.
>
> Actually both patches are fixes for commit v2.6.33-5448-g6457474,
> because it was aimed at one scenario (singly used pages),
> but it breaks the logic in other scenarios (shared and/or executable pages)

It'd be nice to have such details in the changelogs. FWIW,

Acked-by: Pekka Enberg <penberg@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 12:40     ` Pekka Enberg
@ 2011-08-08 12:51       ` Konstantin Khlebnikov
  0 siblings, 0 replies; 32+ messages in thread
From: Konstantin Khlebnikov @ 2011-08-08 12:51 UTC (permalink / raw)
  To: Pekka Enberg
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang,
	KAMEZAWA Hiroyuki, Johannes Weiner

Pekka Enberg wrote:
> On Mon, Aug 8, 2011 at 3:18 PM, Konstantin Khlebnikov
> <khlebnikov@parallels.com>  wrote:
>> Pekka Enberg wrote:
>>>
>>> Hi Konstantin,
>>>
>>> On Mon, Aug 8, 2011 at 2:06 PM, Konstantin Khlebnikov
>>> <khlebnikov@openvz.org>    wrote:
>>>>
>>>> Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only
>>>> once)
>>>> greatly decreases lifetime of single-used mapped file pages.
>>>> Unfortunately it also decreases life time of all shared mapped file
>>>> pages.
>>>> Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed
>>>> in fault path)
>>>> page-fault handler does not mark page active or even referenced.
>>>>
>>>> Thus page_check_references() activates file page only if it was used
>>>> twice while
>>>> it stays in inactive list, meanwhile it activates anon pages after first
>>>> access.
>>>> Inactive list can be small enough, this way reclaimer can accidentally
>>>> throw away any widely used page if it wasn't used twice in short period.
>>>>
>>>> After this patch page_check_references() also activate file mapped page
>>>> at first
>>>> inactive list scan if this page is already used multiple times via
>>>> several ptes.
>>>>
>>>> Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
>>>
>>> Both patches seem reasonable but the changelogs don't really explain
>>> why you're doing the changes. How did you find out about the problem?
>>> Is there some workload that's affected? How did you test your changes?
>>>
>>
>> I found this while trying to fix degragation in rhel6 (~2.6.32) from rhel5
>> (~2.6.18).
>> There a complete mess with>100 web/mail/spam/ftp containers,
>> they share all their files but there a lot of anonymous pages:
>> ~500mb shared file mapped memory and 15-20Gb non-shared anonymous memory.
>> In this situation major-pagefaults are very costly, because all containers
>> share the same page.
>> In my load kernel created a disproportionate pressure on the file memory,
>> compared with the anonymous,
>> they equaled only if I raise swappiness up to 150 =)
>>
>> These patches actually wasn't helped a lot in my problem,
>> but I saw noticable (10-20 times) reduce in count and average time of
>> major-pagefault in file-mapped areas.
>>
>> Actually both patches are fixes for commit v2.6.33-5448-g6457474,
>> because it was aimed at one scenario (singly used pages),
>> but it breaks the logic in other scenarios (shared and/or executable pages)
>
> It'd be nice to have such details in the changelogs. FWIW,

It's not quite honest, I did not do any measurements in the mainline kernel,
I only boot 3.1-rc1 with them on my laptop.
It would be nice to repeat the measurements from v2.6.30-5507-g8cab475,
but I do not have time for that right now.

>
> Acked-by: Pekka Enberg<penberg@kernel.org>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 11:06 [PATCH 1/2] vmscan: promote shared file mapped pages Konstantin Khlebnikov
  2011-08-08 11:07 ` [PATCH 2/2] vmscan: activate executable pages after first usage Konstantin Khlebnikov
  2011-08-08 11:37 ` [PATCH 1/2] vmscan: promote shared file mapped pages Pekka Enberg
@ 2011-08-08 23:36 ` Minchan Kim
  2011-08-08 23:51 ` KAMEZAWA Hiroyuki
  2011-10-31 20:12 ` Andrew Morton
  4 siblings, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2011-08-08 23:36 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang,
	KAMEZAWA Hiroyuki, Johannes Weiner, Rik van Riel,
	KOSAKI Motohiro

On Mon, Aug 8, 2011 at 8:06 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
> greatly decreases lifetime of single-used mapped file pages.
> Unfortunately it also decreases life time of all shared mapped file pages.
> Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
> page-fault handler does not mark page active or even referenced.
>
> Thus page_check_references() activates file page only if it was used twice while
> it stays in inactive list, meanwhile it activates anon pages after first access.
> Inactive list can be small enough, this way reclaimer can accidentally
> throw away any widely used page if it wasn't used twice in short period.
>
> After this patch page_check_references() also activate file mapped page at first
> inactive list scan if this page is already used multiple times via several ptes.
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

Looks good to me.
But the issue is that we prefer shared mapped file pages aggressively
by your patch.

Shared page already have bigger chance to promote than single page
during same time window as many processes can touch the page.

The your concern is when file LRU is too small or scanning
aggressively, shared mapping page could lose the chance to activate
but it is applied single page, too. And still, shared mapping pages
have a bigger chance to activate compared to single page.  So, our
algorithm already reflect shared mapping preference a bit.

Fundamental problem is our eviction algorithm consider only recency,
not frequency. It's very long time problem and it's not easy to fix it
practically.

Anyway, it's a not subject it's right or not but it's policy thing and
I support yours.

Acked-by: Minchan Kim <minchan.kim@gmail.com>



-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 11:06 [PATCH 1/2] vmscan: promote shared file mapped pages Konstantin Khlebnikov
                   ` (2 preceding siblings ...)
  2011-08-08 23:36 ` Minchan Kim
@ 2011-08-08 23:51 ` KAMEZAWA Hiroyuki
  2011-10-31 20:12 ` Andrew Morton
  4 siblings, 0 replies; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-08 23:51 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang, Johannes Weiner

On Mon, 8 Aug 2011 15:06:58 +0400
Konstantin Khlebnikov <khlebnikov@openvz.org> wrote:

> Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
> greatly decreases lifetime of single-used mapped file pages.
> Unfortunately it also decreases life time of all shared mapped file pages.
> Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
> page-fault handler does not mark page active or even referenced.
> 
> Thus page_check_references() activates file page only if it was used twice while
> it stays in inactive list, meanwhile it activates anon pages after first access.
> Inactive list can be small enough, this way reclaimer can accidentally
> throw away any widely used page if it wasn't used twice in short period.
> 
> After this patch page_check_references() also activate file mapped page at first
> inactive list scan if this page is already used multiple times via several ptes.
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

As other guys pointed out, it's better to show performance score change by
this patch in changelog.

Reviewed-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

> ---
>  mm/vmscan.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 47403c9..3cd766d 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -724,7 +724,7 @@ static enum page_references page_check_references(struct page *page,
>  		 */
>  		SetPageReferenced(page);
>  
> -		if (referenced_page)
> +		if (referenced_page || referenced_ptes > 1)
>  			return PAGEREF_ACTIVATE;
>  
>  		return PAGEREF_KEEP;
> 
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/2] vmscan: activate executable pages after first usage
  2011-08-08 11:07 ` [PATCH 2/2] vmscan: activate executable pages after first usage Konstantin Khlebnikov
@ 2011-08-08 23:58   ` KAMEZAWA Hiroyuki
  2011-08-09  0:02   ` Minchan Kim
  2011-08-09  1:23   ` Shaohua Li
  2 siblings, 0 replies; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-08 23:58 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang, Johannes Weiner

On Mon, 8 Aug 2011 15:07:00 +0400
Konstantin Khlebnikov <khlebnikov@openvz.org> wrote:

> Logic added in commit v2.6.30-5507-g8cab475
> (vmscan: make mapped executable pages the first class citizen)
> was noticeably weakened in commit v2.6.33-5448-g6457474
> (vmscan: detect mapped file pages used only once)
> 
> Currently these pages can become "first class citizens" only after second usage.
> 
> After this patch page_check_references() will activate they after first usage,
> and executable code gets yet better chance to stay in memory.
> 
> TODO:
> run some cool tests like in v2.6.30-5507-g8cab475 =)
> 
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>

How effective does this work on your test ?



> ---
>  mm/vmscan.c |    6 ++++++
>  1 files changed, 6 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 3cd766d..29b3612 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -727,6 +727,12 @@ static enum page_references page_check_references(struct page *page,
>  		if (referenced_page || referenced_ptes > 1)
>  			return PAGEREF_ACTIVATE;
>  
> +		/*
> +		 * Activate file-backed executable pages after first usage.
> +		 */
> +		if (vm_flags & VM_EXEC)
> +			return PAGEREF_ACTIVATE;
> +
>  		return PAGEREF_KEEP;
>  	}
>  
> 
> 


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/2] vmscan: activate executable pages after first usage
  2011-08-08 11:07 ` [PATCH 2/2] vmscan: activate executable pages after first usage Konstantin Khlebnikov
  2011-08-08 23:58   ` KAMEZAWA Hiroyuki
@ 2011-08-09  0:02   ` Minchan Kim
  2011-08-09  0:04     ` KAMEZAWA Hiroyuki
  2011-08-09  1:23   ` Shaohua Li
  2 siblings, 1 reply; 32+ messages in thread
From: Minchan Kim @ 2011-08-09  0:02 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu Fengguang,
	KAMEZAWA Hiroyuki, Johannes Weiner

On Mon, Aug 8, 2011 at 8:07 PM, Konstantin Khlebnikov
<khlebnikov@openvz.org> wrote:
> Logic added in commit v2.6.30-5507-g8cab475
> (vmscan: make mapped executable pages the first class citizen)
> was noticeably weakened in commit v2.6.33-5448-g6457474
> (vmscan: detect mapped file pages used only once)
>
> Currently these pages can become "first class citizens" only after second usage.
>
> After this patch page_check_references() will activate they after first usage,
> and executable code gets yet better chance to stay in memory.
>
> TODO:
> run some cool tests like in v2.6.30-5507-g8cab475 =)
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> ---

It might be a very controversial topic.
AFAIR, at least, we did when vmscan: make mapped executable pages the
first class citizen was merged. :)

You try to change behavior.

Old : protect *working set* executable page
New: protect executable page *unconditionally*.

At least, old logic can ignore some executable pages which are not
accessed recently.

Wu had many testing to persuade others.
As you said, we need some number to change policy.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/2] vmscan: activate executable pages after first usage
  2011-08-09  0:02   ` Minchan Kim
@ 2011-08-09  0:04     ` KAMEZAWA Hiroyuki
  2011-08-09  0:26       ` Minchan Kim
  0 siblings, 1 reply; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-08-09  0:04 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Wu Fengguang, Johannes Weiner

On Tue, 9 Aug 2011 09:02:28 +0900
Minchan Kim <minchan.kim@gmail.com> wrote:

> On Mon, Aug 8, 2011 at 8:07 PM, Konstantin Khlebnikov
> <khlebnikov@openvz.org> wrote:
> > Logic added in commit v2.6.30-5507-g8cab475
> > (vmscan: make mapped executable pages the first class citizen)
> > was noticeably weakened in commit v2.6.33-5448-g6457474
> > (vmscan: detect mapped file pages used only once)
> >
> > Currently these pages can become "first class citizens" only after second usage.
> >
> > After this patch page_check_references() will activate they after first usage,
> > and executable code gets yet better chance to stay in memory.
> >
> > TODO:
> > run some cool tests like in v2.6.30-5507-g8cab475 =)
> >
> > Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
> > ---
> 
> It might be a very controversial topic.
> AFAIR, at least, we did when vmscan: make mapped executable pages the
> first class citizen was merged. :)
> 
> You try to change behavior.
> 
> Old : protect *working set* executable page
> New: protect executable page *unconditionally*.
> 

Hmm ? I thought 
Old: protect pages if referenced twice
New: protect executable page if referenced once.

IIUC, ANON is proteced if it's referenced once.

So, this patch changes EXECUTABLE file to the same class as ANON pages.

Anyway, I agree test/measurement is required.

Thanks,
-Kame










^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/2] vmscan: activate executable pages after first usage
  2011-08-09  0:04     ` KAMEZAWA Hiroyuki
@ 2011-08-09  0:26       ` Minchan Kim
  0 siblings, 0 replies; 32+ messages in thread
From: Minchan Kim @ 2011-08-09  0:26 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Konstantin Khlebnikov, linux-mm, Andrew Morton, linux-kernel,
	Wu Fengguang, Johannes Weiner

Hi, Kame.

On Tue, Aug 9, 2011 at 9:04 AM, KAMEZAWA Hiroyuki
<kamezawa.hiroyu@jp.fujitsu.com> wrote:
> On Tue, 9 Aug 2011 09:02:28 +0900
> Minchan Kim <minchan.kim@gmail.com> wrote:
>
>> On Mon, Aug 8, 2011 at 8:07 PM, Konstantin Khlebnikov
>> <khlebnikov@openvz.org> wrote:
>> > Logic added in commit v2.6.30-5507-g8cab475
>> > (vmscan: make mapped executable pages the first class citizen)
>> > was noticeably weakened in commit v2.6.33-5448-g6457474
>> > (vmscan: detect mapped file pages used only once)
>> >
>> > Currently these pages can become "first class citizens" only after second usage.
>> >
>> > After this patch page_check_references() will activate they after first usage,
>> > and executable code gets yet better chance to stay in memory.
>> >
>> > TODO:
>> > run some cool tests like in v2.6.30-5507-g8cab475 =)
>> >
>> > Signed-off-by: Konstantin Khlebnikov <khlebnikov@openvz.org>
>> > ---
>>
>> It might be a very controversial topic.
>> AFAIR, at least, we did when vmscan: make mapped executable pages the
>> first class citizen was merged. :)
>>
>> You try to change behavior.
>>
>> Old : protect *working set* executable page
>> New: protect executable page *unconditionally*.
>>
>
> Hmm ? I thought
> Old: protect pages if referenced twice
> New: protect executable page if referenced once.
>
> IIUC, ANON is proteced if it's referenced once.
>
> So, this patch changes EXECUTABLE file to the same class as ANON pages.

"Working set" means two reference in implementation of the moment. But
it can change in future as many as we want.

"Unconditionally" means that all of mapped page starts from referenced
pte so it would activate all of executable pages.

>
> Anyway, I agree test/measurement is required.

Absolutely.

-- 
Kind regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/2] vmscan: activate executable pages after first usage
  2011-08-08 11:07 ` [PATCH 2/2] vmscan: activate executable pages after first usage Konstantin Khlebnikov
  2011-08-08 23:58   ` KAMEZAWA Hiroyuki
  2011-08-09  0:02   ` Minchan Kim
@ 2011-08-09  1:23   ` Shaohua Li
  2 siblings, 0 replies; 32+ messages in thread
From: Shaohua Li @ 2011-08-09  1:23 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, Andrew Morton, linux-kernel, Wu, Fengguang,
	KAMEZAWA Hiroyuki, Johannes Weiner

On Mon, 2011-08-08 at 19:07 +0800, Konstantin Khlebnikov wrote:
> Logic added in commit v2.6.30-5507-g8cab475
> (vmscan: make mapped executable pages the first class citizen)
> was noticeably weakened in commit v2.6.33-5448-g6457474
> (vmscan: detect mapped file pages used only once)
> 
> Currently these pages can become "first class citizens" only after second usage.
> 
> After this patch page_check_references() will activate they after first usage,
> and executable code gets yet better chance to stay in memory.
> 
> TODO:
> run some cool tests like in v2.6.30-5507-g8cab475 =)
I used to post a similar patch here:
http://marc.info/?l=linux-mm&m=128572906801887&w=2
but running Fengguang's test doesn't show improvement. And actually the
VM_EXEC protect in shrink_active_list() doesn't show improvement too in
my run, I'm wondering if we should remove it. I guess the (vmscan:
detect mapped file pages used only once) patch makes VM_EXEC protect
lose its effect. It's great if you can show solid data.

Thanks,
Shaohua


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 12:18   ` Konstantin Khlebnikov
  2011-08-08 12:40     ` Pekka Enberg
@ 2011-08-18  9:09     ` Johannes Weiner
  2011-11-02 16:30     ` Johannes Weiner
  2 siblings, 0 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-08-18  9:09 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Pekka Enberg, linux-mm, Andrew Morton, linux-kernel,
	Wu Fengguang, KAMEZAWA Hiroyuki, Rik van Riel

On Mon, Aug 08, 2011 at 04:18:11PM +0400, Konstantin Khlebnikov wrote:
> Pekka Enberg wrote:
> >Hi Konstantin,
> >
> >On Mon, Aug 8, 2011 at 2:06 PM, Konstantin Khlebnikov
> ><khlebnikov@openvz.org>  wrote:
> >>Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
> >>greatly decreases lifetime of single-used mapped file pages.
> >>Unfortunately it also decreases life time of all shared mapped file pages.
> >>Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
> >>page-fault handler does not mark page active or even referenced.
> >>
> >>Thus page_check_references() activates file page only if it was used twice while
> >>it stays in inactive list, meanwhile it activates anon pages after first access.
> >>Inactive list can be small enough, this way reclaimer can accidentally
> >>throw away any widely used page if it wasn't used twice in short period.
> >>
> >>After this patch page_check_references() also activate file mapped page at first
> >>inactive list scan if this page is already used multiple times via several ptes.
> >>
> >>Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
> >
> >Both patches seem reasonable but the changelogs don't really explain
> >why you're doing the changes. How did you find out about the problem?
> >Is there some workload that's affected? How did you test your changes?
> >
> 
> I found this while trying to fix degragation in rhel6 (~2.6.32) from rhel5 (~2.6.18).
> There a complete mess with >100 web/mail/spam/ftp containers,
> they share all their files but there a lot of anonymous pages:
> ~500mb shared file mapped memory and 15-20Gb non-shared anonymous memory.

How much unmapped cache do you have around in this scenario?

> In this situation major-pagefaults are very costly, because all containers share the same page.
> In my load kernel created a disproportionate pressure on the file memory, compared with the anonymous,
> they equaled only if I raise swappiness up to 150 =)
> 
> These patches actually wasn't helped a lot in my problem,
> but I saw noticable (10-20 times) reduce in count and average time of major-pagefault in file-mapped areas.

If disabling the used-once detection for shared executable pages does
not help, then the real reason for the regression you observe seems to
be a different one.

Reduced major faults in file mapped areas without other context don't
have to be a good sign per se.  Which memory was reclaimed instead?
Did swapping increase?

It would be good to find a fix that actually helps your workload.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 11:06 [PATCH 1/2] vmscan: promote shared file mapped pages Konstantin Khlebnikov
                   ` (3 preceding siblings ...)
  2011-08-08 23:51 ` KAMEZAWA Hiroyuki
@ 2011-10-31 20:12 ` Andrew Morton
  4 siblings, 0 replies; 32+ messages in thread
From: Andrew Morton @ 2011-10-31 20:12 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: linux-mm, linux-kernel, Wu Fengguang, KAMEZAWA Hiroyuki, Johannes Weiner

On Mon, 8 Aug 2011 15:06:58 +0400
Konstantin Khlebnikov <khlebnikov@openvz.org> wrote:

> Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
> greatly decreases lifetime of single-used mapped file pages.
> Unfortunately it also decreases life time of all shared mapped file pages.
> Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
> page-fault handler does not mark page active or even referenced.
> 
> Thus page_check_references() activates file page only if it was used twice while
> it stays in inactive list, meanwhile it activates anon pages after first access.
> Inactive list can be small enough, this way reclaimer can accidentally
> throw away any widely used page if it wasn't used twice in short period.
> 
> After this patch page_check_references() also activate file mapped page at first
> inactive list scan if this page is already used multiple times via several ptes.

We have quite a few acks on these two patches, but everyone wants to
see detailed performance testing.  That hasn't happened, and caution
dictates that I hold these patches out of linux-3.2, pending that
testing.

Of course, you're not the only person who can undertake that testing (hint).


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-08-08 12:18   ` Konstantin Khlebnikov
  2011-08-08 12:40     ` Pekka Enberg
  2011-08-18  9:09     ` Johannes Weiner
@ 2011-11-02 16:30     ` Johannes Weiner
  2011-11-02 16:31       ` [rfc 1/3] mm: vmscan: never swap under low memory pressure Johannes Weiner
                         ` (3 more replies)
  2 siblings, 4 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:30 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Pekka Enberg, linux-mm, Andrew Morton, linux-kernel,
	Wu Fengguang, KAMEZAWA Hiroyuki, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Mon, Aug 08, 2011 at 04:18:11PM +0400, Konstantin Khlebnikov wrote:
> Pekka Enberg wrote:
> >Hi Konstantin,
> >
> >On Mon, Aug 8, 2011 at 2:06 PM, Konstantin Khlebnikov
> ><khlebnikov@openvz.org>  wrote:
> >>Commit v2.6.33-5448-g6457474 (vmscan: detect mapped file pages used only once)
> >>greatly decreases lifetime of single-used mapped file pages.
> >>Unfortunately it also decreases life time of all shared mapped file pages.
> >>Because after commit v2.6.28-6130-gbf3f3bc (mm: don't mark_page_accessed in fault path)
> >>page-fault handler does not mark page active or even referenced.
> >>
> >>Thus page_check_references() activates file page only if it was used twice while
> >>it stays in inactive list, meanwhile it activates anon pages after first access.
> >>Inactive list can be small enough, this way reclaimer can accidentally
> >>throw away any widely used page if it wasn't used twice in short period.
> >>
> >>After this patch page_check_references() also activate file mapped page at first
> >>inactive list scan if this page is already used multiple times via several ptes.
> >>
> >>Signed-off-by: Konstantin Khlebnikov<khlebnikov@openvz.org>
> >
> >Both patches seem reasonable but the changelogs don't really explain
> >why you're doing the changes. How did you find out about the problem?
> >Is there some workload that's affected? How did you test your changes?
> >
> 
> I found this while trying to fix degragation in rhel6 (~2.6.32) from rhel5 (~2.6.18).
> There a complete mess with >100 web/mail/spam/ftp containers,
> they share all their files but there a lot of anonymous pages:
> ~500mb shared file mapped memory and 15-20Gb non-shared anonymous memory.
> In this situation major-pagefaults are very costly, because all containers share the same page.
> In my load kernel created a disproportionate pressure on the file memory, compared with the anonymous,
> they equaled only if I raise swappiness up to 150 =)
> 
> These patches actually wasn't helped a lot in my problem,
> but I saw noticable (10-20 times) reduce in count and average time of major-pagefault in file-mapped areas.
> 
> Actually both patches are fixes for commit v2.6.33-5448-g6457474,
> because it was aimed at one scenario (singly used pages),
> but it breaks the logic in other scenarios (shared and/or executable pages)

I suspect that while saving shared/executable mapped file pages more
aggressively helps to some extent, the underlying problem is that we
tip the lru balance (comparing the recent_scanned/recent_rotated
ratios) in favor of file pages too much and in unexpected places.

For mapped file, we do:

add to lru:	recent_scanned++
cycle:		recent_scanned++
[ activate:	recent_scanned++, recent_rotated++ ]
[ deactivate:	recent_scanned++, recent_rotated++ ]
reclaim:	recent_scanned++

while for anon:

add to lru:	recent_scanned++, recent_rotated++
reactivate:	recent_scanned++, recent_rotated++
deactivate:	recent_scanned++, recent_rotated++
[ activate:	recent_scanned++, recent_rotated++ ]
[ deactivate:	recent_scanned++, recent_rotated++ ]
reclaim:	recent_scanned++

As you can see, even a long-lived file page tips the balance to the
file list twice: on creation and during the used-once detection.  A
thrashing file working set as in Konstantin's case will actually be
seen as a lucrative source of reclaimable pages.

Tipping the balance with each new file LRU page was meant to steer the
reclaim focus towards streaming IO pages and away from anonymous pages
but wouldn't it be easier to just not swap above a certain priority to
have the same effect?  With enough used-once file pages, we should not
reach that priority threshold.

Tipping the balance for inactive list rotation has been there from the
beginning, but I don't quite understand why.  It probably was not a
problem as the conditions for inactive cycling applied to both file
and anon equally, but with used-once detection for file and deferred
file writeback from direct reclaim, we tend to cycle more file pages
on the inactive list than anonymous ones.  Those rotated pages should
be a signal to favor file reclaim, though.

Here are three (currently under testing) RFC patches that 1. prevent
swapping above DEF_PRIORITY-2, 2. treat inactive list rotations to be
neutral wrt. the inter-LRU balance, and 3. revert the file list boost
on lru addition.

The result looks like this:

file:

add to lru:
[ activate:	recent_scanned++, recent_rotated++ ]
[ deactivate:	recent_scanned++, recent_rotated++ ]
reclaim:	recent_scanned++

mapped file:

add to lru:
cycle:		recent_scanned++, recent_rotated++
[ activate:	recent_scanned++, recent_rotated++ ]
[ deactivate:	recent_scanned++, recent_rotated++ ]
reclaim:	recent_scanned++

anon:
add to lru:	recent_scanned++, recent_rotated++
reactivate:	recent_scanned++, recent_rotated++
deactivate:	recent_scanned++, recent_rotated++
[ activate:	recent_scanned++, recent_rotated++ ]
[ deactivate:	recent_scanned++, recent_rotated++ ]
reclaim:	recent_scanned++

As you can see, this still behaves under the assumption that refaults
from swap are more costly than from the fs, but we keep considering
anonymous pages when the file working set is thrashing.

What do reclaim people think about this?

Konstantin, would you have the chance to try this set directly with
your affected workload if nobody spots any obvious problems?

Thanks!

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [rfc 1/3] mm: vmscan: never swap under low memory pressure
  2011-11-02 16:30     ` Johannes Weiner
@ 2011-11-02 16:31       ` Johannes Weiner
  2011-11-02 17:54         ` KOSAKI Motohiro
  2011-11-07  2:29         ` KAMEZAWA Hiroyuki
  2011-11-02 16:32       ` [rfc 2/3] mm: vmscan: treat inactive cycling as neutral Johannes Weiner
                         ` (2 subsequent siblings)
  3 siblings, 2 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:31 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Pekka Enberg, linux-mm, Andrew Morton, linux-kernel,
	Wu Fengguang, KAMEZAWA Hiroyuki, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

We want to prevent floods of used-once file cache pushing us to swap
out anonymous pages.  Never swap under a certain priority level.  The
availability of used-once cache pages should prevent us from reaching
that threshold.

This is needed because subsequent patches will revert some of the
mechanisms that tried to prefer file over anon, and this should not
result in more eager swapping again.

It might also be better to keep the aging machinery going and just not
swap, rather than staying away from anonymous pages in the first place
and having less useful age information at the time of swapout.

Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
 mm/vmscan.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a90c603..39d3da3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 		 * Try to allocate it some swap space here.
 		 */
 		if (PageAnon(page) && !PageSwapCache(page)) {
+			if (priority >= DEF_PRIORITY - 2)
+				goto keep_locked;
 			if (!(sc->gfp_mask & __GFP_IO))
 				goto keep_locked;
 			if (!add_to_swap(page))
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
  2011-11-02 16:30     ` Johannes Weiner
  2011-11-02 16:31       ` [rfc 1/3] mm: vmscan: never swap under low memory pressure Johannes Weiner
@ 2011-11-02 16:32       ` Johannes Weiner
  2011-11-02 18:04         ` KOSAKI Motohiro
  2011-11-07  2:34         ` KAMEZAWA Hiroyuki
  2011-11-02 16:32       ` [rfc 3/3] mm: vmscan: revert file list boost on lru addition Johannes Weiner
  2011-11-02 16:35       ` [PATCH 1/2] vmscan: promote shared file mapped pages Johannes Weiner
  3 siblings, 2 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:32 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Pekka Enberg, linux-mm, Andrew Morton, linux-kernel,
	Wu Fengguang, KAMEZAWA Hiroyuki, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

Each page that is scanned but put back to the inactive list is counted
as a successful reclaim, which tips the balance between file and anon
lists more towards the cycling list.

This does - in my opinion - not make too much sense, but at the same
time it was not much of a problem, as the conditions that lead to an
inactive list cycle were mostly temporary - locked page, concurrent
page table changes, backing device congested - or at least limited to
a single reclaimer that was not allowed to unmap or meddle with IO.
More important than being moderately rare, those conditions should
apply to both anon and mapped file pages equally and balance out in
the end.

Recently, we started cycling file pages in particular on the inactive
list much more aggressively, for used-once detection of mapped pages,
and when avoiding writeback from direct reclaim.

Those rotated pages do not exactly speak for the reclaimability of the
list they sit on and we risk putting immense pressure on file list for
no good reason.

Instead, count each page not reclaimed and put back to any list,
active or inactive, as rotated, so they are neutral with respect to
the scan/rotate ratio of the list class, as they should be.

Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
 mm/vmscan.c |    9 ++++-----
 1 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 39d3da3..6da66a7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
 	 */
 	spin_lock(&zone->lru_lock);
 	while (!list_empty(page_list)) {
+		int file;
 		int lru;
+
 		page = lru_to_page(page_list);
 		VM_BUG_ON(PageLRU(page));
 		list_del(&page->lru);
@@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
 		SetPageLRU(page);
 		lru = page_lru(page);
 		add_page_to_lru_list(zone, page, lru);
-		if (is_active_lru(lru)) {
-			int file = is_file_lru(lru);
-			int numpages = hpage_nr_pages(page);
-			reclaim_stat->recent_rotated[file] += numpages;
-		}
+		file = is_file_lru(lru);
+		reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
 		if (!pagevec_add(&pvec, page)) {
 			spin_unlock_irq(&zone->lru_lock);
 			__pagevec_release(&pvec);
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [rfc 3/3] mm: vmscan: revert file list boost on lru addition
  2011-11-02 16:30     ` Johannes Weiner
  2011-11-02 16:31       ` [rfc 1/3] mm: vmscan: never swap under low memory pressure Johannes Weiner
  2011-11-02 16:32       ` [rfc 2/3] mm: vmscan: treat inactive cycling as neutral Johannes Weiner
@ 2011-11-02 16:32       ` Johannes Weiner
  2011-11-07  2:45         ` KAMEZAWA Hiroyuki
  2011-11-02 16:35       ` [PATCH 1/2] vmscan: promote shared file mapped pages Johannes Weiner
  3 siblings, 1 reply; 32+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:32 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Pekka Enberg, linux-mm, Andrew Morton, linux-kernel,
	Wu Fengguang, KAMEZAWA Hiroyuki, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
reclaim focus onto file pages with every new file page that hits the
lru list, so that an influx of used-once file pages does not lead to
swapping of anonymous pages.

The problem is that nobody is fixing up the balance if the pages in
fact become part of the resident set.

Anonymous page creation is neutral to the inter-lru balance, so even a
comparably tiny number of heavily used file pages tip the balance in
favor of the file list.

In addition, there is no refault detection, and every refault will
bias the balance even more.  A thrashing file working set will be
mistaken for a very lucrative source of reclaimable pages.

As anonymous pages are no longer swapped above a certain priority
level, this mechanism is no longer needed.  Used-once file pages
should get reclaimed before the VM even considers swapping.

Signed-off-by: Johannes Weiner <jweiner@redhat.com>
---
 mm/swap.c |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 3a442f1..33e5387 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -683,7 +683,6 @@ static void ____pagevec_lru_add_fn(struct page *page, void *arg)
 	SetPageLRU(page);
 	if (active)
 		SetPageActive(page);
-	update_page_reclaim_stat(zone, page, file, active);
 	add_page_to_lru_list(zone, page, lru);
 }
 
-- 
1.7.6.4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/2] vmscan: promote shared file mapped pages
  2011-11-02 16:30     ` Johannes Weiner
                         ` (2 preceding siblings ...)
  2011-11-02 16:32       ` [rfc 3/3] mm: vmscan: revert file list boost on lru addition Johannes Weiner
@ 2011-11-02 16:35       ` Johannes Weiner
  3 siblings, 0 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-11-02 16:35 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, KAMEZAWA Hiroyuki, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Wed, Nov 02, 2011 at 05:30:56PM +0100, Johannes Weiner wrote:
> Tipping the balance for inactive list rotation has been there from the
> beginning, but I don't quite understand why.  It probably was not a
> problem as the conditions for inactive cycling applied to both file
> and anon equally, but with used-once detection for file and deferred
> file writeback from direct reclaim, we tend to cycle more file pages
> on the inactive list than anonymous ones.  Those rotated pages should
> be a signal to favor file reclaim, though.

[...] should NOT be a signal [...]

obviously.  Sorry.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
  2011-11-02 16:31       ` [rfc 1/3] mm: vmscan: never swap under low memory pressure Johannes Weiner
@ 2011-11-02 17:54         ` KOSAKI Motohiro
  2011-11-03 15:51           ` Johannes Weiner
  2011-11-07  2:29         ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 32+ messages in thread
From: KOSAKI Motohiro @ 2011-11-02 17:54 UTC (permalink / raw)
  To: jweiner
  Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
	kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett

> ---
>  mm/vmscan.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a90c603..39d3da3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  		 * Try to allocate it some swap space here.l
>  		 */
>  		if (PageAnon(page) && !PageSwapCache(page)) {
> +			if (priority >= DEF_PRIORITY - 2)
> +				goto keep_locked;
>  			if (!(sc->gfp_mask & __GFP_IO))
>  				goto keep_locked;
>  			if (!add_to_swap(page))

Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
"DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
  2011-11-02 16:32       ` [rfc 2/3] mm: vmscan: treat inactive cycling as neutral Johannes Weiner
@ 2011-11-02 18:04         ` KOSAKI Motohiro
  2011-11-03 12:49           ` Johannes Weiner
  2011-11-07  2:34         ` KAMEZAWA Hiroyuki
  1 sibling, 1 reply; 32+ messages in thread
From: KOSAKI Motohiro @ 2011-11-02 18:04 UTC (permalink / raw)
  To: jweiner
  Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
	kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett

(11/2/2011 9:32 AM), Johannes Weiner wrote:
> Each page that is scanned but put back to the inactive list is counted
> as a successful reclaim, which tips the balance between file and anon
> lists more towards the cycling list.
> 
> This does - in my opinion - not make too much sense, but at the same
> time it was not much of a problem, as the conditions that lead to an
> inactive list cycle were mostly temporary - locked page, concurrent
> page table changes, backing device congested - or at least limited to
> a single reclaimer that was not allowed to unmap or meddle with IO.
> More important than being moderately rare, those conditions should
> apply to both anon and mapped file pages equally and balance out in
> the end.
> 
> Recently, we started cycling file pages in particular on the inactive
> list much more aggressively, for used-once detection of mapped pages,
> and when avoiding writeback from direct reclaim.
> 
> Those rotated pages do not exactly speak for the reclaimability of the
> list they sit on and we risk putting immense pressure on file list for
> no good reason.
> 
> Instead, count each page not reclaimed and put back to any list,
> active or inactive, as rotated, so they are neutral with respect to
> the scan/rotate ratio of the list class, as they should be.
> 
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> ---
>  mm/vmscan.c |    9 ++++-----
>  1 files changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 39d3da3..6da66a7 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
>  	 */
>  	spin_lock(&zone->lru_lock);
>  	while (!list_empty(page_list)) {
> +		int file;
>  		int lru;
> +
>  		page = lru_to_page(page_list);
>  		VM_BUG_ON(PageLRU(page));
>  		list_del(&page->lru);
> @@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
>  		SetPageLRU(page);
>  		lru = page_lru(page);
>  		add_page_to_lru_list(zone, page, lru);
> -		if (is_active_lru(lru)) {
> -			int file = is_file_lru(lru);
> -			int numpages = hpage_nr_pages(page);
> -			reclaim_stat->recent_rotated[file] += numpages;
> -		}
> +		file = is_file_lru(lru);
> +		reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
>  		if (!pagevec_add(&pvec, page)) {
>  			spin_unlock_irq(&zone->lru_lock);
>  			__pagevec_release(&pvec);

When avoiding writeback from direct reclaim case, I think we shouldn't increase
recent_rotated because VM decided "the page should be eviceted, but also it
should be delayed". i'm not sure it's minor factor or not.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
  2011-11-02 18:04         ` KOSAKI Motohiro
@ 2011-11-03 12:49           ` Johannes Weiner
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-11-03 12:49 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
	kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett

On Wed, Nov 02, 2011 at 11:04:30AM -0700, KOSAKI Motohiro wrote:
> (11/2/2011 9:32 AM), Johannes Weiner wrote:
> > Each page that is scanned but put back to the inactive list is counted
> > as a successful reclaim, which tips the balance between file and anon
> > lists more towards the cycling list.
> > 
> > This does - in my opinion - not make too much sense, but at the same
> > time it was not much of a problem, as the conditions that lead to an
> > inactive list cycle were mostly temporary - locked page, concurrent
> > page table changes, backing device congested - or at least limited to
> > a single reclaimer that was not allowed to unmap or meddle with IO.
> > More important than being moderately rare, those conditions should
> > apply to both anon and mapped file pages equally and balance out in
> > the end.
> > 
> > Recently, we started cycling file pages in particular on the inactive
> > list much more aggressively, for used-once detection of mapped pages,
> > and when avoiding writeback from direct reclaim.
> > 
> > Those rotated pages do not exactly speak for the reclaimability of the
> > list they sit on and we risk putting immense pressure on file list for
> > no good reason.
> > 
> > Instead, count each page not reclaimed and put back to any list,
> > active or inactive, as rotated, so they are neutral with respect to
> > the scan/rotate ratio of the list class, as they should be.
> > 
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > ---
> >  mm/vmscan.c |    9 ++++-----
> >  1 files changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 39d3da3..6da66a7 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1360,7 +1360,9 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> >  	 */
> >  	spin_lock(&zone->lru_lock);
> >  	while (!list_empty(page_list)) {
> > +		int file;
> >  		int lru;
> > +
> >  		page = lru_to_page(page_list);
> >  		VM_BUG_ON(PageLRU(page));
> >  		list_del(&page->lru);
> > @@ -1373,11 +1375,8 @@ putback_lru_pages(struct zone *zone, struct scan_control *sc,
> >  		SetPageLRU(page);
> >  		lru = page_lru(page);
> >  		add_page_to_lru_list(zone, page, lru);
> > -		if (is_active_lru(lru)) {
> > -			int file = is_file_lru(lru);
> > -			int numpages = hpage_nr_pages(page);
> > -			reclaim_stat->recent_rotated[file] += numpages;
> > -		}
> > +		file = is_file_lru(lru);
> > +		reclaim_stat->recent_rotated[file] += hpage_nr_pages(page);
> >  		if (!pagevec_add(&pvec, page)) {
> >  			spin_unlock_irq(&zone->lru_lock);
> >  			__pagevec_release(&pvec);
> 
> When avoiding writeback from direct reclaim case, I think we shouldn't increase
> recent_rotated because VM decided "the page should be eviceted, but also it
> should be delayed". i'm not sure it's minor factor or not.

But we DO increase recent_scanned another time when the page is
reclaimed on the next round.

If we don't increase recent_rotated for deferred reclaims, they are
counted as success twice and so considered more valuable than
immediate reclaims.  I don't think that makes sense.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
  2011-11-02 17:54         ` KOSAKI Motohiro
@ 2011-11-03 15:51           ` Johannes Weiner
  2011-11-08  0:16             ` KOSAKI Motohiro
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Weiner @ 2011-11-03 15:51 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
	kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett

On Wed, Nov 02, 2011 at 10:54:23AM -0700, KOSAKI Motohiro wrote:
> > ---
> >  mm/vmscan.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a90c603..39d3da3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		 * Try to allocate it some swap space here.l
> >  		 */
> >  		if (PageAnon(page) && !PageSwapCache(page)) {
> > +			if (priority >= DEF_PRIORITY - 2)
> > +				goto keep_locked;
> >  			if (!(sc->gfp_mask & __GFP_IO))
> >  				goto keep_locked;
> >  			if (!add_to_swap(page))
> 
> Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
> "DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
> machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.

Do you remember what kind of tests you ran that demonstrated
misbehaviour?

We can not reclaim anonymous pages without swapping, so the priority
cutoff applies only to inactive file pages.  If you had 1TB of
inactive file pages, the scanner would have to go through

	((1 << (40 - 12)) >> 12) +
	((1 << (40 - 12)) >> 11) +
	((1 << (40 - 12)) >> 10) = 1792MB

without reclaiming SWAP_CLUSTER_MAX before it considers swapping.
That's a lot of scanning but how likely is it that you have a TB of
unreclaimable inactive cache pages?

Put into proportion, with a priority threshold of 10 a reclaimer will
look at 0.17% ((n >> 12) + (n >> 11) + (n >> 10) (excluding the list
balance bias) of inactive file pages without reclaiming
SWAP_CLUSTER_MAX before it considers swapping.

Currently, the list balance biasing with each newly-added file page
has much higher resistance to scan anonymous pages initially.  But
once it shifted toward anon pages, all reclaimers will start swapping,
unlike the priority threshold that each reclaimer has to reach
individually.  Could this have been what was causing problems for you?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
  2011-11-02 16:31       ` [rfc 1/3] mm: vmscan: never swap under low memory pressure Johannes Weiner
  2011-11-02 17:54         ` KOSAKI Motohiro
@ 2011-11-07  2:29         ` KAMEZAWA Hiroyuki
  2011-11-10 15:29           ` Johannes Weiner
  1 sibling, 1 reply; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07  2:29 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Wed, 2 Nov 2011 17:31:41 +0100
Johannes Weiner <jweiner@redhat.com> wrote:

> We want to prevent floods of used-once file cache pushing us to swap
> out anonymous pages.  Never swap under a certain priority level.  The
> availability of used-once cache pages should prevent us from reaching
> that threshold.
> 
> This is needed because subsequent patches will revert some of the
> mechanisms that tried to prefer file over anon, and this should not
> result in more eager swapping again.
> 
> It might also be better to keep the aging machinery going and just not
> swap, rather than staying away from anonymous pages in the first place
> and having less useful age information at the time of swapout.
> 
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> ---
>  mm/vmscan.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index a90c603..39d3da3 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>  		 * Try to allocate it some swap space here.
>  		 */
>  		if (PageAnon(page) && !PageSwapCache(page)) {
> +			if (priority >= DEF_PRIORITY - 2)
> +				goto keep_locked;
>  			if (!(sc->gfp_mask & __GFP_IO))
>  				goto keep_locked;
>  			if (!add_to_swap(page))

Hm, how about not scanning LRU_ANON rather than checking here ?
Add some bias to get_scan_count() or some..
If you think to need rotation of LRU, only kswapd should do that..

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
  2011-11-02 16:32       ` [rfc 2/3] mm: vmscan: treat inactive cycling as neutral Johannes Weiner
  2011-11-02 18:04         ` KOSAKI Motohiro
@ 2011-11-07  2:34         ` KAMEZAWA Hiroyuki
  2011-11-10 16:06           ` Johannes Weiner
  1 sibling, 1 reply; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07  2:34 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Wed, 2 Nov 2011 17:32:13 +0100
Johannes Weiner <jweiner@redhat.com> wrote:

> Each page that is scanned but put back to the inactive list is counted
> as a successful reclaim, which tips the balance between file and anon
> lists more towards the cycling list.
> 
> This does - in my opinion - not make too much sense, but at the same
> time it was not much of a problem, as the conditions that lead to an
> inactive list cycle were mostly temporary - locked page, concurrent
> page table changes, backing device congested - or at least limited to
> a single reclaimer that was not allowed to unmap or meddle with IO.
> More important than being moderately rare, those conditions should
> apply to both anon and mapped file pages equally and balance out in
> the end.
> 
> Recently, we started cycling file pages in particular on the inactive
> list much more aggressively, for used-once detection of mapped pages,
> and when avoiding writeback from direct reclaim.
> 
> Those rotated pages do not exactly speak for the reclaimability of the
> list they sit on and we risk putting immense pressure on file list for
> no good reason.
> 
> Instead, count each page not reclaimed and put back to any list,
> active or inactive, as rotated, so they are neutral with respect to
> the scan/rotate ratio of the list class, as they should be.
> 
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>

I think this makes sense.

Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>

I wonder it may be better to have victim list for written-backed pages..


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 3/3] mm: vmscan: revert file list boost on lru addition
  2011-11-02 16:32       ` [rfc 3/3] mm: vmscan: revert file list boost on lru addition Johannes Weiner
@ 2011-11-07  2:45         ` KAMEZAWA Hiroyuki
  2011-11-10 16:12           ` Johannes Weiner
  0 siblings, 1 reply; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-07  2:45 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Wed, 2 Nov 2011 17:32:47 +0100
Johannes Weiner <jweiner@redhat.com> wrote:

> The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
> reclaim focus onto file pages with every new file page that hits the
> lru list, so that an influx of used-once file pages does not lead to
> swapping of anonymous pages.
> 
> The problem is that nobody is fixing up the balance if the pages in
> fact become part of the resident set.
> 
> Anonymous page creation is neutral to the inter-lru balance, so even a
> comparably tiny number of heavily used file pages tip the balance in
> favor of the file list.
> 
> In addition, there is no refault detection, and every refault will
> bias the balance even more.  A thrashing file working set will be
> mistaken for a very lucrative source of reclaimable pages.
> 
> As anonymous pages are no longer swapped above a certain priority
> level, this mechanism is no longer needed.  Used-once file pages
> should get reclaimed before the VM even considers swapping.
> 
> Signed-off-by: Johannes Weiner <jweiner@redhat.com>

Do you have some results ?

Thanks,
-Kame


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
  2011-11-03 15:51           ` Johannes Weiner
@ 2011-11-08  0:16             ` KOSAKI Motohiro
  0 siblings, 0 replies; 32+ messages in thread
From: KOSAKI Motohiro @ 2011-11-08  0:16 UTC (permalink / raw)
  To: jweiner
  Cc: khlebnikov, penberg, linux-mm, akpm, linux-kernel, fengguang.wu,
	kamezawa.hiroyu, hannes, riel, mel, minchan.kim, gene.heskett

Hi,

Sorry for the delay. I had tripped San Jose in last week.


> On Wed, Nov 02, 2011 at 10:54:23AM -0700, KOSAKI Motohiro wrote:
>>> ---
>>>  mm/vmscan.c |    2 ++
>>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index a90c603..39d3da3 100644
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
>>>  		 * Try to allocate it some swap space here.l
>>>  		 */
>>>  		if (PageAnon(page) && !PageSwapCache(page)) {
>>> +			if (priority >= DEF_PRIORITY - 2)
>>> +				goto keep_locked;
>>>  			if (!(sc->gfp_mask & __GFP_IO))
>>>  				goto keep_locked;for
>>>  			if (!add_to_swap(page))
>>
>> Hehe, i tried very similar way very long time ago. Unfortunately, it doesn't work.
>> "DEF_PRIORITY - 2" is really poor indicator for reclaim pressure. example, if the
>> machine have 1TB memory, DEF_PRIORITY-2 mean 1TB>>10 = 1GB. It't too big.
> 
> Do you remember what kind of tests you ran that demonstrated
> misbehaviour?
> 
> We can not reclaim anonymous pages without swapping, so the priority
> cutoff applies only to inactive file pages.  If you had 1TB of
> inactive file pages, the scanner would have to go through
> 
> 	((1 << (40 - 12)) >> 12) +
> 	((1 << (40 - 12)) >> 11) +
> 	((1 << (40 - 12)) >> 10) = 1792MB
> 
> without reclaiming SWAP_CLUSTER_MAX before it considers swapping.
> That's a lot of scanning but how likely is it that you have a TB of
> unreclaimable inactive cache pages?

I meant, the affect of this protection strongly depend on system memory.

 - system memory is plenty.
	the protection virtually affect to disable swap-out completely.
 - system memory is not plenty.
	the protection slightly makes a bonus to avoid swap out.

If people buy new machine and move-in their legacy workload into it, they
might surprise a lot of behavior change. I'm worry about it.

That's why I dislike DEF_PRIORITY based heuristic.


> Put into proportion, with a priority threshold of 10 a reclaimer will
> look at 0.17% ((n >> 12) + (n >> 11) + (n >> 10) (excluding the list
> balance bias) of inactive file pages without reclaiming
> SWAP_CLUSTER_MAX before it considers swapping.

Moreover, I think we need to make more precious analysis why unnecessary swapout
was happen. Which factor is dominant and when occur.


> Currently, the list balance biasing with each newly-added file page
> has much higher resistance to scan anonymous pages initially.  But
> once it shifted toward anon pages, all reclaimers will start swapping,
> unlike the priority threshold that each reclaimer has to reach
> individually.  Could this have been what was causing problems for you? 

Um. Currently number of fulusher threads are controlled by kernel. But,
number of swap-out threads aren't limited at all. So, our swapout often
works too aggressively. I think we need fix it.






^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 1/3] mm: vmscan: never swap under low memory pressure
  2011-11-07  2:29         ` KAMEZAWA Hiroyuki
@ 2011-11-10 15:29           ` Johannes Weiner
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-11-10 15:29 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Mon, Nov 07, 2011 at 11:29:41AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:31:41 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
> 
> > We want to prevent floods of used-once file cache pushing us to swap
> > out anonymous pages.  Never swap under a certain priority level.  The
> > availability of used-once cache pages should prevent us from reaching
> > that threshold.
> > 
> > This is needed because subsequent patches will revert some of the
> > mechanisms that tried to prefer file over anon, and this should not
> > result in more eager swapping again.
> > 
> > It might also be better to keep the aging machinery going and just not
> > swap, rather than staying away from anonymous pages in the first place
> > and having less useful age information at the time of swapout.
> > 
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > ---
> >  mm/vmscan.c |    2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index a90c603..39d3da3 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -831,6 +831,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		 * Try to allocate it some swap space here.
> >  		 */
> >  		if (PageAnon(page) && !PageSwapCache(page)) {
> > +			if (priority >= DEF_PRIORITY - 2)
> > +				goto keep_locked;
> >  			if (!(sc->gfp_mask & __GFP_IO))
> >  				goto keep_locked;
> >  			if (!add_to_swap(page))
> 
> Hm, how about not scanning LRU_ANON rather than checking here ?
> Add some bias to get_scan_count() or some..
> If you think to need rotation of LRU, only kswapd should do that..

Absolutely, it would require more tuning.  This patch was really a
'hey, how about we do something like this?  anyone tried that before?'

I keep those things in mind if I pursue this further, thanks.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
  2011-11-07  2:34         ` KAMEZAWA Hiroyuki
@ 2011-11-10 16:06           ` Johannes Weiner
  2011-11-11  0:05             ` KAMEZAWA Hiroyuki
  0 siblings, 1 reply; 32+ messages in thread
From: Johannes Weiner @ 2011-11-10 16:06 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Mon, Nov 07, 2011 at 11:34:17AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:32:13 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
> 
> > Each page that is scanned but put back to the inactive list is counted
> > as a successful reclaim, which tips the balance between file and anon
> > lists more towards the cycling list.
> > 
> > This does - in my opinion - not make too much sense, but at the same
> > time it was not much of a problem, as the conditions that lead to an
> > inactive list cycle were mostly temporary - locked page, concurrent
> > page table changes, backing device congested - or at least limited to
> > a single reclaimer that was not allowed to unmap or meddle with IO.
> > More important than being moderately rare, those conditions should
> > apply to both anon and mapped file pages equally and balance out in
> > the end.
> > 
> > Recently, we started cycling file pages in particular on the inactive
> > list much more aggressively, for used-once detection of mapped pages,
> > and when avoiding writeback from direct reclaim.
> > 
> > Those rotated pages do not exactly speak for the reclaimability of the
> > list they sit on and we risk putting immense pressure on file list for
> > no good reason.
> > 
> > Instead, count each page not reclaimed and put back to any list,
> > active or inactive, as rotated, so they are neutral with respect to
> > the scan/rotate ratio of the list class, as they should be.
> > 
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> 
> I think this makes sense.
> 
> Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> 
> I wonder it may be better to have victim list for written-backed pages..

Do you mean an extra LRU list that holds dirty pages?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 3/3] mm: vmscan: revert file list boost on lru addition
  2011-11-07  2:45         ` KAMEZAWA Hiroyuki
@ 2011-11-10 16:12           ` Johannes Weiner
  0 siblings, 0 replies; 32+ messages in thread
From: Johannes Weiner @ 2011-11-10 16:12 UTC (permalink / raw)
  To: KAMEZAWA Hiroyuki
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Mon, Nov 07, 2011 at 11:45:20AM +0900, KAMEZAWA Hiroyuki wrote:
> On Wed, 2 Nov 2011 17:32:47 +0100
> Johannes Weiner <jweiner@redhat.com> wrote:
> 
> > The idea in 9ff473b 'vmscan: evict streaming IO first' was to steer
> > reclaim focus onto file pages with every new file page that hits the
> > lru list, so that an influx of used-once file pages does not lead to
> > swapping of anonymous pages.
> > 
> > The problem is that nobody is fixing up the balance if the pages in
> > fact become part of the resident set.
> > 
> > Anonymous page creation is neutral to the inter-lru balance, so even a
> > comparably tiny number of heavily used file pages tip the balance in
> > favor of the file list.
> > 
> > In addition, there is no refault detection, and every refault will
> > bias the balance even more.  A thrashing file working set will be
> > mistaken for a very lucrative source of reclaimable pages.
> > 
> > As anonymous pages are no longer swapped above a certain priority
> > level, this mechanism is no longer needed.  Used-once file pages
> > should get reclaimed before the VM even considers swapping.
> > 
> > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> 
> Do you have some results ?

Not yet, sorry, I had to drop it all and do something else.

This change relies on the VM having a different mechanism to go for
one-shot file cache first, so I need to address Kosaki-san's concerns
about 1/3 before pursuing this patch.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [rfc 2/3] mm: vmscan: treat inactive cycling as neutral
  2011-11-10 16:06           ` Johannes Weiner
@ 2011-11-11  0:05             ` KAMEZAWA Hiroyuki
  0 siblings, 0 replies; 32+ messages in thread
From: KAMEZAWA Hiroyuki @ 2011-11-11  0:05 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Konstantin Khlebnikov, Pekka Enberg, linux-mm, Andrew Morton,
	linux-kernel, Wu Fengguang, Johannes Weiner, Rik van Riel,
	Mel Gorman, Minchan Kim, Gene Heskett

On Thu, 10 Nov 2011 17:06:28 +0100
Johannes Weiner <jweiner@redhat.com> wrote:

> On Mon, Nov 07, 2011 at 11:34:17AM +0900, KAMEZAWA Hiroyuki wrote:
> > On Wed, 2 Nov 2011 17:32:13 +0100
> > Johannes Weiner <jweiner@redhat.com> wrote:
> > 
> > > Each page that is scanned but put back to the inactive list is counted
> > > as a successful reclaim, which tips the balance between file and anon
> > > lists more towards the cycling list.
> > > 
> > > This does - in my opinion - not make too much sense, but at the same
> > > time it was not much of a problem, as the conditions that lead to an
> > > inactive list cycle were mostly temporary - locked page, concurrent
> > > page table changes, backing device congested - or at least limited to
> > > a single reclaimer that was not allowed to unmap or meddle with IO.
> > > More important than being moderately rare, those conditions should
> > > apply to both anon and mapped file pages equally and balance out in
> > > the end.
> > > 
> > > Recently, we started cycling file pages in particular on the inactive
> > > list much more aggressively, for used-once detection of mapped pages,
> > > and when avoiding writeback from direct reclaim.
> > > 
> > > Those rotated pages do not exactly speak for the reclaimability of the
> > > list they sit on and we risk putting immense pressure on file list for
> > > no good reason.
> > > 
> > > Instead, count each page not reclaimed and put back to any list,
> > > active or inactive, as rotated, so they are neutral with respect to
> > > the scan/rotate ratio of the list class, as they should be.
> > > 
> > > Signed-off-by: Johannes Weiner <jweiner@redhat.com>
> > 
> > I think this makes sense.
> > 
> > Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > 
> > I wonder it may be better to have victim list for written-backed pages..
> 
> Do you mean an extra LRU list that holds dirty pages?

an extra LRU for pages PG_reclaim ? 

THanks,
-Kame


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2011-11-11  0:06 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-08-08 11:06 [PATCH 1/2] vmscan: promote shared file mapped pages Konstantin Khlebnikov
2011-08-08 11:07 ` [PATCH 2/2] vmscan: activate executable pages after first usage Konstantin Khlebnikov
2011-08-08 23:58   ` KAMEZAWA Hiroyuki
2011-08-09  0:02   ` Minchan Kim
2011-08-09  0:04     ` KAMEZAWA Hiroyuki
2011-08-09  0:26       ` Minchan Kim
2011-08-09  1:23   ` Shaohua Li
2011-08-08 11:37 ` [PATCH 1/2] vmscan: promote shared file mapped pages Pekka Enberg
2011-08-08 12:18   ` Konstantin Khlebnikov
2011-08-08 12:40     ` Pekka Enberg
2011-08-08 12:51       ` Konstantin Khlebnikov
2011-08-18  9:09     ` Johannes Weiner
2011-11-02 16:30     ` Johannes Weiner
2011-11-02 16:31       ` [rfc 1/3] mm: vmscan: never swap under low memory pressure Johannes Weiner
2011-11-02 17:54         ` KOSAKI Motohiro
2011-11-03 15:51           ` Johannes Weiner
2011-11-08  0:16             ` KOSAKI Motohiro
2011-11-07  2:29         ` KAMEZAWA Hiroyuki
2011-11-10 15:29           ` Johannes Weiner
2011-11-02 16:32       ` [rfc 2/3] mm: vmscan: treat inactive cycling as neutral Johannes Weiner
2011-11-02 18:04         ` KOSAKI Motohiro
2011-11-03 12:49           ` Johannes Weiner
2011-11-07  2:34         ` KAMEZAWA Hiroyuki
2011-11-10 16:06           ` Johannes Weiner
2011-11-11  0:05             ` KAMEZAWA Hiroyuki
2011-11-02 16:32       ` [rfc 3/3] mm: vmscan: revert file list boost on lru addition Johannes Weiner
2011-11-07  2:45         ` KAMEZAWA Hiroyuki
2011-11-10 16:12           ` Johannes Weiner
2011-11-02 16:35       ` [PATCH 1/2] vmscan: promote shared file mapped pages Johannes Weiner
2011-08-08 23:36 ` Minchan Kim
2011-08-08 23:51 ` KAMEZAWA Hiroyuki
2011-10-31 20:12 ` Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).