All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-11 15:30 David Howells
  2009-03-11 17:26 ` Johannes Weiner
                   ` (3 more replies)
  0 siblings, 4 replies; 101+ messages in thread
From: David Howells @ 2009-03-11 15:30 UTC (permalink / raw)
  To: torvalds, akpm, peterz; +Cc: Enrik.Berkhan, dhowells, uclinux-dev, linux-kernel

From: Enrik Berkhan <Enrik.Berkhan@ge.com>

The pages attached to a ramfs inode's pagecache by truncation from nothing - as
done by SYSV SHM for example - may get discarded under memory pressure.

The problem is that the pages are not marked dirty.  Anything that creates data
in an MMU-based ramfs will cause the pages holding that data will cause the
set_page_dirty() aop to be called.

For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
won't be called by page-writing faults on writable mmaps, and it isn't called
by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
to allocate a contiguous run.

The solution is to mark the pages dirty at the point of allocation by
the truncation code.

Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 fs/ramfs/file-nommu.c |    3 +++
 1 files changed, 3 insertions(+), 0 deletions(-)


diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
index b9b567a..90d72be 100644
--- a/fs/ramfs/file-nommu.c
+++ b/fs/ramfs/file-nommu.c
@@ -114,6 +114,9 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 		if (!pagevec_add(&lru_pvec, page))
 			__pagevec_lru_add_file(&lru_pvec);
 
+		/* prevent the page from being discarded on memory pressure */
+		SetPageDirty(page);
+
 		unlock_page(page);
 	}
 


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells
@ 2009-03-11 17:26 ` Johannes Weiner
  2009-03-11 22:03   ` Andrew Morton
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-11 17:26 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, akpm, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel

On Wed, Mar 11, 2009 at 03:30:35PM +0000, David Howells wrote:
> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> 
> The pages attached to a ramfs inode's pagecache by truncation from nothing - as
> done by SYSV SHM for example - may get discarded under memory pressure.
> 
> The problem is that the pages are not marked dirty.  Anything that creates data
> in an MMU-based ramfs will cause the pages holding that data will cause the
> set_page_dirty() aop to be called.
> 
> For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> won't be called by page-writing faults on writable mmaps, and it isn't called
> by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> to allocate a contiguous run.
> 
> The solution is to mark the pages dirty at the point of allocation by
> the truncation code.
> 
> Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  fs/ramfs/file-nommu.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> 
> diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
> index b9b567a..90d72be 100644
> --- a/fs/ramfs/file-nommu.c
> +++ b/fs/ramfs/file-nommu.c
> @@ -114,6 +114,9 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
>  		if (!pagevec_add(&lru_pvec, page))
>  			__pagevec_lru_add_file(&lru_pvec);
>  
> +		/* prevent the page from being discarded on memory pressure */
> +		SetPageDirty(page);
> +
>  		unlock_page(page);
>  	}

Reviewed-by: Johannes Weiner <hannes@cmpxchg.org>

I think the attached patch is also needed, though unrelated to the
above.

	Hannes

---
>From bfa7bc5f884bbc01c5e10faba7ca17160befd61e Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes@cmpxchg.org>
Date: Wed, 11 Mar 2009 18:13:34 +0100
Subject: [PATCH] ramfs: don't leak pages when adding to page cache fails

When a ramfs nommu mapping is expanded, contiguous pages are allocated
and added to the pagecache.  The caller's reference is then passed on
by moving whole pagevecs to the file lru list.

If the page cache adding fails, make sure that the error path also
moves the pagevec contents which might still contain up to
PAGEVEC_SIZE successfully added pages, of which we would leak
references otherwise.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 fs/ramfs/file-nommu.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
index b9b567a..6d1624e 100644
--- a/fs/ramfs/file-nommu.c
+++ b/fs/ramfs/file-nommu.c
@@ -126,6 +126,7 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 	return -EFBIG;
 
  add_error:
+	pagevec_lru_add_file(&lru_pvec);
 	page_cache_release(pages + loop);
 	for (loop++; loop < npages; loop++)
 		__free_page(pages + loop);
-- 
1.6.1.3


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells
@ 2009-03-11 22:03   ` Andrew Morton
  2009-03-11 22:03   ` Andrew Morton
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 101+ messages in thread
From: Andrew Morton @ 2009-03-11 22:03 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, peterz, Enrik.Berkhan, dhowells, uclinux-dev,
	linux-kernel, linux-mm, Johannes Weiner

On Wed, 11 Mar 2009 15:30:35 +0000
David Howells <dhowells@redhat.com> wrote:

> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> 
> The pages attached to a ramfs inode's pagecache by truncation from nothing - as
> done by SYSV SHM for example - may get discarded under memory pressure.

Something has gone wrong in core VM.

> The problem is that the pages are not marked dirty.  Anything that creates data
> in an MMU-based ramfs will cause the pages holding that data will cause the
> set_page_dirty() aop to be called.
> 
> For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> won't be called by page-writing faults on writable mmaps, and it isn't called
> by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> to allocate a contiguous run.
> 
> The solution is to mark the pages dirty at the point of allocation by
> the truncation code.

Page reclaim shouldn't be even attempting to reclaim or write back
ramfs pagecache pages - reclaim can't possibly do anything with these
pages!

Arguably those pages shouldn't be on the LRU at all, but we haven't
done that yet.

Now, my problem is that I can't 100% be sure that we _ever_ implemented
this properly.  I _think_ we did, in which case we later broke it.  If
we've always been (stupidly) trying to pageout these pages then OK, I
guess your patch is a suitable 2.6.29 stopgap.

If, however, we broke it then we've probably broken other filesystems
and we should fix the regression instead.

Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the
way to fix all this.

Peter touched it last :)


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-11 22:03   ` Andrew Morton
  0 siblings, 0 replies; 101+ messages in thread
From: Andrew Morton @ 2009-03-11 22:03 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel,
	linux-mm, Johannes Weiner

On Wed, 11 Mar 2009 15:30:35 +0000
David Howells <dhowells@redhat.com> wrote:

> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> 
> The pages attached to a ramfs inode's pagecache by truncation from nothing - as
> done by SYSV SHM for example - may get discarded under memory pressure.

Something has gone wrong in core VM.

> The problem is that the pages are not marked dirty.  Anything that creates data
> in an MMU-based ramfs will cause the pages holding that data will cause the
> set_page_dirty() aop to be called.
> 
> For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> won't be called by page-writing faults on writable mmaps, and it isn't called
> by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> to allocate a contiguous run.
> 
> The solution is to mark the pages dirty at the point of allocation by
> the truncation code.

Page reclaim shouldn't be even attempting to reclaim or write back
ramfs pagecache pages - reclaim can't possibly do anything with these
pages!

Arguably those pages shouldn't be on the LRU at all, but we haven't
done that yet.

Now, my problem is that I can't 100% be sure that we _ever_ implemented
this properly.  I _think_ we did, in which case we later broke it.  If
we've always been (stupidly) trying to pageout these pages then OK, I
guess your patch is a suitable 2.6.29 stopgap.

If, however, we broke it then we've probably broken other filesystems
and we should fix the regression instead.

Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the
way to fix all this.

Peter touched it last :)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-11 22:03   ` Andrew Morton
@ 2009-03-11 22:36     ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-11 22:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Howells, torvalds, peterz, Enrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm

On Wed, Mar 11, 2009 at 03:03:02PM -0700, Andrew Morton wrote:
> On Wed, 11 Mar 2009 15:30:35 +0000
> David Howells <dhowells@redhat.com> wrote:
> 
> > From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> > 
> > The pages attached to a ramfs inode's pagecache by truncation from nothing - as
> > done by SYSV SHM for example - may get discarded under memory pressure.
> 
> Something has gone wrong in core VM.
> 
> > The problem is that the pages are not marked dirty.  Anything that creates data
> > in an MMU-based ramfs will cause the pages holding that data will cause the
> > set_page_dirty() aop to be called.
> > 
> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> > won't be called by page-writing faults on writable mmaps, and it isn't called
> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> > to allocate a contiguous run.
> > 
> > The solution is to mark the pages dirty at the point of allocation by
> > the truncation code.
> 
> Page reclaim shouldn't be even attempting to reclaim or write back
> ramfs pagecache pages - reclaim can't possibly do anything with these
> pages!
> 
> Arguably those pages shouldn't be on the LRU at all, but we haven't
> done that yet.
> 
> Now, my problem is that I can't 100% be sure that we _ever_ implemented
> this properly.  I _think_ we did, in which case we later broke it.  If
> we've always been (stupidly) trying to pageout these pages then OK, I
> guess your patch is a suitable 2.6.29 stopgap.
> 
> If, however, we broke it then we've probably broken other filesystems
> and we should fix the regression instead.
> 
> Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the
> way to fix all this.

The pages are not dirty, so no pageout() which says PAGE_KEEP.  It
will just go through and reclaim the clean, unmapped pages.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-11 22:36     ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-11 22:36 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Howells, torvalds, peterz, Enrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm

On Wed, Mar 11, 2009 at 03:03:02PM -0700, Andrew Morton wrote:
> On Wed, 11 Mar 2009 15:30:35 +0000
> David Howells <dhowells@redhat.com> wrote:
> 
> > From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> > 
> > The pages attached to a ramfs inode's pagecache by truncation from nothing - as
> > done by SYSV SHM for example - may get discarded under memory pressure.
> 
> Something has gone wrong in core VM.
> 
> > The problem is that the pages are not marked dirty.  Anything that creates data
> > in an MMU-based ramfs will cause the pages holding that data will cause the
> > set_page_dirty() aop to be called.
> > 
> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> > won't be called by page-writing faults on writable mmaps, and it isn't called
> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> > to allocate a contiguous run.
> > 
> > The solution is to mark the pages dirty at the point of allocation by
> > the truncation code.
> 
> Page reclaim shouldn't be even attempting to reclaim or write back
> ramfs pagecache pages - reclaim can't possibly do anything with these
> pages!
> 
> Arguably those pages shouldn't be on the LRU at all, but we haven't
> done that yet.
> 
> Now, my problem is that I can't 100% be sure that we _ever_ implemented
> this properly.  I _think_ we did, in which case we later broke it.  If
> we've always been (stupidly) trying to pageout these pages then OK, I
> guess your patch is a suitable 2.6.29 stopgap.
> 
> If, however, we broke it then we've probably broken other filesystems
> and we should fix the regression instead.
> 
> Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the
> way to fix all this.

The pages are not dirty, so no pageout() which says PAGE_KEEP.  It
will just go through and reclaim the clean, unmapped pages.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-11 22:03   ` Andrew Morton
@ 2009-03-12  0:02     ` Andrew Morton
  -1 siblings, 0 replies; 101+ messages in thread
From: Andrew Morton @ 2009-03-12  0:02 UTC (permalink / raw)
  To: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes

On Wed, 11 Mar 2009 15:03:02 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> > The problem is that the pages are not marked dirty.  Anything that creates data
> > in an MMU-based ramfs will cause the pages holding that data will cause the
> > set_page_dirty() aop to be called.
> > 
> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> > won't be called by page-writing faults on writable mmaps, and it isn't called
> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> > to allocate a contiguous run.
> > 
> > The solution is to mark the pages dirty at the point of allocation by
> > the truncation code.
> 
> Page reclaim shouldn't be even attempting to reclaim or write back
> ramfs pagecache pages - reclaim can't possibly do anything with these
> pages!
> 
> Arguably those pages shouldn't be on the LRU at all, but we haven't
> done that yet.
> 
> Now, my problem is that I can't 100% be sure that we _ever_ implemented
> this properly.  I _think_ we did, in which case we later broke it.  If
> we've always been (stupidly) trying to pageout these pages then OK, I
> guess your patch is a suitable 2.6.29 stopgap.

OK, I can't find any code anywhere in which we excluded ramfs pages
from consideration by page reclaim.  How dumb.

So I guess that for now the proposed patch is suitable.  Longer-term we
should bale early in shrink_page_list(), or not add these pages to the
LRU at all.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-12  0:02     ` Andrew Morton
  0 siblings, 0 replies; 101+ messages in thread
From: Andrew Morton @ 2009-03-12  0:02 UTC (permalink / raw)
  To: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes

On Wed, 11 Mar 2009 15:03:02 -0700
Andrew Morton <akpm@linux-foundation.org> wrote:

> > The problem is that the pages are not marked dirty.  Anything that creates data
> > in an MMU-based ramfs will cause the pages holding that data will cause the
> > set_page_dirty() aop to be called.
> > 
> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> > won't be called by page-writing faults on writable mmaps, and it isn't called
> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> > to allocate a contiguous run.
> > 
> > The solution is to mark the pages dirty at the point of allocation by
> > the truncation code.
> 
> Page reclaim shouldn't be even attempting to reclaim or write back
> ramfs pagecache pages - reclaim can't possibly do anything with these
> pages!
> 
> Arguably those pages shouldn't be on the LRU at all, but we haven't
> done that yet.
> 
> Now, my problem is that I can't 100% be sure that we _ever_ implemented
> this properly.  I _think_ we did, in which case we later broke it.  If
> we've always been (stupidly) trying to pageout these pages then OK, I
> guess your patch is a suitable 2.6.29 stopgap.

OK, I can't find any code anywhere in which we excluded ramfs pages
from consideration by page reclaim.  How dumb.

So I guess that for now the proposed patch is suitable.  Longer-term we
should bale early in shrink_page_list(), or not add these pages to the
LRU at all.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells
  2009-03-11 17:26 ` Johannes Weiner
  2009-03-11 22:03   ` Andrew Morton
@ 2009-03-12  0:08 ` Andrew Morton
  2009-03-12  7:12   ` Berkhan, Enrik (GE Infra, Oil & Gas)
  2009-03-12 12:25 ` David Howells
  3 siblings, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2009-03-12  0:08 UTC (permalink / raw)
  To: David Howells
  Cc: torvalds, peterz, Enrik.Berkhan, dhowells, uclinux-dev, linux-kernel

On Wed, 11 Mar 2009 15:30:35 +0000
David Howells <dhowells@redhat.com> wrote:

> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> 
> The pages attached to a ramfs inode's pagecache by truncation from nothing - as
> done by SYSV SHM for example - may get discarded under memory pressure.
> 
> The problem is that the pages are not marked dirty.  Anything that creates data
> in an MMU-based ramfs will cause the pages holding that data will cause the
> set_page_dirty() aop to be called.
> 
> For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
> won't be called by page-writing faults on writable mmaps, and it isn't called
> by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
> to allocate a contiguous run.
> 
> The solution is to mark the pages dirty at the point of allocation by
> the truncation code.
> 
> Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com>
> Signed-off-by: David Howells <dhowells@redhat.com>
> ---
> 
>  fs/ramfs/file-nommu.c |    3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
> 
> 
> diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
> index b9b567a..90d72be 100644
> --- a/fs/ramfs/file-nommu.c
> +++ b/fs/ramfs/file-nommu.c
> @@ -114,6 +114,9 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
>  		if (!pagevec_add(&lru_pvec, page))
>  			__pagevec_lru_add_file(&lru_pvec);
>  
> +		/* prevent the page from being discarded on memory pressure */
> +		SetPageDirty(page);
> +
>  		unlock_page(page);
>  	}

Was there a specific reason for using the low-level SetPageDirty()?

On the write() path, ramfs pages will be dirtied by
simple_commit_write()'s set_page_dirty(), which calls
__set_page_dirty_no_writeback().

It just so happens that __set_page_dirty_no_writeback() is equivalent
to a simple SetPageDirty() - it bypasses all the extra things which we
do for normal permanent-storage-backed pages.

But I'd have thought that it would be cleaner and more maintainable (albeit
a bit slower) to go through the a_ops?


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
  2009-03-12  0:02     ` Andrew Morton
@ 2009-03-12  0:35       ` Minchan Kim
  -1 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  0:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, Johannes Weiner, Minchan Kim,
	Rik van Riel, Lee Schermerhorn, KOSAKI Motohiro

On Thu, Mar 12, 2009 at 9:02 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 11 Mar 2009 15:03:02 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
>> > The problem is that the pages are not marked dirty.  Anything that creates data
>> > in an MMU-based ramfs will cause the pages holding that data will cause the
>> > set_page_dirty() aop to be called.
>> >
>> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
>> > won't be called by page-writing faults on writable mmaps, and it isn't called
>> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
>> > to allocate a contiguous run.
>> >
>> > The solution is to mark the pages dirty at the point of allocation by
>> > the truncation code.
>>
>> Page reclaim shouldn't be even attempting to reclaim or write back
>> ramfs pagecache pages - reclaim can't possibly do anything with these
>> pages!
>>
>> Arguably those pages shouldn't be on the LRU at all, but we haven't
>> done that yet.
>>
>> Now, my problem is that I can't 100% be sure that we _ever_ implemented
>> this properly.  I _think_ we did, in which case we later broke it.  If
>> we've always been (stupidly) trying to pageout these pages then OK, I
>> guess your patch is a suitable 2.6.29 stopgap.
>
> OK, I can't find any code anywhere in which we excluded ramfs pages
> from consideration by page reclaim.  How dumb.


The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
It that case, ramfs_get_inode calls mapping_set_unevictable.
So,  page reclaim can exclude ramfs pages by page_evictable.
It's problem .


> So I guess that for now the proposed patch is suitable.  Longer-term we
> should bale early in shrink_page_list(), or not add these pages to the
> LRU at all.

In future, we have to improve this.

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Kinds regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-12  0:35       ` Minchan Kim
  0 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  0:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, Johannes Weiner, Minchan Kim,
	Rik van Riel, Lee Schermerhorn, KOSAKI Motohiro

On Thu, Mar 12, 2009 at 9:02 AM, Andrew Morton
<akpm@linux-foundation.org> wrote:
> On Wed, 11 Mar 2009 15:03:02 -0700
> Andrew Morton <akpm@linux-foundation.org> wrote:
>
>> > The problem is that the pages are not marked dirty.  Anything that creates data
>> > in an MMU-based ramfs will cause the pages holding that data will cause the
>> > set_page_dirty() aop to be called.
>> >
>> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it
>> > won't be called by page-writing faults on writable mmaps, and it isn't called
>> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing
>> > to allocate a contiguous run.
>> >
>> > The solution is to mark the pages dirty at the point of allocation by
>> > the truncation code.
>>
>> Page reclaim shouldn't be even attempting to reclaim or write back
>> ramfs pagecache pages - reclaim can't possibly do anything with these
>> pages!
>>
>> Arguably those pages shouldn't be on the LRU at all, but we haven't
>> done that yet.
>>
>> Now, my problem is that I can't 100% be sure that we _ever_ implemented
>> this properly.  I _think_ we did, in which case we later broke it.  If
>> we've always been (stupidly) trying to pageout these pages then OK, I
>> guess your patch is a suitable 2.6.29 stopgap.
>
> OK, I can't find any code anywhere in which we excluded ramfs pages
> from consideration by page reclaim.  How dumb.


The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
It that case, ramfs_get_inode calls mapping_set_unevictable.
So,  page reclaim can exclude ramfs pages by page_evictable.
It's problem .


> So I guess that for now the proposed patch is suitable.  Longer-term we
> should bale early in shrink_page_list(), or not add these pages to the
> LRU at all.

In future, we have to improve this.

> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Kinds regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
  2009-03-12  0:35       ` Minchan Kim
@ 2009-03-12  1:04         ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-12  1:04 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

Hi

> >> Page reclaim shouldn't be even attempting to reclaim or write back
> >> ramfs pagecache pages - reclaim can't possibly do anything with these
> >> pages!
> >>
> >> Arguably those pages shouldn't be on the LRU at all, but we haven't
> >> done that yet.
> >>
> >> Now, my problem is that I can't 100% be sure that we _ever_ implemented
> >> this properly. ?I _think_ we did, in which case we later broke it. ?If
> >> we've always been (stupidly) trying to pageout these pages then OK, I
> >> guess your patch is a suitable 2.6.29 stopgap.
> >
> > OK, I can't find any code anywhere in which we excluded ramfs pages
> > from consideration by page reclaim. ?How dumb.
> 
> The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
> It that case, ramfs_get_inode calls mapping_set_unevictable.
> So,  page reclaim can exclude ramfs pages by page_evictable.
> It's problem .

Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
because nobody of vmscan folk havbe nommu machine.

Yes, it is very stupid reason. _very_ welcome to tester! :)



David, Could you please try following patch if you have NOMMU machine?
it is straightforward porting to nommu.


==
Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU

logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU.
but current code does by mistake. fix it.


Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/Kconfig |    1 -
 mm/nommu.c |   24 ++++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

Index: b/mm/Kconfig
===================================================================
--- a/mm/Kconfig	2008-12-28 20:55:23.000000000 +0900
+++ b/mm/Kconfig	2008-12-28 21:24:08.000000000 +0900
@@ -212,7 +212,6 @@ config VIRT_TO_BUS
 config UNEVICTABLE_LRU
 	bool "Add LRU list to track non-evictable pages"
 	default y
-	depends on MMU
 	help
 	  Keeps unevictable pages off of the active and inactive pageout
 	  lists, so kswapd will not waste CPU time or have its balancing
Index: b/mm/nommu.c
===================================================================
--- a/mm/nommu.c	2008-12-25 08:26:37.000000000 +0900
+++ b/mm/nommu.c	2008-12-28 21:29:36.000000000 +0900
@@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct
 	mmput(mm);
 	return len;
 }
+
+/*
+ *  LRU accounting for clear_page_mlock()
+ */
+void __clear_page_mlock(struct page *page)
+{
+	VM_BUG_ON(!PageLocked(page));
+
+	if (!page->mapping) {	/* truncated ? */
+		return;
+	}
+
+	dec_zone_page_state(page, NR_MLOCK);
+	count_vm_event(UNEVICTABLE_PGCLEARED);
+	if (!isolate_lru_page(page)) {
+		putback_lru_page(page);
+	} else {
+		/*
+		 * We lost the race. the page already moved to evictable list.
+		 */
+		if (PageUnevictable(page))
+			count_vm_event(UNEVICTABLE_PGSTRANDED);
+	}
+}





^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
@ 2009-03-12  1:04         ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-12  1:04 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

Hi

> >> Page reclaim shouldn't be even attempting to reclaim or write back
> >> ramfs pagecache pages - reclaim can't possibly do anything with these
> >> pages!
> >>
> >> Arguably those pages shouldn't be on the LRU at all, but we haven't
> >> done that yet.
> >>
> >> Now, my problem is that I can't 100% be sure that we _ever_ implemented
> >> this properly. ?I _think_ we did, in which case we later broke it. ?If
> >> we've always been (stupidly) trying to pageout these pages then OK, I
> >> guess your patch is a suitable 2.6.29 stopgap.
> >
> > OK, I can't find any code anywhere in which we excluded ramfs pages
> > from consideration by page reclaim. ?How dumb.
> 
> The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
> It that case, ramfs_get_inode calls mapping_set_unevictable.
> So,  page reclaim can exclude ramfs pages by page_evictable.
> It's problem .

Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
because nobody of vmscan folk havbe nommu machine.

Yes, it is very stupid reason. _very_ welcome to tester! :)



David, Could you please try following patch if you have NOMMU machine?
it is straightforward porting to nommu.


==
Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU

logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU.
but current code does by mistake. fix it.


Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
---
 mm/Kconfig |    1 -
 mm/nommu.c |   24 ++++++++++++++++++++++++
 2 files changed, 24 insertions(+), 1 deletion(-)

Index: b/mm/Kconfig
===================================================================
--- a/mm/Kconfig	2008-12-28 20:55:23.000000000 +0900
+++ b/mm/Kconfig	2008-12-28 21:24:08.000000000 +0900
@@ -212,7 +212,6 @@ config VIRT_TO_BUS
 config UNEVICTABLE_LRU
 	bool "Add LRU list to track non-evictable pages"
 	default y
-	depends on MMU
 	help
 	  Keeps unevictable pages off of the active and inactive pageout
 	  lists, so kswapd will not waste CPU time or have its balancing
Index: b/mm/nommu.c
===================================================================
--- a/mm/nommu.c	2008-12-25 08:26:37.000000000 +0900
+++ b/mm/nommu.c	2008-12-28 21:29:36.000000000 +0900
@@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct
 	mmput(mm);
 	return len;
 }
+
+/*
+ *  LRU accounting for clear_page_mlock()
+ */
+void __clear_page_mlock(struct page *page)
+{
+	VM_BUG_ON(!PageLocked(page));
+
+	if (!page->mapping) {	/* truncated ? */
+		return;
+	}
+
+	dec_zone_page_state(page, NR_MLOCK);
+	count_vm_event(UNEVICTABLE_PGCLEARED);
+	if (!isolate_lru_page(page)) {
+		putback_lru_page(page);
+	} else {
+		/*
+		 * We lost the race. the page already moved to evictable list.
+		 */
+		if (PageUnevictable(page))
+			count_vm_event(UNEVICTABLE_PGSTRANDED);
+	}
+}




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12  1:04         ` KOSAKI Motohiro
@ 2009-03-12  1:52           ` Minchan Kim
  -1 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  1:52 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

Hi, Kosaki-san. 

I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?


How about this ? 
It's just RFC. It's not tested. 

That's because we can't reclaim that pages regardless of whether there is unevictable list or not

>From 487ce9577ea9c43b04ff340a1ba8c4030873e875 Mon Sep 17 00:00:00 2001
From: MinChan Kim <minchan.kim@gmail.com>
Date: Thu, 12 Mar 2009 10:35:37 +0900
Subject: [PATCH] test
 Signed-off-by: MinChan Kim <minchan.kim@gmail.com>

---
 include/linux/pagemap.h |    9 ---------
 include/linux/swap.h    |    9 ++-------
 2 files changed, 2 insertions(+), 16 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index 4d27bf8..0cf024c 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -32,7 +32,6 @@ static inline void mapping_set_error(struct address_space *mapping, int error)
 	}
 }
 
-#ifdef CONFIG_UNEVICTABLE_LRU
 #define AS_UNEVICTABLE	(__GFP_BITS_SHIFT + 2)	/* e.g., ramdisk, SHM_LOCK */
 
 static inline void mapping_set_unevictable(struct address_space *mapping)
@@ -51,14 +50,6 @@ static inline int mapping_unevictable(struct address_space *mapping)
 		return test_bit(AS_UNEVICTABLE, &mapping->flags);
 	return !!mapping;
 }
-#else
-static inline void mapping_set_unevictable(struct address_space *mapping) { }
-static inline void mapping_clear_unevictable(struct address_space *mapping) { }
-static inline int mapping_unevictable(struct address_space *mapping)
-{
-	return 0;
-}
-#endif
 
 static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
 {
diff --git a/include/linux/swap.h b/include/linux/swap.h
index a3af95b..18c639b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -233,8 +233,9 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order)
 }
 #endif
 
-#ifdef CONFIG_UNEVICTABLE_LRU
 extern int page_evictable(struct page *page, struct vm_area_struct *vma);
+
+#ifdef CONFIG_UNEVICTABLE_LRU
 extern void scan_mapping_unevictable_pages(struct address_space *);
 
 extern unsigned long scan_unevictable_pages;
@@ -243,12 +244,6 @@ extern int scan_unevictable_handler(struct ctl_table *, int, struct file *,
 extern int scan_unevictable_register_node(struct node *node);
 extern void scan_unevictable_unregister_node(struct node *node);
 #else
-static inline int page_evictable(struct page *page,
-						struct vm_area_struct *vma)
-{
-	return 1;
-}
-
 static inline void scan_mapping_unevictable_pages(struct address_space *mapping)
 {
 }
-- 
1.5.4.3



> On Thu, 12 Mar 2009 10:04:41 +0900 (JST)
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>
> Hi
> 
> > >> Page reclaim shouldn't be even attempting to reclaim or write back
> > >> ramfs pagecache pages - reclaim can't possibly do anything with these
> > >> pages!
> > >>
> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't
> > >> done that yet.
> > >>
> > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented
> > >> this properly. ?I _think_ we did, in which case we later broke it. ?If
> > >> we've always been (stupidly) trying to pageout these pages then OK, I
> > >> guess your patch is a suitable 2.6.29 stopgap.
> > >
> > > OK, I can't find any code anywhere in which we excluded ramfs pages
> > > from consideration by page reclaim. ?How dumb.
> > 
> > The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
> > It that case, ramfs_get_inode calls mapping_set_unevictable.
> > So,  page reclaim can exclude ramfs pages by page_evictable.
> > It's problem .
> 
> Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
> because nobody of vmscan folk havbe nommu machine.
> 
> Yes, it is very stupid reason. _very_ welcome to tester! :)
> 
> 
> 
> David, Could you please try following patch if you have NOMMU machine?
> it is straightforward porting to nommu.
> 
> 
> ==
> Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU
> 
> logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU.
> but current code does by mistake. fix it.
> 
> 
> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> ---
>  mm/Kconfig |    1 -
>  mm/nommu.c |   24 ++++++++++++++++++++++++
>  2 files changed, 24 insertions(+), 1 deletion(-)
> 
> Index: b/mm/Kconfig
> ===================================================================
> --- a/mm/Kconfig	2008-12-28 20:55:23.000000000 +0900
> +++ b/mm/Kconfig	2008-12-28 21:24:08.000000000 +0900
> @@ -212,7 +212,6 @@ config VIRT_TO_BUS
>  config UNEVICTABLE_LRU
>  	bool "Add LRU list to track non-evictable pages"
>  	default y
> -	depends on MMU
>  	help
>  	  Keeps unevictable pages off of the active and inactive pageout
>  	  lists, so kswapd will not waste CPU time or have its balancing
> Index: b/mm/nommu.c
> ===================================================================
> --- a/mm/nommu.c	2008-12-25 08:26:37.000000000 +0900
> +++ b/mm/nommu.c	2008-12-28 21:29:36.000000000 +0900
> @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct
>  	mmput(mm);
>  	return len;
>  }
> +
> +/*
> + *  LRU accounting for clear_page_mlock()
> + */
> +void __clear_page_mlock(struct page *page)
> +{
> +	VM_BUG_ON(!PageLocked(page));
> +
> +	if (!page->mapping) {	/* truncated ? */
> +		return;
> +	}
> +
> +	dec_zone_page_state(page, NR_MLOCK);
> +	count_vm_event(UNEVICTABLE_PGCLEARED);
> +	if (!isolate_lru_page(page)) {
> +		putback_lru_page(page);
> +	} else {
> +		/*
> +		 * We lost the race. the page already moved to evictable list.
> +		 */
> +		if (PageUnevictable(page))
> +			count_vm_event(UNEVICTABLE_PGSTRANDED);
> +	}
> +}
> 
> 
> 
> 


-- 
Kinds Regards
Minchan Kim

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-12  1:52           ` Minchan Kim
  0 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  1:52 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

Hi, Kosaki-san. 

I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?


How about this ? 
It's just RFC. It's not tested. 

That's because we can't reclaim that pages regardless of whether there is unevictable list or not

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
  2009-03-12  1:52           ` Minchan Kim
@ 2009-03-12  1:56             ` Minchan Kim
  -1 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  1:56 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

In the middle of writing the email, I seneded it by mistake.
Sorry for that.
Please, understand wrong patch title and changelog.
I think although i don't modify that, you can understand it, well.

So, I can't resend this until finising discussion. :)

On Thu, Mar 12, 2009 at 10:52 AM, Minchan Kim <minchan.kim@gmail.com> wrote:
> Hi, Kosaki-san.
>
> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?
>
>
> How about this ?
> It's just RFC. It's not tested.
>
> That's because we can't reclaim that pages regardless of whether there is unevictable list or not
>
> From 487ce9577ea9c43b04ff340a1ba8c4030873e875 Mon Sep 17 00:00:00 2001
> From: MinChan Kim <minchan.kim@gmail.com>
> Date: Thu, 12 Mar 2009 10:35:37 +0900
> Subject: [PATCH] test
>  Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
>
> ---
>  include/linux/pagemap.h |    9 ---------
>  include/linux/swap.h    |    9 ++-------
>  2 files changed, 2 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 4d27bf8..0cf024c 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -32,7 +32,6 @@ static inline void mapping_set_error(struct address_space *mapping, int error)
>        }
>  }
>
> -#ifdef CONFIG_UNEVICTABLE_LRU
>  #define AS_UNEVICTABLE (__GFP_BITS_SHIFT + 2)  /* e.g., ramdisk, SHM_LOCK */
>
>  static inline void mapping_set_unevictable(struct address_space *mapping)
> @@ -51,14 +50,6 @@ static inline int mapping_unevictable(struct address_space *mapping)
>                return test_bit(AS_UNEVICTABLE, &mapping->flags);
>        return !!mapping;
>  }
> -#else
> -static inline void mapping_set_unevictable(struct address_space *mapping) { }
> -static inline void mapping_clear_unevictable(struct address_space *mapping) { }
> -static inline int mapping_unevictable(struct address_space *mapping)
> -{
> -       return 0;
> -}
> -#endif
>
>  static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
>  {
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index a3af95b..18c639b 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -233,8 +233,9 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order)
>  }
>  #endif
>
> -#ifdef CONFIG_UNEVICTABLE_LRU
>  extern int page_evictable(struct page *page, struct vm_area_struct *vma);
> +
> +#ifdef CONFIG_UNEVICTABLE_LRU
>  extern void scan_mapping_unevictable_pages(struct address_space *);
>
>  extern unsigned long scan_unevictable_pages;
> @@ -243,12 +244,6 @@ extern int scan_unevictable_handler(struct ctl_table *, int, struct file *,
>  extern int scan_unevictable_register_node(struct node *node);
>  extern void scan_unevictable_unregister_node(struct node *node);
>  #else
> -static inline int page_evictable(struct page *page,
> -                                               struct vm_area_struct *vma)
> -{
> -       return 1;
> -}
> -
>  static inline void scan_mapping_unevictable_pages(struct address_space *mapping)
>  {
>  }
> --
> 1.5.4.3
>
>
>
>> On Thu, 12 Mar 2009 10:04:41 +0900 (JST)
>> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>>
>> Hi
>>
>> > >> Page reclaim shouldn't be even attempting to reclaim or write back
>> > >> ramfs pagecache pages - reclaim can't possibly do anything with these
>> > >> pages!
>> > >>
>> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't
>> > >> done that yet.
>> > >>
>> > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented
>> > >> this properly. ?I _think_ we did, in which case we later broke it. ?If
>> > >> we've always been (stupidly) trying to pageout these pages then OK, I
>> > >> guess your patch is a suitable 2.6.29 stopgap.
>> > >
>> > > OK, I can't find any code anywhere in which we excluded ramfs pages
>> > > from consideration by page reclaim. ?How dumb.
>> >
>> > The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
>> > It that case, ramfs_get_inode calls mapping_set_unevictable.
>> > So,  page reclaim can exclude ramfs pages by page_evictable.
>> > It's problem .
>>
>> Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
>> because nobody of vmscan folk havbe nommu machine.
>>
>> Yes, it is very stupid reason. _very_ welcome to tester! :)
>>
>>
>>
>> David, Could you please try following patch if you have NOMMU machine?
>> it is straightforward porting to nommu.
>>
>>
>> ==
>> Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU
>>
>> logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU.
>> but current code does by mistake. fix it.
>>
>>
>> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>> ---
>>  mm/Kconfig |    1 -
>>  mm/nommu.c |   24 ++++++++++++++++++++++++
>>  2 files changed, 24 insertions(+), 1 deletion(-)
>>
>> Index: b/mm/Kconfig
>> ===================================================================
>> --- a/mm/Kconfig      2008-12-28 20:55:23.000000000 +0900
>> +++ b/mm/Kconfig      2008-12-28 21:24:08.000000000 +0900
>> @@ -212,7 +212,6 @@ config VIRT_TO_BUS
>>  config UNEVICTABLE_LRU
>>       bool "Add LRU list to track non-evictable pages"
>>       default y
>> -     depends on MMU
>>       help
>>         Keeps unevictable pages off of the active and inactive pageout
>>         lists, so kswapd will not waste CPU time or have its balancing
>> Index: b/mm/nommu.c
>> ===================================================================
>> --- a/mm/nommu.c      2008-12-25 08:26:37.000000000 +0900
>> +++ b/mm/nommu.c      2008-12-28 21:29:36.000000000 +0900
>> @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct
>>       mmput(mm);
>>       return len;
>>  }
>> +
>> +/*
>> + *  LRU accounting for clear_page_mlock()
>> + */
>> +void __clear_page_mlock(struct page *page)
>> +{
>> +     VM_BUG_ON(!PageLocked(page));
>> +
>> +     if (!page->mapping) {   /* truncated ? */
>> +             return;
>> +     }
>> +
>> +     dec_zone_page_state(page, NR_MLOCK);
>> +     count_vm_event(UNEVICTABLE_PGCLEARED);
>> +     if (!isolate_lru_page(page)) {
>> +             putback_lru_page(page);
>> +     } else {
>> +             /*
>> +              * We lost the race. the page already moved to evictable list.
>> +              */
>> +             if (PageUnevictable(page))
>> +                     count_vm_event(UNEVICTABLE_PGSTRANDED);
>> +     }
>> +}
>>
>>
>>
>>
>
>
> --
> Kinds Regards
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Thanks,
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-12  1:56             ` Minchan Kim
  0 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  1:56 UTC (permalink / raw)
  To: Minchan Kim
  Cc: KOSAKI Motohiro, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

In the middle of writing the email, I seneded it by mistake.
Sorry for that.
Please, understand wrong patch title and changelog.
I think although i don't modify that, you can understand it, well.

So, I can't resend this until finising discussion. :)

On Thu, Mar 12, 2009 at 10:52 AM, Minchan Kim <minchan.kim@gmail.com> wrote:
> Hi, Kosaki-san.
>
> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?
>
>
> How about this ?
> It's just RFC. It's not tested.
>
> That's because we can't reclaim that pages regardless of whether there is unevictable list or not
>
> From 487ce9577ea9c43b04ff340a1ba8c4030873e875 Mon Sep 17 00:00:00 2001
> From: MinChan Kim <minchan.kim@gmail.com>
> Date: Thu, 12 Mar 2009 10:35:37 +0900
> Subject: [PATCH] test
>  Signed-off-by: MinChan Kim <minchan.kim@gmail.com>
>
> ---
>  include/linux/pagemap.h |    9 ---------
>  include/linux/swap.h    |    9 ++-------
>  2 files changed, 2 insertions(+), 16 deletions(-)
>
> diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
> index 4d27bf8..0cf024c 100644
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -32,7 +32,6 @@ static inline void mapping_set_error(struct address_space *mapping, int error)
>        }
>  }
>
> -#ifdef CONFIG_UNEVICTABLE_LRU
>  #define AS_UNEVICTABLE (__GFP_BITS_SHIFT + 2)  /* e.g., ramdisk, SHM_LOCK */
>
>  static inline void mapping_set_unevictable(struct address_space *mapping)
> @@ -51,14 +50,6 @@ static inline int mapping_unevictable(struct address_space *mapping)
>                return test_bit(AS_UNEVICTABLE, &mapping->flags);
>        return !!mapping;
>  }
> -#else
> -static inline void mapping_set_unevictable(struct address_space *mapping) { }
> -static inline void mapping_clear_unevictable(struct address_space *mapping) { }
> -static inline int mapping_unevictable(struct address_space *mapping)
> -{
> -       return 0;
> -}
> -#endif
>
>  static inline gfp_t mapping_gfp_mask(struct address_space * mapping)
>  {
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index a3af95b..18c639b 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -233,8 +233,9 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order)
>  }
>  #endif
>
> -#ifdef CONFIG_UNEVICTABLE_LRU
>  extern int page_evictable(struct page *page, struct vm_area_struct *vma);
> +
> +#ifdef CONFIG_UNEVICTABLE_LRU
>  extern void scan_mapping_unevictable_pages(struct address_space *);
>
>  extern unsigned long scan_unevictable_pages;
> @@ -243,12 +244,6 @@ extern int scan_unevictable_handler(struct ctl_table *, int, struct file *,
>  extern int scan_unevictable_register_node(struct node *node);
>  extern void scan_unevictable_unregister_node(struct node *node);
>  #else
> -static inline int page_evictable(struct page *page,
> -                                               struct vm_area_struct *vma)
> -{
> -       return 1;
> -}
> -
>  static inline void scan_mapping_unevictable_pages(struct address_space *mapping)
>  {
>  }
> --
> 1.5.4.3
>
>
>
>> On Thu, 12 Mar 2009 10:04:41 +0900 (JST)
>> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
>>
>> Hi
>>
>> > >> Page reclaim shouldn't be even attempting to reclaim or write back
>> > >> ramfs pagecache pages - reclaim can't possibly do anything with these
>> > >> pages!
>> > >>
>> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't
>> > >> done that yet.
>> > >>
>> > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented
>> > >> this properly. ?I _think_ we did, in which case we later broke it. ?If
>> > >> we've always been (stupidly) trying to pageout these pages then OK, I
>> > >> guess your patch is a suitable 2.6.29 stopgap.
>> > >
>> > > OK, I can't find any code anywhere in which we excluded ramfs pages
>> > > from consideration by page reclaim. ?How dumb.
>> >
>> > The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
>> > It that case, ramfs_get_inode calls mapping_set_unevictable.
>> > So,  page reclaim can exclude ramfs pages by page_evictable.
>> > It's problem .
>>
>> Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
>> because nobody of vmscan folk havbe nommu machine.
>>
>> Yes, it is very stupid reason. _very_ welcome to tester! :)
>>
>>
>>
>> David, Could you please try following patch if you have NOMMU machine?
>> it is straightforward porting to nommu.
>>
>>
>> ==
>> Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU
>>
>> logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU.
>> but current code does by mistake. fix it.
>>
>>
>> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
>> ---
>>  mm/Kconfig |    1 -
>>  mm/nommu.c |   24 ++++++++++++++++++++++++
>>  2 files changed, 24 insertions(+), 1 deletion(-)
>>
>> Index: b/mm/Kconfig
>> ===================================================================
>> --- a/mm/Kconfig      2008-12-28 20:55:23.000000000 +0900
>> +++ b/mm/Kconfig      2008-12-28 21:24:08.000000000 +0900
>> @@ -212,7 +212,6 @@ config VIRT_TO_BUS
>>  config UNEVICTABLE_LRU
>>       bool "Add LRU list to track non-evictable pages"
>>       default y
>> -     depends on MMU
>>       help
>>         Keeps unevictable pages off of the active and inactive pageout
>>         lists, so kswapd will not waste CPU time or have its balancing
>> Index: b/mm/nommu.c
>> ===================================================================
>> --- a/mm/nommu.c      2008-12-25 08:26:37.000000000 +0900
>> +++ b/mm/nommu.c      2008-12-28 21:29:36.000000000 +0900
>> @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct
>>       mmput(mm);
>>       return len;
>>  }
>> +
>> +/*
>> + *  LRU accounting for clear_page_mlock()
>> + */
>> +void __clear_page_mlock(struct page *page)
>> +{
>> +     VM_BUG_ON(!PageLocked(page));
>> +
>> +     if (!page->mapping) {   /* truncated ? */
>> +             return;
>> +     }
>> +
>> +     dec_zone_page_state(page, NR_MLOCK);
>> +     count_vm_event(UNEVICTABLE_PGCLEARED);
>> +     if (!isolate_lru_page(page)) {
>> +             putback_lru_page(page);
>> +     } else {
>> +             /*
>> +              * We lost the race. the page already moved to evictable list.
>> +              */
>> +             if (PageUnevictable(page))
>> +                     count_vm_event(UNEVICTABLE_PGSTRANDED);
>> +     }
>> +}
>>
>>
>>
>>
>
>
> --
> Kinds Regards
> Minchan Kim
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Thanks,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12  1:52           ` Minchan Kim
@ 2009-03-12  2:00             ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-12  2:00 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

> Hi, Kosaki-san. 
> 
> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?
> 
> How about this ? 
> It's just RFC. It's not tested. 
> 
> That's because we can't reclaim that pages regardless of whether there is unevictable list or not

maybe, your patch work.

but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely 
after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine

it is more cleaner IMHO.
What do you think?




^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-12  2:00             ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-12  2:00 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

> Hi, Kosaki-san. 
> 
> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?
> 
> How about this ? 
> It's just RFC. It's not tested. 
> 
> That's because we can't reclaim that pages regardless of whether there is unevictable list or not

maybe, your patch work.

but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely 
after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine

it is more cleaner IMHO.
What do you think?



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
  2009-03-12  2:00             ` KOSAKI Motohiro
@ 2009-03-12  2:11               ` Minchan Kim
  -1 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  2:11 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan,
	uclinux-dev, linux-kernel, linux-mm, Johannes Weiner,
	Rik van Riel, Lee Schermerhorn

On Thu, Mar 12, 2009 at 11:00 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> Hi, Kosaki-san.
>>
>> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
>> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?
>>
>> How about this ?
>> It's just RFC. It's not tested.
>>
>> That's because we can't reclaim that pages regardless of whether there is unevictable list or not
>
> maybe, your patch work.
>
> but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely
> after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine
>
> it is more cleaner IMHO.
> What do you think?
>
>

I agree your opinion, totally
Let us wait nommu folks's comments.


-- 
Kinds regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-12  2:11               ` Minchan Kim
  0 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-12  2:11 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan,
	uclinux-dev, linux-kernel, linux-mm, Johannes Weiner,
	Rik van Riel, Lee Schermerhorn

On Thu, Mar 12, 2009 at 11:00 AM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> Hi, Kosaki-san.
>>
>> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU.
>> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ?
>>
>> How about this ?
>> It's just RFC. It's not tested.
>>
>> That's because we can't reclaim that pages regardless of whether there is unevictable list or not
>
> maybe, your patch work.
>
> but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely
> after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine
>
> it is more cleaner IMHO.
> What do you think?
>
>

I agree your opinion, totally
Let us wait nommu folks's comments.


-- 
Kinds regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12  0:08 ` Andrew Morton
@ 2009-03-12  7:12   ` Berkhan, Enrik (GE Infra, Oil & Gas)
  2009-03-12 11:29     ` [uClinux-dev] " Jamie Lokier
  0 siblings, 1 reply; 101+ messages in thread
From: Berkhan, Enrik (GE Infra, Oil & Gas) @ 2009-03-12  7:12 UTC (permalink / raw)
  To: Andrew Morton, David Howells
  Cc: torvalds, peterz, dhowells, uclinux-dev, linux-kernel

Andrew Morton wrote:
> On Wed, 11 Mar 2009 15:30:35 +0000
> David Howells <dhowells@redhat.com> wrote:
>> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
>> 
>> The solution is to mark the pages dirty at the point of allocation by
>> the truncation code.
> 
> Was there a specific reason for using the low-level SetPageDirty()?

No, no specific reason. It was just my first try of a fix after spotting 
the problem. After a short discussion with David, we decided to wait for 
others' comments on using the low-/high-level approach.

Enrik

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12  7:12   ` Berkhan, Enrik (GE Infra, Oil & Gas)
@ 2009-03-12 11:29     ` Jamie Lokier
  2009-03-12 11:50       ` Peter Zijlstra
  0 siblings, 1 reply; 101+ messages in thread
From: Jamie Lokier @ 2009-03-12 11:29 UTC (permalink / raw)
  To: uClinux development list
  Cc: Andrew Morton, David Howells, peterz, torvalds, linux-kernel

Berkhan, Enrik (GE Infra, Oil & Gas) wrote:
> Andrew Morton wrote:
> > On Wed, 11 Mar 2009 15:30:35 +0000
> > David Howells <dhowells@redhat.com> wrote:
> >> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> >> 
> >> The solution is to mark the pages dirty at the point of allocation by
> >> the truncation code.
> > 
> > Was there a specific reason for using the low-level SetPageDirty()?
> 
> No, no specific reason. It was just my first try of a fix after spotting 
> the problem. After a short discussion with David, we decided to wait for 
> others' comments on using the low-/high-level approach.

Tangentially related...

Does the vm pageout logic include or skip these "dirty" pages looking
for candidates to flush to storage?  What about with MMU?

-- Jamie

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12 11:29     ` [uClinux-dev] " Jamie Lokier
@ 2009-03-12 11:50       ` Peter Zijlstra
  2009-03-12 23:20         ` Minchan Kim
  0 siblings, 1 reply; 101+ messages in thread
From: Peter Zijlstra @ 2009-03-12 11:50 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: uClinux development list, Andrew Morton, David Howells, torvalds,
	linux-kernel

On Thu, 2009-03-12 at 11:29 +0000, Jamie Lokier wrote:
> Berkhan, Enrik (GE Infra, Oil & Gas) wrote:
> > Andrew Morton wrote:
> > > On Wed, 11 Mar 2009 15:30:35 +0000
> > > David Howells <dhowells@redhat.com> wrote:
> > >> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> > >> 
> > >> The solution is to mark the pages dirty at the point of allocation by
> > >> the truncation code.
> > > 
> > > Was there a specific reason for using the low-level SetPageDirty()?
> > 
> > No, no specific reason. It was just my first try of a fix after spotting 
> > the problem. After a short discussion with David, we decided to wait for 
> > others' comments on using the low-/high-level approach.
> 
> Tangentially related...
> 
> Does the vm pageout logic include or skip these "dirty" pages looking
> for candidates to flush to storage?  What about with MMU?

Includes them, regular pageout will try to do the writeout to clean them
and then discard them. 

The ramfs stuff is rather icky in that it adds the pages to the aging
list, marks them dirty, but does not provide a writeout method. 

This will make the paging code scan over them (continuously) trying to
clean them, failing that (lack of writeout method) and putting them back
on the list.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
  2009-03-12  1:04         ` KOSAKI Motohiro
@ 2009-03-12 12:19           ` Robin Getz
  -1 siblings, 0 replies; 101+ messages in thread
From: Robin Getz @ 2009-03-12 12:19 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

On Wed 11 Mar 2009 21:04, KOSAKI Motohiro pondered:
> Hi
> 
> > >> Page reclaim shouldn't be even attempting to reclaim or write back
> > >> ramfs pagecache pages - reclaim can't possibly do anything with 
> > >> these pages!
> > >>
> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't
> > >> done that yet.
> > >>
> > >> Now, my problem is that I can't 100% be sure that we _ever_
> > >> implemented this properly. ?I _think_ we did, in which case 
> > >> we later broke it. ?If we've always been (stupidly) trying 
> > >> to pageout these pages then OK, I guess your patch is a 
> > >> suitable 2.6.29 stopgap. 
> > >
> > > OK, I can't find any code anywhere in which we excluded ramfs pages
> > > from consideration by page reclaim. ?How dumb.
> > 
> > The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
> > It that case, ramfs_get_inode calls mapping_set_unevictable.
> > So,  page reclaim can exclude ramfs pages by page_evictable.
> > It's problem .
> 
> Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
> because nobody of vmscan folk havbe nommu machine.
> 
> Yes, it is very stupid reason. _very_ welcome to tester! :)

As always - if you (or any kernel developer) would like a noMMU machine to 
test on - please send me a private email.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
@ 2009-03-12 12:19           ` Robin Getz
  0 siblings, 0 replies; 101+ messages in thread
From: Robin Getz @ 2009-03-12 12:19 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

On Wed 11 Mar 2009 21:04, KOSAKI Motohiro pondered:
> Hi
> 
> > >> Page reclaim shouldn't be even attempting to reclaim or write back
> > >> ramfs pagecache pages - reclaim can't possibly do anything with 
> > >> these pages!
> > >>
> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't
> > >> done that yet.
> > >>
> > >> Now, my problem is that I can't 100% be sure that we _ever_
> > >> implemented this properly. ?I _think_ we did, in which case 
> > >> we later broke it. ?If we've always been (stupidly) trying 
> > >> to pageout these pages then OK, I guess your patch is a 
> > >> suitable 2.6.29 stopgap. 
> > >
> > > OK, I can't find any code anywhere in which we excluded ramfs pages
> > > from consideration by page reclaim. ?How dumb.
> > 
> > The ramfs  considers it in just CONFIG_UNEVICTABLE_LRU case
> > It that case, ramfs_get_inode calls mapping_set_unevictable.
> > So,  page reclaim can exclude ramfs pages by page_evictable.
> > It's problem .
> 
> Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
> because nobody of vmscan folk havbe nommu machine.
> 
> Yes, it is very stupid reason. _very_ welcome to tester! :)

As always - if you (or any kernel developer) would like a noMMU machine to 
test on - please send me a private email.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells
                   ` (2 preceding siblings ...)
  2009-03-12  0:08 ` Andrew Morton
@ 2009-03-12 12:25 ` David Howells
  2009-03-12 19:43   ` Andrew Morton
  3 siblings, 1 reply; 101+ messages in thread
From: David Howells @ 2009-03-12 12:25 UTC (permalink / raw)
  To: Andrew Morton
  Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel

Andrew Morton <akpm@linux-foundation.org> wrote:

> Was there a specific reason for using the low-level SetPageDirty()?
> 
> On the write() path, ramfs pages will be dirtied by
> simple_commit_write()'s set_page_dirty(), which calls
> __set_page_dirty_no_writeback().
>
> It just so happens that __set_page_dirty_no_writeback() is equivalent
> to a simple SetPageDirty() - it bypasses all the extra things which we
> do for normal permanent-storage-backed pages.
> 
> But I'd have thought that it would be cleaner and more maintainable (albeit
> a bit slower) to go through the a_ops?

It basically boils down to SetPageDirty() with extra overhead, which you
pointed out.  We're manually manipulating the pagecache for this inode anyway,
so does it matter?

The main thing I think I'd rather get rid of is:

		if (!pagevec_add(&lru_pvec, page))
			__pagevec_lru_add_file(&lru_pvec);
	...
	pagevec_lru_add_file(&lru_pvec);

Which as Peter points out:

	The ramfs stuff is rather icky in that it adds the pages to the aging
	list, marks them dirty, but does not provide a writeout method. 

	This will make the paging code scan over them (continuously) trying to
	clean them, failing that (lack of writeout method) and putting them back
	on the list.

Not requiring the pages to be added to the LRU would be a really good idea.
They are not discardable, be it in MMU or NOMMU mode, except when the inode
itself is discarded.

Furthermore, does it really make sense for ramfs to use do_sync_read/write()
and generic_file_aio_read/write(), at least for NOMMU-mode?  These add a lot
of overhead, and ramfs doesn't really do either direct I/O or AIO.

The main point in favour of using these routines is commonality; but they do
add a lot of layers of overhead.  Does ramfs read/write performance matter
than much, I wonder.

David

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [uClinux-dev] Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12 12:19           ` Robin Getz
@ 2009-03-12 17:55             ` Jamie Lokier
  -1 siblings, 0 replies; 101+ messages in thread
From: Jamie Lokier @ 2009-03-12 17:55 UTC (permalink / raw)
  To: uClinux development list
  Cc: KOSAKI Motohiro, Lee Schermerhorn, Rik van Riel, peterz,
	linux-kernel, linux-mm, Minchan Kim, Johannes Weiner,
	Andrew Morton, torvalds

Robin Getz wrote:
> > Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
> > because nobody of vmscan folk havbe nommu machine.
> > 
> > Yes, it is very stupid reason. _very_ welcome to tester! :)
> 
> As always - if you (or any kernel developer) would like a noMMU machine to 
> test on - please send me a private email.

Well, that explains why vmscan has historically performed a little
dubiously on small nommu machines!

By the way, this is just a random side thought... nommu kernels work
just fine in emulators :-)

-- Jamie

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [uClinux-dev] Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-12 17:55             ` Jamie Lokier
  0 siblings, 0 replies; 101+ messages in thread
From: Jamie Lokier @ 2009-03-12 17:55 UTC (permalink / raw)
  To: uClinux development list
  Cc: KOSAKI Motohiro, Lee Schermerhorn, Rik van Riel, peterz,
	linux-kernel, linux-mm, Minchan Kim, Johannes Weiner,
	Andrew Morton, torvalds

Robin Getz wrote:
> > Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine
> > because nobody of vmscan folk havbe nommu machine.
> > 
> > Yes, it is very stupid reason. _very_ welcome to tester! :)
> 
> As always - if you (or any kernel developer) would like a noMMU machine to 
> test on - please send me a private email.

Well, that explains why vmscan has historically performed a little
dubiously on small nommu machines!

By the way, this is just a random side thought... nommu kernels work
just fine in emulators :-)

-- Jamie

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12 12:25 ` David Howells
@ 2009-03-12 19:43   ` Andrew Morton
  2009-03-13  2:03     ` KOSAKI Motohiro
  0 siblings, 1 reply; 101+ messages in thread
From: Andrew Morton @ 2009-03-12 19:43 UTC (permalink / raw)
  To: David Howells
  Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel

On Thu, 12 Mar 2009 12:25:24 +0000
David Howells <dhowells@redhat.com> wrote:

> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > Was there a specific reason for using the low-level SetPageDirty()?
> > 
> > On the write() path, ramfs pages will be dirtied by
> > simple_commit_write()'s set_page_dirty(), which calls
> > __set_page_dirty_no_writeback().
> >
> > It just so happens that __set_page_dirty_no_writeback() is equivalent
> > to a simple SetPageDirty() - it bypasses all the extra things which we
> > do for normal permanent-storage-backed pages.
> > 
> > But I'd have thought that it would be cleaner and more maintainable (albeit
> > a bit slower) to go through the a_ops?
> 
> It basically boils down to SetPageDirty() with extra overhead, which you
> pointed out.  We're manually manipulating the pagecache for this inode anyway,
> so does it matter?

Not much.  It just seems a bit more consistent.

> The main thing I think I'd rather get rid of is:
> 
> 		if (!pagevec_add(&lru_pvec, page))
> 			__pagevec_lru_add_file(&lru_pvec);
> 	...
> 	pagevec_lru_add_file(&lru_pvec);
> 
> Which as Peter points out:
> 
> 	The ramfs stuff is rather icky in that it adds the pages to the aging
> 	list, marks them dirty, but does not provide a writeout method. 
> 
> 	This will make the paging code scan over them (continuously) trying to
> 	clean them, failing that (lack of writeout method) and putting them back
> 	on the list.
> 
> Not requiring the pages to be added to the LRU would be a really good idea.
> They are not discardable, be it in MMU or NOMMU mode, except when the inode
> itself is discarded.

Yep, these pages shouldn't be on the LRU at all.  I guess that will
require some tweaks to core filemap.c code.

> Furthermore, does it really make sense for ramfs to use do_sync_read/write()
> and generic_file_aio_read/write(), at least for NOMMU-mode?  These add a lot
> of overhead, and ramfs doesn't really do either direct I/O or AIO.
> 
> The main point in favour of using these routines is commonality; but they do
> add a lot of layers of overhead.

Yes, that code is very general hence always has overhead for each
specific client.

>  Does ramfs read/write performance matter
> than much, I wonder.

I doubt it.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12 11:50       ` Peter Zijlstra
@ 2009-03-12 23:20         ` Minchan Kim
  2009-03-13  7:56           ` Peter Zijlstra
  0 siblings, 1 reply; 101+ messages in thread
From: Minchan Kim @ 2009-03-12 23:20 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jamie Lokier, uClinux development list, Andrew Morton,
	David Howells, torvalds, linux-kernel

Hi, Peter.

On Thu, 12 Mar 2009 12:50:08 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Thu, 2009-03-12 at 11:29 +0000, Jamie Lokier wrote:
> > Berkhan, Enrik (GE Infra, Oil & Gas) wrote:
> > > Andrew Morton wrote:
> > > > On Wed, 11 Mar 2009 15:30:35 +0000
> > > > David Howells <dhowells@redhat.com> wrote:
> > > >> From: Enrik Berkhan <Enrik.Berkhan@ge.com>
> > > >> 
> > > >> The solution is to mark the pages dirty at the point of allocation by
> > > >> the truncation code.
> > > > 
> > > > Was there a specific reason for using the low-level SetPageDirty()?
> > > 
> > > No, no specific reason. It was just my first try of a fix after spotting 
> > > the problem. After a short discussion with David, we decided to wait for 
> > > others' comments on using the low-/high-level approach.
> > 
> > Tangentially related...
> > 
> > Does the vm pageout logic include or skip these "dirty" pages looking
> > for candidates to flush to storage?  What about with MMU?
> 
> Includes them, regular pageout will try to do the writeout to clean them
> and then discard them. 
> 
> The ramfs stuff is rather icky in that it adds the pages to the aging
> list, marks them dirty, but does not provide a writeout method. 
> 
> This will make the paging code scan over them (continuously) trying to
> clean them, failing that (lack of writeout method) and putting them back
> on the list.

It ins't true any more. 
UNEVICTABLE_LRU will move ramfs's page from LRU to unevictable list.
Couldn't we solve this problem if NOMMU can support CONFIG_UNEVICTABLE_LRU ?


> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


-- 
Kinds Regards
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12 19:43   ` Andrew Morton
@ 2009-03-13  2:03     ` KOSAKI Motohiro
  2009-03-13  7:57       ` Peter Zijlstra
  0 siblings, 1 reply; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-13  2:03 UTC (permalink / raw)
  To: Andrew Morton
  Cc: kosaki.motohiro, David Howells, torvalds, peterz, Enrik.Berkhan,
	uclinux-dev, linux-kernel

Hi

> > Which as Peter points out:
> > 
> > 	The ramfs stuff is rather icky in that it adds the pages to the aging
> > 	list, marks them dirty, but does not provide a writeout method. 
> > 
> > 	This will make the paging code scan over them (continuously) trying to
> > 	clean them, failing that (lack of writeout method) and putting them back
> > 	on the list.
> > 
> > Not requiring the pages to be added to the LRU would be a really good idea.
> > They are not discardable, be it in MMU or NOMMU mode, except when the inode
> > itself is discarded.
> 
> Yep, these pages shouldn't be on the LRU at all.  I guess that will
> require some tweaks to core filemap.c code.

IMHO, UNEVICTABLE_LRU already does lru isolation.
only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig.

Am I missing anything?





^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12 23:20         ` Minchan Kim
@ 2009-03-13  7:56           ` Peter Zijlstra
  2009-03-13  9:17             ` Minchan Kim
  0 siblings, 1 reply; 101+ messages in thread
From: Peter Zijlstra @ 2009-03-13  7:56 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Jamie Lokier, uClinux development list, Andrew Morton,
	David Howells, torvalds, linux-kernel

On Fri, 2009-03-13 at 08:20 +0900, Minchan Kim wrote:

> > > Does the vm pageout logic include or skip these "dirty" pages looking
> > > for candidates to flush to storage?  What about with MMU?
> > 
> > Includes them, regular pageout will try to do the writeout to clean them
> > and then discard them. 
> > 
> > The ramfs stuff is rather icky in that it adds the pages to the aging
> > list, marks them dirty, but does not provide a writeout method. 
> > 
> > This will make the paging code scan over them (continuously) trying to
> > clean them, failing that (lack of writeout method) and putting them back
> > on the list.
> 
> It ins't true any more. 
> UNEVICTABLE_LRU will move ramfs's page from LRU to unevictable list.
> Couldn't we solve this problem if NOMMU can support CONFIG_UNEVICTABLE_LRU ?

That's more of a band-aid than a solution, no? They should never have
been on the list to begin with.


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-13  2:03     ` KOSAKI Motohiro
@ 2009-03-13  7:57       ` Peter Zijlstra
  2009-03-13  8:15         ` KOSAKI Motohiro
  0 siblings, 1 reply; 101+ messages in thread
From: Peter Zijlstra @ 2009-03-13  7:57 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, David Howells, torvalds, Enrik.Berkhan,
	uclinux-dev, linux-kernel

On Fri, 2009-03-13 at 11:03 +0900, KOSAKI Motohiro wrote:
> Hi
> 
> > > Which as Peter points out:
> > > 
> > > 	The ramfs stuff is rather icky in that it adds the pages to the aging
> > > 	list, marks them dirty, but does not provide a writeout method. 
> > > 
> > > 	This will make the paging code scan over them (continuously) trying to
> > > 	clean them, failing that (lack of writeout method) and putting them back
> > > 	on the list.
> > > 
> > > Not requiring the pages to be added to the LRU would be a really good idea.
> > > They are not discardable, be it in MMU or NOMMU mode, except when the inode
> > > itself is discarded.
> > 
> > Yep, these pages shouldn't be on the LRU at all.  I guess that will
> > require some tweaks to core filemap.c code.
> 
> IMHO, UNEVICTABLE_LRU already does lru isolation.
> only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig.
> 
> Am I missing anything?

Yes, the need to take something off that shouldn't be there to begin
with.


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-13  7:57       ` Peter Zijlstra
@ 2009-03-13  8:15         ` KOSAKI Motohiro
  2009-03-13  9:19           ` Minchan Kim
  2009-03-13 10:44           ` Johannes Weiner
  0 siblings, 2 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-13  8:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: kosaki.motohiro, Andrew Morton, David Howells, torvalds,
	Enrik.Berkhan, uclinux-dev, linux-kernel

> > > > 	The ramfs stuff is rather icky in that it adds the pages to the aging
> > > > 	list, marks them dirty, but does not provide a writeout method. 
> > > > 
> > > > 	This will make the paging code scan over them (continuously) trying to
> > > > 	clean them, failing that (lack of writeout method) and putting them back
> > > > 	on the list.
> > > > 
> > > > Not requiring the pages to be added to the LRU would be a really good idea.
> > > > They are not discardable, be it in MMU or NOMMU mode, except when the inode
> > > > itself is discarded.
> > > 
> > > Yep, these pages shouldn't be on the LRU at all.  I guess that will
> > > require some tweaks to core filemap.c code.
> > 
> > IMHO, UNEVICTABLE_LRU already does lru isolation.
> > only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig.
> > 
> > Am I missing anything?
> 
> Yes, the need to take something off that shouldn't be there to begin
> with.

In past unevictable lru discussion, we discuss the same thing.
at that time, we found two reason of unevictable lru is better than 
completely taking off.

(1) page migration code depend on the page stay on lru.
(2) "taking off at reclaim time" can avoid adding lock to fastpath.
    anyway, complely removing from lru need something lock.
    we disliked it at that time.

So, I think it is still true.
Of cource, better cool solution is always welcome :)





^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs  inode's pagecache may get wrongly discarded
  2009-03-13  7:56           ` Peter Zijlstra
@ 2009-03-13  9:17             ` Minchan Kim
  0 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-13  9:17 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Jamie Lokier, uClinux development list, Andrew Morton,
	David Howells, torvalds, linux-kernel, KOSAKI Motohiro

On Fri, Mar 13, 2009 at 4:56 PM, Peter Zijlstra <peterz@infradead.org> wrote:
> On Fri, 2009-03-13 at 08:20 +0900, Minchan Kim wrote:
>
>> > > Does the vm pageout logic include or skip these "dirty" pages looking
>> > > for candidates to flush to storage?  What about with MMU?
>> >
>> > Includes them, regular pageout will try to do the writeout to clean them
>> > and then discard them.
>> >
>> > The ramfs stuff is rather icky in that it adds the pages to the aging
>> > list, marks them dirty, but does not provide a writeout method.
>> >
>> > This will make the paging code scan over them (continuously) trying to
>> > clean them, failing that (lack of writeout method) and putting them back
>> > on the list.
>>
>> It ins't true any more.
>> UNEVICTABLE_LRU will move ramfs's page from LRU to unevictable list.
>> Couldn't we solve this problem if NOMMU can support CONFIG_UNEVICTABLE_LRU ?
>
> That's more of a band-aid than a solution, no? They should never have
> been on the list to begin with.
>

I agree as Andrew pointed out.
It may be workaround but can be a good solution in current status.
And then, we have to improve it for removal of ramfs pages from lru
list in future, I think.

-- 
Kinds regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may  get wrongly discarded
  2009-03-13  8:15         ` KOSAKI Motohiro
@ 2009-03-13  9:19           ` Minchan Kim
  2009-03-13 10:44           ` Johannes Weiner
  1 sibling, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-13  9:19 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Peter Zijlstra, Andrew Morton, David Howells, torvalds,
	Enrik.Berkhan, uclinux-dev, linux-kernel

On Fri, Mar 13, 2009 at 5:15 PM, KOSAKI Motohiro
<kosaki.motohiro@jp.fujitsu.com> wrote:
>> > > >         The ramfs stuff is rather icky in that it adds the pages to the aging
>> > > >         list, marks them dirty, but does not provide a writeout method.
>> > > >
>> > > >         This will make the paging code scan over them (continuously) trying to
>> > > >         clean them, failing that (lack of writeout method) and putting them back
>> > > >         on the list.
>> > > >
>> > > > Not requiring the pages to be added to the LRU would be a really good idea.
>> > > > They are not discardable, be it in MMU or NOMMU mode, except when the inode
>> > > > itself is discarded.
>> > >
>> > > Yep, these pages shouldn't be on the LRU at all.  I guess that will
>> > > require some tweaks to core filemap.c code.
>> >
>> > IMHO, UNEVICTABLE_LRU already does lru isolation.
>> > only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig.
>> >
>> > Am I missing anything?
>>
>> Yes, the need to take something off that shouldn't be there to begin
>> with.
>
> In past unevictable lru discussion, we discuss the same thing.
> at that time, we found two reason of unevictable lru is better than
> completely taking off.
>
> (1) page migration code depend on the page stay on lru.
> (2) "taking off at reclaim time" can avoid adding lock to fastpath.
>    anyway, complely removing from lru need something lock.
>    we disliked it at that time

Can you explain this issue more detail when you are in convenience, please ?

> So, I think it is still true.
> Of cource, better cool solution is always welcome :)
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>



-- 
Kinds regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-13  8:15         ` KOSAKI Motohiro
  2009-03-13  9:19           ` Minchan Kim
@ 2009-03-13 10:44           ` Johannes Weiner
  2009-03-14 14:29             ` KOSAKI Motohiro
  1 sibling, 1 reply; 101+ messages in thread
From: Johannes Weiner @ 2009-03-13 10:44 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Peter Zijlstra, Andrew Morton, David Howells, torvalds,
	Enrik.Berkhan, uclinux-dev, linux-kernel

On Fri, Mar 13, 2009 at 05:15:44PM +0900, KOSAKI Motohiro wrote:
> > > > > 	The ramfs stuff is rather icky in that it adds the pages to the aging
> > > > > 	list, marks them dirty, but does not provide a writeout method. 
> > > > > 
> > > > > 	This will make the paging code scan over them (continuously) trying to
> > > > > 	clean them, failing that (lack of writeout method) and putting them back
> > > > > 	on the list.
> > > > > 
> > > > > Not requiring the pages to be added to the LRU would be a really good idea.
> > > > > They are not discardable, be it in MMU or NOMMU mode, except when the inode
> > > > > itself is discarded.
> > > > 
> > > > Yep, these pages shouldn't be on the LRU at all.  I guess that will
> > > > require some tweaks to core filemap.c code.
> > > 
> > > IMHO, UNEVICTABLE_LRU already does lru isolation.
> > > only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig.
> > > 
> > > Am I missing anything?
> > 
> > Yes, the need to take something off that shouldn't be there to begin
> > with.
> 
> In past unevictable lru discussion, we discuss the same thing.
> at that time, we found two reason of unevictable lru is better than 
> completely taking off.
> 
> (1) page migration code depend on the page stay on lru.
> (2) "taking off at reclaim time" can avoid adding lock to fastpath.
>     anyway, complely removing from lru need something lock.
>     we disliked it at that time.

Agreed with (1).  NOMMU can't support migration, though.  But keeping
them off the LRU on NOMMU needs adjustment of the page cache
read/write code in mm/filemap.c.

I'm not quite sure I understand (2).  But never adding these pages on
the LRU means we never have to remove them anywhere, no?

	Hannes

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-12  0:35       ` Minchan Kim
@ 2009-03-13 11:53         ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 11:53 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: dhowells, Minchan Kim, Andrew Morton, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> David, Could you please try following patch if you have NOMMU machine?
> it is straightforward porting to nommu.

Is this patch actually sufficient, though?  Surely it requires an alteration
to ramfs to mark the page as being unevictable?

David

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-13 11:53         ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 11:53 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: dhowells, Minchan Kim, Andrew Morton, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	Johannes Weiner, Rik van Riel, Lee Schermerhorn

KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:

> David, Could you please try following patch if you have NOMMU machine?
> it is straightforward porting to nommu.

Is this patch actually sufficient, though?  Surely it requires an alteration
to ramfs to mark the page as being unevictable?

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH 0/2] Make the Unevictable LRU available on NOMMU
  2009-03-12  1:04         ` KOSAKI Motohiro
@ 2009-03-13 17:33           ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw)
  To: kosaki.motohiro, minchan.kim
  Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn


The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be
unavailable in NOMMU mode.

The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode.

David

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH 0/2] Make the Unevictable LRU available on NOMMU
@ 2009-03-13 17:33           ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw)
  To: kosaki.motohiro, minchan.kim
  Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn


The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be
unavailable in NOMMU mode.

The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits
  2009-03-13 17:33           ` David Howells
@ 2009-03-13 17:33             ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw)
  To: kosaki.motohiro, minchan.kim
  Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

The mlock() facility does not exist for NOMMU since all mappings are
effectively locked anyway, so we don't make the bits available when they're
not useful.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/page-flags.h |   20 +++++++++++++-------
 mm/Kconfig                 |    8 ++++++++
 mm/internal.h              |    8 +++++---
 3 files changed, 26 insertions(+), 10 deletions(-)


diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 219a523..61df177 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -96,6 +96,8 @@ enum pageflags {
 	PG_swapbacked,		/* Page is backed by RAM/swap */
 #ifdef CONFIG_UNEVICTABLE_LRU
 	PG_unevictable,		/* Page is "unevictable"  */
+#endif
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
 	PG_mlocked,		/* Page is vma mlocked */
 #endif
 #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR
@@ -234,20 +236,20 @@ PAGEFLAG_FALSE(SwapCache)
 #ifdef CONFIG_UNEVICTABLE_LRU
 PAGEFLAG(Unevictable, unevictable) __CLEARPAGEFLAG(Unevictable, unevictable)
 	TESTCLEARFLAG(Unevictable, unevictable)
+#else
+PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable)
+	SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable)
+	__CLEARPAGEFLAG_NOOP(Unevictable)
+#endif
 
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
 #define MLOCK_PAGES 1
 PAGEFLAG(Mlocked, mlocked) __CLEARPAGEFLAG(Mlocked, mlocked)
 	TESTSCFLAG(Mlocked, mlocked)
-
 #else
-
 #define MLOCK_PAGES 0
 PAGEFLAG_FALSE(Mlocked)
 	SETPAGEFLAG_NOOP(Mlocked) TESTCLEARFLAG_FALSE(Mlocked)
-
-PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable)
-	SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable)
-	__CLEARPAGEFLAG_NOOP(Unevictable)
 #endif
 
 #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR
@@ -367,9 +369,13 @@ static inline void __ClearPageTail(struct page *page)
 
 #ifdef CONFIG_UNEVICTABLE_LRU
 #define __PG_UNEVICTABLE	(1 << PG_unevictable)
-#define __PG_MLOCKED		(1 << PG_mlocked)
 #else
 #define __PG_UNEVICTABLE	0
+#endif
+
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
+#define __PG_MLOCKED		(1 << PG_mlocked)
+#else
 #define __PG_MLOCKED		0
 #endif
 
diff --git a/mm/Kconfig b/mm/Kconfig
index a5b7781..8c89597 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -214,5 +214,13 @@ config UNEVICTABLE_LRU
 	  will use one page flag and increase the code size a little,
 	  say Y unless you know what you are doing.
 
+config HAVE_MLOCK
+	bool
+	default y if MMU=y
+
+config HAVE_MLOCKED_PAGE_BIT
+	bool
+	default y if HAVE_MLOCK=y && UNEVICTABLE_LRU=y
+
 config MMU_NOTIFIER
 	bool
diff --git a/mm/internal.h b/mm/internal.h
index 478223b..987bb03 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -63,6 +63,7 @@ static inline unsigned long page_order(struct page *page)
 	return page_private(page);
 }
 
+#ifdef CONFIG_HAVE_MLOCK
 extern long mlock_vma_pages_range(struct vm_area_struct *vma,
 			unsigned long start, unsigned long end);
 extern void munlock_vma_pages_range(struct vm_area_struct *vma,
@@ -71,6 +72,7 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
 {
 	munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
 }
+#endif
 
 #ifdef CONFIG_UNEVICTABLE_LRU
 /*
@@ -90,7 +92,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old)
 }
 #endif
 
-#ifdef CONFIG_UNEVICTABLE_LRU
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
 /*
  * Called only in fault path via page_evictable() for a new page
  * to determine if it's being mapped into a LOCKED vma.
@@ -165,7 +167,7 @@ static inline void free_page_mlock(struct page *page)
 	}
 }
 
-#else /* CONFIG_UNEVICTABLE_LRU */
+#else /* CONFIG_HAVE_MLOCKED_PAGE_BIT */
 static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p)
 {
 	return 0;
@@ -175,7 +177,7 @@ static inline void mlock_vma_page(struct page *page) { }
 static inline void mlock_migrate_page(struct page *new, struct page *old) { }
 static inline void free_page_mlock(struct page *page) { }
 
-#endif /* CONFIG_UNEVICTABLE_LRU */
+#endif /* CONFIG_HAVE_MLOCKED_PAGE_BIT */
 
 /*
  * Return the mem_map entry representing the 'offset' subpage within


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits
@ 2009-03-13 17:33             ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw)
  To: kosaki.motohiro, minchan.kim
  Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

The mlock() facility does not exist for NOMMU since all mappings are
effectively locked anyway, so we don't make the bits available when they're
not useful.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 include/linux/page-flags.h |   20 +++++++++++++-------
 mm/Kconfig                 |    8 ++++++++
 mm/internal.h              |    8 +++++---
 3 files changed, 26 insertions(+), 10 deletions(-)


diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 219a523..61df177 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -96,6 +96,8 @@ enum pageflags {
 	PG_swapbacked,		/* Page is backed by RAM/swap */
 #ifdef CONFIG_UNEVICTABLE_LRU
 	PG_unevictable,		/* Page is "unevictable"  */
+#endif
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
 	PG_mlocked,		/* Page is vma mlocked */
 #endif
 #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR
@@ -234,20 +236,20 @@ PAGEFLAG_FALSE(SwapCache)
 #ifdef CONFIG_UNEVICTABLE_LRU
 PAGEFLAG(Unevictable, unevictable) __CLEARPAGEFLAG(Unevictable, unevictable)
 	TESTCLEARFLAG(Unevictable, unevictable)
+#else
+PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable)
+	SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable)
+	__CLEARPAGEFLAG_NOOP(Unevictable)
+#endif
 
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
 #define MLOCK_PAGES 1
 PAGEFLAG(Mlocked, mlocked) __CLEARPAGEFLAG(Mlocked, mlocked)
 	TESTSCFLAG(Mlocked, mlocked)
-
 #else
-
 #define MLOCK_PAGES 0
 PAGEFLAG_FALSE(Mlocked)
 	SETPAGEFLAG_NOOP(Mlocked) TESTCLEARFLAG_FALSE(Mlocked)
-
-PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable)
-	SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable)
-	__CLEARPAGEFLAG_NOOP(Unevictable)
 #endif
 
 #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR
@@ -367,9 +369,13 @@ static inline void __ClearPageTail(struct page *page)
 
 #ifdef CONFIG_UNEVICTABLE_LRU
 #define __PG_UNEVICTABLE	(1 << PG_unevictable)
-#define __PG_MLOCKED		(1 << PG_mlocked)
 #else
 #define __PG_UNEVICTABLE	0
+#endif
+
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
+#define __PG_MLOCKED		(1 << PG_mlocked)
+#else
 #define __PG_MLOCKED		0
 #endif
 
diff --git a/mm/Kconfig b/mm/Kconfig
index a5b7781..8c89597 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -214,5 +214,13 @@ config UNEVICTABLE_LRU
 	  will use one page flag and increase the code size a little,
 	  say Y unless you know what you are doing.
 
+config HAVE_MLOCK
+	bool
+	default y if MMU=y
+
+config HAVE_MLOCKED_PAGE_BIT
+	bool
+	default y if HAVE_MLOCK=y && UNEVICTABLE_LRU=y
+
 config MMU_NOTIFIER
 	bool
diff --git a/mm/internal.h b/mm/internal.h
index 478223b..987bb03 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -63,6 +63,7 @@ static inline unsigned long page_order(struct page *page)
 	return page_private(page);
 }
 
+#ifdef CONFIG_HAVE_MLOCK
 extern long mlock_vma_pages_range(struct vm_area_struct *vma,
 			unsigned long start, unsigned long end);
 extern void munlock_vma_pages_range(struct vm_area_struct *vma,
@@ -71,6 +72,7 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma)
 {
 	munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end);
 }
+#endif
 
 #ifdef CONFIG_UNEVICTABLE_LRU
 /*
@@ -90,7 +92,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old)
 }
 #endif
 
-#ifdef CONFIG_UNEVICTABLE_LRU
+#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT
 /*
  * Called only in fault path via page_evictable() for a new page
  * to determine if it's being mapped into a LOCKED vma.
@@ -165,7 +167,7 @@ static inline void free_page_mlock(struct page *page)
 	}
 }
 
-#else /* CONFIG_UNEVICTABLE_LRU */
+#else /* CONFIG_HAVE_MLOCKED_PAGE_BIT */
 static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p)
 {
 	return 0;
@@ -175,7 +177,7 @@ static inline void mlock_vma_page(struct page *page) { }
 static inline void mlock_migrate_page(struct page *new, struct page *old) { }
 static inline void free_page_mlock(struct page *page) { }
 
-#endif /* CONFIG_UNEVICTABLE_LRU */
+#endif /* CONFIG_HAVE_MLOCKED_PAGE_BIT */
 
 /*
  * Return the mem_map entry representing the 'offset' subpage within

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n
  2009-03-13 17:33           ` David Howells
@ 2009-03-13 17:33             ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw)
  To: kosaki.motohiro, minchan.kim
  Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n.  There's no logical
reason it shouldn't be available, and it can be used for ramfs.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 mm/Kconfig |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)


diff --git a/mm/Kconfig b/mm/Kconfig
index 8c89597..b53427a 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -206,7 +206,6 @@ config VIRT_TO_BUS
 config UNEVICTABLE_LRU
 	bool "Add LRU list to track non-evictable pages"
 	default y
-	depends on MMU
 	help
 	  Keeps unevictable pages off of the active and inactive pageout
 	  lists, so kswapd will not waste CPU time or have its balancing


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n
@ 2009-03-13 17:33             ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw)
  To: kosaki.motohiro, minchan.kim
  Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n.  There's no logical
reason it shouldn't be available, and it can be used for ramfs.

Signed-off-by: David Howells <dhowells@redhat.com>
---

 mm/Kconfig |    1 -
 1 files changed, 0 insertions(+), 1 deletions(-)


diff --git a/mm/Kconfig b/mm/Kconfig
index 8c89597..b53427a 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -206,7 +206,6 @@ config VIRT_TO_BUS
 config UNEVICTABLE_LRU
 	bool "Add LRU list to track non-evictable pages"
 	default y
-	depends on MMU
 	help
 	  Keeps unevictable pages off of the active and inactive pageout
 	  lists, so kswapd will not waste CPU time or have its balancing

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-13 11:53         ` David Howells
@ 2009-03-13 22:49           ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-13 22:49 UTC (permalink / raw)
  To: David Howells
  Cc: KOSAKI Motohiro, Minchan Kim, Andrew Morton, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Rik van Riel,
	Lee Schermerhorn

On Fri, Mar 13, 2009 at 11:53:02AM +0000, David Howells wrote:
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > David, Could you please try following patch if you have NOMMU machine?
> > it is straightforward porting to nommu.
> 
> Is this patch actually sufficient, though?  Surely it requires an alteration
> to ramfs to mark the page as being unevictable?

ramfs already marks the whole address space of each inode as
unevictable, see ramfs_get_inode().

The reclaim code will regard this when the config option is enabled.

	Hannes

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
@ 2009-03-13 22:49           ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-13 22:49 UTC (permalink / raw)
  To: David Howells
  Cc: KOSAKI Motohiro, Minchan Kim, Andrew Morton, torvalds, peterz,
	Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Rik van Riel,
	Lee Schermerhorn

On Fri, Mar 13, 2009 at 11:53:02AM +0000, David Howells wrote:
> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote:
> 
> > David, Could you please try following patch if you have NOMMU machine?
> > it is straightforward porting to nommu.
> 
> Is this patch actually sufficient, though?  Surely it requires an alteration
> to ramfs to mark the page as being unevictable?

ramfs already marks the whole address space of each inode as
unevictable, see ramfs_get_inode().

The reclaim code will regard this when the config option is enabled.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
  2009-03-13 17:33           ` David Howells
@ 2009-03-14  0:27             ` Minchan Kim
  -1 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-14  0:27 UTC (permalink / raw)
  To: David Howells, Andrew Morton
  Cc: kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn,
	Minchan Kim

Hi, David.

It seems your patch is better than mine.  Thanks. :)
But my concern is that as Peter pointed out, unevictable lru's
solution is not fundamental one.

He want to remove ramfs page from lru list to begin with.
I guess Andrew also thought same thing with Peter.

I think it's a fundamental solution. but it may be long term solution.
This patch can solve NOMMU problem in current status.

Andrew, What do you think about it ?

On Sat, Mar 14, 2009 at 2:33 AM, David Howells <dhowells@redhat.com> wrote:
>
> The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be
> unavailable in NOMMU mode.
>
> The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode.
>
> David
>



-- 
Kinds regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
@ 2009-03-14  0:27             ` Minchan Kim
  0 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-14  0:27 UTC (permalink / raw)
  To: David Howells, Andrew Morton
  Cc: kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn,
	Minchan Kim

Hi, David.

It seems your patch is better than mine.  Thanks. :)
But my concern is that as Peter pointed out, unevictable lru's
solution is not fundamental one.

He want to remove ramfs page from lru list to begin with.
I guess Andrew also thought same thing with Peter.

I think it's a fundamental solution. but it may be long term solution.
This patch can solve NOMMU problem in current status.

Andrew, What do you think about it ?

On Sat, Mar 14, 2009 at 2:33 AM, David Howells <dhowells@redhat.com> wrote:
>
> The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be
> unavailable in NOMMU mode.
>
> The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode.
>
> David
>



-- 
Kinds regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't  provide the bits
  2009-03-13 17:33             ` David Howells
@ 2009-03-14 11:17               ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw)
  To: David Howells
  Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

2009/3/14 David Howells <dhowells@redhat.com>:
> The mlock() facility does not exist for NOMMU since all mappings are
> effectively locked anyway, so we don't make the bits available when they're
> not useful.
>
> Signed-off-by: David Howells <dhowells@redhat.com>

Oh, your patch is more cleaner way.
Thanks!
   Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits
@ 2009-03-14 11:17               ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw)
  To: David Howells
  Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

2009/3/14 David Howells <dhowells@redhat.com>:
> The mlock() facility does not exist for NOMMU since all mappings are
> effectively locked anyway, so we don't make the bits available when they're
> not useful.
>
> Signed-off-by: David Howells <dhowells@redhat.com>

Oh, your patch is more cleaner way.
Thanks!
   Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when  CONFIG_MMU=n
  2009-03-13 17:33             ` David Howells
@ 2009-03-14 11:17               ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw)
  To: David Howells
  Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

> diff --git a/mm/Kconfig b/mm/Kconfig
> index 8c89597..b53427a 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -206,7 +206,6 @@ config VIRT_TO_BUS
>  config UNEVICTABLE_LRU
>        bool "Add LRU list to track non-evictable pages"
>        default y
> -       depends on MMU
>        help
>          Keeps unevictable pages off of the active and inactive pageout
>          lists, so kswapd will not waste CPU time or have its balancing

   Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n
@ 2009-03-14 11:17               ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw)
  To: David Howells
  Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev,
	linux-kernel, linux-mm, hannes, riel, lee.schermerhorn

> diff --git a/mm/Kconfig b/mm/Kconfig
> index 8c89597..b53427a 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -206,7 +206,6 @@ config VIRT_TO_BUS
>  config UNEVICTABLE_LRU
>        bool "Add LRU list to track non-evictable pages"
>        default y
> -       depends on MMU
>        help
>          Keeps unevictable pages off of the active and inactive pageout
>          lists, so kswapd will not waste CPU time or have its balancing

   Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded
  2009-03-13 10:44           ` Johannes Weiner
@ 2009-03-14 14:29             ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-14 14:29 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, Peter Zijlstra, Andrew Morton, David Howells,
	torvalds, Enrik.Berkhan, uclinux-dev, linux-kernel

> > (1) page migration code depend on the page stay on lru.
> > (2) "taking off at reclaim time" can avoid adding lock to fastpath.
> >     anyway, complely removing from lru need something lock.
> >     we disliked it at that time.
> 
> Agreed with (1).  NOMMU can't support migration, though.  But keeping
> them off the LRU on NOMMU needs adjustment of the page cache
> read/write code in mm/filemap.c.

yes.

> I'm not quite sure I understand (2).  But never adding these pages on
> the LRU means we never have to remove them anywhere, no?

Yeah, you are right.
I was confused to munlock/shm_unlock case.




^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
  2009-03-14  0:27             ` Minchan Kim
@ 2009-03-20 16:08               ` Lee Schermerhorn
  -1 siblings, 0 replies; 101+ messages in thread
From: Lee Schermerhorn @ 2009-03-20 16:08 UTC (permalink / raw)
  To: Minchan Kim
  Cc: David Howells, Andrew Morton, kosaki.motohiro, torvalds, peterz,
	nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel

On Sat, 2009-03-14 at 09:27 +0900, Minchan Kim wrote:
> Hi, David.
> 
> It seems your patch is better than mine.  Thanks. :)
> But my concern is that as Peter pointed out, unevictable lru's
> solution is not fundamental one.
> 
> He want to remove ramfs page from lru list to begin with.
> I guess Andrew also thought same thing with Peter.
> 
> I think it's a fundamental solution. but it may be long term solution.
> This patch can solve NOMMU problem in current status.
> 
> Andrew, What do you think about it ?

[been meaning to respond to this...]

I just want to point out [again :)] that removing the ramfs pages from
the lru will prevent them from being migrated--e.g., for mem hot unplug,
defrag or such.  We currently have this situation with the new ram disk
driver [brd.c] which, unlike the old rd driver, doesn't place its pages
on the LRU.

Migration uses isolation of pages from lru to arbitrate between tasks
trying to migrate or reclaim the same page.  If migration doesn't find
the page on the lru, it assumes that it lost the race and skips the
page.  This is one of the reasons we chose to keep unevictable pages on
an lru-like list known to isolate_lru_page().

Something to keep in mind if/when this comes up again.  Maybe we don't
care?  Maybe ram disk/fs pages should come only from non-movable zone?
Or maybe migration can be reworked not to require the page be
"isolatable" from the lru [haven't thought about how one might do this].

Lee


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
@ 2009-03-20 16:08               ` Lee Schermerhorn
  0 siblings, 0 replies; 101+ messages in thread
From: Lee Schermerhorn @ 2009-03-20 16:08 UTC (permalink / raw)
  To: Minchan Kim
  Cc: David Howells, Andrew Morton, kosaki.motohiro, torvalds, peterz,
	nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel

On Sat, 2009-03-14 at 09:27 +0900, Minchan Kim wrote:
> Hi, David.
> 
> It seems your patch is better than mine.  Thanks. :)
> But my concern is that as Peter pointed out, unevictable lru's
> solution is not fundamental one.
> 
> He want to remove ramfs page from lru list to begin with.
> I guess Andrew also thought same thing with Peter.
> 
> I think it's a fundamental solution. but it may be long term solution.
> This patch can solve NOMMU problem in current status.
> 
> Andrew, What do you think about it ?

[been meaning to respond to this...]

I just want to point out [again :)] that removing the ramfs pages from
the lru will prevent them from being migrated--e.g., for mem hot unplug,
defrag or such.  We currently have this situation with the new ram disk
driver [brd.c] which, unlike the old rd driver, doesn't place its pages
on the LRU.

Migration uses isolation of pages from lru to arbitrate between tasks
trying to migrate or reclaim the same page.  If migration doesn't find
the page on the lru, it assumes that it lost the race and skips the
page.  This is one of the reasons we chose to keep unevictable pages on
an lru-like list known to isolate_lru_page().

Something to keep in mind if/when this comes up again.  Maybe we don't
care?  Maybe ram disk/fs pages should come only from non-movable zone?
Or maybe migration can be reworked not to require the page be
"isolatable" from the lru [haven't thought about how one might do this].

Lee

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
  2009-03-14  0:27             ` Minchan Kim
@ 2009-03-20 16:24               ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-20 16:24 UTC (permalink / raw)
  To: Lee Schermerhorn
  Cc: dhowells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds,
	peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	hannes, riel

Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:

> I just want to point out [again :)] that removing the ramfs pages from
> the lru will prevent them from being migrated

This is less of an issue for NOMMU kernels, since you can't migrate pages that
are mapped.

David

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
@ 2009-03-20 16:24               ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-20 16:24 UTC (permalink / raw)
  To: Lee Schermerhorn
  Cc: dhowells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds,
	peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm,
	hannes, riel

Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:

> I just want to point out [again :)] that removing the ramfs pages from
> the lru will prevent them from being migrated

This is less of an issue for NOMMU kernels, since you can't migrate pages that
are mapped.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
  2009-03-20 16:24               ` David Howells
@ 2009-03-20 18:30                 ` Lee Schermerhorn
  -1 siblings, 0 replies; 101+ messages in thread
From: Lee Schermerhorn @ 2009-03-20 18:30 UTC (permalink / raw)
  To: David Howells
  Cc: Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz,
	nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel

On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote:
> Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> 
> > I just want to point out [again :)] that removing the ramfs pages from
> > the lru will prevent them from being migrated
> 
> This is less of an issue for NOMMU kernels, since you can't migrate pages that
> are mapped.


Agreed.  So, you could eliminate them [ramfs pages] from the lru for
just the nommu kernels, if you wanted to go that route.

Lee


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
@ 2009-03-20 18:30                 ` Lee Schermerhorn
  0 siblings, 0 replies; 101+ messages in thread
From: Lee Schermerhorn @ 2009-03-20 18:30 UTC (permalink / raw)
  To: David Howells
  Cc: Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz,
	nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel

On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote:
> Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> 
> > I just want to point out [again :)] that removing the ramfs pages from
> > the lru will prevent them from being migrated
> 
> This is less of an issue for NOMMU kernels, since you can't migrate pages that
> are mapped.


Agreed.  So, you could eliminate them [ramfs pages] from the lru for
just the nommu kernels, if you wanted to go that route.

Lee

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
  2009-03-20 18:30                 ` Lee Schermerhorn
@ 2009-03-21 10:20                   ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-21 10:20 UTC (permalink / raw)
  To: Lee Schermerhorn
  Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro,
	torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel,
	linux-mm, riel

On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote:
> On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote:
> > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> > 
> > > I just want to point out [again :)] that removing the ramfs pages from
> > > the lru will prevent them from being migrated
> > 
> > This is less of an issue for NOMMU kernels, since you can't migrate pages that
> > are mapped.
> 
> 
> Agreed.  So, you could eliminate them [ramfs pages] from the lru for
> just the nommu kernels, if you wanted to go that route.

These pages don't come with much overhead anymore when they sit on the
unevictable list, right?  So I don't see much point in special casing
them all over the place.

I have a patchset that decouples the unevictable lru feature from
mlock, enables the latter on nommu and then makes sure ramfs pages go
immediately to the unevictable list so they don't need the scanner to
move them.  This is just wiring up of features we already have.

I will sent this mondayish, need to test it more especially on a NOMMU
setup.

	Hannes

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
@ 2009-03-21 10:20                   ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-21 10:20 UTC (permalink / raw)
  To: Lee Schermerhorn
  Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro,
	torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel,
	linux-mm, riel

On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote:
> On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote:
> > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> > 
> > > I just want to point out [again :)] that removing the ramfs pages from
> > > the lru will prevent them from being migrated
> > 
> > This is less of an issue for NOMMU kernels, since you can't migrate pages that
> > are mapped.
> 
> 
> Agreed.  So, you could eliminate them [ramfs pages] from the lru for
> just the nommu kernels, if you wanted to go that route.

These pages don't come with much overhead anymore when they sit on the
unevictable list, right?  So I don't see much point in special casing
them all over the place.

I have a patchset that decouples the unevictable lru feature from
mlock, enables the latter on nommu and then makes sure ramfs pages go
immediately to the unevictable list so they don't need the scanner to
move them.  This is just wiring up of features we already have.

I will sent this mondayish, need to test it more especially on a NOMMU
setup.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* [patch 1/3] mm: decouple unevictable lru from mmu
  2009-03-21 10:20                   ` Johannes Weiner
@ 2009-03-22 20:13                     ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, David Howells, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Mlock is only one source of unevictable pages but with the unevictable
lru enabled, mlock code is referenced unconditionally.

Decouple the two so that the unevictable lru can work without mlock
and thus on nommu setups where we still have unevictable pages from
e.g. ramfs.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.com>
Cc: MinChan Kim <minchan.kim@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
---
 mm/Kconfig    |    1 -
 mm/internal.h |    4 ++--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index a5b7781..fbb190e 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -206,7 +206,6 @@ config VIRT_TO_BUS
 config UNEVICTABLE_LRU
 	bool "Add LRU list to track non-evictable pages"
 	default y
-	depends on MMU
 	help
 	  Keeps unevictable pages off of the active and inactive pageout
 	  lists, so kswapd will not waste CPU time or have its balancing
diff --git a/mm/internal.h b/mm/internal.h
index 478223b..ceaa629 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -90,7 +90,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old)
 }
 #endif
 
-#ifdef CONFIG_UNEVICTABLE_LRU
+#if defined(CONFIG_UNEVICTABLE_LRU) && defined(CONFIG_MMU)
 /*
  * Called only in fault path via page_evictable() for a new page
  * to determine if it's being mapped into a LOCKED vma.
@@ -165,7 +165,7 @@ static inline void free_page_mlock(struct page *page)
 	}
 }
 
-#else /* CONFIG_UNEVICTABLE_LRU */
+#else /* CONFIG_UNEVICTABLE_LRU && CONFIG_MMU */
 static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p)
 {
 	return 0;
-- 
1.6.2.1.135.gde769


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [patch 1/3] mm: decouple unevictable lru from mmu
@ 2009-03-22 20:13                     ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, David Howells, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Mlock is only one source of unevictable pages but with the unevictable
lru enabled, mlock code is referenced unconditionally.

Decouple the two so that the unevictable lru can work without mlock
and thus on nommu setups where we still have unevictable pages from
e.g. ramfs.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.com>
Cc: MinChan Kim <minchan.kim@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
---
 mm/Kconfig    |    1 -
 mm/internal.h |    4 ++--
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/mm/Kconfig b/mm/Kconfig
index a5b7781..fbb190e 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -206,7 +206,6 @@ config VIRT_TO_BUS
 config UNEVICTABLE_LRU
 	bool "Add LRU list to track non-evictable pages"
 	default y
-	depends on MMU
 	help
 	  Keeps unevictable pages off of the active and inactive pageout
 	  lists, so kswapd will not waste CPU time or have its balancing
diff --git a/mm/internal.h b/mm/internal.h
index 478223b..ceaa629 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -90,7 +90,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old)
 }
 #endif
 
-#ifdef CONFIG_UNEVICTABLE_LRU
+#if defined(CONFIG_UNEVICTABLE_LRU) && defined(CONFIG_MMU)
 /*
  * Called only in fault path via page_evictable() for a new page
  * to determine if it's being mapped into a LOCKED vma.
@@ -165,7 +165,7 @@ static inline void free_page_mlock(struct page *page)
 	}
 }
 
-#else /* CONFIG_UNEVICTABLE_LRU */
+#else /* CONFIG_UNEVICTABLE_LRU && CONFIG_MMU */
 static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p)
 {
 	return 0;
-- 
1.6.2.1.135.gde769

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [patch 2/3] ramfs-nommu: use generic lru cache
  2009-03-21 10:20                   ` Johannes Weiner
@ 2009-03-22 20:13                     ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, David Howells, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Instead of open-coding the lru-list-add pagevec batching when
expanding a file mapping from zero, defer to the appropriate page
cache function that also takes care of adding the page to the lru
list.

This is cleaner, saves code and reduces the stack footprint by 16
words worth of pagevec.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.com>
Cc: MinChan Kim <minchan.kim@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
---
 fs/ramfs/file-nommu.c |   15 ++++-----------
 1 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
index 5d7c7ec..351192a 100644
--- a/fs/ramfs/file-nommu.c
+++ b/fs/ramfs/file-nommu.c
@@ -60,7 +60,6 @@ const struct inode_operations ramfs_file_inode_operations = {
  */
 int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 {
-	struct pagevec lru_pvec;
 	unsigned long npages, xpages, loop, limit;
 	struct page *pages;
 	unsigned order;
@@ -103,24 +102,20 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 	memset(data, 0, newsize);
 
 	/* attach all the pages to the inode's address space */
-	pagevec_init(&lru_pvec, 0);
 	for (loop = 0; loop < npages; loop++) {
 		struct page *page = pages + loop;
 
-		ret = add_to_page_cache(page, inode->i_mapping, loop, GFP_KERNEL);
+		ret = add_to_page_cache_lru(page, inode->i_mapping, loop,
+					GFP_KERNEL);
 		if (ret < 0)
 			goto add_error;
 
-		if (!pagevec_add(&lru_pvec, page))
-			__pagevec_lru_add_file(&lru_pvec);
-
 		/* prevent the page from being discarded on memory pressure */
 		SetPageDirty(page);
 
 		unlock_page(page);
 	}
 
-	pagevec_lru_add_file(&lru_pvec);
 	return 0;
 
  fsize_exceeded:
@@ -129,10 +124,8 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 	return -EFBIG;
 
  add_error:
-	pagevec_lru_add_file(&lru_pvec);
-	page_cache_release(pages + loop);
-	for (loop++; loop < npages; loop++)
-		__free_page(pages + loop);
+	while (loop < npages)
+		__free_page(pages + loop++);
 	return ret;
 }
 
-- 
1.6.2.1.135.gde769


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [patch 2/3] ramfs-nommu: use generic lru cache
@ 2009-03-22 20:13                     ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, David Howells, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Instead of open-coding the lru-list-add pagevec batching when
expanding a file mapping from zero, defer to the appropriate page
cache function that also takes care of adding the page to the lru
list.

This is cleaner, saves code and reduces the stack footprint by 16
words worth of pagevec.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.com>
Cc: MinChan Kim <minchan.kim@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
---
 fs/ramfs/file-nommu.c |   15 ++++-----------
 1 files changed, 4 insertions(+), 11 deletions(-)

diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c
index 5d7c7ec..351192a 100644
--- a/fs/ramfs/file-nommu.c
+++ b/fs/ramfs/file-nommu.c
@@ -60,7 +60,6 @@ const struct inode_operations ramfs_file_inode_operations = {
  */
 int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 {
-	struct pagevec lru_pvec;
 	unsigned long npages, xpages, loop, limit;
 	struct page *pages;
 	unsigned order;
@@ -103,24 +102,20 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 	memset(data, 0, newsize);
 
 	/* attach all the pages to the inode's address space */
-	pagevec_init(&lru_pvec, 0);
 	for (loop = 0; loop < npages; loop++) {
 		struct page *page = pages + loop;
 
-		ret = add_to_page_cache(page, inode->i_mapping, loop, GFP_KERNEL);
+		ret = add_to_page_cache_lru(page, inode->i_mapping, loop,
+					GFP_KERNEL);
 		if (ret < 0)
 			goto add_error;
 
-		if (!pagevec_add(&lru_pvec, page))
-			__pagevec_lru_add_file(&lru_pvec);
-
 		/* prevent the page from being discarded on memory pressure */
 		SetPageDirty(page);
 
 		unlock_page(page);
 	}
 
-	pagevec_lru_add_file(&lru_pvec);
 	return 0;
 
  fsize_exceeded:
@@ -129,10 +124,8 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize)
 	return -EFBIG;
 
  add_error:
-	pagevec_lru_add_file(&lru_pvec);
-	page_cache_release(pages + loop);
-	for (loop++; loop < npages; loop++)
-		__free_page(pages + loop);
+	while (loop < npages)
+		__free_page(pages + loop++);
 	return ret;
 }
 
-- 
1.6.2.1.135.gde769

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
  2009-03-21 10:20                   ` Johannes Weiner
@ 2009-03-22 20:13                     ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, David Howells, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Check if the mapping is evictable when initially adding page cache
pages to the LRU lists.  If that is not the case, add them to the
unevictable list immediately instead of leaving it up to the reclaim
code to move them there.

This is useful for ramfs and locked shmem which mark whole mappings as
unevictable and we know at fault time already that it is useless to
try reclaiming these pages.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.com>
Cc: MinChan Kim <minchan.kim@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
---
 mm/filemap.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 23acefe..8574530 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
 
 	ret = add_to_page_cache(page, mapping, offset, gfp_mask);
 	if (ret == 0) {
-		if (page_is_file_cache(page))
+		if (mapping_unevictable(mapping))
+			add_page_to_unevictable_list(page);
+		else if (page_is_file_cache(page))
 			lru_cache_add_file(page);
 		else
 			lru_cache_add_active_anon(page);
-- 
1.6.2.1.135.gde769


^ permalink raw reply related	[flat|nested] 101+ messages in thread

* [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
@ 2009-03-22 20:13                     ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, David Howells, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Check if the mapping is evictable when initially adding page cache
pages to the LRU lists.  If that is not the case, add them to the
unevictable list immediately instead of leaving it up to the reclaim
code to move them there.

This is useful for ramfs and locked shmem which mark whole mappings as
unevictable and we know at fault time already that it is useless to
try reclaiming these pages.

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Nick Piggin <npiggin@suse.de>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.com>
Cc: MinChan Kim <minchan.kim@gmail.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
---
 mm/filemap.c |    4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 23acefe..8574530 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
 
 	ret = add_to_page_cache(page, mapping, offset, gfp_mask);
 	if (ret == 0) {
-		if (page_is_file_cache(page))
+		if (mapping_unevictable(mapping))
+			add_page_to_unevictable_list(page);
+		else if (page_is_file_cache(page))
 			lru_cache_add_file(page);
 		else
 			lru_cache_add_active_anon(page);
-- 
1.6.2.1.135.gde769

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 101+ messages in thread

* Re: [patch 1/3] mm: decouple unevictable lru from mmu
  2009-03-22 20:13                     ` Johannes Weiner
@ 2009-03-22 23:46                       ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-22 23:46 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	MinChan Kim, Lee Schermerhorn

> @@ -206,7 +206,6 @@ config VIRT_TO_BUS
>  config UNEVICTABLE_LRU
>  	bool "Add LRU list to track non-evictable pages"
>  	default y
> -	depends on MMU
>  	help
>  	  Keeps unevictable pages off of the active and inactive pageout
>  	  lists, so kswapd will not waste CPU time or have its balancing
> diff --git a/mm/internal.h b/mm/internal.h
> index 478223b..ceaa629 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h

David alread made this portion and it already merged in mmotm.
Don't you work on mmotm?





^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 1/3] mm: decouple unevictable lru from mmu
@ 2009-03-22 23:46                       ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-22 23:46 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	MinChan Kim, Lee Schermerhorn

> @@ -206,7 +206,6 @@ config VIRT_TO_BUS
>  config UNEVICTABLE_LRU
>  	bool "Add LRU list to track non-evictable pages"
>  	default y
> -	depends on MMU
>  	help
>  	  Keeps unevictable pages off of the active and inactive pageout
>  	  lists, so kswapd will not waste CPU time or have its balancing
> diff --git a/mm/internal.h b/mm/internal.h
> index 478223b..ceaa629 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h

David alread made this portion and it already merged in mmotm.
Don't you work on mmotm?




--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 1/3] mm: decouple unevictable lru from mmu
  2009-03-22 23:46                       ` KOSAKI Motohiro
@ 2009-03-23  0:14                         ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-23  0:14 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, linux-kernel, linux-mm, David Howells,
	Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 08:46:06AM +0900, KOSAKI Motohiro wrote:
> > @@ -206,7 +206,6 @@ config VIRT_TO_BUS
> >  config UNEVICTABLE_LRU
> >  	bool "Add LRU list to track non-evictable pages"
> >  	default y
> > -	depends on MMU
> >  	help
> >  	  Keeps unevictable pages off of the active and inactive pageout
> >  	  lists, so kswapd will not waste CPU time or have its balancing
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 478223b..ceaa629 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> 
> David alread made this portion and it already merged in mmotm.
> Don't you work on mmotm?

Ah, stupid me.  I was even on the Cc for David's patches.  I missed
them, sorry.

David, why do we need two Kconfig symbols for mlock and the mlock page
bit?  Don't we always provide mlock on mmu and never on nommu?
Anyway, that is just out of curiousity.  Good that the change is
already done, so please ignore this patch.

	Hannes

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 1/3] mm: decouple unevictable lru from mmu
@ 2009-03-23  0:14                         ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-23  0:14 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Andrew Morton, linux-kernel, linux-mm, David Howells,
	Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 08:46:06AM +0900, KOSAKI Motohiro wrote:
> > @@ -206,7 +206,6 @@ config VIRT_TO_BUS
> >  config UNEVICTABLE_LRU
> >  	bool "Add LRU list to track non-evictable pages"
> >  	default y
> > -	depends on MMU
> >  	help
> >  	  Keeps unevictable pages off of the active and inactive pageout
> >  	  lists, so kswapd will not waste CPU time or have its balancing
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 478223b..ceaa629 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> 
> David alread made this portion and it already merged in mmotm.
> Don't you work on mmotm?

Ah, stupid me.  I was even on the Cc for David's patches.  I missed
them, sorry.

David, why do we need two Kconfig symbols for mlock and the mlock page
bit?  Don't we always provide mlock on mmu and never on nommu?
Anyway, that is just out of curiousity.  Good that the change is
already done, so please ignore this patch.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
  2009-03-22 20:13                     ` Johannes Weiner
@ 2009-03-23  0:44                       ` Minchan Kim
  -1 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-23  0:44 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-kernel, linux-mm, David Howells,
	Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra,
	Lee Schermerhorn

Hmm,,

This patch is another thing unlike previous series patches.
Firstly, It looked good to me.

I think add_to_page_cache_lru have to become a fast path.
But, how often would ramfs and shmem function be called ?

I have a concern for this patch to add another burden.
so, we need any numbers for getting pros and cons.

Any thoughts ?

On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> Check if the mapping is evictable when initially adding page cache
> pages to the LRU lists.  If that is not the case, add them to the
> unevictable list immediately instead of leaving it up to the reclaim
> code to move them there.
>
> This is useful for ramfs and locked shmem which mark whole mappings as
> unevictable and we know at fault time already that it is useless to
> try reclaiming these pages.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Nick Piggin <npiggin@suse.de>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.com>
> Cc: MinChan Kim <minchan.kim@gmail.com>
> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
> ---
>  mm/filemap.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 23acefe..8574530 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
>
>        ret = add_to_page_cache(page, mapping, offset, gfp_mask);
>        if (ret == 0) {
> -               if (page_is_file_cache(page))
> +               if (mapping_unevictable(mapping))
> +                       add_page_to_unevictable_list(page);
> +               else if (page_is_file_cache(page))
>                        lru_cache_add_file(page);
>                else
>                        lru_cache_add_active_anon(page);
> --
> 1.6.2.1.135.gde769
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Kinds regards,
Minchan Kim

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
@ 2009-03-23  0:44                       ` Minchan Kim
  0 siblings, 0 replies; 101+ messages in thread
From: Minchan Kim @ 2009-03-23  0:44 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-kernel, linux-mm, David Howells,
	Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra,
	Lee Schermerhorn

Hmm,,

This patch is another thing unlike previous series patches.
Firstly, It looked good to me.

I think add_to_page_cache_lru have to become a fast path.
But, how often would ramfs and shmem function be called ?

I have a concern for this patch to add another burden.
so, we need any numbers for getting pros and cons.

Any thoughts ?

On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> Check if the mapping is evictable when initially adding page cache
> pages to the LRU lists.  If that is not the case, add them to the
> unevictable list immediately instead of leaving it up to the reclaim
> code to move them there.
>
> This is useful for ramfs and locked shmem which mark whole mappings as
> unevictable and we know at fault time already that it is useless to
> try reclaiming these pages.
>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Cc: David Howells <dhowells@redhat.com>
> Cc: Nick Piggin <npiggin@suse.de>
> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
> Cc: Rik van Riel <riel@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.com>
> Cc: MinChan Kim <minchan.kim@gmail.com>
> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
> ---
>  mm/filemap.c |    4 +++-
>  1 files changed, 3 insertions(+), 1 deletions(-)
>
> diff --git a/mm/filemap.c b/mm/filemap.c
> index 23acefe..8574530 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping,
>
>        ret = add_to_page_cache(page, mapping, offset, gfp_mask);
>        if (ret == 0) {
> -               if (page_is_file_cache(page))
> +               if (mapping_unevictable(mapping))
> +                       add_page_to_unevictable_list(page);
> +               else if (page_is_file_cache(page))
>                        lru_cache_add_file(page);
>                else
>                        lru_cache_add_active_anon(page);
> --
> 1.6.2.1.135.gde769
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
>



-- 
Kinds regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
  2009-03-23  0:44                       ` Minchan Kim
@ 2009-03-23  2:21                         ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  2:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Johannes Weiner, Andrew Morton, linux-kernel,
	linux-mm, David Howells, Nick Piggin, Rik van Riel,
	Peter Zijlstra, Lee Schermerhorn

> Hmm,,
> 
> This patch is another thing unlike previous series patches.
> Firstly, It looked good to me.
> 
> I think add_to_page_cache_lru have to become a fast path.
> But, how often would ramfs and shmem function be called ?
> 
> I have a concern for this patch to add another burden.
> so, we need any numbers for getting pros and cons.
> 
> Any thoughts ?

this is the just reason why current code don't call add_page_to_unevictable_list().
add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.

then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
it can cause zone->lru_lock contention storm.

then, if nobody have good performance result, I don't ack this patch.


> On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > Check if the mapping is evictable when initially adding page cache
> > pages to the LRU lists. ?If that is not the case, add them to the
> > unevictable list immediately instead of leaving it up to the reclaim
> > code to move them there.
> >
> > This is useful for ramfs and locked shmem which mark whole mappings as
> > unevictable and we know at fault time already that it is useless to
> > try reclaiming these pages.



^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
@ 2009-03-23  2:21                         ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  2:21 UTC (permalink / raw)
  To: Minchan Kim
  Cc: kosaki.motohiro, Johannes Weiner, Andrew Morton, linux-kernel,
	linux-mm, David Howells, Nick Piggin, Rik van Riel,
	Peter Zijlstra, Lee Schermerhorn

> Hmm,,
> 
> This patch is another thing unlike previous series patches.
> Firstly, It looked good to me.
> 
> I think add_to_page_cache_lru have to become a fast path.
> But, how often would ramfs and shmem function be called ?
> 
> I have a concern for this patch to add another burden.
> so, we need any numbers for getting pros and cons.
> 
> Any thoughts ?

this is the just reason why current code don't call add_page_to_unevictable_list().
add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.

then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
it can cause zone->lru_lock contention storm.

then, if nobody have good performance result, I don't ack this patch.


> On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > Check if the mapping is evictable when initially adding page cache
> > pages to the LRU lists. ?If that is not the case, add them to the
> > unevictable list immediately instead of leaving it up to the reclaim
> > code to move them there.
> >
> > This is useful for ramfs and locked shmem which mark whole mappings as
> > unevictable and we know at fault time already that it is useless to
> > try reclaiming these pages.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 2/3] ramfs-nommu: use generic lru cache
  2009-03-22 20:13                     ` Johannes Weiner
@ 2009-03-23  2:22                       ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  2:22 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	MinChan Kim, Lee Schermerhorn

> Instead of open-coding the lru-list-add pagevec batching when
> expanding a file mapping from zero, defer to the appropriate page
> cache function that also takes care of adding the page to the lru
> list.
> 
> This is cleaner, saves code and reduces the stack footprint by 16
> words worth of pagevec.

Looks good to me. thanks good patch.



^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 2/3] ramfs-nommu: use generic lru cache
@ 2009-03-23  2:22                       ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  2:22 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	MinChan Kim, Lee Schermerhorn

> Instead of open-coding the lru-list-add pagevec batching when
> expanding a file mapping from zero, defer to the appropriate page
> cache function that also takes care of adding the page to the lru
> list.
> 
> This is cleaner, saves code and reduces the stack footprint by 16
> words worth of pagevec.

Looks good to me. thanks good patch.


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
  2009-03-23  2:21                         ` KOSAKI Motohiro
@ 2009-03-23  8:42                           ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-23  8:42 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote:
> > Hmm,,
> > 
> > This patch is another thing unlike previous series patches.
> > Firstly, It looked good to me.
> > 
> > I think add_to_page_cache_lru have to become a fast path.
> > But, how often would ramfs and shmem function be called ?
> > 
> > I have a concern for this patch to add another burden.
> > so, we need any numbers for getting pros and cons.
> > 
> > Any thoughts ?
> 
> this is the just reason why current code don't call add_page_to_unevictable_list().
> add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> 
> then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> it can cause zone->lru_lock contention storm.

How is it different then shrink_page_list()?  If readahead put a
contiguous chunk of unevictable pages to the file lru, then
shrink_page_list() will as well call add_page_to_unevictable_list() in
a loop.

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
@ 2009-03-23  8:42                           ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-23  8:42 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote:
> > Hmm,,
> > 
> > This patch is another thing unlike previous series patches.
> > Firstly, It looked good to me.
> > 
> > I think add_to_page_cache_lru have to become a fast path.
> > But, how often would ramfs and shmem function be called ?
> > 
> > I have a concern for this patch to add another burden.
> > so, we need any numbers for getting pros and cons.
> > 
> > Any thoughts ?
> 
> this is the just reason why current code don't call add_page_to_unevictable_list().
> add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> 
> then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> it can cause zone->lru_lock contention storm.

How is it different then shrink_page_list()?  If readahead put a
contiguous chunk of unevictable pages to the file lru, then
shrink_page_list() will as well call add_page_to_unevictable_list() in
a loop.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
  2009-03-23  8:42                           ` Johannes Weiner
@ 2009-03-23  9:01                             ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  9:01 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, Minchan Kim, Andrew Morton, linux-kernel,
	linux-mm, David Howells, Nick Piggin, Rik van Riel,
	Peter Zijlstra, Lee Schermerhorn

> On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote:
> > > Hmm,,
> > > 
> > > This patch is another thing unlike previous series patches.
> > > Firstly, It looked good to me.
> > > 
> > > I think add_to_page_cache_lru have to become a fast path.
> > > But, how often would ramfs and shmem function be called ?
> > > 
> > > I have a concern for this patch to add another burden.
> > > so, we need any numbers for getting pros and cons.
> > > 
> > > Any thoughts ?
> > 
> > this is the just reason why current code don't call add_page_to_unevictable_list().
> > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> > 
> > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> > it can cause zone->lru_lock contention storm.
> 
> How is it different then shrink_page_list()?  If readahead put a
> contiguous chunk of unevictable pages to the file lru, then
> shrink_page_list() will as well call add_page_to_unevictable_list() in
> a loop.

it's probability issue.

readahead: we need to concern
	(1) readahead vs readahead
	(2) readahead vs reclaim

vmscan: we need to concern
	(3) background reclaim vs foreground reclaim

So, (3) is rarely event than (1) and (2).
Am I missing anything?




^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
@ 2009-03-23  9:01                             ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  9:01 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, Minchan Kim, Andrew Morton, linux-kernel,
	linux-mm, David Howells, Nick Piggin, Rik van Riel,
	Peter Zijlstra, Lee Schermerhorn

> On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote:
> > > Hmm,,
> > > 
> > > This patch is another thing unlike previous series patches.
> > > Firstly, It looked good to me.
> > > 
> > > I think add_to_page_cache_lru have to become a fast path.
> > > But, how often would ramfs and shmem function be called ?
> > > 
> > > I have a concern for this patch to add another burden.
> > > so, we need any numbers for getting pros and cons.
> > > 
> > > Any thoughts ?
> > 
> > this is the just reason why current code don't call add_page_to_unevictable_list().
> > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> > 
> > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> > it can cause zone->lru_lock contention storm.
> 
> How is it different then shrink_page_list()?  If readahead put a
> contiguous chunk of unevictable pages to the file lru, then
> shrink_page_list() will as well call add_page_to_unevictable_list() in
> a loop.

it's probability issue.

readahead: we need to concern
	(1) readahead vs readahead
	(2) readahead vs reclaim

vmscan: we need to concern
	(3) background reclaim vs foreground reclaim

So, (3) is rarely event than (1) and (2).
Am I missing anything?



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
  2009-03-23  9:01                             ` KOSAKI Motohiro
@ 2009-03-23  9:23                               ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  9:23 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: kosaki.motohiro, Johannes Weiner, Minchan Kim, Andrew Morton,
	linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel,
	Peter Zijlstra, Lee Schermerhorn

> > > this is the just reason why current code don't call add_page_to_unevictable_list().
> > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> > > 
> > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> > > it can cause zone->lru_lock contention storm.
> > 
> > How is it different then shrink_page_list()?  If readahead put a
> > contiguous chunk of unevictable pages to the file lru, then
> > shrink_page_list() will as well call add_page_to_unevictable_list() in
> > a loop.
> 
> it's probability issue.
> 
> readahead: we need to concern
> 	(1) readahead vs readahead
> 	(2) readahead vs reclaim
> 
> vmscan: we need to concern
> 	(3) background reclaim vs foreground reclaim
> 
> So, (3) is rarely event than (1) and (2).
> Am I missing anything?

my last mail explanation is too poor. sorry.
I don't dislike this patch concept. but it seems a bit naive against contention.
if we can decrease contention risk, I can ack with presure.




^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
@ 2009-03-23  9:23                               ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-23  9:23 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Johannes Weiner, Minchan Kim, Andrew Morton, linux-kernel,
	linux-mm, David Howells, Nick Piggin, Rik van Riel,
	Peter Zijlstra, Lee Schermerhorn

> > > this is the just reason why current code don't call add_page_to_unevictable_list().
> > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> > > 
> > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> > > it can cause zone->lru_lock contention storm.
> > 
> > How is it different then shrink_page_list()?  If readahead put a
> > contiguous chunk of unevictable pages to the file lru, then
> > shrink_page_list() will as well call add_page_to_unevictable_list() in
> > a loop.
> 
> it's probability issue.
> 
> readahead: we need to concern
> 	(1) readahead vs readahead
> 	(2) readahead vs reclaim
> 
> vmscan: we need to concern
> 	(3) background reclaim vs foreground reclaim
> 
> So, (3) is rarely event than (1) and (2).
> Am I missing anything?

my last mail explanation is too poor. sorry.
I don't dislike this patch concept. but it seems a bit naive against contention.
if we can decrease contention risk, I can ack with presure.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 2/3] ramfs-nommu: use generic lru cache
  2009-03-21 10:20                   ` Johannes Weiner
@ 2009-03-23 10:40                     ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-23 10:40 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Johannes Weiner <hannes@cmpxchg.org> wrote:

> Instead of open-coding the lru-list-add pagevec batching when
> expanding a file mapping from zero, defer to the appropriate page
> cache function that also takes care of adding the page to the lru
> list.
> 
> This is cleaner, saves code and reduces the stack footprint by 16
> words worth of pagevec.

Acked-by: David Howells <dhowells@redhat.com>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 2/3] ramfs-nommu: use generic lru cache
@ 2009-03-23 10:40                     ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-23 10:40 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Johannes Weiner <hannes@cmpxchg.org> wrote:

> Instead of open-coding the lru-list-add pagevec batching when
> expanding a file mapping from zero, defer to the appropriate page
> cache function that also takes care of adding the page to the lru
> list.
> 
> This is cleaner, saves code and reduces the stack footprint by 16
> words worth of pagevec.

Acked-by: David Howells <dhowells@redhat.com>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 1/3] mm: decouple unevictable lru from mmu
  2009-03-22 23:46                       ` KOSAKI Motohiro
@ 2009-03-23 10:48                         ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-23 10:48 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: dhowells, KOSAKI Motohiro, Andrew Morton, linux-kernel, linux-mm,
	Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Johannes Weiner <hannes@cmpxchg.org> wrote:

> David, why do we need two Kconfig symbols for mlock and the mlock page
> bit?  Don't we always provide mlock on mmu and never on nommu?

Because whilst the PG_mlocked doesn't exist if we don't have mlock() because
we're in NOMMU mode, that does not imply that it _does_ exist if we _do_ have
mlock() as it's also contingent on having the unevictable LRU.

Not only that, CONFIG_HAVE_MLOCK used in mm/internal.h to switch some stuff
out based on whether we have mlock() available or not - which is not the same
as whether we have PG_mlocked or not.

Mainly I thought it made the train of logic easier.

Note that neither symbol is actually manually adjustable.

David

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 1/3] mm: decouple unevictable lru from mmu
@ 2009-03-23 10:48                         ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-23 10:48 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: dhowells, KOSAKI Motohiro, Andrew Morton, linux-kernel, linux-mm,
	Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Johannes Weiner <hannes@cmpxchg.org> wrote:

> David, why do we need two Kconfig symbols for mlock and the mlock page
> bit?  Don't we always provide mlock on mmu and never on nommu?

Because whilst the PG_mlocked doesn't exist if we don't have mlock() because
we're in NOMMU mode, that does not imply that it _does_ exist if we _do_ have
mlock() as it's also contingent on having the unevictable LRU.

Not only that, CONFIG_HAVE_MLOCK used in mm/internal.h to switch some stuff
out based on whether we have mlock() available or not - which is not the same
as whether we have PG_mlocked or not.

Mainly I thought it made the train of logic easier.

Note that neither symbol is actually manually adjustable.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
  2009-03-21 10:20                   ` Johannes Weiner
@ 2009-03-23 10:53                     ` David Howells
  -1 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-23 10:53 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Johannes Weiner <hannes@cmpxchg.org> wrote:

> -		if (page_is_file_cache(page))
> +		if (mapping_unevictable(mapping))
> +			add_page_to_unevictable_list(page);
> +		else if (page_is_file_cache(page))

It would be nice to avoid adding an extra test and branch in here.  This
function is used a lot, and quite often we know the answer to the first test
before we even get here.

David

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
@ 2009-03-23 10:53                     ` David Howells
  0 siblings, 0 replies; 101+ messages in thread
From: David Howells @ 2009-03-23 10:53 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

Johannes Weiner <hannes@cmpxchg.org> wrote:

> -		if (page_is_file_cache(page))
> +		if (mapping_unevictable(mapping))
> +			add_page_to_unevictable_list(page);
> +		else if (page_is_file_cache(page))

It would be nice to avoid adding an extra test and branch in here.  This
function is used a lot, and quite often we know the answer to the first test
before we even get here.

David

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
  2009-03-21 10:20                   ` Johannes Weiner
@ 2009-03-23 20:07                     ` Lee Schermerhorn
  -1 siblings, 0 replies; 101+ messages in thread
From: Lee Schermerhorn @ 2009-03-23 20:07 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro,
	torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel,
	linux-mm, riel

On Sat, 2009-03-21 at 11:20 +0100, Johannes Weiner wrote:
> On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote:
> > On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote:
> > > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> > > 
> > > > I just want to point out [again :)] that removing the ramfs pages from
> > > > the lru will prevent them from being migrated
> > > 
> > > This is less of an issue for NOMMU kernels, since you can't migrate pages that
> > > are mapped.
> > 
> > 
> > Agreed.  So, you could eliminate them [ramfs pages] from the lru for
> > just the nommu kernels, if you wanted to go that route.
> 
> These pages don't come with much overhead anymore when they sit on the
> unevictable list, right?  So I don't see much point in special casing
> them all over the place.

I agree:  not much overhead; no NEED to special case.  I was only
agreeing with David, that it would be OK to keep them off the LRU for
NOMMU kernels.  
> 
> I have a patchset that decouples the unevictable lru feature from
> mlock, enables the latter on nommu and then makes sure ramfs pages go
> immediately to the unevictable list so they don't need the scanner to
> move them.  This is just wiring up of features we already have.

Yeah.  I didn't do it that way, because I didn't see any benefit in
doing that for ram disk pages.   If one doesn't run vmscan, having the
pages on the normal lru doesn't hurt.  If you do need to run vmscan,
moving them to the unevictable list from there seems the least of your
problems :).  And, doing in the pagevec flush function adds overhead to
the fault path.  Granted, it's amortized over PAGEVEC_SIZE pages.  Would
probably we worth measuring the performance cost.  And any code size
increase--NOMMU kernel users might care about that.

> 
> I will sent this mondayish, need to test it more especially on a NOMMU
> setup.

Saw them.  Will take a look...

Lee


^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU
@ 2009-03-23 20:07                     ` Lee Schermerhorn
  0 siblings, 0 replies; 101+ messages in thread
From: Lee Schermerhorn @ 2009-03-23 20:07 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro,
	torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel,
	linux-mm, riel

On Sat, 2009-03-21 at 11:20 +0100, Johannes Weiner wrote:
> On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote:
> > On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote:
> > > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote:
> > > 
> > > > I just want to point out [again :)] that removing the ramfs pages from
> > > > the lru will prevent them from being migrated
> > > 
> > > This is less of an issue for NOMMU kernels, since you can't migrate pages that
> > > are mapped.
> > 
> > 
> > Agreed.  So, you could eliminate them [ramfs pages] from the lru for
> > just the nommu kernels, if you wanted to go that route.
> 
> These pages don't come with much overhead anymore when they sit on the
> unevictable list, right?  So I don't see much point in special casing
> them all over the place.

I agree:  not much overhead; no NEED to special case.  I was only
agreeing with David, that it would be OK to keep them off the LRU for
NOMMU kernels.  
> 
> I have a patchset that decouples the unevictable lru feature from
> mlock, enables the latter on nommu and then makes sure ramfs pages go
> immediately to the unevictable list so they don't need the scanner to
> move them.  This is just wiring up of features we already have.

Yeah.  I didn't do it that way, because I didn't see any benefit in
doing that for ram disk pages.   If one doesn't run vmscan, having the
pages on the normal lru doesn't hurt.  If you do need to run vmscan,
moving them to the unevictable list from there seems the least of your
problems :).  And, doing in the pagevec flush function adds overhead to
the fault path.  Granted, it's amortized over PAGEVEC_SIZE pages.  Would
probably we worth measuring the performance cost.  And any code size
increase--NOMMU kernel users might care about that.

> 
> I will sent this mondayish, need to test it more especially on a NOMMU
> setup.

Saw them.  Will take a look...

Lee

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
  2009-03-23 10:53                     ` David Howells
@ 2009-03-26  0:01                       ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-26  0:01 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, linux-kernel, linux-mm, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote:
> Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > -		if (page_is_file_cache(page))
> > +		if (mapping_unevictable(mapping))
> > +			add_page_to_unevictable_list(page);
> > +		else if (page_is_file_cache(page))
> 
> It would be nice to avoid adding an extra test and branch in here.  This
> function is used a lot, and quite often we know the answer to the first test
> before we even get here.

Yes, I thought about that too.  So I mounted a tmpfs and dd'd
/dev/zero to a file on it until it ran out of space (around 900M,
without swapping), deleted the file again.  I did this in a tight loop
and profiled it.

I couldn't think of a way that would excercise add_to_page_cache_lru()
more, I hope I didn't overlook anything, please correct if I am wrong.

If I was not, than the extra checking for unevictable mappings doesn't
make a measurable difference.  The function on the vanilla kernel had
a share of 0.2033%, on the patched kernel 0.1953%.

	Hannes

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
@ 2009-03-26  0:01                       ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-26  0:01 UTC (permalink / raw)
  To: David Howells
  Cc: Andrew Morton, linux-kernel, linux-mm, Nick Piggin,
	KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote:
> Johannes Weiner <hannes@cmpxchg.org> wrote:
> 
> > -		if (page_is_file_cache(page))
> > +		if (mapping_unevictable(mapping))
> > +			add_page_to_unevictable_list(page);
> > +		else if (page_is_file_cache(page))
> 
> It would be nice to avoid adding an extra test and branch in here.  This
> function is used a lot, and quite often we know the answer to the first test
> before we even get here.

Yes, I thought about that too.  So I mounted a tmpfs and dd'd
/dev/zero to a file on it until it ran out of space (around 900M,
without swapping), deleted the file again.  I did this in a tight loop
and profiled it.

I couldn't think of a way that would excercise add_to_page_cache_lru()
more, I hope I didn't overlook anything, please correct if I am wrong.

If I was not, than the extra checking for unevictable mappings doesn't
make a measurable difference.  The function on the vanilla kernel had
a share of 0.2033%, on the patched kernel 0.1953%.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
  2009-03-23  9:23                               ` KOSAKI Motohiro
@ 2009-03-26  0:48                                 ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-26  0:48 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 06:23:36PM +0900, KOSAKI Motohiro wrote:
> > > > this is the just reason why current code don't call add_page_to_unevictable_list().
> > > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> > > > 
> > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> > > > it can cause zone->lru_lock contention storm.
> > > 
> > > How is it different then shrink_page_list()?  If readahead put a
> > > contiguous chunk of unevictable pages to the file lru, then
> > > shrink_page_list() will as well call add_page_to_unevictable_list() in
> > > a loop.
> > 
> > it's probability issue.
> > 
> > readahead: we need to concern
> > 	(1) readahead vs readahead
> > 	(2) readahead vs reclaim
> > 
> > vmscan: we need to concern
> > 	(3) background reclaim vs foreground reclaim
> > 
> > So, (3) is rarely event than (1) and (2).
> > Am I missing anything?
> 
> my last mail explanation is too poor. sorry.
> I don't dislike this patch concept. but it seems a bit naive against contention.
> if we can decrease contention risk, I can ack with presure.

My understanding is that when the mapping is truncated before the
pages are scanned for reclaim, then we have a net increase of risk for
the contention storm you describe.

Otherwise, we moved the contention from the reclaim path to the fault
path.

I don't know how likely readahead is.  It only happens when the
mapping was blown up with truncate, otherwise only writes add to the
cache in the ramfs case.

I will further look into this.

	Hannes

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU  lists
@ 2009-03-26  0:48                                 ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-26  0:48 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm,
	David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra,
	Lee Schermerhorn

On Mon, Mar 23, 2009 at 06:23:36PM +0900, KOSAKI Motohiro wrote:
> > > > this is the just reason why current code don't call add_page_to_unevictable_list().
> > > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race.
> > > > 
> > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(),
> > > > it can cause zone->lru_lock contention storm.
> > > 
> > > How is it different then shrink_page_list()?  If readahead put a
> > > contiguous chunk of unevictable pages to the file lru, then
> > > shrink_page_list() will as well call add_page_to_unevictable_list() in
> > > a loop.
> > 
> > it's probability issue.
> > 
> > readahead: we need to concern
> > 	(1) readahead vs readahead
> > 	(2) readahead vs reclaim
> > 
> > vmscan: we need to concern
> > 	(3) background reclaim vs foreground reclaim
> > 
> > So, (3) is rarely event than (1) and (2).
> > Am I missing anything?
> 
> my last mail explanation is too poor. sorry.
> I don't dislike this patch concept. but it seems a bit naive against contention.
> if we can decrease contention risk, I can ack with presure.

My understanding is that when the mapping is truncated before the
pages are scanned for reclaim, then we have a net increase of risk for
the contention storm you describe.

Otherwise, we moved the contention from the reclaim path to the fault
path.

I don't know how likely readahead is.  It only happens when the
mapping was blown up with truncate, otherwise only writes add to the
cache in the ramfs case.

I will further look into this.

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
  2009-03-26  0:01                       ` Johannes Weiner
@ 2009-03-26  8:56                         ` KOSAKI Motohiro
  -1 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-26  8:56 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, David Howells, Andrew Morton, linux-kernel,
	linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

> On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote:
> > Johannes Weiner <hannes@cmpxchg.org> wrote:
> > 
> > > -		if (page_is_file_cache(page))
> > > +		if (mapping_unevictable(mapping))
> > > +			add_page_to_unevictable_list(page);
> > > +		else if (page_is_file_cache(page))
> > 
> > It would be nice to avoid adding an extra test and branch in here.  This
> > function is used a lot, and quite often we know the answer to the first test
> > before we even get here.
> 
> Yes, I thought about that too.  So I mounted a tmpfs and dd'd
> /dev/zero to a file on it until it ran out of space (around 900M,
> without swapping), deleted the file again.  I did this in a tight loop
> and profiled it.
> 
> I couldn't think of a way that would excercise add_to_page_cache_lru()
> more, I hope I didn't overlook anything, please correct if I am wrong.
> 
> If I was not, than the extra checking for unevictable mappings doesn't
> make a measurable difference.  The function on the vanilla kernel had
> a share of 0.2033%, on the patched kernel 0.1953%.

May I ask the number of the cpu of your test box.
In general, lock contention possibility depend on #ofCPUs.

So, I and lee mainly talked about large box.




^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
@ 2009-03-26  8:56                         ` KOSAKI Motohiro
  0 siblings, 0 replies; 101+ messages in thread
From: KOSAKI Motohiro @ 2009-03-26  8:56 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: kosaki.motohiro, David Howells, Andrew Morton, linux-kernel,
	linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

> On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote:
> > Johannes Weiner <hannes@cmpxchg.org> wrote:
> > 
> > > -		if (page_is_file_cache(page))
> > > +		if (mapping_unevictable(mapping))
> > > +			add_page_to_unevictable_list(page);
> > > +		else if (page_is_file_cache(page))
> > 
> > It would be nice to avoid adding an extra test and branch in here.  This
> > function is used a lot, and quite often we know the answer to the first test
> > before we even get here.
> 
> Yes, I thought about that too.  So I mounted a tmpfs and dd'd
> /dev/zero to a file on it until it ran out of space (around 900M,
> without swapping), deleted the file again.  I did this in a tight loop
> and profiled it.
> 
> I couldn't think of a way that would excercise add_to_page_cache_lru()
> more, I hope I didn't overlook anything, please correct if I am wrong.
> 
> If I was not, than the extra checking for unevictable mappings doesn't
> make a measurable difference.  The function on the vanilla kernel had
> a share of 0.2033%, on the patched kernel 0.1953%.

May I ask the number of the cpu of your test box.
In general, lock contention possibility depend on #ofCPUs.

So, I and lee mainly talked about large box.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
  2009-03-26  8:56                         ` KOSAKI Motohiro
@ 2009-03-26 10:36                           ` Johannes Weiner
  -1 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-26 10:36 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: David Howells, Andrew Morton, linux-kernel, linux-mm,
	Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

On Thu, Mar 26, 2009 at 05:56:52PM +0900, KOSAKI Motohiro wrote:
> > On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote:
> > > Johannes Weiner <hannes@cmpxchg.org> wrote:
> > > 
> > > > -		if (page_is_file_cache(page))
> > > > +		if (mapping_unevictable(mapping))
> > > > +			add_page_to_unevictable_list(page);
> > > > +		else if (page_is_file_cache(page))
> > > 
> > > It would be nice to avoid adding an extra test and branch in here.  This
> > > function is used a lot, and quite often we know the answer to the first test
> > > before we even get here.
> > 
> > Yes, I thought about that too.  So I mounted a tmpfs and dd'd
> > /dev/zero to a file on it until it ran out of space (around 900M,
> > without swapping), deleted the file again.  I did this in a tight loop
> > and profiled it.
> > 
> > I couldn't think of a way that would excercise add_to_page_cache_lru()
> > more, I hope I didn't overlook anything, please correct if I am wrong.
> > 
> > If I was not, than the extra checking for unevictable mappings doesn't
> > make a measurable difference.  The function on the vanilla kernel had
> > a share of 0.2033%, on the patched kernel 0.1953%.
> 
> May I ask the number of the cpu of your test box.
> In general, lock contention possibility depend on #ofCPUs.

Yes, sure.  In this test I tried to find out how much this extra
branch makes a difference for the common path (untaken), though.

I have not tried to instrument the lock contention.  But this will be
done with a quadcore system.

> So, I and lee mainly talked about large box.

Yeah, I don't have such a thing ;)

	Hannes

^ permalink raw reply	[flat|nested] 101+ messages in thread

* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists
@ 2009-03-26 10:36                           ` Johannes Weiner
  0 siblings, 0 replies; 101+ messages in thread
From: Johannes Weiner @ 2009-03-26 10:36 UTC (permalink / raw)
  To: KOSAKI Motohiro
  Cc: David Howells, Andrew Morton, linux-kernel, linux-mm,
	Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim,
	Lee Schermerhorn

On Thu, Mar 26, 2009 at 05:56:52PM +0900, KOSAKI Motohiro wrote:
> > On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote:
> > > Johannes Weiner <hannes@cmpxchg.org> wrote:
> > > 
> > > > -		if (page_is_file_cache(page))
> > > > +		if (mapping_unevictable(mapping))
> > > > +			add_page_to_unevictable_list(page);
> > > > +		else if (page_is_file_cache(page))
> > > 
> > > It would be nice to avoid adding an extra test and branch in here.  This
> > > function is used a lot, and quite often we know the answer to the first test
> > > before we even get here.
> > 
> > Yes, I thought about that too.  So I mounted a tmpfs and dd'd
> > /dev/zero to a file on it until it ran out of space (around 900M,
> > without swapping), deleted the file again.  I did this in a tight loop
> > and profiled it.
> > 
> > I couldn't think of a way that would excercise add_to_page_cache_lru()
> > more, I hope I didn't overlook anything, please correct if I am wrong.
> > 
> > If I was not, than the extra checking for unevictable mappings doesn't
> > make a measurable difference.  The function on the vanilla kernel had
> > a share of 0.2033%, on the patched kernel 0.1953%.
> 
> May I ask the number of the cpu of your test box.
> In general, lock contention possibility depend on #ofCPUs.

Yes, sure.  In this test I tried to find out how much this extra
branch makes a difference for the common path (untaken), though.

I have not tried to instrument the lock contention.  But this will be
done with a quadcore system.

> So, I and lee mainly talked about large box.

Yeah, I don't have such a thing ;)

	Hannes

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 101+ messages in thread

end of thread, other threads:[~2009-03-26 10:38 UTC | newest]

Thread overview: 101+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells
2009-03-11 17:26 ` Johannes Weiner
2009-03-11 22:03 ` Andrew Morton
2009-03-11 22:03   ` Andrew Morton
2009-03-11 22:36   ` Johannes Weiner
2009-03-11 22:36     ` Johannes Weiner
2009-03-12  0:02   ` Andrew Morton
2009-03-12  0:02     ` Andrew Morton
2009-03-12  0:35     ` Minchan Kim
2009-03-12  0:35       ` Minchan Kim
2009-03-12  1:04       ` KOSAKI Motohiro
2009-03-12  1:04         ` KOSAKI Motohiro
2009-03-12  1:52         ` Minchan Kim
2009-03-12  1:52           ` Minchan Kim
2009-03-12  1:56           ` Minchan Kim
2009-03-12  1:56             ` Minchan Kim
2009-03-12  2:00           ` KOSAKI Motohiro
2009-03-12  2:00             ` KOSAKI Motohiro
2009-03-12  2:11             ` Minchan Kim
2009-03-12  2:11               ` Minchan Kim
2009-03-12 12:19         ` Robin Getz
2009-03-12 12:19           ` Robin Getz
2009-03-12 17:55           ` [uClinux-dev] " Jamie Lokier
2009-03-12 17:55             ` Jamie Lokier
2009-03-13 17:33         ` [PATCH 0/2] Make the Unevictable LRU available on NOMMU David Howells
2009-03-13 17:33           ` David Howells
2009-03-13 17:33           ` [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits David Howells
2009-03-13 17:33             ` David Howells
2009-03-14 11:17             ` KOSAKI Motohiro
2009-03-14 11:17               ` KOSAKI Motohiro
2009-03-13 17:33           ` [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n David Howells
2009-03-13 17:33             ` David Howells
2009-03-14 11:17             ` KOSAKI Motohiro
2009-03-14 11:17               ` KOSAKI Motohiro
2009-03-14  0:27           ` [PATCH 0/2] Make the Unevictable LRU available on NOMMU Minchan Kim
2009-03-14  0:27             ` Minchan Kim
2009-03-20 16:08             ` Lee Schermerhorn
2009-03-20 16:08               ` Lee Schermerhorn
2009-03-20 16:24             ` David Howells
2009-03-20 16:24               ` David Howells
2009-03-20 18:30               ` Lee Schermerhorn
2009-03-20 18:30                 ` Lee Schermerhorn
2009-03-21 10:20                 ` Johannes Weiner
2009-03-21 10:20                   ` Johannes Weiner
2009-03-22 20:13                   ` [patch 1/3] mm: decouple unevictable lru from mmu Johannes Weiner
2009-03-22 20:13                     ` Johannes Weiner
2009-03-22 23:46                     ` KOSAKI Motohiro
2009-03-22 23:46                       ` KOSAKI Motohiro
2009-03-23  0:14                       ` Johannes Weiner
2009-03-23  0:14                         ` Johannes Weiner
2009-03-23 10:48                       ` David Howells
2009-03-23 10:48                         ` David Howells
2009-03-22 20:13                   ` [patch 2/3] ramfs-nommu: use generic lru cache Johannes Weiner
2009-03-22 20:13                     ` Johannes Weiner
2009-03-23  2:22                     ` KOSAKI Motohiro
2009-03-23  2:22                       ` KOSAKI Motohiro
2009-03-22 20:13                   ` [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists Johannes Weiner
2009-03-22 20:13                     ` Johannes Weiner
2009-03-23  0:44                     ` Minchan Kim
2009-03-23  0:44                       ` Minchan Kim
2009-03-23  2:21                       ` KOSAKI Motohiro
2009-03-23  2:21                         ` KOSAKI Motohiro
2009-03-23  8:42                         ` Johannes Weiner
2009-03-23  8:42                           ` Johannes Weiner
2009-03-23  9:01                           ` KOSAKI Motohiro
2009-03-23  9:01                             ` KOSAKI Motohiro
2009-03-23  9:23                             ` KOSAKI Motohiro
2009-03-23  9:23                               ` KOSAKI Motohiro
2009-03-26  0:48                               ` Johannes Weiner
2009-03-26  0:48                                 ` Johannes Weiner
2009-03-23 10:40                   ` [patch 2/3] ramfs-nommu: use generic lru cache David Howells
2009-03-23 10:40                     ` David Howells
2009-03-23 10:53                   ` [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists David Howells
2009-03-23 10:53                     ` David Howells
2009-03-26  0:01                     ` Johannes Weiner
2009-03-26  0:01                       ` Johannes Weiner
2009-03-26  8:56                       ` KOSAKI Motohiro
2009-03-26  8:56                         ` KOSAKI Motohiro
2009-03-26 10:36                         ` Johannes Weiner
2009-03-26 10:36                           ` Johannes Weiner
2009-03-23 20:07                   ` [PATCH 0/2] Make the Unevictable LRU available on NOMMU Lee Schermerhorn
2009-03-23 20:07                     ` Lee Schermerhorn
2009-03-13 11:53       ` [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells
2009-03-13 11:53         ` David Howells
2009-03-13 22:49         ` Johannes Weiner
2009-03-13 22:49           ` Johannes Weiner
2009-03-12  0:08 ` Andrew Morton
2009-03-12  7:12   ` Berkhan, Enrik (GE Infra, Oil & Gas)
2009-03-12 11:29     ` [uClinux-dev] " Jamie Lokier
2009-03-12 11:50       ` Peter Zijlstra
2009-03-12 23:20         ` Minchan Kim
2009-03-13  7:56           ` Peter Zijlstra
2009-03-13  9:17             ` Minchan Kim
2009-03-12 12:25 ` David Howells
2009-03-12 19:43   ` Andrew Morton
2009-03-13  2:03     ` KOSAKI Motohiro
2009-03-13  7:57       ` Peter Zijlstra
2009-03-13  8:15         ` KOSAKI Motohiro
2009-03-13  9:19           ` Minchan Kim
2009-03-13 10:44           ` Johannes Weiner
2009-03-14 14:29             ` KOSAKI Motohiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.