* [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-11 15:30 David Howells 2009-03-11 17:26 ` Johannes Weiner ` (3 more replies) 0 siblings, 4 replies; 101+ messages in thread From: David Howells @ 2009-03-11 15:30 UTC (permalink / raw) To: torvalds, akpm, peterz; +Cc: Enrik.Berkhan, dhowells, uclinux-dev, linux-kernel From: Enrik Berkhan <Enrik.Berkhan@ge.com> The pages attached to a ramfs inode's pagecache by truncation from nothing - as done by SYSV SHM for example - may get discarded under memory pressure. The problem is that the pages are not marked dirty. Anything that creates data in an MMU-based ramfs will cause the pages holding that data will cause the set_page_dirty() aop to be called. For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it won't be called by page-writing faults on writable mmaps, and it isn't called by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing to allocate a contiguous run. The solution is to mark the pages dirty at the point of allocation by the truncation code. Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com> Signed-off-by: David Howells <dhowells@redhat.com> --- fs/ramfs/file-nommu.c | 3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c index b9b567a..90d72be 100644 --- a/fs/ramfs/file-nommu.c +++ b/fs/ramfs/file-nommu.c @@ -114,6 +114,9 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) if (!pagevec_add(&lru_pvec, page)) __pagevec_lru_add_file(&lru_pvec); + /* prevent the page from being discarded on memory pressure */ + SetPageDirty(page); + unlock_page(page); } ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells @ 2009-03-11 17:26 ` Johannes Weiner 2009-03-11 22:03 ` Andrew Morton ` (2 subsequent siblings) 3 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-11 17:26 UTC (permalink / raw) To: David Howells Cc: torvalds, akpm, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel On Wed, Mar 11, 2009 at 03:30:35PM +0000, David Howells wrote: > From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > The pages attached to a ramfs inode's pagecache by truncation from nothing - as > done by SYSV SHM for example - may get discarded under memory pressure. > > The problem is that the pages are not marked dirty. Anything that creates data > in an MMU-based ramfs will cause the pages holding that data will cause the > set_page_dirty() aop to be called. > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > won't be called by page-writing faults on writable mmaps, and it isn't called > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > to allocate a contiguous run. > > The solution is to mark the pages dirty at the point of allocation by > the truncation code. > > Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com> > Signed-off-by: David Howells <dhowells@redhat.com> > --- > > fs/ramfs/file-nommu.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c > index b9b567a..90d72be 100644 > --- a/fs/ramfs/file-nommu.c > +++ b/fs/ramfs/file-nommu.c > @@ -114,6 +114,9 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) > if (!pagevec_add(&lru_pvec, page)) > __pagevec_lru_add_file(&lru_pvec); > > + /* prevent the page from being discarded on memory pressure */ > + SetPageDirty(page); > + > unlock_page(page); > } Reviewed-by: Johannes Weiner <hannes@cmpxchg.org> I think the attached patch is also needed, though unrelated to the above. Hannes --- >From bfa7bc5f884bbc01c5e10faba7ca17160befd61e Mon Sep 17 00:00:00 2001 From: Johannes Weiner <hannes@cmpxchg.org> Date: Wed, 11 Mar 2009 18:13:34 +0100 Subject: [PATCH] ramfs: don't leak pages when adding to page cache fails When a ramfs nommu mapping is expanded, contiguous pages are allocated and added to the pagecache. The caller's reference is then passed on by moving whole pagevecs to the file lru list. If the page cache adding fails, make sure that the error path also moves the pagevec contents which might still contain up to PAGEVEC_SIZE successfully added pages, of which we would leak references otherwise. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> --- fs/ramfs/file-nommu.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c index b9b567a..6d1624e 100644 --- a/fs/ramfs/file-nommu.c +++ b/fs/ramfs/file-nommu.c @@ -126,6 +126,7 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) return -EFBIG; add_error: + pagevec_lru_add_file(&lru_pvec); page_cache_release(pages + loop); for (loop++; loop < npages; loop++) __free_page(pages + loop); -- 1.6.1.3 ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells @ 2009-03-11 22:03 ` Andrew Morton 2009-03-11 22:03 ` Andrew Morton ` (2 subsequent siblings) 3 siblings, 0 replies; 101+ messages in thread From: Andrew Morton @ 2009-03-11 22:03 UTC (permalink / raw) To: David Howells Cc: torvalds, peterz, Enrik.Berkhan, dhowells, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner On Wed, 11 Mar 2009 15:30:35 +0000 David Howells <dhowells@redhat.com> wrote: > From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > The pages attached to a ramfs inode's pagecache by truncation from nothing - as > done by SYSV SHM for example - may get discarded under memory pressure. Something has gone wrong in core VM. > The problem is that the pages are not marked dirty. Anything that creates data > in an MMU-based ramfs will cause the pages holding that data will cause the > set_page_dirty() aop to be called. > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > won't be called by page-writing faults on writable mmaps, and it isn't called > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > to allocate a contiguous run. > > The solution is to mark the pages dirty at the point of allocation by > the truncation code. Page reclaim shouldn't be even attempting to reclaim or write back ramfs pagecache pages - reclaim can't possibly do anything with these pages! Arguably those pages shouldn't be on the LRU at all, but we haven't done that yet. Now, my problem is that I can't 100% be sure that we _ever_ implemented this properly. I _think_ we did, in which case we later broke it. If we've always been (stupidly) trying to pageout these pages then OK, I guess your patch is a suitable 2.6.29 stopgap. If, however, we broke it then we've probably broken other filesystems and we should fix the regression instead. Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the way to fix all this. Peter touched it last :) ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-11 22:03 ` Andrew Morton 0 siblings, 0 replies; 101+ messages in thread From: Andrew Morton @ 2009-03-11 22:03 UTC (permalink / raw) To: David Howells Cc: torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner On Wed, 11 Mar 2009 15:30:35 +0000 David Howells <dhowells@redhat.com> wrote: > From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > The pages attached to a ramfs inode's pagecache by truncation from nothing - as > done by SYSV SHM for example - may get discarded under memory pressure. Something has gone wrong in core VM. > The problem is that the pages are not marked dirty. Anything that creates data > in an MMU-based ramfs will cause the pages holding that data will cause the > set_page_dirty() aop to be called. > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > won't be called by page-writing faults on writable mmaps, and it isn't called > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > to allocate a contiguous run. > > The solution is to mark the pages dirty at the point of allocation by > the truncation code. Page reclaim shouldn't be even attempting to reclaim or write back ramfs pagecache pages - reclaim can't possibly do anything with these pages! Arguably those pages shouldn't be on the LRU at all, but we haven't done that yet. Now, my problem is that I can't 100% be sure that we _ever_ implemented this properly. I _think_ we did, in which case we later broke it. If we've always been (stupidly) trying to pageout these pages then OK, I guess your patch is a suitable 2.6.29 stopgap. If, however, we broke it then we've probably broken other filesystems and we should fix the regression instead. Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the way to fix all this. Peter touched it last :) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-11 22:03 ` Andrew Morton @ 2009-03-11 22:36 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-11 22:36 UTC (permalink / raw) To: Andrew Morton Cc: David Howells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm On Wed, Mar 11, 2009 at 03:03:02PM -0700, Andrew Morton wrote: > On Wed, 11 Mar 2009 15:30:35 +0000 > David Howells <dhowells@redhat.com> wrote: > > > From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > > > The pages attached to a ramfs inode's pagecache by truncation from nothing - as > > done by SYSV SHM for example - may get discarded under memory pressure. > > Something has gone wrong in core VM. > > > The problem is that the pages are not marked dirty. Anything that creates data > > in an MMU-based ramfs will cause the pages holding that data will cause the > > set_page_dirty() aop to be called. > > > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > > won't be called by page-writing faults on writable mmaps, and it isn't called > > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > > to allocate a contiguous run. > > > > The solution is to mark the pages dirty at the point of allocation by > > the truncation code. > > Page reclaim shouldn't be even attempting to reclaim or write back > ramfs pagecache pages - reclaim can't possibly do anything with these > pages! > > Arguably those pages shouldn't be on the LRU at all, but we haven't > done that yet. > > Now, my problem is that I can't 100% be sure that we _ever_ implemented > this properly. I _think_ we did, in which case we later broke it. If > we've always been (stupidly) trying to pageout these pages then OK, I > guess your patch is a suitable 2.6.29 stopgap. > > If, however, we broke it then we've probably broken other filesystems > and we should fix the regression instead. > > Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the > way to fix all this. The pages are not dirty, so no pageout() which says PAGE_KEEP. It will just go through and reclaim the clean, unmapped pages. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-11 22:36 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-11 22:36 UTC (permalink / raw) To: Andrew Morton Cc: David Howells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm On Wed, Mar 11, 2009 at 03:03:02PM -0700, Andrew Morton wrote: > On Wed, 11 Mar 2009 15:30:35 +0000 > David Howells <dhowells@redhat.com> wrote: > > > From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > > > The pages attached to a ramfs inode's pagecache by truncation from nothing - as > > done by SYSV SHM for example - may get discarded under memory pressure. > > Something has gone wrong in core VM. > > > The problem is that the pages are not marked dirty. Anything that creates data > > in an MMU-based ramfs will cause the pages holding that data will cause the > > set_page_dirty() aop to be called. > > > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > > won't be called by page-writing faults on writable mmaps, and it isn't called > > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > > to allocate a contiguous run. > > > > The solution is to mark the pages dirty at the point of allocation by > > the truncation code. > > Page reclaim shouldn't be even attempting to reclaim or write back > ramfs pagecache pages - reclaim can't possibly do anything with these > pages! > > Arguably those pages shouldn't be on the LRU at all, but we haven't > done that yet. > > Now, my problem is that I can't 100% be sure that we _ever_ implemented > this properly. I _think_ we did, in which case we later broke it. If > we've always been (stupidly) trying to pageout these pages then OK, I > guess your patch is a suitable 2.6.29 stopgap. > > If, however, we broke it then we've probably broken other filesystems > and we should fix the regression instead. > > Running bdi_cap_writeback_dirty() in may_write_to_queue() might be the > way to fix all this. The pages are not dirty, so no pageout() which says PAGE_KEEP. It will just go through and reclaim the clean, unmapped pages. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-11 22:03 ` Andrew Morton @ 2009-03-12 0:02 ` Andrew Morton -1 siblings, 0 replies; 101+ messages in thread From: Andrew Morton @ 2009-03-12 0:02 UTC (permalink / raw) To: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes On Wed, 11 Mar 2009 15:03:02 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > > The problem is that the pages are not marked dirty. Anything that creates data > > in an MMU-based ramfs will cause the pages holding that data will cause the > > set_page_dirty() aop to be called. > > > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > > won't be called by page-writing faults on writable mmaps, and it isn't called > > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > > to allocate a contiguous run. > > > > The solution is to mark the pages dirty at the point of allocation by > > the truncation code. > > Page reclaim shouldn't be even attempting to reclaim or write back > ramfs pagecache pages - reclaim can't possibly do anything with these > pages! > > Arguably those pages shouldn't be on the LRU at all, but we haven't > done that yet. > > Now, my problem is that I can't 100% be sure that we _ever_ implemented > this properly. I _think_ we did, in which case we later broke it. If > we've always been (stupidly) trying to pageout these pages then OK, I > guess your patch is a suitable 2.6.29 stopgap. OK, I can't find any code anywhere in which we excluded ramfs pages from consideration by page reclaim. How dumb. So I guess that for now the proposed patch is suitable. Longer-term we should bale early in shrink_page_list(), or not add these pages to the LRU at all. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 0:02 ` Andrew Morton 0 siblings, 0 replies; 101+ messages in thread From: Andrew Morton @ 2009-03-12 0:02 UTC (permalink / raw) To: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes On Wed, 11 Mar 2009 15:03:02 -0700 Andrew Morton <akpm@linux-foundation.org> wrote: > > The problem is that the pages are not marked dirty. Anything that creates data > > in an MMU-based ramfs will cause the pages holding that data will cause the > > set_page_dirty() aop to be called. > > > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > > won't be called by page-writing faults on writable mmaps, and it isn't called > > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > > to allocate a contiguous run. > > > > The solution is to mark the pages dirty at the point of allocation by > > the truncation code. > > Page reclaim shouldn't be even attempting to reclaim or write back > ramfs pagecache pages - reclaim can't possibly do anything with these > pages! > > Arguably those pages shouldn't be on the LRU at all, but we haven't > done that yet. > > Now, my problem is that I can't 100% be sure that we _ever_ implemented > this properly. I _think_ we did, in which case we later broke it. If > we've always been (stupidly) trying to pageout these pages then OK, I > guess your patch is a suitable 2.6.29 stopgap. OK, I can't find any code anywhere in which we excluded ramfs pages from consideration by page reclaim. How dumb. So I guess that for now the proposed patch is suitable. Longer-term we should bale early in shrink_page_list(), or not add these pages to the LRU at all. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 0:02 ` Andrew Morton @ 2009-03-12 0:35 ` Minchan Kim -1 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 0:35 UTC (permalink / raw) To: Andrew Morton Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Minchan Kim, Rik van Riel, Lee Schermerhorn, KOSAKI Motohiro On Thu, Mar 12, 2009 at 9:02 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 11 Mar 2009 15:03:02 -0700 > Andrew Morton <akpm@linux-foundation.org> wrote: > >> > The problem is that the pages are not marked dirty. Anything that creates data >> > in an MMU-based ramfs will cause the pages holding that data will cause the >> > set_page_dirty() aop to be called. >> > >> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it >> > won't be called by page-writing faults on writable mmaps, and it isn't called >> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing >> > to allocate a contiguous run. >> > >> > The solution is to mark the pages dirty at the point of allocation by >> > the truncation code. >> >> Page reclaim shouldn't be even attempting to reclaim or write back >> ramfs pagecache pages - reclaim can't possibly do anything with these >> pages! >> >> Arguably those pages shouldn't be on the LRU at all, but we haven't >> done that yet. >> >> Now, my problem is that I can't 100% be sure that we _ever_ implemented >> this properly. I _think_ we did, in which case we later broke it. If >> we've always been (stupidly) trying to pageout these pages then OK, I >> guess your patch is a suitable 2.6.29 stopgap. > > OK, I can't find any code anywhere in which we excluded ramfs pages > from consideration by page reclaim. How dumb. The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case It that case, ramfs_get_inode calls mapping_set_unevictable. So, page reclaim can exclude ramfs pages by page_evictable. It's problem . > So I guess that for now the proposed patch is suitable. Longer-term we > should bale early in shrink_page_list(), or not add these pages to the > LRU at all. In future, we have to improve this. > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- Kinds regards, Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 0:35 ` Minchan Kim 0 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 0:35 UTC (permalink / raw) To: Andrew Morton Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Minchan Kim, Rik van Riel, Lee Schermerhorn, KOSAKI Motohiro On Thu, Mar 12, 2009 at 9:02 AM, Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 11 Mar 2009 15:03:02 -0700 > Andrew Morton <akpm@linux-foundation.org> wrote: > >> > The problem is that the pages are not marked dirty. Anything that creates data >> > in an MMU-based ramfs will cause the pages holding that data will cause the >> > set_page_dirty() aop to be called. >> > >> > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it >> > won't be called by page-writing faults on writable mmaps, and it isn't called >> > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing >> > to allocate a contiguous run. >> > >> > The solution is to mark the pages dirty at the point of allocation by >> > the truncation code. >> >> Page reclaim shouldn't be even attempting to reclaim or write back >> ramfs pagecache pages - reclaim can't possibly do anything with these >> pages! >> >> Arguably those pages shouldn't be on the LRU at all, but we haven't >> done that yet. >> >> Now, my problem is that I can't 100% be sure that we _ever_ implemented >> this properly. I _think_ we did, in which case we later broke it. If >> we've always been (stupidly) trying to pageout these pages then OK, I >> guess your patch is a suitable 2.6.29 stopgap. > > OK, I can't find any code anywhere in which we excluded ramfs pages > from consideration by page reclaim. How dumb. The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case It that case, ramfs_get_inode calls mapping_set_unevictable. So, page reclaim can exclude ramfs pages by page_evictable. It's problem . > So I guess that for now the proposed patch is suitable. Longer-term we > should bale early in shrink_page_list(), or not add these pages to the > LRU at all. In future, we have to improve this. > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 0:35 ` Minchan Kim @ 2009-03-12 1:04 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-12 1:04 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn Hi > >> Page reclaim shouldn't be even attempting to reclaim or write back > >> ramfs pagecache pages - reclaim can't possibly do anything with these > >> pages! > >> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't > >> done that yet. > >> > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented > >> this properly. ?I _think_ we did, in which case we later broke it. ?If > >> we've always been (stupidly) trying to pageout these pages then OK, I > >> guess your patch is a suitable 2.6.29 stopgap. > > > > OK, I can't find any code anywhere in which we excluded ramfs pages > > from consideration by page reclaim. ?How dumb. > > The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case > It that case, ramfs_get_inode calls mapping_set_unevictable. > So, page reclaim can exclude ramfs pages by page_evictable. > It's problem . Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine because nobody of vmscan folk havbe nommu machine. Yes, it is very stupid reason. _very_ welcome to tester! :) David, Could you please try following patch if you have NOMMU machine? it is straightforward porting to nommu. == Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU. but current code does by mistake. fix it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> --- mm/Kconfig | 1 - mm/nommu.c | 24 ++++++++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) Index: b/mm/Kconfig =================================================================== --- a/mm/Kconfig 2008-12-28 20:55:23.000000000 +0900 +++ b/mm/Kconfig 2008-12-28 21:24:08.000000000 +0900 @@ -212,7 +212,6 @@ config VIRT_TO_BUS config UNEVICTABLE_LRU bool "Add LRU list to track non-evictable pages" default y - depends on MMU help Keeps unevictable pages off of the active and inactive pageout lists, so kswapd will not waste CPU time or have its balancing Index: b/mm/nommu.c =================================================================== --- a/mm/nommu.c 2008-12-25 08:26:37.000000000 +0900 +++ b/mm/nommu.c 2008-12-28 21:29:36.000000000 +0900 @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct mmput(mm); return len; } + +/* + * LRU accounting for clear_page_mlock() + */ +void __clear_page_mlock(struct page *page) +{ + VM_BUG_ON(!PageLocked(page)); + + if (!page->mapping) { /* truncated ? */ + return; + } + + dec_zone_page_state(page, NR_MLOCK); + count_vm_event(UNEVICTABLE_PGCLEARED); + if (!isolate_lru_page(page)) { + putback_lru_page(page); + } else { + /* + * We lost the race. the page already moved to evictable list. + */ + if (PageUnevictable(page)) + count_vm_event(UNEVICTABLE_PGSTRANDED); + } +} ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 1:04 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-12 1:04 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn Hi > >> Page reclaim shouldn't be even attempting to reclaim or write back > >> ramfs pagecache pages - reclaim can't possibly do anything with these > >> pages! > >> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't > >> done that yet. > >> > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented > >> this properly. ?I _think_ we did, in which case we later broke it. ?If > >> we've always been (stupidly) trying to pageout these pages then OK, I > >> guess your patch is a suitable 2.6.29 stopgap. > > > > OK, I can't find any code anywhere in which we excluded ramfs pages > > from consideration by page reclaim. ?How dumb. > > The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case > It that case, ramfs_get_inode calls mapping_set_unevictable. > So, page reclaim can exclude ramfs pages by page_evictable. > It's problem . Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine because nobody of vmscan folk havbe nommu machine. Yes, it is very stupid reason. _very_ welcome to tester! :) David, Could you please try following patch if you have NOMMU machine? it is straightforward porting to nommu. == Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU. but current code does by mistake. fix it. Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> --- mm/Kconfig | 1 - mm/nommu.c | 24 ++++++++++++++++++++++++ 2 files changed, 24 insertions(+), 1 deletion(-) Index: b/mm/Kconfig =================================================================== --- a/mm/Kconfig 2008-12-28 20:55:23.000000000 +0900 +++ b/mm/Kconfig 2008-12-28 21:24:08.000000000 +0900 @@ -212,7 +212,6 @@ config VIRT_TO_BUS config UNEVICTABLE_LRU bool "Add LRU list to track non-evictable pages" default y - depends on MMU help Keeps unevictable pages off of the active and inactive pageout lists, so kswapd will not waste CPU time or have its balancing Index: b/mm/nommu.c =================================================================== --- a/mm/nommu.c 2008-12-25 08:26:37.000000000 +0900 +++ b/mm/nommu.c 2008-12-28 21:29:36.000000000 +0900 @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct mmput(mm); return len; } + +/* + * LRU accounting for clear_page_mlock() + */ +void __clear_page_mlock(struct page *page) +{ + VM_BUG_ON(!PageLocked(page)); + + if (!page->mapping) { /* truncated ? */ + return; + } + + dec_zone_page_state(page, NR_MLOCK); + count_vm_event(UNEVICTABLE_PGCLEARED); + if (!isolate_lru_page(page)) { + putback_lru_page(page); + } else { + /* + * We lost the race. the page already moved to evictable list. + */ + if (PageUnevictable(page)) + count_vm_event(UNEVICTABLE_PGSTRANDED); + } +} -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 1:04 ` KOSAKI Motohiro @ 2009-03-12 1:52 ` Minchan Kim -1 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 1:52 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn Hi, Kosaki-san. I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? How about this ? It's just RFC. It's not tested. That's because we can't reclaim that pages regardless of whether there is unevictable list or not >From 487ce9577ea9c43b04ff340a1ba8c4030873e875 Mon Sep 17 00:00:00 2001 From: MinChan Kim <minchan.kim@gmail.com> Date: Thu, 12 Mar 2009 10:35:37 +0900 Subject: [PATCH] test Signed-off-by: MinChan Kim <minchan.kim@gmail.com> --- include/linux/pagemap.h | 9 --------- include/linux/swap.h | 9 ++------- 2 files changed, 2 insertions(+), 16 deletions(-) diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 4d27bf8..0cf024c 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -32,7 +32,6 @@ static inline void mapping_set_error(struct address_space *mapping, int error) } } -#ifdef CONFIG_UNEVICTABLE_LRU #define AS_UNEVICTABLE (__GFP_BITS_SHIFT + 2) /* e.g., ramdisk, SHM_LOCK */ static inline void mapping_set_unevictable(struct address_space *mapping) @@ -51,14 +50,6 @@ static inline int mapping_unevictable(struct address_space *mapping) return test_bit(AS_UNEVICTABLE, &mapping->flags); return !!mapping; } -#else -static inline void mapping_set_unevictable(struct address_space *mapping) { } -static inline void mapping_clear_unevictable(struct address_space *mapping) { } -static inline int mapping_unevictable(struct address_space *mapping) -{ - return 0; -} -#endif static inline gfp_t mapping_gfp_mask(struct address_space * mapping) { diff --git a/include/linux/swap.h b/include/linux/swap.h index a3af95b..18c639b 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -233,8 +233,9 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order) } #endif -#ifdef CONFIG_UNEVICTABLE_LRU extern int page_evictable(struct page *page, struct vm_area_struct *vma); + +#ifdef CONFIG_UNEVICTABLE_LRU extern void scan_mapping_unevictable_pages(struct address_space *); extern unsigned long scan_unevictable_pages; @@ -243,12 +244,6 @@ extern int scan_unevictable_handler(struct ctl_table *, int, struct file *, extern int scan_unevictable_register_node(struct node *node); extern void scan_unevictable_unregister_node(struct node *node); #else -static inline int page_evictable(struct page *page, - struct vm_area_struct *vma) -{ - return 1; -} - static inline void scan_mapping_unevictable_pages(struct address_space *mapping) { } -- 1.5.4.3 > On Thu, 12 Mar 2009 10:04:41 +0900 (JST) > KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > > Hi > > > >> Page reclaim shouldn't be even attempting to reclaim or write back > > >> ramfs pagecache pages - reclaim can't possibly do anything with these > > >> pages! > > >> > > >> Arguably those pages shouldn't be on the LRU at all, but we haven't > > >> done that yet. > > >> > > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented > > >> this properly. ?I _think_ we did, in which case we later broke it. ?If > > >> we've always been (stupidly) trying to pageout these pages then OK, I > > >> guess your patch is a suitable 2.6.29 stopgap. > > > > > > OK, I can't find any code anywhere in which we excluded ramfs pages > > > from consideration by page reclaim. ?How dumb. > > > > The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case > > It that case, ramfs_get_inode calls mapping_set_unevictable. > > So, page reclaim can exclude ramfs pages by page_evictable. > > It's problem . > > Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine > because nobody of vmscan folk havbe nommu machine. > > Yes, it is very stupid reason. _very_ welcome to tester! :) > > > > David, Could you please try following patch if you have NOMMU machine? > it is straightforward porting to nommu. > > > == > Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU > > logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU. > but current code does by mistake. fix it. > > > Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > --- > mm/Kconfig | 1 - > mm/nommu.c | 24 ++++++++++++++++++++++++ > 2 files changed, 24 insertions(+), 1 deletion(-) > > Index: b/mm/Kconfig > =================================================================== > --- a/mm/Kconfig 2008-12-28 20:55:23.000000000 +0900 > +++ b/mm/Kconfig 2008-12-28 21:24:08.000000000 +0900 > @@ -212,7 +212,6 @@ config VIRT_TO_BUS > config UNEVICTABLE_LRU > bool "Add LRU list to track non-evictable pages" > default y > - depends on MMU > help > Keeps unevictable pages off of the active and inactive pageout > lists, so kswapd will not waste CPU time or have its balancing > Index: b/mm/nommu.c > =================================================================== > --- a/mm/nommu.c 2008-12-25 08:26:37.000000000 +0900 > +++ b/mm/nommu.c 2008-12-28 21:29:36.000000000 +0900 > @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct > mmput(mm); > return len; > } > + > +/* > + * LRU accounting for clear_page_mlock() > + */ > +void __clear_page_mlock(struct page *page) > +{ > + VM_BUG_ON(!PageLocked(page)); > + > + if (!page->mapping) { /* truncated ? */ > + return; > + } > + > + dec_zone_page_state(page, NR_MLOCK); > + count_vm_event(UNEVICTABLE_PGCLEARED); > + if (!isolate_lru_page(page)) { > + putback_lru_page(page); > + } else { > + /* > + * We lost the race. the page already moved to evictable list. > + */ > + if (PageUnevictable(page)) > + count_vm_event(UNEVICTABLE_PGSTRANDED); > + } > +} > > > > -- Kinds Regards Minchan Kim ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 1:52 ` Minchan Kim 0 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 1:52 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn Hi, Kosaki-san. I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? How about this ? It's just RFC. It's not tested. That's because we can't reclaim that pages regardless of whether there is unevictable list or not ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 1:52 ` Minchan Kim @ 2009-03-12 1:56 ` Minchan Kim -1 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 1:56 UTC (permalink / raw) To: Minchan Kim Cc: KOSAKI Motohiro, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn In the middle of writing the email, I seneded it by mistake. Sorry for that. Please, understand wrong patch title and changelog. I think although i don't modify that, you can understand it, well. So, I can't resend this until finising discussion. :) On Thu, Mar 12, 2009 at 10:52 AM, Minchan Kim <minchan.kim@gmail.com> wrote: > Hi, Kosaki-san. > > I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. > It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? > > > How about this ? > It's just RFC. It's not tested. > > That's because we can't reclaim that pages regardless of whether there is unevictable list or not > > From 487ce9577ea9c43b04ff340a1ba8c4030873e875 Mon Sep 17 00:00:00 2001 > From: MinChan Kim <minchan.kim@gmail.com> > Date: Thu, 12 Mar 2009 10:35:37 +0900 > Subject: [PATCH] test > Signed-off-by: MinChan Kim <minchan.kim@gmail.com> > > --- > include/linux/pagemap.h | 9 --------- > include/linux/swap.h | 9 ++------- > 2 files changed, 2 insertions(+), 16 deletions(-) > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index 4d27bf8..0cf024c 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -32,7 +32,6 @@ static inline void mapping_set_error(struct address_space *mapping, int error) > } > } > > -#ifdef CONFIG_UNEVICTABLE_LRU > #define AS_UNEVICTABLE (__GFP_BITS_SHIFT + 2) /* e.g., ramdisk, SHM_LOCK */ > > static inline void mapping_set_unevictable(struct address_space *mapping) > @@ -51,14 +50,6 @@ static inline int mapping_unevictable(struct address_space *mapping) > return test_bit(AS_UNEVICTABLE, &mapping->flags); > return !!mapping; > } > -#else > -static inline void mapping_set_unevictable(struct address_space *mapping) { } > -static inline void mapping_clear_unevictable(struct address_space *mapping) { } > -static inline int mapping_unevictable(struct address_space *mapping) > -{ > - return 0; > -} > -#endif > > static inline gfp_t mapping_gfp_mask(struct address_space * mapping) > { > diff --git a/include/linux/swap.h b/include/linux/swap.h > index a3af95b..18c639b 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -233,8 +233,9 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order) > } > #endif > > -#ifdef CONFIG_UNEVICTABLE_LRU > extern int page_evictable(struct page *page, struct vm_area_struct *vma); > + > +#ifdef CONFIG_UNEVICTABLE_LRU > extern void scan_mapping_unevictable_pages(struct address_space *); > > extern unsigned long scan_unevictable_pages; > @@ -243,12 +244,6 @@ extern int scan_unevictable_handler(struct ctl_table *, int, struct file *, > extern int scan_unevictable_register_node(struct node *node); > extern void scan_unevictable_unregister_node(struct node *node); > #else > -static inline int page_evictable(struct page *page, > - struct vm_area_struct *vma) > -{ > - return 1; > -} > - > static inline void scan_mapping_unevictable_pages(struct address_space *mapping) > { > } > -- > 1.5.4.3 > > > >> On Thu, 12 Mar 2009 10:04:41 +0900 (JST) >> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: >> >> Hi >> >> > >> Page reclaim shouldn't be even attempting to reclaim or write back >> > >> ramfs pagecache pages - reclaim can't possibly do anything with these >> > >> pages! >> > >> >> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't >> > >> done that yet. >> > >> >> > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented >> > >> this properly. ?I _think_ we did, in which case we later broke it. ?If >> > >> we've always been (stupidly) trying to pageout these pages then OK, I >> > >> guess your patch is a suitable 2.6.29 stopgap. >> > > >> > > OK, I can't find any code anywhere in which we excluded ramfs pages >> > > from consideration by page reclaim. ?How dumb. >> > >> > The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case >> > It that case, ramfs_get_inode calls mapping_set_unevictable. >> > So, page reclaim can exclude ramfs pages by page_evictable. >> > It's problem . >> >> Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine >> because nobody of vmscan folk havbe nommu machine. >> >> Yes, it is very stupid reason. _very_ welcome to tester! :) >> >> >> >> David, Could you please try following patch if you have NOMMU machine? >> it is straightforward porting to nommu. >> >> >> == >> Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU >> >> logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU. >> but current code does by mistake. fix it. >> >> >> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> >> --- >> mm/Kconfig | 1 - >> mm/nommu.c | 24 ++++++++++++++++++++++++ >> 2 files changed, 24 insertions(+), 1 deletion(-) >> >> Index: b/mm/Kconfig >> =================================================================== >> --- a/mm/Kconfig 2008-12-28 20:55:23.000000000 +0900 >> +++ b/mm/Kconfig 2008-12-28 21:24:08.000000000 +0900 >> @@ -212,7 +212,6 @@ config VIRT_TO_BUS >> config UNEVICTABLE_LRU >> bool "Add LRU list to track non-evictable pages" >> default y >> - depends on MMU >> help >> Keeps unevictable pages off of the active and inactive pageout >> lists, so kswapd will not waste CPU time or have its balancing >> Index: b/mm/nommu.c >> =================================================================== >> --- a/mm/nommu.c 2008-12-25 08:26:37.000000000 +0900 >> +++ b/mm/nommu.c 2008-12-28 21:29:36.000000000 +0900 >> @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct >> mmput(mm); >> return len; >> } >> + >> +/* >> + * LRU accounting for clear_page_mlock() >> + */ >> +void __clear_page_mlock(struct page *page) >> +{ >> + VM_BUG_ON(!PageLocked(page)); >> + >> + if (!page->mapping) { /* truncated ? */ >> + return; >> + } >> + >> + dec_zone_page_state(page, NR_MLOCK); >> + count_vm_event(UNEVICTABLE_PGCLEARED); >> + if (!isolate_lru_page(page)) { >> + putback_lru_page(page); >> + } else { >> + /* >> + * We lost the race. the page already moved to evictable list. >> + */ >> + if (PageUnevictable(page)) >> + count_vm_event(UNEVICTABLE_PGSTRANDED); >> + } >> +} >> >> >> >> > > > -- > Kinds Regards > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- Thanks, Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 1:56 ` Minchan Kim 0 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 1:56 UTC (permalink / raw) To: Minchan Kim Cc: KOSAKI Motohiro, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn In the middle of writing the email, I seneded it by mistake. Sorry for that. Please, understand wrong patch title and changelog. I think although i don't modify that, you can understand it, well. So, I can't resend this until finising discussion. :) On Thu, Mar 12, 2009 at 10:52 AM, Minchan Kim <minchan.kim@gmail.com> wrote: > Hi, Kosaki-san. > > I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. > It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? > > > How about this ? > It's just RFC. It's not tested. > > That's because we can't reclaim that pages regardless of whether there is unevictable list or not > > From 487ce9577ea9c43b04ff340a1ba8c4030873e875 Mon Sep 17 00:00:00 2001 > From: MinChan Kim <minchan.kim@gmail.com> > Date: Thu, 12 Mar 2009 10:35:37 +0900 > Subject: [PATCH] test > Signed-off-by: MinChan Kim <minchan.kim@gmail.com> > > --- > include/linux/pagemap.h | 9 --------- > include/linux/swap.h | 9 ++------- > 2 files changed, 2 insertions(+), 16 deletions(-) > > diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h > index 4d27bf8..0cf024c 100644 > --- a/include/linux/pagemap.h > +++ b/include/linux/pagemap.h > @@ -32,7 +32,6 @@ static inline void mapping_set_error(struct address_space *mapping, int error) > } > } > > -#ifdef CONFIG_UNEVICTABLE_LRU > #define AS_UNEVICTABLE (__GFP_BITS_SHIFT + 2) /* e.g., ramdisk, SHM_LOCK */ > > static inline void mapping_set_unevictable(struct address_space *mapping) > @@ -51,14 +50,6 @@ static inline int mapping_unevictable(struct address_space *mapping) > return test_bit(AS_UNEVICTABLE, &mapping->flags); > return !!mapping; > } > -#else > -static inline void mapping_set_unevictable(struct address_space *mapping) { } > -static inline void mapping_clear_unevictable(struct address_space *mapping) { } > -static inline int mapping_unevictable(struct address_space *mapping) > -{ > - return 0; > -} > -#endif > > static inline gfp_t mapping_gfp_mask(struct address_space * mapping) > { > diff --git a/include/linux/swap.h b/include/linux/swap.h > index a3af95b..18c639b 100644 > --- a/include/linux/swap.h > +++ b/include/linux/swap.h > @@ -233,8 +233,9 @@ static inline int zone_reclaim(struct zone *z, gfp_t mask, unsigned int order) > } > #endif > > -#ifdef CONFIG_UNEVICTABLE_LRU > extern int page_evictable(struct page *page, struct vm_area_struct *vma); > + > +#ifdef CONFIG_UNEVICTABLE_LRU > extern void scan_mapping_unevictable_pages(struct address_space *); > > extern unsigned long scan_unevictable_pages; > @@ -243,12 +244,6 @@ extern int scan_unevictable_handler(struct ctl_table *, int, struct file *, > extern int scan_unevictable_register_node(struct node *node); > extern void scan_unevictable_unregister_node(struct node *node); > #else > -static inline int page_evictable(struct page *page, > - struct vm_area_struct *vma) > -{ > - return 1; > -} > - > static inline void scan_mapping_unevictable_pages(struct address_space *mapping) > { > } > -- > 1.5.4.3 > > > >> On Thu, 12 Mar 2009 10:04:41 +0900 (JST) >> KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: >> >> Hi >> >> > >> Page reclaim shouldn't be even attempting to reclaim or write back >> > >> ramfs pagecache pages - reclaim can't possibly do anything with these >> > >> pages! >> > >> >> > >> Arguably those pages shouldn't be on the LRU at all, but we haven't >> > >> done that yet. >> > >> >> > >> Now, my problem is that I can't 100% be sure that we _ever_ implemented >> > >> this properly. ?I _think_ we did, in which case we later broke it. ?If >> > >> we've always been (stupidly) trying to pageout these pages then OK, I >> > >> guess your patch is a suitable 2.6.29 stopgap. >> > > >> > > OK, I can't find any code anywhere in which we excluded ramfs pages >> > > from consideration by page reclaim. ?How dumb. >> > >> > The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case >> > It that case, ramfs_get_inode calls mapping_set_unevictable. >> > So, page reclaim can exclude ramfs pages by page_evictable. >> > It's problem . >> >> Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine >> because nobody of vmscan folk havbe nommu machine. >> >> Yes, it is very stupid reason. _very_ welcome to tester! :) >> >> >> >> David, Could you please try following patch if you have NOMMU machine? >> it is straightforward porting to nommu. >> >> >> == >> Subject: [PATCH] remove to depend on MMU from CONFIG_UNEVICTABLE_LRU >> >> logically, CONFIG_UNEVICTABLE_LRU doesn't depend on MMU. >> but current code does by mistake. fix it. >> >> >> Signed-off-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> >> --- >> mm/Kconfig | 1 - >> mm/nommu.c | 24 ++++++++++++++++++++++++ >> 2 files changed, 24 insertions(+), 1 deletion(-) >> >> Index: b/mm/Kconfig >> =================================================================== >> --- a/mm/Kconfig 2008-12-28 20:55:23.000000000 +0900 >> +++ b/mm/Kconfig 2008-12-28 21:24:08.000000000 +0900 >> @@ -212,7 +212,6 @@ config VIRT_TO_BUS >> config UNEVICTABLE_LRU >> bool "Add LRU list to track non-evictable pages" >> default y >> - depends on MMU >> help >> Keeps unevictable pages off of the active and inactive pageout >> lists, so kswapd will not waste CPU time or have its balancing >> Index: b/mm/nommu.c >> =================================================================== >> --- a/mm/nommu.c 2008-12-25 08:26:37.000000000 +0900 >> +++ b/mm/nommu.c 2008-12-28 21:29:36.000000000 +0900 >> @@ -1521,3 +1521,27 @@ int access_process_vm(struct task_struct >> mmput(mm); >> return len; >> } >> + >> +/* >> + * LRU accounting for clear_page_mlock() >> + */ >> +void __clear_page_mlock(struct page *page) >> +{ >> + VM_BUG_ON(!PageLocked(page)); >> + >> + if (!page->mapping) { /* truncated ? */ >> + return; >> + } >> + >> + dec_zone_page_state(page, NR_MLOCK); >> + count_vm_event(UNEVICTABLE_PGCLEARED); >> + if (!isolate_lru_page(page)) { >> + putback_lru_page(page); >> + } else { >> + /* >> + * We lost the race. the page already moved to evictable list. >> + */ >> + if (PageUnevictable(page)) >> + count_vm_event(UNEVICTABLE_PGSTRANDED); >> + } >> +} >> >> >> >> > > > -- > Kinds Regards > Minchan Kim > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- Thanks, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 1:52 ` Minchan Kim @ 2009-03-12 2:00 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-12 2:00 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn > Hi, Kosaki-san. > > I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. > It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? > > How about this ? > It's just RFC. It's not tested. > > That's because we can't reclaim that pages regardless of whether there is unevictable list or not maybe, your patch work. but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine it is more cleaner IMHO. What do you think? ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 2:00 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-12 2:00 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn > Hi, Kosaki-san. > > I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. > It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? > > How about this ? > It's just RFC. It's not tested. > > That's because we can't reclaim that pages regardless of whether there is unevictable list or not maybe, your patch work. but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine it is more cleaner IMHO. What do you think? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 2:00 ` KOSAKI Motohiro @ 2009-03-12 2:11 ` Minchan Kim -1 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 2:11 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn On Thu, Mar 12, 2009 at 11:00 AM, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: >> Hi, Kosaki-san. >> >> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. >> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? >> >> How about this ? >> It's just RFC. It's not tested. >> >> That's because we can't reclaim that pages regardless of whether there is unevictable list or not > > maybe, your patch work. > > but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely > after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine > > it is more cleaner IMHO. > What do you think? > > I agree your opinion, totally Let us wait nommu folks's comments. -- Kinds regards, Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 2:11 ` Minchan Kim 0 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-12 2:11 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn On Thu, Mar 12, 2009 at 11:00 AM, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: >> Hi, Kosaki-san. >> >> I think ramfs pages's unevictablility should not depend on CONFIG_UNEVICTABLE_LRU. >> It would be better to remove dependency of CONFIG_UNEVICTABLE_LRU ? >> >> How about this ? >> It's just RFC. It's not tested. >> >> That's because we can't reclaim that pages regardless of whether there is unevictable list or not > > maybe, your patch work. > > but we can remove CONFIG_UNEVICTABLE_LRU build option itself completely > after nommu folks confirmed CONFIG_UNEVICTABLE_LRU works well on their machine > > it is more cleaner IMHO. > What do you think? > > I agree your opinion, totally Let us wait nommu folks's comments. -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 1:04 ` KOSAKI Motohiro @ 2009-03-12 12:19 ` Robin Getz -1 siblings, 0 replies; 101+ messages in thread From: Robin Getz @ 2009-03-12 12:19 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn On Wed 11 Mar 2009 21:04, KOSAKI Motohiro pondered: > Hi > > > >> Page reclaim shouldn't be even attempting to reclaim or write back > > >> ramfs pagecache pages - reclaim can't possibly do anything with > > >> these pages! > > >> > > >> Arguably those pages shouldn't be on the LRU at all, but we haven't > > >> done that yet. > > >> > > >> Now, my problem is that I can't 100% be sure that we _ever_ > > >> implemented this properly. ?I _think_ we did, in which case > > >> we later broke it. ?If we've always been (stupidly) trying > > >> to pageout these pages then OK, I guess your patch is a > > >> suitable 2.6.29 stopgap. > > > > > > OK, I can't find any code anywhere in which we excluded ramfs pages > > > from consideration by page reclaim. ?How dumb. > > > > The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case > > It that case, ramfs_get_inode calls mapping_set_unevictable. > > So, page reclaim can exclude ramfs pages by page_evictable. > > It's problem . > > Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine > because nobody of vmscan folk havbe nommu machine. > > Yes, it is very stupid reason. _very_ welcome to tester! :) As always - if you (or any kernel developer) would like a noMMU machine to test on - please send me a private email. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 12:19 ` Robin Getz 0 siblings, 0 replies; 101+ messages in thread From: Robin Getz @ 2009-03-12 12:19 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn On Wed 11 Mar 2009 21:04, KOSAKI Motohiro pondered: > Hi > > > >> Page reclaim shouldn't be even attempting to reclaim or write back > > >> ramfs pagecache pages - reclaim can't possibly do anything with > > >> these pages! > > >> > > >> Arguably those pages shouldn't be on the LRU at all, but we haven't > > >> done that yet. > > >> > > >> Now, my problem is that I can't 100% be sure that we _ever_ > > >> implemented this properly. ?I _think_ we did, in which case > > >> we later broke it. ?If we've always been (stupidly) trying > > >> to pageout these pages then OK, I guess your patch is a > > >> suitable 2.6.29 stopgap. > > > > > > OK, I can't find any code anywhere in which we excluded ramfs pages > > > from consideration by page reclaim. ?How dumb. > > > > The ramfs considers it in just CONFIG_UNEVICTABLE_LRU case > > It that case, ramfs_get_inode calls mapping_set_unevictable. > > So, page reclaim can exclude ramfs pages by page_evictable. > > It's problem . > > Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine > because nobody of vmscan folk havbe nommu machine. > > Yes, it is very stupid reason. _very_ welcome to tester! :) As always - if you (or any kernel developer) would like a noMMU machine to test on - please send me a private email. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [uClinux-dev] Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 12:19 ` Robin Getz @ 2009-03-12 17:55 ` Jamie Lokier -1 siblings, 0 replies; 101+ messages in thread From: Jamie Lokier @ 2009-03-12 17:55 UTC (permalink / raw) To: uClinux development list Cc: KOSAKI Motohiro, Lee Schermerhorn, Rik van Riel, peterz, linux-kernel, linux-mm, Minchan Kim, Johannes Weiner, Andrew Morton, torvalds Robin Getz wrote: > > Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine > > because nobody of vmscan folk havbe nommu machine. > > > > Yes, it is very stupid reason. _very_ welcome to tester! :) > > As always - if you (or any kernel developer) would like a noMMU machine to > test on - please send me a private email. Well, that explains why vmscan has historically performed a little dubiously on small nommu machines! By the way, this is just a random side thought... nommu kernels work just fine in emulators :-) -- Jamie ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [uClinux-dev] Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-12 17:55 ` Jamie Lokier 0 siblings, 0 replies; 101+ messages in thread From: Jamie Lokier @ 2009-03-12 17:55 UTC (permalink / raw) To: uClinux development list Cc: KOSAKI Motohiro, Lee Schermerhorn, Rik van Riel, peterz, linux-kernel, linux-mm, Minchan Kim, Johannes Weiner, Andrew Morton, torvalds Robin Getz wrote: > > Currently, CONFIG_UNEVICTABLE_LRU can't use on nommu machine > > because nobody of vmscan folk havbe nommu machine. > > > > Yes, it is very stupid reason. _very_ welcome to tester! :) > > As always - if you (or any kernel developer) would like a noMMU machine to > test on - please send me a private email. Well, that explains why vmscan has historically performed a little dubiously on small nommu machines! By the way, this is just a random side thought... nommu kernels work just fine in emulators :-) -- Jamie -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* [PATCH 0/2] Make the Unevictable LRU available on NOMMU 2009-03-12 1:04 ` KOSAKI Motohiro @ 2009-03-13 17:33 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw) To: kosaki.motohiro, minchan.kim Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be unavailable in NOMMU mode. The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode. David ^ permalink raw reply [flat|nested] 101+ messages in thread
* [PATCH 0/2] Make the Unevictable LRU available on NOMMU @ 2009-03-13 17:33 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw) To: kosaki.motohiro, minchan.kim Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be unavailable in NOMMU mode. The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode. David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits 2009-03-13 17:33 ` David Howells @ 2009-03-13 17:33 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw) To: kosaki.motohiro, minchan.kim Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn The mlock() facility does not exist for NOMMU since all mappings are effectively locked anyway, so we don't make the bits available when they're not useful. Signed-off-by: David Howells <dhowells@redhat.com> --- include/linux/page-flags.h | 20 +++++++++++++------- mm/Kconfig | 8 ++++++++ mm/internal.h | 8 +++++--- 3 files changed, 26 insertions(+), 10 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 219a523..61df177 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -96,6 +96,8 @@ enum pageflags { PG_swapbacked, /* Page is backed by RAM/swap */ #ifdef CONFIG_UNEVICTABLE_LRU PG_unevictable, /* Page is "unevictable" */ +#endif +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT PG_mlocked, /* Page is vma mlocked */ #endif #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR @@ -234,20 +236,20 @@ PAGEFLAG_FALSE(SwapCache) #ifdef CONFIG_UNEVICTABLE_LRU PAGEFLAG(Unevictable, unevictable) __CLEARPAGEFLAG(Unevictable, unevictable) TESTCLEARFLAG(Unevictable, unevictable) +#else +PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable) + SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable) + __CLEARPAGEFLAG_NOOP(Unevictable) +#endif +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT #define MLOCK_PAGES 1 PAGEFLAG(Mlocked, mlocked) __CLEARPAGEFLAG(Mlocked, mlocked) TESTSCFLAG(Mlocked, mlocked) - #else - #define MLOCK_PAGES 0 PAGEFLAG_FALSE(Mlocked) SETPAGEFLAG_NOOP(Mlocked) TESTCLEARFLAG_FALSE(Mlocked) - -PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable) - SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable) - __CLEARPAGEFLAG_NOOP(Unevictable) #endif #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR @@ -367,9 +369,13 @@ static inline void __ClearPageTail(struct page *page) #ifdef CONFIG_UNEVICTABLE_LRU #define __PG_UNEVICTABLE (1 << PG_unevictable) -#define __PG_MLOCKED (1 << PG_mlocked) #else #define __PG_UNEVICTABLE 0 +#endif + +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT +#define __PG_MLOCKED (1 << PG_mlocked) +#else #define __PG_MLOCKED 0 #endif diff --git a/mm/Kconfig b/mm/Kconfig index a5b7781..8c89597 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -214,5 +214,13 @@ config UNEVICTABLE_LRU will use one page flag and increase the code size a little, say Y unless you know what you are doing. +config HAVE_MLOCK + bool + default y if MMU=y + +config HAVE_MLOCKED_PAGE_BIT + bool + default y if HAVE_MLOCK=y && UNEVICTABLE_LRU=y + config MMU_NOTIFIER bool diff --git a/mm/internal.h b/mm/internal.h index 478223b..987bb03 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -63,6 +63,7 @@ static inline unsigned long page_order(struct page *page) return page_private(page); } +#ifdef CONFIG_HAVE_MLOCK extern long mlock_vma_pages_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void munlock_vma_pages_range(struct vm_area_struct *vma, @@ -71,6 +72,7 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma) { munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end); } +#endif #ifdef CONFIG_UNEVICTABLE_LRU /* @@ -90,7 +92,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old) } #endif -#ifdef CONFIG_UNEVICTABLE_LRU +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT /* * Called only in fault path via page_evictable() for a new page * to determine if it's being mapped into a LOCKED vma. @@ -165,7 +167,7 @@ static inline void free_page_mlock(struct page *page) } } -#else /* CONFIG_UNEVICTABLE_LRU */ +#else /* CONFIG_HAVE_MLOCKED_PAGE_BIT */ static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p) { return 0; @@ -175,7 +177,7 @@ static inline void mlock_vma_page(struct page *page) { } static inline void mlock_migrate_page(struct page *new, struct page *old) { } static inline void free_page_mlock(struct page *page) { } -#endif /* CONFIG_UNEVICTABLE_LRU */ +#endif /* CONFIG_HAVE_MLOCKED_PAGE_BIT */ /* * Return the mem_map entry representing the 'offset' subpage within ^ permalink raw reply related [flat|nested] 101+ messages in thread
* [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits @ 2009-03-13 17:33 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw) To: kosaki.motohiro, minchan.kim Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn The mlock() facility does not exist for NOMMU since all mappings are effectively locked anyway, so we don't make the bits available when they're not useful. Signed-off-by: David Howells <dhowells@redhat.com> --- include/linux/page-flags.h | 20 +++++++++++++------- mm/Kconfig | 8 ++++++++ mm/internal.h | 8 +++++--- 3 files changed, 26 insertions(+), 10 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 219a523..61df177 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -96,6 +96,8 @@ enum pageflags { PG_swapbacked, /* Page is backed by RAM/swap */ #ifdef CONFIG_UNEVICTABLE_LRU PG_unevictable, /* Page is "unevictable" */ +#endif +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT PG_mlocked, /* Page is vma mlocked */ #endif #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR @@ -234,20 +236,20 @@ PAGEFLAG_FALSE(SwapCache) #ifdef CONFIG_UNEVICTABLE_LRU PAGEFLAG(Unevictable, unevictable) __CLEARPAGEFLAG(Unevictable, unevictable) TESTCLEARFLAG(Unevictable, unevictable) +#else +PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable) + SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable) + __CLEARPAGEFLAG_NOOP(Unevictable) +#endif +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT #define MLOCK_PAGES 1 PAGEFLAG(Mlocked, mlocked) __CLEARPAGEFLAG(Mlocked, mlocked) TESTSCFLAG(Mlocked, mlocked) - #else - #define MLOCK_PAGES 0 PAGEFLAG_FALSE(Mlocked) SETPAGEFLAG_NOOP(Mlocked) TESTCLEARFLAG_FALSE(Mlocked) - -PAGEFLAG_FALSE(Unevictable) TESTCLEARFLAG_FALSE(Unevictable) - SETPAGEFLAG_NOOP(Unevictable) CLEARPAGEFLAG_NOOP(Unevictable) - __CLEARPAGEFLAG_NOOP(Unevictable) #endif #ifdef CONFIG_IA64_UNCACHED_ALLOCATOR @@ -367,9 +369,13 @@ static inline void __ClearPageTail(struct page *page) #ifdef CONFIG_UNEVICTABLE_LRU #define __PG_UNEVICTABLE (1 << PG_unevictable) -#define __PG_MLOCKED (1 << PG_mlocked) #else #define __PG_UNEVICTABLE 0 +#endif + +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT +#define __PG_MLOCKED (1 << PG_mlocked) +#else #define __PG_MLOCKED 0 #endif diff --git a/mm/Kconfig b/mm/Kconfig index a5b7781..8c89597 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -214,5 +214,13 @@ config UNEVICTABLE_LRU will use one page flag and increase the code size a little, say Y unless you know what you are doing. +config HAVE_MLOCK + bool + default y if MMU=y + +config HAVE_MLOCKED_PAGE_BIT + bool + default y if HAVE_MLOCK=y && UNEVICTABLE_LRU=y + config MMU_NOTIFIER bool diff --git a/mm/internal.h b/mm/internal.h index 478223b..987bb03 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -63,6 +63,7 @@ static inline unsigned long page_order(struct page *page) return page_private(page); } +#ifdef CONFIG_HAVE_MLOCK extern long mlock_vma_pages_range(struct vm_area_struct *vma, unsigned long start, unsigned long end); extern void munlock_vma_pages_range(struct vm_area_struct *vma, @@ -71,6 +72,7 @@ static inline void munlock_vma_pages_all(struct vm_area_struct *vma) { munlock_vma_pages_range(vma, vma->vm_start, vma->vm_end); } +#endif #ifdef CONFIG_UNEVICTABLE_LRU /* @@ -90,7 +92,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old) } #endif -#ifdef CONFIG_UNEVICTABLE_LRU +#ifdef CONFIG_HAVE_MLOCKED_PAGE_BIT /* * Called only in fault path via page_evictable() for a new page * to determine if it's being mapped into a LOCKED vma. @@ -165,7 +167,7 @@ static inline void free_page_mlock(struct page *page) } } -#else /* CONFIG_UNEVICTABLE_LRU */ +#else /* CONFIG_HAVE_MLOCKED_PAGE_BIT */ static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p) { return 0; @@ -175,7 +177,7 @@ static inline void mlock_vma_page(struct page *page) { } static inline void mlock_migrate_page(struct page *new, struct page *old) { } static inline void free_page_mlock(struct page *page) { } -#endif /* CONFIG_UNEVICTABLE_LRU */ +#endif /* CONFIG_HAVE_MLOCKED_PAGE_BIT */ /* * Return the mem_map entry representing the 'offset' subpage within -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits 2009-03-13 17:33 ` David Howells @ 2009-03-14 11:17 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw) To: David Howells Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn 2009/3/14 David Howells <dhowells@redhat.com>: > The mlock() facility does not exist for NOMMU since all mappings are > effectively locked anyway, so we don't make the bits available when they're > not useful. > > Signed-off-by: David Howells <dhowells@redhat.com> Oh, your patch is more cleaner way. Thanks! Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits @ 2009-03-14 11:17 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw) To: David Howells Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn 2009/3/14 David Howells <dhowells@redhat.com>: > The mlock() facility does not exist for NOMMU since all mappings are > effectively locked anyway, so we don't make the bits available when they're > not useful. > > Signed-off-by: David Howells <dhowells@redhat.com> Oh, your patch is more cleaner way. Thanks! Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n 2009-03-13 17:33 ` David Howells @ 2009-03-13 17:33 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw) To: kosaki.motohiro, minchan.kim Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n. There's no logical reason it shouldn't be available, and it can be used for ramfs. Signed-off-by: David Howells <dhowells@redhat.com> --- mm/Kconfig | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index 8c89597..b53427a 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -206,7 +206,6 @@ config VIRT_TO_BUS config UNEVICTABLE_LRU bool "Add LRU list to track non-evictable pages" default y - depends on MMU help Keeps unevictable pages off of the active and inactive pageout lists, so kswapd will not waste CPU time or have its balancing ^ permalink raw reply related [flat|nested] 101+ messages in thread
* [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n @ 2009-03-13 17:33 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 17:33 UTC (permalink / raw) To: kosaki.motohiro, minchan.kim Cc: dhowells, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n. There's no logical reason it shouldn't be available, and it can be used for ramfs. Signed-off-by: David Howells <dhowells@redhat.com> --- mm/Kconfig | 1 - 1 files changed, 0 insertions(+), 1 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index 8c89597..b53427a 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -206,7 +206,6 @@ config VIRT_TO_BUS config UNEVICTABLE_LRU bool "Add LRU list to track non-evictable pages" default y - depends on MMU help Keeps unevictable pages off of the active and inactive pageout lists, so kswapd will not waste CPU time or have its balancing -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n 2009-03-13 17:33 ` David Howells @ 2009-03-14 11:17 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw) To: David Howells Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn > diff --git a/mm/Kconfig b/mm/Kconfig > index 8c89597..b53427a 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -206,7 +206,6 @@ config VIRT_TO_BUS > config UNEVICTABLE_LRU > bool "Add LRU list to track non-evictable pages" > default y > - depends on MMU > help > Keeps unevictable pages off of the active and inactive pageout > lists, so kswapd will not waste CPU time or have its balancing Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n @ 2009-03-14 11:17 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-14 11:17 UTC (permalink / raw) To: David Howells Cc: minchan.kim, akpm, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn > diff --git a/mm/Kconfig b/mm/Kconfig > index 8c89597..b53427a 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -206,7 +206,6 @@ config VIRT_TO_BUS > config UNEVICTABLE_LRU > bool "Add LRU list to track non-evictable pages" > default y > - depends on MMU > help > Keeps unevictable pages off of the active and inactive pageout > lists, so kswapd will not waste CPU time or have its balancing Reviewed-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU 2009-03-13 17:33 ` David Howells @ 2009-03-14 0:27 ` Minchan Kim -1 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-14 0:27 UTC (permalink / raw) To: David Howells, Andrew Morton Cc: kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn, Minchan Kim Hi, David. It seems your patch is better than mine. Thanks. :) But my concern is that as Peter pointed out, unevictable lru's solution is not fundamental one. He want to remove ramfs page from lru list to begin with. I guess Andrew also thought same thing with Peter. I think it's a fundamental solution. but it may be long term solution. This patch can solve NOMMU problem in current status. Andrew, What do you think about it ? On Sat, Mar 14, 2009 at 2:33 AM, David Howells <dhowells@redhat.com> wrote: > > The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be > unavailable in NOMMU mode. > > The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode. > > David > -- Kinds regards, Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU @ 2009-03-14 0:27 ` Minchan Kim 0 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-14 0:27 UTC (permalink / raw) To: David Howells, Andrew Morton Cc: kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel, lee.schermerhorn, Minchan Kim Hi, David. It seems your patch is better than mine. Thanks. :) But my concern is that as Peter pointed out, unevictable lru's solution is not fundamental one. He want to remove ramfs page from lru list to begin with. I guess Andrew also thought same thing with Peter. I think it's a fundamental solution. but it may be long term solution. This patch can solve NOMMU problem in current status. Andrew, What do you think about it ? On Sat, Mar 14, 2009 at 2:33 AM, David Howells <dhowells@redhat.com> wrote: > > The first patch causes the mlock() bits added by CONFIG_UNEVICTABLE_LRU to be > unavailable in NOMMU mode. > > The second patch makes CONFIG_UNEVICTABLE_LRU available in NOMMU mode. > > David > -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU 2009-03-14 0:27 ` Minchan Kim @ 2009-03-20 16:08 ` Lee Schermerhorn -1 siblings, 0 replies; 101+ messages in thread From: Lee Schermerhorn @ 2009-03-20 16:08 UTC (permalink / raw) To: Minchan Kim Cc: David Howells, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel On Sat, 2009-03-14 at 09:27 +0900, Minchan Kim wrote: > Hi, David. > > It seems your patch is better than mine. Thanks. :) > But my concern is that as Peter pointed out, unevictable lru's > solution is not fundamental one. > > He want to remove ramfs page from lru list to begin with. > I guess Andrew also thought same thing with Peter. > > I think it's a fundamental solution. but it may be long term solution. > This patch can solve NOMMU problem in current status. > > Andrew, What do you think about it ? [been meaning to respond to this...] I just want to point out [again :)] that removing the ramfs pages from the lru will prevent them from being migrated--e.g., for mem hot unplug, defrag or such. We currently have this situation with the new ram disk driver [brd.c] which, unlike the old rd driver, doesn't place its pages on the LRU. Migration uses isolation of pages from lru to arbitrate between tasks trying to migrate or reclaim the same page. If migration doesn't find the page on the lru, it assumes that it lost the race and skips the page. This is one of the reasons we chose to keep unevictable pages on an lru-like list known to isolate_lru_page(). Something to keep in mind if/when this comes up again. Maybe we don't care? Maybe ram disk/fs pages should come only from non-movable zone? Or maybe migration can be reworked not to require the page be "isolatable" from the lru [haven't thought about how one might do this]. Lee ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU @ 2009-03-20 16:08 ` Lee Schermerhorn 0 siblings, 0 replies; 101+ messages in thread From: Lee Schermerhorn @ 2009-03-20 16:08 UTC (permalink / raw) To: Minchan Kim Cc: David Howells, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel On Sat, 2009-03-14 at 09:27 +0900, Minchan Kim wrote: > Hi, David. > > It seems your patch is better than mine. Thanks. :) > But my concern is that as Peter pointed out, unevictable lru's > solution is not fundamental one. > > He want to remove ramfs page from lru list to begin with. > I guess Andrew also thought same thing with Peter. > > I think it's a fundamental solution. but it may be long term solution. > This patch can solve NOMMU problem in current status. > > Andrew, What do you think about it ? [been meaning to respond to this...] I just want to point out [again :)] that removing the ramfs pages from the lru will prevent them from being migrated--e.g., for mem hot unplug, defrag or such. We currently have this situation with the new ram disk driver [brd.c] which, unlike the old rd driver, doesn't place its pages on the LRU. Migration uses isolation of pages from lru to arbitrate between tasks trying to migrate or reclaim the same page. If migration doesn't find the page on the lru, it assumes that it lost the race and skips the page. This is one of the reasons we chose to keep unevictable pages on an lru-like list known to isolate_lru_page(). Something to keep in mind if/when this comes up again. Maybe we don't care? Maybe ram disk/fs pages should come only from non-movable zone? Or maybe migration can be reworked not to require the page be "isolatable" from the lru [haven't thought about how one might do this]. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU 2009-03-14 0:27 ` Minchan Kim @ 2009-03-20 16:24 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-20 16:24 UTC (permalink / raw) To: Lee Schermerhorn Cc: dhowells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > I just want to point out [again :)] that removing the ramfs pages from > the lru will prevent them from being migrated This is less of an issue for NOMMU kernels, since you can't migrate pages that are mapped. David ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU @ 2009-03-20 16:24 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-20 16:24 UTC (permalink / raw) To: Lee Schermerhorn Cc: dhowells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > I just want to point out [again :)] that removing the ramfs pages from > the lru will prevent them from being migrated This is less of an issue for NOMMU kernels, since you can't migrate pages that are mapped. David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU 2009-03-20 16:24 ` David Howells @ 2009-03-20 18:30 ` Lee Schermerhorn -1 siblings, 0 replies; 101+ messages in thread From: Lee Schermerhorn @ 2009-03-20 18:30 UTC (permalink / raw) To: David Howells Cc: Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote: > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > I just want to point out [again :)] that removing the ramfs pages from > > the lru will prevent them from being migrated > > This is less of an issue for NOMMU kernels, since you can't migrate pages that > are mapped. Agreed. So, you could eliminate them [ramfs pages] from the lru for just the nommu kernels, if you wanted to go that route. Lee ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU @ 2009-03-20 18:30 ` Lee Schermerhorn 0 siblings, 0 replies; 101+ messages in thread From: Lee Schermerhorn @ 2009-03-20 18:30 UTC (permalink / raw) To: David Howells Cc: Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, hannes, riel On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote: > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > I just want to point out [again :)] that removing the ramfs pages from > > the lru will prevent them from being migrated > > This is less of an issue for NOMMU kernels, since you can't migrate pages that > are mapped. Agreed. So, you could eliminate them [ramfs pages] from the lru for just the nommu kernels, if you wanted to go that route. Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU 2009-03-20 18:30 ` Lee Schermerhorn @ 2009-03-21 10:20 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-21 10:20 UTC (permalink / raw) To: Lee Schermerhorn Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, riel On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote: > On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote: > > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > > > I just want to point out [again :)] that removing the ramfs pages from > > > the lru will prevent them from being migrated > > > > This is less of an issue for NOMMU kernels, since you can't migrate pages that > > are mapped. > > > Agreed. So, you could eliminate them [ramfs pages] from the lru for > just the nommu kernels, if you wanted to go that route. These pages don't come with much overhead anymore when they sit on the unevictable list, right? So I don't see much point in special casing them all over the place. I have a patchset that decouples the unevictable lru feature from mlock, enables the latter on nommu and then makes sure ramfs pages go immediately to the unevictable list so they don't need the scanner to move them. This is just wiring up of features we already have. I will sent this mondayish, need to test it more especially on a NOMMU setup. Hannes ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU @ 2009-03-21 10:20 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-21 10:20 UTC (permalink / raw) To: Lee Schermerhorn Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, riel On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote: > On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote: > > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > > > I just want to point out [again :)] that removing the ramfs pages from > > > the lru will prevent them from being migrated > > > > This is less of an issue for NOMMU kernels, since you can't migrate pages that > > are mapped. > > > Agreed. So, you could eliminate them [ramfs pages] from the lru for > just the nommu kernels, if you wanted to go that route. These pages don't come with much overhead anymore when they sit on the unevictable list, right? So I don't see much point in special casing them all over the place. I have a patchset that decouples the unevictable lru feature from mlock, enables the latter on nommu and then makes sure ramfs pages go immediately to the unevictable list so they don't need the scanner to move them. This is just wiring up of features we already have. I will sent this mondayish, need to test it more especially on a NOMMU setup. Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* [patch 1/3] mm: decouple unevictable lru from mmu 2009-03-21 10:20 ` Johannes Weiner @ 2009-03-22 20:13 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Mlock is only one source of unevictable pages but with the unevictable lru enabled, mlock code is referenced unconditionally. Decouple the two so that the unevictable lru can work without mlock and thus on nommu setups where we still have unevictable pages from e.g. ramfs. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Howells <dhowells@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.com> Cc: MinChan Kim <minchan.kim@gmail.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> --- mm/Kconfig | 1 - mm/internal.h | 4 ++-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index a5b7781..fbb190e 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -206,7 +206,6 @@ config VIRT_TO_BUS config UNEVICTABLE_LRU bool "Add LRU list to track non-evictable pages" default y - depends on MMU help Keeps unevictable pages off of the active and inactive pageout lists, so kswapd will not waste CPU time or have its balancing diff --git a/mm/internal.h b/mm/internal.h index 478223b..ceaa629 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -90,7 +90,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old) } #endif -#ifdef CONFIG_UNEVICTABLE_LRU +#if defined(CONFIG_UNEVICTABLE_LRU) && defined(CONFIG_MMU) /* * Called only in fault path via page_evictable() for a new page * to determine if it's being mapped into a LOCKED vma. @@ -165,7 +165,7 @@ static inline void free_page_mlock(struct page *page) } } -#else /* CONFIG_UNEVICTABLE_LRU */ +#else /* CONFIG_UNEVICTABLE_LRU && CONFIG_MMU */ static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p) { return 0; -- 1.6.2.1.135.gde769 ^ permalink raw reply related [flat|nested] 101+ messages in thread
* [patch 1/3] mm: decouple unevictable lru from mmu @ 2009-03-22 20:13 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Mlock is only one source of unevictable pages but with the unevictable lru enabled, mlock code is referenced unconditionally. Decouple the two so that the unevictable lru can work without mlock and thus on nommu setups where we still have unevictable pages from e.g. ramfs. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Howells <dhowells@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.com> Cc: MinChan Kim <minchan.kim@gmail.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> --- mm/Kconfig | 1 - mm/internal.h | 4 ++-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index a5b7781..fbb190e 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -206,7 +206,6 @@ config VIRT_TO_BUS config UNEVICTABLE_LRU bool "Add LRU list to track non-evictable pages" default y - depends on MMU help Keeps unevictable pages off of the active and inactive pageout lists, so kswapd will not waste CPU time or have its balancing diff --git a/mm/internal.h b/mm/internal.h index 478223b..ceaa629 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -90,7 +90,7 @@ static inline void unevictable_migrate_page(struct page *new, struct page *old) } #endif -#ifdef CONFIG_UNEVICTABLE_LRU +#if defined(CONFIG_UNEVICTABLE_LRU) && defined(CONFIG_MMU) /* * Called only in fault path via page_evictable() for a new page * to determine if it's being mapped into a LOCKED vma. @@ -165,7 +165,7 @@ static inline void free_page_mlock(struct page *page) } } -#else /* CONFIG_UNEVICTABLE_LRU */ +#else /* CONFIG_UNEVICTABLE_LRU && CONFIG_MMU */ static inline int is_mlocked_vma(struct vm_area_struct *v, struct page *p) { return 0; -- 1.6.2.1.135.gde769 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [patch 1/3] mm: decouple unevictable lru from mmu 2009-03-22 20:13 ` Johannes Weiner @ 2009-03-22 23:46 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-22 23:46 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn > @@ -206,7 +206,6 @@ config VIRT_TO_BUS > config UNEVICTABLE_LRU > bool "Add LRU list to track non-evictable pages" > default y > - depends on MMU > help > Keeps unevictable pages off of the active and inactive pageout > lists, so kswapd will not waste CPU time or have its balancing > diff --git a/mm/internal.h b/mm/internal.h > index 478223b..ceaa629 100644 > --- a/mm/internal.h > +++ b/mm/internal.h David alread made this portion and it already merged in mmotm. Don't you work on mmotm? ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 1/3] mm: decouple unevictable lru from mmu @ 2009-03-22 23:46 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-22 23:46 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn > @@ -206,7 +206,6 @@ config VIRT_TO_BUS > config UNEVICTABLE_LRU > bool "Add LRU list to track non-evictable pages" > default y > - depends on MMU > help > Keeps unevictable pages off of the active and inactive pageout > lists, so kswapd will not waste CPU time or have its balancing > diff --git a/mm/internal.h b/mm/internal.h > index 478223b..ceaa629 100644 > --- a/mm/internal.h > +++ b/mm/internal.h David alread made this portion and it already merged in mmotm. Don't you work on mmotm? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 1/3] mm: decouple unevictable lru from mmu 2009-03-22 23:46 ` KOSAKI Motohiro @ 2009-03-23 0:14 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-23 0:14 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn On Mon, Mar 23, 2009 at 08:46:06AM +0900, KOSAKI Motohiro wrote: > > @@ -206,7 +206,6 @@ config VIRT_TO_BUS > > config UNEVICTABLE_LRU > > bool "Add LRU list to track non-evictable pages" > > default y > > - depends on MMU > > help > > Keeps unevictable pages off of the active and inactive pageout > > lists, so kswapd will not waste CPU time or have its balancing > > diff --git a/mm/internal.h b/mm/internal.h > > index 478223b..ceaa629 100644 > > --- a/mm/internal.h > > +++ b/mm/internal.h > > David alread made this portion and it already merged in mmotm. > Don't you work on mmotm? Ah, stupid me. I was even on the Cc for David's patches. I missed them, sorry. David, why do we need two Kconfig symbols for mlock and the mlock page bit? Don't we always provide mlock on mmu and never on nommu? Anyway, that is just out of curiousity. Good that the change is already done, so please ignore this patch. Hannes ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 1/3] mm: decouple unevictable lru from mmu @ 2009-03-23 0:14 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-23 0:14 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn On Mon, Mar 23, 2009 at 08:46:06AM +0900, KOSAKI Motohiro wrote: > > @@ -206,7 +206,6 @@ config VIRT_TO_BUS > > config UNEVICTABLE_LRU > > bool "Add LRU list to track non-evictable pages" > > default y > > - depends on MMU > > help > > Keeps unevictable pages off of the active and inactive pageout > > lists, so kswapd will not waste CPU time or have its balancing > > diff --git a/mm/internal.h b/mm/internal.h > > index 478223b..ceaa629 100644 > > --- a/mm/internal.h > > +++ b/mm/internal.h > > David alread made this portion and it already merged in mmotm. > Don't you work on mmotm? Ah, stupid me. I was even on the Cc for David's patches. I missed them, sorry. David, why do we need two Kconfig symbols for mlock and the mlock page bit? Don't we always provide mlock on mmu and never on nommu? Anyway, that is just out of curiousity. Good that the change is already done, so please ignore this patch. Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 1/3] mm: decouple unevictable lru from mmu 2009-03-22 23:46 ` KOSAKI Motohiro @ 2009-03-23 10:48 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-23 10:48 UTC (permalink / raw) To: Johannes Weiner Cc: dhowells, KOSAKI Motohiro, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Johannes Weiner <hannes@cmpxchg.org> wrote: > David, why do we need two Kconfig symbols for mlock and the mlock page > bit? Don't we always provide mlock on mmu and never on nommu? Because whilst the PG_mlocked doesn't exist if we don't have mlock() because we're in NOMMU mode, that does not imply that it _does_ exist if we _do_ have mlock() as it's also contingent on having the unevictable LRU. Not only that, CONFIG_HAVE_MLOCK used in mm/internal.h to switch some stuff out based on whether we have mlock() available or not - which is not the same as whether we have PG_mlocked or not. Mainly I thought it made the train of logic easier. Note that neither symbol is actually manually adjustable. David ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 1/3] mm: decouple unevictable lru from mmu @ 2009-03-23 10:48 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-23 10:48 UTC (permalink / raw) To: Johannes Weiner Cc: dhowells, KOSAKI Motohiro, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Johannes Weiner <hannes@cmpxchg.org> wrote: > David, why do we need two Kconfig symbols for mlock and the mlock page > bit? Don't we always provide mlock on mmu and never on nommu? Because whilst the PG_mlocked doesn't exist if we don't have mlock() because we're in NOMMU mode, that does not imply that it _does_ exist if we _do_ have mlock() as it's also contingent on having the unevictable LRU. Not only that, CONFIG_HAVE_MLOCK used in mm/internal.h to switch some stuff out based on whether we have mlock() available or not - which is not the same as whether we have PG_mlocked or not. Mainly I thought it made the train of logic easier. Note that neither symbol is actually manually adjustable. David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* [patch 2/3] ramfs-nommu: use generic lru cache 2009-03-21 10:20 ` Johannes Weiner @ 2009-03-22 20:13 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Instead of open-coding the lru-list-add pagevec batching when expanding a file mapping from zero, defer to the appropriate page cache function that also takes care of adding the page to the lru list. This is cleaner, saves code and reduces the stack footprint by 16 words worth of pagevec. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Howells <dhowells@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.com> Cc: MinChan Kim <minchan.kim@gmail.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> --- fs/ramfs/file-nommu.c | 15 ++++----------- 1 files changed, 4 insertions(+), 11 deletions(-) diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c index 5d7c7ec..351192a 100644 --- a/fs/ramfs/file-nommu.c +++ b/fs/ramfs/file-nommu.c @@ -60,7 +60,6 @@ const struct inode_operations ramfs_file_inode_operations = { */ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) { - struct pagevec lru_pvec; unsigned long npages, xpages, loop, limit; struct page *pages; unsigned order; @@ -103,24 +102,20 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) memset(data, 0, newsize); /* attach all the pages to the inode's address space */ - pagevec_init(&lru_pvec, 0); for (loop = 0; loop < npages; loop++) { struct page *page = pages + loop; - ret = add_to_page_cache(page, inode->i_mapping, loop, GFP_KERNEL); + ret = add_to_page_cache_lru(page, inode->i_mapping, loop, + GFP_KERNEL); if (ret < 0) goto add_error; - if (!pagevec_add(&lru_pvec, page)) - __pagevec_lru_add_file(&lru_pvec); - /* prevent the page from being discarded on memory pressure */ SetPageDirty(page); unlock_page(page); } - pagevec_lru_add_file(&lru_pvec); return 0; fsize_exceeded: @@ -129,10 +124,8 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) return -EFBIG; add_error: - pagevec_lru_add_file(&lru_pvec); - page_cache_release(pages + loop); - for (loop++; loop < npages; loop++) - __free_page(pages + loop); + while (loop < npages) + __free_page(pages + loop++); return ret; } -- 1.6.2.1.135.gde769 ^ permalink raw reply related [flat|nested] 101+ messages in thread
* [patch 2/3] ramfs-nommu: use generic lru cache @ 2009-03-22 20:13 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Instead of open-coding the lru-list-add pagevec batching when expanding a file mapping from zero, defer to the appropriate page cache function that also takes care of adding the page to the lru list. This is cleaner, saves code and reduces the stack footprint by 16 words worth of pagevec. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Howells <dhowells@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.com> Cc: MinChan Kim <minchan.kim@gmail.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> --- fs/ramfs/file-nommu.c | 15 ++++----------- 1 files changed, 4 insertions(+), 11 deletions(-) diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c index 5d7c7ec..351192a 100644 --- a/fs/ramfs/file-nommu.c +++ b/fs/ramfs/file-nommu.c @@ -60,7 +60,6 @@ const struct inode_operations ramfs_file_inode_operations = { */ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) { - struct pagevec lru_pvec; unsigned long npages, xpages, loop, limit; struct page *pages; unsigned order; @@ -103,24 +102,20 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) memset(data, 0, newsize); /* attach all the pages to the inode's address space */ - pagevec_init(&lru_pvec, 0); for (loop = 0; loop < npages; loop++) { struct page *page = pages + loop; - ret = add_to_page_cache(page, inode->i_mapping, loop, GFP_KERNEL); + ret = add_to_page_cache_lru(page, inode->i_mapping, loop, + GFP_KERNEL); if (ret < 0) goto add_error; - if (!pagevec_add(&lru_pvec, page)) - __pagevec_lru_add_file(&lru_pvec); - /* prevent the page from being discarded on memory pressure */ SetPageDirty(page); unlock_page(page); } - pagevec_lru_add_file(&lru_pvec); return 0; fsize_exceeded: @@ -129,10 +124,8 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) return -EFBIG; add_error: - pagevec_lru_add_file(&lru_pvec); - page_cache_release(pages + loop); - for (loop++; loop < npages; loop++) - __free_page(pages + loop); + while (loop < npages) + __free_page(pages + loop++); return ret; } -- 1.6.2.1.135.gde769 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [patch 2/3] ramfs-nommu: use generic lru cache 2009-03-22 20:13 ` Johannes Weiner @ 2009-03-23 2:22 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 2:22 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn > Instead of open-coding the lru-list-add pagevec batching when > expanding a file mapping from zero, defer to the appropriate page > cache function that also takes care of adding the page to the lru > list. > > This is cleaner, saves code and reduces the stack footprint by 16 > words worth of pagevec. Looks good to me. thanks good patch. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 2/3] ramfs-nommu: use generic lru cache @ 2009-03-23 2:22 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 2:22 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn > Instead of open-coding the lru-list-add pagevec batching when > expanding a file mapping from zero, defer to the appropriate page > cache function that also takes care of adding the page to the lru > list. > > This is cleaner, saves code and reduces the stack footprint by 16 > words worth of pagevec. Looks good to me. thanks good patch. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-21 10:20 ` Johannes Weiner @ 2009-03-22 20:13 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Check if the mapping is evictable when initially adding page cache pages to the LRU lists. If that is not the case, add them to the unevictable list immediately instead of leaving it up to the reclaim code to move them there. This is useful for ramfs and locked shmem which mark whole mappings as unevictable and we know at fault time already that it is useless to try reclaiming these pages. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Howells <dhowells@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.com> Cc: MinChan Kim <minchan.kim@gmail.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> --- mm/filemap.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 23acefe..8574530 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping, ret = add_to_page_cache(page, mapping, offset, gfp_mask); if (ret == 0) { - if (page_is_file_cache(page)) + if (mapping_unevictable(mapping)) + add_page_to_unevictable_list(page); + else if (page_is_file_cache(page)) lru_cache_add_file(page); else lru_cache_add_active_anon(page); -- 1.6.2.1.135.gde769 ^ permalink raw reply related [flat|nested] 101+ messages in thread
* [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-22 20:13 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-22 20:13 UTC (permalink / raw) To: Andrew Morton Cc: linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Check if the mapping is evictable when initially adding page cache pages to the LRU lists. If that is not the case, add them to the unevictable list immediately instead of leaving it up to the reclaim code to move them there. This is useful for ramfs and locked shmem which mark whole mappings as unevictable and we know at fault time already that it is useless to try reclaiming these pages. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: David Howells <dhowells@redhat.com> Cc: Nick Piggin <npiggin@suse.de> Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Cc: Rik van Riel <riel@redhat.com> Cc: Peter Zijlstra <peterz@infradead.com> Cc: MinChan Kim <minchan.kim@gmail.com> Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> --- mm/filemap.c | 4 +++- 1 files changed, 3 insertions(+), 1 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 23acefe..8574530 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping, ret = add_to_page_cache(page, mapping, offset, gfp_mask); if (ret == 0) { - if (page_is_file_cache(page)) + if (mapping_unevictable(mapping)) + add_page_to_unevictable_list(page); + else if (page_is_file_cache(page)) lru_cache_add_file(page); else lru_cache_add_active_anon(page); -- 1.6.2.1.135.gde769 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply related [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-22 20:13 ` Johannes Weiner @ 2009-03-23 0:44 ` Minchan Kim -1 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-23 0:44 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, Lee Schermerhorn Hmm,, This patch is another thing unlike previous series patches. Firstly, It looked good to me. I think add_to_page_cache_lru have to become a fast path. But, how often would ramfs and shmem function be called ? I have a concern for this patch to add another burden. so, we need any numbers for getting pros and cons. Any thoughts ? On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote: > Check if the mapping is evictable when initially adding page cache > pages to the LRU lists. If that is not the case, add them to the > unevictable list immediately instead of leaving it up to the reclaim > code to move them there. > > This is useful for ramfs and locked shmem which mark whole mappings as > unevictable and we know at fault time already that it is useless to > try reclaiming these pages. > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> > Cc: David Howells <dhowells@redhat.com> > Cc: Nick Piggin <npiggin@suse.de> > Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > Cc: Rik van Riel <riel@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.com> > Cc: MinChan Kim <minchan.kim@gmail.com> > Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> > --- > mm/filemap.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 23acefe..8574530 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping, > > ret = add_to_page_cache(page, mapping, offset, gfp_mask); > if (ret == 0) { > - if (page_is_file_cache(page)) > + if (mapping_unevictable(mapping)) > + add_page_to_unevictable_list(page); > + else if (page_is_file_cache(page)) > lru_cache_add_file(page); > else > lru_cache_add_active_anon(page); > -- > 1.6.2.1.135.gde769 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- Kinds regards, Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-23 0:44 ` Minchan Kim 0 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-23 0:44 UTC (permalink / raw) To: Johannes Weiner Cc: Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, Lee Schermerhorn Hmm,, This patch is another thing unlike previous series patches. Firstly, It looked good to me. I think add_to_page_cache_lru have to become a fast path. But, how often would ramfs and shmem function be called ? I have a concern for this patch to add another burden. so, we need any numbers for getting pros and cons. Any thoughts ? On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote: > Check if the mapping is evictable when initially adding page cache > pages to the LRU lists. If that is not the case, add them to the > unevictable list immediately instead of leaving it up to the reclaim > code to move them there. > > This is useful for ramfs and locked shmem which mark whole mappings as > unevictable and we know at fault time already that it is useless to > try reclaiming these pages. > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> > Cc: David Howells <dhowells@redhat.com> > Cc: Nick Piggin <npiggin@suse.de> > Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> > Cc: Rik van Riel <riel@redhat.com> > Cc: Peter Zijlstra <peterz@infradead.com> > Cc: MinChan Kim <minchan.kim@gmail.com> > Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com> > --- > mm/filemap.c | 4 +++- > 1 files changed, 3 insertions(+), 1 deletions(-) > > diff --git a/mm/filemap.c b/mm/filemap.c > index 23acefe..8574530 100644 > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -506,7 +506,9 @@ int add_to_page_cache_lru(struct page *page, struct address_space *mapping, > > ret = add_to_page_cache(page, mapping, offset, gfp_mask); > if (ret == 0) { > - if (page_is_file_cache(page)) > + if (mapping_unevictable(mapping)) > + add_page_to_unevictable_list(page); > + else if (page_is_file_cache(page)) > lru_cache_add_file(page); > else > lru_cache_add_active_anon(page); > -- > 1.6.2.1.135.gde769 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> > -- Kinds regards, Minchan Kim -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-23 0:44 ` Minchan Kim @ 2009-03-23 2:21 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 2:21 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Johannes Weiner, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn > Hmm,, > > This patch is another thing unlike previous series patches. > Firstly, It looked good to me. > > I think add_to_page_cache_lru have to become a fast path. > But, how often would ramfs and shmem function be called ? > > I have a concern for this patch to add another burden. > so, we need any numbers for getting pros and cons. > > Any thoughts ? this is the just reason why current code don't call add_page_to_unevictable_list(). add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), it can cause zone->lru_lock contention storm. then, if nobody have good performance result, I don't ack this patch. > On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote: > > Check if the mapping is evictable when initially adding page cache > > pages to the LRU lists. ?If that is not the case, add them to the > > unevictable list immediately instead of leaving it up to the reclaim > > code to move them there. > > > > This is useful for ramfs and locked shmem which mark whole mappings as > > unevictable and we know at fault time already that it is useless to > > try reclaiming these pages. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-23 2:21 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 2:21 UTC (permalink / raw) To: Minchan Kim Cc: kosaki.motohiro, Johannes Weiner, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn > Hmm,, > > This patch is another thing unlike previous series patches. > Firstly, It looked good to me. > > I think add_to_page_cache_lru have to become a fast path. > But, how often would ramfs and shmem function be called ? > > I have a concern for this patch to add another burden. > so, we need any numbers for getting pros and cons. > > Any thoughts ? this is the just reason why current code don't call add_page_to_unevictable_list(). add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), it can cause zone->lru_lock contention storm. then, if nobody have good performance result, I don't ack this patch. > On Mon, Mar 23, 2009 at 5:13 AM, Johannes Weiner <hannes@cmpxchg.org> wrote: > > Check if the mapping is evictable when initially adding page cache > > pages to the LRU lists. ?If that is not the case, add them to the > > unevictable list immediately instead of leaving it up to the reclaim > > code to move them there. > > > > This is useful for ramfs and locked shmem which mark whole mappings as > > unevictable and we know at fault time already that it is useless to > > try reclaiming these pages. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-23 2:21 ` KOSAKI Motohiro @ 2009-03-23 8:42 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-23 8:42 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote: > > Hmm,, > > > > This patch is another thing unlike previous series patches. > > Firstly, It looked good to me. > > > > I think add_to_page_cache_lru have to become a fast path. > > But, how often would ramfs and shmem function be called ? > > > > I have a concern for this patch to add another burden. > > so, we need any numbers for getting pros and cons. > > > > Any thoughts ? > > this is the just reason why current code don't call add_page_to_unevictable_list(). > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > it can cause zone->lru_lock contention storm. How is it different then shrink_page_list()? If readahead put a contiguous chunk of unevictable pages to the file lru, then shrink_page_list() will as well call add_page_to_unevictable_list() in a loop. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-23 8:42 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-23 8:42 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote: > > Hmm,, > > > > This patch is another thing unlike previous series patches. > > Firstly, It looked good to me. > > > > I think add_to_page_cache_lru have to become a fast path. > > But, how often would ramfs and shmem function be called ? > > > > I have a concern for this patch to add another burden. > > so, we need any numbers for getting pros and cons. > > > > Any thoughts ? > > this is the just reason why current code don't call add_page_to_unevictable_list(). > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > it can cause zone->lru_lock contention storm. How is it different then shrink_page_list()? If readahead put a contiguous chunk of unevictable pages to the file lru, then shrink_page_list() will as well call add_page_to_unevictable_list() in a loop. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-23 8:42 ` Johannes Weiner @ 2009-03-23 9:01 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 9:01 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn > On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote: > > > Hmm,, > > > > > > This patch is another thing unlike previous series patches. > > > Firstly, It looked good to me. > > > > > > I think add_to_page_cache_lru have to become a fast path. > > > But, how often would ramfs and shmem function be called ? > > > > > > I have a concern for this patch to add another burden. > > > so, we need any numbers for getting pros and cons. > > > > > > Any thoughts ? > > > > this is the just reason why current code don't call add_page_to_unevictable_list(). > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > > it can cause zone->lru_lock contention storm. > > How is it different then shrink_page_list()? If readahead put a > contiguous chunk of unevictable pages to the file lru, then > shrink_page_list() will as well call add_page_to_unevictable_list() in > a loop. it's probability issue. readahead: we need to concern (1) readahead vs readahead (2) readahead vs reclaim vmscan: we need to concern (3) background reclaim vs foreground reclaim So, (3) is rarely event than (1) and (2). Am I missing anything? ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-23 9:01 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 9:01 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn > On Mon, Mar 23, 2009 at 11:21:36AM +0900, KOSAKI Motohiro wrote: > > > Hmm,, > > > > > > This patch is another thing unlike previous series patches. > > > Firstly, It looked good to me. > > > > > > I think add_to_page_cache_lru have to become a fast path. > > > But, how often would ramfs and shmem function be called ? > > > > > > I have a concern for this patch to add another burden. > > > so, we need any numbers for getting pros and cons. > > > > > > Any thoughts ? > > > > this is the just reason why current code don't call add_page_to_unevictable_list(). > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > > it can cause zone->lru_lock contention storm. > > How is it different then shrink_page_list()? If readahead put a > contiguous chunk of unevictable pages to the file lru, then > shrink_page_list() will as well call add_page_to_unevictable_list() in > a loop. it's probability issue. readahead: we need to concern (1) readahead vs readahead (2) readahead vs reclaim vmscan: we need to concern (3) background reclaim vs foreground reclaim So, (3) is rarely event than (1) and (2). Am I missing anything? -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-23 9:01 ` KOSAKI Motohiro @ 2009-03-23 9:23 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 9:23 UTC (permalink / raw) To: KOSAKI Motohiro Cc: kosaki.motohiro, Johannes Weiner, Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn > > > this is the just reason why current code don't call add_page_to_unevictable_list(). > > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > > > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > > > it can cause zone->lru_lock contention storm. > > > > How is it different then shrink_page_list()? If readahead put a > > contiguous chunk of unevictable pages to the file lru, then > > shrink_page_list() will as well call add_page_to_unevictable_list() in > > a loop. > > it's probability issue. > > readahead: we need to concern > (1) readahead vs readahead > (2) readahead vs reclaim > > vmscan: we need to concern > (3) background reclaim vs foreground reclaim > > So, (3) is rarely event than (1) and (2). > Am I missing anything? my last mail explanation is too poor. sorry. I don't dislike this patch concept. but it seems a bit naive against contention. if we can decrease contention risk, I can ack with presure. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-23 9:23 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-23 9:23 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Johannes Weiner, Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn > > > this is the just reason why current code don't call add_page_to_unevictable_list(). > > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > > > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > > > it can cause zone->lru_lock contention storm. > > > > How is it different then shrink_page_list()? If readahead put a > > contiguous chunk of unevictable pages to the file lru, then > > shrink_page_list() will as well call add_page_to_unevictable_list() in > > a loop. > > it's probability issue. > > readahead: we need to concern > (1) readahead vs readahead > (2) readahead vs reclaim > > vmscan: we need to concern > (3) background reclaim vs foreground reclaim > > So, (3) is rarely event than (1) and (2). > Am I missing anything? my last mail explanation is too poor. sorry. I don't dislike this patch concept. but it seems a bit naive against contention. if we can decrease contention risk, I can ack with presure. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-23 9:23 ` KOSAKI Motohiro @ 2009-03-26 0:48 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-26 0:48 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn On Mon, Mar 23, 2009 at 06:23:36PM +0900, KOSAKI Motohiro wrote: > > > > this is the just reason why current code don't call add_page_to_unevictable_list(). > > > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > > > > > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > > > > it can cause zone->lru_lock contention storm. > > > > > > How is it different then shrink_page_list()? If readahead put a > > > contiguous chunk of unevictable pages to the file lru, then > > > shrink_page_list() will as well call add_page_to_unevictable_list() in > > > a loop. > > > > it's probability issue. > > > > readahead: we need to concern > > (1) readahead vs readahead > > (2) readahead vs reclaim > > > > vmscan: we need to concern > > (3) background reclaim vs foreground reclaim > > > > So, (3) is rarely event than (1) and (2). > > Am I missing anything? > > my last mail explanation is too poor. sorry. > I don't dislike this patch concept. but it seems a bit naive against contention. > if we can decrease contention risk, I can ack with presure. My understanding is that when the mapping is truncated before the pages are scanned for reclaim, then we have a net increase of risk for the contention storm you describe. Otherwise, we moved the contention from the reclaim path to the fault path. I don't know how likely readahead is. It only happens when the mapping was blown up with truncate, otherwise only writes add to the cache in the ramfs case. I will further look into this. Hannes ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-26 0:48 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-26 0:48 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Minchan Kim, Andrew Morton, linux-kernel, linux-mm, David Howells, Nick Piggin, Rik van Riel, Peter Zijlstra, Lee Schermerhorn On Mon, Mar 23, 2009 at 06:23:36PM +0900, KOSAKI Motohiro wrote: > > > > this is the just reason why current code don't call add_page_to_unevictable_list(). > > > > add_page_to_unevictable_list() don't use pagevec. it is needed for avoiding race. > > > > > > > > then, if readahead path (i.e. add_to_page_cache_lru()) use add_page_to_unevictable_list(), > > > > it can cause zone->lru_lock contention storm. > > > > > > How is it different then shrink_page_list()? If readahead put a > > > contiguous chunk of unevictable pages to the file lru, then > > > shrink_page_list() will as well call add_page_to_unevictable_list() in > > > a loop. > > > > it's probability issue. > > > > readahead: we need to concern > > (1) readahead vs readahead > > (2) readahead vs reclaim > > > > vmscan: we need to concern > > (3) background reclaim vs foreground reclaim > > > > So, (3) is rarely event than (1) and (2). > > Am I missing anything? > > my last mail explanation is too poor. sorry. > I don't dislike this patch concept. but it seems a bit naive against contention. > if we can decrease contention risk, I can ack with presure. My understanding is that when the mapping is truncated before the pages are scanned for reclaim, then we have a net increase of risk for the contention storm you describe. Otherwise, we moved the contention from the reclaim path to the fault path. I don't know how likely readahead is. It only happens when the mapping was blown up with truncate, otherwise only writes add to the cache in the ramfs case. I will further look into this. Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 2/3] ramfs-nommu: use generic lru cache 2009-03-21 10:20 ` Johannes Weiner @ 2009-03-23 10:40 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-23 10:40 UTC (permalink / raw) To: Johannes Weiner Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Johannes Weiner <hannes@cmpxchg.org> wrote: > Instead of open-coding the lru-list-add pagevec batching when > expanding a file mapping from zero, defer to the appropriate page > cache function that also takes care of adding the page to the lru > list. > > This is cleaner, saves code and reduces the stack footprint by 16 > words worth of pagevec. Acked-by: David Howells <dhowells@redhat.com> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 2/3] ramfs-nommu: use generic lru cache @ 2009-03-23 10:40 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-23 10:40 UTC (permalink / raw) To: Johannes Weiner Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Johannes Weiner <hannes@cmpxchg.org> wrote: > Instead of open-coding the lru-list-add pagevec batching when > expanding a file mapping from zero, defer to the appropriate page > cache function that also takes care of adding the page to the lru > list. > > This is cleaner, saves code and reduces the stack footprint by 16 > words worth of pagevec. Acked-by: David Howells <dhowells@redhat.com> -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-21 10:20 ` Johannes Weiner @ 2009-03-23 10:53 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-23 10:53 UTC (permalink / raw) To: Johannes Weiner Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Johannes Weiner <hannes@cmpxchg.org> wrote: > - if (page_is_file_cache(page)) > + if (mapping_unevictable(mapping)) > + add_page_to_unevictable_list(page); > + else if (page_is_file_cache(page)) It would be nice to avoid adding an extra test and branch in here. This function is used a lot, and quite often we know the answer to the first test before we even get here. David ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-23 10:53 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-23 10:53 UTC (permalink / raw) To: Johannes Weiner Cc: dhowells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn Johannes Weiner <hannes@cmpxchg.org> wrote: > - if (page_is_file_cache(page)) > + if (mapping_unevictable(mapping)) > + add_page_to_unevictable_list(page); > + else if (page_is_file_cache(page)) It would be nice to avoid adding an extra test and branch in here. This function is used a lot, and quite often we know the answer to the first test before we even get here. David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-23 10:53 ` David Howells @ 2009-03-26 0:01 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-26 0:01 UTC (permalink / raw) To: David Howells Cc: Andrew Morton, linux-kernel, linux-mm, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote: > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > - if (page_is_file_cache(page)) > > + if (mapping_unevictable(mapping)) > > + add_page_to_unevictable_list(page); > > + else if (page_is_file_cache(page)) > > It would be nice to avoid adding an extra test and branch in here. This > function is used a lot, and quite often we know the answer to the first test > before we even get here. Yes, I thought about that too. So I mounted a tmpfs and dd'd /dev/zero to a file on it until it ran out of space (around 900M, without swapping), deleted the file again. I did this in a tight loop and profiled it. I couldn't think of a way that would excercise add_to_page_cache_lru() more, I hope I didn't overlook anything, please correct if I am wrong. If I was not, than the extra checking for unevictable mappings doesn't make a measurable difference. The function on the vanilla kernel had a share of 0.2033%, on the patched kernel 0.1953%. Hannes ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-26 0:01 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-26 0:01 UTC (permalink / raw) To: David Howells Cc: Andrew Morton, linux-kernel, linux-mm, Nick Piggin, KOSAKI Motohiro, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote: > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > - if (page_is_file_cache(page)) > > + if (mapping_unevictable(mapping)) > > + add_page_to_unevictable_list(page); > > + else if (page_is_file_cache(page)) > > It would be nice to avoid adding an extra test and branch in here. This > function is used a lot, and quite often we know the answer to the first test > before we even get here. Yes, I thought about that too. So I mounted a tmpfs and dd'd /dev/zero to a file on it until it ran out of space (around 900M, without swapping), deleted the file again. I did this in a tight loop and profiled it. I couldn't think of a way that would excercise add_to_page_cache_lru() more, I hope I didn't overlook anything, please correct if I am wrong. If I was not, than the extra checking for unevictable mappings doesn't make a measurable difference. The function on the vanilla kernel had a share of 0.2033%, on the patched kernel 0.1953%. Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-26 0:01 ` Johannes Weiner @ 2009-03-26 8:56 ` KOSAKI Motohiro -1 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-26 8:56 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, David Howells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn > On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote: > > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > > - if (page_is_file_cache(page)) > > > + if (mapping_unevictable(mapping)) > > > + add_page_to_unevictable_list(page); > > > + else if (page_is_file_cache(page)) > > > > It would be nice to avoid adding an extra test and branch in here. This > > function is used a lot, and quite often we know the answer to the first test > > before we even get here. > > Yes, I thought about that too. So I mounted a tmpfs and dd'd > /dev/zero to a file on it until it ran out of space (around 900M, > without swapping), deleted the file again. I did this in a tight loop > and profiled it. > > I couldn't think of a way that would excercise add_to_page_cache_lru() > more, I hope I didn't overlook anything, please correct if I am wrong. > > If I was not, than the extra checking for unevictable mappings doesn't > make a measurable difference. The function on the vanilla kernel had > a share of 0.2033%, on the patched kernel 0.1953%. May I ask the number of the cpu of your test box. In general, lock contention possibility depend on #ofCPUs. So, I and lee mainly talked about large box. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-26 8:56 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-26 8:56 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, David Howells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn > On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote: > > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > > - if (page_is_file_cache(page)) > > > + if (mapping_unevictable(mapping)) > > > + add_page_to_unevictable_list(page); > > > + else if (page_is_file_cache(page)) > > > > It would be nice to avoid adding an extra test and branch in here. This > > function is used a lot, and quite often we know the answer to the first test > > before we even get here. > > Yes, I thought about that too. So I mounted a tmpfs and dd'd > /dev/zero to a file on it until it ran out of space (around 900M, > without swapping), deleted the file again. I did this in a tight loop > and profiled it. > > I couldn't think of a way that would excercise add_to_page_cache_lru() > more, I hope I didn't overlook anything, please correct if I am wrong. > > If I was not, than the extra checking for unevictable mappings doesn't > make a measurable difference. The function on the vanilla kernel had > a share of 0.2033%, on the patched kernel 0.1953%. May I ask the number of the cpu of your test box. In general, lock contention possibility depend on #ofCPUs. So, I and lee mainly talked about large box. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists 2009-03-26 8:56 ` KOSAKI Motohiro @ 2009-03-26 10:36 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-26 10:36 UTC (permalink / raw) To: KOSAKI Motohiro Cc: David Howells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn On Thu, Mar 26, 2009 at 05:56:52PM +0900, KOSAKI Motohiro wrote: > > On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote: > > > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > > > > - if (page_is_file_cache(page)) > > > > + if (mapping_unevictable(mapping)) > > > > + add_page_to_unevictable_list(page); > > > > + else if (page_is_file_cache(page)) > > > > > > It would be nice to avoid adding an extra test and branch in here. This > > > function is used a lot, and quite often we know the answer to the first test > > > before we even get here. > > > > Yes, I thought about that too. So I mounted a tmpfs and dd'd > > /dev/zero to a file on it until it ran out of space (around 900M, > > without swapping), deleted the file again. I did this in a tight loop > > and profiled it. > > > > I couldn't think of a way that would excercise add_to_page_cache_lru() > > more, I hope I didn't overlook anything, please correct if I am wrong. > > > > If I was not, than the extra checking for unevictable mappings doesn't > > make a measurable difference. The function on the vanilla kernel had > > a share of 0.2033%, on the patched kernel 0.1953%. > > May I ask the number of the cpu of your test box. > In general, lock contention possibility depend on #ofCPUs. Yes, sure. In this test I tried to find out how much this extra branch makes a difference for the common path (untaken), though. I have not tried to instrument the lock contention. But this will be done with a quadcore system. > So, I and lee mainly talked about large box. Yeah, I don't have such a thing ;) Hannes ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists @ 2009-03-26 10:36 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-26 10:36 UTC (permalink / raw) To: KOSAKI Motohiro Cc: David Howells, Andrew Morton, linux-kernel, linux-mm, Nick Piggin, Rik van Riel, Peter Zijlstra, MinChan Kim, Lee Schermerhorn On Thu, Mar 26, 2009 at 05:56:52PM +0900, KOSAKI Motohiro wrote: > > On Mon, Mar 23, 2009 at 10:53:27AM +0000, David Howells wrote: > > > Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > > > > - if (page_is_file_cache(page)) > > > > + if (mapping_unevictable(mapping)) > > > > + add_page_to_unevictable_list(page); > > > > + else if (page_is_file_cache(page)) > > > > > > It would be nice to avoid adding an extra test and branch in here. This > > > function is used a lot, and quite often we know the answer to the first test > > > before we even get here. > > > > Yes, I thought about that too. So I mounted a tmpfs and dd'd > > /dev/zero to a file on it until it ran out of space (around 900M, > > without swapping), deleted the file again. I did this in a tight loop > > and profiled it. > > > > I couldn't think of a way that would excercise add_to_page_cache_lru() > > more, I hope I didn't overlook anything, please correct if I am wrong. > > > > If I was not, than the extra checking for unevictable mappings doesn't > > make a measurable difference. The function on the vanilla kernel had > > a share of 0.2033%, on the patched kernel 0.1953%. > > May I ask the number of the cpu of your test box. > In general, lock contention possibility depend on #ofCPUs. Yes, sure. In this test I tried to find out how much this extra branch makes a difference for the common path (untaken), though. I have not tried to instrument the lock contention. But this will be done with a quadcore system. > So, I and lee mainly talked about large box. Yeah, I don't have such a thing ;) Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU 2009-03-21 10:20 ` Johannes Weiner @ 2009-03-23 20:07 ` Lee Schermerhorn -1 siblings, 0 replies; 101+ messages in thread From: Lee Schermerhorn @ 2009-03-23 20:07 UTC (permalink / raw) To: Johannes Weiner Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, riel On Sat, 2009-03-21 at 11:20 +0100, Johannes Weiner wrote: > On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote: > > On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote: > > > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > > > > > I just want to point out [again :)] that removing the ramfs pages from > > > > the lru will prevent them from being migrated > > > > > > This is less of an issue for NOMMU kernels, since you can't migrate pages that > > > are mapped. > > > > > > Agreed. So, you could eliminate them [ramfs pages] from the lru for > > just the nommu kernels, if you wanted to go that route. > > These pages don't come with much overhead anymore when they sit on the > unevictable list, right? So I don't see much point in special casing > them all over the place. I agree: not much overhead; no NEED to special case. I was only agreeing with David, that it would be OK to keep them off the LRU for NOMMU kernels. > > I have a patchset that decouples the unevictable lru feature from > mlock, enables the latter on nommu and then makes sure ramfs pages go > immediately to the unevictable list so they don't need the scanner to > move them. This is just wiring up of features we already have. Yeah. I didn't do it that way, because I didn't see any benefit in doing that for ram disk pages. If one doesn't run vmscan, having the pages on the normal lru doesn't hurt. If you do need to run vmscan, moving them to the unevictable list from there seems the least of your problems :). And, doing in the pagevec flush function adds overhead to the fault path. Granted, it's amortized over PAGEVEC_SIZE pages. Would probably we worth measuring the performance cost. And any code size increase--NOMMU kernel users might care about that. > > I will sent this mondayish, need to test it more especially on a NOMMU > setup. Saw them. Will take a look... Lee ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH 0/2] Make the Unevictable LRU available on NOMMU @ 2009-03-23 20:07 ` Lee Schermerhorn 0 siblings, 0 replies; 101+ messages in thread From: Lee Schermerhorn @ 2009-03-23 20:07 UTC (permalink / raw) To: Johannes Weiner Cc: David Howells, Minchan Kim, Andrew Morton, kosaki.motohiro, torvalds, peterz, nrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, riel On Sat, 2009-03-21 at 11:20 +0100, Johannes Weiner wrote: > On Fri, Mar 20, 2009 at 02:30:15PM -0400, Lee Schermerhorn wrote: > > On Fri, 2009-03-20 at 16:24 +0000, David Howells wrote: > > > Lee Schermerhorn <Lee.Schermerhorn@hp.com> wrote: > > > > > > > I just want to point out [again :)] that removing the ramfs pages from > > > > the lru will prevent them from being migrated > > > > > > This is less of an issue for NOMMU kernels, since you can't migrate pages that > > > are mapped. > > > > > > Agreed. So, you could eliminate them [ramfs pages] from the lru for > > just the nommu kernels, if you wanted to go that route. > > These pages don't come with much overhead anymore when they sit on the > unevictable list, right? So I don't see much point in special casing > them all over the place. I agree: not much overhead; no NEED to special case. I was only agreeing with David, that it would be OK to keep them off the LRU for NOMMU kernels. > > I have a patchset that decouples the unevictable lru feature from > mlock, enables the latter on nommu and then makes sure ramfs pages go > immediately to the unevictable list so they don't need the scanner to > move them. This is just wiring up of features we already have. Yeah. I didn't do it that way, because I didn't see any benefit in doing that for ram disk pages. If one doesn't run vmscan, having the pages on the normal lru doesn't hurt. If you do need to run vmscan, moving them to the unevictable list from there seems the least of your problems :). And, doing in the pagevec flush function adds overhead to the fault path. Granted, it's amortized over PAGEVEC_SIZE pages. Would probably we worth measuring the performance cost. And any code size increase--NOMMU kernel users might care about that. > > I will sent this mondayish, need to test it more especially on a NOMMU > setup. Saw them. Will take a look... Lee -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 0:35 ` Minchan Kim @ 2009-03-13 11:53 ` David Howells -1 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 11:53 UTC (permalink / raw) To: KOSAKI Motohiro Cc: dhowells, Minchan Kim, Andrew Morton, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > David, Could you please try following patch if you have NOMMU machine? > it is straightforward porting to nommu. Is this patch actually sufficient, though? Surely it requires an alteration to ramfs to mark the page as being unevictable? David ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-13 11:53 ` David Howells 0 siblings, 0 replies; 101+ messages in thread From: David Howells @ 2009-03-13 11:53 UTC (permalink / raw) To: KOSAKI Motohiro Cc: dhowells, Minchan Kim, Andrew Morton, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Johannes Weiner, Rik van Riel, Lee Schermerhorn KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > David, Could you please try following patch if you have NOMMU machine? > it is straightforward porting to nommu. Is this patch actually sufficient, though? Surely it requires an alteration to ramfs to mark the page as being unevictable? David -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-13 11:53 ` David Howells @ 2009-03-13 22:49 ` Johannes Weiner -1 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-13 22:49 UTC (permalink / raw) To: David Howells Cc: KOSAKI Motohiro, Minchan Kim, Andrew Morton, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Rik van Riel, Lee Schermerhorn On Fri, Mar 13, 2009 at 11:53:02AM +0000, David Howells wrote: > KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > > > David, Could you please try following patch if you have NOMMU machine? > > it is straightforward porting to nommu. > > Is this patch actually sufficient, though? Surely it requires an alteration > to ramfs to mark the page as being unevictable? ramfs already marks the whole address space of each inode as unevictable, see ramfs_get_inode(). The reclaim code will regard this when the config option is enabled. Hannes ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded @ 2009-03-13 22:49 ` Johannes Weiner 0 siblings, 0 replies; 101+ messages in thread From: Johannes Weiner @ 2009-03-13 22:49 UTC (permalink / raw) To: David Howells Cc: KOSAKI Motohiro, Minchan Kim, Andrew Morton, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel, linux-mm, Rik van Riel, Lee Schermerhorn On Fri, Mar 13, 2009 at 11:53:02AM +0000, David Howells wrote: > KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: > > > David, Could you please try following patch if you have NOMMU machine? > > it is straightforward porting to nommu. > > Is this patch actually sufficient, though? Surely it requires an alteration > to ramfs to mark the page as being unevictable? ramfs already marks the whole address space of each inode as unevictable, see ramfs_get_inode(). The reclaim code will regard this when the config option is enabled. Hannes -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells 2009-03-11 17:26 ` Johannes Weiner 2009-03-11 22:03 ` Andrew Morton @ 2009-03-12 0:08 ` Andrew Morton 2009-03-12 7:12 ` Berkhan, Enrik (GE Infra, Oil & Gas) 2009-03-12 12:25 ` David Howells 3 siblings, 1 reply; 101+ messages in thread From: Andrew Morton @ 2009-03-12 0:08 UTC (permalink / raw) To: David Howells Cc: torvalds, peterz, Enrik.Berkhan, dhowells, uclinux-dev, linux-kernel On Wed, 11 Mar 2009 15:30:35 +0000 David Howells <dhowells@redhat.com> wrote: > From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > The pages attached to a ramfs inode's pagecache by truncation from nothing - as > done by SYSV SHM for example - may get discarded under memory pressure. > > The problem is that the pages are not marked dirty. Anything that creates data > in an MMU-based ramfs will cause the pages holding that data will cause the > set_page_dirty() aop to be called. > > For the NOMMU-based mmap, set_page_dirty() may be called by write(), but it > won't be called by page-writing faults on writable mmaps, and it isn't called > by ramfs_nommu_expand_for_mapping() when a file is being truncated from nothing > to allocate a contiguous run. > > The solution is to mark the pages dirty at the point of allocation by > the truncation code. > > Signed-off-by: Enrik Berkhan <Enrik.Berkhan@ge.com> > Signed-off-by: David Howells <dhowells@redhat.com> > --- > > fs/ramfs/file-nommu.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > > diff --git a/fs/ramfs/file-nommu.c b/fs/ramfs/file-nommu.c > index b9b567a..90d72be 100644 > --- a/fs/ramfs/file-nommu.c > +++ b/fs/ramfs/file-nommu.c > @@ -114,6 +114,9 @@ int ramfs_nommu_expand_for_mapping(struct inode *inode, size_t newsize) > if (!pagevec_add(&lru_pvec, page)) > __pagevec_lru_add_file(&lru_pvec); > > + /* prevent the page from being discarded on memory pressure */ > + SetPageDirty(page); > + > unlock_page(page); > } Was there a specific reason for using the low-level SetPageDirty()? On the write() path, ramfs pages will be dirtied by simple_commit_write()'s set_page_dirty(), which calls __set_page_dirty_no_writeback(). It just so happens that __set_page_dirty_no_writeback() is equivalent to a simple SetPageDirty() - it bypasses all the extra things which we do for normal permanent-storage-backed pages. But I'd have thought that it would be cleaner and more maintainable (albeit a bit slower) to go through the a_ops? ^ permalink raw reply [flat|nested] 101+ messages in thread
* RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 0:08 ` Andrew Morton @ 2009-03-12 7:12 ` Berkhan, Enrik (GE Infra, Oil & Gas) 2009-03-12 11:29 ` [uClinux-dev] " Jamie Lokier 0 siblings, 1 reply; 101+ messages in thread From: Berkhan, Enrik (GE Infra, Oil & Gas) @ 2009-03-12 7:12 UTC (permalink / raw) To: Andrew Morton, David Howells Cc: torvalds, peterz, dhowells, uclinux-dev, linux-kernel Andrew Morton wrote: > On Wed, 11 Mar 2009 15:30:35 +0000 > David Howells <dhowells@redhat.com> wrote: >> From: Enrik Berkhan <Enrik.Berkhan@ge.com> >> >> The solution is to mark the pages dirty at the point of allocation by >> the truncation code. > > Was there a specific reason for using the low-level SetPageDirty()? No, no specific reason. It was just my first try of a fix after spotting the problem. After a short discussion with David, we decided to wait for others' comments on using the low-/high-level approach. Enrik ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 7:12 ` Berkhan, Enrik (GE Infra, Oil & Gas) @ 2009-03-12 11:29 ` Jamie Lokier 2009-03-12 11:50 ` Peter Zijlstra 0 siblings, 1 reply; 101+ messages in thread From: Jamie Lokier @ 2009-03-12 11:29 UTC (permalink / raw) To: uClinux development list Cc: Andrew Morton, David Howells, peterz, torvalds, linux-kernel Berkhan, Enrik (GE Infra, Oil & Gas) wrote: > Andrew Morton wrote: > > On Wed, 11 Mar 2009 15:30:35 +0000 > > David Howells <dhowells@redhat.com> wrote: > >> From: Enrik Berkhan <Enrik.Berkhan@ge.com> > >> > >> The solution is to mark the pages dirty at the point of allocation by > >> the truncation code. > > > > Was there a specific reason for using the low-level SetPageDirty()? > > No, no specific reason. It was just my first try of a fix after spotting > the problem. After a short discussion with David, we decided to wait for > others' comments on using the low-/high-level approach. Tangentially related... Does the vm pageout logic include or skip these "dirty" pages looking for candidates to flush to storage? What about with MMU? -- Jamie ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 11:29 ` [uClinux-dev] " Jamie Lokier @ 2009-03-12 11:50 ` Peter Zijlstra 2009-03-12 23:20 ` Minchan Kim 0 siblings, 1 reply; 101+ messages in thread From: Peter Zijlstra @ 2009-03-12 11:50 UTC (permalink / raw) To: Jamie Lokier Cc: uClinux development list, Andrew Morton, David Howells, torvalds, linux-kernel On Thu, 2009-03-12 at 11:29 +0000, Jamie Lokier wrote: > Berkhan, Enrik (GE Infra, Oil & Gas) wrote: > > Andrew Morton wrote: > > > On Wed, 11 Mar 2009 15:30:35 +0000 > > > David Howells <dhowells@redhat.com> wrote: > > >> From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > >> > > >> The solution is to mark the pages dirty at the point of allocation by > > >> the truncation code. > > > > > > Was there a specific reason for using the low-level SetPageDirty()? > > > > No, no specific reason. It was just my first try of a fix after spotting > > the problem. After a short discussion with David, we decided to wait for > > others' comments on using the low-/high-level approach. > > Tangentially related... > > Does the vm pageout logic include or skip these "dirty" pages looking > for candidates to flush to storage? What about with MMU? Includes them, regular pageout will try to do the writeout to clean them and then discard them. The ramfs stuff is rather icky in that it adds the pages to the aging list, marks them dirty, but does not provide a writeout method. This will make the paging code scan over them (continuously) trying to clean them, failing that (lack of writeout method) and putting them back on the list. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 11:50 ` Peter Zijlstra @ 2009-03-12 23:20 ` Minchan Kim 2009-03-13 7:56 ` Peter Zijlstra 0 siblings, 1 reply; 101+ messages in thread From: Minchan Kim @ 2009-03-12 23:20 UTC (permalink / raw) To: Peter Zijlstra Cc: Jamie Lokier, uClinux development list, Andrew Morton, David Howells, torvalds, linux-kernel Hi, Peter. On Thu, 12 Mar 2009 12:50:08 +0100 Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, 2009-03-12 at 11:29 +0000, Jamie Lokier wrote: > > Berkhan, Enrik (GE Infra, Oil & Gas) wrote: > > > Andrew Morton wrote: > > > > On Wed, 11 Mar 2009 15:30:35 +0000 > > > > David Howells <dhowells@redhat.com> wrote: > > > >> From: Enrik Berkhan <Enrik.Berkhan@ge.com> > > > >> > > > >> The solution is to mark the pages dirty at the point of allocation by > > > >> the truncation code. > > > > > > > > Was there a specific reason for using the low-level SetPageDirty()? > > > > > > No, no specific reason. It was just my first try of a fix after spotting > > > the problem. After a short discussion with David, we decided to wait for > > > others' comments on using the low-/high-level approach. > > > > Tangentially related... > > > > Does the vm pageout logic include or skip these "dirty" pages looking > > for candidates to flush to storage? What about with MMU? > > Includes them, regular pageout will try to do the writeout to clean them > and then discard them. > > The ramfs stuff is rather icky in that it adds the pages to the aging > list, marks them dirty, but does not provide a writeout method. > > This will make the paging code scan over them (continuously) trying to > clean them, failing that (lack of writeout method) and putting them back > on the list. It ins't true any more. UNEVICTABLE_LRU will move ramfs's page from LRU to unevictable list. Couldn't we solve this problem if NOMMU can support CONFIG_UNEVICTABLE_LRU ? > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ -- Kinds Regards Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 23:20 ` Minchan Kim @ 2009-03-13 7:56 ` Peter Zijlstra 2009-03-13 9:17 ` Minchan Kim 0 siblings, 1 reply; 101+ messages in thread From: Peter Zijlstra @ 2009-03-13 7:56 UTC (permalink / raw) To: Minchan Kim Cc: Jamie Lokier, uClinux development list, Andrew Morton, David Howells, torvalds, linux-kernel On Fri, 2009-03-13 at 08:20 +0900, Minchan Kim wrote: > > > Does the vm pageout logic include or skip these "dirty" pages looking > > > for candidates to flush to storage? What about with MMU? > > > > Includes them, regular pageout will try to do the writeout to clean them > > and then discard them. > > > > The ramfs stuff is rather icky in that it adds the pages to the aging > > list, marks them dirty, but does not provide a writeout method. > > > > This will make the paging code scan over them (continuously) trying to > > clean them, failing that (lack of writeout method) and putting them back > > on the list. > > It ins't true any more. > UNEVICTABLE_LRU will move ramfs's page from LRU to unevictable list. > Couldn't we solve this problem if NOMMU can support CONFIG_UNEVICTABLE_LRU ? That's more of a band-aid than a solution, no? They should never have been on the list to begin with. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [uClinux-dev] RE: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-13 7:56 ` Peter Zijlstra @ 2009-03-13 9:17 ` Minchan Kim 0 siblings, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-13 9:17 UTC (permalink / raw) To: Peter Zijlstra Cc: Jamie Lokier, uClinux development list, Andrew Morton, David Howells, torvalds, linux-kernel, KOSAKI Motohiro On Fri, Mar 13, 2009 at 4:56 PM, Peter Zijlstra <peterz@infradead.org> wrote: > On Fri, 2009-03-13 at 08:20 +0900, Minchan Kim wrote: > >> > > Does the vm pageout logic include or skip these "dirty" pages looking >> > > for candidates to flush to storage? What about with MMU? >> > >> > Includes them, regular pageout will try to do the writeout to clean them >> > and then discard them. >> > >> > The ramfs stuff is rather icky in that it adds the pages to the aging >> > list, marks them dirty, but does not provide a writeout method. >> > >> > This will make the paging code scan over them (continuously) trying to >> > clean them, failing that (lack of writeout method) and putting them back >> > on the list. >> >> It ins't true any more. >> UNEVICTABLE_LRU will move ramfs's page from LRU to unevictable list. >> Couldn't we solve this problem if NOMMU can support CONFIG_UNEVICTABLE_LRU ? > > That's more of a band-aid than a solution, no? They should never have > been on the list to begin with. > I agree as Andrew pointed out. It may be workaround but can be a good solution in current status. And then, we have to improve it for removal of ramfs pages from lru list in future, I think. -- Kinds regards, Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells ` (2 preceding siblings ...) 2009-03-12 0:08 ` Andrew Morton @ 2009-03-12 12:25 ` David Howells 2009-03-12 19:43 ` Andrew Morton 3 siblings, 1 reply; 101+ messages in thread From: David Howells @ 2009-03-12 12:25 UTC (permalink / raw) To: Andrew Morton Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel Andrew Morton <akpm@linux-foundation.org> wrote: > Was there a specific reason for using the low-level SetPageDirty()? > > On the write() path, ramfs pages will be dirtied by > simple_commit_write()'s set_page_dirty(), which calls > __set_page_dirty_no_writeback(). > > It just so happens that __set_page_dirty_no_writeback() is equivalent > to a simple SetPageDirty() - it bypasses all the extra things which we > do for normal permanent-storage-backed pages. > > But I'd have thought that it would be cleaner and more maintainable (albeit > a bit slower) to go through the a_ops? It basically boils down to SetPageDirty() with extra overhead, which you pointed out. We're manually manipulating the pagecache for this inode anyway, so does it matter? The main thing I think I'd rather get rid of is: if (!pagevec_add(&lru_pvec, page)) __pagevec_lru_add_file(&lru_pvec); ... pagevec_lru_add_file(&lru_pvec); Which as Peter points out: The ramfs stuff is rather icky in that it adds the pages to the aging list, marks them dirty, but does not provide a writeout method. This will make the paging code scan over them (continuously) trying to clean them, failing that (lack of writeout method) and putting them back on the list. Not requiring the pages to be added to the LRU would be a really good idea. They are not discardable, be it in MMU or NOMMU mode, except when the inode itself is discarded. Furthermore, does it really make sense for ramfs to use do_sync_read/write() and generic_file_aio_read/write(), at least for NOMMU-mode? These add a lot of overhead, and ramfs doesn't really do either direct I/O or AIO. The main point in favour of using these routines is commonality; but they do add a lot of layers of overhead. Does ramfs read/write performance matter than much, I wonder. David ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 12:25 ` David Howells @ 2009-03-12 19:43 ` Andrew Morton 2009-03-13 2:03 ` KOSAKI Motohiro 0 siblings, 1 reply; 101+ messages in thread From: Andrew Morton @ 2009-03-12 19:43 UTC (permalink / raw) To: David Howells Cc: dhowells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel On Thu, 12 Mar 2009 12:25:24 +0000 David Howells <dhowells@redhat.com> wrote: > Andrew Morton <akpm@linux-foundation.org> wrote: > > > Was there a specific reason for using the low-level SetPageDirty()? > > > > On the write() path, ramfs pages will be dirtied by > > simple_commit_write()'s set_page_dirty(), which calls > > __set_page_dirty_no_writeback(). > > > > It just so happens that __set_page_dirty_no_writeback() is equivalent > > to a simple SetPageDirty() - it bypasses all the extra things which we > > do for normal permanent-storage-backed pages. > > > > But I'd have thought that it would be cleaner and more maintainable (albeit > > a bit slower) to go through the a_ops? > > It basically boils down to SetPageDirty() with extra overhead, which you > pointed out. We're manually manipulating the pagecache for this inode anyway, > so does it matter? Not much. It just seems a bit more consistent. > The main thing I think I'd rather get rid of is: > > if (!pagevec_add(&lru_pvec, page)) > __pagevec_lru_add_file(&lru_pvec); > ... > pagevec_lru_add_file(&lru_pvec); > > Which as Peter points out: > > The ramfs stuff is rather icky in that it adds the pages to the aging > list, marks them dirty, but does not provide a writeout method. > > This will make the paging code scan over them (continuously) trying to > clean them, failing that (lack of writeout method) and putting them back > on the list. > > Not requiring the pages to be added to the LRU would be a really good idea. > They are not discardable, be it in MMU or NOMMU mode, except when the inode > itself is discarded. Yep, these pages shouldn't be on the LRU at all. I guess that will require some tweaks to core filemap.c code. > Furthermore, does it really make sense for ramfs to use do_sync_read/write() > and generic_file_aio_read/write(), at least for NOMMU-mode? These add a lot > of overhead, and ramfs doesn't really do either direct I/O or AIO. > > The main point in favour of using these routines is commonality; but they do > add a lot of layers of overhead. Yes, that code is very general hence always has overhead for each specific client. > Does ramfs read/write performance matter > than much, I wonder. I doubt it. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-12 19:43 ` Andrew Morton @ 2009-03-13 2:03 ` KOSAKI Motohiro 2009-03-13 7:57 ` Peter Zijlstra 0 siblings, 1 reply; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-13 2:03 UTC (permalink / raw) To: Andrew Morton Cc: kosaki.motohiro, David Howells, torvalds, peterz, Enrik.Berkhan, uclinux-dev, linux-kernel Hi > > Which as Peter points out: > > > > The ramfs stuff is rather icky in that it adds the pages to the aging > > list, marks them dirty, but does not provide a writeout method. > > > > This will make the paging code scan over them (continuously) trying to > > clean them, failing that (lack of writeout method) and putting them back > > on the list. > > > > Not requiring the pages to be added to the LRU would be a really good idea. > > They are not discardable, be it in MMU or NOMMU mode, except when the inode > > itself is discarded. > > Yep, these pages shouldn't be on the LRU at all. I guess that will > require some tweaks to core filemap.c code. IMHO, UNEVICTABLE_LRU already does lru isolation. only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig. Am I missing anything? ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-13 2:03 ` KOSAKI Motohiro @ 2009-03-13 7:57 ` Peter Zijlstra 2009-03-13 8:15 ` KOSAKI Motohiro 0 siblings, 1 reply; 101+ messages in thread From: Peter Zijlstra @ 2009-03-13 7:57 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Andrew Morton, David Howells, torvalds, Enrik.Berkhan, uclinux-dev, linux-kernel On Fri, 2009-03-13 at 11:03 +0900, KOSAKI Motohiro wrote: > Hi > > > > Which as Peter points out: > > > > > > The ramfs stuff is rather icky in that it adds the pages to the aging > > > list, marks them dirty, but does not provide a writeout method. > > > > > > This will make the paging code scan over them (continuously) trying to > > > clean them, failing that (lack of writeout method) and putting them back > > > on the list. > > > > > > Not requiring the pages to be added to the LRU would be a really good idea. > > > They are not discardable, be it in MMU or NOMMU mode, except when the inode > > > itself is discarded. > > > > Yep, these pages shouldn't be on the LRU at all. I guess that will > > require some tweaks to core filemap.c code. > > IMHO, UNEVICTABLE_LRU already does lru isolation. > only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig. > > Am I missing anything? Yes, the need to take something off that shouldn't be there to begin with. ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-13 7:57 ` Peter Zijlstra @ 2009-03-13 8:15 ` KOSAKI Motohiro 2009-03-13 9:19 ` Minchan Kim 2009-03-13 10:44 ` Johannes Weiner 0 siblings, 2 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-13 8:15 UTC (permalink / raw) To: Peter Zijlstra Cc: kosaki.motohiro, Andrew Morton, David Howells, torvalds, Enrik.Berkhan, uclinux-dev, linux-kernel > > > > The ramfs stuff is rather icky in that it adds the pages to the aging > > > > list, marks them dirty, but does not provide a writeout method. > > > > > > > > This will make the paging code scan over them (continuously) trying to > > > > clean them, failing that (lack of writeout method) and putting them back > > > > on the list. > > > > > > > > Not requiring the pages to be added to the LRU would be a really good idea. > > > > They are not discardable, be it in MMU or NOMMU mode, except when the inode > > > > itself is discarded. > > > > > > Yep, these pages shouldn't be on the LRU at all. I guess that will > > > require some tweaks to core filemap.c code. > > > > IMHO, UNEVICTABLE_LRU already does lru isolation. > > only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig. > > > > Am I missing anything? > > Yes, the need to take something off that shouldn't be there to begin > with. In past unevictable lru discussion, we discuss the same thing. at that time, we found two reason of unevictable lru is better than completely taking off. (1) page migration code depend on the page stay on lru. (2) "taking off at reclaim time" can avoid adding lock to fastpath. anyway, complely removing from lru need something lock. we disliked it at that time. So, I think it is still true. Of cource, better cool solution is always welcome :) ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-13 8:15 ` KOSAKI Motohiro @ 2009-03-13 9:19 ` Minchan Kim 2009-03-13 10:44 ` Johannes Weiner 1 sibling, 0 replies; 101+ messages in thread From: Minchan Kim @ 2009-03-13 9:19 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Peter Zijlstra, Andrew Morton, David Howells, torvalds, Enrik.Berkhan, uclinux-dev, linux-kernel On Fri, Mar 13, 2009 at 5:15 PM, KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> wrote: >> > > > The ramfs stuff is rather icky in that it adds the pages to the aging >> > > > list, marks them dirty, but does not provide a writeout method. >> > > > >> > > > This will make the paging code scan over them (continuously) trying to >> > > > clean them, failing that (lack of writeout method) and putting them back >> > > > on the list. >> > > > >> > > > Not requiring the pages to be added to the LRU would be a really good idea. >> > > > They are not discardable, be it in MMU or NOMMU mode, except when the inode >> > > > itself is discarded. >> > > >> > > Yep, these pages shouldn't be on the LRU at all. I guess that will >> > > require some tweaks to core filemap.c code. >> > >> > IMHO, UNEVICTABLE_LRU already does lru isolation. >> > only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig. >> > >> > Am I missing anything? >> >> Yes, the need to take something off that shouldn't be there to begin >> with. > > In past unevictable lru discussion, we discuss the same thing. > at that time, we found two reason of unevictable lru is better than > completely taking off. > > (1) page migration code depend on the page stay on lru. > (2) "taking off at reclaim time" can avoid adding lock to fastpath. > anyway, complely removing from lru need something lock. > we disliked it at that time Can you explain this issue more detail when you are in convenience, please ? > So, I think it is still true. > Of cource, better cool solution is always welcome :) > > > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- Kinds regards, Minchan Kim ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-13 8:15 ` KOSAKI Motohiro 2009-03-13 9:19 ` Minchan Kim @ 2009-03-13 10:44 ` Johannes Weiner 2009-03-14 14:29 ` KOSAKI Motohiro 1 sibling, 1 reply; 101+ messages in thread From: Johannes Weiner @ 2009-03-13 10:44 UTC (permalink / raw) To: KOSAKI Motohiro Cc: Peter Zijlstra, Andrew Morton, David Howells, torvalds, Enrik.Berkhan, uclinux-dev, linux-kernel On Fri, Mar 13, 2009 at 05:15:44PM +0900, KOSAKI Motohiro wrote: > > > > > The ramfs stuff is rather icky in that it adds the pages to the aging > > > > > list, marks them dirty, but does not provide a writeout method. > > > > > > > > > > This will make the paging code scan over them (continuously) trying to > > > > > clean them, failing that (lack of writeout method) and putting them back > > > > > on the list. > > > > > > > > > > Not requiring the pages to be added to the LRU would be a really good idea. > > > > > They are not discardable, be it in MMU or NOMMU mode, except when the inode > > > > > itself is discarded. > > > > > > > > Yep, these pages shouldn't be on the LRU at all. I guess that will > > > > require some tweaks to core filemap.c code. > > > > > > IMHO, UNEVICTABLE_LRU already does lru isolation. > > > only rest prblem is, getting rid of "depends on MMU" line in mm/Kconfig. > > > > > > Am I missing anything? > > > > Yes, the need to take something off that shouldn't be there to begin > > with. > > In past unevictable lru discussion, we discuss the same thing. > at that time, we found two reason of unevictable lru is better than > completely taking off. > > (1) page migration code depend on the page stay on lru. > (2) "taking off at reclaim time" can avoid adding lock to fastpath. > anyway, complely removing from lru need something lock. > we disliked it at that time. Agreed with (1). NOMMU can't support migration, though. But keeping them off the LRU on NOMMU needs adjustment of the page cache read/write code in mm/filemap.c. I'm not quite sure I understand (2). But never adding these pages on the LRU means we never have to remove them anywhere, no? Hannes ^ permalink raw reply [flat|nested] 101+ messages in thread
* Re: [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded 2009-03-13 10:44 ` Johannes Weiner @ 2009-03-14 14:29 ` KOSAKI Motohiro 0 siblings, 0 replies; 101+ messages in thread From: KOSAKI Motohiro @ 2009-03-14 14:29 UTC (permalink / raw) To: Johannes Weiner Cc: kosaki.motohiro, Peter Zijlstra, Andrew Morton, David Howells, torvalds, Enrik.Berkhan, uclinux-dev, linux-kernel > > (1) page migration code depend on the page stay on lru. > > (2) "taking off at reclaim time" can avoid adding lock to fastpath. > > anyway, complely removing from lru need something lock. > > we disliked it at that time. > > Agreed with (1). NOMMU can't support migration, though. But keeping > them off the LRU on NOMMU needs adjustment of the page cache > read/write code in mm/filemap.c. yes. > I'm not quite sure I understand (2). But never adding these pages on > the LRU means we never have to remove them anywhere, no? Yeah, you are right. I was confused to munlock/shm_unlock case. ^ permalink raw reply [flat|nested] 101+ messages in thread
end of thread, other threads:[~2009-03-26 10:38 UTC | newest] Thread overview: 101+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-03-11 15:30 [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells 2009-03-11 17:26 ` Johannes Weiner 2009-03-11 22:03 ` Andrew Morton 2009-03-11 22:03 ` Andrew Morton 2009-03-11 22:36 ` Johannes Weiner 2009-03-11 22:36 ` Johannes Weiner 2009-03-12 0:02 ` Andrew Morton 2009-03-12 0:02 ` Andrew Morton 2009-03-12 0:35 ` Minchan Kim 2009-03-12 0:35 ` Minchan Kim 2009-03-12 1:04 ` KOSAKI Motohiro 2009-03-12 1:04 ` KOSAKI Motohiro 2009-03-12 1:52 ` Minchan Kim 2009-03-12 1:52 ` Minchan Kim 2009-03-12 1:56 ` Minchan Kim 2009-03-12 1:56 ` Minchan Kim 2009-03-12 2:00 ` KOSAKI Motohiro 2009-03-12 2:00 ` KOSAKI Motohiro 2009-03-12 2:11 ` Minchan Kim 2009-03-12 2:11 ` Minchan Kim 2009-03-12 12:19 ` Robin Getz 2009-03-12 12:19 ` Robin Getz 2009-03-12 17:55 ` [uClinux-dev] " Jamie Lokier 2009-03-12 17:55 ` Jamie Lokier 2009-03-13 17:33 ` [PATCH 0/2] Make the Unevictable LRU available on NOMMU David Howells 2009-03-13 17:33 ` David Howells 2009-03-13 17:33 ` [PATCH 1/2] NOMMU: There is no mlock() for NOMMU, so don't provide the bits David Howells 2009-03-13 17:33 ` David Howells 2009-03-14 11:17 ` KOSAKI Motohiro 2009-03-14 11:17 ` KOSAKI Motohiro 2009-03-13 17:33 ` [PATCH 2/2] NOMMU: Make CONFIG_UNEVICTABLE_LRU available when CONFIG_MMU=n David Howells 2009-03-13 17:33 ` David Howells 2009-03-14 11:17 ` KOSAKI Motohiro 2009-03-14 11:17 ` KOSAKI Motohiro 2009-03-14 0:27 ` [PATCH 0/2] Make the Unevictable LRU available on NOMMU Minchan Kim 2009-03-14 0:27 ` Minchan Kim 2009-03-20 16:08 ` Lee Schermerhorn 2009-03-20 16:08 ` Lee Schermerhorn 2009-03-20 16:24 ` David Howells 2009-03-20 16:24 ` David Howells 2009-03-20 18:30 ` Lee Schermerhorn 2009-03-20 18:30 ` Lee Schermerhorn 2009-03-21 10:20 ` Johannes Weiner 2009-03-21 10:20 ` Johannes Weiner 2009-03-22 20:13 ` [patch 1/3] mm: decouple unevictable lru from mmu Johannes Weiner 2009-03-22 20:13 ` Johannes Weiner 2009-03-22 23:46 ` KOSAKI Motohiro 2009-03-22 23:46 ` KOSAKI Motohiro 2009-03-23 0:14 ` Johannes Weiner 2009-03-23 0:14 ` Johannes Weiner 2009-03-23 10:48 ` David Howells 2009-03-23 10:48 ` David Howells 2009-03-22 20:13 ` [patch 2/3] ramfs-nommu: use generic lru cache Johannes Weiner 2009-03-22 20:13 ` Johannes Weiner 2009-03-23 2:22 ` KOSAKI Motohiro 2009-03-23 2:22 ` KOSAKI Motohiro 2009-03-22 20:13 ` [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists Johannes Weiner 2009-03-22 20:13 ` Johannes Weiner 2009-03-23 0:44 ` Minchan Kim 2009-03-23 0:44 ` Minchan Kim 2009-03-23 2:21 ` KOSAKI Motohiro 2009-03-23 2:21 ` KOSAKI Motohiro 2009-03-23 8:42 ` Johannes Weiner 2009-03-23 8:42 ` Johannes Weiner 2009-03-23 9:01 ` KOSAKI Motohiro 2009-03-23 9:01 ` KOSAKI Motohiro 2009-03-23 9:23 ` KOSAKI Motohiro 2009-03-23 9:23 ` KOSAKI Motohiro 2009-03-26 0:48 ` Johannes Weiner 2009-03-26 0:48 ` Johannes Weiner 2009-03-23 10:40 ` [patch 2/3] ramfs-nommu: use generic lru cache David Howells 2009-03-23 10:40 ` David Howells 2009-03-23 10:53 ` [patch 3/3] mm: keep pages from unevictable mappings off the LRU lists David Howells 2009-03-23 10:53 ` David Howells 2009-03-26 0:01 ` Johannes Weiner 2009-03-26 0:01 ` Johannes Weiner 2009-03-26 8:56 ` KOSAKI Motohiro 2009-03-26 8:56 ` KOSAKI Motohiro 2009-03-26 10:36 ` Johannes Weiner 2009-03-26 10:36 ` Johannes Weiner 2009-03-23 20:07 ` [PATCH 0/2] Make the Unevictable LRU available on NOMMU Lee Schermerhorn 2009-03-23 20:07 ` Lee Schermerhorn 2009-03-13 11:53 ` [PATCH] NOMMU: Pages allocated to a ramfs inode's pagecache may get wrongly discarded David Howells 2009-03-13 11:53 ` David Howells 2009-03-13 22:49 ` Johannes Weiner 2009-03-13 22:49 ` Johannes Weiner 2009-03-12 0:08 ` Andrew Morton 2009-03-12 7:12 ` Berkhan, Enrik (GE Infra, Oil & Gas) 2009-03-12 11:29 ` [uClinux-dev] " Jamie Lokier 2009-03-12 11:50 ` Peter Zijlstra 2009-03-12 23:20 ` Minchan Kim 2009-03-13 7:56 ` Peter Zijlstra 2009-03-13 9:17 ` Minchan Kim 2009-03-12 12:25 ` David Howells 2009-03-12 19:43 ` Andrew Morton 2009-03-13 2:03 ` KOSAKI Motohiro 2009-03-13 7:57 ` Peter Zijlstra 2009-03-13 8:15 ` KOSAKI Motohiro 2009-03-13 9:19 ` Minchan Kim 2009-03-13 10:44 ` Johannes Weiner 2009-03-14 14:29 ` KOSAKI Motohiro
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.