All of lore.kernel.org
 help / color / mirror / Atom feed
From: Shaohua Li <shli@fb.com>
To: Minchan Kim <minchan@kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<Kernel-team@fb.com>, <danielmicay@gmail.com>, <mhocko@suse.com>,
	<hughd@google.com>, <hannes@cmpxchg.org>, <riel@redhat.com>,
	<mgorman@techsingularity.net>, <akpm@linux-foundation.org>
Subject: Re: [PATCH V2 3/7] mm: reclaim MADV_FREE pages
Date: Fri, 10 Feb 2017 09:43:07 -0800	[thread overview]
Message-ID: <20170210174307.GC86050@shli-mbp.local> (raw)
In-Reply-To: <20170210065839.GD25078@bbox>

On Fri, Feb 10, 2017 at 03:58:39PM +0900, Minchan Kim wrote:
> On Fri, Feb 03, 2017 at 03:33:19PM -0800, Shaohua Li wrote:
> > When memory pressure is high, we free MADV_FREE pages. If the pages are
> > not dirty in pte, the pages could be freed immediately. Otherwise we
> > can't reclaim them. We put the pages back to anonumous LRU list (by
> > setting SwapBacked flag) and the pages will be reclaimed in normal
> > swapout way.
> > 
> > We use normal page reclaim policy. Since MADV_FREE pages are put into
> > inactive file list, such pages and inactive file pages are reclaimed
> > according to their age. This is expected, because we don't want to
> > reclaim too many MADV_FREE pages before used once pages.
> > 
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: Hugh Dickins <hughd@google.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Rik van Riel <riel@redhat.com>
> > Cc: Mel Gorman <mgorman@techsingularity.net>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Shaohua Li <shli@fb.com>
> > ---
> >  mm/rmap.c   |  4 ++++
> >  mm/vmscan.c | 43 +++++++++++++++++++++++++++++++------------
> >  2 files changed, 35 insertions(+), 12 deletions(-)
> > 
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index c8d6204..5f05926 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1554,6 +1554,10 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> >  			dec_mm_counter(mm, MM_ANONPAGES);
> >  			rp->lazyfreed++;
> >  			goto discard;
> > +		} else if (flags & TTU_LZFREE) {
> > +			set_pte_at(mm, address, pte, pteval);
> > +			ret = SWAP_FAIL;
> > +			goto out_unmap;
> 
> trivial:
> 
> How about this?
> 
> if (flags && TTU_LZFREE) {
> 	if (PageDirty(page)) {
> 		set_pte_at(XXX);
> 		ret = SWAP_FAIL;
> 		goto out_unmap;
> 	} else {
> 		dec_mm_counter(mm, MM_ANONPAGES);
> 		rp->lazyfreed++;
> 		goto discard;
> 	}
> }
ok
 
> >  		}
> >  
> >  		if (swap_duplicate(entry) < 0) {
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 947ab6f..b304a84 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -864,7 +864,7 @@ static enum page_references page_check_references(struct page *page,
> >  		return PAGEREF_RECLAIM;
> >  
> >  	if (referenced_ptes) {
> > -		if (PageSwapBacked(page))
> > +		if (PageSwapBacked(page) || PageAnon(page))
> 
> If anyone accesses MADV_FREEed range with load op, not store,
> why shouldn't we discard that pages?

Don't have strong opinion about this, userspace probably shouldn't do this. I'm
ok to delete it if you insist.

> >  			return PAGEREF_ACTIVATE;
> >  		/*
> >  		 * All mapped pages start out with page table
> > @@ -903,7 +903,7 @@ static enum page_references page_check_references(struct page *page,
> >  
> >  /* Check if a page is dirty or under writeback */
> >  static void page_check_dirty_writeback(struct page *page,
> > -				       bool *dirty, bool *writeback)
> > +			bool *dirty, bool *writeback, bool lazyfree)
> >  {
> >  	struct address_space *mapping;
> >  
> > @@ -911,7 +911,7 @@ static void page_check_dirty_writeback(struct page *page,
> >  	 * Anonymous pages are not handled by flushers and must be written
> >  	 * from reclaim context. Do not stall reclaim based on them
> >  	 */
> > -	if (!page_is_file_cache(page)) {
> > +	if (!page_is_file_cache(page) || lazyfree) {
> 
> tivial:
> 
> We can check it with PageLazyFree in here rather than passing lazyfree
> argument. It's consistent like page_is_file_cache in here.

ok 
> >  		*dirty = false;
> >  		*writeback = false;
> >  		return;
> > @@ -971,7 +971,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		int may_enter_fs;
> >  		enum page_references references = PAGEREF_RECLAIM_CLEAN;
> >  		bool dirty, writeback;
> > -		bool lazyfree = false;
> > +		bool lazyfree;
> >  		int ret = SWAP_SUCCESS;
> >  
> >  		cond_resched();
> > @@ -986,6 +986,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  
> >  		sc->nr_scanned++;
> >  
> > +		lazyfree = page_is_lazyfree(page);
> > +
> >  		if (unlikely(!page_evictable(page)))
> >  			goto cull_mlocked;
> >  
> > @@ -993,7 +995,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  			goto keep_locked;
> >  
> >  		/* Double the slab pressure for mapped and swapcache pages */
> > -		if (page_mapped(page) || PageSwapCache(page))
> > +		if ((page_mapped(page) || PageSwapCache(page)) && !lazyfree)
> >  			sc->nr_scanned++;
> 
> In this phase, we cannot know whether lazyfree marked page is discarable
> or not. If it is freeable and mapped, this logic makes sense. However,
> if the page is dirty?

I think this doesn't matter. If the page is dirty, it will go to reclaim in
next round and swap out. At that time, we will add nr_scanned there.

> >  
> >  		may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
> > @@ -1005,7 +1007,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		 * will stall and start writing pages if the tail of the LRU
> >  		 * is all dirty unqueued pages.
> >  		 */
> > -		page_check_dirty_writeback(page, &dirty, &writeback);
> > +		page_check_dirty_writeback(page, &dirty, &writeback, lazyfree);
> >  		if (dirty || writeback)
> >  			nr_dirty++;
> >  
> > @@ -1107,6 +1109,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  			; /* try to reclaim the page below */
> >  		}
> >  
> > +		/* lazyfree page could be freed directly */
> > +		if (lazyfree) {
> > +			if (unlikely(PageTransHuge(page)) &&
> > +			    split_huge_page_to_list(page, page_list))
> > +				goto keep_locked;
> > +			goto unmap_page;
> > +		}
> > +
> 
> Maybe, we can remove this hunk. Instead add lazyfree check in here.
> 
> 		if (PageAnon(page) && !PageSwapCache(page) && !lazyfree) {
> 			if (!(sc->gfp_mask & __GFP_IO))
ok

Thanks,
Shaohua 

WARNING: multiple messages have this Message-ID (diff)
From: Shaohua Li <shli@fb.com>
To: Minchan Kim <minchan@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Kernel-team@fb.com, danielmicay@gmail.com, mhocko@suse.com,
	hughd@google.com, hannes@cmpxchg.org, riel@redhat.com,
	mgorman@techsingularity.net, akpm@linux-foundation.org
Subject: Re: [PATCH V2 3/7] mm: reclaim MADV_FREE pages
Date: Fri, 10 Feb 2017 09:43:07 -0800	[thread overview]
Message-ID: <20170210174307.GC86050@shli-mbp.local> (raw)
In-Reply-To: <20170210065839.GD25078@bbox>

On Fri, Feb 10, 2017 at 03:58:39PM +0900, Minchan Kim wrote:
> On Fri, Feb 03, 2017 at 03:33:19PM -0800, Shaohua Li wrote:
> > When memory pressure is high, we free MADV_FREE pages. If the pages are
> > not dirty in pte, the pages could be freed immediately. Otherwise we
> > can't reclaim them. We put the pages back to anonumous LRU list (by
> > setting SwapBacked flag) and the pages will be reclaimed in normal
> > swapout way.
> > 
> > We use normal page reclaim policy. Since MADV_FREE pages are put into
> > inactive file list, such pages and inactive file pages are reclaimed
> > according to their age. This is expected, because we don't want to
> > reclaim too many MADV_FREE pages before used once pages.
> > 
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Minchan Kim <minchan@kernel.org>
> > Cc: Hugh Dickins <hughd@google.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Rik van Riel <riel@redhat.com>
> > Cc: Mel Gorman <mgorman@techsingularity.net>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Signed-off-by: Shaohua Li <shli@fb.com>
> > ---
> >  mm/rmap.c   |  4 ++++
> >  mm/vmscan.c | 43 +++++++++++++++++++++++++++++++------------
> >  2 files changed, 35 insertions(+), 12 deletions(-)
> > 
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index c8d6204..5f05926 100644
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -1554,6 +1554,10 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
> >  			dec_mm_counter(mm, MM_ANONPAGES);
> >  			rp->lazyfreed++;
> >  			goto discard;
> > +		} else if (flags & TTU_LZFREE) {
> > +			set_pte_at(mm, address, pte, pteval);
> > +			ret = SWAP_FAIL;
> > +			goto out_unmap;
> 
> trivial:
> 
> How about this?
> 
> if (flags && TTU_LZFREE) {
> 	if (PageDirty(page)) {
> 		set_pte_at(XXX);
> 		ret = SWAP_FAIL;
> 		goto out_unmap;
> 	} else {
> 		dec_mm_counter(mm, MM_ANONPAGES);
> 		rp->lazyfreed++;
> 		goto discard;
> 	}
> }
ok
 
> >  		}
> >  
> >  		if (swap_duplicate(entry) < 0) {
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 947ab6f..b304a84 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -864,7 +864,7 @@ static enum page_references page_check_references(struct page *page,
> >  		return PAGEREF_RECLAIM;
> >  
> >  	if (referenced_ptes) {
> > -		if (PageSwapBacked(page))
> > +		if (PageSwapBacked(page) || PageAnon(page))
> 
> If anyone accesses MADV_FREEed range with load op, not store,
> why shouldn't we discard that pages?

Don't have strong opinion about this, userspace probably shouldn't do this. I'm
ok to delete it if you insist.

> >  			return PAGEREF_ACTIVATE;
> >  		/*
> >  		 * All mapped pages start out with page table
> > @@ -903,7 +903,7 @@ static enum page_references page_check_references(struct page *page,
> >  
> >  /* Check if a page is dirty or under writeback */
> >  static void page_check_dirty_writeback(struct page *page,
> > -				       bool *dirty, bool *writeback)
> > +			bool *dirty, bool *writeback, bool lazyfree)
> >  {
> >  	struct address_space *mapping;
> >  
> > @@ -911,7 +911,7 @@ static void page_check_dirty_writeback(struct page *page,
> >  	 * Anonymous pages are not handled by flushers and must be written
> >  	 * from reclaim context. Do not stall reclaim based on them
> >  	 */
> > -	if (!page_is_file_cache(page)) {
> > +	if (!page_is_file_cache(page) || lazyfree) {
> 
> tivial:
> 
> We can check it with PageLazyFree in here rather than passing lazyfree
> argument. It's consistent like page_is_file_cache in here.

ok 
> >  		*dirty = false;
> >  		*writeback = false;
> >  		return;
> > @@ -971,7 +971,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		int may_enter_fs;
> >  		enum page_references references = PAGEREF_RECLAIM_CLEAN;
> >  		bool dirty, writeback;
> > -		bool lazyfree = false;
> > +		bool lazyfree;
> >  		int ret = SWAP_SUCCESS;
> >  
> >  		cond_resched();
> > @@ -986,6 +986,8 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  
> >  		sc->nr_scanned++;
> >  
> > +		lazyfree = page_is_lazyfree(page);
> > +
> >  		if (unlikely(!page_evictable(page)))
> >  			goto cull_mlocked;
> >  
> > @@ -993,7 +995,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  			goto keep_locked;
> >  
> >  		/* Double the slab pressure for mapped and swapcache pages */
> > -		if (page_mapped(page) || PageSwapCache(page))
> > +		if ((page_mapped(page) || PageSwapCache(page)) && !lazyfree)
> >  			sc->nr_scanned++;
> 
> In this phase, we cannot know whether lazyfree marked page is discarable
> or not. If it is freeable and mapped, this logic makes sense. However,
> if the page is dirty?

I think this doesn't matter. If the page is dirty, it will go to reclaim in
next round and swap out. At that time, we will add nr_scanned there.

> >  
> >  		may_enter_fs = (sc->gfp_mask & __GFP_FS) ||
> > @@ -1005,7 +1007,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  		 * will stall and start writing pages if the tail of the LRU
> >  		 * is all dirty unqueued pages.
> >  		 */
> > -		page_check_dirty_writeback(page, &dirty, &writeback);
> > +		page_check_dirty_writeback(page, &dirty, &writeback, lazyfree);
> >  		if (dirty || writeback)
> >  			nr_dirty++;
> >  
> > @@ -1107,6 +1109,14 @@ static unsigned long shrink_page_list(struct list_head *page_list,
> >  			; /* try to reclaim the page below */
> >  		}
> >  
> > +		/* lazyfree page could be freed directly */
> > +		if (lazyfree) {
> > +			if (unlikely(PageTransHuge(page)) &&
> > +			    split_huge_page_to_list(page, page_list))
> > +				goto keep_locked;
> > +			goto unmap_page;
> > +		}
> > +
> 
> Maybe, we can remove this hunk. Instead add lazyfree check in here.
> 
> 		if (PageAnon(page) && !PageSwapCache(page) && !lazyfree) {
> 			if (!(sc->gfp_mask & __GFP_IO))
ok

Thanks,
Shaohua 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-02-10 18:25 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-03 23:33 [PATCH V2 0/7] mm: fix some MADV_FREE issues Shaohua Li
2017-02-03 23:33 ` Shaohua Li
2017-02-03 23:33 ` [PATCH V2 1/7] mm: don't assume anonymous pages have SwapBacked flag Shaohua Li
2017-02-03 23:33   ` Shaohua Li
2017-02-03 23:33 ` [PATCH V2 2/7] mm: move MADV_FREE pages into LRU_INACTIVE_FILE list Shaohua Li
2017-02-03 23:33   ` Shaohua Li
2017-02-04  6:38   ` Hillf Danton
2017-02-04  6:38     ` Hillf Danton
2017-02-09  6:33     ` Hillf Danton
2017-02-09  6:33       ` Hillf Danton
2017-02-10  6:50   ` Minchan Kim
2017-02-10  6:50     ` Minchan Kim
2017-02-10 17:30     ` Shaohua Li
2017-02-10 17:30       ` Shaohua Li
2017-02-13  4:57       ` Minchan Kim
2017-02-13  4:57         ` Minchan Kim
2017-02-10 13:02   ` Michal Hocko
2017-02-10 13:02     ` Michal Hocko
2017-02-10 17:33     ` Shaohua Li
2017-02-10 17:33       ` Shaohua Li
2017-02-03 23:33 ` [PATCH V2 3/7] mm: reclaim MADV_FREE pages Shaohua Li
2017-02-03 23:33   ` Shaohua Li
2017-02-10  6:58   ` Minchan Kim
2017-02-10  6:58     ` Minchan Kim
2017-02-10 17:43     ` Shaohua Li [this message]
2017-02-10 17:43       ` Shaohua Li
2017-02-13  5:06       ` Minchan Kim
2017-02-13  5:06         ` Minchan Kim
2017-02-10 13:23   ` Michal Hocko
2017-02-10 13:23     ` Michal Hocko
2017-02-03 23:33 ` [PATCH V2 4/7] mm: enable MADV_FREE for swapless system Shaohua Li
2017-02-03 23:33   ` Shaohua Li
2017-02-03 23:33 ` [PATCH V2 5/7] mm: add vmstat account for MADV_FREE pages Shaohua Li
2017-02-03 23:33   ` Shaohua Li
2017-02-10 13:27   ` Michal Hocko
2017-02-10 13:27     ` Michal Hocko
2017-02-10 17:50     ` Shaohua Li
2017-02-10 17:50       ` Shaohua Li
2017-02-21  9:43       ` Michal Hocko
2017-02-21  9:43         ` Michal Hocko
2017-02-03 23:33 ` [PATCH V2 6/7] proc: show MADV_FREE pages info in smaps Shaohua Li
2017-02-03 23:33   ` Shaohua Li
2017-02-10 13:30   ` Michal Hocko
2017-02-10 13:30     ` Michal Hocko
2017-02-10 17:52     ` Shaohua Li
2017-02-10 17:52       ` Shaohua Li
2017-02-22  2:47   ` Minchan Kim
2017-02-22  2:47     ` Minchan Kim
2017-02-22  4:11     ` Shaohua Li
2017-02-22  4:11       ` Shaohua Li
2017-02-03 23:33 ` [PATCH V2 7/7] mm: add a separate RSS for MADV_FREE pages Shaohua Li
2017-02-03 23:33   ` Shaohua Li
2017-02-10 13:35   ` Michal Hocko
2017-02-10 13:35     ` Michal Hocko
2017-02-10 18:01     ` Shaohua Li
2017-02-10 18:01       ` Shaohua Li
2017-02-21  9:45       ` Michal Hocko
2017-02-21  9:45         ` Michal Hocko
2017-02-22  0:46   ` Minchan Kim
2017-02-22  0:46     ` Minchan Kim
2017-02-22  1:27     ` Shaohua Li
2017-02-22  1:27       ` Shaohua Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170210174307.GC86050@shli-mbp.local \
    --to=shli@fb.com \
    --cc=Kernel-team@fb.com \
    --cc=akpm@linux-foundation.org \
    --cc=danielmicay@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=riel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.