linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for v5.8 0/3] fix for "mm: balance LRU lists based on relative thrashing" patchset
@ 2020-06-16  6:16 js1304
  2020-06-16  6:16 ` [PATCH for v5.8 1/3] mm: workingset: age nonresident information alongside anonymous pages js1304
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: js1304 @ 2020-06-16  6:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

This patchset fixes some problems of the patchset,
"mm: balance LRU lists based on relative thrashing", which is now merged
on the mainline.

Patch "mm: workingset: let cache workingset challenge anon fix" is
the result of discussion with Johannes. See following link.

http://lkml.kernel.org/r/20200520232525.798933-6-hannes@cmpxchg.org

And, the other two are minor things which are found when I try
to rebase my patchset.

Johannes Weiner (1):
  mm: workingset: age nonresident information alongside anonymous pages

Joonsoo Kim (2):
  mm/swap: fix for "mm: workingset: age nonresident information
    alongside anonymous pages"
  mm/memory: fix IO cost for anonymous page

 include/linux/mmzone.h |  4 ++--
 include/linux/swap.h   |  1 +
 mm/memory.c            |  8 ++++++++
 mm/swap.c              |  3 +--
 mm/vmscan.c            |  3 +++
 mm/workingset.c        | 46 +++++++++++++++++++++++++++-------------------
 6 files changed, 42 insertions(+), 23 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH for v5.8 1/3] mm: workingset: age nonresident information alongside anonymous pages
  2020-06-16  6:16 [PATCH for v5.8 0/3] fix for "mm: balance LRU lists based on relative thrashing" patchset js1304
@ 2020-06-16  6:16 ` js1304
  2020-06-29 10:10   ` Vlastimil Babka
  2020-06-16  6:16 ` [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" js1304
  2020-06-16  6:16 ` [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page js1304
  2 siblings, 1 reply; 11+ messages in thread
From: js1304 @ 2020-06-16  6:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

From: Johannes Weiner <hannes@cmpxchg.org>

After ("mm: workingset: let cache workingset challenge anon fix"), we
compare refault distances to active_file + anon. But age of the
non-resident information is only driven by the file LRU. As a result,
we may overestimate the recency of any incoming refaults and activate
them too eagerly, causing unnecessary LRU churn in certain situations.

Make anon aging drive nonresident age as well to address that.

Reported-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h |  4 ++--
 include/linux/swap.h   |  1 +
 mm/vmscan.c            |  3 +++
 mm/workingset.c        | 46 +++++++++++++++++++++++++++-------------------
 4 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index c4c37fd..f6f8849 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -257,8 +257,8 @@ struct lruvec {
 	 */
 	unsigned long			anon_cost;
 	unsigned long			file_cost;
-	/* Evictions & activations on the inactive file list */
-	atomic_long_t			inactive_age;
+	/* Non-resident age, driven by LRU movement */
+	atomic_long_t			nonresident_age;
 	/* Refaults at the time of last reclaim cycle */
 	unsigned long			refaults;
 	/* Various lruvec state flags (enum lruvec_flags) */
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 4c5974b..5b3216b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -313,6 +313,7 @@ struct vma_swap_readahead {
 };
 
 /* linux/mm/workingset.c */
+void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
 void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
 void workingset_refault(struct page *page, void *shadow);
 void workingset_activation(struct page *page);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index b6d8432..749d239 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -904,6 +904,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 		__delete_from_swap_cache(page, swap);
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 		put_swap_page(page, swap);
+		workingset_eviction(page, target_memcg);
 	} else {
 		void (*freepage)(struct page *);
 		void *shadow = NULL;
@@ -1884,6 +1885,8 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
 				list_add(&page->lru, &pages_to_free);
 		} else {
 			nr_moved += nr_pages;
+			if (PageActive(page))
+				workingset_age_nonresident(lruvec, nr_pages);
 		}
 	}
 
diff --git a/mm/workingset.c b/mm/workingset.c
index d481ea4..50b7937 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -156,8 +156,8 @@
  *
  *		Implementation
  *
- * For each node's file LRU lists, a counter for inactive evictions
- * and activations is maintained (node->inactive_age).
+ * For each node's LRU lists, a counter for inactive evictions and
+ * activations is maintained (node->nonresident_age).
  *
  * On eviction, a snapshot of this counter (along with some bits to
  * identify the node) is stored in the now empty page cache
@@ -213,7 +213,17 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 	*workingsetp = workingset;
 }
 
-static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
+/**
+ * workingset_age_nonresident - age non-resident entries as LRU ages
+ * @memcg: the lruvec that was aged
+ * @nr_pages: the number of pages to count
+ *
+ * As in-memory pages are aged, non-resident pages need to be aged as
+ * well, in order for the refault distances later on to be comparable
+ * to the in-memory dimensions. This function allows reclaim and LRU
+ * operations to drive the non-resident aging along in parallel.
+ */
+void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages)
 {
 	/*
 	 * Reclaiming a cgroup means reclaiming all its children in a
@@ -227,11 +237,8 @@ static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
 	 * the root cgroup's, age as well.
 	 */
 	do {
-		struct lruvec *lruvec;
-
-		lruvec = mem_cgroup_lruvec(memcg, pgdat);
-		atomic_long_inc(&lruvec->inactive_age);
-	} while (memcg && (memcg = parent_mem_cgroup(memcg)));
+		atomic_long_add(nr_pages, &lruvec->nonresident_age);
+	} while ((lruvec = parent_lruvec(lruvec)));
 }
 
 /**
@@ -254,12 +261,11 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
 	VM_BUG_ON_PAGE(page_count(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 
-	advance_inactive_age(page_memcg(page), pgdat);
-
 	lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
+	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
 	/* XXX: target_memcg can be NULL, go through lruvec */
 	memcgid = mem_cgroup_id(lruvec_memcg(lruvec));
-	eviction = atomic_long_read(&lruvec->inactive_age);
+	eviction = atomic_long_read(&lruvec->nonresident_age);
 	return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
 }
 
@@ -309,20 +315,20 @@ void workingset_refault(struct page *page, void *shadow)
 	if (!mem_cgroup_disabled() && !eviction_memcg)
 		goto out;
 	eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
-	refault = atomic_long_read(&eviction_lruvec->inactive_age);
+	refault = atomic_long_read(&eviction_lruvec->nonresident_age);
 
 	/*
 	 * Calculate the refault distance
 	 *
 	 * The unsigned subtraction here gives an accurate distance
-	 * across inactive_age overflows in most cases. There is a
+	 * across nonresident_age overflows in most cases. There is a
 	 * special case: usually, shadow entries have a short lifetime
 	 * and are either refaulted or reclaimed along with the inode
 	 * before they get too old.  But it is not impossible for the
-	 * inactive_age to lap a shadow entry in the field, which can
-	 * then result in a false small refault distance, leading to a
-	 * false activation should this old entry actually refault
-	 * again.  However, earlier kernels used to deactivate
+	 * nonresident_age to lap a shadow entry in the field, which
+	 * can then result in a false small refault distance, leading
+	 * to a false activation should this old entry actually
+	 * refault again.  However, earlier kernels used to deactivate
 	 * unconditionally with *every* reclaim invocation for the
 	 * longest time, so the occasional inappropriate activation
 	 * leading to pressure on the active list is not a problem.
@@ -359,7 +365,7 @@ void workingset_refault(struct page *page, void *shadow)
 		goto out;
 
 	SetPageActive(page);
-	advance_inactive_age(memcg, pgdat);
+	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
 	inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE);
 
 	/* Page was active prior to eviction */
@@ -382,6 +388,7 @@ void workingset_refault(struct page *page, void *shadow)
 void workingset_activation(struct page *page)
 {
 	struct mem_cgroup *memcg;
+	struct lruvec *lruvec;
 
 	rcu_read_lock();
 	/*
@@ -394,7 +401,8 @@ void workingset_activation(struct page *page)
 	memcg = page_memcg_rcu(page);
 	if (!mem_cgroup_disabled() && !memcg)
 		goto out;
-	advance_inactive_age(memcg, page_pgdat(page));
+	lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
+	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
 out:
 	rcu_read_unlock();
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
  2020-06-16  6:16 [PATCH for v5.8 0/3] fix for "mm: balance LRU lists based on relative thrashing" patchset js1304
  2020-06-16  6:16 ` [PATCH for v5.8 1/3] mm: workingset: age nonresident information alongside anonymous pages js1304
@ 2020-06-16  6:16 ` js1304
  2020-06-16 14:48   ` Johannes Weiner
                     ` (2 more replies)
  2020-06-16  6:16 ` [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page js1304
  2 siblings, 3 replies; 11+ messages in thread
From: js1304 @ 2020-06-16  6:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Non-file-lru page could also be activated in mark_page_accessed()
and we need to count this activation for nonresident_age.

Note that it's better for this patch to be squashed into the patch
"mm: workingset: age nonresident information alongside anonymous pages".

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/swap.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index 667133d..c5d5114 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -443,8 +443,7 @@ void mark_page_accessed(struct page *page)
 		else
 			__lru_cache_activate_page(page);
 		ClearPageReferenced(page);
-		if (page_is_file_lru(page))
-			workingset_activation(page);
+		workingset_activation(page);
 	}
 	if (page_is_idle(page))
 		clear_page_idle(page);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page
  2020-06-16  6:16 [PATCH for v5.8 0/3] fix for "mm: balance LRU lists based on relative thrashing" patchset js1304
  2020-06-16  6:16 ` [PATCH for v5.8 1/3] mm: workingset: age nonresident information alongside anonymous pages js1304
  2020-06-16  6:16 ` [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" js1304
@ 2020-06-16  6:16 ` js1304
  2020-06-16 14:50   ` Johannes Weiner
  2020-06-29 10:27   ` Vlastimil Babka
  2 siblings, 2 replies; 11+ messages in thread
From: js1304 @ 2020-06-16  6:16 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

With synchronous IO swap device, swap-in is directly handled in fault
code. Since IO cost notation isn't added there, with synchronous IO swap
device, LRU balancing could be wrongly biased. Fix it to count it
in fault code.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/memory.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/memory.c b/mm/memory.c
index bc6a471..3359057 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3143,6 +3143,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 				if (err)
 					goto out_page;
 
+				/*
+				 * XXX: Move to lru_cache_add() when it
+				 * supports new vs putback
+				 */
+				spin_lock_irq(&page_pgdat(page)->lru_lock);
+				lru_note_cost_page(page);
+				spin_unlock_irq(&page_pgdat(page)->lru_lock);
+
 				lru_cache_add(page);
 				swap_readpage(page, true);
 			}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
  2020-06-16  6:16 ` [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" js1304
@ 2020-06-16 14:48   ` Johannes Weiner
  2020-06-16 18:36   ` Andrew Morton
  2020-06-29 10:25   ` Vlastimil Babka
  2 siblings, 0 replies; 11+ messages in thread
From: Johannes Weiner @ 2020-06-16 14:48 UTC (permalink / raw)
  To: js1304
  Cc: Andrew Morton, linux-mm, linux-kernel, Rik van Riel, Minchan Kim,
	Michal Hocko, kernel-team, Joonsoo Kim

On Tue, Jun 16, 2020 at 03:16:43PM +0900, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Non-file-lru page could also be activated in mark_page_accessed()
> and we need to count this activation for nonresident_age.

Good catch. Shmem pages use mark_page_accessed().

> Note that it's better for this patch to be squashed into the patch
> "mm: workingset: age nonresident information alongside anonymous pages".
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page
  2020-06-16  6:16 ` [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page js1304
@ 2020-06-16 14:50   ` Johannes Weiner
  2020-06-29 10:27   ` Vlastimil Babka
  1 sibling, 0 replies; 11+ messages in thread
From: Johannes Weiner @ 2020-06-16 14:50 UTC (permalink / raw)
  To: js1304
  Cc: Andrew Morton, linux-mm, linux-kernel, Rik van Riel, Minchan Kim,
	Michal Hocko, kernel-team, Joonsoo Kim

On Tue, Jun 16, 2020 at 03:16:44PM +0900, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> With synchronous IO swap device, swap-in is directly handled in fault
> code. Since IO cost notation isn't added there, with synchronous IO swap
> device, LRU balancing could be wrongly biased. Fix it to count it
> in fault code.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
  2020-06-16  6:16 ` [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" js1304
  2020-06-16 14:48   ` Johannes Weiner
@ 2020-06-16 18:36   ` Andrew Morton
  2020-06-17  5:08     ` Joonsoo Kim
  2020-06-29 10:25   ` Vlastimil Babka
  2 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2020-06-16 18:36 UTC (permalink / raw)
  To: js1304
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

On Tue, 16 Jun 2020 15:16:43 +0900 js1304@gmail.com wrote:

> Subject: [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"

I'm having trouble locating such a patch.

> Non-file-lru page could also be activated in mark_page_accessed()
> and we need to count this activation for nonresident_age.
> 
> Note that it's better for this patch to be squashed into the patch
> "mm: workingset: age nonresident information alongside anonymous pages".
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/swap.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 667133d..c5d5114 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -443,8 +443,7 @@ void mark_page_accessed(struct page *page)
>  		else
>  			__lru_cache_activate_page(page);
>  		ClearPageReferenced(page);
> -		if (page_is_file_lru(page))
> -			workingset_activation(page);
> +		workingset_activation(page);
>  	}
>  	if (page_is_idle(page))
>  		clear_page_idle(page);

AFAICT this patch Fixes: a528910e12ec7ee ("mm: thrash detection-based file
cache sizing")?


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
  2020-06-16 18:36   ` Andrew Morton
@ 2020-06-17  5:08     ` Joonsoo Kim
  0 siblings, 0 replies; 11+ messages in thread
From: Joonsoo Kim @ 2020-06-17  5:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Memory Management List, LKML, Johannes Weiner,
	Rik van Riel, Minchan Kim, Michal Hocko, kernel-team,
	Joonsoo Kim

2020년 6월 17일 (수) 오전 3:36, Andrew Morton <akpm@linux-foundation.org>님이 작성:
>
> On Tue, 16 Jun 2020 15:16:43 +0900 js1304@gmail.com wrote:
>
> > Subject: [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
>
> I'm having trouble locating such a patch.
>
> > Non-file-lru page could also be activated in mark_page_accessed()
> > and we need to count this activation for nonresident_age.
> >
> > Note that it's better for this patch to be squashed into the patch
> > "mm: workingset: age nonresident information alongside anonymous pages".
> >
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/swap.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 667133d..c5d5114 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -443,8 +443,7 @@ void mark_page_accessed(struct page *page)
> >               else
> >                       __lru_cache_activate_page(page);
> >               ClearPageReferenced(page);
> > -             if (page_is_file_lru(page))
> > -                     workingset_activation(page);
> > +             workingset_activation(page);
> >       }
> >       if (page_is_idle(page))
> >               clear_page_idle(page);
>
> AFAICT this patch Fixes: a528910e12ec7ee ("mm: thrash detection-based file
> cache sizing")?

No,

This patch could be squashed into the previous patch,
"mm: workingset: age nonresident information alongside anonymous
pages", in this patchset.
I intentionally do not unify them by my hand since review is required.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for v5.8 1/3] mm: workingset: age nonresident information alongside anonymous pages
  2020-06-16  6:16 ` [PATCH for v5.8 1/3] mm: workingset: age nonresident information alongside anonymous pages js1304
@ 2020-06-29 10:10   ` Vlastimil Babka
  0 siblings, 0 replies; 11+ messages in thread
From: Vlastimil Babka @ 2020-06-29 10:10 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

On 6/16/20 8:16 AM, js1304@gmail.com wrote:
> From: Johannes Weiner <hannes@cmpxchg.org>
> 
> After ("mm: workingset: let cache workingset challenge anon fix"), we

This could now be updated to:
After commit 34e58cac6d8f ("mm: workingset: let cache workingset challenge
anon"), we ...

> compare refault distances to active_file + anon. But age of the
> non-resident information is only driven by the file LRU. As a result,
> we may overestimate the recency of any incoming refaults and activate
> them too eagerly, causing unnecessary LRU churn in certain situations.
> 
> Make anon aging drive nonresident age as well to address that.

Fixes: 34e58cac6d8f ("mm: workingset: let cache workingset challenge anon")

> Reported-by: Joonsoo Kim <js1304@gmail.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  include/linux/mmzone.h |  4 ++--
>  include/linux/swap.h   |  1 +
>  mm/vmscan.c            |  3 +++
>  mm/workingset.c        | 46 +++++++++++++++++++++++++++-------------------
>  4 files changed, 33 insertions(+), 21 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index c4c37fd..f6f8849 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -257,8 +257,8 @@ struct lruvec {
>  	 */
>  	unsigned long			anon_cost;
>  	unsigned long			file_cost;
> -	/* Evictions & activations on the inactive file list */
> -	atomic_long_t			inactive_age;
> +	/* Non-resident age, driven by LRU movement */
> +	atomic_long_t			nonresident_age;
>  	/* Refaults at the time of last reclaim cycle */
>  	unsigned long			refaults;
>  	/* Various lruvec state flags (enum lruvec_flags) */
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 4c5974b..5b3216b 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -313,6 +313,7 @@ struct vma_swap_readahead {
>  };
>  
>  /* linux/mm/workingset.c */
> +void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
>  void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
>  void workingset_refault(struct page *page, void *shadow);
>  void workingset_activation(struct page *page);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index b6d8432..749d239 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -904,6 +904,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
>  		__delete_from_swap_cache(page, swap);
>  		xa_unlock_irqrestore(&mapping->i_pages, flags);
>  		put_swap_page(page, swap);
> +		workingset_eviction(page, target_memcg);
>  	} else {
>  		void (*freepage)(struct page *);
>  		void *shadow = NULL;
> @@ -1884,6 +1885,8 @@ static unsigned noinline_for_stack move_pages_to_lru(struct lruvec *lruvec,
>  				list_add(&page->lru, &pages_to_free);
>  		} else {
>  			nr_moved += nr_pages;
> +			if (PageActive(page))
> +				workingset_age_nonresident(lruvec, nr_pages);
>  		}
>  	}
>  
> diff --git a/mm/workingset.c b/mm/workingset.c
> index d481ea4..50b7937 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -156,8 +156,8 @@
>   *
>   *		Implementation
>   *
> - * For each node's file LRU lists, a counter for inactive evictions
> - * and activations is maintained (node->inactive_age).
> + * For each node's LRU lists, a counter for inactive evictions and
> + * activations is maintained (node->nonresident_age).
>   *
>   * On eviction, a snapshot of this counter (along with some bits to
>   * identify the node) is stored in the now empty page cache
> @@ -213,7 +213,17 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
>  	*workingsetp = workingset;
>  }
>  
> -static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
> +/**
> + * workingset_age_nonresident - age non-resident entries as LRU ages
> + * @memcg: the lruvec that was aged
> + * @nr_pages: the number of pages to count
> + *
> + * As in-memory pages are aged, non-resident pages need to be aged as
> + * well, in order for the refault distances later on to be comparable
> + * to the in-memory dimensions. This function allows reclaim and LRU
> + * operations to drive the non-resident aging along in parallel.
> + */
> +void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages)
>  {
>  	/*
>  	 * Reclaiming a cgroup means reclaiming all its children in a
> @@ -227,11 +237,8 @@ static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
>  	 * the root cgroup's, age as well.
>  	 */
>  	do {
> -		struct lruvec *lruvec;
> -
> -		lruvec = mem_cgroup_lruvec(memcg, pgdat);
> -		atomic_long_inc(&lruvec->inactive_age);
> -	} while (memcg && (memcg = parent_mem_cgroup(memcg)));
> +		atomic_long_add(nr_pages, &lruvec->nonresident_age);
> +	} while ((lruvec = parent_lruvec(lruvec)));
>  }
>  
>  /**
> @@ -254,12 +261,11 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
>  	VM_BUG_ON_PAGE(page_count(page), page);
>  	VM_BUG_ON_PAGE(!PageLocked(page), page);
>  
> -	advance_inactive_age(page_memcg(page), pgdat);
> -
>  	lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
> +	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
>  	/* XXX: target_memcg can be NULL, go through lruvec */
>  	memcgid = mem_cgroup_id(lruvec_memcg(lruvec));
> -	eviction = atomic_long_read(&lruvec->inactive_age);
> +	eviction = atomic_long_read(&lruvec->nonresident_age);
>  	return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
>  }
>  
> @@ -309,20 +315,20 @@ void workingset_refault(struct page *page, void *shadow)
>  	if (!mem_cgroup_disabled() && !eviction_memcg)
>  		goto out;
>  	eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
> -	refault = atomic_long_read(&eviction_lruvec->inactive_age);
> +	refault = atomic_long_read(&eviction_lruvec->nonresident_age);
>  
>  	/*
>  	 * Calculate the refault distance
>  	 *
>  	 * The unsigned subtraction here gives an accurate distance
> -	 * across inactive_age overflows in most cases. There is a
> +	 * across nonresident_age overflows in most cases. There is a
>  	 * special case: usually, shadow entries have a short lifetime
>  	 * and are either refaulted or reclaimed along with the inode
>  	 * before they get too old.  But it is not impossible for the
> -	 * inactive_age to lap a shadow entry in the field, which can
> -	 * then result in a false small refault distance, leading to a
> -	 * false activation should this old entry actually refault
> -	 * again.  However, earlier kernels used to deactivate
> +	 * nonresident_age to lap a shadow entry in the field, which
> +	 * can then result in a false small refault distance, leading
> +	 * to a false activation should this old entry actually
> +	 * refault again.  However, earlier kernels used to deactivate
>  	 * unconditionally with *every* reclaim invocation for the
>  	 * longest time, so the occasional inappropriate activation
>  	 * leading to pressure on the active list is not a problem.
> @@ -359,7 +365,7 @@ void workingset_refault(struct page *page, void *shadow)
>  		goto out;
>  
>  	SetPageActive(page);
> -	advance_inactive_age(memcg, pgdat);
> +	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
>  	inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE);
>  
>  	/* Page was active prior to eviction */
> @@ -382,6 +388,7 @@ void workingset_refault(struct page *page, void *shadow)
>  void workingset_activation(struct page *page)
>  {
>  	struct mem_cgroup *memcg;
> +	struct lruvec *lruvec;
>  
>  	rcu_read_lock();
>  	/*
> @@ -394,7 +401,8 @@ void workingset_activation(struct page *page)
>  	memcg = page_memcg_rcu(page);
>  	if (!mem_cgroup_disabled() && !memcg)
>  		goto out;
> -	advance_inactive_age(memcg, page_pgdat(page));
> +	lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
> +	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
>  out:
>  	rcu_read_unlock();
>  }
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages"
  2020-06-16  6:16 ` [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" js1304
  2020-06-16 14:48   ` Johannes Weiner
  2020-06-16 18:36   ` Andrew Morton
@ 2020-06-29 10:25   ` Vlastimil Babka
  2 siblings, 0 replies; 11+ messages in thread
From: Vlastimil Babka @ 2020-06-29 10:25 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

On 6/16/20 8:16 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> Non-file-lru page could also be activated in mark_page_accessed()
> and we need to count this activation for nonresident_age.
> 
> Note that it's better for this patch to be squashed into the patch
> "mm: workingset: age nonresident information alongside anonymous pages".

Agreed.

> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/swap.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 667133d..c5d5114 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -443,8 +443,7 @@ void mark_page_accessed(struct page *page)
>  		else
>  			__lru_cache_activate_page(page);
>  		ClearPageReferenced(page);
> -		if (page_is_file_lru(page))
> -			workingset_activation(page);
> +		workingset_activation(page);
>  	}
>  	if (page_is_idle(page))
>  		clear_page_idle(page);
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page
  2020-06-16  6:16 ` [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page js1304
  2020-06-16 14:50   ` Johannes Weiner
@ 2020-06-29 10:27   ` Vlastimil Babka
  1 sibling, 0 replies; 11+ messages in thread
From: Vlastimil Babka @ 2020-06-29 10:27 UTC (permalink / raw)
  To: js1304, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Rik van Riel,
	Minchan Kim, Michal Hocko, kernel-team, Joonsoo Kim

On 6/16/20 8:16 AM, js1304@gmail.com wrote:
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> With synchronous IO swap device, swap-in is directly handled in fault
> code. Since IO cost notation isn't added there, with synchronous IO swap
> device, LRU balancing could be wrongly biased. Fix it to count it
> in fault code.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> ---
>  mm/memory.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/memory.c b/mm/memory.c
> index bc6a471..3359057 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -3143,6 +3143,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
>  				if (err)
>  					goto out_page;
>  
> +				/*
> +				 * XXX: Move to lru_cache_add() when it
> +				 * supports new vs putback
> +				 */
> +				spin_lock_irq(&page_pgdat(page)->lru_lock);
> +				lru_note_cost_page(page);
> +				spin_unlock_irq(&page_pgdat(page)->lru_lock);
> +
>  				lru_cache_add(page);
>  				swap_readpage(page, true);
>  			}
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-06-29 21:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16  6:16 [PATCH for v5.8 0/3] fix for "mm: balance LRU lists based on relative thrashing" patchset js1304
2020-06-16  6:16 ` [PATCH for v5.8 1/3] mm: workingset: age nonresident information alongside anonymous pages js1304
2020-06-29 10:10   ` Vlastimil Babka
2020-06-16  6:16 ` [PATCH for v5.8 2/3] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" js1304
2020-06-16 14:48   ` Johannes Weiner
2020-06-16 18:36   ` Andrew Morton
2020-06-17  5:08     ` Joonsoo Kim
2020-06-29 10:25   ` Vlastimil Babka
2020-06-16  6:16 ` [PATCH for v5.8 3/3] mm/memory: fix IO cost for anonymous page js1304
2020-06-16 14:50   ` Johannes Weiner
2020-06-29 10:27   ` Vlastimil Babka

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).