All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, hannes@cmpxchg.org,
	iamjoonsoo.kim@lge.com, js1304@gmail.com, mhocko@suse.com,
	minchan.kim@gmail.com, mm-commits@vger.kernel.org,
	riel@surriel.com, torvalds@linux-foundation.org
Subject: [patch 25/32] mm: workingset: age nonresident information alongside anonymous pages
Date: Thu, 25 Jun 2020 20:30:31 -0700	[thread overview]
Message-ID: <20200626033031.WA4Y5iL9D%akpm@linux-foundation.org> (raw)
In-Reply-To: <20200625202807.b630829d6fa55388148bee7d@linux-foundation.org>

From: Johannes Weiner <hannes@cmpxchg.org>
Subject: mm: workingset: age nonresident information alongside anonymous pages

Patch series "fix for "mm: balance LRU lists based on relative thrashing" patchset"

This patchset fixes some problems of the patchset, "mm: balance LRU lists
based on relative thrashing", which is now merged on the mainline.

Patch "mm: workingset: let cache workingset challenge anon fix" is the
result of discussion with Johannes.  See following link.

http://lkml.kernel.org/r/20200520232525.798933-6-hannes@cmpxchg.org

And, the other two are minor things which are found when I try to rebase
my patchset.


This patch (of 3):
After ("mm: workingset: let cache workingset challenge anon fix"), we
compare refault distances to active_file + anon.  But age of the
non-resident information is only driven by the file LRU.  As a result, we
may overestimate the recency of any incoming refaults and activate them
too eagerly, causing unnecessary LRU churn in certain situations.

Make anon aging drive nonresident age as well to address that.

Link: http://lkml.kernel.org/r/1592288204-27734-1-git-send-email-iamjoonsoo.kim@lge.com
Link: http://lkml.kernel.org/r/1592288204-27734-2-git-send-email-iamjoonsoo.kim@lge.com
Fixes: 34e58cac6d8f2a ("mm: workingset: let cache workingset challenge anon")
Reported-by: Joonsoo Kim <js1304@gmail.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Minchan Kim <minchan.kim@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 include/linux/mmzone.h |    4 +--
 include/linux/swap.h   |    1 
 mm/vmscan.c            |    3 ++
 mm/workingset.c        |   46 ++++++++++++++++++++++-----------------
 4 files changed, 33 insertions(+), 21 deletions(-)

--- a/include/linux/mmzone.h~mm-workingset-age-nonresident-information-alongside-anonymous-pages
+++ a/include/linux/mmzone.h
@@ -257,8 +257,8 @@ struct lruvec {
 	 */
 	unsigned long			anon_cost;
 	unsigned long			file_cost;
-	/* Evictions & activations on the inactive file list */
-	atomic_long_t			inactive_age;
+	/* Non-resident age, driven by LRU movement */
+	atomic_long_t			nonresident_age;
 	/* Refaults at the time of last reclaim cycle */
 	unsigned long			refaults;
 	/* Various lruvec state flags (enum lruvec_flags) */
--- a/include/linux/swap.h~mm-workingset-age-nonresident-information-alongside-anonymous-pages
+++ a/include/linux/swap.h
@@ -313,6 +313,7 @@ struct vma_swap_readahead {
 };
 
 /* linux/mm/workingset.c */
+void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
 void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg);
 void workingset_refault(struct page *page, void *shadow);
 void workingset_activation(struct page *page);
--- a/mm/vmscan.c~mm-workingset-age-nonresident-information-alongside-anonymous-pages
+++ a/mm/vmscan.c
@@ -904,6 +904,7 @@ static int __remove_mapping(struct addre
 		__delete_from_swap_cache(page, swap);
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 		put_swap_page(page, swap);
+		workingset_eviction(page, target_memcg);
 	} else {
 		void (*freepage)(struct page *);
 		void *shadow = NULL;
@@ -1884,6 +1885,8 @@ static unsigned noinline_for_stack move_
 				list_add(&page->lru, &pages_to_free);
 		} else {
 			nr_moved += nr_pages;
+			if (PageActive(page))
+				workingset_age_nonresident(lruvec, nr_pages);
 		}
 	}
 
--- a/mm/workingset.c~mm-workingset-age-nonresident-information-alongside-anonymous-pages
+++ a/mm/workingset.c
@@ -156,8 +156,8 @@
  *
  *		Implementation
  *
- * For each node's file LRU lists, a counter for inactive evictions
- * and activations is maintained (node->inactive_age).
+ * For each node's LRU lists, a counter for inactive evictions and
+ * activations is maintained (node->nonresident_age).
  *
  * On eviction, a snapshot of this counter (along with some bits to
  * identify the node) is stored in the now empty page cache
@@ -213,7 +213,17 @@ static void unpack_shadow(void *shadow,
 	*workingsetp = workingset;
 }
 
-static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
+/**
+ * workingset_age_nonresident - age non-resident entries as LRU ages
+ * @memcg: the lruvec that was aged
+ * @nr_pages: the number of pages to count
+ *
+ * As in-memory pages are aged, non-resident pages need to be aged as
+ * well, in order for the refault distances later on to be comparable
+ * to the in-memory dimensions. This function allows reclaim and LRU
+ * operations to drive the non-resident aging along in parallel.
+ */
+void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages)
 {
 	/*
 	 * Reclaiming a cgroup means reclaiming all its children in a
@@ -227,11 +237,8 @@ static void advance_inactive_age(struct
 	 * the root cgroup's, age as well.
 	 */
 	do {
-		struct lruvec *lruvec;
-
-		lruvec = mem_cgroup_lruvec(memcg, pgdat);
-		atomic_long_inc(&lruvec->inactive_age);
-	} while (memcg && (memcg = parent_mem_cgroup(memcg)));
+		atomic_long_add(nr_pages, &lruvec->nonresident_age);
+	} while ((lruvec = parent_lruvec(lruvec)));
 }
 
 /**
@@ -254,12 +261,11 @@ void *workingset_eviction(struct page *p
 	VM_BUG_ON_PAGE(page_count(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 
-	advance_inactive_age(page_memcg(page), pgdat);
-
 	lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
+	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
 	/* XXX: target_memcg can be NULL, go through lruvec */
 	memcgid = mem_cgroup_id(lruvec_memcg(lruvec));
-	eviction = atomic_long_read(&lruvec->inactive_age);
+	eviction = atomic_long_read(&lruvec->nonresident_age);
 	return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
 }
 
@@ -309,20 +315,20 @@ void workingset_refault(struct page *pag
 	if (!mem_cgroup_disabled() && !eviction_memcg)
 		goto out;
 	eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
-	refault = atomic_long_read(&eviction_lruvec->inactive_age);
+	refault = atomic_long_read(&eviction_lruvec->nonresident_age);
 
 	/*
 	 * Calculate the refault distance
 	 *
 	 * The unsigned subtraction here gives an accurate distance
-	 * across inactive_age overflows in most cases. There is a
+	 * across nonresident_age overflows in most cases. There is a
 	 * special case: usually, shadow entries have a short lifetime
 	 * and are either refaulted or reclaimed along with the inode
 	 * before they get too old.  But it is not impossible for the
-	 * inactive_age to lap a shadow entry in the field, which can
-	 * then result in a false small refault distance, leading to a
-	 * false activation should this old entry actually refault
-	 * again.  However, earlier kernels used to deactivate
+	 * nonresident_age to lap a shadow entry in the field, which
+	 * can then result in a false small refault distance, leading
+	 * to a false activation should this old entry actually
+	 * refault again.  However, earlier kernels used to deactivate
 	 * unconditionally with *every* reclaim invocation for the
 	 * longest time, so the occasional inappropriate activation
 	 * leading to pressure on the active list is not a problem.
@@ -359,7 +365,7 @@ void workingset_refault(struct page *pag
 		goto out;
 
 	SetPageActive(page);
-	advance_inactive_age(memcg, pgdat);
+	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
 	inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE);
 
 	/* Page was active prior to eviction */
@@ -382,6 +388,7 @@ out:
 void workingset_activation(struct page *page)
 {
 	struct mem_cgroup *memcg;
+	struct lruvec *lruvec;
 
 	rcu_read_lock();
 	/*
@@ -394,7 +401,8 @@ void workingset_activation(struct page *
 	memcg = page_memcg_rcu(page);
 	if (!mem_cgroup_disabled() && !memcg)
 		goto out;
-	advance_inactive_age(memcg, page_pgdat(page));
+	lruvec = mem_cgroup_page_lruvec(page, page_pgdat(page));
+	workingset_age_nonresident(lruvec, hpage_nr_pages(page));
 out:
 	rcu_read_unlock();
 }
_

  parent reply	other threads:[~2020-06-26  3:30 UTC|newest]

Thread overview: 84+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-26  3:28 incoming Andrew Morton
2020-06-26  3:29 ` [patch 01/32] openrisc: fix boot oops when DEBUG_VM is enabled Andrew Morton
2020-06-26  3:29 ` [patch 02/32] mm: do_swap_page(): fix up the error code Andrew Morton
2020-06-26  3:29 ` [patch 03/32] mm, compaction: make capture control handling safe wrt interrupts Andrew Morton
2020-06-26  3:29 ` [patch 04/32] kexec: do not verify the signature without the lockdown or mandatory signature Andrew Morton
2020-06-26  3:29 ` [patch 05/32] ocfs2: avoid inode removal while nfsd is accessing it Andrew Morton
2020-06-26  3:29 ` [patch 06/32] ocfs2: load global_inode_alloc Andrew Morton
2020-06-26  3:29 ` [patch 07/32] ocfs2: fix panic on nfs server over ocfs2 Andrew Morton
2020-06-26  3:29 ` [patch 08/32] ocfs2: fix value of OCFS2_INVALID_SLOT Andrew Morton
2020-06-26  3:29 ` [patch 09/32] lib: fix test_hmm.c reference after free Andrew Morton
2020-06-26  3:29 ` [patch 10/32] linux/bits.h: fix unsigned less than zero warnings Andrew Morton
     [not found]   ` <CAHk-=wiZrhVUq3N17=GVzMQNQUKi65x=-djTM2A+fz8UdQxgEg@mail.gmail.com>
     [not found]     ` <CADRDgG6SXwngT5gS2EY1Y0xnPdYth-FicQyTnPyqiwpmw52eQg@mail.gmail.com>
2020-06-26 13:23       ` Andy Shevchenko
2020-06-26 14:03         ` Arnd Bergmann
2020-06-26 14:09           ` Andy Shevchenko
2020-06-26 14:43             ` Arnd Bergmann
2020-06-26 15:21               ` Kees Cook
2020-06-27 22:01           ` Linus Torvalds
2020-07-08 19:07             ` [PATCH] kbuild: Move -Wtype-limits to W=2 Rikard Falkeborn
2020-07-08 20:00               ` Andy Shevchenko
2020-06-26  3:29 ` [patch 11/32] mm, slab: fix sign conversion problem in memcg_uncharge_slab() Andrew Morton
2020-06-26  3:29 ` Andrew Morton
2020-06-26  3:29 ` [patch 12/32] mm/slab: use memzero_explicit() in kzfree() Andrew Morton
2020-06-26  3:29 ` [patch 13/32] slub: cure list_slab_objects() from double fix Andrew Morton
2020-06-26  3:29 ` [patch 14/32] mm: fix swap cache node allocation mask Andrew Morton
2020-06-26  3:29   ` Andrew Morton
2020-06-26  3:30 ` [patch 15/32] mm/memory.c: properly pte_offset_map_lock/unlock in vm_insert_pages() Andrew Morton
2020-06-26  3:30 ` [patch 16/32] mm/debug_vm_pgtable: fix build failure with powerpc 8xx Andrew Morton
2020-06-26  3:30 ` [patch 17/32] make asm-generic/cacheflush.h more standalone Andrew Morton
2020-06-26  3:30 ` [patch 18/32] media: omap3isp: remove cacheflush.h Andrew Morton
2020-06-26  3:30 ` [patch 19/32] mm/vmalloc.c: fix a warning while make xmldocs Andrew Morton
2020-06-26  3:30 ` [patch 20/32] mm: memcontrol: handle div0 crash race condition in memory.low Andrew Morton
2020-06-26  3:30   ` Andrew Morton
2020-06-26  3:30 ` [patch 21/32] mm/memcontrol.c: add missed css_put() Andrew Morton
2020-06-26  3:30 ` [patch 22/32] mm/memcontrol.c: prevent missed memory.low load tears Andrew Morton
2020-06-26  3:30 ` [patch 23/32] docs: mm/gup: minor documentation update Andrew Morton
2020-06-26  3:30 ` [patch 24/32] doc: THP CoW fault no longer allocate THP Andrew Morton
2020-06-26  3:30 ` Andrew Morton [this message]
2020-06-26  3:30 ` [patch 26/32] mm/swap: fix for "mm: workingset: age nonresident information alongside anonymous pages" Andrew Morton
2020-06-26  3:30 ` [patch 27/32] mm/memory: fix IO cost for anonymous page Andrew Morton
2020-06-26  3:30 ` [patch 28/32] x86/hyperv: allocate the hypercall page with only read and execute bits Andrew Morton
2020-06-26  3:30 ` [patch 29/32] arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page Andrew Morton
2020-06-26  3:30 ` [patch 30/32] mm: remove vmalloc_exec Andrew Morton
2020-06-26  3:30 ` [patch 31/32] mm/memory_hotplug.c: fix false softlockup during pfn range removal Andrew Morton
2020-06-26  3:30 ` [patch 32/32] MAINTAINERS: update info for sparse Andrew Morton
2020-06-26  6:51 ` incoming Linus Torvalds
2020-06-26  7:31   ` incoming Linus Torvalds
2020-06-26 17:39   ` incoming Konstantin Ryabitsev
2020-06-26 17:40     ` incoming Konstantin Ryabitsev
2020-06-27  3:32 ` + linux-next-git-rejects.patch added to -mm tree Andrew Morton
2020-06-27  3:32 ` [merged] dma-remap-align-the-size-in-dma_common__remap.patch removed from " Andrew Morton
2020-06-27  3:32 ` [merged] openrisc-fix-boot-oops-when-debug_vm-is-enabled.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-do_swap_page-fix-up-the-error-code-instantiation.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-compaction-make-capture-control-handling-safe-wrt-interrupts.patch " Andrew Morton
2020-06-27  3:33 ` [merged] kexec-do-not-verify-the-signature-without-the-lockdown-or-mandatory-signature.patch " Andrew Morton
2020-06-27  3:33 ` [merged] ocfs2-avoid-inode-removed-while-nfsd-access-it.patch " Andrew Morton
2020-06-27  3:33 ` [merged] ocfs2-load-global_inode_alloc.patch " Andrew Morton
2020-06-27  3:33 ` [merged] ocfs2-fix-panic-on-nfs-server-over-ocfs2.patch " Andrew Morton
2020-06-27  3:33 ` Andrew Morton
2020-06-27  3:33 ` [merged] ocfs2-fix-value-of-ocfs2_invalid_slot.patch " Andrew Morton
2020-06-27  3:33 ` [merged] lib-fix-test_hmmc-reference-after-free.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-slab-fix-sign-conversion-problem-in-memcg_uncharge_slab.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-slab-use-memzero_explicit-in-kzfree.patch " Andrew Morton
2020-06-27  3:33 ` [merged] slub-cure-list_slab_objects-from-double-fix.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-fix-swap-cache-node-allocation-mask.patch " Andrew Morton
2020-06-27  3:33   ` Andrew Morton
2020-06-27  3:33 ` [merged] mm-memoryc-properly-pte_offset_map_lock-unlock-in-vm_insert_pages.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-debug_vm_pgtable-fix-build-failure-with-powerpc-8xx.patch " Andrew Morton
2020-06-27  3:33 ` [merged] make-asm-generic-cacheflushh-more-standalone.patch " Andrew Morton
2020-06-27  3:33 ` [merged] media-omap3isp-remove-cacheflushh.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-fix-a-warning-while-make-xmldocs.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-memcontrol-handle-div0-crash-race-condition-in-memorylow.patch " Andrew Morton
2020-06-27  3:33   ` Andrew Morton
2020-06-27  3:33 ` [merged] mm-memcontrol-fix-do-not-put-the-css-reference.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-memcg-prevent-missed-memorylow-load-tears.patch " Andrew Morton
2020-06-27  3:33 ` [merged] docs-mm-gup-minor-documentation-update.patch " Andrew Morton
2020-06-27  3:33 ` [merged] doc-thp-cow-fault-no-longer-allocate-thp.patch " Andrew Morton
2020-06-27  3:33 ` [merged] mm-workingset-age-nonresident-information-alongside-anonymous-pages.patch " Andrew Morton
2020-06-27  3:34 ` [merged] mm-swap-fix-for-mm-workingset-age-nonresident-information-alongside-anonymous-pages.patch " Andrew Morton
2020-06-27  3:34 ` [merged] mm-memory-fix-io-cost-for-anonymous-page.patch " Andrew Morton
2020-06-27  3:34 ` [merged] x86-hyperv-allocate-the-hypercall-page-with-only-read-and-execute-bits.patch " Andrew Morton
2020-06-27  3:34 ` [merged] arm64-use-page_kernel_rox-directly-in-alloc_insn_page.patch " Andrew Morton
2020-06-27  3:34 ` [merged] mm-remove-vmalloc_exec.patch " Andrew Morton
2020-06-27  3:34 ` [merged] mm-fix-false-softlockup-during-pfn-range-removal.patch " Andrew Morton
2020-06-27  3:34 ` [merged] maintainers-update-info-for-sparse.patch " Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200626033031.WA4Y5iL9D%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=js1304@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=minchan.kim@gmail.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=riel@surriel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.