linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction()
@ 2019-02-22 17:43 Andrey Ryabinin
  2019-02-22 17:43 ` [PATCH 2/5] mm: remove zone_lru_lock() function access ->lru_lock directly Andrey Ryabinin
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2019-02-22 17:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Andrey Ryabinin, Johannes Weiner,
	Michal Hocko, Vlastimil Babka, Rik van Riel, Mel Gorman

workingset_eviction() doesn't use and never did use the @mapping argument.
Remove it.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 include/linux/swap.h | 2 +-
 mm/vmscan.c          | 2 +-
 mm/workingset.c      | 3 +--
 3 files changed, 3 insertions(+), 4 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 649529be91f2..fc50e21b3b88 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -307,7 +307,7 @@ struct vma_swap_readahead {
 };
 
 /* linux/mm/workingset.c */
-void *workingset_eviction(struct address_space *mapping, struct page *page);
+void *workingset_eviction(struct page *page);
 void workingset_refault(struct page *page, void *shadow);
 void workingset_activation(struct page *page);
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index ac4806f0f332..a9852ed7b97f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -952,7 +952,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 		 */
 		if (reclaimed && page_is_file_cache(page) &&
 		    !mapping_exiting(mapping) && !dax_mapping(mapping))
-			shadow = workingset_eviction(mapping, page);
+			shadow = workingset_eviction(page);
 		__delete_from_page_cache(page, shadow);
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 
diff --git a/mm/workingset.c b/mm/workingset.c
index dcb994f2acc2..0906137760c5 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -215,13 +215,12 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 
 /**
  * workingset_eviction - note the eviction of a page from memory
- * @mapping: address space the page was backing
  * @page: the page being evicted
  *
  * Returns a shadow entry to be stored in @mapping->i_pages in place
  * of the evicted @page so that a later refault can be detected.
  */
-void *workingset_eviction(struct address_space *mapping, struct page *page)
+void *workingset_eviction(struct page *page)
 {
 	struct pglist_data *pgdat = page_pgdat(page);
 	struct mem_cgroup *memcg = page_memcg(page);
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 2/5] mm: remove zone_lru_lock() function access ->lru_lock directly
  2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
@ 2019-02-22 17:43 ` Andrey Ryabinin
  2019-02-25 12:10   ` Vlastimil Babka
  2019-02-22 17:43 ` [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone Andrey Ryabinin
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Andrey Ryabinin @ 2019-02-22 17:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Andrey Ryabinin, Johannes Weiner,
	Michal Hocko, Vlastimil Babka, Rik van Riel, Mel Gorman

We have common pattern to access lru_lock from a page pointer:
	zone_lru_lock(page_zone(page))

Which is silly, because it unfolds to this:
	&NODE_DATA(page_to_nid(page))->node_zones[page_zonenum(page)]->zone_pgdat->lru_lock
while we can simply do
	&NODE_DATA(page_to_nid(page))->lru_lock

Remove zone_lru_lock() function, since it's only complicate things.
Use 'page_pgdat(page)->lru_lock' pattern instead.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 Documentation/cgroup-v1/memcg_test.txt |  4 ++--
 Documentation/cgroup-v1/memory.txt     |  4 ++--
 include/linux/mm_types.h               |  2 +-
 include/linux/mmzone.h                 |  4 ----
 mm/compaction.c                        | 15 ++++++++-------
 mm/filemap.c                           |  4 ++--
 mm/huge_memory.c                       |  6 +++---
 mm/memcontrol.c                        | 14 +++++++-------
 mm/mlock.c                             | 14 +++++++-------
 mm/page_idle.c                         |  8 ++++----
 mm/rmap.c                              |  2 +-
 mm/swap.c                              | 16 ++++++++--------
 mm/vmscan.c                            | 16 ++++++++--------
 13 files changed, 53 insertions(+), 56 deletions(-)

diff --git a/Documentation/cgroup-v1/memcg_test.txt b/Documentation/cgroup-v1/memcg_test.txt
index 5c7f310f32bb..621e29ffb358 100644
--- a/Documentation/cgroup-v1/memcg_test.txt
+++ b/Documentation/cgroup-v1/memcg_test.txt
@@ -107,9 +107,9 @@ Under below explanation, we assume CONFIG_MEM_RES_CTRL_SWAP=y.
 
 8. LRU
         Each memcg has its own private LRU. Now, its handling is under global
-	VM's control (means that it's handled under global zone_lru_lock).
+	VM's control (means that it's handled under global pgdat->lru_lock).
 	Almost all routines around memcg's LRU is called by global LRU's
-	list management functions under zone_lru_lock().
+	list management functions under pgdat->lru_lock.
 
 	A special function is mem_cgroup_isolate_pages(). This scans
 	memcg's private LRU and call __isolate_lru_page() to extract a page
diff --git a/Documentation/cgroup-v1/memory.txt b/Documentation/cgroup-v1/memory.txt
index 8e2cb1dabeb0..a33cedf85427 100644
--- a/Documentation/cgroup-v1/memory.txt
+++ b/Documentation/cgroup-v1/memory.txt
@@ -267,11 +267,11 @@ When oom event notifier is registered, event will be delivered.
    Other lock order is following:
    PG_locked.
    mm->page_table_lock
-       zone_lru_lock
+       pgdat->lru_lock
 	  lock_page_cgroup.
   In many cases, just lock_page_cgroup() is called.
   per-zone-per-cgroup LRU (cgroup's private LRU) is just guarded by
-  zone_lru_lock, it has no lock of its own.
+  pgdat->lru_lock, it has no lock of its own.
 
 2.7 Kernel Memory Extension (CONFIG_MEMCG_KMEM)
 
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index fe0f672f53ce..9b9dd8350e26 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -79,7 +79,7 @@ struct page {
 		struct {	/* Page cache and anonymous pages */
 			/**
 			 * @lru: Pageout list, eg. active_list protected by
-			 * zone_lru_lock.  Sometimes used as a generic list
+			 * pgdat->lru_lock.  Sometimes used as a generic list
 			 * by the page owner.
 			 */
 			struct list_head lru;
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 2fd4247262e9..22423763c0bd 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -788,10 +788,6 @@ typedef struct pglist_data {
 
 #define node_start_pfn(nid)	(NODE_DATA(nid)->node_start_pfn)
 #define node_end_pfn(nid) pgdat_end_pfn(NODE_DATA(nid))
-static inline spinlock_t *zone_lru_lock(struct zone *zone)
-{
-	return &zone->zone_pgdat->lru_lock;
-}
 
 static inline struct lruvec *node_lruvec(struct pglist_data *pgdat)
 {
diff --git a/mm/compaction.c b/mm/compaction.c
index 98f99f41dfdc..a3305f13a138 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -775,6 +775,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 			unsigned long end_pfn, isolate_mode_t isolate_mode)
 {
 	struct zone *zone = cc->zone;
+	pg_data_t *pgdat = zone->zone_pgdat;
 	unsigned long nr_scanned = 0, nr_isolated = 0;
 	struct lruvec *lruvec;
 	unsigned long flags = 0;
@@ -839,8 +840,8 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		 * if contended.
 		 */
 		if (!(low_pfn % SWAP_CLUSTER_MAX)
-		    && compact_unlock_should_abort(zone_lru_lock(zone), flags,
-								&locked, cc))
+		    && compact_unlock_should_abort(&pgdat->lru_lock,
+					    flags, &locked, cc))
 			break;
 
 		if (!pfn_valid_within(low_pfn))
@@ -910,7 +911,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 			if (unlikely(__PageMovable(page)) &&
 					!PageIsolated(page)) {
 				if (locked) {
-					spin_unlock_irqrestore(zone_lru_lock(zone),
+					spin_unlock_irqrestore(&pgdat->lru_lock,
 									flags);
 					locked = false;
 				}
@@ -940,7 +941,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 
 		/* If we already hold the lock, we can skip some rechecking */
 		if (!locked) {
-			locked = compact_lock_irqsave(zone_lru_lock(zone),
+			locked = compact_lock_irqsave(&pgdat->lru_lock,
 								&flags, cc);
 
 			/* Try get exclusive access under lock */
@@ -965,7 +966,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 			}
 		}
 
-		lruvec = mem_cgroup_page_lruvec(page, zone->zone_pgdat);
+		lruvec = mem_cgroup_page_lruvec(page, pgdat);
 
 		/* Try isolate the page */
 		if (__isolate_lru_page(page, isolate_mode) != 0)
@@ -1007,7 +1008,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 		 */
 		if (nr_isolated) {
 			if (locked) {
-				spin_unlock_irqrestore(zone_lru_lock(zone), flags);
+				spin_unlock_irqrestore(&pgdat->lru_lock, flags);
 				locked = false;
 			}
 			putback_movable_pages(&cc->migratepages);
@@ -1034,7 +1035,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 
 isolate_abort:
 	if (locked)
-		spin_unlock_irqrestore(zone_lru_lock(zone), flags);
+		spin_unlock_irqrestore(&pgdat->lru_lock, flags);
 
 	/*
 	 * Updated the cached scanner pfn once the pageblock has been scanned
diff --git a/mm/filemap.c b/mm/filemap.c
index 663f3b84990d..cace3eb8069f 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -98,8 +98,8 @@
  *    ->swap_lock		(try_to_unmap_one)
  *    ->private_lock		(try_to_unmap_one)
  *    ->i_pages lock		(try_to_unmap_one)
- *    ->zone_lru_lock(zone)	(follow_page->mark_page_accessed)
- *    ->zone_lru_lock(zone)	(check_pte_range->isolate_lru_page)
+ *    ->pgdat->lru_lock		(follow_page->mark_page_accessed)
+ *    ->pgdat->lru_lock		(check_pte_range->isolate_lru_page)
  *    ->private_lock		(page_remove_rmap->set_page_dirty)
  *    ->i_pages lock		(page_remove_rmap->set_page_dirty)
  *    bdi.wb->list_lock		(page_remove_rmap->set_page_dirty)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d4847026d4b1..4ccac6b32d49 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2475,7 +2475,7 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 		xa_unlock(&head->mapping->i_pages);
 	}
 
-	spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags);
+	spin_unlock_irqrestore(&page_pgdat(head)->lru_lock, flags);
 
 	remap_page(head);
 
@@ -2686,7 +2686,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		lru_add_drain();
 
 	/* prevent PageLRU to go away from under us, and freeze lru stats */
-	spin_lock_irqsave(zone_lru_lock(page_zone(head)), flags);
+	spin_lock_irqsave(&pgdata->lru_lock, flags);
 
 	if (mapping) {
 		XA_STATE(xas, &mapping->i_pages, page_index(head));
@@ -2731,7 +2731,7 @@ int split_huge_page_to_list(struct page *page, struct list_head *list)
 		spin_unlock(&pgdata->split_queue_lock);
 fail:		if (mapping)
 			xa_unlock(&mapping->i_pages);
-		spin_unlock_irqrestore(zone_lru_lock(page_zone(head)), flags);
+		spin_unlock_irqrestore(&pgdata->lru_lock, flags);
 		remap_page(head);
 		ret = -EBUSY;
 	}
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5fc2e1a7d4d2..17859721a263 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2377,13 +2377,13 @@ static void cancel_charge(struct mem_cgroup *memcg, unsigned int nr_pages)
 
 static void lock_page_lru(struct page *page, int *isolated)
 {
-	struct zone *zone = page_zone(page);
+	pg_data_t *pgdat = page_pgdat(page);
 
-	spin_lock_irq(zone_lru_lock(zone));
+	spin_lock_irq(&pgdat->lru_lock);
 	if (PageLRU(page)) {
 		struct lruvec *lruvec;
 
-		lruvec = mem_cgroup_page_lruvec(page, zone->zone_pgdat);
+		lruvec = mem_cgroup_page_lruvec(page, pgdat);
 		ClearPageLRU(page);
 		del_page_from_lru_list(page, lruvec, page_lru(page));
 		*isolated = 1;
@@ -2393,17 +2393,17 @@ static void lock_page_lru(struct page *page, int *isolated)
 
 static void unlock_page_lru(struct page *page, int isolated)
 {
-	struct zone *zone = page_zone(page);
+	pg_data_t *pgdat = page_pgdat(page);
 
 	if (isolated) {
 		struct lruvec *lruvec;
 
-		lruvec = mem_cgroup_page_lruvec(page, zone->zone_pgdat);
+		lruvec = mem_cgroup_page_lruvec(page, pgdat);
 		VM_BUG_ON_PAGE(PageLRU(page), page);
 		SetPageLRU(page);
 		add_page_to_lru_list(page, lruvec, page_lru(page));
 	}
-	spin_unlock_irq(zone_lru_lock(zone));
+	spin_unlock_irq(&pgdat->lru_lock);
 }
 
 static void commit_charge(struct page *page, struct mem_cgroup *memcg,
@@ -2689,7 +2689,7 @@ void __memcg_kmem_uncharge(struct page *page, int order)
 
 /*
  * Because tail pages are not marked as "used", set it. We're under
- * zone_lru_lock and migration entries setup in all page mappings.
+ * pgdat->lru_lock and migration entries setup in all page mappings.
  */
 void mem_cgroup_split_huge_fixup(struct page *head)
 {
diff --git a/mm/mlock.c b/mm/mlock.c
index 41cc47e28ad6..080f3b36415b 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -182,7 +182,7 @@ static void __munlock_isolation_failed(struct page *page)
 unsigned int munlock_vma_page(struct page *page)
 {
 	int nr_pages;
-	struct zone *zone = page_zone(page);
+	pg_data_t *pgdat = page_pgdat(page);
 
 	/* For try_to_munlock() and to serialize with page migration */
 	BUG_ON(!PageLocked(page));
@@ -194,7 +194,7 @@ unsigned int munlock_vma_page(struct page *page)
 	 * might otherwise copy PageMlocked to part of the tail pages before
 	 * we clear it in the head page. It also stabilizes hpage_nr_pages().
 	 */
-	spin_lock_irq(zone_lru_lock(zone));
+	spin_lock_irq(&pgdat->lru_lock);
 
 	if (!TestClearPageMlocked(page)) {
 		/* Potentially, PTE-mapped THP: do not skip the rest PTEs */
@@ -203,17 +203,17 @@ unsigned int munlock_vma_page(struct page *page)
 	}
 
 	nr_pages = hpage_nr_pages(page);
-	__mod_zone_page_state(zone, NR_MLOCK, -nr_pages);
+	__mod_zone_page_state(page_zone(page), NR_MLOCK, -nr_pages);
 
 	if (__munlock_isolate_lru_page(page, true)) {
-		spin_unlock_irq(zone_lru_lock(zone));
+		spin_unlock_irq(&pgdat->lru_lock);
 		__munlock_isolated_page(page);
 		goto out;
 	}
 	__munlock_isolation_failed(page);
 
 unlock_out:
-	spin_unlock_irq(zone_lru_lock(zone));
+	spin_unlock_irq(&pgdat->lru_lock);
 
 out:
 	return nr_pages - 1;
@@ -298,7 +298,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 	pagevec_init(&pvec_putback);
 
 	/* Phase 1: page isolation */
-	spin_lock_irq(zone_lru_lock(zone));
+	spin_lock_irq(&zone->zone_pgdat->lru_lock);
 	for (i = 0; i < nr; i++) {
 		struct page *page = pvec->pages[i];
 
@@ -325,7 +325,7 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
 		pvec->pages[i] = NULL;
 	}
 	__mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
-	spin_unlock_irq(zone_lru_lock(zone));
+	spin_unlock_irq(&zone->zone_pgdat->lru_lock);
 
 	/* Now we can release pins of pages that we are not munlocking */
 	pagevec_release(&pvec_putback);
diff --git a/mm/page_idle.c b/mm/page_idle.c
index b9e4b42b33ab..0b39ec0c945c 100644
--- a/mm/page_idle.c
+++ b/mm/page_idle.c
@@ -31,7 +31,7 @@
 static struct page *page_idle_get_page(unsigned long pfn)
 {
 	struct page *page;
-	struct zone *zone;
+	pg_data_t *pgdat;
 
 	if (!pfn_valid(pfn))
 		return NULL;
@@ -41,13 +41,13 @@ static struct page *page_idle_get_page(unsigned long pfn)
 	    !get_page_unless_zero(page))
 		return NULL;
 
-	zone = page_zone(page);
-	spin_lock_irq(zone_lru_lock(zone));
+	pgdat = page_pgdat(page);
+	spin_lock_irq(&pgdat->lru_lock);
 	if (unlikely(!PageLRU(page))) {
 		put_page(page);
 		page = NULL;
 	}
-	spin_unlock_irq(zone_lru_lock(zone));
+	spin_unlock_irq(&pgdat->lru_lock);
 	return page;
 }
 
diff --git a/mm/rmap.c b/mm/rmap.c
index 0454ecc29537..b30c7c71d1d9 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -27,7 +27,7 @@
  *         mapping->i_mmap_rwsem
  *           anon_vma->rwsem
  *             mm->page_table_lock or pte_lock
- *               zone_lru_lock (in mark_page_accessed, isolate_lru_page)
+ *               pgdat->lru_lock (in mark_page_accessed, isolate_lru_page)
  *               swap_lock (in swap_duplicate, swap_info_get)
  *                 mmlist_lock (in mmput, drain_mmlist and others)
  *                 mapping->private_lock (in __set_page_dirty_buffers)
diff --git a/mm/swap.c b/mm/swap.c
index 4d7d37eb3c40..301ed4e04320 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -58,16 +58,16 @@ static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
 static void __page_cache_release(struct page *page)
 {
 	if (PageLRU(page)) {
-		struct zone *zone = page_zone(page);
+		pg_data_t *pgdat = page_pgdat(page);
 		struct lruvec *lruvec;
 		unsigned long flags;
 
-		spin_lock_irqsave(zone_lru_lock(zone), flags);
-		lruvec = mem_cgroup_page_lruvec(page, zone->zone_pgdat);
+		spin_lock_irqsave(&pgdat->lru_lock, flags);
+		lruvec = mem_cgroup_page_lruvec(page, pgdat);
 		VM_BUG_ON_PAGE(!PageLRU(page), page);
 		__ClearPageLRU(page);
 		del_page_from_lru_list(page, lruvec, page_off_lru(page));
-		spin_unlock_irqrestore(zone_lru_lock(zone), flags);
+		spin_unlock_irqrestore(&pgdat->lru_lock, flags);
 	}
 	__ClearPageWaiters(page);
 	mem_cgroup_uncharge(page);
@@ -322,12 +322,12 @@ static inline void activate_page_drain(int cpu)
 
 void activate_page(struct page *page)
 {
-	struct zone *zone = page_zone(page);
+	pg_data_t *pgdat = page_pgdat(page);
 
 	page = compound_head(page);
-	spin_lock_irq(zone_lru_lock(zone));
-	__activate_page(page, mem_cgroup_page_lruvec(page, zone->zone_pgdat), NULL);
-	spin_unlock_irq(zone_lru_lock(zone));
+	spin_lock_irq(&pgdat->lru_lock);
+	__activate_page(page, mem_cgroup_page_lruvec(page, pgdat), NULL);
+	spin_unlock_irq(&pgdat->lru_lock);
 }
 #endif
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index a9852ed7b97f..2d081a32c6a8 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1614,8 +1614,8 @@ static __always_inline void update_lru_sizes(struct lruvec *lruvec,
 
 }
 
-/*
- * zone_lru_lock is heavily contended.  Some of the functions that
+/**
+ * pgdat->lru_lock is heavily contended.  Some of the functions that
  * shrink the lists perform better by taking out a batch of pages
  * and working on them outside the LRU lock.
  *
@@ -1750,11 +1750,11 @@ int isolate_lru_page(struct page *page)
 	WARN_RATELIMIT(PageTail(page), "trying to isolate tail page");
 
 	if (PageLRU(page)) {
-		struct zone *zone = page_zone(page);
+		pg_data_t *pgdat = page_pgdat(page);
 		struct lruvec *lruvec;
 
-		spin_lock_irq(zone_lru_lock(zone));
-		lruvec = mem_cgroup_page_lruvec(page, zone->zone_pgdat);
+		spin_lock_irq(&pgdat->lru_lock);
+		lruvec = mem_cgroup_page_lruvec(page, pgdat);
 		if (PageLRU(page)) {
 			int lru = page_lru(page);
 			get_page(page);
@@ -1762,7 +1762,7 @@ int isolate_lru_page(struct page *page)
 			del_page_from_lru_list(page, lruvec, lru);
 			ret = 0;
 		}
-		spin_unlock_irq(zone_lru_lock(zone));
+		spin_unlock_irq(&pgdat->lru_lock);
 	}
 	return ret;
 }
@@ -1990,9 +1990,9 @@ shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec,
  * processes, from rmap.
  *
  * If the pages are mostly unmapped, the processing is fast and it is
- * appropriate to hold zone_lru_lock across the whole operation.  But if
+ * appropriate to hold pgdat->lru_lock across the whole operation.  But if
  * the pages are mapped, the processing is slow (page_referenced()) so we
- * should drop zone_lru_lock around each page.  It's impossible to balance
+ * should drop pgdat->lru_lock around each page.  It's impossible to balance
  * this, so instead we remove the pages from the LRU while processing them.
  * It is safe to rely on PG_active against the non-LRU pages in here because
  * nobody will play with that bit on a non-LRU page.
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone
  2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
  2019-02-22 17:43 ` [PATCH 2/5] mm: remove zone_lru_lock() function access ->lru_lock directly Andrey Ryabinin
@ 2019-02-22 17:43 ` Andrey Ryabinin
  2019-02-22 19:01   ` Rik van Riel
  2019-02-25 12:13   ` Vlastimil Babka
  2019-02-22 17:43 ` [PATCH 4/5] mm/vmscan: remove unused lru_pages argument Andrey Ryabinin
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2019-02-22 17:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Andrey Ryabinin, Johannes Weiner,
	Michal Hocko, Vlastimil Babka, Rik van Riel, Mel Gorman

too_many_isolated() in mm/compaction.c looks only at node state,
so it makes more sense to change argument to pgdat instead of zone.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 mm/compaction.c | 19 +++++++++----------
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/mm/compaction.c b/mm/compaction.c
index a3305f13a138..b2d02aba41d8 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -738,16 +738,16 @@ isolate_freepages_range(struct compact_control *cc,
 }
 
 /* Similar to reclaim, but different enough that they don't share logic */
-static bool too_many_isolated(struct zone *zone)
+static bool too_many_isolated(pg_data_t *pgdat)
 {
 	unsigned long active, inactive, isolated;
 
-	inactive = node_page_state(zone->zone_pgdat, NR_INACTIVE_FILE) +
-			node_page_state(zone->zone_pgdat, NR_INACTIVE_ANON);
-	active = node_page_state(zone->zone_pgdat, NR_ACTIVE_FILE) +
-			node_page_state(zone->zone_pgdat, NR_ACTIVE_ANON);
-	isolated = node_page_state(zone->zone_pgdat, NR_ISOLATED_FILE) +
-			node_page_state(zone->zone_pgdat, NR_ISOLATED_ANON);
+	inactive = node_page_state(pgdat, NR_INACTIVE_FILE) +
+			node_page_state(pgdat, NR_INACTIVE_ANON);
+	active = node_page_state(pgdat, NR_ACTIVE_FILE) +
+			node_page_state(pgdat, NR_ACTIVE_ANON);
+	isolated = node_page_state(pgdat, NR_ISOLATED_FILE) +
+			node_page_state(pgdat, NR_ISOLATED_ANON);
 
 	return isolated > (inactive + active) / 2;
 }
@@ -774,8 +774,7 @@ static unsigned long
 isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 			unsigned long end_pfn, isolate_mode_t isolate_mode)
 {
-	struct zone *zone = cc->zone;
-	pg_data_t *pgdat = zone->zone_pgdat;
+	pg_data_t *pgdat = cc->zone->zone_pgdat;
 	unsigned long nr_scanned = 0, nr_isolated = 0;
 	struct lruvec *lruvec;
 	unsigned long flags = 0;
@@ -791,7 +790,7 @@ isolate_migratepages_block(struct compact_control *cc, unsigned long low_pfn,
 	 * list by either parallel reclaimers or compaction. If there are,
 	 * delay for some time until fewer pages are isolated
 	 */
-	while (unlikely(too_many_isolated(zone))) {
+	while (unlikely(too_many_isolated(pgdat))) {
 		/* async migration should just abort */
 		if (cc->mode == MIGRATE_ASYNC)
 			return 0;
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 4/5] mm/vmscan: remove unused lru_pages argument
  2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
  2019-02-22 17:43 ` [PATCH 2/5] mm: remove zone_lru_lock() function access ->lru_lock directly Andrey Ryabinin
  2019-02-22 17:43 ` [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone Andrey Ryabinin
@ 2019-02-22 17:43 ` Andrey Ryabinin
  2019-02-22 18:02   ` Johannes Weiner
  2019-02-25 12:17   ` Vlastimil Babka
  2019-02-22 17:43 ` [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list Andrey Ryabinin
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2019-02-22 17:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Andrey Ryabinin, Johannes Weiner,
	Michal Hocko, Vlastimil Babka, Rik van Riel, Mel Gorman

The argument 'unsigned long *lru_pages' passed around with no purpose,
remove it.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 mm/vmscan.c | 17 +++++------------
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 2d081a32c6a8..07f74e9507b6 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2257,8 +2257,7 @@ enum scan_balance {
  * nr[2] = file inactive pages to scan; nr[3] = file active pages to scan
  */
 static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
-			   struct scan_control *sc, unsigned long *nr,
-			   unsigned long *lru_pages)
+			   struct scan_control *sc, unsigned long *nr)
 {
 	int swappiness = mem_cgroup_swappiness(memcg);
 	struct zone_reclaim_stat *reclaim_stat = &lruvec->reclaim_stat;
@@ -2409,7 +2408,6 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 	fraction[1] = fp;
 	denominator = ap + fp + 1;
 out:
-	*lru_pages = 0;
 	for_each_evictable_lru(lru) {
 		int file = is_file_lru(lru);
 		unsigned long lruvec_size;
@@ -2525,7 +2523,6 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 			BUG();
 		}
 
-		*lru_pages += lruvec_size;
 		nr[lru] = scan;
 	}
 }
@@ -2534,7 +2531,7 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
  * This is a basic per-node page freer.  Used by both kswapd and direct reclaim.
  */
 static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memcg,
-			      struct scan_control *sc, unsigned long *lru_pages)
+			      struct scan_control *sc)
 {
 	struct lruvec *lruvec = mem_cgroup_lruvec(pgdat, memcg);
 	unsigned long nr[NR_LRU_LISTS];
@@ -2546,7 +2543,7 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 	struct blk_plug plug;
 	bool scan_adjusted;
 
-	get_scan_count(lruvec, memcg, sc, nr, lru_pages);
+	get_scan_count(lruvec, memcg, sc, nr);
 
 	/* Record the original scan target for proportional adjustments later */
 	memcpy(targets, nr, sizeof(nr));
@@ -2751,7 +2748,6 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 			.pgdat = pgdat,
 			.priority = sc->priority,
 		};
-		unsigned long node_lru_pages = 0;
 		struct mem_cgroup *memcg;
 
 		memset(&sc->nr, 0, sizeof(sc->nr));
@@ -2761,7 +2757,6 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 
 		memcg = mem_cgroup_iter(root, NULL, &reclaim);
 		do {
-			unsigned long lru_pages;
 			unsigned long reclaimed;
 			unsigned long scanned;
 
@@ -2798,8 +2793,7 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 
 			reclaimed = sc->nr_reclaimed;
 			scanned = sc->nr_scanned;
-			shrink_node_memcg(pgdat, memcg, sc, &lru_pages);
-			node_lru_pages += lru_pages;
+			shrink_node_memcg(pgdat, memcg, sc);
 
 			if (sc->may_shrinkslab) {
 				shrink_slab(sc->gfp_mask, pgdat->node_id,
@@ -3332,7 +3326,6 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 		.may_swap = !noswap,
 		.may_shrinkslab = 1,
 	};
-	unsigned long lru_pages;
 
 	sc.gfp_mask = (gfp_mask & GFP_RECLAIM_MASK) |
 			(GFP_HIGHUSER_MOVABLE & ~GFP_RECLAIM_MASK);
@@ -3349,7 +3342,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg,
 	 * will pick up pages from other mem cgroup's as well. We hack
 	 * the priority and make it zero.
 	 */
-	shrink_node_memcg(pgdat, memcg, &sc, &lru_pages);
+	shrink_node_memcg(pgdat, memcg, &sc);
 
 	trace_mm_vmscan_memcg_softlimit_reclaim_end(sc.nr_reclaimed);
 
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list
  2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
                   ` (2 preceding siblings ...)
  2019-02-22 17:43 ` [PATCH 4/5] mm/vmscan: remove unused lru_pages argument Andrey Ryabinin
@ 2019-02-22 17:43 ` Andrey Ryabinin
  2019-02-22 18:22   ` Johannes Weiner
  2019-02-22 17:58 ` [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Johannes Weiner
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Andrey Ryabinin @ 2019-02-22 17:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Andrey Ryabinin, Johannes Weiner,
	Michal Hocko, Vlastimil Babka, Rik van Riel, Mel Gorman

shrink_node_memcg() always forcely shrink active anon list.
This doesn't seem like correct behavior. If system/memcg has no swap, it's
absolutely pointless to rebalance anon lru lists.
And in case we did scan the active anon list above, it's unclear why would
we need this additional force scan. If there are cases when we want more
aggressive scan of the anon lru we should just change the scan target
in get_scan_count() (and better explain such cases in the comments).

Remove this force shrink and let get_scan_count() to decide how
much of active anon we want to shrink.

Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Rik van Riel <riel@surriel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
---
 mm/vmscan.c | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 07f74e9507b6..efd10d6b9510 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2563,8 +2563,8 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 			 sc->priority == DEF_PRIORITY);
 
 	blk_start_plug(&plug);
-	while (nr[LRU_INACTIVE_ANON] || nr[LRU_ACTIVE_FILE] ||
-					nr[LRU_INACTIVE_FILE]) {
+	while (nr[LRU_ACTIVE_ANON] || nr[LRU_INACTIVE_ANON] ||
+		nr[LRU_ACTIVE_FILE] || nr[LRU_INACTIVE_FILE]) {
 		unsigned long nr_anon, nr_file, percentage;
 		unsigned long nr_scanned;
 
@@ -2636,14 +2636,6 @@ static void shrink_node_memcg(struct pglist_data *pgdat, struct mem_cgroup *memc
 	}
 	blk_finish_plug(&plug);
 	sc->nr_reclaimed += nr_reclaimed;
-
-	/*
-	 * Even if we did not try to evict anon pages at all, we want to
-	 * rebalance the anon lru active/inactive ratio.
-	 */
-	if (inactive_list_is_low(lruvec, false, memcg, sc, true))
-		shrink_active_list(SWAP_CLUSTER_MAX, lruvec,
-				   sc, LRU_ACTIVE_ANON);
 }
 
 /* Use reclaim/compaction for costly allocs or under memory pressure */
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction()
  2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
                   ` (3 preceding siblings ...)
  2019-02-22 17:43 ` [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list Andrey Ryabinin
@ 2019-02-22 17:58 ` Johannes Weiner
  2019-02-22 19:02 ` Rik van Riel
  2019-02-25 12:01 ` Vlastimil Babka
  6 siblings, 0 replies; 17+ messages in thread
From: Johannes Weiner @ 2019-02-22 17:58 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko,
	Vlastimil Babka, Rik van Riel, Mel Gorman

On Fri, Feb 22, 2019 at 08:43:33PM +0300, Andrey Ryabinin wrote:
> workingset_eviction() doesn't use and never did use the @mapping argument.
> Remove it.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] mm/vmscan: remove unused lru_pages argument
  2019-02-22 17:43 ` [PATCH 4/5] mm/vmscan: remove unused lru_pages argument Andrey Ryabinin
@ 2019-02-22 18:02   ` Johannes Weiner
  2019-02-25 12:17   ` Vlastimil Babka
  1 sibling, 0 replies; 17+ messages in thread
From: Johannes Weiner @ 2019-02-22 18:02 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko,
	Vlastimil Babka, Rik van Riel, Mel Gorman

On Fri, Feb 22, 2019 at 08:43:36PM +0300, Andrey Ryabinin wrote:
> The argument 'unsigned long *lru_pages' passed around with no purpose,
> remove it.

The reference to this was removed recently in 9092c71bb724 ("mm: use
sc->priority for slab shrink targets"), might be worth pointing out
for context.

> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Johannes Weiner <hannes@cmpxchg.org>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list
  2019-02-22 17:43 ` [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list Andrey Ryabinin
@ 2019-02-22 18:22   ` Johannes Weiner
  2019-02-26 12:04     ` Andrey Ryabinin
  0 siblings, 1 reply; 17+ messages in thread
From: Johannes Weiner @ 2019-02-22 18:22 UTC (permalink / raw)
  To: Andrey Ryabinin
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko,
	Vlastimil Babka, Rik van Riel, Mel Gorman

On Fri, Feb 22, 2019 at 08:43:37PM +0300, Andrey Ryabinin wrote:
> shrink_node_memcg() always forcely shrink active anon list.
> This doesn't seem like correct behavior. If system/memcg has no swap, it's
> absolutely pointless to rebalance anon lru lists.
> And in case we did scan the active anon list above, it's unclear why would
> we need this additional force scan. If there are cases when we want more
> aggressive scan of the anon lru we should just change the scan target
> in get_scan_count() (and better explain such cases in the comments).
> 
> Remove this force shrink and let get_scan_count() to decide how
> much of active anon we want to shrink.

This change breaks the anon pre-aging.

The idea behind this is that the VM maintains a small batch of anon
reclaim candidates with recent access information. On every reclaim,
even when we just trim cache, which is the most common reclaim mode,
but also when we just swapped out some pages and shrunk the inactive
anon list, at the end of it we make sure that the list of potential
anon candidates is refilled for the next reclaim cycle.

The comments for this are above inactive_list_is_low() and the
age_active_anon() call from kswapd.

Re: no swap, you are correct. We should gate that rebalancing on
total_swap_pages, just like age_active_anon() does.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone
  2019-02-22 17:43 ` [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone Andrey Ryabinin
@ 2019-02-22 19:01   ` Rik van Riel
  2019-02-25 12:13   ` Vlastimil Babka
  1 sibling, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2019-02-22 19:01 UTC (permalink / raw)
  To: Andrey Ryabinin, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Vlastimil Babka, Mel Gorman

[-- Attachment #1: Type: text/plain, Size: 547 bytes --]

On Fri, 2019-02-22 at 20:43 +0300, Andrey Ryabinin wrote:
> too_many_isolated() in mm/compaction.c looks only at node state,
> so it makes more sense to change argument to pgdat instead of zone.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Rik van Riel <riel@surriel.com>

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction()
  2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
                   ` (4 preceding siblings ...)
  2019-02-22 17:58 ` [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Johannes Weiner
@ 2019-02-22 19:02 ` Rik van Riel
  2019-02-25 12:01 ` Vlastimil Babka
  6 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2019-02-22 19:02 UTC (permalink / raw)
  To: Andrey Ryabinin, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Vlastimil Babka, Mel Gorman

[-- Attachment #1: Type: text/plain, Size: 503 bytes --]

On Fri, 2019-02-22 at 20:43 +0300, Andrey Ryabinin wrote:
> workingset_eviction() doesn't use and never did use the @mapping
> argument.
> Remove it.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>

Acked-by: Rik van Riel <riel@surriel.com>

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction()
  2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
                   ` (5 preceding siblings ...)
  2019-02-22 19:02 ` Rik van Riel
@ 2019-02-25 12:01 ` Vlastimil Babka
  2019-02-26 12:07   ` Andrey Ryabinin
  6 siblings, 1 reply; 17+ messages in thread
From: Vlastimil Babka @ 2019-02-25 12:01 UTC (permalink / raw)
  To: Andrey Ryabinin, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Rik van Riel, Mel Gorman

On 2/22/19 6:43 PM, Andrey Ryabinin wrote:
> workingset_eviction() doesn't use and never did use the @mapping argument.
> Remove it.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> ---
>  include/linux/swap.h | 2 +-
>  mm/vmscan.c          | 2 +-
>  mm/workingset.c      | 3 +--
>  3 files changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 649529be91f2..fc50e21b3b88 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -307,7 +307,7 @@ struct vma_swap_readahead {
>  };
>  
>  /* linux/mm/workingset.c */
> -void *workingset_eviction(struct address_space *mapping, struct page *page);
> +void *workingset_eviction(struct page *page);
>  void workingset_refault(struct page *page, void *shadow);
>  void workingset_activation(struct page *page);
>  
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index ac4806f0f332..a9852ed7b97f 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -952,7 +952,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
>  		 */
>  		if (reclaimed && page_is_file_cache(page) &&
>  		    !mapping_exiting(mapping) && !dax_mapping(mapping))
> -			shadow = workingset_eviction(mapping, page);
> +			shadow = workingset_eviction(page);
>  		__delete_from_page_cache(page, shadow);
>  		xa_unlock_irqrestore(&mapping->i_pages, flags);
>  
> diff --git a/mm/workingset.c b/mm/workingset.c
> index dcb994f2acc2..0906137760c5 100644
> --- a/mm/workingset.c
> +++ b/mm/workingset.c
> @@ -215,13 +215,12 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
>  
>  /**
>   * workingset_eviction - note the eviction of a page from memory
> - * @mapping: address space the page was backing
>   * @page: the page being evicted
>   *
>   * Returns a shadow entry to be stored in @mapping->i_pages in place

The line above still references @mapping, I guess kerneldoc build will
complain?

>   * of the evicted @page so that a later refault can be detected.
>   */
> -void *workingset_eviction(struct address_space *mapping, struct page *page)
> +void *workingset_eviction(struct page *page)
>  {
>  	struct pglist_data *pgdat = page_pgdat(page);
>  	struct mem_cgroup *memcg = page_memcg(page);
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 2/5] mm: remove zone_lru_lock() function access ->lru_lock directly
  2019-02-22 17:43 ` [PATCH 2/5] mm: remove zone_lru_lock() function access ->lru_lock directly Andrey Ryabinin
@ 2019-02-25 12:10   ` Vlastimil Babka
  0 siblings, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2019-02-25 12:10 UTC (permalink / raw)
  To: Andrey Ryabinin, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Rik van Riel, Mel Gorman

s/function/and/ on the subject?

On 2/22/19 6:43 PM, Andrey Ryabinin wrote:
> We have common pattern to access lru_lock from a page pointer:
> 	zone_lru_lock(page_zone(page))
> 
> Which is silly, because it unfolds to this:
> 	&NODE_DATA(page_to_nid(page))->node_zones[page_zonenum(page)]->zone_pgdat->lru_lock
> while we can simply do
> 	&NODE_DATA(page_to_nid(page))->lru_lock

Heh, well spotted.

> 
> Remove zone_lru_lock() function, since it's only complicate things.
> Use 'page_pgdat(page)->lru_lock' pattern instead.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

Thanks.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone
  2019-02-22 17:43 ` [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone Andrey Ryabinin
  2019-02-22 19:01   ` Rik van Riel
@ 2019-02-25 12:13   ` Vlastimil Babka
  1 sibling, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2019-02-25 12:13 UTC (permalink / raw)
  To: Andrey Ryabinin, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Rik van Riel, Mel Gorman

On 2/22/19 6:43 PM, Andrey Ryabinin wrote:
> too_many_isolated() in mm/compaction.c looks only at node state,
> so it makes more sense to change argument to pgdat instead of zone.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 4/5] mm/vmscan: remove unused lru_pages argument
  2019-02-22 17:43 ` [PATCH 4/5] mm/vmscan: remove unused lru_pages argument Andrey Ryabinin
  2019-02-22 18:02   ` Johannes Weiner
@ 2019-02-25 12:17   ` Vlastimil Babka
  1 sibling, 0 replies; 17+ messages in thread
From: Vlastimil Babka @ 2019-02-25 12:17 UTC (permalink / raw)
  To: Andrey Ryabinin, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Rik van Riel, Mel Gorman

On 2/22/19 6:43 PM, Andrey Ryabinin wrote:
> The argument 'unsigned long *lru_pages' passed around with no purpose,
> remove it.
> 
> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>

With what Johannes has said,

Acked-by: Vlastimil Babka <vbabka@suse.cz>

> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Rik van Riel <riel@surriel.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list
  2019-02-22 18:22   ` Johannes Weiner
@ 2019-02-26 12:04     ` Andrey Ryabinin
  2019-02-26 14:42       ` Rik van Riel
  0 siblings, 1 reply; 17+ messages in thread
From: Andrey Ryabinin @ 2019-02-26 12:04 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko,
	Vlastimil Babka, Rik van Riel, Mel Gorman

On 2/22/19 9:22 PM, Johannes Weiner wrote:
> On Fri, Feb 22, 2019 at 08:43:37PM +0300, Andrey Ryabinin wrote:
>> shrink_node_memcg() always forcely shrink active anon list.
>> This doesn't seem like correct behavior. If system/memcg has no swap, it's
>> absolutely pointless to rebalance anon lru lists.
>> And in case we did scan the active anon list above, it's unclear why would
>> we need this additional force scan. If there are cases when we want more
>> aggressive scan of the anon lru we should just change the scan target
>> in get_scan_count() (and better explain such cases in the comments).
>>
>> Remove this force shrink and let get_scan_count() to decide how
>> much of active anon we want to shrink.
> 
> This change breaks the anon pre-aging.
> 
> The idea behind this is that the VM maintains a small batch of anon
> reclaim candidates with recent access information. On every reclaim,
> even when we just trim cache, which is the most common reclaim mode,
> but also when we just swapped out some pages and shrunk the inactive
> anon list, at the end of it we make sure that the list of potential
> anon candidates is refilled for the next reclaim cycle.
> 
> The comments for this are above inactive_list_is_low() and the
> age_active_anon() call from kswapd.
> 
> Re: no swap, you are correct. We should gate that rebalancing on
> total_swap_pages, just like age_active_anon() does.
> 


I think we should leave anon aging only for !SCAN_FILE cases.
At least aging was definitely invented for the SCAN_FRACT mode which was the
main mode at the time it was added by the commit:

	556adecba110bf5f1db6c6b56416cfab5bcab698
	Author: Rik van Riel <riel@redhat.com>
	Date:   Sat Oct 18 20:26:34 2008 -0700

	    vmscan: second chance replacement for anonymous pages


Later we've got more of the SCAN_FILE mode usage, commit:

e9868505987a03a26a3979f27b82911ccc003752
Author: Rik van Riel <riel@redhat.com>
Date:   Tue Dec 11 16:01:10 2012 -0800

    mm,vmscan: only evict file pages when we have plenty


and I think would be reasonable to  avoid the anon aging in the SCAN_FILE case.
Because if workload generates enough inactive file pages we never go to the SCAN_FRACT,
so aging is just as useless as with no swap case.

So, how about something like bellow on top of the patch?

---
 mm/vmscan.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index efd10d6b9510..6c63adfee37b 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2525,6 +2525,15 @@ static void get_scan_count(struct lruvec *lruvec, struct mem_cgroup *memcg,
 
 		nr[lru] = scan;
 	}
+
+	/*
+	 * Even if we did not try to evict anon pages at all, we want to
+	 * rebalance the anon lru active/inactive ratio to maintain
+	 * enough reclaim candidates for the next reclaim cycle.
+	 */
+	if (scan_balance != SCAN_FILE && inactive_list_is_low(lruvec,
+						false, memcg, sc, false))
+		nr[LRU_ACTIVE_ANON] += SWAP_CLUSTER_MAX;
 }
 
 /*
-- 
2.19.2





^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction()
  2019-02-25 12:01 ` Vlastimil Babka
@ 2019-02-26 12:07   ` Andrey Ryabinin
  0 siblings, 0 replies; 17+ messages in thread
From: Andrey Ryabinin @ 2019-02-26 12:07 UTC (permalink / raw)
  To: Vlastimil Babka, Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Rik van Riel, Mel Gorman



On 2/25/19 3:01 PM, Vlastimil Babka wrote:
> On 2/22/19 6:43 PM, Andrey Ryabinin wrote:
>> workingset_eviction() doesn't use and never did use the @mapping argument.
>> Remove it.
>>
>> Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Rik van Riel <riel@surriel.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> ---
>>  include/linux/swap.h | 2 +-
>>  mm/vmscan.c          | 2 +-
>>  mm/workingset.c      | 3 +--
>>  3 files changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 649529be91f2..fc50e21b3b88 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -307,7 +307,7 @@ struct vma_swap_readahead {
>>  };
>>  
>>  /* linux/mm/workingset.c */
>> -void *workingset_eviction(struct address_space *mapping, struct page *page);
>> +void *workingset_eviction(struct page *page);
>>  void workingset_refault(struct page *page, void *shadow);
>>  void workingset_activation(struct page *page);
>>  
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index ac4806f0f332..a9852ed7b97f 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -952,7 +952,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
>>  		 */
>>  		if (reclaimed && page_is_file_cache(page) &&
>>  		    !mapping_exiting(mapping) && !dax_mapping(mapping))
>> -			shadow = workingset_eviction(mapping, page);
>> +			shadow = workingset_eviction(page);
>>  		__delete_from_page_cache(page, shadow);
>>  		xa_unlock_irqrestore(&mapping->i_pages, flags);
>>  
>> diff --git a/mm/workingset.c b/mm/workingset.c
>> index dcb994f2acc2..0906137760c5 100644
>> --- a/mm/workingset.c
>> +++ b/mm/workingset.c
>> @@ -215,13 +215,12 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
>>  
>>  /**
>>   * workingset_eviction - note the eviction of a page from memory
>> - * @mapping: address space the page was backing
>>   * @page: the page being evicted
>>   *
>>   * Returns a shadow entry to be stored in @mapping->i_pages in place
> 
> The line above still references @mapping, I guess kerneldoc build will
> complain?
> 

Maybe. Will replace it with @page->mapping->i_pages

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list
  2019-02-26 12:04     ` Andrey Ryabinin
@ 2019-02-26 14:42       ` Rik van Riel
  0 siblings, 0 replies; 17+ messages in thread
From: Rik van Riel @ 2019-02-26 14:42 UTC (permalink / raw)
  To: Andrey Ryabinin, Johannes Weiner
  Cc: Andrew Morton, linux-mm, linux-kernel, Michal Hocko,
	Vlastimil Babka, Mel Gorman

[-- Attachment #1: Type: text/plain, Size: 1216 bytes --]

On Tue, 2019-02-26 at 15:04 +0300, Andrey Ryabinin wrote:

> I think we should leave anon aging only for !SCAN_FILE cases.
> At least aging was definitely invented for the SCAN_FRACT mode which
> was the
> main mode at the time it was added by the commit:

> and I think would be reasonable to  avoid the anon aging in the
> SCAN_FILE case.
> Because if workload generates enough inactive file pages we never go
> to the SCAN_FRACT,
> so aging is just as useless as with no swap case.

There are a few different cases here.

If you NEVER end up scanning or evicting anonymous
pages, scanning them is indeed a waste of time.

However, if you occasionally end pushing something
into swap, it is very useful to know that the pages
that did get pushed to swap had been sitting on the
inactive list for a very long time, and had not been
used in that time.

To limit the amount of wasted work, only SWAP_CLUSTER_MAX
pages are moved from the active_anon list to the inactive_anon
list at a time.

I suppose that could be gated behind a check whether or
not the system has swap space configured, so no anon
pages are ever scanned if the system has no swap space.

-- 
All Rights Reversed.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2019-02-26 14:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-02-22 17:43 [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Andrey Ryabinin
2019-02-22 17:43 ` [PATCH 2/5] mm: remove zone_lru_lock() function access ->lru_lock directly Andrey Ryabinin
2019-02-25 12:10   ` Vlastimil Babka
2019-02-22 17:43 ` [PATCH 3/5] mm/compaction: pass pgdat to too_many_isolated() instead of zone Andrey Ryabinin
2019-02-22 19:01   ` Rik van Riel
2019-02-25 12:13   ` Vlastimil Babka
2019-02-22 17:43 ` [PATCH 4/5] mm/vmscan: remove unused lru_pages argument Andrey Ryabinin
2019-02-22 18:02   ` Johannes Weiner
2019-02-25 12:17   ` Vlastimil Babka
2019-02-22 17:43 ` [PATCH 5/5] mm/vmscan: don't forcely shrink active anon lru list Andrey Ryabinin
2019-02-22 18:22   ` Johannes Weiner
2019-02-26 12:04     ` Andrey Ryabinin
2019-02-26 14:42       ` Rik van Riel
2019-02-22 17:58 ` [PATCH 1/5] mm/workingset: remove unused @mapping argument in workingset_eviction() Johannes Weiner
2019-02-22 19:02 ` Rik van Riel
2019-02-25 12:01 ` Vlastimil Babka
2019-02-26 12:07   ` Andrey Ryabinin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).