linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/9] A few cleanup and fixup patches for vmscan
@ 2022-04-09  9:34 Miaohe Lin
  2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
                   ` (8 more replies)
  0 siblings, 9 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

Hi everyone,
This series contains a few patches to remove obsolete comment, introduce
helper to remove duplicated code and so no. Also we take all base pages
of THP into account in rare race condition. More details can be found in
the respective changelogs. Thanks!

---
v2:
  patch 1/9: drop code change and add a comment about MADV_FREE
  patch 2/9: simplify the code further and change to goto keep_locked
  patch 3/9: use folio, remove unneeded inline and break craze long lines
  patch 5/9: activate swap-backed executable folios after first usage too
  patch 9/9: new cleanup patch splitted from 5/9
  Many thanks Huang Ying, Matthew Wilcox, Christoph Hellwig, Muchun Song
  for review!
---
Miaohe Lin (9):
  mm/vmscan: add a comment about MADV_FREE pages check in
    folio_check_dirty_writeback
  mm/vmscan: remove unneeded can_split_huge_page check
  mm/vmscan: introduce helper function reclaim_page_list()
  mm/vmscan: save a bit of stack space in shrink_lruvec
  mm/vmscan: activate swap-backed executable folios after first usage
  mm/vmscan: take all base pages of THP into account when race with
    speculative reference
  mm/vmscan: take min_slab_pages into account when try to call
    shrink_node
  mm/vmscan: remove obsolete comment in kswapd_run
  mm/vmscan: use helper folio_is_file_lru()

 mm/vmscan.c | 92 ++++++++++++++++++++++++++---------------------------
 1 file changed, 45 insertions(+), 47 deletions(-)

-- 
2.23.0


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-11 11:50   ` ying.huang
                     ` (2 more replies)
  2022-04-09  9:34 ` [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check Miaohe Lin
                   ` (7 subsequent siblings)
  8 siblings, 3 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

The MADV_FREE pages check in folio_check_dirty_writeback is a bit hard to
follow. Add a comment to make the code clear.

Suggested-by: Huang, Ying <ying.huang@intel.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index c77d5052f230..4a76be47bed1 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1436,6 +1436,9 @@ static void folio_check_dirty_writeback(struct folio *folio,
 	/*
 	 * Anonymous pages are not handled by flushers and must be written
 	 * from reclaim context. Do not stall reclaim based on them
+	 * MADV_FREE anonymous pages are put into inactive file list too.
+	 * They could be mistakenly treated as file lru. So further anon
+	 * test is needed.
 	 */
 	if (!folio_is_file_lru(folio) ||
 	    (folio_test_anon(folio) && !folio_test_swapbacked(folio))) {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
  2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-11 14:18   ` Christoph Hellwig
                     ` (2 more replies)
  2022-04-09  9:34 ` [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list() Miaohe Lin
                   ` (6 subsequent siblings)
  8 siblings, 3 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

We don't need to check can_split_folio() because folio_maybe_dma_pinned()
is checked before. It will avoid the long term pinned pages to be swapped
out. And we can live with short term pinned pages. Without can_split_folio
checking we can simplify the code. Also activate_locked can be changed to
keep_locked as it's just short term pinning.

Suggested-by: Huang, Ying <ying.huang@intel.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 22 ++++++++--------------
 1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4a76be47bed1..01f5db75a507 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 					goto keep_locked;
 				if (folio_maybe_dma_pinned(folio))
 					goto keep_locked;
-				if (PageTransHuge(page)) {
-					/* cannot split THP, skip it */
-					if (!can_split_folio(folio, NULL))
-						goto activate_locked;
-					/*
-					 * Split pages without a PMD map right
-					 * away. Chances are some or all of the
-					 * tail pages can be freed without IO.
-					 */
-					if (!folio_entire_mapcount(folio) &&
-					    split_folio_to_list(folio,
-								page_list))
-						goto activate_locked;
-				}
+				/*
+				 * Split pages without a PMD map right
+				 * away. Chances are some or all of the
+				 * tail pages can be freed without IO.
+				 */
+				if (PageTransHuge(page) && !folio_entire_mapcount(folio) &&
+				    split_folio_to_list(folio, page_list))
+					goto keep_locked;
 				if (!add_to_swap(page)) {
 					if (!PageTransHuge(page))
 						goto activate_locked_split;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list()
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
  2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
  2022-04-09  9:34 ` [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-09 13:44   ` Matthew Wilcox
  2022-04-09  9:34 ` [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec Miaohe Lin
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

Introduce helper function reclaim_page_list() to eliminate the duplicated
code of doing shrink_page_list() and putback_lru_page. Also We can separate
node reclaim from node page list operation this way. No functional change
intended.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 50 +++++++++++++++++++++++++-------------------------
 1 file changed, 25 insertions(+), 25 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 01f5db75a507..59b96320f481 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2531,14 +2531,12 @@ static void shrink_active_list(unsigned long nr_to_scan,
 			nr_deactivate, nr_rotated, sc->priority, file);
 }
 
-unsigned long reclaim_pages(struct list_head *page_list)
+static unsigned int reclaim_page_list(struct list_head *page_list,
+				      struct pglist_data *pgdat)
 {
-	int nid = NUMA_NO_NODE;
-	unsigned int nr_reclaimed = 0;
-	LIST_HEAD(node_page_list);
 	struct reclaim_stat dummy_stat;
-	struct page *page;
-	unsigned int noreclaim_flag;
+	unsigned int nr_reclaimed;
+	struct folio *folio;
 	struct scan_control sc = {
 		.gfp_mask = GFP_KERNEL,
 		.may_writepage = 1,
@@ -2547,6 +2545,24 @@ unsigned long reclaim_pages(struct list_head *page_list)
 		.no_demotion = 1,
 	};
 
+	nr_reclaimed = shrink_page_list(page_list, pgdat, &sc, &dummy_stat, false);
+	while (!list_empty(page_list)) {
+		folio = lru_to_folio(page_list);
+		list_del(&folio->lru);
+		putback_lru_page(&folio->page);
+	}
+
+	return nr_reclaimed;
+}
+
+unsigned long reclaim_pages(struct list_head *page_list)
+{
+	int nid = NUMA_NO_NODE;
+	unsigned int nr_reclaimed = 0;
+	LIST_HEAD(node_page_list);
+	struct page *page;
+	unsigned int noreclaim_flag;
+
 	noreclaim_flag = memalloc_noreclaim_save();
 
 	while (!list_empty(page_list)) {
@@ -2562,28 +2578,12 @@ unsigned long reclaim_pages(struct list_head *page_list)
 			continue;
 		}
 
-		nr_reclaimed += shrink_page_list(&node_page_list,
-						NODE_DATA(nid),
-						&sc, &dummy_stat, false);
-		while (!list_empty(&node_page_list)) {
-			page = lru_to_page(&node_page_list);
-			list_del(&page->lru);
-			putback_lru_page(page);
-		}
-
+		nr_reclaimed += reclaim_page_list(&node_page_list, NODE_DATA(nid));
 		nid = NUMA_NO_NODE;
 	}
 
-	if (!list_empty(&node_page_list)) {
-		nr_reclaimed += shrink_page_list(&node_page_list,
-						NODE_DATA(nid),
-						&sc, &dummy_stat, false);
-		while (!list_empty(&node_page_list)) {
-			page = lru_to_page(&node_page_list);
-			list_del(&page->lru);
-			putback_lru_page(page);
-		}
-	}
+	if (!list_empty(&node_page_list))
+		nr_reclaimed += reclaim_page_list(&node_page_list, NODE_DATA(nid));
 
 	memalloc_noreclaim_restore(noreclaim_flag);
 
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
                   ` (2 preceding siblings ...)
  2022-04-09  9:34 ` [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list() Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-12  0:57   ` ying.huang
  2022-04-09  9:34 ` [PATCH v2 5/9] mm/vmscan: activate swap-backed executable folios after first usage Miaohe Lin
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

LRU_UNEVICTABLE is not taken into account when shrink lruvec. So we can
save a bit of stack space by shrinking the array size of nr and targets
to NR_LRU_LISTS - 1. No functional change intended.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 59b96320f481..0e5818970998 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2881,8 +2881,9 @@ static bool can_age_anon_pages(struct pglist_data *pgdat,
 
 static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
 {
-	unsigned long nr[NR_LRU_LISTS];
-	unsigned long targets[NR_LRU_LISTS];
+	/* LRU_UNEVICTABLE is not taken into account. */
+	unsigned long nr[NR_LRU_LISTS - 1];
+	unsigned long targets[NR_LRU_LISTS - 1];
 	unsigned long nr_to_scan;
 	enum lru_list lru;
 	unsigned long nr_reclaimed = 0;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 5/9] mm/vmscan: activate swap-backed executable folios after first usage
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
                   ` (3 preceding siblings ...)
  2022-04-09  9:34 ` [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-12  0:59   ` ying.huang
  2022-04-09  9:34 ` [PATCH v2 6/9] mm/vmscan: take all base pages of THP into account when race with speculative reference Miaohe Lin
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

We should activate swap-backed executable folios (e.g. tmpfs) after first
usage so that executable code gets yet better chance to stay in memory.

Suggested-by: Huang, Ying <ying.huang@intel.com>
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/vmscan.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0e5818970998..cc1193e320c2 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1412,9 +1412,9 @@ static enum page_references folio_check_references(struct folio *folio,
 			return PAGEREF_ACTIVATE;
 
 		/*
-		 * Activate file-backed executable folios after first usage.
+		 * Activate executable folios after first usage.
 		 */
-		if ((vm_flags & VM_EXEC) && !folio_test_swapbacked(folio))
+		if (vm_flags & VM_EXEC)
 			return PAGEREF_ACTIVATE;
 
 		return PAGEREF_KEEP;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 6/9] mm/vmscan: take all base pages of THP into account when race with speculative reference
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
                   ` (4 preceding siblings ...)
  2022-04-09  9:34 ` [PATCH v2 5/9] mm/vmscan: activate swap-backed executable folios after first usage Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-09  9:34 ` [PATCH v2 7/9] mm/vmscan: take min_slab_pages into account when try to call shrink_node Miaohe Lin
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

If the page has buffers, shrink_page_list will try to free the buffer
mappings associated with the page and try to free the page as well.
In the rare race with speculative reference, the page will be freed
shortly by speculative reference. But nr_reclaimed is not incremented
correctly when we come across the THP. We need to account all the base
pages in this case.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index cc1193e320c2..53f1d0755b34 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1878,7 +1878,7 @@ static unsigned int shrink_page_list(struct list_head *page_list,
 					 * increment nr_reclaimed here (and
 					 * leave it off the LRU).
 					 */
-					nr_reclaimed++;
+					nr_reclaimed += nr_pages;
 					continue;
 				}
 			}
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 7/9] mm/vmscan: take min_slab_pages into account when try to call shrink_node
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
                   ` (5 preceding siblings ...)
  2022-04-09  9:34 ` [PATCH v2 6/9] mm/vmscan: take all base pages of THP into account when race with speculative reference Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-12  1:36   ` ying.huang
  2022-04-09  9:34 ` [PATCH v2 8/9] mm/vmscan: remove obsolete comment in kswapd_run Miaohe Lin
  2022-04-09  9:35 ` [PATCH v2 9/9] mm/vmscan: use helper folio_is_file_lru() Miaohe Lin
  8 siblings, 1 reply; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

Since commit 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from
shrink_zone()"), slab reclaim and lru page reclaim are done together
in the shrink_node. So we should take min_slab_pages into account
when try to call shrink_node.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 53f1d0755b34..ba83d8f3e53e 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4714,7 +4714,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
 	noreclaim_flag = memalloc_noreclaim_save();
 	set_task_reclaim_state(p, &sc.reclaim_state);
 
-	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
+	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages ||
+	    node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B) > pgdat->min_slab_pages) {
 		/*
 		 * Free memory by calling shrink node with increasing
 		 * priorities until we have enough memory freed.
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 8/9] mm/vmscan: remove obsolete comment in kswapd_run
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
                   ` (6 preceding siblings ...)
  2022-04-09  9:34 ` [PATCH v2 7/9] mm/vmscan: take min_slab_pages into account when try to call shrink_node Miaohe Lin
@ 2022-04-09  9:34 ` Miaohe Lin
  2022-04-09  9:35 ` [PATCH v2 9/9] mm/vmscan: use helper folio_is_file_lru() Miaohe Lin
  8 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:34 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

Since commit 6b700b5b3c59 ("mm/vmscan.c: remove cpu online notification
for now"), cpu online notification is removed. So kswapd won't move to
proper cpus if cpus are hot-added. Remove this obsolete comment.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index ba83d8f3e53e..79310b3cbbe3 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4567,7 +4567,6 @@ unsigned long shrink_all_memory(unsigned long nr_to_reclaim)
 
 /*
  * This kswapd start function will be called by init and node-hot-add.
- * On node-hot-add, kswapd will moved to proper cpus if cpus are hot-added.
  */
 void kswapd_run(int nid)
 {
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 9/9] mm/vmscan: use helper folio_is_file_lru()
  2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
                   ` (7 preceding siblings ...)
  2022-04-09  9:34 ` [PATCH v2 8/9] mm/vmscan: remove obsolete comment in kswapd_run Miaohe Lin
@ 2022-04-09  9:35 ` Miaohe Lin
  8 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-09  9:35 UTC (permalink / raw)
  To: akpm
  Cc: ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel, linmiaohe

Use helper folio_is_file_lru() to check whether folio is file lru. Minor
readability improvement.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 79310b3cbbe3..208ce417f1a4 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1421,7 +1421,7 @@ static enum page_references folio_check_references(struct folio *folio,
 	}
 
 	/* Reclaim if clean, defer dirty folios to writeback */
-	if (referenced_folio && !folio_test_swapbacked(folio))
+	if (referenced_folio && folio_is_file_lru(folio))
 		return PAGEREF_RECLAIM_CLEAN;
 
 	return PAGEREF_RECLAIM;
-- 
2.23.0


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list()
  2022-04-09  9:34 ` [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list() Miaohe Lin
@ 2022-04-09 13:44   ` Matthew Wilcox
  2022-04-11  1:53     ` Miaohe Lin
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2022-04-09 13:44 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, ying.huang, songmuchun, hch, linux-mm, linux-kernel

On Sat, Apr 09, 2022 at 05:34:54PM +0800, Miaohe Lin wrote:
> +	nr_reclaimed = shrink_page_list(page_list, pgdat, &sc, &dummy_stat, false);
> +	while (!list_empty(page_list)) {
> +		folio = lru_to_folio(page_list);
> +		list_del(&folio->lru);
> +		putback_lru_page(&folio->page);

folio_putback_lru()


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list()
  2022-04-09 13:44   ` Matthew Wilcox
@ 2022-04-11  1:53     ` Miaohe Lin
  2022-04-11  3:17       ` Matthew Wilcox
  0 siblings, 1 reply; 33+ messages in thread
From: Miaohe Lin @ 2022-04-11  1:53 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: akpm, ying.huang, songmuchun, hch, linux-mm, linux-kernel

On 2022/4/9 21:44, Matthew Wilcox wrote:
> On Sat, Apr 09, 2022 at 05:34:54PM +0800, Miaohe Lin wrote:
>> +	nr_reclaimed = shrink_page_list(page_list, pgdat, &sc, &dummy_stat, false);
>> +	while (!list_empty(page_list)) {
>> +		folio = lru_to_folio(page_list);
>> +		list_del(&folio->lru);
>> +		putback_lru_page(&folio->page);
> 
> folio_putback_lru()

I thought folio_putback_lru is deliberately not to use because there is no caller of folio_putback_lru now.
But it seems I was wrong. Will do it in next version.

Thanks a lot!

> 
> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list()
  2022-04-11  1:53     ` Miaohe Lin
@ 2022-04-11  3:17       ` Matthew Wilcox
  2022-04-11  3:26         ` Miaohe Lin
  0 siblings, 1 reply; 33+ messages in thread
From: Matthew Wilcox @ 2022-04-11  3:17 UTC (permalink / raw)
  To: Miaohe Lin; +Cc: akpm, ying.huang, songmuchun, hch, linux-mm, linux-kernel

On Mon, Apr 11, 2022 at 09:53:15AM +0800, Miaohe Lin wrote:
> On 2022/4/9 21:44, Matthew Wilcox wrote:
> > On Sat, Apr 09, 2022 at 05:34:54PM +0800, Miaohe Lin wrote:
> >> +	nr_reclaimed = shrink_page_list(page_list, pgdat, &sc, &dummy_stat, false);
> >> +	while (!list_empty(page_list)) {
> >> +		folio = lru_to_folio(page_list);
> >> +		list_del(&folio->lru);
> >> +		putback_lru_page(&folio->page);
> > 
> > folio_putback_lru()
> 
> I thought folio_putback_lru is deliberately not to use because there is no caller of folio_putback_lru now.
> But it seems I was wrong. Will do it in next version.

Looks like all of the uses of it that I mooted during the last merge
window ended up going away.

https://lore.kernel.org/all/20220204195852.1751729-47-willy@infradead.org/
was obsoleted by commit b109b87050df

https://lore.kernel.org/all/20220204195852.1751729-48-willy@infradead.org/
and
https://lore.kernel.org/all/20220204195852.1751729-50-willy@infradead.org/
were also obsoleted by Hugh's mlock changes

I also sent
https://lore.kernel.org/all/YjJJIrENYb1qFHzl@casper.infradead.org/

but never quite got it up to submittable quality.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list()
  2022-04-11  3:17       ` Matthew Wilcox
@ 2022-04-11  3:26         ` Miaohe Lin
  0 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-11  3:26 UTC (permalink / raw)
  To: Matthew Wilcox; +Cc: akpm, ying.huang, songmuchun, hch, linux-mm, linux-kernel

On 2022/4/11 11:17, Matthew Wilcox wrote:
> On Mon, Apr 11, 2022 at 09:53:15AM +0800, Miaohe Lin wrote:
>> On 2022/4/9 21:44, Matthew Wilcox wrote:
>>> On Sat, Apr 09, 2022 at 05:34:54PM +0800, Miaohe Lin wrote:
>>>> +	nr_reclaimed = shrink_page_list(page_list, pgdat, &sc, &dummy_stat, false);
>>>> +	while (!list_empty(page_list)) {
>>>> +		folio = lru_to_folio(page_list);
>>>> +		list_del(&folio->lru);
>>>> +		putback_lru_page(&folio->page);
>>>
>>> folio_putback_lru()
>>
>> I thought folio_putback_lru is deliberately not to use because there is no caller of folio_putback_lru now.
>> But it seems I was wrong. Will do it in next version.
> 
> Looks like all of the uses of it that I mooted during the last merge
> window ended up going away.
> 
> https://lore.kernel.org/all/20220204195852.1751729-47-willy@infradead.org/
> was obsoleted by commit b109b87050df
> 
> https://lore.kernel.org/all/20220204195852.1751729-48-willy@infradead.org/
> and
> https://lore.kernel.org/all/20220204195852.1751729-50-willy@infradead.org/
> were also obsoleted by Hugh's mlock changes
> 
> I also sent
> https://lore.kernel.org/all/YjJJIrENYb1qFHzl@casper.infradead.org/
> 
> but never quite got it up to submittable quality.

I see. This is really a pity. Many thanks for clarifying. :)

> 
> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback
  2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
@ 2022-04-11 11:50   ` ying.huang
  2022-04-12  3:07     ` Miaohe Lin
  2022-04-11 14:16   ` Christoph Hellwig
  2022-04-12  8:44   ` Oscar Salvador
  2 siblings, 1 reply; 33+ messages in thread
From: ying.huang @ 2022-04-11 11:50 UTC (permalink / raw)
  To: Miaohe Lin, akpm; +Cc: songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote:
> The MADV_FREE pages check in folio_check_dirty_writeback is a bit hard to
> follow. Add a comment to make the code clear.
> 
> Suggested-by: Huang, Ying <ying.huang@intel.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/vmscan.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c77d5052f230..4a76be47bed1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1436,6 +1436,9 @@ static void folio_check_dirty_writeback(struct folio
> *folio,
>         /*
>          * Anonymous pages are not handled by flushers and must be written
>          * from reclaim context. Do not stall reclaim based on them
> +        * MADV_FREE anonymous pages are put into inactive file list too.
> +        * They could be mistakenly treated as file lru. So further anon
> +        * test is needed.
>          */

How about something as follows,

	/*
         * Anonymous pages (including MADV_FREE ones) are not handled
by	 * flushers and must be written from reclaim context. Do not stall
	 * reclaim based on them
	 */.

Best Regards,
Huang, Ying

>         if (!folio_is_file_lru(folio) ||
>             (folio_test_anon(folio) && !folio_test_swapbacked(folio))) {



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback
  2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
  2022-04-11 11:50   ` ying.huang
@ 2022-04-11 14:16   ` Christoph Hellwig
  2022-04-12  3:15     ` Miaohe Lin
  2022-04-12  8:44   ` Oscar Salvador
  2 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2022-04-11 14:16 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, Apr 09, 2022 at 05:34:52PM +0800, Miaohe Lin wrote:
> The MADV_FREE pages check in folio_check_dirty_writeback is a bit hard to
> follow. Add a comment to make the code clear.
> 
> Suggested-by: Huang, Ying <ying.huang@intel.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/vmscan.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index c77d5052f230..4a76be47bed1 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1436,6 +1436,9 @@ static void folio_check_dirty_writeback(struct folio *folio,
>  	/*
>  	 * Anonymous pages are not handled by flushers and must be written
>  	 * from reclaim context. Do not stall reclaim based on them

While you touch this please add a period at the end of the above
sentence.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-09  9:34 ` [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check Miaohe Lin
@ 2022-04-11 14:18   ` Christoph Hellwig
  2022-04-12  3:14     ` Miaohe Lin
  2022-04-12  0:52   ` ying.huang
  2022-04-12  8:59   ` Oscar Salvador
  2 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2022-04-11 14:18 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
> is checked before. It will avoid the long term pinned pages to be swapped
> out. And we can live with short term pinned pages. Without can_split_folio
> checking we can simplify the code. Also activate_locked can be changed to
> keep_locked as it's just short term pinning.
> 
> Suggested-by: Huang, Ying <ying.huang@intel.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/vmscan.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 4a76be47bed1..01f5db75a507 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>  					goto keep_locked;
>  				if (folio_maybe_dma_pinned(folio))
>  					goto keep_locked;
> -				if (PageTransHuge(page)) {
> -					/* cannot split THP, skip it */
> -					if (!can_split_folio(folio, NULL))
> -						goto activate_locked;
> -					/*
> -					 * Split pages without a PMD map right
> -					 * away. Chances are some or all of the
> -					 * tail pages can be freed without IO.
> -					 */
> -					if (!folio_entire_mapcount(folio) &&
> -					    split_folio_to_list(folio,
> -								page_list))
> -						goto activate_locked;
> -				}
> +				/*
> +				 * Split pages without a PMD map right
> +				 * away. Chances are some or all of the
> +				 * tail pages can be freed without IO.
> +				 */

This could use more of the line length and be more readable:

				/*
				 * Split pages without a PMD map right away.
				 * Chances are some or all of the tail pages
				 * can be freed without IO.
				 */

> +				if (PageTransHuge(page) && !folio_entire_mapcount(folio) &&

Please put the folio_entire_mapcoun ont a separate line to make this a
bit more redable.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-09  9:34 ` [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check Miaohe Lin
  2022-04-11 14:18   ` Christoph Hellwig
@ 2022-04-12  0:52   ` ying.huang
  2022-04-12  8:59   ` Oscar Salvador
  2 siblings, 0 replies; 33+ messages in thread
From: ying.huang @ 2022-04-12  0:52 UTC (permalink / raw)
  To: Miaohe Lin, akpm; +Cc: songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote:
> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
> is checked before. It will avoid the long term pinned pages to be swapped
> out. And we can live with short term pinned pages. Without can_split_folio
> checking we can simplify the code. Also activate_locked can be changed to
> keep_locked as it's just short term pinning.
> 
> Suggested-by: Huang, Ying <ying.huang@intel.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Look good to me.  Thanks!

Reviewed-by: Huang, Ying <ying.huang@intel.com>

Best Regards,
Huang, Ying

> ---
>  mm/vmscan.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 4a76be47bed1..01f5db75a507 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>  					goto keep_locked;
>  				if (folio_maybe_dma_pinned(folio))
>  					goto keep_locked;
> -				if (PageTransHuge(page)) {
> -					/* cannot split THP, skip it */
> -					if (!can_split_folio(folio, NULL))
> -						goto activate_locked;
> -					/*
> -					 * Split pages without a PMD map right
> -					 * away. Chances are some or all of the
> -					 * tail pages can be freed without IO.
> -					 */
> -					if (!folio_entire_mapcount(folio) &&
> -					    split_folio_to_list(folio,
> -								page_list))
> -						goto activate_locked;
> -				}
> +				/*
> +				 * Split pages without a PMD map right
> +				 * away. Chances are some or all of the
> +				 * tail pages can be freed without IO.
> +				 */
> +				if (PageTransHuge(page) && !folio_entire_mapcount(folio) &&
> +				    split_folio_to_list(folio, page_list))
> +					goto keep_locked;
>  				if (!add_to_swap(page)) {
>  					if (!PageTransHuge(page))
>  						goto activate_locked_split;



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec
  2022-04-09  9:34 ` [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec Miaohe Lin
@ 2022-04-12  0:57   ` ying.huang
  2022-04-12  3:13     ` Miaohe Lin
  0 siblings, 1 reply; 33+ messages in thread
From: ying.huang @ 2022-04-12  0:57 UTC (permalink / raw)
  To: Miaohe Lin, akpm; +Cc: songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote:
> LRU_UNEVICTABLE is not taken into account when shrink lruvec. So we can
> save a bit of stack space by shrinking the array size of nr and targets
> to NR_LRU_LISTS - 1. No functional change intended.
> 
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/vmscan.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 59b96320f481..0e5818970998 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2881,8 +2881,9 @@ static bool can_age_anon_pages(struct pglist_data *pgdat,
>  
> 
>  static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
>  {
> -	unsigned long nr[NR_LRU_LISTS];
> -	unsigned long targets[NR_LRU_LISTS];
> +	/* LRU_UNEVICTABLE is not taken into account. */
> +	unsigned long nr[NR_LRU_LISTS - 1];
> +	unsigned long targets[NR_LRU_LISTS - 1];
>  	unsigned long nr_to_scan;
>  	enum lru_list lru;
>  	unsigned long nr_reclaimed = 0;

As Christoph pointed out, this is hacky without much benefit.  Please
drop this patch.

Best Regards,
Huang, Ying


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 5/9] mm/vmscan: activate swap-backed executable folios after first usage
  2022-04-09  9:34 ` [PATCH v2 5/9] mm/vmscan: activate swap-backed executable folios after first usage Miaohe Lin
@ 2022-04-12  0:59   ` ying.huang
  0 siblings, 0 replies; 33+ messages in thread
From: ying.huang @ 2022-04-12  0:59 UTC (permalink / raw)
  To: Miaohe Lin, akpm; +Cc: songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote:
> We should activate swap-backed executable folios (e.g. tmpfs) after first
> usage so that executable code gets yet better chance to stay in memory.
> 
> Suggested-by: Huang, Ying <ying.huang@intel.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Reviewed-by: Huang, Ying <ying.huang@intel.com>

Best Regards,
Huang, Ying

> ---
>  mm/vmscan.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0e5818970998..cc1193e320c2 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1412,9 +1412,9 @@ static enum page_references folio_check_references(struct folio *folio,
>  			return PAGEREF_ACTIVATE;
>  
> 
> 
> 
>  		/*
> -		 * Activate file-backed executable folios after first usage.
> +		 * Activate executable folios after first usage.
>  		 */
> -		if ((vm_flags & VM_EXEC) && !folio_test_swapbacked(folio))
> +		if (vm_flags & VM_EXEC)
>  			return PAGEREF_ACTIVATE;
>  
> 
> 
> 
>  		return PAGEREF_KEEP;



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 7/9] mm/vmscan: take min_slab_pages into account when try to call shrink_node
  2022-04-09  9:34 ` [PATCH v2 7/9] mm/vmscan: take min_slab_pages into account when try to call shrink_node Miaohe Lin
@ 2022-04-12  1:36   ` ying.huang
  0 siblings, 0 replies; 33+ messages in thread
From: ying.huang @ 2022-04-12  1:36 UTC (permalink / raw)
  To: Miaohe Lin, akpm, Christoph Hellwig
  Cc: songmuchun, hch, willy, linux-mm, linux-kernel, Johannes Weiner

On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote:
> Since commit 6b4f7799c6a5 ("mm: vmscan: invoke slab shrinkers from
> shrink_zone()"), slab reclaim and lru page reclaim are done together
> in the shrink_node. So we should take min_slab_pages into account
> when try to call shrink_node.
> 

Looks reasonable to me, copying Johannes.

Hi, Christoph,

Should we check min_unmapped_pages and min_slab_pages in shrink_node()
to avoid reclaim LRU when necessary and vice versa?  This may be done
via another 2 scan_control flags.

Best Regards,
Huang, Ying

> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/vmscan.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 53f1d0755b34..ba83d8f3e53e 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -4714,7 +4714,8 @@ static int __node_reclaim(struct pglist_data *pgdat, gfp_t gfp_mask, unsigned in
>  	noreclaim_flag = memalloc_noreclaim_save();
>  	set_task_reclaim_state(p, &sc.reclaim_state);
>  
> 
> 
> 
> -	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages) {
> +	if (node_pagecache_reclaimable(pgdat) > pgdat->min_unmapped_pages ||
> +	    node_page_state_pages(pgdat, NR_SLAB_RECLAIMABLE_B) > pgdat->min_slab_pages) {
>  		/*
>  		 * Free memory by calling shrink node with increasing
>  		 * priorities until we have enough memory freed.



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback
  2022-04-11 11:50   ` ying.huang
@ 2022-04-12  3:07     ` Miaohe Lin
  0 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-12  3:07 UTC (permalink / raw)
  To: ying.huang; +Cc: songmuchun, hch, willy, linux-mm, linux-kernel, Andrew Morton

On 2022/4/11 19:50, ying.huang@intel.com wrote:
> On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote:
>> The MADV_FREE pages check in folio_check_dirty_writeback is a bit hard to
>> follow. Add a comment to make the code clear.
>>
>> Suggested-by: Huang, Ying <ying.huang@intel.com>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/vmscan.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index c77d5052f230..4a76be47bed1 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1436,6 +1436,9 @@ static void folio_check_dirty_writeback(struct folio
>> *folio,
>>         /*
>>          * Anonymous pages are not handled by flushers and must be written
>>          * from reclaim context. Do not stall reclaim based on them
>> +        * MADV_FREE anonymous pages are put into inactive file list too.
>> +        * They could be mistakenly treated as file lru. So further anon
>> +        * test is needed.
>>          */
> 
> How about something as follows,
> 
> 	/*
>          * Anonymous pages (including MADV_FREE ones) are not handled
> by	 * flushers and must be written from reclaim context. Do not stall
> 	 * reclaim based on them
> 	 */.
> 

This comment looks good but it seems it doesn't explain the MADV_FREE check logic.
Is this already enough? If so, will change to use this. Thanks!

> Best Regards,
> Huang, Ying
> 
>>         if (!folio_is_file_lru(folio) ||
>>             (folio_test_anon(folio) && !folio_test_swapbacked(folio))) {
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec
  2022-04-12  0:57   ` ying.huang
@ 2022-04-12  3:13     ` Miaohe Lin
  0 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-12  3:13 UTC (permalink / raw)
  To: ying.huang, akpm; +Cc: songmuchun, hch, willy, linux-mm, linux-kernel

On 2022/4/12 8:57, ying.huang@intel.com wrote:
> On Sat, 2022-04-09 at 17:34 +0800, Miaohe Lin wrote:
>> LRU_UNEVICTABLE is not taken into account when shrink lruvec. So we can
>> save a bit of stack space by shrinking the array size of nr and targets
>> to NR_LRU_LISTS - 1. No functional change intended.
>>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/vmscan.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 59b96320f481..0e5818970998 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -2881,8 +2881,9 @@ static bool can_age_anon_pages(struct pglist_data *pgdat,
>>  
>>
>>  static void shrink_lruvec(struct lruvec *lruvec, struct scan_control *sc)
>>  {
>> -	unsigned long nr[NR_LRU_LISTS];
>> -	unsigned long targets[NR_LRU_LISTS];
>> +	/* LRU_UNEVICTABLE is not taken into account. */
>> +	unsigned long nr[NR_LRU_LISTS - 1];
>> +	unsigned long targets[NR_LRU_LISTS - 1];
>>  	unsigned long nr_to_scan;
>>  	enum lru_list lru;
>>  	unsigned long nr_reclaimed = 0;
> 
> As Christoph pointed out, this is hacky without much benefit.  Please
> drop this patch.

Will drop this patch. Thanks.

> 
> Best Regards,
> Huang, Ying
> 
> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-11 14:18   ` Christoph Hellwig
@ 2022-04-12  3:14     ` Miaohe Lin
  0 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-12  3:14 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: akpm, ying.huang, songmuchun, willy, linux-mm, linux-kernel

On 2022/4/11 22:18, Christoph Hellwig wrote:
> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
>> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
>> is checked before. It will avoid the long term pinned pages to be swapped
>> out. And we can live with short term pinned pages. Without can_split_folio
>> checking we can simplify the code. Also activate_locked can be changed to
>> keep_locked as it's just short term pinning.
>>
>> Suggested-by: Huang, Ying <ying.huang@intel.com>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/vmscan.c | 22 ++++++++--------------
>>  1 file changed, 8 insertions(+), 14 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 4a76be47bed1..01f5db75a507 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>>  					goto keep_locked;
>>  				if (folio_maybe_dma_pinned(folio))
>>  					goto keep_locked;
>> -				if (PageTransHuge(page)) {
>> -					/* cannot split THP, skip it */
>> -					if (!can_split_folio(folio, NULL))
>> -						goto activate_locked;
>> -					/*
>> -					 * Split pages without a PMD map right
>> -					 * away. Chances are some or all of the
>> -					 * tail pages can be freed without IO.
>> -					 */
>> -					if (!folio_entire_mapcount(folio) &&
>> -					    split_folio_to_list(folio,
>> -								page_list))
>> -						goto activate_locked;
>> -				}
>> +				/*
>> +				 * Split pages without a PMD map right
>> +				 * away. Chances are some or all of the
>> +				 * tail pages can be freed without IO.
>> +				 */
> 
> This could use more of the line length and be more readable:
> 
> 				/*
> 				 * Split pages without a PMD map right away.
> 				 * Chances are some or all of the tail pages
> 				 * can be freed without IO.
> 				 */
> 

Looks better.

>> +				if (PageTransHuge(page) && !folio_entire_mapcount(folio) &&
> 
> Please put the folio_entire_mapcoun ont a separate line to make this a
> bit more redable.

Will do in next version. Many thanks for your comment!

> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback
  2022-04-11 14:16   ` Christoph Hellwig
@ 2022-04-12  3:15     ` Miaohe Lin
  0 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-12  3:15 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: akpm, ying.huang, songmuchun, willy, linux-mm, linux-kernel

On 2022/4/11 22:16, Christoph Hellwig wrote:
> On Sat, Apr 09, 2022 at 05:34:52PM +0800, Miaohe Lin wrote:
>> The MADV_FREE pages check in folio_check_dirty_writeback is a bit hard to
>> follow. Add a comment to make the code clear.
>>
>> Suggested-by: Huang, Ying <ying.huang@intel.com>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/vmscan.c | 3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index c77d5052f230..4a76be47bed1 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1436,6 +1436,9 @@ static void folio_check_dirty_writeback(struct folio *folio,
>>  	/*
>>  	 * Anonymous pages are not handled by flushers and must be written
>>  	 * from reclaim context. Do not stall reclaim based on them
> 
> While you touch this please add a period at the end of the above
> sentence.

Will do.

> 
> Otherwise looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Many thanks for review!

> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback
  2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
  2022-04-11 11:50   ` ying.huang
  2022-04-11 14:16   ` Christoph Hellwig
@ 2022-04-12  8:44   ` Oscar Salvador
  2 siblings, 0 replies; 33+ messages in thread
From: Oscar Salvador @ 2022-04-12  8:44 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, Apr 09, 2022 at 05:34:52PM +0800, Miaohe Lin wrote:
> The MADV_FREE pages check in folio_check_dirty_writeback is a bit hard to
> follow. Add a comment to make the code clear.
> 
> Suggested-by: Huang, Ying <ying.huang@intel.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-09  9:34 ` [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check Miaohe Lin
  2022-04-11 14:18   ` Christoph Hellwig
  2022-04-12  0:52   ` ying.huang
@ 2022-04-12  8:59   ` Oscar Salvador
  2022-04-12 13:42     ` Miaohe Lin
  2 siblings, 1 reply; 33+ messages in thread
From: Oscar Salvador @ 2022-04-12  8:59 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: akpm, ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel

On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
> is checked before. It will avoid the long term pinned pages to be swapped
> out. And we can live with short term pinned pages. Without can_split_folio
> checking we can simplify the code. Also activate_locked can be changed to
> keep_locked as it's just short term pinning.

What do you mean by "we can live with short term pinned pages"?
Does it mean that it was not pinned when we check
folio_maybe_dma_pinned() but now it is?

To me it looks like the pinning is fluctuating and we rely on
split_folio_to_list() to see whether we succeed or not, and if not
we give it another spin in the next round?

> Suggested-by: Huang, Ying <ying.huang@intel.com>
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  mm/vmscan.c | 22 ++++++++--------------
>  1 file changed, 8 insertions(+), 14 deletions(-)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 4a76be47bed1..01f5db75a507 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>  					goto keep_locked;
>  				if (folio_maybe_dma_pinned(folio))
>  					goto keep_locked;
> -				if (PageTransHuge(page)) {
> -					/* cannot split THP, skip it */
> -					if (!can_split_folio(folio, NULL))
> -						goto activate_locked;
> -					/*
> -					 * Split pages without a PMD map right
> -					 * away. Chances are some or all of the
> -					 * tail pages can be freed without IO.
> -					 */
> -					if (!folio_entire_mapcount(folio) &&
> -					    split_folio_to_list(folio,
> -								page_list))
> -						goto activate_locked;
> -				}
> +				/*
> +				 * Split pages without a PMD map right
> +				 * away. Chances are some or all of the
> +				 * tail pages can be freed without IO.
> +				 */
> +				if (PageTransHuge(page) && !folio_entire_mapcount(folio) &&
> +				    split_folio_to_list(folio, page_list))
> +					goto keep_locked;
>  				if (!add_to_swap(page)) {
>  					if (!PageTransHuge(page))
>  						goto activate_locked_split;
> -- 
> 2.23.0
> 
> 

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-12  8:59   ` Oscar Salvador
@ 2022-04-12 13:42     ` Miaohe Lin
  2022-04-12 14:59       ` David Hildenbrand
  0 siblings, 1 reply; 33+ messages in thread
From: Miaohe Lin @ 2022-04-12 13:42 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: akpm, ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel

On 2022/4/12 16:59, Oscar Salvador wrote:
> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
>> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
>> is checked before. It will avoid the long term pinned pages to be swapped
>> out. And we can live with short term pinned pages. Without can_split_folio
>> checking we can simplify the code. Also activate_locked can be changed to
>> keep_locked as it's just short term pinning.
> 
> What do you mean by "we can live with short term pinned pages"?
> Does it mean that it was not pinned when we check
> folio_maybe_dma_pinned() but now it is?
> 
> To me it looks like the pinning is fluctuating and we rely on
> split_folio_to_list() to see whether we succeed or not, and if not
> we give it another spin in the next round?

Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be
pinned for a noticeable time. So it's expected to split the folio successfully in the next
round as the pinning is really fluctuating. Or am I miss something?

Many thanks for your comment and reply!

> 
>> Suggested-by: Huang, Ying <ying.huang@intel.com>
>> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
>> ---
>>  mm/vmscan.c | 22 ++++++++--------------
>>  1 file changed, 8 insertions(+), 14 deletions(-)
>>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 4a76be47bed1..01f5db75a507 100644
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -1711,20 +1711,14 @@ static unsigned int shrink_page_list(struct list_head *page_list,
>>  					goto keep_locked;
>>  				if (folio_maybe_dma_pinned(folio))
>>  					goto keep_locked;
>> -				if (PageTransHuge(page)) {
>> -					/* cannot split THP, skip it */
>> -					if (!can_split_folio(folio, NULL))
>> -						goto activate_locked;
>> -					/*
>> -					 * Split pages without a PMD map right
>> -					 * away. Chances are some or all of the
>> -					 * tail pages can be freed without IO.
>> -					 */
>> -					if (!folio_entire_mapcount(folio) &&
>> -					    split_folio_to_list(folio,
>> -								page_list))
>> -						goto activate_locked;
>> -				}
>> +				/*
>> +				 * Split pages without a PMD map right
>> +				 * away. Chances are some or all of the
>> +				 * tail pages can be freed without IO.
>> +				 */
>> +				if (PageTransHuge(page) && !folio_entire_mapcount(folio) &&
>> +				    split_folio_to_list(folio, page_list))
>> +					goto keep_locked;
>>  				if (!add_to_swap(page)) {
>>  					if (!PageTransHuge(page))
>>  						goto activate_locked_split;
>> -- 
>> 2.23.0
>>
>>
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-12 13:42     ` Miaohe Lin
@ 2022-04-12 14:59       ` David Hildenbrand
  2022-04-13  1:26         ` ying.huang
  0 siblings, 1 reply; 33+ messages in thread
From: David Hildenbrand @ 2022-04-12 14:59 UTC (permalink / raw)
  To: Miaohe Lin, Oscar Salvador
  Cc: akpm, ying.huang, songmuchun, hch, willy, linux-mm, linux-kernel

On 12.04.22 15:42, Miaohe Lin wrote:
> On 2022/4/12 16:59, Oscar Salvador wrote:
>> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
>>> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
>>> is checked before. It will avoid the long term pinned pages to be swapped
>>> out. And we can live with short term pinned pages. Without can_split_folio
>>> checking we can simplify the code. Also activate_locked can be changed to
>>> keep_locked as it's just short term pinning.
>>
>> What do you mean by "we can live with short term pinned pages"?
>> Does it mean that it was not pinned when we check
>> folio_maybe_dma_pinned() but now it is?
>>
>> To me it looks like the pinning is fluctuating and we rely on
>> split_folio_to_list() to see whether we succeed or not, and if not
>> we give it another spin in the next round?
> 
> Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be
> pinned for a noticeable time. So it's expected to split the folio successfully in the next
> round as the pinning is really fluctuating. Or am I miss something?
> 

Just so we're on the same page. folio_maybe_dma_pinned() only capture
FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now
via vmsplice().

can_split_folio() is more precise then folio_maybe_dma_pinned(), but
both are racy as long as the page is still mapped.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-12 14:59       ` David Hildenbrand
@ 2022-04-13  1:26         ` ying.huang
  2022-04-13  2:17           ` Miaohe Lin
  2022-04-15  3:07           ` ying.huang
  0 siblings, 2 replies; 33+ messages in thread
From: ying.huang @ 2022-04-13  1:26 UTC (permalink / raw)
  To: David Hildenbrand, Miaohe Lin, Oscar Salvador
  Cc: akpm, songmuchun, hch, willy, linux-mm, linux-kernel,
	Pavel Tatashin, John Hubbard, Linus Torvalds, Vlastimil Babka,
	Yu Zhao

On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote:
> On 12.04.22 15:42, Miaohe Lin wrote:
> > On 2022/4/12 16:59, Oscar Salvador wrote:
> > > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
> > > > We don't need to check can_split_folio() because folio_maybe_dma_pinned()
> > > > is checked before. It will avoid the long term pinned pages to be swapped
> > > > out. And we can live with short term pinned pages. Without can_split_folio
> > > > checking we can simplify the code. Also activate_locked can be changed to
> > > > keep_locked as it's just short term pinning.
> > > 
> > > What do you mean by "we can live with short term pinned pages"?
> > > Does it mean that it was not pinned when we check
> > > folio_maybe_dma_pinned() but now it is?
> > > 
> > > To me it looks like the pinning is fluctuating and we rely on
> > > split_folio_to_list() to see whether we succeed or not, and if not
> > > we give it another spin in the next round?
> > 
> > Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be
> > pinned for a noticeable time. So it's expected to split the folio successfully in the next
> > round as the pinning is really fluctuating. Or am I miss something?
> > 
> 
> Just so we're on the same page. folio_maybe_dma_pinned() only capture
> FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now
> via vmsplice().

Per my original understanding, folio_maybe_dma_pinned() can be used to
detect long-term pinned pages.  And it seems reasonable to skip the
long-term pinned pages and try short-term pinned pages during page
reclaiming.  But as you pointed out, vmsplice() doesn't use FOLL_PIN. 
So if vmsplice() is expected to pin pages for long time, and we have no
way to detect it, then we should keep can_split_folio() in the original
code.

Copying more people who have worked on long-term pinning for comments.

Best Regards,
Huang, Ying 

> can_split_folio() is more precise then folio_maybe_dma_pinned(), but
> both are racy as long as the page is still mapped.
> 
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-13  1:26         ` ying.huang
@ 2022-04-13  2:17           ` Miaohe Lin
  2022-04-15  3:07           ` ying.huang
  1 sibling, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-13  2:17 UTC (permalink / raw)
  To: ying.huang, David Hildenbrand, Oscar Salvador
  Cc: akpm, songmuchun, hch, willy, linux-mm, linux-kernel,
	Pavel Tatashin, John Hubbard, Linus Torvalds, Vlastimil Babka,
	Yu Zhao

On 2022/4/13 9:26, ying.huang@intel.com wrote:
> On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote:
>> On 12.04.22 15:42, Miaohe Lin wrote:
>>> On 2022/4/12 16:59, Oscar Salvador wrote:
>>>> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
>>>>> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
>>>>> is checked before. It will avoid the long term pinned pages to be swapped
>>>>> out. And we can live with short term pinned pages. Without can_split_folio
>>>>> checking we can simplify the code. Also activate_locked can be changed to
>>>>> keep_locked as it's just short term pinning.
>>>>
>>>> What do you mean by "we can live with short term pinned pages"?
>>>> Does it mean that it was not pinned when we check
>>>> folio_maybe_dma_pinned() but now it is?
>>>>
>>>> To me it looks like the pinning is fluctuating and we rely on
>>>> split_folio_to_list() to see whether we succeed or not, and if not
>>>> we give it another spin in the next round?
>>>
>>> Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be
>>> pinned for a noticeable time. So it's expected to split the folio successfully in the next
>>> round as the pinning is really fluctuating. Or am I miss something?
>>>
>>
>> Just so we're on the same page. folio_maybe_dma_pinned() only capture
>> FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now
>> via vmsplice().
> 
> Per my original understanding, folio_maybe_dma_pinned() can be used to
> detect long-term pinned pages.  And it seems reasonable to skip the
> long-term pinned pages and try short-term pinned pages during page
> reclaiming.  But as you pointed out, vmsplice() doesn't use FOLL_PIN. 
> So if vmsplice() is expected to pin pages for long time, and we have no
> way to detect it, then we should keep can_split_folio() in the original
> code.

IIUC, even if we have no way to detect it, can_split_folio should be removed
due to:

"""
can_split_huge_page is introduced via commit b8f593cd0896 ("mm, THP, swap:
check whether THP can be split firstly") to avoid deleting the THP from
the swap cache and freeing the swap cluster when the THP cannot be split.
But since commit bd4c82c22c36 ("mm, THP, swap: delay splitting THP after
swapped out"), splitting THP is delayed until THP is swapped out. There's
no need to delete the THP from the swap cache and free the swap cluster
anymore. Thus we can remove this unneeded can_split_huge_page check now to
simplify the code logic.
"""

THP might not need to be splitted and could be freed directly after swapout.
So can_split_huge_page check here is unneeded. Or am I miss something?

Thanks!

> 
> Copying more people who have worked on long-term pinning for comments.
> 
> Best Regards,
> Huang, Ying 
> 
>> can_split_folio() is more precise then folio_maybe_dma_pinned(), but
>> both are racy as long as the page is still mapped.
>>
>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-13  1:26         ` ying.huang
  2022-04-13  2:17           ` Miaohe Lin
@ 2022-04-15  3:07           ` ying.huang
  2022-04-16  2:34             ` Miaohe Lin
  1 sibling, 1 reply; 33+ messages in thread
From: ying.huang @ 2022-04-15  3:07 UTC (permalink / raw)
  To: David Hildenbrand, Miaohe Lin, Oscar Salvador
  Cc: akpm, songmuchun, hch, willy, linux-mm, linux-kernel,
	Pavel Tatashin, John Hubbard, Linus Torvalds, Vlastimil Babka,
	Yu Zhao

On Wed, 2022-04-13 at 09:26 +0800, ying.huang@intel.com wrote:
> On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote:
> > On 12.04.22 15:42, Miaohe Lin wrote:
> > > On 2022/4/12 16:59, Oscar Salvador wrote:
> > > > On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
> > > > > We don't need to check can_split_folio() because folio_maybe_dma_pinned()
> > > > > is checked before. It will avoid the long term pinned pages to be swapped
> > > > > out. And we can live with short term pinned pages. Without can_split_folio
> > > > > checking we can simplify the code. Also activate_locked can be changed to
> > > > > keep_locked as it's just short term pinning.
> > > > 
> > > > What do you mean by "we can live with short term pinned pages"?
> > > > Does it mean that it was not pinned when we check
> > > > folio_maybe_dma_pinned() but now it is?
> > > > 
> > > > To me it looks like the pinning is fluctuating and we rely on
> > > > split_folio_to_list() to see whether we succeed or not, and if not
> > > > we give it another spin in the next round?
> > > 
> > > Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be
> > > pinned for a noticeable time. So it's expected to split the folio successfully in the next
> > > round as the pinning is really fluctuating. Or am I miss something?
> > > 
> > 
> > Just so we're on the same page. folio_maybe_dma_pinned() only capture
> > FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now
> > via vmsplice().
> 
> Per my original understanding, folio_maybe_dma_pinned() can be used to
> detect long-term pinned pages.  And it seems reasonable to skip the
> long-term pinned pages and try short-term pinned pages during page
> reclaiming.  But as you pointed out, vmsplice() doesn't use FOLL_PIN. 
> So if vmsplice() is expected to pin pages for long time, and we have no
> way to detect it, then we should keep can_split_folio() in the original
> code.
> 
> Copying more people who have worked on long-term pinning for comments.

Checked the discussion in the following thread,

https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQSj80_w@mail.gmail.com/

It seems that from the practical point of view, folio_maybe_dma_pinned()
can identify most long-term pinned pages that may block memory hot-
remove or CMA allocation.  Although as David pointed out, some pages may
still be GUPed for long time (e.g. via vmsplice) even if
!folio_maybe_dma_pinned().

But from another point of view, can_split_huge_page() is cheap and THP
swapout is expensive (swap space, disk IO, and hard to be recovered), so
it may be better to keep can_split_huge_page() in shink_page_list().

Best Regards,
Huang, Ying

> 
> > can_split_folio() is more precise then folio_maybe_dma_pinned(), but
> > both are racy as long as the page is still mapped.
> > 
> > 
> 



^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check
  2022-04-15  3:07           ` ying.huang
@ 2022-04-16  2:34             ` Miaohe Lin
  0 siblings, 0 replies; 33+ messages in thread
From: Miaohe Lin @ 2022-04-16  2:34 UTC (permalink / raw)
  To: ying.huang, David Hildenbrand, Oscar Salvador
  Cc: akpm, songmuchun, hch, willy, linux-mm, linux-kernel,
	Pavel Tatashin, John Hubbard, Linus Torvalds, Vlastimil Babka,
	Yu Zhao

On 2022/4/15 11:07, ying.huang@intel.com wrote:
> On Wed, 2022-04-13 at 09:26 +0800, ying.huang@intel.com wrote:
>> On Tue, 2022-04-12 at 16:59 +0200, David Hildenbrand wrote:
>>> On 12.04.22 15:42, Miaohe Lin wrote:
>>>> On 2022/4/12 16:59, Oscar Salvador wrote:
>>>>> On Sat, Apr 09, 2022 at 05:34:53PM +0800, Miaohe Lin wrote:
>>>>>> We don't need to check can_split_folio() because folio_maybe_dma_pinned()
>>>>>> is checked before. It will avoid the long term pinned pages to be swapped
>>>>>> out. And we can live with short term pinned pages. Without can_split_folio
>>>>>> checking we can simplify the code. Also activate_locked can be changed to
>>>>>> keep_locked as it's just short term pinning.
>>>>>
>>>>> What do you mean by "we can live with short term pinned pages"?
>>>>> Does it mean that it was not pinned when we check
>>>>> folio_maybe_dma_pinned() but now it is?
>>>>>
>>>>> To me it looks like the pinning is fluctuating and we rely on
>>>>> split_folio_to_list() to see whether we succeed or not, and if not
>>>>> we give it another spin in the next round?
>>>>
>>>> Yes. Short term pinned pages is relative to long term pinned pages and these pages won't be
>>>> pinned for a noticeable time. So it's expected to split the folio successfully in the next
>>>> round as the pinning is really fluctuating. Or am I miss something?
>>>>
>>>
>>> Just so we're on the same page. folio_maybe_dma_pinned() only capture
>>> FOLL_PIN, but not FOLL_GET. You can have long-term FOLL_GET right now
>>> via vmsplice().
>>
>> Per my original understanding, folio_maybe_dma_pinned() can be used to
>> detect long-term pinned pages.  And it seems reasonable to skip the
>> long-term pinned pages and try short-term pinned pages during page
>> reclaiming.  But as you pointed out, vmsplice() doesn't use FOLL_PIN. 
>> So if vmsplice() is expected to pin pages for long time, and we have no
>> way to detect it, then we should keep can_split_folio() in the original
>> code.
>>
>> Copying more people who have worked on long-term pinning for comments.
> 
> Checked the discussion in the following thread,
> 
> https://lore.kernel.org/lkml/CA+CK2bBffHBxjmb9jmSKacm0fJMinyt3Nhk8Nx6iudcQSj80_w@mail.gmail.com/
> 
> It seems that from the practical point of view, folio_maybe_dma_pinned()
> can identify most long-term pinned pages that may block memory hot-
> remove or CMA allocation.  Although as David pointed out, some pages may
> still be GUPed for long time (e.g. via vmsplice) even if
> !folio_maybe_dma_pinned().
> 
> But from another point of view, can_split_huge_page() is cheap and THP
> swapout is expensive (swap space, disk IO, and hard to be recovered), so
> it may be better to keep can_split_huge_page() in shink_page_list().

Many thanks for your explanation. Looks convincing for me. Is it worth a comment about the above
stuff? Anyway, will drop this patch. Thanks!

> 
> Best Regards,
> Huang, Ying
> 
>>
>>> can_split_folio() is more precise then folio_maybe_dma_pinned(), but
>>> both are racy as long as the page is still mapped.
>>>
>>>
>>
> 
> 
> .
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2022-04-16  2:34 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-09  9:34 [PATCH v2 0/9] A few cleanup and fixup patches for vmscan Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 1/9] mm/vmscan: add a comment about MADV_FREE pages check in folio_check_dirty_writeback Miaohe Lin
2022-04-11 11:50   ` ying.huang
2022-04-12  3:07     ` Miaohe Lin
2022-04-11 14:16   ` Christoph Hellwig
2022-04-12  3:15     ` Miaohe Lin
2022-04-12  8:44   ` Oscar Salvador
2022-04-09  9:34 ` [PATCH v2 2/9] mm/vmscan: remove unneeded can_split_huge_page check Miaohe Lin
2022-04-11 14:18   ` Christoph Hellwig
2022-04-12  3:14     ` Miaohe Lin
2022-04-12  0:52   ` ying.huang
2022-04-12  8:59   ` Oscar Salvador
2022-04-12 13:42     ` Miaohe Lin
2022-04-12 14:59       ` David Hildenbrand
2022-04-13  1:26         ` ying.huang
2022-04-13  2:17           ` Miaohe Lin
2022-04-15  3:07           ` ying.huang
2022-04-16  2:34             ` Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 3/9] mm/vmscan: introduce helper function reclaim_page_list() Miaohe Lin
2022-04-09 13:44   ` Matthew Wilcox
2022-04-11  1:53     ` Miaohe Lin
2022-04-11  3:17       ` Matthew Wilcox
2022-04-11  3:26         ` Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 4/9] mm/vmscan: save a bit of stack space in shrink_lruvec Miaohe Lin
2022-04-12  0:57   ` ying.huang
2022-04-12  3:13     ` Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 5/9] mm/vmscan: activate swap-backed executable folios after first usage Miaohe Lin
2022-04-12  0:59   ` ying.huang
2022-04-09  9:34 ` [PATCH v2 6/9] mm/vmscan: take all base pages of THP into account when race with speculative reference Miaohe Lin
2022-04-09  9:34 ` [PATCH v2 7/9] mm/vmscan: take min_slab_pages into account when try to call shrink_node Miaohe Lin
2022-04-12  1:36   ` ying.huang
2022-04-09  9:34 ` [PATCH v2 8/9] mm/vmscan: remove obsolete comment in kswapd_run Miaohe Lin
2022-04-09  9:35 ` [PATCH v2 9/9] mm/vmscan: use helper folio_is_file_lru() Miaohe Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).