Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/9] workingset protection/detection on the anonymous LRU list
@ 2020-02-11  6:19 js1304
  2020-02-11  6:19 ` [PATCH 1/9] mm/vmscan: make active/inactive ratio as 1:1 for anon lru js1304
                   ` (10 more replies)
  0 siblings, 11 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Hello,

This patchset implements workingset protection and detection on
the anonymous LRU list.

* SUBJECT
workingset protection

* PROBLEM
In current implementation, newly created or swap-in anonymous page is
started on the active list. Growing the active list results in rebalancing
active/inactive list so old pages on the active list are demoted to the
inactive list. Hence, hot page on the active list isn't protected at all.

Following is an example of this situation.

Assume that 50 hot pages on active list and system can contain total
100 pages. Numbers denote the number of pages on active/inactive
list (active | inactive). (h) stands for hot pages and (uo) stands for
used-once pages.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(uo) | 50(h)

3. workload: another 50 newly created (used-once) pages
50(uo) | 50(uo), swap-out 50(h)

As we can see, hot pages are swapped-out and it would cause swap-in later.

* SOLUTION
Since this is what we want to avoid, this patchset implements workingset
protection. Like as the file LRU list, newly created or swap-in anonymous
page is started on the inactive list. Also, like as the file LRU list,
if enough reference happens, the page will be promoted. This simple
modification changes the above example as following.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(h) | 50(uo)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(uo)

hot pages remains in the active list. :)

* EXPERIMENT
I tested this scenario on my test bed and confirmed that this problem
happens on current implementation. I also checked that it is fixed by
this patchset.

I did another test to show the performance effect of this patchset.

- ebizzy (with modified random function)
ebizzy is the test program that main thread allocates lots of memory and
child threads access them randomly during the given times. Swap-in/out
will happen if allocated memory is larger than the system memory.

The random function that represents the zipf distribution is used to
make hot/cold memory. Hot/cold ratio is controlled by the parameter. If
the parameter is high, hot memory is accessed much larger than cold one.
If the parameter is low, the number of access on each memory would be
similar. I uses various parameters in order to show the effect of
patchset on various hot/cold ratio workload.

My test setup is a virtual machine with 8 cpus and 1024MB memory.

Result format is as following.

Parameter 0.1 ... 1.3
Allocated memory size
Throughput for base (larger is better)
Throughput for patchset (larger is better)
Improvement (larger is better)


* single thread

     0.1      0.3      0.5      0.7      0.9      1.1      1.3
<512M>
  7009.0   7372.0   7774.0   8523.0   9569.0  10724.0  11936.0
  6973.0   7342.0   7745.0   8576.0   9441.0  10730.0  12033.0
   -0.01     -0.0     -0.0     0.01    -0.01      0.0     0.01
<768M>
   915.0   1039.0   1275.0   1687.0   2328.0   3486.0   5445.0
   920.0   1037.0   1238.0   1689.0   2384.0   3638.0   5381.0
    0.01     -0.0    -0.03      0.0     0.02     0.04    -0.01
<1024M>
   425.0    471.0    539.0    753.0   1183.0   2130.0   3839.0
   414.0    468.0    553.0    770.0   1242.0   2187.0   3932.0
   -0.03    -0.01     0.03     0.02     0.05     0.03     0.02
<1280M>
   320.0    346.0    410.0    556.0    871.0   1654.0   3298.0
   316.0    346.0    411.0    550.0    892.0   1652.0   3293.0
   -0.01      0.0      0.0    -0.01     0.02     -0.0     -0.0
<1536M>
   273.0    290.0    341.0    458.0    733.0   1381.0   2925.0
   271.0    293.0    344.0    462.0    740.0   1398.0   2969.0
   -0.01     0.01     0.01     0.01     0.01     0.01     0.02
<2048M>
    77.0     79.0     95.0    147.0    276.0    690.0   1816.0
    91.0     94.0    115.0    170.0    321.0    770.0   2018.0
    0.18     0.19     0.21     0.16     0.16     0.12     0.11


* multi thread (8)

     0.1      0.3      0.5      0.7      0.9      1.1      1.3
<512M>
 29083.0  29648.0  30145.0  31668.0  33964.0  38414.0  43707.0
 29238.0  29701.0  30301.0  31328.0  33809.0  37991.0  43667.0
    0.01      0.0     0.01    -0.01     -0.0    -0.01     -0.0
<768M>
  3332.0   3699.0   4673.0   5830.0   8307.0  12969.0  17665.0
  3579.0   3992.0   4432.0   6111.0   8699.0  12604.0  18061.0
    0.07     0.08    -0.05     0.05     0.05    -0.03     0.02
<1024M>
  1921.0   2141.0   2484.0   3296.0   5391.0   8227.0  14574.0
  1989.0   2155.0   2609.0   3565.0   5463.0   8170.0  15642.0
    0.04     0.01     0.05     0.08     0.01    -0.01     0.07
<1280M>
  1524.0   1625.0   1931.0   2581.0   4155.0   6959.0  12443.0
  1560.0   1707.0   2016.0   2714.0   4262.0   7518.0  13910.0
    0.02     0.05     0.04     0.05     0.03     0.08     0.12
<1536M>
  1303.0   1399.0   1550.0   2137.0   3469.0   6712.0  12944.0
  1356.0   1465.0   1701.0   2237.0   3583.0   6830.0  13580.0
    0.04     0.05      0.1     0.05     0.03     0.02     0.05
<2048M>
   172.0    184.0    215.0    289.0    514.0   1318.0   4153.0
   175.0    190.0    225.0    329.0    606.0   1585.0   5170.0
    0.02     0.03     0.05     0.14     0.18      0.2     0.24

As we can see, as allocated memory grows, patched kernel get the better
result. Maximum improvement is 21% for the single thread test and
24% for the multi thread test.


* SUBJECT
workingset detection

* PROBLEM
Later part of the patchset implements the workingset detection for
the anonymous LRU list. There is a corner case that workingset protection
could cause thrashing. If we can avoid thrashing by workingset detection,
we can get the better performance.

Following is an example of thrashing due to the workingset protection.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (will be hot) pages
50(h) | 50(wh)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(wh)

4. workload: 50 (will be hot) pages
50(h) | 50(wh), swap-in 50(wh)

5. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(wh)

6. repeat 4, 5

Without workingset detection, this kind of workload cannot be promoted
and thrashing happens forever.

* SOLUTION
Therefore, this patchset implements workingset detection.
All the infrastructure for workingset detecion is already implemented,
so there is not much work to do. First, extend workingset detection
code to deal with the anonymous LRU list. Then, make swap cache handles
the exceptional value for the shadow entry. Lastly, install/retrieve
the shadow value into/from the swap cache and check the refault distance.

* EXPERIMENT
I made a test program to imitates above scenario and confirmed that
problem exists. Then, I checked that this patchset fixes it.

My test setup is a virtual machine with 8 cpus and 6100MB memory. But,
the amount of the memory that the test program can use is about 280 MB.
This is because the system uses large ram-backed swap and large ramdisk
to capture the trace.

Test scenario is like as below.

1. allocate cold memory (512MB)
2. allocate hot-1 memory (96MB)
3. activate hot-1 memory (96MB)
4. allocate another hot-2 memory (96MB)
5. access cold memory (128MB)
6. access hot-2 memory (96MB)
7. repeat 5, 6

Since hot-1 memory (96MB) is on the active list, the inactive list can
contains roughly 190MB pages. hot-2 memory's re-access interval
(96+128 MB) is more 190MB, so it cannot be promoted without workingset
detection and swap-in/out happens repeatedly. With this patchset,
workingset detection works and promotion happens. Therefore, swap-in/out
occurs less.

Here is the result. (average of 5 runs)

type swap-in swap-out
base 863240 989945
patch 681565 809273

As we can see, patched kernel do less swap-in/out.

Note that, this result is gotten from v5.1. Although workingset detection
works on v5.1, it doesn't work well on v5.5. It looks like that recent
code change on workingset.c is the reason of this problem. I will
track it soon.

Patchset is based on v5.5.
Patchset can also be available at

https://github.com/JoonsooKim/linux/tree/improve-anonymous-lru-management-v1.00-v5.5

Enjoy it.

Thanks.

Joonsoo Kim (9):
  mm/vmscan: make active/inactive ratio as 1:1 for anon lru
  mm/vmscan: protect the workingset on anonymous LRU
  mm/workingset: extend the workingset detection for anon LRU
  mm/swapcache: support to handle the value in swapcache
  mm/workingset: use the node counter if memcg is the root memcg
  mm/workingset: handle the page without memcg
  mm/swap: implement workingset detection for anonymous LRU
  mm/vmscan: restore active/inactive ratio for anonymous LRU
  mm/swap: count a new anonymous page as a reclaim_state's rotate

 include/linux/mmzone.h  | 14 ++++++++-----
 include/linux/swap.h    | 18 ++++++++++++-----
 kernel/events/uprobes.c |  2 +-
 mm/huge_memory.c        |  6 +++---
 mm/khugepaged.c         |  2 +-
 mm/memcontrol.c         | 12 ++++++++----
 mm/memory.c             | 16 +++++++++------
 mm/migrate.c            |  2 +-
 mm/shmem.c              |  3 ++-
 mm/swap.c               | 40 ++++++++++++++++++++++++++++++-------
 mm/swap_state.c         | 52 ++++++++++++++++++++++++++++++++++++++++++-------
 mm/swapfile.c           |  2 +-
 mm/userfaultfd.c        |  2 +-
 mm/vmscan.c             | 43 +++++++++++++++++++++++++++++++---------
 mm/vmstat.c             |  6 ++++--
 mm/workingset.c         | 47 +++++++++++++++++++++++++++++++-------------
 16 files changed, 199 insertions(+), 68 deletions(-)

-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/9] mm/vmscan: make active/inactive ratio as 1:1 for anon lru
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-11  6:19 ` [PATCH 2/9] mm/vmscan: protect the workingset on anonymous LRU js1304
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Current implementation of LRU management for anonymous page has some
problems. Most important one is that it doesn't protect the workingset,
that is, pages on the active LRU list. Although, this problem will be
fixed in the following patchset, the preparation is required and
this patch does it.

What following patchset does is to restore workingset protection. In this
case, newly created or swap-in pages are started their lifetime on the
inactive list. If inactive list is too small, there is not enough chance
to be referenced and the page cannot become the workingset.

In order to provide enough chance to the newly anonymous pages, this patch
makes active/inactive LRU ratio as 1:1.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 572fb17..e772f3f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2217,7 +2217,7 @@ static bool inactive_is_low(struct lruvec *lruvec, enum lru_list inactive_lru)
 	active = lruvec_page_state(lruvec, NR_LRU_BASE + active_lru);
 
 	gb = (inactive + active) >> (30 - PAGE_SHIFT);
-	if (gb)
+	if (gb && is_file_lru(inactive_lru))
 		inactive_ratio = int_sqrt(10 * gb);
 	else
 		inactive_ratio = 1;
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/9] mm/vmscan: protect the workingset on anonymous LRU
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
  2020-02-11  6:19 ` [PATCH 1/9] mm/vmscan: make active/inactive ratio as 1:1 for anon lru js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-11  6:19 ` [PATCH 3/9] mm/workingset: extend the workingset detection for anon LRU js1304
                   ` (8 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

In current implementation, newly created or swap-in anonymous page
is started on active list. Growing active list results in rebalancing
active/inactive list so old pages on active list are demoted to inactive
list. Hence, the page on active list isn't protected at all.

Following is an example of this situation.

Assume that 50 hot pages on active list. Numbers denote the number of
pages on active/inactive list (active | inactive).

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(uo) | 50(h)

3. workload: another 50 newly created (used-once) pages
50(uo) | 50(uo), swap-out 50(h)

This patch tries to fix this issue.
Like as file LRU, newly created or swap-in anonymous pages will be
inserted to the inactive list. They are promoted to active list if
enough reference happens. This simple modification changes the above
example as following.

1. 50 hot pages on active list
50(h) | 0

2. workload: 50 newly created (used-once) pages
50(h) | 50(uo)

3. workload: another 50 newly created (used-once) pages
50(h) | 50(uo), swap-out 50(uo)

As you can see, hot pages on active list would be protected.

Note that, this implementation has a drawback that the page cannot
be promoted and will be swapped-out if re-access interval is greater than
the size of inactive list but less than the size of total(active+inactive).
To solve this potential issue, following patch will apply workingset
detection that is applied to file LRU some day before.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/swap.h    |  2 +-
 kernel/events/uprobes.c |  2 +-
 mm/huge_memory.c        |  6 +++---
 mm/khugepaged.c         |  2 +-
 mm/memory.c             |  9 ++++-----
 mm/migrate.c            |  2 +-
 mm/swap.c               | 13 +++++++------
 mm/swapfile.c           |  2 +-
 mm/userfaultfd.c        |  2 +-
 mm/vmscan.c             | 21 +++++++++++++++++++--
 10 files changed, 39 insertions(+), 22 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1e99f7a..954e13e 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -344,7 +344,7 @@ extern void deactivate_page(struct page *page);
 extern void mark_page_lazyfree(struct page *page);
 extern void swap_setup(void);
 
-extern void lru_cache_add_active_or_unevictable(struct page *page,
+extern void lru_cache_add_inactive_or_unevictable(struct page *page,
 						struct vm_area_struct *vma);
 
 /* linux/mm/vmscan.c */
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index ece7e13..14156fc 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -190,7 +190,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 		get_page(new_page);
 		page_add_new_anon_rmap(new_page, vma, addr, false);
 		mem_cgroup_commit_charge(new_page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(new_page, vma);
+		lru_cache_add_inactive_or_unevictable(new_page, vma);
 	} else
 		/* no new page, just dec_mm_counter for old_page */
 		dec_mm_counter(mm, MM_ANONPAGES);
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a880932..6356dfd 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -638,7 +638,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf,
 		entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
 		page_add_new_anon_rmap(page, vma, haddr, true);
 		mem_cgroup_commit_charge(page, memcg, false, true);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 		pgtable_trans_huge_deposit(vma->vm_mm, vmf->pmd, pgtable);
 		set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry);
 		add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR);
@@ -1282,7 +1282,7 @@ static vm_fault_t do_huge_pmd_wp_page_fallback(struct vm_fault *vmf,
 		set_page_private(pages[i], 0);
 		page_add_new_anon_rmap(pages[i], vmf->vma, haddr, false);
 		mem_cgroup_commit_charge(pages[i], memcg, false, false);
-		lru_cache_add_active_or_unevictable(pages[i], vma);
+		lru_cache_add_inactive_or_unevictable(pages[i], vma);
 		vmf->pte = pte_offset_map(&_pmd, haddr);
 		VM_BUG_ON(!pte_none(*vmf->pte));
 		set_pte_at(vma->vm_mm, haddr, vmf->pte, entry);
@@ -1435,7 +1435,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf, pmd_t orig_pmd)
 		pmdp_huge_clear_flush_notify(vma, haddr, vmf->pmd);
 		page_add_new_anon_rmap(new_page, vma, haddr, true);
 		mem_cgroup_commit_charge(new_page, memcg, false, true);
-		lru_cache_add_active_or_unevictable(new_page, vma);
+		lru_cache_add_inactive_or_unevictable(new_page, vma);
 		set_pmd_at(vma->vm_mm, haddr, vmf->pmd, entry);
 		update_mmu_cache_pmd(vma, vmf->address, vmf->pmd);
 		if (!page) {
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index b679908..246c155 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1092,7 +1092,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	page_add_new_anon_rmap(new_page, vma, address, true);
 	mem_cgroup_commit_charge(new_page, memcg, false, true);
 	count_memcg_events(memcg, THP_COLLAPSE_ALLOC, 1);
-	lru_cache_add_active_or_unevictable(new_page, vma);
+	lru_cache_add_inactive_or_unevictable(new_page, vma);
 	pgtable_trans_huge_deposit(mm, pmd, pgtable);
 	set_pmd_at(mm, address, pmd, _pmd);
 	update_mmu_cache_pmd(vma, address, pmd);
diff --git a/mm/memory.c b/mm/memory.c
index 45442d9..5f7813a 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2513,7 +2513,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf)
 		ptep_clear_flush_notify(vma, vmf->address, vmf->pte);
 		page_add_new_anon_rmap(new_page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(new_page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(new_page, vma);
+		lru_cache_add_inactive_or_unevictable(new_page, vma);
 		/*
 		 * We call the notify macro here because, when using secondary
 		 * mmu page tables (such as kvm shadow page tables), we want the
@@ -3038,11 +3038,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 	if (unlikely(page != swapcache && swapcache)) {
 		page_add_new_anon_rmap(page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	} else {
 		do_page_add_anon_rmap(page, vma, vmf->address, exclusive);
 		mem_cgroup_commit_charge(page, memcg, true, false);
-		activate_page(page);
 	}
 
 	swap_free(entry);
@@ -3186,7 +3185,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
 	inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 	page_add_new_anon_rmap(page, vma, vmf->address, false);
 	mem_cgroup_commit_charge(page, memcg, false, false);
-	lru_cache_add_active_or_unevictable(page, vma);
+	lru_cache_add_inactive_or_unevictable(page, vma);
 setpte:
 	set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry);
 
@@ -3449,7 +3448,7 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct mem_cgroup *memcg,
 		inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES);
 		page_add_new_anon_rmap(page, vma, vmf->address, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	} else {
 		inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page));
 		page_add_file_rmap(page, false);
diff --git a/mm/migrate.c b/mm/migrate.c
index 86873b6..ef034c0 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2784,7 +2784,7 @@ static void migrate_vma_insert_page(struct migrate_vma *migrate,
 	page_add_new_anon_rmap(page, vma, addr, false);
 	mem_cgroup_commit_charge(page, memcg, false, false);
 	if (!is_zone_device_page(page))
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	get_page(page);
 
 	if (flush) {
diff --git a/mm/swap.c b/mm/swap.c
index 5341ae9..18b2735 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -448,23 +448,24 @@ void lru_cache_add(struct page *page)
 }
 
 /**
- * lru_cache_add_active_or_unevictable
+ * lru_cache_add_inactive_or_unevictable
  * @page:  the page to be added to LRU
  * @vma:   vma in which page is mapped for determining reclaimability
  *
- * Place @page on the active or unevictable LRU list, depending on its
+ * Place @page on the inactive or unevictable LRU list, depending on its
  * evictability.  Note that if the page is not evictable, it goes
  * directly back onto it's zone's unevictable list, it does NOT use a
  * per cpu pagevec.
  */
-void lru_cache_add_active_or_unevictable(struct page *page,
+void lru_cache_add_inactive_or_unevictable(struct page *page,
 					 struct vm_area_struct *vma)
 {
+	bool evictable;
+
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 
-	if (likely((vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED))
-		SetPageActive(page);
-	else if (!TestSetPageMlocked(page)) {
+	evictable = (vma->vm_flags & (VM_LOCKED | VM_SPECIAL)) != VM_LOCKED;
+	if (!evictable && !TestSetPageMlocked(page)) {
 		/*
 		 * We use the irq-unsafe __mod_zone_page_stat because this
 		 * counter is not modified from interrupt context, and the pte
diff --git a/mm/swapfile.c b/mm/swapfile.c
index bb3261d..6bdcbf9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1888,7 +1888,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
 	} else { /* ksm created a completely new copy */
 		page_add_new_anon_rmap(page, vma, addr, false);
 		mem_cgroup_commit_charge(page, memcg, false, false);
-		lru_cache_add_active_or_unevictable(page, vma);
+		lru_cache_add_inactive_or_unevictable(page, vma);
 	}
 	swap_free(entry);
 	/*
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 1b0d7ab..875e329 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -120,7 +120,7 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm,
 	inc_mm_counter(dst_mm, MM_ANONPAGES);
 	page_add_new_anon_rmap(page, dst_vma, dst_addr, false);
 	mem_cgroup_commit_charge(page, memcg, false, false);
-	lru_cache_add_active_or_unevictable(page, dst_vma);
+	lru_cache_add_inactive_or_unevictable(page, dst_vma);
 
 	set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte);
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e772f3f..4122a84 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1010,8 +1010,15 @@ static enum page_references page_check_references(struct page *page,
 		return PAGEREF_RECLAIM;
 
 	if (referenced_ptes) {
-		if (PageSwapBacked(page))
-			return PAGEREF_ACTIVATE;
+		if (PageSwapBacked(page)) {
+			if (referenced_page) {
+				ClearPageReferenced(page);
+				return PAGEREF_ACTIVATE;
+			}
+
+			SetPageReferenced(page);
+			return PAGEREF_KEEP;
+		}
 		/*
 		 * All mapped pages start out with page table
 		 * references from the instantiating fault, so we need
@@ -2056,6 +2063,15 @@ static void shrink_active_list(unsigned long nr_to_scan,
 			}
 		}
 
+		/*
+		 * Now, newly created anonymous page isn't appened to the
+		 * active list. We don't need to clear the reference bit here.
+		 */
+		if (PageSwapBacked(page)) {
+			ClearPageReferenced(page);
+			goto deactivate;
+		}
+
 		if (page_referenced(page, 0, sc->target_mem_cgroup,
 				    &vm_flags)) {
 			nr_rotated += hpage_nr_pages(page);
@@ -2074,6 +2090,7 @@ static void shrink_active_list(unsigned long nr_to_scan,
 			}
 		}
 
+deactivate:
 		ClearPageActive(page);	/* we are de-activating */
 		SetPageWorkingset(page);
 		list_add(&page->lru, &l_inactive);
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/9] mm/workingset: extend the workingset detection for anon LRU
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
  2020-02-11  6:19 ` [PATCH 1/9] mm/vmscan: make active/inactive ratio as 1:1 for anon lru js1304
  2020-02-11  6:19 ` [PATCH 2/9] mm/vmscan: protect the workingset on anonymous LRU js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-14  4:07   ` Joonsoo Kim
  2020-02-11  6:19 ` [PATCH 4/9] mm/swapcache: support to handle the value in swapcache js1304
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

In the following patch, workingset detection will be applied to
anonymous LRU. To prepare it, this patch adds some code to
distinguish/handle the both LRUs.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h | 14 +++++++++-----
 mm/memcontrol.c        | 12 ++++++++----
 mm/vmscan.c            | 15 ++++++++++-----
 mm/vmstat.c            |  6 ++++--
 mm/workingset.c        | 35 ++++++++++++++++++++++-------------
 5 files changed, 53 insertions(+), 29 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 5334ad8..b78fd8c 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -220,8 +220,12 @@ enum node_stat_item {
 	NR_ISOLATED_ANON,	/* Temporary isolated pages from anon lru */
 	NR_ISOLATED_FILE,	/* Temporary isolated pages from file lru */
 	WORKINGSET_NODES,
-	WORKINGSET_REFAULT,
-	WORKINGSET_ACTIVATE,
+	WORKINGSET_REFAULT_BASE,
+	WORKINGSET_REFAULT_ANON = WORKINGSET_REFAULT_BASE,
+	WORKINGSET_REFAULT_FILE,
+	WORKINGSET_ACTIVATE_BASE,
+	WORKINGSET_ACTIVATE_ANON = WORKINGSET_ACTIVATE_BASE,
+	WORKINGSET_ACTIVATE_FILE,
 	WORKINGSET_RESTORE,
 	WORKINGSET_NODERECLAIM,
 	NR_ANON_MAPPED,	/* Mapped anonymous pages */
@@ -304,10 +308,10 @@ enum lruvec_flags {
 struct lruvec {
 	struct list_head		lists[NR_LRU_LISTS];
 	struct zone_reclaim_stat	reclaim_stat;
-	/* Evictions & activations on the inactive file list */
-	atomic_long_t			inactive_age;
+	/* Evictions & activations on the inactive list */
+	atomic_long_t			inactive_age[2];
 	/* Refaults at the time of last reclaim cycle */
-	unsigned long			refaults;
+	unsigned long			refaults[2];
 	/* Various lruvec state flags (enum lruvec_flags) */
 	unsigned long			flags;
 #ifdef CONFIG_MEMCG
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6c83cf4..8f4473d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1431,10 +1431,14 @@ static char *memory_stat_format(struct mem_cgroup *memcg)
 	seq_buf_printf(&s, "%s %lu\n", vm_event_name(PGMAJFAULT),
 		       memcg_events(memcg, PGMAJFAULT));
 
-	seq_buf_printf(&s, "workingset_refault %lu\n",
-		       memcg_page_state(memcg, WORKINGSET_REFAULT));
-	seq_buf_printf(&s, "workingset_activate %lu\n",
-		       memcg_page_state(memcg, WORKINGSET_ACTIVATE));
+	seq_buf_printf(&s, "workingset_refault_anon %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_REFAULT_ANON));
+	seq_buf_printf(&s, "workingset_refault_file %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_REFAULT_FILE));
+	seq_buf_printf(&s, "workingset_activate_anon %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_ACTIVATE_ANON));
+	seq_buf_printf(&s, "workingset_activate_file %lu\n",
+		       memcg_page_state(memcg, WORKINGSET_ACTIVATE_FILE));
 	seq_buf_printf(&s, "workingset_nodereclaim %lu\n",
 		       memcg_page_state(memcg, WORKINGSET_NODERECLAIM));
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4122a84..74c3ade 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2735,7 +2735,10 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 	if (!sc->force_deactivate) {
 		unsigned long refaults;
 
-		if (inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
+		refaults = lruvec_page_state(target_lruvec,
+				WORKINGSET_ACTIVATE_ANON);
+		if (refaults != target_lruvec->refaults[0] ||
+			inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
 			sc->may_deactivate |= DEACTIVATE_ANON;
 		else
 			sc->may_deactivate &= ~DEACTIVATE_ANON;
@@ -2746,8 +2749,8 @@ static bool shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 		 * rid of any stale active pages quickly.
 		 */
 		refaults = lruvec_page_state(target_lruvec,
-					     WORKINGSET_ACTIVATE);
-		if (refaults != target_lruvec->refaults ||
+				WORKINGSET_ACTIVATE_FILE);
+		if (refaults != target_lruvec->refaults[1] ||
 		    inactive_is_low(target_lruvec, LRU_INACTIVE_FILE))
 			sc->may_deactivate |= DEACTIVATE_FILE;
 		else
@@ -3026,8 +3029,10 @@ static void snapshot_refaults(struct mem_cgroup *target_memcg, pg_data_t *pgdat)
 	unsigned long refaults;
 
 	target_lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
-	refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE);
-	target_lruvec->refaults = refaults;
+	refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_ANON);
+	target_lruvec->refaults[0] = refaults;
+	refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_FILE);
+	target_lruvec->refaults[1] = refaults;
 }
 
 /*
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 78d5337..3cdf8e9 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1146,8 +1146,10 @@ const char * const vmstat_text[] = {
 	"nr_isolated_anon",
 	"nr_isolated_file",
 	"workingset_nodes",
-	"workingset_refault",
-	"workingset_activate",
+	"workingset_refault_anon",
+	"workingset_refault_file",
+	"workingset_activate_anon",
+	"workingset_activate_file",
 	"workingset_restore",
 	"workingset_nodereclaim",
 	"nr_anon_pages",
diff --git a/mm/workingset.c b/mm/workingset.c
index 474186b..d04f70a 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -15,6 +15,7 @@
 #include <linux/dax.h>
 #include <linux/fs.h>
 #include <linux/mm.h>
+#include <linux/mm_inline.h>
 
 /*
  *		Double CLOCK lists
@@ -156,7 +157,7 @@
  *
  *		Implementation
  *
- * For each node's file LRU lists, a counter for inactive evictions
+ * For each node's anon/file LRU lists, a counter for inactive evictions
  * and activations is maintained (node->inactive_age).
  *
  * On eviction, a snapshot of this counter (along with some bits to
@@ -213,7 +214,8 @@ static void unpack_shadow(void *shadow, int *memcgidp, pg_data_t **pgdat,
 	*workingsetp = workingset;
 }
 
-static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
+static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat,
+				int is_file)
 {
 	/*
 	 * Reclaiming a cgroup means reclaiming all its children in a
@@ -230,7 +232,7 @@ static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
 		struct lruvec *lruvec;
 
 		lruvec = mem_cgroup_lruvec(memcg, pgdat);
-		atomic_long_inc(&lruvec->inactive_age);
+		atomic_long_inc(&lruvec->inactive_age[is_file]);
 	} while (memcg && (memcg = parent_mem_cgroup(memcg)));
 }
 
@@ -248,18 +250,19 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
 	unsigned long eviction;
 	struct lruvec *lruvec;
 	int memcgid;
+	int is_file = page_is_file_cache(page);
 
 	/* Page is fully exclusive and pins page->mem_cgroup */
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 	VM_BUG_ON_PAGE(page_count(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 
-	advance_inactive_age(page_memcg(page), pgdat);
+	advance_inactive_age(page_memcg(page), pgdat, is_file);
 
 	lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
 	/* XXX: target_memcg can be NULL, go through lruvec */
 	memcgid = mem_cgroup_id(lruvec_memcg(lruvec));
-	eviction = atomic_long_read(&lruvec->inactive_age);
+	eviction = atomic_long_read(&lruvec->inactive_age[is_file]);
 	return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
 }
 
@@ -278,13 +281,16 @@ void workingset_refault(struct page *page, void *shadow)
 	struct lruvec *eviction_lruvec;
 	unsigned long refault_distance;
 	struct pglist_data *pgdat;
-	unsigned long active_file;
+	unsigned long active;
 	struct mem_cgroup *memcg;
 	unsigned long eviction;
 	struct lruvec *lruvec;
 	unsigned long refault;
 	bool workingset;
 	int memcgid;
+	int is_file = page_is_file_cache(page);
+	enum lru_list active_lru = page_lru_base_type(page) + LRU_ACTIVE_FILE;
+	enum node_stat_item workingset_stat;
 
 	unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
 
@@ -309,8 +315,8 @@ void workingset_refault(struct page *page, void *shadow)
 	if (!mem_cgroup_disabled() && !eviction_memcg)
 		goto out;
 	eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
-	refault = atomic_long_read(&eviction_lruvec->inactive_age);
-	active_file = lruvec_page_state(eviction_lruvec, NR_ACTIVE_FILE);
+	refault = atomic_long_read(&eviction_lruvec->inactive_age[is_file]);
+	active = lruvec_page_state(eviction_lruvec, active_lru);
 
 	/*
 	 * Calculate the refault distance
@@ -341,19 +347,21 @@ void workingset_refault(struct page *page, void *shadow)
 	memcg = page_memcg(page);
 	lruvec = mem_cgroup_lruvec(memcg, pgdat);
 
-	inc_lruvec_state(lruvec, WORKINGSET_REFAULT);
+	workingset_stat = WORKINGSET_REFAULT_BASE + is_file;
+	inc_lruvec_state(lruvec, workingset_stat);
 
 	/*
 	 * Compare the distance to the existing workingset size. We
 	 * don't act on pages that couldn't stay resident even if all
 	 * the memory was available to the page cache.
 	 */
-	if (refault_distance > active_file)
+	if (refault_distance > active)
 		goto out;
 
 	SetPageActive(page);
-	advance_inactive_age(memcg, pgdat);
-	inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE);
+	advance_inactive_age(memcg, pgdat, is_file);
+	workingset_stat = WORKINGSET_ACTIVATE_BASE + is_file;
+	inc_lruvec_state(lruvec, workingset_stat);
 
 	/* Page was active prior to eviction */
 	if (workingset) {
@@ -371,6 +379,7 @@ void workingset_refault(struct page *page, void *shadow)
 void workingset_activation(struct page *page)
 {
 	struct mem_cgroup *memcg;
+	int is_file = page_is_file_cache(page);
 
 	rcu_read_lock();
 	/*
@@ -383,7 +392,7 @@ void workingset_activation(struct page *page)
 	memcg = page_memcg_rcu(page);
 	if (!mem_cgroup_disabled() && !memcg)
 		goto out;
-	advance_inactive_age(memcg, page_pgdat(page));
+	advance_inactive_age(memcg, page_pgdat(page), is_file);
 out:
 	rcu_read_unlock();
 }
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/9] mm/swapcache: support to handle the value in swapcache
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (2 preceding siblings ...)
  2020-02-11  6:19 ` [PATCH 3/9] mm/workingset: extend the workingset detection for anon LRU js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-11  6:19 ` [PATCH 5/9] mm/workingset: use the node counter if memcg is the root memcg js1304
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Swapcache doesn't handle the value since there is no case using the value.
In the following patch, workingset detection for anonymous page will be
implemented and it stores the value into the swapcache. So, we need to
handle it and this patch implement handling.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/swap.h |  5 +++--
 mm/swap_state.c      | 23 ++++++++++++++++++++---
 mm/vmscan.c          |  2 +-
 3 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 954e13e..0df8b3f 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -410,7 +410,8 @@ extern void show_swap_cache_info(void);
 extern int add_to_swap(struct page *page);
 extern int add_to_swap_cache(struct page *, swp_entry_t, gfp_t);
 extern int __add_to_swap_cache(struct page *page, swp_entry_t entry);
-extern void __delete_from_swap_cache(struct page *, swp_entry_t entry);
+extern void __delete_from_swap_cache(struct page *page,
+			swp_entry_t entry, void *shadow);
 extern void delete_from_swap_cache(struct page *);
 extern void free_page_and_swap_cache(struct page *);
 extern void free_pages_and_swap_cache(struct page **, int);
@@ -571,7 +572,7 @@ static inline int add_to_swap_cache(struct page *page, swp_entry_t entry,
 }
 
 static inline void __delete_from_swap_cache(struct page *page,
-							swp_entry_t entry)
+					swp_entry_t entry, void *shadow)
 {
 }
 
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 8e7ce9a..3fbbe45 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -117,6 +117,10 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp)
 	pgoff_t idx = swp_offset(entry);
 	XA_STATE_ORDER(xas, &address_space->i_pages, idx, compound_order(page));
 	unsigned long i, nr = compound_nr(page);
+	unsigned long nrexceptional = 0;
+	void *old;
+
+	xas_set_update(&xas, workingset_update_node);
 
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	VM_BUG_ON_PAGE(PageSwapCache(page), page);
@@ -132,10 +136,14 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp)
 			goto unlock;
 		for (i = 0; i < nr; i++) {
 			VM_BUG_ON_PAGE(xas.xa_index != idx + i, page);
+			old = xas_load(&xas);
+			if (xa_is_value(old))
+				nrexceptional++;
 			set_page_private(page + i, entry.val + i);
 			xas_store(&xas, page);
 			xas_next(&xas);
 		}
+		address_space->nrexceptional -= nrexceptional;
 		address_space->nrpages += nr;
 		__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, nr);
 		ADD_CACHE_INFO(add_total, nr);
@@ -155,24 +163,33 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp)
  * This must be called only on pages that have
  * been verified to be in the swap cache.
  */
-void __delete_from_swap_cache(struct page *page, swp_entry_t entry)
+void __delete_from_swap_cache(struct page *page,
+			swp_entry_t entry, void *shadow)
 {
 	struct address_space *address_space = swap_address_space(entry);
 	int i, nr = hpage_nr_pages(page);
 	pgoff_t idx = swp_offset(entry);
 	XA_STATE(xas, &address_space->i_pages, idx);
 
+	/* Do not apply workingset detection for the hugh page */
+	if (nr > 1)
+		shadow = NULL;
+
+	xas_set_update(&xas, workingset_update_node);
+
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 	VM_BUG_ON_PAGE(!PageSwapCache(page), page);
 	VM_BUG_ON_PAGE(PageWriteback(page), page);
 
 	for (i = 0; i < nr; i++) {
-		void *entry = xas_store(&xas, NULL);
+		void *entry = xas_store(&xas, shadow);
 		VM_BUG_ON_PAGE(entry != page, entry);
 		set_page_private(page + i, 0);
 		xas_next(&xas);
 	}
 	ClearPageSwapCache(page);
+	if (shadow)
+		address_space->nrexceptional += nr;
 	address_space->nrpages -= nr;
 	__mod_node_page_state(page_pgdat(page), NR_FILE_PAGES, -nr);
 	ADD_CACHE_INFO(del_total, nr);
@@ -247,7 +264,7 @@ void delete_from_swap_cache(struct page *page)
 	struct address_space *address_space = swap_address_space(entry);
 
 	xa_lock_irq(&address_space->i_pages);
-	__delete_from_swap_cache(page, entry);
+	__delete_from_swap_cache(page, entry, NULL);
 	xa_unlock_irq(&address_space->i_pages);
 
 	put_swap_page(page, entry);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 74c3ade..99588ba 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -909,7 +909,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 	if (PageSwapCache(page)) {
 		swp_entry_t swap = { .val = page_private(page) };
 		mem_cgroup_swapout(page, swap);
-		__delete_from_swap_cache(page, swap);
+		__delete_from_swap_cache(page, swap, NULL);
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 		put_swap_page(page, swap);
 	} else {
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 5/9] mm/workingset: use the node counter if memcg is the root memcg
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (3 preceding siblings ...)
  2020-02-11  6:19 ` [PATCH 4/9] mm/swapcache: support to handle the value in swapcache js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-11  6:19 ` [PATCH 6/9] mm/workingset: handle the page without memcg js1304
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

In the following patch, workingset detection is implemented for the
swap cache. Swap cache's node is usually allocated by kswapd and it
isn't charged by kmemcg since it is from the kernel thread. So the swap
cache's shadow node is managed by the node list of the list_lru rather
than the memcg specific one.

If counting the shadow node on the root memcg happens to reclaim the slab
object, the shadow node count returns the number of the shadow node on
the node list of the list_lru since root memcg has the kmem_cache_id, -1.

However, the size of pages on the LRU is calculated by using the specific
memcg, so mismatch happens. This causes the number of shadow node not to
be increased to the enough size and, therefore, workingset detection
cannot work correctly. This patch fixes this bug by checking if the memcg
is the root memcg or not. If it is the root memcg, instead of using
the memcg-specific LRU, the system-wide LRU is used to calculate proper
size of the shadow node so that the number of the shadow node can grow
as expected.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/workingset.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/workingset.c b/mm/workingset.c
index d04f70a..636aafc 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -468,7 +468,13 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
 	 * PAGE_SIZE / xa_nodes / node_entries * 8 / PAGE_SIZE
 	 */
 #ifdef CONFIG_MEMCG
-	if (sc->memcg) {
+	/*
+	 * Kernel allocation on root memcg isn't regarded as allocation of
+	 * specific memcg. So, if sc->memcg is the root memcg, we need to
+	 * use the count for the node rather than one for the specific
+	 * memcg.
+	 */
+	if (sc->memcg && !mem_cgroup_is_root(sc->memcg)) {
 		struct lruvec *lruvec;
 		int i;
 
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 6/9] mm/workingset: handle the page without memcg
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (4 preceding siblings ...)
  2020-02-11  6:19 ` [PATCH 5/9] mm/workingset: use the node counter if memcg is the root memcg js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-11  6:19 ` [PATCH 7/9] mm/swap: implement workingset detection for anonymous LRU js1304
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

When implementing workingset detection for anonymous page, I found
some swapcache pages with NULL memcg. From the code reading, I found
two reasons.

One is the case that swap-in readahead happens. The other is the
corner case related to the shmem cache. These two problems should be
fixed, but, it's not straight-forward to fix. For example, when swap-off,
all swapped-out pages are read into swapcache. In this case, who's the
owner of the swapcache page?

Since this problem doesn't look trivial, I decide to leave the issue and
handles this corner case on the place where the error occurs.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/workingset.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/mm/workingset.c b/mm/workingset.c
index 636aafc..20286d6 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -257,6 +257,10 @@ void *workingset_eviction(struct page *page, struct mem_cgroup *target_memcg)
 	VM_BUG_ON_PAGE(page_count(page), page);
 	VM_BUG_ON_PAGE(!PageLocked(page), page);
 
+	/* page_memcg() can be NULL if swap-in readahead happens */
+	if (!page_memcg(page))
+		return NULL;
+
 	advance_inactive_age(page_memcg(page), pgdat, is_file);
 
 	lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 7/9] mm/swap: implement workingset detection for anonymous LRU
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (5 preceding siblings ...)
  2020-02-11  6:19 ` [PATCH 6/9] mm/workingset: handle the page without memcg js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-11  6:19 ` [PATCH 8/9] mm/vmscan: restore active/inactive ratio " js1304
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

This patch implements workingset detection for anonymous LRU.
All the infrastructure is implemented by the previous patches so this patch
just activates the workingset detection by installing/retrieving
the shadow entry.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/swap.h | 11 +++++++++--
 mm/memory.c          |  7 ++++++-
 mm/shmem.c           |  3 ++-
 mm/swap_state.c      | 31 ++++++++++++++++++++++++++-----
 mm/vmscan.c          |  7 +++++--
 5 files changed, 48 insertions(+), 11 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 0df8b3f..fb4772e 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -408,7 +408,9 @@ extern struct address_space *swapper_spaces[];
 extern unsigned long total_swapcache_pages(void);
 extern void show_swap_cache_info(void);
 extern int add_to_swap(struct page *page);
-extern int add_to_swap_cache(struct page *, swp_entry_t, gfp_t);
+extern void *get_shadow_from_swap_cache(swp_entry_t entry);
+extern int add_to_swap_cache(struct page *page, swp_entry_t entry,
+			gfp_t gfp, void **shadowp);
 extern int __add_to_swap_cache(struct page *page, swp_entry_t entry);
 extern void __delete_from_swap_cache(struct page *page,
 			swp_entry_t entry, void *shadow);
@@ -565,8 +567,13 @@ static inline int add_to_swap(struct page *page)
 	return 0;
 }
 
+static inline void *get_shadow_from_swap_cache(swp_entry_t entry)
+{
+	return NULL;
+}
+
 static inline int add_to_swap_cache(struct page *page, swp_entry_t entry,
-							gfp_t gfp_mask)
+					gfp_t gfp_mask, void **shadowp)
 {
 	return -1;
 }
diff --git a/mm/memory.c b/mm/memory.c
index 5f7813a..91a2097 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2925,10 +2925,15 @@ vm_fault_t do_swap_page(struct vm_fault *vmf)
 			page = alloc_page_vma(GFP_HIGHUSER_MOVABLE, vma,
 							vmf->address);
 			if (page) {
+				void *shadow;
+
 				__SetPageLocked(page);
 				__SetPageSwapBacked(page);
 				set_page_private(page, entry.val);
-				lru_cache_add_anon(page);
+				shadow = get_shadow_from_swap_cache(entry);
+				if (shadow)
+					workingset_refault(page, shadow);
+				lru_cache_add(page);
 				swap_readpage(page, true);
 			}
 		} else {
diff --git a/mm/shmem.c b/mm/shmem.c
index 8793e8c..c6663ad 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1370,7 +1370,8 @@ static int shmem_writepage(struct page *page, struct writeback_control *wbc)
 		list_add(&info->swaplist, &shmem_swaplist);
 
 	if (add_to_swap_cache(page, swap,
-			__GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN) == 0) {
+			__GFP_HIGH | __GFP_NOMEMALLOC | __GFP_NOWARN,
+			NULL) == 0) {
 		spin_lock_irq(&info->lock);
 		shmem_recalc_inode(inode);
 		info->swapped++;
diff --git a/mm/swap_state.c b/mm/swap_state.c
index 3fbbe45..7f7cb19 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -107,11 +107,24 @@ void show_swap_cache_info(void)
 	printk("Total swap = %lukB\n", total_swap_pages << (PAGE_SHIFT - 10));
 }
 
+void *get_shadow_from_swap_cache(swp_entry_t entry)
+{
+	struct address_space *address_space = swap_address_space(entry);
+	pgoff_t idx = swp_offset(entry);
+	struct page *page;
+
+	page = find_get_entry(address_space, idx);
+	if (xa_is_value(page))
+		return page;
+	return NULL;
+}
+
 /*
  * add_to_swap_cache resembles add_to_page_cache_locked on swapper_space,
  * but sets SwapCache flag and private instead of mapping and index.
  */
-int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp)
+int add_to_swap_cache(struct page *page, swp_entry_t entry,
+			gfp_t gfp, void **shadowp)
 {
 	struct address_space *address_space = swap_address_space(entry);
 	pgoff_t idx = swp_offset(entry);
@@ -137,8 +150,11 @@ int add_to_swap_cache(struct page *page, swp_entry_t entry, gfp_t gfp)
 		for (i = 0; i < nr; i++) {
 			VM_BUG_ON_PAGE(xas.xa_index != idx + i, page);
 			old = xas_load(&xas);
-			if (xa_is_value(old))
+			if (xa_is_value(old)) {
 				nrexceptional++;
+				if (shadowp)
+					*shadowp = old;
+			}
 			set_page_private(page + i, entry.val + i);
 			xas_store(&xas, page);
 			xas_next(&xas);
@@ -226,7 +242,7 @@ int add_to_swap(struct page *page)
 	 * Add it to the swap cache.
 	 */
 	err = add_to_swap_cache(page, entry,
-			__GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN);
+			__GFP_HIGH|__GFP_NOMEMALLOC|__GFP_NOWARN, NULL);
 	if (err)
 		/*
 		 * add_to_swap_cache() doesn't return -EEXIST, so we can safely
@@ -380,6 +396,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 	struct page *found_page = NULL, *new_page = NULL;
 	struct swap_info_struct *si;
 	int err;
+	void *shadow;
 	*new_page_allocated = false;
 
 	do {
@@ -435,11 +452,15 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
 		/* May fail (-ENOMEM) if XArray node allocation failed. */
 		__SetPageLocked(new_page);
 		__SetPageSwapBacked(new_page);
-		err = add_to_swap_cache(new_page, entry, gfp_mask & GFP_KERNEL);
+		shadow = NULL;
+		err = add_to_swap_cache(new_page, entry,
+				gfp_mask & GFP_KERNEL, &shadow);
 		if (likely(!err)) {
 			/* Initiate read into locked page */
 			SetPageWorkingset(new_page);
-			lru_cache_add_anon(new_page);
+			if (shadow)
+				workingset_refault(new_page, shadow);
+			lru_cache_add(new_page);
 			*new_page_allocated = true;
 			return new_page;
 		}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 99588ba..a1892e7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -867,6 +867,7 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 {
 	unsigned long flags;
 	int refcount;
+	void *shadow = NULL;
 
 	BUG_ON(!PageLocked(page));
 	BUG_ON(mapping != page_mapping(page));
@@ -909,12 +910,13 @@ static int __remove_mapping(struct address_space *mapping, struct page *page,
 	if (PageSwapCache(page)) {
 		swp_entry_t swap = { .val = page_private(page) };
 		mem_cgroup_swapout(page, swap);
-		__delete_from_swap_cache(page, swap, NULL);
+		if (reclaimed && !mapping_exiting(mapping))
+			shadow = workingset_eviction(page, target_memcg);
+		__delete_from_swap_cache(page, swap, shadow);
 		xa_unlock_irqrestore(&mapping->i_pages, flags);
 		put_swap_page(page, swap);
 	} else {
 		void (*freepage)(struct page *);
-		void *shadow = NULL;
 
 		freepage = mapping->a_ops->freepage;
 		/*
@@ -1485,6 +1487,7 @@ static unsigned long shrink_page_list(struct list_head *page_list,
 			SetPageActive(page);
 			stat->nr_activate[type] += nr_pages;
 			count_memcg_page_event(page, PGACTIVATE);
+			workingset_activation(page);
 		}
 keep_locked:
 		unlock_page(page);
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 8/9] mm/vmscan: restore active/inactive ratio for anonymous LRU
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (6 preceding siblings ...)
  2020-02-11  6:19 ` [PATCH 7/9] mm/swap: implement workingset detection for anonymous LRU js1304
@ 2020-02-11  6:19 ` " js1304
  2020-02-11  6:19 ` [PATCH 9/9] mm/swap: count a new anonymous page as a reclaim_state's rotate js1304
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

Now, workingset detection is implemented for anonymous LRU.
We don't have to worry about the misfound for workingset due to
the ratio of active/inactive. Let's restore the ratio.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/vmscan.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index a1892e7..81ff725 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2237,7 +2237,7 @@ static bool inactive_is_low(struct lruvec *lruvec, enum lru_list inactive_lru)
 	active = lruvec_page_state(lruvec, NR_LRU_BASE + active_lru);
 
 	gb = (inactive + active) >> (30 - PAGE_SHIFT);
-	if (gb && is_file_lru(inactive_lru))
+	if (gb)
 		inactive_ratio = int_sqrt(10 * gb);
 	else
 		inactive_ratio = 1;
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 9/9] mm/swap: count a new anonymous page as a reclaim_state's rotate
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (7 preceding siblings ...)
  2020-02-11  6:19 ` [PATCH 8/9] mm/vmscan: restore active/inactive ratio " js1304
@ 2020-02-11  6:19 ` js1304
  2020-02-12  3:35 ` Hillf Danton
  2020-02-14  4:12 ` [PATCH 0/9] workingset protection/detection on the anonymous LRU list Joonsoo Kim
  10 siblings, 0 replies; 14+ messages in thread
From: js1304 @ 2020-02-11  6:19 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Johannes Weiner, Michal Hocko,
	Hugh Dickins, Minchan Kim, Vlastimil Babka, Mel Gorman,
	kernel-team, Joonsoo Kim

From: Joonsoo Kim <iamjoonsoo.kim@lge.com>

reclaim_stat's rotate is used for controlling the ratio of scanning page
between file and anonymous LRU. All new anonymous pages are counted
for rotate before the patch, protecting anonymous pages on active LRU, and,
it makes that reclaim on anonymous LRU is less happened than file LRU.

Now, situation is changed. all new anonymous pages are not added
to the active LRU so rotate would be far less than before. It will cause
that reclaim on anonymous LRU happens more and it would result in bad
effect on some system that is optimized for previous setting.

Therefore, this patch counts a new anonymous page as a reclaim_state's
rotate. Although it is non-logical to add this count to
the reclaim_state's rotate in current algorithm, reducing the regression
would be more important.

I found this regression on kernel-build test and it is roughly 2~5%
performance degradation. With this workaround, performance is completely
restored.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 mm/swap.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/mm/swap.c b/mm/swap.c
index 18b2735..c3584af 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -187,6 +187,9 @@ int get_kernel_page(unsigned long start, int write, struct page **pages)
 }
 EXPORT_SYMBOL_GPL(get_kernel_page);
 
+static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
+				 void *arg);
+
 static void pagevec_lru_move_fn(struct pagevec *pvec,
 	void (*move_fn)(struct page *page, struct lruvec *lruvec, void *arg),
 	void *arg)
@@ -207,6 +210,19 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
 			spin_lock_irqsave(&pgdat->lru_lock, flags);
 		}
 
+		if (move_fn == __pagevec_lru_add_fn) {
+			struct list_head *entry = &page->lru;
+			unsigned long next = (unsigned long)entry->next;
+			unsigned long rotate = next & 2;
+
+			if (rotate) {
+				VM_BUG_ON(arg);
+
+				next = next & ~2;
+				entry->next = (struct list_head *)next;
+				arg = (void *)rotate;
+			}
+		}
 		lruvec = mem_cgroup_page_lruvec(page, pgdat);
 		(*move_fn)(page, lruvec, arg);
 	}
@@ -475,6 +491,14 @@ void lru_cache_add_inactive_or_unevictable(struct page *page,
 				    hpage_nr_pages(page));
 		count_vm_event(UNEVICTABLE_PGMLOCKED);
 	}
+
+	if (PageSwapBacked(page) && evictable) {
+		struct list_head *entry = &page->lru;
+		unsigned long next = (unsigned long)entry->next;
+
+		next = next | 2;
+		entry->next = (struct list_head *)next;
+	}
 	lru_cache_add(page);
 }
 
@@ -927,6 +951,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 {
 	enum lru_list lru;
 	int was_unevictable = TestClearPageUnevictable(page);
+	unsigned long rotate = (unsigned long)arg;
 
 	VM_BUG_ON_PAGE(PageLRU(page), page);
 
@@ -962,7 +987,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
 	if (page_evictable(page)) {
 		lru = page_lru(page);
 		update_page_reclaim_stat(lruvec, page_is_file_cache(page),
-					 PageActive(page));
+					 PageActive(page) | rotate);
 		if (was_unevictable)
 			count_vm_event(UNEVICTABLE_PGRESCUED);
 	} else {
-- 
2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 9/9] mm/swap: count a new anonymous page as a reclaim_state's rotate
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (8 preceding siblings ...)
  2020-02-11  6:19 ` [PATCH 9/9] mm/swap: count a new anonymous page as a reclaim_state's rotate js1304
@ 2020-02-12  3:35 ` Hillf Danton
  2020-02-12 11:00   ` Joonsoo Kim
  2020-02-14  4:12 ` [PATCH 0/9] workingset protection/detection on the anonymous LRU list Joonsoo Kim
  10 siblings, 1 reply; 14+ messages in thread
From: Hillf Danton @ 2020-02-12  3:35 UTC (permalink / raw)
  To: js1304
  Cc: Andrew Morton, linux-mm, linux-kernel, Johannes Weiner,
	Michal Hocko, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Mel Gorman, kernel-team, Joonsoo Kim


On Mon, 10 Feb 2020 22:20:37 -0800 (PST)
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> 
> reclaim_stat's rotate is used for controlling the ratio of scanning page
> between file and anonymous LRU. All new anonymous pages are counted
> for rotate before the patch, protecting anonymous pages on active LRU, and,
> it makes that reclaim on anonymous LRU is less happened than file LRU.
> 
> Now, situation is changed. all new anonymous pages are not added
> to the active LRU so rotate would be far less than before. It will cause
> that reclaim on anonymous LRU happens more and it would result in bad
> effect on some system that is optimized for previous setting.
> 
> Therefore, this patch counts a new anonymous page as a reclaim_state's
> rotate. Although it is non-logical to add this count to
> the reclaim_state's rotate in current algorithm, reducing the regression
> would be more important.
> 
> I found this regression on kernel-build test and it is roughly 2~5%
> performance degradation. With this workaround, performance is completely
> restored.
> 
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  mm/swap.c | 27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/swap.c b/mm/swap.c
> index 18b2735..c3584af 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -187,6 +187,9 @@ int get_kernel_page(unsigned long start, int write, struct page **pages)
>  }
>  EXPORT_SYMBOL_GPL(get_kernel_page);
>  
> +static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
> +				 void *arg);
> +
>  static void pagevec_lru_move_fn(struct pagevec *pvec,
>  	void (*move_fn)(struct page *page, struct lruvec *lruvec, void *arg),
>  	void *arg)
> @@ -207,6 +210,19 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
>  			spin_lock_irqsave(&pgdat->lru_lock, flags);
>  		}
>  
> +		if (move_fn == __pagevec_lru_add_fn) {
> +			struct list_head *entry = &page->lru;
> +			unsigned long next = (unsigned long)entry->next;
> +			unsigned long rotate = next & 2;
> +
> +			if (rotate) {
> +				VM_BUG_ON(arg);
> +
> +				next = next & ~2;
> +				entry->next = (struct list_head *)next;
> +				arg = (void *)rotate;
> +			}
> +		}
>  		lruvec = mem_cgroup_page_lruvec(page, pgdat);
>  		(*move_fn)(page, lruvec, arg);
>  	}
> @@ -475,6 +491,14 @@ void lru_cache_add_inactive_or_unevictable(struct page *page,
>  				    hpage_nr_pages(page));
>  		count_vm_event(UNEVICTABLE_PGMLOCKED);
>  	}
> +
> +	if (PageSwapBacked(page) && evictable) {
> +		struct list_head *entry = &page->lru;
> +		unsigned long next = (unsigned long)entry->next;
> +
> +		next = next | 2;
> +		entry->next = (struct list_head *)next;
> +	}
>  	lru_cache_add(page);
>  }
>  
> @@ -927,6 +951,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
>  {
>  	enum lru_list lru;
>  	int was_unevictable = TestClearPageUnevictable(page);
> +	unsigned long rotate = (unsigned long)arg;
>  
>  	VM_BUG_ON_PAGE(PageLRU(page), page);
>  
> @@ -962,7 +987,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
>  	if (page_evictable(page)) {
>  		lru = page_lru(page);
>  		update_page_reclaim_stat(lruvec, page_is_file_cache(page),
> -					 PageActive(page));
> +					 PageActive(page) | rotate);


Is it likely to rotate a page if we know it's not active?

 		update_page_reclaim_stat(lruvec, page_is_file_cache(page),
-					 PageActive(page));
+					 PageActive(page) ||
+					 !page_is_file_cache(page));


>  		if (was_unevictable)
>  			count_vm_event(UNEVICTABLE_PGRESCUED);
>  	} else {
> -- 
> 2.7.4



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 9/9] mm/swap: count a new anonymous page as a reclaim_state's rotate
  2020-02-12  3:35 ` Hillf Danton
@ 2020-02-12 11:00   ` Joonsoo Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Joonsoo Kim @ 2020-02-12 11:00 UTC (permalink / raw)
  To: Hillf Danton
  Cc: Andrew Morton, Linux Memory Management List, LKML,
	Johannes Weiner, Michal Hocko, Hugh Dickins, Minchan Kim,
	Vlastimil Babka, Mel Gorman, kernel-team, Joonsoo Kim

Hello,

2020년 2월 12일 (수) 오후 12:35, Hillf Danton <hdanton@sina.com>님이 작성:
>
>
> On Mon, 10 Feb 2020 22:20:37 -0800 (PST)
> > From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> >
> > reclaim_stat's rotate is used for controlling the ratio of scanning page
> > between file and anonymous LRU. All new anonymous pages are counted
> > for rotate before the patch, protecting anonymous pages on active LRU, and,
> > it makes that reclaim on anonymous LRU is less happened than file LRU.
> >
> > Now, situation is changed. all new anonymous pages are not added
> > to the active LRU so rotate would be far less than before. It will cause
> > that reclaim on anonymous LRU happens more and it would result in bad
> > effect on some system that is optimized for previous setting.
> >
> > Therefore, this patch counts a new anonymous page as a reclaim_state's
> > rotate. Although it is non-logical to add this count to
> > the reclaim_state's rotate in current algorithm, reducing the regression
> > would be more important.
> >
> > I found this regression on kernel-build test and it is roughly 2~5%
> > performance degradation. With this workaround, performance is completely
> > restored.
> >
> > Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> > ---
> >  mm/swap.c | 27 ++++++++++++++++++++++++++-
> >  1 file changed, 26 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 18b2735..c3584af 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -187,6 +187,9 @@ int get_kernel_page(unsigned long start, int write, struct page **pages)
> >  }
> >  EXPORT_SYMBOL_GPL(get_kernel_page);
> >
> > +static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
> > +                              void *arg);
> > +
> >  static void pagevec_lru_move_fn(struct pagevec *pvec,
> >       void (*move_fn)(struct page *page, struct lruvec *lruvec, void *arg),
> >       void *arg)
> > @@ -207,6 +210,19 @@ static void pagevec_lru_move_fn(struct pagevec *pvec,
> >                       spin_lock_irqsave(&pgdat->lru_lock, flags);
> >               }
> >
> > +             if (move_fn == __pagevec_lru_add_fn) {
> > +                     struct list_head *entry = &page->lru;
> > +                     unsigned long next = (unsigned long)entry->next;
> > +                     unsigned long rotate = next & 2;
> > +
> > +                     if (rotate) {
> > +                             VM_BUG_ON(arg);
> > +
> > +                             next = next & ~2;
> > +                             entry->next = (struct list_head *)next;
> > +                             arg = (void *)rotate;
> > +                     }
> > +             }
> >               lruvec = mem_cgroup_page_lruvec(page, pgdat);
> >               (*move_fn)(page, lruvec, arg);
> >       }
> > @@ -475,6 +491,14 @@ void lru_cache_add_inactive_or_unevictable(struct page *page,
> >                                   hpage_nr_pages(page));
> >               count_vm_event(UNEVICTABLE_PGMLOCKED);
> >       }
> > +
> > +     if (PageSwapBacked(page) && evictable) {
> > +             struct list_head *entry = &page->lru;
> > +             unsigned long next = (unsigned long)entry->next;
> > +
> > +             next = next | 2;
> > +             entry->next = (struct list_head *)next;
> > +     }
> >       lru_cache_add(page);
> >  }
> >
> > @@ -927,6 +951,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
> >  {
> >       enum lru_list lru;
> >       int was_unevictable = TestClearPageUnevictable(page);
> > +     unsigned long rotate = (unsigned long)arg;
> >
> >       VM_BUG_ON_PAGE(PageLRU(page), page);
> >
> > @@ -962,7 +987,7 @@ static void __pagevec_lru_add_fn(struct page *page, struct lruvec *lruvec,
> >       if (page_evictable(page)) {
> >               lru = page_lru(page);
> >               update_page_reclaim_stat(lruvec, page_is_file_cache(page),
> > -                                      PageActive(page));
> > +                                      PageActive(page) | rotate);
>
>
> Is it likely to rotate a page if we know it's not active?
>
>                 update_page_reclaim_stat(lruvec, page_is_file_cache(page),
> -                                        PageActive(page));
> +                                        PageActive(page) ||
> +                                        !page_is_file_cache(page));
>

My intention is that only newly created anonymous pages contributes
the rotate count.
With your code suggestion, other case for anonymous pages could also contributes
the rotate count since  __pagevec_lru_add_fn() is used else where.

Thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/9] mm/workingset: extend the workingset detection for anon LRU
  2020-02-11  6:19 ` [PATCH 3/9] mm/workingset: extend the workingset detection for anon LRU js1304
@ 2020-02-14  4:07   ` Joonsoo Kim
  0 siblings, 0 replies; 14+ messages in thread
From: Joonsoo Kim @ 2020-02-14  4:07 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Memory Management List, LKML, Johannes Weiner,
	Michal Hocko, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Mel Gorman, kernel-team, Joonsoo Kim

2020년 2월 11일 (화) 오후 3:20, <js1304@gmail.com>님이 작성:
>
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> In the following patch, workingset detection will be applied to
> anonymous LRU. To prepare it, this patch adds some code to
> distinguish/handle the both LRUs.
>
> Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
> ---
>  include/linux/mmzone.h | 14 +++++++++-----
>  mm/memcontrol.c        | 12 ++++++++----
>  mm/vmscan.c            | 15 ++++++++++-----
>  mm/vmstat.c            |  6 ++++--
>  mm/workingset.c        | 35 ++++++++++++++++++++++-------------
>  5 files changed, 53 insertions(+), 29 deletions(-)

This patch should be changed as following.

-       enum lru_list active_lru = page_lru_base_type(page) + LRU_ACTIVE_FILE;
+       enum lru_list active_lru = page_lru_base_type(page) + LRU_ACTIVE;

Whole fixed patch is as following.

--------------------->8----------------------
From 2b0691140d11c4e9a0f1500dda831b70697b2a00 Mon Sep 17 00:00:00 2001
From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Date: Fri, 15 Nov 2019 09:40:22 +0900
Subject: [PATCH] mm/workingset: extend the workingset detection for anon LRU

In the following patch, workingset detection will be applied to
anonymous LRU. To prepare it, this patch adds some code to
distinguish/handle the both LRUs.

Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
---
 include/linux/mmzone.h | 14 +++++++++-----
 mm/memcontrol.c        | 12 ++++++++----
 mm/vmscan.c            | 15 ++++++++++-----
 mm/vmstat.c            |  6 ++++--
 mm/workingset.c        | 35 ++++++++++++++++++++++-------------
 5 files changed, 53 insertions(+), 29 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 5334ad8fc7bd..b78fd8c7284b 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -220,8 +220,12 @@ enum node_stat_item {
  NR_ISOLATED_ANON, /* Temporary isolated pages from anon lru */
  NR_ISOLATED_FILE, /* Temporary isolated pages from file lru */
  WORKINGSET_NODES,
- WORKINGSET_REFAULT,
- WORKINGSET_ACTIVATE,
+ WORKINGSET_REFAULT_BASE,
+ WORKINGSET_REFAULT_ANON = WORKINGSET_REFAULT_BASE,
+ WORKINGSET_REFAULT_FILE,
+ WORKINGSET_ACTIVATE_BASE,
+ WORKINGSET_ACTIVATE_ANON = WORKINGSET_ACTIVATE_BASE,
+ WORKINGSET_ACTIVATE_FILE,
  WORKINGSET_RESTORE,
  WORKINGSET_NODERECLAIM,
  NR_ANON_MAPPED, /* Mapped anonymous pages */
@@ -304,10 +308,10 @@ enum lruvec_flags {
 struct lruvec {
  struct list_head lists[NR_LRU_LISTS];
  struct zone_reclaim_stat reclaim_stat;
- /* Evictions & activations on the inactive file list */
- atomic_long_t inactive_age;
+ /* Evictions & activations on the inactive list */
+ atomic_long_t inactive_age[2];
  /* Refaults at the time of last reclaim cycle */
- unsigned long refaults;
+ unsigned long refaults[2];
  /* Various lruvec state flags (enum lruvec_flags) */
  unsigned long flags;
 #ifdef CONFIG_MEMCG
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 6c83cf4ed970..8f4473d6ff9c 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1431,10 +1431,14 @@ static char *memory_stat_format(struct
mem_cgroup *memcg)
  seq_buf_printf(&s, "%s %lu\n", vm_event_name(PGMAJFAULT),
         memcg_events(memcg, PGMAJFAULT));

- seq_buf_printf(&s, "workingset_refault %lu\n",
-        memcg_page_state(memcg, WORKINGSET_REFAULT));
- seq_buf_printf(&s, "workingset_activate %lu\n",
-        memcg_page_state(memcg, WORKINGSET_ACTIVATE));
+ seq_buf_printf(&s, "workingset_refault_anon %lu\n",
+        memcg_page_state(memcg, WORKINGSET_REFAULT_ANON));
+ seq_buf_printf(&s, "workingset_refault_file %lu\n",
+        memcg_page_state(memcg, WORKINGSET_REFAULT_FILE));
+ seq_buf_printf(&s, "workingset_activate_anon %lu\n",
+        memcg_page_state(memcg, WORKINGSET_ACTIVATE_ANON));
+ seq_buf_printf(&s, "workingset_activate_file %lu\n",
+        memcg_page_state(memcg, WORKINGSET_ACTIVATE_FILE));
  seq_buf_printf(&s, "workingset_nodereclaim %lu\n",
         memcg_page_state(memcg, WORKINGSET_NODERECLAIM));

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 4122a841dfce..74c3adefc933 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2735,7 +2735,10 @@ static bool shrink_node(pg_data_t *pgdat,
struct scan_control *sc)
  if (!sc->force_deactivate) {
  unsigned long refaults;

- if (inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
+ refaults = lruvec_page_state(target_lruvec,
+ WORKINGSET_ACTIVATE_ANON);
+ if (refaults != target_lruvec->refaults[0] ||
+ inactive_is_low(target_lruvec, LRU_INACTIVE_ANON))
  sc->may_deactivate |= DEACTIVATE_ANON;
  else
  sc->may_deactivate &= ~DEACTIVATE_ANON;
@@ -2746,8 +2749,8 @@ static bool shrink_node(pg_data_t *pgdat, struct
scan_control *sc)
  * rid of any stale active pages quickly.
  */
  refaults = lruvec_page_state(target_lruvec,
-      WORKINGSET_ACTIVATE);
- if (refaults != target_lruvec->refaults ||
+ WORKINGSET_ACTIVATE_FILE);
+ if (refaults != target_lruvec->refaults[1] ||
      inactive_is_low(target_lruvec, LRU_INACTIVE_FILE))
  sc->may_deactivate |= DEACTIVATE_FILE;
  else
@@ -3026,8 +3029,10 @@ static void snapshot_refaults(struct mem_cgroup
*target_memcg, pg_data_t *pgdat)
  unsigned long refaults;

  target_lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
- refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE);
- target_lruvec->refaults = refaults;
+ refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_ANON);
+ target_lruvec->refaults[0] = refaults;
+ refaults = lruvec_page_state(target_lruvec, WORKINGSET_ACTIVATE_FILE);
+ target_lruvec->refaults[1] = refaults;
 }

 /*
diff --git a/mm/vmstat.c b/mm/vmstat.c
index 78d53378db99..3cdf8e9b0ba2 100644
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1146,8 +1146,10 @@ const char * const vmstat_text[] = {
  "nr_isolated_anon",
  "nr_isolated_file",
  "workingset_nodes",
- "workingset_refault",
- "workingset_activate",
+ "workingset_refault_anon",
+ "workingset_refault_file",
+ "workingset_activate_anon",
+ "workingset_activate_file",
  "workingset_restore",
  "workingset_nodereclaim",
  "nr_anon_pages",
diff --git a/mm/workingset.c b/mm/workingset.c
index 474186b76ced..5fb8f85d1fec 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -15,6 +15,7 @@
 #include <linux/dax.h>
 #include <linux/fs.h>
 #include <linux/mm.h>
+#include <linux/mm_inline.h>

 /*
  * Double CLOCK lists
@@ -156,7 +157,7 @@
  *
  * Implementation
  *
- * For each node's file LRU lists, a counter for inactive evictions
+ * For each node's anon/file LRU lists, a counter for inactive evictions
  * and activations is maintained (node->inactive_age).
  *
  * On eviction, a snapshot of this counter (along with some bits to
@@ -213,7 +214,8 @@ static void unpack_shadow(void *shadow, int
*memcgidp, pg_data_t **pgdat,
  *workingsetp = workingset;
 }

-static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat)
+static void advance_inactive_age(struct mem_cgroup *memcg, pg_data_t *pgdat,
+ int is_file)
 {
  /*
  * Reclaiming a cgroup means reclaiming all its children in a
@@ -230,7 +232,7 @@ static void advance_inactive_age(struct mem_cgroup
*memcg, pg_data_t *pgdat)
  struct lruvec *lruvec;

  lruvec = mem_cgroup_lruvec(memcg, pgdat);
- atomic_long_inc(&lruvec->inactive_age);
+ atomic_long_inc(&lruvec->inactive_age[is_file]);
  } while (memcg && (memcg = parent_mem_cgroup(memcg)));
 }

@@ -248,18 +250,19 @@ void *workingset_eviction(struct page *page,
struct mem_cgroup *target_memcg)
  unsigned long eviction;
  struct lruvec *lruvec;
  int memcgid;
+ int is_file = page_is_file_cache(page);

  /* Page is fully exclusive and pins page->mem_cgroup */
  VM_BUG_ON_PAGE(PageLRU(page), page);
  VM_BUG_ON_PAGE(page_count(page), page);
  VM_BUG_ON_PAGE(!PageLocked(page), page);

- advance_inactive_age(page_memcg(page), pgdat);
+ advance_inactive_age(page_memcg(page), pgdat, is_file);

  lruvec = mem_cgroup_lruvec(target_memcg, pgdat);
  /* XXX: target_memcg can be NULL, go through lruvec */
  memcgid = mem_cgroup_id(lruvec_memcg(lruvec));
- eviction = atomic_long_read(&lruvec->inactive_age);
+ eviction = atomic_long_read(&lruvec->inactive_age[is_file]);
  return pack_shadow(memcgid, pgdat, eviction, PageWorkingset(page));
 }

@@ -278,13 +281,16 @@ void workingset_refault(struct page *page, void *shadow)
  struct lruvec *eviction_lruvec;
  unsigned long refault_distance;
  struct pglist_data *pgdat;
- unsigned long active_file;
+ unsigned long active;
  struct mem_cgroup *memcg;
  unsigned long eviction;
  struct lruvec *lruvec;
  unsigned long refault;
  bool workingset;
  int memcgid;
+ int is_file = page_is_file_cache(page);
+ enum lru_list active_lru = page_lru_base_type(page) + LRU_ACTIVE;
+ enum node_stat_item workingset_stat;

  unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);

@@ -309,8 +315,8 @@ void workingset_refault(struct page *page, void *shadow)
  if (!mem_cgroup_disabled() && !eviction_memcg)
  goto out;
  eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
- refault = atomic_long_read(&eviction_lruvec->inactive_age);
- active_file = lruvec_page_state(eviction_lruvec, NR_ACTIVE_FILE);
+ refault = atomic_long_read(&eviction_lruvec->inactive_age[is_file]);
+ active = lruvec_page_state(eviction_lruvec, active_lru);

  /*
  * Calculate the refault distance
@@ -341,19 +347,21 @@ void workingset_refault(struct page *page, void *shadow)
  memcg = page_memcg(page);
  lruvec = mem_cgroup_lruvec(memcg, pgdat);

- inc_lruvec_state(lruvec, WORKINGSET_REFAULT);
+ workingset_stat = WORKINGSET_REFAULT_BASE + is_file;
+ inc_lruvec_state(lruvec, workingset_stat);

  /*
  * Compare the distance to the existing workingset size. We
  * don't act on pages that couldn't stay resident even if all
  * the memory was available to the page cache.
  */
- if (refault_distance > active_file)
+ if (refault_distance > active)
  goto out;

  SetPageActive(page);
- advance_inactive_age(memcg, pgdat);
- inc_lruvec_state(lruvec, WORKINGSET_ACTIVATE);
+ advance_inactive_age(memcg, pgdat, is_file);
+ workingset_stat = WORKINGSET_ACTIVATE_BASE + is_file;
+ inc_lruvec_state(lruvec, workingset_stat);

  /* Page was active prior to eviction */
  if (workingset) {
@@ -371,6 +379,7 @@ void workingset_refault(struct page *page, void *shadow)
 void workingset_activation(struct page *page)
 {
  struct mem_cgroup *memcg;
+ int is_file = page_is_file_cache(page);

  rcu_read_lock();
  /*
@@ -383,7 +392,7 @@ void workingset_activation(struct page *page)
  memcg = page_memcg_rcu(page);
  if (!mem_cgroup_disabled() && !memcg)
  goto out;
- advance_inactive_age(memcg, page_pgdat(page));
+ advance_inactive_age(memcg, page_pgdat(page), is_file);
 out:
  rcu_read_unlock();
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/9] workingset protection/detection on the anonymous LRU list
  2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
                   ` (9 preceding siblings ...)
  2020-02-12  3:35 ` Hillf Danton
@ 2020-02-14  4:12 ` Joonsoo Kim
  10 siblings, 0 replies; 14+ messages in thread
From: Joonsoo Kim @ 2020-02-14  4:12 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Linux Memory Management List, LKML, Johannes Weiner,
	Michal Hocko, Hugh Dickins, Minchan Kim, Vlastimil Babka,
	Mel Gorman, kernel-team, Joonsoo Kim

2020년 2월 11일 (화) 오후 3:20, <js1304@gmail.com>님이 작성:
>
> From: Joonsoo Kim <iamjoonsoo.kim@lge.com>
>
> Hello,
>
> This patchset implements workingset protection and detection on
> the anonymous LRU list.
>
snip...
>
> Note that, this result is gotten from v5.1. Although workingset detection
> works on v5.1, it doesn't work well on v5.5. It looks like that recent
> code change on workingset.c is the reason of this problem. I will
> track it soon.

Problem is fixed now.
It's due to my mistake on the rebase.
Replace "mm/workingset: extend the workingset detection for anon LRU"
with the patch on the following link.

https://lkml.kernel.org/r/CAAmzW4NGwvTiE_unACAcSZUH9V3tO0qR=ZPxi=q9s=zDi53DeQ@mail.gmail.com

Thanks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, back to index

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11  6:19 [PATCH 0/9] workingset protection/detection on the anonymous LRU list js1304
2020-02-11  6:19 ` [PATCH 1/9] mm/vmscan: make active/inactive ratio as 1:1 for anon lru js1304
2020-02-11  6:19 ` [PATCH 2/9] mm/vmscan: protect the workingset on anonymous LRU js1304
2020-02-11  6:19 ` [PATCH 3/9] mm/workingset: extend the workingset detection for anon LRU js1304
2020-02-14  4:07   ` Joonsoo Kim
2020-02-11  6:19 ` [PATCH 4/9] mm/swapcache: support to handle the value in swapcache js1304
2020-02-11  6:19 ` [PATCH 5/9] mm/workingset: use the node counter if memcg is the root memcg js1304
2020-02-11  6:19 ` [PATCH 6/9] mm/workingset: handle the page without memcg js1304
2020-02-11  6:19 ` [PATCH 7/9] mm/swap: implement workingset detection for anonymous LRU js1304
2020-02-11  6:19 ` [PATCH 8/9] mm/vmscan: restore active/inactive ratio " js1304
2020-02-11  6:19 ` [PATCH 9/9] mm/swap: count a new anonymous page as a reclaim_state's rotate js1304
2020-02-12  3:35 ` Hillf Danton
2020-02-12 11:00   ` Joonsoo Kim
2020-02-14  4:12 ` [PATCH 0/9] workingset protection/detection on the anonymous LRU list Joonsoo Kim

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git