linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH linux-next v3] swap_state: update shadow_nodes for anonymous page
@ 2023-01-13  9:36 yang.yang29
  2023-01-16 19:51 ` Matthew Wilcox
  0 siblings, 1 reply; 4+ messages in thread
From: yang.yang29 @ 2023-01-13  9:36 UTC (permalink / raw)
  To: akpm, hannes, willy, bagasdotme
  Cc: linux-fsdevel, linux-kernel, linux-mm, iamjoonsoo.kim, ran.xiaokai

From: Yang Yang <yang.yang29@zte.com.cn>

Shadow_nodes is for shadow nodes reclaiming of workingset handling,
it is updated when page cache add or delete since long time ago
workingset only supported page cache. But when workingset supports
anonymous page detection, we missied updating shadow nodes for
it. This caused that shadow nodes of anonymous page will never be
reclaimd by scan_shadow_nodes() even they use much memory and
system memory is tense.

So update shadow_nodes of anonymous page when swap cache is
add or delete by calling  xas_set_update(..workingset_update_node).

Fixes: aae466b0052e ("mm/swap: implement workingset detection for anonymous LRU")
Signed-off-by: Yang Yang <yang.yang29@zte.com.cn>
Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
---
change for v3
- Modify git log of explain of this patch do in imperative mood. Thanks to
Bagas Sanjaya.
change for v2
- Include a description of the user-visible effect. Add fixes tag. Modify comments.
Also call workingset_update_node() in clear_shadow_from_swap_cache(). Thanks
to Matthew Wilcox.
---
 include/linux/xarray.h | 3 ++-
 mm/swap_state.c        | 6 ++++++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index 44dd6d6e01bc..5cc1f718fec9 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -1643,7 +1643,8 @@ static inline void xas_set_order(struct xa_state *xas, unsigned long index,
  * @update: Function to call when updating a node.
  *
  * The XArray can notify a caller after it has updated an xa_node.
- * This is advanced functionality and is only needed by the page cache.
+ * This is advanced functionality and is only needed by the page cache
+ * and swap cache.
  */
 static inline void xas_set_update(struct xa_state *xas, xa_update_node_t update)
 {
diff --git a/mm/swap_state.c b/mm/swap_state.c
index cb9aaa00951d..7a003d8abb37 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -94,6 +94,8 @@ int add_to_swap_cache(struct folio *folio, swp_entry_t entry,
 	unsigned long i, nr = folio_nr_pages(folio);
 	void *old;

+	xas_set_update(&xas, workingset_update_node);
+
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 	VM_BUG_ON_FOLIO(folio_test_swapcache(folio), folio);
 	VM_BUG_ON_FOLIO(!folio_test_swapbacked(folio), folio);
@@ -145,6 +147,8 @@ void __delete_from_swap_cache(struct folio *folio,
 	pgoff_t idx = swp_offset(entry);
 	XA_STATE(xas, &address_space->i_pages, idx);

+	xas_set_update(&xas, workingset_update_node);
+
 	VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio);
 	VM_BUG_ON_FOLIO(!folio_test_swapcache(folio), folio);
 	VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio);
@@ -252,6 +256,8 @@ void clear_shadow_from_swap_cache(int type, unsigned long begin,
 		struct address_space *address_space = swap_address_space(entry);
 		XA_STATE(xas, &address_space->i_pages, curr);

+		xas_set_update(&xas, workingset_update_node);
+
 		xa_lock_irq(&address_space->i_pages);
 		xas_for_each(&xas, old, end) {
 			if (!xa_is_value(old))
-- 
2.15.2

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH linux-next v3] swap_state: update shadow_nodes for anonymous page
  2023-01-13  9:36 [PATCH linux-next v3] swap_state: update shadow_nodes for anonymous page yang.yang29
@ 2023-01-16 19:51 ` Matthew Wilcox
  2023-01-17  1:27   ` yang.yang29
  2023-01-18 12:17   ` yang.yang29
  0 siblings, 2 replies; 4+ messages in thread
From: Matthew Wilcox @ 2023-01-16 19:51 UTC (permalink / raw)
  To: yang.yang29
  Cc: akpm, hannes, bagasdotme, linux-fsdevel, linux-kernel, linux-mm,
	iamjoonsoo.kim, ran.xiaokai

On Fri, Jan 13, 2023 at 05:36:45PM +0800, yang.yang29@zte.com.cn wrote:
> From: Yang Yang <yang.yang29@zte.com.cn>
> 
> Shadow_nodes is for shadow nodes reclaiming of workingset handling,
> it is updated when page cache add or delete since long time ago
> workingset only supported page cache. But when workingset supports
> anonymous page detection, we missied updating shadow nodes for
> it. This caused that shadow nodes of anonymous page will never be
> reclaimd by scan_shadow_nodes() even they use much memory and
> system memory is tense.
> 
> So update shadow_nodes of anonymous page when swap cache is
> add or delete by calling  xas_set_update(..workingset_update_node).

What testing did you do of this?  I have this crash in today's testing:

04304 BUG: kernel NULL pointer dereference, address: 0000000000000080
04304 #PF: supervisor read access in kernel mode
04304 #PF: error_code(0x0000) - not-present page
04304 PGD 0 P4D 0
04304 Oops: 0000 [#1] PREEMPT SMP NOPTI
04304 CPU: 4 PID: 3219629 Comm: sh Kdump: loaded Not tainted 6.2.0-rc4-next-20230116-00016-gd289d3de8ce5-dirty #69
04304 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
04304 RIP: 0010:_raw_spin_trylock+0x12/0x50
04304 Code: e0 41 5c 5d c3 89 c6 48 89 df e8 89 06 00 00 4c 89 e0 5b 41 5c 5d c3 90 55 48 89 e5 53 48 89 fb bf 01 00 00 00 e8 be 5b 71 ff <8b> 03 85 c0 75 16 ba 01 00 00 00 f0 0f b1 13 b8 01 00 00 00 75 06
04304 RSP: 0018:ffff888059afbbb8 EFLAGS: 00010093
04304 RAX: 0000000000000003 RBX: 0000000000000080 RCX: 0000000000000000
04304 RDX: 0000000000000000 RSI: ffff8880033e24c8 RDI: 0000000000000001
04304 RBP: ffff888059afbbc0 R08: 0000000000000000 R09: ffff888059afbd68
04304 R10: ffff88807d9db868 R11: 0000000000000000 R12: ffff8880033e24c0
04304 R13: ffff88800a1d8008 R14: ffff8880033e24c8 R15: ffff8880033e24c0
04304 FS:  00007feeeabc6740(0000) GS:ffff88807d900000(0000) knlGS:0000000000000000
04304 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
04304 CR2: 0000000000000080 CR3: 0000000059830003 CR4: 0000000000770ea0
04304 PKRU: 55555554
04304 Call Trace:
04304  <TASK>
04304  shadow_lru_isolate+0x3a/0x120
04304  __list_lru_walk_one+0xa3/0x190
04304  ? memcg_list_lru_alloc+0x330/0x330
04304  ? memcg_list_lru_alloc+0x330/0x330
04304  list_lru_walk_one_irq+0x59/0x80
04304  scan_shadow_nodes+0x27/0x30
04304  do_shrink_slab+0x13b/0x2e0
04304  shrink_slab+0x92/0x250
04304  drop_slab+0x41/0x90
04304  drop_caches_sysctl_handler+0x70/0x80
04304  proc_sys_call_handler+0x162/0x210
04304  proc_sys_write+0xe/0x10
04304  vfs_write+0x1c7/0x3a0
04304  ksys_write+0x57/0xd0
04304  __x64_sys_write+0x14/0x20
04304  do_syscall_64+0x34/0x80
04304  entry_SYSCALL_64_after_hwframe+0x63/0xcd
04304 RIP: 0033:0x7feeeacc1190

Decoding it, shadow_lru_isolate+0x3a/0x120 maps back to this line:

        if (!spin_trylock(&mapping->host->i_lock)) {

i_lock is at offset 128 of struct inode, so that matches the dump.
I believe that swapper_spaces never have ->host set, so I don't
believe you've tested this patch since 51b8c1fe250d went in
back in 2021.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH linux-next v3] swap_state: update shadow_nodes for anonymous page
  2023-01-16 19:51 ` Matthew Wilcox
@ 2023-01-17  1:27   ` yang.yang29
  2023-01-18 12:17   ` yang.yang29
  1 sibling, 0 replies; 4+ messages in thread
From: yang.yang29 @ 2023-01-17  1:27 UTC (permalink / raw)
  To: willy
  Cc: akpm, hannes, bagasdotme, linux-fsdevel, linux-kernel, linux-mm,
	iamjoonsoo.kim, ran.xiaokai

> What testing did you do of this?  I have this crash in today's testing:

My test is this: 
1.Configure zram for swap.
2.Run some program malloc and access large memory, make sure they
can cause swap.
3.Watch count_shadow_nodes() and shadow_lru_isolate() to make sure
that shadow_nodes are really shrinking by adding printk().

Really sorry for inadequate test, I will try more tests include drop_caches
by sysctl.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH linux-next v3] swap_state: update shadow_nodes for anonymous page
  2023-01-16 19:51 ` Matthew Wilcox
  2023-01-17  1:27   ` yang.yang29
@ 2023-01-18 12:17   ` yang.yang29
  1 sibling, 0 replies; 4+ messages in thread
From: yang.yang29 @ 2023-01-18 12:17 UTC (permalink / raw)
  To: willy
  Cc: akpm, hannes, bagasdotme, linux-fsdevel, linux-kernel, linux-mm,
	iamjoonsoo.kim, ran.xiaokai

> i_lock is at offset 128 of struct inode, so that matches the dump.
> I believe that swapper_spaces never have ->host set, so I don't
> believe you've tested this patch since 51b8c1fe250d went in
> back in 2021.

You are totally right. I reproduce the panic in linux-next, and fix
it by patch v4. I should be more careful, since I used Linux 5.14
to test the patch which is a mistake.

Much apologies for the time wasted.

Thanks.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-01-18 12:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-01-13  9:36 [PATCH linux-next v3] swap_state: update shadow_nodes for anonymous page yang.yang29
2023-01-16 19:51 ` Matthew Wilcox
2023-01-17  1:27   ` yang.yang29
2023-01-18 12:17   ` yang.yang29

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).