linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] mm/swap: fix system stuck due to infinite loop
@ 2021-04-02  7:03 Stillinux
  2021-04-03  0:44 ` Andrew Morton
       [not found] ` <20210406065944.08d8aa76@mail.inbox.lv>
  0 siblings, 2 replies; 6+ messages in thread
From: Stillinux @ 2021-04-02  7:03 UTC (permalink / raw)
  To: akpm; +Cc: linux-mm, linux-kernel, liuzhengyuan, liuyun01

[-- Attachment #1: Type: text/plain, Size: 1804 bytes --]

In the case of high system memory and load pressure, we ran ltp test
and found that the system was stuck, the direct memory reclaim was
all stuck in io_schedule, the waiting request was stuck in the blk_plug
flow of one process, and this process fell into an infinite loop.
not do the action of brushing out the request.

The call flow of this process is swap_cluster_readahead.
Use blk_start/finish_plug for blk_plug operation,
flow swap_cluster_readahead->__read_swap_cache_async->swapcache_prepare.
When swapcache_prepare return -EEXIST, it will fall into an infinite loop,
even if cond_resched is called, but according to the schedule,
sched_submit_work will be based on tsk->state, and will not flash out
the blk_plug request, so will hang io, causing the overall system  hang.

For the first time involving the swap part, there is no good way to fix
the problem from the fundamental problem. In order to solve the
engineering situation, we chose to make swap_cluster_readahead aware of
the memory pressure situation as soon as possible, and do io_schedule to
flush out the blk_plug request, thereby changing the allocation flag in
swap_readpage to GFP_NOIO , No longer do the memory reclaim of flush io.
Although system operating normally, but not the most fundamental way.

Signed-off-by: huangjinhui <huangjinhui@kylinos.cn>
---
 mm/page_io.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/page_io.c b/mm/page_io.c
index c493ce9ebcf5..87392ffabb12 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -403,7 +403,7 @@ int swap_readpage(struct page *page, bool synchronous)
 	}

 	ret = 0;
-	bio = bio_alloc(GFP_KERNEL, 1);
+	bio = bio_alloc(GFP_NOIO, 1);
 	bio_set_dev(bio, sis->bdev);
 	bio->bi_opf = REQ_OP_READ;
 	bio->bi_iter.bi_sector = swap_page_sector(page);
-- 
2.25.1

[-- Attachment #2: Type: text/html, Size: 2010 bytes --]

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-04-06 22:49 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-02  7:03 [RFC PATCH] mm/swap: fix system stuck due to infinite loop Stillinux
2021-04-03  0:44 ` Andrew Morton
2021-04-04  9:26   ` Stillinux
     [not found] ` <20210406065944.08d8aa76@mail.inbox.lv>
2021-04-06  0:15   ` [PATCH] mm/vmscan: add sysctl knobs for protecting the specified kernel test robot
2021-04-06  1:16   ` kernel test robot
2021-04-06 22:49   ` [RFC PATCH] mm/swap: fix system stuck due to infinite loop Stillinux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).