stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, Guo Ziliang <guo.ziliang@zte.com.cn>,
	Zeal Robot <zealci@zte.com.cn>,
	Ran Xiaokai <ran.xiaokai@zte.com.cn>,
	Jiang Xuexin <jiang.xuexin@zte.com.cn>,
	Yang Yang <yang.yang29@zte.com.cn>,
	Hugh Dickins <hughd@google.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
	Michal Hocko <mhocko@kernel.org>,
	Minchan Kim <minchan@kernel.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Roger Quadros <rogerq@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: [PATCH 5.16 03/37] mm: swap: get rid of livelock in swapin readahead
Date: Mon, 21 Mar 2022 14:52:45 +0100	[thread overview]
Message-ID: <20220321133221.392695021@linuxfoundation.org> (raw)
In-Reply-To: <20220321133221.290173884@linuxfoundation.org>

From: Guo Ziliang <guo.ziliang@zte.com.cn>

commit 029c4628b2eb2ca969e9bf979b05dc18d8d5575e upstream.

In our testing, a livelock task was found.  Through sysrq printing, same
stack was found every time, as follows:

  __swap_duplicate+0x58/0x1a0
  swapcache_prepare+0x24/0x30
  __read_swap_cache_async+0xac/0x220
  read_swap_cache_async+0x58/0xa0
  swapin_readahead+0x24c/0x628
  do_swap_page+0x374/0x8a0
  __handle_mm_fault+0x598/0xd60
  handle_mm_fault+0x114/0x200
  do_page_fault+0x148/0x4d0
  do_translation_fault+0xb0/0xd4
  do_mem_abort+0x50/0xb0

The reason for the livelock is that swapcache_prepare() always returns
EEXIST, indicating that SWAP_HAS_CACHE has not been cleared, so that it
cannot jump out of the loop.  We suspect that the task that clears the
SWAP_HAS_CACHE flag never gets a chance to run.  We try to lower the
priority of the task stuck in a livelock so that the task that clears
the SWAP_HAS_CACHE flag will run.  The results show that the system
returns to normal after the priority is lowered.

In our testing, multiple real-time tasks are bound to the same core, and
the task in the livelock is the highest priority task of the core, so
the livelocked task cannot be preempted.

Although cond_resched() is used by __read_swap_cache_async, it is an
empty function in the preemptive system and cannot achieve the purpose
of releasing the CPU.  A high-priority task cannot release the CPU
unless preempted by a higher-priority task.  But when this task is
already the highest priority task on this core, other tasks will not be
able to be scheduled.  So we think we should replace cond_resched() with
schedule_timeout_uninterruptible(1), schedule_timeout_interruptible will
call set_current_state first to set the task state, so the task will be
removed from the running queue, so as to achieve the purpose of giving
up the CPU and prevent it from running in kernel mode for too long.

(akpm: ugly hack becomes uglier.  But it fixes the issue in a
backportable-to-stable fashion while we hopefully work on something
better)

Link: https://lkml.kernel.org/r/20220221111749.1928222-1-cgel.zte@gmail.com
Signed-off-by: Guo Ziliang <guo.ziliang@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Reviewed-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Reviewed-by: Jiang Xuexin <jiang.xuexin@zte.com.cn>
Reviewed-by: Yang Yang <yang.yang29@zte.com.cn>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Roger Quadros <rogerq@kernel.org>
Cc: Ziliang Guo <guo.ziliang@zte.com.cn>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 mm/swap_state.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -478,7 +478,7 @@ struct page *__read_swap_cache_async(swp
 		 * __read_swap_cache_async(), which has set SWAP_HAS_CACHE
 		 * in swap_map, but not yet added its page to swap cache.
 		 */
-		cond_resched();
+		schedule_timeout_uninterruptible(1);
 	}
 
 	/*



  parent reply	other threads:[~2022-03-21 14:05 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-21 13:52 [PATCH 5.16 00/37] 5.16.17-rc1 review Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 01/37] crypto: qcom-rng - ensure buffer for generate is completely filled Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 02/37] ocfs2: fix crash when initialize filecheck kobj fails Greg Kroah-Hartman
2022-03-21 13:52 ` Greg Kroah-Hartman [this message]
2022-03-21 13:52 ` [PATCH 5.16 04/37] block: release rq qos structures for queue without disk Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 05/37] drm/mgag200: Fix PLL setup for g200wb and g200ew Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 06/37] efi: fix return value of __setup handlers Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 07/37] alx: acquire mutex for alx_reinit in alx_change_mtu Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 08/37] vsock: each transport cycles only on its own sockets Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 09/37] esp6: fix check on ipv6_skip_exthdrs return value Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 10/37] net: phy: marvell: Fix invalid comparison in the resume and suspend functions Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 11/37] net/packet: fix slab-out-of-bounds access in packet_recvmsg() Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 12/37] nvmet: revert "nvmet: make discovery NQN configurable" Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 13/37] atm: eni: Add check for dma_map_single Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 14/37] ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats() Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 15/37] iavf: Fix double free in iavf_reset_task Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 16/37] hv_netvsc: Add check for kvmalloc_array Greg Kroah-Hartman
2022-03-21 13:52 ` [PATCH 5.16 17/37] drm/imx: parallel-display: Remove bus flags check in imx_pd_bridge_atomic_check() Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 18/37] drm/panel: simple: Fix Innolux G070Y2-L01 BPP settings Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 19/37] net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit() Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 20/37] drm: Dont make DRM_PANEL_BRIDGE dependent on DRM_KMS_HELPERS Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 21/37] net: dsa: Add missing of_node_put() in dsa_port_parse_of Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 22/37] net: phy: mscc: Add MODULE_FIRMWARE macros Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 23/37] bnx2x: fix built-in kernel driver load failure Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 24/37] net: bcmgenet: skip invalid partial checksums Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 25/37] net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 26/37] iavf: Fix hang during reboot/shutdown Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 27/37] arm64: fix clang warning about TRAMP_VALIAS Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 28/37] usb: gadget: rndis: prevent integer overflow in rndis_set_response() Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 29/37] usb: gadget: Fix use-after-free bug by not setting udc->dev.driver Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 30/37] usb: usbtmc: Fix bug in pipe direction for control transfers Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 31/37] scsi: mpt3sas: Page fault in reply q processing Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 32/37] Input: aiptek - properly check endpoint type Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 33/37] arm64: errata: avoid duplicate field initializer Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 34/37] perf symbols: Fix symbol size calculation condition Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 35/37] Revert "arm64: dts: freescale: Fix interrupt-map parent address cells" Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 36/37] Revert "ath10k: drop beacon and probe response which leak from other channel" Greg Kroah-Hartman
2022-03-21 13:53 ` [PATCH 5.16 37/37] btrfs: skip reserved bytes warning on unmount after log cleanup failure Greg Kroah-Hartman
2022-03-21 18:22 ` [PATCH 5.16 00/37] 5.16.17-rc1 review Florian Fainelli
2022-03-21 19:16 ` Jon Hunter
2022-03-21 19:51 ` Jeffrin Thalakkottoor
2022-03-21 23:21 ` Shuah Khan
2022-03-21 23:28 ` Fox Chen
2022-03-22  1:53 ` Zan Aziz
2022-03-22  2:01 ` Guenter Roeck
2022-03-22  8:31 ` Ron Economos
2022-03-22  8:52 ` Naresh Kamboju
2022-03-22 11:23 ` Bagas Sanjaya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220321133221.392695021@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=akpm@linux-foundation.org \
    --cc=guo.ziliang@zte.com.cn \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jiang.xuexin@zte.com.cn \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=naoya.horiguchi@nec.com \
    --cc=ran.xiaokai@zte.com.cn \
    --cc=rogerq@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=yang.yang29@zte.com.cn \
    --cc=zealci@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).