From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 039EFC77B78 for ; Wed, 3 May 2023 01:45:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6DCBF6B0074; Tue, 2 May 2023 21:45:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 68DB66B0078; Tue, 2 May 2023 21:45:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5ACFA6B007B; Tue, 2 May 2023 21:45:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from mail3-166.sinamail.sina.com.cn (mail3-166.sinamail.sina.com.cn [202.108.3.166]) by kanga.kvack.org (Postfix) with ESMTP id BC3B06B0074 for ; Tue, 2 May 2023 21:45:17 -0400 (EDT) X-SMAIL-HELO: localhost.localdomain Received: from unknown (HELO localhost.localdomain)([114.249.59.75]) by sina.com (172.16.97.23) with ESMTP id 6451BCA500029AA6; Wed, 3 May 2023 09:45:14 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com Authentication-Results: sina.com; spf=none smtp.mailfrom=hdanton@sina.com; dkim=none header.i=none; dmarc=none action=none header.from=hdanton@sina.com X-SMAIL-MID: 71226131457783 From: Hillf Danton To: Doug Anderson Cc: Andrew Morton , Mel Gorman , Alexander Viro , Christian Brauner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Matthew Wilcox , Yu Zhao Subject: Re: [PATCH v3] migrate_pages: Avoid blocking for IO in MIGRATE_SYNC_LIGHT Date: Wed, 3 May 2023 09:45:00 +0800 Message-Id: <20230503014500.3692-1-hdanton@sina.com> In-Reply-To: References: <20230428135414.v3.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid> <20230430085300.3173-1-hdanton@sina.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000015, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2 May 2023 14:20:54 -0700 Douglas Anderson > On Sun, Apr 30, 2023 at 1:53=E2=80=AFAM Hillf Danton wrote: > > On 28 Apr 2023 13:54:38 -0700 Douglas Anderson > > > The MIGRATE_SYNC_LIGHT mode is intended to block for things that will > > > finish quickly but not for things that will take a long time. Exactly > > > how long is too long is not well defined, but waits of tens of > > > milliseconds is likely non-ideal. > > > > > > When putting a Chromebook under memory pressure (opening over 90 tabs > > > on a 4GB machine) it was fairly easy to see delays waiting for some > > > locks in the kcompactd code path of > 100 ms. While the laptop wasn't > > > amazingly usable in this state, it was still limping along and this > > > state isn't something artificial. Sometimes we simply end up with a > > > lot of memory pressure. > > > > Given longer than 100ms stall, this can not be a correct fix if the > > hardware fails to do more than ten IOs a second. > > > > OTOH given some pages reclaimed for compaction to make forward progress > > before kswapd wakes kcompactd up, this can not be a fix without spotting > > the cause of the stall. > > Right that the system is in pretty bad shape when this happens and > it's not very effective at doing IO or much of anything because it's > under bad memory pressure. Based on the info in another reply [1] | I put some more traces in and reproduced it again. I saw something | that looked like this: | | 1. balance_pgdat() called wakeup_kcompactd() with order=10 and that | caused us to get all the way to the end and wakeup kcompactd (there | were previous calls to wakeup_kcompactd() that returned early). | | 2. kcompactd started and completed kcompactd_do_work() without blocking. | | 3. kcompactd called proactive_compact_node() and there blocked for | ~92ms in one case, ~120ms in another case, ~131ms in another case. I see fragmentation given order=10 and proactive_compact_node(). Can you specify the evidence of bad memory pressure? [1] https://lore.kernel.org/lkml/CAD=FV=V8m-mpJsFntCciqtq7xnvhmnvPdTvxNuBGBT3-cDdabQ@mail.gmail.com/ > > I guess my first thought is that, when this happens then a process > holding the lock gets preempted and doesn't get scheduled back in for > a while. That _should_ be possible, right? In the case where I'm > reproducing this then all the CPUs would be super busy madly trying to > compress / decompress zram, so it doesn't surprise me that a process > could get context switched out for a while. Could switchout turn the below I/O upside down? /* * In "light" mode, we can wait for transient locks (eg * inserting a page into the page table), but it's not * worth waiting for I/O. */