From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <owner-linux-mm@kvack.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 039EFC77B78
	for <linux-mm@archiver.kernel.org>; Wed,  3 May 2023 01:45:18 +0000 (UTC)
Received: by kanga.kvack.org (Postfix)
	id 6DCBF6B0074; Tue,  2 May 2023 21:45:18 -0400 (EDT)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 68DB66B0078; Tue,  2 May 2023 21:45:18 -0400 (EDT)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 5ACFA6B007B; Tue,  2 May 2023 21:45:18 -0400 (EDT)
X-Delivered-To: linux-mm@kvack.org
Received: from mail3-166.sinamail.sina.com.cn (mail3-166.sinamail.sina.com.cn [202.108.3.166])
	by kanga.kvack.org (Postfix) with ESMTP id BC3B06B0074
	for <linux-mm@kvack.org>; Tue,  2 May 2023 21:45:17 -0400 (EDT)
X-SMAIL-HELO: localhost.localdomain
Received: from unknown (HELO localhost.localdomain)([114.249.59.75])
	by sina.com (172.16.97.23) with ESMTP
	id 6451BCA500029AA6; Wed, 3 May 2023 09:45:14 +0800 (CST)
X-Sender: hdanton@sina.com
X-Auth-ID: hdanton@sina.com
Authentication-Results: sina.com;
	 spf=none smtp.mailfrom=hdanton@sina.com;
	 dkim=none header.i=none;
	 dmarc=none action=none header.from=hdanton@sina.com
X-SMAIL-MID: 71226131457783
From: Hillf Danton <hdanton@sina.com>
To: Doug Anderson <dianders@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>,
	Yu Zhao <yuzhao@google.com>
Subject: Re: [PATCH v3] migrate_pages: Avoid blocking for IO in MIGRATE_SYNC_LIGHT
Date: Wed,  3 May 2023 09:45:00 +0800
Message-Id: <20230503014500.3692-1-hdanton@sina.com>
In-Reply-To: <CAD=FV=WLHZfNN5cGMUEnvv17obVK-MLmWHJHx=MV55Q1YxczOA@mail.gmail.com>
References: <20230428135414.v3.1.Ia86ccac02a303154a0b8bc60567e7a95d34c96d3@changeid> <20230430085300.3173-1-hdanton@sina.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000015, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

On 2 May 2023 14:20:54 -0700 Douglas Anderson <dianders@chromium.org>
> On Sun, Apr 30, 2023 at 1:53=E2=80=AFAM Hillf Danton <hdanton@sina.com> wrote:
> > On 28 Apr 2023 13:54:38 -0700 Douglas Anderson <dianders@chromium.org>
> > > The MIGRATE_SYNC_LIGHT mode is intended to block for things that will
> > > finish quickly but not for things that will take a long time. Exactly
> > > how long is too long is not well defined, but waits of tens of
> > > milliseconds is likely non-ideal.
> > >
> > > When putting a Chromebook under memory pressure (opening over 90 tabs
> > > on a 4GB machine) it was fairly easy to see delays waiting for some
> > > locks in the kcompactd code path of > 100 ms. While the laptop wasn't
> > > amazingly usable in this state, it was still limping along and this
> > > state isn't something artificial. Sometimes we simply end up with a
> > > lot of memory pressure.
> >
> > Given longer than 100ms stall, this can not be a correct fix if the
> > hardware fails to do more than ten IOs a second.
> >
> > OTOH given some pages reclaimed for compaction to make forward progress
> > before kswapd wakes kcompactd up, this can not be a fix without spotting
> > the cause of the stall.
> 
> Right that the system is in pretty bad shape when this happens and
> it's not very effective at doing IO or much of anything because it's
> under bad memory pressure.

Based on the info in another reply [1]

   | I put some more traces in and reproduced it again. I saw something
   | that looked like this:
   | 
   | 1. balance_pgdat() called wakeup_kcompactd() with order=10 and that
   | caused us to get all the way to the end and wakeup kcompactd (there
   | were previous calls to wakeup_kcompactd() that returned early).
   | 
   | 2. kcompactd started and completed kcompactd_do_work() without blocking.
   | 
   | 3. kcompactd called proactive_compact_node() and there blocked for
   | ~92ms in one case, ~120ms in another case, ~131ms in another case.

I see fragmentation given order=10 and proactive_compact_node(). Can you
specify the evidence of bad memory pressure?

[1] https://lore.kernel.org/lkml/CAD=FV=V8m-mpJsFntCciqtq7xnvhmnvPdTvxNuBGBT3-cDdabQ@mail.gmail.com/
> 
> I guess my first thought is that, when this happens then a process
> holding the lock gets preempted and doesn't get scheduled back in for
> a while. That _should_ be possible, right? In the case where I'm
> reproducing this then all the CPUs would be super busy madly trying to
> compress / decompress zram, so it doesn't surprise me that a process
> could get context switched out for a while.

Could switchout turn the below I/O upside down?
		/*
		 * In "light" mode, we can wait for transient locks (eg
		 * inserting a page into the page table), but it's not
		 * worth waiting for I/O.
		 */