All of lore.kernel.org
 help / color / mirror / Atom feed
From: Hillf Danton <hdanton@sina.com>
To: Doug Anderson <dianders@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@techsingularity.net>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Matthew Wilcox <willy@infradead.org>, Yu Zhao <yuzhao@google.com>
Subject: Re: [PATCH v3] migrate_pages: Avoid blocking for IO in MIGRATE_SYNC_LIGHT
Date: Wed,  3 May 2023 09:45:00 +0800	[thread overview]
Message-ID: <20230503014500.3692-1-hdanton@sina.com> (raw)
In-Reply-To: <CAD=FV=WLHZfNN5cGMUEnvv17obVK-MLmWHJHx=MV55Q1YxczOA@mail.gmail.com>

On 2 May 2023 14:20:54 -0700 Douglas Anderson <dianders@chromium.org>
> On Sun, Apr 30, 2023 at 1:53=E2=80=AFAM Hillf Danton <hdanton@sina.com> wrote:
> > On 28 Apr 2023 13:54:38 -0700 Douglas Anderson <dianders@chromium.org>
> > > The MIGRATE_SYNC_LIGHT mode is intended to block for things that will
> > > finish quickly but not for things that will take a long time. Exactly
> > > how long is too long is not well defined, but waits of tens of
> > > milliseconds is likely non-ideal.
> > >
> > > When putting a Chromebook under memory pressure (opening over 90 tabs
> > > on a 4GB machine) it was fairly easy to see delays waiting for some
> > > locks in the kcompactd code path of > 100 ms. While the laptop wasn't
> > > amazingly usable in this state, it was still limping along and this
> > > state isn't something artificial. Sometimes we simply end up with a
> > > lot of memory pressure.
> >
> > Given longer than 100ms stall, this can not be a correct fix if the
> > hardware fails to do more than ten IOs a second.
> >
> > OTOH given some pages reclaimed for compaction to make forward progress
> > before kswapd wakes kcompactd up, this can not be a fix without spotting
> > the cause of the stall.
> 
> Right that the system is in pretty bad shape when this happens and
> it's not very effective at doing IO or much of anything because it's
> under bad memory pressure.

Based on the info in another reply [1]

   | I put some more traces in and reproduced it again. I saw something
   | that looked like this:
   | 
   | 1. balance_pgdat() called wakeup_kcompactd() with order=10 and that
   | caused us to get all the way to the end and wakeup kcompactd (there
   | were previous calls to wakeup_kcompactd() that returned early).
   | 
   | 2. kcompactd started and completed kcompactd_do_work() without blocking.
   | 
   | 3. kcompactd called proactive_compact_node() and there blocked for
   | ~92ms in one case, ~120ms in another case, ~131ms in another case.

I see fragmentation given order=10 and proactive_compact_node(). Can you
specify the evidence of bad memory pressure?

[1] https://lore.kernel.org/lkml/CAD=FV=V8m-mpJsFntCciqtq7xnvhmnvPdTvxNuBGBT3-cDdabQ@mail.gmail.com/
> 
> I guess my first thought is that, when this happens then a process
> holding the lock gets preempted and doesn't get scheduled back in for
> a while. That _should_ be possible, right? In the case where I'm
> reproducing this then all the CPUs would be super busy madly trying to
> compress / decompress zram, so it doesn't surprise me that a process
> could get context switched out for a while.

Could switchout turn the below I/O upside down?
		/*
		 * In "light" mode, we can wait for transient locks (eg
		 * inserting a page into the page table), but it's not
		 * worth waiting for I/O.
		 */


  reply	other threads:[~2023-05-03  1:45 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-28 20:54 [PATCH v3] migrate_pages: Avoid blocking for IO in MIGRATE_SYNC_LIGHT Douglas Anderson
2023-04-29  2:33 ` Matthew Wilcox
2023-04-29 10:13 ` Hillf Danton
2023-05-02 21:08   ` Doug Anderson
2023-05-02  8:29 ` Mel Gorman
     [not found] ` <20230430085300.3173-1-hdanton@sina.com>
2023-05-02 21:20   ` Doug Anderson
2023-05-03  1:45     ` Hillf Danton [this message]
2023-05-05 17:11       ` Doug Anderson
2023-05-06  1:22         ` Hillf Danton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230503014500.3692-1-hdanton@sina.com \
    --to=hdanton@sina.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=dianders@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.