linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@suse.de>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone
Date: Thu, 19 Jan 2017 13:11:43 +0000	[thread overview]
Message-ID: <20170119131143.2ze5l5fwheoqdpne@suse.de> (raw)
In-Reply-To: <20170119112336.GN30786@dhcp22.suse.cz>

On Thu, Jan 19, 2017 at 12:23:36PM +0100, Michal Hocko wrote:
> On Thu 19-01-17 10:07:55, Mel Gorman wrote:
> [...]
> > mm, vmscan: Wait on a waitqueue when too many pages are isolated
> > 
> > When too many pages are isolated, direct reclaim waits on congestion to clear
> > for up to a tenth of a second. There is no reason to believe that too many
> > pages are isolated due to dirty pages, reclaim efficiency or congestion.
> > It may simply be because an extremely large number of processes have entered
> > direct reclaim at the same time. However, it is possible for the situation
> > to persist forever and never reach OOM.
> > 
> > This patch queues processes a waitqueue when too many pages are isolated.
> > When parallel reclaimers finish shrink_page_list, they wake the waiters
> > to recheck whether too many pages are isolated.
> > 
> > The wait on the queue has a timeout as not all sites that isolate pages
> > will do the wakeup. Depending on every isolation of LRU pages to be perfect
> > forever is potentially fragile. The specific wakeups occur for page reclaim
> > and compaction. If too many pages are isolated due to memory failure,
> > hotplug or directly calling migration from a syscall then the waiting
> > processes may wait the full timeout.
> > 
> > Note that the timeout allows the use of waitqueue_active() on the basis
> > that a race will cause the full timeout to be reached due to a missed
> > wakeup. This is relatively harmless and still a massive improvement over
> > unconditionally calling congestion_wait.
> > 
> > Direct reclaimers that cannot isolate pages within the timeout will consider
> > return to the caller. This is somewhat clunky as it won't return immediately
> > and make go through the other priorities and slab shrinking. Eventually,
> > it'll go through a few iterations of should_reclaim_retry and reach the
> > MAX_RECLAIM_RETRIES limit and consider going OOM.
> 
> I cannot really say I would like this. It's just much more complex than
> necessary.

I guess it's a difference in opinion. Miximg per-zone and per-node
information for me is complex. I liked the workqueue because it was an
example of waiting on a specific event instead of relying completely on
time.

> I definitely agree that congestion_wait while waiting for
> too_many_isolated is a crude hack. This patch doesn't really resolve
> my biggest worry, though, that we go OOM with too many pages isolated
> as your patch doesn't alter zone_reclaimable_pages to reflect those
> numbers.
> 

Indeed, but such cases are also caught by the no_progress_loop logic to
avoid a premature OOM.

> Anyway, I think both of us are probably overcomplicating things a bit.
> Your waitqueue approach is definitely better semantically than the
> congestion_wait because we are waiting for a different event than the
> API is intended for. On the other hand a mere
> schedule_timeout_interruptible might work equally well in the real life.
> On the other side I might really over emphasise the role of NR_ISOLATED*
> counts. It might really turn out that we can safely ignore them and it
> won't be the end of the world. So what do you think about the following
> as a starting point. If we ever see oom reports with high number of
> NR_ISOLATED* which are part of the oom report then we know we have to do
> something about that. Those changes would at least be driven by a real
> usecase rather than theoretical scenarios.
> 
> So what do you think about the following? Tetsuo, would you be willing
> to run this patch through your torture testing please?

I'm fine with treating this as a starting point.

Thanks.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-01-19 13:11 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-18 13:44 [RFC PATCH 0/2] fix unbounded too_many_isolated Michal Hocko
2017-01-18 13:44 ` [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone Michal Hocko
2017-01-18 14:46   ` Mel Gorman
2017-01-18 15:15     ` Michal Hocko
2017-01-18 15:54       ` Mel Gorman
2017-01-18 16:17         ` Michal Hocko
2017-01-18 17:00           ` Mel Gorman
2017-01-18 17:29             ` Michal Hocko
2017-01-19 10:07               ` Mel Gorman
2017-01-19 11:23                 ` Michal Hocko
2017-01-19 13:11                   ` Mel Gorman [this message]
2017-01-20 13:27                     ` Tetsuo Handa
2017-01-21  7:42                       ` Tetsuo Handa
2017-01-25 10:15                         ` Michal Hocko
2017-01-25 10:19                           ` Christoph Hellwig
2017-01-25 10:46                             ` Michal Hocko
2017-01-25 11:09                               ` Tetsuo Handa
2017-01-25 13:00                                 ` Michal Hocko
2017-01-27 14:49                                   ` Michal Hocko
2017-01-28 15:27                                     ` Tetsuo Handa
2017-01-30  8:55                                       ` Michal Hocko
2017-02-02 10:14                                         ` Michal Hocko
2017-02-03 10:57                                           ` Tetsuo Handa
2017-02-03 14:41                                             ` Michal Hocko
2017-02-03 14:50                                             ` Michal Hocko
2017-02-03 17:24                                               ` Brian Foster
2017-02-06  6:29                                                 ` Tetsuo Handa
2017-02-06 14:35                                                   ` Brian Foster
2017-02-06 14:42                                                     ` Michal Hocko
2017-02-06 15:47                                                       ` Brian Foster
2017-02-07 10:30                                                     ` Tetsuo Handa
2017-02-07 16:54                                                       ` Brian Foster
2017-02-03 14:55                                             ` Michal Hocko
2017-02-05 10:43                                               ` Tetsuo Handa
2017-02-06 10:34                                                 ` Michal Hocko
2017-02-06 10:39                                                 ` Michal Hocko
2017-02-07 21:12                                                   ` Michal Hocko
2017-02-08  9:24                                                     ` Peter Zijlstra
2017-02-21  9:40                                             ` Michal Hocko
2017-02-21 14:35                                               ` Tetsuo Handa
2017-02-21 15:53                                                 ` Michal Hocko
2017-02-22  2:02                                                   ` Tetsuo Handa
2017-02-22  7:54                                                     ` Michal Hocko
2017-02-26  6:30                                                       ` Tetsuo Handa
2017-01-31 11:58                                   ` Michal Hocko
2017-01-31 12:51                                     ` Christoph Hellwig
2017-01-31 13:21                                       ` Michal Hocko
2017-01-25 10:33                           ` [RFC PATCH 1/2] mm, vmscan: account the number of isolated pagesper zone Tetsuo Handa
2017-01-25 12:34                             ` Michal Hocko
2017-01-25 13:13                               ` [RFC PATCH 1/2] mm, vmscan: account the number of isolated pages per zone Tetsuo Handa
2017-01-25  9:53                       ` Michal Hocko
2017-01-20  6:42                 ` Hillf Danton
2017-01-20  9:25                   ` Mel Gorman
2017-01-18 13:44 ` [RFC PATCH 2/2] mm, vmscan: do not loop on too_many_isolated for ever Michal Hocko
2017-01-18 14:50   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170119131143.2ze5l5fwheoqdpne@suse.de \
    --to=mgorman@suse.de \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).