linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: akpm@linux-foundation.org, linux-mm@kvack.org,
	rientjes@google.com, hannes@cmpxchg.org, guro@fb.com,
	tj@kernel.org, vdavydov.dev@gmail.com,
	torvalds@linux-foundation.org
Subject: Re: [PATCH] mm,oom: Don't call schedule_timeout_killable() with oom_lock held.
Date: Tue, 23 Jan 2018 13:42:45 +0100	[thread overview]
Message-ID: <20180123124245.GK1526@dhcp22.suse.cz> (raw)
In-Reply-To: <201801232107.HJB48975.OHJFFOOLFQMVSt@I-love.SAKURA.ne.jp>

On Tue 23-01-18 21:07:03, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Mon 22-01-18 22:46:22, Tetsuo Handa wrote:
> > > When I was examining a bug which occurs under CPU + memory pressure, I
> > > observed that a thread which called out_of_memory() can sleep for minutes
> > > at schedule_timeout_killable(1) with oom_lock held when many threads are
> > > doing direct reclaim.
> > > 
> > > --------------------
> > > [  163.357628] b.out invoked oom-killer: gfp_mask=0x14280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO), nodemask=(null), order=0, oom_score_adj=0
> > > [  163.360946] CPU: 0 PID: 554 Comm: b.out Not tainted 4.15.0-rc8+ #216
> > > (...snipped...)
> > > [  163.470193] Out of memory: Kill process 548 (b.out) score 6 or sacrifice child
> > > [  163.471813] Killed process 1191 (b.out) total-vm:2108kB, anon-rss:60kB, file-rss:4kB, shmem-rss:0kB
> > > (...snipped...)
> > > [  248.016033] sysrq: SysRq : Show State
> > > (...snipped...)
> > > [  249.625720] b.out           R  running task        0   554    538 0x00000004
> > > [  249.627778] Call Trace:
> > > [  249.628513]  __schedule+0x142/0x4b2
> > > [  249.629394]  schedule+0x27/0x70
> > > [  249.630114]  schedule_timeout+0xd1/0x160
> > > [  249.631029]  ? oom_kill_process+0x396/0x400
> > > [  249.632039]  ? __next_timer_interrupt+0xc0/0xc0
> > > [  249.633087]  schedule_timeout_killable+0x15/0x20
> > > [  249.634097]  out_of_memory+0xea/0x270
> > > [  249.634901]  __alloc_pages_nodemask+0x715/0x880
> > > [  249.635920]  handle_mm_fault+0x538/0xe40
> > > [  249.636888]  ? __enqueue_entity+0x63/0x70
> > > [  249.637787]  ? set_next_entity+0x4b/0x80
> > > [  249.638687]  __do_page_fault+0x199/0x430
> > > [  249.639535]  ? vmalloc_sync_all+0x180/0x180
> > > [  249.640452]  do_page_fault+0x1a/0x1e
> > > [  249.641283]  common_exception+0x82/0x8a
> > > (...snipped...)
> > > [  462.676366] oom_reaper: reaped process 1191 (b.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> > > --------------------
> > > 
> > > --------------------
> > > [  269.985819] b.out invoked oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null), order=0, oom_score_adj=0
> > > [  269.988570] CPU: 0 PID: 9079 Comm: b.out Not tainted 4.15.0-rc8+ #217
> > > (...snipped...)
> > > [  270.073050] Out of memory: Kill process 914 (b.out) score 9 or sacrifice child
> > > [  270.074660] Killed process 2208 (b.out) total-vm:2108kB, anon-rss:64kB, file-rss:4kB, shmem-rss:0kB
> > > [  297.562824] sysrq: SysRq : Show State
> > > (...snipped...)
> > > [  471.716610] b.out           R  running task        0  9079   7400 0x00000000
> > > [  471.718203] Call Trace:
> > > [  471.718784]  __schedule+0x142/0x4b2
> > > [  471.719577]  schedule+0x27/0x70
> > > [  471.720294]  schedule_timeout+0xd1/0x160
> > > [  471.721207]  ? oom_kill_process+0x396/0x400
> > > [  471.722151]  ? __next_timer_interrupt+0xc0/0xc0
> > > [  471.723215]  schedule_timeout_killable+0x15/0x20
> > > [  471.724350]  out_of_memory+0xea/0x270
> > > [  471.725201]  __alloc_pages_nodemask+0x715/0x880
> > > [  471.726238]  ? radix_tree_lookup_slot+0x1f/0x50
> > > [  471.727253]  filemap_fault+0x346/0x510
> > > [  471.728120]  ? filemap_map_pages+0x245/0x2d0
> > > [  471.729105]  ? unlock_page+0x30/0x30
> > > [  471.729987]  __xfs_filemap_fault.isra.18+0x2d/0xb0
> > > [  471.731488]  ? unlock_page+0x30/0x30
> > > [  471.732364]  xfs_filemap_fault+0xa/0x10
> > > [  471.733260]  __do_fault+0x11/0x30
> > > [  471.734033]  handle_mm_fault+0x8e8/0xe40
> > > [  471.735200]  __do_page_fault+0x199/0x430
> > > [  471.736163]  ? common_exception+0x82/0x8a
> > > [  471.737102]  ? vmalloc_sync_all+0x180/0x180
> > > [  471.738061]  do_page_fault+0x1a/0x1e
> > > [  471.738881]  common_exception+0x82/0x8a
> > > (...snipped...)
> > > [  566.969400] oom_reaper: reaped process 2208 (b.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> > > --------------------
> > > 
> > > Allowing the OOM reaper to start reclaiming memory without waiting for
> > > the oom_lock is not sufficient if the OOM reaper did not reclaim enough
> > > memory. We need to make sure that the thread which called out_of_memory()
> > > will release oom_lock shortly. Thus, this patch brings the short sleep
> > > to outside of the OOM killer.
> > 
> > And why does sleeping outside of the lock make any sense? The whole
> > point of the sleep is to give the victim some time to exit and if we
> > sleep outside of the lock then contending allocating paths hit the oom
> > path early.
> 
> Why does subsequent threads call out_of_memory() again without giving
> the OOM victim some time to exit matter? MMF_OOM_SKIP test will prevent
> subsequent out_of_memory() calls from selecting next OOM victims.

Yes, it is possible that the _current_ code doesn't need this as I've
written below.

> > To be completely host, I am not in love with this
> > schedule_timeout_uninterruptible(1). It is an ugly hack. It used to be
> > much more important in the past when the oom victim test was too
> > fragile. I strongly suspect that it is not needed this days so rather
> > than moving the sleep around I would try to remove it altogether.
> 
> But this schedule_timeout_uninterruptible(1) serves as a guaranteed
> sleep for PF_WQ_WORKER threads
> ( http://lkml.kernel.org/r/20170830064019.mfihbeu3mm5ygcrb@dhcp22.suse.cz ).
> 
>     > If we are under memory pressure, __zone_watermark_ok() can return false.
>     > If __zone_watermark_ok() == false, when is schedule_timeout_*() called explicitly?
> 
>     If all zones fail with the watermark check then we should hit the oom
>     path and sleep there. We do not do so for all cases though.
> 
> Thus, you cannot simply remove it.

Then I would rather make should_reclaim_retry more robust.

> > Also, your changelog silently skips over some important details. The
> > system must be really overloaded when a short sleep can take minutes.
> 
> Yes, the system was really overloaded, for I was testing below reproducer
> on a x86_32 kernel.
[...]
> > I would trongly suspect that such an overloaded system doesn't need
> > a short sleep to hold the oom lock for too long. All you need is to be
> > preempted. So this patch doesn't really solve any _real_ problem.
> 
> Preemption makes the OOM situation much worse. The only way to avoid all OOM
> lockups caused by lack of CPU resource is to replace mutex_trylock(&oom_lock)
> in __alloc_pages_may_oom() with mutex_lock(&oom_lock) (or similar) in order to
> guarantee that all threads waiting for the OOM killer/reaper to make forward
> progress shall give enough CPU resource.

And how exactly does that help when the page allocator gets preempted by
somebody not doing any allocation?
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2018-01-23 12:42 UTC|newest]

Thread overview: 70+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-22 13:46 [PATCH] mm,oom: Don't call schedule_timeout_killable() with oom_lock held Tetsuo Handa
2018-01-23  8:38 ` Michal Hocko
2018-01-23 12:07   ` Tetsuo Handa
2018-01-23 12:42     ` Michal Hocko [this message]
2018-01-24 13:28       ` Tetsuo Handa
2018-02-13 11:58         ` Tetsuo Handa
2018-02-20 13:32           ` [PATCH] mm,page_alloc: wait for oom_lock than back off Tetsuo Handa
2018-02-20 13:40             ` Matthew Wilcox
2018-02-20 14:12               ` Tetsuo Handa
2018-02-20 14:49             ` Michal Hocko
2018-02-21 14:27               ` Tetsuo Handa
2018-02-22 13:06                 ` Michal Hocko
2018-02-24  8:00                   ` [PATCH v2] " Tetsuo Handa
2018-02-26  9:27                     ` Michal Hocko
2018-02-26 10:58                       ` Tetsuo Handa
2018-02-26 12:19                         ` Michal Hocko
2018-02-26 13:16                           ` Tetsuo Handa
2018-03-02 11:10                             ` [PATCH v3] " Tetsuo Handa
2018-03-02 14:10                               ` Michal Hocko
2018-03-03  3:15                                 ` Tetsuo Handa
2018-03-21 10:39                                   ` Tetsuo Handa
2018-03-21 11:21                                     ` Michal Hocko
2018-03-21 11:35                                       ` Tetsuo Handa
2018-03-21 12:00                                         ` Michal Hocko
2018-03-21 12:20                                           ` Tetsuo Handa
2018-03-21 12:31                                             ` Michal Hocko
2018-03-22 10:51 [PATCH] mm,oom: Don't call schedule_timeout_killable() with oom_lock held Tetsuo Handa
2018-03-22 11:45 ` Michal Hocko
2018-03-22 13:16   ` Tetsuo Handa
2018-05-12 14:18 Tetsuo Handa
2018-05-15  9:16 ` Michal Hocko
2018-05-18 10:14   ` Tetsuo Handa
2018-05-18 12:20     ` Michal Hocko
2018-05-20 15:56       ` Tetsuo Handa
2018-05-22  6:18         ` Michal Hocko
2018-05-23 10:24           ` Tetsuo Handa
2018-05-23 11:57             ` Michal Hocko
2018-05-23 13:45               ` Tetsuo Handa
2018-05-23 14:56                 ` Michal Hocko
2018-05-24 10:51                   ` Tetsuo Handa
2018-05-24 11:50                     ` Michal Hocko
2018-05-25  1:17                       ` Tetsuo Handa
2018-05-25  8:31                         ` Michal Hocko
2018-05-25 10:57                           ` Tetsuo Handa
2018-05-25 11:42                             ` Michal Hocko
2018-05-25 11:46                               ` Tetsuo Handa
2018-05-28 12:43                                 ` Michal Hocko
2018-05-28 20:57                                   ` Tetsuo Handa
2018-05-29  7:17                                     ` Michal Hocko
2018-05-29 23:07                                       ` Andrew Morton
2018-05-31 10:10                                         ` Tetsuo Handa
2018-05-31 10:44                                           ` Michal Hocko
2018-05-31 15:23                                             ` Tetsuo Handa
2018-05-31 18:47                                               ` Michal Hocko
2018-06-01  1:21                                                 ` Tetsuo Handa
2018-06-01  8:04                                                   ` Michal Hocko
2018-06-01 15:28                                         ` Michal Hocko
2018-06-01 21:11                                           ` Andrew Morton
2018-06-04  7:04                                             ` Michal Hocko
2018-06-04 10:41                                               ` Tetsuo Handa
2018-06-04 11:22                                                 ` Michal Hocko
2018-06-04 11:30                                                   ` Tetsuo Handa
2018-06-06  9:02                                                 ` David Rientjes
2018-06-06 13:37                                                   ` Tetsuo Handa
2018-06-06 18:44                                                     ` David Rientjes
2018-05-29  7:17             ` Michal Hocko
2018-05-29  8:16               ` Michal Hocko
2018-05-29 14:33                 ` Tetsuo Handa
2018-05-29 17:18                   ` Michal Hocko
2018-05-29 17:28                     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180123124245.GK1526@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).