From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f198.google.com (mail-wr0-f198.google.com [209.85.128.198]) by kanga.kvack.org (Postfix) with ESMTP id A4BB16B0006 for ; Fri, 25 May 2018 07:42:18 -0400 (EDT) Received: by mail-wr0-f198.google.com with SMTP id i11-v6so3959475wre.16 for ; Fri, 25 May 2018 04:42:18 -0700 (PDT) Received: from mx2.suse.de (mx2.suse.de. [195.135.220.15]) by mx.google.com with ESMTPS id a24-v6si2162609edc.97.2018.05.25.04.42.16 for (version=TLS1 cipher=AES128-SHA bits=128/128); Fri, 25 May 2018 04:42:16 -0700 (PDT) Date: Fri, 25 May 2018 13:42:13 +0200 From: Michal Hocko Subject: Re: [PATCH] mm,oom: Don't call schedule_timeout_killable() with oom_lock held. Message-ID: <20180525114213.GJ11881@dhcp22.suse.cz> References: <201805241951.IFF48475.FMOSOJFQHLVtFO@I-love.SAKURA.ne.jp> <20180524115017.GE20441@dhcp22.suse.cz> <201805250117.w4P1HgdG039943@www262.sakura.ne.jp> <20180525083118.GI11881@dhcp22.suse.cz> <201805251957.EJJ09809.LFJHFFVOOSQOtM@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <201805251957.EJJ09809.LFJHFFVOOSQOtM@I-love.SAKURA.ne.jp> Sender: owner-linux-mm@kvack.org List-ID: To: Tetsuo Handa Cc: guro@fb.com, rientjes@google.com, hannes@cmpxchg.org, vdavydov.dev@gmail.com, tj@kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, torvalds@linux-foundation.org On Fri 25-05-18 19:57:32, Tetsuo Handa wrote: > Michal Hocko wrote: > > On Fri 25-05-18 10:17:42, Tetsuo Handa wrote: > > > Then, please show me (by writing a patch yourself) how to tell whether > > > we should sleep there. What I can come up is shown below. > > > > > > -@@ -4241,6 +4240,12 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > > - /* Retry as long as the OOM killer is making progress */ > > > - if (did_some_progress) { > > > - no_progress_loops = 0; > > > -+ /* > > > -+ * This schedule_timeout_*() serves as a guaranteed sleep for > > > -+ * PF_WQ_WORKER threads when __zone_watermark_ok() == false. > > > -+ */ > > > -+ if (!tsk_is_oom_victim(current)) > > > -+ schedule_timeout_uninterruptible(1); > > > - goto retry; > > > - } > > > +@@ -3927,6 +3926,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask) > > > + (*no_progress_loops)++; > > > > > > + /* > > > ++ * We do a short sleep here if the OOM killer/reaper/victims are > > > ++ * holding oom_lock, in order to try to give them some CPU resources > > > ++ * for releasing memory. > > > ++ */ > > > ++ if (mutex_is_locked(&oom_lock) && !tsk_is_oom_victim(current)) > > > ++ schedule_timeout_uninterruptible(1); > > > ++ > > > ++ /* > > > + * Make sure we converge to OOM if we cannot make any progress > > > + * several times in the row. > > > + */ > > > > > > As far as I know, whether a domain which the current thread belongs to is > > > already OOM is not known as of should_reclaim_retry(). Therefore, sleeping > > > there can become a pointless delay if the domain which the current thread > > > belongs to and the domain which the owner of oom_lock (it can be a random > > > thread inside out_of_memory() or exit_mmap()) belongs to differs. > > > > > > But you insist sleeping there means that you don't care about such > > > pointless delay? > > > > What is wrong with the folliwing? should_reclaim_retry should be a > > natural reschedule point. PF_WQ_WORKER is a special case which needs a > > stronger rescheduling policy. Doing that unconditionally seems more > > straightforward than depending on a zone being a good candidate for a > > further reclaim. > > Where is schedule_timeout_uninterruptible(1) for !PF_KTHREAD threads? Re-read what I've said. -- Michal Hocko SUSE Labs