From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C22EFC433DF for ; Sat, 11 Jul 2020 05:37:51 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5E69F206F4 for ; Sat, 11 Jul 2020 05:37:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SlI3Hims" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5E69F206F4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8CD6A8D0002; Sat, 11 Jul 2020 01:37:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 87DF78D0001; Sat, 11 Jul 2020 01:37:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 76B928D0002; Sat, 11 Jul 2020 01:37:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0062.hostedemail.com [216.40.44.62]) by kanga.kvack.org (Postfix) with ESMTP id 6034B8D0001 for ; Sat, 11 Jul 2020 01:37:50 -0400 (EDT) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id BED048248047 for ; Sat, 11 Jul 2020 05:37:49 +0000 (UTC) X-FDA: 77024688258.22.limit07_130988626ed4 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin22.hostedemail.com (Postfix) with ESMTP id 867CE18038E60 for ; Sat, 11 Jul 2020 05:37:49 +0000 (UTC) X-HE-Tag: limit07_130988626ed4 X-Filterd-Recvd-Size: 6053 Received: from mail-io1-f67.google.com (mail-io1-f67.google.com [209.85.166.67]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Sat, 11 Jul 2020 05:37:49 +0000 (UTC) Received: by mail-io1-f67.google.com with SMTP id i4so8222146iov.11 for ; Fri, 10 Jul 2020 22:37:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/4RYihkypDhjDxO+Qt8oraMbJD879gn2gge/j2wEWMg=; b=SlI3HimsWrdiu+SBUbryu6zgjaBAz1wiJHKFkEC6tn53h+sMUTlA8HuLKmMUNf6imR uikabBQNn9U0eV7PNFtTpSp09MtV88WmMMDp3LkvIFfTMMn7SoGtV+Lv7D9K6QNprv1J R2oRWpyn0TKsvrJA4HVbRMzOXRSLIxLQh6R9zp2X3dORUPHtDT27l8KN5unSZFB7W42R pTev9We+GalHQfQwBbRh1axjr/IZRIPmYO1641m1nbEYJdvgAFIYB+4J1SVC0W+TnlYQ QrAtUo2VYGSQ+07wR0afwY1g1iKPcOOSxE5Jjy2AOOQocVIWJYWwjUriYH/U96cAekCP NTIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/4RYihkypDhjDxO+Qt8oraMbJD879gn2gge/j2wEWMg=; b=TYycV3qi7paIRrH2hYFcKmQsvDlpDJV5scUCmXFdGztpATNDYTleQSh/Q7eUiQUvqe EcoboiiozrxiTfEZJGFhJ+CYNNncFqhPcghLAYDAwMOaGaovdkBMcXdHv4zJCrz8cSwn Y8Mcm+sF7WGs6gJpH+W5jcTVgBxMm+fyZ6zovM7YZo3nnSCTUvQ41VFsYx0q7EYv1GNw rJSJBw8n++ZFZkp5Wy5BjuR2D5PMNHLd9tVVGWPDaQjoJj2IvwiAtt5PM2aInejMEQgJ DtOW8ja+43CA8Wo9hOwP8A/ju2Oe/3RBkU85yJhh+om47kYYukhxFYuErIMSQiUr4zHc w9dQ== X-Gm-Message-State: AOAM5337xjsDnJ9BuyTVW+7DCTS+cqx6gaCorRCg4P+Qc5VK8BpLBeTq UpXIx4516ZksXiLevYyHlFRUtiB8yM4sAzAAsH0= X-Google-Smtp-Source: ABdhPJziZk8q0K7birtRnSNLLeUT29+9VpZfTh1ubaMESz/3FyESsFJVo4k4eY+SEYq/s9Ysd+3OIpXXy4tv9yDXZKs= X-Received: by 2002:a05:6602:6c4:: with SMTP id n4mr50208916iox.202.1594445868604; Fri, 10 Jul 2020 22:37:48 -0700 (PDT) MIME-Version: 1.0 References: <1594437481-11144-1-git-send-email-laoar.shao@gmail.com> In-Reply-To: <1594437481-11144-1-git-send-email-laoar.shao@gmail.com> From: Yafang Shao Date: Sat, 11 Jul 2020 13:37:12 +0800 Message-ID: Subject: Re: [PATCH] mm, oom: don't invoke oom killer if current has been reapered To: Michal Hocko , David Rientjes , Andrew Morton Cc: Linux MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 867CE18038E60 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam05 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Jul 11, 2020 at 11:18 AM Yafang Shao wrote: > > If the current's MMF_OOM_SKIP is set, it means that the current is exiting > or dying and likely to realease its address space. So we don't need to > invoke the oom killer again. Otherwise that may cause some unexpected > issues, for example, bellow is the issue found in our production > environment. > > There're many threads of a multi-threaded task parallel running in a > container on many cpus. Then many threads triggered OOM at the same time, > > CPU-1 CPU-2 ... CPU-n > thread-1 thread-2 ... thread-n > > wait oom_lock wait oom_lock ... hold oom_lock > > (sigkill received) > > select current as victim > and wakeup oom reaper > > release oom_lock > > (MMF_OOM_SKIP set by oom reaper) > > (lots of pages are freed) > hold oom_lock > > because MMF_OOM_SKIP > is set, kill others > > The thread running on CPU-n received sigkill and it will select current as > the victim and wakeup the oom reaper. Then oom reaper will reap its rss and > free lots of pages, as a result, there will be many free pages. > Although the multi-threaded task is exiting, the other threads will > continue to kill others because of the check of MMF_OOM_SKIP in > task_will_free_mem(). > > Signed-off-by: Yafang Shao > --- > mm/oom_kill.c | 14 ++++++-------- > 1 file changed, 6 insertions(+), 8 deletions(-) > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index 6e94962..a8a155a 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -825,13 +825,6 @@ static bool task_will_free_mem(struct task_struct *task) > if (!__task_will_free_mem(task)) > return false; > > - /* > - * This task has already been drained by the oom reaper so there are > - * only small chances it will free some more > - */ > - if (test_bit(MMF_OOM_SKIP, &mm->flags)) > - return false; > - > if (atomic_read(&mm->mm_users) <= 1) > return true; > > @@ -963,7 +956,8 @@ static void oom_kill_process(struct oom_control *oc, const char *message) > * so it can die quickly > */ > task_lock(victim); > - if (task_will_free_mem(victim)) { > + if (!test_bit(MMF_OOM_SKIP, &victim->mm->flags) && > + task_will_free_mem(victim)) { > mark_oom_victim(victim); > wake_oom_reaper(victim); > task_unlock(victim); > @@ -1056,6 +1050,10 @@ bool out_of_memory(struct oom_control *oc) > return true; > } > > + /* current has been already reapered */ > + if (test_bit(MMF_OOM_SKIP, ¤t->mm->flags)) > + return true; > + Oops. Should check whether mm is NULL first: if (mm && test_bit(MMF_OOM_SKIP, mm->flags)) > /* > * If current has a pending SIGKILL or is exiting, then automatically > * select it. The goal is to allow it to allocate so that it may > -- > 1.8.3.1 > -- Thanks Yafang