From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2E13C433E1 for ; Mon, 13 Jul 2020 06:02:00 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 70B262075D for ; Mon, 13 Jul 2020 06:02:00 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 70B262075D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 8F9578D0003; Mon, 13 Jul 2020 02:01:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8AA398D0002; Mon, 13 Jul 2020 02:01:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C03E8D0003; Mon, 13 Jul 2020 02:01:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0226.hostedemail.com [216.40.44.226]) by kanga.kvack.org (Postfix) with ESMTP id 64D9B8D0002 for ; Mon, 13 Jul 2020 02:01:59 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id DE6FA2C6D for ; Mon, 13 Jul 2020 06:01:58 +0000 (UTC) X-FDA: 77032006716.05.join17_0a01f4026ee6 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id B54E01803190E for ; Mon, 13 Jul 2020 06:01:58 +0000 (UTC) X-HE-Tag: join17_0a01f4026ee6 X-Filterd-Recvd-Size: 4262 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Mon, 13 Jul 2020 06:01:58 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id dp18so14740019ejc.8 for ; Sun, 12 Jul 2020 23:01:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fzn98yRJCUXDOPPzeyL45uInjiQ/VcA0GhfSJUYMra4=; b=MgCyWxubMWRe8xb3Q6UltY/PWQ/31gtNS7Q5jHNjNL4Boksf+uwKil97sNIit9uOc3 gp7ldbGb+bq40mBGggDuRUPw/M84xdUYp/VXf6K7M6yRleKx6ZVO6TnX3CCEoi7WrFCi RPurNGau+N17joD9L2nzu/X3aRlbvPcIcmWxxkBJeAVYaKEHbLAUawdrVHw1uJg0STKe 0sNbBXJ4850TVy1u8x7ZKyuzW1zzqAHcZIZ18xto7n8JP2LuRbX9may+7f8mvU3/SxPt M0DYUoTXPRdGX2RL7vO5AfRyFUwE9F2b2XFHKai05TJMmwGUSzpZgMxR5ax1ipKL/Evw x6zw== X-Gm-Message-State: AOAM531Y+IBW4rwX1DzweKSu23qZJDU3smEv2xwdzb98BwqqUOY4dfwF yEIEGh7i0MY6za+7cpOl/ac= X-Google-Smtp-Source: ABdhPJyeMd/c4fgYBvDytEE/FRSCCJYLof6JcB+dT3gzVUtdmx62szqkmQfOYSd10sJ4AYMVaeGBPQ== X-Received: by 2002:a17:906:7387:: with SMTP id f7mr54201078ejl.131.1594620117214; Sun, 12 Jul 2020 23:01:57 -0700 (PDT) Received: from localhost (ip-37-188-148-171.eurotel.cz. [37.188.148.171]) by smtp.gmail.com with ESMTPSA id cw14sm10706713edb.88.2020.07.12.23.01.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 12 Jul 2020 23:01:56 -0700 (PDT) Date: Mon, 13 Jul 2020 08:01:54 +0200 From: Michal Hocko To: Yafang Shao Cc: rientjes@google.com, akpm@linux-foundation.org, linux-mm@kvack.org Subject: Re: [PATCH] mm, oom: don't invoke oom killer if current has been reapered Message-ID: <20200713060154.GA16783@dhcp22.suse.cz> References: <1594437481-11144-1-git-send-email-laoar.shao@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1594437481-11144-1-git-send-email-laoar.shao@gmail.com> X-Rspamd-Queue-Id: B54E01803190E X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri 10-07-20 23:18:01, Yafang Shao wrote: > If the current's MMF_OOM_SKIP is set, it means that the current is exiting > or dying and likely to realease its address space. That is not actually true. The primary reason for this flag is to tell that the task is no longer relevant for the oom victim selection because most of its memory has been released. But the task might be stuck at many places and waiting for it to terminate might easily lockup the system. The design of the oom reaper is to guarantee a forward progress if when the victim cannot make a forward progress on its own. For that to work the oom killer cannot relly rely on the victim's state or that it would finish. If you remove this fundamental assumption then the oom killer can lockup again. > So we don't need to > invoke the oom killer again. Otherwise that may cause some unexpected > issues, for example, bellow is the issue found in our production > environment. Please see the above. > There're many threads of a multi-threaded task parallel running in a > container on many cpus. Then many threads triggered OOM at the same time, > > CPU-1 CPU-2 ... CPU-n > thread-1 thread-2 ... thread-n > > wait oom_lock wait oom_lock ... hold oom_lock > > (sigkill received) > > select current as victim > and wakeup oom reaper > > release oom_lock > > (MMF_OOM_SKIP set by oom reaper) > > (lots of pages are freed) > hold oom_lock Could you be more specific please? The page allocator never waits for the oom_lock and keeps retrying instead. Also __alloc_pages_may_oom tries to allocate with the lock held. Could you provide oom reports please? -- Michal Hocko SUSE Labs