From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.1 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 938A8C433E2 for ; Thu, 16 Jul 2020 07:06:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 53A0E2067D for ; Thu, 16 Jul 2020 07:06:32 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="i42i7Ikv" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 53A0E2067D Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id CED386B0008; Thu, 16 Jul 2020 03:06:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C4F926B000C; Thu, 16 Jul 2020 03:06:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF0C46B000D; Thu, 16 Jul 2020 03:06:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0021.hostedemail.com [216.40.44.21]) by kanga.kvack.org (Postfix) with ESMTP id 947266B0008 for ; Thu, 16 Jul 2020 03:06:31 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 5D9A5824934B for ; Thu, 16 Jul 2020 07:06:31 +0000 (UTC) X-FDA: 77043055782.18.war71_11159dc26f00 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id 1E934100EDBFA for ; Thu, 16 Jul 2020 07:06:30 +0000 (UTC) X-HE-Tag: war71_11159dc26f00 X-Filterd-Recvd-Size: 5500 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Thu, 16 Jul 2020 07:06:29 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id 72so3473460ple.0 for ; Thu, 16 Jul 2020 00:06:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=RLArHN/BnFnZATUYJaMVnDmI/FZygxnFyB/yl5nIyLc=; b=i42i7IkvUVnUOFlICr8Rp3ZY/pJy1gQu8Na+mkUtb2nUNfQ6XC3qCum23dDgZhYi1C tmOtLpjwF6T6EC26TdL6hrO0fJyEott+/xrf1huzEJ6WK/nKJ61dEXROaDiilD3riXR4 etQtUt5VmxL9Sc73eFagjFcmUdE9Rpn3kxBOxCuyBK5ENFmmQ+KGr2iH54/fvzlQ5lsW OeS04uAmGU2aCN3frPI5vPOEA4Z86od+RuHvtUmQfy2QJOxyejrg9hp4OrH1aRYP16jx AhuNkgDvbmlrGwiVL3H7KIM4XuZt+qGy/q5+49o+N3hMAgLn1lJLNMckixhHkCOah60f fjsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=RLArHN/BnFnZATUYJaMVnDmI/FZygxnFyB/yl5nIyLc=; b=ePvRmEgo0i8jF00uhbRbRDEk2tW64pPATc68j4GEU3eaEBgDKz3pUqf+els6tQJ42l K6XX5uvC2tgXwAYG20IvgQXRAy/Ys0qwK/purVlkQqntBX1MrOjTZJZNUin+OC6nMC2/ H2E6MaheCjrs1OvrMTqtwuroaRZmvIDAbv0UvToKlUFsSPMRluRhdKV1Ywe7qCO1Z4IA tZvxbRtAgD7wgvtxscMDEfDJU1TN+CzwtU4LuVea9wEtF/ptS9sw2Yefe79Q9Ov6Xswh cPDgwIVCH4aRhzv16aP/dgojJG6wppP1z8siLXc4Jaxa/Ln1csgkSrHu98Bzy3FMP2jP 3QBQ== X-Gm-Message-State: AOAM531wQheq5ttLy6n+ouNU0uBVEL/mIVHjwxTg4VqYhhQNuehPlNsd Xa1/KOdAc2XnhZTjYGe1EwLWoQ== X-Google-Smtp-Source: ABdhPJzNNJ0P5c2d1eYrry9SVpBftlLuAAatVeDaObK0Di3Zqoz0KdwGsIlU3AKgqAW66toFJZTiRA== X-Received: by 2002:a17:902:d68f:: with SMTP id v15mr2476741ply.109.1594883188511; Thu, 16 Jul 2020 00:06:28 -0700 (PDT) Received: from [2620:15c:17:3:4a0f:cfff:fe51:6667] ([2620:15c:17:3:4a0f:cfff:fe51:6667]) by smtp.gmail.com with ESMTPSA id e16sm3947721pff.180.2020.07.16.00.06.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Jul 2020 00:06:27 -0700 (PDT) Date: Thu, 16 Jul 2020 00:06:27 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Michal Hocko cc: Tetsuo Handa , Yafang Shao , Andrew Morton , Johannes Weiner , Linux MM Subject: Re: [PATCH v2] memcg, oom: check memcg margin for parallel oom In-Reply-To: <20200716061156.GB31089@dhcp22.suse.cz> Message-ID: References: <1594735034-19190-1-git-send-email-laoar.shao@gmail.com> <20200716061156.GB31089@dhcp22.suse.cz> User-Agent: Alpine 2.23 (DEB 453 2020-06-18) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Rspamd-Queue-Id: 1E934100EDBFA X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam03 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 16 Jul 2020, Michal Hocko wrote: > > > But regardless of whether we present previous data to the user in the > > > kernel log or not, we've determined that oom killing a process is a > > > serious matter and go to any lengths possible to avoid having to do it. > > > For us, that means waiting until the "point of no return" to either go > > > ahead with oom killing a process or aborting and retrying the charge. > > > > > > I don't think moving the mem_cgroup_margin() check to out_of_memory() > > > right before printing the oom info and killing the process is a very > > > invasive patch. Any strong preference against doing it that way? I think > > > moving the check as late as possible to save a process from being killed > > > when racing with an exiter or killed process (including perhaps current) > > > has a pretty clear motivation. > > > > > > > How about ignoring MMF_OOM_SKIP for once? I think this has almost same > > effect as moving the mem_cgroup_margin() check to out_of_memory() > > right before printing the oom info and killing the process. > > How would that help with races when a task is exiting while the oom > selects a victim? We are not talking about races with the oom_reaper > IIUC. Btw. if races with the oom_reaper are a concern then I would much > rather delay the wake up than complicate the existing protocol even > further. Right, this isn't a concern about racing with oom reaping or when finding processes that have already been selected as the oom victim. This is about (potentially significant) amounts of memory that has been uncharged to the memcg hierarchy between the failure of reclaim to uncharge memory and the actual killing of a user process.