From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3E690C433DF for ; Fri, 21 Aug 2020 16:06:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F2052207DE for ; Fri, 21 Aug 2020 16:06:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XM5WiqUM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F2052207DE Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 6AA498D0061; Fri, 21 Aug 2020 12:06:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 65A948D0013; Fri, 21 Aug 2020 12:06:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 571458D0061; Fri, 21 Aug 2020 12:06:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id 426EB8D0013 for ; Fri, 21 Aug 2020 12:06:30 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id EFF32824556B for ; Fri, 21 Aug 2020 16:06:29 +0000 (UTC) X-FDA: 77175053298.21.show84_020e3262703a Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 659CE180442C0 for ; Fri, 21 Aug 2020 16:06:27 +0000 (UTC) X-HE-Tag: show84_020e3262703a X-Filterd-Recvd-Size: 6687 Received: from mail-vs1-f67.google.com (mail-vs1-f67.google.com [209.85.217.67]) by imf12.hostedemail.com (Postfix) with ESMTP for ; Fri, 21 Aug 2020 16:06:26 +0000 (UTC) Received: by mail-vs1-f67.google.com with SMTP id b26so1064577vsa.13 for ; Fri, 21 Aug 2020 09:06:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=b7LMvTJRCBEy3uOnWRZT0zK9P07GM4iF4nxCqVfim8k=; b=XM5WiqUM8GbZZWeaOuKkTmam3WvoxctYZV5DRa4dkgylCOr+Rseuisg062z5LgeG1P cJco8xFaQWvWV6H3T8sJ69chqVgjazfM6SdYNODrXvyRCEDvLw4btsIpd5YUgypTUdml RkmTqQ2iD7gVLxv/e3DZyo47BC2INp7QfIxaOytfUXR5OSBrGLA6OP+8fHfHPkbiUfgp pvjPpGCCaZ71NEql+Gt2TnaXEJwgpYPyKjP7H0voySET5Ym5fEno9kR2UG63FDovPsl/ 0neZla1O0l+BnfvPd39VsViQUkTk2cLtmnx38rn9B5h47bipQXkmULEYWC0iNy2DdzTw 31Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=b7LMvTJRCBEy3uOnWRZT0zK9P07GM4iF4nxCqVfim8k=; b=Oh/An3yqUJKlo8jSJqUcWTs3LR9CUg3xY4wnKy9IaibbpjoZOngytsIwAWSHS5QnVj mo1TE4rsgu13+YYjLvkXR8/HPM/X5J/IdFZ/S49d192NSfz621clY6dmhUtwIiUHmVxV ZP0c8cPfkKAKmS2yp+Zg6tOoqdsHBeRKay1ZlqXYvq3YGcEq1XOSZ+bWp+TUZH4cUrjI 7Z0an6/8RpEQ89iuTBQvKkMZLuSlJ6anowSDw/1OiPzJiK0zi1BeHrEBdaRI9C+y23EO QxN9yFxwi5JHjURE1Tf8O4elwUtQ2TL4sjaFxO5E/ftg9QMwKLfVtTJL7tg3H3arYJWJ 3aFw== X-Gm-Message-State: AOAM532f6vx2E3lBl3l8zmnhdCRuLvoT91o7fDSavHTs9XXiTNvTH5oc 8KCZyS2kL2agM7pll8S0jbJpsvEHkUbMSJKx7t9Kdg== X-Google-Smtp-Source: ABdhPJwJp9Y7eAvqM2EcEoZSaYF8tzqoAMXUXqdQJN+Vexb2/46sXF/P38JLaCGmhGPvKT+DFFarr66O1p/K/nONX9o= X-Received: by 2002:a67:ff92:: with SMTP id v18mr2543591vsq.221.1598025985891; Fri, 21 Aug 2020 09:06:25 -0700 (PDT) MIME-Version: 1.0 References: <87d03lxysr.fsf@x220.int.ebiederm.org> <20200820132631.GK5033@dhcp22.suse.cz> <20200820133454.ch24kewh42ax4ebl@wittgenstein> <20200820140054.fdkbotd4tgfrqpe6@wittgenstein> <637ab0e7-e686-0c94-753b-b97d24bb8232@i-love.sakura.ne.jp> <87k0xtv0d4.fsf@x220.int.ebiederm.org> <20200820162645.GP5033@dhcp22.suse.cz> <87r1s0txxe.fsf@x220.int.ebiederm.org> <20200821111558.GG4546@redhat.com> In-Reply-To: From: Suren Baghdasaryan Date: Fri, 21 Aug 2020 09:06:14 -0700 Message-ID: Subject: Re: [PATCH 1/1] mm, oom_adj: don't loop through tasks in __set_oom_adj when not necessary To: Oleg Nesterov Cc: "Eric W. Biederman" , Michal Hocko , Tetsuo Handa , Christian Brauner , Tim Murray , mingo@kernel.org, Peter Zijlstra , Thomas Gleixner , esyr@redhat.com, christian@kellner.me, areber@redhat.com, Shakeel Butt , cyphar@cyphar.com, adobriyan@gmail.com, Andrew Morton , gladkov.alexey@gmail.com, Michel Lespinasse , daniel.m.jordan@oracle.com, avagin@gmail.com, bernd.edlinger@hotmail.de, John Johansen , laoar.shao@gmail.com, Minchan Kim , kernel-team , LKML , linux-fsdevel@vger.kernel.org, linux-mm Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 659CE180442C0 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 21, 2020 at 8:28 AM Suren Baghdasaryan wrote: > > On Fri, Aug 21, 2020 at 4:16 AM Oleg Nesterov wrote: > > > > On 08/20, Eric W. Biederman wrote: > > > > > > That said if we are going for a small change why not: > > > > > > /* > > > * Make sure we will check other processes sharing the mm if this is > > > * not vfrok which wants its own oom_score_adj. > > > * pin the mm so it doesn't go away and get reused after task_unlock > > > */ > > > if (!task->vfork_done) { > > > struct task_struct *p = find_lock_task_mm(task); > > > > > > if (p) { > > > - if (atomic_read(&p->mm->mm_users) > 1) { > > > + if (atomic_read(&p->mm->mm_users) > p->signal->nr_threads) { > > > > In theory this needs a barrier to avoid the race with do_exit(). And I'd > > suggest to use signal->live, I think signal->nr_threads should die... > > Something like > > > > bool probably_has_other_mm_users(tsk) > > { > > return atomic_read_acquire(&tsk->mm->mm_users) > > > atomic_read(&tsk->signal->live); > > } > > > > The barrier implied by _acquire ensures that if we race with the exiting > > task and see the result of exit_mm()->mmput(mm), then we must also see > > the result of atomic_dec_and_test(signal->live). > > > > Either way, if we want to fix the race with clone(CLONE_VM) we need other > > changes. > > The way I understand this condition in __set_oom_adj() sync logic is > that we would be ok with false positives (when we loop unnecessarily) > but we can't tolerate false negatives (when oom_score_adj gets out of > sync). With the clone(CLONE_VM) race not addressed we are allowing > false negatives and IMHO that's not acceptable because it creates a > possibility for userspace to get an inconsistent picture. When > developing the patch I did think about using (p->mm->mm_users > > p->signal->nr_threads) condition and had to reject it due to that > reason. > Actually, reviewing again and considering where list_add_tail_rcu is happening, maybe the race with clone(CLONE_VM) does not introduce false negatives. However a false negative I think will happen when a task shares mm with another task and also has an additional thread. Shared mm will increment mm_users without adding to signal->live and the additional thread will advance signal->live without adding to mm_users. As a result these increments will balance themselves and (mm->mm_users > signal->live) condition will yield false negative. > > > > Oleg. > >