From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DD2BC432BE for ; Fri, 23 Jul 2021 17:00:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D8ABE60EBD for ; Fri, 23 Jul 2021 17:00:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D8ABE60EBD Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 015216B0033; Fri, 23 Jul 2021 13:00:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F06BD6B005D; Fri, 23 Jul 2021 13:00:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF5F56B006C; Fri, 23 Jul 2021 13:00:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0030.hostedemail.com [216.40.44.30]) by kanga.kvack.org (Postfix) with ESMTP id C59DC6B0033 for ; Fri, 23 Jul 2021 13:00:40 -0400 (EDT) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6A1C28249980 for ; Fri, 23 Jul 2021 17:00:40 +0000 (UTC) X-FDA: 78394466640.29.3D450F5 Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by imf27.hostedemail.com (Postfix) with ESMTP id 1CB1470012C1 for ; Fri, 23 Jul 2021 17:00:39 +0000 (UTC) Received: by mail-lf1-f53.google.com with SMTP id d18so3140964lfb.6 for ; Fri, 23 Jul 2021 10:00:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mYrZg+fpRQAmubKOisYlI4PP99r2kR+Pc+emzbHA1zU=; b=VIsnpgjCvbquqGy8008dby3WGq1lkP0WIKEXONe2zP3+7E8+Bqd70Z+nw+8CTxPdVG fTF4I0uoy5w17M0iQl1dD64gcIXcr2NbpNJiRuoWnI2R4Nq8b9/4kritGYxRVpL+5Srq Ontytx421RfBfulHBx4pTLtcUXqVFYwoX3BnhAaHltHe1YK1DFjT9yVaLkNq8wSLrXHC ZAuOR1hE6s6zZrgvJFjs0y1eWGw9+dsO9w+ojiQbGaADbEXT4/RnUTJKpVZ70s5xFLv3 PTCtCXIz23hGMJvKvyJAPEpH4nKokDR+RvbDiNBfjqOH+sTZ2P4Zi6AP98qsI0pcz/Vn aOMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mYrZg+fpRQAmubKOisYlI4PP99r2kR+Pc+emzbHA1zU=; b=bXJkhlHTg6EROGbqVyYcx5v8Bw+mPRBryh6H2EnZPm0qycdVORovts6voaf6DwXIh7 CGZA/2J1qdi1r/d9mcFjilw3ksQk1Zb8ANOb0IEDf0Tq/kt0pw/wU9TONNaqrbccVN3n wysp8iOMilqTgJSsnNQ6EB3Y98mnMN9pj/Q++TuVUCtcMVNOQx5a6O+jU6BdFYeeCWLx /UhloPIN71ec6inq7x/J5jLQEq65/JuiMh4n+82ictS0pF6q0InYEiyH5E0PJHZndm1I dtNlg0VW+lOgcQFkouOYpJrD6I0acs78R78oJBhEtwrAHK6URFSLkls2SknpGQ3GuV/Z 2tMA== X-Gm-Message-State: AOAM532duhtB/IDcGbByU5rlt/COcGeRK/wZPBYR+j219EigFLxGXdg2 v4CSmOGgg4pP0MjNUu1RkazURiM9IKOBqDbwzjKAyg== X-Google-Smtp-Source: ABdhPJx+E64jFTPgcU+HWdNxXsKIHIGys4KGyK9bevyN+71SfSiqn0Ofw2Xe2zbrnwRUnVeYmcbPMNQdf80YQ2Nx5bA= X-Received: by 2002:a05:6512:c23:: with SMTP id z35mr3714473lfu.299.1627059638178; Fri, 23 Jul 2021 10:00:38 -0700 (PDT) MIME-Version: 1.0 References: <20210723011436.60960-1-surenb@google.com> In-Reply-To: From: Shakeel Butt Date: Fri, 23 Jul 2021 10:00:26 -0700 Message-ID: Subject: Re: [PATCH v3 1/2] mm: introduce process_mrelease system call To: Suren Baghdasaryan Cc: Michal Hocko , Andrew Morton , David Rientjes , Matthew Wilcox , Johannes Weiner , Roman Gushchin , Rik van Riel , Minchan Kim , Christian Brauner , Christoph Hellwig , Oleg Nesterov , David Hildenbrand , Jann Horn , Andy Lutomirski , Christian Brauner , Florian Weimer , Jan Engelhardt , Tim Murray , Linux API , Linux MM , LKML , kernel-team Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20161025 header.b=VIsnpgjC; spf=pass (imf27.hostedemail.com: domain of shakeelb@google.com designates 209.85.167.53 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 1CB1470012C1 X-Stat-Signature: fiu84fpfsdyejh3y33ib1qdbbbex5j58 X-HE-Tag: 1627059639-121462 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jul 23, 2021 at 9:09 AM Suren Baghdasaryan wrote: > > On Fri, Jul 23, 2021 at 6:46 AM Shakeel Butt wrote: > > > > On Fri, Jul 23, 2021 at 1:53 AM Michal Hocko wrote: > > > > > [...] > > > > However > > > > retrying means issuing another syscall, so additional overhead... > > > > I guess such "best effort" approach would be unusual for a syscall, so > > > > maybe we can keep it as it is now and if such "do not block" mode is needed > > > > we can use flags to implement it later? > > > > > > Yeah, an explicit opt-in via flags would be an option if that turns out > > > to be really necessary. > > > > > > > I am fine with keeping it as it is but we do need the non-blocking > > option (via flags) to enable userspace to act more aggressively. > > I think you want to check memory conditions shortly after issuing > kill/reap requests irrespective of mmap_sem contention. The reason is > that even when memory release is not blocked, allocations from other > processes might consume memory faster than we release it. For example, > in Android we issue kill and start waiting on pidfd for its death > notification. As soon as the process is dead we reassess the situation > and possibly kill again. If the process is not dead within a > configurable timeout we check conditions again and might issue more > kill requests (IOW our wait for the process to die has a timeout). If > process_mrelease() is blocked on mmap_sem, we might timeout like this. > I imagine that a non-blocking option for process_mrelease() would not > really change this logic. On a containerized system, killing a job requires killing multiple processes and then process_mrelease() them. Now there is cgroup.kill to kill all the processes in a cgroup tree but we would still need to process_mrelease() all the processes in that tree. There is a chance that we get stuck in reaping the early process. Making process_mrelease() non-blocking will enable the userspace to go to other processes in the list. An alternative would be to have a cgroup specific interface for reaping similar to cgroup.kill. > Adding such an option is trivial but I would like to make sure it's > indeed useful. Maybe after the syscall is in place you can experiment > with it and see if such an option would really change the way you use > it? SGTM.