From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC3FCC4363A for ; Mon, 5 Oct 2020 08:14:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2DB0F20781 for ; Mon, 5 Oct 2020 08:14:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2DB0F20781 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=canonical.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3C3036B0068; Mon, 5 Oct 2020 04:14:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 34C6D8E0001; Mon, 5 Oct 2020 04:14:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 19FA76B006E; Mon, 5 Oct 2020 04:14:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0203.hostedemail.com [216.40.44.203]) by kanga.kvack.org (Postfix) with ESMTP id DB8476B0068 for ; Mon, 5 Oct 2020 04:14:32 -0400 (EDT) Received: from smtpin25.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 80607181AE869 for ; Mon, 5 Oct 2020 08:14:32 +0000 (UTC) X-FDA: 77337159984.25.price39_3d085dd271bc Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin25.hostedemail.com (Postfix) with ESMTP id 55E7F1804E3A0 for ; Mon, 5 Oct 2020 08:14:32 +0000 (UTC) X-HE-Tag: price39_3d085dd271bc X-Filterd-Recvd-Size: 7993 Received: from youngberry.canonical.com (youngberry.canonical.com [91.189.89.112]) by imf38.hostedemail.com (Postfix) with ESMTP for ; Mon, 5 Oct 2020 08:14:31 +0000 (UTC) Received: from mail-wr1-f69.google.com ([209.85.221.69]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kPLde-0001Im-9V for linux-mm@kvack.org; Mon, 05 Oct 2020 08:14:30 +0000 Received: by mail-wr1-f69.google.com with SMTP id o6so3686867wrp.1 for ; Mon, 05 Oct 2020 01:14:30 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=a3RK306ViIXSDm7E/OHB8KsWv3MXYnGA8qRg0vSmwRg=; b=fOYE3fQhpCXt/9XDueEDd0QFhRc2o9m4+XzyKOe2Ud5qcEJPewdOIINEOQy8k2opei mj9rxQUHxy/aEEX8T40YYkiwl3pN3VEuMWPV4ZCshrMgwMzq2CqlIPdKUK4cCuktx70R toNXdp9HhM2HKqLEmATk0UqyM8/WubFffKXqVN5wNBSL/ihEoTXOgHOoB8S6RZ2Lmmnh w4Y/ayqSydYOYE3r+kOdCFrKSrDhg4Pa8qU7a5Q2gO0puu9PgCGLDHLRzobOl4LxsQMz urNr/ESlajVX0Ce7CUoaBUbL5qx9gSo8APiDP/01MXuYhI8LdjDG/pae2NDuV3hyVGMv 0rKA== X-Gm-Message-State: AOAM531K+F065MjjoZ5s4iu9Es2QyJa2V2KNa70B70U23UbGF+WCrsLq q44Gg26OawTAGrgt6Nt64uPc7EDrdQZW8hXPAo5uNbh1Pe4nsXSzXlePyFDqPcMeOJ/grG7tOnt gQciJs3uA2bFGewia2jl01DU38fkQ X-Received: by 2002:a7b:c111:: with SMTP id w17mr16169212wmi.28.1601885669741; Mon, 05 Oct 2020 01:14:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxK2e5XsCLt9z+P6jBHXXdu+RZxdp5xEs60J4eTYgbaA18QOia0wcIZVKu26ncPtxuUD2XYNA== X-Received: by 2002:a7b:c111:: with SMTP id w17mr16169172wmi.28.1601885669454; Mon, 05 Oct 2020 01:14:29 -0700 (PDT) Received: from xps-13-7390.homenet.telecomitalia.it (host-79-36-133-218.retail.telecomitalia.it. [79.36.133.218]) by smtp.gmail.com with ESMTPSA id a15sm13168855wrn.3.2020.10.05.01.14.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 05 Oct 2020 01:14:28 -0700 (PDT) From: Andrea Righi To: Michal Hocko , Vladimir Davydov Cc: Li Zefan , Tejun Heo , Johannes Weiner , Andrew Morton , Luigi Semenzato , "Rafael J . Wysocki" , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH RFC v2 2/2] mm: memcontrol: introduce opportunistic memory reclaim Date: Mon, 5 Oct 2020 10:13:13 +0200 Message-Id: <20201005081313.732745-3-andrea.righi@canonical.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201005081313.732745-1-andrea.righi@canonical.com> References: <20201005081313.732745-1-andrea.righi@canonical.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Opportunistic memory reclaim allows user-space to trigger an artificial memory pressure condition and force the system to reclaim memory (drop caches, swap out anonymous memory, etc.). This feature is provided by adding a new file to each memcg: memory.swap.reclaim. Writing a number to this file forces a memcg to reclaim memory up to that number of bytes ("max" means as much memory as possible). Reading from the this file returns the amount of bytes reclaimed in the last opportunistic memory reclaim attempt. Memory reclaim can be interrupted sending a signal to the process that is writing to memory.swap.reclaim (i.e., to set a timeout for the whole memory reclaim run). Signed-off-by: Andrea Righi --- Documentation/admin-guide/cgroup-v2.rst | 18 ++++++++ include/linux/memcontrol.h | 4 ++ mm/memcontrol.c | 59 +++++++++++++++++++++++++ 3 files changed, 81 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admi= n-guide/cgroup-v2.rst index baa07b30845e..2850a5cb4b1e 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1409,6 +1409,24 @@ PAGE_SIZE multiple when read back. Swap usage hard limit. If a cgroup's swap usage reaches this limit, anonymous memory of the cgroup will not be swapped out. =20 + memory.swap.reclaim + A read-write single value file that can be used to trigger + opportunistic memory reclaim. + + The string written to this file represents the amount of memory = to be + reclaimed (special value "max" means "as much memory as possible= "). + + When opportunistic memory reclaim is started the system will be = put + into an artificial memory pressure condition and memory will be + reclaimed by dropping clean page cache pages, swapping out anony= mous + pages, etc. + + NOTE: it is possible to interrupt the memory reclaim sending a s= ignal + to the writer of this file. + + Reading from memory.swap.reclaim returns the amount of bytes rec= laimed + in the last attempt. + memory.swap.events A read-only flat-keyed file which exists on non-root cgroups. The following entries are defined. Unless specified diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index d0b036123c6a..0c90d989bdc1 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -306,6 +306,10 @@ struct mem_cgroup { bool tcpmem_active; int tcpmem_pressure; =20 +#ifdef CONFIG_MEMCG_SWAP + unsigned long nr_swap_reclaimed; +#endif + #ifdef CONFIG_MEMCG_KMEM /* Index in the kmem_cache->memcg_params.memcg_caches array */ int kmemcg_id; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 6877c765b8d0..b98e9bbd61b0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -7346,6 +7346,60 @@ static int swap_events_show(struct seq_file *m, vo= id *v) return 0; } =20 +/* + * Try to reclaim some memory in the system, stop when one of the follow= ing + * conditions occurs: + * - at least "nr_pages" have been reclaimed + * - no more pages can be reclaimed + * - current task explicitly interrupted by a signal (e.g., user space + * timeout) + * + * @nr_pages - amount of pages to be reclaimed (0 means "as many pages = as + * possible"). + */ +static unsigned long +do_mm_reclaim(struct mem_cgroup *memcg, unsigned long nr_pages) +{ + unsigned long nr_reclaimed =3D 0; + + while (nr_pages > 0) { + unsigned long reclaimed; + + if (signal_pending(current)) + break; + reclaimed =3D __shrink_all_memory(nr_pages, memcg); + if (!reclaimed) + break; + nr_reclaimed +=3D reclaimed; + nr_pages -=3D min_t(unsigned long, reclaimed, nr_pages); + } + return nr_reclaimed; +} + +static ssize_t swap_reclaim_write(struct kernfs_open_file *of, + char *buf, size_t nbytes, loff_t off) +{ + struct mem_cgroup *memcg =3D mem_cgroup_from_css(of_css(of)); + unsigned long nr_to_reclaim; + int err; + + buf =3D strstrip(buf); + err =3D page_counter_memparse(buf, "max", &nr_to_reclaim); + if (err) + return err; + memcg->nr_swap_reclaimed =3D do_mm_reclaim(memcg, nr_to_reclaim); + + return nbytes; +} + +static u64 swap_reclaim_read(struct cgroup_subsys_state *css, + struct cftype *cft) +{ + struct mem_cgroup *memcg =3D mem_cgroup_from_css(css); + + return memcg->nr_swap_reclaimed << PAGE_SHIFT; +} + static struct cftype swap_files[] =3D { { .name =3D "swap.current", @@ -7370,6 +7424,11 @@ static struct cftype swap_files[] =3D { .file_offset =3D offsetof(struct mem_cgroup, swap_events_file), .seq_show =3D swap_events_show, }, + { + .name =3D "swap.reclaim", + .write =3D swap_reclaim_write, + .read_u64 =3D swap_reclaim_read, + }, { } /* terminate */ }; =20 --=20 2.27.0