From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.5 required=3.0 tests=FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE166C38A24 for ; Thu, 7 May 2020 09:37:08 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5A8F7208E4 for ; Thu, 7 May 2020 09:37:08 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5A8F7208E4 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=sina.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A8128900003; Thu, 7 May 2020 05:37:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A328F900002; Thu, 7 May 2020 05:37:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94703900003; Thu, 7 May 2020 05:37:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0163.hostedemail.com [216.40.44.163]) by kanga.kvack.org (Postfix) with ESMTP id 7A34D900002 for ; Thu, 7 May 2020 05:37:07 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 352218248D51 for ; Thu, 7 May 2020 09:37:07 +0000 (UTC) X-FDA: 76789419294.16.bomb10_20a6967f91d24 X-HE-Tag: bomb10_20a6967f91d24 X-Filterd-Recvd-Size: 5526 Received: from mail3-163.sinamail.sina.com.cn (mail3-163.sinamail.sina.com.cn [202.108.3.163]) by imf13.hostedemail.com (Postfix) with SMTP for ; Thu, 7 May 2020 09:37:03 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([61.51.224.194]) by sina.com with ESMTP id 5EB3D6B900016A07; Thu, 7 May 2020 17:36:58 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 38345049283215 From: Hillf Danton To: "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, rcu@vger.kernel.org, akpm@linux-foundation.org, Hillf Danton , linux-mm Subject: Re: [PATCH RFC tip/core/rcu] Add shrinker to shift to fast/inefficient GP mode Date: Thu, 7 May 2020 17:36:47 +0800 Message-Id: <20200507093647.11932-1-hdanton@sina.com> In-Reply-To: <20200507004240.GA9156@paulmck-ThinkPad-P72> References: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello Paul On Wed, 6 May 2020 17:42:40 Paul E. McKenney wrote: >=20 > This commit adds a shrinker so as to inform RCU when memory is scarce. A simpler hook is added in the logic of kswapd for subscribing the info that memory pressure is high, and then on top of it make rcu a subscriber by copying your code for the shrinker, wishing it makes a sense to you. What's not yet included is to make the hook per node to help make every reviewer convinced that memory is becoming tight. Of course without the cost of making subscribers node aware. Hillf --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -49,6 +49,16 @@ static inline void set_max_mapnr(unsigne static inline void set_max_mapnr(unsigned long limit) { } #endif =20 +/* subscriber of kswapd's memory_pressure_high signal */ +struct mph_subscriber { + struct list_head node; + void (*info) (void *data); + void *data; +}; + +int mph_subscribe(struct mph_subscriber *ms); +void mph_unsubscribe(struct mph_subscriber *ms); + extern atomic_long_t _totalram_pages; static inline unsigned long totalram_pages(void) { --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3536,6 +3536,40 @@ static bool kswapd_shrink_node(pg_data_t } =20 /* + * subscribers of kswapd's signal that memory pressure is high + */ +static LIST_HEAD(mph_subs); +static DEFINE_MUTEX(mph_lock); + +int mph_subscribe(struct mph_subscriber *ms) +{ + if (!ms->info) + return -EAGAIN; + + mutex_lock(&mph_lock); + list_add_tail(&ms->node, &mph_subs); + mutex_unlock(&mph_lock); + return 0; +} + +void mph_unsubscribe(struct mph_subscriber *ms) +{ + mutex_lock(&mph_lock); + list_del(&ms->node); + mutex_unlock(&mph_lock); +} + +static void kswapd_bbc_mph(void) +{ + struct mph_subscriber *ms; + + mutex_lock(&mph_lock); + list_for_each_entry(ms, &mph_subs, node) + ms->info(ms->data); + mutex_unlock(&mph_lock); +} + +/* * For kswapd, balance_pgdat() will reclaim pages across a node from zon= es * that are eligible for use by the caller until at least one zone is * balanced. @@ -3663,8 +3697,11 @@ restart: * If we're getting trouble reclaiming, start doing writepage * even in laptop mode. */ - if (sc.priority < DEF_PRIORITY - 2) + if (sc.priority < DEF_PRIORITY - 2) { sc.may_writepage =3D 1; + if (sc.priority =3D=3D DEF_PRIORITY - 3) + kswapd_bbc_mph(); + } =20 /* Call soft limit reclaim before calling shrink_node. */ sc.nr_scanned =3D 0; --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -325,6 +325,8 @@ struct rcu_state { int ncpus_snap; /* # CPUs seen last time. */ u8 cbovld; /* Callback overload now? */ u8 cbovldnext; /* ^ ^ next time? */ + u8 mph; /* mm pressure high signal from kswapd */ + unsigned long mph_end; /* time stamp in jiffies */ =20 unsigned long jiffies_force_qs; /* Time at which to invoke */ /* force_quiescent_state(). */ --- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -52,6 +52,7 @@ #include #include #include +#include #include #include #include @@ -2314,8 +2315,15 @@ static void force_qs_rnp(int (*f)(struct struct rcu_data *rdp; struct rcu_node *rnp; =20 - rcu_state.cbovld =3D rcu_state.cbovldnext; + rcu_state.cbovld =3D smp_load_acquire(&rcu_state.mph) || + rcu_state.cbovldnext; rcu_state.cbovldnext =3D false; + + if (READ_ONCE(rcu_state.mph) && + time_after(jiffies, READ_ONCE(rcu_state.mph_end))) { + WRITE_ONCE(rcu_state.mph, false); + pr_info("%s: Ending OOM-mode grace periods.\n", __func__); + } rcu_for_each_leaf_node(rnp) { cond_resched_tasks_rcu_qs(); mask =3D 0; @@ -2643,6 +2651,20 @@ static void check_cb_ovld(struct rcu_dat raw_spin_unlock_rcu_node(rnp); } =20 +static void rcu_mph_info(void *data) +{ + struct rcu_state *state =3D data; + + WRITE_ONCE(state->mph_end, jiffies + HZ / 10); + smp_store_release(&state->mph, true); + rcu_force_quiescent_state(); +} + +static struct mph_subscriber rcu_mph_subscriber =3D { + .info =3D rcu_mph_info, + .data =3D &rcu_state, +}; + /* Helper function for call_rcu() and friends. */ static void __call_rcu(struct rcu_head *head, rcu_callback_t func) @@ -4036,6 +4058,8 @@ void __init rcu_init(void) qovld_calc =3D DEFAULT_RCU_QOVLD_MULT * qhimark; else qovld_calc =3D qovld; + + mph_subscribe(&rcu_mph_subscriber); } =20 #include "tree_stall.h"