From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A7626C433FE for ; Sat, 2 Apr 2022 17:56:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244606AbiDBR4j (ORCPT ); Sat, 2 Apr 2022 13:56:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231678AbiDBR4f (ORCPT ); Sat, 2 Apr 2022 13:56:35 -0400 Received: from out0.migadu.com (out0.migadu.com [IPv6:2001:41d0:2:267::]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DA833E0C7 for ; Sat, 2 Apr 2022 10:54:42 -0700 (PDT) Content-Type: text/plain; charset=utf-8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1648922079; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tZzJyOenYgHGR7Qus+38SF8NRb7w4EHzCNnvX97aW7g=; b=fWKZ6gNuhfl1YgF5V33Yc3q2pJAxGzp48CNKfHcyIR9qqmetC2uU9ExOPoCBI6HyEnmUax FSERvPxsH4zHkehIgZVNTXQn+t6UaSMZ1kiCuqbmxCpHtRSEn5SHfs8Z9yz4+LH1OP/szU fo0xK26iXa+17pzPNUvHkf4SG3G0E7g= Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [RFC] mm/vmscan: add periodic slab shrinker X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin In-Reply-To: <20220402072103.5140-1-hdanton@sina.com> Date: Sat, 2 Apr 2022 10:54:36 -0700 Cc: MM , Matthew Wilcox , Dave Chinner , Mel Gorman , Stephen Brennan , Yu Zhao , David Hildenbrand , LKML Message-Id: References: <20220402072103.5140-1-hdanton@sina.com> To: Hillf Danton X-Migadu-Flow: FLOW_OUT X-Migadu-Auth-User: linux.dev Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello Hillf! Thank you for sharing it, really interesting! I=E2=80=99m actually working o= n the same problem.=20 No code to share yet, but here are some of my thoughts: 1) If there is a =E2=80=9Cnatural=E2=80=9D memory pressure, no additional sl= ab scanning is needed. 2) =46rom a power perspective it=E2=80=99s better to scan more at once, but l= ess often. 3) Maybe we need a feedback loop with the slab allocator: e.g. if slabs are a= lmost full there is more sense to do a proactive scanning and free up some m= emory, otherwise we=E2=80=99ll end up allocating more slabs. But it=E2=80=99= s tricky. 4) If the scanning is not resulting in any memory reclaim, maybe we should (= temporarily) exclude the corresponding shrinker from the scanning. Thanks! > On Apr 2, 2022, at 12:21 AM, Hillf Danton wrote: >=20 > =EF=BB=BFTo mitigate the pain of having "several millions" of negative den= tries in > a single directory [1] for example, add the periodic slab shrinker that > runs independent of direct and background reclaimers in bid to recycle the= > slab objects that haven been cold for more than 30 seconds. >=20 > Q, Why is it needed? > A, Kswapd may take a nap as long as 30 minutes. >=20 > Add periodic flag to shrink control to let cache owners know this is the > periodic shrinker that equals to the regular one running at the lowest > recalim priority, and feel free to take no action without one-off objects > piling up. >=20 > Only for thoughts now. >=20 > Hillf >=20 > [1] https://lore.kernel.org/linux-fsdevel/20220209231406.187668-1-stephen.= s.brennan@oracle.com/ >=20 > --- x/include/linux/shrinker.h > +++ y/include/linux/shrinker.h > @@ -14,6 +14,7 @@ struct shrink_control { >=20 > /* current node being shrunk (for NUMA aware shrinkers) */ > int nid; > + int periodic; >=20 > /* > * How many objects scan_objects should scan and try to reclaim. > --- x/mm/vmscan.c > +++ y/mm/vmscan.c > @@ -781,6 +781,8 @@ static unsigned long do_shrink_slab(stru > scanned +=3D shrinkctl->nr_scanned; >=20 > cond_resched(); > + if (shrinkctl->periodic) > + break; > } >=20 > /* > @@ -906,7 +908,8 @@ static unsigned long shrink_slab_memcg(g > */ > static unsigned long shrink_slab(gfp_t gfp_mask, int nid, > struct mem_cgroup *memcg, > - int priority) > + int priority, > + int periodic) > { > unsigned long ret, freed =3D 0; > struct shrinker *shrinker; > @@ -929,6 +932,7 @@ static unsigned long shrink_slab(gfp_t g > .gfp_mask =3D gfp_mask, > .nid =3D nid, > .memcg =3D memcg, > + .periodic =3D periodic, > }; >=20 > ret =3D do_shrink_slab(&sc, shrinker, priority); > @@ -952,7 +956,7 @@ out: > return freed; > } >=20 > -static void drop_slab_node(int nid) > +static void drop_slab_node(int nid, int periodic) > { > unsigned long freed; > int shift =3D 0; > @@ -966,19 +970,31 @@ static void drop_slab_node(int nid) > freed =3D 0; > memcg =3D mem_cgroup_iter(NULL, NULL, NULL); > do { > - freed +=3D shrink_slab(GFP_KERNEL, nid, memcg, 0); > + freed +=3D shrink_slab(GFP_KERNEL, nid, memcg, 0, periodic); > } while ((memcg =3D mem_cgroup_iter(NULL, memcg, NULL)) !=3D NULL);= > } while ((freed >> shift++) > 1); > } >=20 > -void drop_slab(void) > +static void __drop_slab(int periodic) > { > int nid; >=20 > for_each_online_node(nid) > - drop_slab_node(nid); > + drop_slab_node(nid, periodic); > +} > + > +void drop_slab(void) > +{ > + __drop_slab(0); > } >=20 > +static void periodic_slab_shrinker_workfn(struct work_struct *work) > +{ > + __drop_slab(1); > + queue_delayed_work(system_unbound_wq, to_delayed_work(work), 30*HZ); > +} > +static DECLARE_DELAYED_WORK(periodic_slab_shrinker, periodic_slab_shrinke= r_workfn); > + > static inline int is_page_cache_freeable(struct folio *folio) > { > /* > @@ -3098,7 +3114,7 @@ static void shrink_node_memcgs(pg_data_t > shrink_lruvec(lruvec, sc); >=20 > shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, > - sc->priority); > + sc->priority, 0); >=20 > /* Record the group's reclaim efficiency */ > vmpressure(sc->gfp_mask, memcg, false, > @@ -4354,8 +4370,11 @@ static void kswapd_try_to_sleep(pg_data_ > */ > set_pgdat_percpu_threshold(pgdat, calculate_normal_threshold); >=20 > - if (!kthread_should_stop()) > + if (!kthread_should_stop()) { > + queue_delayed_work(system_unbound_wq, > + &periodic_slab_shrinker, 60*HZ); > schedule(); > + } >=20 > set_pgdat_percpu_threshold(pgdat, calculate_pressure_threshold); > } else { > -- >=20