From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.4 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20587C433DB for ; Thu, 4 Feb 2021 07:25:22 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 891D064DDC for ; Thu, 4 Feb 2021 07:25:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 891D064DDC Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=virtuozzo.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C1E906B006C; Thu, 4 Feb 2021 02:25:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BD07A6B0070; Thu, 4 Feb 2021 02:25:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0D2A6B0071; Thu, 4 Feb 2021 02:25:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0028.hostedemail.com [216.40.44.28]) by kanga.kvack.org (Postfix) with ESMTP id 9AC9E6B006C for ; Thu, 4 Feb 2021 02:25:20 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 67D211EE6 for ; Thu, 4 Feb 2021 07:25:20 +0000 (UTC) X-FDA: 77779749600.05.badge61_6312e2f275da Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin05.hostedemail.com (Postfix) with ESMTP id 4951118016BA8 for ; Thu, 4 Feb 2021 07:25:20 +0000 (UTC) X-HE-Tag: badge61_6312e2f275da X-Filterd-Recvd-Size: 4603 Received: from relay.sw.ru (relay.sw.ru [185.231.240.75]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Thu, 4 Feb 2021 07:25:19 +0000 (UTC) Received: from [192.168.15.247] by relay.sw.ru with esmtp (Exim 4.94) (envelope-from ) id 1l7Z0L-001eHJ-Ak; Thu, 04 Feb 2021 10:24:41 +0300 Subject: Re: [v6 PATCH 03/11] mm: vmscan: use shrinker_rwsem to protect shrinker_maps allocation To: Yang Shi , guro@fb.com, vbabka@suse.cz, shakeelb@google.com, david@fromorbit.com, hannes@cmpxchg.org, mhocko@suse.com, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org References: <20210203172042.800474-1-shy828301@gmail.com> <20210203172042.800474-4-shy828301@gmail.com> From: Kirill Tkhai Message-ID: Date: Thu, 4 Feb 2021 10:24:40 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.6.1 MIME-Version: 1.0 In-Reply-To: <20210203172042.800474-4-shy828301@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 03.02.2021 20:20, Yang Shi wrote: > Since memcg_shrinker_map_size just can be changed under holding shrinker_rwsem > exclusively, the read side can be protected by holding read lock, so it sounds > superfluous to have a dedicated mutex. > > Kirill Tkhai suggested use write lock since: > > * We want the assignment to shrinker_maps is visible for shrink_slab_memcg(). > * The rcu_dereference_protected() dereferrencing in shrink_slab_memcg(), but > in case of we use READ lock in alloc_shrinker_maps(), the dereferrencing > is not actually protected. > * READ lock makes alloc_shrinker_info() racy against memory allocation fail. > alloc_shrinker_info()->free_shrinker_info() may free memory right after > shrink_slab_memcg() dereferenced it. You may say > shrink_slab_memcg()->mem_cgroup_online() protects us from it? Yes, sure, > but this is not the thing we want to remember in the future, since this > spreads modularity. > > And a test with heavy paging workload didn't show write lock makes things worse. > > Acked-by: Vlastimil Babka > Signed-off-by: Yang Shi Acked-by: Kirill Tkhai > --- > mm/vmscan.c | 16 ++++++---------- > 1 file changed, 6 insertions(+), 10 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 96b08c79f18d..e4ddaaaeffe2 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -187,7 +187,6 @@ static DECLARE_RWSEM(shrinker_rwsem); > #ifdef CONFIG_MEMCG > > static int memcg_shrinker_map_size; > -static DEFINE_MUTEX(memcg_shrinker_map_mutex); > > static void free_shrinker_map_rcu(struct rcu_head *head) > { > @@ -200,8 +199,6 @@ static int expand_one_shrinker_map(struct mem_cgroup *memcg, > struct memcg_shrinker_map *new, *old; > int nid; > > - lockdep_assert_held(&memcg_shrinker_map_mutex); > - > for_each_node(nid) { > old = rcu_dereference_protected( > mem_cgroup_nodeinfo(memcg, nid)->shrinker_map, true); > @@ -249,7 +246,7 @@ int alloc_shrinker_maps(struct mem_cgroup *memcg) > if (mem_cgroup_is_root(memcg)) > return 0; > > - mutex_lock(&memcg_shrinker_map_mutex); > + down_write(&shrinker_rwsem); > size = memcg_shrinker_map_size; > for_each_node(nid) { > map = kvzalloc_node(sizeof(*map) + size, GFP_KERNEL, nid); > @@ -260,7 +257,7 @@ int alloc_shrinker_maps(struct mem_cgroup *memcg) > } > rcu_assign_pointer(memcg->nodeinfo[nid]->shrinker_map, map); > } > - mutex_unlock(&memcg_shrinker_map_mutex); > + up_write(&shrinker_rwsem); > > return ret; > } > @@ -275,9 +272,8 @@ static int expand_shrinker_maps(int new_id) > if (size <= old_size) > return 0; > > - mutex_lock(&memcg_shrinker_map_mutex); > if (!root_mem_cgroup) > - goto unlock; > + goto out; > > memcg = mem_cgroup_iter(NULL, NULL, NULL); > do { > @@ -286,13 +282,13 @@ static int expand_shrinker_maps(int new_id) > ret = expand_one_shrinker_map(memcg, size, old_size); > if (ret) { > mem_cgroup_iter_break(NULL, memcg); > - goto unlock; > + goto out; > } > } while ((memcg = mem_cgroup_iter(NULL, memcg, NULL)) != NULL); > -unlock: > +out: > if (!ret) > memcg_shrinker_map_size = size; > - mutex_unlock(&memcg_shrinker_map_mutex); > + > return ret; > } > >