From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.3 required=3.0 tests=DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5778AC83009 for ; Tue, 28 Apr 2020 16:15:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0AD6F206D6 for ; Tue, 28 Apr 2020 16:15:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="U5diXMQa" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0AD6F206D6 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A11818E0007; Tue, 28 Apr 2020 12:15:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C20E8E0001; Tue, 28 Apr 2020 12:15:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B1998E0007; Tue, 28 Apr 2020 12:15:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0100.hostedemail.com [216.40.44.100]) by kanga.kvack.org (Postfix) with ESMTP id 754FD8E0001 for ; Tue, 28 Apr 2020 12:15:43 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 53AD3180AD801 for ; Tue, 28 Apr 2020 16:15:42 +0000 (UTC) X-FDA: 76757764524.26.bone08_4edf96365d23d X-HE-Tag: bone08_4edf96365d23d X-Filterd-Recvd-Size: 10303 Received: from mail-qt1-f196.google.com (mail-qt1-f196.google.com [209.85.160.196]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Tue, 28 Apr 2020 16:15:40 +0000 (UTC) Received: by mail-qt1-f196.google.com with SMTP id s30so17763232qth.2 for ; Tue, 28 Apr 2020 09:15:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rDRxVYoqu4jBaHjL5nQgf4JCTrI8hpo4fCoe5mCi1VY=; b=U5diXMQaQ+jUdquhp3DncLsRDg3FzZOMxBGIGHLhVDQRNaSecGBAOx1bs6+OAtHGgT JhIDMrffKZr08/u8CfyfBan6NXy3spYxF5z9VfCYm9RmUb3VsC5C4guUdZo6bZm9vOM7 SKLXoxeJSJhIhuGMZ7ZeP4KqLo3TY2+XWoB2AexYvgn38x7wd8flXF4GT7rF0YqNSkFY PH4Ktt30Mar3F4Txq2SPLBRUiR76voOf3LdHYLs8/B1yVKQzvjQSMU/oFnercfmYsfAF yCZIgGq18PUv57d5PVBbskx8ekJOthBXbxOlOy7S7HOJG+eSHAmJ1NOn/ZAPJ8IG4mZO r2Lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rDRxVYoqu4jBaHjL5nQgf4JCTrI8hpo4fCoe5mCi1VY=; b=unYIu/WsguGh+5CmtFJV2o6y0wuAHuaFNDS2Fp+Mu2yRtQL3Y2XE+sj/qNVh/I0GQu HnHfhnAOwdsud9h583POIx8yVBbOIC14wcF0P/gYEootN5HsKjoHGLrLTHSnl+xqYF+1 BKH5cPvYe2gifOsAzkX395oRcezcw0wB8AssetRuCsYlIt+zP+z9dQ1ns/xpi07s5lQy qCXRLEfz/tbkRrPKygchMLM7UnBCOuh87Y0H/qAZ7ZAOj9BYigdrdioO+2NZZpSIAS39 tye5yM3k/4Cqt+4b1yUQgU985ArN2ewGyqKgLNGqMN+I+mDDXHwysjnzxPse7HbEQj3s 7x8g== X-Gm-Message-State: AGi0PuYZ3qW+Fx1/VYjMEh24CLxc+nEC9ThMfAT52J11xfQeXr0sN3Ru DmcqyYCJS9rb1NcE54pDqXI= X-Google-Smtp-Source: APiQypIPgkRvpQaND2Zr8BfaFrEliFkoA2SGsGJdNwLNb39pGObry7krD8EOEyYAufEUengUoqRAdg== X-Received: by 2002:ac8:3403:: with SMTP id u3mr197920qtb.274.1588090538743; Tue, 28 Apr 2020 09:15:38 -0700 (PDT) Received: from dschatzberg-fedora-PC0Y6AEN.thefacebook.com ([2620:10d:c091:480::1:3e4a]) by smtp.gmail.com with ESMTPSA id z2sm14087421qkc.28.2020.04.28.09.15.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Apr 2020 09:15:37 -0700 (PDT) From: Dan Schatzberg To: Cc: Johannes Weiner , Shakeel Butt , Roman Gushchin , Jens Axboe , Alexander Viro , Jan Kara , Amir Goldstein , Tejun Heo , Li Zefan , Michal Hocko , Vladimir Davydov , Andrew Morton , Hugh Dickins , Chris Down , Yang Shi , Ingo Molnar , "Peter Zijlstra (Intel)" , Mathieu Desnoyers , Andrea Arcangeli , Dan Schatzberg , Thomas Gleixner , linux-block@vger.kernel.org (open list:BLOCK LAYER), linux-kernel@vger.kernel.org (open list), linux-fsdevel@vger.kernel.org (open list:FILESYSTEMS (VFS and infrastructure)), cgroups@vger.kernel.org (open list:CONTROL GROUP (CGROUP)), linux-mm@kvack.org (open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)) Subject: [PATCH v5 2/4] mm: support nesting memalloc_use_memcg() Date: Tue, 28 Apr 2020 12:13:48 -0400 Message-Id: <20200428161355.6377-3-schatzberg.dan@gmail.com> X-Mailer: git-send-email 2.21.1 In-Reply-To: <20200428161355.6377-1-schatzberg.dan@gmail.com> References: <20200428161355.6377-1-schatzberg.dan@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Johannes Weiner The memalloc_use_memcg() function to override the default memcg accounting context currently doesn't nest. But the patches to make the loop driver cgroup-aware will end up nesting: [ 98.137605] alloc_page_buffers+0x210/0x288 [ 98.141799] __getblk_gfp+0x1d4/0x400 [ 98.145475] ext4_read_block_bitmap_nowait+0x148/0xbc8 [ 98.150628] ext4_mb_init_cache+0x25c/0x9b0 [ 98.154821] ext4_mb_init_group+0x270/0x390 [ 98.159014] ext4_mb_good_group+0x264/0x270 [ 98.163208] ext4_mb_regular_allocator+0x480/0x798 [ 98.168011] ext4_mb_new_blocks+0x958/0x10f8 [ 98.172294] ext4_ext_map_blocks+0xec8/0x1618 [ 98.176660] ext4_map_blocks+0x1b8/0x8a0 [ 98.180592] ext4_writepages+0x830/0xf10 [ 98.184523] do_writepages+0xb4/0x198 [ 98.188195] __filemap_fdatawrite_range+0x170/0x1c8 [ 98.193086] filemap_write_and_wait_range+0x40/0xb0 [ 98.197974] ext4_punch_hole+0x4a4/0x660 [ 98.201907] ext4_fallocate+0x294/0x1190 [ 98.205839] loop_process_work+0x690/0x1100 [ 98.210032] loop_workfn+0x2c/0x110 [ 98.213529] process_one_work+0x3e0/0x648 [ 98.217546] worker_thread+0x70/0x670 [ 98.221217] kthread+0x1b8/0x1c0 [ 98.224452] ret_from_fork+0x10/0x18 where loop_process_work() sets the memcg override to the memcg that submitted the IO request, and alloc_page_buffers() sets the override to the memcg that instantiated the cache page, which may differ. Make memalloc_use_memcg() return the old memcg and convert existing users to a stacking model. Delete the unused memalloc_unuse_memcg(). Signed-off-by: Johannes Weiner Reviewed-by: Shakeel Butt Acked-by: Roman Gushchin --- fs/buffer.c | 6 +++--- fs/notify/fanotify/fanotify.c | 5 +++-- fs/notify/inotify/inotify_fsnotify.c | 5 +++-- include/linux/sched/mm.h | 28 +++++++++------------------- 4 files changed, 18 insertions(+), 26 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 599a0bf7257b..b4e99c6b52ec 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -851,13 +851,13 @@ struct buffer_head *alloc_page_buffers(struct page = *page, unsigned long size, struct buffer_head *bh, *head; gfp_t gfp =3D GFP_NOFS | __GFP_ACCOUNT; long offset; - struct mem_cgroup *memcg; + struct mem_cgroup *memcg, *old_memcg; =20 if (retry) gfp |=3D __GFP_NOFAIL; =20 memcg =3D get_mem_cgroup_from_page(page); - memalloc_use_memcg(memcg); + old_memcg =3D memalloc_use_memcg(memcg); =20 head =3D NULL; offset =3D PAGE_SIZE; @@ -876,7 +876,7 @@ struct buffer_head *alloc_page_buffers(struct page *p= age, unsigned long size, set_bh_page(bh, page, offset); } out: - memalloc_unuse_memcg(); + memalloc_use_memcg(old_memcg); mem_cgroup_put(memcg); return head; /* diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.= c index 5435a40f82be..54c787cd6efb 100644 --- a/fs/notify/fanotify/fanotify.c +++ b/fs/notify/fanotify/fanotify.c @@ -353,6 +353,7 @@ struct fanotify_event *fanotify_alloc_event(struct fs= notify_group *group, gfp_t gfp =3D GFP_KERNEL_ACCOUNT; struct inode *id =3D fanotify_fid_inode(inode, mask, data, data_type); const struct path *path =3D fsnotify_data_path(data, data_type); + struct mem_cgroup *oldmemcg; =20 /* * For queues with unlimited length lost events are not expected and @@ -366,7 +367,7 @@ struct fanotify_event *fanotify_alloc_event(struct fs= notify_group *group, gfp |=3D __GFP_RETRY_MAYFAIL; =20 /* Whoever is interested in the event, pays for the allocation. */ - memalloc_use_memcg(group->memcg); + oldmemcg =3D memalloc_use_memcg(group->memcg); =20 if (fanotify_is_perm_event(mask)) { struct fanotify_perm_event *pevent; @@ -451,7 +452,7 @@ struct fanotify_event *fanotify_alloc_event(struct fs= notify_group *group, } } out: - memalloc_unuse_memcg(); + memalloc_use_memcg(oldmemcg); return event; } =20 diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/ino= tify_fsnotify.c index 2ebc89047153..d27c6e83cea6 100644 --- a/fs/notify/inotify/inotify_fsnotify.c +++ b/fs/notify/inotify/inotify_fsnotify.c @@ -69,6 +69,7 @@ int inotify_handle_event(struct fsnotify_group *group, int ret; int len =3D 0; int alloc_len =3D sizeof(struct inotify_event_info); + struct mem_cgroup *oldmemcg; =20 if (WARN_ON(fsnotify_iter_vfsmount_mark(iter_info))) return 0; @@ -93,9 +94,9 @@ int inotify_handle_event(struct fsnotify_group *group, * trigger OOM killer in the target monitoring memcg as it may have * security repercussion. */ - memalloc_use_memcg(group->memcg); + oldmemcg =3D memalloc_use_memcg(group->memcg); event =3D kmalloc(alloc_len, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL); - memalloc_unuse_memcg(); + memalloc_use_memcg(oldmemcg); =20 if (unlikely(!event)) { /* diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h index c49257a3b510..95e8bfb0cab1 100644 --- a/include/linux/sched/mm.h +++ b/include/linux/sched/mm.h @@ -316,31 +316,21 @@ static inline void memalloc_nocma_restore(unsigned = int flags) * __GFP_ACCOUNT allocations till the end of the scope will be charged t= o the * given memcg. * - * NOTE: This function is not nesting safe. + * NOTE: This function can nest. Users must save the return value and + * reset the previous value after their own charging scope is over */ -static inline void memalloc_use_memcg(struct mem_cgroup *memcg) +static inline struct mem_cgroup * +memalloc_use_memcg(struct mem_cgroup *memcg) { - WARN_ON_ONCE(current->active_memcg); + struct mem_cgroup *old =3D current->active_memcg; current->active_memcg =3D memcg; -} - -/** - * memalloc_unuse_memcg - Ends the remote memcg charging scope. - * - * This function marks the end of the remote memcg charging scope starte= d by - * memalloc_use_memcg(). - */ -static inline void memalloc_unuse_memcg(void) -{ - current->active_memcg =3D NULL; + return old; } #else -static inline void memalloc_use_memcg(struct mem_cgroup *memcg) -{ -} - -static inline void memalloc_unuse_memcg(void) +static inline struct mem_cgroup * +memalloc_use_memcg(struct mem_cgroup *memcg) { + return NULL; } #endif =20 --=20 2.24.1