From: Roman Gushchin <guro@fb.com>
To: Dan Schatzberg <schatzberg.dan@gmail.com>
Cc: no To-header on input <";"@kvack.org>,
Jens Axboe <axboe@kernel.dk>,
Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
Amir Goldstein <amir73il@gmail.com>, Tejun Heo <tj@kernel.org>,
Li Zefan <lizefan@huawei.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Hugh Dickins <hughd@google.com>,
Shakeel Butt <shakeelb@google.com>,
Chris Down <chris@chrisdown.name>,
Yang Shi <yang.shi@linux.alibaba.com>,
Ingo Molnar <mingo@kernel.org>,
"Peter Zijlstra (Intel)" <peterz@infradead.org>,
Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
"open list:BLOCK LAYER" <linux-block@vger.kernel.org>,
open list <linux-kernel@vger.kernel.org>,
"open list:FILESYSTEMS (VFS and infrastructure)"
<linux-fsdevel@vger.kernel.org>,
"open list:CONTROL GROUP (CGROUP)" <cgroups@vger.kernel.org>,
"open list:CONTROL GROUP - MEMORY RESOURCE CONTROLLER (MEMCG)"
<linux-mm@kvack.org>
Subject: Re: [PATCH 2/4] mm: support nesting memalloc_use_memcg()
Date: Tue, 21 Apr 2020 18:13:07 -0700 [thread overview]
Message-ID: <20200422011307.GA47525@carbon.DHCP.thefacebook.com> (raw)
In-Reply-To: <20200420223936.6773-3-schatzberg.dan@gmail.com>
On Mon, Apr 20, 2020 at 06:39:30PM -0400, Dan Schatzberg wrote:
> The memalloc_use_memcg() function to override the default memcg
> accounting context currently doesn't nest. But the patches to make the
> loop driver cgroup-aware will end up nesting:
>
> [ 98.137605] alloc_page_buffers+0x210/0x288
> [ 98.141799] __getblk_gfp+0x1d4/0x400
> [ 98.145475] ext4_read_block_bitmap_nowait+0x148/0xbc8
> [ 98.150628] ext4_mb_init_cache+0x25c/0x9b0
> [ 98.154821] ext4_mb_init_group+0x270/0x390
> [ 98.159014] ext4_mb_good_group+0x264/0x270
> [ 98.163208] ext4_mb_regular_allocator+0x480/0x798
> [ 98.168011] ext4_mb_new_blocks+0x958/0x10f8
> [ 98.172294] ext4_ext_map_blocks+0xec8/0x1618
> [ 98.176660] ext4_map_blocks+0x1b8/0x8a0
> [ 98.180592] ext4_writepages+0x830/0xf10
> [ 98.184523] do_writepages+0xb4/0x198
> [ 98.188195] __filemap_fdatawrite_range+0x170/0x1c8
> [ 98.193086] filemap_write_and_wait_range+0x40/0xb0
> [ 98.197974] ext4_punch_hole+0x4a4/0x660
> [ 98.201907] ext4_fallocate+0x294/0x1190
> [ 98.205839] loop_process_work+0x690/0x1100
> [ 98.210032] loop_workfn+0x2c/0x110
> [ 98.213529] process_one_work+0x3e0/0x648
> [ 98.217546] worker_thread+0x70/0x670
> [ 98.221217] kthread+0x1b8/0x1c0
> [ 98.224452] ret_from_fork+0x10/0x18
>
> where loop_process_work() sets the memcg override to the memcg that
> submitted the IO request, and alloc_page_buffers() sets the override
> to the memcg that instantiated the cache page, which may differ.
>
> Make memalloc_use_memcg() return the old memcg and convert existing
> users to a stacking model. Delete the unused memalloc_unuse_memcg().
>
> Signed-off-by: Dan Schatzberg <schatzberg.dan@gmail.com>
Acked-by: Roman Gushchin <guro@fb.com>
One small nit below.
Thanks!
> ---
> fs/buffer.c | 6 +++---
> fs/notify/fanotify/fanotify.c | 5 +++--
> fs/notify/inotify/inotify_fsnotify.c | 5 +++--
> include/linux/sched/mm.h | 28 +++++++++-------------------
> 4 files changed, 18 insertions(+), 26 deletions(-)
>
> diff --git a/fs/buffer.c b/fs/buffer.c
> index 599a0bf7257b..e39e05985323 100644
> --- a/fs/buffer.c
> +++ b/fs/buffer.c
> @@ -851,13 +851,13 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
> struct buffer_head *bh, *head;
> gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT;
> long offset;
> - struct mem_cgroup *memcg;
> + struct mem_cgroup *memcg, *oldmemcg;
I'd rename it to old_memcg.
>
> if (retry)
> gfp |= __GFP_NOFAIL;
>
> memcg = get_mem_cgroup_from_page(page);
> - memalloc_use_memcg(memcg);
> + oldmemcg = memalloc_use_memcg(memcg);
>
> head = NULL;
> offset = PAGE_SIZE;
> @@ -876,7 +876,7 @@ struct buffer_head *alloc_page_buffers(struct page *page, unsigned long size,
> set_bh_page(bh, page, offset);
> }
> out:
> - memalloc_unuse_memcg();
> + memalloc_use_memcg(oldmemcg);
> mem_cgroup_put(memcg);
> return head;
> /*
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 5435a40f82be..54c787cd6efb 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -353,6 +353,7 @@ struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
> gfp_t gfp = GFP_KERNEL_ACCOUNT;
> struct inode *id = fanotify_fid_inode(inode, mask, data, data_type);
> const struct path *path = fsnotify_data_path(data, data_type);
> + struct mem_cgroup *oldmemcg;
>
> /*
> * For queues with unlimited length lost events are not expected and
> @@ -366,7 +367,7 @@ struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
> gfp |= __GFP_RETRY_MAYFAIL;
>
> /* Whoever is interested in the event, pays for the allocation. */
> - memalloc_use_memcg(group->memcg);
> + oldmemcg = memalloc_use_memcg(group->memcg);
>
> if (fanotify_is_perm_event(mask)) {
> struct fanotify_perm_event *pevent;
> @@ -451,7 +452,7 @@ struct fanotify_event *fanotify_alloc_event(struct fsnotify_group *group,
> }
> }
> out:
> - memalloc_unuse_memcg();
> + memalloc_use_memcg(oldmemcg);
> return event;
> }
>
> diff --git a/fs/notify/inotify/inotify_fsnotify.c b/fs/notify/inotify/inotify_fsnotify.c
> index 2ebc89047153..d27c6e83cea6 100644
> --- a/fs/notify/inotify/inotify_fsnotify.c
> +++ b/fs/notify/inotify/inotify_fsnotify.c
> @@ -69,6 +69,7 @@ int inotify_handle_event(struct fsnotify_group *group,
> int ret;
> int len = 0;
> int alloc_len = sizeof(struct inotify_event_info);
> + struct mem_cgroup *oldmemcg;
>
> if (WARN_ON(fsnotify_iter_vfsmount_mark(iter_info)))
> return 0;
> @@ -93,9 +94,9 @@ int inotify_handle_event(struct fsnotify_group *group,
> * trigger OOM killer in the target monitoring memcg as it may have
> * security repercussion.
> */
> - memalloc_use_memcg(group->memcg);
> + oldmemcg = memalloc_use_memcg(group->memcg);
> event = kmalloc(alloc_len, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
> - memalloc_unuse_memcg();
> + memalloc_use_memcg(oldmemcg);
>
> if (unlikely(!event)) {
> /*
> diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
> index c49257a3b510..95e8bfb0cab1 100644
> --- a/include/linux/sched/mm.h
> +++ b/include/linux/sched/mm.h
> @@ -316,31 +316,21 @@ static inline void memalloc_nocma_restore(unsigned int flags)
> * __GFP_ACCOUNT allocations till the end of the scope will be charged to the
> * given memcg.
> *
> - * NOTE: This function is not nesting safe.
> + * NOTE: This function can nest. Users must save the return value and
> + * reset the previous value after their own charging scope is over
> */
> -static inline void memalloc_use_memcg(struct mem_cgroup *memcg)
> +static inline struct mem_cgroup *
> +memalloc_use_memcg(struct mem_cgroup *memcg)
> {
> - WARN_ON_ONCE(current->active_memcg);
> + struct mem_cgroup *old = current->active_memcg;
> current->active_memcg = memcg;
> -}
> -
> -/**
> - * memalloc_unuse_memcg - Ends the remote memcg charging scope.
> - *
> - * This function marks the end of the remote memcg charging scope started by
> - * memalloc_use_memcg().
> - */
> -static inline void memalloc_unuse_memcg(void)
> -{
> - current->active_memcg = NULL;
> + return old;
> }
> #else
> -static inline void memalloc_use_memcg(struct mem_cgroup *memcg)
> -{
> -}
> -
> -static inline void memalloc_unuse_memcg(void)
> +static inline struct mem_cgroup *
> +memalloc_use_memcg(struct mem_cgroup *memcg)
> {
> + return NULL;
> }
> #endif
>
> --
> 2.24.1
>
next prev parent reply other threads:[~2020-04-22 1:15 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-20 22:39 [PATCH 0/4] Charge loop device i/o to issuing cgroup Dan Schatzberg
2020-04-20 22:39 ` [PATCH 1/4] loop: Use worker per cgroup instead of kworker Dan Schatzberg
2020-04-20 22:39 ` [PATCH 2/4] mm: support nesting memalloc_use_memcg() Dan Schatzberg
2020-04-21 0:43 ` Shakeel Butt
2020-04-22 1:13 ` Roman Gushchin [this message]
2020-04-20 22:39 ` [PATCH 3/4] mm: Charge active memcg when no mm is set Dan Schatzberg
2020-04-20 22:39 ` [PATCH 4/4] loop: Charge i/o to mem and blk cg Dan Schatzberg
2020-04-21 2:48 ` [PATCH 1/4] loop: Use worker per cgroup instead of kworker Hillf Danton
2020-04-21 13:55 ` Dan Schatzberg
[not found] ` <20200421033337.13208-1-hdanton@sina.com>
2020-04-21 13:57 ` [PATCH 4/4] loop: Charge i/o to mem and blk cg Dan Schatzberg
2020-05-28 13:54 [PATCH v6 0/4] Charge loop device i/o to issuing cgroup Dan Schatzberg
2020-05-28 13:54 ` [PATCH 2/4] mm: support nesting memalloc_use_memcg() Dan Schatzberg
2020-08-24 15:35 [PATCH v7 0/4] Charge loop device i/o to issuing cgroup Dan Schatzberg
2020-08-24 15:36 ` [PATCH 2/4] mm: support nesting memalloc_use_memcg() Dan Schatzberg
2020-08-24 16:19 ` Roman Gushchin
2020-08-24 17:13 ` Dan Schatzberg
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200422011307.GA47525@carbon.DHCP.thefacebook.com \
--to=guro@fb.com \
--cc=";"@kvack.org \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=amir73il@gmail.com \
--cc=axboe@kernel.dk \
--cc=cgroups@vger.kernel.org \
--cc=chris@chrisdown.name \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lizefan@huawei.com \
--cc=mathieu.desnoyers@efficios.com \
--cc=mhocko@kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=schatzberg.dan@gmail.com \
--cc=shakeelb@google.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
--cc=viro@zeniv.linux.org.uk \
--cc=yang.shi@linux.alibaba.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).