mm-commits Archive on lore.kernel.org
 help / color / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, dschatzberg@fb.com, guro@fb.com,
	hannes@cmpxchg.org, linux-mm@kvack.org,
	mm-commits@vger.kernel.org, shakeelb@google.com,
	torvalds@linux-foundation.org
Subject: [patch 02/40] mm, memcg: rework remote charging API to support nesting
Date: Sat, 17 Oct 2020 16:13:40 -0700
Message-ID: <20201017231340.JBcWsleuj%akpm@linux-foundation.org> (raw)
In-Reply-To: <20201017161314.88890b87fae7446ccc13c902@linux-foundation.org>

From: Roman Gushchin <guro@fb.com>
Subject: mm, memcg: rework remote charging API to support nesting

Currently the remote memcg charging API consists of two functions:
memalloc_use_memcg() and memalloc_unuse_memcg(), which set and clear the
memcg value, which overwrites the memcg of the current task.

  memalloc_use_memcg(target_memcg);
  <...>
  memalloc_unuse_memcg();

It works perfectly for allocations performed from a normal context,
however an attempt to call it from an interrupt context or just nest two
remote charging blocks will lead to an incorrect accounting.  On exit from
the inner block the active memcg will be cleared instead of being
restored.

  memalloc_use_memcg(target_memcg);

  memalloc_use_memcg(target_memcg_2);
    <...>
    memalloc_unuse_memcg();

    Error: allocation here are charged to the memcg of the current
    process instead of target_memcg.

  memalloc_unuse_memcg();

This patch extends the remote charging API by switching to a single
function: struct mem_cgroup *set_active_memcg(struct mem_cgroup *memcg),
which sets the new value and returns the old one.  So a remote charging
block will look like:

  old_memcg = set_active_memcg(target_memcg);
  <...>
  set_active_memcg(old_memcg);

This patch is heavily based on the patch by Johannes Weiner, which can be
found here: https://lkml.org/lkml/2020/5/28/806 .

Link: https://lkml.kernel.org/r/20200821212056.3769116-1-guro@fb.com
Signed-off-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Dan Schatzberg <dschatzberg@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/buffer.c                          |    6 ++---
 fs/notify/fanotify/fanotify.c        |    5 ++--
 fs/notify/inotify/inotify_fsnotify.c |    5 ++--
 include/linux/sched/mm.h             |   30 ++++++++-----------------
 mm/memcontrol.c                      |    6 ++---
 5 files changed, 22 insertions(+), 30 deletions(-)

--- a/fs/buffer.c~mm-rework-remote-memcg-charging-api-to-support-nesting
+++ a/fs/buffer.c
@@ -842,13 +842,13 @@ struct buffer_head *alloc_page_buffers(s
 	struct buffer_head *bh, *head;
 	gfp_t gfp = GFP_NOFS | __GFP_ACCOUNT;
 	long offset;
-	struct mem_cgroup *memcg;
+	struct mem_cgroup *memcg, *old_memcg;
 
 	if (retry)
 		gfp |= __GFP_NOFAIL;
 
 	memcg = get_mem_cgroup_from_page(page);
-	memalloc_use_memcg(memcg);
+	old_memcg = set_active_memcg(memcg);
 
 	head = NULL;
 	offset = PAGE_SIZE;
@@ -867,7 +867,7 @@ struct buffer_head *alloc_page_buffers(s
 		set_bh_page(bh, page, offset);
 	}
 out:
-	memalloc_unuse_memcg();
+	set_active_memcg(old_memcg);
 	mem_cgroup_put(memcg);
 	return head;
 /*
--- a/fs/notify/fanotify/fanotify.c~mm-rework-remote-memcg-charging-api-to-support-nesting
+++ a/fs/notify/fanotify/fanotify.c
@@ -531,6 +531,7 @@ static struct fanotify_event *fanotify_a
 	struct inode *dirid = fanotify_dfid_inode(mask, data, data_type, dir);
 	const struct path *path = fsnotify_data_path(data, data_type);
 	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
+	struct mem_cgroup *old_memcg;
 	struct inode *child = NULL;
 	bool name_event = false;
 
@@ -580,7 +581,7 @@ static struct fanotify_event *fanotify_a
 		gfp |= __GFP_RETRY_MAYFAIL;
 
 	/* Whoever is interested in the event, pays for the allocation. */
-	memalloc_use_memcg(group->memcg);
+	old_memcg = set_active_memcg(group->memcg);
 
 	if (fanotify_is_perm_event(mask)) {
 		event = fanotify_alloc_perm_event(path, gfp);
@@ -608,7 +609,7 @@ static struct fanotify_event *fanotify_a
 		event->pid = get_pid(task_tgid(current));
 
 out:
-	memalloc_unuse_memcg();
+	set_active_memcg(old_memcg);
 	return event;
 }
 
--- a/fs/notify/inotify/inotify_fsnotify.c~mm-rework-remote-memcg-charging-api-to-support-nesting
+++ a/fs/notify/inotify/inotify_fsnotify.c
@@ -66,6 +66,7 @@ static int inotify_one_event(struct fsno
 	int ret;
 	int len = 0;
 	int alloc_len = sizeof(struct inotify_event_info);
+	struct mem_cgroup *old_memcg;
 
 	if ((inode_mark->mask & FS_EXCL_UNLINK) &&
 	    path && d_unlinked(path->dentry))
@@ -87,9 +88,9 @@ static int inotify_one_event(struct fsno
 	 * trigger OOM killer in the target monitoring memcg as it may have
 	 * security repercussion.
 	 */
-	memalloc_use_memcg(group->memcg);
+	old_memcg = set_active_memcg(group->memcg);
 	event = kmalloc(alloc_len, GFP_KERNEL_ACCOUNT | __GFP_RETRY_MAYFAIL);
-	memalloc_unuse_memcg();
+	set_active_memcg(old_memcg);
 
 	if (unlikely(!event)) {
 		/*
--- a/include/linux/sched/mm.h~mm-rework-remote-memcg-charging-api-to-support-nesting
+++ a/include/linux/sched/mm.h
@@ -280,38 +280,28 @@ static inline void memalloc_nocma_restor
 
 #ifdef CONFIG_MEMCG
 /**
- * memalloc_use_memcg - Starts the remote memcg charging scope.
+ * set_active_memcg - Starts the remote memcg charging scope.
  * @memcg: memcg to charge.
  *
  * This function marks the beginning of the remote memcg charging scope. All the
  * __GFP_ACCOUNT allocations till the end of the scope will be charged to the
  * given memcg.
  *
- * NOTE: This function is not nesting safe.
+ * NOTE: This function can nest. Users must save the return value and
+ * reset the previous value after their own charging scope is over.
  */
-static inline void memalloc_use_memcg(struct mem_cgroup *memcg)
+static inline struct mem_cgroup *
+set_active_memcg(struct mem_cgroup *memcg)
 {
-	WARN_ON_ONCE(current->active_memcg);
+	struct mem_cgroup *old = current->active_memcg;
 	current->active_memcg = memcg;
-}
-
-/**
- * memalloc_unuse_memcg - Ends the remote memcg charging scope.
- *
- * This function marks the end of the remote memcg charging scope started by
- * memalloc_use_memcg().
- */
-static inline void memalloc_unuse_memcg(void)
-{
-	current->active_memcg = NULL;
+	return old;
 }
 #else
-static inline void memalloc_use_memcg(struct mem_cgroup *memcg)
-{
-}
-
-static inline void memalloc_unuse_memcg(void)
+static inline struct mem_cgroup *
+set_active_memcg(struct mem_cgroup *memcg)
 {
+	return NULL;
 }
 #endif
 
--- a/mm/memcontrol.c~mm-rework-remote-memcg-charging-api-to-support-nesting
+++ a/mm/memcontrol.c
@@ -5290,12 +5290,12 @@ static struct cgroup_subsys_state * __re
 mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
 {
 	struct mem_cgroup *parent = mem_cgroup_from_css(parent_css);
-	struct mem_cgroup *memcg;
+	struct mem_cgroup *memcg, *old_memcg;
 	long error = -ENOMEM;
 
-	memalloc_use_memcg(parent);
+	old_memcg = set_active_memcg(parent);
 	memcg = mem_cgroup_alloc();
-	memalloc_unuse_memcg();
+	set_active_memcg(old_memcg);
 	if (IS_ERR(memcg))
 		return ERR_CAST(memcg);
 
_

  parent reply index

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-17 23:13 incoming Andrew Morton
2020-10-17 23:13 ` [patch 01/40] ia64: fix build error with !COREDUMP Andrew Morton
2020-10-17 23:13 ` Andrew Morton [this message]
2020-10-17 23:13 ` [patch 03/40] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current() Andrew Morton
2020-10-17 23:13 ` [patch 04/40] mm: kmem: remove redundant checks from get_obj_cgroup_from_current() Andrew Morton
2020-10-17 23:13 ` [patch 05/40] mm: kmem: prepare remote memcg charging infra for interrupt contexts Andrew Morton
2020-10-17 23:13 ` [patch 06/40] mm: kmem: enable kernel memcg accounting from " Andrew Morton
2020-10-17 23:13 ` [patch 07/40] mm/memory-failure: remove a wrapper for alloc_migration_target() Andrew Morton
2020-10-17 23:14 ` [patch 08/40] mm/memory_hotplug: " Andrew Morton
2020-10-17 23:14 ` [patch 09/40] mm/migrate: avoid possible unnecessary process right check in kernel_move_pages() Andrew Morton
2020-10-17 23:14 ` [patch 10/40] mm/mmap: add inline vma_next() for readability of mmap code Andrew Morton
2020-10-17 23:14 ` [patch 11/40] mm/mmap: add inline munmap_vma_range() for code readability Andrew Morton
2020-10-17 23:14 ` [patch 12/40] mm/gup_benchmark: take the mmap lock around GUP Andrew Morton
2020-10-17 23:14 ` [patch 13/40] binfmt_elf: take the mmap lock around find_extend_vma() Andrew Morton
2020-10-17 23:14 ` [patch 14/40] mm/gup: assert that the mmap lock is held in __get_user_pages() Andrew Morton
2020-10-17 23:14 ` [patch 15/40] mm/gup_benchmark: rename to mm/gup_test Andrew Morton
2020-10-17 23:14 ` [patch 16/40] selftests/vm: use a common gup_test.h Andrew Morton
2020-10-17 23:14 ` [patch 17/40] selftests/vm: rename run_vmtests --> run_vmtests.sh Andrew Morton
2020-10-17 23:14 ` [patch 18/40] selftests/vm: minor cleanup: Makefile and gup_test.c Andrew Morton
2020-10-17 23:14 ` [patch 19/40] selftests/vm: only some gup_test items are really benchmarks Andrew Morton
2020-10-17 23:14 ` [patch 20/40] selftests/vm: gup_test: introduce the dump_pages() sub-test Andrew Morton
2020-10-17 23:14 ` [patch 21/40] selftests/vm: run_vmtests.sh: update and clean up gup_test invocation Andrew Morton
2020-10-17 23:14 ` [patch 22/40] selftests/vm: hmm-tests: remove the libhugetlbfs dependency Andrew Morton
2020-10-17 23:14 ` [patch 23/40] selftests/vm: 10x speedup for hmm-tests Andrew Morton
2020-10-17 23:14 ` [patch 24/40] mm/madvise: pass mm to do_madvise Andrew Morton
2020-10-17 23:14 ` [patch 25/40] pid: move pidfd_get_pid() to pid.c Andrew Morton
2020-10-17 23:14 ` [patch 26/40] mm/madvise: introduce process_madvise() syscall: an external memory hinting API Andrew Morton
2020-10-17 23:15 ` [patch 27/40] mm: update the documentation for vfree Andrew Morton
2020-10-17 23:15 ` [patch 28/40] mm: add a VM_MAP_PUT_PAGES flag for vmap Andrew Morton
2020-10-17 23:15 ` [patch 29/40] mm: add a vmap_pfn function Andrew Morton
2020-10-17 23:15 ` [patch 30/40] mm: allow a NULL fn callback in apply_to_page_range Andrew Morton
2020-10-17 23:15 ` [patch 31/40] zsmalloc: switch from alloc_vm_area to get_vm_area Andrew Morton
2020-10-17 23:15 ` [patch 32/40] drm/i915: use vmap in shmem_pin_map Andrew Morton
2020-10-17 23:15 ` [patch 33/40] drm/i915: stop using kmap in i915_gem_object_map Andrew Morton
2020-10-17 23:15 ` [patch 34/40] drm/i915: use vmap " Andrew Morton
2020-10-17 23:15 ` [patch 35/40] xen/xenbus: use apply_to_page_range directly in xenbus_map_ring_pv Andrew Morton
2020-10-17 23:15 ` [patch 36/40] x86/xen: open code alloc_vm_area in arch_gnttab_valloc Andrew Morton
2020-10-17 23:15 ` [patch 37/40] mm: remove alloc_vm_area Andrew Morton
2020-10-17 23:15 ` [patch 38/40] mm: cleanup the gfp_mask handling in __vmalloc_area_node Andrew Morton
2020-10-17 23:15 ` [patch 39/40] mm: remove the filename in the top of file comment in vmalloc.c Andrew Morton
2020-10-17 23:15 ` [patch 40/40] mm: remove duplicate include statement in mmu.c Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201017231340.JBcWsleuj%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=dschatzberg@fb.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=shakeelb@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

mm-commits Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/mm-commits/0 mm-commits/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 mm-commits mm-commits/ https://lore.kernel.org/mm-commits \
		mm-commits@vger.kernel.org
	public-inbox-index mm-commits

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.mm-commits


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git