linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/4] mm: kmem: kernel memory accounting in an interrupt context
@ 2020-08-27 17:52 Roman Gushchin
  2020-08-27 17:52 ` [PATCH RFC 1/4] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current() Roman Gushchin
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Roman Gushchin @ 2020-08-27 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, =Shakeel Butt, Johannes Weiner, Michal Hocko,
	kernel-team, linux-kernel, Roman Gushchin

This patchset implements memcg-based memory accounting of
allocations made from an interrupt context.

Historically, such allocations were passed unaccounted mostly
because charging the memory cgroup of the current process wasn't
an option. Also performance reasons were likely a reason too.

The remote charging API allows to temporarily overwrite the
currently active memory cgroup, so that all memory allocations
are accounted towards some specified memory cgroup instead
of the memory cgroup of the current process.

This patchset extends the remote charging API so that it can be
used from an interrupt context. Then it removes the fence that
prevented the accounting of allocations made from an interrupt
context. It also contains a couple of optimizations/code
refactorings.

This patchset doesn't directly enable accounting for any specific
allocations, but prepares the code base for it. The bpf memory
accounting will likely be the first user of it: a typical
example is a bpf program parsing an incoming network packet,
which allocates an entry in hashmap map to store some information.


Roman Gushchin (4):
  mm: kmem: move memcg_kmem_bypass() calls to
    get_mem/obj_cgroup_from_current()
  mm: kmem: remove redundant checks from get_obj_cgroup_from_current()
  mm: kmem: prepare remote memcg charging infra for interrupt contexts
  mm: kmem: enable kernel memcg accounting from interrupt contexts

 include/linux/memcontrol.h | 12 -------
 include/linux/sched/mm.h   | 13 +++++--
 mm/memcontrol.c            | 69 ++++++++++++++++++++++++++++----------
 mm/percpu.c                |  3 +-
 mm/slab.h                  |  3 --
 5 files changed, 63 insertions(+), 37 deletions(-)

-- 
2.26.2



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH RFC 1/4] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current()
  2020-08-27 17:52 [PATCH RFC 0/4] mm: kmem: kernel memory accounting in an interrupt context Roman Gushchin
@ 2020-08-27 17:52 ` Roman Gushchin
  2020-08-27 21:10   ` Shakeel Butt
  2020-08-27 17:52 ` [PATCH RFC 2/4] mm: kmem: remove redundant checks from get_obj_cgroup_from_current() Roman Gushchin
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 10+ messages in thread
From: Roman Gushchin @ 2020-08-27 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, =Shakeel Butt, Johannes Weiner, Michal Hocko,
	kernel-team, linux-kernel, Roman Gushchin

Currently memcg_kmem_bypass() is called before obtaining the current
memory/obj cgroup using get_mem/obj_cgroup_from_current(). Moving
memcg_kmem_bypass() into get_mem/obj_cgroup_from_current() reduces
the number of call sites and allows further code simplifications.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 mm/memcontrol.c | 13 ++++++++-----
 mm/percpu.c     |  3 +--
 mm/slab.h       |  3 ---
 3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index dc892a3c4b17..9c08d8d14bc0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1074,6 +1074,9 @@ EXPORT_SYMBOL(get_mem_cgroup_from_page);
  */
 static __always_inline struct mem_cgroup *get_mem_cgroup_from_current(void)
 {
+	if (memcg_kmem_bypass())
+		return NULL;
+
 	if (unlikely(current->active_memcg)) {
 		struct mem_cgroup *memcg;
 
@@ -2913,6 +2916,9 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void)
 	struct obj_cgroup *objcg = NULL;
 	struct mem_cgroup *memcg;
 
+	if (memcg_kmem_bypass())
+		return NULL;
+
 	if (unlikely(!current->mm && !current->active_memcg))
 		return NULL;
 
@@ -3039,19 +3045,16 @@ int __memcg_kmem_charge_page(struct page *page, gfp_t gfp, int order)
 	struct mem_cgroup *memcg;
 	int ret = 0;
 
-	if (memcg_kmem_bypass())
-		return 0;
-
 	memcg = get_mem_cgroup_from_current();
-	if (!mem_cgroup_is_root(memcg)) {
+	if (memcg && !mem_cgroup_is_root(memcg)) {
 		ret = __memcg_kmem_charge(memcg, gfp, 1 << order);
 		if (!ret) {
 			page->mem_cgroup = memcg;
 			__SetPageKmemcg(page);
 			return 0;
 		}
+		css_put(&memcg->css);
 	}
-	css_put(&memcg->css);
 	return ret;
 }
 
diff --git a/mm/percpu.c b/mm/percpu.c
index f4709629e6de..9b07bd5bc45f 100644
--- a/mm/percpu.c
+++ b/mm/percpu.c
@@ -1584,8 +1584,7 @@ static enum pcpu_chunk_type pcpu_memcg_pre_alloc_hook(size_t size, gfp_t gfp,
 {
 	struct obj_cgroup *objcg;
 
-	if (!memcg_kmem_enabled() || !(gfp & __GFP_ACCOUNT) ||
-	    memcg_kmem_bypass())
+	if (!memcg_kmem_enabled() || !(gfp & __GFP_ACCOUNT))
 		return PCPU_CHUNK_ROOT;
 
 	objcg = get_obj_cgroup_from_current();
diff --git a/mm/slab.h b/mm/slab.h
index 95e5cc1bb2a3..4a24e1702923 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -280,9 +280,6 @@ static inline struct obj_cgroup *memcg_slab_pre_alloc_hook(struct kmem_cache *s,
 {
 	struct obj_cgroup *objcg;
 
-	if (memcg_kmem_bypass())
-		return NULL;
-
 	objcg = get_obj_cgroup_from_current();
 	if (!objcg)
 		return NULL;
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC 2/4] mm: kmem: remove redundant checks from get_obj_cgroup_from_current()
  2020-08-27 17:52 [PATCH RFC 0/4] mm: kmem: kernel memory accounting in an interrupt context Roman Gushchin
  2020-08-27 17:52 ` [PATCH RFC 1/4] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current() Roman Gushchin
@ 2020-08-27 17:52 ` Roman Gushchin
  2020-08-27 21:10   ` Shakeel Butt
  2020-08-27 17:52 ` [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts Roman Gushchin
  2020-08-27 17:52 ` [PATCH RFC 4/4] mm: kmem: enable kernel memcg accounting from " Roman Gushchin
  3 siblings, 1 reply; 10+ messages in thread
From: Roman Gushchin @ 2020-08-27 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, =Shakeel Butt, Johannes Weiner, Michal Hocko,
	kernel-team, linux-kernel, Roman Gushchin

There are checks for current->mm and current->active_memcg
in get_obj_cgroup_from_current(), but these checks are redundant:
memcg_kmem_bypass() called just above performs same checks.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 mm/memcontrol.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 9c08d8d14bc0..5d847257a639 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2919,9 +2919,6 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void)
 	if (memcg_kmem_bypass())
 		return NULL;
 
-	if (unlikely(!current->mm && !current->active_memcg))
-		return NULL;
-
 	rcu_read_lock();
 	if (unlikely(current->active_memcg))
 		memcg = rcu_dereference(current->active_memcg);
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts
  2020-08-27 17:52 [PATCH RFC 0/4] mm: kmem: kernel memory accounting in an interrupt context Roman Gushchin
  2020-08-27 17:52 ` [PATCH RFC 1/4] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current() Roman Gushchin
  2020-08-27 17:52 ` [PATCH RFC 2/4] mm: kmem: remove redundant checks from get_obj_cgroup_from_current() Roman Gushchin
@ 2020-08-27 17:52 ` Roman Gushchin
  2020-08-27 21:58   ` Shakeel Butt
  2020-08-27 17:52 ` [PATCH RFC 4/4] mm: kmem: enable kernel memcg accounting from " Roman Gushchin
  3 siblings, 1 reply; 10+ messages in thread
From: Roman Gushchin @ 2020-08-27 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, =Shakeel Butt, Johannes Weiner, Michal Hocko,
	kernel-team, linux-kernel, Roman Gushchin

Remote memcg charging API uses current->active_memcg to store the
currently active memory cgroup, which overwrites the memory cgroup
of the current process. It works well for normal contexts, but doesn't
work for interrupt contexts: indeed, if an interrupt occurs during
the execution of a section with an active memcg set, all allocations
inside the interrupt will be charged to the active memcg set (given
that we'll enable accounting for allocations from an interrupt
context). But because the interrupt might have no relation to the
active memcg set outside, it's obviously wrong from the accounting
prospective.

To resolve this problem, let's add a global percpu int_active_memcg
variable, which will be used to store an active memory cgroup which
will be sued from interrupt contexts. set_active_memcg() will
transparently use current->active_memcg or int_active_memcg depending
on the context.

To make the read part simple and transparent for the caller, let's
introduce two new functions:
  - struct mem_cgroup *active_memcg(void),
  - struct mem_cgroup *get_active_memcg(void).

They are returning the active memcg if it's set, hiding all
implementation details: where to get it depending on the current context.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 include/linux/sched/mm.h | 13 +++++++++--
 mm/memcontrol.c          | 48 ++++++++++++++++++++++++++++------------
 2 files changed, 45 insertions(+), 16 deletions(-)

diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 4c69a4349ac1..030a1cf77b8a 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -304,6 +304,7 @@ static inline void memalloc_nocma_restore(unsigned int flags)
 #endif
 
 #ifdef CONFIG_MEMCG
+DECLARE_PER_CPU(struct mem_cgroup *, int_active_memcg);
 /**
  * set_active_memcg - Starts the remote memcg charging scope.
  * @memcg: memcg to charge.
@@ -318,8 +319,16 @@ static inline void memalloc_nocma_restore(unsigned int flags)
 static inline struct mem_cgroup *
 set_active_memcg(struct mem_cgroup *memcg)
 {
-	struct mem_cgroup *old = current->active_memcg;
-	current->active_memcg = memcg;
+	struct mem_cgroup *old;
+
+	if (in_interrupt()) {
+		old = this_cpu_read(int_active_memcg);
+		this_cpu_write(int_active_memcg, memcg);
+	} else {
+		old = current->active_memcg;
+		current->active_memcg = memcg;
+	}
+
 	return old;
 }
 #else
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5d847257a639..a51a6066079e 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -73,6 +73,9 @@ EXPORT_SYMBOL(memory_cgrp_subsys);
 
 struct mem_cgroup *root_mem_cgroup __read_mostly;
 
+/* Active memory cgroup to use from an interrupt context */
+DEFINE_PER_CPU(struct mem_cgroup *, int_active_memcg);
+
 /* Socket memory accounting disabled? */
 static bool cgroup_memory_nosocket;
 
@@ -1069,26 +1072,43 @@ struct mem_cgroup *get_mem_cgroup_from_page(struct page *page)
 }
 EXPORT_SYMBOL(get_mem_cgroup_from_page);
 
-/**
- * If current->active_memcg is non-NULL, do not fallback to current->mm->memcg.
- */
-static __always_inline struct mem_cgroup *get_mem_cgroup_from_current(void)
+static __always_inline struct mem_cgroup *active_memcg(void)
 {
-	if (memcg_kmem_bypass())
-		return NULL;
+	if (in_interrupt())
+		return this_cpu_read(int_active_memcg);
+	else
+		return current->active_memcg;
+}
 
-	if (unlikely(current->active_memcg)) {
-		struct mem_cgroup *memcg;
+static __always_inline struct mem_cgroup *get_active_memcg(void)
+{
+	struct mem_cgroup *memcg;
 
-		rcu_read_lock();
+	rcu_read_lock();
+	memcg = active_memcg();
+	if (memcg) {
 		/* current->active_memcg must hold a ref. */
-		if (WARN_ON_ONCE(!css_tryget(&current->active_memcg->css)))
+		if (WARN_ON_ONCE(!css_tryget(&memcg->css)))
 			memcg = root_mem_cgroup;
 		else
 			memcg = current->active_memcg;
-		rcu_read_unlock();
-		return memcg;
 	}
+	rcu_read_unlock();
+
+	return memcg;
+}
+
+/**
+ * If active memcg is set, do not fallback to current->mm->memcg.
+ */
+static __always_inline struct mem_cgroup *get_mem_cgroup_from_current(void)
+{
+	if (memcg_kmem_bypass())
+		return NULL;
+
+	if (unlikely(active_memcg()))
+		return get_active_memcg();
+
 	return get_mem_cgroup_from_mm(current->mm);
 }
 
@@ -2920,8 +2940,8 @@ __always_inline struct obj_cgroup *get_obj_cgroup_from_current(void)
 		return NULL;
 
 	rcu_read_lock();
-	if (unlikely(current->active_memcg))
-		memcg = rcu_dereference(current->active_memcg);
+	if (unlikely(active_memcg()))
+		memcg = active_memcg();
 	else
 		memcg = mem_cgroup_from_task(current);
 
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH RFC 4/4] mm: kmem: enable kernel memcg accounting from interrupt contexts
  2020-08-27 17:52 [PATCH RFC 0/4] mm: kmem: kernel memory accounting in an interrupt context Roman Gushchin
                   ` (2 preceding siblings ...)
  2020-08-27 17:52 ` [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts Roman Gushchin
@ 2020-08-27 17:52 ` Roman Gushchin
  2020-08-27 22:02   ` Shakeel Butt
  3 siblings, 1 reply; 10+ messages in thread
From: Roman Gushchin @ 2020-08-27 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, =Shakeel Butt, Johannes Weiner, Michal Hocko,
	kernel-team, linux-kernel, Roman Gushchin

If a memcg to charge can be determined (using remote charging API),
there are no reasons to exclude allocations made from an interrupt
context from the accounting.

Such allocations will pass even if the resulting memcg size will
exceed the hard limit, but it will affect the application of the
memory pressure and an inability to put the workload under the limit
will eventually trigger the OOM.

To use active_memcg() helper, memcg_kmem_bypass() is moved back
to memcontrol.c.

Signed-off-by: Roman Gushchin <guro@fb.com>
---
 include/linux/memcontrol.h | 12 ------------
 mm/memcontrol.c            | 13 +++++++++++++
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index d0b036123c6a..924177502479 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -1528,18 +1528,6 @@ static inline bool memcg_kmem_enabled(void)
 	return static_branch_likely(&memcg_kmem_enabled_key);
 }
 
-static inline bool memcg_kmem_bypass(void)
-{
-	if (in_interrupt())
-		return true;
-
-	/* Allow remote memcg charging in kthread contexts. */
-	if ((!current->mm || (current->flags & PF_KTHREAD)) &&
-	     !current->active_memcg)
-		return true;
-	return false;
-}
-
 static inline int memcg_kmem_charge_page(struct page *page, gfp_t gfp,
 					 int order)
 {
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index a51a6066079e..75cd1a1e66c8 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1098,6 +1098,19 @@ static __always_inline struct mem_cgroup *get_active_memcg(void)
 	return memcg;
 }
 
+static __always_inline bool memcg_kmem_bypass(void)
+{
+	/* Allow remote memcg charging from any context. */
+	if (unlikely(active_memcg()))
+		return false;
+
+	/* Memcg to charge can't be determined. */
+	if (in_interrupt() || !current->mm || (current->flags & PF_KTHREAD))
+		return true;
+
+	return false;
+}
+
 /**
  * If active memcg is set, do not fallback to current->mm->memcg.
  */
-- 
2.26.2



^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 1/4] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current()
  2020-08-27 17:52 ` [PATCH RFC 1/4] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current() Roman Gushchin
@ 2020-08-27 21:10   ` Shakeel Butt
  0 siblings, 0 replies; 10+ messages in thread
From: Shakeel Butt @ 2020-08-27 21:10 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Linux MM, Andrew Morton, Johannes Weiner, Michal Hocko,
	Kernel Team, LKML

On Thu, Aug 27, 2020 at 10:52 AM Roman Gushchin <guro@fb.com> wrote:
>
> Currently memcg_kmem_bypass() is called before obtaining the current
> memory/obj cgroup using get_mem/obj_cgroup_from_current(). Moving
> memcg_kmem_bypass() into get_mem/obj_cgroup_from_current() reduces
> the number of call sites and allows further code simplifications.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Reviewed-by: Shakeel Butt <shakeelb@google.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 2/4] mm: kmem: remove redundant checks from get_obj_cgroup_from_current()
  2020-08-27 17:52 ` [PATCH RFC 2/4] mm: kmem: remove redundant checks from get_obj_cgroup_from_current() Roman Gushchin
@ 2020-08-27 21:10   ` Shakeel Butt
  0 siblings, 0 replies; 10+ messages in thread
From: Shakeel Butt @ 2020-08-27 21:10 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Linux MM, Andrew Morton, Johannes Weiner, Michal Hocko,
	Kernel Team, LKML

On Thu, Aug 27, 2020 at 10:52 AM Roman Gushchin <guro@fb.com> wrote:
>
> There are checks for current->mm and current->active_memcg
> in get_obj_cgroup_from_current(), but these checks are redundant:
> memcg_kmem_bypass() called just above performs same checks.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Reviewed-by: Shakeel Butt <shakeelb@google.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts
  2020-08-27 17:52 ` [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts Roman Gushchin
@ 2020-08-27 21:58   ` Shakeel Butt
  2020-08-27 22:37     ` Roman Gushchin
  0 siblings, 1 reply; 10+ messages in thread
From: Shakeel Butt @ 2020-08-27 21:58 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Linux MM, Andrew Morton, Johannes Weiner, Michal Hocko,
	Kernel Team, LKML

On Thu, Aug 27, 2020 at 10:52 AM Roman Gushchin <guro@fb.com> wrote:
>
> Remote memcg charging API uses current->active_memcg to store the
> currently active memory cgroup, which overwrites the memory cgroup
> of the current process. It works well for normal contexts, but doesn't
> work for interrupt contexts: indeed, if an interrupt occurs during
> the execution of a section with an active memcg set, all allocations
> inside the interrupt will be charged to the active memcg set (given
> that we'll enable accounting for allocations from an interrupt
> context). But because the interrupt might have no relation to the
> active memcg set outside, it's obviously wrong from the accounting
> prospective.
>
> To resolve this problem, let's add a global percpu int_active_memcg
> variable, which will be used to store an active memory cgroup which
> will be sued from interrupt contexts. set_active_memcg() will

*used

> transparently use current->active_memcg or int_active_memcg depending
> on the context.
>
> To make the read part simple and transparent for the caller, let's
> introduce two new functions:
>   - struct mem_cgroup *active_memcg(void),
>   - struct mem_cgroup *get_active_memcg(void).
>
> They are returning the active memcg if it's set, hiding all
> implementation details: where to get it depending on the current context.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

I like this patch. Internally we have a similar patch which instead of
per-cpu int_active_memcg have current->active_memcg_irq. Our use-case
was radix tree node allocations where we use the root node's memcg to
charge all the nodes of the tree and the reason behind was that we
observed a lot of zombies which were stuck due to radix tree nodes
charges while the actual pages pointed by the those nodes/entries were
in used by active jobs (shared file system and the kernel is older
than the kmem reparenting).

Reviewed-by: Shakeel Butt <shakeelb@google.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 4/4] mm: kmem: enable kernel memcg accounting from interrupt contexts
  2020-08-27 17:52 ` [PATCH RFC 4/4] mm: kmem: enable kernel memcg accounting from " Roman Gushchin
@ 2020-08-27 22:02   ` Shakeel Butt
  0 siblings, 0 replies; 10+ messages in thread
From: Shakeel Butt @ 2020-08-27 22:02 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Linux MM, Andrew Morton, Johannes Weiner, Michal Hocko,
	Kernel Team, LKML

On Thu, Aug 27, 2020 at 10:52 AM Roman Gushchin <guro@fb.com> wrote:
>
> If a memcg to charge can be determined (using remote charging API),
> there are no reasons to exclude allocations made from an interrupt
> context from the accounting.
>
> Such allocations will pass even if the resulting memcg size will
> exceed the hard limit, but it will affect the application of the
> memory pressure and an inability to put the workload under the limit
> will eventually trigger the OOM.
>
> To use active_memcg() helper, memcg_kmem_bypass() is moved back
> to memcontrol.c.
>
> Signed-off-by: Roman Gushchin <guro@fb.com>

Reviewed-by: Shakeel Butt <shakeelb@google.com>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts
  2020-08-27 21:58   ` Shakeel Butt
@ 2020-08-27 22:37     ` Roman Gushchin
  0 siblings, 0 replies; 10+ messages in thread
From: Roman Gushchin @ 2020-08-27 22:37 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Linux MM, Andrew Morton, Johannes Weiner, Michal Hocko,
	Kernel Team, LKML

On Thu, Aug 27, 2020 at 02:58:50PM -0700, Shakeel Butt wrote:
> On Thu, Aug 27, 2020 at 10:52 AM Roman Gushchin <guro@fb.com> wrote:
> >
> > Remote memcg charging API uses current->active_memcg to store the
> > currently active memory cgroup, which overwrites the memory cgroup
> > of the current process. It works well for normal contexts, but doesn't
> > work for interrupt contexts: indeed, if an interrupt occurs during
> > the execution of a section with an active memcg set, all allocations
> > inside the interrupt will be charged to the active memcg set (given
> > that we'll enable accounting for allocations from an interrupt
> > context). But because the interrupt might have no relation to the
> > active memcg set outside, it's obviously wrong from the accounting
> > prospective.
> >
> > To resolve this problem, let's add a global percpu int_active_memcg
> > variable, which will be used to store an active memory cgroup which
> > will be sued from interrupt contexts. set_active_memcg() will
> 
> *used
> 
> > transparently use current->active_memcg or int_active_memcg depending
> > on the context.
> >
> > To make the read part simple and transparent for the caller, let's
> > introduce two new functions:
> >   - struct mem_cgroup *active_memcg(void),
> >   - struct mem_cgroup *get_active_memcg(void).
> >
> > They are returning the active memcg if it's set, hiding all
> > implementation details: where to get it depending on the current context.
> >
> > Signed-off-by: Roman Gushchin <guro@fb.com>
> 
> I like this patch. Internally we have a similar patch which instead of
> per-cpu int_active_memcg have current->active_memcg_irq. Our use-case
> was radix tree node allocations where we use the root node's memcg to
> charge all the nodes of the tree and the reason behind was that we
> observed a lot of zombies which were stuck due to radix tree nodes
> charges while the actual pages pointed by the those nodes/entries were
> in used by active jobs (shared file system and the kernel is older
> than the kmem reparenting).
> 
> Reviewed-by: Shakeel Butt <shakeelb@google.com>

Thank you for reviews, Shakeel!

I'll fix the typo, add your acks and will resend it as v1.

Thanks!


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-08-27 22:37 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-27 17:52 [PATCH RFC 0/4] mm: kmem: kernel memory accounting in an interrupt context Roman Gushchin
2020-08-27 17:52 ` [PATCH RFC 1/4] mm: kmem: move memcg_kmem_bypass() calls to get_mem/obj_cgroup_from_current() Roman Gushchin
2020-08-27 21:10   ` Shakeel Butt
2020-08-27 17:52 ` [PATCH RFC 2/4] mm: kmem: remove redundant checks from get_obj_cgroup_from_current() Roman Gushchin
2020-08-27 21:10   ` Shakeel Butt
2020-08-27 17:52 ` [PATCH RFC 3/4] mm: kmem: prepare remote memcg charging infra for interrupt contexts Roman Gushchin
2020-08-27 21:58   ` Shakeel Butt
2020-08-27 22:37     ` Roman Gushchin
2020-08-27 17:52 ` [PATCH RFC 4/4] mm: kmem: enable kernel memcg accounting from " Roman Gushchin
2020-08-27 22:02   ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).