mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
diff mbox series

Message ID 20180203082353.17284-1-hannes@cmpxchg.org
State New
Headers show
Series
  • mm: memcontrol: fix NR_WRITEBACK leak in memcg and system stats
Related show

Commit Message

Johannes Weiner Feb. 3, 2018, 8:23 a.m. UTC
After the ("a983b5ebee57 mm: memcontrol: fix excessive complexity in
memory.stat reporting"), we observed slowly upward creeping
NR_WRITEBACK counts over the course of several days, both the
per-memcg stats as well as the system counter in e.g. /proc/meminfo.

The conversion from full per-cpu stat counts to per-cpu cached atomic
stat counts introduced an irq-unsafe RMW operation into the updates.

Most stat updates come from process context, but one notable exception
is the NR_WRITEBACK counter. While writebacks are issued from process
context, they are retired from (soft)irq context.

When writeback completions interrupt the RMW counter updates of new
writebacks being issued, the decs from the completions are lost.

Since the global updates are routed through the joint lruvec API, both
the memcg counters as well as the system counters are affected.

This patch makes the joint stat and event API irq safe.

Fixes: a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting")
Debugged-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 include/linux/memcontrol.h | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

Comments

Rik van Riel Feb. 3, 2018, 6:23 p.m. UTC | #1
On Sat, 2018-02-03 at 03:23 -0500, Johannes Weiner wrote:
> 
> This patch makes the joint stat and event API irq safe.
> 
> Fixes: a983b5ebee57 ("mm: memcontrol: fix excessive complexity in
> memory.stat reporting")
> Debugged-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
Reviewed-by: Rik van Riel <riel@surriel.com>
Shakeel Butt Feb. 7, 2018, 3:44 p.m. UTC | #2
On Sat, Feb 3, 2018 at 12:23 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> After the ("a983b5ebee57 mm: memcontrol: fix excessive complexity in
> memory.stat reporting"), we observed slowly upward creeping
> NR_WRITEBACK counts over the course of several days, both the
> per-memcg stats as well as the system counter in e.g. /proc/meminfo.
>
> The conversion from full per-cpu stat counts to per-cpu cached atomic
> stat counts introduced an irq-unsafe RMW operation into the updates.
>
> Most stat updates come from process context, but one notable exception
> is the NR_WRITEBACK counter. While writebacks are issued from process
> context, they are retired from (soft)irq context.
>
> When writeback completions interrupt the RMW counter updates of new
> writebacks being issued, the decs from the completions are lost.
>
> Since the global updates are routed through the joint lruvec API, both
> the memcg counters as well as the system counters are affected.
>
> This patch makes the joint stat and event API irq safe.
>
> Fixes: a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting")
> Debugged-by: Tejun Heo <tj@kernel.org>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Should this be considered for stable?
Johannes Weiner Feb. 7, 2018, 4:55 p.m. UTC | #3
Hi Shakeel,

On Wed, Feb 07, 2018 at 07:44:08AM -0800, Shakeel Butt wrote:
> On Sat, Feb 3, 2018 at 12:23 AM, Johannes Weiner <hannes@cmpxchg.org> wrote:
> > After the ("a983b5ebee57 mm: memcontrol: fix excessive complexity in
> > memory.stat reporting"), we observed slowly upward creeping
> > NR_WRITEBACK counts over the course of several days, both the
> > per-memcg stats as well as the system counter in e.g. /proc/meminfo.
> >
> > The conversion from full per-cpu stat counts to per-cpu cached atomic
> > stat counts introduced an irq-unsafe RMW operation into the updates.
> >
> > Most stat updates come from process context, but one notable exception
> > is the NR_WRITEBACK counter. While writebacks are issued from process
> > context, they are retired from (soft)irq context.
> >
> > When writeback completions interrupt the RMW counter updates of new
> > writebacks being issued, the decs from the completions are lost.
> >
> > Since the global updates are routed through the joint lruvec API, both
> > the memcg counters as well as the system counters are affected.
> >
> > This patch makes the joint stat and event API irq safe.
> >
> > Fixes: a983b5ebee57 ("mm: memcontrol: fix excessive complexity in memory.stat reporting")
> > Debugged-by: Tejun Heo <tj@kernel.org>
> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> 
> Should this be considered for stable?

The stable tree is only for fixes to already released kernels, but
there is no release containing the faulty patch:

$ git describe --tags a983b5ebee57
v4.15-3322-ga983b5ebee57

nor was the faulty patch itself marked for stable.

So as long as this fix makes it into 4.16 we should be good.

Patch
diff mbox series

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 882046863581..c46016bb25eb 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -523,9 +523,11 @@  static inline void __mod_memcg_state(struct mem_cgroup *memcg,
 static inline void mod_memcg_state(struct mem_cgroup *memcg,
 				   int idx, int val)
 {
-	preempt_disable();
+	unsigned long flags;
+
+	local_irq_save(flags);
 	__mod_memcg_state(memcg, idx, val);
-	preempt_enable();
+	local_irq_restore(flags);
 }
 
 /**
@@ -606,9 +608,11 @@  static inline void __mod_lruvec_state(struct lruvec *lruvec,
 static inline void mod_lruvec_state(struct lruvec *lruvec,
 				    enum node_stat_item idx, int val)
 {
-	preempt_disable();
+	unsigned long flags;
+
+	local_irq_save(flags);
 	__mod_lruvec_state(lruvec, idx, val);
-	preempt_enable();
+	local_irq_restore(flags);
 }
 
 static inline void __mod_lruvec_page_state(struct page *page,
@@ -630,9 +634,11 @@  static inline void __mod_lruvec_page_state(struct page *page,
 static inline void mod_lruvec_page_state(struct page *page,
 					 enum node_stat_item idx, int val)
 {
-	preempt_disable();
+	unsigned long flags;
+
+	local_irq_save(flags);
 	__mod_lruvec_page_state(page, idx, val);
-	preempt_enable();
+	local_irq_restore(flags);
 }
 
 unsigned long mem_cgroup_soft_limit_reclaim(pg_data_t *pgdat, int order,
@@ -659,9 +665,11 @@  static inline void __count_memcg_events(struct mem_cgroup *memcg,
 static inline void count_memcg_events(struct mem_cgroup *memcg,
 				      int idx, unsigned long count)
 {
-	preempt_disable();
+	unsigned long flags;
+
+	local_irq_save(flags);
 	__count_memcg_events(memcg, idx, count);
-	preempt_enable();
+	local_irq_restore(flags);
 }
 
 /* idx can be of type enum memcg_event_item or vm_event_item */