linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
@ 2021-07-26 15:00 Johannes Weiner
  2021-07-26 15:08 ` Chris Down
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Johannes Weiner @ 2021-07-26 15:00 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Dan Carpenter, linux-mm, cgroups, linux-kernel, kernel-team

Dan Carpenter reports:

    The patch 2d146aa3aa84: "mm: memcontrol: switch to rstat" from Apr
    29, 2021, leads to the following static checker warning:

	    kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
	    warn: sleeping in atomic context

    mm/memcontrol.c
      3572  static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
      3573  {
      3574          unsigned long val;
      3575
      3576          if (mem_cgroup_is_root(memcg)) {
      3577                  cgroup_rstat_flush(memcg->css.cgroup);
			    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

    This is from static analysis and potentially a false positive.  The
    problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
    which holds an rcu_read_lock().  And the cgroup_rstat_flush() function
    can sleep.

      3578                  val = memcg_page_state(memcg, NR_FILE_PAGES) +
      3579                          memcg_page_state(memcg, NR_ANON_MAPPED);
      3580                  if (swap)
      3581                          val += memcg_page_state(memcg, MEMCG_SWAP);
      3582          } else {
      3583                  if (!swap)
      3584                          val = page_counter_read(&memcg->memory);
      3585                  else
      3586                          val = page_counter_read(&memcg->memsw);
      3587          }
      3588          return val;
      3589  }

__mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
thresholding code is invoked during stat changes, and those contexts
have irqs disabled as well. If the lock breaking occurs inside the
flush function, it will result in a sleep from an atomic context.

Use the irsafe flushing variant in mem_cgroup_usage() to fix this.

Fixes: 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
Cc: <stable@vger.kernel.org>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
---
 mm/memcontrol.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ae1f5d0cb581..eb8e87c4833f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3574,7 +3574,8 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
 	unsigned long val;
 
 	if (mem_cgroup_is_root(memcg)) {
-		cgroup_rstat_flush(memcg->css.cgroup);
+		/* mem_cgroup_threshold() calls here from irqsafe context */
+		cgroup_rstat_flush_irqsafe(memcg->css.cgroup);
 		val = memcg_page_state(memcg, NR_FILE_PAGES) +
 			memcg_page_state(memcg, NR_ANON_MAPPED);
 		if (swap)
-- 
2.32.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
  2021-07-26 15:00 [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code Johannes Weiner
@ 2021-07-26 15:08 ` Chris Down
  2021-07-26 15:16 ` Rik van Riel
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Chris Down @ 2021-07-26 15:08 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Dan Carpenter, linux-mm, cgroups, linux-kernel,
	kernel-team

Johannes Weiner writes:
>Dan Carpenter reports:
>
>    The patch 2d146aa3aa84: "mm: memcontrol: switch to rstat" from Apr
>    29, 2021, leads to the following static checker warning:
>
>	    kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
>	    warn: sleeping in atomic context
>
>    mm/memcontrol.c
>      3572  static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>      3573  {
>      3574          unsigned long val;
>      3575
>      3576          if (mem_cgroup_is_root(memcg)) {
>      3577                  cgroup_rstat_flush(memcg->css.cgroup);
>			    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>    This is from static analysis and potentially a false positive.  The
>    problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
>    which holds an rcu_read_lock().  And the cgroup_rstat_flush() function
>    can sleep.
>
>      3578                  val = memcg_page_state(memcg, NR_FILE_PAGES) +
>      3579                          memcg_page_state(memcg, NR_ANON_MAPPED);
>      3580                  if (swap)
>      3581                          val += memcg_page_state(memcg, MEMCG_SWAP);
>      3582          } else {
>      3583                  if (!swap)
>      3584                          val = page_counter_read(&memcg->memory);
>      3585                  else
>      3586                          val = page_counter_read(&memcg->memsw);
>      3587          }
>      3588          return val;
>      3589  }
>
>__mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
>thresholding code is invoked during stat changes, and those contexts
>have irqs disabled as well. If the lock breaking occurs inside the
>flush function, it will result in a sleep from an atomic context.
>
>Use the irsafe flushing variant in mem_cgroup_usage() to fix this.
>
>Fixes: 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
>Cc: <stable@vger.kernel.org>
>Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
>Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Thanks, looks good.

Acked-by: Chris Down <chris@chrisdown.name>

>---
> mm/memcontrol.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
>diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>index ae1f5d0cb581..eb8e87c4833f 100644
>--- a/mm/memcontrol.c
>+++ b/mm/memcontrol.c
>@@ -3574,7 +3574,8 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
> 	unsigned long val;
>
> 	if (mem_cgroup_is_root(memcg)) {
>-		cgroup_rstat_flush(memcg->css.cgroup);
>+		/* mem_cgroup_threshold() calls here from irqsafe context */
>+		cgroup_rstat_flush_irqsafe(memcg->css.cgroup);
> 		val = memcg_page_state(memcg, NR_FILE_PAGES) +
> 			memcg_page_state(memcg, NR_ANON_MAPPED);
> 		if (swap)
>-- 
>2.32.0
>
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
  2021-07-26 15:00 [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code Johannes Weiner
  2021-07-26 15:08 ` Chris Down
@ 2021-07-26 15:16 ` Rik van Riel
  2021-07-27 16:51   ` Shakeel Butt
  2021-07-26 20:32 ` Michal Hocko
  2021-07-27 16:59 ` Shakeel Butt
  3 siblings, 1 reply; 7+ messages in thread
From: Rik van Riel @ 2021-07-26 15:16 UTC (permalink / raw)
  To: hannes, akpm; +Cc: Kernel Team, dan.carpenter, linux-mm, linux-kernel, cgroups

On Mon, 2021-07-26 at 11:00 -0400, Johannes Weiner wrote:
> 
> __mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
> thresholding code is invoked during stat changes, and those contexts
> have irqs disabled as well. If the lock breaking occurs inside the
> flush function, it will result in a sleep from an atomic context.
> 
> Use the irsafe flushing variant in mem_cgroup_usage() to fix this

While this fix is necessary, in the long term I think we may
want some sort of redesign here, to make sure the irq safe
version does not spin long times trying to get the statistics
off some other CPU.

I have seen a number of soft (IIRC) lockups deep inside the
bowels of cgroup_rstat_flush_irqsafe, with the function taking
multiple seconds to complete.

Reviewed-by: Rik van Riel <riel@surriel.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
  2021-07-26 15:00 [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code Johannes Weiner
  2021-07-26 15:08 ` Chris Down
  2021-07-26 15:16 ` Rik van Riel
@ 2021-07-26 20:32 ` Michal Hocko
  2021-07-27 16:59 ` Shakeel Butt
  3 siblings, 0 replies; 7+ messages in thread
From: Michal Hocko @ 2021-07-26 20:32 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Dan Carpenter, linux-mm, cgroups, linux-kernel,
	kernel-team

On Mon 26-07-21 11:00:19, Johannes Weiner wrote:
> Dan Carpenter reports:
> 
>     The patch 2d146aa3aa84: "mm: memcontrol: switch to rstat" from Apr
>     29, 2021, leads to the following static checker warning:
> 
> 	    kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
> 	    warn: sleeping in atomic context
> 
>     mm/memcontrol.c
>       3572  static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>       3573  {
>       3574          unsigned long val;
>       3575
>       3576          if (mem_cgroup_is_root(memcg)) {
>       3577                  cgroup_rstat_flush(memcg->css.cgroup);
> 			    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
>     This is from static analysis and potentially a false positive.  The
>     problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
>     which holds an rcu_read_lock().  And the cgroup_rstat_flush() function
>     can sleep.
> 
>       3578                  val = memcg_page_state(memcg, NR_FILE_PAGES) +
>       3579                          memcg_page_state(memcg, NR_ANON_MAPPED);
>       3580                  if (swap)
>       3581                          val += memcg_page_state(memcg, MEMCG_SWAP);
>       3582          } else {
>       3583                  if (!swap)
>       3584                          val = page_counter_read(&memcg->memory);
>       3585                  else
>       3586                          val = page_counter_read(&memcg->memsw);
>       3587          }
>       3588          return val;
>       3589  }
> 
> __mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
> thresholding code is invoked during stat changes, and those contexts
> have irqs disabled as well. If the lock breaking occurs inside the
> flush function, it will result in a sleep from an atomic context.
> 
> Use the irsafe flushing variant in mem_cgroup_usage() to fix this.
> 
> Fixes: 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
> Cc: <stable@vger.kernel.org>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Acked-by: Michal Hocko <mhocko@suse.com>

Thanks!

> ---
>  mm/memcontrol.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index ae1f5d0cb581..eb8e87c4833f 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3574,7 +3574,8 @@ static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>  	unsigned long val;
>  
>  	if (mem_cgroup_is_root(memcg)) {
> -		cgroup_rstat_flush(memcg->css.cgroup);
> +		/* mem_cgroup_threshold() calls here from irqsafe context */
> +		cgroup_rstat_flush_irqsafe(memcg->css.cgroup);
>  		val = memcg_page_state(memcg, NR_FILE_PAGES) +
>  			memcg_page_state(memcg, NR_ANON_MAPPED);
>  		if (swap)
> -- 
> 2.32.0

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
  2021-07-26 15:16 ` Rik van Riel
@ 2021-07-27 16:51   ` Shakeel Butt
  2021-08-03 14:34     ` Rik van Riel
  0 siblings, 1 reply; 7+ messages in thread
From: Shakeel Butt @ 2021-07-27 16:51 UTC (permalink / raw)
  To: Rik van Riel
  Cc: hannes, akpm, Kernel Team, dan.carpenter, linux-mm, linux-kernel,
	cgroups

On Mon, Jul 26, 2021 at 8:19 AM Rik van Riel <riel@fb.com> wrote:
>
> On Mon, 2021-07-26 at 11:00 -0400, Johannes Weiner wrote:
> >
> > __mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
> > thresholding code is invoked during stat changes, and those contexts
> > have irqs disabled as well. If the lock breaking occurs inside the
> > flush function, it will result in a sleep from an atomic context.
> >
> > Use the irsafe flushing variant in mem_cgroup_usage() to fix this
>
> While this fix is necessary, in the long term I think we may
> want some sort of redesign here, to make sure the irq safe
> version does not spin long times trying to get the statistics
> off some other CPU.
>
> I have seen a number of soft (IIRC) lockups deep inside the
> bowels of cgroup_rstat_flush_irqsafe, with the function taking
> multiple seconds to complete.

Can you please share a bit more detail on this lockup? I am wondering
if this was due to the flush not happening more often and thus the
update tree is large or if there are too many concurrent flushes
happening.

>
> Reviewed-by: Rik van Riel <riel@surriel.com>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
  2021-07-26 15:00 [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code Johannes Weiner
                   ` (2 preceding siblings ...)
  2021-07-26 20:32 ` Michal Hocko
@ 2021-07-27 16:59 ` Shakeel Butt
  3 siblings, 0 replies; 7+ messages in thread
From: Shakeel Butt @ 2021-07-27 16:59 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, Dan Carpenter, Linux MM, Cgroups, LKML, Kernel Team

On Mon, Jul 26, 2021 at 8:01 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> Dan Carpenter reports:
>
>     The patch 2d146aa3aa84: "mm: memcontrol: switch to rstat" from Apr
>     29, 2021, leads to the following static checker warning:
>
>             kernel/cgroup/rstat.c:200 cgroup_rstat_flush()
>             warn: sleeping in atomic context
>
>     mm/memcontrol.c
>       3572  static unsigned long mem_cgroup_usage(struct mem_cgroup *memcg, bool swap)
>       3573  {
>       3574          unsigned long val;
>       3575
>       3576          if (mem_cgroup_is_root(memcg)) {
>       3577                  cgroup_rstat_flush(memcg->css.cgroup);
>                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>     This is from static analysis and potentially a false positive.  The
>     problem is that mem_cgroup_usage() is called from __mem_cgroup_threshold()
>     which holds an rcu_read_lock().  And the cgroup_rstat_flush() function
>     can sleep.
>
>       3578                  val = memcg_page_state(memcg, NR_FILE_PAGES) +
>       3579                          memcg_page_state(memcg, NR_ANON_MAPPED);
>       3580                  if (swap)
>       3581                          val += memcg_page_state(memcg, MEMCG_SWAP);
>       3582          } else {
>       3583                  if (!swap)
>       3584                          val = page_counter_read(&memcg->memory);
>       3585                  else
>       3586                          val = page_counter_read(&memcg->memsw);
>       3587          }
>       3588          return val;
>       3589  }
>
> __mem_cgroup_threshold() indeed holds the rcu lock. In addition, the
> thresholding code is invoked during stat changes, and those contexts
> have irqs disabled as well. If the lock breaking occurs inside the
> flush function, it will result in a sleep from an atomic context.
>
> Use the irsafe flushing variant in mem_cgroup_usage() to fix this.
>
> Fixes: 2d146aa3aa84 ("mm: memcontrol: switch to rstat")
> Cc: <stable@vger.kernel.org>
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>

Reviewed-by: Shakeel Butt <shakeelb@google.com>

BTW what do you think of removing stat flushes from the read side
(kernel and userspace) completely after periodic flushing and async
flushing from update side? Basically with "memcg: infrastructure to
flush memcg stats".

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code
  2021-07-27 16:51   ` Shakeel Butt
@ 2021-08-03 14:34     ` Rik van Riel
  0 siblings, 0 replies; 7+ messages in thread
From: Rik van Riel @ 2021-08-03 14:34 UTC (permalink / raw)
  To: shakeelb
  Cc: Kernel Team, dan.carpenter, hannes, cgroups, linux-mm,
	linux-kernel, akpm

On Tue, 2021-07-27 at 09:51 -0700, Shakeel Butt wrote:
> On Mon, Jul 26, 2021 at 8:19 AM Rik van Riel <riel@fb.com> wrote:
> > 
> > On Mon, 2021-07-26 at 11:00 -0400, Johannes Weiner wrote:
> > > 
> > > __mem_cgroup_threshold() indeed holds the rcu lock. In addition,
> > > the
> > > thresholding code is invoked during stat changes, and those
> > > contexts
> > > have irqs disabled as well. If the lock breaking occurs inside
> > > the
> > > flush function, it will result in a sleep from an atomic context.
> > > 
> > > Use the irsafe flushing variant in mem_cgroup_usage() to fix this
> > 
> > While this fix is necessary, in the long term I think we may
> > want some sort of redesign here, to make sure the irq safe
> > version does not spin long times trying to get the statistics
> > off some other CPU.
> > 
> > I have seen a number of soft (IIRC) lockups deep inside the
> > bowels of cgroup_rstat_flush_irqsafe, with the function taking
> > multiple seconds to complete.
> 
> Can you please share a bit more detail on this lockup? I am wondering
> if this was due to the flush not happening more often and thus the
> update tree is large or if there are too many concurrent flushes
> happening.

I was not logged into any system while it happened, but
only found it later in the logs.

I suspect your explanation is the reason why it happened,
though.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-08-03 14:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-26 15:00 [PATCH] mm: memcontrol: fix blocking rstat function called from atomic cgroup1 thresholding code Johannes Weiner
2021-07-26 15:08 ` Chris Down
2021-07-26 15:16 ` Rik van Riel
2021-07-27 16:51   ` Shakeel Butt
2021-08-03 14:34     ` Rik van Riel
2021-07-26 20:32 ` Michal Hocko
2021-07-27 16:59 ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).