All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: Do not stall register_shrinker
@ 2017-11-24  0:04 ` Minchan Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Minchan Kim @ 2017-11-24  0:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, kernel-team, Minchan Kim, Michal Hocko,
	Tetsuo Handa, Shakeel Butt, Johannes Weiner

Shakeel Butt reported, he have observed in production system that
the job loader gets stuck for 10s of seconds while doing mount
operation. It turns out that it was stuck in register_shrinker()
and some unrelated job was under memory pressure and spending time
in shrink_slab(). Machines have a lot of shrinkers registered and
jobs under memory pressure has to traverse all of those memcg-aware
shrinkers and do affect unrelated jobs which want to register their
own shrinkers.

To solve the issue, this patch simply bails out slab shrinking
once it found someone want to register shrinker in parallel.
A downside is it could cause unfair shrinking between shrinkers.
However, it should be rare and we can add compilcated logic once
we found it's not enough.

Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6a5a72baccd5..6698001787bd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 			sc.nid = 0;
 
 		freed += do_shrink_slab(&sc, shrinker, priority);
+		/*
+		 * bail out if someone want to register a new shrinker to
+		 * prevent long time stall by parallel ongoing shrinking.
+		 */
+		if (rwsem_is_contended(&shrinker_rwsem)) {
+			freed = freed ? : 1;
+			break;
+		}
 	}
 
 	up_read(&shrinker_rwsem);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] mm: Do not stall register_shrinker
@ 2017-11-24  0:04 ` Minchan Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Minchan Kim @ 2017-11-24  0:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, kernel-team, Minchan Kim, Michal Hocko,
	Tetsuo Handa, Shakeel Butt, Johannes Weiner

Shakeel Butt reported, he have observed in production system that
the job loader gets stuck for 10s of seconds while doing mount
operation. It turns out that it was stuck in register_shrinker()
and some unrelated job was under memory pressure and spending time
in shrink_slab(). Machines have a lot of shrinkers registered and
jobs under memory pressure has to traverse all of those memcg-aware
shrinkers and do affect unrelated jobs which want to register their
own shrinkers.

To solve the issue, this patch simply bails out slab shrinking
once it found someone want to register shrinker in parallel.
A downside is it could cause unfair shrinking between shrinkers.
However, it should be rare and we can add compilcated logic once
we found it's not enough.

Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
Cc: Michal Hocko <mhocko@suse.com>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/vmscan.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 6a5a72baccd5..6698001787bd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 			sc.nid = 0;
 
 		freed += do_shrink_slab(&sc, shrinker, priority);
+		/*
+		 * bail out if someone want to register a new shrinker to
+		 * prevent long time stall by parallel ongoing shrinking.
+		 */
+		if (rwsem_is_contended(&shrinker_rwsem)) {
+			freed = freed ? : 1;
+			break;
+		}
 	}
 
 	up_read(&shrinker_rwsem);
-- 
2.7.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: Do not stall register_shrinker
  2017-11-24  0:04 ` Minchan Kim
@ 2017-11-24  7:50   ` Michal Hocko
  -1 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2017-11-24  7:50 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, linux-kernel, kernel-team, Tetsuo Handa,
	Shakeel Butt, Johannes Weiner

On Fri 24-11-17 09:04:59, Minchan Kim wrote:
> Shakeel Butt reported, he have observed in production system that
> the job loader gets stuck for 10s of seconds while doing mount
> operation. It turns out that it was stuck in register_shrinker()
> and some unrelated job was under memory pressure and spending time
> in shrink_slab(). Machines have a lot of shrinkers registered and
> jobs under memory pressure has to traverse all of those memcg-aware
> shrinkers and do affect unrelated jobs which want to register their
> own shrinkers.
> 
> To solve the issue, this patch simply bails out slab shrinking
> once it found someone want to register shrinker in parallel.
> A downside is it could cause unfair shrinking between shrinkers.
> However, it should be rare and we can add compilcated logic once
> we found it's not enough.
> 
> Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/vmscan.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6a5a72baccd5..6698001787bd 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  			sc.nid = 0;
>  
>  		freed += do_shrink_slab(&sc, shrinker, priority);
> +		/*
> +		 * bail out if someone want to register a new shrinker to
> +		 * prevent long time stall by parallel ongoing shrinking.
> +		 */
> +		if (rwsem_is_contended(&shrinker_rwsem)) {
> +			freed = freed ? : 1;
> +			break;
> +		}
>  	}
>  
>  	up_read(&shrinker_rwsem);
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: Do not stall register_shrinker
@ 2017-11-24  7:50   ` Michal Hocko
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2017-11-24  7:50 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Andrew Morton, linux-mm, linux-kernel, kernel-team, Tetsuo Handa,
	Shakeel Butt, Johannes Weiner

On Fri 24-11-17 09:04:59, Minchan Kim wrote:
> Shakeel Butt reported, he have observed in production system that
> the job loader gets stuck for 10s of seconds while doing mount
> operation. It turns out that it was stuck in register_shrinker()
> and some unrelated job was under memory pressure and spending time
> in shrink_slab(). Machines have a lot of shrinkers registered and
> jobs under memory pressure has to traverse all of those memcg-aware
> shrinkers and do affect unrelated jobs which want to register their
> own shrinkers.
> 
> To solve the issue, this patch simply bails out slab shrinking
> once it found someone want to register shrinker in parallel.
> A downside is it could cause unfair shrinking between shrinkers.
> However, it should be rare and we can add compilcated logic once
> we found it's not enough.
> 
> Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  mm/vmscan.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6a5a72baccd5..6698001787bd 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  			sc.nid = 0;
>  
>  		freed += do_shrink_slab(&sc, shrinker, priority);
> +		/*
> +		 * bail out if someone want to register a new shrinker to
> +		 * prevent long time stall by parallel ongoing shrinking.
> +		 */
> +		if (rwsem_is_contended(&shrinker_rwsem)) {
> +			freed = freed ? : 1;
> +			break;
> +		}
>  	}
>  
>  	up_read(&shrinker_rwsem);
> -- 
> 2.7.4
> 

-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: Do not stall register_shrinker
  2017-11-24  0:04 ` Minchan Kim
@ 2017-11-27  5:46   ` Anshuman Khandual
  -1 siblings, 0 replies; 8+ messages in thread
From: Anshuman Khandual @ 2017-11-27  5:46 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton
  Cc: linux-mm, linux-kernel, kernel-team, Michal Hocko, Tetsuo Handa,
	Shakeel Butt, Johannes Weiner

On 11/24/2017 05:34 AM, Minchan Kim wrote:
> Shakeel Butt reported, he have observed in production system that
> the job loader gets stuck for 10s of seconds while doing mount
> operation. It turns out that it was stuck in register_shrinker()
> and some unrelated job was under memory pressure and spending time
> in shrink_slab(). Machines have a lot of shrinkers registered and
> jobs under memory pressure has to traverse all of those memcg-aware
> shrinkers and do affect unrelated jobs which want to register their
> own shrinkers.
> 
> To solve the issue, this patch simply bails out slab shrinking
> once it found someone want to register shrinker in parallel.
> A downside is it could cause unfair shrinking between shrinkers.
> However, it should be rare and we can add compilcated logic once
> we found it's not enough.
> 
> Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/vmscan.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6a5a72baccd5..6698001787bd 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  			sc.nid = 0;
>  
>  		freed += do_shrink_slab(&sc, shrinker, priority);
> +		/*
> +		 * bail out if someone want to register a new shrinker to
> +		 * prevent long time stall by parallel ongoing shrinking.
> +		 */
> +		if (rwsem_is_contended(&shrinker_rwsem)) {
> +			freed = freed ? : 1;
> +			break;
> +		}

This is similar to when it aborts for not being able to grab the
shrinker_rwsem at the beginning.

if (!down_read_trylock(&shrinker_rwsem)) {
	/*
	 * If we would return 0, our callers would understand that we
	 * have nothing else to shrink and give up trying. By returning
	 * 1 we keep it going and assume we'll be able to shrink next
	 * time.
	 */
	freed = 1;
	goto out;
}

Right now, shrink_slab() is getting called from three places. Twice in
shrink_node() and once in drop_slab_node(). But the return value from
shrink_slab() is checked only inside drop_slab_node() and it has some
heuristics to decide whether to keep on scanning over available memcg
shrinkers registered.

The question is does aborting here will still guarantee forward progress
for all the contexts which might be attempting to allocate memory and had
eventually invoked shrink_slab() ? Because may be the memory allocation
request has more priority than a context getting bit delayed while being
stuck waiting on shrinker_rwsem.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: Do not stall register_shrinker
@ 2017-11-27  5:46   ` Anshuman Khandual
  0 siblings, 0 replies; 8+ messages in thread
From: Anshuman Khandual @ 2017-11-27  5:46 UTC (permalink / raw)
  To: Minchan Kim, Andrew Morton
  Cc: linux-mm, linux-kernel, kernel-team, Michal Hocko, Tetsuo Handa,
	Shakeel Butt, Johannes Weiner

On 11/24/2017 05:34 AM, Minchan Kim wrote:
> Shakeel Butt reported, he have observed in production system that
> the job loader gets stuck for 10s of seconds while doing mount
> operation. It turns out that it was stuck in register_shrinker()
> and some unrelated job was under memory pressure and spending time
> in shrink_slab(). Machines have a lot of shrinkers registered and
> jobs under memory pressure has to traverse all of those memcg-aware
> shrinkers and do affect unrelated jobs which want to register their
> own shrinkers.
> 
> To solve the issue, this patch simply bails out slab shrinking
> once it found someone want to register shrinker in parallel.
> A downside is it could cause unfair shrinking between shrinkers.
> However, it should be rare and we can add compilcated logic once
> we found it's not enough.
> 
> Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/vmscan.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 6a5a72baccd5..6698001787bd 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  			sc.nid = 0;
>  
>  		freed += do_shrink_slab(&sc, shrinker, priority);
> +		/*
> +		 * bail out if someone want to register a new shrinker to
> +		 * prevent long time stall by parallel ongoing shrinking.
> +		 */
> +		if (rwsem_is_contended(&shrinker_rwsem)) {
> +			freed = freed ? : 1;
> +			break;
> +		}

This is similar to when it aborts for not being able to grab the
shrinker_rwsem at the beginning.

if (!down_read_trylock(&shrinker_rwsem)) {
	/*
	 * If we would return 0, our callers would understand that we
	 * have nothing else to shrink and give up trying. By returning
	 * 1 we keep it going and assume we'll be able to shrink next
	 * time.
	 */
	freed = 1;
	goto out;
}

Right now, shrink_slab() is getting called from three places. Twice in
shrink_node() and once in drop_slab_node(). But the return value from
shrink_slab() is checked only inside drop_slab_node() and it has some
heuristics to decide whether to keep on scanning over available memcg
shrinkers registered.

The question is does aborting here will still guarantee forward progress
for all the contexts which might be attempting to allocate memory and had
eventually invoked shrink_slab() ? Because may be the memory allocation
request has more priority than a context getting bit delayed while being
stuck waiting on shrinker_rwsem.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: Do not stall register_shrinker
  2017-11-27  5:46   ` Anshuman Khandual
@ 2017-11-27  6:31     ` Minchan Kim
  -1 siblings, 0 replies; 8+ messages in thread
From: Minchan Kim @ 2017-11-27  6:31 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Andrew Morton, linux-mm, linux-kernel, kernel-team, Michal Hocko,
	Tetsuo Handa, Shakeel Butt, Johannes Weiner

On Mon, Nov 27, 2017 at 11:16:46AM +0530, Anshuman Khandual wrote:
> On 11/24/2017 05:34 AM, Minchan Kim wrote:
> > Shakeel Butt reported, he have observed in production system that
> > the job loader gets stuck for 10s of seconds while doing mount
> > operation. It turns out that it was stuck in register_shrinker()
> > and some unrelated job was under memory pressure and spending time
> > in shrink_slab(). Machines have a lot of shrinkers registered and
> > jobs under memory pressure has to traverse all of those memcg-aware
> > shrinkers and do affect unrelated jobs which want to register their
> > own shrinkers.
> > 
> > To solve the issue, this patch simply bails out slab shrinking
> > once it found someone want to register shrinker in parallel.
> > A downside is it could cause unfair shrinking between shrinkers.
> > However, it should be rare and we can add compilcated logic once
> > we found it's not enough.
> > 
> > Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> > Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/vmscan.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 6a5a72baccd5..6698001787bd 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
> >  			sc.nid = 0;
> >  
> >  		freed += do_shrink_slab(&sc, shrinker, priority);
> > +		/*
> > +		 * bail out if someone want to register a new shrinker to
> > +		 * prevent long time stall by parallel ongoing shrinking.
> > +		 */
> > +		if (rwsem_is_contended(&shrinker_rwsem)) {
> > +			freed = freed ? : 1;
> > +			break;
> > +		}
> 
> This is similar to when it aborts for not being able to grab the
> shrinker_rwsem at the beginning.
> 
> if (!down_read_trylock(&shrinker_rwsem)) {
> 	/*
> 	 * If we would return 0, our callers would understand that we
> 	 * have nothing else to shrink and give up trying. By returning
> 	 * 1 we keep it going and assume we'll be able to shrink next
> 	 * time.
> 	 */
> 	freed = 1;
> 	goto out;
> }
> 
> Right now, shrink_slab() is getting called from three places. Twice in
> shrink_node() and once in drop_slab_node(). But the return value from
> shrink_slab() is checked only inside drop_slab_node() and it has some
> heuristics to decide whether to keep on scanning over available memcg
> shrinkers registered.
> 
> The question is does aborting here will still guarantee forward progress
> for all the contexts which might be attempting to allocate memory and had
> eventually invoked shrink_slab() ? Because may be the memory allocation
> request has more priority than a context getting bit delayed while being
> stuck waiting on shrinker_rwsem.

Some of routines relied on temporal return's value of shrink_slab to make
decisions in procedure of progress of reclaimaing. It might affect whole
procedure progress of shrinking at that time. However, we have removed such
heusristic and unified it with checking forard progress during MAX_RECAIM_RETRIES
trial so I don't think it makes big difference.

Thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] mm: Do not stall register_shrinker
@ 2017-11-27  6:31     ` Minchan Kim
  0 siblings, 0 replies; 8+ messages in thread
From: Minchan Kim @ 2017-11-27  6:31 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: Andrew Morton, linux-mm, linux-kernel, kernel-team, Michal Hocko,
	Tetsuo Handa, Shakeel Butt, Johannes Weiner

On Mon, Nov 27, 2017 at 11:16:46AM +0530, Anshuman Khandual wrote:
> On 11/24/2017 05:34 AM, Minchan Kim wrote:
> > Shakeel Butt reported, he have observed in production system that
> > the job loader gets stuck for 10s of seconds while doing mount
> > operation. It turns out that it was stuck in register_shrinker()
> > and some unrelated job was under memory pressure and spending time
> > in shrink_slab(). Machines have a lot of shrinkers registered and
> > jobs under memory pressure has to traverse all of those memcg-aware
> > shrinkers and do affect unrelated jobs which want to register their
> > own shrinkers.
> > 
> > To solve the issue, this patch simply bails out slab shrinking
> > once it found someone want to register shrinker in parallel.
> > A downside is it could cause unfair shrinking between shrinkers.
> > However, it should be rare and we can add compilcated logic once
> > we found it's not enough.
> > 
> > Link: http://lkml.kernel.org/r/20171115005602.GB23810@bbox
> > Cc: Michal Hocko <mhocko@suse.com>
> > Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> > Acked-by: Johannes Weiner <hannes@cmpxchg.org>
> > Reported-and-tested-by: Shakeel Butt <shakeelb@google.com>
> > Signed-off-by: Shakeel Butt <shakeelb@google.com>
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/vmscan.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 6a5a72baccd5..6698001787bd 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -486,6 +486,14 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
> >  			sc.nid = 0;
> >  
> >  		freed += do_shrink_slab(&sc, shrinker, priority);
> > +		/*
> > +		 * bail out if someone want to register a new shrinker to
> > +		 * prevent long time stall by parallel ongoing shrinking.
> > +		 */
> > +		if (rwsem_is_contended(&shrinker_rwsem)) {
> > +			freed = freed ? : 1;
> > +			break;
> > +		}
> 
> This is similar to when it aborts for not being able to grab the
> shrinker_rwsem at the beginning.
> 
> if (!down_read_trylock(&shrinker_rwsem)) {
> 	/*
> 	 * If we would return 0, our callers would understand that we
> 	 * have nothing else to shrink and give up trying. By returning
> 	 * 1 we keep it going and assume we'll be able to shrink next
> 	 * time.
> 	 */
> 	freed = 1;
> 	goto out;
> }
> 
> Right now, shrink_slab() is getting called from three places. Twice in
> shrink_node() and once in drop_slab_node(). But the return value from
> shrink_slab() is checked only inside drop_slab_node() and it has some
> heuristics to decide whether to keep on scanning over available memcg
> shrinkers registered.
> 
> The question is does aborting here will still guarantee forward progress
> for all the contexts which might be attempting to allocate memory and had
> eventually invoked shrink_slab() ? Because may be the memory allocation
> request has more priority than a context getting bit delayed while being
> stuck waiting on shrinker_rwsem.

Some of routines relied on temporal return's value of shrink_slab to make
decisions in procedure of progress of reclaimaing. It might affect whole
procedure progress of shrinking at that time. However, we have removed such
heusristic and unified it with checking forard progress during MAX_RECAIM_RETRIES
trial so I don't think it makes big difference.

Thanks.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-11-27  6:31 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-24  0:04 [PATCH] mm: Do not stall register_shrinker Minchan Kim
2017-11-24  0:04 ` Minchan Kim
2017-11-24  7:50 ` Michal Hocko
2017-11-24  7:50   ` Michal Hocko
2017-11-27  5:46 ` Anshuman Khandual
2017-11-27  5:46   ` Anshuman Khandual
2017-11-27  6:31   ` Minchan Kim
2017-11-27  6:31     ` Minchan Kim

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.