linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm, shrinker: make shrinker_list lockless
@ 2017-11-08 17:37 Shakeel Butt
  2017-11-08 17:58 ` Greg Thelen
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Shakeel Butt @ 2017-11-08 17:37 UTC (permalink / raw)
  To: Minchan Kim, Huang Ying, Mel Gorman, Vladimir Davydov,
	Michal Hocko, Greg Thelen, Johannes Weiner, Andrew Morton
  Cc: linux-mm, linux-kernel, Shakeel Butt

In our production, we have observed that the job loader gets stuck for
10s of seconds while doing mount operation. It turns out that it was
stuck in register_shrinker() and some unrelated job was under memory
pressure and spending time in shrink_slab(). Our machines have a lot
of shrinkers registered and jobs under memory pressure has to traverse
all of those memcg-aware shrinkers and do affect unrelated jobs which
want to register their own shrinkers.

This patch has made the shrinker_list traversal lockless and shrinker
register remain fast. For the shrinker unregister, atomic counter
has been introduced to avoid synchronize_rcu() call. The fields of
struct shrinker has been rearraged to make sure that the size does
not increase for x86_64.

The shrinker functions are allowed to reschedule() and thus can not
be called with rcu read lock. One way to resolve that is to use
srcu read lock but then ifdefs has to be used as SRCU is behind
CONFIG_SRCU. Another way is to just release the rcu read lock before
calling the shrinker and reacquire on the return. The atomic counter
will make sure that the shrinker entry will not be freed under us.

Signed-off-by: Shakeel Butt <shakeelb@google.com>
---
Changelog since v1:
- release and reacquire rcu lock across shrinker call.

 include/linux/shrinker.h |  4 +++-
 mm/vmscan.c              | 54 ++++++++++++++++++++++++++++++------------------
 2 files changed, 37 insertions(+), 21 deletions(-)

diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 388ff2936a87..434b76ef9367 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -60,14 +60,16 @@ struct shrinker {
 	unsigned long (*scan_objects)(struct shrinker *,
 				      struct shrink_control *sc);
 
+	unsigned int flags;
 	int seeks;	/* seeks to recreate an obj */
 	long batch;	/* reclaim batch size, 0 = default */
-	unsigned long flags;
 
 	/* These are for internal use */
 	struct list_head list;
 	/* objs pending delete, per node */
 	atomic_long_t *nr_deferred;
+	/* Number of active do_shrink_slab calls to this shrinker */
+	atomic_t nr_active;
 };
 #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
 
diff --git a/mm/vmscan.c b/mm/vmscan.c
index eb2f0315b8c0..6cec46ac6d95 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -157,7 +157,7 @@ int vm_swappiness = 60;
 unsigned long vm_total_pages;
 
 static LIST_HEAD(shrinker_list);
-static DECLARE_RWSEM(shrinker_rwsem);
+static DEFINE_SPINLOCK(shrinker_lock);
 
 #ifdef CONFIG_MEMCG
 static bool global_reclaim(struct scan_control *sc)
@@ -285,21 +285,42 @@ int register_shrinker(struct shrinker *shrinker)
 	if (!shrinker->nr_deferred)
 		return -ENOMEM;
 
-	down_write(&shrinker_rwsem);
-	list_add_tail(&shrinker->list, &shrinker_list);
-	up_write(&shrinker_rwsem);
+	atomic_set(&shrinker->nr_active, 0);
+	spin_lock(&shrinker_lock);
+	list_add_tail_rcu(&shrinker->list, &shrinker_list);
+	spin_unlock(&shrinker_lock);
 	return 0;
 }
 EXPORT_SYMBOL(register_shrinker);
 
+static void get_shrinker(struct shrinker *shrinker)
+{
+	atomic_inc(&shrinker->nr_active);
+	rcu_read_unlock();
+}
+
+static void put_shrinker(struct shrinker *shrinker)
+{
+	rcu_read_lock();
+	if (!atomic_dec_return(&shrinker->nr_active))
+		wake_up_atomic_t(&shrinker->nr_active);
+}
+
+static int shrinker_wait_atomic_t(atomic_t *p)
+{
+	schedule();
+	return 0;
+}
 /*
  * Remove one
  */
 void unregister_shrinker(struct shrinker *shrinker)
 {
-	down_write(&shrinker_rwsem);
-	list_del(&shrinker->list);
-	up_write(&shrinker_rwsem);
+	spin_lock(&shrinker_lock);
+	list_del_rcu(&shrinker->list);
+	spin_unlock(&shrinker_lock);
+	wait_on_atomic_t(&shrinker->nr_active, shrinker_wait_atomic_t,
+			 TASK_UNINTERRUPTIBLE);
 	kfree(shrinker->nr_deferred);
 }
 EXPORT_SYMBOL(unregister_shrinker);
@@ -468,18 +489,9 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 	if (nr_scanned == 0)
 		nr_scanned = SWAP_CLUSTER_MAX;
 
-	if (!down_read_trylock(&shrinker_rwsem)) {
-		/*
-		 * If we would return 0, our callers would understand that we
-		 * have nothing else to shrink and give up trying. By returning
-		 * 1 we keep it going and assume we'll be able to shrink next
-		 * time.
-		 */
-		freed = 1;
-		goto out;
-	}
+	rcu_read_lock();
 
-	list_for_each_entry(shrinker, &shrinker_list, list) {
+	list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
 		struct shrink_control sc = {
 			.gfp_mask = gfp_mask,
 			.nid = nid,
@@ -498,11 +510,13 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 		if (!(shrinker->flags & SHRINKER_NUMA_AWARE))
 			sc.nid = 0;
 
+		get_shrinker(shrinker);
 		freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible);
+		put_shrinker(shrinker);
 	}
 
-	up_read(&shrinker_rwsem);
-out:
+	rcu_read_unlock();
+
 	cond_resched();
 	return freed;
 }
-- 
2.15.0.403.gc27cc4dac6-goog

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-08 17:37 [PATCH v2] mm, shrinker: make shrinker_list lockless Shakeel Butt
@ 2017-11-08 17:58 ` Greg Thelen
  2017-11-09  0:07 ` Minchan Kim
  2017-11-09 10:26 ` Tetsuo Handa
  2 siblings, 0 replies; 9+ messages in thread
From: Greg Thelen @ 2017-11-08 17:58 UTC (permalink / raw)
  To: Shakeel Butt; +Cc: linux-mm, linux-kernel, Shakeel Butt

(off list)

Shakeel Butt <shakeelb@google.com> wrote:

> In our production, we have observed that the job loader gets stuck for
> 10s of seconds while doing mount operation. It turns out that it was
> stuck in register_shrinker() and some unrelated job was under memory
> pressure and spending time in shrink_slab(). Our machines have a lot
> of shrinkers registered and jobs under memory pressure has to traverse
> all of those memcg-aware shrinkers and do affect unrelated jobs which
> want to register their own shrinkers.
>
> This patch has made the shrinker_list traversal lockless and shrinker
> register remain fast. For the shrinker unregister, atomic counter
> has been introduced to avoid synchronize_rcu() call. The fields of
> struct shrinker has been rearraged to make sure that the size does
> not increase for x86_64.
>
> The shrinker functions are allowed to reschedule() and thus can not
> be called with rcu read lock. One way to resolve that is to use
> srcu read lock but then ifdefs has to be used as SRCU is behind
> CONFIG_SRCU. Another way is to just release the rcu read lock before
> calling the shrinker and reacquire on the return. The atomic counter
> will make sure that the shrinker entry will not be freed under us.
>
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
> Changelog since v1:
> - release and reacquire rcu lock across shrinker call.
>
>  include/linux/shrinker.h |  4 +++-
>  mm/vmscan.c              | 54 ++++++++++++++++++++++++++++++------------------
>  2 files changed, 37 insertions(+), 21 deletions(-)
>
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 388ff2936a87..434b76ef9367 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -60,14 +60,16 @@ struct shrinker {
>  	unsigned long (*scan_objects)(struct shrinker *,
>  				      struct shrink_control *sc);
>  
> +	unsigned int flags;
>  	int seeks;	/* seeks to recreate an obj */
>  	long batch;	/* reclaim batch size, 0 = default */
> -	unsigned long flags;
>  
>  	/* These are for internal use */
>  	struct list_head list;
>  	/* objs pending delete, per node */
>  	atomic_long_t *nr_deferred;
> +	/* Number of active do_shrink_slab calls to this shrinker */
> +	atomic_t nr_active;
>  };
>  #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
>  
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index eb2f0315b8c0..6cec46ac6d95 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -157,7 +157,7 @@ int vm_swappiness = 60;
>  unsigned long vm_total_pages;
>  
>  static LIST_HEAD(shrinker_list);
> -static DECLARE_RWSEM(shrinker_rwsem);
> +static DEFINE_SPINLOCK(shrinker_lock);
>  
>  #ifdef CONFIG_MEMCG
>  static bool global_reclaim(struct scan_control *sc)
> @@ -285,21 +285,42 @@ int register_shrinker(struct shrinker *shrinker)
>  	if (!shrinker->nr_deferred)
>  		return -ENOMEM;
>  
> -	down_write(&shrinker_rwsem);
> -	list_add_tail(&shrinker->list, &shrinker_list);
> -	up_write(&shrinker_rwsem);
> +	atomic_set(&shrinker->nr_active, 0);
> +	spin_lock(&shrinker_lock);
> +	list_add_tail_rcu(&shrinker->list, &shrinker_list);
> +	spin_unlock(&shrinker_lock);
>  	return 0;
>  }
>  EXPORT_SYMBOL(register_shrinker);
>  
> +static void get_shrinker(struct shrinker *shrinker)
> +{
> +	atomic_inc(&shrinker->nr_active);
> +	rcu_read_unlock();
> +}
> +
> +static void put_shrinker(struct shrinker *shrinker)
> +{
> +	rcu_read_lock();
> +	if (!atomic_dec_return(&shrinker->nr_active))
> +		wake_up_atomic_t(&shrinker->nr_active);
> +}
> +
> +static int shrinker_wait_atomic_t(atomic_t *p)
> +{
> +	schedule();
> +	return 0;
> +}
>  /*
>   * Remove one
>   */
>  void unregister_shrinker(struct shrinker *shrinker)
>  {
> -	down_write(&shrinker_rwsem);
> -	list_del(&shrinker->list);
> -	up_write(&shrinker_rwsem);
> +	spin_lock(&shrinker_lock);
> +	list_del_rcu(&shrinker->list);
> +	spin_unlock(&shrinker_lock);
> +	wait_on_atomic_t(&shrinker->nr_active, shrinker_wait_atomic_t,
> +			 TASK_UNINTERRUPTIBLE);

What keeps us from returning to the caller which could kfree the
shrinker before shrink_slab() uses it for list iteration?

>  	kfree(shrinker->nr_deferred);
>  }
>  EXPORT_SYMBOL(unregister_shrinker);
> @@ -468,18 +489,9 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  	if (nr_scanned == 0)
>  		nr_scanned = SWAP_CLUSTER_MAX;
>  
> -	if (!down_read_trylock(&shrinker_rwsem)) {
> -		/*
> -		 * If we would return 0, our callers would understand that we
> -		 * have nothing else to shrink and give up trying. By returning
> -		 * 1 we keep it going and assume we'll be able to shrink next
> -		 * time.
> -		 */
> -		freed = 1;
> -		goto out;
> -	}
> +	rcu_read_lock();
>  
> -	list_for_each_entry(shrinker, &shrinker_list, list) {
> +	list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
>  		struct shrink_control sc = {
>  			.gfp_mask = gfp_mask,
>  			.nid = nid,
> @@ -498,11 +510,13 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  		if (!(shrinker->flags & SHRINKER_NUMA_AWARE))
>  			sc.nid = 0;
>  
> +		get_shrinker(shrinker);
>  		freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible);
> +		put_shrinker(shrinker);
>  	}
>  
> -	up_read(&shrinker_rwsem);
> -out:
> +	rcu_read_unlock();
> +
>  	cond_resched();
>  	return freed;
>  }
> -- 
> 2.15.0.403.gc27cc4dac6-goog

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-08 17:37 [PATCH v2] mm, shrinker: make shrinker_list lockless Shakeel Butt
  2017-11-08 17:58 ` Greg Thelen
@ 2017-11-09  0:07 ` Minchan Kim
  2017-11-09  1:07   ` Shakeel Butt
  2017-11-09 10:26 ` Tetsuo Handa
  2 siblings, 1 reply; 9+ messages in thread
From: Minchan Kim @ 2017-11-09  0:07 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Huang Ying, Mel Gorman, Vladimir Davydov, Michal Hocko,
	Greg Thelen, Johannes Weiner, Andrew Morton, linux-mm,
	linux-kernel

Hi,

On Wed, Nov 08, 2017 at 09:37:40AM -0800, Shakeel Butt wrote:
> In our production, we have observed that the job loader gets stuck for
> 10s of seconds while doing mount operation. It turns out that it was
> stuck in register_shrinker() and some unrelated job was under memory
> pressure and spending time in shrink_slab(). Our machines have a lot
> of shrinkers registered and jobs under memory pressure has to traverse
> all of those memcg-aware shrinkers and do affect unrelated jobs which
> want to register their own shrinkers.
> 
> This patch has made the shrinker_list traversal lockless and shrinker
> register remain fast. For the shrinker unregister, atomic counter
> has been introduced to avoid synchronize_rcu() call. The fields of

So, do you want to enhance unregister shrinker path as well as registering?

> struct shrinker has been rearraged to make sure that the size does
> not increase for x86_64.
> 
> The shrinker functions are allowed to reschedule() and thus can not
> be called with rcu read lock. One way to resolve that is to use
> srcu read lock but then ifdefs has to be used as SRCU is behind
> CONFIG_SRCU. Another way is to just release the rcu read lock before
> calling the shrinker and reacquire on the return. The atomic counter
> will make sure that the shrinker entry will not be freed under us.

Instead of adding new lock, could we simply release shrinker_rwsem read-side
lock in list traveral periodically to give a chance to hold a write-side
lock?

> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
> Changelog since v1:
> - release and reacquire rcu lock across shrinker call.
> 
>  include/linux/shrinker.h |  4 +++-
>  mm/vmscan.c              | 54 ++++++++++++++++++++++++++++++------------------
>  2 files changed, 37 insertions(+), 21 deletions(-)
> 
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 388ff2936a87..434b76ef9367 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -60,14 +60,16 @@ struct shrinker {
>  	unsigned long (*scan_objects)(struct shrinker *,
>  				      struct shrink_control *sc);
>  
> +	unsigned int flags;
>  	int seeks;	/* seeks to recreate an obj */
>  	long batch;	/* reclaim batch size, 0 = default */
> -	unsigned long flags;
>  
>  	/* These are for internal use */
>  	struct list_head list;
>  	/* objs pending delete, per node */
>  	atomic_long_t *nr_deferred;
> +	/* Number of active do_shrink_slab calls to this shrinker */
> +	atomic_t nr_active;
>  };
>  #define DEFAULT_SEEKS 2 /* A good number if you don't know better. */
>  
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index eb2f0315b8c0..6cec46ac6d95 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -157,7 +157,7 @@ int vm_swappiness = 60;
>  unsigned long vm_total_pages;
>  
>  static LIST_HEAD(shrinker_list);
> -static DECLARE_RWSEM(shrinker_rwsem);
> +static DEFINE_SPINLOCK(shrinker_lock);
>  
>  #ifdef CONFIG_MEMCG
>  static bool global_reclaim(struct scan_control *sc)
> @@ -285,21 +285,42 @@ int register_shrinker(struct shrinker *shrinker)
>  	if (!shrinker->nr_deferred)
>  		return -ENOMEM;
>  
> -	down_write(&shrinker_rwsem);
> -	list_add_tail(&shrinker->list, &shrinker_list);
> -	up_write(&shrinker_rwsem);
> +	atomic_set(&shrinker->nr_active, 0);
> +	spin_lock(&shrinker_lock);
> +	list_add_tail_rcu(&shrinker->list, &shrinker_list);
> +	spin_unlock(&shrinker_lock);
>  	return 0;
>  }
>  EXPORT_SYMBOL(register_shrinker);
>  
> +static void get_shrinker(struct shrinker *shrinker)
> +{
> +	atomic_inc(&shrinker->nr_active);
> +	rcu_read_unlock();
> +}
> +
> +static void put_shrinker(struct shrinker *shrinker)
> +{
> +	rcu_read_lock();
> +	if (!atomic_dec_return(&shrinker->nr_active))
> +		wake_up_atomic_t(&shrinker->nr_active);
> +}
> +
> +static int shrinker_wait_atomic_t(atomic_t *p)
> +{
> +	schedule();
> +	return 0;
> +}
>  /*
>   * Remove one
>   */
>  void unregister_shrinker(struct shrinker *shrinker)
>  {
> -	down_write(&shrinker_rwsem);
> -	list_del(&shrinker->list);
> -	up_write(&shrinker_rwsem);
> +	spin_lock(&shrinker_lock);
> +	list_del_rcu(&shrinker->list);
> +	spin_unlock(&shrinker_lock);
> +	wait_on_atomic_t(&shrinker->nr_active, shrinker_wait_atomic_t,
> +			 TASK_UNINTERRUPTIBLE);
>  	kfree(shrinker->nr_deferred);
>  }
>  EXPORT_SYMBOL(unregister_shrinker);
> @@ -468,18 +489,9 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  	if (nr_scanned == 0)
>  		nr_scanned = SWAP_CLUSTER_MAX;
>  
> -	if (!down_read_trylock(&shrinker_rwsem)) {
> -		/*
> -		 * If we would return 0, our callers would understand that we
> -		 * have nothing else to shrink and give up trying. By returning
> -		 * 1 we keep it going and assume we'll be able to shrink next
> -		 * time.
> -		 */
> -		freed = 1;
> -		goto out;
> -	}
> +	rcu_read_lock();
>  
> -	list_for_each_entry(shrinker, &shrinker_list, list) {
> +	list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
>  		struct shrink_control sc = {
>  			.gfp_mask = gfp_mask,
>  			.nid = nid,
> @@ -498,11 +510,13 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>  		if (!(shrinker->flags & SHRINKER_NUMA_AWARE))
>  			sc.nid = 0;
>  
> +		get_shrinker(shrinker);
>  		freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible);
> +		put_shrinker(shrinker);
>  	}
>  
> -	up_read(&shrinker_rwsem);
> -out:
> +	rcu_read_unlock();
> +
>  	cond_resched();
>  	return freed;
>  }
> -- 
> 2.15.0.403.gc27cc4dac6-goog
> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-09  0:07 ` Minchan Kim
@ 2017-11-09  1:07   ` Shakeel Butt
  2017-11-09  1:40     ` Minchan Kim
  0 siblings, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2017-11-09  1:07 UTC (permalink / raw)
  To: Minchan Kim
  Cc: Huang Ying, Mel Gorman, Vladimir Davydov, Michal Hocko,
	Greg Thelen, Johannes Weiner, Andrew Morton, Linux MM, LKML

On Wed, Nov 8, 2017 at 4:07 PM, Minchan Kim <minchan@kernel.org> wrote:
> Hi,
>
> On Wed, Nov 08, 2017 at 09:37:40AM -0800, Shakeel Butt wrote:
>> In our production, we have observed that the job loader gets stuck for
>> 10s of seconds while doing mount operation. It turns out that it was
>> stuck in register_shrinker() and some unrelated job was under memory
>> pressure and spending time in shrink_slab(). Our machines have a lot
>> of shrinkers registered and jobs under memory pressure has to traverse
>> all of those memcg-aware shrinkers and do affect unrelated jobs which
>> want to register their own shrinkers.
>>
>> This patch has made the shrinker_list traversal lockless and shrinker
>> register remain fast. For the shrinker unregister, atomic counter
>> has been introduced to avoid synchronize_rcu() call. The fields of
>
> So, do you want to enhance unregister shrinker path as well as registering?
>

Yes, I don't want to add delay to unregister_shrinker for the normal
case where there isn't any readers (i.e. unconditional
synchronize_rcu).

>> struct shrinker has been rearraged to make sure that the size does
>> not increase for x86_64.
>>
>> The shrinker functions are allowed to reschedule() and thus can not
>> be called with rcu read lock. One way to resolve that is to use
>> srcu read lock but then ifdefs has to be used as SRCU is behind
>> CONFIG_SRCU. Another way is to just release the rcu read lock before
>> calling the shrinker and reacquire on the return. The atomic counter
>> will make sure that the shrinker entry will not be freed under us.
>
> Instead of adding new lock, could we simply release shrinker_rwsem read-side
> lock in list traveral periodically to give a chance to hold a write-side
> lock?
>

Greg has already pointed out that this patch is still not right/safe
and now I am getting to the opinion that without changing the shrinker
API, it might not be possible to do lockless shrinker traversal and
unregister shrinker without synchronize_rcu().

Regarding your suggestion, do you mean to add periodic release lock
and reacquire using down_read_trylock() or down_read()?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-09  1:07   ` Shakeel Butt
@ 2017-11-09  1:40     ` Minchan Kim
  0 siblings, 0 replies; 9+ messages in thread
From: Minchan Kim @ 2017-11-09  1:40 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Huang Ying, Mel Gorman, Vladimir Davydov, Michal Hocko,
	Greg Thelen, Johannes Weiner, Andrew Morton, Linux MM, LKML

On Wed, Nov 08, 2017 at 05:07:08PM -0800, Shakeel Butt wrote:
> On Wed, Nov 8, 2017 at 4:07 PM, Minchan Kim <minchan@kernel.org> wrote:
> > Hi,
> >
> > On Wed, Nov 08, 2017 at 09:37:40AM -0800, Shakeel Butt wrote:
> >> In our production, we have observed that the job loader gets stuck for
> >> 10s of seconds while doing mount operation. It turns out that it was
> >> stuck in register_shrinker() and some unrelated job was under memory
> >> pressure and spending time in shrink_slab(). Our machines have a lot
> >> of shrinkers registered and jobs under memory pressure has to traverse
> >> all of those memcg-aware shrinkers and do affect unrelated jobs which
> >> want to register their own shrinkers.
> >>
> >> This patch has made the shrinker_list traversal lockless and shrinker
> >> register remain fast. For the shrinker unregister, atomic counter
> >> has been introduced to avoid synchronize_rcu() call. The fields of
> >
> > So, do you want to enhance unregister shrinker path as well as registering?
> >
> 
> Yes, I don't want to add delay to unregister_shrinker for the normal
> case where there isn't any readers (i.e. unconditional
> synchronize_rcu).

Not sure how it makes bad.
It would be better to add opinion in description about why unregister path is
important and how synchronize_rcu might makeA slow for usual cases.

> 
> >> struct shrinker has been rearraged to make sure that the size does
> >> not increase for x86_64.
> >>
> >> The shrinker functions are allowed to reschedule() and thus can not
> >> be called with rcu read lock. One way to resolve that is to use
> >> srcu read lock but then ifdefs has to be used as SRCU is behind
> >> CONFIG_SRCU. Another way is to just release the rcu read lock before
> >> calling the shrinker and reacquire on the return. The atomic counter
> >> will make sure that the shrinker entry will not be freed under us.
> >
> > Instead of adding new lock, could we simply release shrinker_rwsem read-side
> > lock in list traveral periodically to give a chance to hold a write-side
> > lock?
> >
> 
> Greg has already pointed out that this patch is still not right/safe
> and now I am getting to the opinion that without changing the shrinker
> API, it might not be possible to do lockless shrinker traversal and
> unregister shrinker without synchronize_rcu().
> 
> Regarding your suggestion, do you mean to add periodic release lock
> and reacquire using down_read_trylock() or down_read()?

Yub with down_read. Actually, I do not see point of down_read_trylock
when considering write-lock path in reigster_shinker is too short.

The problem in suggested approach is we should traverse list from the
beginning again after reacquiring, which breaks fairness of each
shrinker.

Maybe, we can introduce rwlock_is_contended which checks wait_list
and returns true if wait_list is not empty.

Thanks.




> 
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-08 17:37 [PATCH v2] mm, shrinker: make shrinker_list lockless Shakeel Butt
  2017-11-08 17:58 ` Greg Thelen
  2017-11-09  0:07 ` Minchan Kim
@ 2017-11-09 10:26 ` Tetsuo Handa
  2017-11-09 15:34   ` Shakeel Butt
  2 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2017-11-09 10:26 UTC (permalink / raw)
  To: Shakeel Butt, Minchan Kim, Huang Ying, Mel Gorman,
	Vladimir Davydov, Michal Hocko, Greg Thelen, Johannes Weiner,
	Andrew Morton
  Cc: linux-mm, linux-kernel

On 2017/11/09 2:37, Shakeel Butt wrote:
> In our production, we have observed that the job loader gets stuck for
> 10s of seconds while doing mount operation. It turns out that it was
> stuck in register_shrinker() and some unrelated job was under memory
> pressure and spending time in shrink_slab(). Our machines have a lot
> of shrinkers registered and jobs under memory pressure has to traverse
> all of those memcg-aware shrinkers and do affect unrelated jobs which
> want to register their own shrinkers.
> 
> This patch has made the shrinker_list traversal lockless and shrinker
> register remain fast. For the shrinker unregister, atomic counter
> has been introduced to avoid synchronize_rcu() call. The fields of
> struct shrinker has been rearraged to make sure that the size does
> not increase for x86_64.
> 
> The shrinker functions are allowed to reschedule() and thus can not
> be called with rcu read lock. One way to resolve that is to use
> srcu read lock but then ifdefs has to be used as SRCU is behind
> CONFIG_SRCU. Another way is to just release the rcu read lock before
> calling the shrinker and reacquire on the return. The atomic counter
> will make sure that the shrinker entry will not be freed under us.
> 
> Signed-off-by: Shakeel Butt <shakeelb@google.com>
> ---
> Changelog since v1:
> - release and reacquire rcu lock across shrinker call.
> 
>  include/linux/shrinker.h |  4 +++-
>  mm/vmscan.c              | 54 ++++++++++++++++++++++++++++++------------------
>  2 files changed, 37 insertions(+), 21 deletions(-)
> 

If you can accept serialized register_shrinker()/unregister_shrinker(),
I think that something like shown below can do it.

----------
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index 388ff29..e2272dd 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -62,9 +62,10 @@ struct shrinker {
 
 	int seeks;	/* seeks to recreate an obj */
 	long batch;	/* reclaim batch size, 0 = default */
-	unsigned long flags;
+	unsigned int flags;
 
 	/* These are for internal use */
+	atomic_t nr_active; /* Counted only if !SHRINKER_PERMANENT */
 	struct list_head list;
 	/* objs pending delete, per node */
 	atomic_long_t *nr_deferred;
@@ -74,6 +75,7 @@ struct shrinker {
 /* Flags */
 #define SHRINKER_NUMA_AWARE	(1 << 0)
 #define SHRINKER_MEMCG_AWARE	(1 << 1)
+#define SHRINKER_PERMANENT	(1 << 2)
 
 extern int register_shrinker(struct shrinker *);
 extern void unregister_shrinker(struct shrinker *);
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1c1bc95..e963359 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -157,7 +157,7 @@ struct scan_control {
 unsigned long vm_total_pages;
 
 static LIST_HEAD(shrinker_list);
-static DECLARE_RWSEM(shrinker_rwsem);
+static DEFINE_MUTEX(shrinker_lock);
 
 #ifdef CONFIG_MEMCG
 static bool global_reclaim(struct scan_control *sc)
@@ -285,9 +285,10 @@ int register_shrinker(struct shrinker *shrinker)
 	if (!shrinker->nr_deferred)
 		return -ENOMEM;
 
-	down_write(&shrinker_rwsem);
-	list_add_tail(&shrinker->list, &shrinker_list);
-	up_write(&shrinker_rwsem);
+	atomic_set(&shrinker->nr_active, 0);
+	mutex_lock(&shrinker_lock);
+	list_add_tail_rcu(&shrinker->list, &shrinker_list);
+	mutex_unlock(&shrinker_lock);
 	return 0;
 }
 EXPORT_SYMBOL(register_shrinker);
@@ -297,9 +298,14 @@ int register_shrinker(struct shrinker *shrinker)
  */
 void unregister_shrinker(struct shrinker *shrinker)
 {
-	down_write(&shrinker_rwsem);
-	list_del(&shrinker->list);
-	up_write(&shrinker_rwsem);
+	BUG_ON(shrinker->flags & SHRINKER_PERMANENT);
+	mutex_lock(&shrinker_lock);
+	list_del_rcu(&shrinker->list);
+	synchronize_rcu();
+	while (atomic_read(&shrinker->nr_active))
+		msleep(1);
+	synchronize_rcu();
+	mutex_unlock(&shrinker_lock);
 	kfree(shrinker->nr_deferred);
 }
 EXPORT_SYMBOL(unregister_shrinker);
@@ -468,18 +474,9 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 	if (nr_scanned == 0)
 		nr_scanned = SWAP_CLUSTER_MAX;
 
-	if (!down_read_trylock(&shrinker_rwsem)) {
-		/*
-		 * If we would return 0, our callers would understand that we
-		 * have nothing else to shrink and give up trying. By returning
-		 * 1 we keep it going and assume we'll be able to shrink next
-		 * time.
-		 */
-		freed = 1;
-		goto out;
-	}
-
-	list_for_each_entry(shrinker, &shrinker_list, list) {
+	rcu_read_lock();
+	list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
+		bool permanent;
 		struct shrink_control sc = {
 			.gfp_mask = gfp_mask,
 			.nid = nid,
@@ -498,11 +495,16 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
 		if (!(shrinker->flags & SHRINKER_NUMA_AWARE))
 			sc.nid = 0;
 
+		permanent = (shrinker->flags & SHRINKER_PERMANENT);
+		if (!permanent)
+			atomic_inc(&shrinker->nr_active);
+		rcu_read_unlock();
 		freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible);
+		rcu_read_lock();
+		if (!permanent)
+			atomic_dec(&shrinker->nr_active);
 	}
-
-	up_read(&shrinker_rwsem);
-out:
+	rcu_read_unlock();
 	cond_resched();
 	return freed;
 }
----------

If you want parallel register_shrinker()/unregister_shrinker(), something like
shown below on top of shown above will do it.

----------
diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
index e2272dd..471b2f6 100644
--- a/include/linux/shrinker.h
+++ b/include/linux/shrinker.h
@@ -67,6 +67,7 @@ struct shrinker {
 	/* These are for internal use */
 	atomic_t nr_active; /* Counted only if !SHRINKER_PERMANENT */
 	struct list_head list;
+	struct list_head gc_list;
 	/* objs pending delete, per node */
 	atomic_long_t *nr_deferred;
 };
diff --git a/mm/vmscan.c b/mm/vmscan.c
index e963359..a216dc5 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -157,7 +157,7 @@ struct scan_control {
 unsigned long vm_total_pages;
 
 static LIST_HEAD(shrinker_list);
-static DEFINE_MUTEX(shrinker_lock);
+static DEFINE_SPINLOCK(shrinker_lock);
 
 #ifdef CONFIG_MEMCG
 static bool global_reclaim(struct scan_control *sc)
@@ -286,9 +286,9 @@ int register_shrinker(struct shrinker *shrinker)
 		return -ENOMEM;
 
 	atomic_set(&shrinker->nr_active, 0);
-	mutex_lock(&shrinker_lock);
+	spin_lock(&shrinker_lock);
 	list_add_tail_rcu(&shrinker->list, &shrinker_list);
-	mutex_unlock(&shrinker_lock);
+	spin_unlock(&shrinker_lock);
 	return 0;
 }
 EXPORT_SYMBOL(register_shrinker);
@@ -298,15 +298,30 @@ int register_shrinker(struct shrinker *shrinker)
  */
 void unregister_shrinker(struct shrinker *shrinker)
 {
+	static LIST_HEAD(shrinker_gc_list);
+	struct shrinker *gc;
+
 	BUG_ON(shrinker->flags & SHRINKER_PERMANENT);
-	mutex_lock(&shrinker_lock);
+	spin_lock(&shrinker_lock);
 	list_del_rcu(&shrinker->list);
+	/*
+	 * Need to update ->list.next if concurrently unregistering shrinkers
+	 * can find this shrinker, for this shrinker's unregistration might
+	 * complete before their unregistrations complete.
+	 */
+	list_for_each_entry(gc, &shrinker_gc_list, gc_list) {
+		if (gc->list.next == &shrinker->list)
+			rcu_assign_pointer(gc->list.next, shrinker->list.next);
+	}
+	list_add_tail(&shrinker->gc_list, &shrinker_gc_list);
+	spin_unlock(&shrinker_lock);
 	synchronize_rcu();
 	while (atomic_read(&shrinker->nr_active))
 		msleep(1);
 	synchronize_rcu();
-	mutex_unlock(&shrinker_lock);
+	spin_lock(&shrinker_lock);
+	list_del(&shrinker->gc_list);
+	spin_unlock(&shrinker_lock);
 	kfree(shrinker->nr_deferred);
 }
 EXPORT_SYMBOL(unregister_shrinker);
----------

F.Y.I. When I posted above change at
http://lkml.kernel.org/r/201411231350.DHI12456.OLOFFJSFtQVMHO@I-love.SAKURA.ne.jp ,
Michal Hocko commented like below.

  I thought that {un}register_shrinker are really unlikely
  paths called during initialization and tear down which usually do not
  happen during OOM conditions.

  I cannot judge the patch itself as this is out of my area but is the
  complexity worth it?

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-09 10:26 ` Tetsuo Handa
@ 2017-11-09 15:34   ` Shakeel Butt
  2017-11-09 21:46     ` Tetsuo Handa
  0 siblings, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2017-11-09 15:34 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Minchan Kim, Huang Ying, Mel Gorman, Vladimir Davydov,
	Michal Hocko, Greg Thelen, Johannes Weiner, Andrew Morton,
	Linux MM, LKML

>
> If you can accept serialized register_shrinker()/unregister_shrinker(),
> I think that something like shown below can do it.
>

Thanks.

> ----------
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index 388ff29..e2272dd 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -62,9 +62,10 @@ struct shrinker {
>
>         int seeks;      /* seeks to recreate an obj */
>         long batch;     /* reclaim batch size, 0 = default */
> -       unsigned long flags;
> +       unsigned int flags;
>
>         /* These are for internal use */
> +       atomic_t nr_active; /* Counted only if !SHRINKER_PERMANENT */
>         struct list_head list;
>         /* objs pending delete, per node */
>         atomic_long_t *nr_deferred;
> @@ -74,6 +75,7 @@ struct shrinker {
>  /* Flags */
>  #define SHRINKER_NUMA_AWARE    (1 << 0)
>  #define SHRINKER_MEMCG_AWARE   (1 << 1)
> +#define SHRINKER_PERMANENT     (1 << 2)
>
>  extern int register_shrinker(struct shrinker *);
>  extern void unregister_shrinker(struct shrinker *);
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 1c1bc95..e963359 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -157,7 +157,7 @@ struct scan_control {
>  unsigned long vm_total_pages;
>
>  static LIST_HEAD(shrinker_list);
> -static DECLARE_RWSEM(shrinker_rwsem);
> +static DEFINE_MUTEX(shrinker_lock);
>
>  #ifdef CONFIG_MEMCG
>  static bool global_reclaim(struct scan_control *sc)
> @@ -285,9 +285,10 @@ int register_shrinker(struct shrinker *shrinker)
>         if (!shrinker->nr_deferred)
>                 return -ENOMEM;
>
> -       down_write(&shrinker_rwsem);
> -       list_add_tail(&shrinker->list, &shrinker_list);
> -       up_write(&shrinker_rwsem);
> +       atomic_set(&shrinker->nr_active, 0);
> +       mutex_lock(&shrinker_lock);
> +       list_add_tail_rcu(&shrinker->list, &shrinker_list);
> +       mutex_unlock(&shrinker_lock);
>         return 0;
>  }
>  EXPORT_SYMBOL(register_shrinker);
> @@ -297,9 +298,14 @@ int register_shrinker(struct shrinker *shrinker)
>   */
>  void unregister_shrinker(struct shrinker *shrinker)
>  {
> -       down_write(&shrinker_rwsem);
> -       list_del(&shrinker->list);
> -       up_write(&shrinker_rwsem);
> +       BUG_ON(shrinker->flags & SHRINKER_PERMANENT);
> +       mutex_lock(&shrinker_lock);
> +       list_del_rcu(&shrinker->list);
> +       synchronize_rcu();
> +       while (atomic_read(&shrinker->nr_active))
> +               msleep(1);

If we assume that we will never do register_shrinker and
unregister_shrinker on the same object in parallel then do we still
need to do msleep & synchronize_rcu() within mutex?

> +       synchronize_rcu();

I was hoping to not put any delay for the normal case (no memory
pressure and no reclaimers).

> +       mutex_unlock(&shrinker_lock);
>         kfree(shrinker->nr_deferred);
>  }
>  EXPORT_SYMBOL(unregister_shrinker);
> @@ -468,18 +474,9 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>         if (nr_scanned == 0)
>                 nr_scanned = SWAP_CLUSTER_MAX;
>
> -       if (!down_read_trylock(&shrinker_rwsem)) {
> -               /*
> -                * If we would return 0, our callers would understand that we
> -                * have nothing else to shrink and give up trying. By returning
> -                * 1 we keep it going and assume we'll be able to shrink next
> -                * time.
> -                */
> -               freed = 1;
> -               goto out;
> -       }
> -
> -       list_for_each_entry(shrinker, &shrinker_list, list) {
> +       rcu_read_lock();
> +       list_for_each_entry_rcu(shrinker, &shrinker_list, list) {
> +               bool permanent;
>                 struct shrink_control sc = {
>                         .gfp_mask = gfp_mask,
>                         .nid = nid,
> @@ -498,11 +495,16 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid,
>                 if (!(shrinker->flags & SHRINKER_NUMA_AWARE))
>                         sc.nid = 0;
>
> +               permanent = (shrinker->flags & SHRINKER_PERMANENT);
> +               if (!permanent)
> +                       atomic_inc(&shrinker->nr_active);
> +               rcu_read_unlock();
>                 freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible);
> +               rcu_read_lock();
> +               if (!permanent)
> +                       atomic_dec(&shrinker->nr_active);
>         }
> -
> -       up_read(&shrinker_rwsem);
> -out:
> +       rcu_read_unlock();
>         cond_resched();
>         return freed;
>  }
> ----------
>
> If you want parallel register_shrinker()/unregister_shrinker(), something like
> shown below on top of shown above will do it.
>
> ----------
> diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h
> index e2272dd..471b2f6 100644
> --- a/include/linux/shrinker.h
> +++ b/include/linux/shrinker.h
> @@ -67,6 +67,7 @@ struct shrinker {
>         /* These are for internal use */
>         atomic_t nr_active; /* Counted only if !SHRINKER_PERMANENT */
>         struct list_head list;
> +       struct list_head gc_list;
>         /* objs pending delete, per node */
>         atomic_long_t *nr_deferred;
>  };
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index e963359..a216dc5 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -157,7 +157,7 @@ struct scan_control {
>  unsigned long vm_total_pages;
>
>  static LIST_HEAD(shrinker_list);
> -static DEFINE_MUTEX(shrinker_lock);
> +static DEFINE_SPINLOCK(shrinker_lock);
>
>  #ifdef CONFIG_MEMCG
>  static bool global_reclaim(struct scan_control *sc)
> @@ -286,9 +286,9 @@ int register_shrinker(struct shrinker *shrinker)
>                 return -ENOMEM;
>
>         atomic_set(&shrinker->nr_active, 0);
> -       mutex_lock(&shrinker_lock);
> +       spin_lock(&shrinker_lock);
>         list_add_tail_rcu(&shrinker->list, &shrinker_list);
> -       mutex_unlock(&shrinker_lock);
> +       spin_unlock(&shrinker_lock);
>         return 0;
>  }
>  EXPORT_SYMBOL(register_shrinker);
> @@ -298,15 +298,30 @@ int register_shrinker(struct shrinker *shrinker)
>   */
>  void unregister_shrinker(struct shrinker *shrinker)
>  {
> +       static LIST_HEAD(shrinker_gc_list);
> +       struct shrinker *gc;
> +
>         BUG_ON(shrinker->flags & SHRINKER_PERMANENT);
> -       mutex_lock(&shrinker_lock);
> +       spin_lock(&shrinker_lock);
>         list_del_rcu(&shrinker->list);
> +       /*
> +        * Need to update ->list.next if concurrently unregistering shrinkers
> +        * can find this shrinker, for this shrinker's unregistration might
> +        * complete before their unregistrations complete.
> +        */
> +       list_for_each_entry(gc, &shrinker_gc_list, gc_list) {
> +               if (gc->list.next == &shrinker->list)
> +                       rcu_assign_pointer(gc->list.next, shrinker->list.next);
> +       }
> +       list_add_tail(&shrinker->gc_list, &shrinker_gc_list);
> +       spin_unlock(&shrinker_lock);
>         synchronize_rcu();
>         while (atomic_read(&shrinker->nr_active))
>                 msleep(1);
>         synchronize_rcu();
> -       mutex_unlock(&shrinker_lock);
> +       spin_lock(&shrinker_lock);
> +       list_del(&shrinker->gc_list);
> +       spin_unlock(&shrinker_lock);
>         kfree(shrinker->nr_deferred);
>  }
>  EXPORT_SYMBOL(unregister_shrinker);
> ----------
>
> F.Y.I. When I posted above change at
> http://lkml.kernel.org/r/201411231350.DHI12456.OLOFFJSFtQVMHO@I-love.SAKURA.ne.jp ,
> Michal Hocko commented like below.
>
>   I thought that {un}register_shrinker are really unlikely
>   paths called during initialization and tear down which usually do not
>   happen during OOM conditions.
>
>   I cannot judge the patch itself as this is out of my area but is the
>   complexity worth it?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-09 15:34   ` Shakeel Butt
@ 2017-11-09 21:46     ` Tetsuo Handa
  2017-11-10 18:16       ` Shakeel Butt
  0 siblings, 1 reply; 9+ messages in thread
From: Tetsuo Handa @ 2017-11-09 21:46 UTC (permalink / raw)
  To: shakeelb
  Cc: minchan, ying.huang, mgorman, vdavydov.dev, mhocko, gthelen,
	hannes, akpm, linux-mm, linux-kernel

Shakeel Butt wrote:
> > If you can accept serialized register_shrinker()/unregister_shrinker(),
> > I think that something like shown below can do it.
> 
> If we assume that we will never do register_shrinker and
> unregister_shrinker on the same object in parallel then do we still
> need to do msleep & synchronize_rcu() within mutex?

Doing register_shrinker() and unregister_shrinker() on the same object
in parallel is wrong. This mutex is to ensure that we do not need to
worry about ->list.next field. synchronize_rcu() should not be slow.
If you want to avoid msleep() with mutex held, you can also apply

> > If you want parallel register_shrinker()/unregister_shrinker(), something like
> > shown below on top of shown above will do it.

change.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2] mm, shrinker: make shrinker_list lockless
  2017-11-09 21:46     ` Tetsuo Handa
@ 2017-11-10 18:16       ` Shakeel Butt
  0 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2017-11-10 18:16 UTC (permalink / raw)
  To: Tetsuo Handa
  Cc: Minchan Kim, Huang Ying, Mel Gorman, Vladimir Davydov,
	Michal Hocko, Greg Thelen, Johannes Weiner, Andrew Morton,
	Linux MM, LKML

On Thu, Nov 9, 2017 at 1:46 PM, Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
> Shakeel Butt wrote:
>> > If you can accept serialized register_shrinker()/unregister_shrinker(),
>> > I think that something like shown below can do it.
>>
>> If we assume that we will never do register_shrinker and
>> unregister_shrinker on the same object in parallel then do we still
>> need to do msleep & synchronize_rcu() within mutex?
>
> Doing register_shrinker() and unregister_shrinker() on the same object
> in parallel is wrong. This mutex is to ensure that we do not need to
> worry about ->list.next field. synchronize_rcu() should not be slow.
> If you want to avoid msleep() with mutex held, you can also apply
>
>> > If you want parallel register_shrinker()/unregister_shrinker(), something like
>> > shown below on top of shown above will do it.
>
> change.

Thanks for the explanation. Can you post the patch for others to
review without parallel register/unregister and SHRINKER_PERMANENT (we
can add when we need them)? You can use the motivation for the patch I
mentioned in my patch instead.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-11-10 18:16 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-08 17:37 [PATCH v2] mm, shrinker: make shrinker_list lockless Shakeel Butt
2017-11-08 17:58 ` Greg Thelen
2017-11-09  0:07 ` Minchan Kim
2017-11-09  1:07   ` Shakeel Butt
2017-11-09  1:40     ` Minchan Kim
2017-11-09 10:26 ` Tetsuo Handa
2017-11-09 15:34   ` Shakeel Butt
2017-11-09 21:46     ` Tetsuo Handa
2017-11-10 18:16       ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).