From: Shakeel Butt <shakeelb@google.com> To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Minchan Kim <minchan@kernel.org>, Huang Ying <ying.huang@intel.com>, Mel Gorman <mgorman@techsingularity.net>, Vladimir Davydov <vdavydov.dev@gmail.com>, Michal Hocko <mhocko@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>, Andrew Morton <akpm@linux-foundation.org>, Greg Thelen <gthelen@google.com>, Linux MM <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock. Date: Mon, 13 Nov 2017 14:05:34 -0800 [thread overview] Message-ID: <CALvZod5RgJrWgsy=14qBpDRLOad3WD8_fPrG5nFjkTQ4rL3rsQ@mail.gmail.com> (raw) In-Reply-To: <1510609063-3327-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> On Mon, Nov 13, 2017 at 1:37 PM, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > When shrinker_rwsem was introduced, it was assumed that > register_shrinker()/unregister_shrinker() are really unlikely paths > which are called during initialization and tear down. But nowadays, > register_shrinker()/unregister_shrinker() might be called regularly. > This patch prepares for allowing parallel registration/unregistration > of shrinkers. > > Since do_shrink_slab() can reschedule, we cannot protect shrinker_list > using one RCU section. But using atomic_inc()/atomic_dec() for each > do_shrink_slab() call will not impact so much. > > This patch uses polling loop with short sleep for unregister_shrinker() > rather than wait_on_atomic_t(), for we can save reader's cost (plain > atomic_dec() compared to atomic_dec_and_test()), we can expect that > do_shrink_slab() of unregistering shrinker likely returns shortly, and > we can avoid khungtaskd warnings when do_shrink_slab() of unregistering > shrinker unexpectedly took so long. > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-and-tested-by: Shakeel Butt <shakeelb@google.com> > --- > include/linux/shrinker.h | 3 ++- > mm/vmscan.c | 41 +++++++++++++++++++---------------------- > 2 files changed, 21 insertions(+), 23 deletions(-) > > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h > index 388ff29..333a1d0 100644 > --- a/include/linux/shrinker.h > +++ b/include/linux/shrinker.h > @@ -62,9 +62,10 @@ struct shrinker { > > int seeks; /* seeks to recreate an obj */ > long batch; /* reclaim batch size, 0 = default */ > - unsigned long flags; > + unsigned int flags; > > /* These are for internal use */ > + atomic_t nr_active; > struct list_head list; > /* objs pending delete, per node */ > atomic_long_t *nr_deferred; > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 1c1bc95..c8996e8 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -157,7 +157,7 @@ struct scan_control { > unsigned long vm_total_pages; > > static LIST_HEAD(shrinker_list); > -static DECLARE_RWSEM(shrinker_rwsem); > +static DEFINE_MUTEX(shrinker_lock); > > #ifdef CONFIG_MEMCG > static bool global_reclaim(struct scan_control *sc) > @@ -285,9 +285,10 @@ int register_shrinker(struct shrinker *shrinker) > if (!shrinker->nr_deferred) > return -ENOMEM; > > - down_write(&shrinker_rwsem); > - list_add_tail(&shrinker->list, &shrinker_list); > - up_write(&shrinker_rwsem); > + atomic_set(&shrinker->nr_active, 0); > + mutex_lock(&shrinker_lock); > + list_add_tail_rcu(&shrinker->list, &shrinker_list); > + mutex_unlock(&shrinker_lock); > return 0; > } > EXPORT_SYMBOL(register_shrinker); > @@ -297,9 +298,13 @@ int register_shrinker(struct shrinker *shrinker) > */ > void unregister_shrinker(struct shrinker *shrinker) > { > - down_write(&shrinker_rwsem); > - list_del(&shrinker->list); > - up_write(&shrinker_rwsem); > + mutex_lock(&shrinker_lock); > + list_del_rcu(&shrinker->list); > + synchronize_rcu(); > + while (atomic_read(&shrinker->nr_active)) > + schedule_timeout_uninterruptible(1); > + synchronize_rcu(); > + mutex_unlock(&shrinker_lock); > kfree(shrinker->nr_deferred); > } > EXPORT_SYMBOL(unregister_shrinker); > @@ -468,18 +473,8 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, > if (nr_scanned == 0) > nr_scanned = SWAP_CLUSTER_MAX; > > - if (!down_read_trylock(&shrinker_rwsem)) { > - /* > - * If we would return 0, our callers would understand that we > - * have nothing else to shrink and give up trying. By returning > - * 1 we keep it going and assume we'll be able to shrink next > - * time. > - */ > - freed = 1; > - goto out; > - } > - > - list_for_each_entry(shrinker, &shrinker_list, list) { > + rcu_read_lock(); > + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { > struct shrink_control sc = { > .gfp_mask = gfp_mask, > .nid = nid, > @@ -498,11 +493,13 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, > if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) > sc.nid = 0; > > + atomic_inc(&shrinker->nr_active); > + rcu_read_unlock(); > freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible); > + rcu_read_lock(); > + atomic_dec(&shrinker->nr_active); > } > - > - up_read(&shrinker_rwsem); > -out: > + rcu_read_unlock(); > cond_resched(); > return freed; > } > -- > 1.8.3.1 >
WARNING: multiple messages have this Message-ID (diff)
From: Shakeel Butt <shakeelb@google.com> To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> Cc: Minchan Kim <minchan@kernel.org>, Huang Ying <ying.huang@intel.com>, Mel Gorman <mgorman@techsingularity.net>, Vladimir Davydov <vdavydov.dev@gmail.com>, Michal Hocko <mhocko@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>, Andrew Morton <akpm@linux-foundation.org>, Greg Thelen <gthelen@google.com>, Linux MM <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org> Subject: Re: [PATCH 1/2] mm,vmscan: Kill global shrinker lock. Date: Mon, 13 Nov 2017 14:05:34 -0800 [thread overview] Message-ID: <CALvZod5RgJrWgsy=14qBpDRLOad3WD8_fPrG5nFjkTQ4rL3rsQ@mail.gmail.com> (raw) In-Reply-To: <1510609063-3327-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> On Mon, Nov 13, 2017 at 1:37 PM, Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp> wrote: > When shrinker_rwsem was introduced, it was assumed that > register_shrinker()/unregister_shrinker() are really unlikely paths > which are called during initialization and tear down. But nowadays, > register_shrinker()/unregister_shrinker() might be called regularly. > This patch prepares for allowing parallel registration/unregistration > of shrinkers. > > Since do_shrink_slab() can reschedule, we cannot protect shrinker_list > using one RCU section. But using atomic_inc()/atomic_dec() for each > do_shrink_slab() call will not impact so much. > > This patch uses polling loop with short sleep for unregister_shrinker() > rather than wait_on_atomic_t(), for we can save reader's cost (plain > atomic_dec() compared to atomic_dec_and_test()), we can expect that > do_shrink_slab() of unregistering shrinker likely returns shortly, and > we can avoid khungtaskd warnings when do_shrink_slab() of unregistering > shrinker unexpectedly took so long. > > Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Reviewed-and-tested-by: Shakeel Butt <shakeelb@google.com> > --- > include/linux/shrinker.h | 3 ++- > mm/vmscan.c | 41 +++++++++++++++++++---------------------- > 2 files changed, 21 insertions(+), 23 deletions(-) > > diff --git a/include/linux/shrinker.h b/include/linux/shrinker.h > index 388ff29..333a1d0 100644 > --- a/include/linux/shrinker.h > +++ b/include/linux/shrinker.h > @@ -62,9 +62,10 @@ struct shrinker { > > int seeks; /* seeks to recreate an obj */ > long batch; /* reclaim batch size, 0 = default */ > - unsigned long flags; > + unsigned int flags; > > /* These are for internal use */ > + atomic_t nr_active; > struct list_head list; > /* objs pending delete, per node */ > atomic_long_t *nr_deferred; > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 1c1bc95..c8996e8 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -157,7 +157,7 @@ struct scan_control { > unsigned long vm_total_pages; > > static LIST_HEAD(shrinker_list); > -static DECLARE_RWSEM(shrinker_rwsem); > +static DEFINE_MUTEX(shrinker_lock); > > #ifdef CONFIG_MEMCG > static bool global_reclaim(struct scan_control *sc) > @@ -285,9 +285,10 @@ int register_shrinker(struct shrinker *shrinker) > if (!shrinker->nr_deferred) > return -ENOMEM; > > - down_write(&shrinker_rwsem); > - list_add_tail(&shrinker->list, &shrinker_list); > - up_write(&shrinker_rwsem); > + atomic_set(&shrinker->nr_active, 0); > + mutex_lock(&shrinker_lock); > + list_add_tail_rcu(&shrinker->list, &shrinker_list); > + mutex_unlock(&shrinker_lock); > return 0; > } > EXPORT_SYMBOL(register_shrinker); > @@ -297,9 +298,13 @@ int register_shrinker(struct shrinker *shrinker) > */ > void unregister_shrinker(struct shrinker *shrinker) > { > - down_write(&shrinker_rwsem); > - list_del(&shrinker->list); > - up_write(&shrinker_rwsem); > + mutex_lock(&shrinker_lock); > + list_del_rcu(&shrinker->list); > + synchronize_rcu(); > + while (atomic_read(&shrinker->nr_active)) > + schedule_timeout_uninterruptible(1); > + synchronize_rcu(); > + mutex_unlock(&shrinker_lock); > kfree(shrinker->nr_deferred); > } > EXPORT_SYMBOL(unregister_shrinker); > @@ -468,18 +473,8 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, > if (nr_scanned == 0) > nr_scanned = SWAP_CLUSTER_MAX; > > - if (!down_read_trylock(&shrinker_rwsem)) { > - /* > - * If we would return 0, our callers would understand that we > - * have nothing else to shrink and give up trying. By returning > - * 1 we keep it going and assume we'll be able to shrink next > - * time. > - */ > - freed = 1; > - goto out; > - } > - > - list_for_each_entry(shrinker, &shrinker_list, list) { > + rcu_read_lock(); > + list_for_each_entry_rcu(shrinker, &shrinker_list, list) { > struct shrink_control sc = { > .gfp_mask = gfp_mask, > .nid = nid, > @@ -498,11 +493,13 @@ static unsigned long shrink_slab(gfp_t gfp_mask, int nid, > if (!(shrinker->flags & SHRINKER_NUMA_AWARE)) > sc.nid = 0; > > + atomic_inc(&shrinker->nr_active); > + rcu_read_unlock(); > freed += do_shrink_slab(&sc, shrinker, nr_scanned, nr_eligible); > + rcu_read_lock(); > + atomic_dec(&shrinker->nr_active); > } > - > - up_read(&shrinker_rwsem); > -out: > + rcu_read_unlock(); > cond_resched(); > return freed; > } > -- > 1.8.3.1 > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-11-13 22:05 UTC|newest] Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-11-13 21:37 [PATCH 1/2] mm,vmscan: Kill global shrinker lock Tetsuo Handa 2017-11-13 21:37 ` Tetsuo Handa 2017-11-13 21:37 ` [PATCH 2/2] mm,vmscan: Allow parallel registration/unregistration of shrinkers Tetsuo Handa 2017-11-13 21:37 ` Tetsuo Handa 2017-11-13 22:05 ` Shakeel Butt [this message] 2017-11-13 22:05 ` [PATCH 1/2] mm,vmscan: Kill global shrinker lock Shakeel Butt 2017-11-15 0:56 ` Minchan Kim 2017-11-15 0:56 ` Minchan Kim 2017-11-15 6:28 ` Shakeel Butt 2017-11-15 6:28 ` Shakeel Butt 2017-11-16 0:46 ` Minchan Kim 2017-11-16 0:46 ` Minchan Kim 2017-11-16 1:41 ` Shakeel Butt 2017-11-16 1:41 ` Shakeel Butt 2017-11-16 4:50 ` Minchan Kim 2017-11-16 4:50 ` Minchan Kim 2017-11-15 8:56 ` Michal Hocko 2017-11-15 8:56 ` Michal Hocko 2017-11-15 9:18 ` Michal Hocko 2017-11-15 9:18 ` Michal Hocko 2017-11-16 17:44 ` Johannes Weiner 2017-11-16 17:44 ` Johannes Weiner 2017-11-23 23:46 ` Minchan Kim 2017-11-23 23:46 ` Minchan Kim 2017-11-15 9:02 ` Michal Hocko 2017-11-15 9:02 ` Michal Hocko 2017-11-15 10:58 ` Tetsuo Handa 2017-11-15 10:58 ` Tetsuo Handa 2017-11-15 11:51 ` Michal Hocko 2017-11-15 11:51 ` Michal Hocko 2017-11-16 0:56 ` Minchan Kim 2017-11-16 0:56 ` Minchan Kim 2017-11-15 13:28 ` Johannes Weiner 2017-11-15 13:28 ` Johannes Weiner 2017-11-16 10:56 ` Tetsuo Handa 2017-11-16 10:56 ` Tetsuo Handa 2017-11-15 14:00 ` Johannes Weiner 2017-11-15 14:00 ` Johannes Weiner 2017-11-15 14:11 ` Michal Hocko 2017-11-15 14:11 ` Michal Hocko 2018-01-25 2:04 ` Tetsuo Handa 2018-01-25 2:04 ` Tetsuo Handa 2018-01-25 8:36 ` Michal Hocko 2018-01-25 8:36 ` Michal Hocko 2018-01-25 10:56 ` Tetsuo Handa 2018-01-25 10:56 ` Tetsuo Handa 2018-01-25 11:41 ` Michal Hocko 2018-01-25 11:41 ` Michal Hocko 2018-01-25 22:19 ` Eric Wheeler 2018-01-25 22:19 ` Eric Wheeler 2018-01-26 3:12 ` Tetsuo Handa 2018-01-26 3:12 ` Tetsuo Handa 2018-01-26 10:08 ` Michal Hocko 2018-01-26 10:08 ` Michal Hocko 2017-11-17 17:35 ` Christoph Hellwig 2017-11-17 17:35 ` Christoph Hellwig 2017-11-17 17:41 ` Shakeel Butt 2017-11-17 17:41 ` Shakeel Butt 2017-11-17 17:53 ` Shakeel Butt 2017-11-17 17:53 ` Shakeel Butt 2017-11-17 18:36 ` Christoph Hellwig 2017-11-17 18:36 ` Christoph Hellwig 2017-11-20 9:25 ` Michal Hocko 2017-11-20 9:25 ` Michal Hocko 2017-11-20 9:33 ` Christoph Hellwig 2017-11-20 9:33 ` Christoph Hellwig 2017-11-20 9:42 ` Michal Hocko 2017-11-20 9:42 ` Michal Hocko 2017-11-20 10:41 ` Christoph Hellwig 2017-11-20 10:41 ` Christoph Hellwig 2017-11-20 10:56 ` Tetsuo Handa 2017-11-20 10:56 ` Tetsuo Handa 2017-11-20 18:28 ` Paul E. McKenney 2017-11-20 18:28 ` Paul E. McKenney
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to='CALvZod5RgJrWgsy=14qBpDRLOad3WD8_fPrG5nFjkTQ4rL3rsQ@mail.gmail.com' \ --to=shakeelb@google.com \ --cc=akpm@linux-foundation.org \ --cc=gthelen@google.com \ --cc=hannes@cmpxchg.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=mgorman@techsingularity.net \ --cc=mhocko@kernel.org \ --cc=minchan@kernel.org \ --cc=penguin-kernel@i-love.sakura.ne.jp \ --cc=vdavydov.dev@gmail.com \ --cc=ying.huang@intel.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.