From: Vladimir Davydov <vdavydov.dev@gmail.com> To: Sahitya Tummala <stummala@codeaurora.org> Cc: Alexander Polakov <apolyakov@beget.ru>, Andrew Morton <akpm@linux-foundation.org>, Jan Kara <jack@suse.cz>, viro@zeniv.linux.org.uk, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v2] fs/dcache.c: fix spin lockup issue on nlru->lock Date: Wed, 21 Jun 2017 19:31:34 +0300 [thread overview] Message-ID: <20170621163134.GA3273@esperanza> (raw) In-Reply-To: <1498027155-4456-1-git-send-email-stummala@codeaurora.org> On Wed, Jun 21, 2017 at 12:09:15PM +0530, Sahitya Tummala wrote: > __list_lru_walk_one() acquires nlru spin lock (nlru->lock) for > longer duration if there are more number of items in the lru list. > As per the current code, it can hold the spin lock for upto maximum > UINT_MAX entries at a time. So if there are more number of items in > the lru list, then "BUG: spinlock lockup suspected" is observed in > the below path - > > [<ffffff8eca0fb0bc>] spin_bug+0x90 > [<ffffff8eca0fb220>] do_raw_spin_lock+0xfc > [<ffffff8ecafb7798>] _raw_spin_lock+0x28 > [<ffffff8eca1ae884>] list_lru_add+0x28 > [<ffffff8eca1f5dac>] dput+0x1c8 > [<ffffff8eca1eb46c>] path_put+0x20 > [<ffffff8eca1eb73c>] terminate_walk+0x3c > [<ffffff8eca1eee58>] path_lookupat+0x100 > [<ffffff8eca1f00fc>] filename_lookup+0x6c > [<ffffff8eca1f0264>] user_path_at_empty+0x54 > [<ffffff8eca1e066c>] SyS_faccessat+0xd0 > [<ffffff8eca084e30>] el0_svc_naked+0x24 > > This nlru->lock is acquired by another CPU in this path - > > [<ffffff8eca1f5fd0>] d_lru_shrink_move+0x34 > [<ffffff8eca1f6180>] dentry_lru_isolate_shrink+0x48 > [<ffffff8eca1aeafc>] __list_lru_walk_one.isra.10+0x94 > [<ffffff8eca1aec34>] list_lru_walk_node+0x40 > [<ffffff8eca1f6620>] shrink_dcache_sb+0x60 > [<ffffff8eca1e56a8>] do_remount_sb+0xbc > [<ffffff8eca1e583c>] do_emergency_remount+0xb0 > [<ffffff8eca0ba510>] process_one_work+0x228 > [<ffffff8eca0bb158>] worker_thread+0x2e0 > [<ffffff8eca0c040c>] kthread+0xf4 > [<ffffff8eca084dd0>] ret_from_fork+0x10 > > Fix this lockup by reducing the number of entries to be shrinked > from the lru list to 1024 at once. Also, add cond_resched() before > processing the lru list again. > > Link: http://marc.info/?t=149722864900001&r=1&w=2 > Fix-suggested-by: Jan kara <jack@suse.cz> > Fix-suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> > --- > v2: patch shrink_dcache_sb() instead of list_lru_walk() > --- > fs/dcache.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index cddf397..c8ca150 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -1133,10 +1133,11 @@ void shrink_dcache_sb(struct super_block *sb) > LIST_HEAD(dispose); > > freed = list_lru_walk(&sb->s_dentry_lru, > - dentry_lru_isolate_shrink, &dispose, UINT_MAX); > + dentry_lru_isolate_shrink, &dispose, 1024); > > this_cpu_sub(nr_dentry_unused, freed); > shrink_dentry_list(&dispose); > + cond_resched(); > } while (freed > 0); In an extreme case, a single invocation of list_lru_walk() can skip all 1024 dentries, in which case 'freed' will be 0 forcing us to break the loop prematurely. I think we should loop until there's at least one dentry left on the LRU, i.e. while (list_lru_count(&sb->s_dentry_lru) > 0) However, even that wouldn't be quite correct, because list_lru_count() iterates over all memory cgroups to sum list_lru_one->nr_items, which can race with memcg offlining code migrating dentries off a dead cgroup (see memcg_drain_all_list_lrus()). So it looks like to make this check race-free, we need to account the number of entries on the LRU not only per memcg, but also per node, i.e. add list_lru_node->nr_items. Fortunately, list_lru entries can't be migrated between NUMA nodes. > } > EXPORT_SYMBOL(shrink_dcache_sb);
WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov.dev@gmail.com> To: Sahitya Tummala <stummala@codeaurora.org> Cc: Alexander Polakov <apolyakov@beget.ru>, Andrew Morton <akpm@linux-foundation.org>, Jan Kara <jack@suse.cz>, viro@zeniv.linux.org.uk, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH v2] fs/dcache.c: fix spin lockup issue on nlru->lock Date: Wed, 21 Jun 2017 19:31:34 +0300 [thread overview] Message-ID: <20170621163134.GA3273@esperanza> (raw) In-Reply-To: <1498027155-4456-1-git-send-email-stummala@codeaurora.org> On Wed, Jun 21, 2017 at 12:09:15PM +0530, Sahitya Tummala wrote: > __list_lru_walk_one() acquires nlru spin lock (nlru->lock) for > longer duration if there are more number of items in the lru list. > As per the current code, it can hold the spin lock for upto maximum > UINT_MAX entries at a time. So if there are more number of items in > the lru list, then "BUG: spinlock lockup suspected" is observed in > the below path - > > [<ffffff8eca0fb0bc>] spin_bug+0x90 > [<ffffff8eca0fb220>] do_raw_spin_lock+0xfc > [<ffffff8ecafb7798>] _raw_spin_lock+0x28 > [<ffffff8eca1ae884>] list_lru_add+0x28 > [<ffffff8eca1f5dac>] dput+0x1c8 > [<ffffff8eca1eb46c>] path_put+0x20 > [<ffffff8eca1eb73c>] terminate_walk+0x3c > [<ffffff8eca1eee58>] path_lookupat+0x100 > [<ffffff8eca1f00fc>] filename_lookup+0x6c > [<ffffff8eca1f0264>] user_path_at_empty+0x54 > [<ffffff8eca1e066c>] SyS_faccessat+0xd0 > [<ffffff8eca084e30>] el0_svc_naked+0x24 > > This nlru->lock is acquired by another CPU in this path - > > [<ffffff8eca1f5fd0>] d_lru_shrink_move+0x34 > [<ffffff8eca1f6180>] dentry_lru_isolate_shrink+0x48 > [<ffffff8eca1aeafc>] __list_lru_walk_one.isra.10+0x94 > [<ffffff8eca1aec34>] list_lru_walk_node+0x40 > [<ffffff8eca1f6620>] shrink_dcache_sb+0x60 > [<ffffff8eca1e56a8>] do_remount_sb+0xbc > [<ffffff8eca1e583c>] do_emergency_remount+0xb0 > [<ffffff8eca0ba510>] process_one_work+0x228 > [<ffffff8eca0bb158>] worker_thread+0x2e0 > [<ffffff8eca0c040c>] kthread+0xf4 > [<ffffff8eca084dd0>] ret_from_fork+0x10 > > Fix this lockup by reducing the number of entries to be shrinked > from the lru list to 1024 at once. Also, add cond_resched() before > processing the lru list again. > > Link: http://marc.info/?t=149722864900001&r=1&w=2 > Fix-suggested-by: Jan kara <jack@suse.cz> > Fix-suggested-by: Vladimir Davydov <vdavydov.dev@gmail.com> > Signed-off-by: Sahitya Tummala <stummala@codeaurora.org> > --- > v2: patch shrink_dcache_sb() instead of list_lru_walk() > --- > fs/dcache.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/dcache.c b/fs/dcache.c > index cddf397..c8ca150 100644 > --- a/fs/dcache.c > +++ b/fs/dcache.c > @@ -1133,10 +1133,11 @@ void shrink_dcache_sb(struct super_block *sb) > LIST_HEAD(dispose); > > freed = list_lru_walk(&sb->s_dentry_lru, > - dentry_lru_isolate_shrink, &dispose, UINT_MAX); > + dentry_lru_isolate_shrink, &dispose, 1024); > > this_cpu_sub(nr_dentry_unused, freed); > shrink_dentry_list(&dispose); > + cond_resched(); > } while (freed > 0); In an extreme case, a single invocation of list_lru_walk() can skip all 1024 dentries, in which case 'freed' will be 0 forcing us to break the loop prematurely. I think we should loop until there's at least one dentry left on the LRU, i.e. while (list_lru_count(&sb->s_dentry_lru) > 0) However, even that wouldn't be quite correct, because list_lru_count() iterates over all memory cgroups to sum list_lru_one->nr_items, which can race with memcg offlining code migrating dentries off a dead cgroup (see memcg_drain_all_list_lrus()). So it looks like to make this check race-free, we need to account the number of entries on the LRU not only per memcg, but also per node, i.e. add list_lru_node->nr_items. Fortunately, list_lru entries can't be migrated between NUMA nodes. > } > EXPORT_SYMBOL(shrink_dcache_sb); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-06-21 16:31 UTC|newest] Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-06-12 0:47 [PATCH] mm/list_lru.c: use cond_resched_lock() for nlru->lock Sahitya Tummala 2017-06-12 0:47 ` Sahitya Tummala 2017-06-12 13:11 ` Jan Kara 2017-06-12 13:11 ` Jan Kara 2017-06-15 21:05 ` Andrew Morton 2017-06-15 21:05 ` Andrew Morton 2017-06-16 14:44 ` Sahitya Tummala 2017-06-16 14:44 ` Sahitya Tummala 2017-06-17 11:14 ` Vladimir Davydov 2017-06-17 11:14 ` Vladimir Davydov 2017-06-20 2:52 ` Sahitya Tummala 2017-06-20 2:52 ` Sahitya Tummala 2017-06-21 6:39 ` [PATCH v2] fs/dcache.c: fix spin lockup issue on nlru->lock Sahitya Tummala 2017-06-21 6:39 ` Sahitya Tummala 2017-06-21 16:31 ` Vladimir Davydov [this message] 2017-06-21 16:31 ` Vladimir Davydov 2017-06-22 16:31 ` Sahitya Tummala 2017-06-22 16:31 ` Sahitya Tummala 2017-06-22 17:49 ` Vladimir Davydov 2017-06-22 17:49 ` Vladimir Davydov 2017-06-28 6:07 ` [PATCH v3 1/2] mm/list_lru.c: fix list_lru_count_node() to be race free Sahitya Tummala 2017-06-28 6:07 ` Sahitya Tummala 2017-06-28 6:07 ` [PATCH v3 2/2] fs/dcache.c: fix spin lockup issue on nlru->lock Sahitya Tummala 2017-06-28 6:07 ` Sahitya Tummala 2017-06-28 17:18 ` [PATCH v3 1/2] mm/list_lru.c: fix list_lru_count_node() to be race free Vladimir Davydov 2017-06-28 17:18 ` Vladimir Davydov 2017-06-29 3:39 ` [PATCH v4 " Sahitya Tummala 2017-06-29 3:39 ` Sahitya Tummala 2017-07-01 16:28 ` Vladimir Davydov 2017-07-01 16:28 ` Vladimir Davydov 2017-06-29 3:39 ` [PATCH v4 2/2] fs/dcache.c: fix spin lockup issue on nlru->lock Sahitya Tummala 2017-06-29 3:39 ` Sahitya Tummala 2017-06-29 22:48 ` Andrew Morton 2017-06-29 22:48 ` Andrew Morton 2017-06-30 3:16 ` Sahitya Tummala 2017-06-30 3:16 ` Sahitya Tummala 2017-07-01 16:28 ` Vladimir Davydov 2017-07-01 16:28 ` Vladimir Davydov
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170621163134.GA3273@esperanza \ --to=vdavydov.dev@gmail.com \ --cc=akpm@linux-foundation.org \ --cc=apolyakov@beget.ru \ --cc=jack@suse.cz \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=stummala@codeaurora.org \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.