linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vfs: Avoid softlockups in drop_pagecache_sb()
@ 2018-12-06 16:11 Jan Kara
  2019-01-08 15:16 ` Michal Hocko
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2018-12-06 16:11 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, mhocko, Jan Kara

When superblock has lots of inodes without any pagecache (like is the case for
/proc), drop_pagecache_sb() will iterate through all of them without dropping
sb->s_inode_list_lock which can lead to softlockups.

Fix the problem by going to the slow path and doing cond_resched() in case the
process needs rescheduling.

Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/drop_caches.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index 82377017130f..d31b6c72b476 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -21,8 +21,13 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
 	spin_lock(&sb->s_inode_list_lock);
 	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
 		spin_lock(&inode->i_lock);
+		/*
+		 * We must skip inodes in unusual state. We may also skip
+		 * inodes without pages but we deliberately won't in case
+		 * we need to reschedule to avoid softlockups.
+		 */
 		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
-		    (inode->i_mapping->nrpages == 0)) {
+		    (inode->i_mapping->nrpages == 0 && !need_resched())) {
 			spin_unlock(&inode->i_lock);
 			continue;
 		}
@@ -30,6 +35,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
 		spin_unlock(&inode->i_lock);
 		spin_unlock(&sb->s_inode_list_lock);
 
+		cond_resched();
 		invalidate_mapping_pages(inode->i_mapping, 0, -1);
 		iput(toput_inode);
 		toput_inode = inode;
-- 
2.16.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] vfs: Avoid softlockups in drop_pagecache_sb()
  2018-12-06 16:11 [PATCH] vfs: Avoid softlockups in drop_pagecache_sb() Jan Kara
@ 2019-01-08 15:16 ` Michal Hocko
  0 siblings, 0 replies; 5+ messages in thread
From: Michal Hocko @ 2019-01-08 15:16 UTC (permalink / raw)
  To: Jan Kara; +Cc: Al Viro, linux-fsdevel

Is this staged in any existing tree?

On Thu 06-12-18 17:11:52, Jan Kara wrote:
> When superblock has lots of inodes without any pagecache (like is the case for
> /proc), drop_pagecache_sb() will iterate through all of them without dropping
> sb->s_inode_list_lock which can lead to softlockups.
> 
> Fix the problem by going to the slow path and doing cond_resched() in case the
> process needs rescheduling.

FWIW the patch looks good to me.

> Signed-off-by: Jan Kara <jack@suse.cz>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  fs/drop_caches.c | 8 +++++++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/drop_caches.c b/fs/drop_caches.c
> index 82377017130f..d31b6c72b476 100644
> --- a/fs/drop_caches.c
> +++ b/fs/drop_caches.c
> @@ -21,8 +21,13 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
>  	spin_lock(&sb->s_inode_list_lock);
>  	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
>  		spin_lock(&inode->i_lock);
> +		/*
> +		 * We must skip inodes in unusual state. We may also skip
> +		 * inodes without pages but we deliberately won't in case
> +		 * we need to reschedule to avoid softlockups.
> +		 */
>  		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
> -		    (inode->i_mapping->nrpages == 0)) {
> +		    (inode->i_mapping->nrpages == 0 && !need_resched())) {
>  			spin_unlock(&inode->i_lock);
>  			continue;
>  		}
> @@ -30,6 +35,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
>  		spin_unlock(&inode->i_lock);
>  		spin_unlock(&sb->s_inode_list_lock);
>  
> +		cond_resched();
>  		invalidate_mapping_pages(inode->i_mapping, 0, -1);
>  		iput(toput_inode);
>  		toput_inode = inode;
> -- 
> 2.16.4

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] vfs: Avoid softlockups in drop_pagecache_sb()
  2019-01-30  0:56 ` Andrew Morton
@ 2019-01-30 10:07   ` Jan Kara
  0 siblings, 0 replies; 5+ messages in thread
From: Jan Kara @ 2019-01-30 10:07 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Jan Kara, Al Viro, linux-fsdevel, linux-mm

On Tue 29-01-19 16:56:36, Andrew Morton wrote:
> On Mon, 14 Jan 2019 09:53:43 +0100 Jan Kara <jack@suse.cz> wrote:
> 
> > When superblock has lots of inodes without any pagecache (like is the
> > case for /proc), drop_pagecache_sb() will iterate through all of them
> > without dropping sb->s_inode_list_lock which can lead to softlockups
> > (one of our customers hit this).
> > 
> > Fix the problem by going to the slow path and doing cond_resched() in
> > case the process needs rescheduling.
> > 
> > ...
> >
> > --- a/fs/drop_caches.c
> > +++ b/fs/drop_caches.c
> > @@ -21,8 +21,13 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
> >  	spin_lock(&sb->s_inode_list_lock);
> >  	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
> >  		spin_lock(&inode->i_lock);
> > +		/*
> > +		 * We must skip inodes in unusual state. We may also skip
> > +		 * inodes without pages but we deliberately won't in case
> > +		 * we need to reschedule to avoid softlockups.
> > +		 */
> >  		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
> > -		    (inode->i_mapping->nrpages == 0)) {
> > +		    (inode->i_mapping->nrpages == 0 && !need_resched())) {
> >  			spin_unlock(&inode->i_lock);
> >  			continue;
> >  		}
> > @@ -30,6 +35,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
> >  		spin_unlock(&inode->i_lock);
> >  		spin_unlock(&sb->s_inode_list_lock);
> >  
> > +		cond_resched();
> >  		invalidate_mapping_pages(inode->i_mapping, 0, -1);
> >  		iput(toput_inode);
> >  		toput_inode = inode;
> 
> Are we sure there's no situation in which a large number of inodes can
> be in the "unusual state", leading to the same issue?

No, we cannot be really sure that there aren't many such inodes (although
I'd be surprised if there were). But the problem with "unusual state"
inodes is that you cannot just __iget() them (well, you could but it's
breaking the rules and would lead to use-after-free issues ;) and so
there's no easy way to drop the spinlock and then continue the iteration
after cond_resched(). So overall it's too complex to deal with this until
someone actually hits it.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] vfs: Avoid softlockups in drop_pagecache_sb()
  2019-01-14  8:53 Jan Kara
@ 2019-01-30  0:56 ` Andrew Morton
  2019-01-30 10:07   ` Jan Kara
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2019-01-30  0:56 UTC (permalink / raw)
  To: Jan Kara; +Cc: Al Viro, linux-fsdevel, linux-mm

On Mon, 14 Jan 2019 09:53:43 +0100 Jan Kara <jack@suse.cz> wrote:

> When superblock has lots of inodes without any pagecache (like is the
> case for /proc), drop_pagecache_sb() will iterate through all of them
> without dropping sb->s_inode_list_lock which can lead to softlockups
> (one of our customers hit this).
> 
> Fix the problem by going to the slow path and doing cond_resched() in
> case the process needs rescheduling.
> 
> ...
>
> --- a/fs/drop_caches.c
> +++ b/fs/drop_caches.c
> @@ -21,8 +21,13 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
>  	spin_lock(&sb->s_inode_list_lock);
>  	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
>  		spin_lock(&inode->i_lock);
> +		/*
> +		 * We must skip inodes in unusual state. We may also skip
> +		 * inodes without pages but we deliberately won't in case
> +		 * we need to reschedule to avoid softlockups.
> +		 */
>  		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
> -		    (inode->i_mapping->nrpages == 0)) {
> +		    (inode->i_mapping->nrpages == 0 && !need_resched())) {
>  			spin_unlock(&inode->i_lock);
>  			continue;
>  		}
> @@ -30,6 +35,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
>  		spin_unlock(&inode->i_lock);
>  		spin_unlock(&sb->s_inode_list_lock);
>  
> +		cond_resched();
>  		invalidate_mapping_pages(inode->i_mapping, 0, -1);
>  		iput(toput_inode);
>  		toput_inode = inode;

Are we sure there's no situation in which a large number of inodes can
be in the "unusual state", leading to the same issue?


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH] vfs: Avoid softlockups in drop_pagecache_sb()
@ 2019-01-14  8:53 Jan Kara
  2019-01-30  0:56 ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Jan Kara @ 2019-01-14  8:53 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Al Viro, linux-fsdevel, linux-mm, Jan Kara

When superblock has lots of inodes without any pagecache (like is the
case for /proc), drop_pagecache_sb() will iterate through all of them
without dropping sb->s_inode_list_lock which can lead to softlockups
(one of our customers hit this).

Fix the problem by going to the slow path and doing cond_resched() in
case the process needs rescheduling.

Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Jan Kara <jack@suse.cz>
---
 fs/drop_caches.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

Andrew, can you please merge this patch? Thanks!

diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index 82377017130f..d31b6c72b476 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -21,8 +21,13 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
 	spin_lock(&sb->s_inode_list_lock);
 	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
 		spin_lock(&inode->i_lock);
+		/*
+		 * We must skip inodes in unusual state. We may also skip
+		 * inodes without pages but we deliberately won't in case
+		 * we need to reschedule to avoid softlockups.
+		 */
 		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
-		    (inode->i_mapping->nrpages == 0)) {
+		    (inode->i_mapping->nrpages == 0 && !need_resched())) {
 			spin_unlock(&inode->i_lock);
 			continue;
 		}
@@ -30,6 +35,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
 		spin_unlock(&inode->i_lock);
 		spin_unlock(&sb->s_inode_list_lock);
 
+		cond_resched();
 		invalidate_mapping_pages(inode->i_mapping, 0, -1);
 		iput(toput_inode);
 		toput_inode = inode;
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-01-30 10:07 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06 16:11 [PATCH] vfs: Avoid softlockups in drop_pagecache_sb() Jan Kara
2019-01-08 15:16 ` Michal Hocko
2019-01-14  8:53 Jan Kara
2019-01-30  0:56 ` Andrew Morton
2019-01-30 10:07   ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).