* [PATCH V2] fs: avoid softlockups in s_inodes iterators
@ 2019-10-14 21:30 Eric Sandeen
2019-10-14 21:36 ` Eric Sandeen
` (2 more replies)
0 siblings, 3 replies; 12+ messages in thread
From: Eric Sandeen @ 2019-10-14 21:30 UTC (permalink / raw)
To: fsdevel
Anything that walks all inodes on sb->s_inodes list without rescheduling
risks softlockups.
Previous efforts were made in 2 functions, see:
c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
ac05fbb inode: don't softlockup when evicting inodes
but there hasn't been an audit of all walkers, so do that now. This
also consistently moves the cond_resched() calls to the bottom of each
loop in cases where it already exists.
One loop remains: remove_dquot_ref(), because I'm not quite sure how
to deal with that one w/o taking the i_lock.
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---
V2: Drop unrelated iput cleanups in fsnotify
diff --git a/fs/drop_caches.c b/fs/drop_caches.c
index d31b6c72b476..dc1a1d5d825b 100644
--- a/fs/drop_caches.c
+++ b/fs/drop_caches.c
@@ -35,11 +35,11 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
spin_unlock(&inode->i_lock);
spin_unlock(&sb->s_inode_list_lock);
- cond_resched();
invalidate_mapping_pages(inode->i_mapping, 0, -1);
iput(toput_inode);
toput_inode = inode;
+ cond_resched();
spin_lock(&sb->s_inode_list_lock);
}
spin_unlock(&sb->s_inode_list_lock);
diff --git a/fs/inode.c b/fs/inode.c
index fef457a42882..b0c789bb3dba 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -676,6 +676,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
struct inode *inode, *next;
LIST_HEAD(dispose);
+again:
spin_lock(&sb->s_inode_list_lock);
list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
spin_lock(&inode->i_lock);
@@ -698,6 +699,13 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
inode_lru_list_del(inode);
spin_unlock(&inode->i_lock);
list_add(&inode->i_lru, &dispose);
+
+ if (need_resched()) {
+ spin_unlock(&sb->s_inode_list_lock);
+ cond_resched();
+ dispose_list(&dispose);
+ goto again;
+ }
}
spin_unlock(&sb->s_inode_list_lock);
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index 2ecef6155fc0..ac9eb273e28c 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -77,6 +77,7 @@ static void fsnotify_unmount_inodes(struct super_block *sb)
iput_inode = inode;
+ cond_resched();
spin_lock(&sb->s_inode_list_lock);
}
spin_unlock(&sb->s_inode_list_lock);
diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
index 6e826b454082..4a085b3c7cac 100644
--- a/fs/quota/dquot.c
+++ b/fs/quota/dquot.c
@@ -985,6 +985,7 @@ static int add_dquot_ref(struct super_block *sb, int type)
* later.
*/
old_inode = inode;
+ cond_resched();
spin_lock(&sb->s_inode_list_lock);
}
spin_unlock(&sb->s_inode_list_lock);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-14 21:30 [PATCH V2] fs: avoid softlockups in s_inodes iterators Eric Sandeen
@ 2019-10-14 21:36 ` Eric Sandeen
2019-10-15 7:37 ` Jan Kara
2019-10-16 17:11 ` [PATCH 2/1] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen
2 siblings, 0 replies; 12+ messages in thread
From: Eric Sandeen @ 2019-10-14 21:36 UTC (permalink / raw)
To: fsdevel; +Cc: Jan Kara, Josef Bacik
On 10/14/19 4:30 PM, Eric Sandeen wrote:
> Anything that walks all inodes on sb->s_inodes list without rescheduling
> risks softlockups.
>
> Previous efforts were made in 2 functions, see:
>
> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
> ac05fbb inode: don't softlockup when evicting inodes
>
> but there hasn't been an audit of all walkers, so do that now. This
> also consistently moves the cond_resched() calls to the bottom of each
> loop in cases where it already exists.
>
> One loop remains: remove_dquot_ref(), because I'm not quite sure how
> to deal with that one w/o taking the i_lock.
>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Whoops, cc: Jan & Josef as on original, sorry.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-14 21:30 [PATCH V2] fs: avoid softlockups in s_inodes iterators Eric Sandeen
2019-10-14 21:36 ` Eric Sandeen
@ 2019-10-15 7:37 ` Jan Kara
2019-10-16 2:36 ` Eric Sandeen
2019-10-16 17:11 ` [PATCH 2/1] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen
2 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2019-10-15 7:37 UTC (permalink / raw)
To: Eric Sandeen; +Cc: fsdevel
On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
> Anything that walks all inodes on sb->s_inodes list without rescheduling
> risks softlockups.
>
> Previous efforts were made in 2 functions, see:
>
> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
> ac05fbb inode: don't softlockup when evicting inodes
>
> but there hasn't been an audit of all walkers, so do that now. This
> also consistently moves the cond_resched() calls to the bottom of each
> loop in cases where it already exists.
>
> One loop remains: remove_dquot_ref(), because I'm not quite sure how
> to deal with that one w/o taking the i_lock.
>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Thanks Eric. The patch looks good to me. You can add:
Reviewed-by: Jan Kara <jack@suse.cz>
BTW, I suppose you need to add Al to pickup the patch?
Honza
> ---
>
> V2: Drop unrelated iput cleanups in fsnotify
>
> diff --git a/fs/drop_caches.c b/fs/drop_caches.c
> index d31b6c72b476..dc1a1d5d825b 100644
> --- a/fs/drop_caches.c
> +++ b/fs/drop_caches.c
> @@ -35,11 +35,11 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
> spin_unlock(&inode->i_lock);
> spin_unlock(&sb->s_inode_list_lock);
> - cond_resched();
> invalidate_mapping_pages(inode->i_mapping, 0, -1);
> iput(toput_inode);
> toput_inode = inode;
> + cond_resched();
> spin_lock(&sb->s_inode_list_lock);
> }
> spin_unlock(&sb->s_inode_list_lock);
> diff --git a/fs/inode.c b/fs/inode.c
> index fef457a42882..b0c789bb3dba 100644
> --- a/fs/inode.c
> +++ b/fs/inode.c
> @@ -676,6 +676,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
> struct inode *inode, *next;
> LIST_HEAD(dispose);
> +again:
> spin_lock(&sb->s_inode_list_lock);
> list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) {
> spin_lock(&inode->i_lock);
> @@ -698,6 +699,13 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty)
> inode_lru_list_del(inode);
> spin_unlock(&inode->i_lock);
> list_add(&inode->i_lru, &dispose);
> +
> + if (need_resched()) {
> + spin_unlock(&sb->s_inode_list_lock);
> + cond_resched();
> + dispose_list(&dispose);
> + goto again;
> + }
> }
> spin_unlock(&sb->s_inode_list_lock);
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index 2ecef6155fc0..ac9eb273e28c 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -77,6 +77,7 @@ static void fsnotify_unmount_inodes(struct super_block *sb)
> iput_inode = inode;
> + cond_resched();
> spin_lock(&sb->s_inode_list_lock);
> }
> spin_unlock(&sb->s_inode_list_lock);
> diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c
> index 6e826b454082..4a085b3c7cac 100644
> --- a/fs/quota/dquot.c
> +++ b/fs/quota/dquot.c
> @@ -985,6 +985,7 @@ static int add_dquot_ref(struct super_block *sb, int type)
> * later.
> */
> old_inode = inode;
> + cond_resched();
> spin_lock(&sb->s_inode_list_lock);
> }
> spin_unlock(&sb->s_inode_list_lock);
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-15 7:37 ` Jan Kara
@ 2019-10-16 2:36 ` Eric Sandeen
2019-10-16 9:42 ` Jan Kara
0 siblings, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2019-10-16 2:36 UTC (permalink / raw)
To: Jan Kara, Eric Sandeen; +Cc: fsdevel, Al Viro
On 10/15/19 2:37 AM, Jan Kara wrote:
> On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
>> Anything that walks all inodes on sb->s_inodes list without rescheduling
>> risks softlockups.
>>
>> Previous efforts were made in 2 functions, see:
>>
>> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
>> ac05fbb inode: don't softlockup when evicting inodes
>>
>> but there hasn't been an audit of all walkers, so do that now. This
>> also consistently moves the cond_resched() calls to the bottom of each
>> loop in cases where it already exists.
>>
>> One loop remains: remove_dquot_ref(), because I'm not quite sure how
>> to deal with that one w/o taking the i_lock.
>>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>
> Thanks Eric. The patch looks good to me. You can add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
thanks
> BTW, I suppose you need to add Al to pickup the patch?
Yeah (cc'd now)
But it was just pointed out to me that if/when the majority of inodes
at umount time have i_count == 0, we'll never hit the resched in
fsnotify_unmount_inodes() and may still have an issue ...
-Eric
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-16 2:36 ` Eric Sandeen
@ 2019-10-16 9:42 ` Jan Kara
2019-10-16 13:23 ` Eric Sandeen
0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2019-10-16 9:42 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Jan Kara, Eric Sandeen, fsdevel, Al Viro
On Tue 15-10-19 21:36:08, Eric Sandeen wrote:
> On 10/15/19 2:37 AM, Jan Kara wrote:
> > On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
> >> Anything that walks all inodes on sb->s_inodes list without rescheduling
> >> risks softlockups.
> >>
> >> Previous efforts were made in 2 functions, see:
> >>
> >> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
> >> ac05fbb inode: don't softlockup when evicting inodes
> >>
> >> but there hasn't been an audit of all walkers, so do that now. This
> >> also consistently moves the cond_resched() calls to the bottom of each
> >> loop in cases where it already exists.
> >>
> >> One loop remains: remove_dquot_ref(), because I'm not quite sure how
> >> to deal with that one w/o taking the i_lock.
> >>
> >> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >
> > Thanks Eric. The patch looks good to me. You can add:
> >
> > Reviewed-by: Jan Kara <jack@suse.cz>
>
> thanks
>
> > BTW, I suppose you need to add Al to pickup the patch?
>
> Yeah (cc'd now)
>
> But it was just pointed out to me that if/when the majority of inodes
> at umount time have i_count == 0, we'll never hit the resched in
> fsnotify_unmount_inodes() and may still have an issue ...
Yeah, that's a good point. So that loop will need some further tweaking
(like doing iget-iput dance in need_resched() case like in some other
places).
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-16 9:42 ` Jan Kara
@ 2019-10-16 13:23 ` Eric Sandeen
2019-10-16 13:49 ` Jan Kara
0 siblings, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2019-10-16 13:23 UTC (permalink / raw)
To: Jan Kara; +Cc: Eric Sandeen, fsdevel, Al Viro
On 10/16/19 4:42 AM, Jan Kara wrote:
> On Tue 15-10-19 21:36:08, Eric Sandeen wrote:
>> On 10/15/19 2:37 AM, Jan Kara wrote:
>>> On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
>>>> Anything that walks all inodes on sb->s_inodes list without rescheduling
>>>> risks softlockups.
>>>>
>>>> Previous efforts were made in 2 functions, see:
>>>>
>>>> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
>>>> ac05fbb inode: don't softlockup when evicting inodes
>>>>
>>>> but there hasn't been an audit of all walkers, so do that now. This
>>>> also consistently moves the cond_resched() calls to the bottom of each
>>>> loop in cases where it already exists.
>>>>
>>>> One loop remains: remove_dquot_ref(), because I'm not quite sure how
>>>> to deal with that one w/o taking the i_lock.
>>>>
>>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>>
>>> Thanks Eric. The patch looks good to me. You can add:
>>>
>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>
>> thanks
>>
>>> BTW, I suppose you need to add Al to pickup the patch?
>>
>> Yeah (cc'd now)
>>
>> But it was just pointed out to me that if/when the majority of inodes
>> at umount time have i_count == 0, we'll never hit the resched in
>> fsnotify_unmount_inodes() and may still have an issue ...
>
> Yeah, that's a good point. So that loop will need some further tweaking
> (like doing iget-iput dance in need_resched() case like in some other
> places).
Well, it's already got an iget/iput for anything with i_count > 0. But
as the comment says (and I think it's right...) doing an iget/iput
on i_count == 0 inodes at this point would be without SB_ACTIVE and the final
iput here would actually start evicting inodes in /this/ loop, right?
I think we could (ab)use the lru list to construct a "dispose" list for
fsnotify processing as was done in evict_inodes...
or maybe the two should be merged, and fsnotify watches could be handled
directly in evict_inodes. But that doesn't feel quite right.
-Eric
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-16 13:23 ` Eric Sandeen
@ 2019-10-16 13:49 ` Jan Kara
2019-10-16 14:39 ` Eric Sandeen
0 siblings, 1 reply; 12+ messages in thread
From: Jan Kara @ 2019-10-16 13:49 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Jan Kara, Eric Sandeen, fsdevel, Al Viro
On Wed 16-10-19 08:23:51, Eric Sandeen wrote:
> On 10/16/19 4:42 AM, Jan Kara wrote:
> > On Tue 15-10-19 21:36:08, Eric Sandeen wrote:
> >> On 10/15/19 2:37 AM, Jan Kara wrote:
> >>> On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
> >>>> Anything that walks all inodes on sb->s_inodes list without rescheduling
> >>>> risks softlockups.
> >>>>
> >>>> Previous efforts were made in 2 functions, see:
> >>>>
> >>>> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
> >>>> ac05fbb inode: don't softlockup when evicting inodes
> >>>>
> >>>> but there hasn't been an audit of all walkers, so do that now. This
> >>>> also consistently moves the cond_resched() calls to the bottom of each
> >>>> loop in cases where it already exists.
> >>>>
> >>>> One loop remains: remove_dquot_ref(), because I'm not quite sure how
> >>>> to deal with that one w/o taking the i_lock.
> >>>>
> >>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >>>
> >>> Thanks Eric. The patch looks good to me. You can add:
> >>>
> >>> Reviewed-by: Jan Kara <jack@suse.cz>
> >>
> >> thanks
> >>
> >>> BTW, I suppose you need to add Al to pickup the patch?
> >>
> >> Yeah (cc'd now)
> >>
> >> But it was just pointed out to me that if/when the majority of inodes
> >> at umount time have i_count == 0, we'll never hit the resched in
> >> fsnotify_unmount_inodes() and may still have an issue ...
> >
> > Yeah, that's a good point. So that loop will need some further tweaking
> > (like doing iget-iput dance in need_resched() case like in some other
> > places).
>
> Well, it's already got an iget/iput for anything with i_count > 0. But
> as the comment says (and I think it's right...) doing an iget/iput
> on i_count == 0 inodes at this point would be without SB_ACTIVE and the final
> iput here would actually start evicting inodes in /this/ loop, right?
Yes, it would but since this is just before calling evict_inodes(), I have
currently hard time remembering why evicting inodes like that would be an
issue.
> I think we could (ab)use the lru list to construct a "dispose" list for
> fsnotify processing as was done in evict_inodes...
>
> or maybe the two should be merged, and fsnotify watches could be handled
> directly in evict_inodes. But that doesn't feel quite right.
Merging the two would be possible (and faster!) as well but I agree it
feels a bit dirty :)
Honza
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-16 13:49 ` Jan Kara
@ 2019-10-16 14:39 ` Eric Sandeen
2019-10-16 15:26 ` Eric Sandeen
0 siblings, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2019-10-16 14:39 UTC (permalink / raw)
To: Jan Kara; +Cc: Eric Sandeen, fsdevel, Al Viro
On 10/16/19 8:49 AM, Jan Kara wrote:
> On Wed 16-10-19 08:23:51, Eric Sandeen wrote:
>> On 10/16/19 4:42 AM, Jan Kara wrote:
>>> On Tue 15-10-19 21:36:08, Eric Sandeen wrote:
>>>> On 10/15/19 2:37 AM, Jan Kara wrote:
>>>>> On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
>>>>>> Anything that walks all inodes on sb->s_inodes list without rescheduling
>>>>>> risks softlockups.
>>>>>>
>>>>>> Previous efforts were made in 2 functions, see:
>>>>>>
>>>>>> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
>>>>>> ac05fbb inode: don't softlockup when evicting inodes
>>>>>>
>>>>>> but there hasn't been an audit of all walkers, so do that now. This
>>>>>> also consistently moves the cond_resched() calls to the bottom of each
>>>>>> loop in cases where it already exists.
>>>>>>
>>>>>> One loop remains: remove_dquot_ref(), because I'm not quite sure how
>>>>>> to deal with that one w/o taking the i_lock.
>>>>>>
>>>>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>>>>
>>>>> Thanks Eric. The patch looks good to me. You can add:
>>>>>
>>>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>>>
>>>> thanks
>>>>
>>>>> BTW, I suppose you need to add Al to pickup the patch?
>>>>
>>>> Yeah (cc'd now)
>>>>
>>>> But it was just pointed out to me that if/when the majority of inodes
>>>> at umount time have i_count == 0, we'll never hit the resched in
>>>> fsnotify_unmount_inodes() and may still have an issue ...
>>>
>>> Yeah, that's a good point. So that loop will need some further tweaking
>>> (like doing iget-iput dance in need_resched() case like in some other
>>> places).
>>
>> Well, it's already got an iget/iput for anything with i_count > 0. But
>> as the comment says (and I think it's right...) doing an iget/iput
>> on i_count == 0 inodes at this point would be without SB_ACTIVE and the final
>> iput here would actually start evicting inodes in /this/ loop, right?
>
> Yes, it would but since this is just before calling evict_inodes(), I have
> currently hard time remembering why evicting inodes like that would be an
> issue.
Probably just weird to effectively evict all inodes prior to evict_inodes() ;)
>> I think we could (ab)use the lru list to construct a "dispose" list for
>> fsnotify processing as was done in evict_inodes...
[narrator: Eric's idea here is dumb and it won't work]
>> or maybe the two should be merged, and fsnotify watches could be handled
>> directly in evict_inodes. But that doesn't feel quite right.
>
> Merging the two would be possible (and faster!) as well but I agree it
> feels a bit dirty :)
It's starting to look like maybe the only option...
I'll see if Al is willing to merge this patch as is for the simple "schedule
the big loops" and see about a 2nd patch on top to do more surgery for this
case.
Thanks,
-Eric
> Honza
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-16 14:39 ` Eric Sandeen
@ 2019-10-16 15:26 ` Eric Sandeen
2019-10-16 15:35 ` Jan Kara
0 siblings, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2019-10-16 15:26 UTC (permalink / raw)
To: Eric Sandeen, Jan Kara; +Cc: fsdevel, Al Viro
On 10/16/19 9:39 AM, Eric Sandeen wrote:
> On 10/16/19 8:49 AM, Jan Kara wrote:
>> On Wed 16-10-19 08:23:51, Eric Sandeen wrote:
>>> On 10/16/19 4:42 AM, Jan Kara wrote:
>>>> On Tue 15-10-19 21:36:08, Eric Sandeen wrote:
>>>>> On 10/15/19 2:37 AM, Jan Kara wrote:
>>>>>> On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
>>>>>>> Anything that walks all inodes on sb->s_inodes list without rescheduling
>>>>>>> risks softlockups.
>>>>>>>
>>>>>>> Previous efforts were made in 2 functions, see:
>>>>>>>
>>>>>>> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
>>>>>>> ac05fbb inode: don't softlockup when evicting inodes
>>>>>>>
>>>>>>> but there hasn't been an audit of all walkers, so do that now. This
>>>>>>> also consistently moves the cond_resched() calls to the bottom of each
>>>>>>> loop in cases where it already exists.
>>>>>>>
>>>>>>> One loop remains: remove_dquot_ref(), because I'm not quite sure how
>>>>>>> to deal with that one w/o taking the i_lock.
>>>>>>>
>>>>>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>>>>>
>>>>>> Thanks Eric. The patch looks good to me. You can add:
>>>>>>
>>>>>> Reviewed-by: Jan Kara <jack@suse.cz>
>>>>>
>>>>> thanks
>>>>>
>>>>>> BTW, I suppose you need to add Al to pickup the patch?
>>>>>
>>>>> Yeah (cc'd now)
>>>>>
>>>>> But it was just pointed out to me that if/when the majority of inodes
>>>>> at umount time have i_count == 0, we'll never hit the resched in
>>>>> fsnotify_unmount_inodes() and may still have an issue ...
>>>>
>>>> Yeah, that's a good point. So that loop will need some further tweaking
>>>> (like doing iget-iput dance in need_resched() case like in some other
>>>> places).
>>>
>>> Well, it's already got an iget/iput for anything with i_count > 0. But
>>> as the comment says (and I think it's right...) doing an iget/iput
>>> on i_count == 0 inodes at this point would be without SB_ACTIVE and the final
>>> iput here would actually start evicting inodes in /this/ loop, right?
>>
>> Yes, it would but since this is just before calling evict_inodes(), I have
>> currently hard time remembering why evicting inodes like that would be an
>> issue.
>
> Probably just weird to effectively evict all inodes prior to evict_inodes() ;)
>
>>> I think we could (ab)use the lru list to construct a "dispose" list for
>>> fsnotify processing as was done in evict_inodes...
>
> [narrator: Eric's idea here is dumb and it won't work]
>
>>> or maybe the two should be merged, and fsnotify watches could be handled
>>> directly in evict_inodes. But that doesn't feel quite right.
>>
>> Merging the two would be possible (and faster!) as well but I agree it
>> feels a bit dirty :)
>
> It's starting to look like maybe the only option...
>
> I'll see if Al is willing to merge this patch as is for the simple "schedule
> the big loops" and see about a 2nd patch on top to do more surgery for this
> case.
Sorry for thinking out loud in public but I'm not too familiar with fsnotify, so
I'm being timid. However, since fsnotify_sb_delete() and evict_inodes() are working
on orthogonal sets of inodes (fsnotify_sb_delete only cares about nonzero refcount,
and evict_inodes only cares about zero refcount), I think we can just swap the order
of the calls. The fsnotify call will then have a much smaller list to walk
(any refcounted inodes) as well.
I'll try to give this a test.
diff --git a/fs/super.c b/fs/super.c
index cfadab2cbf35..cd352530eca9 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb)
sync_filesystem(sb);
sb->s_flags &= ~SB_ACTIVE;
- fsnotify_sb_delete(sb);
cgroup_writeback_umount();
+ /* evict all inodes with zero refcount */
evict_inodes(sb);
+ /* only nonzero refcount inodes can have marks */
+ fsnotify_sb_delete(sb);
if (sb->s_dio_done_wq) {
destroy_workqueue(sb->s_dio_done_wq);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators
2019-10-16 15:26 ` Eric Sandeen
@ 2019-10-16 15:35 ` Jan Kara
0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2019-10-16 15:35 UTC (permalink / raw)
To: Eric Sandeen; +Cc: Eric Sandeen, Jan Kara, fsdevel, Al Viro
On Wed 16-10-19 10:26:16, Eric Sandeen wrote:
> On 10/16/19 9:39 AM, Eric Sandeen wrote:
> > On 10/16/19 8:49 AM, Jan Kara wrote:
> >> On Wed 16-10-19 08:23:51, Eric Sandeen wrote:
> >>> On 10/16/19 4:42 AM, Jan Kara wrote:
> >>>> On Tue 15-10-19 21:36:08, Eric Sandeen wrote:
> >>>>> On 10/15/19 2:37 AM, Jan Kara wrote:
> >>>>>> On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
> >>>>>>> Anything that walks all inodes on sb->s_inodes list without rescheduling
> >>>>>>> risks softlockups.
> >>>>>>>
> >>>>>>> Previous efforts were made in 2 functions, see:
> >>>>>>>
> >>>>>>> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
> >>>>>>> ac05fbb inode: don't softlockup when evicting inodes
> >>>>>>>
> >>>>>>> but there hasn't been an audit of all walkers, so do that now. This
> >>>>>>> also consistently moves the cond_resched() calls to the bottom of each
> >>>>>>> loop in cases where it already exists.
> >>>>>>>
> >>>>>>> One loop remains: remove_dquot_ref(), because I'm not quite sure how
> >>>>>>> to deal with that one w/o taking the i_lock.
> >>>>>>>
> >>>>>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >>>>>>
> >>>>>> Thanks Eric. The patch looks good to me. You can add:
> >>>>>>
> >>>>>> Reviewed-by: Jan Kara <jack@suse.cz>
> >>>>>
> >>>>> thanks
> >>>>>
> >>>>>> BTW, I suppose you need to add Al to pickup the patch?
> >>>>>
> >>>>> Yeah (cc'd now)
> >>>>>
> >>>>> But it was just pointed out to me that if/when the majority of inodes
> >>>>> at umount time have i_count == 0, we'll never hit the resched in
> >>>>> fsnotify_unmount_inodes() and may still have an issue ...
> >>>>
> >>>> Yeah, that's a good point. So that loop will need some further tweaking
> >>>> (like doing iget-iput dance in need_resched() case like in some other
> >>>> places).
> >>>
> >>> Well, it's already got an iget/iput for anything with i_count > 0. But
> >>> as the comment says (and I think it's right...) doing an iget/iput
> >>> on i_count == 0 inodes at this point would be without SB_ACTIVE and the final
> >>> iput here would actually start evicting inodes in /this/ loop, right?
> >>
> >> Yes, it would but since this is just before calling evict_inodes(), I have
> >> currently hard time remembering why evicting inodes like that would be an
> >> issue.
> >
> > Probably just weird to effectively evict all inodes prior to evict_inodes() ;)
> >
> >>> I think we could (ab)use the lru list to construct a "dispose" list for
> >>> fsnotify processing as was done in evict_inodes...
> >
> > [narrator: Eric's idea here is dumb and it won't work]
> >
> >>> or maybe the two should be merged, and fsnotify watches could be handled
> >>> directly in evict_inodes. But that doesn't feel quite right.
> >>
> >> Merging the two would be possible (and faster!) as well but I agree it
> >> feels a bit dirty :)
> >
> > It's starting to look like maybe the only option...
> >
> > I'll see if Al is willing to merge this patch as is for the simple "schedule
> > the big loops" and see about a 2nd patch on top to do more surgery for this
> > case.
>
> Sorry for thinking out loud in public but I'm not too familiar with fsnotify, so
> I'm being timid. However, since fsnotify_sb_delete() and evict_inodes() are working
> on orthogonal sets of inodes (fsnotify_sb_delete only cares about nonzero refcount,
> and evict_inodes only cares about zero refcount), I think we can just swap the order
> of the calls. The fsnotify call will then have a much smaller list to walk
> (any refcounted inodes) as well.
>
> I'll try to give this a test.
Yes, this should make the softlockup impossible to trigger in practice. So
agreed.
Honza
>
> diff --git a/fs/super.c b/fs/super.c
> index cfadab2cbf35..cd352530eca9 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb)
> sync_filesystem(sb);
> sb->s_flags &= ~SB_ACTIVE;
>
> - fsnotify_sb_delete(sb);
> cgroup_writeback_umount();
>
> + /* evict all inodes with zero refcount */
> evict_inodes(sb);
> + /* only nonzero refcount inodes can have marks */
> + fsnotify_sb_delete(sb);
>
> if (sb->s_dio_done_wq) {
> destroy_workqueue(sb->s_dio_done_wq);
>
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/1] fs: call fsnotify_sb_delete after evict_inodes
2019-10-14 21:30 [PATCH V2] fs: avoid softlockups in s_inodes iterators Eric Sandeen
2019-10-14 21:36 ` Eric Sandeen
2019-10-15 7:37 ` Jan Kara
@ 2019-10-16 17:11 ` Eric Sandeen
2019-10-17 8:39 ` Jan Kara
2 siblings, 1 reply; 12+ messages in thread
From: Eric Sandeen @ 2019-10-16 17:11 UTC (permalink / raw)
To: fsdevel; +Cc: Jan Kara, Al Viro, Eric Paris
When a filesystem is unmounted, we currently call fsnotify_sb_delete()
before evict_inodes(), which means that fsnotify_unmount_inodes()
must iterate over all inodes on the superblock, even though it will
only act on inodes with a refcount. This is inefficient and can lead
to livelocks as it iterates over many unrefcounted inodes.
However, since fsnotify_sb_delete() and evict_inodes() are working
on orthogonal sets of inodes (fsnotify_sb_delete() only cares about nonzero
refcount, and evict_inodes() only cares about zero refcount), we can swap
the order of the calls. The fsnotify call will then have a much smaller
list to walk (any refcounted inodes).
This should speed things up overall, and avoid livelocks in
fsnotify_unmount_inodes().
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---
I just did basic sanity testing here, but AFAIK there is no *notify
test suite, so I'm not sure how to really give this a workout.
diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
index ac9eb273e28c..426f03b6e660 100644
--- a/fs/notify/fsnotify.c
+++ b/fs/notify/fsnotify.c
@@ -57,6 +57,10 @@ static void fsnotify_unmount_inodes(struct super_block *sb)
* doing an __iget/iput with SB_ACTIVE clear would actually
* evict all inodes with zero i_count from icache which is
* unnecessarily violent and may in fact be illegal to do.
+ *
+ * However, we should have been called /after/ evict_inodes
+ * removed all zero refcount inodes, in any case. Test to
+ * be sure.
*/
if (!atomic_read(&inode->i_count)) {
spin_unlock(&inode->i_lock);
diff --git a/fs/super.c b/fs/super.c
index cfadab2cbf35..cd352530eca9 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb)
sync_filesystem(sb);
sb->s_flags &= ~SB_ACTIVE;
- fsnotify_sb_delete(sb);
cgroup_writeback_umount();
+ /* evict all inodes with zero refcount */
evict_inodes(sb);
+ /* only nonzero refcount inodes can have marks */
+ fsnotify_sb_delete(sb);
if (sb->s_dio_done_wq) {
destroy_workqueue(sb->s_dio_done_wq);
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH 2/1] fs: call fsnotify_sb_delete after evict_inodes
2019-10-16 17:11 ` [PATCH 2/1] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen
@ 2019-10-17 8:39 ` Jan Kara
0 siblings, 0 replies; 12+ messages in thread
From: Jan Kara @ 2019-10-17 8:39 UTC (permalink / raw)
To: Eric Sandeen; +Cc: fsdevel, Jan Kara, Al Viro, Eric Paris
On Wed 16-10-19 12:11:24, Eric Sandeen wrote:
> When a filesystem is unmounted, we currently call fsnotify_sb_delete()
> before evict_inodes(), which means that fsnotify_unmount_inodes()
> must iterate over all inodes on the superblock, even though it will
> only act on inodes with a refcount. This is inefficient and can lead
> to livelocks as it iterates over many unrefcounted inodes.
>
> However, since fsnotify_sb_delete() and evict_inodes() are working
> on orthogonal sets of inodes (fsnotify_sb_delete() only cares about nonzero
> refcount, and evict_inodes() only cares about zero refcount), we can swap
> the order of the calls. The fsnotify call will then have a much smaller
> list to walk (any refcounted inodes).
>
> This should speed things up overall, and avoid livelocks in
> fsnotify_unmount_inodes().
>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Thanks for the patch. It looks good to me. Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
> ---
>
> I just did basic sanity testing here, but AFAIK there is no *notify
> test suite, so I'm not sure how to really give this a workout.
LTP has quite a few tests for inotify & fanotify.
Honza
> diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c
> index ac9eb273e28c..426f03b6e660 100644
> --- a/fs/notify/fsnotify.c
> +++ b/fs/notify/fsnotify.c
> @@ -57,6 +57,10 @@ static void fsnotify_unmount_inodes(struct super_block *sb)
> * doing an __iget/iput with SB_ACTIVE clear would actually
> * evict all inodes with zero i_count from icache which is
> * unnecessarily violent and may in fact be illegal to do.
> + *
> + * However, we should have been called /after/ evict_inodes
> + * removed all zero refcount inodes, in any case. Test to
> + * be sure.
> */
> if (!atomic_read(&inode->i_count)) {
> spin_unlock(&inode->i_lock);
> diff --git a/fs/super.c b/fs/super.c
> index cfadab2cbf35..cd352530eca9 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb)
> sync_filesystem(sb);
> sb->s_flags &= ~SB_ACTIVE;
>
> - fsnotify_sb_delete(sb);
> cgroup_writeback_umount();
>
> + /* evict all inodes with zero refcount */
> evict_inodes(sb);
> + /* only nonzero refcount inodes can have marks */
> + fsnotify_sb_delete(sb);
>
> if (sb->s_dio_done_wq) {
> destroy_workqueue(sb->s_dio_done_wq);
>
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2019-10-17 8:39 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-14 21:30 [PATCH V2] fs: avoid softlockups in s_inodes iterators Eric Sandeen
2019-10-14 21:36 ` Eric Sandeen
2019-10-15 7:37 ` Jan Kara
2019-10-16 2:36 ` Eric Sandeen
2019-10-16 9:42 ` Jan Kara
2019-10-16 13:23 ` Eric Sandeen
2019-10-16 13:49 ` Jan Kara
2019-10-16 14:39 ` Eric Sandeen
2019-10-16 15:26 ` Eric Sandeen
2019-10-16 15:35 ` Jan Kara
2019-10-16 17:11 ` [PATCH 2/1] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen
2019-10-17 8:39 ` Jan Kara
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).