* [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators @ 2019-11-07 20:52 Eric Sandeen 2019-11-07 20:52 ` [PATCH 1/2] fs: avoid softlockups in " Eric Sandeen ` (2 more replies) 0 siblings, 3 replies; 7+ messages in thread From: Eric Sandeen @ 2019-11-07 20:52 UTC (permalink / raw) To: linux-fsdevel; +Cc: viro 2 patches to make sure we either schedule in an s_inodes walking loop, or do our best to limit the size of the walk, to avoid soft lockups. V2 to fix inexplicable patch mangling Thanks, -Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] fs: avoid softlockups in s_inodes iterators 2019-11-07 20:52 [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators Eric Sandeen @ 2019-11-07 20:52 ` Eric Sandeen 2019-11-07 20:52 ` [PATCH 2/2] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen 2019-11-13 22:07 ` [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators Eric Sandeen 2 siblings, 0 replies; 7+ messages in thread From: Eric Sandeen @ 2019-11-07 20:52 UTC (permalink / raw) To: linux-fsdevel; +Cc: viro Anything that walks all inodes on sb->s_inodes list without rescheduling risks softlockups. Previous efforts were made in 2 functions, see: c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() ac05fbb inode: don't softlockup when evicting inodes but there hasn't been an audit of all walkers, so do that now. This also consistently moves the cond_resched() calls to the bottom of each loop in cases where it already exists. One loop remains: remove_dquot_ref(), because I'm not quite sure how to deal with that one w/o taking the i_lock. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> --- fs/drop_caches.c | 2 +- fs/inode.c | 7 +++++++ fs/notify/fsnotify.c | 1 + fs/quota/dquot.c | 1 + 4 files changed, 10 insertions(+), 1 deletion(-) diff --git a/fs/drop_caches.c b/fs/drop_caches.c index d31b6c7..dc1a1d5 100644 --- a/fs/drop_caches.c +++ b/fs/drop_caches.c @@ -35,11 +35,11 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused) spin_unlock(&inode->i_lock); spin_unlock(&sb->s_inode_list_lock); - cond_resched(); invalidate_mapping_pages(inode->i_mapping, 0, -1); iput(toput_inode); toput_inode = inode; + cond_resched(); spin_lock(&sb->s_inode_list_lock); } spin_unlock(&sb->s_inode_list_lock); diff --git a/fs/inode.c b/fs/inode.c index fef457a..96d62d9 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -676,6 +676,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) struct inode *inode, *next; LIST_HEAD(dispose); +again: spin_lock(&sb->s_inode_list_lock); list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { spin_lock(&inode->i_lock); @@ -698,6 +699,12 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) inode_lru_list_del(inode); spin_unlock(&inode->i_lock); list_add(&inode->i_lru, &dispose); + if (need_resched()) { + spin_unlock(&sb->s_inode_list_lock); + cond_resched(); + dispose_list(&dispose); + goto again; + } } spin_unlock(&sb->s_inode_list_lock); diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index 2ecef61..ac9eb27 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -77,6 +77,7 @@ static void fsnotify_unmount_inodes(struct super_block *sb) iput_inode = inode; + cond_resched(); spin_lock(&sb->s_inode_list_lock); } spin_unlock(&sb->s_inode_list_lock); diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 6e826b4..4a085b3 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -985,6 +985,7 @@ static int add_dquot_ref(struct super_block *sb, int type) * later. */ old_inode = inode; + cond_resched(); spin_lock(&sb->s_inode_list_lock); } spin_unlock(&sb->s_inode_list_lock); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] fs: call fsnotify_sb_delete after evict_inodes 2019-11-07 20:52 [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators Eric Sandeen 2019-11-07 20:52 ` [PATCH 1/2] fs: avoid softlockups in " Eric Sandeen @ 2019-11-07 20:52 ` Eric Sandeen 2019-11-13 23:26 ` Al Viro 2019-11-15 13:47 ` [PATCH 2/2 V2] " Eric Sandeen 2019-11-13 22:07 ` [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators Eric Sandeen 2 siblings, 2 replies; 7+ messages in thread From: Eric Sandeen @ 2019-11-07 20:52 UTC (permalink / raw) To: linux-fsdevel; +Cc: viro When a filesystem is unmounted, we currently call fsnotify_sb_delete() before evict_inodes(), which means that fsnotify_unmount_inodes() must iterate over all inodes on the superblock, even though it will only act on inodes with a refcount. This is inefficient and can lead to livelocks as it iterates over many unrefcounted inodes. However, since fsnotify_sb_delete() and evict_inodes() are working on orthogonal sets of inodes (fsnotify_sb_delete() only cares about nonzero refcount, and evict_inodes() only cares about zero refcount), we can swap the order of the calls. The fsnotify call will then have a much smaller list to walk (any refcounted inodes). This should speed things up overall, and avoid livelocks in fsnotify_unmount_inodes(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> --- fs/notify/fsnotify.c | 3 +++ fs/super.c | 4 +++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index ac9eb27..f44e39c 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -57,6 +57,9 @@ static void fsnotify_unmount_inodes(struct super_block *sb) * doing an __iget/iput with SB_ACTIVE clear would actually * evict all inodes with zero i_count from icache which is * unnecessarily violent and may in fact be illegal to do. + * However, we should have been called /after/ evict_inodes + * removed all zero refcount inodes, in any case. Test to + * be sure. */ if (!atomic_read(&inode->i_count)) { spin_unlock(&inode->i_lock); diff --git a/fs/super.c b/fs/super.c index cfadab2..cd35253 100644 --- a/fs/super.c +++ b/fs/super.c @@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb) sync_filesystem(sb); sb->s_flags &= ~SB_ACTIVE; - fsnotify_sb_delete(sb); cgroup_writeback_umount(); + /* evict all inodes with zero refcount */ evict_inodes(sb); + /* only nonzero refcount inodes can have marks */ + fsnotify_sb_delete(sb); if (sb->s_dio_done_wq) { destroy_workqueue(sb->s_dio_done_wq); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] fs: call fsnotify_sb_delete after evict_inodes 2019-11-07 20:52 ` [PATCH 2/2] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen @ 2019-11-13 23:26 ` Al Viro 2019-11-15 13:47 ` [PATCH 2/2 V2] " Eric Sandeen 1 sibling, 0 replies; 7+ messages in thread From: Al Viro @ 2019-11-13 23:26 UTC (permalink / raw) To: Eric Sandeen; +Cc: linux-fsdevel On Thu, Nov 07, 2019 at 02:52:34PM -0600, Eric Sandeen wrote: > When a filesystem is unmounted, we currently call fsnotify_sb_delete() > before evict_inodes(), which means that fsnotify_unmount_inodes() > must iterate over all inodes on the superblock, even though it will > only act on inodes with a refcount. This is inefficient and can lead > to livelocks as it iterates over many unrefcounted inodes. > > However, since fsnotify_sb_delete() and evict_inodes() are working > on orthogonal sets of inodes (fsnotify_sb_delete() only cares about > nonzero refcount, and evict_inodes() only cares about zero refcount), > we can swap the order of the calls. The fsnotify call will then have > a much smaller list to walk (any refcounted inodes). > > This should speed things up overall, and avoid livelocks in > fsnotify_unmount_inodes(). Umm... The critical part you've omitted here is that at this stage any final iput() done by fsnotify_sb_delete() (or anybody else, really) will forcibly evict the sucker there and then. So it's not as if any inodes were *added* to the evictable set by fsnotify_sb_delete() to be picked by evict_inodes() - any candidate is immediately disposed of. The crucial point is that SB_ACTIVE is already cleared by that stage - without that the patch would've been badly broken. That aside, both patches look sane. Could you update the commit message and resend the second one? ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 2/2 V2] fs: call fsnotify_sb_delete after evict_inodes 2019-11-07 20:52 ` [PATCH 2/2] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen 2019-11-13 23:26 ` Al Viro @ 2019-11-15 13:47 ` Eric Sandeen 1 sibling, 0 replies; 7+ messages in thread From: Eric Sandeen @ 2019-11-15 13:47 UTC (permalink / raw) To: linux-fsdevel; +Cc: viro When a filesystem is unmounted, we currently call fsnotify_sb_delete() before evict_inodes(), which means that fsnotify_unmount_inodes() must iterate over all inodes on the superblock looking for any inodes with watches. This is inefficient and can lead to livelocks as it iterates over many unwatched inodes. At this point, SB_ACTIVE is gone and dropping refcount to zero kicks the inode out out immediately, so anything processed by fsnotify_sb_delete / fsnotify_unmount_inodes gets evicted in that loop. After that, the call to evict_inodes will evict everything else with a zero refcount. This should speed things up overall, and avoid livelocks in fsnotify_unmount_inodes(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> --- V2: Update changelog fs/notify/fsnotify.c | 3 +++ fs/super.c | 4 +++- 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index ac9eb27..f44e39c 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -57,6 +57,9 @@ static void fsnotify_unmount_inodes(struct super_block *sb) * doing an __iget/iput with SB_ACTIVE clear would actually * evict all inodes with zero i_count from icache which is * unnecessarily violent and may in fact be illegal to do. + * However, we should have been called /after/ evict_inodes + * removed all zero refcount inodes, in any case. Test to + * be sure. */ if (!atomic_read(&inode->i_count)) { spin_unlock(&inode->i_lock); diff --git a/fs/super.c b/fs/super.c index cfadab2..cd35253 100644 --- a/fs/super.c +++ b/fs/super.c @@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb) sync_filesystem(sb); sb->s_flags &= ~SB_ACTIVE; - fsnotify_sb_delete(sb); cgroup_writeback_umount(); + /* evict all inodes with zero refcount */ evict_inodes(sb); + /* only nonzero refcount inodes can have marks */ + fsnotify_sb_delete(sb); if (sb->s_dio_done_wq) { destroy_workqueue(sb->s_dio_done_wq); -- 1.8.3.1 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators 2019-11-07 20:52 [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators Eric Sandeen 2019-11-07 20:52 ` [PATCH 1/2] fs: avoid softlockups in " Eric Sandeen 2019-11-07 20:52 ` [PATCH 2/2] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen @ 2019-11-13 22:07 ` Eric Sandeen 2 siblings, 0 replies; 7+ messages in thread From: Eric Sandeen @ 2019-11-13 22:07 UTC (permalink / raw) To: linux-fsdevel; +Cc: viro On 11/7/19 2:52 PM, Eric Sandeen wrote: > 2 patches to make sure we either schedule in an s_inodes walking > loop, or do our best to limit the size of the walk, to avoid soft > lockups. > > V2 to fix inexplicable patch mangling > > Thanks, > -Eric > Al, ping on this? Thanks, -Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 0/2] avoid softlockups in various s_inodes iterators @ 2019-11-07 19:42 Eric Sandeen 2019-11-07 19:47 ` [PATCH 1/2] fs: avoid softlockups in " Eric Sandeen 0 siblings, 1 reply; 7+ messages in thread From: Eric Sandeen @ 2019-11-07 19:42 UTC (permalink / raw) To: fsdevel; +Cc: Al Viro 2 patches to make sure we either schedule in an s_inodes walking loop, or do our best to limit the size of the walk, to avoid soft lockups. -Eric ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] fs: avoid softlockups in s_inodes iterators 2019-11-07 19:42 [PATCH 0/2] " Eric Sandeen @ 2019-11-07 19:47 ` Eric Sandeen 0 siblings, 0 replies; 7+ messages in thread From: Eric Sandeen @ 2019-11-07 19:47 UTC (permalink / raw) To: fsdevel; +Cc: Al Viro Anything that walks all inodes on sb->s_inodes list without rescheduling risks softlockups. Previous efforts were made in 2 functions, see: c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb() ac05fbb inode: don't softlockup when evicting inodes but there hasn't been an audit of all walkers, so do that now. This also consistently moves the cond_resched() calls to the bottom of each loop in cases where it already exists. One loop remains: remove_dquot_ref(), because I'm not quite sure how to deal with that one w/o taking the i_lock. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> --- V2: Drop unrelated iput cleanups in fsnotify diff --git a/fs/drop_caches.c b/fs/drop_caches.c index d31b6c72b476..dc1a1d5d825b 100644 --- a/fs/drop_caches.c +++ b/fs/drop_caches.c @@ -35,11 +35,11 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused) spin_unlock(&inode->i_lock); spin_unlock(&sb->s_inode_list_lock); - cond_resched(); invalidate_mapping_pages(inode->i_mapping, 0, -1); iput(toput_inode); toput_inode = inode; + cond_resched(); spin_lock(&sb->s_inode_list_lock); } spin_unlock(&sb->s_inode_list_lock); diff --git a/fs/inode.c b/fs/inode.c index fef457a42882..b0c789bb3dba 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -676,6 +676,7 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) struct inode *inode, *next; LIST_HEAD(dispose); +again: spin_lock(&sb->s_inode_list_lock); list_for_each_entry_safe(inode, next, &sb->s_inodes, i_sb_list) { spin_lock(&inode->i_lock); @@ -698,6 +699,13 @@ int invalidate_inodes(struct super_block *sb, bool kill_dirty) inode_lru_list_del(inode); spin_unlock(&inode->i_lock); list_add(&inode->i_lru, &dispose); + + if (need_resched()) { + spin_unlock(&sb->s_inode_list_lock); + cond_resched(); + dispose_list(&dispose); + goto again; + } } spin_unlock(&sb->s_inode_list_lock); diff --git a/fs/notify/fsnotify.c b/fs/notify/fsnotify.c index 2ecef6155fc0..ac9eb273e28c 100644 --- a/fs/notify/fsnotify.c +++ b/fs/notify/fsnotify.c @@ -77,6 +77,7 @@ static void fsnotify_unmount_inodes(struct super_block *sb) iput_inode = inode; + cond_resched(); spin_lock(&sb->s_inode_list_lock); } spin_unlock(&sb->s_inode_list_lock); diff --git a/fs/quota/dquot.c b/fs/quota/dquot.c index 6e826b454082..4a085b3c7cac 100644 --- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -985,6 +985,7 @@ static int add_dquot_ref(struct super_block *sb, int type) * later. */ old_inode = inode; + cond_resched(); spin_lock(&sb->s_inode_list_lock); } spin_unlock(&sb->s_inode_list_lock); ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2019-11-15 13:47 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-11-07 20:52 [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators Eric Sandeen 2019-11-07 20:52 ` [PATCH 1/2] fs: avoid softlockups in " Eric Sandeen 2019-11-07 20:52 ` [PATCH 2/2] fs: call fsnotify_sb_delete after evict_inodes Eric Sandeen 2019-11-13 23:26 ` Al Viro 2019-11-15 13:47 ` [PATCH 2/2 V2] " Eric Sandeen 2019-11-13 22:07 ` [PATCH 0/2 V2] avoid softlockups in various s_inodes iterators Eric Sandeen -- strict thread matches above, loose matches on Subject: below -- 2019-11-07 19:42 [PATCH 0/2] " Eric Sandeen 2019-11-07 19:47 ` [PATCH 1/2] fs: avoid softlockups in " Eric Sandeen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).