* [PATCH] vfs: stop shrinker while fs is freezed @ 2019-12-13 22:24 Junxiao Bi 2019-12-14 0:40 ` Darrick J. Wong 2019-12-16 4:09 ` Dave Chinner 0 siblings, 2 replies; 6+ messages in thread From: Junxiao Bi @ 2019-12-13 22:24 UTC (permalink / raw) To: linux-fsdevel, linux-kernel; +Cc: viro Shrinker could be blocked by freeze while dropping the last reference of some inode that had been removed. As "s_umount" lock was acquired by the Shrinker before blocked, the thaw will hung by this lock. This caused a deadlock. crash7latest> set 132 PID: 132 COMMAND: "kswapd0:0" TASK: ffff9cdc9dfb5f00 [THREAD_INFO: ffff9cdc9dfb5f00] CPU: 6 STATE: TASK_UNINTERRUPTIBLE crash7latest> bt PID: 132 TASK: ffff9cdc9dfb5f00 CPU: 6 COMMAND: "kswapd0:0" #0 [ffffaa5d075bf900] __schedule at ffffffff8186487c #1 [ffffaa5d075bf998] schedule at ffffffff81864e96 #2 [ffffaa5d075bf9b0] rwsem_down_read_failed at ffffffff818689ee #3 [ffffaa5d075bfa40] call_rwsem_down_read_failed at ffffffff81859308 #4 [ffffaa5d075bfa90] __percpu_down_read at ffffffff810ebd38 #5 [ffffaa5d075bfab0] __sb_start_write at ffffffff812859ef #6 [ffffaa5d075bfad0] xfs_trans_alloc at ffffffffc07ebe9c [xfs] #7 [ffffaa5d075bfb18] xfs_free_eofblocks at ffffffffc07c39d1 [xfs] #8 [ffffaa5d075bfb80] xfs_inactive at ffffffffc07de878 [xfs] #9 [ffffaa5d075bfba0] __dta_xfs_fs_destroy_inode_3543 at ffffffffc07e885e [xfs] #10 [ffffaa5d075bfbd0] destroy_inode at ffffffff812a25de #11 [ffffaa5d075bfbe8] evict at ffffffff812a2b73 #12 [ffffaa5d075bfc10] dispose_list at ffffffff812a2c1d #13 [ffffaa5d075bfc38] prune_icache_sb at ffffffff812a421a #14 [ffffaa5d075bfc70] super_cache_scan at ffffffff812870a1 #15 [ffffaa5d075bfcc8] shrink_slab at ffffffff811eebb3 #16 [ffffaa5d075bfdb0] shrink_node at ffffffff811f4788 #17 [ffffaa5d075bfe38] kswapd at ffffffff811f58c3 #18 [ffffaa5d075bff08] kthread at ffffffff810b75d5 #19 [ffffaa5d075bff50] ret_from_fork at ffffffff81a0035e crash7latest> set 31060 PID: 31060 COMMAND: "safefreeze" TASK: ffff9cd292868000 [THREAD_INFO: ffff9cd292868000] CPU: 2 STATE: TASK_UNINTERRUPTIBLE crash7latest> bt PID: 31060 TASK: ffff9cd292868000 CPU: 2 COMMAND: "safefreeze" #0 [ffffaa5d10047c90] __schedule at ffffffff8186487c #1 [ffffaa5d10047d28] schedule at ffffffff81864e96 #2 [ffffaa5d10047d40] rwsem_down_write_failed at ffffffff81868f18 #3 [ffffaa5d10047dd8] call_rwsem_down_write_failed at ffffffff81859367 #4 [ffffaa5d10047e20] down_write at ffffffff81867cfd #5 [ffffaa5d10047e38] thaw_super at ffffffff81285d2d #6 [ffffaa5d10047e60] do_vfs_ioctl at ffffffff81299566 #7 [ffffaa5d10047ee8] sys_ioctl at ffffffff81299709 #8 [ffffaa5d10047f28] do_syscall_64 at ffffffff81003949 #9 [ffffaa5d10047f50] entry_SYSCALL_64_after_hwframe at ffffffff81a001ad RIP: 0000000000453d67 RSP: 00007ffff9c1ce78 RFLAGS: 00000206 RAX: ffffffffffffffda RBX: 0000000001cbe92c RCX: 0000000000453d67 RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000014 RBP: 00007ffff9c1cf80 R8: 0000000000000000 R9: 0000000000000012 R10: 0000000000000008 R11: 0000000000000206 R12: 0000000000401fb0 R13: 0000000000402040 R14: 0000000000000000 R15: 0000000000000000 ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> --- fs/super.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/fs/super.c b/fs/super.c index cfadab2cbf35..adc18652302b 100644 --- a/fs/super.c +++ b/fs/super.c @@ -80,6 +80,11 @@ static unsigned long super_cache_scan(struct shrinker *shrink, if (!trylock_super(sb)) return SHRINK_STOP; + if (sb->s_writers.frozen != SB_UNFROZEN) { + up_read(&sb->s_umount); + return SHRINK_STOP; + } + if (sb->s_op->nr_cached_objects) fs_objects = sb->s_op->nr_cached_objects(sb, sc); -- 2.17.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: stop shrinker while fs is freezed 2019-12-13 22:24 [PATCH] vfs: stop shrinker while fs is freezed Junxiao Bi @ 2019-12-14 0:40 ` Darrick J. Wong 2019-12-16 4:11 ` Dave Chinner 2019-12-16 4:09 ` Dave Chinner 1 sibling, 1 reply; 6+ messages in thread From: Darrick J. Wong @ 2019-12-14 0:40 UTC (permalink / raw) To: Junxiao Bi; +Cc: linux-fsdevel, linux-kernel, viro, linux-mm [adding mm to cc] On Fri, Dec 13, 2019 at 02:24:40PM -0800, Junxiao Bi wrote: > Shrinker could be blocked by freeze while dropping the last reference of > some inode that had been removed. As "s_umount" lock was acquired by the > Shrinker before blocked, the thaw will hung by this lock. This caused a > deadlock. > > crash7latest> set 132 > PID: 132 > COMMAND: "kswapd0:0" > TASK: ffff9cdc9dfb5f00 [THREAD_INFO: ffff9cdc9dfb5f00] > CPU: 6 > STATE: TASK_UNINTERRUPTIBLE > crash7latest> bt > PID: 132 TASK: ffff9cdc9dfb5f00 CPU: 6 COMMAND: "kswapd0:0" > #0 [ffffaa5d075bf900] __schedule at ffffffff8186487c > #1 [ffffaa5d075bf998] schedule at ffffffff81864e96 > #2 [ffffaa5d075bf9b0] rwsem_down_read_failed at ffffffff818689ee > #3 [ffffaa5d075bfa40] call_rwsem_down_read_failed at ffffffff81859308 > #4 [ffffaa5d075bfa90] __percpu_down_read at ffffffff810ebd38 > #5 [ffffaa5d075bfab0] __sb_start_write at ffffffff812859ef > #6 [ffffaa5d075bfad0] xfs_trans_alloc at ffffffffc07ebe9c [xfs] > #7 [ffffaa5d075bfb18] xfs_free_eofblocks at ffffffffc07c39d1 [xfs] > #8 [ffffaa5d075bfb80] xfs_inactive at ffffffffc07de878 [xfs] > #9 [ffffaa5d075bfba0] __dta_xfs_fs_destroy_inode_3543 at ffffffffc07e885e [xfs] > #10 [ffffaa5d075bfbd0] destroy_inode at ffffffff812a25de > #11 [ffffaa5d075bfbe8] evict at ffffffff812a2b73 > #12 [ffffaa5d075bfc10] dispose_list at ffffffff812a2c1d > #13 [ffffaa5d075bfc38] prune_icache_sb at ffffffff812a421a > #14 [ffffaa5d075bfc70] super_cache_scan at ffffffff812870a1 > #15 [ffffaa5d075bfcc8] shrink_slab at ffffffff811eebb3 > #16 [ffffaa5d075bfdb0] shrink_node at ffffffff811f4788 > #17 [ffffaa5d075bfe38] kswapd at ffffffff811f58c3 > #18 [ffffaa5d075bff08] kthread at ffffffff810b75d5 > #19 [ffffaa5d075bff50] ret_from_fork at ffffffff81a0035e > crash7latest> set 31060 > PID: 31060 > COMMAND: "safefreeze" > TASK: ffff9cd292868000 [THREAD_INFO: ffff9cd292868000] > CPU: 2 > STATE: TASK_UNINTERRUPTIBLE > crash7latest> bt > PID: 31060 TASK: ffff9cd292868000 CPU: 2 COMMAND: "safefreeze" > #0 [ffffaa5d10047c90] __schedule at ffffffff8186487c > #1 [ffffaa5d10047d28] schedule at ffffffff81864e96 > #2 [ffffaa5d10047d40] rwsem_down_write_failed at ffffffff81868f18 > #3 [ffffaa5d10047dd8] call_rwsem_down_write_failed at ffffffff81859367 > #4 [ffffaa5d10047e20] down_write at ffffffff81867cfd > #5 [ffffaa5d10047e38] thaw_super at ffffffff81285d2d > #6 [ffffaa5d10047e60] do_vfs_ioctl at ffffffff81299566 > #7 [ffffaa5d10047ee8] sys_ioctl at ffffffff81299709 > #8 [ffffaa5d10047f28] do_syscall_64 at ffffffff81003949 > #9 [ffffaa5d10047f50] entry_SYSCALL_64_after_hwframe at ffffffff81a001ad > RIP: 0000000000453d67 RSP: 00007ffff9c1ce78 RFLAGS: 00000206 > RAX: ffffffffffffffda RBX: 0000000001cbe92c RCX: 0000000000453d67 > RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000014 > RBP: 00007ffff9c1cf80 R8: 0000000000000000 R9: 0000000000000012 > R10: 0000000000000008 R11: 0000000000000206 R12: 0000000000401fb0 > R13: 0000000000402040 R14: 0000000000000000 R15: 0000000000000000 > ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b > > Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> > --- > fs/super.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/fs/super.c b/fs/super.c > index cfadab2cbf35..adc18652302b 100644 > --- a/fs/super.c > +++ b/fs/super.c > @@ -80,6 +80,11 @@ static unsigned long super_cache_scan(struct shrinker *shrink, > if (!trylock_super(sb)) > return SHRINK_STOP; > > + if (sb->s_writers.frozen != SB_UNFROZEN) { > + up_read(&sb->s_umount); > + return SHRINK_STOP; > + } Whatever happened to "let's just fsfreeze the filesystems shortly before freezing the system? Did someone find a reason why that wouldn't work? Also, uh, doesn't this disable memory reclaim for frozen filesystems? Maybe we all need to go review the xfs io-less inode reclaim series so we can stop running transactions in reclaim... I can't merge any of it until the mm changes go upstream. --D > + > if (sb->s_op->nr_cached_objects) > fs_objects = sb->s_op->nr_cached_objects(sb, sc); > > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: stop shrinker while fs is freezed 2019-12-14 0:40 ` Darrick J. Wong @ 2019-12-16 4:11 ` Dave Chinner 2019-12-16 18:56 ` Junxiao Bi 0 siblings, 1 reply; 6+ messages in thread From: Dave Chinner @ 2019-12-16 4:11 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Junxiao Bi, linux-fsdevel, linux-kernel, viro, linux-mm On Fri, Dec 13, 2019 at 04:40:12PM -0800, Darrick J. Wong wrote: > [adding mm to cc] > > On Fri, Dec 13, 2019 at 02:24:40PM -0800, Junxiao Bi wrote: > > Shrinker could be blocked by freeze while dropping the last reference of > > some inode that had been removed. As "s_umount" lock was acquired by the > > Shrinker before blocked, the thaw will hung by this lock. This caused a > > deadlock. > > > > crash7latest> set 132 > > PID: 132 > > COMMAND: "kswapd0:0" > > TASK: ffff9cdc9dfb5f00 [THREAD_INFO: ffff9cdc9dfb5f00] > > CPU: 6 > > STATE: TASK_UNINTERRUPTIBLE > > crash7latest> bt > > PID: 132 TASK: ffff9cdc9dfb5f00 CPU: 6 COMMAND: "kswapd0:0" > > #0 [ffffaa5d075bf900] __schedule at ffffffff8186487c > > #1 [ffffaa5d075bf998] schedule at ffffffff81864e96 > > #2 [ffffaa5d075bf9b0] rwsem_down_read_failed at ffffffff818689ee > > #3 [ffffaa5d075bfa40] call_rwsem_down_read_failed at ffffffff81859308 > > #4 [ffffaa5d075bfa90] __percpu_down_read at ffffffff810ebd38 > > #5 [ffffaa5d075bfab0] __sb_start_write at ffffffff812859ef > > #6 [ffffaa5d075bfad0] xfs_trans_alloc at ffffffffc07ebe9c [xfs] > > #7 [ffffaa5d075bfb18] xfs_free_eofblocks at ffffffffc07c39d1 [xfs] > > #8 [ffffaa5d075bfb80] xfs_inactive at ffffffffc07de878 [xfs] > > #9 [ffffaa5d075bfba0] __dta_xfs_fs_destroy_inode_3543 at ffffffffc07e885e [xfs] > > #10 [ffffaa5d075bfbd0] destroy_inode at ffffffff812a25de > > #11 [ffffaa5d075bfbe8] evict at ffffffff812a2b73 > > #12 [ffffaa5d075bfc10] dispose_list at ffffffff812a2c1d > > #13 [ffffaa5d075bfc38] prune_icache_sb at ffffffff812a421a > > #14 [ffffaa5d075bfc70] super_cache_scan at ffffffff812870a1 > > #15 [ffffaa5d075bfcc8] shrink_slab at ffffffff811eebb3 > > #16 [ffffaa5d075bfdb0] shrink_node at ffffffff811f4788 > > #17 [ffffaa5d075bfe38] kswapd at ffffffff811f58c3 > > #18 [ffffaa5d075bff08] kthread at ffffffff810b75d5 > > #19 [ffffaa5d075bff50] ret_from_fork at ffffffff81a0035e > > crash7latest> set 31060 > > PID: 31060 > > COMMAND: "safefreeze" > > TASK: ffff9cd292868000 [THREAD_INFO: ffff9cd292868000] > > CPU: 2 > > STATE: TASK_UNINTERRUPTIBLE > > crash7latest> bt > > PID: 31060 TASK: ffff9cd292868000 CPU: 2 COMMAND: "safefreeze" > > #0 [ffffaa5d10047c90] __schedule at ffffffff8186487c > > #1 [ffffaa5d10047d28] schedule at ffffffff81864e96 > > #2 [ffffaa5d10047d40] rwsem_down_write_failed at ffffffff81868f18 > > #3 [ffffaa5d10047dd8] call_rwsem_down_write_failed at ffffffff81859367 > > #4 [ffffaa5d10047e20] down_write at ffffffff81867cfd > > #5 [ffffaa5d10047e38] thaw_super at ffffffff81285d2d > > #6 [ffffaa5d10047e60] do_vfs_ioctl at ffffffff81299566 > > #7 [ffffaa5d10047ee8] sys_ioctl at ffffffff81299709 > > #8 [ffffaa5d10047f28] do_syscall_64 at ffffffff81003949 > > #9 [ffffaa5d10047f50] entry_SYSCALL_64_after_hwframe at ffffffff81a001ad > > RIP: 0000000000453d67 RSP: 00007ffff9c1ce78 RFLAGS: 00000206 > > RAX: ffffffffffffffda RBX: 0000000001cbe92c RCX: 0000000000453d67 > > RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000014 > > RBP: 00007ffff9c1cf80 R8: 0000000000000000 R9: 0000000000000012 > > R10: 0000000000000008 R11: 0000000000000206 R12: 0000000000401fb0 > > R13: 0000000000402040 R14: 0000000000000000 R15: 0000000000000000 > > ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b > > > > Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> > > --- > > fs/super.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/fs/super.c b/fs/super.c > > index cfadab2cbf35..adc18652302b 100644 > > --- a/fs/super.c > > +++ b/fs/super.c > > @@ -80,6 +80,11 @@ static unsigned long super_cache_scan(struct shrinker *shrink, > > if (!trylock_super(sb)) > > return SHRINK_STOP; > > > > + if (sb->s_writers.frozen != SB_UNFROZEN) { > > + up_read(&sb->s_umount); > > + return SHRINK_STOP; > > + } > > Whatever happened to "let's just fsfreeze the filesystems shortly before > freezing the system? Did someone find a reason why that wouldn't work? > > Also, uh, doesn't this disable memory reclaim for frozen filesystems? > > Maybe we all need to go review the xfs io-less inode reclaim series so > we can stop running transactions in reclaim... I can't merge any of it > until the mm changes go upstream. IO-less reclaim doesn't prevent ->destroy_inode from having to run transactions. e.g. this is the path through which unlink does inode freeing. Background inode inactivation is the patchset that addresses this problem... :) Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: stop shrinker while fs is freezed 2019-12-16 4:11 ` Dave Chinner @ 2019-12-16 18:56 ` Junxiao Bi 0 siblings, 0 replies; 6+ messages in thread From: Junxiao Bi @ 2019-12-16 18:56 UTC (permalink / raw) To: Dave Chinner, Darrick J. Wong; +Cc: linux-fsdevel, linux-kernel, viro, linux-mm [-- Attachment #1: Type: text/plain, Size: 1245 bytes --] On 12/15/19 8:11 PM, Dave Chinner wrote: >>> diff --git a/fs/super.c b/fs/super.c >>> index cfadab2cbf35..adc18652302b 100644 >>> --- a/fs/super.c >>> +++ b/fs/super.c >>> @@ -80,6 +80,11 @@ static unsigned long super_cache_scan(struct shrinker *shrink, >>> if (!trylock_super(sb)) >>> return SHRINK_STOP; >>> >>> + if (sb->s_writers.frozen != SB_UNFROZEN) { >>> + up_read(&sb->s_umount); >>> + return SHRINK_STOP; >>> + } >> Whatever happened to "let's just fsfreeze the filesystems shortly before >> freezing the system? Did someone find a reason why that wouldn't work? >> >> Also, uh, doesn't this disable memory reclaim for frozen filesystems? >> >> Maybe we all need to go review the xfs io-less inode reclaim series so >> we can stop running transactions in reclaim... I can't merge any of it >> until the mm changes go upstream. > IO-less reclaim doesn't prevent ->destroy_inode from having to run > transactions. e.g. this is the path through which unlink does inode > freeing. Background inode inactivation is the patchset that > addresses this problem...:) Sound backgroup inode inactivation is only for xfs? I suppose this is a generic issue, other fs will also suffer this? Thanks, Junxiao. > > Cheers, > > Dave. [-- Attachment #2: Type: text/html, Size: 2117 bytes --] ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: stop shrinker while fs is freezed 2019-12-13 22:24 [PATCH] vfs: stop shrinker while fs is freezed Junxiao Bi 2019-12-14 0:40 ` Darrick J. Wong @ 2019-12-16 4:09 ` Dave Chinner 2019-12-16 18:53 ` Junxiao Bi 1 sibling, 1 reply; 6+ messages in thread From: Dave Chinner @ 2019-12-16 4:09 UTC (permalink / raw) To: Junxiao Bi; +Cc: linux-fsdevel, linux-kernel, viro On Fri, Dec 13, 2019 at 02:24:40PM -0800, Junxiao Bi wrote: > Shrinker could be blocked by freeze while dropping the last reference of > some inode that had been removed. As "s_umount" lock was acquired by the > Shrinker before blocked, the thaw will hung by this lock. This caused a > deadlock. > > crash7latest> set 132 > PID: 132 > COMMAND: "kswapd0:0" > TASK: ffff9cdc9dfb5f00 [THREAD_INFO: ffff9cdc9dfb5f00] > CPU: 6 > STATE: TASK_UNINTERRUPTIBLE > crash7latest> bt > PID: 132 TASK: ffff9cdc9dfb5f00 CPU: 6 COMMAND: "kswapd0:0" > #0 [ffffaa5d075bf900] __schedule at ffffffff8186487c > #1 [ffffaa5d075bf998] schedule at ffffffff81864e96 > #2 [ffffaa5d075bf9b0] rwsem_down_read_failed at ffffffff818689ee > #3 [ffffaa5d075bfa40] call_rwsem_down_read_failed at ffffffff81859308 > #4 [ffffaa5d075bfa90] __percpu_down_read at ffffffff810ebd38 > #5 [ffffaa5d075bfab0] __sb_start_write at ffffffff812859ef > #6 [ffffaa5d075bfad0] xfs_trans_alloc at ffffffffc07ebe9c [xfs] > #7 [ffffaa5d075bfb18] xfs_free_eofblocks at ffffffffc07c39d1 [xfs] > #8 [ffffaa5d075bfb80] xfs_inactive at ffffffffc07de878 [xfs] > #9 [ffffaa5d075bfba0] __dta_xfs_fs_destroy_inode_3543 at ffffffffc07e885e [xfs] > #10 [ffffaa5d075bfbd0] destroy_inode at ffffffff812a25de > #11 [ffffaa5d075bfbe8] evict at ffffffff812a2b73 > #12 [ffffaa5d075bfc10] dispose_list at ffffffff812a2c1d > #13 [ffffaa5d075bfc38] prune_icache_sb at ffffffff812a421a > #14 [ffffaa5d075bfc70] super_cache_scan at ffffffff812870a1 > #15 [ffffaa5d075bfcc8] shrink_slab at ffffffff811eebb3 > #16 [ffffaa5d075bfdb0] shrink_node at ffffffff811f4788 > #17 [ffffaa5d075bfe38] kswapd at ffffffff811f58c3 > #18 [ffffaa5d075bff08] kthread at ffffffff810b75d5 > #19 [ffffaa5d075bff50] ret_from_fork at ffffffff81a0035e How did you get a file that needed EOF block trimming to be disposed of when the filesystem is frozen? Part of freezing the filesystem is tossing all the reclaimable inodes out of the cache, which means all the inodes that might require EOF block trimming should have already been removed from the cache before the freeze goes into effect.... > crash7latest> set 31060 > PID: 31060 > COMMAND: "safefreeze" > TASK: ffff9cd292868000 [THREAD_INFO: ffff9cd292868000] > CPU: 2 > STATE: TASK_UNINTERRUPTIBLE > crash7latest> bt > PID: 31060 TASK: ffff9cd292868000 CPU: 2 COMMAND: "safefreeze" > #0 [ffffaa5d10047c90] __schedule at ffffffff8186487c > #1 [ffffaa5d10047d28] schedule at ffffffff81864e96 > #2 [ffffaa5d10047d40] rwsem_down_write_failed at ffffffff81868f18 > #3 [ffffaa5d10047dd8] call_rwsem_down_write_failed at ffffffff81859367 > #4 [ffffaa5d10047e20] down_write at ffffffff81867cfd > #5 [ffffaa5d10047e38] thaw_super at ffffffff81285d2d > #6 [ffffaa5d10047e60] do_vfs_ioctl at ffffffff81299566 > #7 [ffffaa5d10047ee8] sys_ioctl at ffffffff81299709 > #8 [ffffaa5d10047f28] do_syscall_64 at ffffffff81003949 > #9 [ffffaa5d10047f50] entry_SYSCALL_64_after_hwframe at ffffffff81a001ad > RIP: 0000000000453d67 RSP: 00007ffff9c1ce78 RFLAGS: 00000206 > RAX: ffffffffffffffda RBX: 0000000001cbe92c RCX: 0000000000453d67 > RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000014 > RBP: 00007ffff9c1cf80 R8: 0000000000000000 R9: 0000000000000012 > R10: 0000000000000008 R11: 0000000000000206 R12: 0000000000401fb0 > R13: 0000000000402040 R14: 0000000000000000 R15: 0000000000000000 > ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b > > Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> > --- > fs/super.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/fs/super.c b/fs/super.c > index cfadab2cbf35..adc18652302b 100644 > --- a/fs/super.c > +++ b/fs/super.c > @@ -80,6 +80,11 @@ static unsigned long super_cache_scan(struct shrinker *shrink, > if (!trylock_super(sb)) > return SHRINK_STOP; > > + if (sb->s_writers.frozen != SB_UNFROZEN) { > + up_read(&sb->s_umount); > + return SHRINK_STOP; > + } Ah, no. Now go run a filesystem traversal over a filesystem with a few tens of million files in it while the filesystem is frozen, and what the dentry and inode cache grow and grow until you run out of memory.... THe shrinker *needs* to run while the filesystem is frozen, but it should not be tripping over files that need modification on eviction. Working out how we got a file that required truncation during eviction is the first thing to do here so we can then determine if a) we should have caught it at freeze time, b) whether it can be caught at freeze time, c) whether it can safely be skipped during a freeze, or d) something else.... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] vfs: stop shrinker while fs is freezed 2019-12-16 4:09 ` Dave Chinner @ 2019-12-16 18:53 ` Junxiao Bi 0 siblings, 0 replies; 6+ messages in thread From: Junxiao Bi @ 2019-12-16 18:53 UTC (permalink / raw) To: Dave Chinner; +Cc: linux-fsdevel, linux-kernel, viro Hi Dave, On 12/15/19 8:09 PM, Dave Chinner wrote: > On Fri, Dec 13, 2019 at 02:24:40PM -0800, Junxiao Bi wrote: >> Shrinker could be blocked by freeze while dropping the last reference of >> some inode that had been removed. As "s_umount" lock was acquired by the >> Shrinker before blocked, the thaw will hung by this lock. This caused a >> deadlock. >> >> crash7latest> set 132 >> PID: 132 >> COMMAND: "kswapd0:0" >> TASK: ffff9cdc9dfb5f00 [THREAD_INFO: ffff9cdc9dfb5f00] >> CPU: 6 >> STATE: TASK_UNINTERRUPTIBLE >> crash7latest> bt >> PID: 132 TASK: ffff9cdc9dfb5f00 CPU: 6 COMMAND: "kswapd0:0" >> #0 [ffffaa5d075bf900] __schedule at ffffffff8186487c >> #1 [ffffaa5d075bf998] schedule at ffffffff81864e96 >> #2 [ffffaa5d075bf9b0] rwsem_down_read_failed at ffffffff818689ee >> #3 [ffffaa5d075bfa40] call_rwsem_down_read_failed at ffffffff81859308 >> #4 [ffffaa5d075bfa90] __percpu_down_read at ffffffff810ebd38 >> #5 [ffffaa5d075bfab0] __sb_start_write at ffffffff812859ef >> #6 [ffffaa5d075bfad0] xfs_trans_alloc at ffffffffc07ebe9c [xfs] >> #7 [ffffaa5d075bfb18] xfs_free_eofblocks at ffffffffc07c39d1 [xfs] >> #8 [ffffaa5d075bfb80] xfs_inactive at ffffffffc07de878 [xfs] >> #9 [ffffaa5d075bfba0] __dta_xfs_fs_destroy_inode_3543 at ffffffffc07e885e [xfs] >> #10 [ffffaa5d075bfbd0] destroy_inode at ffffffff812a25de >> #11 [ffffaa5d075bfbe8] evict at ffffffff812a2b73 >> #12 [ffffaa5d075bfc10] dispose_list at ffffffff812a2c1d >> #13 [ffffaa5d075bfc38] prune_icache_sb at ffffffff812a421a >> #14 [ffffaa5d075bfc70] super_cache_scan at ffffffff812870a1 >> #15 [ffffaa5d075bfcc8] shrink_slab at ffffffff811eebb3 >> #16 [ffffaa5d075bfdb0] shrink_node at ffffffff811f4788 >> #17 [ffffaa5d075bfe38] kswapd at ffffffff811f58c3 >> #18 [ffffaa5d075bff08] kthread at ffffffff810b75d5 >> #19 [ffffaa5d075bff50] ret_from_fork at ffffffff81a0035e > How did you get a file that needed EOF block trimming to be disposed > of when the filesystem is frozen? > > Part of freezing the filesystem is tossing all the reclaimable > inodes out of the cache, which means all the inodes that might > require EOF block trimming should have already been removed from the > cache before the freeze goes into effect.... This issue happened only once, don't know how it was triggered. I am not xfs developer, don't know what EOF block mean. But this seemed a generic issue, destroy inode will engage the transaction on non-EOF blocks? > >> crash7latest> set 31060 >> PID: 31060 >> COMMAND: "safefreeze" >> TASK: ffff9cd292868000 [THREAD_INFO: ffff9cd292868000] >> CPU: 2 >> STATE: TASK_UNINTERRUPTIBLE >> crash7latest> bt >> PID: 31060 TASK: ffff9cd292868000 CPU: 2 COMMAND: "safefreeze" >> #0 [ffffaa5d10047c90] __schedule at ffffffff8186487c >> #1 [ffffaa5d10047d28] schedule at ffffffff81864e96 >> #2 [ffffaa5d10047d40] rwsem_down_write_failed at ffffffff81868f18 >> #3 [ffffaa5d10047dd8] call_rwsem_down_write_failed at ffffffff81859367 >> #4 [ffffaa5d10047e20] down_write at ffffffff81867cfd >> #5 [ffffaa5d10047e38] thaw_super at ffffffff81285d2d >> #6 [ffffaa5d10047e60] do_vfs_ioctl at ffffffff81299566 >> #7 [ffffaa5d10047ee8] sys_ioctl at ffffffff81299709 >> #8 [ffffaa5d10047f28] do_syscall_64 at ffffffff81003949 >> #9 [ffffaa5d10047f50] entry_SYSCALL_64_after_hwframe at ffffffff81a001ad >> RIP: 0000000000453d67 RSP: 00007ffff9c1ce78 RFLAGS: 00000206 >> RAX: ffffffffffffffda RBX: 0000000001cbe92c RCX: 0000000000453d67 >> RDX: 0000000000000000 RSI: 00000000c0045878 RDI: 0000000000000014 >> RBP: 00007ffff9c1cf80 R8: 0000000000000000 R9: 0000000000000012 >> R10: 0000000000000008 R11: 0000000000000206 R12: 0000000000401fb0 >> R13: 0000000000402040 R14: 0000000000000000 R15: 0000000000000000 >> ORIG_RAX: 0000000000000010 CS: 0033 SS: 002b >> >> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> >> --- >> fs/super.c | 5 +++++ >> 1 file changed, 5 insertions(+) >> >> diff --git a/fs/super.c b/fs/super.c >> index cfadab2cbf35..adc18652302b 100644 >> --- a/fs/super.c >> +++ b/fs/super.c >> @@ -80,6 +80,11 @@ static unsigned long super_cache_scan(struct shrinker *shrink, >> if (!trylock_super(sb)) >> return SHRINK_STOP; >> >> + if (sb->s_writers.frozen != SB_UNFROZEN) { >> + up_read(&sb->s_umount); >> + return SHRINK_STOP; >> + } > Ah, no. Now go run a filesystem traversal over a filesystem with a > few tens of million files in it while the filesystem is frozen, and > what the dentry and inode cache grow and grow until you run out of > memory.... Yea, this is an issue. > > THe shrinker *needs* to run while the filesystem is frozen, but it > should not be tripping over files that need modification on > eviction. Working out how we got a file that required truncation > during eviction is the first thing to do here so we can then > determine if a) we should have caught it at freeze time, b) whether > it can be caught at freeze time, c) whether it can safely be skipped > during a freeze, or d) something else.... will think about it, thank you. Thanks, Junxiao. > > Cheers, > > Dave. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-12-16 18:57 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-13 22:24 [PATCH] vfs: stop shrinker while fs is freezed Junxiao Bi 2019-12-14 0:40 ` Darrick J. Wong 2019-12-16 4:11 ` Dave Chinner 2019-12-16 18:56 ` Junxiao Bi 2019-12-16 4:09 ` Dave Chinner 2019-12-16 18:53 ` Junxiao Bi
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.