All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Junxiao Bi <junxiao.bi@oracle.com>
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	viro@zeniv.linux.org.uk, linux-mm@kvack.org
Subject: Re: [PATCH] vfs: stop shrinker while fs is freezed
Date: Fri, 13 Dec 2019 16:40:12 -0800	[thread overview]
Message-ID: <20191214004012.GC99868@magnolia> (raw)
In-Reply-To: <20191213222440.11519-1-junxiao.bi@oracle.com>

[adding mm to cc]

On Fri, Dec 13, 2019 at 02:24:40PM -0800, Junxiao Bi wrote:
> Shrinker could be blocked by freeze while dropping the last reference of
> some inode that had been removed. As "s_umount" lock was acquired by the
> Shrinker before blocked, the thaw will hung by this lock. This caused a
> deadlock.
> 
>  crash7latest> set 132
>      PID: 132
>  COMMAND: "kswapd0:0"
>     TASK: ffff9cdc9dfb5f00  [THREAD_INFO: ffff9cdc9dfb5f00]
>      CPU: 6
>    STATE: TASK_UNINTERRUPTIBLE
>  crash7latest> bt
>  PID: 132    TASK: ffff9cdc9dfb5f00  CPU: 6   COMMAND: "kswapd0:0"
>   #0 [ffffaa5d075bf900] __schedule at ffffffff8186487c
>   #1 [ffffaa5d075bf998] schedule at ffffffff81864e96
>   #2 [ffffaa5d075bf9b0] rwsem_down_read_failed at ffffffff818689ee
>   #3 [ffffaa5d075bfa40] call_rwsem_down_read_failed at ffffffff81859308
>   #4 [ffffaa5d075bfa90] __percpu_down_read at ffffffff810ebd38
>   #5 [ffffaa5d075bfab0] __sb_start_write at ffffffff812859ef
>   #6 [ffffaa5d075bfad0] xfs_trans_alloc at ffffffffc07ebe9c [xfs]
>   #7 [ffffaa5d075bfb18] xfs_free_eofblocks at ffffffffc07c39d1 [xfs]
>   #8 [ffffaa5d075bfb80] xfs_inactive at ffffffffc07de878 [xfs]
>   #9 [ffffaa5d075bfba0] __dta_xfs_fs_destroy_inode_3543 at ffffffffc07e885e [xfs]
>  #10 [ffffaa5d075bfbd0] destroy_inode at ffffffff812a25de
>  #11 [ffffaa5d075bfbe8] evict at ffffffff812a2b73
>  #12 [ffffaa5d075bfc10] dispose_list at ffffffff812a2c1d
>  #13 [ffffaa5d075bfc38] prune_icache_sb at ffffffff812a421a
>  #14 [ffffaa5d075bfc70] super_cache_scan at ffffffff812870a1
>  #15 [ffffaa5d075bfcc8] shrink_slab at ffffffff811eebb3
>  #16 [ffffaa5d075bfdb0] shrink_node at ffffffff811f4788
>  #17 [ffffaa5d075bfe38] kswapd at ffffffff811f58c3
>  #18 [ffffaa5d075bff08] kthread at ffffffff810b75d5
>  #19 [ffffaa5d075bff50] ret_from_fork at ffffffff81a0035e
>  crash7latest> set 31060
>      PID: 31060
>  COMMAND: "safefreeze"
>     TASK: ffff9cd292868000  [THREAD_INFO: ffff9cd292868000]
>      CPU: 2
>    STATE: TASK_UNINTERRUPTIBLE
>  crash7latest> bt
>  PID: 31060  TASK: ffff9cd292868000  CPU: 2   COMMAND: "safefreeze"
>   #0 [ffffaa5d10047c90] __schedule at ffffffff8186487c
>   #1 [ffffaa5d10047d28] schedule at ffffffff81864e96
>   #2 [ffffaa5d10047d40] rwsem_down_write_failed at ffffffff81868f18
>   #3 [ffffaa5d10047dd8] call_rwsem_down_write_failed at ffffffff81859367
>   #4 [ffffaa5d10047e20] down_write at ffffffff81867cfd
>   #5 [ffffaa5d10047e38] thaw_super at ffffffff81285d2d
>   #6 [ffffaa5d10047e60] do_vfs_ioctl at ffffffff81299566
>   #7 [ffffaa5d10047ee8] sys_ioctl at ffffffff81299709
>   #8 [ffffaa5d10047f28] do_syscall_64 at ffffffff81003949
>   #9 [ffffaa5d10047f50] entry_SYSCALL_64_after_hwframe at ffffffff81a001ad
>      RIP: 0000000000453d67  RSP: 00007ffff9c1ce78  RFLAGS: 00000206
>      RAX: ffffffffffffffda  RBX: 0000000001cbe92c  RCX: 0000000000453d67
>      RDX: 0000000000000000  RSI: 00000000c0045878  RDI: 0000000000000014
>      RBP: 00007ffff9c1cf80   R8: 0000000000000000   R9: 0000000000000012
>      R10: 0000000000000008  R11: 0000000000000206  R12: 0000000000401fb0
>      R13: 0000000000402040  R14: 0000000000000000  R15: 0000000000000000
>      ORIG_RAX: 0000000000000010  CS: 0033  SS: 002b
> 
> Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
> ---
>  fs/super.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/super.c b/fs/super.c
> index cfadab2cbf35..adc18652302b 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -80,6 +80,11 @@ static unsigned long super_cache_scan(struct shrinker *shrink,
>  	if (!trylock_super(sb))
>  		return SHRINK_STOP;
>  
> +	if (sb->s_writers.frozen != SB_UNFROZEN) {
> +		up_read(&sb->s_umount);
> +		return SHRINK_STOP;
> +	}

Whatever happened to "let's just fsfreeze the filesystems shortly before
freezing the system?  Did someone find a reason why that wouldn't work?

Also, uh, doesn't this disable memory reclaim for frozen filesystems?

Maybe we all need to go review the xfs io-less inode reclaim series so
we can stop running transactions in reclaim... I can't merge any of it
until the mm changes go upstream.

--D

> +
>  	if (sb->s_op->nr_cached_objects)
>  		fs_objects = sb->s_op->nr_cached_objects(sb, sc);
>  
> -- 
> 2.17.1
> 

  reply	other threads:[~2019-12-14  0:40 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-13 22:24 [PATCH] vfs: stop shrinker while fs is freezed Junxiao Bi
2019-12-14  0:40 ` Darrick J. Wong [this message]
2019-12-16  4:11   ` Dave Chinner
2019-12-16 18:56     ` Junxiao Bi
2019-12-16  4:09 ` Dave Chinner
2019-12-16 18:53   ` Junxiao Bi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191214004012.GC99868@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.