On 10/09/2017 01:52 PM, NeilBrown wrote: > On Mon, Oct 09 2017, Xiao Ni wrote: > >> On 10/09/2017 12:57 PM, NeilBrown wrote: >>> It would if you had applied >>> [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce() >>> >>> Did you apply all 4 patches? >> Sorry, it's my mistake. I insmod the wrong module. I'll apply the four >> patches >> and do test again. >>> Thanks. I looks suspend_lo_store() is calling raid5_quiesce() directly >>> as you say - so a patch is missing. >> Yes, thanks for pointing about this. Hi Neil I applied the four patches and one patch "md: fix deadlock error in recent patch." There is a new stuck. It's stuck at suspend_hi_store this time. I add the calltrace as an attachment. I added some printk to print some information. [12695.993329] mddev suspend : 1 [12695.996270] mddev ro : 0 [12695.998790] mddev insync : 0 [12696.001641] mddev active io: 1 mddev->flags doesn't have MD_SB_CHANGE_PENDING. root 8653 0.0 0.0 9688 4752 ? DLs 00:52 0:00 /usr/sbin/mdadm --grow --continue /dev/md0 [root@dell-pr1700-02 md]# cat /proc/8653/stack [] mddev_suspend+0x12c/0x160 [md_mod] [] suspend_hi_store+0x7c/0xe0 [md_mod] [] md_attr_store+0x80/0xc0 [md_mod] [] sysfs_kf_write+0x3a/0x50 [] kernfs_fop_write+0xff/0x180 [] __vfs_write+0x37/0x170 [] vfs_write+0xb2/0x1b0 [] SyS_write+0x55/0xc0 [] do_syscall_64+0x67/0x150 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff root 1234 0.0 0.0 106008 7280 ? Ss Oct09 0:00 /usr/sbin/sshd -D root 8655 0.1 0.0 108996 2752 pts/0 D+ 00:52 0:04 | \_ dd if=/dev/urandom of=/mnt/md_test/testfile bs=1M count=1000 [root@dell-pr1700-02 md]# cat /proc/8655/stack [] wait_transaction_locked+0x8a/0xd0 [jbd2] [] add_transaction_credits+0x1c4/0x2a0 [jbd2] [] start_this_handle+0x197/0x400 [jbd2] [] jbd2__journal_start+0xeb/0x1f0 [jbd2] [] __ext4_journal_start_sb+0x6d/0xf0 [ext4] [] ext4_da_write_begin+0x140/0x410 [ext4] [] generic_perform_write+0xbe/0x1b0 [] __generic_file_write_iter+0x19b/0x1e0 [] ext4_file_write_iter+0x28f/0x3f0 [ext4] [] __vfs_write+0xf3/0x170 [] vfs_write+0xb2/0x1b0 [] SyS_write+0x55/0xc0 [] do_syscall_64+0x67/0x150 [] entry_SYSCALL64_slow_path+0x25/0x25 [] 0xffffffffffffffff root 8143 0.0 0.0 0 0 ? D 00:40 0:00 \_ [kworker/u8:4] [root@dell-pr1700-02 md]# cat /proc/8143/stack [] md_make_request+0xb1/0x260 [md_mod] [] generic_make_request+0x117/0x2f0 [] submit_bio+0x75/0x150 [] ext4_io_submit+0x4c/0x60 [ext4] [] ext4_bio_write_page+0x1a4/0x3b0 [ext4] [] mpage_submit_page+0x57/0x70 [ext4] [] mpage_map_and_submit_buffers+0x168/0x290 [ext4] [] ext4_writepages+0x852/0xe80 [ext4] [] do_writepages+0x1c/0x70 [] __writeback_single_inode+0x45/0x320 [] writeback_sb_inodes+0x280/0x570 [] __writeback_inodes_wb+0x8c/0xc0 [] wb_writeback+0x276/0x310 [] wb_workfn+0x19c/0x3b0 [] process_one_work+0x149/0x360 [] worker_thread+0x4d/0x3c0 [] kthread+0x109/0x140 [] ret_from_fork+0x25/0x30 [] 0xffffffffffffffff >> Hmm, so it's the first question. Why can't call md_check_recovery when >> MD_SB_CHANGE_PENDING >> is set. It needs to update the superblock too when MD_SB_CHANGE_PENDING >> is set. I can't >> understand this part. >> >> Can it be: >> --- a/drivers/md/raid5.c >> +++ b/drivers/md/raid5.c >> @@ -6299,7 +6299,7 @@ static void raid5d(struct md_thread *thread) >> break; >> handled += batch_size; >> >> - if (mddev->sb_flags & ~(1 << MD_SB_CHANGE_PENDING)) { >> + if (mddev->sb_flags) { >> > Maybe it could, but there is a test in md_check_recovery() > > if ( ! ( > (mddev->sb_flags & ~ (1< > and it makes sense to match that. There is no point dropping the > spinlock and reclaiming it if md_check_recovery() isn't going to do > anything useful. It's introduced by: commit 126925c090155f13e90b9e7e8c4010e96027c00a Author: NeilBrown Date: Tue Sep 7 17:02:47 2010 +1000 md: call md_update_sb even for 'external' metadata arrays. For external metadata we don't want to take mddev lock if only MD_SB_CHANGE_PENDING is set. But it also reduce the opportunity to update superblock for internal metadata if only MD_SB_CHANGE_PENDING is set. Can it be: diff --git a/drivers/md/md.c b/drivers/md/md.c index b6b7a28..55e9280 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -7777,7 +7777,7 @@ void md_check_recovery(struct mddev *mddev) if (mddev->ro && !test_bit(MD_RECOVERY_NEEDED, &mddev->recovery)) return; if ( ! ( - (mddev->flags & ~ (1<flags & (mddev->external == 1 && ~ (1<recovery) || test_bit(MD_RECOVERY_DONE, &mddev->recovery) || (mddev->external == 0 && mddev->safemode == 1) || Best Regards Xiao