All of lore.kernel.org
 help / color / mirror / Atom feed
From: Xiao Ni <xni@redhat.com>
To: NeilBrown <neilb@suse.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
Date: Mon, 9 Oct 2017 13:32:16 +0800	[thread overview]
Message-ID: <441ae9fe-fd73-2aac-8bb1-c64da28cda27@redhat.com> (raw)
In-Reply-To: <87a810zznc.fsf@notabene.neil.brown.name>



On 10/09/2017 12:57 PM, NeilBrown wrote:
> On Sun, Oct 08 2017, Xiao Ni wrote:
>
>> ----- Original Message -----
>>> From: "NeilBrown" <neilb@suse.com>
>>> To: "Xiao Ni" <xni@redhat.com>
>>> Cc: linux-raid@vger.kernel.org
>>> Sent: Friday, October 6, 2017 12:32:19 PM
>>> Subject: Re: [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without
>>>
>>> On Fri, Oct 06 2017, Xiao Ni wrote:
>>>
>>>> On 10/05/2017 01:17 PM, NeilBrown wrote:
>>>>> On Thu, Sep 14 2017, Xiao Ni wrote:
>>>>>
>>>>>>> What do
>>>>>>>    cat /proc/8987/stack
>>>>>>>    cat /proc/8983/stack
>>>>>>>    cat /proc/8966/stack
>>>>>>>    cat /proc/8381/stack
>>>>>>>
>>>>>>> show??
>>>>> ...
>>>>>
>>>>>> /usr/sbin/mdadm --grow --continue /dev/md0. Is it the reason to add
>>>>>> lockdep_assert_held(&mddev->reconfig_mutex)?
>>>>>> [root@dell-pr1700-02 ~]# cat /proc/8983/stack
>>>>>> [<ffffffffa0a3464c>] mddev_suspend+0x12c/0x160 [md_mod]
>>>>>> [<ffffffffa0a379ec>] suspend_lo_store+0x7c/0xe0 [md_mod]
>>>>>> [<ffffffffa0a3b7d0>] md_attr_store+0x80/0xc0 [md_mod]
>>>>>> [<ffffffff812ec8da>] sysfs_kf_write+0x3a/0x50
>>>>>> [<ffffffff812ec39f>] kernfs_fop_write+0xff/0x180
>>>>>> [<ffffffff81260457>] __vfs_write+0x37/0x170
>>>>>> [<ffffffff812619e2>] vfs_write+0xb2/0x1b0
>>>>>> [<ffffffff81263015>] SyS_write+0x55/0xc0
>>>>>> [<ffffffff810037c7>] do_syscall_64+0x67/0x150
>>>>>> [<ffffffff81777527>] entry_SYSCALL64_slow_path+0x25/0x25
>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>>>
>>>>>> [jbd2/md0-8]
>>>>>> [root@dell-pr1700-02 ~]# cat /proc/8966/stack
>>>>>> [<ffffffffa0a39b20>] md_write_start+0xf0/0x220 [md_mod]
>>>>>> [<ffffffffa0972b49>] raid5_make_request+0x89/0x8b0 [raid456]
>>>>>> [<ffffffffa0a34175>] md_make_request+0xf5/0x260 [md_mod]
>>>>>> [<ffffffff81376427>] generic_make_request+0x117/0x2f0
>>>>>> [<ffffffff81376675>] submit_bio+0x75/0x150
>>>>>> [<ffffffff8129e0b0>] submit_bh_wbc+0x140/0x170
>>>>>> [<ffffffff8129e683>] submit_bh+0x13/0x20
>>>>>> [<ffffffffa0957e29>] jbd2_write_superblock+0x109/0x230 [jbd2]
>>>>>> [<ffffffffa0957f8b>] jbd2_journal_update_sb_log_tail+0x3b/0x80 [jbd2]
>>>>>> [<ffffffffa09517ff>] jbd2_journal_commit_transaction+0x16ef/0x19e0 [jbd2]
>>>>>> [<ffffffffa0955d02>] kjournald2+0xd2/0x260 [jbd2]
>>>>>> [<ffffffff810c73f9>] kthread+0x109/0x140
>>>>>> [<ffffffff817776c5>] ret_from_fork+0x25/0x30
>>>>>> [<ffffffffffffffff>] 0xffffffffffffffff
>>>>> Thanks for this (and sorry it took so long to get to it).
>>>>> It looks like
>>>>>
>>>>> Commit: cc27b0c78c79 ("md: fix deadlock between mddev_suspend() and
>>>>> md_write_start()")
>>>>>
>>>>> is badly broken.  I wonder how it ever passed testing.
>>>>>
>>>>> In write_start() is change the wait_event() call to
>>>>>
>>>>> 	wait_event(mddev->sb_wait,
>>>>> 		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) &&
>>>>> 		   !mddev->suspended);
>>>>>
>>>>>
>>>>> That should be
>>>>>
>>>>> 	wait_event(mddev->sb_wait,
>>>>> 		   !test_bit(MD_SB_CHANGE_PENDING, &mddev->sb_flags) ||
>>>>> 		   mddev->suspended);
>>>> Hi Neil
>>>>
>>>> Do we want write bio can be handled when mddev->suspended is 1? After
>>>> changing to this,
>>>> write bio can be handled when mddev->suspended is 1.
>>> This is OK.
>>> New write bios will not get past md_handle_request().
>>> A write bios that did get past md_handle_request() is still allowed
>>> through md_write_start().  The mddev_suspend() call won't complete until
>>> that write bio has finished.
>> Hi Neil
>>
>> Thanks for the explanation. I took some time to read the emails about the
>> patch cc27b0c78 which introduced this. It's similar with this problem I
>> countered. But there is a call of function mddev_suspend in level_store.
>> So add the check of mddev->suspended in md_write_start can fix the problem
>> "reshape raid5 -> raid6 atop bcache deadlocks at start on md_attr_store /
>> raid5_make_request".
>>
>> In function suspend_lo_store it doesn't call mddev_suspend under mddev->reconfig_mutex.
> It would if you had applied
>     [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce()
>
> Did you apply all 4 patches?

Sorry, it's my mistake. I insmod the wrong module. I'll apply the four 
patches
and do test again.
> Thanks.  I looks suspend_lo_store() is calling raid5_quiesce() directly
> as you say - so a patch is missing.

Yes, thanks for pointing about this.

>>>> Hmm, I have a question. Why can't call md_check_recovery when
>>>> MD_SB_CHANGE_PENDING
>>>> is set in raid5d?
>>> When MD_SB_CHANGE_PENDING is not set, there is no need to call
>>> md_check_recovery().  I wouldn't hurt except that it would be a waste of
>>> time.
>> I'm confused. If we want to call md_check_recovery when MD_SB_CHANGE_PENDING
>> is set, it should be
> Sorry, I described the condition wrongly.
> If any bit is set in ->sb_flags (except MD_SB_CHANGE_PENDING), then
> we need to call md_check_recovery().  If none of those other bits
> are set, there is no need.

Hmm, so it's the first question. Why can't call md_check_recovery when 
MD_SB_CHANGE_PENDING
is set. It needs to update the superblock too when MD_SB_CHANGE_PENDING 
is set. I can't
understand this part.

Can it be:
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -6299,7 +6299,7 @@ static void raid5d(struct md_thread *thread)
                         break;
                 handled += batch_size;

-               if (mddev->sb_flags & ~(1 << MD_SB_CHANGE_PENDING)) {
+               if (mddev->sb_flags) {


Best Regards
Xiao

>
> NeilBrown


  reply	other threads:[~2017-10-09  5:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-12  1:49 [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without NeilBrown
2017-09-12  1:49 ` [PATCH 3/4] md: use mddev_suspend/resume instead of ->quiesce() NeilBrown
2017-09-12  1:49 ` [PATCH 1/4] md: always hold reconfig_mutex when calling mddev_suspend() NeilBrown
2017-09-12  1:49 ` [PATCH 4/4] md: allow metadata update while suspending NeilBrown
2017-09-12  1:49 ` [PATCH 2/4] md: don't call bitmap_create() while array is quiesced NeilBrown
2017-09-12  2:51 ` [PATCH 0/4] RFC: attempt to remove md deadlocks with metadata without Xiao Ni
2017-09-13  2:11 ` Xiao Ni
2017-09-13 15:09   ` Xiao Ni
2017-09-13 23:05     ` NeilBrown
2017-09-14  4:55       ` Xiao Ni
2017-09-14  5:32         ` NeilBrown
2017-09-14  7:57           ` Xiao Ni
2017-09-16 13:15             ` Xiao Ni
2017-10-05  5:17             ` NeilBrown
2017-10-06  3:53               ` Xiao Ni
2017-10-06  4:32                 ` NeilBrown
2017-10-09  1:21                   ` Xiao Ni
2017-10-09  4:57                     ` NeilBrown
2017-10-09  5:32                       ` Xiao Ni [this message]
2017-10-09  5:52                         ` NeilBrown
2017-10-10  6:05                           ` Xiao Ni
2017-10-10 21:20                             ` NeilBrown
     [not found]                               ` <960568852.19225619.1507689864371.JavaMail.zimbra@redhat.com>
2017-10-13  3:48                                 ` NeilBrown
2017-10-16  4:43                                   ` Xiao Ni
2017-09-30  9:46 ` Xiao Ni
2017-10-05  5:03   ` NeilBrown
2017-10-06  3:40     ` Xiao Ni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=441ae9fe-fd73-2aac-8bb1-c64da28cda27@redhat.com \
    --to=xni@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.