linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guoqing Jiang <guoqing.jiang@cloud.ionos.com>
To: Donald Buczek <buczek@molgen.mpg.de>, Song Liu <song@kernel.org>,
	linux-raid@vger.kernel.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	it+raid@molgen.mpg.de
Subject: Re: md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition
Date: Tue, 2 Feb 2021 16:42:04 +0100	[thread overview]
Message-ID: <0bb7c8d8-6b96-ce70-c5ee-ba414de10561@cloud.ionos.com> (raw)
In-Reply-To: <93d8d623-8aec-ad91-490c-a414c4926fb2@molgen.mpg.de>

Hi Donald,

On 1/26/21 17:05, Donald Buczek wrote:
> Dear Guoqing,
> 
> On 26.01.21 15:06, Guoqing Jiang wrote:
>>
>>
>> On 1/26/21 13:58, Donald Buczek wrote:
>>>
>>>
>>>> Hmm, how about wake the waiter up in the while loop of raid5d?
>>>>
>>>> @@ -6520,6 +6532,11 @@ static void raid5d(struct md_thread *thread)
>>>>                          md_check_recovery(mddev);
>>>>                          spin_lock_irq(&conf->device_lock);
>>>>                  }
>>>> +
>>>> +               if ((atomic_read(&conf->active_stripes)
>>>> +                    < (conf->max_nr_stripes * 3 / 4) ||
>>>> +                    (test_bit(MD_RECOVERY_INTR, &mddev->recovery))))
>>>> +                       wake_up(&conf->wait_for_stripe);
>>>>          }
>>>>          pr_debug("%d stripes handled\n", handled);
>>>
>>> Hmm... With this patch on top of your other one, we still have the 
>>> basic symptoms (md3_raid6 busy looping), but the sync thread is now 
>>> hanging at
>>>
>>>      root@sloth:~# cat /proc/$(pgrep md3_resync)/stack
>>>      [<0>] md_do_sync.cold+0x8ec/0x97c
>>>      [<0>] md_thread+0xab/0x160
>>>      [<0>] kthread+0x11b/0x140
>>>      [<0>] ret_from_fork+0x22/0x30
>>>
>>> instead, which is 
>>> https://elixir.bootlin.com/linux/latest/source/drivers/md/md.c#L8963
>>
>> Not sure why recovery_active is not zero, because it is set 0 before 
>> blk_start_plug, and raid5_sync_request returns 0 and skipped is also 
>> set to 1. Perhaps handle_stripe calls md_done_sync.
>>
>> Could you double check the value of recovery_active? Or just don't 
>> wait if resync thread is interrupted.
>>
>> wait_event(mddev->recovery_wait,
>>         test_bit(MD_RECOVERY_INTR,&mddev->recovery) ||
>>         !atomic_read(&mddev->recovery_active));
> 
> With that added, md3_resync goes into a loop, too. Not 100% busy, though.
> 
> root@sloth:~# cat /proc/$(pgrep md3_resync)/stack
> 
> [<0>] raid5_get_active_stripe+0x1e7/0x6b0  # 
> https://elixir.bootlin.com/linux/v5.11-rc5/source/drivers/md/raid5.c#L735
> [<0>] raid5_sync_request+0x2a7/0x3d0       # 
> https://elixir.bootlin.com/linux/v5.11-rc5/source/drivers/md/raid5.c#L6274
> [<0>] md_do_sync.cold+0x3ee/0x97c          # 
> https://elixir.bootlin.com/linux/v5.11-rc5/source/drivers/md/md.c#L8883
> [<0>] md_thread+0xab/0x160
> [<0>] kthread+0x11b/0x140
> [<0>] ret_from_fork+0x22/0x30
> 
> Sometimes top of stack is raid5_get_active_stripe+0x1ef/0x6b0 instead of 
> raid5_get_active_stripe+0x1e7/0x6b0, so I guess it sleeps, its woken, 
> but the conditions don't match so its sleeps again.

I don't know why the condition was not true after the change since the 
RECOVERY_INTR is set and the caller is raid5_sync_request.

wait_event_lock_irq(conf->wait_for_stripe,
     (test_bit(MD_RECOVERY_INTR, &mddev->recovery) && sync_req) ||
      /* the previous condition */,
     *(conf->hash_locks + hash));


BTW, I think there some some possible ways:

1. let "echo idle" give up the reconfig_mutex if there are limited 
number of active stripe, but I feel it is ugly to check sh number from 
action_store (kind of layer violation).

2. make raid5_sync_request -> raid5_get_active_stripe can quit from the 
current situation (this was we tried, though it doesn't work so far).

3. set MD_ALLOW_SB_UPDATE as you said though I am not sure the safety 
(but maybe I am wrong).

4. given the write IO keeps coming from upper layer which decrease the 
available stripes. Maybe we need to call grow_one_stripe at the 
beginning of raid5_make_request for this case, then call drop_one_stripe
at the end of make_request.

5. maybe don't hold reconfig_mutex when try to unregister sync_thread, 
like this.

         /* resync has finished, collect result */
         mddev_unlock(mddev);
         md_unregister_thread(&mddev->sync_thread);
         mddev_lock(mddev);



My suggestion would be try 2 + 4 together since the reproducer triggers 
both sync io and write io. Or try 5. Perhaps there is better way which I 
just can't find.

Thanks,
Guoqing

  reply	other threads:[~2021-02-02 15:46 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-28 12:25 md_raid: mdX_raid6 looping after sync_action "check" to "idle" transition Donald Buczek
2020-11-30  2:06 ` Guoqing Jiang
2020-12-01  9:29   ` Donald Buczek
2020-12-02 17:28     ` Donald Buczek
2020-12-03  1:55       ` Guoqing Jiang
2020-12-03 11:42         ` Donald Buczek
2020-12-21 12:33           ` Donald Buczek
2021-01-19 11:30             ` Donald Buczek
2021-01-20 16:33               ` Guoqing Jiang
2021-01-23 13:04                 ` Donald Buczek
2021-01-25  8:54                   ` Donald Buczek
2021-01-25 21:32                     ` Donald Buczek
2021-01-26  0:44                       ` Guoqing Jiang
2021-01-26  9:50                         ` Donald Buczek
2021-01-26 11:14                           ` Guoqing Jiang
2021-01-26 12:58                             ` Donald Buczek
2021-01-26 14:06                               ` Guoqing Jiang
2021-01-26 16:05                                 ` Donald Buczek
2021-02-02 15:42                                   ` Guoqing Jiang [this message]
2021-02-08 11:38                                     ` Donald Buczek
2021-02-08 14:53                                       ` Guoqing Jiang
2021-02-08 18:41                                         ` Donald Buczek
2021-02-09  0:46                                           ` Guoqing Jiang
2021-02-09  9:24                                             ` Donald Buczek
2023-03-14 13:25                                             ` Marc Smith
2023-03-14 13:55                                               ` Guoqing Jiang
2023-03-14 14:45                                                 ` Marc Smith
2023-03-16 15:25                                                   ` Marc Smith
2023-03-29  0:01                                                     ` Song Liu
2023-08-22 21:16                                                       ` Dragan Stancevic
2023-08-23  1:22                                                         ` Yu Kuai
2023-08-23 15:33                                                           ` Dragan Stancevic
2023-08-24  1:18                                                             ` Yu Kuai
2023-08-28 20:32                                                               ` Dragan Stancevic
2023-08-30  1:36                                                                 ` Yu Kuai
2023-09-05  3:50                                                                   ` Yu Kuai
2023-09-05 13:54                                                                     ` Dragan Stancevic
2023-09-13  9:08                                                                       ` Donald Buczek
2023-09-13 14:16                                                                         ` Dragan Stancevic
2023-09-14  6:03                                                                           ` Donald Buczek
2023-09-17  8:55                                                                             ` Donald Buczek
2023-09-24 14:35                                                                               ` Donald Buczek
2023-09-25  1:11                                                                                 ` Yu Kuai
2023-09-25  9:11                                                                                   ` Donald Buczek
2023-09-25  9:32                                                                                     ` Yu Kuai
2023-03-15  3:02                                                 ` Yu Kuai
2023-03-15  9:30                                                   ` Guoqing Jiang
2023-03-15  9:53                                                     ` Yu Kuai
2023-03-15  7:52                                               ` Donald Buczek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bb7c8d8-6b96-ce70-c5ee-ba414de10561@cloud.ionos.com \
    --to=guoqing.jiang@cloud.ionos.com \
    --cc=buczek@molgen.mpg.de \
    --cc=it+raid@molgen.mpg.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).