linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tkaczyk, Mariusz" <mariusz.tkaczyk@linux.intel.com>
To: Song Liu <song@kernel.org>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10
Date: Mon, 27 Sep 2021 16:54:27 +0200	[thread overview]
Message-ID: <83f06776-891f-dffa-7449-4128c419ada9@linux.intel.com> (raw)
In-Reply-To: <CAPhsuW4HZFaPgKx68mDeWE0F7SAjpnXmHL0TzN1SuZD6_Kds-w@mail.gmail.com>

Hi Song,
Thank for review.

On 24.09.2021 23:15, Song Liu wrote:
>>> diff --git a/drivers/md/md.c b/drivers/md/md.c
>>> index c322841d4edc..ac20eb2ddff7 100644
>>> --- a/drivers/md/md.c
>>> +++ b/drivers/md/md.c
>>> @@ -926,8 +926,9 @@ static void super_written(struct bio *bio)
>>>                  pr_err("md: %s gets error=%d\n", __func__,
>>>                         blk_status_to_errno(bio->bi_status));
>>>                  md_error(mddev, rdev);
>>> -               if (!test_bit(Faulty, &rdev->flags)
>>> -                   && (bio->bi_opf & MD_FAILFAST)) {
>>> +               if (!test_bit(Faulty, &rdev->flags) &&
>>> +                    !test_bit(MD_BROKEN, &mddev->flags) &&
>>> +                    (bio->bi_opf & MD_FAILFAST)) {
>>
>> So with MD_BROKEN, we will not try to update the SB?
Array is dead, is there added value in writing failed state?

For external arrays failed state is not written, because drive is
not removed from MD device and metadata manager cannot detect array
failure. This is how it was originally implemented (expect raid5 but I
aligned it around two years ago[1]). I tried to make it consistent for
everyone but I failed. Second patch restores possibility to remove
drive by kernel for raid5 so failed state will be detected (for external)
again.
So, maybe I should drop this change for native. What do you think?

>>
>>>                          set_bit(MD_SB_NEED_REWRITE, &mddev->sb_flags);
>>>                          set_bit(LastDev, &rdev->flags);
>>>                  }
>>> @@ -2979,7 +2980,8 @@ state_store(struct md_rdev *rdev, const char *buf, size_t len)
>>>          int err = -EINVAL;
>>>          if (cmd_match(buf, "faulty") && rdev->mddev->pers) {
>>>                  md_error(rdev->mddev, rdev);
>>> -               if (test_bit(Faulty, &rdev->flags))
>>> +
>>> +               if (!test_bit(MD_BROKEN, &rdev->mddev->flags))
>>
>> I don't think this makes much sense. EBUSY for already failed array
>> sounds weird.
>> Also, shall we also set MD_BROKEN here?
>>
> Actually, we just called md_error above, so we don't need to set MD_BROKEN here.
> But we shouldn't return EBUSY in such cases, right?
>
About EBUSY:
This is how it is implemented in mdadm, we are expecting it in
case of failure. See my fix[2].
I agree that it can be confusing, but this is how it is working.
Do you want to change it across mdadm and md?
This will break compatibility.

About MD_BROKEN:
As you see we are determining failure by checking rdev state, if "Faulty"
in flags after md_error() is not set, then it assumes that array is
failed and EBUSY is returned to userspace.

I changed verification to checking MD_BROKEN flag instead. It is
potentially harmful for raid1 and raid10 as old way is working well but
main target was to unify it across levels. Current verification method
is not flexible enough for raid5 (second patch).
Small benefit is that new IO will be failed faster, in md_submit_io().

Thanks,
Mariusz

[1] https://git.kernel.org/pub/scm/linux/kernel/git/song/md.git/commit/?id=fb73b3
[2] https://git.kernel.org/pub/scm/utils/mdadm/mdadm.git/commit/?id=cb8f5371

  reply	other threads:[~2021-09-27 14:54 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-17 15:34 [PATCH 0/2] Use MD_BROKEN for redundant arrays Mariusz Tkaczyk
2021-09-17 15:34 ` [PATCH 1/2] md, raid1, raid10: Set MD_BROKEN for RAID1 and RAID10 Mariusz Tkaczyk
2021-09-24 21:09   ` Song Liu
2021-09-24 21:15     ` Song Liu
2021-09-27 14:54       ` Tkaczyk, Mariusz [this message]
2021-09-27 22:59         ` Song Liu
2021-09-28  7:33           ` Tkaczyk, Mariusz
2021-09-28  7:55             ` Tkaczyk, Mariusz
2021-09-29  1:29               ` Song Liu
2021-09-30 11:23                 ` Tkaczyk, Mariusz
2021-09-30 15:56                   ` Song Liu
2021-09-17 15:34 ` [PATCH 2/2] raid5: introduce MD_BROKEN Mariusz Tkaczyk
2021-09-24 21:18   ` Song Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=83f06776-891f-dffa-7449-4128c419ada9@linux.intel.com \
    --to=mariusz.tkaczyk@linux.intel.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=song@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).