linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yu Kuai <yukuai1@huaweicloud.com>
To: Peter Neuwirth <reddunur@online.de>,
	linux-raid@vger.kernel.org, "yukuai (C)" <yukuai3@huawei.com>
Subject: Re: linux mdadm assembly error: md: cannot handle concurrent replacement and reshape. (reboot while reshaping)
Date: Thu, 4 May 2023 16:16:50 +0800	[thread overview]
Message-ID: <34d38acf-64aa-d9c1-e603-a4551612b8ac@huaweicloud.com> (raw)
In-Reply-To: <e2f96772-bfbc-f43b-6da1-f520e5164536@online.de>

Hi,

在 2023/04/28 5:09, Peter Neuwirth 写道:
> Hello linux-raid group.
> 
> I have an issue with my linux raid setup and I hope somebody here
> could help me get my raid active again without data loss.
> 
> I have a debian 11 system with one raid array (6x 1TB hdd drives, raid 
> level 5 )
> that was active running till today, when I added two more 1TB hdd drives
> and also changed the raid level to 6.
> 
> Note: For completition:
> 
> My raid setup month ago was
> 
> mdadm --create --verbose /dev/md0 -c 256K --level=5 --raid-devices=6  
> /dev/sdd /dev/sdc /dev/sdb /dev/sda /dev/sdg /dev/sdf
> 
> mkfs.xfs -d su=254k,sw=6 -l version=2,su=256k -s size=4k /dev/md0
> 
> mdadm --detail --scan | tee -a /etc/mdadm/mdadm.conf
> 
> update-initramfs -u
> 
> echo '/dev/md0 /mnt/data ext4 defaults,nofail,discard 0 0' | sudo tee -a 
> /etc/fstab
> 
> 
> Today I did:
> 
> mdadm --add /dev/md0 /dev/sdg /dev/sdh
> 
> sudo mdadm --grow /dev/md0 --level=6
> 
> 
> This started a growth process, I could observe with
> watch -n 1 cat /proc/mdstat
> and md0 was still usable all the day.
> Due to speedy file access reasons I paused the grow and insertion
> process today at about 50% by issue
> 
> echo "frozen" > /sys/block/md0/md/sync_action
> 
> 
> After the file access was done, I restarted the
> process with
> 
> echo reshape > /sys/block/md0/md/sync_action
>
After look into this problem, I figure out that this is how the problem
(corrupted data) triggered in the first place, while the problem that
kernel log about "md: cannot handle concurrent replacement and reshape"
is not fatal.

"echo reshape" will restart the whole process, while recorded reshape
position should be used. This is a seriously kernel bug, I'll try to fix
this soon.

By the way, "echo idle" should avoid this problem.

Thanks,
Kuai
> 
> but I saw in mdstat that it started form the scratch.
> After about 5 min I noticed, that /dev/dm0 mount was gone with
> an input/output error in syslog and I rebooted the computer, to see the
> kernel would reassemble dm0 correctly. Maybe the this was a problem,
> because the dm0 was still reshaping, I do not know..


  parent reply	other threads:[~2023-05-04  8:24 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-27 21:09 linux mdadm assembly error: md: cannot handle concurrent replacement and reshape. (reboot while reshaping) Peter Neuwirth
2023-04-28  2:01 ` Yu Kuai
2023-05-04  8:16 ` Yu Kuai [this message]
2023-05-02 11:30 Peter Neuwirth
2023-05-04  1:57 ` Yu Kuai
2023-05-04  2:10   ` Yu Kuai
2023-05-04  8:36 Peter Neuwirth
2023-05-04  9:08 ` Yu Kuai
2023-05-04  9:43 Peter Neuwirth
2023-05-04 10:49 Peter Neuwirth

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=34d38acf-64aa-d9c1-e603-a4551612b8ac@huaweicloud.com \
    --to=yukuai1@huaweicloud.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=reddunur@online.de \
    --cc=yukuai3@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).