* Need help with degraded raid 5
@ 2020-03-05 0:31 William Morgan
2020-03-05 14:53 ` Jinpu Wang
0 siblings, 1 reply; 6+ messages in thread
From: William Morgan @ 2020-03-05 0:31 UTC (permalink / raw)
To: linux-raid
Hello,
I'm working with a 4 disk raid 5. In the past I have experienced a
problem that resulted in the array being set to "inactive", but with
some guidance from the list, I was able to rebuild with no loss of
data. Recently I have a slightly different situation where one disk
was "removed" and marked as "spare", so the array is still active, but
degraded.
I've been monitoring the array, and I got a "Fail event" notification
right after a power blip which showed this mdstat:
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid5 sdm1[4](F) sdj1[0] sdk1[1] sdl1[2]
23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
bitmap: 0/59 pages [0KB], 65536KB chunk
unused devices: <none>
A little while later I got a "DegradedArray event" notification with
the following mdstat:
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md0 : active raid5 sdl1[4] sdj1[1] sdk1[2] sdi1[0]
23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
[>....................] recovery = 0.0% (12600/7813893120)
finish=113621.8min speed=1145K/sec
bitmap: 2/59 pages [8KB], 65536KB chunk
unused devices: <none>
which seemed to imply that sdl was being rebuilt, but then I got
another "DegradedArray event" notification with this:
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md0 : active raid5 sdl1[4](S) sdj1[1] sdk1[2] sdi1[0]
23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
bitmap: 2/59 pages [8KB], 65536KB chunk
unused devices: <none>
I don't think anything is really wrong with the removed disk however.
So assuming I've got backups, what do I need to do to reinsert the
disk and get the array back to a normal state? Or does that disk's
data need to be completely rebuilt? And how do I initiate that?
I'm using the latest mdadm and a very recent kernel. Currently I get this:
bill@bill-desk:~$ sudo mdadm --detail /dev/md0
/dev/md0:
Version : 1.2
Creation Time : Sat Sep 22 19:10:10 2018
Raid Level : raid5
Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
Used Dev Size : 7813893120 (7451.91 GiB 8001.43 GB)
Raid Devices : 4
Total Devices : 4
Persistence : Superblock is persistent
Intent Bitmap : Internal
Update Time : Mon Mar 2 17:41:32 2020
State : clean, degraded
Active Devices : 3
Working Devices : 4
Failed Devices : 0
Spare Devices : 1
Layout : left-symmetric
Chunk Size : 64K
Consistency Policy : bitmap
Name : bill-desk:0 (local to host bill-desk)
UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
Events : 10407
Number Major Minor RaidDevice State
0 8 129 0 active sync /dev/sdi1
1 8 145 1 active sync /dev/sdj1
2 8 161 2 active sync /dev/sdk1
- 0 0 3 removed
4 8 177 - spare /dev/sdl1
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Need help with degraded raid 5
2020-03-05 0:31 Need help with degraded raid 5 William Morgan
@ 2020-03-05 14:53 ` Jinpu Wang
2020-03-05 17:22 ` Wols Lists
0 siblings, 1 reply; 6+ messages in thread
From: Jinpu Wang @ 2020-03-05 14:53 UTC (permalink / raw)
To: William Morgan; +Cc: linux-raid
William Morgan <therealbrewer@gmail.com> 于2020年3月5日周四 上午1:33写道:
>
> Hello,
>
> I'm working with a 4 disk raid 5. In the past I have experienced a
> problem that resulted in the array being set to "inactive", but with
> some guidance from the list, I was able to rebuild with no loss of
> data. Recently I have a slightly different situation where one disk
> was "removed" and marked as "spare", so the array is still active, but
> degraded.
>
> I've been monitoring the array, and I got a "Fail event" notification
> right after a power blip which showed this mdstat:
>
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10]
> md0 : active raid5 sdm1[4](F) sdj1[0] sdk1[1] sdl1[2]
> 23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
> bitmap: 0/59 pages [0KB], 65536KB chunk
>
> unused devices: <none>
>
> A little while later I got a "DegradedArray event" notification with
> the following mdstat:
>
> Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
> [raid1] [raid10]
> md0 : active raid5 sdl1[4] sdj1[1] sdk1[2] sdi1[0]
> 23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
> [>....................] recovery = 0.0% (12600/7813893120)
> finish=113621.8min speed=1145K/sec
> bitmap: 2/59 pages [8KB], 65536KB chunk
>
> unused devices: <none>
>
> which seemed to imply that sdl was being rebuilt, but then I got
> another "DegradedArray event" notification with this:
>
> Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
> [raid1] [raid10]
> md0 : active raid5 sdl1[4](S) sdj1[1] sdk1[2] sdi1[0]
> 23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
> bitmap: 2/59 pages [8KB], 65536KB chunk
>
> unused devices: <none>
>
>
> I don't think anything is really wrong with the removed disk however.
> So assuming I've got backups, what do I need to do to reinsert the
> disk and get the array back to a normal state? Or does that disk's
> data need to be completely rebuilt? And how do I initiate that?
>
> I'm using the latest mdadm and a very recent kernel. Currently I get this:
>
> bill@bill-desk:~$ sudo mdadm --detail /dev/md0
> /dev/md0:
> Version : 1.2
> Creation Time : Sat Sep 22 19:10:10 2018
> Raid Level : raid5
> Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
> Used Dev Size : 7813893120 (7451.91 GiB 8001.43 GB)
> Raid Devices : 4
> Total Devices : 4
> Persistence : Superblock is persistent
>
> Intent Bitmap : Internal
>
> Update Time : Mon Mar 2 17:41:32 2020
> State : clean, degraded
> Active Devices : 3
> Working Devices : 4
> Failed Devices : 0
> Spare Devices : 1
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Consistency Policy : bitmap
>
> Name : bill-desk:0 (local to host bill-desk)
> UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
> Events : 10407
>
> Number Major Minor RaidDevice State
> 0 8 129 0 active sync /dev/sdi1
> 1 8 145 1 active sync /dev/sdj1
> 2 8 161 2 active sync /dev/sdk1
> - 0 0 3 removed
>
> 4 8 177 - spare /dev/sdl1
"mdadm /dev/md0 -a /dev/sdl1" should work for you to add the disk
back to array, maybe you can check first with "mdadm -E /dev/sdl1" to
make sure.
Regards,
Jack Wang
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Need help with degraded raid 5
2020-03-05 14:53 ` Jinpu Wang
@ 2020-03-05 17:22 ` Wols Lists
2020-03-06 21:33 ` William Morgan
0 siblings, 1 reply; 6+ messages in thread
From: Wols Lists @ 2020-03-05 17:22 UTC (permalink / raw)
To: Jinpu Wang, William Morgan; +Cc: linux-raid
On 05/03/20 14:53, Jinpu Wang wrote:
> "mdadm /dev/md0 -a /dev/sdl1" should work for you to add the disk
> back to array, maybe you can check first with "mdadm -E /dev/sdl1" to
> make sure.
Or better, --re-add or whatever the option is. If it can find the
relevant data in the superblock, like bitmap or journal or whatever, it
will just bring the disk up-to-date. If it can't, it functions just like
add, so you've lost nothing but might gain a lot.
Cheers,
Wol
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Need help with degraded raid 5
2020-03-05 17:22 ` Wols Lists
@ 2020-03-06 21:33 ` William Morgan
2020-03-06 22:55 ` David C. Rankin
2020-03-09 8:39 ` Jack Wang
0 siblings, 2 replies; 6+ messages in thread
From: William Morgan @ 2020-03-06 21:33 UTC (permalink / raw)
To: Wols Lists; +Cc: Jinpu Wang, linux-raid
On Thu, Mar 5, 2020 at 11:22 AM Wols Lists <antlists@youngman.org.uk> wrote:
>
> On 05/03/20 14:53, Jinpu Wang wrote:
> > "mdadm /dev/md0 -a /dev/sdl1" should work for you to add the disk
> > back to array, maybe you can check first with "mdadm -E /dev/sdl1" to
> > make sure.
>
> Or better, --re-add or whatever the option is. If it can find the
> relevant data in the superblock, like bitmap or journal or whatever, it
> will just bring the disk up-to-date. If it can't, it functions just like
> add, so you've lost nothing but might gain a lot.
>
> Cheers,
> Wol
I tried re-add and I get the following error:
bill@bill-desk:~$ sudo mdadm /dev/md0 --re-add /dev/sdl1
mdadm: Cannot open /dev/sdl1: Device or resource busy
sdl is not mounted, and it doesn't seem to be a device mapper issue:
bill@bill-desk:~$ sudo dmsetup table
No devices found
Here is the current state of sdl:
bill@bill-desk:~$ sudo mdadm -E /dev/sdl1
/dev/sdl1:
Magic : a92b4efc
Version : 1.2
Feature Map : 0x9
Array UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
Name : bill-desk:0 (local to host bill-desk)
Creation Time : Sat Sep 22 19:10:10 2018
Raid Level : raid5
Raid Devices : 4
Avail Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
Data Offset : 264192 sectors
Super Offset : 8 sectors
Unused Space : before=264112 sectors, after=0 sectors
State : clean
Device UUID : 8c628aed:802a5dc8:9d8a8910:9794ec02
Internal Bitmap : 8 sectors from superblock
Update Time : Mon Mar 2 17:41:32 2020
Bad Block Log : 512 entries available at offset 40 sectors - bad
blocks present.
Checksum : 7b89f1e6 - correct
Events : 10749
Layout : left-symmetric
Chunk Size : 64K
Device Role : spare
Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
What am I overlooking?
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Need help with degraded raid 5
2020-03-06 21:33 ` William Morgan
@ 2020-03-06 22:55 ` David C. Rankin
2020-03-09 8:39 ` Jack Wang
1 sibling, 0 replies; 6+ messages in thread
From: David C. Rankin @ 2020-03-06 22:55 UTC (permalink / raw)
To: mdraid
On 03/06/2020 03:33 PM, William Morgan wrote:
> I tried re-add and I get the following error:
>
> bill@bill-desk:~$ sudo mdadm /dev/md0 --re-add /dev/sdl1
> mdadm: Cannot open /dev/sdl1: Device or resource busy
>
> sdl is not mounted, and it doesn't seem to be a device mapper issue:
cat /proc/mdstat and/or cat /proc/partitions
and see if the disk was brought up as an array of its own (something like
/dev/md127, etc..), If so, simply
sudo mdadm --stop /dev/md127
The try you re-add again. I recently had that occur when I put in a
replacement disk for a raid1 array. Even though I just cut the plastic
anti-static bag off the brand-new drive, when I booted the system, it came up
as an array (of what I don't know). I got the same device busy and simply used
--stop on the obviously not-an-array array, The --re-add worked just fine
afterwards.
--
David C. Rankin, J.D.,P.E.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Need help with degraded raid 5
2020-03-06 21:33 ` William Morgan
2020-03-06 22:55 ` David C. Rankin
@ 2020-03-09 8:39 ` Jack Wang
1 sibling, 0 replies; 6+ messages in thread
From: Jack Wang @ 2020-03-09 8:39 UTC (permalink / raw)
To: William Morgan; +Cc: Wols Lists, Jinpu Wang, linux-raid
William Morgan <therealbrewer@gmail.com> 于2020年3月6日周五 下午10:35写道:
>
> On Thu, Mar 5, 2020 at 11:22 AM Wols Lists <antlists@youngman.org.uk> wrote:
> >
> > On 05/03/20 14:53, Jinpu Wang wrote:
> > > "mdadm /dev/md0 -a /dev/sdl1" should work for you to add the disk
> > > back to array, maybe you can check first with "mdadm -E /dev/sdl1" to
> > > make sure.
> >
> > Or better, --re-add or whatever the option is. If it can find the
> > relevant data in the superblock, like bitmap or journal or whatever, it
> > will just bring the disk up-to-date. If it can't, it functions just like
> > add, so you've lost nothing but might gain a lot.
> >
> > Cheers,
> > Wol
>
> I tried re-add and I get the following error:
>
> bill@bill-desk:~$ sudo mdadm /dev/md0 --re-add /dev/sdl1
> mdadm: Cannot open /dev/sdl1: Device or resource busy
>
> sdl is not mounted, and it doesn't seem to be a device mapper issue:
>
> bill@bill-desk:~$ sudo dmsetup table
> No devices found
This is strange.
have you checked if any other process is using sdl1?
"sudo lsof /dev/sdl1"
>
> Here is the current state of sdl:
>
> bill@bill-desk:~$ sudo mdadm -E /dev/sdl1
> /dev/sdl1:
> Magic : a92b4efc
> Version : 1.2
> Feature Map : 0x9
> Array UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
> Name : bill-desk:0 (local to host bill-desk)
> Creation Time : Sat Sep 22 19:10:10 2018
> Raid Level : raid5
> Raid Devices : 4
>
> Avail Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
> Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
> Data Offset : 264192 sectors
> Super Offset : 8 sectors
> Unused Space : before=264112 sectors, after=0 sectors
> State : clean
> Device UUID : 8c628aed:802a5dc8:9d8a8910:9794ec02
>
> Internal Bitmap : 8 sectors from superblock
> Update Time : Mon Mar 2 17:41:32 2020
> Bad Block Log : 512 entries available at offset 40 sectors - bad
> blocks present.
> Checksum : 7b89f1e6 - correct
> Events : 10749
>
> Layout : left-symmetric
> Chunk Size : 64K
>
> Device Role : spare
> Array State : AAA. ('A' == active, '.' == missing, 'R' == replacing)
>
The metadata looks fine.
If no one holds the disk, maybe last resort to zero out the metadata
and add the disk back, maybe first try David's suggestion, stop array
first and try re-add.
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2020-03-09 8:39 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-05 0:31 Need help with degraded raid 5 William Morgan
2020-03-05 14:53 ` Jinpu Wang
2020-03-05 17:22 ` Wols Lists
2020-03-06 21:33 ` William Morgan
2020-03-06 22:55 ` David C. Rankin
2020-03-09 8:39 ` Jack Wang
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.