All of lore.kernel.org
 help / color / mirror / Atom feed
* Requesting help recovering my array
       [not found] <432300551.863689.1705953121879.ref@mail.yahoo.com>
@ 2024-01-22 19:52 ` RJ Marquette
  2024-01-22 21:39   ` Reindl Harald
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-22 19:52 UTC (permalink / raw)
  To: linux-raid

Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in reserve that I built using 3TB drives in 2019.  It has been running fine since, not even a single drive failure.  The system also has a 7th hard drive for OS, home directory, etc.  The motherboard had four SATA ports, so I added an adapter card that has 4 more ports, with three drives connected to it.  The server runs Debian that I keep relatively current.

Yesterday, I swapped a newer motherboard into the computer (upgraded my desktop and moved the guts to my server).  I never disconnected the cables from the adapter card (whew, I think), so I know which four drives were connected to the motherboard.  Unfortunately I didn't really note how they were hooked to the motherboard (SATA1-4 ports).  Didn't even think it would be an issue.  I'm reasonably confident the array drives on the motherboard were sda-sdc, but I'm not certain.

Now I can't get the array to come up.  I'm reasonably certain I haven't done anything to write to the drives - but mdadm will not assemble the drives (I have not tried to force it).  I'm not entirely sure what's up and would really appreciate any help. 

I've tried various incantations of mdadm --assemble --scan, with no luck.  I've seen the posts about certain motherboards that can mess up the drives, and I'm hoping I'm not in that boat.  The "new" motherboard is a Asus Z96-K/CSM.

I assume using --force is in my future...I see various pages that say use --force then check it, but will that damage it if I'm wrong?  If not, how will I know it's correct?  Is the order of drives important with --force?  I see conflicting info on that.

I'm no expert but it looks like each drive has the mdadm superblock...so I'm not sure why it won't assemble.  Please help!

Thanks in advance.
--RJ

root@jackie:~# uname -a 
Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64 GNU/Linux 

root@jackie:~# mdadm --version 
mdadm - v4.1 - 2018-10-01

root@jackie:~# mdadm --examine /dev/sda 
/dev/sda:   MBR Magic : aa55 
Partition[0] :   4294967295 sectors at            1 (type ee) 

root@jackie:~# mdadm --examine /dev/sda1 
mdadm: No md superblock detected on /dev/sda1. 

root@jackie:~# mdadm --examine /dev/sdb 
/dev/sdb:   MBR Magic : aa55 
Partition[0] :   4294967295 sectors at            1 (type ee) 

root@jackie:~# mdadm --examine /dev/sdb1 
mdadm: No md superblock detected on /dev/sdb1. 

root@jackie:~# mdadm --examine /dev/sdc 
/dev/sdc:          Magic : a92b4efc        Version : 1.2    
Feature Map : 0x0     
Array UUID : 74a11272:9b233a5b:2506f763:27693ccc           
Name : jackie:0  (local to host jackie)  
Creation Time : Sat Dec  8 19:32:07 2018     
Raid Level : raid5   
Raid Devices : 5 Avail 
Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)     
Array Size : 11720540160 (11177.58 GiB 12001.83 GB)  
Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)    
Data Offset : 262144 sectors   
Super Offset : 8 sectors   
Unused Space : before=261864 sectors, after=944 sectors          
State : clean    
Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a    
Update Time : Fri Jan 19 15:25:37 2024  
Bad Block Log : 512 entries available at offset 264 sectors       
Checksum : 2487f053 - correct         
Events : 5958         
Layout : left-symmetric     
Chunk Size : 512K   
Device Role : spare   
Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing) 

root@jackie:~# mdadm --examine /dev/sdc1 
mdadm: cannot open /dev/sdc1: No such file or directory 

root@jackie:~# mdadm --examine /dev/sde 
/dev/sde:   MBR Magic : aa55 
Partition[0] :   4294967295 sectors at            1 (type ee) 

root@jackie:~# mdadm --examine /dev/sde1 
mdadm: No md superblock detected on /dev/sde1. 

root@jackie:~# mdadm --examine /dev/sdf 
/dev/sdf:   MBR Magic : aa55 
Partition[0] :   4294967295 sectors at            1 (type ee) 

root@jackie:~# mdadm --examine /dev/sdf1 
mdadm: No md superblock detected on /dev/sdf1. 

root@jackie:~# mdadm --examine /dev/sdg 
/dev/sdg:   MBR Magic : aa55 
Partition[0] :   4294967295 sectors at            1 (type ee) 

root@jackie:~# mdadm --examine /dev/sdg1 
mdadm: No md superblock detected on /dev/sdg1.

root@jackie:~# lsdrv  
PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode] 
├scsi 0:0:0:0 ATA      ST3000VN007-2E41 {Z7317D1A} 
│└sda 2.73t [8:0] Partitioned (gpt) 
│ └sda1 2.73t [8:1] Empty/Unknown 
├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR} 
│└sdb 2.73t [8:16] Partitioned (gpt) 
│ └sdb1 2.73t [8:17] Empty/Unknown 
├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA} 
│└sdc 2.73t [8:32] MD raid5 (5) inactive 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc} 
└scsi 3:0:0:0 ATA      ST32000542AS     {5XW110LY} 
└sdd 1.82t [8:48] Partitioned (dos)  
├sdd1 23.28g [8:49] Partitioned (dos) {d94cc2c8-037a-49c5-8a1e-01bb47d78624}  
│└Mounted as /dev/sdd1 @ /  
├sdd2 1.00k [8:50] Partitioned (dos)  
├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}  
│└Mounted as /dev/sdd5 @ /var  
├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}  
├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}  
│└Mounted as /dev/sdd7 @ /tmp  
└sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}   
└Mounted as /dev/sdd8 @ /home PCI [ahci] 

06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11) 
├scsi 6:0:0:0 ATA      Hitachi HUS72403 {P8G84LEP} 
│└sde 2.73t [8:64] Partitioned (gpt) 
│ └sde1 2.73t [8:65] Empty/Unknown 
├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46} 
│└sdf 2.73t [8:80] Partitioned (gpt) 
│ └sdf1 2.73t [8:81] Empty/Unknown 
└scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX} 
└sdg 2.73t [8:96] Partitioned (gpt)
└sdg1 2.73t [8:97] Empty/Unknown

root@jackie:~# cat /etc/mdadm/mdadm.conf  
 # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23 -0500 by mkconf 
ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 UUID=74a11272:9b233a5b:2506f763:27693cccr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-22 19:52 ` Requesting help recovering my array RJ Marquette
@ 2024-01-22 21:39   ` Reindl Harald
  2024-01-22 22:13     ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Reindl Harald @ 2024-01-22 21:39 UTC (permalink / raw)
  To: RJ Marquette, linux-raid

a ton of "mdadm --examine" outputs but i can't see a "cat /proc/mdstat"

/dev/sdX is completly irrelevant when it comes to raid - you can even 
connect a random disk via USB adapter without a change from the view of 
the array

Am 22.01.24 um 20:52 schrieb RJ Marquette:
> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in reserve that I built using 3TB drives in 2019.  It has been running fine since, not even a single drive failure.  The system also has a 7th hard drive for OS, home directory, etc.  The motherboard had four SATA ports, so I added an adapter card that has 4 more ports, with three drives connected to it.  The server runs Debian that I keep relatively current.
> 
> Yesterday, I swapped a newer motherboard into the computer (upgraded my desktop and moved the guts to my server).  I never disconnected the cables from the adapter card (whew, I think), so I know which four drives were connected to the motherboard.  Unfortunately I didn't really note how they were hooked to the motherboard (SATA1-4 ports).  Didn't even think it would be an issue.  I'm reasonably confident the array drives on the motherboard were sda-sdc, but I'm not certain.
> 
> Now I can't get the array to come up.  I'm reasonably certain I haven't done anything to write to the drives - but mdadm will not assemble the drives (I have not tried to force it).  I'm not entirely sure what's up and would really appreciate any help.
> 
> I've tried various incantations of mdadm --assemble --scan, with no luck.  I've seen the posts about certain motherboards that can mess up the drives, and I'm hoping I'm not in that boat.  The "new" motherboard is a Asus Z96-K/CSM.
> 
> I assume using --force is in my future...I see various pages that say use --force then check it, but will that damage it if I'm wrong?  If not, how will I know it's correct?  Is the order of drives important with --force?  I see conflicting info on that.
> 
> I'm no expert but it looks like each drive has the mdadm superblock...so I'm not sure why it won't assemble.  Please help!
> 
> Thanks in advance.
> --RJ
> 
> root@jackie:~# uname -a
> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64 GNU/Linux
> 
> root@jackie:~# mdadm --version
> mdadm - v4.1 - 2018-10-01
> 
> root@jackie:~# mdadm --examine /dev/sda
> /dev/sda:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sda1
> mdadm: No md superblock detected on /dev/sda1.
> 
> root@jackie:~# mdadm --examine /dev/sdb
> /dev/sdb:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sdb1
> mdadm: No md superblock detected on /dev/sdb1.
> 
> root@jackie:~# mdadm --examine /dev/sdc
> /dev/sdc:          Magic : a92b4efc        Version : 1.2
> Feature Map : 0x0
> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
> Name : jackie:0  (local to host jackie)
> Creation Time : Sat Dec  8 19:32:07 2018
> Raid Level : raid5
> Raid Devices : 5 Avail
> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
> Data Offset : 262144 sectors
> Super Offset : 8 sectors
> Unused Space : before=261864 sectors, after=944 sectors
> State : clean
> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
> Update Time : Fri Jan 19 15:25:37 2024
> Bad Block Log : 512 entries available at offset 264 sectors
> Checksum : 2487f053 - correct
> Events : 5958
> Layout : left-symmetric
> Chunk Size : 512K
> Device Role : spare
> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
> 
> root@jackie:~# mdadm --examine /dev/sdc1
> mdadm: cannot open /dev/sdc1: No such file or directory
> 
> root@jackie:~# mdadm --examine /dev/sde
> /dev/sde:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sde1
> mdadm: No md superblock detected on /dev/sde1.
> 
> root@jackie:~# mdadm --examine /dev/sdf
> /dev/sdf:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sdf1
> mdadm: No md superblock detected on /dev/sdf1.
> 
> root@jackie:~# mdadm --examine /dev/sdg
> /dev/sdg:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sdg1
> mdadm: No md superblock detected on /dev/sdg1.
> 
> root@jackie:~# lsdrv
> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode]
> ├scsi 0:0:0:0 ATA      ST3000VN007-2E41 {Z7317D1A}
> │└sda 2.73t [8:0] Partitioned (gpt)
> │ └sda1 2.73t [8:1] Empty/Unknown
> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
> │└sdb 2.73t [8:16] Partitioned (gpt)
> │ └sdb1 2.73t [8:17] Empty/Unknown
> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
> │└sdc 2.73t [8:32] MD raid5 (5) inactive 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc}
> └scsi 3:0:0:0 ATA      ST32000542AS     {5XW110LY}
> └sdd 1.82t [8:48] Partitioned (dos)
> ├sdd1 23.28g [8:49] Partitioned (dos) {d94cc2c8-037a-49c5-8a1e-01bb47d78624}
> │└Mounted as /dev/sdd1 @ /
> ├sdd2 1.00k [8:50] Partitioned (dos)
> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
> │└Mounted as /dev/sdd5 @ /var
> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
> │└Mounted as /dev/sdd7 @ /tmp
> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
> └Mounted as /dev/sdd8 @ /home PCI [ahci]
> 
> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
> ├scsi 6:0:0:0 ATA      Hitachi HUS72403 {P8G84LEP}
> │└sde 2.73t [8:64] Partitioned (gpt)
> │ └sde1 2.73t [8:65] Empty/Unknown
> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
> │└sdf 2.73t [8:80] Partitioned (gpt)
> │ └sdf1 2.73t [8:81] Empty/Unknown
> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
> └sdg 2.73t [8:96] Partitioned (gpt)
> └sdg1 2.73t [8:97] Empty/Unknown
> 
> root@jackie:~# cat /etc/mdadm/mdadm.conf
>   # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23 -0500 by mkconf
> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 UUID=74a11272:9b233a5b:2506f763:27693cccr


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-22 21:39   ` Reindl Harald
@ 2024-01-22 22:13     ` RJ Marquette
  2024-01-22 23:49       ` Reindl Harald
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-22 22:13 UTC (permalink / raw)
  To: linux-raid, Reindl Harald

Sorry! 

rj@jackie:~$ cat /proc/mdstat 
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] 
unused devices: <none>


Thanks.
--RJ





On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald <h.reindl@thelounge.net> wrote: 





a ton of "mdadm --examine" outputs but i can't see a "cat /proc/mdstat"

/dev/sdX is completly irrelevant when it comes to raid - you can even 
connect a random disk via USB adapter without a change from the view of 
the array

Am 22.01.24 um 20:52 schrieb RJ Marquette:
> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in reserve that I built using 3TB drives in 2019.  It has been running fine since, not even a single drive failure.  The system also has a 7th hard drive for OS, home directory, etc.  The motherboard had four SATA ports, so I added an adapter card that has 4 more ports, with three drives connected to it.  The server runs Debian that I keep relatively current.
> 
> Yesterday, I swapped a newer motherboard into the computer (upgraded my desktop and moved the guts to my server).  I never disconnected the cables from the adapter card (whew, I think), so I know which four drives were connected to the motherboard.  Unfortunately I didn't really note how they were hooked to the motherboard (SATA1-4 ports).  Didn't even think it would be an issue.  I'm reasonably confident the array drives on the motherboard were sda-sdc, but I'm not certain.
> 
> Now I can't get the array to come up.  I'm reasonably certain I haven't done anything to write to the drives - but mdadm will not assemble the drives (I have not tried to force it).  I'm not entirely sure what's up and would really appreciate any help.
> 
> I've tried various incantations of mdadm --assemble --scan, with no luck.  I've seen the posts about certain motherboards that can mess up the drives, and I'm hoping I'm not in that boat.  The "new" motherboard is a Asus Z96-K/CSM.
> 
> I assume using --force is in my future...I see various pages that say use --force then check it, but will that damage it if I'm wrong?  If not, how will I know it's correct?  Is the order of drives important with --force?  I see conflicting info on that.
> 
> I'm no expert but it looks like each drive has the mdadm superblock...so I'm not sure why it won't assemble.  Please help!
> 
> Thanks in advance.
> --RJ
> 
> root@jackie:~# uname -a
> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64 GNU/Linux
> 
> root@jackie:~# mdadm --version
> mdadm - v4.1 - 2018-10-01
> 
> root@jackie:~# mdadm --examine /dev/sda
> /dev/sda:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sda1
> mdadm: No md superblock detected on /dev/sda1.
> 
> root@jackie:~# mdadm --examine /dev/sdb
> /dev/sdb:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sdb1
> mdadm: No md superblock detected on /dev/sdb1.
> 
> root@jackie:~# mdadm --examine /dev/sdc
> /dev/sdc:          Magic : a92b4efc        Version : 1.2
> Feature Map : 0x0
> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
> Name : jackie:0  (local to host jackie)
> Creation Time : Sat Dec  8 19:32:07 2018
> Raid Level : raid5
> Raid Devices : 5 Avail
> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
> Data Offset : 262144 sectors
> Super Offset : 8 sectors
> Unused Space : before=261864 sectors, after=944 sectors
> State : clean
> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
> Update Time : Fri Jan 19 15:25:37 2024
> Bad Block Log : 512 entries available at offset 264 sectors
> Checksum : 2487f053 - correct
> Events : 5958
> Layout : left-symmetric
> Chunk Size : 512K
> Device Role : spare
> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
> 
> root@jackie:~# mdadm --examine /dev/sdc1
> mdadm: cannot open /dev/sdc1: No such file or directory
> 
> root@jackie:~# mdadm --examine /dev/sde
> /dev/sde:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sde1
> mdadm: No md superblock detected on /dev/sde1.
> 
> root@jackie:~# mdadm --examine /dev/sdf
> /dev/sdf:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sdf1
> mdadm: No md superblock detected on /dev/sdf1.
> 
> root@jackie:~# mdadm --examine /dev/sdg
> /dev/sdg:   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
> 
> root@jackie:~# mdadm --examine /dev/sdg1
> mdadm: No md superblock detected on /dev/sdg1.
> 
> root@jackie:~# lsdrv
> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode]
> ├scsi 0:0:0:0 ATA      ST3000VN007-2E41 {Z7317D1A}
> │└sda 2.73t [8:0] Partitioned (gpt)
> │ └sda1 2.73t [8:1] Empty/Unknown
> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
> │└sdb 2.73t [8:16] Partitioned (gpt)
> │ └sdb1 2.73t [8:17] Empty/Unknown
> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
> │└sdc 2.73t [8:32] MD raid5 (5) inactive 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc}
> └scsi 3:0:0:0 ATA      ST32000542AS     {5XW110LY}
> └sdd 1.82t [8:48] Partitioned (dos)
> ├sdd1 23.28g [8:49] Partitioned (dos) {d94cc2c8-037a-49c5-8a1e-01bb47d78624}
> │└Mounted as /dev/sdd1 @ /
> ├sdd2 1.00k [8:50] Partitioned (dos)
> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
> │└Mounted as /dev/sdd5 @ /var
> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
> │└Mounted as /dev/sdd7 @ /tmp
> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
> └Mounted as /dev/sdd8 @ /home PCI [ahci]
> 
> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
> ├scsi 6:0:0:0 ATA      Hitachi HUS72403 {P8G84LEP}
> │└sde 2.73t [8:64] Partitioned (gpt)
> │ └sde1 2.73t [8:65] Empty/Unknown
> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
> │└sdf 2.73t [8:80] Partitioned (gpt)
> │ └sdf1 2.73t [8:81] Empty/Unknown
> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
> └sdg 2.73t [8:96] Partitioned (gpt)
> └sdg1 2.73t [8:97] Empty/Unknown
> 
> root@jackie:~# cat /etc/mdadm/mdadm.conf
>   # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23 -0500 by mkconf
> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 UUID=74a11272:9b233a5b:2506f763:27693cccr



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-22 22:13     ` RJ Marquette
@ 2024-01-22 23:49       ` Reindl Harald
  2024-01-23  0:09         ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Reindl Harald @ 2024-01-22 23:49 UTC (permalink / raw)
  To: RJ Marquette, linux-raid



Am 22.01.24 um 23:13 schrieb RJ Marquette:
> Sorry!
> 
> rj@jackie:~$ cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> unused devices: <none>

that's all and where is the ton of raid-types coming from with no single 
array shown?

[root@srv-rhsoft:~]$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb2[2] sda2[0]
       30740480 blocks super 1.2 [2/2] [UU]
       bitmap: 0/1 pages [0KB], 65536KB chunk

md1 : active raid1 sda3[0] sdb3[2]
       3875717120 blocks super 1.2 [2/2] [UU]
       bitmap: 5/29 pages [20KB], 65536KB chunk

unused devices: <none>

> On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald <h.reindl@thelounge.net> wrote:
> 
> a ton of "mdadm --examine" outputs but i can't see a "cat /proc/mdstat"
> 
> /dev/sdX is completly irrelevant when it comes to raid - you can even
> connect a random disk via USB adapter without a change from the view of
> the array
> 
> Am 22.01.24 um 20:52 schrieb RJ Marquette:
>> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in reserve that I built using 3TB drives in 2019.  It has been running fine since, not even a single drive failure.  The system also has a 7th hard drive for OS, home directory, etc.  The motherboard had four SATA ports, so I added an adapter card that has 4 more ports, with three drives connected to it.  The server runs Debian that I keep relatively current.
>>
>> Yesterday, I swapped a newer motherboard into the computer (upgraded my desktop and moved the guts to my server).  I never disconnected the cables from the adapter card (whew, I think), so I know which four drives were connected to the motherboard.  Unfortunately I didn't really note how they were hooked to the motherboard (SATA1-4 ports).  Didn't even think it would be an issue.  I'm reasonably confident the array drives on the motherboard were sda-sdc, but I'm not certain.
>>
>> Now I can't get the array to come up.  I'm reasonably certain I haven't done anything to write to the drives - but mdadm will not assemble the drives (I have not tried to force it).  I'm not entirely sure what's up and would really appreciate any help.
>>
>> I've tried various incantations of mdadm --assemble --scan, with no luck.  I've seen the posts about certain motherboards that can mess up the drives, and I'm hoping I'm not in that boat.  The "new" motherboard is a Asus Z96-K/CSM.
>>
>> I assume using --force is in my future...I see various pages that say use --force then check it, but will that damage it if I'm wrong?  If not, how will I know it's correct?  Is the order of drives important with --force?  I see conflicting info on that.
>>
>> I'm no expert but it looks like each drive has the mdadm superblock...so I'm not sure why it won't assemble.  Please help!
>>
>> Thanks in advance.
>> --RJ
>>
>> root@jackie:~# uname -a
>> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64 GNU/Linux
>>
>> root@jackie:~# mdadm --version
>> mdadm - v4.1 - 2018-10-01
>>
>> root@jackie:~# mdadm --examine /dev/sda
>> /dev/sda:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sda1
>> mdadm: No md superblock detected on /dev/sda1.
>>
>> root@jackie:~# mdadm --examine /dev/sdb
>> /dev/sdb:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdb1
>> mdadm: No md superblock detected on /dev/sdb1.
>>
>> root@jackie:~# mdadm --examine /dev/sdc
>> /dev/sdc:          Magic : a92b4efc        Version : 1.2
>> Feature Map : 0x0
>> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
>> Name : jackie:0  (local to host jackie)
>> Creation Time : Sat Dec  8 19:32:07 2018
>> Raid Level : raid5
>> Raid Devices : 5 Avail
>> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
>> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
>> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
>> Data Offset : 262144 sectors
>> Super Offset : 8 sectors
>> Unused Space : before=261864 sectors, after=944 sectors
>> State : clean
>> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
>> Update Time : Fri Jan 19 15:25:37 2024
>> Bad Block Log : 512 entries available at offset 264 sectors
>> Checksum : 2487f053 - correct
>> Events : 5958
>> Layout : left-symmetric
>> Chunk Size : 512K
>> Device Role : spare
>> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
>>
>> root@jackie:~# mdadm --examine /dev/sdc1
>> mdadm: cannot open /dev/sdc1: No such file or directory
>>
>> root@jackie:~# mdadm --examine /dev/sde
>> /dev/sde:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sde1
>> mdadm: No md superblock detected on /dev/sde1.
>>
>> root@jackie:~# mdadm --examine /dev/sdf
>> /dev/sdf:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdf1
>> mdadm: No md superblock detected on /dev/sdf1.
>>
>> root@jackie:~# mdadm --examine /dev/sdg
>> /dev/sdg:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdg1
>> mdadm: No md superblock detected on /dev/sdg1.
>>
>> root@jackie:~# lsdrv
>> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode]
>> ├scsi 0:0:0:0 ATA      ST3000VN007-2E41 {Z7317D1A}
>> │└sda 2.73t [8:0] Partitioned (gpt)
>> │ └sda1 2.73t [8:1] Empty/Unknown
>> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
>> │└sdb 2.73t [8:16] Partitioned (gpt)
>> │ └sdb1 2.73t [8:17] Empty/Unknown
>> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
>> │└sdc 2.73t [8:32] MD raid5 (5) inactive 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc}
>> └scsi 3:0:0:0 ATA      ST32000542AS     {5XW110LY}
>> └sdd 1.82t [8:48] Partitioned (dos)
>> ├sdd1 23.28g [8:49] Partitioned (dos) {d94cc2c8-037a-49c5-8a1e-01bb47d78624}
>> │└Mounted as /dev/sdd1 @ /
>> ├sdd2 1.00k [8:50] Partitioned (dos)
>> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
>> │└Mounted as /dev/sdd5 @ /var
>> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
>> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
>> │└Mounted as /dev/sdd7 @ /tmp
>> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
>> └Mounted as /dev/sdd8 @ /home PCI [ahci]
>>
>> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
>> ├scsi 6:0:0:0 ATA      Hitachi HUS72403 {P8G84LEP}
>> │└sde 2.73t [8:64] Partitioned (gpt)
>> │ └sde1 2.73t [8:65] Empty/Unknown
>> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
>> │└sdf 2.73t [8:80] Partitioned (gpt)
>> │ └sdf1 2.73t [8:81] Empty/Unknown
>> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
>> └sdg 2.73t [8:96] Partitioned (gpt)
>> └sdg1 2.73t [8:97] Empty/Unknown
>>
>> root@jackie:~# cat /etc/mdadm/mdadm.conf
>>     # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23 -0500 by mkconf
>> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 UUID=74a11272:9b233a5b:2506f763:27693cccr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-22 23:49       ` Reindl Harald
@ 2024-01-23  0:09         ` RJ Marquette
  2024-01-23  1:52           ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-23  0:09 UTC (permalink / raw)
  To: linux-raid, Reindl Harald

That's all.  

If I run:

root@jackie:~# mdadm --assemble --scan
mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to start the array.

root@jackie:~# cat /proc/mdstat  
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]  
unused devices: <none>

root@jackie:~# ls -l /dev/md*
ls: cannot access '/dev/md*': No such file or directory

It seems to be recognizing the spare drive, but not the 5 that actually have data, for some reason.

Thanks.
--RJ








On Monday, January 22, 2024 at 06:49:50 PM EST, Reindl Harald <h.reindl@thelounge.net> wrote: 







Am 22.01.24 um 23:13 schrieb RJ Marquette:
> Sorry!
> 
> rj@jackie:~$ cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> unused devices: <none>

that's all and where is the ton of raid-types coming from with no single 
array shown?

[root@srv-rhsoft:~]$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb2[2] sda2[0]
      30740480 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md1 : active raid1 sda3[0] sdb3[2]
      3875717120 blocks super 1.2 [2/2] [UU]
      bitmap: 5/29 pages [20KB], 65536KB chunk


unused devices: <none>

> On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald <h.reindl@thelounge.net> wrote:
> 
> a ton of "mdadm --examine" outputs but i can't see a "cat /proc/mdstat"
> 
> /dev/sdX is completly irrelevant when it comes to raid - you can even
> connect a random disk via USB adapter without a change from the view of
> the array
> 
> Am 22.01.24 um 20:52 schrieb RJ Marquette:
>> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in reserve that I built using 3TB drives in 2019.  It has been running fine since, not even a single drive failure.  The system also has a 7th hard drive for OS, home directory, etc.  The motherboard had four SATA ports, so I added an adapter card that has 4 more ports, with three drives connected to it.  The server runs Debian that I keep relatively current.
>>
>> Yesterday, I swapped a newer motherboard into the computer (upgraded my desktop and moved the guts to my server).  I never disconnected the cables from the adapter card (whew, I think), so I know which four drives were connected to the motherboard.  Unfortunately I didn't really note how they were hooked to the motherboard (SATA1-4 ports).  Didn't even think it would be an issue.  I'm reasonably confident the array drives on the motherboard were sda-sdc, but I'm not certain.
>>
>> Now I can't get the array to come up.  I'm reasonably certain I haven't done anything to write to the drives - but mdadm will not assemble the drives (I have not tried to force it).  I'm not entirely sure what's up and would really appreciate any help.
>>
>> I've tried various incantations of mdadm --assemble --scan, with no luck.  I've seen the posts about certain motherboards that can mess up the drives, and I'm hoping I'm not in that boat.  The "new" motherboard is a Asus Z96-K/CSM.
>>
>> I assume using --force is in my future...I see various pages that say use --force then check it, but will that damage it if I'm wrong?  If not, how will I know it's correct?  Is the order of drives important with --force?  I see conflicting info on that.
>>
>> I'm no expert but it looks like each drive has the mdadm superblock...so I'm not sure why it won't assemble.  Please help!
>>
>> Thanks in advance.
>> --RJ
>>
>> root@jackie:~# uname -a
>> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64 GNU/Linux
>>
>> root@jackie:~# mdadm --version
>> mdadm - v4.1 - 2018-10-01
>>
>> root@jackie:~# mdadm --examine /dev/sda
>> /dev/sda:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sda1
>> mdadm: No md superblock detected on /dev/sda1.
>>
>> root@jackie:~# mdadm --examine /dev/sdb
>> /dev/sdb:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdb1
>> mdadm: No md superblock detected on /dev/sdb1.
>>
>> root@jackie:~# mdadm --examine /dev/sdc
>> /dev/sdc:          Magic : a92b4efc        Version : 1.2
>> Feature Map : 0x0
>> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
>> Name : jackie:0  (local to host jackie)
>> Creation Time : Sat Dec  8 19:32:07 2018
>> Raid Level : raid5
>> Raid Devices : 5 Avail
>> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
>> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
>> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
>> Data Offset : 262144 sectors
>> Super Offset : 8 sectors
>> Unused Space : before=261864 sectors, after=944 sectors
>> State : clean
>> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
>> Update Time : Fri Jan 19 15:25:37 2024
>> Bad Block Log : 512 entries available at offset 264 sectors
>> Checksum : 2487f053 - correct
>> Events : 5958
>> Layout : left-symmetric
>> Chunk Size : 512K
>> Device Role : spare
>> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
>>
>> root@jackie:~# mdadm --examine /dev/sdc1
>> mdadm: cannot open /dev/sdc1: No such file or directory
>>
>> root@jackie:~# mdadm --examine /dev/sde
>> /dev/sde:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sde1
>> mdadm: No md superblock detected on /dev/sde1.
>>
>> root@jackie:~# mdadm --examine /dev/sdf
>> /dev/sdf:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdf1
>> mdadm: No md superblock detected on /dev/sdf1.
>>
>> root@jackie:~# mdadm --examine /dev/sdg
>> /dev/sdg:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdg1
>> mdadm: No md superblock detected on /dev/sdg1.
>>
>> root@jackie:~# lsdrv
>> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode]
>> ├scsi 0:0:0:0 ATA      ST3000VN007-2E41 {Z7317D1A}
>> │└sda 2.73t [8:0] Partitioned (gpt)
>> │ └sda1 2.73t [8:1] Empty/Unknown
>> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
>> │└sdb 2.73t [8:16] Partitioned (gpt)
>> │ └sdb1 2.73t [8:17] Empty/Unknown
>> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
>> │└sdc 2.73t [8:32] MD raid5 (5) inactive 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc}
>> └scsi 3:0:0:0 ATA      ST32000542AS     {5XW110LY}
>> └sdd 1.82t [8:48] Partitioned (dos)
>> ├sdd1 23.28g [8:49] Partitioned (dos) {d94cc2c8-037a-49c5-8a1e-01bb47d78624}
>> │└Mounted as /dev/sdd1 @ /
>> ├sdd2 1.00k [8:50] Partitioned (dos)
>> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
>> │└Mounted as /dev/sdd5 @ /var
>> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
>> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
>> │└Mounted as /dev/sdd7 @ /tmp
>> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
>> └Mounted as /dev/sdd8 @ /home PCI [ahci]
>>
>> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
>> ├scsi 6:0:0:0 ATA      Hitachi HUS72403 {P8G84LEP}
>> │└sde 2.73t [8:64] Partitioned (gpt)
>> │ └sde1 2.73t [8:65] Empty/Unknown
>> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
>> │└sdf 2.73t [8:80] Partitioned (gpt)
>> │ └sdf1 2.73t [8:81] Empty/Unknown
>> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
>> └sdg 2.73t [8:96] Partitioned (gpt)
>> └sdg1 2.73t [8:97] Empty/Unknown
>>
>> root@jackie:~# cat /etc/mdadm/mdadm.conf
>>     # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23 -0500 by mkconf
>> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 UUID=74a11272:9b233a5b:2506f763:27693cccr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-23  0:09         ` RJ Marquette
@ 2024-01-23  1:52           ` RJ Marquette
  2024-01-23 16:06             ` David Niklas
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-23  1:52 UTC (permalink / raw)
  To: linux-raid, Reindl Harald

I meant to add that my /proc/mdstat looked much more like yours on the old system.  But nothing is showing on this one. 

I may try swapping back to the old motherboard.  Another possibility that might be factor - UEFI vs Legacy BIOS.

Thanks.
--RJ


On Monday, January 22, 2024 at 07:45:29 PM EST, RJ Marquette <rjm1@yahoo.com> wrote: 





That's all.  

If I run:

root@jackie:~# mdadm --assemble --scan
mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to start the array.

root@jackie:~# cat /proc/mdstat  
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]  
unused devices: <none>

root@jackie:~# ls -l /dev/md*
ls: cannot access '/dev/md*': No such file or directory

It seems to be recognizing the spare drive, but not the 5 that actually have data, for some reason.

Thanks.
--RJ








On Monday, January 22, 2024 at 06:49:50 PM EST, Reindl Harald <h.reindl@thelounge.net> wrote: 







Am 22.01.24 um 23:13 schrieb RJ Marquette:
> Sorry!
> 
> rj@jackie:~$ cat /proc/mdstat
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
> unused devices: <none>

that's all and where is the ton of raid-types coming from with no single 
array shown?

[root@srv-rhsoft:~]$ cat /proc/mdstat
Personalities : [raid1]
md0 : active raid1 sdb2[2] sda2[0]
      30740480 blocks super 1.2 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

md1 : active raid1 sda3[0] sdb3[2]
      3875717120 blocks super 1.2 [2/2] [UU]
      bitmap: 5/29 pages [20KB], 65536KB chunk


unused devices: <none>

> On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald <h.reindl@thelounge.net> wrote:
> 
> a ton of "mdadm --examine" outputs but i can't see a "cat /proc/mdstat"
> 
> /dev/sdX is completly irrelevant when it comes to raid - you can even
> connect a random disk via USB adapter without a change from the view of
> the array
> 
> Am 22.01.24 um 20:52 schrieb RJ Marquette:
>> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in reserve that I built using 3TB drives in 2019.  It has been running fine since, not even a single drive failure.  The system also has a 7th hard drive for OS, home directory, etc.  The motherboard had four SATA ports, so I added an adapter card that has 4 more ports, with three drives connected to it.  The server runs Debian that I keep relatively current.
>>
>> Yesterday, I swapped a newer motherboard into the computer (upgraded my desktop and moved the guts to my server).  I never disconnected the cables from the adapter card (whew, I think), so I know which four drives were connected to the motherboard.  Unfortunately I didn't really note how they were hooked to the motherboard (SATA1-4 ports).  Didn't even think it would be an issue.  I'm reasonably confident the array drives on the motherboard were sda-sdc, but I'm not certain.
>>
>> Now I can't get the array to come up.  I'm reasonably certain I haven't done anything to write to the drives - but mdadm will not assemble the drives (I have not tried to force it).  I'm not entirely sure what's up and would really appreciate any help.
>>
>> I've tried various incantations of mdadm --assemble --scan, with no luck.  I've seen the posts about certain motherboards that can mess up the drives, and I'm hoping I'm not in that boat.  The "new" motherboard is a Asus Z96-K/CSM.
>>
>> I assume using --force is in my future...I see various pages that say use --force then check it, but will that damage it if I'm wrong?  If not, how will I know it's correct?  Is the order of drives important with --force?  I see conflicting info on that.
>>
>> I'm no expert but it looks like each drive has the mdadm superblock...so I'm not sure why it won't assemble.  Please help!
>>
>> Thanks in advance.
>> --RJ
>>
>> root@jackie:~# uname -a
>> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31) x86_64 GNU/Linux
>>
>> root@jackie:~# mdadm --version
>> mdadm - v4.1 - 2018-10-01
>>
>> root@jackie:~# mdadm --examine /dev/sda
>> /dev/sda:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sda1
>> mdadm: No md superblock detected on /dev/sda1.
>>
>> root@jackie:~# mdadm --examine /dev/sdb
>> /dev/sdb:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdb1
>> mdadm: No md superblock detected on /dev/sdb1.
>>
>> root@jackie:~# mdadm --examine /dev/sdc
>> /dev/sdc:          Magic : a92b4efc        Version : 1.2
>> Feature Map : 0x0
>> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
>> Name : jackie:0  (local to host jackie)
>> Creation Time : Sat Dec  8 19:32:07 2018
>> Raid Level : raid5
>> Raid Devices : 5 Avail
>> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
>> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
>> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
>> Data Offset : 262144 sectors
>> Super Offset : 8 sectors
>> Unused Space : before=261864 sectors, after=944 sectors
>> State : clean
>> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
>> Update Time : Fri Jan 19 15:25:37 2024
>> Bad Block Log : 512 entries available at offset 264 sectors
>> Checksum : 2487f053 - correct
>> Events : 5958
>> Layout : left-symmetric
>> Chunk Size : 512K
>> Device Role : spare
>> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
>>
>> root@jackie:~# mdadm --examine /dev/sdc1
>> mdadm: cannot open /dev/sdc1: No such file or directory
>>
>> root@jackie:~# mdadm --examine /dev/sde
>> /dev/sde:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sde1
>> mdadm: No md superblock detected on /dev/sde1.
>>
>> root@jackie:~# mdadm --examine /dev/sdf
>> /dev/sdf:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdf1
>> mdadm: No md superblock detected on /dev/sdf1.
>>
>> root@jackie:~# mdadm --examine /dev/sdg
>> /dev/sdg:   MBR Magic : aa55
>> Partition[0] :   4294967295 sectors at            1 (type ee)
>>
>> root@jackie:~# mdadm --examine /dev/sdg1
>> mdadm: No md superblock detected on /dev/sdg1.
>>
>> root@jackie:~# lsdrv
>> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series Chipset Family SATA Controller [AHCI Mode]
>> ├scsi 0:0:0:0 ATA      ST3000VN007-2E41 {Z7317D1A}
>> │└sda 2.73t [8:0] Partitioned (gpt)
>> │ └sda1 2.73t [8:1] Empty/Unknown
>> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
>> │└sdb 2.73t [8:16] Partitioned (gpt)
>> │ └sdb1 2.73t [8:17] Empty/Unknown
>> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
>> │└sdc 2.73t [8:32] MD raid5 (5) inactive 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc}
>> └scsi 3:0:0:0 ATA      ST32000542AS     {5XW110LY}
>> └sdd 1.82t [8:48] Partitioned (dos)
>> ├sdd1 23.28g [8:49] Partitioned (dos) {d94cc2c8-037a-49c5-8a1e-01bb47d78624}
>> │└Mounted as /dev/sdd1 @ /
>> ├sdd2 1.00k [8:50] Partitioned (dos)
>> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
>> │└Mounted as /dev/sdd5 @ /var
>> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
>> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
>> │└Mounted as /dev/sdd7 @ /tmp
>> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
>> └Mounted as /dev/sdd8 @ /home PCI [ahci]
>>
>> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11)
>> ├scsi 6:0:0:0 ATA      Hitachi HUS72403 {P8G84LEP}
>> │└sde 2.73t [8:64] Partitioned (gpt)
>> │ └sde1 2.73t [8:65] Empty/Unknown
>> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
>> │└sdf 2.73t [8:80] Partitioned (gpt)
>> │ └sdf1 2.73t [8:81] Empty/Unknown
>> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
>> └sdg 2.73t [8:96] Partitioned (gpt)
>> └sdg1 2.73t [8:97] Empty/Unknown
>>
>> root@jackie:~# cat /etc/mdadm/mdadm.conf
>>     # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23 -0500 by mkconf
>> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 UUID=74a11272:9b233a5b:2506f763:27693cccr

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-23  1:52           ` RJ Marquette
@ 2024-01-23 16:06             ` David Niklas
  2024-01-23 16:09               ` RJ Marquette
  2024-01-23 16:16               ` RJ Marquette
  0 siblings, 2 replies; 40+ messages in thread
From: David Niklas @ 2024-01-23 16:06 UTC (permalink / raw)
  To: linux-raid; +Cc: RJ Marquette

Hello,

As someone who's a bit more experienced in RAID array failures, I'd like
to suggest the following:

# Check that all drives are being detected.
ls /dev/sd*

# Verify what exactly is being scanned.
grep DEVICE /etc/mdadm/mdadm.conf

Assuming both of these give satisfactory results*, your next step would
be to try assembling them out of order and see what happens. For example:

-> mdadm --assemble /dev/md0 /dev/sda /dev/sdb
Mdadm: Error Not part of array /dev/sdb
-> mdadm --assemble /dev/md0 /dev/sda /dev/sdc
Mdadm: Error too few drives to start array /dev/md0

Please note that I made up what mdadm is saying there. But it still tells
you what's going on.
* for the ls command you should see all the drives you have. For the grep
command you should get a listing like "/dev/sda /dev/sdb"... Obviously,
all the drives that might have a RAID array on them should be listed.


Sincerely,
David





On Tue, 23 Jan 2024 01:52:31 +0000 (UTC)
RJ Marquette <rjm1@yahoo.com> wrote:
> I meant to add that my /proc/mdstat looked much more like yours on the
> old system.  But nothing is showing on this one. 
> 
> I may try swapping back to the old motherboard.  Another possibility
> that might be factor - UEFI vs Legacy BIOS.
> 
> Thanks.
> --RJ
> 
> 
> On Monday, January 22, 2024 at 07:45:29 PM EST, RJ Marquette
> <rjm1@yahoo.com> wrote: 
> 
> 
> 
> 
> 
> That's all.  
> 
> If I run:
> 
> root@jackie:~# mdadm --assemble --scan
> mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to
> start the array.
> 
> root@jackie:~# cat /proc/mdstat  
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10] unused devices: <none>
> 
> root@jackie:~# ls -l /dev/md*
> ls: cannot access '/dev/md*': No such file or directory
> 
> It seems to be recognizing the spare drive, but not the 5 that actually
> have data, for some reason.
> 
> Thanks.
> --RJ
> 
> 
> 
> 
> 
> 
> 
> 
> On Monday, January 22, 2024 at 06:49:50 PM EST, Reindl Harald
> <h.reindl@thelounge.net> wrote: 
> 
> 
> 
> 
> 
> 
> 
> Am 22.01.24 um 23:13 schrieb RJ Marquette:
> > Sorry!
> > 
> > rj@jackie:~$ cat /proc/mdstat
> > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> > [raid4] [raid10] unused devices: <none>  
> 
> that's all and where is the ton of raid-types coming from with no
> single array shown?
> 
> [root@srv-rhsoft:~]$ cat /proc/mdstat
> Personalities : [raid1]
> md0 : active raid1 sdb2[2] sda2[0]
>       30740480 blocks super 1.2 [2/2] [UU]
>       bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> md1 : active raid1 sda3[0] sdb3[2]
>       3875717120 blocks super 1.2 [2/2] [UU]
>       bitmap: 5/29 pages [20KB], 65536KB chunk
> 
> 
> unused devices: <none>
> 
> > On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald
> > <h.reindl@thelounge.net> wrote:
> > 
> > a ton of "mdadm --examine" outputs but i can't see a
> > "cat /proc/mdstat"
> > 
> > /dev/sdX is completly irrelevant when it comes to raid - you can even
> > connect a random disk via USB adapter without a change from the view
> > of the array
> > 
> > Am 22.01.24 um 20:52 schrieb RJ Marquette:  
> >> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in
> >> reserve that I built using 3TB drives in 2019.  It has been running
> >> fine since, not even a single drive failure.  The system also has a
> >> 7th hard drive for OS, home directory, etc.  The motherboard had
> >> four SATA ports, so I added an adapter card that has 4 more ports,
> >> with three drives connected to it.  The server runs Debian that I
> >> keep relatively current.
> >>
> >> Yesterday, I swapped a newer motherboard into the computer (upgraded
> >> my desktop and moved the guts to my server).  I never disconnected
> >> the cables from the adapter card (whew, I think), so I know which
> >> four drives were connected to the motherboard.  Unfortunately I
> >> didn't really note how they were hooked to the motherboard (SATA1-4
> >> ports).  Didn't even think it would be an issue.  I'm reasonably
> >> confident the array drives on the motherboard were sda-sdc, but I'm
> >> not certain.
> >>
> >> Now I can't get the array to come up.  I'm reasonably certain I
> >> haven't done anything to write to the drives - but mdadm will not
> >> assemble the drives (I have not tried to force it).  I'm not
> >> entirely sure what's up and would really appreciate any help.
> >>
> >> I've tried various incantations of mdadm --assemble --scan, with no
> >> luck.  I've seen the posts about certain motherboards that can mess
> >> up the drives, and I'm hoping I'm not in that boat.  The "new"
> >> motherboard is a Asus Z96-K/CSM.
> >>
> >> I assume using --force is in my future...I see various pages that
> >> say use --force then check it, but will that damage it if I'm
> >> wrong?  If not, how will I know it's correct?  Is the order of
> >> drives important with --force?  I see conflicting info on that.
> >>
> >> I'm no expert but it looks like each drive has the mdadm
> >> superblock...so I'm not sure why it won't assemble.  Please help!
> >>
> >> Thanks in advance.
> >> --RJ
> >>
> >> root@jackie:~# uname -a
> >> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31)
> >> x86_64 GNU/Linux
> >>
> >> root@jackie:~# mdadm --version
> >> mdadm - v4.1 - 2018-10-01
> >>
> >> root@jackie:~# mdadm --examine /dev/sda
> >> /dev/sda:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sda1
> >> mdadm: No md superblock detected on /dev/sda1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdb
> >> /dev/sdb:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdb1
> >> mdadm: No md superblock detected on /dev/sdb1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdc
> >> /dev/sdc:          Magic : a92b4efc        Version : 1.2
> >> Feature Map : 0x0
> >> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
> >> Name : jackie:0  (local to host jackie)
> >> Creation Time : Sat Dec  8 19:32:07 2018
> >> Raid Level : raid5
> >> Raid Devices : 5 Avail
> >> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
> >> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
> >> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
> >> Data Offset : 262144 sectors
> >> Super Offset : 8 sectors
> >> Unused Space : before=261864 sectors, after=944 sectors
> >> State : clean
> >> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
> >> Update Time : Fri Jan 19 15:25:37 2024
> >> Bad Block Log : 512 entries available at offset 264 sectors
> >> Checksum : 2487f053 - correct
> >> Events : 5958
> >> Layout : left-symmetric
> >> Chunk Size : 512K
> >> Device Role : spare
> >> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdc1
> >> mdadm: cannot open /dev/sdc1: No such file or directory
> >>
> >> root@jackie:~# mdadm --examine /dev/sde
> >> /dev/sde:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sde1
> >> mdadm: No md superblock detected on /dev/sde1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdf
> >> /dev/sdf:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdf1
> >> mdadm: No md superblock detected on /dev/sdf1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdg
> >> /dev/sdg:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdg1
> >> mdadm: No md superblock detected on /dev/sdg1.
> >>
> >> root@jackie:~# lsdrv
> >> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series
> >> Chipset Family SATA Controller [AHCI Mode] ├scsi 0:0:0:0 ATA
> >>      ST3000VN007-2E41 {Z7317D1A} │└sda 2.73t [8:0] Partitioned (gpt)
> >> │ └sda1 2.73t [8:1] Empty/Unknown
> >> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
> >> │└sdb 2.73t [8:16] Partitioned (gpt)
> >> │ └sdb1 2.73t [8:17] Empty/Unknown
> >> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
> >> │└sdc 2.73t [8:32] MD raid5 (5) inactive
> >> 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc} └scsi 3:0:0:0 ATA
> >>      ST32000542AS     {5XW110LY} └sdd 1.82t [8:48] Partitioned (dos)
> >> ├sdd1 23.28g [8:49] Partitioned (dos)
> >> {d94cc2c8-037a-49c5-8a1e-01bb47d78624} │└Mounted as /dev/sdd1 @ /
> >> ├sdd2 1.00k [8:50] Partitioned (dos)
> >> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
> >> │└Mounted as /dev/sdd5 @ /var
> >> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
> >> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
> >> │└Mounted as /dev/sdd7 @ /tmp
> >> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
> >> └Mounted as /dev/sdd8 @ /home PCI [ahci]
> >>
> >> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe
> >> 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11) ├scsi 6:0:0:0 ATA
> >>      Hitachi HUS72403 {P8G84LEP} │└sde 2.73t [8:64] Partitioned (gpt)
> >> │ └sde1 2.73t [8:65] Empty/Unknown
> >> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
> >> │└sdf 2.73t [8:80] Partitioned (gpt)
> >> │ └sdf1 2.73t [8:81] Empty/Unknown
> >> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
> >> └sdg 2.73t [8:96] Partitioned (gpt)
> >> └sdg1 2.73t [8:97] Empty/Unknown
> >>
> >> root@jackie:~# cat /etc/mdadm/mdadm.conf
> >>     # This configuration was auto-generated on Wed, 27 Nov 2019
> >>15:53:23 -0500 by mkconf
> >> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0
> >> UUID=74a11272:9b233a5b:2506f763:27693cccr  
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-23 16:06             ` David Niklas
@ 2024-01-23 16:09               ` RJ Marquette
  2024-01-23 16:16               ` RJ Marquette
  1 sibling, 0 replies; 40+ messages in thread
From: RJ Marquette @ 2024-01-23 16:09 UTC (permalink / raw)
  To: linux-raid, David Niklas

Thanks.  All drives in the system are being detected (/dev/sdd is my system drive - the rest are all of the array):

rj@jackie:~$ ls -l /dev/sd*
brw-rw---- 1 root disk 8,  0 Jan 21 19:08 /dev/sda
brw-rw---- 1 root disk 8,  1 Jan 21 19:08 /dev/sda1
brw-rw---- 1 root disk 8, 16 Jan 21 19:08 /dev/sdb
brw-rw---- 1 root disk 8, 17 Jan 21 19:08 /dev/sdb1
brw-rw---- 1 root disk 8, 32 Jan 21 19:08 /dev/sdc
brw-rw---- 1 root disk 8, 48 Jan 21 19:08 /dev/sdd
brw-rw---- 1 root disk 8, 49 Jan 21 19:08 /dev/sdd1
brw-rw---- 1 root disk 8, 50 Jan 21 19:08 /dev/sdd2
brw-rw---- 1 root disk 8, 53 Jan 21 19:08 /dev/sdd5
brw-rw---- 1 root disk 8, 54 Jan 21 19:08 /dev/sdd6
brw-rw---- 1 root disk 8, 55 Jan 21 19:08 /dev/sdd7
brw-rw---- 1 root disk 8, 56 Jan 21 19:08 /dev/sdd8
brw-rw---- 1 root disk 8, 64 Jan 21 19:08 /dev/sde
brw-rw---- 1 root disk 8, 65 Jan 21 19:08 /dev/sde1
brw-rw---- 1 root disk 8, 80 Jan 21 19:08 /dev/sdf
brw-rw---- 1 root disk 8, 81 Jan 21 19:08 /dev/sdf1
brw-rw---- 1 root disk 8, 96 Jan 21 19:08 /dev/sdg
brw-rw---- 1 root disk 8, 97 Jan 21 19:08 /dev/sdg1


The devices are not listed in the mdadm.conf, nor were they ever.  Here's everything that's not commented out in that file:









On Tuesday, January 23, 2024 at 11:06:30 AM EST, David Niklas <simd@vfemail.net> wrote: 





Hello,

As someone who's a bit more experienced in RAID array failures, I'd like
to suggest the following:

# Check that all drives are being detected.
ls /dev/sd*

# Verify what exactly is being scanned.
grep DEVICE /etc/mdadm/mdadm.conf

Assuming both of these give satisfactory results*, your next step would
be to try assembling them out of order and see what happens. For example:

-> mdadm --assemble /dev/md0 /dev/sda /dev/sdb
Mdadm: Error Not part of array /dev/sdb
-> mdadm --assemble /dev/md0 /dev/sda /dev/sdc
Mdadm: Error too few drives to start array /dev/md0

Please note that I made up what mdadm is saying there. But it still tells
you what's going on.
* for the ls command you should see all the drives you have. For the grep
command you should get a listing like "/dev/sda /dev/sdb"... Obviously,
all the drives that might have a RAID array on them should be listed.


Sincerely,
David





On Tue, 23 Jan 2024 01:52:31 +0000 (UTC)
RJ Marquette <rjm1@yahoo.com> wrote:
> I meant to add that my /proc/mdstat looked much more like yours on the
> old system.  But nothing is showing on this one. 
> 
> I may try swapping back to the old motherboard.  Another possibility
> that might be factor - UEFI vs Legacy BIOS.
> 
> Thanks.
> --RJ
> 
> 
> On Monday, January 22, 2024 at 07:45:29 PM EST, RJ Marquette
> <rjm1@yahoo.com> wrote: 
> 
> 
> 
> 
> 
> That's all.  
> 
> If I run:
> 
> root@jackie:~# mdadm --assemble --scan
> mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to
> start the array.
> 
> root@jackie:~# cat /proc/mdstat  
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10] unused devices: <none>
> 
> root@jackie:~# ls -l /dev/md*
> ls: cannot access '/dev/md*': No such file or directory
> 
> It seems to be recognizing the spare drive, but not the 5 that actually
> have data, for some reason.
> 
> Thanks.
> --RJ
> 
> 
> 
> 
> 
> 
> 
> 
> On Monday, January 22, 2024 at 06:49:50 PM EST, Reindl Harald
> <h.reindl@thelounge.net> wrote: 
> 
> 
> 
> 
> 
> 
> 
> Am 22.01.24 um 23:13 schrieb RJ Marquette:
> > Sorry!
> > 
> > rj@jackie:~$ cat /proc/mdstat
> > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> > [raid4] [raid10] unused devices: <none>  
> 
> that's all and where is the ton of raid-types coming from with no
> single array shown?
> 
> [root@srv-rhsoft:~]$ cat /proc/mdstat
> Personalities : [raid1]
> md0 : active raid1 sdb2[2] sda2[0]
>       30740480 blocks super 1.2 [2/2] [UU]
>       bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> md1 : active raid1 sda3[0] sdb3[2]
>       3875717120 blocks super 1.2 [2/2] [UU]
>       bitmap: 5/29 pages [20KB], 65536KB chunk
> 
> 
> unused devices: <none>
> 
> > On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald
> > <h.reindl@thelounge.net> wrote:
> > 
> > a ton of "mdadm --examine" outputs but i can't see a
> > "cat /proc/mdstat"
> > 
> > /dev/sdX is completly irrelevant when it comes to raid - you can even
> > connect a random disk via USB adapter without a change from the view
> > of the array
> > 
> > Am 22.01.24 um 20:52 schrieb RJ Marquette:  
> >> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in
> >> reserve that I built using 3TB drives in 2019.  It has been running
> >> fine since, not even a single drive failure.  The system also has a
> >> 7th hard drive for OS, home directory, etc.  The motherboard had
> >> four SATA ports, so I added an adapter card that has 4 more ports,
> >> with three drives connected to it.  The server runs Debian that I
> >> keep relatively current.
> >>
> >> Yesterday, I swapped a newer motherboard into the computer (upgraded
> >> my desktop and moved the guts to my server).  I never disconnected
> >> the cables from the adapter card (whew, I think), so I know which
> >> four drives were connected to the motherboard.  Unfortunately I
> >> didn't really note how they were hooked to the motherboard (SATA1-4
> >> ports).  Didn't even think it would be an issue.  I'm reasonably
> >> confident the array drives on the motherboard were sda-sdc, but I'm
> >> not certain.
> >>
> >> Now I can't get the array to come up.  I'm reasonably certain I
> >> haven't done anything to write to the drives - but mdadm will not
> >> assemble the drives (I have not tried to force it).  I'm not
> >> entirely sure what's up and would really appreciate any help.
> >>
> >> I've tried various incantations of mdadm --assemble --scan, with no
> >> luck.  I've seen the posts about certain motherboards that can mess
> >> up the drives, and I'm hoping I'm not in that boat.  The "new"
> >> motherboard is a Asus Z96-K/CSM.
> >>
> >> I assume using --force is in my future...I see various pages that
> >> say use --force then check it, but will that damage it if I'm
> >> wrong?  If not, how will I know it's correct?  Is the order of
> >> drives important with --force?  I see conflicting info on that.
> >>
> >> I'm no expert but it looks like each drive has the mdadm
> >> superblock...so I'm not sure why it won't assemble.  Please help!
> >>
> >> Thanks in advance.
> >> --RJ
> >>
> >> root@jackie:~# uname -a
> >> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31)
> >> x86_64 GNU/Linux
> >>
> >> root@jackie:~# mdadm --version
> >> mdadm - v4.1 - 2018-10-01
> >>
> >> root@jackie:~# mdadm --examine /dev/sda
> >> /dev/sda:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sda1
> >> mdadm: No md superblock detected on /dev/sda1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdb
> >> /dev/sdb:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdb1
> >> mdadm: No md superblock detected on /dev/sdb1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdc
> >> /dev/sdc:          Magic : a92b4efc        Version : 1.2
> >> Feature Map : 0x0
> >> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
> >> Name : jackie:0  (local to host jackie)
> >> Creation Time : Sat Dec  8 19:32:07 2018
> >> Raid Level : raid5
> >> Raid Devices : 5 Avail
> >> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
> >> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
> >> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
> >> Data Offset : 262144 sectors
> >> Super Offset : 8 sectors
> >> Unused Space : before=261864 sectors, after=944 sectors
> >> State : clean
> >> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
> >> Update Time : Fri Jan 19 15:25:37 2024
> >> Bad Block Log : 512 entries available at offset 264 sectors
> >> Checksum : 2487f053 - correct
> >> Events : 5958
> >> Layout : left-symmetric
> >> Chunk Size : 512K
> >> Device Role : spare
> >> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdc1
> >> mdadm: cannot open /dev/sdc1: No such file or directory
> >>
> >> root@jackie:~# mdadm --examine /dev/sde
> >> /dev/sde:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sde1
> >> mdadm: No md superblock detected on /dev/sde1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdf
> >> /dev/sdf:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdf1
> >> mdadm: No md superblock detected on /dev/sdf1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdg
> >> /dev/sdg:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdg1
> >> mdadm: No md superblock detected on /dev/sdg1.
> >>
> >> root@jackie:~# lsdrv
> >> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series
> >> Chipset Family SATA Controller [AHCI Mode] ├scsi 0:0:0:0 ATA
> >>      ST3000VN007-2E41 {Z7317D1A} │└sda 2.73t [8:0] Partitioned (gpt)
> >> │ └sda1 2.73t [8:1] Empty/Unknown
> >> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
> >> │└sdb 2.73t [8:16] Partitioned (gpt)
> >> │ └sdb1 2.73t [8:17] Empty/Unknown
> >> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
> >> │└sdc 2.73t [8:32] MD raid5 (5) inactive
> >> 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc} └scsi 3:0:0:0 ATA
> >>      ST32000542AS     {5XW110LY} └sdd 1.82t [8:48] Partitioned (dos)
> >> ├sdd1 23.28g [8:49] Partitioned (dos)
> >> {d94cc2c8-037a-49c5-8a1e-01bb47d78624} │└Mounted as /dev/sdd1 @ /
> >> ├sdd2 1.00k [8:50] Partitioned (dos)
> >> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
> >> │└Mounted as /dev/sdd5 @ /var
> >> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
> >> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
> >> │└Mounted as /dev/sdd7 @ /tmp
> >> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
> >> └Mounted as /dev/sdd8 @ /home PCI [ahci]
> >>
> >> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe
> >> 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11) ├scsi 6:0:0:0 ATA
> >>      Hitachi HUS72403 {P8G84LEP} │└sde 2.73t [8:64] Partitioned (gpt)
> >> │ └sde1 2.73t [8:65] Empty/Unknown
> >> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
> >> │└sdf 2.73t [8:80] Partitioned (gpt)
> >> │ └sdf1 2.73t [8:81] Empty/Unknown
> >> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
> >> └sdg 2.73t [8:96] Partitioned (gpt)
> >> └sdg1 2.73t [8:97] Empty/Unknown
> >>
> >> root@jackie:~# cat /etc/mdadm/mdadm.conf
> >>     # This configuration was auto-generated on Wed, 27 Nov 2019
> >>15:53:23 -0500 by mkconf
> >> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0
> >> UUID=74a11272:9b233a5b:2506f763:27693cccr  
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-23 16:06             ` David Niklas
  2024-01-23 16:09               ` RJ Marquette
@ 2024-01-23 16:16               ` RJ Marquette
  2024-01-23 22:50                 ` Sandro
  2024-01-24  3:19                 ` David Niklas
  1 sibling, 2 replies; 40+ messages in thread
From: RJ Marquette @ 2024-01-23 16:16 UTC (permalink / raw)
  To: linux-raid, David Niklas

(Sorry if this came through twice without the mdadm.conf contents, somehow I accidentally hit send when I was trying to paste in.)

Thanks.  All drives in the system are being detected (/dev/sdd is my system drive - the rest are all of the array):

rj@jackie:~$ ls -l /dev/sd*
brw-rw---- 1 root disk 8,  0 Jan 21 19:08 /dev/sda
brw-rw---- 1 root disk 8,  1 Jan 21 19:08 /dev/sda1
brw-rw---- 1 root disk 8, 16 Jan 21 19:08 /dev/sdb
brw-rw---- 1 root disk 8, 17 Jan 21 19:08 /dev/sdb1
brw-rw---- 1 root disk 8, 32 Jan 21 19:08 /dev/sdc
brw-rw---- 1 root disk 8, 48 Jan 21 19:08 /dev/sdd
brw-rw---- 1 root disk 8, 49 Jan 21 19:08 /dev/sdd1
brw-rw---- 1 root disk 8, 50 Jan 21 19:08 /dev/sdd2
brw-rw---- 1 root disk 8, 53 Jan 21 19:08 /dev/sdd5
brw-rw---- 1 root disk 8, 54 Jan 21 19:08 /dev/sdd6
brw-rw---- 1 root disk 8, 55 Jan 21 19:08 /dev/sdd7
brw-rw---- 1 root disk 8, 56 Jan 21 19:08 /dev/sdd8
brw-rw---- 1 root disk 8, 64 Jan 21 19:08 /dev/sde
brw-rw---- 1 root disk 8, 65 Jan 21 19:08 /dev/sde1
brw-rw---- 1 root disk 8, 80 Jan 21 19:08 /dev/sdf
brw-rw---- 1 root disk 8, 81 Jan 21 19:08 /dev/sdf1
brw-rw---- 1 root disk 8, 96 Jan 21 19:08 /dev/sdg
brw-rw---- 1 root disk 8, 97 Jan 21 19:08 /dev/sdg1


The devices are not listed in the mdadm.conf, nor were they ever.  Here's everything (except the initial header comments about updating initramfs and all) from that file:

# by default (built-in), scan all partitions (/proc/partitions) and all
# containers for MD superblocks. alternatively, specify devices to scan, using
# wildcards if desired.
#DEVICE partitions containers

# automatically tag new arrays as belonging to the local system
HOMEHOST <system>

# instruct the monitoring daemon where to send mail alerts
MAILADDR rj

# definitions of existing MD arrays
#ARRAY /dev/md/0  metadata=1.2 UUID=74a11272:9b233a5b:2506f763:27693ccc name=jackie:0

# This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23 -0500 by mkconf
ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 UUID=74a11272:9b233a5b:2506f763:27693ccc


I assume that last line was added when I added the spare drive.  Should I add the drives to the mdadm.conf then run the assemble command you suggested?

It's like mdadm was assembling them automatically upon bootup, but that stopped working with the new motherboard for some reason.

Thanks.
--RJ






On Tuesday, January 23, 2024 at 11:06:30 AM EST, David Niklas <simd@vfemail.net> wrote: 





Hello,

As someone who's a bit more experienced in RAID array failures, I'd like
to suggest the following:

# Check that all drives are being detected.
ls /dev/sd*

# Verify what exactly is being scanned.
grep DEVICE /etc/mdadm/mdadm.conf

Assuming both of these give satisfactory results*, your next step would
be to try assembling them out of order and see what happens. For example:

-> mdadm --assemble /dev/md0 /dev/sda /dev/sdb
Mdadm: Error Not part of array /dev/sdb
-> mdadm --assemble /dev/md0 /dev/sda /dev/sdc
Mdadm: Error too few drives to start array /dev/md0

Please note that I made up what mdadm is saying there. But it still tells
you what's going on.
* for the ls command you should see all the drives you have. For the grep
command you should get a listing like "/dev/sda /dev/sdb"... Obviously,
all the drives that might have a RAID array on them should be listed.


Sincerely,
David





On Tue, 23 Jan 2024 01:52:31 +0000 (UTC)
RJ Marquette <rjm1@yahoo.com> wrote:
> I meant to add that my /proc/mdstat looked much more like yours on the
> old system.  But nothing is showing on this one. 
> 
> I may try swapping back to the old motherboard.  Another possibility
> that might be factor - UEFI vs Legacy BIOS.
> 
> Thanks.
> --RJ
> 
> 
> On Monday, January 22, 2024 at 07:45:29 PM EST, RJ Marquette
> <rjm1@yahoo.com> wrote: 
> 
> 
> 
> 
> 
> That's all.  
> 
> If I run:
> 
> root@jackie:~# mdadm --assemble --scan
> mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to
> start the array.
> 
> root@jackie:~# cat /proc/mdstat  
> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> [raid4] [raid10] unused devices: <none>
> 
> root@jackie:~# ls -l /dev/md*
> ls: cannot access '/dev/md*': No such file or directory
> 
> It seems to be recognizing the spare drive, but not the 5 that actually
> have data, for some reason.
> 
> Thanks.
> --RJ
> 
> 
> 
> 
> 
> 
> 
> 
> On Monday, January 22, 2024 at 06:49:50 PM EST, Reindl Harald
> <h.reindl@thelounge.net> wrote: 
> 
> 
> 
> 
> 
> 
> 
> Am 22.01.24 um 23:13 schrieb RJ Marquette:
> > Sorry!
> > 
> > rj@jackie:~$ cat /proc/mdstat
> > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> > [raid4] [raid10] unused devices: <none>  
> 
> that's all and where is the ton of raid-types coming from with no
> single array shown?
> 
> [root@srv-rhsoft:~]$ cat /proc/mdstat
> Personalities : [raid1]
> md0 : active raid1 sdb2[2] sda2[0]
>       30740480 blocks super 1.2 [2/2] [UU]
>       bitmap: 0/1 pages [0KB], 65536KB chunk
> 
> md1 : active raid1 sda3[0] sdb3[2]
>       3875717120 blocks super 1.2 [2/2] [UU]
>       bitmap: 5/29 pages [20KB], 65536KB chunk
> 
> 
> unused devices: <none>
> 
> > On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald
> > <h.reindl@thelounge.net> wrote:
> > 
> > a ton of "mdadm --examine" outputs but i can't see a
> > "cat /proc/mdstat"
> > 
> > /dev/sdX is completly irrelevant when it comes to raid - you can even
> > connect a random disk via USB adapter without a change from the view
> > of the array
> > 
> > Am 22.01.24 um 20:52 schrieb RJ Marquette:  
> >> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in
> >> reserve that I built using 3TB drives in 2019.  It has been running
> >> fine since, not even a single drive failure.  The system also has a
> >> 7th hard drive for OS, home directory, etc.  The motherboard had
> >> four SATA ports, so I added an adapter card that has 4 more ports,
> >> with three drives connected to it.  The server runs Debian that I
> >> keep relatively current.
> >>
> >> Yesterday, I swapped a newer motherboard into the computer (upgraded
> >> my desktop and moved the guts to my server).  I never disconnected
> >> the cables from the adapter card (whew, I think), so I know which
> >> four drives were connected to the motherboard.  Unfortunately I
> >> didn't really note how they were hooked to the motherboard (SATA1-4
> >> ports).  Didn't even think it would be an issue.  I'm reasonably
> >> confident the array drives on the motherboard were sda-sdc, but I'm
> >> not certain.
> >>
> >> Now I can't get the array to come up.  I'm reasonably certain I
> >> haven't done anything to write to the drives - but mdadm will not
> >> assemble the drives (I have not tried to force it).  I'm not
> >> entirely sure what's up and would really appreciate any help.
> >>
> >> I've tried various incantations of mdadm --assemble --scan, with no
> >> luck.  I've seen the posts about certain motherboards that can mess
> >> up the drives, and I'm hoping I'm not in that boat.  The "new"
> >> motherboard is a Asus Z96-K/CSM.
> >>
> >> I assume using --force is in my future...I see various pages that
> >> say use --force then check it, but will that damage it if I'm
> >> wrong?  If not, how will I know it's correct?  Is the order of
> >> drives important with --force?  I see conflicting info on that.
> >>
> >> I'm no expert but it looks like each drive has the mdadm
> >> superblock...so I'm not sure why it won't assemble.  Please help!
> >>
> >> Thanks in advance.
> >> --RJ
> >>
> >> root@jackie:~# uname -a
> >> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31)
> >> x86_64 GNU/Linux
> >>
> >> root@jackie:~# mdadm --version
> >> mdadm - v4.1 - 2018-10-01
> >>
> >> root@jackie:~# mdadm --examine /dev/sda
> >> /dev/sda:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sda1
> >> mdadm: No md superblock detected on /dev/sda1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdb
> >> /dev/sdb:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdb1
> >> mdadm: No md superblock detected on /dev/sdb1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdc
> >> /dev/sdc:          Magic : a92b4efc        Version : 1.2
> >> Feature Map : 0x0
> >> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
> >> Name : jackie:0  (local to host jackie)
> >> Creation Time : Sat Dec  8 19:32:07 2018
> >> Raid Level : raid5
> >> Raid Devices : 5 Avail
> >> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
> >> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
> >> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
> >> Data Offset : 262144 sectors
> >> Super Offset : 8 sectors
> >> Unused Space : before=261864 sectors, after=944 sectors
> >> State : clean
> >> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
> >> Update Time : Fri Jan 19 15:25:37 2024
> >> Bad Block Log : 512 entries available at offset 264 sectors
> >> Checksum : 2487f053 - correct
> >> Events : 5958
> >> Layout : left-symmetric
> >> Chunk Size : 512K
> >> Device Role : spare
> >> Array State : AAAAA ('A' == active, '.' == missing, 'R' == replacing)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdc1
> >> mdadm: cannot open /dev/sdc1: No such file or directory
> >>
> >> root@jackie:~# mdadm --examine /dev/sde
> >> /dev/sde:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sde1
> >> mdadm: No md superblock detected on /dev/sde1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdf
> >> /dev/sdf:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdf1
> >> mdadm: No md superblock detected on /dev/sdf1.
> >>
> >> root@jackie:~# mdadm --examine /dev/sdg
> >> /dev/sdg:   MBR Magic : aa55
> >> Partition[0] :   4294967295 sectors at            1 (type ee)
> >>
> >> root@jackie:~# mdadm --examine /dev/sdg1
> >> mdadm: No md superblock detected on /dev/sdg1.
> >>
> >> root@jackie:~# lsdrv
> >> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series
> >> Chipset Family SATA Controller [AHCI Mode] ├scsi 0:0:0:0 ATA
> >>      ST3000VN007-2E41 {Z7317D1A} │└sda 2.73t [8:0] Partitioned (gpt)
> >> │ └sda1 2.73t [8:1] Empty/Unknown
> >> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
> >> │└sdb 2.73t [8:16] Partitioned (gpt)
> >> │ └sdb1 2.73t [8:17] Empty/Unknown
> >> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
> >> │└sdc 2.73t [8:32] MD raid5 (5) inactive
> >> 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc} └scsi 3:0:0:0 ATA
> >>      ST32000542AS     {5XW110LY} └sdd 1.82t [8:48] Partitioned (dos)
> >> ├sdd1 23.28g [8:49] Partitioned (dos)
> >> {d94cc2c8-037a-49c5-8a1e-01bb47d78624} │└Mounted as /dev/sdd1 @ /
> >> ├sdd2 1.00k [8:50] Partitioned (dos)
> >> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
> >> │└Mounted as /dev/sdd5 @ /var
> >> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
> >> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
> >> │└Mounted as /dev/sdd7 @ /tmp
> >> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
> >> └Mounted as /dev/sdd8 @ /home PCI [ahci]
> >>
> >> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215 PCIe
> >> 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11) ├scsi 6:0:0:0 ATA
> >>      Hitachi HUS72403 {P8G84LEP} │└sde 2.73t [8:64] Partitioned (gpt)
> >> │ └sde1 2.73t [8:65] Empty/Unknown
> >> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
> >> │└sdf 2.73t [8:80] Partitioned (gpt)
> >> │ └sdf1 2.73t [8:81] Empty/Unknown
> >> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
> >> └sdg 2.73t [8:96] Partitioned (gpt)
> >> └sdg1 2.73t [8:97] Empty/Unknown
> >>
> >> root@jackie:~# cat /etc/mdadm/mdadm.conf
> >>     # This configuration was auto-generated on Wed, 27 Nov 2019
> >>15:53:23 -0500 by mkconf
> >> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0
> >> UUID=74a11272:9b233a5b:2506f763:27693cccr  
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-23 16:16               ` RJ Marquette
@ 2024-01-23 22:50                 ` Sandro
  2024-01-24  0:59                   ` RJ Marquette
       [not found]                   ` <d051abe3-af97-47a4-a087-432c91beb57e@yahoo.com>
  2024-01-24  3:19                 ` David Niklas
  1 sibling, 2 replies; 40+ messages in thread
From: Sandro @ 2024-01-23 22:50 UTC (permalink / raw)
  To: RJ Marquette, linux-raid, David Niklas

On 23-01-2024 17:16, RJ Marquette wrote:
> It's like mdadm was assembling them automatically upon bootup, but that
> stopped working with the new motherboard for some reason.

Just a hunch, since you wrote that you updated your system as well:

https://bugzilla.redhat.com/show_bug.cgi?id=2249392

If that's affecting you, `blkid` will be missing some information 
required for RAID assembly during boot. On the other hand, I was able to 
assemble my RAID devices manually.

-- Sandro


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-23 22:50                 ` Sandro
@ 2024-01-24  0:59                   ` RJ Marquette
       [not found]                   ` <d051abe3-af97-47a4-a087-432c91beb57e@yahoo.com>
  1 sibling, 0 replies; 40+ messages in thread
From: RJ Marquette @ 2024-01-24  0:59 UTC (permalink / raw)
  To: linux-raid


That's an interesting theory. I did update Debian a few days ago and hadn't rebooted before the hardware upgrade. So it might not have anything to do with the hardware upgrade. That would make a lot more sense.


When you say manually, is that adding the devices to the conf file then running mdadm --assemble?


Thanks.

--RJ






On Tuesday, January 23, 2024 at 06:00:32 PM EST, Sandro <lists@penguinpee.nl> wrote: 





On 23-01-2024 17:16, RJ Marquette wrote:

> It's like mdadm was assembling them automatically upon bootup, but that
> stopped working with the new motherboard for some reason.


Just a hunch, since you wrote that you updated your system as well:

https://bugzilla.redhat.com/show_bug.cgi?id=2249392

If that's affecting you, `blkid` will be missing some information 
required for RAID assembly during boot. On the other hand, I was able to 
assemble my RAID devices manually.

-- Sandro



^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-23 16:16               ` RJ Marquette
  2024-01-23 22:50                 ` Sandro
@ 2024-01-24  3:19                 ` David Niklas
  2024-01-24 12:17                   ` RJ Marquette
  1 sibling, 1 reply; 40+ messages in thread
From: David Niklas @ 2024-01-24  3:19 UTC (permalink / raw)
  To: linux-raid

Hello,
I, personally, just use the device and array lines. You're welcome to keep
tracking down why UUID detection doesn't work for you if you so chose.

Example:
DEVICE /dev/sda1 /dev/sdb1 ...
ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 /dev/sda1 /dev/sdb1 ...

If you just want the array to work (for now), then:
mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
should do the trick.

One question though, why does sdc not have a partition table? I mean, it
doesn't really matter if you use a RAID array without one, but it stands
out from the rest of the info as erroneous. Sort of like either you
goofed (hopefully), or the drive isn't being detected/working properly.

Sincerely,
David


PS: I'm subscribed to the list. No need to CC me.




On Tue, 23 Jan 2024 16:16:12 +0000 (UTC)
RJ Marquette <rjm1@yahoo.com> wrote:
> (Sorry if this came through twice without the mdadm.conf contents,
> somehow I accidentally hit send when I was trying to paste in.)
> 
> Thanks.  All drives in the system are being detected (/dev/sdd is my
> system drive - the rest are all of the array):
> 
> rj@jackie:~$ ls -l /dev/sd*
> brw-rw---- 1 root disk 8,  0 Jan 21 19:08 /dev/sda
> brw-rw---- 1 root disk 8,  1 Jan 21 19:08 /dev/sda1
> brw-rw---- 1 root disk 8, 16 Jan 21 19:08 /dev/sdb
> brw-rw---- 1 root disk 8, 17 Jan 21 19:08 /dev/sdb1
> brw-rw---- 1 root disk 8, 32 Jan 21 19:08 /dev/sdc
> brw-rw---- 1 root disk 8, 48 Jan 21 19:08 /dev/sdd
> brw-rw---- 1 root disk 8, 49 Jan 21 19:08 /dev/sdd1
> brw-rw---- 1 root disk 8, 50 Jan 21 19:08 /dev/sdd2
> brw-rw---- 1 root disk 8, 53 Jan 21 19:08 /dev/sdd5
> brw-rw---- 1 root disk 8, 54 Jan 21 19:08 /dev/sdd6
> brw-rw---- 1 root disk 8, 55 Jan 21 19:08 /dev/sdd7
> brw-rw---- 1 root disk 8, 56 Jan 21 19:08 /dev/sdd8
> brw-rw---- 1 root disk 8, 64 Jan 21 19:08 /dev/sde
> brw-rw---- 1 root disk 8, 65 Jan 21 19:08 /dev/sde1
> brw-rw---- 1 root disk 8, 80 Jan 21 19:08 /dev/sdf
> brw-rw---- 1 root disk 8, 81 Jan 21 19:08 /dev/sdf1
> brw-rw---- 1 root disk 8, 96 Jan 21 19:08 /dev/sdg
> brw-rw---- 1 root disk 8, 97 Jan 21 19:08 /dev/sdg1
> 
> 
> The devices are not listed in the mdadm.conf, nor were they ever.
> Here's everything (except the initial header comments about updating
> initramfs and all) from that file:
> 
> # by default (built-in), scan all partitions (/proc/partitions) and all
> # containers for MD superblocks. alternatively, specify devices to
> scan, using # wildcards if desired.
> #DEVICE partitions containers
> 
> # automatically tag new arrays as belonging to the local system
> HOMEHOST <system>
> 
> # instruct the monitoring daemon where to send mail alerts
> MAILADDR rj
> 
> # definitions of existing MD arrays
> #ARRAY /dev/md/0  metadata=1.2 UUID=74a11272:9b233a5b:2506f763:27693ccc
> name=jackie:0
> 
> # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23
> -0500 by mkconf 
> UUID=74a11272:9b233a5b:2506f763:27693ccc
> 
> 
> I assume that last line was added when I added the spare drive.  Should
> I add the drives to the mdadm.conf then run the assemble command you
> suggested?
> 
> It's like mdadm was assembling them automatically upon bootup, but that
> stopped working with the new motherboard for some reason.
> 
> Thanks.
> --RJ
> 
> 
> 
> 
> 
> 
> On Tuesday, January 23, 2024 at 11:06:30 AM EST, David Niklas
> <simd@vfemail.net> wrote: 
> 
> 
> 
> 
> 
> Hello,
> 
> As someone who's a bit more experienced in RAID array failures, I'd like
> to suggest the following:
> 
> # Check that all drives are being detected.
> ls /dev/sd*
> 
> # Verify what exactly is being scanned.
> grep DEVICE /etc/mdadm/mdadm.conf
> 
> Assuming both of these give satisfactory results*, your next step would
> be to try assembling them out of order and see what happens. For
> example:
> 
> -> mdadm --assemble /dev/md0 /dev/sda /dev/sdb  
> Mdadm: Error Not part of array /dev/sdb
> -> mdadm --assemble /dev/md0 /dev/sda /dev/sdc  
> Mdadm: Error too few drives to start array /dev/md0
> 
> Please note that I made up what mdadm is saying there. But it still
> tells you what's going on.
> * for the ls command you should see all the drives you have. For the
> grep command you should get a listing like "/dev/sda /dev/sdb"...
> Obviously, all the drives that might have a RAID array on them should
> be listed.
> 
> 
> Sincerely,
> David
> 
> 
> 
> 
> 
> On Tue, 23 Jan 2024 01:52:31 +0000 (UTC)
> RJ Marquette <rjm1@yahoo.com> wrote:
> > I meant to add that my /proc/mdstat looked much more like yours on the
> > old system.  But nothing is showing on this one. 
> > 
> > I may try swapping back to the old motherboard.  Another possibility
> > that might be factor - UEFI vs Legacy BIOS.
> > 
> > Thanks.
> > --RJ
> > 
> > 
> > On Monday, January 22, 2024 at 07:45:29 PM EST, RJ Marquette
> > <rjm1@yahoo.com> wrote: 
> > 
> > 
> > 
> > 
> > 
> > That's all.  
> > 
> > If I run:
> > 
> > root@jackie:~# mdadm --assemble --scan
> > mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to
> > start the array.
> > 
> > root@jackie:~# cat /proc/mdstat  
> > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> > [raid4] [raid10] unused devices: <none>
> > 
> > root@jackie:~# ls -l /dev/md*
> > ls: cannot access '/dev/md*': No such file or directory
> > 
> > It seems to be recognizing the spare drive, but not the 5 that
> > actually have data, for some reason.
> > 
> > Thanks.
> > --RJ
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Monday, January 22, 2024 at 06:49:50 PM EST, Reindl Harald
> > <h.reindl@thelounge.net> wrote: 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Am 22.01.24 um 23:13 schrieb RJ Marquette:  
> > > Sorry!
> > > 
> > > rj@jackie:~$ cat /proc/mdstat
> > > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> > > [raid4] [raid10] unused devices: <none>    
> > 
> > that's all and where is the ton of raid-types coming from with no
> > single array shown?
> > 
> > [root@srv-rhsoft:~]$ cat /proc/mdstat
> > Personalities : [raid1]
> > md0 : active raid1 sdb2[2] sda2[0]
> >       30740480 blocks super 1.2 [2/2] [UU]
> >       bitmap: 0/1 pages [0KB], 65536KB chunk
> > 
> > md1 : active raid1 sda3[0] sdb3[2]
> >       3875717120 blocks super 1.2 [2/2] [UU]
> >       bitmap: 5/29 pages [20KB], 65536KB chunk
> > 
> > 
> > unused devices: <none>
> >   
> > > On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald
> > > <h.reindl@thelounge.net> wrote:
> > > 
> > > a ton of "mdadm --examine" outputs but i can't see a
> > > "cat /proc/mdstat"
> > > 
> > > /dev/sdX is completly irrelevant when it comes to raid - you can
> > > even connect a random disk via USB adapter without a change from
> > > the view of the array
> > > 
> > > Am 22.01.24 um 20:52 schrieb RJ Marquette:    
> > >> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in
> > >> reserve that I built using 3TB drives in 2019.  It has been running
> > >> fine since, not even a single drive failure.  The system also has a
> > >> 7th hard drive for OS, home directory, etc.  The motherboard had
> > >> four SATA ports, so I added an adapter card that has 4 more ports,
> > >> with three drives connected to it.  The server runs Debian that I
> > >> keep relatively current.
> > >>
> > >> Yesterday, I swapped a newer motherboard into the computer
> > >> (upgraded my desktop and moved the guts to my server).  I never
> > >> disconnected the cables from the adapter card (whew, I think), so
> > >> I know which four drives were connected to the motherboard.
> > >> Unfortunately I didn't really note how they were hooked to the
> > >> motherboard (SATA1-4 ports).  Didn't even think it would be an
> > >> issue.  I'm reasonably confident the array drives on the
> > >> motherboard were sda-sdc, but I'm not certain.
> > >>
> > >> Now I can't get the array to come up.  I'm reasonably certain I
> > >> haven't done anything to write to the drives - but mdadm will not
> > >> assemble the drives (I have not tried to force it).  I'm not
> > >> entirely sure what's up and would really appreciate any help.
> > >>
> > >> I've tried various incantations of mdadm --assemble --scan, with no
> > >> luck.  I've seen the posts about certain motherboards that can mess
> > >> up the drives, and I'm hoping I'm not in that boat.  The "new"
> > >> motherboard is a Asus Z96-K/CSM.
> > >>
> > >> I assume using --force is in my future...I see various pages that
> > >> say use --force then check it, but will that damage it if I'm
> > >> wrong?  If not, how will I know it's correct?  Is the order of
> > >> drives important with --force?  I see conflicting info on that.
> > >>
> > >> I'm no expert but it looks like each drive has the mdadm
> > >> superblock...so I'm not sure why it won't assemble.  Please help!
> > >>
> > >> Thanks in advance.
> > >> --RJ
> > >>
> > >> root@jackie:~# uname -a
> > >> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31)
> > >> x86_64 GNU/Linux
> > >>
> > >> root@jackie:~# mdadm --version
> > >> mdadm - v4.1 - 2018-10-01
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sda
> > >> /dev/sda:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sda1
> > >> mdadm: No md superblock detected on /dev/sda1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdb
> > >> /dev/sdb:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdb1
> > >> mdadm: No md superblock detected on /dev/sdb1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdc
> > >> /dev/sdc:          Magic : a92b4efc        Version : 1.2
> > >> Feature Map : 0x0
> > >> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
> > >> Name : jackie:0  (local to host jackie)
> > >> Creation Time : Sat Dec  8 19:32:07 2018
> > >> Raid Level : raid5
> > >> Raid Devices : 5 Avail
> > >> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
> > >> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
> > >> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
> > >> Data Offset : 262144 sectors
> > >> Super Offset : 8 sectors
> > >> Unused Space : before=261864 sectors, after=944 sectors
> > >> State : clean
> > >> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
> > >> Update Time : Fri Jan 19 15:25:37 2024
> > >> Bad Block Log : 512 entries available at offset 264 sectors
> > >> Checksum : 2487f053 - correct
> > >> Events : 5958
> > >> Layout : left-symmetric
> > >> Chunk Size : 512K
> > >> Device Role : spare
> > >> Array State : AAAAA ('A' == active, '.' == missing, 'R' ==
> > >> replacing)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdc1
> > >> mdadm: cannot open /dev/sdc1: No such file or directory
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sde
> > >> /dev/sde:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sde1
> > >> mdadm: No md superblock detected on /dev/sde1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdf
> > >> /dev/sdf:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdf1
> > >> mdadm: No md superblock detected on /dev/sdf1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdg
> > >> /dev/sdg:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdg1
> > >> mdadm: No md superblock detected on /dev/sdg1.
> > >>
> > >> root@jackie:~# lsdrv
> > >> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series
> > >> Chipset Family SATA Controller [AHCI Mode] ├scsi 0:0:0:0 ATA
> > >>      ST3000VN007-2E41 {Z7317D1A} │└sda 2.73t [8:0] Partitioned
> > >> (gpt) │ └sda1 2.73t [8:1] Empty/Unknown
> > >> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
> > >> │└sdb 2.73t [8:16] Partitioned (gpt)
> > >> │ └sdb1 2.73t [8:17] Empty/Unknown
> > >> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
> > >> │└sdc 2.73t [8:32] MD raid5 (5) inactive
> > >> 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc} └scsi 3:0:0:0 ATA
> > >>      ST32000542AS     {5XW110LY} └sdd 1.82t [8:48] Partitioned
> > >> (dos) ├sdd1 23.28g [8:49] Partitioned (dos)
> > >> {d94cc2c8-037a-49c5-8a1e-01bb47d78624} │└Mounted as /dev/sdd1 @ /
> > >> ├sdd2 1.00k [8:50] Partitioned (dos)
> > >> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
> > >> │└Mounted as /dev/sdd5 @ /var
> > >> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
> > >> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
> > >> │└Mounted as /dev/sdd7 @ /tmp
> > >> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
> > >> └Mounted as /dev/sdd8 @ /home PCI [ahci]
> > >>
> > >> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215
> > >> PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11) ├scsi 6:0:0:0
> > >> ATA Hitachi HUS72403 {P8G84LEP} │└sde 2.73t [8:64] Partitioned
> > >> (gpt) │ └sde1 2.73t [8:65] Empty/Unknown
> > >> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
> > >> │└sdf 2.73t [8:80] Partitioned (gpt)
> > >> │ └sdf1 2.73t [8:81] Empty/Unknown
> > >> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
> > >> └sdg 2.73t [8:96] Partitioned (gpt)
> > >> └sdg1 2.73t [8:97] Empty/Unknown
> > >>
> > >> root@jackie:~# cat /etc/mdadm/mdadm.conf
> > >>     # This configuration was auto-generated on Wed, 27 Nov 2019
> > >>15:53:23 -0500 by mkconf
> > >> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0
> > >> UUID=74a11272:9b233a5b:2506f763:27693cccr    
> >   
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
       [not found]                   ` <d051abe3-af97-47a4-a087-432c91beb57e@yahoo.com>
@ 2024-01-24  9:11                     ` Sandro
  0 siblings, 0 replies; 40+ messages in thread
From: Sandro @ 2024-01-24  9:11 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid, David Niklas

On 24-01-2024 01:55, RJ Marquette wrote:
> When you say manually, it's that adding the devices to the conf file then running assemble?

My conf file already contained entries regarding the raids. By manually 
I meant `mdadm --assemble` using either `--scan [--no-degraded]`, or, 
for more control, specify the devices making up the array as others 
suggested already.

-- Sandro


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24  3:19                 ` David Niklas
@ 2024-01-24 12:17                   ` RJ Marquette
  2024-01-24 17:06                     ` Sandro
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-24 12:17 UTC (permalink / raw)
  To: linux-raid

Sigh.  My goal is to get the RAID working again, the blkid issue was mentioned as a potential problem source, so I looked into it.  However, my libblkid files haven't been updated since January 20, 2022, so I'm guessing that isn't the source of the issue.

When I try the command you suggested below, I get:
root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
mdadm: no recogniseable superblock on /dev/sda1
mdadm: /dev/sda1 has no superblock - assembly aborted


I'm not sure why sdc is set up differently.  I don't remember if I set that up at the same time as the rest of the array, or if I did it a few days later and perhaps did it differently.  Oddly enough it's the only that actually seems to be reporting correctly...but based on the data it reports, it's the spare drive.

Thanks.
--RJ


On Tuesday, January 23, 2024 at 10:19:49 PM EST, David Niklas <simd@vfemail.net> wrote: 


Hello,
I, personally, just use the device and array lines. You're welcome to keep
tracking down why UUID detection doesn't work for you if you so chose.

Example:
DEVICE /dev/sda1 /dev/sdb1 ...
ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0 /dev/sda1 /dev/sdb1 ...

If you just want the array to work (for now), then:
mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
should do the trick.

One question though, why does sdc not have a partition table? I mean, it
doesn't really matter if you use a RAID array without one, but it stands
out from the rest of the info as erroneous. Sort of like either you
goofed (hopefully), or the drive isn't being detected/working properly.

Sincerely,
David


PS: I'm subscribed to the list. No need to CC me.




On Tue, 23 Jan 2024 16:16:12 +0000 (UTC)
RJ Marquette <rjm1@yahoo.com> wrote:
> (Sorry if this came through twice without the mdadm.conf contents,
> somehow I accidentally hit send when I was trying to paste in.)
> 
> Thanks.  All drives in the system are being detected (/dev/sdd is my
> system drive - the rest are all of the array):
> 
> rj@jackie:~$ ls -l /dev/sd*
> brw-rw---- 1 root disk 8,  0 Jan 21 19:08 /dev/sda
> brw-rw---- 1 root disk 8,  1 Jan 21 19:08 /dev/sda1
> brw-rw---- 1 root disk 8, 16 Jan 21 19:08 /dev/sdb
> brw-rw---- 1 root disk 8, 17 Jan 21 19:08 /dev/sdb1
> brw-rw---- 1 root disk 8, 32 Jan 21 19:08 /dev/sdc
> brw-rw---- 1 root disk 8, 48 Jan 21 19:08 /dev/sdd
> brw-rw---- 1 root disk 8, 49 Jan 21 19:08 /dev/sdd1
> brw-rw---- 1 root disk 8, 50 Jan 21 19:08 /dev/sdd2
> brw-rw---- 1 root disk 8, 53 Jan 21 19:08 /dev/sdd5
> brw-rw---- 1 root disk 8, 54 Jan 21 19:08 /dev/sdd6
> brw-rw---- 1 root disk 8, 55 Jan 21 19:08 /dev/sdd7
> brw-rw---- 1 root disk 8, 56 Jan 21 19:08 /dev/sdd8
> brw-rw---- 1 root disk 8, 64 Jan 21 19:08 /dev/sde
> brw-rw---- 1 root disk 8, 65 Jan 21 19:08 /dev/sde1
> brw-rw---- 1 root disk 8, 80 Jan 21 19:08 /dev/sdf
> brw-rw---- 1 root disk 8, 81 Jan 21 19:08 /dev/sdf1
> brw-rw---- 1 root disk 8, 96 Jan 21 19:08 /dev/sdg
> brw-rw---- 1 root disk 8, 97 Jan 21 19:08 /dev/sdg1
> 
> 
> The devices are not listed in the mdadm.conf, nor were they ever.
> Here's everything (except the initial header comments about updating
> initramfs and all) from that file:
> 
> # by default (built-in), scan all partitions (/proc/partitions) and all
> # containers for MD superblocks. alternatively, specify devices to
> scan, using # wildcards if desired.
> #DEVICE partitions containers
> 
> # automatically tag new arrays as belonging to the local system
> HOMEHOST <system>
> 
> # instruct the monitoring daemon where to send mail alerts
> MAILADDR rj
> 
> # definitions of existing MD arrays
> #ARRAY /dev/md/0  metadata=1.2 UUID=74a11272:9b233a5b:2506f763:27693ccc
> name=jackie:0
> 
> # This configuration was auto-generated on Wed, 27 Nov 2019 15:53:23
> -0500 by mkconf 
> UUID=74a11272:9b233a5b:2506f763:27693ccc
> 
> 
> I assume that last line was added when I added the spare drive.  Should
> I add the drives to the mdadm.conf then run the assemble command you
> suggested?
> 
> It's like mdadm was assembling them automatically upon bootup, but that
> stopped working with the new motherboard for some reason.
> 
> Thanks.
> --RJ
> 
> 
> 
> 
> 
> 
> On Tuesday, January 23, 2024 at 11:06:30 AM EST, David Niklas
> <simd@vfemail.net> wrote: 
> 
> 
> 
> 
> 
> Hello,
> 
> As someone who's a bit more experienced in RAID array failures, I'd like
> to suggest the following:
> 
> # Check that all drives are being detected.
> ls /dev/sd*
> 
> # Verify what exactly is being scanned.
> grep DEVICE /etc/mdadm/mdadm.conf
> 
> Assuming both of these give satisfactory results*, your next step would
> be to try assembling them out of order and see what happens. For
> example:
> 
> -> mdadm --assemble /dev/md0 /dev/sda /dev/sdb  
> Mdadm: Error Not part of array /dev/sdb
> -> mdadm --assemble /dev/md0 /dev/sda /dev/sdc  
> Mdadm: Error too few drives to start array /dev/md0
> 
> Please note that I made up what mdadm is saying there. But it still
> tells you what's going on.
> * for the ls command you should see all the drives you have. For the
> grep command you should get a listing like "/dev/sda /dev/sdb"...
> Obviously, all the drives that might have a RAID array on them should
> be listed.
> 
> 
> Sincerely,
> David
> 
> 
> 
> 
> 
> On Tue, 23 Jan 2024 01:52:31 +0000 (UTC)
> RJ Marquette <rjm1@yahoo.com> wrote:
> > I meant to add that my /proc/mdstat looked much more like yours on the
> > old system.  But nothing is showing on this one. 
> > 
> > I may try swapping back to the old motherboard.  Another possibility
> > that might be factor - UEFI vs Legacy BIOS.
> > 
> > Thanks.
> > --RJ
> > 
> > 
> > On Monday, January 22, 2024 at 07:45:29 PM EST, RJ Marquette
> > <rjm1@yahoo.com> wrote: 
> > 
> > 
> > 
> > 
> > 
> > That's all.  
> > 
> > If I run:
> > 
> > root@jackie:~# mdadm --assemble --scan
> > mdadm: /dev/md0 assembled from 0 drives and 1 spare - not enough to
> > start the array.
> > 
> > root@jackie:~# cat /proc/mdstat  
> > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> > [raid4] [raid10] unused devices: <none>
> > 
> > root@jackie:~# ls -l /dev/md*
> > ls: cannot access '/dev/md*': No such file or directory
> > 
> > It seems to be recognizing the spare drive, but not the 5 that
> > actually have data, for some reason.
> > 
> > Thanks.
> > --RJ
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > On Monday, January 22, 2024 at 06:49:50 PM EST, Reindl Harald
> > <h.reindl@thelounge.net> wrote: 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > Am 22.01.24 um 23:13 schrieb RJ Marquette:  
> > > Sorry!
> > > 
> > > rj@jackie:~$ cat /proc/mdstat
> > > Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
> > > [raid4] [raid10] unused devices: <none>   
> > 
> > that's all and where is the ton of raid-types coming from with no
> > single array shown?
> > 
> > [root@srv-rhsoft:~]$ cat /proc/mdstat
> > Personalities : [raid1]
> > md0 : active raid1 sdb2[2] sda2[0]
> >       30740480 blocks super 1.2 [2/2] [UU]
> >       bitmap: 0/1 pages [0KB], 65536KB chunk
> > 
> > md1 : active raid1 sda3[0] sdb3[2]
> >       3875717120 blocks super 1.2 [2/2] [UU]
> >       bitmap: 5/29 pages [20KB], 65536KB chunk
> > 
> > 
> > unused devices: <none>
> >  
> > > On Monday, January 22, 2024 at 04:55:50 PM EST, Reindl Harald
> > > <h.reindl@thelounge.net> wrote:
> > > 
> > > a ton of "mdadm --examine" outputs but i can't see a
> > > "cat /proc/mdstat"
> > > 
> > > /dev/sdX is completly irrelevant when it comes to raid - you can
> > > even connect a random disk via USB adapter without a change from
> > > the view of the array
> > > 
> > > Am 22.01.24 um 20:52 schrieb RJ Marquette:   
> > >> Hi, all.  I have a Raid5 array with 5 disks in use and a 6th in
> > >> reserve that I built using 3TB drives in 2019.  It has been running
> > >> fine since, not even a single drive failure.  The system also has a
> > >> 7th hard drive for OS, home directory, etc.  The motherboard had
> > >> four SATA ports, so I added an adapter card that has 4 more ports,
> > >> with three drives connected to it.  The server runs Debian that I
> > >> keep relatively current.
> > >>
> > >> Yesterday, I swapped a newer motherboard into the computer
> > >> (upgraded my desktop and moved the guts to my server).  I never
> > >> disconnected the cables from the adapter card (whew, I think), so
> > >> I know which four drives were connected to the motherboard.
> > >> Unfortunately I didn't really note how they were hooked to the
> > >> motherboard (SATA1-4 ports).  Didn't even think it would be an
> > >> issue.  I'm reasonably confident the array drives on the
> > >> motherboard were sda-sdc, but I'm not certain.
> > >>
> > >> Now I can't get the array to come up.  I'm reasonably certain I
> > >> haven't done anything to write to the drives - but mdadm will not
> > >> assemble the drives (I have not tried to force it).  I'm not
> > >> entirely sure what's up and would really appreciate any help.
> > >>
> > >> I've tried various incantations of mdadm --assemble --scan, with no
> > >> luck.  I've seen the posts about certain motherboards that can mess
> > >> up the drives, and I'm hoping I'm not in that boat.  The "new"
> > >> motherboard is a Asus Z96-K/CSM.
> > >>
> > >> I assume using --force is in my future...I see various pages that
> > >> say use --force then check it, but will that damage it if I'm
> > >> wrong?  If not, how will I know it's correct?  Is the order of
> > >> drives important with --force?  I see conflicting info on that.
> > >>
> > >> I'm no expert but it looks like each drive has the mdadm
> > >> superblock...so I'm not sure why it won't assemble.  Please help!
> > >>
> > >> Thanks in advance.
> > >> --RJ
> > >>
> > >> root@jackie:~# uname -a
> > >> Linux jackie 5.10.0-27-amd64 #1 SMP Debian 5.10.205-2 (2023-12-31)
> > >> x86_64 GNU/Linux
> > >>
> > >> root@jackie:~# mdadm --version
> > >> mdadm - v4.1 - 2018-10-01
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sda
> > >> /dev/sda:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sda1
> > >> mdadm: No md superblock detected on /dev/sda1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdb
> > >> /dev/sdb:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdb1
> > >> mdadm: No md superblock detected on /dev/sdb1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdc
> > >> /dev/sdc:          Magic : a92b4efc        Version : 1.2
> > >> Feature Map : 0x0
> > >> Array UUID : 74a11272:9b233a5b:2506f763:27693ccc
> > >> Name : jackie:0  (local to host jackie)
> > >> Creation Time : Sat Dec  8 19:32:07 2018
> > >> Raid Level : raid5
> > >> Raid Devices : 5 Avail
> > >> Dev Size : 5860271024 (2794.39 GiB 3000.46 GB)
> > >> Array Size : 11720540160 (11177.58 GiB 12001.83 GB)
> > >> Used Dev Size : 5860270080 (2794.39 GiB 3000.46 GB)
> > >> Data Offset : 262144 sectors
> > >> Super Offset : 8 sectors
> > >> Unused Space : before=261864 sectors, after=944 sectors
> > >> State : clean
> > >> Device UUID : a2b677bb:4004d8fb:a298a923:bab4df8a
> > >> Update Time : Fri Jan 19 15:25:37 2024
> > >> Bad Block Log : 512 entries available at offset 264 sectors
> > >> Checksum : 2487f053 - correct
> > >> Events : 5958
> > >> Layout : left-symmetric
> > >> Chunk Size : 512K
> > >> Device Role : spare
> > >> Array State : AAAAA ('A' == active, '.' == missing, 'R' ==
> > >> replacing)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdc1
> > >> mdadm: cannot open /dev/sdc1: No such file or directory
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sde
> > >> /dev/sde:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sde1
> > >> mdadm: No md superblock detected on /dev/sde1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdf
> > >> /dev/sdf:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdf1
> > >> mdadm: No md superblock detected on /dev/sdf1.
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdg
> > >> /dev/sdg:   MBR Magic : aa55
> > >> Partition[0] :   4294967295 sectors at            1 (type ee)
> > >>
> > >> root@jackie:~# mdadm --examine /dev/sdg1
> > >> mdadm: No md superblock detected on /dev/sdg1.
> > >>
> > >> root@jackie:~# lsdrv
> > >> PCI [ahci] 00:1f.2 SATA controller: Intel Corporation 9 Series
> > >> Chipset Family SATA Controller [AHCI Mode] ├scsi 0:0:0:0 ATA
> > >>      ST3000VN007-2E41 {Z7317D1A} │└sda 2.73t [8:0] Partitioned
> > >> (gpt) │ └sda1 2.73t [8:1] Empty/Unknown
> > >> ├scsi 1:0:0:0 ATA      Hitachi HUS72403 {P8GSA1WR}
> > >> │└sdb 2.73t [8:16] Partitioned (gpt)
> > >> │ └sdb1 2.73t [8:17] Empty/Unknown
> > >> ├scsi 2:0:0:0 ATA      Hitachi HUA72303 {MK0371YVGSZ9RA}
> > >> │└sdc 2.73t [8:32] MD raid5 (5) inactive
> > >> 'jackie:0' {74a11272-9b23-3a5b-2506-f76327693ccc} └scsi 3:0:0:0 ATA
> > >>      ST32000542AS     {5XW110LY} └sdd 1.82t [8:48] Partitioned
> > >> (dos) ├sdd1 23.28g [8:49] Partitioned (dos)
> > >> {d94cc2c8-037a-49c5-8a1e-01bb47d78624} │└Mounted as /dev/sdd1 @ /
> > >> ├sdd2 1.00k [8:50] Partitioned (dos)
> > >> ├sdd5 9.31g [8:53] ext4 {6eb3b4d0-8c7f-4b06-a431-4c292d5bda86}
> > >> │└Mounted as /dev/sdd5 @ /var
> > >> ├sdd6 3.96g [8:54] swap {901cd56d-ef11-4866-824b-d9ec4ae6fe6e}
> > >> ├sdd7 1.86g [8:55] ext4 {69ba0889-322b-4fc8-b9d3-a2d133c97e5e}
> > >> │└Mounted as /dev/sdd7 @ /tmp
> > >> └sdd8 1.78t [8:56] ext4 {4ed408d4-6b22-46e0-baed-2e0589ff41fb}
> > >> └Mounted as /dev/sdd8 @ /home PCI [ahci]
> > >>
> > >> 06:00.0 SATA controller: Marvell Technology Group Ltd. 88SE9215
> > >> PCIe 2.0 x1 4-port SATA 6 Gb/s Controller (rev 11) ├scsi 6:0:0:0
> > >> ATA Hitachi HUS72403 {P8G84LEP} │└sde 2.73t [8:64] Partitioned
> > >> (gpt) │ └sde1 2.73t [8:65] Empty/Unknown
> > >> ├scsi 7:0:0:0 ATA      ST3000VN007-2E41 {Z7317D46}
> > >> │└sdf 2.73t [8:80] Partitioned (gpt)
> > >> │ └sdf1 2.73t [8:81] Empty/Unknown
> > >> └scsi 8:0:0:0 ATA      ST3000VN007-2E41 {Z7317JTX}
> > >> └sdg 2.73t [8:96] Partitioned (gpt)
> > >> └sdg1 2.73t [8:97] Empty/Unknown
> > >>
> > >> root@jackie:~# cat /etc/mdadm/mdadm.conf
> > >>     # This configuration was auto-generated on Wed, 27 Nov 2019
> > >>15:53:23 -0500 by mkconf
> > >> ARRAY /dev/md0 metadata=1.2 spares=1 name=jackie:0
> > >> UUID=74a11272:9b233a5b:2506f763:27693cccr   
> >  
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 12:17                   ` RJ Marquette
@ 2024-01-24 17:06                     ` Sandro
  2024-01-24 18:06                       ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Sandro @ 2024-01-24 17:06 UTC (permalink / raw)
  To: RJ Marquette, linux-raid

On 24-01-2024 13:17, RJ Marquette wrote:
> When I try the command you suggested below, I get:
> root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> mdadm: no recogniseable superblock on /dev/sda1
> mdadm: /dev/sda1 has no superblock - assembly aborted

Try `mdadm --examine` on every partition / drive that is giving you 
trouble. Maybe you are remembering things wrong and the raid device is 
/dev/sda and not /dev/sda1.

You can also go through the entire list (/dev/sd*), you posted earlier. 
There's no harm in running the command. It will look for the superblock 
and tell you what has been found. This could provide the information you 
need to assemble the array.

Alternatively, leave sda1 out of the assembly and see if mdadm will be 
able to partially assemble the array.

-- Sandro


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 17:06                     ` Sandro
@ 2024-01-24 18:06                       ` RJ Marquette
  2024-01-24 21:20                         ` Roger Heflin
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-24 18:06 UTC (permalink / raw)
  To: linux-raid

Other than sdc (as you noted), the other array drives come back like this:

root@jackie:/etc/mdadm# mdadm --examine /dev/sda
/dev/sda:
  MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)

root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
mdadm: No md superblock detected on /dev/sda1.


Trying your other suggestion:
root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
mdadm: no recogniseable superblock on /dev/sdb1
mdadm: /dev/sdb1 has no superblock - assembly aborted

root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
mdadm: Cannot assemble mbr metadata on /dev/sdb
mdadm: /dev/sdb has no superblock - assembly aborted


Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid

The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.

Thanks.
--RJ





On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote: 





On 24-01-2024 13:17, RJ Marquette wrote:

> When I try the command you suggested below, I get:
> root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> mdadm: no recogniseable superblock on /dev/sda1
> mdadm: /dev/sda1 has no superblock - assembly aborted


Try `mdadm --examine` on every partition / drive that is giving you 
trouble. Maybe you are remembering things wrong and the raid device is 
/dev/sda and not /dev/sda1.

You can also go through the entire list (/dev/sd*), you posted earlier. 
There's no harm in running the command. It will look for the superblock 
and tell you what has been found. This could provide the information you 
need to assemble the array.

Alternatively, leave sda1 out of the assembly and see if mdadm will be 
able to partially assemble the array.

-- Sandro

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 18:06                       ` RJ Marquette
@ 2024-01-24 21:20                         ` Roger Heflin
  2024-01-24 21:31                           ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Heflin @ 2024-01-24 21:20 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

Are you sure you did not partition devices that did not previously
have partition tables?

Partition tables will typically cause the under device (sda) to be
ignored by all of tools since it should never having something else
(except the partition table) on it.

I have had to remove incorrectly added partition tables/blocks to make
lvm and other tools again see the data.  Otherwise the tools ignore
it.

On Wed, Jan 24, 2024 at 12:06 PM RJ Marquette <rjm1@yahoo.com> wrote:
>
> Other than sdc (as you noted), the other array drives come back like this:
>
> root@jackie:/etc/mdadm# mdadm --examine /dev/sda
> /dev/sda:
>   MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)
>
> root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
> mdadm: No md superblock detected on /dev/sda1.
>
>
> Trying your other suggestion:
> root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
> mdadm: no recogniseable superblock on /dev/sdb1
> mdadm: /dev/sdb1 has no superblock - assembly aborted
>
> root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
> mdadm: Cannot assemble mbr metadata on /dev/sdb
> mdadm: /dev/sdb has no superblock - assembly aborted
>
>
> Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid
>
> The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.
>
> Thanks.
> --RJ
>
>
>
>
>
> On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote:
>
>
>
>
>
> On 24-01-2024 13:17, RJ Marquette wrote:
>
> > When I try the command you suggested below, I get:
> > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> > mdadm: no recogniseable superblock on /dev/sda1
> > mdadm: /dev/sda1 has no superblock - assembly aborted
>
>
> Try `mdadm --examine` on every partition / drive that is giving you
> trouble. Maybe you are remembering things wrong and the raid device is
> /dev/sda and not /dev/sda1.
>
> You can also go through the entire list (/dev/sd*), you posted earlier.
> There's no harm in running the command. It will look for the superblock
> and tell you what has been found. This could provide the information you
> need to assemble the array.
>
> Alternatively, leave sda1 out of the assembly and see if mdadm will be
> able to partially assemble the array.
>
> -- Sandro
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 21:20                         ` Roger Heflin
@ 2024-01-24 21:31                           ` RJ Marquette
  2024-01-24 21:44                             ` Roger Heflin
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-24 21:31 UTC (permalink / raw)
  To: linux-raid

I didn't touch the drives.  I shut down the computer with everything working fine, swapped motherboards, booted the new board, and discovered this problem immediately when the computer failed to boot because the array wasn't up and running.  I definitely haven't run fdisk or other disk partitioning programs on them.

Other than the modifications to the mdadm.conf to describe the drives and partitions (none of which have made any difference), I modified my fstab to comment out the raid array so the computer would boot normally.  I've been trying to figure out what is going on ever since.  I've tried to avoid doing anything that might write to the drives.  

I thought this upgrade would take an hour or two to swap hardware, not days of troubleshooting.  That was the advantage of software RAID, I thought.

Thanks.
--RJ




On Wednesday, January 24, 2024 at 04:20:51 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote: 





Are you sure you did not partition devices that did not previously
have partition tables?

Partition tables will typically cause the under device (sda) to be
ignored by all of tools since it should never having something else
(except the partition table) on it.

I have had to remove incorrectly added partition tables/blocks to make
lvm and other tools again see the data.  Otherwise the tools ignore
it.

On Wed, Jan 24, 2024 at 12:06 PM RJ Marquette <rjm1@yahoo.com> wrote:
>
> Other than sdc (as you noted), the other array drives come back like this:
>
> root@jackie:/etc/mdadm# mdadm --examine /dev/sda
> /dev/sda:
>  MBR Magic : aa55
> Partition[0] :  4294967295 sectors at            1 (type ee)
>
> root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
> mdadm: No md superblock detected on /dev/sda1.
>
>
> Trying your other suggestion:
> root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
> mdadm: no recogniseable superblock on /dev/sdb1
> mdadm: /dev/sdb1 has no superblock - assembly aborted
>
> root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
> mdadm: Cannot assemble mbr metadata on /dev/sdb
> mdadm: /dev/sdb has no superblock - assembly aborted
>
>
> Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid
>
> The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.
>
> Thanks.
> --RJ
>
>
>
>
>
> On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote:
>
>
>
>
>
> On 24-01-2024 13:17, RJ Marquette wrote:
>
> > When I try the command you suggested below, I get:
> > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> > mdadm: no recogniseable superblock on /dev/sda1
> > mdadm: /dev/sda1 has no superblock - assembly aborted
>
>
> Try `mdadm --examine` on every partition / drive that is giving you
> trouble. Maybe you are remembering things wrong and the raid device is
> /dev/sda and not /dev/sda1.
>
> You can also go through the entire list (/dev/sd*), you posted earlier.
> There's no harm in running the command. It will look for the superblock
> and tell you what has been found. This could provide the information you
> need to assemble the array.
>
> Alternatively, leave sda1 out of the assembly and see if mdadm will be
> able to partially assemble the array.
>
> -- Sandro
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 21:31                           ` RJ Marquette
@ 2024-01-24 21:44                             ` Roger Heflin
  2024-01-24 22:21                               ` Robin Hill
  2024-01-25  1:13                               ` RJ Marquette
  0 siblings, 2 replies; 40+ messages in thread
From: Roger Heflin @ 2024-01-24 21:44 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

Well, if you have a /dev/sdb1 device and you think the mdadm device is
/dev/sdb (not sdb1) then SOMEONE added a partition table at some point
in time or you are confused what you mdadm device is.   if sdb is a
mdadm device and it has a partition table then mdadm --examine may see
the partition table and report that and STOP reporting anything else.

And note that that partition table could have been added at any point
in time since the prior reboot.  I have found (and fixed) ones that
were added years earlier and found on the next reboot for something
similar a year or 2 later.  On my own stuff before a hardware/mb
upgrade i will do a reboot to make sure that it reboots cleanly as all
sorts of stuff can happen (ie like initramfs/kernel  changes causing a
general failure to boot).

On Wed, Jan 24, 2024 at 3:31 PM RJ Marquette <rjm1@yahoo.com> wrote:
>
> I didn't touch the drives.  I shut down the computer with everything working fine, swapped motherboards, booted the new board, and discovered this problem immediately when the computer failed to boot because the array wasn't up and running.  I definitely haven't run fdisk or other disk partitioning programs on them.
>
> Other than the modifications to the mdadm.conf to describe the drives and partitions (none of which have made any difference), I modified my fstab to comment out the raid array so the computer would boot normally.  I've been trying to figure out what is going on ever since.  I've tried to avoid doing anything that might write to the drives.
>
> I thought this upgrade would take an hour or two to swap hardware, not days of troubleshooting.  That was the advantage of software RAID, I thought.
>
> Thanks.
> --RJ
>
>
>
>
> On Wednesday, January 24, 2024 at 04:20:51 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
>
>
>
>
>
> Are you sure you did not partition devices that did not previously
> have partition tables?
>
> Partition tables will typically cause the under device (sda) to be
> ignored by all of tools since it should never having something else
> (except the partition table) on it.
>
> I have had to remove incorrectly added partition tables/blocks to make
> lvm and other tools again see the data.  Otherwise the tools ignore
> it.
>
> On Wed, Jan 24, 2024 at 12:06 PM RJ Marquette <rjm1@yahoo.com> wrote:
> >
> > Other than sdc (as you noted), the other array drives come back like this:
> >
> > root@jackie:/etc/mdadm# mdadm --examine /dev/sda
> > /dev/sda:
> >  MBR Magic : aa55
> > Partition[0] :  4294967295 sectors at            1 (type ee)
> >
> > root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
> > mdadm: No md superblock detected on /dev/sda1.
> >
> >
> > Trying your other suggestion:
> > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
> > mdadm: no recogniseable superblock on /dev/sdb1
> > mdadm: /dev/sdb1 has no superblock - assembly aborted
> >
> > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
> > mdadm: Cannot assemble mbr metadata on /dev/sdb
> > mdadm: /dev/sdb has no superblock - assembly aborted
> >
> >
> > Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid
> >
> > The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.
> >
> > Thanks.
> > --RJ
> >
> >
> >
> >
> >
> > On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote:
> >
> >
> >
> >
> >
> > On 24-01-2024 13:17, RJ Marquette wrote:
> >
> > > When I try the command you suggested below, I get:
> > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> > > mdadm: no recogniseable superblock on /dev/sda1
> > > mdadm: /dev/sda1 has no superblock - assembly aborted
> >
> >
> > Try `mdadm --examine` on every partition / drive that is giving you
> > trouble. Maybe you are remembering things wrong and the raid device is
> > /dev/sda and not /dev/sda1.
> >
> > You can also go through the entire list (/dev/sd*), you posted earlier.
> > There's no harm in running the command. It will look for the superblock
> > and tell you what has been found. This could provide the information you
> > need to assemble the array.
> >
> > Alternatively, leave sda1 out of the assembly and see if mdadm will be
> > able to partially assemble the array.
> >
> > -- Sandro
> >
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 21:44                             ` Roger Heflin
@ 2024-01-24 22:21                               ` Robin Hill
  2024-01-24 22:37                                 ` Roger Heflin
  2024-01-25  1:13                               ` RJ Marquette
  1 sibling, 1 reply; 40+ messages in thread
From: Robin Hill @ 2024-01-24 22:21 UTC (permalink / raw)
  To: Roger Heflin; +Cc: RJ Marquette, linux-raid

There have been reported cases of BIOSes auto-creating partitions on
disks, so this is certainly a possibility. I used to use bare disks but
have switched to partitions instead, just to prevent this sort of thing
from happening.

Cheers,
    Robin

On Wed Jan 24, 2024 at 03:44:55PM -0600, Roger Heflin wrote:

> Well, if you have a /dev/sdb1 device and you think the mdadm device is
> /dev/sdb (not sdb1) then SOMEONE added a partition table at some point
> in time or you are confused what you mdadm device is.   if sdb is a
> mdadm device and it has a partition table then mdadm --examine may see
> the partition table and report that and STOP reporting anything else.
> 
> And note that that partition table could have been added at any point
> in time since the prior reboot.  I have found (and fixed) ones that
> were added years earlier and found on the next reboot for something
> similar a year or 2 later.  On my own stuff before a hardware/mb
> upgrade i will do a reboot to make sure that it reboots cleanly as all
> sorts of stuff can happen (ie like initramfs/kernel  changes causing a
> general failure to boot).
> 
> On Wed, Jan 24, 2024 at 3:31 PM RJ Marquette <rjm1@yahoo.com> wrote:
> >
> > I didn't touch the drives.  I shut down the computer with everything working fine, swapped motherboards, booted the new board, and discovered this problem immediately when the computer failed to boot because the array wasn't up and running.  I definitely haven't run fdisk or other disk partitioning programs on them.
> >
> > Other than the modifications to the mdadm.conf to describe the drives and partitions (none of which have made any difference), I modified my fstab to comment out the raid array so the computer would boot normally.  I've been trying to figure out what is going on ever since.  I've tried to avoid doing anything that might write to the drives.
> >
> > I thought this upgrade would take an hour or two to swap hardware, not days of troubleshooting.  That was the advantage of software RAID, I thought.
> >
> > Thanks.
> > --RJ
> >
> >
> >
> >
> > On Wednesday, January 24, 2024 at 04:20:51 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> >
> >
> >
> >
> >
> > Are you sure you did not partition devices that did not previously
> > have partition tables?
> >
> > Partition tables will typically cause the under device (sda) to be
> > ignored by all of tools since it should never having something else
> > (except the partition table) on it.
> >
> > I have had to remove incorrectly added partition tables/blocks to make
> > lvm and other tools again see the data.  Otherwise the tools ignore
> > it.
> >
> > On Wed, Jan 24, 2024 at 12:06 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > Other than sdc (as you noted), the other array drives come back like this:
> > >
> > > root@jackie:/etc/mdadm# mdadm --examine /dev/sda
> > > /dev/sda:
> > >  MBR Magic : aa55
> > > Partition[0] :  4294967295 sectors at            1 (type ee)
> > >
> > > root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
> > > mdadm: No md superblock detected on /dev/sda1.
> > >
> > >
> > > Trying your other suggestion:
> > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
> > > mdadm: no recogniseable superblock on /dev/sdb1
> > > mdadm: /dev/sdb1 has no superblock - assembly aborted
> > >
> > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
> > > mdadm: Cannot assemble mbr metadata on /dev/sdb
> > > mdadm: /dev/sdb has no superblock - assembly aborted
> > >
> > >
> > > Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid
> > >
> > > The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.
> > >
> > > Thanks.
> > > --RJ
> > >
> > >
> > >
> > >
> > >
> > > On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote:
> > >
> > >
> > >
> > >
> > >
> > > On 24-01-2024 13:17, RJ Marquette wrote:
> > >
> > > > When I try the command you suggested below, I get:
> > > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> > > > mdadm: no recogniseable superblock on /dev/sda1
> > > > mdadm: /dev/sda1 has no superblock - assembly aborted
> > >
> > >
> > > Try `mdadm --examine` on every partition / drive that is giving you
> > > trouble. Maybe you are remembering things wrong and the raid device is
> > > /dev/sda and not /dev/sda1.
> > >
> > > You can also go through the entire list (/dev/sd*), you posted earlier.
> > > There's no harm in running the command. It will look for the superblock
> > > and tell you what has been found. This could provide the information you
> > > need to assemble the array.
> > >
> > > Alternatively, leave sda1 out of the assembly and see if mdadm will be
> > > able to partially assemble the array.
> > >
> > > -- Sandro
> > >
> >
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 22:21                               ` Robin Hill
@ 2024-01-24 22:37                                 ` Roger Heflin
  0 siblings, 0 replies; 40+ messages in thread
From: Roger Heflin @ 2024-01-24 22:37 UTC (permalink / raw)
  To: Roger Heflin, RJ Marquette, linux-raid

you might do a fdisk -l sdb and see what the partition table looks
like and what type of table it is.

And run this:
 dd if=/dev/sdb bs=1M count=2 | xxd -a  | more
my first block (0000) is all 00 (this is where the  partition table
goes) and my md header looks like this:

00001000: fc4e 2ba9 0100 0000 0100 0000 0000 0000  .N+.............
00001010: ea9b 97b7 9301 0e79 ef9a ac45 a0b8 4276  .......y...E..Bv
00001020: 6c6f 6361 6c68 6f73 742e 6c6f 6361 6c64  localhost.locald
00001030: 6f6d 6169 6e3a 3133 0000 0000 0000 0000  omain:13........


On Wed, Jan 24, 2024 at 4:22 PM Robin Hill <robin@robinhill.me.uk> wrote:
>
> There have been reported cases of BIOSes auto-creating partitions on
> disks, so this is certainly a possibility. I used to use bare disks but
> have switched to partitions instead, just to prevent this sort of thing
> from happening.
>
> Cheers,
>     Robin
>
> On Wed Jan 24, 2024 at 03:44:55PM -0600, Roger Heflin wrote:
>
> > Well, if you have a /dev/sdb1 device and you think the mdadm device is
> > /dev/sdb (not sdb1) then SOMEONE added a partition table at some point
> > in time or you are confused what you mdadm device is.   if sdb is a
> > mdadm device and it has a partition table then mdadm --examine may see
> > the partition table and report that and STOP reporting anything else.
> >
> > And note that that partition table could have been added at any point
> > in time since the prior reboot.  I have found (and fixed) ones that
> > were added years earlier and found on the next reboot for something
> > similar a year or 2 later.  On my own stuff before a hardware/mb
> > upgrade i will do a reboot to make sure that it reboots cleanly as all
> > sorts of stuff can happen (ie like initramfs/kernel  changes causing a
> > general failure to boot).
> >
> > On Wed, Jan 24, 2024 at 3:31 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > I didn't touch the drives.  I shut down the computer with everything working fine, swapped motherboards, booted the new board, and discovered this problem immediately when the computer failed to boot because the array wasn't up and running.  I definitely haven't run fdisk or other disk partitioning programs on them.
> > >
> > > Other than the modifications to the mdadm.conf to describe the drives and partitions (none of which have made any difference), I modified my fstab to comment out the raid array so the computer would boot normally.  I've been trying to figure out what is going on ever since.  I've tried to avoid doing anything that might write to the drives.
> > >
> > > I thought this upgrade would take an hour or two to swap hardware, not days of troubleshooting.  That was the advantage of software RAID, I thought.
> > >
> > > Thanks.
> > > --RJ
> > >
> > >
> > >
> > >
> > > On Wednesday, January 24, 2024 at 04:20:51 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> > >
> > >
> > >
> > >
> > >
> > > Are you sure you did not partition devices that did not previously
> > > have partition tables?
> > >
> > > Partition tables will typically cause the under device (sda) to be
> > > ignored by all of tools since it should never having something else
> > > (except the partition table) on it.
> > >
> > > I have had to remove incorrectly added partition tables/blocks to make
> > > lvm and other tools again see the data.  Otherwise the tools ignore
> > > it.
> > >
> > > On Wed, Jan 24, 2024 at 12:06 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > > >
> > > > Other than sdc (as you noted), the other array drives come back like this:
> > > >
> > > > root@jackie:/etc/mdadm# mdadm --examine /dev/sda
> > > > /dev/sda:
> > > >  MBR Magic : aa55
> > > > Partition[0] :  4294967295 sectors at            1 (type ee)
> > > >
> > > > root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
> > > > mdadm: No md superblock detected on /dev/sda1.
> > > >
> > > >
> > > > Trying your other suggestion:
> > > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
> > > > mdadm: no recogniseable superblock on /dev/sdb1
> > > > mdadm: /dev/sdb1 has no superblock - assembly aborted
> > > >
> > > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
> > > > mdadm: Cannot assemble mbr metadata on /dev/sdb
> > > > mdadm: /dev/sdb has no superblock - assembly aborted
> > > >
> > > >
> > > > Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid
> > > >
> > > > The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.
> > > >
> > > > Thanks.
> > > > --RJ
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 24-01-2024 13:17, RJ Marquette wrote:
> > > >
> > > > > When I try the command you suggested below, I get:
> > > > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> > > > > mdadm: no recogniseable superblock on /dev/sda1
> > > > > mdadm: /dev/sda1 has no superblock - assembly aborted
> > > >
> > > >
> > > > Try `mdadm --examine` on every partition / drive that is giving you
> > > > trouble. Maybe you are remembering things wrong and the raid device is
> > > > /dev/sda and not /dev/sda1.
> > > >
> > > > You can also go through the entire list (/dev/sd*), you posted earlier.
> > > > There's no harm in running the command. It will look for the superblock
> > > > and tell you what has been found. This could provide the information you
> > > > need to assemble the array.
> > > >
> > > > Alternatively, leave sda1 out of the assembly and see if mdadm will be
> > > > able to partially assemble the array.
> > > >
> > > > -- Sandro
> > > >
> > >
> >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-24 21:44                             ` Roger Heflin
  2024-01-24 22:21                               ` Robin Hill
@ 2024-01-25  1:13                               ` RJ Marquette
  2024-01-25  1:57                                 ` Roger Heflin
  2024-01-25 17:06                                 ` Reindl Harald
  1 sibling, 2 replies; 40+ messages in thread
From: RJ Marquette @ 2024-01-25  1:13 UTC (permalink / raw)
  To: linux-raid

It looks like this is what happened after all.  I searched for "MBR Magic aa55" and found someone else with the same issue long ago:  https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like his was caused by a RAID configuration option in BIOS.  I recall seeing that on mine; I must have activated it by accident when setting the boot drive or something. 

I swapped the old motherboard back in, no improvement, so I'm back to the new one.  I'm now running testdisk to see if I can repair the partition table.

Thanks.
--RJ



On Wednesday, January 24, 2024 at 04:45:19 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote: 





Well, if you have a /dev/sdb1 device and you think the mdadm device is
/dev/sdb (not sdb1) then SOMEONE added a partition table at some point
in time or you are confused what you mdadm device is.  if sdb is a
mdadm device and it has a partition table then mdadm --examine may see
the partition table and report that and STOP reporting anything else.

And note that that partition table could have been added at any point
in time since the prior reboot.  I have found (and fixed) ones that
were added years earlier and found on the next reboot for something
similar a year or 2 later.  On my own stuff before a hardware/mb
upgrade i will do a reboot to make sure that it reboots cleanly as all
sorts of stuff can happen (ie like initramfs/kernel  changes causing a
general failure to boot).

On Wed, Jan 24, 2024 at 3:31 PM RJ Marquette <rjm1@yahoo.com> wrote:
>
> I didn't touch the drives.  I shut down the computer with everything working fine, swapped motherboards, booted the new board, and discovered this problem immediately when the computer failed to boot because the array wasn't up and running.  I definitely haven't run fdisk or other disk partitioning programs on them.
>
> Other than the modifications to the mdadm.conf to describe the drives and partitions (none of which have made any difference), I modified my fstab to comment out the raid array so the computer would boot normally.  I've been trying to figure out what is going on ever since.  I've tried to avoid doing anything that might write to the drives.
>
> I thought this upgrade would take an hour or two to swap hardware, not days of troubleshooting.  That was the advantage of software RAID, I thought.
>
> Thanks.
> --RJ
>
>
>
>
> On Wednesday, January 24, 2024 at 04:20:51 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
>
>
>
>
>
> Are you sure you did not partition devices that did not previously
> have partition tables?
>
> Partition tables will typically cause the under device (sda) to be
> ignored by all of tools since it should never having something else
> (except the partition table) on it.
>
> I have had to remove incorrectly added partition tables/blocks to make
> lvm and other tools again see the data.  Otherwise the tools ignore
> it.
>
> On Wed, Jan 24, 2024 at 12:06 PM RJ Marquette <rjm1@yahoo.com> wrote:
> >
> > Other than sdc (as you noted), the other array drives come back like this:
> >
> > root@jackie:/etc/mdadm# mdadm --examine /dev/sda
> > /dev/sda:
> >  MBR Magic : aa55
> > Partition[0] :  4294967295 sectors at            1 (type ee)
> >
> > root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
> > mdadm: No md superblock detected on /dev/sda1.
> >
> >
> > Trying your other suggestion:
> > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
> > mdadm: no recogniseable superblock on /dev/sdb1
> > mdadm: /dev/sdb1 has no superblock - assembly aborted
> >
> > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
> > mdadm: Cannot assemble mbr metadata on /dev/sdb
> > mdadm: /dev/sdb has no superblock - assembly aborted
> >
> >
> > Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid
> >
> > The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.
> >
> > Thanks.
> > --RJ
> >
> >
> >
> >
> >
> > On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote:
> >
> >
> >
> >
> >
> > On 24-01-2024 13:17, RJ Marquette wrote:
> >
> > > When I try the command you suggested below, I get:
> > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> > > mdadm: no recogniseable superblock on /dev/sda1
> > > mdadm: /dev/sda1 has no superblock - assembly aborted
> >
> >
> > Try `mdadm --examine` on every partition / drive that is giving you
> > trouble. Maybe you are remembering things wrong and the raid device is
> > /dev/sda and not /dev/sda1.
> >
> > You can also go through the entire list (/dev/sd*), you posted earlier.
> > There's no harm in running the command. It will look for the superblock
> > and tell you what has been found. This could provide the information you
> > need to assemble the array.
> >
> > Alternatively, leave sda1 out of the assembly and see if mdadm will be
> > able to partially assemble the array.
> >
> > -- Sandro
> >
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25  1:13                               ` RJ Marquette
@ 2024-01-25  1:57                                 ` Roger Heflin
  2024-01-25  9:49                                   ` Pascal Hambourg
  2024-01-25 17:06                                 ` Reindl Harald
  1 sibling, 1 reply; 40+ messages in thread
From: Roger Heflin @ 2024-01-25  1:57 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

dd if=/dev/sdb of=/root/sdb.512byte.block.save bs=512 count=1
(repeat on each damaged drive with its name).
then
dd if=/dev/zero of=/dev/sdb bs=512 count=1     (for each disk).

the first command makes a copy of that first block just in case and
the 2nd command clears the first 512 bytes that contains the partition
data.

On Wed, Jan 24, 2024 at 7:13 PM RJ Marquette <rjm1@yahoo.com> wrote:
>
> It looks like this is what happened after all.  I searched for "MBR Magic aa55" and found someone else with the same issue long ago:  https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like his was caused by a RAID configuration option in BIOS.  I recall seeing that on mine; I must have activated it by accident when setting the boot drive or something.
>
> I swapped the old motherboard back in, no improvement, so I'm back to the new one.  I'm now running testdisk to see if I can repair the partition table.
>
> Thanks.
> --RJ
>
>
>
> On Wednesday, January 24, 2024 at 04:45:19 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
>
>
>
>
>
> Well, if you have a /dev/sdb1 device and you think the mdadm device is
> /dev/sdb (not sdb1) then SOMEONE added a partition table at some point
> in time or you are confused what you mdadm device is.  if sdb is a
> mdadm device and it has a partition table then mdadm --examine may see
> the partition table and report that and STOP reporting anything else.
>
> And note that that partition table could have been added at any point
> in time since the prior reboot.  I have found (and fixed) ones that
> were added years earlier and found on the next reboot for something
> similar a year or 2 later.  On my own stuff before a hardware/mb
> upgrade i will do a reboot to make sure that it reboots cleanly as all
> sorts of stuff can happen (ie like initramfs/kernel  changes causing a
> general failure to boot).
>
> On Wed, Jan 24, 2024 at 3:31 PM RJ Marquette <rjm1@yahoo.com> wrote:
> >
> > I didn't touch the drives.  I shut down the computer with everything working fine, swapped motherboards, booted the new board, and discovered this problem immediately when the computer failed to boot because the array wasn't up and running.  I definitely haven't run fdisk or other disk partitioning programs on them.
> >
> > Other than the modifications to the mdadm.conf to describe the drives and partitions (none of which have made any difference), I modified my fstab to comment out the raid array so the computer would boot normally.  I've been trying to figure out what is going on ever since.  I've tried to avoid doing anything that might write to the drives.
> >
> > I thought this upgrade would take an hour or two to swap hardware, not days of troubleshooting.  That was the advantage of software RAID, I thought.
> >
> > Thanks.
> > --RJ
> >
> >
> >
> >
> > On Wednesday, January 24, 2024 at 04:20:51 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> >
> >
> >
> >
> >
> > Are you sure you did not partition devices that did not previously
> > have partition tables?
> >
> > Partition tables will typically cause the under device (sda) to be
> > ignored by all of tools since it should never having something else
> > (except the partition table) on it.
> >
> > I have had to remove incorrectly added partition tables/blocks to make
> > lvm and other tools again see the data.  Otherwise the tools ignore
> > it.
> >
> > On Wed, Jan 24, 2024 at 12:06 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > Other than sdc (as you noted), the other array drives come back like this:
> > >
> > > root@jackie:/etc/mdadm# mdadm --examine /dev/sda
> > > /dev/sda:
> > >  MBR Magic : aa55
> > > Partition[0] :  4294967295 sectors at            1 (type ee)
> > >
> > > root@jackie:/etc/mdadm# mdadm --examine /dev/sda1
> > > mdadm: No md superblock detected on /dev/sda1.
> > >
> > >
> > > Trying your other suggestion:
> > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1
> > > mdadm: no recogniseable superblock on /dev/sdb1
> > > mdadm: /dev/sdb1 has no superblock - assembly aborted
> > >
> > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sdb /dev/sde /dev/sdf /dev/sdg
> > > mdadm: Cannot assemble mbr metadata on /dev/sdb
> > > mdadm: /dev/sdb has no superblock - assembly aborted
> > >
> > >
> > > Basically I've tried everything here:  https://raid.wiki.kernel.org/index.php/Linux_Raid
> > >
> > > The impression I'm getting here is that we aren't really sure what the issue is.  I think tonight I'll play with some of the BIOS settings and see if there's something in there.  If not I'll swap back to the old motherboard and see what happens.
> > >
> > > Thanks.
> > > --RJ
> > >
> > >
> > >
> > >
> > >
> > > On Wednesday, January 24, 2024 at 12:06:26 PM EST, Sandro <lists@penguinpee.nl> wrote:
> > >
> > >
> > >
> > >
> > >
> > > On 24-01-2024 13:17, RJ Marquette wrote:
> > >
> > > > When I try the command you suggested below, I get:
> > > > root@jackie:/etc/mdadm# mdadm --assemble /dev/md0 /dev/sd{a,b,e,f,g}1
> > > > mdadm: no recogniseable superblock on /dev/sda1
> > > > mdadm: /dev/sda1 has no superblock - assembly aborted
> > >
> > >
> > > Try `mdadm --examine` on every partition / drive that is giving you
> > > trouble. Maybe you are remembering things wrong and the raid device is
> > > /dev/sda and not /dev/sda1.
> > >
> > > You can also go through the entire list (/dev/sd*), you posted earlier.
> > > There's no harm in running the command. It will look for the superblock
> > > and tell you what has been found. This could provide the information you
> > > need to assemble the array.
> > >
> > > Alternatively, leave sda1 out of the assembly and see if mdadm will be
> > > able to partially assemble the array.
> > >
> > > -- Sandro
> > >
> >
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25  1:57                                 ` Roger Heflin
@ 2024-01-25  9:49                                   ` Pascal Hambourg
  2024-01-25 11:49                                     ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Pascal Hambourg @ 2024-01-25  9:49 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

On 25/01/2024 at 02:57, Roger Heflin wrote:
> 
> dd if=/dev/zero of=/dev/sdb bs=512 count=1     (for each disk).

I'm afraid it won't help.
As far as I can see, having an MBR signature in the first sector does 
not prevent blkid or mdadm from detecting the RAID superblock.
Also, previous mail from the OP show that the disks have GPT partition 
tables (as expected with 3 TiB) which usually span (~16 KiB) beyond the 
beginning of the 1.2 RAID superblock (4 KiB) so I suspect that the RAID 
superblock was overwritten.

A tiny hope is that the RAID member was actually in a partition but the 
geometry in the partition table is wrong or the kernel does not read it 
properly (I have seen this once). You can check the partition table and 
how the kernel sees the partition with

fdisk -l /dev/sdb
cat /sys/block/sdb/sdb1/start
cat /sys/block/sdb/sdb1/size

> On Wed, Jan 24, 2024 at 7:13 PM RJ Marquette <rjm1@yahoo.com> wrote:
>>
>> It looks like this is what happened after all.  I searched for "MBR Magic aa55" and found someone else with the same issue long ago:  https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like his was caused by a RAID configuration option in BIOS.  I recall seeing that on mine; I must have activated it by accident when setting the boot drive or something.
>>
>> I swapped the old motherboard back in, no improvement, so I'm back to the new one.  I'm now running testdisk to see if I can repair the partition table.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25  9:49                                   ` Pascal Hambourg
@ 2024-01-25 11:49                                     ` RJ Marquette
  2024-01-25 14:57                                       ` Pascal Hambourg
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-25 11:49 UTC (permalink / raw)
  To: linux-raid

root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
Disk model: Hitachi HUS72403
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627

Device     Start        End    Sectors  Size Type
/dev/sdb1   2048 5860532223 5860530176  2.7T Microsoft basic data

root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start  
2048
root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
5860530176

(I haven't checked all of the drives in the array, just this one.)

Thanks.
--RJ





On Thursday, January 25, 2024 at 06:07:27 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote: 





On 25/01/2024 at 02:57, Roger Heflin wrote:
> 
> dd if=/dev/zero of=/dev/sdb bs=512 count=1    (for each disk).

I'm afraid it won't help.
As far as I can see, having an MBR signature in the first sector does 
not prevent blkid or mdadm from detecting the RAID superblock.
Also, previous mail from the OP show that the disks have GPT partition 
tables (as expected with 3 TiB) which usually span (~16 KiB) beyond the 
beginning of the 1.2 RAID superblock (4 KiB) so I suspect that the RAID 
superblock was overwritten.

A tiny hope is that the RAID member was actually in a partition but the 
geometry in the partition table is wrong or the kernel does not read it 
properly (I have seen this once). You can check the partition table and 
how the kernel sees the partition with

fdisk -l /dev/sdb
cat /sys/block/sdb/sdb1/start
cat /sys/block/sdb/sdb1/size

> On Wed, Jan 24, 2024 at 7:13 PM RJ Marquette <rjm1@yahoo.com> wrote:

>>
>> It looks like this is what happened after all.  I searched for "MBR Magic aa55" and found someone else with the same issue long ago:  https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like his was caused by a RAID configuration option in BIOS.  I recall seeing that on mine; I must have activated it by accident when setting the boot drive or something.
>>
>> I swapped the old motherboard back in, no improvement, so I'm back to the new one.  I'm now running testdisk to see if I can repair the partition table.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 11:49                                     ` RJ Marquette
@ 2024-01-25 14:57                                       ` Pascal Hambourg
  2024-01-25 15:08                                         ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Pascal Hambourg @ 2024-01-25 14:57 UTC (permalink / raw)
  To: linux-raid

On 25/01/2024 at 12:49, RJ Marquette wrote:
> root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> Disk model: Hitachi HUS72403
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disklabel type: gpt
> Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> 
> Device     Start        End    Sectors  Size Type
> /dev/sdb1   2048 5860532223 5860530176  2.7T Microsoft basic data
> 
> root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> 2048
> root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> 5860530176

The partition geometry looks correct, with standard alignment.
And the kernel view of the partition matches the partition table.
The partition type "Microsoft basic data" is neither "Linux RAID" nor 
the default type "Linux flesystem" set by usual GNU/Linux partitioning 
tools such as fdisk, parted and gdisk so it seems unlikely that the 
partition was created with one of these tools.

>>> It looks like this is what happened after all.  I searched for "MBR
>>> Magic aa55" and found someone else with the same issue long ago:
>>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
>>> his was caused by a RAID configuration option in BIOS.  I recall seeing
>>> that on mine; I must have activated it by accident when setting the boot
>>> drive or something.

I am a bit suspicious about this cause for two reasons:
- sde, sdf and sdg are affected even though they are connected to the 
add-on Marvell SATA controller card which is supposed to be outside the 
motherboard RAID scope;
- sdc is not affected even though it is connected to the onboard Intel 
SATA controller.

What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 14:57                                       ` Pascal Hambourg
@ 2024-01-25 15:08                                         ` RJ Marquette
  2024-01-25 17:43                                           ` Roger Heflin
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-25 15:08 UTC (permalink / raw)
  To: linux-raid

It's an ext4 RAID5 array.  No LVM, LUKS, etc.

You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?

Thanks.
--RJ


On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote: 





On 25/01/2024 at 12:49, RJ Marquette wrote:
> root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> Disk model: Hitachi HUS72403
> Units: sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 4096 bytes
> I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> Disklabel type: gpt
> Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> 
> Device     Start        End    Sectors  Size Type
> /dev/sdb1   2048 5860532223 5860530176  2.7T Microsoft basic data
> 
> root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> 2048
> root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> 5860530176

The partition geometry looks correct, with standard alignment.
And the kernel view of the partition matches the partition table.
The partition type "Microsoft basic data" is neither "Linux RAID" nor 
the default type "Linux flesystem" set by usual GNU/Linux partitioning 
tools such as fdisk, parted and gdisk so it seems unlikely that the 
partition was created with one of these tools.


>>> It looks like this is what happened after all.  I searched for "MBR
>>> Magic aa55" and found someone else with the same issue long ago:
>>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
>>> his was caused by a RAID configuration option in BIOS.  I recall seeing
>>> that on mine; I must have activated it by accident when setting the boot
>>> drive or something.


I am a bit suspicious about this cause for two reasons:
- sde, sdf and sdg are affected even though they are connected to the 
add-on Marvell SATA controller card which is supposed to be outside the 
motherboard RAID scope;
- sdc is not affected even though it is connected to the onboard Intel 
SATA controller.

What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25  1:13                               ` RJ Marquette
  2024-01-25  1:57                                 ` Roger Heflin
@ 2024-01-25 17:06                                 ` Reindl Harald
  1 sibling, 0 replies; 40+ messages in thread
From: Reindl Harald @ 2024-01-25 17:06 UTC (permalink / raw)
  To: RJ Marquette, linux-raid



Am 25.01.24 um 02:13 schrieb RJ Marquette:
> It looks like this is what happened after all.  I searched for "MBR Magic aa55" and found someone else with the same issue long ago:  https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like his was caused by a RAID configuration option in BIOS.  I recall seeing that on mine; I must have activated it by accident when setting the boot drive or something.
> 
> I swapped the old motherboard back in, no improvement, so I'm back to the new one.  I'm now running testdisk to see if I can repair the partition table.

you learend the hard way *never* add unpartitioned drives to an array

and if it's only to leave 50 MB unused if some clown builds a disk which 
is slightly smaller in useable space

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 15:08                                         ` RJ Marquette
@ 2024-01-25 17:43                                           ` Roger Heflin
  2024-01-25 18:33                                             ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Heflin @ 2024-01-25 17:43 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

You never booted windows or any other non-linux boot image that might
have decided to "fix" the disk's missing partition tables?

And when messing with the install did you move the disks around so
that some of the disk could have been on the intel controller with
raid set at different times?

That specific model of marvell controller does not list support raid,
but other models in the same family do, so it may also have an option
in the bios that "fixes" the partition table.

Any number of id10t's writing tools may have wrongly decided that a
disk without a partition table needs to be fixed.  I know the windows
disk management used to (may still) complain about no partitions and
prompt to "fix" it.

I always run partitions on everything.   I have had the partition save
me when 2 different vendors hardware raid controllers lost their
config (random crash, freaked on on fw upgrade) and when the config
was recreated seem to "helpfully" clear a few kb at the front of the
disk.   Rescue boot, repartition, mount os lv, and reinstall grub
fixed those.

On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
>
> It's an ext4 RAID5 array.  No LVM, LUKS, etc.
>
> You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
>
> Thanks.
> --RJ
>
>
> On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
>
>
>
>
>
> On 25/01/2024 at 12:49, RJ Marquette wrote:
> > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > Disk model: Hitachi HUS72403
> > Units: sectors of 1 * 512 = 512 bytes
> > Sector size (logical/physical): 512 bytes / 4096 bytes
> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > Disklabel type: gpt
> > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> >
> > Device     Start        End    Sectors  Size Type
> > /dev/sdb1   2048 5860532223 5860530176  2.7T Microsoft basic data
> >
> > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > 2048
> > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > 5860530176
>
> The partition geometry looks correct, with standard alignment.
> And the kernel view of the partition matches the partition table.
> The partition type "Microsoft basic data" is neither "Linux RAID" nor
> the default type "Linux flesystem" set by usual GNU/Linux partitioning
> tools such as fdisk, parted and gdisk so it seems unlikely that the
> partition was created with one of these tools.
>
>
> >>> It looks like this is what happened after all.  I searched for "MBR
> >>> Magic aa55" and found someone else with the same issue long ago:
> >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> >>> that on mine; I must have activated it by accident when setting the boot
> >>> drive or something.
>
>
> I am a bit suspicious about this cause for two reasons:
> - sde, sdf and sdg are affected even though they are connected to the
> add-on Marvell SATA controller card which is supposed to be outside the
> motherboard RAID scope;
> - sdc is not affected even though it is connected to the onboard Intel
> SATA controller.
>
> What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 17:43                                           ` Roger Heflin
@ 2024-01-25 18:33                                             ` RJ Marquette
  2024-01-25 22:37                                               ` Roger Heflin
  0 siblings, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-25 18:33 UTC (permalink / raw)
  To: linux-raid

No, this system does not have any other OS's installed on it, Debian Linux only, as it's my server.

No, the three drives remained connected to the extra controller card and were never removed from that card - I just pulled the card out of the case with the connections intact, and swung it off to the side.  In fact they still haven't been removed.

I don't understand the partitions comment, as 5 of the 6 drives do appear to have separate partitions for the data, and the one that doesn't is the only that seems to be responding normally.  I guess the theory is that whatever damaged the partition tables wrote a single primary partition to the drive in the process?

I do not know what caused this problem.  I've had no reason to run fdisk or any similar utility on that computer in years.  I know we want to figure out why this happened, but I'd also like to recover my RAID, if possible.

What are my options at this point?  Should I try something like this?  (This is for someone's RAID1 setup, obviously the level and drives would change for me.):

mdadm --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda /dev/sdb

That's from this page:  https://askubuntu.com/questions/1254561/md-raid-superblock-gets-deleted

I'm currently running testdisk on one of the affected drives to see what that turns up.

Thanks.
--RJ

On Thursday, January 25, 2024 at 12:43:58 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote: 





You never booted windows or any other non-linux boot image that might
have decided to "fix" the disk's missing partition tables?

And when messing with the install did you move the disks around so
that some of the disk could have been on the intel controller with
raid set at different times?

That specific model of marvell controller does not list support raid,
but other models in the same family do, so it may also have an option
in the bios that "fixes" the partition table.

Any number of id10t's writing tools may have wrongly decided that a
disk without a partition table needs to be fixed.  I know the windows
disk management used to (may still) complain about no partitions and
prompt to "fix" it.

I always run partitions on everything.  I have had the partition save
me when 2 different vendors hardware raid controllers lost their
config (random crash, freaked on on fw upgrade) and when the config
was recreated seem to "helpfully" clear a few kb at the front of the
disk.  Rescue boot, repartition, mount os lv, and reinstall grub
fixed those.

On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
>
> It's an ext4 RAID5 array.  No LVM, LUKS, etc.
>
> You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
>
> Thanks.
> --RJ
>
>
> On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
>
>
>
>
>
> On 25/01/2024 at 12:49, RJ Marquette wrote:
> > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > Disk model: Hitachi HUS72403
> > Units: sectors of 1 * 512 = 512 bytes
> > Sector size (logical/physical): 512 bytes / 4096 bytes
> > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > Disklabel type: gpt
> > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> >
> > Device    Start        End    Sectors  Size Type
> > /dev/sdb1  2048 5860532223 5860530176  2.7T Microsoft basic data
> >
> > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > 2048
> > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > 5860530176
>
> The partition geometry looks correct, with standard alignment.
> And the kernel view of the partition matches the partition table.
> The partition type "Microsoft basic data" is neither "Linux RAID" nor
> the default type "Linux flesystem" set by usual GNU/Linux partitioning
> tools such as fdisk, parted and gdisk so it seems unlikely that the
> partition was created with one of these tools.
>
>
> >>> It looks like this is what happened after all.  I searched for "MBR
> >>> Magic aa55" and found someone else with the same issue long ago:
> >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> >>> that on mine; I must have activated it by accident when setting the boot
> >>> drive or something.
>
>
> I am a bit suspicious about this cause for two reasons:
> - sde, sdf and sdg are affected even though they are connected to the
> add-on Marvell SATA controller card which is supposed to be outside the
> motherboard RAID scope;
> - sdc is not affected even though it is connected to the onboard Intel
> SATA controller.
>
> What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 18:33                                             ` RJ Marquette
@ 2024-01-25 22:37                                               ` Roger Heflin
  2024-01-25 22:53                                                 ` Roger Heflin
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Heflin @ 2024-01-25 22:37 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

If the one that is working right does not have a partition then
someway the partitioning got added teh the broken disks.  The
partition is a gpt partition which another indicated is about 16k
which would mean it overwrote the md header at 4k, and that header
being overwritten would cause the disk to no longer be raid members.

The create must have the disks in the correct order and with the
correct parameters,     Doing a random create with a random order is
unlikely to work, and may well make things unrecoverable.

I believe there are instructions on some page about md repairing that
talks about using overlays.   Using the overlays lets you create an
overlay such that the underlying devices aren't written to such that
you can test a number of different orders and parameters to find the
one that works.

I think this is the stuff about recovering and overlays that you want to follow.

https://raid.wiki.kernel.org/index.php/Irreversible_mdadm_failure_recovery


On Thu, Jan 25, 2024 at 12:41 PM RJ Marquette <rjm1@yahoo.com> wrote:
>
> No, this system does not have any other OS's installed on it, Debian Linux only, as it's my server.
>
> No, the three drives remained connected to the extra controller card and were never removed from that card - I just pulled the card out of the case with the connections intact, and swung it off to the side.  In fact they still haven't been removed.
>
> I don't understand the partitions comment, as 5 of the 6 drives do appear to have separate partitions for the data, and the one that doesn't is the only that seems to be responding normally.  I guess the theory is that whatever damaged the partition tables wrote a single primary partition to the drive in the process?
>
> I do not know what caused this problem.  I've had no reason to run fdisk or any similar utility on that computer in years.  I know we want to figure out why this happened, but I'd also like to recover my RAID, if possible.
>
> What are my options at this point?  Should I try something like this?  (This is for someone's RAID1 setup, obviously the level and drives would change for me.):
>
> mdadm --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda /dev/sdb
>
> That's from this page:  https://askubuntu.com/questions/1254561/md-raid-superblock-gets-deleted
>
> I'm currently running testdisk on one of the affected drives to see what that turns up.
>
> Thanks.
> --RJ
>
> On Thursday, January 25, 2024 at 12:43:58 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
>
>
>
>
>
> You never booted windows or any other non-linux boot image that might
> have decided to "fix" the disk's missing partition tables?
>
> And when messing with the install did you move the disks around so
> that some of the disk could have been on the intel controller with
> raid set at different times?
>
> That specific model of marvell controller does not list support raid,
> but other models in the same family do, so it may also have an option
> in the bios that "fixes" the partition table.
>
> Any number of id10t's writing tools may have wrongly decided that a
> disk without a partition table needs to be fixed.  I know the windows
> disk management used to (may still) complain about no partitions and
> prompt to "fix" it.
>
> I always run partitions on everything.  I have had the partition save
> me when 2 different vendors hardware raid controllers lost their
> config (random crash, freaked on on fw upgrade) and when the config
> was recreated seem to "helpfully" clear a few kb at the front of the
> disk.  Rescue boot, repartition, mount os lv, and reinstall grub
> fixed those.
>
> On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
> >
> > It's an ext4 RAID5 array.  No LVM, LUKS, etc.
> >
> > You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
> >
> > Thanks.
> > --RJ
> >
> >
> > On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
> >
> >
> >
> >
> >
> > On 25/01/2024 at 12:49, RJ Marquette wrote:
> > > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > > Disk model: Hitachi HUS72403
> > > Units: sectors of 1 * 512 = 512 bytes
> > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > Disklabel type: gpt
> > > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> > >
> > > Device    Start        End    Sectors  Size Type
> > > /dev/sdb1  2048 5860532223 5860530176  2.7T Microsoft basic data
> > >
> > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > > 2048
> > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > > 5860530176
> >
> > The partition geometry looks correct, with standard alignment.
> > And the kernel view of the partition matches the partition table.
> > The partition type "Microsoft basic data" is neither "Linux RAID" nor
> > the default type "Linux flesystem" set by usual GNU/Linux partitioning
> > tools such as fdisk, parted and gdisk so it seems unlikely that the
> > partition was created with one of these tools.
> >
> >
> > >>> It looks like this is what happened after all.  I searched for "MBR
> > >>> Magic aa55" and found someone else with the same issue long ago:
> > >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> > >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> > >>> that on mine; I must have activated it by accident when setting the boot
> > >>> drive or something.
> >
> >
> > I am a bit suspicious about this cause for two reasons:
> > - sde, sdf and sdg are affected even though they are connected to the
> > add-on Marvell SATA controller card which is supposed to be outside the
> > motherboard RAID scope;
> > - sdc is not affected even though it is connected to the onboard Intel
> > SATA controller.
> >
> > What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
> >
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 22:37                                               ` Roger Heflin
@ 2024-01-25 22:53                                                 ` Roger Heflin
  2024-01-25 23:00                                                   ` Roger Heflin
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Heflin @ 2024-01-25 22:53 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

and given the partition table probably eliminated the disk superblocks
step #1 would be to overlay sdX (not sdx1) and remove the partition
from the overlay and begin testing.

The first test may be simply the dd if=/dev/zero of=overlaydisk bs=256
count=8 as looking at the gpt partitions I have says that will
eliminate that header and then try an --examine and see if that finds
anything, if that works you won't need to go into the assume clean
stuff which simplifies everything.

My checking says the gpt partition table seems to start 256bytes in
and stops before about 2k in before hex 1000(4k) where the md header
seems to be located on mine.



On Thu, Jan 25, 2024 at 4:37 PM Roger Heflin <rogerheflin@gmail.com> wrote:
>
> If the one that is working right does not have a partition then
> someway the partitioning got added teh the broken disks.  The
> partition is a gpt partition which another indicated is about 16k
> which would mean it overwrote the md header at 4k, and that header
> being overwritten would cause the disk to no longer be raid members.
>
> The create must have the disks in the correct order and with the
> correct parameters,     Doing a random create with a random order is
> unlikely to work, and may well make things unrecoverable.
>
> I believe there are instructions on some page about md repairing that
> talks about using overlays.   Using the overlays lets you create an
> overlay such that the underlying devices aren't written to such that
> you can test a number of different orders and parameters to find the
> one that works.
>
> I think this is the stuff about recovering and overlays that you want to follow.
>
> https://raid.wiki.kernel.org/index.php/Irreversible_mdadm_failure_recovery
>
>
> On Thu, Jan 25, 2024 at 12:41 PM RJ Marquette <rjm1@yahoo.com> wrote:
> >
> > No, this system does not have any other OS's installed on it, Debian Linux only, as it's my server.
> >
> > No, the three drives remained connected to the extra controller card and were never removed from that card - I just pulled the card out of the case with the connections intact, and swung it off to the side.  In fact they still haven't been removed.
> >
> > I don't understand the partitions comment, as 5 of the 6 drives do appear to have separate partitions for the data, and the one that doesn't is the only that seems to be responding normally.  I guess the theory is that whatever damaged the partition tables wrote a single primary partition to the drive in the process?
> >
> > I do not know what caused this problem.  I've had no reason to run fdisk or any similar utility on that computer in years.  I know we want to figure out why this happened, but I'd also like to recover my RAID, if possible.
> >
> > What are my options at this point?  Should I try something like this?  (This is for someone's RAID1 setup, obviously the level and drives would change for me.):
> >
> > mdadm --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda /dev/sdb
> >
> > That's from this page:  https://askubuntu.com/questions/1254561/md-raid-superblock-gets-deleted
> >
> > I'm currently running testdisk on one of the affected drives to see what that turns up.
> >
> > Thanks.
> > --RJ
> >
> > On Thursday, January 25, 2024 at 12:43:58 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> >
> >
> >
> >
> >
> > You never booted windows or any other non-linux boot image that might
> > have decided to "fix" the disk's missing partition tables?
> >
> > And when messing with the install did you move the disks around so
> > that some of the disk could have been on the intel controller with
> > raid set at different times?
> >
> > That specific model of marvell controller does not list support raid,
> > but other models in the same family do, so it may also have an option
> > in the bios that "fixes" the partition table.
> >
> > Any number of id10t's writing tools may have wrongly decided that a
> > disk without a partition table needs to be fixed.  I know the windows
> > disk management used to (may still) complain about no partitions and
> > prompt to "fix" it.
> >
> > I always run partitions on everything.  I have had the partition save
> > me when 2 different vendors hardware raid controllers lost their
> > config (random crash, freaked on on fw upgrade) and when the config
> > was recreated seem to "helpfully" clear a few kb at the front of the
> > disk.  Rescue boot, repartition, mount os lv, and reinstall grub
> > fixed those.
> >
> > On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > It's an ext4 RAID5 array.  No LVM, LUKS, etc.
> > >
> > > You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
> > >
> > > Thanks.
> > > --RJ
> > >
> > >
> > > On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
> > >
> > >
> > >
> > >
> > >
> > > On 25/01/2024 at 12:49, RJ Marquette wrote:
> > > > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > > > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > > > Disk model: Hitachi HUS72403
> > > > Units: sectors of 1 * 512 = 512 bytes
> > > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > > Disklabel type: gpt
> > > > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> > > >
> > > > Device    Start        End    Sectors  Size Type
> > > > /dev/sdb1  2048 5860532223 5860530176  2.7T Microsoft basic data
> > > >
> > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > > > 2048
> > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > > > 5860530176
> > >
> > > The partition geometry looks correct, with standard alignment.
> > > And the kernel view of the partition matches the partition table.
> > > The partition type "Microsoft basic data" is neither "Linux RAID" nor
> > > the default type "Linux flesystem" set by usual GNU/Linux partitioning
> > > tools such as fdisk, parted and gdisk so it seems unlikely that the
> > > partition was created with one of these tools.
> > >
> > >
> > > >>> It looks like this is what happened after all.  I searched for "MBR
> > > >>> Magic aa55" and found someone else with the same issue long ago:
> > > >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> > > >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> > > >>> that on mine; I must have activated it by accident when setting the boot
> > > >>> drive or something.
> > >
> > >
> > > I am a bit suspicious about this cause for two reasons:
> > > - sde, sdf and sdg are affected even though they are connected to the
> > > add-on Marvell SATA controller card which is supposed to be outside the
> > > motherboard RAID scope;
> > > - sdc is not affected even though it is connected to the onboard Intel
> > > SATA controller.
> > >
> > > What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
> > >
> >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 22:53                                                 ` Roger Heflin
@ 2024-01-25 23:00                                                   ` Roger Heflin
  2024-01-26 15:15                                                     ` RJ Marquette
  0 siblings, 1 reply; 40+ messages in thread
From: Roger Heflin @ 2024-01-25 23:00 UTC (permalink / raw)
  To: RJ Marquette; +Cc: linux-raid

looking further gpt may simply clear all it could use.

So you may need to look at the first 16k and clear on the overlay that
16k before the test.

On Thu, Jan 25, 2024 at 4:53 PM Roger Heflin <rogerheflin@gmail.com> wrote:
>
> and given the partition table probably eliminated the disk superblocks
> step #1 would be to overlay sdX (not sdx1) and remove the partition
> from the overlay and begin testing.
>
> The first test may be simply the dd if=/dev/zero of=overlaydisk bs=256
> count=8 as looking at the gpt partitions I have says that will
> eliminate that header and then try an --examine and see if that finds
> anything, if that works you won't need to go into the assume clean
> stuff which simplifies everything.
>
> My checking says the gpt partition table seems to start 256bytes in
> and stops before about 2k in before hex 1000(4k) where the md header
> seems to be located on mine.
>
>
>
> On Thu, Jan 25, 2024 at 4:37 PM Roger Heflin <rogerheflin@gmail.com> wrote:
> >
> > If the one that is working right does not have a partition then
> > someway the partitioning got added teh the broken disks.  The
> > partition is a gpt partition which another indicated is about 16k
> > which would mean it overwrote the md header at 4k, and that header
> > being overwritten would cause the disk to no longer be raid members.
> >
> > The create must have the disks in the correct order and with the
> > correct parameters,     Doing a random create with a random order is
> > unlikely to work, and may well make things unrecoverable.
> >
> > I believe there are instructions on some page about md repairing that
> > talks about using overlays.   Using the overlays lets you create an
> > overlay such that the underlying devices aren't written to such that
> > you can test a number of different orders and parameters to find the
> > one that works.
> >
> > I think this is the stuff about recovering and overlays that you want to follow.
> >
> > https://raid.wiki.kernel.org/index.php/Irreversible_mdadm_failure_recovery
> >
> >
> > On Thu, Jan 25, 2024 at 12:41 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > No, this system does not have any other OS's installed on it, Debian Linux only, as it's my server.
> > >
> > > No, the three drives remained connected to the extra controller card and were never removed from that card - I just pulled the card out of the case with the connections intact, and swung it off to the side.  In fact they still haven't been removed.
> > >
> > > I don't understand the partitions comment, as 5 of the 6 drives do appear to have separate partitions for the data, and the one that doesn't is the only that seems to be responding normally.  I guess the theory is that whatever damaged the partition tables wrote a single primary partition to the drive in the process?
> > >
> > > I do not know what caused this problem.  I've had no reason to run fdisk or any similar utility on that computer in years.  I know we want to figure out why this happened, but I'd also like to recover my RAID, if possible.
> > >
> > > What are my options at this point?  Should I try something like this?  (This is for someone's RAID1 setup, obviously the level and drives would change for me.):
> > >
> > > mdadm --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda /dev/sdb
> > >
> > > That's from this page:  https://askubuntu.com/questions/1254561/md-raid-superblock-gets-deleted
> > >
> > > I'm currently running testdisk on one of the affected drives to see what that turns up.
> > >
> > > Thanks.
> > > --RJ
> > >
> > > On Thursday, January 25, 2024 at 12:43:58 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> > >
> > >
> > >
> > >
> > >
> > > You never booted windows or any other non-linux boot image that might
> > > have decided to "fix" the disk's missing partition tables?
> > >
> > > And when messing with the install did you move the disks around so
> > > that some of the disk could have been on the intel controller with
> > > raid set at different times?
> > >
> > > That specific model of marvell controller does not list support raid,
> > > but other models in the same family do, so it may also have an option
> > > in the bios that "fixes" the partition table.
> > >
> > > Any number of id10t's writing tools may have wrongly decided that a
> > > disk without a partition table needs to be fixed.  I know the windows
> > > disk management used to (may still) complain about no partitions and
> > > prompt to "fix" it.
> > >
> > > I always run partitions on everything.  I have had the partition save
> > > me when 2 different vendors hardware raid controllers lost their
> > > config (random crash, freaked on on fw upgrade) and when the config
> > > was recreated seem to "helpfully" clear a few kb at the front of the
> > > disk.  Rescue boot, repartition, mount os lv, and reinstall grub
> > > fixed those.
> > >
> > > On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
> > > >
> > > > It's an ext4 RAID5 array.  No LVM, LUKS, etc.
> > > >
> > > > You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
> > > >
> > > > Thanks.
> > > > --RJ
> > > >
> > > >
> > > > On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 25/01/2024 at 12:49, RJ Marquette wrote:
> > > > > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > > > > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > > > > Disk model: Hitachi HUS72403
> > > > > Units: sectors of 1 * 512 = 512 bytes
> > > > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > > > Disklabel type: gpt
> > > > > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> > > > >
> > > > > Device    Start        End    Sectors  Size Type
> > > > > /dev/sdb1  2048 5860532223 5860530176  2.7T Microsoft basic data
> > > > >
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > > > > 2048
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > > > > 5860530176
> > > >
> > > > The partition geometry looks correct, with standard alignment.
> > > > And the kernel view of the partition matches the partition table.
> > > > The partition type "Microsoft basic data" is neither "Linux RAID" nor
> > > > the default type "Linux flesystem" set by usual GNU/Linux partitioning
> > > > tools such as fdisk, parted and gdisk so it seems unlikely that the
> > > > partition was created with one of these tools.
> > > >
> > > >
> > > > >>> It looks like this is what happened after all.  I searched for "MBR
> > > > >>> Magic aa55" and found someone else with the same issue long ago:
> > > > >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> > > > >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> > > > >>> that on mine; I must have activated it by accident when setting the boot
> > > > >>> drive or something.
> > > >
> > > >
> > > > I am a bit suspicious about this cause for two reasons:
> > > > - sde, sdf and sdg are affected even though they are connected to the
> > > > add-on Marvell SATA controller card which is supposed to be outside the
> > > > motherboard RAID scope;
> > > > - sdc is not affected even though it is connected to the onboard Intel
> > > > SATA controller.
> > > >
> > > > What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
> > > >
> > >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-25 23:00                                                   ` Roger Heflin
@ 2024-01-26 15:15                                                     ` RJ Marquette
  2024-01-26 15:25                                                       ` Reindl Harald
  2024-01-26 16:03                                                       ` RJ Marquette
  0 siblings, 2 replies; 40+ messages in thread
From: RJ Marquette @ 2024-01-26 15:15 UTC (permalink / raw)
  To: linux-raid

Thanks.

I'm not following all of what you're saying, and I suspect I'm beyond the point where it's going to help.  I tried the instructions on the page you linked, but I haven't hit the right combination of parameters and drive order to get a valid ext4 partition.  Sigh.  I will admit I couldn't get the overlay working, and figuring I was already likely not going to be able to recreate it, I went for it directly on the drives - still using readonly and assume clean, but of course it overwrites what was left of the partition tables - which I figured was no big loss at this point.  It didn't sound like my chances of recovery were very high anyway.

Basically I'm trying variations on this command - removing offset, using sd[x] instead of sd[x]1, changing the order of a and b, etc.  Would it be critical the spare drive be included?  I can't see why it would, so I've been ignoring it during these tests.  (The chunk and offset came from the info on the spare drive.)

mdadm --create /dev/md0 --level=5 --chunk=512K --data-offset=262144s --raid-devices=5 /dev/sda1 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1 --assume-clean --readonly

I really do wish I knew what happened to these drives (and I'm sure this group would, too).  I'm pretty sure it was something in the BIOS that caused it; I've seen a few reports of motherboards from this era (~2015) with bugs in UEFI in other brands that caused issues like this, but I couldn't find anything specific to ASUS.  Someone mentioned the possibility of something happening while it was running that would have cropped up when I rebooted (even on the old board), but it hadn't been that long since my previous reboot (~30 days IIRC), and as I mentioned I've had no cause to modify anything related to the system, aside from updating the software using apt.

After I recreate the array, I'll throw some data on it, and then reboot the computer to see what happens.  It'll probably be fine, but if it's munged again...well...that will certainly be an interesting outcome, to say the least.

Thanks for the suggestions, everyone.  I'm not planning to overwrite the array right this moment, so if you have other suggestions on things to try, I'm open to them.

Other thoughts as I process this:

I dunno.  At one point a few weeks ago, my /var partition (which isn't on the array) filled up, so that caused some weird issues with various things until I figured out what was going on.  I doubt that was a factor here, and the array was working just fine after the cleanup.

I do have backups of most of the stuff that was on the array - I will lose a bunch of our ripped DVDs and Blu-Rays, which is a headache to recreate, but not truly lost.  I believe I have copies of all of our recent pictures on my laptop or desktop machine; older stuff is stored on Amazon Glacier, if needed, but I think I have a local copy of most of it.  

I see a few groups of pictures I may have lost completely.  They were too new for being uploaded to Glacier, but too old to still be on my desktop or laptop.  I don't seem to have posted them online, either.  (I'm checking my Glacier inventory now to see if I did upload them at some point, but it's unlikely.)

I did lose some data for a hobby project, which is not at all critical or important in the grand scheme of things, but it is a bummer.  I've even been thinking about how I should back up that data.  Fortunately, I always enter the most important information into a database when new data came in, so I at least have that.

I used to have a script that backed up the array to a 3TB drive in my desktop machine, but as 7TB (used space in the array) is greater than 3TB, there was an obvious issue developing, so I stopped it a few years back, instead of paring down the stuff copied over.  Dang.

Thanks.
--RJ





On Thursday, January 25, 2024 at 06:01:18 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote: 





looking further gpt may simply clear all it could use.

So you may need to look at the first 16k and clear on the overlay that
16k before the test.

On Thu, Jan 25, 2024 at 4:53 PM Roger Heflin <rogerheflin@gmail.com> wrote:
>
> and given the partition table probably eliminated the disk superblocks
> step #1 would be to overlay sdX (not sdx1) and remove the partition
> from the overlay and begin testing.
>
> The first test may be simply the dd if=/dev/zero of=overlaydisk bs=256
> count=8 as looking at the gpt partitions I have says that will
> eliminate that header and then try an --examine and see if that finds
> anything, if that works you won't need to go into the assume clean
> stuff which simplifies everything.
>
> My checking says the gpt partition table seems to start 256bytes in
> and stops before about 2k in before hex 1000(4k) where the md header
> seems to be located on mine.
>
>
>
> On Thu, Jan 25, 2024 at 4:37 PM Roger Heflin <rogerheflin@gmail.com> wrote:
> >
> > If the one that is working right does not have a partition then
> > someway the partitioning got added teh the broken disks.  The
> > partition is a gpt partition which another indicated is about 16k
> > which would mean it overwrote the md header at 4k, and that header
> > being overwritten would cause the disk to no longer be raid members.
> >
> > The create must have the disks in the correct order and with the
> > correct parameters,    Doing a random create with a random order is
> > unlikely to work, and may well make things unrecoverable.
> >
> > I believe there are instructions on some page about md repairing that
> > talks about using overlays.  Using the overlays lets you create an
> > overlay such that the underlying devices aren't written to such that
> > you can test a number of different orders and parameters to find the
> > one that works.
> >
> > I think this is the stuff about recovering and overlays that you want to follow.
> >
> > https://raid.wiki.kernel.org/index.php/Irreversible_mdadm_failure_recovery
> >
> >
> > On Thu, Jan 25, 2024 at 12:41 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > No, this system does not have any other OS's installed on it, Debian Linux only, as it's my server.
> > >
> > > No, the three drives remained connected to the extra controller card and were never removed from that card - I just pulled the card out of the case with the connections intact, and swung it off to the side.  In fact they still haven't been removed.
> > >
> > > I don't understand the partitions comment, as 5 of the 6 drives do appear to have separate partitions for the data, and the one that doesn't is the only that seems to be responding normally.  I guess the theory is that whatever damaged the partition tables wrote a single primary partition to the drive in the process?
> > >
> > > I do not know what caused this problem.  I've had no reason to run fdisk or any similar utility on that computer in years.  I know we want to figure out why this happened, but I'd also like to recover my RAID, if possible.
> > >
> > > What are my options at this point?  Should I try something like this?  (This is for someone's RAID1 setup, obviously the level and drives would change for me.):
> > >
> > > mdadm --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda /dev/sdb
> > >
> > > That's from this page:  https://askubuntu.com/questions/1254561/md-raid-superblock-gets-deleted
> > >
> > > I'm currently running testdisk on one of the affected drives to see what that turns up.
> > >
> > > Thanks.
> > > --RJ
> > >
> > > On Thursday, January 25, 2024 at 12:43:58 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> > >
> > >
> > >
> > >
> > >
> > > You never booted windows or any other non-linux boot image that might
> > > have decided to "fix" the disk's missing partition tables?
> > >
> > > And when messing with the install did you move the disks around so
> > > that some of the disk could have been on the intel controller with
> > > raid set at different times?
> > >
> > > That specific model of marvell controller does not list support raid,
> > > but other models in the same family do, so it may also have an option
> > > in the bios that "fixes" the partition table.
> > >
> > > Any number of id10t's writing tools may have wrongly decided that a
> > > disk without a partition table needs to be fixed.  I know the windows
> > > disk management used to (may still) complain about no partitions and
> > > prompt to "fix" it.
> > >
> > > I always run partitions on everything.  I have had the partition save
> > > me when 2 different vendors hardware raid controllers lost their
> > > config (random crash, freaked on on fw upgrade) and when the config
> > > was recreated seem to "helpfully" clear a few kb at the front of the
> > > disk.  Rescue boot, repartition, mount os lv, and reinstall grub
> > > fixed those.
> > >
> > > On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
> > > >
> > > > It's an ext4 RAID5 array.  No LVM, LUKS, etc.
> > > >
> > > > You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
> > > >
> > > > Thanks.
> > > > --RJ
> > > >
> > > >
> > > > On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 25/01/2024 at 12:49, RJ Marquette wrote:
> > > > > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > > > > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > > > > Disk model: Hitachi HUS72403
> > > > > Units: sectors of 1 * 512 = 512 bytes
> > > > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > > > Disklabel type: gpt
> > > > > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> > > > >
> > > > > Device    Start        End    Sectors  Size Type
> > > > > /dev/sdb1  2048 5860532223 5860530176  2.7T Microsoft basic data
> > > > >
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > > > > 2048
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > > > > 5860530176
> > > >
> > > > The partition geometry looks correct, with standard alignment.
> > > > And the kernel view of the partition matches the partition table.
> > > > The partition type "Microsoft basic data" is neither "Linux RAID" nor
> > > > the default type "Linux flesystem" set by usual GNU/Linux partitioning
> > > > tools such as fdisk, parted and gdisk so it seems unlikely that the
> > > > partition was created with one of these tools.
> > > >
> > > >
> > > > >>> It looks like this is what happened after all.  I searched for "MBR
> > > > >>> Magic aa55" and found someone else with the same issue long ago:
> > > > >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> > > > >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> > > > >>> that on mine; I must have activated it by accident when setting the boot
> > > > >>> drive or something.
> > > >
> > > >
> > > > I am a bit suspicious about this cause for two reasons:
> > > > - sde, sdf and sdg are affected even though they are connected to the
> > > > add-on Marvell SATA controller card which is supposed to be outside the
> > > > motherboard RAID scope;
> > > > - sdc is not affected even though it is connected to the onboard Intel
> > > > SATA controller.
> > > >
> > > > What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
> > > >
> > >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-26 15:15                                                     ` RJ Marquette
@ 2024-01-26 15:25                                                       ` Reindl Harald
  2024-01-26 16:03                                                       ` RJ Marquette
  1 sibling, 0 replies; 40+ messages in thread
From: Reindl Harald @ 2024-01-26 15:25 UTC (permalink / raw)
  To: RJ Marquette, linux-raid



Am 26.01.24 um 16:15 schrieb RJ Marquette:
> I do have backups of most of the stuff that was on the array - I will lose a bunch of our ripped DVDs and Blu-Rays, which is a headache to recreate, but not truly lost.  I believe I have copies of all of our recent pictures on my laptop or desktop machine; older stuff is stored on Amazon Glacier, if needed, but I think I have a local copy of most of it.
> 
> I see a few groups of pictures I may have lost completely.  They were too new for being uploaded to Glacier, but too old to still be on my desktop or laptop.  I don't seem to have posted them online, either.  (I'm checking my Glacier inventory now to see if I did upload them at some point, but it's unlikely.)

so before "Yesterday, I swapped a newer motherboard into the computer" 
you didn't do a recent backup of your data and "I may try swapping back 
to the old motherboard" implies the repalcement was planned and not 
caused by a hardware failure?

a) never use unpartitioned drives anywhere
b) always make recent backups before touch ardware

two lessons where the first would have preveneted the problem at all and 
the second makes sure that whatever happens you have a backup of everything

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-26 15:15                                                     ` RJ Marquette
  2024-01-26 15:25                                                       ` Reindl Harald
@ 2024-01-26 16:03                                                       ` RJ Marquette
  2024-01-26 23:45                                                         ` RJ Marquette
  1 sibling, 1 reply; 40+ messages in thread
From: RJ Marquette @ 2024-01-26 16:03 UTC (permalink / raw)
  To: linux-raid

HOLY...  I GOT IT!  WOW.

I guess all I needed to do was abandon hope.

The magic spell that worked (sorry, I know it's not magic to the developers on this list, but it definitely feels that way to me right now):

mdadm --create /dev/md0 --level=5 --chunk=512K --data-offset=262144s --raid-devices=5 /dev/sde /dev/sdf /dev/sdg /dev/sda /dev/sdb --assume-clean --readonly

I'm copying the stuff I thought I lost now, while it's still accessible in RO mode!

The previous command I tried swapped sdb and sda, and I got a different error when I tried to mount it.  The previous attempts all gave me the "bad superblock" error upon mount attempt, but that attempt said something like filesystem errors.  So I made a note of it, then tried the one above, and it mounted cleanly.  No concerns at all.

(Sorry, I'm sure this is normally a very stoic list, but, I'm sure you understand my excitement here.)

I assume, once I'm comfortable with my backups, I can unmount it, stop it, then use mdadm --assemble to load it again with read/write access. I've already updated the mdadm.conf with the new uuid, so I think it should work automatically then.

Yes, I will review my backup plan and make improvements.  The plan was good, in that I wasn't going to lose most of the critical stuff, but not great, in that I was going to lose some.

(And, yes, I still wonder why it happened in the first place, so I will keep that in mind.  My guess is that if the array survives the next reboot, it'll probably be fine for a long time.  If it doesn't, then it's clearly something in the motherboard trashing it.  Not great, but at least now I'll know how to resurrect it when I swap in the previous motherboard.)

THANK YOU EVERYONE!

--RJ




On Friday, January 26, 2024 at 10:15:16 AM EST, RJ Marquette <rjm1@yahoo.com> wrote: 





Thanks.

I'm not following all of what you're saying, and I suspect I'm beyond the point where it's going to help.  I tried the instructions on the page you linked, but I haven't hit the right combination of parameters and drive order to get a valid ext4 partition.  Sigh.  I will admit I couldn't get the overlay working, and figuring I was already likely not going to be able to recreate it, I went for it directly on the drives - still using readonly and assume clean, but of course it overwrites what was left of the partition tables - which I figured was no big loss at this point.  It didn't sound like my chances of recovery were very high anyway.

Basically I'm trying variations on this command - removing offset, using sd[x] instead of sd[x]1, changing the order of a and b, etc.  Would it be critical the spare drive be included?  I can't see why it would, so I've been ignoring it during these tests.  (The chunk and offset came from the info on the spare drive.)

mdadm --create /dev/md0 --level=5 --chunk=512K --data-offset=262144s --raid-devices=5 /dev/sda1 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1 --assume-clean --readonly

I really do wish I knew what happened to these drives (and I'm sure this group would, too).  I'm pretty sure it was something in the BIOS that caused it; I've seen a few reports of motherboards from this era (~2015) with bugs in UEFI in other brands that caused issues like this, but I couldn't find anything specific to ASUS.  Someone mentioned the possibility of something happening while it was running that would have cropped up when I rebooted (even on the old board), but it hadn't been that long since my previous reboot (~30 days IIRC), and as I mentioned I've had no cause to modify anything related to the system, aside from updating the software using apt.

After I recreate the array, I'll throw some data on it, and then reboot the computer to see what happens.  It'll probably be fine, but if it's munged again...well...that will certainly be an interesting outcome, to say the least.

Thanks for the suggestions, everyone.  I'm not planning to overwrite the array right this moment, so if you have other suggestions on things to try, I'm open to them.

Other thoughts as I process this:

I dunno.  At one point a few weeks ago, my /var partition (which isn't on the array) filled up, so that caused some weird issues with various things until I figured out what was going on.  I doubt that was a factor here, and the array was working just fine after the cleanup.

I do have backups of most of the stuff that was on the array - I will lose a bunch of our ripped DVDs and Blu-Rays, which is a headache to recreate, but not truly lost.  I believe I have copies of all of our recent pictures on my laptop or desktop machine; older stuff is stored on Amazon Glacier, if needed, but I think I have a local copy of most of it.  

I see a few groups of pictures I may have lost completely.  They were too new for being uploaded to Glacier, but too old to still be on my desktop or laptop.  I don't seem to have posted them online, either.  (I'm checking my Glacier inventory now to see if I did upload them at some point, but it's unlikely.)

I did lose some data for a hobby project, which is not at all critical or important in the grand scheme of things, but it is a bummer.  I've even been thinking about how I should back up that data.  Fortunately, I always enter the most important information into a database when new data came in, so I at least have that.

I used to have a script that backed up the array to a 3TB drive in my desktop machine, but as 7TB (used space in the array) is greater than 3TB, there was an obvious issue developing, so I stopped it a few years back, instead of paring down the stuff copied over.  Dang.

Thanks.
--RJ





On Thursday, January 25, 2024 at 06:01:18 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote: 





looking further gpt may simply clear all it could use.

So you may need to look at the first 16k and clear on the overlay that
16k before the test.

On Thu, Jan 25, 2024 at 4:53 PM Roger Heflin <rogerheflin@gmail.com> wrote:
>
> and given the partition table probably eliminated the disk superblocks
> step #1 would be to overlay sdX (not sdx1) and remove the partition
> from the overlay and begin testing.
>
> The first test may be simply the dd if=/dev/zero of=overlaydisk bs=256
> count=8 as looking at the gpt partitions I have says that will
> eliminate that header and then try an --examine and see if that finds
> anything, if that works you won't need to go into the assume clean
> stuff which simplifies everything.
>
> My checking says the gpt partition table seems to start 256bytes in
> and stops before about 2k in before hex 1000(4k) where the md header
> seems to be located on mine.
>
>
>
> On Thu, Jan 25, 2024 at 4:37 PM Roger Heflin <rogerheflin@gmail.com> wrote:
> >
> > If the one that is working right does not have a partition then
> > someway the partitioning got added teh the broken disks.  The
> > partition is a gpt partition which another indicated is about 16k
> > which would mean it overwrote the md header at 4k, and that header
> > being overwritten would cause the disk to no longer be raid members.
> >
> > The create must have the disks in the correct order and with the
> > correct parameters,    Doing a random create with a random order is
> > unlikely to work, and may well make things unrecoverable.
> >
> > I believe there are instructions on some page about md repairing that
> > talks about using overlays.  Using the overlays lets you create an
> > overlay such that the underlying devices aren't written to such that
> > you can test a number of different orders and parameters to find the
> > one that works.
> >
> > I think this is the stuff about recovering and overlays that you want to follow.
> >
> > https://raid.wiki.kernel.org/index.php/Irreversible_mdadm_failure_recovery
> >
> >
> > On Thu, Jan 25, 2024 at 12:41 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > No, this system does not have any other OS's installed on it, Debian Linux only, as it's my server.
> > >
> > > No, the three drives remained connected to the extra controller card and were never removed from that card - I just pulled the card out of the case with the connections intact, and swung it off to the side.  In fact they still haven't been removed.
> > >
> > > I don't understand the partitions comment, as 5 of the 6 drives do appear to have separate partitions for the data, and the one that doesn't is the only that seems to be responding normally.  I guess the theory is that whatever damaged the partition tables wrote a single primary partition to the drive in the process?
> > >
> > > I do not know what caused this problem.  I've had no reason to run fdisk or any similar utility on that computer in years.  I know we want to figure out why this happened, but I'd also like to recover my RAID, if possible.
> > >
> > > What are my options at this point?  Should I try something like this?  (This is for someone's RAID1 setup, obviously the level and drives would change for me.):
> > >
> > > mdadm --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda /dev/sdb
> > >
> > > That's from this page:  https://askubuntu.com/questions/1254561/md-raid-superblock-gets-deleted
> > >
> > > I'm currently running testdisk on one of the affected drives to see what that turns up.
> > >
> > > Thanks.
> > > --RJ
> > >
> > > On Thursday, January 25, 2024 at 12:43:58 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> > >
> > >
> > >
> > >
> > >
> > > You never booted windows or any other non-linux boot image that might
> > > have decided to "fix" the disk's missing partition tables?
> > >
> > > And when messing with the install did you move the disks around so
> > > that some of the disk could have been on the intel controller with
> > > raid set at different times?
> > >
> > > That specific model of marvell controller does not list support raid,
> > > but other models in the same family do, so it may also have an option
> > > in the bios that "fixes" the partition table.
> > >
> > > Any number of id10t's writing tools may have wrongly decided that a
> > > disk without a partition table needs to be fixed.  I know the windows
> > > disk management used to (may still) complain about no partitions and
> > > prompt to "fix" it.
> > >
> > > I always run partitions on everything.  I have had the partition save
> > > me when 2 different vendors hardware raid controllers lost their
> > > config (random crash, freaked on on fw upgrade) and when the config
> > > was recreated seem to "helpfully" clear a few kb at the front of the
> > > disk.  Rescue boot, repartition, mount os lv, and reinstall grub
> > > fixed those.
> > >
> > > On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
> > > >
> > > > It's an ext4 RAID5 array.  No LVM, LUKS, etc.
> > > >
> > > > You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
> > > >
> > > > Thanks.
> > > > --RJ
> > > >
> > > >
> > > > On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 25/01/2024 at 12:49, RJ Marquette wrote:
> > > > > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > > > > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > > > > Disk model: Hitachi HUS72403
> > > > > Units: sectors of 1 * 512 = 512 bytes
> > > > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > > > Disklabel type: gpt
> > > > > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> > > > >
> > > > > Device    Start        End    Sectors  Size Type
> > > > > /dev/sdb1  2048 5860532223 5860530176  2.7T Microsoft basic data
> > > > >
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > > > > 2048
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > > > > 5860530176
> > > >
> > > > The partition geometry looks correct, with standard alignment.
> > > > And the kernel view of the partition matches the partition table.
> > > > The partition type "Microsoft basic data" is neither "Linux RAID" nor
> > > > the default type "Linux flesystem" set by usual GNU/Linux partitioning
> > > > tools such as fdisk, parted and gdisk so it seems unlikely that the
> > > > partition was created with one of these tools.
> > > >
> > > >
> > > > >>> It looks like this is what happened after all.  I searched for "MBR
> > > > >>> Magic aa55" and found someone else with the same issue long ago:
> > > > >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> > > > >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> > > > >>> that on mine; I must have activated it by accident when setting the boot
> > > > >>> drive or something.
> > > >
> > > >
> > > > I am a bit suspicious about this cause for two reasons:
> > > > - sde, sdf and sdg are affected even though they are connected to the
> > > > add-on Marvell SATA controller card which is supposed to be outside the
> > > > motherboard RAID scope;
> > > > - sdc is not affected even though it is connected to the onboard Intel
> > > > SATA controller.
> > > >
> > > > What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
> > > >
> > >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-26 16:03                                                       ` RJ Marquette
@ 2024-01-26 23:45                                                         ` RJ Marquette
  2024-01-27  8:41                                                           ` Pascal Hambourg
  2024-02-19 20:48                                                           ` Pascal Hambourg
  0 siblings, 2 replies; 40+ messages in thread
From: RJ Marquette @ 2024-01-26 23:45 UTC (permalink / raw)
  To: linux-raid

Quick follow up:  When I rebooted, the partition tables got munged again.  Definitely a BIOS issue.  I have a 10TB drive on order, so I'll copy everything off, then rebuild the array in the recommended format (though one wonders if I even need an array when a single drive can hold everything...) with partitions, and see what happens then.

Thanks.
--RJ


On Friday, January 26, 2024 at 11:03:38 AM EST, RJ Marquette <rjm1@yahoo.com> wrote: 





HOLY...  I GOT IT!  WOW.

I guess all I needed to do was abandon hope.

The magic spell that worked (sorry, I know it's not magic to the developers on this list, but it definitely feels that way to me right now):

mdadm --create /dev/md0 --level=5 --chunk=512K --data-offset=262144s --raid-devices=5 /dev/sde /dev/sdf /dev/sdg /dev/sda /dev/sdb --assume-clean --readonly

I'm copying the stuff I thought I lost now, while it's still accessible in RO mode!

The previous command I tried swapped sdb and sda, and I got a different error when I tried to mount it.  The previous attempts all gave me the "bad superblock" error upon mount attempt, but that attempt said something like filesystem errors.  So I made a note of it, then tried the one above, and it mounted cleanly.  No concerns at all.

(Sorry, I'm sure this is normally a very stoic list, but, I'm sure you understand my excitement here.)

I assume, once I'm comfortable with my backups, I can unmount it, stop it, then use mdadm --assemble to load it again with read/write access. I've already updated the mdadm.conf with the new uuid, so I think it should work automatically then.

Yes, I will review my backup plan and make improvements.  The plan was good, in that I wasn't going to lose most of the critical stuff, but not great, in that I was going to lose some.

(And, yes, I still wonder why it happened in the first place, so I will keep that in mind.  My guess is that if the array survives the next reboot, it'll probably be fine for a long time.  If it doesn't, then it's clearly something in the motherboard trashing it.  Not great, but at least now I'll know how to resurrect it when I swap in the previous motherboard.)

THANK YOU EVERYONE!

--RJ




On Friday, January 26, 2024 at 10:15:16 AM EST, RJ Marquette <rjm1@yahoo.com> wrote: 





Thanks.

I'm not following all of what you're saying, and I suspect I'm beyond the point where it's going to help.  I tried the instructions on the page you linked, but I haven't hit the right combination of parameters and drive order to get a valid ext4 partition.  Sigh.  I will admit I couldn't get the overlay working, and figuring I was already likely not going to be able to recreate it, I went for it directly on the drives - still using readonly and assume clean, but of course it overwrites what was left of the partition tables - which I figured was no big loss at this point.  It didn't sound like my chances of recovery were very high anyway.

Basically I'm trying variations on this command - removing offset, using sd[x] instead of sd[x]1, changing the order of a and b, etc.  Would it be critical the spare drive be included?  I can't see why it would, so I've been ignoring it during these tests.  (The chunk and offset came from the info on the spare drive.)

mdadm --create /dev/md0 --level=5 --chunk=512K --data-offset=262144s --raid-devices=5 /dev/sda1 /dev/sdb1 /dev/sde1 /dev/sdf1 /dev/sdg1 --assume-clean --readonly

I really do wish I knew what happened to these drives (and I'm sure this group would, too).  I'm pretty sure it was something in the BIOS that caused it; I've seen a few reports of motherboards from this era (~2015) with bugs in UEFI in other brands that caused issues like this, but I couldn't find anything specific to ASUS.  Someone mentioned the possibility of something happening while it was running that would have cropped up when I rebooted (even on the old board), but it hadn't been that long since my previous reboot (~30 days IIRC), and as I mentioned I've had no cause to modify anything related to the system, aside from updating the software using apt.

After I recreate the array, I'll throw some data on it, and then reboot the computer to see what happens.  It'll probably be fine, but if it's munged again...well...that will certainly be an interesting outcome, to say the least.

Thanks for the suggestions, everyone.  I'm not planning to overwrite the array right this moment, so if you have other suggestions on things to try, I'm open to them.

Other thoughts as I process this:

I dunno.  At one point a few weeks ago, my /var partition (which isn't on the array) filled up, so that caused some weird issues with various things until I figured out what was going on.  I doubt that was a factor here, and the array was working just fine after the cleanup.

I do have backups of most of the stuff that was on the array - I will lose a bunch of our ripped DVDs and Blu-Rays, which is a headache to recreate, but not truly lost.  I believe I have copies of all of our recent pictures on my laptop or desktop machine; older stuff is stored on Amazon Glacier, if needed, but I think I have a local copy of most of it.  

I see a few groups of pictures I may have lost completely.  They were too new for being uploaded to Glacier, but too old to still be on my desktop or laptop.  I don't seem to have posted them online, either.  (I'm checking my Glacier inventory now to see if I did upload them at some point, but it's unlikely.)

I did lose some data for a hobby project, which is not at all critical or important in the grand scheme of things, but it is a bummer.  I've even been thinking about how I should back up that data.  Fortunately, I always enter the most important information into a database when new data came in, so I at least have that.

I used to have a script that backed up the array to a 3TB drive in my desktop machine, but as 7TB (used space in the array) is greater than 3TB, there was an obvious issue developing, so I stopped it a few years back, instead of paring down the stuff copied over.  Dang.

Thanks.
--RJ





On Thursday, January 25, 2024 at 06:01:18 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote: 





looking further gpt may simply clear all it could use.

So you may need to look at the first 16k and clear on the overlay that
16k before the test.

On Thu, Jan 25, 2024 at 4:53 PM Roger Heflin <rogerheflin@gmail.com> wrote:
>
> and given the partition table probably eliminated the disk superblocks
> step #1 would be to overlay sdX (not sdx1) and remove the partition
> from the overlay and begin testing.
>
> The first test may be simply the dd if=/dev/zero of=overlaydisk bs=256
> count=8 as looking at the gpt partitions I have says that will
> eliminate that header and then try an --examine and see if that finds
> anything, if that works you won't need to go into the assume clean
> stuff which simplifies everything.
>
> My checking says the gpt partition table seems to start 256bytes in
> and stops before about 2k in before hex 1000(4k) where the md header
> seems to be located on mine.
>
>
>
> On Thu, Jan 25, 2024 at 4:37 PM Roger Heflin <rogerheflin@gmail.com> wrote:
> >
> > If the one that is working right does not have a partition then
> > someway the partitioning got added teh the broken disks.  The
> > partition is a gpt partition which another indicated is about 16k
> > which would mean it overwrote the md header at 4k, and that header
> > being overwritten would cause the disk to no longer be raid members.
> >
> > The create must have the disks in the correct order and with the
> > correct parameters,    Doing a random create with a random order is
> > unlikely to work, and may well make things unrecoverable.
> >
> > I believe there are instructions on some page about md repairing that
> > talks about using overlays.  Using the overlays lets you create an
> > overlay such that the underlying devices aren't written to such that
> > you can test a number of different orders and parameters to find the
> > one that works.
> >
> > I think this is the stuff about recovering and overlays that you want to follow.
> >
> > https://raid.wiki.kernel.org/index.php/Irreversible_mdadm_failure_recovery
> >
> >
> > On Thu, Jan 25, 2024 at 12:41 PM RJ Marquette <rjm1@yahoo.com> wrote:
> > >
> > > No, this system does not have any other OS's installed on it, Debian Linux only, as it's my server.
> > >
> > > No, the three drives remained connected to the extra controller card and were never removed from that card - I just pulled the card out of the case with the connections intact, and swung it off to the side.  In fact they still haven't been removed.
> > >
> > > I don't understand the partitions comment, as 5 of the 6 drives do appear to have separate partitions for the data, and the one that doesn't is the only that seems to be responding normally.  I guess the theory is that whatever damaged the partition tables wrote a single primary partition to the drive in the process?
> > >
> > > I do not know what caused this problem.  I've had no reason to run fdisk or any similar utility on that computer in years.  I know we want to figure out why this happened, but I'd also like to recover my RAID, if possible.
> > >
> > > What are my options at this point?  Should I try something like this?  (This is for someone's RAID1 setup, obviously the level and drives would change for me.):
> > >
> > > mdadm --create --assume-clean --level=1 --raid-devices=2 /dev/md0 /dev/sda /dev/sdb
> > >
> > > That's from this page:  https://askubuntu.com/questions/1254561/md-raid-superblock-gets-deleted
> > >
> > > I'm currently running testdisk on one of the affected drives to see what that turns up.
> > >
> > > Thanks.
> > > --RJ
> > >
> > > On Thursday, January 25, 2024 at 12:43:58 PM EST, Roger Heflin <rogerheflin@gmail.com> wrote:
> > >
> > >
> > >
> > >
> > >
> > > You never booted windows or any other non-linux boot image that might
> > > have decided to "fix" the disk's missing partition tables?
> > >
> > > And when messing with the install did you move the disks around so
> > > that some of the disk could have been on the intel controller with
> > > raid set at different times?
> > >
> > > That specific model of marvell controller does not list support raid,
> > > but other models in the same family do, so it may also have an option
> > > in the bios that "fixes" the partition table.
> > >
> > > Any number of id10t's writing tools may have wrongly decided that a
> > > disk without a partition table needs to be fixed.  I know the windows
> > > disk management used to (may still) complain about no partitions and
> > > prompt to "fix" it.
> > >
> > > I always run partitions on everything.  I have had the partition save
> > > me when 2 different vendors hardware raid controllers lost their
> > > config (random crash, freaked on on fw upgrade) and when the config
> > > was recreated seem to "helpfully" clear a few kb at the front of the
> > > disk.  Rescue boot, repartition, mount os lv, and reinstall grub
> > > fixed those.
> > >
> > > On Thu, Jan 25, 2024 at 9:17 AM RJ Marquette <rjm1@yahoo.com> wrote:
> > > >
> > > > It's an ext4 RAID5 array.  No LVM, LUKS, etc.
> > > >
> > > > You make a good point about the BIOS explanation - it seems to have affected only the 5 RAID drives that had data on them, not the spare, nor the other system drive (and the latter two are both connected to the motherboard).  How would it have decided to grab exactly those 5?
> > > >
> > > > Thanks.
> > > > --RJ
> > > >
> > > >
> > > > On Thursday, January 25, 2024 at 10:01:40 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On 25/01/2024 at 12:49, RJ Marquette wrote:
> > > > > root@jackie:/home/rj# /sbin/fdisk -l /dev/sdb
> > > > > Disk /dev/sdb: 2.73 TiB, 3000592982016 bytes, 5860533168 sectors
> > > > > Disk model: Hitachi HUS72403
> > > > > Units: sectors of 1 * 512 = 512 bytes
> > > > > Sector size (logical/physical): 512 bytes / 4096 bytes
> > > > > I/O size (minimum/optimal): 4096 bytes / 4096 bytes
> > > > > Disklabel type: gpt
> > > > > Disk identifier: AF5DC5DE-1404-4F4F-85AF-B5574CD9C627
> > > > >
> > > > > Device    Start        End    Sectors  Size Type
> > > > > /dev/sdb1  2048 5860532223 5860530176  2.7T Microsoft basic data
> > > > >
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/start
> > > > > 2048
> > > > > root@jackie:/home/rj# cat /sys/block/sdb/sdb1/size
> > > > > 5860530176
> > > >
> > > > The partition geometry looks correct, with standard alignment.
> > > > And the kernel view of the partition matches the partition table.
> > > > The partition type "Microsoft basic data" is neither "Linux RAID" nor
> > > > the default type "Linux flesystem" set by usual GNU/Linux partitioning
> > > > tools such as fdisk, parted and gdisk so it seems unlikely that the
> > > > partition was created with one of these tools.
> > > >
> > > >
> > > > >>> It looks like this is what happened after all.  I searched for "MBR
> > > > >>> Magic aa55" and found someone else with the same issue long ago:
> > > > >>> https://serverfault.com/questions/580761/is-mdadm-raid-toast  Looks like
> > > > >>> his was caused by a RAID configuration option in BIOS.  I recall seeing
> > > > >>> that on mine; I must have activated it by accident when setting the boot
> > > > >>> drive or something.
> > > >
> > > >
> > > > I am a bit suspicious about this cause for two reasons:
> > > > - sde, sdf and sdg are affected even though they are connected to the
> > > > add-on Marvell SATA controller card which is supposed to be outside the
> > > > motherboard RAID scope;
> > > > - sdc is not affected even though it is connected to the onboard Intel
> > > > SATA controller.
> > > >
> > > > What was contents type of the RAID array ? LVM, LUKS, plain filesystem ?
> > > >
> > >

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-26 23:45                                                         ` RJ Marquette
@ 2024-01-27  8:41                                                           ` Pascal Hambourg
  2024-01-27 12:30                                                             ` RJ Marquette
  2024-02-19 20:48                                                           ` Pascal Hambourg
  1 sibling, 1 reply; 40+ messages in thread
From: Pascal Hambourg @ 2024-01-27  8:41 UTC (permalink / raw)
  To: linux-raid

On 27/01/2024 at 00:45, RJ Marquette wrote:
> Quick follow up:Â  When I rebooted, the partition tables got munged
> again.  Definitely a BIOS issue.  I have a 10TB drive on order, so I'll
> copy everything off, then rebuild the array in the recommended format
> with partitions, and see what happens then

You should be able to rebuild the array on top of the partitions by 
subtracting the partition offset from the data offset. If the partitions 
all begin at sector 2048:

--data-offset=$((262144-2048))s

Beware that /dev/sd* names are not always persistent across reboots, so 
check that the disks are in the same order as during the previous boot.

> (though one wonders if I even need an array when a single drive can
> hold everything...).

RAID5 provides disk fault tolerance. If you only need disk aggregation, 
you could use RAID0 or LVM instead.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-27  8:41                                                           ` Pascal Hambourg
@ 2024-01-27 12:30                                                             ` RJ Marquette
  0 siblings, 0 replies; 40+ messages in thread
From: RJ Marquette @ 2024-01-27 12:30 UTC (permalink / raw)
  To: linux-raid

On Saturday, January 27, 2024 at 04:08:21 AM EST, Pascal Hambourg <pascal@plouf.fr.eu.org> wrote:

> You should be able to rebuild the array on top of the partitions by
> subtracting the partition offset from the data offset. If the partitions
> all begin at sector 2048:

> --data-offset=$((262144-2048))s

Thanks.  I might try that after I get the new drive... and back everything up.

> Beware that /dev/sd* names are not always persistent across reboots, so
> check that the disks are in the same order as during the previous boot.

Yeah, so far I haven't had that problem, but I do know it's possible. 

> RAID5 provides disk fault tolerance. If you only need disk aggregation,
> you could use RAID0 or LVM instead.

Well the original reason for this array when I built it was that huge drives, like 12 TB (the effective capacity of the array), weren't affordable, if they were available at all.  That's the point I was trying to make - the need for an array to get 12 TB of storage has gone by the wayside; I can just buy 12 TB drives.  I have no plan to move away from the array for now, though, I figure I'll keep using it until drives start dying or I truly fill it up, then decide.

Thanks.
--RJ



On 27/01/2024 at 00:45, RJ Marquette wrote:
> Quick follow up:  When I rebooted, the partition tables got munged
> again.  Definitely a BIOS issue.  I have a 10TB drive on order, so I'll
> copy everything off, then rebuild the array in the recommended format
> with partitions, and see what happens then

You should be able to rebuild the array on top of the partitions by 
subtracting the partition offset from the data offset. If the partitions 
all begin at sector 2048:

--data-offset=$((262144-2048))s

Beware that /dev/sd* names are not always persistent across reboots, so 
check that the disks are in the same order as during the previous boot.


> (though one wonders if I even need an array when a single drive can
> hold everything...).


RAID5 provides disk fault tolerance. If you only need disk aggregation, 
you could use RAID0 or LVM instead.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: Requesting help recovering my array
  2024-01-26 23:45                                                         ` RJ Marquette
  2024-01-27  8:41                                                           ` Pascal Hambourg
@ 2024-02-19 20:48                                                           ` Pascal Hambourg
  1 sibling, 0 replies; 40+ messages in thread
From: Pascal Hambourg @ 2024-02-19 20:48 UTC (permalink / raw)
  To: RJ Marquette, linux-raid

Hello RJ and others,

On 27/01/2024 at 00:45, RJ Marquette wrote:
> Quick follow up:  When I rebooted, the partition tables got munged
> again.  Definitely a BIOS issue.


Today I came across a similar issue with a RAID1 array of two 
unpartitioned drives where the RAID superblock on one drive was 
overwritten with a GPT partition table at every boot, but the superblock 
on the other drive was not affected.

It turns out that the affected drive had remnants of a GPT partition 
table (protective MBR at the beginning of the drive and secondary GPT 
header and table at the end of the drive), whereas the unaffected drive 
did not.

After deleting the secondary GPT header and protective MBR with wipefs, 
the RAID superblock was not overwritten any more. So it seems that 
something, presumably the BIOS/UEFI firmware, found the secondary GPT 
table on the drive and decided to "restore" the missing primary GPT table.

^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2024-02-19 20:48 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <432300551.863689.1705953121879.ref@mail.yahoo.com>
2024-01-22 19:52 ` Requesting help recovering my array RJ Marquette
2024-01-22 21:39   ` Reindl Harald
2024-01-22 22:13     ` RJ Marquette
2024-01-22 23:49       ` Reindl Harald
2024-01-23  0:09         ` RJ Marquette
2024-01-23  1:52           ` RJ Marquette
2024-01-23 16:06             ` David Niklas
2024-01-23 16:09               ` RJ Marquette
2024-01-23 16:16               ` RJ Marquette
2024-01-23 22:50                 ` Sandro
2024-01-24  0:59                   ` RJ Marquette
     [not found]                   ` <d051abe3-af97-47a4-a087-432c91beb57e@yahoo.com>
2024-01-24  9:11                     ` Sandro
2024-01-24  3:19                 ` David Niklas
2024-01-24 12:17                   ` RJ Marquette
2024-01-24 17:06                     ` Sandro
2024-01-24 18:06                       ` RJ Marquette
2024-01-24 21:20                         ` Roger Heflin
2024-01-24 21:31                           ` RJ Marquette
2024-01-24 21:44                             ` Roger Heflin
2024-01-24 22:21                               ` Robin Hill
2024-01-24 22:37                                 ` Roger Heflin
2024-01-25  1:13                               ` RJ Marquette
2024-01-25  1:57                                 ` Roger Heflin
2024-01-25  9:49                                   ` Pascal Hambourg
2024-01-25 11:49                                     ` RJ Marquette
2024-01-25 14:57                                       ` Pascal Hambourg
2024-01-25 15:08                                         ` RJ Marquette
2024-01-25 17:43                                           ` Roger Heflin
2024-01-25 18:33                                             ` RJ Marquette
2024-01-25 22:37                                               ` Roger Heflin
2024-01-25 22:53                                                 ` Roger Heflin
2024-01-25 23:00                                                   ` Roger Heflin
2024-01-26 15:15                                                     ` RJ Marquette
2024-01-26 15:25                                                       ` Reindl Harald
2024-01-26 16:03                                                       ` RJ Marquette
2024-01-26 23:45                                                         ` RJ Marquette
2024-01-27  8:41                                                           ` Pascal Hambourg
2024-01-27 12:30                                                             ` RJ Marquette
2024-02-19 20:48                                                           ` Pascal Hambourg
2024-01-25 17:06                                 ` Reindl Harald

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.