All of lore.kernel.org
 help / color / mirror / Atom feed
* Re: Two raid5 arrays are inactive and have changed UUIDs
       [not found] ` <959ca414-0c97-2e8d-7715-a7cb75790fcd@youngman.org.uk>
@ 2020-01-10  1:19   ` William Morgan
  2020-01-10  1:55     ` Wols Lists
  0 siblings, 1 reply; 15+ messages in thread
From: William Morgan @ 2020-01-10  1:19 UTC (permalink / raw)
  To: Wol's lists; +Cc: linux-raid

Thank you for the encouraging response. I think I would like to
attempt to rescue the smaller array first as the data there is
somewhat less crucial and I may learn something before working with
the more important larger array.

> > md1 consists of 4x 4TB drives:
> >
> > role drive events state
> >   0    sdj    5948  AAAA
> >   1    sdk   38643  .AAA
> >   2    sdl   38643  .AAA
> >   3    sdm   38643  .AAA
>
> This array *should* be easy to recover. Again, use overlays, and
> force-assemble sdk, sdl, and sdm. DO NOT include sdj - this was ejected
> from the array a long time ago, and including it will seriously mess up
> your array. This means you've actually been running a 3-disk raid-0 for
> quite a while, so provided nothing more goes wrong, you'll have a
> perfect recovery, but any trouble and your data is toast. Is there any
> way you can ddrescue these three drives before attempting a recovery?

I do have plenty of additional disk space. If I try ddrescue first,
will that just give me a backup of the array in case something goes
wrong with the force-assemble with overlays? Can you give me some
guidance on what to do with ddrescue?

Thanks,
Bill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-10  1:19   ` Two raid5 arrays are inactive and have changed UUIDs William Morgan
@ 2020-01-10  1:55     ` Wols Lists
  2020-01-13 22:40       ` William Morgan
  0 siblings, 1 reply; 15+ messages in thread
From: Wols Lists @ 2020-01-10  1:55 UTC (permalink / raw)
  To: William Morgan; +Cc: linux-raid

On 10/01/20 01:19, William Morgan wrote:
> Thank you for the encouraging response. I think I would like to
> attempt to rescue the smaller array first as the data there is
> somewhat less crucial and I may learn something before working with
> the more important larger array.
> 
>>> md1 consists of 4x 4TB drives:
>>>
>>> role drive events state
>>>   0    sdj    5948  AAAA
>>>   1    sdk   38643  .AAA
>>>   2    sdl   38643  .AAA
>>>   3    sdm   38643  .AAA
>>
>> This array *should* be easy to recover. Again, use overlays, and
>> force-assemble sdk, sdl, and sdm. DO NOT include sdj - this was ejected
>> from the array a long time ago, and including it will seriously mess up
>> your array. This means you've actually been running a 3-disk raid-0 for
>> quite a while, so provided nothing more goes wrong, you'll have a
>> perfect recovery, but any trouble and your data is toast. Is there any
>> way you can ddrescue these three drives before attempting a recovery?
> 
> I do have plenty of additional disk space. If I try ddrescue first,
> will that just give me a backup of the array in case something goes
> wrong with the force-assemble with overlays? Can you give me some
> guidance on what to do with ddrescue?
> 
Firstly, the whole point of overlays is to avoid damaging the arrays -
done properly, any and all changes made to the array are actually
diverted to files elsewhere so when you shut down all the changes are
lost and you get the unaltered disks back. So the idea is you assemble
the array with overlays, inspect the data, check the disk with fsck etc,
and if it all looks good you know you can assemble the array without
overlays and recover everything.

Of course, things can always go wrogn ... so ddrescue makes a backup.
Depending on how you want to do it, just use ddrescue exactly as you
would use dd. You can copy your disk, eg "dd if=/dev/sdm of=/dev/sdn" -
just MAKE SURE you get the arguments right to avoid trashing your
original disk by mistake, or you can copy the disk or partition to a
file eg "dd if=/dev/sdj of=/home/rescue/copy_of_sdj".

If you're not happy using overlays, having ddrescue'd the disks you
could always assemble the array directly from the copies and make sure
everything's okay there, before trying it with the original disks.

Note that there is no difference as far as the *user* is concerned
between dd and ddrescue. Under the hood, there's a lot of difference -
ddrescue is targeted at failing hardware so while it tries to just do a
simple copy it doesn't fail on read errors and has a large repertoire of
tricks to try and get the data off. It doesn't always succeed, though
... :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-10  1:55     ` Wols Lists
@ 2020-01-13 22:40       ` William Morgan
  2020-01-14 14:47         ` William Morgan
  0 siblings, 1 reply; 15+ messages in thread
From: William Morgan @ 2020-01-13 22:40 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid

> >>> md1 consists of 4x 4TB drives:
> >>>
> >>> role drive events state
> >>>   0    sdj    5948  AAAA
> >>>   1    sdk   38643  .AAA
> >>>   2    sdl   38643  .AAA
> >>>   3    sdm   38643  .AAA

> If you're not happy using overlays, having ddrescue'd the disks you
> could always assemble the array directly from the copies and make sure
> everything's okay there, before trying it with the original disks.

I successfully ddrescued all four drives /dev/sd[j,k,l,m], each to new
disk, with no errors reported. I have the copies on /dev/sd[n,o,p,q].

Now if I want to force assemble the three copies with event counts
that agree [o,p,q], should I just do:

mdadm --assemble --force /dev/md1 /dev/sdo1 /dev/sdp1 /dev/sdq1

Or should I assemble the copies into a new array, say /dev/md2?

I'm worried that assembling the copies might merge them into the
existing (inactive) array. Is that what's supposed to happen? I'm
unclear here.

Thanks for your help,
Bill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-13 22:40       ` William Morgan
@ 2020-01-14 14:47         ` William Morgan
  2020-01-14 15:23           ` Wols Lists
  0 siblings, 1 reply; 15+ messages in thread
From: William Morgan @ 2020-01-14 14:47 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid

Well, I went ahead and tried the forced assembly:

bill@bill-desk:~$ sudo mdadm --assemble --force /dev/md1 /dev/sdg1
/dev/sdh1 /dev/sdi1
[sudo] password for bill:
mdadm: Merging with already-assembled /dev/md/1
mdadm: Marking array /dev/md/1 as 'clean'
mdadm: failed to RUN_ARRAY /dev/md/1: Input/output error

(The drive letters have changed because I removed a bunch of other
drives. The original drives are now on sd[b,c,d,e] and the copies are
on sd[f,g,h,i] with sdf being a copy of the presumably bad sdb with
the event count which doesn't agree with the other 3 disks.)

So, it failed. dmesg shows:

152144.483755] md: array md1 already has disks!
[152144.483772] md: kicking non-fresh sdb1 from array!
[152144.520313] md/raid:md1: not clean -- starting background reconstruction
[152144.520345] md/raid:md1: device sdd1 operational as raid disk 2
[152144.520346] md/raid:md1: device sde1 operational as raid disk 1
[152144.520348] md/raid:md1: device sdc1 operational as raid disk 3
[152144.522219] md/raid:md1: cannot start dirty degraded array.
[152144.566782] md/raid:md1: failed to run raid set.
[152144.566785] md: pers->run() failed ...
[152144.568169] md1: ADD_NEW_DISK not supported
[152144.569894] md1: ADD_NEW_DISK not supported
[152144.571498] md1: ADD_NEW_DISK not supported
[152144.573964] md1: ADD_NEW_DISK not supported

mdstat shows sdb no longer part of the array:

bill@bill-desk:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md1 : inactive sdd1[2] sde1[1] sdc1[4]
      11720653824 blocks super 1.2

details of the array:

ill@bill-desk:~$ sudo mdadm -D /dev/md1
/dev/md1:
           Version : 1.2
     Creation Time : Tue Sep 25 23:31:31 2018
        Raid Level : raid5
     Used Dev Size : 18446744073709551615
      Raid Devices : 4
     Total Devices : 3
       Persistence : Superblock is persistent

       Update Time : Sat Jan  4 16:52:59 2020
             State : active, FAILED, Not Started
    Active Devices : 3
   Working Devices : 3
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 64K

Consistency Policy : unknown

              Name : bill-desk:1  (local to host bill-desk)
              UUID : 723f939b:62b73a3e:e86e1fe1:e37131dc
            Events : 38643

    Number   Major   Minor   RaidDevice State
       -       0        0        0      removed
       -       0        0        1      removed
       -       0        0        2      removed
       -       0        0        3      removed

       -       8       65        1      sync   /dev/sde1
       -       8       49        2      sync   /dev/sdd1
       -       8       33        3      sync   /dev/sdc1

Now if I try the forced assemble again I get:

ill@bill-desk:~$ sudo mdadm --assemble --force /dev/md1 /dev/sdg1
/dev/sdh1 /dev/sdi1
mdadm: Found some drive for an array that is already active: /dev/md/1
mdadm: giving up.

I'm lost now. Not sure what to do anymore. Do I need to edit
mdadm.conf? Do I need to remove the original drives? Any ideas Wols?

Bill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-14 14:47         ` William Morgan
@ 2020-01-14 15:23           ` Wols Lists
  2020-01-15 22:12             ` William Morgan
  0 siblings, 1 reply; 15+ messages in thread
From: Wols Lists @ 2020-01-14 15:23 UTC (permalink / raw)
  To: William Morgan; +Cc: linux-raid

On 14/01/20 14:47, William Morgan wrote:
> Well, I went ahead and tried the forced assembly:
> 
> bill@bill-desk:~$ sudo mdadm --assemble --force /dev/md1 /dev/sdg1
> /dev/sdh1 /dev/sdi1
> [sudo] password for bill:
> mdadm: Merging with already-assembled /dev/md/1

This looks like your problem ... it looks like you have a failed
assembly active. Did you do an "mdadm --stop /dev/md1"? You should
always do that between every attempt.

> mdadm: Marking array /dev/md/1 as 'clean'
> mdadm: failed to RUN_ARRAY /dev/md/1: Input/output error
> 
I hope that's a good sign - if it's got an array that doesn't make sense
then everything should have stopped at that point and the --force won't
have done any damage.

> (The drive letters have changed because I removed a bunch of other
> drives. The original drives are now on sd[b,c,d,e] and the copies are
> on sd[f,g,h,i] with sdf being a copy of the presumably bad sdb with
> the event count which doesn't agree with the other 3 disks.)

Make sure md1 doesn't appear to exist at all (--stop), and then try
again ...

Cheers,
Wol

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-14 15:23           ` Wols Lists
@ 2020-01-15 22:12             ` William Morgan
  2020-01-15 23:44               ` Wols Lists
  0 siblings, 1 reply; 15+ messages in thread
From: William Morgan @ 2020-01-15 22:12 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid

> Did you do an "mdadm --stop /dev/md1"? You should
> always do that between every attempt.
>
Yes, that was exactly the problem - I forgot to stop it. After that,
the array assembled perfectly, and fsck showed no problems with the
array. I was able to salvage all the data. Wouldn't have been able to
do it without your help!

Now on to the array of larger disks:

md0 consists of 4x 8TB drives:

role drive events state
 0    sdb   10080  A.AA (bad blocks reported on this drive)
 1    sdc   10070  AAAA
 2    sdd   10070  AAAA
 3    sde   10080  A.AA (bad blocks reported on this drive)

I used the same approach, ddrescue-ing sd[b-e] to sd[f-i]. No errors
during the copy. Then I stopped md0 and force-assembled the array from
the copies. That seems to have gone well.:

bill@bill-desk:~$ sudo mdadm --assemble --force /dev/md0 /dev/sd[f-i]1
mdadm: forcing event count in /dev/sdg1(1) from 10070 upto 10080
mdadm: forcing event count in /dev/sdh1(2) from 10070 upto 10080
mdadm: clearing FAULTY flag for device 1 in /dev/md0 for /dev/sdg1
mdadm: Marking array /dev/md0 as 'clean'
mdadm: /dev/md0 has been started with 4 drives.

checked the status of md0:

bill@bill-desk:~$ sudo mdadm -D /dev/md0
/dev/md0:
           Version : 1.2
     Creation Time : Sat Sep 22 19:10:10 2018
        Raid Level : raid5
        Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
     Used Dev Size : 7813893120 (7451.91 GiB 8001.43 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sat Jan  4 16:58:47 2020
             State : clean
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric
        Chunk Size : 64K

Consistency Policy : bitmap

              Name : bill-desk:0  (local to host bill-desk)
              UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
            Events : 10080

    Number   Major   Minor   RaidDevice State
       0       8       81        0      active sync   /dev/sdf1
       1       8       97        1      active sync   /dev/sdg1
       2       8      113        2      active sync   /dev/sdh1
       4       8      129        3      active sync   /dev/sdi1

Then checked with fsck:

bill@bill-desk:~$ sudo fsck /dev/md0
fsck from util-linux 2.34
e2fsck 1.45.3 (14-Jul-2019)
/dev/md0: recovering journal
/dev/md0: clean, 11274/366276608 files, 5285497435/5860419840 blocks

Everything seems ok. Then I examined each disk:

ill@bill-desk:~$ sudo mdadm --examine /dev/sd[f-i]
/dev/sdf:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdg:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdh:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
/dev/sdi:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
bill@bill-desk:~$ sudo mdadm --examine /dev/sd[f-i]1
/dev/sdf1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x9
     Array UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
           Name : bill-desk:0  (local to host bill-desk)
  Creation Time : Sat Sep 22 19:10:10 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
     Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : ab1323e0:9c0426cf:3e168733:b73e9c5c

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Jan 15 15:35:37 2020
  Bad Block Log : 512 entries available at offset 40 sectors - bad
blocks present.
       Checksum : f1599c42 - correct
         Events : 10080

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdg1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
           Name : bill-desk:0  (local to host bill-desk)
  Creation Time : Sat Sep 22 19:10:10 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
     Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : c875f246:ce25d947:a413e198:4100082e

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Jan 15 15:35:37 2020
  Bad Block Log : 512 entries available at offset 40 sectors
       Checksum : 7a1de7a - correct
         Events : 10080

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdh1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
           Name : bill-desk:0  (local to host bill-desk)
  Creation Time : Sat Sep 22 19:10:10 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
     Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : fd0634e6:6943f723:0e30260e:e253b1f4

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Jan 15 15:35:37 2020
  Bad Block Log : 512 entries available at offset 40 sectors
       Checksum : beeffd56 - correct
         Events : 10080

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdi1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x9
     Array UUID : 06ad8de5:3a7a15ad:88116f44:fcdee150
           Name : bill-desk:0  (local to host bill-desk)
  Creation Time : Sat Sep 22 19:10:10 2018
     Raid Level : raid5
   Raid Devices : 4

 Avail Dev Size : 15627786240 (7451.91 GiB 8001.43 GB)
     Array Size : 23441679360 (22355.73 GiB 24004.28 GB)
    Data Offset : 264192 sectors
   Super Offset : 8 sectors
   Unused Space : before=264112 sectors, after=0 sectors
          State : clean
    Device UUID : 8c628aed:802a5dc8:9d8a8910:9794ec02

Internal Bitmap : 8 sectors from superblock
    Update Time : Wed Jan 15 15:35:37 2020
  Bad Block Log : 512 entries available at offset 40 sectors - bad
blocks present.
       Checksum : 7b4adb4a - correct
         Events : 10080

         Layout : left-symmetric
     Chunk Size : 64K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)

All 4 drives have the same event count and all four show the same
state of AAAA, but the first and last drive still show bad blocks
present. Is that because ddrescue copied literally everything from the
original drives, including the list of bad blocks? How should I go
about clearing those bad blocks? Is there something more I should do
to verify the integrity of the data?

Thanks so much for your help!
Bill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-15 22:12             ` William Morgan
@ 2020-01-15 23:44               ` Wols Lists
  2020-01-19 17:02                 ` William Morgan
  0 siblings, 1 reply; 15+ messages in thread
From: Wols Lists @ 2020-01-15 23:44 UTC (permalink / raw)
  To: William Morgan; +Cc: linux-raid

On 15/01/20 22:12, William Morgan wrote:
> All 4 drives have the same event count and all four show the same
> state of AAAA, but the first and last drive still show bad blocks
> present. Is that because ddrescue copied literally everything from the
> original drives, including the list of bad blocks? How should I go
> about clearing those bad blocks? Is there something more I should do
> to verify the integrity of the data?

Read the wiki - the section on badblocks will be - enlightening - shall
we say.

https://raid.wiki.kernel.org/index.php/The_Badblocks_controversy

Yes, the bad blocks are implemented within md, so they got copied across
along with everything else. So your array should be perfectly fine
despite the badblocks allegedly there ...

Cheers,
Wol

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-15 23:44               ` Wols Lists
@ 2020-01-19 17:02                 ` William Morgan
  2020-01-19 17:07                   ` William Morgan
                                     ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: William Morgan @ 2020-01-19 17:02 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid

> Read the wiki - the section on badblocks will be - enlightening - shall
> we say.
>
> https://raid.wiki.kernel.org/index.php/The_Badblocks_controversy

Yeah, I had read that already. Seems like a strange limbo state to be
in with no clear way of fixing bad blocks....

Anyway, I think the bad blocks may be related to the problem I'm
having now. At first everything looks good:

bill@bill-desk:~$ cat /proc/mdstat
Personalities : [raid6] [raid5] [raid4] [linear] [multipath] [raid0]
[raid1] [raid10]
md0 : active raid5 sdk1[1] sdm1[4] sdj1[0] sdl1[2]
      23441679360 blocks super 1.2 level 5, 64k chunk, algorithm 2 [4/4] [UUUU]
      bitmap: 0/59 pages [0KB], 65536KB chunk
unused devices: <none>

bill@bill-desk:~$ sudo fsck /dev/md0
fsck from util-linux 2.34
e2fsck 1.45.3 (14-Jul-2019)
/dev/md0: clean, 11274/366276608 files, 5285497435/5860419840 blocks

bill@bill-desk:~$ sudo lsblk -f
NAME    FSTYPE            LABEL       UUID
    FSAVAIL FSUSE% MOUNTPOINT
.
.
.
sdj
└─sdj1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  └─md0 ext4
ceef50e9-afdd-4903-899d-1ad05a0780e0
sdk
└─sdk1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  └─md0 ext4
ceef50e9-afdd-4903-899d-1ad05a0780e0
sdl
└─sdl1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  └─md0 ext4
ceef50e9-afdd-4903-899d-1ad05a0780e0
sdm
└─sdm1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  └─md0 ext4                          ceef50e9-afdd-4903-899d-1ad05a0780e0

But then I get confused about the UUIDs. I'm trying to automount the
array using fstab (no unusual settings in there, just defaults), but
I'm not sure which of the two UUIDs above to use. So I look at mdadm
for help:

bill@bill-desk:~$ sudo mdadm --examine --scan
ARRAY /dev/md/0  metadata=1.2 UUID=06ad8de5:3a7a15ad:88116f44:fcdee150
name=bill-desk:0

However, if I use this UUID starting with "06ad", then I get an error:

bill@bill-desk:~$ sudo mount -all
mount: /media/bill/STUFF: mount(2) system call failed: Structure needs cleaning.

But I don't know how to clean it if fsck says it's OK.

On the other hand, if I use the UUID above starting with "ceef", then
it mounts and everything seems OK.

Basically, I don't understand why lsblk lists two UUIDs for the array,
and why mdadm gives the wrong one in terms of mounting. This is where
I was confused before about the UUID changing. Any insight here?

Cheers,
Bill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-19 17:02                 ` William Morgan
@ 2020-01-19 17:07                   ` William Morgan
  2020-01-19 17:41                   ` Wols Lists
  2020-01-20  8:49                   ` Robin Hill
  2 siblings, 0 replies; 15+ messages in thread
From: William Morgan @ 2020-01-19 17:07 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid

Here is a better formatted output from lsblk:

bill@bill-desk:~$ sudo lsblk -fi
NAME    FSTYPE            LABEL       UUID
    FSAVAIL FSUSE% MOUNTPOINT
.
.
.
sdj
`-sdj1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  `-md0 ext4
ceef50e9-afdd-4903-899d-1ad05a0780e0
sdk
`-sdk1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  `-md0 ext4
ceef50e9-afdd-4903-899d-1ad05a0780e0
sdl
`-sdl1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  `-md0 ext4
ceef50e9-afdd-4903-899d-1ad05a0780e0
sdm
`-sdm1  linux_raid_member bill-desk:0
06ad8de5-3a7a-15ad-8811-6f44fcdee150
  `-md0 ext4                          ceef50e9-afdd-4903-899d-1ad05a0780e0

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-19 17:02                 ` William Morgan
  2020-01-19 17:07                   ` William Morgan
@ 2020-01-19 17:41                   ` Wols Lists
  2020-01-19 20:12                     ` William Morgan
  2020-01-20  8:49                   ` Robin Hill
  2 siblings, 1 reply; 15+ messages in thread
From: Wols Lists @ 2020-01-19 17:41 UTC (permalink / raw)
  To: William Morgan; +Cc: linux-raid

On 19/01/20 17:02, William Morgan wrote:
> But then I get confused about the UUIDs. I'm trying to automount the
> array using fstab (no unusual settings in there, just defaults), but
> I'm not sure which of the two UUIDs above to use. So I look at mdadm
> for help:
> 
> bill@bill-desk:~$ sudo mdadm --examine --scan
> ARRAY /dev/md/0  metadata=1.2 UUID=06ad8de5:3a7a15ad:88116f44:fcdee150
> name=bill-desk:0
> 
> However, if I use this UUID starting with "06ad", then I get an error:

Looking at your output (and no I'm not that good at reading traces) it
looks to me like this 06ad8de5 uuid should be that of ONE of your
partitions. But it looks like it's been cloned to ALL partitions.

You didn't do anything daft like partitioning one disk, and then just
dd'ing or cp'ing the partition table across? Never a good idea.
> 
> bill@bill-desk:~$ sudo mount -all
> mount: /media/bill/STUFF: mount(2) system call failed: Structure needs cleaning.
> 
> But I don't know how to clean it if fsck says it's OK.
> 
> On the other hand, if I use the UUID above starting with "ceef", then
> it mounts and everything seems OK.

Yup. That looks like the correct UUID for the array.
> 
> Basically, I don't understand why lsblk lists two UUIDs for the array,
> and why mdadm gives the wrong one in terms of mounting. This is where
> I was confused before about the UUID changing. Any insight here?
> 
Well it LOOKS to me like something has changed all the partition UUIDs
to the array UUID, and then the array UUID has changed to avoid a
collision.

I dunno - let's hope someone else has some ideas ...

Cheers,
Wol

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-19 17:41                   ` Wols Lists
@ 2020-01-19 20:12                     ` William Morgan
  2020-01-19 21:10                       ` Wol's lists
  0 siblings, 1 reply; 15+ messages in thread
From: William Morgan @ 2020-01-19 20:12 UTC (permalink / raw)
  To: Wols Lists; +Cc: linux-raid

> You didn't do anything daft like partitioning one disk, and then just
> dd'ing or cp'ing the partition table across? Never a good idea.

No, I didn't. Here are the partition UUIDs for disks j,k,l,m:

bill@bill-desk:~$ for i in {j,k,l,m} ; do udevadm info --query=all
--name=/dev/sd"$i" | grep -i UUID ; done
E: ID_PART_TABLE_UUID=bbccfe6f-739d-42de-9b84-99ca821f3291
E: ID_PART_TABLE_UUID=f42fd04d-96e3-4b0c-b20c-5286898b104f
E: ID_PART_TABLE_UUID=f467adc9-dba4-4ee9-95cc-f1a93e4a7d98
E: ID_PART_TABLE_UUID=4805ef03-5bd1-46d1-a0e1-5dde52b096ec

> Well it LOOKS to me like something has changed all the partition UUIDs
> to the array UUID, and then the array UUID has changed to avoid a
> collision.
>
> I dunno - let's hope someone else has some ideas ...

Well, here is some further info that shows 2 UUIDs for the single array md0:

bill@bill-desk:~$ udevadm info --query=all --name=/dev/md0 | grep -i UUID
S: disk/by-uuid/ceef50e9-afdd-4903-899d-1ad05a0780e0
S: disk/by-id/md-uuid-06ad8de5:3a7a15ad:88116f44:fcdee150
E: MD_UUID=06ad8de5:3a7a15ad:88116f44:fcdee150
E: ID_FS_UUID=ceef50e9-afdd-4903-899d-1ad05a0780e0
E: ID_FS_UUID_ENC=ceef50e9-afdd-4903-899d-1ad05a0780e0
E: UDISKS_MD_UUID=06ad8de5:3a7a15ad:88116f44:fcdee150
E: DEVLINKS=/dev/md/0 /dev/disk/by-id/md-name-bill-desk:0
/dev/disk/by-uuid/ceef50e9-afdd-4903-899d-1ad05a0780e0
/dev/disk/by-id/md-uuid-06ad8de5:3a7a15ad:88116f44:fcdee150

Still not sure why /dev/disk/by-id/ and /dev/disk/by-uuid/ would have
differing UUIDs, but maybe that's normal.

Maybe someone can shed some light here?

Cheers,
Bill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-19 20:12                     ` William Morgan
@ 2020-01-19 21:10                       ` Wol's lists
  2020-01-24 14:10                         ` Nix
  0 siblings, 1 reply; 15+ messages in thread
From: Wol's lists @ 2020-01-19 21:10 UTC (permalink / raw)
  To: William Morgan; +Cc: linux-raid

On 19/01/2020 20:12, William Morgan wrote:
> Still not sure why/dev/disk/by-id/  and/dev/disk/by-uuid/  would have
> differing UUIDs, but maybe that's normal.

Given that one is an ID, and the other is a UUID, I would have thought 
it normal that they're different (not that I know the difference between 
them! :-)

It would be nice if we had some decent docu on what all these assorted 
id's and uuid's and all that were, but I haven't managed to find any. 
Probably somewhere in the info pages for grub - given that I really 
don't like hypertext the fact that grub docu is very much hypertext is 
rather off-putting ...

Cheers,
Wol

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-19 17:02                 ` William Morgan
  2020-01-19 17:07                   ` William Morgan
  2020-01-19 17:41                   ` Wols Lists
@ 2020-01-20  8:49                   ` Robin Hill
  2020-01-20 15:00                     ` William Morgan
  2 siblings, 1 reply; 15+ messages in thread
From: Robin Hill @ 2020-01-20  8:49 UTC (permalink / raw)
  To: William Morgan; +Cc: Wols Lists, linux-raid

On Sun Jan 19, 2020 at 11:02:54AM -0600, William Morgan wrote:

> bill@bill-desk:~$ sudo lsblk -f
> NAME    FSTYPE            LABEL       UUID
>     FSAVAIL FSUSE% MOUNTPOINT
> .
> .
> .
> sdj
> └─sdj1  linux_raid_member bill-desk:0
> 06ad8de5-3a7a-15ad-8811-6f44fcdee150
>   └─md0 ext4
> ceef50e9-afdd-4903-899d-1ad05a0780e0
> sdk
> └─sdk1  linux_raid_member bill-desk:0
> 06ad8de5-3a7a-15ad-8811-6f44fcdee150
>   └─md0 ext4
> ceef50e9-afdd-4903-899d-1ad05a0780e0
> sdl
> └─sdl1  linux_raid_member bill-desk:0
> 06ad8de5-3a7a-15ad-8811-6f44fcdee150
>   └─md0 ext4
> ceef50e9-afdd-4903-899d-1ad05a0780e0
> sdm
> └─sdm1  linux_raid_member bill-desk:0
> 06ad8de5-3a7a-15ad-8811-6f44fcdee150
>   └─md0 ext4                          ceef50e9-afdd-4903-899d-1ad05a0780e0
> 
> But then I get confused about the UUIDs. I'm trying to automount the
> array using fstab (no unusual settings in there, just defaults), but
> I'm not sure which of the two UUIDs above to use. So I look at mdadm
> for help:
> 
> bill@bill-desk:~$ sudo mdadm --examine --scan
> ARRAY /dev/md/0  metadata=1.2 UUID=06ad8de5:3a7a15ad:88116f44:fcdee150
> name=bill-desk:0
> 
> However, if I use this UUID starting with "06ad", then I get an error:
> 
> bill@bill-desk:~$ sudo mount -all
> mount: /media/bill/STUFF: mount(2) system call failed: Structure needs cleaning.
> 
> But I don't know how to clean it if fsck says it's OK.
> 
> On the other hand, if I use the UUID above starting with "ceef", then
> it mounts and everything seems OK.
> 
> Basically, I don't understand why lsblk lists two UUIDs for the array,
> and why mdadm gives the wrong one in terms of mounting. This is where
> I was confused before about the UUID changing. Any insight here?
> 

One is the UUID for the array (starting with "06ad") - this is what you
use in /etc/mdadm.conf. The second is the UUID for the filesystem you
have on the array (starting with "ceef") - that's used in /etc/fstab. If
you recreate the filesystem then you'll get a different filesystem UUID
but keep the same array UUID.

Cheers,
    Robin
-- 
     ___        
    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
   / / )      | Little Jim says ....                            |
  // !!       |      "He fallen in de water !!"                 |

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-20  8:49                   ` Robin Hill
@ 2020-01-20 15:00                     ` William Morgan
  0 siblings, 0 replies; 15+ messages in thread
From: William Morgan @ 2020-01-20 15:00 UTC (permalink / raw)
  To: William Morgan, Wols Lists, linux-raid

> One is the UUID for the array (starting with "06ad") - this is what you
> use in /etc/mdadm.conf. The second is the UUID for the filesystem you
> have on the array (starting with "ceef") - that's used in /etc/fstab. If
> you recreate the filesystem then you'll get a different filesystem UUID
> but keep the same array UUID.
>
> Cheers,
>     Robin

Thank you Robin, this is helpful. Can you recommend some place to read
up on UUIDs and raid arrays, or some general UUID documentation?

Cheers,
Bill

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Two raid5 arrays are inactive and have changed UUIDs
  2020-01-19 21:10                       ` Wol's lists
@ 2020-01-24 14:10                         ` Nix
  0 siblings, 0 replies; 15+ messages in thread
From: Nix @ 2020-01-24 14:10 UTC (permalink / raw)
  To: Wol's lists; +Cc: William Morgan, linux-raid

On 19 Jan 2020, Wol's lists verbalised:
> It would be nice if we had some decent docu on what all these assorted id's and uuid's and all that were, but I haven't managed to
> find any. Probably somewhere in the info pages for grub - given that I really don't like hypertext the fact that grub docu is very
> much hypertext is rather off-putting ...

It's just perfectly ordinary Texinfo. There are lots of non-hypertext
output formats, e.g. plain text:
<https://www.gnu.org/software/grub/manual/grub/grub.txt.gz> or DVI:
<https://www.gnu.org/software/grub/manual/grub/grub.dvi.gz>. (Though the
DVI, like the PDF, contains specials that can implement hyperlinks if
your viewer can render them, I'm fairly sure the plain text version
doesn't.)

-- 
NULL && (void)

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-01-24 14:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <CALc6PW4OKR2KVFgzoEbRJ0TRwvqi5EZAdC__HOx+vJKMT0TXYQ@mail.gmail.com>
     [not found] ` <959ca414-0c97-2e8d-7715-a7cb75790fcd@youngman.org.uk>
2020-01-10  1:19   ` Two raid5 arrays are inactive and have changed UUIDs William Morgan
2020-01-10  1:55     ` Wols Lists
2020-01-13 22:40       ` William Morgan
2020-01-14 14:47         ` William Morgan
2020-01-14 15:23           ` Wols Lists
2020-01-15 22:12             ` William Morgan
2020-01-15 23:44               ` Wols Lists
2020-01-19 17:02                 ` William Morgan
2020-01-19 17:07                   ` William Morgan
2020-01-19 17:41                   ` Wols Lists
2020-01-19 20:12                     ` William Morgan
2020-01-19 21:10                       ` Wol's lists
2020-01-24 14:10                         ` Nix
2020-01-20  8:49                   ` Robin Hill
2020-01-20 15:00                     ` William Morgan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.