All of lore.kernel.org
 help / color / mirror / Atom feed
* Big-endian RAID5 recovery problem
@ 2017-05-01 21:39 Adam Thompson
  2017-05-01 21:59 ` Anthony Youngman
                   ` (3 more replies)
  0 siblings, 4 replies; 11+ messages in thread
From: Adam Thompson @ 2017-05-01 21:39 UTC (permalink / raw)
  To: MUUG Roundtable, linux-raid

So I've got 4 IDE HDDs, each with 3 RAID partitions on them, that were 
part of a RAID array in a now-very-dead NAS.

Of course, I need to get data off them that wasn't backed up anywhere 
else.

I've got a 4-port USB3 PCIe card, and 4 IDE/SATA USB adapters, and all 
the hardware seems to work.  So far, so good.

The problem is that the disks use the v0.90 metadata format, and they 
came from a big-endian system, not a little-endian system.  MD 
superblocks *since* v0.90 are endian-agnostic, but back in v0.90, the 
superblock was byte-order specific.

mdadm(8) on an Intel processor refuses to acknowledge the existence of 
the superblock.  Testdisk detects it and correctly identifies it as a 
Big-endian v0.90 superblock.

I'm reluctant to blindly do a forced --create on the four disks, because 
I'm not 100% certain of the RAID topology; there are at least two RAID 
devices, one of which was hidden from the user, so I have no a-priori 
knowledge of its RAID level or layout.

The filesystems on the md(4) devices are, AFAIK, all XFS, and so should 
(hopefully) not have any endianness issues.

I can't find any modern big-endian Linux systems... looks like all the 
ARM distros run in little-endian mode.

Any suggestions on the best way to move forward?

Thanks,
-Adam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Big-endian RAID5 recovery problem
  2017-05-01 21:39 Big-endian RAID5 recovery problem Adam Thompson
@ 2017-05-01 21:59 ` Anthony Youngman
  2017-05-01 22:33   ` Adam Thompson
  2017-05-02  7:29 ` [RndTbl] " Trevor Cordes
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 11+ messages in thread
From: Anthony Youngman @ 2017-05-01 21:59 UTC (permalink / raw)
  To: Adam Thompson, MUUG Roundtable, linux-raid

On 01/05/17 22:39, Adam Thompson wrote:

>
> I'm reluctant to blindly do a forced --create on the four disks, because
> I'm not 100% certain of the RAID topology; there are at least two RAID
> devices, one of which was hidden from the user, so I have no a-priori
> knowledge of its RAID level or layout.
>
Get hold of lsdrv, and see what that tells you. (Look at the raid wiki 
for details.) I don't know if it will have endian issues, but if it 
doesn't an expert will probably be able to chime straight in and tell 
you the create command.

The other thing is, read up on overlays because, if you overlay those 
disks, you will be able to "create" without actually writing to the 
disks. That way you can test - and even do a complete backup and 
recovery - without ever actually writing to, and altering, the original 
disks.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Big-endian RAID5 recovery problem
  2017-05-01 21:59 ` Anthony Youngman
@ 2017-05-01 22:33   ` Adam Thompson
  2017-05-02  2:47     ` Phil Turmel
  0 siblings, 1 reply; 11+ messages in thread
From: Adam Thompson @ 2017-05-01 22:33 UTC (permalink / raw)
  To: Anthony Youngman; +Cc: MUUG Roundtable, linux-raid

On 2017-05-01 16:59, Anthony Youngman wrote:
> Get hold of lsdrv, and see what that tells you. (Look at the raid wiki
> for details.) I don't know if it will have endian issues, but if it
> doesn't an expert will probably be able to chime straight in and tell
> you the create command.

Ah!  And that took me straight to the "asking for help" page.

The raw data is here: 
https://gist.github.com/anonymous/321b6db3160c259c4a4dd549817a3d07

To summarize:
* smartctl either fails to run or shows nothing wrong (depending on the 
vintage of drive, maybe?);
* mdadm --examine fails to read the superblock because of the endianness 
issue (see 
https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#The_version-0.90_Superblock_Format)
* lsdrv fails to report any useful MD topology information I could see 
(other than confirming that each md device had four members, one 
partition on each drive)

I also see 3 "FD" type partitions on each disk, but lsdrv only 
identifies *2* of them as belonging to an MD array.  Not sure what's up 
with that.


> The other thing is, read up on overlays because, if you overlay those
> disks, you will be able to "create" without actually writing to the
> disks. That way you can test - and even do a complete backup and
> recovery - without ever actually writing to, and altering, the
> original disks.

Currently reading, thanks.  Didn't know overlays could be used for block 
devices.


Spinning up a QEMU instance of Linux-PPC or Linux-MIPS with the disks in 
pass-through mode has also been mentioned, but... ugh.  Anecdotal 
reports from the web suggest that doing so would just be opening up a 
second rabbit hole in addition to the one I'm already headed down.

-Adam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Big-endian RAID5 recovery problem
  2017-05-01 22:33   ` Adam Thompson
@ 2017-05-02  2:47     ` Phil Turmel
  0 siblings, 0 replies; 11+ messages in thread
From: Phil Turmel @ 2017-05-02  2:47 UTC (permalink / raw)
  To: Adam Thompson, Anthony Youngman; +Cc: MUUG Roundtable, linux-raid

On 05/01/2017 06:33 PM, Adam Thompson wrote:

> Spinning up a QEMU instance of Linux-PPC or Linux-MIPS with the disks in
> pass-through mode has also been mentioned, but... ugh.  Anecdotal
> reports from the web suggest that doing so would just be opening up a
> second rabbit hole in addition to the one I'm already headed down.

Actually, this is a very good idea.  If you can successfully assemble
within one of these VMs, you could use the --update=metadata option to
convert to v1.0 superblocks.  Cleanly shut down, and then the arrays
would be usable in the host.

Phil


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RndTbl] Big-endian RAID5 recovery problem
  2017-05-01 21:39 Big-endian RAID5 recovery problem Adam Thompson
  2017-05-01 21:59 ` Anthony Youngman
@ 2017-05-02  7:29 ` Trevor Cordes
  2017-05-02  8:59 ` Roman Mamedov
  2017-05-06  5:57 ` NeilBrown
  3 siblings, 0 replies; 11+ messages in thread
From: Trevor Cordes @ 2017-05-02  7:29 UTC (permalink / raw)
  To: Adam Thompson; +Cc: Continuation of Round Table discussion, linux-raid

On 2017-05-01 Adam Thompson wrote:
> The problem is that the disks use the v0.90 metadata format, and they 
> came from a big-endian system, not a little-endian system.  MD 
> superblocks *since* v0.90 are endian-agnostic, but back in v0.90, the 
> superblock was byte-order specific.

This may sound crazy, but conceivably you could (using the md source
code) find where the superblock lives (just the first copy should be
enough), hex edit the superblock and (again using the source) swap
bytes around to change the endian-ness.  Then make sure any future
mdadm commands are just using the first superblock (which I think is
the default).

If there's some kind of checksum involved, then the endian-ness may
affect it (??), and fudging that may be quite difficult (compared to
finding a big-endian box out there to borrow instead!).

No matter what you do, I would instantly dd each real disk into files
in your own big fs and work on the copies.  No reason to ever write to
the originals, keep them as backups.  I *think* you can use mdadm on
plain files(??) otherwise I guess you could subpartition a big disk.
Ya, it will be cheesy to run RAID5 all off of 1 disk, but in this case
performance won't be critical.

Lastly, I would make sure to start the array in such a way that for
sure it won't resync, so either get all 3 preset perfectly, or leave 1
as missing, or I think there's a command to not start if it would
result in a dirty array.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Big-endian RAID5 recovery problem
  2017-05-01 21:39 Big-endian RAID5 recovery problem Adam Thompson
  2017-05-01 21:59 ` Anthony Youngman
  2017-05-02  7:29 ` [RndTbl] " Trevor Cordes
@ 2017-05-02  8:59 ` Roman Mamedov
  2017-05-05 19:22   ` [RndTbl] " Adam Thompson
  2017-05-06  5:57 ` NeilBrown
  3 siblings, 1 reply; 11+ messages in thread
From: Roman Mamedov @ 2017-05-02  8:59 UTC (permalink / raw)
  To: Adam Thompson; +Cc: MUUG Roundtable, linux-raid

On Mon, 01 May 2017 16:39:07 -0500
Adam Thompson <athompso@athompso.net> wrote:

> I can't find any modern big-endian Linux systems... looks like all the 
> ARM distros run in little-endian mode.

Here are QEMU images for debian-mips (should be big-endian, as opposed to
debian-mipsel): https://people.debian.org/~aurel32/qemu/mips/

Of course it will run purely in software, but most likely more than fast enough
to copy away the data.

Not entirely sure that particular emulated MIPS system support more than 4
drives), but it appears that a starting point could be (man qemu-system):

           Instead of -hda, -hdb, -hdc, -hdd, you can use:

                   qemu-system-i386 -drive file=file,index=0,media=disk
                   qemu-system-i386 -drive file=file,index=1,media=disk
                   qemu-system-i386 -drive file=file,index=2,media=disk
                   qemu-system-i386 -drive file=file,index=3,media=disk

with indexes 0..5, as you need the boot disk, all 4 drives, and one more as
the backup destination.

May or may not be the best way, but IMO beats trying to hex-edit the
superblock right away.

-- 
With respect,
Roman

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RndTbl] Big-endian RAID5 recovery problem
  2017-05-02  8:59 ` Roman Mamedov
@ 2017-05-05 19:22   ` Adam Thompson
  0 siblings, 0 replies; 11+ messages in thread
From: Adam Thompson @ 2017-05-05 19:22 UTC (permalink / raw)
  To: Continuation of Round Table discussion; +Cc: linux-raid

On 2017-05-02 03:59, Roman Mamedov wrote:
> On Mon, 01 May 2017 16:39:07 -0500
> Adam Thompson <athompso@athompso.net> wrote:
> 
>> I can't find any modern big-endian Linux systems... looks like all the
>> ARM distros run in little-endian mode.
> 
> Here are QEMU images for debian-mips (should be big-endian, as opposed 
> to
> debian-mipsel): https://people.debian.org/~aurel32/qemu/mips/
> 
> Of course it will run purely in software, but most likely more than 
> fast enough
> to copy away the data.
> 
> Not entirely sure that particular emulated MIPS system support more 
> than 4
> drives), but it appears that a starting point could be (man 
> qemu-system):
> 
>            Instead of -hda, -hdb, -hdc, -hdd, you can use:
> 
>                    qemu-system-i386 -drive file=file,index=0,media=disk
>                    qemu-system-i386 -drive file=file,index=1,media=disk
>                    qemu-system-i386 -drive file=file,index=2,media=disk
>                    qemu-system-i386 -drive file=file,index=3,media=disk
> 
> with indexes 0..5, as you need the boot disk, all 4 drives, and one 
> more as
> the backup destination.
> 
> May or may not be the best way, but IMO beats trying to hex-edit the
> superblock right away.


So I now have:
  4 x old IDE hard drives,
  plugged into 4 x USB3-to-IDE adapters,
  plugged into a 4x USB3 PCIe 1x adapter card,
  plugged into an Ubuntu desktop i5 system @ 3.3GHz,
  passed through into QEMU as virtual SCSI devices,
  connected to a virtual AMD am53c974 SCSI adapter,
  connected to a virtual Malta-series MIPS64 series,
  emulated by QEMU.

Yikes!

The good news is that as soon as the Debian kernel booted, it 
auto-detected the RAID arrays and started re-silvering them.

The bad news is: 1) that I thought I had attached the devices in 
read-only mode (oops); and 2) it's re-syncing the MD array at 
~5000Kb/sec.

I'll leave it to sync over the weekend (praying for no power outages), 
but I sure hope I can upgrade the metadata block instead of doing this 
all through a QEMU (non-accelerated) VM!

Thanks for the suggestions so far,
-Adam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Big-endian RAID5 recovery problem
  2017-05-01 21:39 Big-endian RAID5 recovery problem Adam Thompson
                   ` (2 preceding siblings ...)
  2017-05-02  8:59 ` Roman Mamedov
@ 2017-05-06  5:57 ` NeilBrown
  2017-05-06  6:41   ` [RndTbl] " Trevor Cordes
  3 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-05-06  5:57 UTC (permalink / raw)
  To: Adam Thompson, MUUG Roundtable, linux-raid

[-- Attachment #1: Type: text/plain, Size: 1466 bytes --]

On Mon, May 01 2017, Adam Thompson wrote:

> So I've got 4 IDE HDDs, each with 3 RAID partitions on them, that were 
> part of a RAID array in a now-very-dead NAS.
>
> Of course, I need to get data off them that wasn't backed up anywhere 
> else.
>
> I've got a 4-port USB3 PCIe card, and 4 IDE/SATA USB adapters, and all 
> the hardware seems to work.  So far, so good.
>
> The problem is that the disks use the v0.90 metadata format, and they 
> came from a big-endian system, not a little-endian system.  MD 
> superblocks *since* v0.90 are endian-agnostic, but back in v0.90, the 
> superblock was byte-order specific.
>
> mdadm(8) on an Intel processor refuses to acknowledge the existence of 
> the superblock.  Testdisk detects it and correctly identifies it as a 
> Big-endian v0.90 superblock.
>
> I'm reluctant to blindly do a forced --create on the four disks, because 
> I'm not 100% certain of the RAID topology; there are at least two RAID 
> devices, one of which was hidden from the user, so I have no a-priori 
> knowledge of its RAID level or layout.
>
> The filesystems on the md(4) devices are, AFAIK, all XFS, and so should 
> (hopefully) not have any endianness issues.
>
> I can't find any modern big-endian Linux systems... looks like all the 
> ARM distros run in little-endian mode.
>
> Any suggestions on the best way to move forward?
>

Look for "--update=byteorder" in the mdadm man page.

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RndTbl] Big-endian RAID5 recovery problem
  2017-05-06  5:57 ` NeilBrown
@ 2017-05-06  6:41   ` Trevor Cordes
  2017-05-07 23:40     ` [mdadm PATCH] Mention "endian" in documentation for --update=byte-order NeilBrown
  0 siblings, 1 reply; 11+ messages in thread
From: Trevor Cordes @ 2017-05-06  6:41 UTC (permalink / raw)
  To: NeilBrown
  Cc: Continuation of Round Table discussion, Adam Thompson, linux-raid

On 2017-05-06 NeilBrown wrote:
> On Mon, May 01 2017, Adam Thompson wrote:
> 
> > The problem is that the disks use the v0.90 metadata format, and
> > they came from a big-endian system, not a little-endian system.  MD 
> > superblocks *since* v0.90 are endian-agnostic, but back in v0.90,
> > the superblock was byte-order specific.
> 
> Look for "--update=byteorder" in the mdadm man page.

Doh!  Make that double-doh!  So easy, yet so hidden.  I'll request one
little tweak/feature: can the man page be updated so that that option
has the word "endian" somewhere in its description?  I think most of us
did as the first attempt at helping: man mdadm, search endian.  When
that failed, no one thought to search byteorder!  Even google didn't
fix that one, at least not for my web searches.

Thanks linux-RAID guys (especially Neil), once again you make the
difficult easy.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [mdadm PATCH] Mention "endian" in documentation for --update=byte-order
  2017-05-06  6:41   ` [RndTbl] " Trevor Cordes
@ 2017-05-07 23:40     ` NeilBrown
  2017-05-08 17:42       ` Jes Sorensen
  0 siblings, 1 reply; 11+ messages in thread
From: NeilBrown @ 2017-05-07 23:40 UTC (permalink / raw)
  To: Jes Sorensen, Trevor Cordes
  Cc: Continuation of Round Table discussion, Adam Thompson, linux-raid

[-- Attachment #1: Type: text/plain, Size: 743 bytes --]


This makes it easier to find as "endian" is a commonly used term.

Reported-by: Trevor Cordes <trevor@tecnopolis.ca>
Signed-off-by: NeilBrown <neilb@suse.com>
---
 mdadm.8.in | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mdadm.8.in b/mdadm.8.in
index fb99a5cd9159..388e0edbf89a 100644
--- a/mdadm.8.in
+++ b/mdadm.8.in
@@ -1264,7 +1264,8 @@ is correct.
 The
 .B byteorder
 option allows arrays to be moved between machines with different
-byte-order.
+byte-order, such as from a big-endian machine like a Sparc or some
+MIPS machines, to a little-endian x86_64 machine.
 When assembling such an array for the first time after a move, giving
 .B "\-\-update=byteorder"
 will cause
-- 
2.12.2


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [mdadm PATCH] Mention "endian" in documentation for --update=byte-order
  2017-05-07 23:40     ` [mdadm PATCH] Mention "endian" in documentation for --update=byte-order NeilBrown
@ 2017-05-08 17:42       ` Jes Sorensen
  0 siblings, 0 replies; 11+ messages in thread
From: Jes Sorensen @ 2017-05-08 17:42 UTC (permalink / raw)
  To: NeilBrown, Trevor Cordes
  Cc: Continuation of Round Table discussion, Adam Thompson, linux-raid

On 05/07/2017 07:40 PM, NeilBrown wrote:
>
> This makes it easier to find as "endian" is a commonly used term.
>
> Reported-by: Trevor Cordes <trevor@tecnopolis.ca>
> Signed-off-by: NeilBrown <neilb@suse.com>

Applied!

Thanks,
Jes



^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-05-08 17:42 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-01 21:39 Big-endian RAID5 recovery problem Adam Thompson
2017-05-01 21:59 ` Anthony Youngman
2017-05-01 22:33   ` Adam Thompson
2017-05-02  2:47     ` Phil Turmel
2017-05-02  7:29 ` [RndTbl] " Trevor Cordes
2017-05-02  8:59 ` Roman Mamedov
2017-05-05 19:22   ` [RndTbl] " Adam Thompson
2017-05-06  5:57 ` NeilBrown
2017-05-06  6:41   ` [RndTbl] " Trevor Cordes
2017-05-07 23:40     ` [mdadm PATCH] Mention "endian" in documentation for --update=byte-order NeilBrown
2017-05-08 17:42       ` Jes Sorensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.