All of lore.kernel.org
 help / color / mirror / Atom feed
* bootsect replicated in p1, RAID enclosure suggestions?
@ 2016-08-23  5:09 travis+ml-linux-raid
  2016-08-24  2:14 ` travis+ml-linux-raid
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: travis+ml-linux-raid @ 2016-08-23  5:09 UTC (permalink / raw)
  To: linux-raid

Hello all,

So I have an Intel NUC (for low power Linux) plugged via USB into a 4
bay enclosure doing linear (yeah I know; it's the backup server, the
primary is raid10).

And every once in a while, this happens (*see end).  The partition 1
that would normally contain a MD slice ends up being a replica of the
boot cylinder.  I can't tell if it's the mdraid linear impl, the
kernel doing something weird, the USB drivers, the enclosure firmware,
or what.

Anyway, this happened while I was restoring a Windows machine whose
root drive suddenly took a nosedive, and it happens every 6 months
or so.  Today it happened while I was in the middle of recovering
a Windows machine whose 1TB SSD threw up on C: and totally nuked
the data.

The last low-power option I tried was an OpenRD Ultimate based around
ARMv5TE which was basically unsupported by debian by the time I got
it, and subsequently became ultra-flaky due to what seemed to be RAM
problems - it was crashing every 3 days with kernel panics, and every
once in a while would do something worse.

Any recommendations on a low power hardware with a well-supported
distro, that matches up well with a real backplane and SATA
connections instead of USB.  The only caveat is that I want to encrypt
raw disks and it has to not be very noisy - so no rackmount gear
with 65dB 1" dog whistle fans.  Obviously, whatever backplane must
be well-supported by the distro.

Also, does anyone have experience with cryptsetup on multiple
partitions?  I can do that but get prompted multiple times and I was
wondering if anyone knew an easy way to fix the boot time scripts to
avoid that, only prompting once per unique underlying crypttab.

And finally, I have a story about buggy drive firmware that you
might enjoy, especially if you were doing this sort of stuff in
the 90s as well.  Cheers:

http://www.subspacefield.org/security/hard_drives_of_doom/


[*]

# parted /dev/sde
GNU Parted 2.3
Using /dev/sde
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p                                                                
Model: WDC WD40 EFRX-68WT0N0 (scsi)
Disk /dev/sde: 4001GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name        Flags
 1      1049kB  4001GB  4001GB               Linux RAID  raid

(parted) q                                                                
# parted /dev/sdd1
GNU Parted 2.3
Using /dev/sdd1
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) p                                                                
Model: Unknown (unknown)
Disk /dev/sdd1: 4001GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt

Number  Start   End     Size    File system  Name        Flags
 1      1049kB  4001GB  4001GB               Linux RAID  raid

-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-23  5:09 bootsect replicated in p1, RAID enclosure suggestions? travis+ml-linux-raid
@ 2016-08-24  2:14 ` travis+ml-linux-raid
  2016-08-24 17:15 ` Chris Murphy
  2016-09-01 17:22 ` Wols Lists
  2 siblings, 0 replies; 17+ messages in thread
From: travis+ml-linux-raid @ 2016-08-24  2:14 UTC (permalink / raw)
  To: linux-raid

$ mdadm --version
mdadm - v3.2.5 - 18th May 2012
$ uname -a
Linux hostname 3.2.0-107-generic #148-Ubuntu SMP Mon Jul 18 20:22:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
$ cat /etc/lsb-release 
DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=12.04
DISTRIB_CODENAME=precise
DISTRIB_DESCRIPTION="Ubuntu 12.04.5 LTS"

And I think there must be a bug in referencing the beginning of a
partition vs the beginning of the disk which leads to this.  Back when
I was using raw disk devices I had corruption in the first cylinders
which also held the mdlabel and I thought the lack of a partition
table was the problem... obviously not.

Could very well be a bug in USB enclosure firmware too.  Hard to know
how to proceed.

On Mon, Aug 22, 2016 at 10:09:47PM -0700, travis+ml-linux-raid@subspacefield.org wrote:
> Hello all,
> 
> So I have an Intel NUC (for low power Linux) plugged via USB into a 4
> bay enclosure doing linear (yeah I know; it's the backup server, the
> primary is raid10).
> 
> And every once in a while, this happens (*see end).  The partition 1
> that would normally contain a MD slice ends up being a replica of the
> boot cylinder.  I can't tell if it's the mdraid linear impl, the
> kernel doing something weird, the USB drivers, the enclosure firmware,
> or what.
> 
> Anyway, this happened while I was restoring a Windows machine whose
> root drive suddenly took a nosedive, and it happens every 6 months
> or so.  Today it happened while I was in the middle of recovering
> a Windows machine whose 1TB SSD threw up on C: and totally nuked
> the data.
> 
> The last low-power option I tried was an OpenRD Ultimate based around
> ARMv5TE which was basically unsupported by debian by the time I got
> it, and subsequently became ultra-flaky due to what seemed to be RAM
> problems - it was crashing every 3 days with kernel panics, and every
> once in a while would do something worse.
> 
> Any recommendations on a low power hardware with a well-supported
> distro, that matches up well with a real backplane and SATA
> connections instead of USB.  The only caveat is that I want to encrypt
> raw disks and it has to not be very noisy - so no rackmount gear
> with 65dB 1" dog whistle fans.  Obviously, whatever backplane must
> be well-supported by the distro.
> 
> Also, does anyone have experience with cryptsetup on multiple
> partitions?  I can do that but get prompted multiple times and I was
> wondering if anyone knew an easy way to fix the boot time scripts to
> avoid that, only prompting once per unique underlying crypttab.
> 
> And finally, I have a story about buggy drive firmware that you
> might enjoy, especially if you were doing this sort of stuff in
> the 90s as well.  Cheers:
> 
> http://www.subspacefield.org/security/hard_drives_of_doom/
> 
> 
> [*]
> 
> # parted /dev/sde
> GNU Parted 2.3
> Using /dev/sde
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) p                                                                
> Model: WDC WD40 EFRX-68WT0N0 (scsi)
> Disk /dev/sde: 4001GB
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
> 
> Number  Start   End     Size    File system  Name        Flags
>  1      1049kB  4001GB  4001GB               Linux RAID  raid
> 
> (parted) q                                                                
> # parted /dev/sdd1
> GNU Parted 2.3
> Using /dev/sdd1
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) p                                                                
> Model: Unknown (unknown)
> Disk /dev/sdd1: 4001GB
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
> 
> Number  Start   End     Size    File system  Name        Flags
>  1      1049kB  4001GB  4001GB               Linux RAID  raid
> 
> -- 
> http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
> "Computer crime, the glamor crime of the 1970s, will become in the
> 1980s one of the greatest sources of preventable business loss."
> John M. Carroll, "Computer Security", first edition cover flap, 1977
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-23  5:09 bootsect replicated in p1, RAID enclosure suggestions? travis+ml-linux-raid
  2016-08-24  2:14 ` travis+ml-linux-raid
@ 2016-08-24 17:15 ` Chris Murphy
  2016-08-25  6:25   ` travis+ml-linux-raid
  2016-09-01 17:22 ` Wols Lists
  2 siblings, 1 reply; 17+ messages in thread
From: Chris Murphy @ 2016-08-24 17:15 UTC (permalink / raw)
  To: Linux-RAID

On Mon, Aug 22, 2016 at 11:09 PM,
<travis+ml-linux-raid@subspacefield.org> wrote:
> Hello all,
>
> So I have an Intel NUC (for low power Linux) plugged via USB into a 4
> bay enclosure doing linear (yeah I know; it's the backup server, the
> primary is raid10).
>
> And every once in a while, this happens (*see end).  The partition 1
> that would normally contain a MD slice ends up being a replica of the
> boot cylinder.  I can't tell if it's the mdraid linear impl, the
> kernel doing something weird, the USB drivers, the enclosure firmware,
> or what.

OK well you don't tell us what the mdadm create command was, there's
no information on the metadata version, no mdadm -E or -D output, etc.
There's really nothing to go on here. So we can't tell what the
problem is either, or what your question is.

>
> Anyway, this happened while I was restoring a Windows machine whose
> root drive suddenly took a nosedive, and it happens every 6 months
> or so.  Today it happened while I was in the middle of recovering
> a Windows machine whose 1TB SSD threw up on C: and totally nuked
> the data.

OK? I don't follow this at all, how it relates to the NUC, how it
relates to the USB drives connected to the NUC.


>
> The last low-power option I tried was an OpenRD Ultimate based around
> ARMv5TE which was basically unsupported by debian by the time I got
> it, and subsequently became ultra-flaky due to what seemed to be RAM
> problems - it was crashing every 3 days with kernel panics, and every
> once in a while would do something worse.

This is definitely superfluous information that just clutters the thread...



> Any recommendations on a low power hardware with a well-supported
> distro, that matches up well with a real backplane and SATA
> connections instead of USB.  The only caveat is that I want to encrypt
> raw disks and it has to not be very noisy - so no rackmount gear
> with 65dB 1" dog whistle fans.  Obviously, whatever backplane must
> be well-supported by the distro.

OK so you just want to give up on the existing setup and you want
advice on a whole new setup? From my perspective you're basically on
three separate threads at this point.


>
> Also, does anyone have experience with cryptsetup on multiple
> partitions?  I can do that but get prompted multiple times and I was
> wondering if anyone knew an easy way to fix the boot time scripts to
> avoid that, only prompting once per unique underlying crypttab.

And now you're on your fourth subject for an entirely new thread that
also has nothing to do with this list. This is probably a distribution
question. On the distribution I use, the thing that prompts for a
passphrase tries that passphrase on all cryptluks devices, so in the
event they share the same passphrase, they're all opened just by
entering the passphrase one time. If the passphrase is entered
incorrectly, now I'm stuck and have to enter the passphrase per LUKS
instance.


>
> And finally, I have a story about buggy drive firmware that you
> might enjoy, especially if you were doing this sort of stuff in
> the 90s as well.  Cheers:

OK...fifth subject and thread.



> # parted /dev/sde
> GNU Parted 2.3

I would start out by using a non-ancient version of parted. This is 6 years old.


> Using /dev/sde
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) p
> Model: WDC WD40 EFRX-68WT0N0 (scsi)
> Disk /dev/sde: 4001GB
> Sector size (logical/physical): 512B/512B

It's a WDC Red with a physical sector size of 4096B, so it looks like
the USB enclosure is doing the typical thing of masking the try
physical sector size from the kernel. This is better than the opposite
where the enclosure reports the drive as 4096B/4096B logical/physical,
where the drive itself has 512B logical sectors, as this will cause
problems if the drive is ever removed from that enclosure, or put into
one that doesn't report 4096B logical sectors.


> Partition Table: gpt
>
> Number  Start   End     Size    File system  Name        Flags
>  1      1049kB  4001GB  4001GB               Linux RAID  raid
>
> (parted) q
> # parted /dev/sdd1
> GNU Parted 2.3
> Using /dev/sdd1
> Welcome to GNU Parted! Type 'help' to view a list of commands.
> (parted) p
> Model: Unknown (unknown)
> Disk /dev/sdd1: 4001GB
> Sector size (logical/physical): 512B/512B
> Partition Table: gpt
>
> Number  Start   End     Size    File system  Name        Flags
>  1      1049kB  4001GB  4001GB               Linux RAID  raid


It's purely speculation, but it sounds like to me in the history of
one or more drives, the previous signatures weren't removed before the
drive was retasked for its new purpose. That's the folly of not wiping
the signatures in the reverse order they were created, and just
expecting that starting over will wipe those old signatures.

But I think there is a legitimate gripe that parted probably should
not operate on partitions like this. It's not valid to have nested
GPTs like this. And I have no idea if parted is showing you valid or
bogus information. You'd need to do something like:

dd if=/dev/sdd1 count=2 2>/dev/null | hexdump -C

And then we can see if there really is a PMBR and GPT in that first
sector that parted is picking up. But where it could be coming from in
an mdadm linear layout? No idea.

The other thing to check is the end of the partition, because GPT has
a primary and backup. So the 2nd to last sector of sdd1 may have a
backup GPT on it, and possibly something is wrongly restoring it
sometimes.

In any case I would still look to using something much much newer than
parted 2.3, it's basically Pleistocene old, and the version of mdadm
is also likewise old. But this is what happens with LTS releases,
ancient software for which no one except its maintainers remember the
state and history.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-24 17:15 ` Chris Murphy
@ 2016-08-25  6:25   ` travis+ml-linux-raid
  2016-08-25 21:06     ` Wols Lists
  2016-08-25 22:32     ` Chris Murphy
  0 siblings, 2 replies; 17+ messages in thread
From: travis+ml-linux-raid @ 2016-08-25  6:25 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux-RAID

On Wed, Aug 24, 2016 at 11:15:58AM -0600, Chris Murphy wrote:
> OK well you don't tell us what the mdadm create command was, there's
> no information on the metadata version, no mdadm -E or -D output, etc.
> There's really nothing to go on here. So we can't tell what the
> problem is either, or what your question is.

Thanks for the response, I learned some interesting things!

Here is one of the non-nuked drives:

$ sudo mdadm -E /dev/sdd1
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : <elided>
           Name : <elided>
  Creation Time : Wed Aug 10 11:33:41 2016
     Raid Level : raid0
   Raid Devices : 4

 Avail Dev Size : 7814035071 (3726.02 GiB 4000.79 GB)
    Data Offset : 16 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : <elided)

    Update Time : Wed Aug 10 11:33:41 2016
       Checksum : 490b562f - correct
         Events : 0

     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)

Here is what should be the same, only device 2 in the array
(device 3 is similar or identical):

$ sudo mdadm -E /dev/sdf1
/dev/sdf1:
   MBR Magic : aa55
Partition[0] :   4294967295 sectors at            1 (type ee)
$ sudo mdadm -D /dev/sdf1
mdadm: /dev/sdf1 does not appear to be an md device

Sadly, I can't do a mdadm -D because I can't assemble the RAID.
$ sudo mdadm -E /dev/md127
$

The command history is gone, but I would imagine that the RAID was
created with something like this:

mdadm --create /dev/md/bu --level=0 --raid-devices=4 /dev/sd{b,c,d,e}1

Although it could have been level=linear.

To summarize my email:
"Is this is a known problem? If not, here is a bug report"

> > Any recommendations on a low power hardware with a well-supported
> > distro, that matches up well with a real backplane and SATA
> > connections instead of USB.  The only caveat is that I want to encrypt
> > raw disks and it has to not be very noisy - so no rackmount gear
> > with 65dB 1" dog whistle fans.  Obviously, whatever backplane must
> > be well-supported by the distro.
> 
> OK so you just want to give up on the existing setup and you want
> advice on a whole new setup? From my perspective you're basically on
> three separate threads at this point.

Depends on the circumstances.  I'm prepared to if there are no obvious
fixes.  My intuition tells me the issue may be in the 4-bay switched
SATA enclosure, or the USB connection, or the driver thereof, and not
mdraid itself.  I'm happy to be wrong on that.

BTW, in case this rings any bells as being buggy, here is the enclosure:
https://www.amazon.com/Mediasonic-ProBox-HF2-SU3S2-SATA-Enclosure/dp/B003X26VV4/

> It's a WDC Red with a physical sector size of 4096B, so it looks like
> the USB enclosure is doing the typical thing of masking the try
> physical sector size from the kernel. This is better than the opposite
> where the enclosure reports the drive as 4096B/4096B logical/physical,
> where the drive itself has 512B logical sectors, as this will cause
> problems if the drive is ever removed from that enclosure, or put into
> one that doesn't report 4096B logical sectors.

Oooh, that's meaty information thank you.  I hadn't kept up with
things since the great 2TB changeover.  That could explain some crap I
see with larger drives and USB enclosures.  The problems you describe,
I saw back in the great 2GB switchover. Seagate had some boot sector
magic that would make things work by changing the cylinder sizes,
until it didn't....

> > # parted /dev/sdd1
> > GNU Parted 2.3
> > Using /dev/sdd1
> > Welcome to GNU Parted! Type 'help' to view a list of commands.
> > (parted) p
> > Model: Unknown (unknown)
> > Disk /dev/sdd1: 4001GB
> > Sector size (logical/physical): 512B/512B
> > Partition Table: gpt
> >
> > Number  Start   End     Size    File system  Name        Flags
> >  1      1049kB  4001GB  4001GB               Linux RAID  raid
> 
> It's purely speculation, but it sounds like to me in the history of
> one or more drives, the previous signatures weren't removed before the
> drive was retasked for its new purpose. That's the folly of not wiping
> the signatures in the reverse order they were created, and just
> expecting that starting over will wipe those old signatures.

It's possible, but why would you ever end up with a GPT in a partition?

I've certainly encountered this "GPT outside cylinder 0" on these two
drives before, but it goes away with a forcible reassemble or recreate
(which I did last time), because the mdlabel blows it away. Unless
it's something this list knows about, I suspect it is a firmware
glitch in the USB enclosure.

> But I think there is a legitimate gripe that parted probably should
> not operate on partitions like this. It's not valid to have nested
> GPTs like this. And I have no idea if parted is showing you valid or
> bogus information. You'd need to do something like:
> 
> dd if=/dev/sdd1 count=2 2>/dev/null | hexdump -C

## Good disk (for comparison):
$ sudo dd if=/dev/sdd1 count=2 2> /dev/null | file -
/dev/stdin: data
$ sudo dd if=/dev/sdd1 count=2 2> /dev/null | hexdump -C | head -20 
00000000  ff 02 19 2e 03 ee fa d8  6d d7 24 78 e1 d4 04 3d  |........m.$x...=|
00000010  c9 92 33 97 17 7a 10 d3  05 bd 39 36 b4 a9 7c 14  |..3..z....96..|.|
00000020  a7 de 66 b6 cd d9 ff ef  45 27 74 6e 94 0a 03 49  |..f.....E'tn...I|
00000030  d4 43 26 2d 45 39 d1 93  8a 35 91 91 ff c9 a4 8e  |.C&-E9...5......|
00000040  bd 9a 06 6d cc f2 89 65  c0 91 87 1c 1b f0 da 2f  |...m...e......./|
00000050  83 c2 12 eb 80 3c c2 4c  68 cc 65 40 26 13 e0 77  |.....<.Lh.e@&..w|
00000060  38 15 ed 78 27 76 4c 91  71 99 3e 9f 99 f1 3f 51  |8..x'vL.q.>...?Q|
00000070  19 db 12 a3 ac b6 61 12  ff d9 37 87 31 1f 8b dd  |......a...7.1...|
00000080  88 82 de fb db f2 a5 31  10 2a d2 03 be 12 be bd  |.......1.*......|
00000090  19 46 9f c1 3b ea a1 37  81 d2 4d 00 54 e7 b4 55  |.F..;..7..M.T..U|
000000a0  b7 65 6c 3f 95 40 b0 f4  28 ff 90 62 22 cb 22 fd  |.el?.@..(..b".".|
000000b0  6b 4d 90 56 32 4b c6 22  35 b1 62 76 e1 fd 82 d5  |kM.V2K."5.bv....|
000000c0  03 40 c0 85 4b ac 5a 44  9e 6a 25 97 d3 7f bd fe  |.@..K.ZD.j%.....|
000000d0  0c 2d a8 bb 33 f4 00 df  7a 05 ae 6d b3 3e f3 7d  |.-..3...z..m.>.}|
000000e0  34 9e 0e 57 14 de d8 e0  28 63 82 a6 2a 8a 1f fc  |4..W....(c..*...|
000000f0  fe 2f b0 69 67 ac 0a e9  c2 53 a7 d8 36 1a 18 5a  |./.ig....S..6..Z|
00000100  d6 d4 e6 ce df f7 fc 67  13 eb 25 08 45 50 10 7b  |.......g..%.EP.{|
00000110  c6 23 1e 59 dc 2d c2 65  53 90 ca ec 21 e7 28 74  |.#.Y.-.eS...!.(t|
00000120  41 7f 3e 58 72 08 75 c1  d5 ca d0 91 55 5f 43 6a  |A.>Xr.u.....U_Cj|
00000130  4e 84 d5 7f aa f2 b5 27  e4 86 5d 28 ae 6c 29 a1  |N......'..](.l).|

## Bad disk:
$ sudo dd if=/dev/sdf1 count=2 2> /dev/null | file -
/dev/stdin: x86 boot sector; partition 1: ID=0xee, starthead 0, startsector 1, 4294967295 sectors, code offset 0x6f
$ sudo dd if=/dev/sdf1 count=2 2> /dev/null | hexdump -C 
00000000  38 6f 96 52 ea 9c 31 cd  10 a2 84 58 a2 f0 f5 43  |8o.R..1....X...C|
00000010  0f f2 5a 9b c7 ff 82 b2  d8 59 86 60 15 bc 31 65  |..Z......Y.`..1e|
00000020  bc d7 77 f9 31 6a c8 16  3f 13 90 24 b7 57 ff 6b  |..w.1j..?..$.W.k|
00000030  64 7e e2 99 2a 99 f7 32  69 be aa 56 36 31 f7 db  |d~..*..2i..V61..|
00000040  8c 4c 4c 12 68 19 77 0f  f6 3b 92 bf 18 92 c2 45  |.LL.h.w..;.....E|
00000050  73 d5 b7 93 cc ae 6b b9  b0 bd 0c 85 a9 c3 19 f7  |s.....k.........|
00000060  87 34 b8 be 0a 95 cd 03  03 d5 01 49 b5 b0 86 fe  |.4.........I....|
00000070  71 1c d2 f6 42 ed ce b0  eb c3 5f 4c 07 34 30 c7  |q...B....._L.40.|
00000080  8a 1f 91 c4 8b 28 b9 07  8e da ae 7d 7d c5 24 2b  |.....(.....}}.$+|
00000090  6d f9 ea a3 6a 83 9d b8  6a 1f 6d db 3a 01 22 c7  |m...j...j.m.:.".|
000000a0  56 fc 2a 46 f8 b2 84 31  d1 8b 58 55 b6 5a 36 7b  |V.*F...1..XU.Z6{|
000000b0  48 5d 98 2a 3f f0 ae 80  2b f8 6b b2 7f 1e 27 c2  |H].*?...+.k...'.|
000000c0  59 65 d0 bf c7 f0 5b 18  dc 59 8e 68 46 03 b6 ca  |Ye....[..Y.hF...|
000000d0  42 06 7a 52 7a 49 36 03  0d d5 9b 67 a2 03 3b 13  |B.zRzI6....g..;.|
000000e0  40 23 19 f5 1a a6 bd fb  c8 d5 5b 26 f5 6a 86 ab  |@#........[&.j..|
000000f0  89 77 98 d8 09 cb b7 59  80 03 81 48 ba c6 ce 77  |.w.....Y...H...w|
00000100  3c 6c d2 ba a0 71 c3 20  18 fd 77 db ca a8 8a e3  |<l...q. ..w.....|
00000110  8d 6c 1f 17 d5 9f e5 81  bf 50 62 c3 bc f8 6c 5d  |.l.......Pb...l]|
00000120  f7 3f a6 37 6b a9 53 2b  88 15 5d 6e 1e 48 4f b4  |.?.7k.S+..]n.HO.|
00000130  db af b4 f7 f5 7b 4d f3  3f 60 44 60 6e a2 c4 6d  |.....{M.?`D`n..m|
00000140  b9 6c 88 04 e8 66 d1 7c  a0 09 10 66 32 de 70 e1  |.l...f.|...f2.p.|
00000150  98 40 54 5e 1d f2 af b8  2e d1 75 0d 3c 46 1f f8  |.@T^......u.<F..|
00000160  85 72 49 87 ad 92 59 28  fd 9d 22 8e 1b 9f 2c 00  |.rI...Y(.."...,.|
00000170  87 58 74 01 63 a5 94 13  e3 9c ea ec 3f 21 22 41  |.Xt.c.......?!"A|
00000180  05 13 78 f3 a8 46 b3 02  9e 23 cb 9d 21 db a6 ae  |..x..F...#..!...|
00000190  08 a8 70 48 18 6c e2 38  e4 ac 03 6e 06 74 17 7c  |..pH.l.8...n.t.||
000001a0  90 ca 9f 5e 2e 2b 84 ef  52 2c 08 9a 48 98 f9 46  |...^.+..R,..H..F|
000001b0  f4 9f 00 cd ec a0 11 d7  00 00 00 00 00 00 00 00  |................|
000001c0  02 00 ee ff ff ff 01 00  00 00 ff ff ff ff 00 00  |................|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  45 46 49 20 50 41 52 54  00 00 01 00 5c 00 00 00  |EFI PART....\...|
00000210  3a dc 43 c4 00 00 00 00  01 00 00 00 00 00 00 00  |:.C.............|
00000220  8e b6 c0 d1 01 00 00 00  22 00 00 00 00 00 00 00  |........".......|
00000230  6d b6 c0 d1 01 00 00 00  a5 4f bd 75 f6 c8 4f 43  |m........O.u..OC|
00000240  92 31 ab b6 a9 59 aa 04  02 00 00 00 00 00 00 00  |.1...Y..........|
00000250  80 00 00 00 80 00 00 00  59 04 3d 4a 00 00 00 00  |........Y.=J....|
00000260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

## is that the same as the boot sector itself?  Interesting q.
# dd if=/dev/sdd count=2 of=/tmp/foo && dd if=/dev/sdd1 count=2 of=/tmp/bar && cmp /tmp/foo /tmp/bar
## Nope, how do they differ?  Well that's a bit unpleasant to do manually but here...
# dd if=/dev/sdd count=2 2> /dev/null | hexdump -C
00000000  10 06 27 48 33 df bb 55  8b 28 fe 60 5e 18 6d 38  |..'H3..U.(.`^.m8|
00000010  fc b3 17 36 55 de fd 83  d0 52 72 19 d0 76 12 f0  |...6U....Rr..v..|
00000020  1e 23 bc 4d c5 4d c2 d6  5a d4 2b cd 16 78 c9 28  |.#.M.M..Z.+..x.(|
00000030  77 21 c4 9f c4 b7 48 ad  e0 7b 08 d6 f5 8e 92 a7  |w!....H..{......|
00000040  bc 88 35 02 e7 f8 b8 3b  05 97 db a3 ad e7 96 4b  |..5....;.......K|
00000050  84 d9 e2 a4 3a 5a 07 ac  fc a2 78 58 d7 c8 5a 19  |....:Z....xX..Z.|
00000060  88 9c f6 f2 c0 ec 99 55  d9 5d 00 87 3a 86 52 01  |.......U.]..:.R.|
00000070  92 58 25 82 99 50 8e 28  0f 42 07 71 9a a3 db 82  |.X%..P.(.B.q....|
00000080  00 d9 b8 28 9d d8 97 85  9d c6 fb 5e 4d 94 3a 6e  |...(.......^M.:n|
00000090  19 3c a6 ce 57 6b a0 52  d6 72 0c 41 2e cd cb a2  |.<..Wk.R.r.A....|
000000a0  15 c8 d4 c8 8c 90 34 5f  15 ab 69 96 af 3d 7e 30  |......4_..i..=~0|
000000b0  25 e1 72 35 d6 c4 b2 5e  78 72 0b 3f 9a 96 40 7e  |%.r5...^xr.?..@~|
000000c0  c6 aa 0e 5a da 99 ae fe  a3 93 8b 5b c4 bf 91 64  |...Z.......[...d|
000000d0  d5 62 12 ea 70 15 a9 05  81 8d e4 fb 36 15 c9 63  |.b..p.......6..c|
000000e0  ba f9 d2 5c f6 df 28 71  d8 d5 82 95 2b 83 40 db  |...\..(q....+.@.|
000000f0  9b fe e2 a7 9b 38 5e 5f  51 a6 6e e6 7b 4e bf 02  |.....8^_Q.n.{N..|
00000100  d2 fb aa f9 2c 7a 5b f5  47 ad ac 7e d1 1c f3 1b  |....,z[.G..~....|
00000110  a3 8e 54 9f a4 8d 1a 02  3f cc 81 f0 ca e9 28 1e  |..T.....?.....(.|
00000120  33 9e d8 71 dd f2 aa b7  d4 06 96 cb 0c 8e f1 6a  |3..q...........j|
00000130  88 1d 2a 8a a3 33 00 8c  ef d4 d8 39 3e 70 18 34  |..*..3.....9>p.4|
00000140  e6 3a cd e7 0b d6 82 a8  a4 aa ff bd b3 69 0a cc  |.:...........i..|
00000150  32 9e e3 26 34 bb cc 0e  b0 69 5f 9a c5 f3 57 7d  |2..&4....i_...W}|
00000160  47 82 bc 66 44 55 c4 de  3c 2c 14 d0 9a 73 6a da  |G..fDU..<,...sj.|
00000170  3c 5e f8 99 26 5b f4 8a  13 a1 f1 c8 a9 20 4c 3a  |<^..&[....... L:|
00000180  bd 03 4e e9 83 25 46 32  3f 80 3e 42 58 e7 18 27  |..N..%F2?.>BX..'|
00000190  8a c8 7c 8c 74 99 96 61  d4 e2 58 c2 27 71 8c 3b  |..|.t..a..X.'q.;|
000001a0  da 33 f8 7f b5 c1 a7 a0  c2 7b 54 29 0d 47 b4 b5  |.3.......{T).G..|
000001b0  4c 62 5b f8 e9 6f bc 29  00 00 00 00 00 00 00 00  |Lb[..o.)........|
000001c0  02 00 ee ff ff ff 01 00  00 00 ff ff ff ff 00 00  |................|
000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
00000200  45 46 49 20 50 41 52 54  00 00 01 00 5c 00 00 00  |EFI PART....\...|
00000210  62 01 85 1f 00 00 00 00  01 00 00 00 00 00 00 00  |b...............|
00000220  af be c0 d1 01 00 00 00  22 00 00 00 00 00 00 00  |........".......|
00000230  8e be c0 d1 01 00 00 00  e2 89 58 78 77 63 52 44  |..........XxwcRD|
00000240  93 9e 4a 93 16 06 86 6b  02 00 00 00 00 00 00 00  |..J....k........|
00000250  80 00 00 00 80 00 00 00  5d ff 7e 02 00 00 00 00  |........].~.....|
00000260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

> And then we can see if there really is a PMBR and GPT in that first
> sector that parted is picking up. But where it could be coming from in
> an mdadm linear layout? No idea.
> 
> The other thing to check is the end of the partition, because GPT has
> a primary and backup. So the 2nd to last sector of sdd1 may have a
> backup GPT on it, and possibly something is wrongly restoring it
> sometimes.
> 
> In any case I would still look to using something much much newer than
> parted 2.3, it's basically Pleistocene old, and the version of mdadm
> is also likewise old. But this is what happens with LTS releases,
> ancient software for which no one except its maintainers remember the
> state and history.

I understand and can probably acquire the most recent stable and
compile from source, if you think that would prove useful enough to
justify the effort.  TBH once GPT came out I lost track of which
partitioning tool was appropriate to use, it seemed like (IIRC)
cfdisk, sfdisk, parted were all vying for my attention... is parted
now the standard?

At the current moment I am backing up the drives so that I can try a
forcible reassemble.  I think that last time this happened, that
effectively relabeled the mdraid partitions and fixed the problem.
The underlying mdraid has an LVM on LUKS, but last time this happened
I managed to fsck and get 99% of the data back, with only a few things
ending up in lost+found.  Presumably there might have been some data
corruption, but since it's a backup server only I consider it
tolerable, modulo the failed Windows system which needs to restore
from it.
-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-25  6:25   ` travis+ml-linux-raid
@ 2016-08-25 21:06     ` Wols Lists
  2016-08-25 22:32     ` Chris Murphy
  1 sibling, 0 replies; 17+ messages in thread
From: Wols Lists @ 2016-08-25 21:06 UTC (permalink / raw)
  To: Linux-RAID, travis+ml-linux-raid

On 25/08/16 07:25, travis+ml-linux-raid@subspacefield.org wrote:
> I understand and can probably acquire the most recent stable and
> compile from source, if you think that would prove useful enough to
> justify the effort.  TBH once GPT came out I lost track of which
> partitioning tool was appropriate to use, it seemed like (IIRC)
> cfdisk, sfdisk, parted were all vying for my attention... is parted
> now the standard?

To add to the fun, I use gdisk (or is it gfdisk?).

Like so many things gnu, when I looked at parted I ran away screaming
from the feature overkill ... :-)

Cheers,
Wol

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-25  6:25   ` travis+ml-linux-raid
  2016-08-25 21:06     ` Wols Lists
@ 2016-08-25 22:32     ` Chris Murphy
  2016-08-26  2:33       ` Phil Turmel
  2016-08-26  2:50       ` travis
  1 sibling, 2 replies; 17+ messages in thread
From: Chris Murphy @ 2016-08-25 22:32 UTC (permalink / raw)
  To: Chris Murphy, Linux-RAID

On Thu, Aug 25, 2016 at 12:25 AM,
<travis+ml-linux-raid@subspacefield.org> wrote:

> $ sudo mdadm -E /dev/sdd1
> /dev/sdd1:
>           Magic : a92b4efc
>         Version : 1.2
>     Feature Map : 0x0
>      Array UUID : <elided>
>            Name : <elided>
>   Creation Time : Wed Aug 10 11:33:41 2016
>      Raid Level : raid0
>    Raid Devices : 4
>
>  Avail Dev Size : 7814035071 (3726.02 GiB 4000.79 GB)
>     Data Offset : 16 sectors
>    Super Offset : 8 sectors
>           State : clean
>     Device UUID : <elided)
>
>     Update Time : Wed Aug 10 11:33:41 2016
>        Checksum : 490b562f - correct
>          Events : 0
>
>      Chunk Size : 512K
>
>    Device Role : Active device 0
>    Array State : AAAA ('A' == active, '.' == missing)

I'm confused by Events: 0, even though I see the same thing with raid0
and linear arrays. As writes happen, array stopped and started, this
Events count does not increase. Parity raid only thing I guess?

Anyway, sdd1 has both an mdadm superblock on it, as shown above, and
it also has a GPT on it as show in your first message and below -
that's not good, but not unfixable. The mdadm super block starts at
LBA 8, 4096 bytes from the start of that partition, so it's safe to
zero the first 4096 bytes. The GPT is mainly in the first three
sectors so you could just write zeros for a count of 3, although it is
more complete to zero with a count=8, for the partition, not the whole
device.


>
> Here is what should be the same, only device 2 in the array
> (device 3 is similar or identical):
>
> $ sudo mdadm -E /dev/sdf1
> /dev/sdf1:
>    MBR Magic : aa55
> Partition[0] :   4294967295 sectors at            1 (type ee)

Looks like the mdadm super block might have been stepped on by
something. You'd need to look for some evidence of it using something
like

dd if=/dev/sdf1 count=9 2>/dev/null | hexdump -C

If it's intact it should be at offset x1000 and again just a matter of
wiping the first 8 sectors, again of the partition, not the whole
device.






> $ sudo mdadm -D /dev/sdf1
> mdadm: /dev/sdf1 does not appear to be an md device

You're getting the commands confused. -E applies to /dev/sdXY member
devices, and -D applies to /dev/mdX arrays.


>
> Sadly, I can't do a mdadm -D because I can't assemble the RAID.
> $ sudo mdadm -E /dev/md127

Again, wrong command, you should use -D for this.


> $
>
> The command history is gone, but I would imagine that the RAID was
> created with something like this:
>
> mdadm --create /dev/md/bu --level=0 --raid-devices=4 /dev/sd{b,c,d,e}1
>
> Although it could have been level=linear.
>
> To summarize my email:
> "Is this is a known problem? If not, here is a bug report"

This is not a bug report. There's no reproduce steps, there's no
evidence of a bug. I'm not experiencing random replacement of mdadm
superblock data with MBR and GPT signatures. That's not really what
I'd expect of drive or enclosure firmware which by design should be
partition agnostic, as there's more than one or two valid kinds of
partitioning. Plus, it'd be scary even if it picked the right one, it
could clobber a legitimate existing one.

So I'd say it's something else.


>> It's purely speculation, but it sounds like to me in the history of
>> one or more drives, the previous signatures weren't removed before the
>> drive was retasked for its new purpose. That's the folly of not wiping
>> the signatures in the reverse order they were created, and just
>> expecting that starting over will wipe those old signatures.
>
> It's possible, but why would you ever end up with a GPT in a partition?

In every case I've seen, it was user error. I haven't heard of things
putting GPTs in partitions, and in a sense I'd say it's a bug if any
utility lets a user do that. Nesting GPT's in partitions, bad idea,
although it *should* be innocuous because it shouldn't be seen/honored
by anything that doesn't go looking for it because it doesn't belong
there.



>
> I've certainly encountered this "GPT outside cylinder 0" on these two
> drives before,

Keep in mind cylinders are gone, they don't exist anymore. Drives all
speak in LBAs now. *shrug* The GPT typically involves LBAs 0, 1 and 2
at least, more if there are more than 4 partitions.

> but it goes away with a forcible reassemble or recreate
> (which I did last time), because the mdlabel blows it away.

Umm, I think that only happens with -U, --update.


>Unless
> it's something this list knows about, I suspect it is a firmware
> glitch in the USB enclosure.

Doubtful.




>
>> But I think there is a legitimate gripe that parted probably should
>> not operate on partitions like this. It's not valid to have nested
>> GPTs like this. And I have no idea if parted is showing you valid or
>> bogus information. You'd need to do something like:
>>
>> dd if=/dev/sdd1 count=2 2>/dev/null | hexdump -C
>
> ## Good disk (for comparison):
> $ sudo dd if=/dev/sdd1 count=2 2> /dev/null | file -
> /dev/stdin: data
> $ sudo dd if=/dev/sdd1 count=2 2> /dev/null | hexdump -C | head -20
> 00000000  ff 02 19 2e 03 ee fa d8  6d d7 24 78 e1 d4 04 3d  |........m.$x...=|
> 00000010  c9 92 33 97 17 7a 10 d3  05 bd 39 36 b4 a9 7c 14  |..3..z....96..|.|
> 00000020  a7 de 66 b6 cd d9 ff ef  45 27 74 6e 94 0a 03 49  |..f.....E'tn...I|
> 00000030  d4 43 26 2d 45 39 d1 93  8a 35 91 91 ff c9 a4 8e  |.C&-E9...5......|
> 00000040  bd 9a 06 6d cc f2 89 65  c0 91 87 1c 1b f0 da 2f  |...m...e......./|
> 00000050  83 c2 12 eb 80 3c c2 4c  68 cc 65 40 26 13 e0 77  |.....<.Lh.e@&..w|
> 00000060  38 15 ed 78 27 76 4c 91  71 99 3e 9f 99 f1 3f 51  |8..x'vL.q.>...?Q|
> 00000070  19 db 12 a3 ac b6 61 12  ff d9 37 87 31 1f 8b dd  |......a...7.1...|
> 00000080  88 82 de fb db f2 a5 31  10 2a d2 03 be 12 be bd  |.......1.*......|
> 00000090  19 46 9f c1 3b ea a1 37  81 d2 4d 00 54 e7 b4 55  |.F..;..7..M.T..U|
> 000000a0  b7 65 6c 3f 95 40 b0 f4  28 ff 90 62 22 cb 22 fd  |.el?.@..(..b".".|
> 000000b0  6b 4d 90 56 32 4b c6 22  35 b1 62 76 e1 fd 82 d5  |kM.V2K."5.bv....|
> 000000c0  03 40 c0 85 4b ac 5a 44  9e 6a 25 97 d3 7f bd fe  |.@..K.ZD.j%.....|
> 000000d0  0c 2d a8 bb 33 f4 00 df  7a 05 ae 6d b3 3e f3 7d  |.-..3...z..m.>.}|
> 000000e0  34 9e 0e 57 14 de d8 e0  28 63 82 a6 2a 8a 1f fc  |4..W....(c..*...|
> 000000f0  fe 2f b0 69 67 ac 0a e9  c2 53 a7 d8 36 1a 18 5a  |./.ig....S..6..Z|
> 00000100  d6 d4 e6 ce df f7 fc 67  13 eb 25 08 45 50 10 7b  |.......g..%.EP.{|
> 00000110  c6 23 1e 59 dc 2d c2 65  53 90 ca ec 21 e7 28 74  |.#.Y.-.eS...!.(t|
> 00000120  41 7f 3e 58 72 08 75 c1  d5 ca d0 91 55 5f 43 6a  |A.>Xr.u.....U_Cj|
> 00000130  4e 84 d5 7f aa f2 b5 27  e4 86 5d 28 ae 6c 29 a1  |N......'..](.l).|

OK I don't know why you used head, I needed to see past offset 0x130.
Offset lines 0x1f0 and x200 have the MBR and GPT signatures, so the
above doesn't really tell me anything.

I don't recognize the above stuff, so I'm not sure what it is. I'd
usually expect it to be zeros if it's not a boot drive.

>
> ## Bad disk:
> $ sudo dd if=/dev/sdf1 count=2 2> /dev/null | file -
> /dev/stdin: x86 boot sector; partition 1: ID=0xee, starthead 0, startsector 1, 4294967295 sectors, code offset 0x6f
> $ sudo dd if=/dev/sdf1 count=2 2> /dev/null | hexdump -C
> 00000000  38 6f 96 52 ea 9c 31 cd  10 a2 84 58 a2 f0 f5 43  |8o.R..1....X...C|
> 00000010  0f f2 5a 9b c7 ff 82 b2  d8 59 86 60 15 bc 31 65  |..Z......Y.`..1e|
> 00000020  bc d7 77 f9 31 6a c8 16  3f 13 90 24 b7 57 ff 6b  |..w.1j..?..$.W.k|
> 00000030  64 7e e2 99 2a 99 f7 32  69 be aa 56 36 31 f7 db  |d~..*..2i..V61..|
> 00000040  8c 4c 4c 12 68 19 77 0f  f6 3b 92 bf 18 92 c2 45  |.LL.h.w..;.....E|
> 00000050  73 d5 b7 93 cc ae 6b b9  b0 bd 0c 85 a9 c3 19 f7  |s.....k.........|
> 00000060  87 34 b8 be 0a 95 cd 03  03 d5 01 49 b5 b0 86 fe  |.4.........I....|
> 00000070  71 1c d2 f6 42 ed ce b0  eb c3 5f 4c 07 34 30 c7  |q...B....._L.40.|
> 00000080  8a 1f 91 c4 8b 28 b9 07  8e da ae 7d 7d c5 24 2b  |.....(.....}}.$+|
> 00000090  6d f9 ea a3 6a 83 9d b8  6a 1f 6d db 3a 01 22 c7  |m...j...j.m.:.".|
> 000000a0  56 fc 2a 46 f8 b2 84 31  d1 8b 58 55 b6 5a 36 7b  |V.*F...1..XU.Z6{|
> 000000b0  48 5d 98 2a 3f f0 ae 80  2b f8 6b b2 7f 1e 27 c2  |H].*?...+.k...'.|
> 000000c0  59 65 d0 bf c7 f0 5b 18  dc 59 8e 68 46 03 b6 ca  |Ye....[..Y.hF...|
> 000000d0  42 06 7a 52 7a 49 36 03  0d d5 9b 67 a2 03 3b 13  |B.zRzI6....g..;.|
> 000000e0  40 23 19 f5 1a a6 bd fb  c8 d5 5b 26 f5 6a 86 ab  |@#........[&.j..|
> 000000f0  89 77 98 d8 09 cb b7 59  80 03 81 48 ba c6 ce 77  |.w.....Y...H...w|
> 00000100  3c 6c d2 ba a0 71 c3 20  18 fd 77 db ca a8 8a e3  |<l...q. ..w.....|
> 00000110  8d 6c 1f 17 d5 9f e5 81  bf 50 62 c3 bc f8 6c 5d  |.l.......Pb...l]|
> 00000120  f7 3f a6 37 6b a9 53 2b  88 15 5d 6e 1e 48 4f b4  |.?.7k.S+..]n.HO.|
> 00000130  db af b4 f7 f5 7b 4d f3  3f 60 44 60 6e a2 c4 6d  |.....{M.?`D`n..m|
> 00000140  b9 6c 88 04 e8 66 d1 7c  a0 09 10 66 32 de 70 e1  |.l...f.|...f2.p.|
> 00000150  98 40 54 5e 1d f2 af b8  2e d1 75 0d 3c 46 1f f8  |.@T^......u.<F..|
> 00000160  85 72 49 87 ad 92 59 28  fd 9d 22 8e 1b 9f 2c 00  |.rI...Y(.."...,.|
> 00000170  87 58 74 01 63 a5 94 13  e3 9c ea ec 3f 21 22 41  |.Xt.c.......?!"A|
> 00000180  05 13 78 f3 a8 46 b3 02  9e 23 cb 9d 21 db a6 ae  |..x..F...#..!...|
> 00000190  08 a8 70 48 18 6c e2 38  e4 ac 03 6e 06 74 17 7c  |..pH.l.8...n.t.||
> 000001a0  90 ca 9f 5e 2e 2b 84 ef  52 2c 08 9a 48 98 f9 46  |...^.+..R,..H..F|
> 000001b0  f4 9f 00 cd ec a0 11 d7  00 00 00 00 00 00 00 00  |................|
> 000001c0  02 00 ee ff ff ff 01 00  00 00 ff ff ff ff 00 00  |................|
> 000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
> 00000200  45 46 49 20 50 41 52 54  00 00 01 00 5c 00 00 00  |EFI PART....\...|
> 00000210  3a dc 43 c4 00 00 00 00  01 00 00 00 00 00 00 00  |:.C.............|
> 00000220  8e b6 c0 d1 01 00 00 00  22 00 00 00 00 00 00 00  |........".......|
> 00000230  6d b6 c0 d1 01 00 00 00  a5 4f bd 75 f6 c8 4f 43  |m........O.u..OC|
> 00000240  92 31 ab b6 a9 59 aa 04  02 00 00 00 00 00 00 00  |.1...Y..........|
> 00000250  80 00 00 00 80 00 00 00  59 04 3d 4a 00 00 00 00  |........Y.=J....|
> 00000260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|


OK it does in fact have a PMBR and GPT in the 1st and 2nd sector of
this partition. Pretty weird how it got there. There is a UUID
starting at offset 0x238 so you can look around and see if anything
else has that UUID or if that UUID ever changed or comes back after
you fix this. If it's not the same UUID, something is creating it with
a random UUID each time, which would mean it's not just being copied
from somewhere.


>
> ## is that the same as the boot sector itself?  Interesting q.
> # dd if=/dev/sdd count=2 of=/tmp/foo && dd if=/dev/sdd1 count=2 of=/tmp/bar && cmp /tmp/foo /tmp/bar
> ## Nope, how do they differ?  Well that's a bit unpleasant to do manually but here...
> # dd if=/dev/sdd count=2 2> /dev/null | hexdump -C
> 00000000  10 06 27 48 33 df bb 55  8b 28 fe 60 5e 18 6d 38  |..'H3..U.(.`^.m8|
> 00000010  fc b3 17 36 55 de fd 83  d0 52 72 19 d0 76 12 f0  |...6U....Rr..v..|
> 00000020  1e 23 bc 4d c5 4d c2 d6  5a d4 2b cd 16 78 c9 28  |.#.M.M..Z.+..x.(|
> 00000030  77 21 c4 9f c4 b7 48 ad  e0 7b 08 d6 f5 8e 92 a7  |w!....H..{......|
> 00000040  bc 88 35 02 e7 f8 b8 3b  05 97 db a3 ad e7 96 4b  |..5....;.......K|
> 00000050  84 d9 e2 a4 3a 5a 07 ac  fc a2 78 58 d7 c8 5a 19  |....:Z....xX..Z.|
> 00000060  88 9c f6 f2 c0 ec 99 55  d9 5d 00 87 3a 86 52 01  |.......U.]..:.R.|
> 00000070  92 58 25 82 99 50 8e 28  0f 42 07 71 9a a3 db 82  |.X%..P.(.B.q....|
> 00000080  00 d9 b8 28 9d d8 97 85  9d c6 fb 5e 4d 94 3a 6e  |...(.......^M.:n|
> 00000090  19 3c a6 ce 57 6b a0 52  d6 72 0c 41 2e cd cb a2  |.<..Wk.R.r.A....|
> 000000a0  15 c8 d4 c8 8c 90 34 5f  15 ab 69 96 af 3d 7e 30  |......4_..i..=~0|
> 000000b0  25 e1 72 35 d6 c4 b2 5e  78 72 0b 3f 9a 96 40 7e  |%.r5...^xr.?..@~|
> 000000c0  c6 aa 0e 5a da 99 ae fe  a3 93 8b 5b c4 bf 91 64  |...Z.......[...d|
> 000000d0  d5 62 12 ea 70 15 a9 05  81 8d e4 fb 36 15 c9 63  |.b..p.......6..c|
> 000000e0  ba f9 d2 5c f6 df 28 71  d8 d5 82 95 2b 83 40 db  |...\..(q....+.@.|
> 000000f0  9b fe e2 a7 9b 38 5e 5f  51 a6 6e e6 7b 4e bf 02  |.....8^_Q.n.{N..|
> 00000100  d2 fb aa f9 2c 7a 5b f5  47 ad ac 7e d1 1c f3 1b  |....,z[.G..~....|
> 00000110  a3 8e 54 9f a4 8d 1a 02  3f cc 81 f0 ca e9 28 1e  |..T.....?.....(.|
> 00000120  33 9e d8 71 dd f2 aa b7  d4 06 96 cb 0c 8e f1 6a  |3..q...........j|
> 00000130  88 1d 2a 8a a3 33 00 8c  ef d4 d8 39 3e 70 18 34  |..*..3.....9>p.4|
> 00000140  e6 3a cd e7 0b d6 82 a8  a4 aa ff bd b3 69 0a cc  |.:...........i..|
> 00000150  32 9e e3 26 34 bb cc 0e  b0 69 5f 9a c5 f3 57 7d  |2..&4....i_...W}|
> 00000160  47 82 bc 66 44 55 c4 de  3c 2c 14 d0 9a 73 6a da  |G..fDU..<,...sj.|
> 00000170  3c 5e f8 99 26 5b f4 8a  13 a1 f1 c8 a9 20 4c 3a  |<^..&[....... L:|
> 00000180  bd 03 4e e9 83 25 46 32  3f 80 3e 42 58 e7 18 27  |..N..%F2?.>BX..'|
> 00000190  8a c8 7c 8c 74 99 96 61  d4 e2 58 c2 27 71 8c 3b  |..|.t..a..X.'q.;|
> 000001a0  da 33 f8 7f b5 c1 a7 a0  c2 7b 54 29 0d 47 b4 b5  |.3.......{T).G..|
> 000001b0  4c 62 5b f8 e9 6f bc 29  00 00 00 00 00 00 00 00  |Lb[..o.)........|
> 000001c0  02 00 ee ff ff ff 01 00  00 00 ff ff ff ff 00 00  |................|
> 000001d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
> *
> 000001f0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 55 aa  |..............U.|
> 00000200  45 46 49 20 50 41 52 54  00 00 01 00 5c 00 00 00  |EFI PART....\...|
> 00000210  62 01 85 1f 00 00 00 00  01 00 00 00 00 00 00 00  |b...............|
> 00000220  af be c0 d1 01 00 00 00  22 00 00 00 00 00 00 00  |........".......|
> 00000230  8e be c0 d1 01 00 00 00  e2 89 58 78 77 63 52 44  |..........XxwcRD|
> 00000240  93 9e 4a 93 16 06 86 6b  02 00 00 00 00 00 00 00  |..J....k........|
> 00000250  80 00 00 00 80 00 00 00  5d ff 7e 02 00 00 00 00  |........].~.....|
> 00000260  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|

We kinda expect sdd to have a valid PMBR and GPT though... so that's
sane. I just don't know what to make of the stuff in LBA 0 before the
PMBR.


> I understand and can probably acquire the most recent stable and
> compile from source, if you think that would prove useful enough to
> justify the effort.  TBH once GPT came out I lost track of which
> partitioning tool was appropriate to use, it seemed like (IIRC)
> cfdisk, sfdisk, parted were all vying for my attention... is parted
> now the standard?

It is common. I prefer gdisk, which has a nomenclature similar to
fdisk. The nomenclature of parted is confusing.


>
> At the current moment I am backing up the drives so that I can try a
> forcible reassemble.  I think that last time this happened, that
> effectively relabeled the mdraid partitions and fixed the problem.
> The underlying mdraid has an LVM on LUKS, but last time this happened
> I managed to fsck and get 99% of the data back, with only a few things
> ending up in lost+found.  Presumably there might have been some data
> corruption, but since it's a backup server only I consider it
> tolerable, modulo the failed Windows system which needs to restore
> from it.

FWIW it's probably a lot simpler layout if you wanted to do either
linear or raid0, to just blow away all four drives with hdparm and ATA
security erase to get rid of all signatures; and then make all of them
into LVM physical volumes without any partitioning first, and then
make a logical volume, which by default is linear/concat, or you can
choose to use raid0 (this is a per logical volume characteristic), and
then encrypt the LV, and then format the LUKS volume. There's no
advantage to adding either partitions or mdadm RAIDs if you're going
to use LVM anyway and this is a Linux only storage enclosure.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-25 22:32     ` Chris Murphy
@ 2016-08-26  2:33       ` Phil Turmel
  2016-08-26  2:48         ` Chris Murphy
  2016-08-26  2:50       ` travis
  1 sibling, 1 reply; 17+ messages in thread
From: Phil Turmel @ 2016-08-26  2:33 UTC (permalink / raw)
  To: Chris Murphy, Linux-RAID

On 08/25/2016 06:32 PM, Chris Murphy wrote:

>> It's possible, but why would you ever end up with a GPT in a partition?
> 
> In every case I've seen, it was user error. I haven't heard of things
> putting GPTs in partitions, and in a sense I'd say it's a bug if any
> utility lets a user do that. Nesting GPT's in partitions, bad idea,
> although it *should* be innocuous because it shouldn't be seen/honored
> by anything that doesn't go looking for it because it doesn't belong
> there.

It is possible to run gdisk or parted on /dev/sdX1 accidentally instead
of /dev/sdX.  Pretty simple user error.

It is also possible and appropriate if using v0.90 or v1.0 metadata on
an array and you partition the array itself.  Then it'll show up on
member 0, any mirror of member 0, and possibly on a parity disk (if
intervening blocks are zero).

Phil

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-26  2:33       ` Phil Turmel
@ 2016-08-26  2:48         ` Chris Murphy
  2016-08-26  3:11           ` travis+ml-linux-raid
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Murphy @ 2016-08-26  2:48 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Chris Murphy, Linux-RAID

On Thu, Aug 25, 2016 at 8:33 PM, Phil Turmel <philip@turmel.org> wrote:
> On 08/25/2016 06:32 PM, Chris Murphy wrote:
>
>>> It's possible, but why would you ever end up with a GPT in a partition?
>>
>> In every case I've seen, it was user error. I haven't heard of things
>> putting GPTs in partitions, and in a sense I'd say it's a bug if any
>> utility lets a user do that. Nesting GPT's in partitions, bad idea,
>> although it *should* be innocuous because it shouldn't be seen/honored
>> by anything that doesn't go looking for it because it doesn't belong
>> there.
>
> It is possible to run gdisk or parted on /dev/sdX1 accidentally instead
> of /dev/sdX.  Pretty simple user error.
>
> It is also possible and appropriate if using v0.90 or v1.0 metadata on
> an array and you partition the array itself.  Then it'll show up on
> member 0, any mirror of member 0, and possibly on a parity disk (if
> intervening blocks are zero).

Right, so something like GPT on /dev/sda and /dev/sdb to create sda1
and sdb1, then mdadm -C /dev/md0 --metadata=1.0 ... /dev/sda1
/dev/sdb1, and then create a GPT on /dev/md0. The result is /dev/md0,
/dev/sda1, and /dev/sda2 will all appear to have the same GPT on them.

I would say that's probably a bad idea, I know some tools allow it,
but it creates an ambiguity. It could be argued to be inconsistent
with the UEFI spec. The only nesting it describes is MBR on a GPT
partition, not GPT nested in a GPT partition. This is probably also
better done using LVM. Otherwise we get nutty things...


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-25 22:32     ` Chris Murphy
  2016-08-26  2:33       ` Phil Turmel
@ 2016-08-26  2:50       ` travis
  2016-08-26  3:21         ` Chris Murphy
  1 sibling, 1 reply; 17+ messages in thread
From: travis @ 2016-08-26  2:50 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux-RAID

On Thu, Aug 25, 2016 at 04:32:12PM -0600, Chris Murphy wrote:
> that's not good, but not unfixable. The mdadm super block starts at
> LBA 8, 4096 bytes from the start of that partition, so it's safe to
> zero the first 4096 bytes. The GPT is mainly in the first three
> sectors so you could just write zeros for a count of 3, although it is
> more complete to zero with a count=8, for the partition, not the whole
> device.

Useful info, thanks.

> Looks like the mdadm super block might have been stepped on by
> something. You'd need to look for some evidence of it using something
> like
> 
> dd if=/dev/sdf1 count=9 2>/dev/null | hexdump -C
> 
> If it's intact it should be at offset x1000 and again just a matter of
> wiping the first 8 sectors, again of the partition, not the whole
> device.

> > Sadly, I can't do a mdadm -D because I can't assemble the RAID.
> > $ sudo mdadm -E /dev/md127
> 
> Again, wrong command, you should use -D for this.

# mdadm -D /dev/md127 
mdadm: md device /dev/md127 does not appear to be active.

> This is not a bug report. There's no reproduce steps, there's no
> evidence of a bug. I'm not experiencing random replacement of mdadm
> superblock data with MBR and GPT signatures.

I realize it's not terribly actionable.  But enough circumstantial
evidence from enough people and one starts looking for things which
can exhibit that behavior.

> That's not really what
> I'd expect of drive or enclosure firmware which by design should be
> partition agnostic, as there's more than one or two valid kinds of
> partitioning. Plus, it'd be scary even if it picked the right one, it
> could clobber a legitimate existing one.

I've had some weird shit, but you're right that it's odd that it'd
write a partition table out to /dev/sdd1 instead of /dev/sdd, that
almost sounds like something that would require the OS to get
involved, to get that offset confused.

> So I'd say it's something else.

Do you have any idea what that could be?  I haven't logged into this
box in months, and nobody else has either.  If it's not USB or drive
firmware, I'm fresh out of ideas.  Repartitioning disks isn't exactly
something most stuff does automatically and without prompting, as it's
pretty dangerous.

> In every case I've seen, it was user error. I haven't heard of things
> putting GPTs in partitions, and in a sense I'd say it's a bug if any
> utility lets a user do that. Nesting GPT's in partitions, bad idea,
> although it *should* be innocuous because it shouldn't be seen/honored
> by anything that doesn't go looking for it because it doesn't belong
> there.

That's entirely possible.  When I had this problem the _first_ few times
I assumed it was the fact I was using raw disks and not partitioned disks.
I had a very similar problem, where something would wipe out the mdlabel,
but only on the last two drives of the array.

In fact, I decided to grep around for /dev/sdd1 and /dev/sde1 which seem
to get trounced (but not /dev/sd[bc]1) and what do you know:

# grep -R /dev/sde1 /etc/
/etc/lvm/cache/.cache:          "/dev/sde1",

That certainly looks promising.  I wonder if you just solved my problem
without hardware upgrade.

> > I've certainly encountered this "GPT outside cylinder 0" on these two
> > drives before,
> 
> Keep in mind cylinders are gone, they don't exist anymore. Drives all
> speak in LBAs now. *shrug* The GPT typically involves LBAs 0, 1 and 2
> at least, more if there are more than 4 partitions.

Shorthand for "before partition 1".

> I don't recognize the above stuff, so I'm not sure what it is. I'd
> usually expect it to be zeros if it's not a boot drive.

It was used as a raw disk in an encrypted RAID before.

> OK it does in fact have a PMBR and GPT in the 1st and 2nd sector of
> this partition. Pretty weird how it got there. There is a UUID
> starting at offset 0x238 so you can look around and see if anything
> else has that UUID or if that UUID ever changed or comes back after
> you fix this. If it's not the same UUID, something is creating it with
> a random UUID each time, which would mean it's not just being copied
> from somewhere.

Got it.  Good idea.

> We kinda expect sdd to have a valid PMBR and GPT though... so that's
> sane. I just don't know what to make of the stuff in LBA 0 before the
> PMBR.

It's just random fill from a previous incarnation.

> It is common. I prefer gdisk, which has a nomenclature similar to
> fdisk. The nomenclature of parted is confusing.

I think somewhere in learning parted and repartitioning all the disks,
I managed to type /dev/sdX1 instead of /dev/sdX when creating the
partitions.

> FWIW it's probably a lot simpler layout if you wanted to do either
> linear or raid0, to just blow away all four drives with hdparm and ATA
> security erase to get rid of all signatures; and then make all of them
> into LVM physical volumes without any partitioning first, and then
> make a logical volume, which by default is linear/concat, or you can
> choose to use raid0 (this is a per logical volume characteristic), and
> then encrypt the LV, and then format the LUKS volume. There's no
> advantage to adding either partitions or mdadm RAIDs if you're going
> to use LVM anyway and this is a Linux only storage enclosure.

Good call, reduces the diversity of layers in the stack too.  Thanks.
-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-26  2:48         ` Chris Murphy
@ 2016-08-26  3:11           ` travis+ml-linux-raid
  0 siblings, 0 replies; 17+ messages in thread
From: travis+ml-linux-raid @ 2016-08-26  3:11 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Phil Turmel, Linux-RAID

On Thu, Aug 25, 2016 at 08:48:24PM -0600, Chris Murphy wrote:
> Right, so something like GPT on /dev/sda and /dev/sdb to create sda1
> and sdb1, then mdadm -C /dev/md0 --metadata=1.0 ... /dev/sda1
> /dev/sdb1, and then create a GPT on /dev/md0. The result is /dev/md0,
> /dev/sda1, and /dev/sda2 will all appear to have the same GPT on them.
> 
> I would say that's probably a bad idea, I know some tools allow it,
> but it creates an ambiguity. It could be argued to be inconsistent
> with the UEFI spec. The only nesting it describes is MBR on a GPT
> partition, not GPT nested in a GPT partition. This is probably also
> better done using LVM. Otherwise we get nutty things...

We had similar ambiguities in MBR-land, if you set the active flag on
more than one partition, or if you have more than one extended
partition.

Since the behavior in those cases is undefined, it seems wise to avoid
creating them.

Better if the specification avoids these situations - that no
combination of bits can create an ambiguous interpretation - but in
the occasional cases where that can't be avoided you're best off not
creating those situations.
-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-26  2:50       ` travis
@ 2016-08-26  3:21         ` Chris Murphy
  2016-08-26  3:58           ` travis
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Murphy @ 2016-08-26  3:21 UTC (permalink / raw)
  To: Chris Murphy, Linux-RAID

On Thu, Aug 25, 2016 at 8:50 PM,  <travis@subspacefield.org> wrote:

>
>> So I'd say it's something else.
>
> Do you have any idea what that could be?

User error, you even suspect it yourself later...



> In fact, I decided to grep around for /dev/sdd1 and /dev/sde1 which seem
> to get trounced (but not /dev/sd[bc]1) and what do you know:
>
> # grep -R /dev/sde1 /etc/
> /etc/lvm/cache/.cache:          "/dev/sde1",
>
> That certainly looks promising.  I wonder if you just solved my problem
> without hardware upgrade.

That just contains a listing of LVM devices, I don't think that's
related to this problem.




>> > I've certainly encountered this "GPT outside cylinder 0" on these two
>> > drives before,
>>
>> Keep in mind cylinders are gone, they don't exist anymore. Drives all
>> speak in LBAs now. *shrug* The GPT typically involves LBAs 0, 1 and 2
>> at least, more if there are more than 4 partitions.
>
> Shorthand for "before partition 1".

Unreliable. By convention most tools used to start it at LBA 63 which
*was* based on CHS, but that's the Pleistocene (again). It's been many
years, maybe nearing a decade, since a tool would default to that.
First, 62 sectors isn't big enough to embed a bootloader these days.
Second, it's not 4096 byte aligned for 4K sector drives, which now
pretty much every hard drive is, except some higher end SCSI/SAS
drives come with the option of 512 byte physical sectors still. But
these are quickly vanishing. Macs typically start the first partition
at LBA 40, and on Windows and Linux these days it's usually LBA 2048
(1MiB gap to the first partition).



>
>> I don't recognize the above stuff, so I'm not sure what it is. I'd
>> usually expect it to be zeros if it's not a boot drive.
>
> It was used as a raw disk in an encrypted RAID before.

OK



>> It is common. I prefer gdisk, which has a nomenclature similar to
>> fdisk. The nomenclature of parted is confusing.
>
> I think somewhere in learning parted and repartitioning all the disks,
> I managed to type /dev/sdX1 instead of /dev/sdX when creating the
> partitions.

Bingo. That would do it.

The thing to get in the habit of when retasking anything, be it a
drive, a partition, or logical volume:

1. Tear down with wipefs -a from most recent structure created (file
system) to the first; or
2. Full disk encryption. If you merely luksFormat, you've obliterated
all the previous signatures, effectively, so no need for a tear down;
or
3. hdparm ATA secure erase; or
4. write zeros with something like badblocks.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-26  3:21         ` Chris Murphy
@ 2016-08-26  3:58           ` travis
  2016-08-26  4:06             ` travis+ml-linux-raid
  0 siblings, 1 reply; 17+ messages in thread
From: travis @ 2016-08-26  3:58 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux-RAID

On Thu, Aug 25, 2016 at 09:21:30PM -0600, Chris Murphy wrote:
> On Thu, Aug 25, 2016 at 8:50 PM,  <travis@subspacefield.org> wrote:
> 
> >
> >> So I'd say it's something else.
> >
> > Do you have any idea what that could be?
> 
> User error, you even suspect it yourself later...

Yeah, *when I created this disk layout* I might have created a GPT in
partition 1.

That was probably a year or more ago.

That has nothing to do with this crash, which is perhaps the fourth of
its kind.

I haven't touched the box in weeks before this happened, when I was
away on vacation.  Although, it could have lurked for some time,
and only been uncovered by a crash or kpanic.

> > Shorthand for "before partition 1".
> 
> Unreliable. By convention most tools used to start it at LBA 63 which
> *was* based on CHS, but that's the Pleistocene (again). It's been many
> years, maybe nearing a decade, since a tool would default to that.
> First, 62 sectors isn't big enough to embed a bootloader these days.
> Second, it's not 4096 byte aligned for 4K sector drives, which now
> pretty much every hard drive is, except some higher end SCSI/SAS
> drives come with the option of 512 byte physical sectors still. But
> these are quickly vanishing. Macs typically start the first partition
> at LBA 40, and on Windows and Linux these days it's usually LBA 2048
> (1MiB gap to the first partition).

Relax, it's just to save time.  I'm aware we don't use CHS addressing
any more.  I'm also aware that Kleenex is a brand name, even though I
use it to mean tissue.  GNU/Linux, not Linux.  I get it.  Let's move on.
-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-26  3:58           ` travis
@ 2016-08-26  4:06             ` travis+ml-linux-raid
  2016-08-26  4:25               ` Chris Murphy
  0 siblings, 1 reply; 17+ messages in thread
From: travis+ml-linux-raid @ 2016-08-26  4:06 UTC (permalink / raw)
  To: Chris Murphy, Linux-RAID

On Thu, Aug 25, 2016 at 08:58:50PM -0700, travis@subspacefield.org wrote:
> Yeah, *when I created this disk layout* I might have created a GPT in
> partition 1.
> 
> That was probably a year or more ago.
> 
> That has nothing to do with this crash, which is perhaps the fourth of
> its kind.
> 
> I haven't touched the box in weeks before this happened, when I was
> away on vacation.  Although, it could have lurked for some time,
> and only been uncovered by a crash or kpanic.

I certainly have not repartitioned the disks multiple times.  There's
no need for that.  It leads to these sort of problems.

The kernel does panic from time to time.

To repeat, this box was *completely unattended* for several weeks
before the crash.  No administration at all.  I simply rsync'd things
off it as necessary.  As a non-root user.

I am curious about the fact that /dev/sdd1 and /dev/sde1 were listed
together in this lvm cache, and those are the two disks that normally
get blasted every 6 months or so.  That's an odd coincidence, and my
best lead yet.
-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-26  4:06             ` travis+ml-linux-raid
@ 2016-08-26  4:25               ` Chris Murphy
  2016-09-02  2:18                 ` travis+ml-linux-raid
  0 siblings, 1 reply; 17+ messages in thread
From: Chris Murphy @ 2016-08-26  4:25 UTC (permalink / raw)
  To: Chris Murphy, Linux-RAID

On Thu, Aug 25, 2016 at 10:06 PM,
<travis+ml-linux-raid@subspacefield.org> wrote:
> On Thu, Aug 25, 2016 at 08:58:50PM -0700, travis@subspacefield.org wrote:
>> Yeah, *when I created this disk layout* I might have created a GPT in
>> partition 1.
>>
>> That was probably a year or more ago.
>>
>> That has nothing to do with this crash, which is perhaps the fourth of
>> its kind.
>>
>> I haven't touched the box in weeks before this happened, when I was
>> away on vacation.  Although, it could have lurked for some time,
>> and only been uncovered by a crash or kpanic.
>
> I certainly have not repartitioned the disks multiple times.  There's
> no need for that.  It leads to these sort of problems.
>
> The kernel does panic from time to time.
>
> To repeat, this box was *completely unattended* for several weeks
> before the crash.  No administration at all.  I simply rsync'd things
> off it as necessary.  As a non-root user.
>
> I am curious about the fact that /dev/sdd1 and /dev/sde1 were listed
> together in this lvm cache, and those are the two disks that normally
> get blasted every 6 months or so.  That's an odd coincidence, and my
> best lead yet.

Well that file does seem stale, because those partitions aren't
actually part of LVM. They're members of an mdadm array. I don't know
where LVM comes into this because we don't have the complete layout.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-23  5:09 bootsect replicated in p1, RAID enclosure suggestions? travis+ml-linux-raid
  2016-08-24  2:14 ` travis+ml-linux-raid
  2016-08-24 17:15 ` Chris Murphy
@ 2016-09-01 17:22 ` Wols Lists
  2016-09-01 23:10   ` Chris Murphy
  2 siblings, 1 reply; 17+ messages in thread
From: Wols Lists @ 2016-09-01 17:22 UTC (permalink / raw)
  To: linux-raid

On 23/08/16 06:09, travis+ml-linux-raid@subspacefield.org wrote:
> Hello all,
> 
> So I have an Intel NUC (for low power Linux) plugged via USB into a 4
> bay enclosure doing linear (yeah I know; it's the backup server, the
> primary is raid10).
> 
> And every once in a while, this happens (*see end).  The partition 1
> that would normally contain a MD slice ends up being a replica of the
> boot cylinder.  I can't tell if it's the mdraid linear impl, the
> kernel doing something weird, the USB drivers, the enclosure firmware,
> or what.

Interesting snippet from LWN ...

The Btrfs CRC checking means that a read from a corrupted sector will
cause an I/O error rather than return garbage. Facebook had some storage
devices that would appear to store data correctly in a set of logical
block addresses (LBAs) until the next reboot, at which point reads to
those blocks would return GUID partition table (GPT) data instead. He
did not name the device maker because it turned out to actually be a
BIOS problem. In any case, the CRCs allowed the Facebook team to quickly
figure out that the problem was not in Btrfs when it affected thousands
of machines as they were rebooted for a kernel upgrade.

The article itself is

https://lwn.net/Articles/698090/

Cheers,
Wol

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-09-01 17:22 ` Wols Lists
@ 2016-09-01 23:10   ` Chris Murphy
  0 siblings, 0 replies; 17+ messages in thread
From: Chris Murphy @ 2016-09-01 23:10 UTC (permalink / raw)
  To: Wols Lists; +Cc: Linux-RAID

On Thu, Sep 1, 2016 at 11:22 AM, Wols Lists <antlists@youngman.org.uk> wrote:
> On 23/08/16 06:09, travis+ml-linux-raid@subspacefield.org wrote:
>> Hello all,
>>
>> So I have an Intel NUC (for low power Linux) plugged via USB into a 4
>> bay enclosure doing linear (yeah I know; it's the backup server, the
>> primary is raid10).
>>
>> And every once in a while, this happens (*see end).  The partition 1
>> that would normally contain a MD slice ends up being a replica of the
>> boot cylinder.  I can't tell if it's the mdraid linear impl, the
>> kernel doing something weird, the USB drivers, the enclosure firmware,
>> or what.
>
> Interesting snippet from LWN ...
>
> The Btrfs CRC checking means that a read from a corrupted sector will
> cause an I/O error rather than return garbage. Facebook had some storage
> devices that would appear to store data correctly in a set of logical
> block addresses (LBAs) until the next reboot, at which point reads to
> those blocks would return GUID partition table (GPT) data instead.

Wow that's right in between bizarre and hilarious. Maybe Travis should
check for firmware updates (for the computer, and the enclosure if it
offers such a thing, and maybe even the drives).


 >He
> did not name the device maker because it turned out to actually be a
> BIOS problem. In any case, the CRCs allowed the Facebook team to quickly
> figure out that the problem was not in Btrfs when it affected thousands
> of machines as they were rebooted for a kernel upgrade.

Yeah even in a recent case on linux-btrfs where there's two drives
with bad sectors causing grief, the volume (somewhat surprisingly)
mounted ro,degraded and appears to be mostly recoverable, but the main
thing is that even in that case, other than nocow files, it's expected
anything that copies over (cp, rsync, btrfs send) is not corrupt. If
it were corrupt even after reconstruction from parity (even bad
parity), Brfs will give an I/O error and not submit the data to user
space.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: bootsect replicated in p1, RAID enclosure suggestions?
  2016-08-26  4:25               ` Chris Murphy
@ 2016-09-02  2:18                 ` travis+ml-linux-raid
  0 siblings, 0 replies; 17+ messages in thread
From: travis+ml-linux-raid @ 2016-09-02  2:18 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Linux-RAID

On Thu, Aug 25, 2016 at 10:25:35PM -0600, Chris Murphy wrote:
> Well that file does seem stale, because those partitions aren't
> actually part of LVM. They're members of an mdadm array. I don't know
> where LVM comes into this because we don't have the complete layout.

md127 = /dev/sd{b,c,d,e}1
LUKS on that
PV/VG/LV on that.

/dev/sda5 is also a LUKS partition with LVM on it for root.

I wonder if it's possible that whatever restored a GPT also restored a
LVM header, and somehow that picked it up?

Anyway, after doing bitwise backups of disks, I did a create
--assume-clean with --level=raid0 and the thing seems fine.

# fsck /dev/V_hostname/L_bu
fsck from util-linux 2.20.1
e2fsck 1.42 (29-Nov-2011)
/dev/mapper/V_hostname-L_bu contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 3A: Optimizing directories
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong for group #67095 (65535, counted=0).
Fix<y>? yes

Free blocks count wrong (496207637, counted=496207638).
Fix<y>? yes

And that was pretty much it.
-- 
http://www.subspacefield.org/~travis/ | if spammer then john@subspacefield.org
"Computer crime, the glamor crime of the 1970s, will become in the
1980s one of the greatest sources of preventable business loss."
John M. Carroll, "Computer Security", first edition cover flap, 1977

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2016-09-02  2:18 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-08-23  5:09 bootsect replicated in p1, RAID enclosure suggestions? travis+ml-linux-raid
2016-08-24  2:14 ` travis+ml-linux-raid
2016-08-24 17:15 ` Chris Murphy
2016-08-25  6:25   ` travis+ml-linux-raid
2016-08-25 21:06     ` Wols Lists
2016-08-25 22:32     ` Chris Murphy
2016-08-26  2:33       ` Phil Turmel
2016-08-26  2:48         ` Chris Murphy
2016-08-26  3:11           ` travis+ml-linux-raid
2016-08-26  2:50       ` travis
2016-08-26  3:21         ` Chris Murphy
2016-08-26  3:58           ` travis
2016-08-26  4:06             ` travis+ml-linux-raid
2016-08-26  4:25               ` Chris Murphy
2016-09-02  2:18                 ` travis+ml-linux-raid
2016-09-01 17:22 ` Wols Lists
2016-09-01 23:10   ` Chris Murphy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.