All of lore.kernel.org
 help / color / mirror / Atom feed
* Issue with bad file system
@ 2012-11-19  4:46 Drew Reusser
  2012-11-19 15:18 ` Eric Sandeen
  0 siblings, 1 reply; 17+ messages in thread
From: Drew Reusser @ 2012-11-19  4:46 UTC (permalink / raw)
  To: linux-ext4

Hello all,

I am having an issue with a ext4 file system which I cannot get to
come back up.  There were no changes that I know of, but I rebooted,
and when the system came back up it was not able to read from the hard
drives.  I have it setup in a raid 10 configuration, and can mount the
raid array with no issues.  However when I mount the file system, I
get the following error "EXT4-fs (md0): VFS: Can't find ext4
filesystem" from the log files.

Is there anyone that can help as I have to get this data back and I
prefer not to have to go to a professional to get it back and spend
the money.

-Drew

Basic Info:  4x1TB disks, raid 10 equaling 2TB total.  All four disks
are just one giant partition starting at the 2MB mark so I could boot
and put mdadm in the boot sector and get into linux.

Some basic commands below:


mint mnt # mount -t ext4 /dev/md0 /mnt/raid
mount: wrong fs type, bad option, bad superblock on /dev/md0,
       missing codepage or helper program, or other error
       In some cases useful info is found in syslog - try
       dmesg | tail  or so




C is the pen-drive I am booting from currently and F is the 2TB disk I
was using to backup to.

mint dev # cat /proc/partitions
major minor  #blocks  name

   7        0     939820 loop0
   8        0  976762584 sda
   8        1  976760832 sda1
   8       16  976762584 sdb
   8       17  976237568 sdb1
   8       32    1985024 sdc
   8       33    1984960 sdc1
   8       48  976762584 sdd
   8       49  976760832 sdd1
   8       64  976762584 sde
   8       65  976237568 sde1
  11        0    1048575 sr0
   8       80 1953514584 sdf
   8       81 1953512448 sdf1

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19  4:46 Issue with bad file system Drew Reusser
@ 2012-11-19 15:18 ` Eric Sandeen
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Sandeen @ 2012-11-19 15:18 UTC (permalink / raw)
  To: Drew Reusser; +Cc: linux-ext4

On 11/18/12 10:46 PM, Drew Reusser wrote:
> Hello all,
> 
> I am having an issue with a ext4 file system which I cannot get to
> come back up.  There were no changes that I know of, but I rebooted,
> and when the system came back up it was not able to read from the hard
> drives.  I have it setup in a raid 10 configuration, and can mount the
> raid array with no issues.  However when I mount the file system, I
> get the following error "EXT4-fs (md0): VFS: Can't find ext4
> filesystem" from the log files.
> 
> Is there anyone that can help as I have to get this data back and I
> prefer not to have to go to a professional to get it back and spend
> the money.
> 
> -Drew
> 
> Basic Info:  4x1TB disks, raid 10 equaling 2TB total.  All four disks
> are just one giant partition starting at the 2MB mark so I could boot
> and put mdadm in the boot sector and get into linux.
> 
> Some basic commands below:
> 
> 
> mint mnt # mount -t ext4 /dev/md0 /mnt/raid
> mount: wrong fs type, bad option, bad superblock on /dev/md0,
>        missing codepage or helper program, or other error
>        In some cases useful info is found in syslog - try
>        dmesg | tail  or so
> 

Ok, at this point you need to do what it says:

# dmesg | tail

so we can see for ourselves why the kernel rejected it (is there 
anything other than "EXT4-fs (md0): VFS: Can't find ext4
filesystem" ?

does /proc/mdstat look right?  This is more likely an md config
issue than an ext4 problem.  What does file -s /dev/md0 or
blkid /dev/md0 say?

-Eric

> 
> 
> C is the pen-drive I am booting from currently and F is the 2TB disk I
> was using to backup to.
> 
> mint dev # cat /proc/partitions
> major minor  #blocks  name
> 
>    7        0     939820 loop0
>    8        0  976762584 sda
>    8        1  976760832 sda1
>    8       16  976762584 sdb
>    8       17  976237568 sdb1
>    8       32    1985024 sdc
>    8       33    1984960 sdc1
>    8       48  976762584 sdd
>    8       49  976760832 sdd1
>    8       64  976762584 sde
>    8       65  976237568 sde1
>   11        0    1048575 sr0
>    8       80 1953514584 sdf
>    8       81 1953512448 sdf1
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 21:15           ` George Spelvin
@ 2012-11-19 21:30             ` Eric Sandeen
  0 siblings, 0 replies; 17+ messages in thread
From: Eric Sandeen @ 2012-11-19 21:30 UTC (permalink / raw)
  To: George Spelvin; +Cc: dreusser, linux-ext4

On 11/19/12 3:15 PM, George Spelvin wrote:
>> There is no encryption to my knowledge (not an expert in mdadm).
> 
> linux md doesn't do encryption.  But there are other things, like
> dm-crypt, that it can be combined with to do encryption.
> 
> That doesn't look like a superblock, that looks like random bits,
> as produced by encryption or good compression.  (Could be jpeg,
> mp3, or compressed video, it's hard to tell.)

does blkid /dev/md0 and/or file -s /dev/md0 think it looks like anything other than data?

-Eric


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 19:54         ` Drew Reusser
@ 2012-11-19 21:15           ` George Spelvin
  2012-11-19 21:30             ` Eric Sandeen
  0 siblings, 1 reply; 17+ messages in thread
From: George Spelvin @ 2012-11-19 21:15 UTC (permalink / raw)
  To: dreusser, sandeen; +Cc: linux-ext4, linux

> There is no encryption to my knowledge (not an expert in mdadm).

linux md doesn't do encryption.  But there are other things, like
dm-crypt, that it can be combined with to do encryption.

That doesn't look like a superblock, that looks like random bits,
as produced by encryption or good compression.  (Could be jpeg,
mp3, or compressed video, it's hard to tell.)

*And* the first few backup superblocks appear to be trashed
as well.

This means either at least a gigabyte of data got overwritten, or some
kind of underlying transformation got changed, so the superblocks
aren't visible in the right places any more.

The obvious possibilities are:
1) Data scrambling, such as from encryption.
2) Address scrambling, such as by moving the components
   in a RAID aroud or assembling with a different stripe
   size.

If you can find anywhere a sector that looks like a superblock as
described at
https://ext4.wiki.kernel.org/index.php/Ext4_Disk_Layout#The_Super_Block

we can try to figure out what's going on.

They're supposed to be located at blocks 32768 * 3**k, 32768 * 5**k,
and 32768 * 7**k, for k = 0, 1, 2, ....  That is, 32768 times 1, 3, 9,
27, 81, 243, ...; 32768 times 1, 5, 25, 125, 625, ...; and 32768 times 1,
7, 49, 343, ...

(Those are 4K blocks; myltiply by 8 for 512-byte sector numbers.)

While the primary superblock contains file system statistics and flags,
the backups are generally never written after file system creation time,
so it's *really* hard to understand how they could have been overwritten
by any sort of normal file system activity.

Let's see... you have 1952211968K in your RAID, meaning 488052992 4K
blocks, and 14895 block groups.

The *last* backup superblocks should be at 3^8 = 6561, 5^5 = 3125,
and 7^4 = 2401.

Try the following:
dumpe2fs -h -o superblock=$((32768*3**8)) /dev/md0
dumpe2fs -h -o superblock=$((32768*5**5)) /dev/md0
dumpe2fs -h -o superblock=$((32768*7**4)) /dev/md0
dumpe2fs -h -o superblock=$((32768*3**7)) /dev/md0

If none of those work, either something overwrote *most* of your drive,
or something has happened to give that illusion.

As Ted says, I'd research option 2 very carefully.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 19:53           ` Drew Reusser
@ 2012-11-19 20:24             ` Theodore Ts'o
  0 siblings, 0 replies; 17+ messages in thread
From: Theodore Ts'o @ 2012-11-19 20:24 UTC (permalink / raw)
  To: Drew Reusser; +Cc: Eric Sandeen, George Spelvin, linux-ext4

On Mon, Nov 19, 2012 at 07:53:13PM +0000, Drew Reusser wrote:
> 
> mint mnt # debugfs -s 32768 -b 4096 /dev/md0
> debugfs 1.42.5 (29-Jul-2012)
> /dev/md0: Bad magic number in super-block while opening filesystem
> debugfs:
> debugfs:  ls
> ls: Filesystem not open
> debugfs:

Sorry, I didn't realize that you weren't able to find *any* backup
super blocks.  In that case, either you got really unlucky and you had
some failure which has taken out a very large number of blocks, spread
out across the file system --- or I'd have to agree with Eric, I'd
want to be really, really sure that the Raid array hadn't gotten
assembled incorrectly somehow....

							- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 17:14       ` Eric Sandeen
  2012-11-19 18:41         ` Theodore Ts'o
@ 2012-11-19 19:54         ` Drew Reusser
  2012-11-19 21:15           ` George Spelvin
  1 sibling, 1 reply; 17+ messages in thread
From: Drew Reusser @ 2012-11-19 19:54 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: George Spelvin, linux-ext4

On Mon, Nov 19, 2012 at 5:14 PM, Eric Sandeen <sandeen@redhat.com> wrote:
> On 11/19/12 10:57 AM, Drew Reusser wrote:
>> On Mon, Nov 19, 2012 at 8:32 AM, George Spelvin <linux@horizon.com> wrote:
>
> ...
>
>
>>> Another thing that would be useful is "dd if=/dev/md0 skip=2 count=2 | xxd"
>>> (or od -x if you don't have xxd).  That will give a hex dump of the
>>> primary superblock, which might show the extent of the damage.
>>>
>>>
>>
>>
>> mint mnt # dd if=/dev/md0 skip=2 count=2 | xxd
>> 0000000: 3a3a 6b4e fd9b a66c 6467 f7a6 9fb0 9183  ::kN...ldg......
>> 2+0 records in
>> 2+0 records out
>
> I'd actually start looking at block 0 too.  Can you do that again,
> and drop the "skip?"
>
> as an aside - any chance there was supposed to be an encryption
> layer in there somewhere?  The below looks pretty random.
>
> -Eric
>
>> 0000010: 3fc6 11aa 0cec 0441 6bc8 6b55 ff4d f2ba  ?......Ak.kU.M..
>> 0000020: a475 15d9 ea5d 6833 5608 df95 60cd 4f76  .u...]h3V...`.Ov
>> 0000030: 5490 5c22 e564 c91b 1913 a519 807a 8986  T.\".d.......z..
>> 0000040: f3ba 3c8a 6c11 f78a bc94 5947 b55d d83f  ..<.l.....YG.].?
>> 0000050: 2eac 0f3c ab20 d88d 1820 d5bc 0f97 aaf6  ...<. ... ......
>> 0000060: 81f5 63cc 6eff 8e2e 54ad 50fe 291f 17f8  ..c.n...T.P.)...
>> 0000070: be01 67ad b9ec 49f7 fa60 953b 7348 9730  ..g...I..`.;sH.0
>> 0000080: 6105 8f4d da9c 3ef3 e90e c190 6471 a766  a..M..>.....dq.f
>> 0000090: c90a 77f6 1196 3b74 9121 e89c 19bd 29f4  ..w...;t.!....).
>> 00000a0: 808c 3342 16e7 0c80 e8c3 3a6a 5560 78eb  ..3B......:jU`x.
>> 00000b0: c0d9 c6d5 b386 a3a9 5275 7f5a f572 218b  ........Ru.Z.r!.
>> 00000c0: 63d5 28f8 71aa 4f6f e716 060a 4a50 70cb  c.(.q.Oo....JPp.
>> 00000d0: a740 8c5f e2df 2e65 11cd a88f c4ed c9bb  .@._...e........
>> 00000e0: d444 2da5 7e5d 7bb4 38f6 5fd0 60a8 d2cf  .D-.~]{.8._.`...
>> 00000f0: 813f 9afd 26b8 8cc0 6e6a 59a2 e1f6 32ce  .?..&...njY...2.
>> 0000100: 8c01 5928 5661 9687 fb9c b07d b412 4a57  ..Y(Va.....}..JW
>> 0000110: 2626 7099 c350 a893 3f76 1953 c34b 7ddf  &&p..P..?v.S.K}.
>> 0000120: d73d 5e7f 9d3c f4fe dac9 e2d7 8eaf da94  .=^..<..........
>> 0000130: 86fb cbfe 7866 45fa 72c9 a687 ea83 3b71  ....xfE.r.....;q
>> 0000140: 80cc 3320 59c3 c653 977d b6e3 79c4 5c3b  ..3 Y..S.}..y.\;
>> 0000150: b36b c978 6824 d4b9 9252 04dd 39de 4c05  .k.xh$...R..9.L.
>> 0000160: d331 951b 3dff 8974 4ac1 1950 f6bd 5586  .1..=..tJ..P..U.
>> 0000170: 1095 f62c e7b3 d9d6 fbee 8cfb 5bd0 959e  ...,........[...
>> 0000180: f6ae f551 1b41 a8cf ae40 435a ff05 c38e  ...Q.A...@CZ....
>> 0000190: 903d 2258 689e b64c f49d b7b3 593d f55c  .="Xh..L....Y=.\
>> 00001a0: fd4f f395 4ecd 6653 e46d 66b6 a046 a9cb  .O..N.fS.mf..F..
>> 00001b0: ee90 58d8 e876 5f63 5014 cbfe b0e1 cd53  ..X..v_cP......S
>> 00001c0: 12ac 43c5 2996 4a15 014e 3c7d a5c2 5842  ..C.).J..N<}..XB
>> 00001d0: 2e00 f2ea 3e12 fbe6 1403 e240 0e01 80d0  ....>......@....
>> 00001e0: 1a03 c0c0 37fb 3cae df51 1818 987f e03b  ....7.<..Q.....;
>> 00001f0: 718f b339 63e7 8495 56b4 045a 0091 7382  q..9c...V..Z..s.
>> 0000200: 9828 187e 5eab e288 a4c6 fd95 3950 f868  .(.~^.......9P.h
>> 0000210: 3ee5 5fe0 4943 8e1d 2315 41a7 0989 c97c  >._.IC..#.A....|
>> 0000220: 992a c47b 9ccd e08a cfea 4603 2f51 e5e3  .*.{......F./Q..
>> 0000230: 04c3 3224 bfaa f0fa 79ae 13db d774 8c87  ..2$....y....t..
>> 0000240: fad8 93b1 ddc6 ce8c 90f3 e754 c6a4 ece3  ...........T....
>> 0000250: 13ab 59e8 b5cc f5d1 c9ab 297f ba63 84a7  ..Y.......)..c..
>> 0000260: c8ed bff6 9e55 a191 cfef c79e 6cd0 8ccd  .....U......l...
>> 0000270: 83b1 de5a 3c26 1f81 e1fa b2ed 503a 2445  ...Z<&......P:$E
>> 0000280: 7212 2b2e 1242 18fb cac7 c3e5 73de fb2a  r.+..B......s..*
>> 0000290: c0ed 2318 01ba 1f04 22f1 fb3d f356 c0a0  ..#....."..=.V..
>> 00002a0: 258f 0184 6653 7814 e3ff b4ab 3276 7b9d  %...fSx.....2v{.
>> 00002b0: 6c68 0b00 8024 68f0 47e6 2aad e447 674b  lh...$h.G.*..GgK
>> 00002c0: 1a0c 0f85 e11d 5275 6d58 9940 e738 a3a8  ......RumX.@.8..
>> 00002d0: 490f cdbc e710 9099 9dbd a688 00d8 a530  I..............0
>> 00002e0: 843c 6665 a912 abdb 6c95 9e96 70dc f409  .<fe....l...p...
>> 00002f0: f27a 3c12 15b0 c168 29a7 c190 f9ac 90c0  .z<....h).......
>> 0000300: cd58 20ff 0461 ddcf 6617 9764 c352 a0de  .X ..a..f..d.R..
>> 0000310: 3818 e5e0 a168 49aa 8b98 2e6d 92f9 b575  8....hI....m...u
>> 0000320: d7dd 4651 1c54 c5e9 f96a d0f0 14c0 240a  ..FQ.T...j....$.
>> 0000330: 3193 00e1 f895 6aba 0780 37c1 0f3a 3b3e  1.....j...7..:;>
>> 0000340: a8bb c25f 8148 0140 1825 7814 0a68 8e5e  ..._.H.@.%x..h.^
>> 0000350: 237b db47 9e5f 573c acbd 5d54 a1ae ce9c  #{.G._W<..]T....
>> 0000360: b498 c8cd e0dd 4c34 ee8a bf32 b7cf 0cca  ......L4...2....
>> 0000370: ba69 e0a7 e9ef 09c6 7c20 5007 7662 9c36  .i......| P.vb.6
>> 0000380: f053 fda0 41f6 e560 1f2b ffbc 5344 407c  .S..A..`.+..SD@|
>> 0000390: 9801 ee74 a1ef 236f 6b6c 50c0 2acf 8ebf  ...t..#oklP.*...
>> 00003a0: f6ec 5049 7633 d215 1b35 af46 44e0 a7db  ..PIv3...5.FD...
>> 00003b0: fd01 43ef c03d 2d44 7c21 d12c 75a3 f1f2  ..C..=-D|!.,u...
>> 00003c0: a459 e196 ce0f 6de0 19f1 d086 2504 2f09  .Y....m.....%./.
>> 00003d0: 6abe 5f04 4277 86c9 6b20 a054 bf82 0b5d  j._.Bw..k .T...]
>> 00003e0: 3525 32b4 051a af5e 34af 1b29 0083 4987  5%2....^4..)..I.
>> 00003f0: c071 ab38 2567 0dff 54bc 2f8e 130e 33e2  .q.8%g..T./...3.
>>
>>



There is no encryption to my knowledge (not an expert in mdadm).

mint mnt # dd if=/dev/md0 count=2 | xxd
0000000: 00a0 a303 000c 8e0e 664d ba00 adcb 520e  ........fM....R.
2+0 records in
2+0 records out
0000010: f59f a303 0000 0000 0200 0000 0200 0000  ................
0000020: 0080 0000 0080 0000 0020 0000 0000 0000  ......... ......
1024 bytes (1.0 kB) copied0000030: a635 a450 0000 ffff 53ef 0200 0100
0000  .5.P....S.......
0000040: e32f 4750 0000 0000 0000 0000 0100 0000  ./GP............
0000050: 0000 0000 0b00 0000 0001 0100 3800 0000  ............8...
0000060: 4202 0000 7b00 0000 9308 d4b5 f9dc 4a4d  B...{.........JM
0000070: 8e5a 39e0 7291 333c 0000 0000 0000 0000  .Z9.r.3<........
0000080: 0000 0000 0000 0000 0000 0000 0000 0000  ................
, 0.0164673 s, 62.2 kB/s
0000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000c0: 0000 0000 0000 0000 0000 0000 0000 c503  ................
00000d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00000e0: 0000 0000 0000 0000 0000 0000 a9f4 8585  ................
00000f0: 4956 4fe9 8e63 326a 07bc bafb 0101 0000  IVO..c2j........
0000100: 0c00 0000 0000 0000 e32f 4750 0af3 0200  ........./GP....
0000110: 0400 0000 0000 0000 0000 0000 ff7f 0000  ................
0000120: 0080 4007 ff7f 0000 0100 0000 ffff 4007  ..@...........@.
0000130: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000140: 0000 0000 0000 0000 0000 0000 0000 0008  ................
0000150: 0000 0000 0000 0000 0000 0000 1c00 1c00  ................
0000160: 0100 0000 0000 0000 0000 0000 0000 0000  ................
0000170: 0000 0000 0400 0000 8817 0200 0000 0000  ................
0000180: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000190: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00001f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000200: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000210: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000220: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000230: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000240: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000250: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000260: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000270: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000280: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000290: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00002a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00002b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00002c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00002d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00002e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00002f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000300: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000310: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000320: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000330: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000340: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000350: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000360: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000370: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000380: 0000 0000 0000 0000 0000 0000 0000 0000  ................
0000390: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003a0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003b0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003c0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003d0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003e0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
00003f0: 0000 0000 0000 0000 0000 0000 0000 0000  ................

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 18:41         ` Theodore Ts'o
  2012-11-19 19:15           ` George Spelvin
@ 2012-11-19 19:53           ` Drew Reusser
  2012-11-19 20:24             ` Theodore Ts'o
  1 sibling, 1 reply; 17+ messages in thread
From: Drew Reusser @ 2012-11-19 19:53 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: Eric Sandeen, George Spelvin, linux-ext4

On Mon, Nov 19, 2012 at 6:41 PM, Theodore Ts'o <tytso@mit.edu> wrote:
> One of the things you could to verify that in fact the RAID array is
> sane is to run the following command:
>
> debugfs -s 32768 -b 4096 /dev/md0
>
> Then you can examine the file system via the debugfs commands "cd",
> "ls", "cat", "dump" (or even "rdump", although that's more interesting
> recovery operations).  I would suggest looking at a number of
> directories and make sure they look as you expect them, and that you
> try dumping out a few files and making sure that they are
> uncorrecpted.
>
> If the majority of the files you look at look sane, then it should be
> safe to let e2fsck recover the file system from the backup superblock.
>
> In the future, we'll be able to use the metadata checksum feature to
> automate this process (as well as being able to more gracefully and
> automatically handle inode table blocks written to the wrong location
> on disk, overwriting other inode table blocks) --- but a bit more
> testing is needed before I'd recommend it for regular users.  (In
> particular, I want to make sure that random journal corruptions are
> handled correctly when the metadata checksum feature is enabled ---
> before we start having more enthusiastic users try out bleeding edge
> features on production file systems....)
>
> Regards,
>
>                                                 - Ted

Sorry Ted, but that is not working.

mint mnt # debugfs -s 32768 -b 4096 /dev/md0
debugfs 1.42.5 (29-Jul-2012)
/dev/md0: Bad magic number in super-block while opening filesystem
debugfs:
debugfs:  ls
ls: Filesystem not open
debugfs:

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 19:15           ` George Spelvin
@ 2012-11-19 19:36             ` Theodore Ts'o
  0 siblings, 0 replies; 17+ messages in thread
From: Theodore Ts'o @ 2012-11-19 19:36 UTC (permalink / raw)
  To: George Spelvin; +Cc: sandeen, dreusser, linux-ext4

On Mon, Nov 19, 2012 at 02:15:05PM -0500, George Spelvin wrote:
> 
> Oh, more enthusiastic users are *already* trying out metadata_csum
> on production file systems.  And finding lots of bugs doing so!
> 
> My latest is a file system where e2fsck can find the problem, but
> can't fix it.

Is this a file system suffering from the aftermath of the off-line
resizing with the 64-bit file system feature?  Or is this some other
unrelated problem?

Thanks,

						- Ted



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 18:41         ` Theodore Ts'o
@ 2012-11-19 19:15           ` George Spelvin
  2012-11-19 19:36             ` Theodore Ts'o
  2012-11-19 19:53           ` Drew Reusser
  1 sibling, 1 reply; 17+ messages in thread
From: George Spelvin @ 2012-11-19 19:15 UTC (permalink / raw)
  To: sandeen, tytso; +Cc: dreusser, linux-ext4, linux

> (In particular, I want to make sure that random journal corruptions are
> handled correctly when the metadata checksum feature is enabled ---
> before we start having more enthusiastic users try out bleeding edge
> features on production file systems....)

Oh, more enthusiastic users are *already* trying out metadata_csum
on production file systems.  And finding lots of bugs doing so!

My latest is a file system where e2fsck can find the problem, but
can't fix it.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 17:14       ` Eric Sandeen
@ 2012-11-19 18:41         ` Theodore Ts'o
  2012-11-19 19:15           ` George Spelvin
  2012-11-19 19:53           ` Drew Reusser
  2012-11-19 19:54         ` Drew Reusser
  1 sibling, 2 replies; 17+ messages in thread
From: Theodore Ts'o @ 2012-11-19 18:41 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: Drew Reusser, George Spelvin, linux-ext4

One of the things you could to verify that in fact the RAID array is
sane is to run the following command:

debugfs -s 32768 -b 4096 /dev/md0

Then you can examine the file system via the debugfs commands "cd",
"ls", "cat", "dump" (or even "rdump", although that's more interesting
recovery operations).  I would suggest looking at a number of
directories and make sure they look as you expect them, and that you
try dumping out a few files and making sure that they are
uncorrecpted.

If the majority of the files you look at look sane, then it should be
safe to let e2fsck recover the file system from the backup superblock.

In the future, we'll be able to use the metadata checksum feature to
automate this process (as well as being able to more gracefully and
automatically handle inode table blocks written to the wrong location
on disk, overwriting other inode table blocks) --- but a bit more
testing is needed before I'd recommend it for regular users.  (In
particular, I want to make sure that random journal corruptions are
handled correctly when the metadata checksum feature is enabled ---
before we start having more enthusiastic users try out bleeding edge
features on production file systems....)

Regards,

						- Ted

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 16:57     ` Drew Reusser
@ 2012-11-19 17:14       ` Eric Sandeen
  2012-11-19 18:41         ` Theodore Ts'o
  2012-11-19 19:54         ` Drew Reusser
  0 siblings, 2 replies; 17+ messages in thread
From: Eric Sandeen @ 2012-11-19 17:14 UTC (permalink / raw)
  To: Drew Reusser; +Cc: George Spelvin, linux-ext4

On 11/19/12 10:57 AM, Drew Reusser wrote:
> On Mon, Nov 19, 2012 at 8:32 AM, George Spelvin <linux@horizon.com> wrote:

...


>> Another thing that would be useful is "dd if=/dev/md0 skip=2 count=2 | xxd"
>> (or od -x if you don't have xxd).  That will give a hex dump of the
>> primary superblock, which might show the extent of the damage.
>>
>>
> 
> 
> mint mnt # dd if=/dev/md0 skip=2 count=2 | xxd
> 0000000: 3a3a 6b4e fd9b a66c 6467 f7a6 9fb0 9183  ::kN...ldg......
> 2+0 records in
> 2+0 records out

I'd actually start looking at block 0 too.  Can you do that again,
and drop the "skip?"

as an aside - any chance there was supposed to be an encryption
layer in there somewhere?  The below looks pretty random.

-Eric

> 0000010: 3fc6 11aa 0cec 0441 6bc8 6b55 ff4d f2ba  ?......Ak.kU.M..
> 0000020: a475 15d9 ea5d 6833 5608 df95 60cd 4f76  .u...]h3V...`.Ov
> 0000030: 5490 5c22 e564 c91b 1913 a519 807a 8986  T.\".d.......z..
> 0000040: f3ba 3c8a 6c11 f78a bc94 5947 b55d d83f  ..<.l.....YG.].?
> 0000050: 2eac 0f3c ab20 d88d 1820 d5bc 0f97 aaf6  ...<. ... ......
> 0000060: 81f5 63cc 6eff 8e2e 54ad 50fe 291f 17f8  ..c.n...T.P.)...
> 0000070: be01 67ad b9ec 49f7 fa60 953b 7348 9730  ..g...I..`.;sH.0
> 0000080: 6105 8f4d da9c 3ef3 e90e c190 6471 a766  a..M..>.....dq.f
> 0000090: c90a 77f6 1196 3b74 9121 e89c 19bd 29f4  ..w...;t.!....).
> 00000a0: 808c 3342 16e7 0c80 e8c3 3a6a 5560 78eb  ..3B......:jU`x.
> 00000b0: c0d9 c6d5 b386 a3a9 5275 7f5a f572 218b  ........Ru.Z.r!.
> 00000c0: 63d5 28f8 71aa 4f6f e716 060a 4a50 70cb  c.(.q.Oo....JPp.
> 00000d0: a740 8c5f e2df 2e65 11cd a88f c4ed c9bb  .@._...e........
> 00000e0: d444 2da5 7e5d 7bb4 38f6 5fd0 60a8 d2cf  .D-.~]{.8._.`...
> 00000f0: 813f 9afd 26b8 8cc0 6e6a 59a2 e1f6 32ce  .?..&...njY...2.
> 0000100: 8c01 5928 5661 9687 fb9c b07d b412 4a57  ..Y(Va.....}..JW
> 0000110: 2626 7099 c350 a893 3f76 1953 c34b 7ddf  &&p..P..?v.S.K}.
> 0000120: d73d 5e7f 9d3c f4fe dac9 e2d7 8eaf da94  .=^..<..........
> 0000130: 86fb cbfe 7866 45fa 72c9 a687 ea83 3b71  ....xfE.r.....;q
> 0000140: 80cc 3320 59c3 c653 977d b6e3 79c4 5c3b  ..3 Y..S.}..y.\;
> 0000150: b36b c978 6824 d4b9 9252 04dd 39de 4c05  .k.xh$...R..9.L.
> 0000160: d331 951b 3dff 8974 4ac1 1950 f6bd 5586  .1..=..tJ..P..U.
> 0000170: 1095 f62c e7b3 d9d6 fbee 8cfb 5bd0 959e  ...,........[...
> 0000180: f6ae f551 1b41 a8cf ae40 435a ff05 c38e  ...Q.A...@CZ....
> 0000190: 903d 2258 689e b64c f49d b7b3 593d f55c  .="Xh..L....Y=.\
> 00001a0: fd4f f395 4ecd 6653 e46d 66b6 a046 a9cb  .O..N.fS.mf..F..
> 00001b0: ee90 58d8 e876 5f63 5014 cbfe b0e1 cd53  ..X..v_cP......S
> 00001c0: 12ac 43c5 2996 4a15 014e 3c7d a5c2 5842  ..C.).J..N<}..XB
> 00001d0: 2e00 f2ea 3e12 fbe6 1403 e240 0e01 80d0  ....>......@....
> 00001e0: 1a03 c0c0 37fb 3cae df51 1818 987f e03b  ....7.<..Q.....;
> 00001f0: 718f b339 63e7 8495 56b4 045a 0091 7382  q..9c...V..Z..s.
> 0000200: 9828 187e 5eab e288 a4c6 fd95 3950 f868  .(.~^.......9P.h
> 0000210: 3ee5 5fe0 4943 8e1d 2315 41a7 0989 c97c  >._.IC..#.A....|
> 0000220: 992a c47b 9ccd e08a cfea 4603 2f51 e5e3  .*.{......F./Q..
> 0000230: 04c3 3224 bfaa f0fa 79ae 13db d774 8c87  ..2$....y....t..
> 0000240: fad8 93b1 ddc6 ce8c 90f3 e754 c6a4 ece3  ...........T....
> 0000250: 13ab 59e8 b5cc f5d1 c9ab 297f ba63 84a7  ..Y.......)..c..
> 0000260: c8ed bff6 9e55 a191 cfef c79e 6cd0 8ccd  .....U......l...
> 0000270: 83b1 de5a 3c26 1f81 e1fa b2ed 503a 2445  ...Z<&......P:$E
> 0000280: 7212 2b2e 1242 18fb cac7 c3e5 73de fb2a  r.+..B......s..*
> 0000290: c0ed 2318 01ba 1f04 22f1 fb3d f356 c0a0  ..#....."..=.V..
> 00002a0: 258f 0184 6653 7814 e3ff b4ab 3276 7b9d  %...fSx.....2v{.
> 00002b0: 6c68 0b00 8024 68f0 47e6 2aad e447 674b  lh...$h.G.*..GgK
> 00002c0: 1a0c 0f85 e11d 5275 6d58 9940 e738 a3a8  ......RumX.@.8..
> 00002d0: 490f cdbc e710 9099 9dbd a688 00d8 a530  I..............0
> 00002e0: 843c 6665 a912 abdb 6c95 9e96 70dc f409  .<fe....l...p...
> 00002f0: f27a 3c12 15b0 c168 29a7 c190 f9ac 90c0  .z<....h).......
> 0000300: cd58 20ff 0461 ddcf 6617 9764 c352 a0de  .X ..a..f..d.R..
> 0000310: 3818 e5e0 a168 49aa 8b98 2e6d 92f9 b575  8....hI....m...u
> 0000320: d7dd 4651 1c54 c5e9 f96a d0f0 14c0 240a  ..FQ.T...j....$.
> 0000330: 3193 00e1 f895 6aba 0780 37c1 0f3a 3b3e  1.....j...7..:;>
> 0000340: a8bb c25f 8148 0140 1825 7814 0a68 8e5e  ..._.H.@.%x..h.^
> 0000350: 237b db47 9e5f 573c acbd 5d54 a1ae ce9c  #{.G._W<..]T....
> 0000360: b498 c8cd e0dd 4c34 ee8a bf32 b7cf 0cca  ......L4...2....
> 0000370: ba69 e0a7 e9ef 09c6 7c20 5007 7662 9c36  .i......| P.vb.6
> 0000380: f053 fda0 41f6 e560 1f2b ffbc 5344 407c  .S..A..`.+..SD@|
> 0000390: 9801 ee74 a1ef 236f 6b6c 50c0 2acf 8ebf  ...t..#oklP.*...
> 00003a0: f6ec 5049 7633 d215 1b35 af46 44e0 a7db  ..PIv3...5.FD...
> 00003b0: fd01 43ef c03d 2d44 7c21 d12c 75a3 f1f2  ..C..=-D|!.,u...
> 00003c0: a459 e196 ce0f 6de0 19f1 d086 2504 2f09  .Y....m.....%./.
> 00003d0: 6abe 5f04 4277 86c9 6b20 a054 bf82 0b5d  j._.Bw..k .T...]
> 00003e0: 3525 32b4 051a af5e 34af 1b29 0083 4987  5%2....^4..)..I.
> 00003f0: c071 ab38 2567 0dff 54bc 2f8e 130e 33e2  .q.8%g..T./...3.
> 
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19 15:29     ` Eric Sandeen
@ 2012-11-19 17:00       ` Drew Reusser
  0 siblings, 0 replies; 17+ messages in thread
From: Drew Reusser @ 2012-11-19 17:00 UTC (permalink / raw)
  To: Eric Sandeen; +Cc: George Spelvin, linux-ext4

On Mon, Nov 19, 2012 at 3:29 PM, Eric Sandeen <sandeen@redhat.com> wrote:
> On 11/19/12 2:32 AM, George Spelvin wrote:
> ...
>
>> "e2fsck -n" will only print errors and not change anything.  It's
>> always safe.
>>
>> Try "e2fsck -n -v /dev/md0" (given the dumpe2fs failure, I expect that
>> will not work) and then try "e2fsck -n -v -b 32768 /dev/md0".
>>
>> I don't know what happened to your superblock, but if that's all that
>> got trashed, recovery is actually quite straightforward and there's no
>> risk of data loss.  e2fsck will just print a huge number of "free blocks
>> count wrong" messages as it fixes them.
>>
>> (However, that's a pretty big "if".)
>>
>>
>> Another thing that would be useful is "dd if=/dev/md0 skip=2 count=2 | xxd"
>> (or od -x if you don't have xxd).  That will give a hex dump of the
>> primary superblock, which might show the extent of the damage.
>>
>>
>> If "e2fsck -n -b 32768" works, the way to repair it is to run it again
>> without the "-n", but the -n output will say how bad it is.
>
> Whoops, I replied without seeing these other replies; somehow threading
> was broken w/ George's first reply.
>
> Anyway - I would not go to e2fsck yet.  I think your raid is mis-assembled.
> I'd investigate that first.  I'll look at the other output a bit more, but
> for now, I'd stay away from fsck - just wanted to get that out there quick.
>
> -Eric

Can you give me more details as to why you think the raid is misassembled?

mint ~ # mdadm --examine /dev/sd[abde]1
/dev/sda1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : db9e3115:556a49db:27c42d30:02657472
           Name : mint:0  (local to host mint)
  Creation Time : Thu Nov 15 11:08:02 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953259520 (931.39 GiB 1000.07 GB)
     Array Size : 1952211968 (1861.77 GiB 1999.07 GB)
  Used Dev Size : 1952211968 (930.89 GiB 999.53 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 933ec5c0:d6819e33:adb0e6c8:90e337bd

    Update Time : Thu Nov 15 15:08:55 2012
       Checksum : b516984f - correct
         Events : 17

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdb1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : db9e3115:556a49db:27c42d30:02657472
           Name : mint:0  (local to host mint)
  Creation Time : Thu Nov 15 11:08:02 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1952212992 (930.89 GiB 999.53 GB)
     Array Size : 1952211968 (1861.77 GiB 1999.07 GB)
  Used Dev Size : 1952211968 (930.89 GiB 999.53 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 9d6df7c7:ce401405:4ea18763:a528ecc5

    Update Time : Thu Nov 15 15:08:55 2012
       Checksum : 3103c408 - correct
         Events : 17

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sdd1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : db9e3115:556a49db:27c42d30:02657472
           Name : mint:0  (local to host mint)
  Creation Time : Thu Nov 15 11:08:02 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1953259520 (931.39 GiB 1000.07 GB)
     Array Size : 1952211968 (1861.77 GiB 1999.07 GB)
  Used Dev Size : 1952211968 (930.89 GiB 999.53 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : fa1a1b82:989e933a:95e4d249:5cee901d

    Update Time : Thu Nov 15 15:08:55 2012
       Checksum : 5ea6d02d - correct
         Events : 17

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA ('A' == active, '.' == missing)
/dev/sde1:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x0
     Array UUID : db9e3115:556a49db:27c42d30:02657472
           Name : mint:0  (local to host mint)
  Creation Time : Thu Nov 15 11:08:02 2012
     Raid Level : raid10
   Raid Devices : 4

 Avail Dev Size : 1952212992 (930.89 GiB 999.53 GB)
     Array Size : 1952211968 (1861.77 GiB 1999.07 GB)
  Used Dev Size : 1952211968 (930.89 GiB 999.53 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : clean
    Device UUID : 594ed481:471ef11a:027f1c24:6f9d057d

    Update Time : Thu Nov 15 15:08:55 2012
       Checksum : 786bd4bc - correct
         Events : 17

         Layout : near=2
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAA ('A' == active, '.' == missing)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19  8:32   ` George Spelvin
  2012-11-19 15:29     ` Eric Sandeen
@ 2012-11-19 16:57     ` Drew Reusser
  2012-11-19 17:14       ` Eric Sandeen
  1 sibling, 1 reply; 17+ messages in thread
From: Drew Reusser @ 2012-11-19 16:57 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-ext4

On Mon, Nov 19, 2012 at 8:32 AM, George Spelvin <linux@horizon.com> wrote:
>> I am running this on Linux Mint 12 .. I don't know the kernel version
>> as I cannot boot so I am booting of whatever I downloaded from Mint's
>> website (13 rc I think) off a pen drive to try and save my data.
>>
>> mint mnt # uname -a
>> Linux mint 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012
>> x86_64 x86_64 x86_64 GNU/Linux
>
> For the benefit of other ext4 hackers, Mint 12 is based on Ubuntu 11.10
> and runs a 3.0 kernel.
>
>> I ran the cat of /proc/partitions and copied the data from previous
>> emails to the linux-raid DL (which forwarded me onto this one).  Must
>> have gotten it before the raid was working.  Here is an updated one.
>
> Okay, no problem.  The failure to show up just conflicted with your
> statement that the RAID worked fine, and I was wondering if there wa a
> problem there.  It appears that's not the issue.
>
>> as for the last command you asked .. can you give me more info on it?
>> if you meant dumpe2fs ... here is the output.
>
> I did; my fingers got confused; sorry about the typo.  Doubly sorry
> because it is a plausible command name.
>
>> mint mnt # dumpe2fs -h /dev/md0
>> dumpe2fs 1.42.5 (29-Jul-2012)
>> dumpe2fs: Bad magic number in super-block while trying to open /dev/md0
>> Couldn't find valid filesystem superblock.
>
> Well, that explains the immediate problem.
>
> Does "dumpe2fs -h -o superblock=32768" produce anything more useful?
> (That checks the first backup superblock.  There are additional backups at
> 98304, 163840, 229376, 294912, ...)
>


mint mnt # dumpe2fs -h -o superblock=32768 /dev/md0
dumpe2fs 1.42.5 (29-Jul-2012)
dumpe2fs: Filesystem has unexpected block size while trying to open /dev/md0
Couldn't find valid filesystem superblock.
mint mnt # dumpe2fs -h -o superblock=98304 /dev/md0
dumpe2fs 1.42.5 (29-Jul-2012)
dumpe2fs: Filesystem has unexpected block size while trying to open /dev/md0
Couldn't find valid filesystem superblock.
mint mnt # dumpe2fs -h -o superblock=163840 /dev/md0
dumpe2fs 1.42.5 (29-Jul-2012)
dumpe2fs: Bad magic number in super-block while trying to open /dev/md0
Couldn't find valid filesystem superblock.
mint mnt # dumpe2fs -h -o superblock=229376 /dev/md0
dumpe2fs 1.42.5 (29-Jul-2012)
dumpe2fs: Bad magic number in super-block while trying to open /dev/md0
Couldn't find valid filesystem superblock.
mint mnt # dumpe2fs -h -o superblock=294912 /dev/md0
dumpe2fs 1.42.5 (29-Jul-2012)
dumpe2fs: Bad magic number in super-block while trying to open /dev/md0
Couldn't find valid filesystem superblock.




>> Can you give me specific e2fsck commands to run which will not ruin my
>> disks and data?  I have seen people online recommending re-writing the
>> super blocks but I am now sure I want to write anything until I know
>> it will not damage something and erase my data.
>
> "e2fsck -n" will only print errors and not change anything.  It's
> always safe.
>


mint mnt # e2fsck -n -v /dev/md0
e2fsck 1.42.5 (29-Jul-2012)
ext2fs_open2: Bad magic number in super-block
e2fsck: Superblock invalid, trying backup blocks...
e2fsck: Bad magic number in super-block while trying to open /dev/md0

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>




> Try "e2fsck -n -v /dev/md0" (given the dumpe2fs failure, I expect that
> will not work) and then try "e2fsck -n -v -b 32768 /dev/md0".
>

mint mnt # e2fsck -n -v -b 32768 /dev/md0
e2fsck 1.42.5 (29-Jul-2012)
e2fsck: Filesystem has unexpected block size while trying to open /dev/md0

The superblock could not be read or does not describe a correct ext2
filesystem.  If the device is valid and it really contains an ext2
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>



> I don't know what happened to your superblock, but if that's all that
> got trashed, recovery is actually quite straightforward and there's no
> risk of data loss.  e2fsck will just print a huge number of "free blocks
> count wrong" messages as it fixes them.
>
> (However, that's a pretty big "if".)
>
>
> Another thing that would be useful is "dd if=/dev/md0 skip=2 count=2 | xxd"
> (or od -x if you don't have xxd).  That will give a hex dump of the
> primary superblock, which might show the extent of the damage.
>
>


mint mnt # dd if=/dev/md0 skip=2 count=2 | xxd
0000000: 3a3a 6b4e fd9b a66c 6467 f7a6 9fb0 9183  ::kN...ldg......
2+0 records in
2+0 records out
0000010: 3fc6 11aa 0cec 0441 6bc8 6b55 ff4d f2ba  ?......Ak.kU.M..
0000020: a475 15d9 ea5d 6833 5608 df95 60cd 4f76  .u...]h3V...`.Ov
1024 bytes (1.0 kB) copied0000030: 5490 5c22 e564 c91b 1913 a519 807a
8986  T.\".d.......z..
0000040: f3ba 3c8a 6c11 f78a bc94 5947 b55d d83f  ..<.l.....YG.].?
0000050: 2eac 0f3c ab20 d88d 1820 d5bc 0f97 aaf6  ...<. ... ......
0000060: 81f5 63cc 6eff 8e2e 54ad 50fe 291f 17f8  ..c.n...T.P.)...
0000070: be01 67ad b9ec 49f7 fa60 953b 7348 9730  ..g...I..`.;sH.0
, 0.0122559 s, 83.6 kB/s
0000080: 6105 8f4d da9c 3ef3 e90e c190 6471 a766  a..M..>.....dq.f
0000090: c90a 77f6 1196 3b74 9121 e89c 19bd 29f4  ..w...;t.!....).
00000a0: 808c 3342 16e7 0c80 e8c3 3a6a 5560 78eb  ..3B......:jU`x.
00000b0: c0d9 c6d5 b386 a3a9 5275 7f5a f572 218b  ........Ru.Z.r!.
00000c0: 63d5 28f8 71aa 4f6f e716 060a 4a50 70cb  c.(.q.Oo....JPp.
00000d0: a740 8c5f e2df 2e65 11cd a88f c4ed c9bb  .@._...e........
00000e0: d444 2da5 7e5d 7bb4 38f6 5fd0 60a8 d2cf  .D-.~]{.8._.`...
00000f0: 813f 9afd 26b8 8cc0 6e6a 59a2 e1f6 32ce  .?..&...njY...2.
0000100: 8c01 5928 5661 9687 fb9c b07d b412 4a57  ..Y(Va.....}..JW
0000110: 2626 7099 c350 a893 3f76 1953 c34b 7ddf  &&p..P..?v.S.K}.
0000120: d73d 5e7f 9d3c f4fe dac9 e2d7 8eaf da94  .=^..<..........
0000130: 86fb cbfe 7866 45fa 72c9 a687 ea83 3b71  ....xfE.r.....;q
0000140: 80cc 3320 59c3 c653 977d b6e3 79c4 5c3b  ..3 Y..S.}..y.\;
0000150: b36b c978 6824 d4b9 9252 04dd 39de 4c05  .k.xh$...R..9.L.
0000160: d331 951b 3dff 8974 4ac1 1950 f6bd 5586  .1..=..tJ..P..U.
0000170: 1095 f62c e7b3 d9d6 fbee 8cfb 5bd0 959e  ...,........[...
0000180: f6ae f551 1b41 a8cf ae40 435a ff05 c38e  ...Q.A...@CZ....
0000190: 903d 2258 689e b64c f49d b7b3 593d f55c  .="Xh..L....Y=.\
00001a0: fd4f f395 4ecd 6653 e46d 66b6 a046 a9cb  .O..N.fS.mf..F..
00001b0: ee90 58d8 e876 5f63 5014 cbfe b0e1 cd53  ..X..v_cP......S
00001c0: 12ac 43c5 2996 4a15 014e 3c7d a5c2 5842  ..C.).J..N<}..XB
00001d0: 2e00 f2ea 3e12 fbe6 1403 e240 0e01 80d0  ....>......@....
00001e0: 1a03 c0c0 37fb 3cae df51 1818 987f e03b  ....7.<..Q.....;
00001f0: 718f b339 63e7 8495 56b4 045a 0091 7382  q..9c...V..Z..s.
0000200: 9828 187e 5eab e288 a4c6 fd95 3950 f868  .(.~^.......9P.h
0000210: 3ee5 5fe0 4943 8e1d 2315 41a7 0989 c97c  >._.IC..#.A....|
0000220: 992a c47b 9ccd e08a cfea 4603 2f51 e5e3  .*.{......F./Q..
0000230: 04c3 3224 bfaa f0fa 79ae 13db d774 8c87  ..2$....y....t..
0000240: fad8 93b1 ddc6 ce8c 90f3 e754 c6a4 ece3  ...........T....
0000250: 13ab 59e8 b5cc f5d1 c9ab 297f ba63 84a7  ..Y.......)..c..
0000260: c8ed bff6 9e55 a191 cfef c79e 6cd0 8ccd  .....U......l...
0000270: 83b1 de5a 3c26 1f81 e1fa b2ed 503a 2445  ...Z<&......P:$E
0000280: 7212 2b2e 1242 18fb cac7 c3e5 73de fb2a  r.+..B......s..*
0000290: c0ed 2318 01ba 1f04 22f1 fb3d f356 c0a0  ..#....."..=.V..
00002a0: 258f 0184 6653 7814 e3ff b4ab 3276 7b9d  %...fSx.....2v{.
00002b0: 6c68 0b00 8024 68f0 47e6 2aad e447 674b  lh...$h.G.*..GgK
00002c0: 1a0c 0f85 e11d 5275 6d58 9940 e738 a3a8  ......RumX.@.8..
00002d0: 490f cdbc e710 9099 9dbd a688 00d8 a530  I..............0
00002e0: 843c 6665 a912 abdb 6c95 9e96 70dc f409  .<fe....l...p...
00002f0: f27a 3c12 15b0 c168 29a7 c190 f9ac 90c0  .z<....h).......
0000300: cd58 20ff 0461 ddcf 6617 9764 c352 a0de  .X ..a..f..d.R..
0000310: 3818 e5e0 a168 49aa 8b98 2e6d 92f9 b575  8....hI....m...u
0000320: d7dd 4651 1c54 c5e9 f96a d0f0 14c0 240a  ..FQ.T...j....$.
0000330: 3193 00e1 f895 6aba 0780 37c1 0f3a 3b3e  1.....j...7..:;>
0000340: a8bb c25f 8148 0140 1825 7814 0a68 8e5e  ..._.H.@.%x..h.^
0000350: 237b db47 9e5f 573c acbd 5d54 a1ae ce9c  #{.G._W<..]T....
0000360: b498 c8cd e0dd 4c34 ee8a bf32 b7cf 0cca  ......L4...2....
0000370: ba69 e0a7 e9ef 09c6 7c20 5007 7662 9c36  .i......| P.vb.6
0000380: f053 fda0 41f6 e560 1f2b ffbc 5344 407c  .S..A..`.+..SD@|
0000390: 9801 ee74 a1ef 236f 6b6c 50c0 2acf 8ebf  ...t..#oklP.*...
00003a0: f6ec 5049 7633 d215 1b35 af46 44e0 a7db  ..PIv3...5.FD...
00003b0: fd01 43ef c03d 2d44 7c21 d12c 75a3 f1f2  ..C..=-D|!.,u...
00003c0: a459 e196 ce0f 6de0 19f1 d086 2504 2f09  .Y....m.....%./.
00003d0: 6abe 5f04 4277 86c9 6b20 a054 bf82 0b5d  j._.Bw..k .T...]
00003e0: 3525 32b4 051a af5e 34af 1b29 0083 4987  5%2....^4..)..I.
00003f0: c071 ab38 2567 0dff 54bc 2f8e 130e 33e2  .q.8%g..T./...3.



> If "e2fsck -n -b 32768" works, the way to repair it is to run it again
> without the "-n", but the -n output will say how bad it is.
>
> As a general rule of thumb, the more phases e2fsck gets through before
> complaining, the less the damage.  Errors in the bitmaps found in phase 5
> are not a serious problem; they only indicate things that would rapidly
> *become* serious problems if you mounted the file system and tried to
> write to it.
>
> One thing I strongly recommend when e2fsck is fixing a lot of problems is
> to save a log (I usually use the "script" program, but you can also use
> "<command> 2>&1 | tee output") of the e2fsck run so you can refer back
> to it later.


Okay .. So basically I just have to find the right superblock and get
that back.  Is there a way to find the list of backup superblocks on
the raid array so i can test each one and see which one will work?

-Drew

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19  8:32   ` George Spelvin
@ 2012-11-19 15:29     ` Eric Sandeen
  2012-11-19 17:00       ` Drew Reusser
  2012-11-19 16:57     ` Drew Reusser
  1 sibling, 1 reply; 17+ messages in thread
From: Eric Sandeen @ 2012-11-19 15:29 UTC (permalink / raw)
  To: George Spelvin; +Cc: dreusser, linux-ext4

On 11/19/12 2:32 AM, George Spelvin wrote:
...

> "e2fsck -n" will only print errors and not change anything.  It's
> always safe.
> 
> Try "e2fsck -n -v /dev/md0" (given the dumpe2fs failure, I expect that
> will not work) and then try "e2fsck -n -v -b 32768 /dev/md0".
> 
> I don't know what happened to your superblock, but if that's all that
> got trashed, recovery is actually quite straightforward and there's no
> risk of data loss.  e2fsck will just print a huge number of "free blocks
> count wrong" messages as it fixes them.
> 
> (However, that's a pretty big "if".)
> 
> 
> Another thing that would be useful is "dd if=/dev/md0 skip=2 count=2 | xxd"
> (or od -x if you don't have xxd).  That will give a hex dump of the
> primary superblock, which might show the extent of the damage.
> 
> 
> If "e2fsck -n -b 32768" works, the way to repair it is to run it again
> without the "-n", but the -n output will say how bad it is.

Whoops, I replied without seeing these other replies; somehow threading
was broken w/ George's first reply.

Anyway - I would not go to e2fsck yet.  I think your raid is mis-assembled.
I'd investigate that first.  I'll look at the other output a bit more, but
for now, I'd stay away from fsck - just wanted to get that out there quick.

-Eric

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19  7:23 ` Drew Reusser
@ 2012-11-19  8:32   ` George Spelvin
  2012-11-19 15:29     ` Eric Sandeen
  2012-11-19 16:57     ` Drew Reusser
  0 siblings, 2 replies; 17+ messages in thread
From: George Spelvin @ 2012-11-19  8:32 UTC (permalink / raw)
  To: dreusser, linux; +Cc: linux-ext4

> I am running this on Linux Mint 12 .. I don't know the kernel version
> as I cannot boot so I am booting of whatever I downloaded from Mint's
> website (13 rc I think) off a pen drive to try and save my data.
> 
> mint mnt # uname -a
> Linux mint 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012
> x86_64 x86_64 x86_64 GNU/Linux

For the benefit of other ext4 hackers, Mint 12 is based on Ubuntu 11.10
and runs a 3.0 kernel.

> I ran the cat of /proc/partitions and copied the data from previous
> emails to the linux-raid DL (which forwarded me onto this one).  Must
> have gotten it before the raid was working.  Here is an updated one.

Okay, no problem.  The failure to show up just conflicted with your
statement that the RAID worked fine, and I was wondering if there wa a
problem there.  It appears that's not the issue.

> as for the last command you asked .. can you give me more info on it?
> if you meant dumpe2fs ... here is the output.

I did; my fingers got confused; sorry about the typo.  Doubly sorry
because it is a plausible command name.

> mint mnt # dumpe2fs -h /dev/md0
> dumpe2fs 1.42.5 (29-Jul-2012)
> dumpe2fs: Bad magic number in super-block while trying to open /dev/md0
> Couldn't find valid filesystem superblock.

Well, that explains the immediate problem.

Does "dumpe2fs -h -o superblock=32768" produce anything more useful?
(That checks the first backup superblock.  There are additional backups at
98304, 163840, 229376, 294912, ...)

> Can you give me specific e2fsck commands to run which will not ruin my
> disks and data?  I have seen people online recommending re-writing the
> super blocks but I am now sure I want to write anything until I know
> it will not damage something and erase my data.

"e2fsck -n" will only print errors and not change anything.  It's
always safe.

Try "e2fsck -n -v /dev/md0" (given the dumpe2fs failure, I expect that
will not work) and then try "e2fsck -n -v -b 32768 /dev/md0".

I don't know what happened to your superblock, but if that's all that
got trashed, recovery is actually quite straightforward and there's no
risk of data loss.  e2fsck will just print a huge number of "free blocks
count wrong" messages as it fixes them.

(However, that's a pretty big "if".)


Another thing that would be useful is "dd if=/dev/md0 skip=2 count=2 | xxd"
(or od -x if you don't have xxd).  That will give a hex dump of the
primary superblock, which might show the extent of the damage.


If "e2fsck -n -b 32768" works, the way to repair it is to run it again
without the "-n", but the -n output will say how bad it is.

As a general rule of thumb, the more phases e2fsck gets through before
complaining, the less the damage.  Errors in the bitmaps found in phase 5
are not a serious problem; they only indicate things that would rapidly
*become* serious problems if you mounted the file system and tried to
write to it.

One thing I strongly recommend when e2fsck is fixing a lot of problems is
to save a log (I usually use the "script" program, but you can also use
"<command> 2>&1 | tee output") of the e2fsck run so you can refer back
to it later.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
  2012-11-19  6:30 George Spelvin
@ 2012-11-19  7:23 ` Drew Reusser
  2012-11-19  8:32   ` George Spelvin
  0 siblings, 1 reply; 17+ messages in thread
From: Drew Reusser @ 2012-11-19  7:23 UTC (permalink / raw)
  To: George Spelvin; +Cc: linux-ext4

On Mon, Nov 19, 2012 at 6:30 AM, George Spelvin <linux@horizon.com> wrote:
> I'm sorry about your problems, but... a little more information
> please?
>
> Mount is very explicitly telling you that there's probably more
> information in the kernel logs.  It even gives the command to see it.
> WHY DIDN'T YOU INCLUDE IT?  <snark>Was it perhaps too subtle a hint?</snark>
>
> What kernel version are you running?  If it's a distribution kernel,
> what kernel?  What architecture?
>
> What does e2fsck say about the file system?
> What does dumpe4fs -h /dev/md0 show?
>
>
> The one odd thing I can see is that md devices *are* usually listed in
> /proc/partitions, but it's not showing up in yours.
>
> The output of "cat /proc/mdstat" would also be useful.


I did give you the error in the kernel log in the original email .. it
was in quotes (the very last line).  Here is more, along with some
superflous logs.  This was stop and a start of the raid array, and two
attempted mounts of the file system:

[264788.678322] md0: detected capacity change from 1999065055232 to 0
[264788.678330] md: md0 stopped.
[264788.678339] md: unbind<sda1>
[264788.680063] md: export_rdev(sda1)
[264788.680108] md: unbind<sde1>
[264788.684033] md: export_rdev(sde1)
[264788.684055] md: unbind<sdd1>
[264788.688534] md: export_rdev(sdd1)
[264788.688556] md: unbind<sdb1>
[264788.696025] md: export_rdev(sdb1)
[264800.331630] md: md0 stopped.
[264800.333015] md: bind<sdb1>
[264800.334507] md: bind<sdd1>
[264800.334790] md: bind<sde1>
[264800.334977] md: bind<sda1>
[264800.338578] bio: create slab <bio-1> at 1
[264800.341261] md/raid10:md0: active with 4 out of 4 devices
[264800.341293] md0: detected capacity change from 0 to 1999065055232
[264800.343496]  md0: unknown partition table
[264906.870361] EXT4-fs (md0): VFS: Can't find ext4 filesystem
[270145.600580] i2c i2c-0: >sendbytes: NAK bailout.
[270155.620473] i2c i2c-0: >sendbytes: NAK bailout.
[270155.621372] i2c i2c-0: >sendbytes: NAK bailout.
[270155.621948] i2c i2c-0: >sendbytes: NAK bailout.
[270155.622523] i2c i2c-0: >sendbytes: NAK bailout.
[270155.623099] i2c i2c-0: >sendbytes: NAK bailout.
[270155.623138] [drm:radeon_vga_detect] *ERROR* VGA-1: probed a
monitor but no|invalid EDID
[270155.817233] i2c i2c-0: >sendbytes: NAK bailout.
[270155.829516] i2c i2c-0: >sendbytes: NAK bailout.
[270155.841186] i2c i2c-0: >sendbytes: NAK bailout.
[270155.841766] i2c i2c-0: >sendbytes: NAK bailout.
[270155.842342] i2c i2c-0: >sendbytes: NAK bailout.
[270155.842918] i2c i2c-0: >sendbytes: NAK bailout.
[270155.843493] i2c i2c-0: >sendbytes: NAK bailout.
[270155.977230] i2c i2c-0: >sendbytes: NAK bailout.
[270155.989185] i2c i2c-0: >sendbytes: NAK bailout.
[270155.989762] i2c i2c-0: >sendbytes: NAK bailout.
[270155.991786] i2c i2c-0: >sendbytes: NAK bailout.
[270155.992386] i2c i2c-0: >sendbytes: NAK bailout.
[270155.993288] i2c i2c-0: >sendbytes: NAK bailout.
[270155.993864] i2c i2c-0: >sendbytes: NAK bailout.
[270155.994440] i2c i2c-0: >sendbytes: NAK bailout.
[270303.640240] EXT4-fs (md0): VFS: Can't find ext4 filesystem


I am running this on Linux Mint 12 .. I don't know the kernel version
as I cannot boot so I am booting of whatever I downloaded from Mint's
website (13 rc I think) off a pen drive to try and save my data.

mint mnt # uname -a
Linux mint 3.5.0-17-generic #28-Ubuntu SMP Tue Oct 9 19:31:23 UTC 2012
x86_64 x86_64 x86_64 GNU/Linux


I ran the cat of /proc/partitions and copied the data from previous
emails to the linux-raid DL (which forwarded me onto this one).  Must
have gotten it before the raid was working.  Here is an updated one.

mint mnt # cat /proc/partitions
major minor  #blocks  name

   7        0     939820 loop0
   8        0  976762584 sda
   8        1  976760832 sda1
   8       16  976762584 sdb
   8       17  976237568 sdb1
   8       32    1985024 sdc
   8       33    1984960 sdc1
   8       48  976762584 sdd
   8       49  976760832 sdd1
   8       64  976762584 sde
   8       65  976237568 sde1
  11        0    1048575 sr0
   8       80 1953514584 sdf
   8       81 1953512448 sdf1
   9        0 1952211968 md0


mint mnt # cat /proc/mdstat
Personalities : [raid10]
md0 : active raid10 sda1[0] sde1[3] sdd1[2] sdb1[1]
      1952211968 blocks super 1.2 512K chunks 2 near-copies [4/4] [UUUU]

unused devices: <none>


Can you give me specific e2fsck commands to run which will not ruin my
disks and data?  I have seen people online recommending re-writing the
super blocks but I am now sure I want to write anything until I know
it will not damage something and erase my data.

as for the last command you asked .. can you give me more info on it?

mint mnt # dumpe4fs -h /dev/md0
No command 'dumpe4fs' found, did you mean:
 Command 'dumpe2fs' from package 'e2fsprogs' (main)
dumpe4fs: command not found
mint mnt # apt-get install dume4fs
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package dume4fs


if you meant dumpe2fs ... here is the output.

mint mnt # dumpe2fs -h /dev/md0
dumpe2fs 1.42.5 (29-Jul-2012)
dumpe2fs: Bad magic number in super-block while trying to open /dev/md0
Couldn't find valid filesystem superblock.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Issue with bad file system
@ 2012-11-19  6:30 George Spelvin
  2012-11-19  7:23 ` Drew Reusser
  0 siblings, 1 reply; 17+ messages in thread
From: George Spelvin @ 2012-11-19  6:30 UTC (permalink / raw)
  To: dreusser; +Cc: linux, linux-ext4

I'm sorry about your problems, but... a little more information
please?

Mount is very explicitly telling you that there's probably more
information in the kernel logs.  It even gives the command to see it.
WHY DIDN'T YOU INCLUDE IT?  <snark>Was it perhaps too subtle a hint?</snark>

What kernel version are you running?  If it's a distribution kernel,
what kernel?  What architecture?

What does e2fsck say about the file system?
What does dumpe4fs -h /dev/md0 show?


The one odd thing I can see is that md devices *are* usually listed in
/proc/partitions, but it's not showing up in yours.

The output of "cat /proc/mdstat" would also be useful.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2012-11-19 21:31 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-11-19  4:46 Issue with bad file system Drew Reusser
2012-11-19 15:18 ` Eric Sandeen
2012-11-19  6:30 George Spelvin
2012-11-19  7:23 ` Drew Reusser
2012-11-19  8:32   ` George Spelvin
2012-11-19 15:29     ` Eric Sandeen
2012-11-19 17:00       ` Drew Reusser
2012-11-19 16:57     ` Drew Reusser
2012-11-19 17:14       ` Eric Sandeen
2012-11-19 18:41         ` Theodore Ts'o
2012-11-19 19:15           ` George Spelvin
2012-11-19 19:36             ` Theodore Ts'o
2012-11-19 19:53           ` Drew Reusser
2012-11-19 20:24             ` Theodore Ts'o
2012-11-19 19:54         ` Drew Reusser
2012-11-19 21:15           ` George Spelvin
2012-11-19 21:30             ` Eric Sandeen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.