All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Pavel Machek <pavel@ucw.cz>
Cc: kernel list <linux-kernel@vger.kernel.org>,
	adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org
Subject: Re: ext4: media error but where?
Date: Mon, 7 Jul 2014 19:21:10 -0400	[thread overview]
Message-ID: <20140707232110.GE8254@thunk.org> (raw)
In-Reply-To: <20140707185543.GA26056@amd.pavel.ucw.cz>

On Mon, Jul 07, 2014 at 08:55:43PM +0200, Pavel Machek wrote:
> If I wanted to recover the data... remount-r would be the way to
> go. Then back it up using dd_rescue. ... But that way I'd turn bad
> sectors into silent data corruption.
> 
> If I wanted to recover data from that partition, fsck -c (or
> badblocks, but that's trickier) and then dd_rescue would be the way to go.

Ah, if that's what you're worried about, just do the following:

badblocks -b 4096 -o /tmp/badblocks.sdXX /dev/sdXX
debugfs -R "icheck $(cat /tmp/badblocks.sdXX)" /dev/sdXX > /tmp/bad-inodes
debugfs -R "ncheck $(sed -e 1d /tmp/bad-inodes | awk '{print $2}' | sort -nu)" > /tmp/bad-files

This will give you a list of the files that contain blocks that had
I/O errors.  So now you know which files have contents which have
probably been corrupted.  No more silent data corruption.  :-)

> Actually -- tool to do relocations would be nice. It is not exactly
> easy to do it right by hand.

It's not *that* hard.  All you really need to do is:

for i in $(cat /tmp/badblocks.sdXX) ; do
    dd if=/dev/zero of=/dev/sdXX bs=4k seek=$i count=1
done
e2fsck -f /dev/sdXX

For bonus points, you could write a C program which tries to read the
block one final time before doing the forced write of all zeros.

It's a bit harder if you are trying to interpret the device-driver
dependent error messages, and translate the absolute sector number
into a partition-relative block number.  (Except sometimes, depending
on the block device, the number which is given is either a relative
sector number, or a relative block number.)


For disks that do bad block remapping, an even simpler thing to do is
to just delete the corrupted files.  When the blocks get reallocated
for some other purpose, the HDD should automatically remap the block
on write, and if the write fails, such that you are getting an I/O
error on the write, it's time to replace the disk.

> Forcing reallocation is hard & tricky. You may want to simply mark it
> bad and lose a tiny bit of disk space... And even if you want to force
> reallocation, you want to do fsck -c, first, and restore affected
> files from backup.

Trying to force reallocation isn't that hard, so long as you have
resigned yourself that you've lost the data in the blocks in question.
And if it doesn't work, for whatever reason, I would simply not trust
the disk any longer.

For me at least, it's all about the value of the disk versus the value
of my time and the data on the disk.  When I take my hourly rate into
question ($annual comp divided by 2000) the value of trying to save a
particular hard drive almost never works out in my favor.  So these
days, my bias is to do what I can to save the data, but to not fool
around with trying to play fancy games with e2fsck -c.  I'll just want
to save what I can, and hopefully, with regular backups, that won't
require heroic measures, and then trash and replace the HDD.

Cheers,

					- Ted

P.S.  I'm not sure why you consider running badblocks to be tricky.
The only thing you need to be careful about is passing the file system
blocksize to badblocks.  And since the block size is almost always 4k
for any non-trivial file system, all you really need to do is
"badblocks -b 4096".  Or, if you really like:

	   badblocks -b $(dumpe2fs -h /dev/sdXX | awk -F: '/^Block size: / {print $2}') /dev/sdXX

See?  Easy peasy!  :-)


  parent reply	other threads:[~2014-07-07 23:21 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-26 20:20 ext4: total breakdown on USB hdd, 3.0 kernel Pavel Machek
2014-06-26 20:30 ` Pavel Machek
2014-06-26 20:50   ` Pavel Machek
2014-06-27  2:48     ` Theodore Ts'o
2014-06-27  2:46   ` Theodore Ts'o
2014-06-29 20:25     ` Pavel Machek
2014-06-29 21:04       ` Theodore Ts'o
2014-06-30  6:46         ` Pavel Machek
2014-06-30 13:43           ` Theodore Ts'o
2014-07-04 10:23             ` ext4: media error but where? Pavel Machek
2014-07-04 12:11               ` Theodore Ts'o
2014-07-04 17:21                 ` Pavel Machek
2014-07-04 18:06                   ` Pavel Machek
2014-07-04 18:56                   ` Theodore Ts'o
2014-07-06 13:32                     ` Pavel Machek
2014-07-06 13:43                       ` Pavel Machek
2014-07-06 18:29                         ` Theodore Ts'o
2014-07-06 21:37                           ` Pavel Machek
2014-07-07  1:00                             ` Theodore Ts'o
2014-07-07 18:55                               ` Pavel Machek
2014-07-07 23:18                                 ` 3.16-rc, ext4: oopses, OOMs after hard powerdown Pavel Machek
2014-07-07 23:21                                 ` Theodore Ts'o [this message]
2014-07-04 19:17                   ` ext4: media error but where? Andreas Dilger
2014-07-04 20:33                     ` Pavel Machek
2014-07-04 22:18                       ` Andreas Dilger
2014-07-05 22:17                       ` Theodore Ts'o
2014-06-27  8:23 ` ext4: total breakdown on USB hdd, 3.0 kernel Oliver Neukum

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140707232110.GE8254@thunk.org \
    --to=tytso@mit.edu \
    --cc=adilger.kernel@dilger.ca \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pavel@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.