All of lore.kernel.org
 help / color / mirror / Atom feed
* Marking bad blocks
@ 2003-07-20  3:30 Whit Blauvelt
  2003-07-20  9:17 ` Rudy L. Zijlstra
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: Whit Blauvelt @ 2003-07-20  3:30 UTC (permalink / raw)
  To: reiserfs-list

Hi,

I'm trying to use the add-bad-blocks.c script provided on the site, but with
both gcc 2.95.3 and 3.2.2 I get:

"undefinded reference to `concat'" 

when I try to do "gcc add-bad-blocks.c".

Is there something else I need to do to compile it successfully? Or is it
just bad code? Why isn't whatever the necessary step is spelled out on the
site?

And, the instruction page only mentions patches to old versions of the
kernel. Is the patch to support this now standard in the Reiserfs of kernel
2.4.21? 

I like Reiserfs a lot, and use it extensively, but for the second time I'm
having serious trouble because of its lack of bad block handling. The
Website is full of posturing about "If you have a problem, it's probably
your hardware, not a real bug." Fine, this is hardware. The first time
around a drive problem totally trashed a system. Now it looks like the
Reiser code is better (in 2.4.21 it doesn't crash the system as it does on
the same drive with 2.4.19 - looks like something was improved either in
Reiser or general file system support in the kernel). But an out of date
page on handling bad blocks with a utility that doesn't compile doesn't
evidence thorough care.

Thanks for any advice on how to actually take care of this problem short of
throwing out a drive (or going to another file system) just because a couple
of blocks are bad.

Whit

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Marking bad blocks
  2003-07-20  3:30 Marking bad blocks Whit Blauvelt
@ 2003-07-20  9:17 ` Rudy L. Zijlstra
  2003-07-20 22:59   ` Szakacsits Szabolcs
  2003-07-20 10:39 ` Vitaly Fertman
  2003-07-20 14:23 ` Whit Blauvelt
  2 siblings, 1 reply; 5+ messages in thread
From: Rudy L. Zijlstra @ 2003-07-20  9:17 UTC (permalink / raw)
  To: Whit Blauvelt; +Cc: reiserfs-list

Whit Blauvelt wrote:

>Thanks for any advice on how to actually take care of this problem short of
>throwing out a drive (or going to another file system) just because a couple
>of blocks are bad.
>  
>
In my experience, you are courting a great amount of data loss.

Some HardDisk history (although imprecise on dates):

early 80's: HD was small, producing them difficult, and most had some 
badblocks on them from day 1. Manufacturers would actually tell you 
which blocks were bad on delivery of the disk. The on-disk state was not 
likely to change quickly, and the bad-block list tended to stay stable 
for long preriods of time.

Now we'll skip some time, and keep in mind that with HDs as with other 
things, functionality has moved from the OS into the disk.

current state of HDs:
- HD's are manifold bigger and faster
- HD's incorporate on disk memory caching
- HD's have an extra cylinder (sometimes more than 1) which they use to 
re-map bad-blocks. In other words, the HD contains a sub-system that 
checks whether a block has gone bad and if so, it no longer uses it, but 
in stead uses one from the "hidden" cylinders.

its this last feature that is important in this discussion. From early 
disks on to the current day, some bad blocks on a disk are no point at 
all. It is when they start growing in number that it getting time to 
exchange them. Usually, when the number of bad blocks start growing, it 
goes exponentially. First a slow growth, and the growth increases. I've 
recently had occasion to observe it again on a disk. Luckily not one of 
mine ;-)
This means that with a modern disk (from 5 years ago and later), once 
you start seeing bac blocks you already have at least a complete 
cylinder full of bad blocks, and it is no longer "some bad blocks". It 
also usually means the disk is rather far advanced on the exponential 
curve.... Of course, there are a number of "usually" in this.  Based on 
personal sad disk history, (yes, i have had several disks develop bad 
blocks, and the pattern is rather consistent) I will exchange a disk 
with bad blocks for a new one as soon as i detect bad blocks. If it is 
still within warranty it goes back to the manufacturer post speed.

Now of course you are free to change to an other file system, like for 
example ext2 and maybe you are lucky, and the disk stays stable for 
another year, after that it suddenly develops more bad block, prerhaps 
block 1 on cylinder 1, and then you will likely complain on the ext2 
list that you lost your data. Up to you, but don't expect  commiseration 
from them.

No filesystem will stay consistent, and not lose data, in the face of 
imperfect hardware.

good luck,

Rudy


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Marking bad blocks
  2003-07-20  3:30 Marking bad blocks Whit Blauvelt
  2003-07-20  9:17 ` Rudy L. Zijlstra
@ 2003-07-20 10:39 ` Vitaly Fertman
  2003-07-20 14:23 ` Whit Blauvelt
  2 siblings, 0 replies; 5+ messages in thread
From: Vitaly Fertman @ 2003-07-20 10:39 UTC (permalink / raw)
  To: Whit Blauvelt, reiserfs-list


Hi, 

On Sunday 20 July 2003 07:30, Whit Blauvelt wrote:
> Hi,
>
> I'm trying to use the add-bad-blocks.c script provided on the site, but
> with both gcc 2.95.3 and 3.2.2 I get:

this is strange -- I have no problem with gcc 3.3 and 2.96.  

> "undefinded reference to `concat'"

it seems to be a problem with your gcc.

> when I try to do "gcc add-bad-blocks.c".
>
> Is there something else I need to do to compile it successfully? Or is it
> just bad code? Why isn't whatever the necessary step is spelled out on the
> site?
>
> And, the instruction page only mentions patches to old versions of the
> kernel. Is the patch to support this now standard in the Reiserfs of kernel
> 2.4.21?

Only those patches are available, actually bad block handling support is 
going to be included into reiserfsprogs soon (likely into the next release).

> I like Reiserfs a lot, and use it extensively, but for the second time I'm
> having serious trouble because of its lack of bad block handling. The
> Website is full of posturing about "If you have a problem, it's probably
> your hardware, not a real bug." Fine, this is hardware. The first time
> around a drive problem totally trashed a system. Now it looks like the
> Reiser code is better (in 2.4.21 it doesn't crash the system as it does on
> the same drive with 2.4.19 - looks like something was improved either in
> Reiser or general file system support in the kernel). But an out of date
> page on handling bad blocks with a utility that doesn't compile doesn't
> evidence thorough care.
>
> Thanks for any advice on how to actually take care of this problem short of
> throwing out a drive (or going to another file system) just because a
> couple of blocks are bad.
>
> Whit

Have you tried to dd into these bad blocks? This command can make the 
harddrive to move bad blocks into internal harddrive bad block list (remap 
them). If it fails that means that bad block list is full and you have a lot of 
bad blocks on your harddrive and it is strongly recommended to change the 
harddrive ASAP - you will get more of them soon.

-- 
Thanks,
Vitaly Fertman

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Marking bad blocks
  2003-07-20  3:30 Marking bad blocks Whit Blauvelt
  2003-07-20  9:17 ` Rudy L. Zijlstra
  2003-07-20 10:39 ` Vitaly Fertman
@ 2003-07-20 14:23 ` Whit Blauvelt
  2 siblings, 0 replies; 5+ messages in thread
From: Whit Blauvelt @ 2003-07-20 14:23 UTC (permalink / raw)
  To: reiserfs-list

Thanks for the various responses. The drive in question is a few years old,
so its own bad-block handling capabilities may not be the latest. 

The suggestion to "dd into the blocks" is intriguing. Does anyone have an
example of how to do that based on the "badblock" output of block numbers?
dd's man and info pages are terse, and I've only ever used dd for writing
images to floppies.

As for the suggestion that there's a problem with my gcc since I can't
compile add-bad-blocks.c - well, this is trying it with gcc on two different
systems: the first being an originally Red Hat 6.0 system where gcc has been
upgraded and compiled from the GNU sources, the second on a very current
Gentoo system where the gcc has is compiled from the Gentoo sources - so
what are the odds that two different versions of gcc on two different
systems and flavors of Linux both have the same problem, and it's gcc's
fault?

As for whether it's wise to replace hard drives at the first sign of
trouble. Yeah, it is. But there are instances like today where it's Sunday,
I'm in a rural location far from anyplace selling hard drives, I don't have
an equivalent drive on hand (wouldn't it be wonderful if we could all always
have a total supply of backup parts for our systems?). The important thing
for software is to keep the system going, with messages to the system
operator about anything consequential. Allowing a failure of a bad block on
a hard drive to be a cause of system failure - when programming a file
system to gracefully handle bad blocks is old hat - isn't consistent with
the belt-and-suspenders philosophy that's the heart of good systems
administration. (Of course, not having a complete set of spare parts isn't
consistent either - but software features are cheaper to distribute than is
hardware).

It's encouraging that some sort of bad block handling is planned for the
next version of Reiser utilities.

Whit

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Marking bad blocks
  2003-07-20  9:17 ` Rudy L. Zijlstra
@ 2003-07-20 22:59   ` Szakacsits Szabolcs
  0 siblings, 0 replies; 5+ messages in thread
From: Szakacsits Szabolcs @ 2003-07-20 22:59 UTC (permalink / raw)
  To: Rudy L. Zijlstra; +Cc: Whit Blauvelt, reiserfs-list


On Sun, 20 Jul 2003, Rudy L. Zijlstra wrote:

> In my experience, you are courting a great amount of data loss.

Considering Whit's disk is old and the bad sectors are new, quite probably.

   [... nice harddisk history summary ...]

I've believed the same on the current state. Until I wrote ntfsresize and
it started its life one year ago. I marked 'bad block handling' the lowest
priority but never really planned to implement it, you explained why. NTFS
keeps a list of bad sectors and ntfsresize only checks its presence and
refuses to make the modifications if the disk is "dying".

The problem, it appears there is a decent number of disks leaving the
factories nowaday already having bad sectors (in short, no hidden cylinders
or they are full). Users claim disk remains stable, no new bad sectors with
time and most of them aren't even willing to replace it. Just like in the
early 80's. So in the end, I "had to" write the code.

	Szaka


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-07-20 22:59 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-07-20  3:30 Marking bad blocks Whit Blauvelt
2003-07-20  9:17 ` Rudy L. Zijlstra
2003-07-20 22:59   ` Szakacsits Szabolcs
2003-07-20 10:39 ` Vitaly Fertman
2003-07-20 14:23 ` Whit Blauvelt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.