linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* AARGH! Please help. IDE controller fsckup
@ 2002-10-02 13:16 Roy Sigurd Karlsbakk
  2002-10-03  9:53 ` Jakob Oestergaard
  0 siblings, 1 reply; 8+ messages in thread
From: Roy Sigurd Karlsbakk @ 2002-10-02 13:16 UTC (permalink / raw)
  To: Kernel mailing list

hi all

I have this cute little server with some 16 120gig IDE drives, and I've got 
some serious problems with it.

Controllers:
One onboard IDE controller (2 channels).
Two promise ATA100 (2 channels each).
One CMD649 (2 channels).

something seriously bad about the CMD649 makes Linux beleive it's the first 
controller with hd[abcd]. On these, there are two RAID-1s (/ and /var). Due 
to the fact that the box has some 1,6TB disk space, we haven't got any backup 
solution (we have an identical box in order to mirror them).

so - now - the CMD649 has suddenly begun to fail - losing contact with one or 
two drives, and I _really_ need to get what's on /data (RAID-5 on 
hd[efghijklmnop]) out. Problem is - the replacement controller I've got from 
the vendor works fine (turns up as controller 3 serving hd[mnop]). How can I 
revert this most easily to be able to boot again?

I hope this is not too off topic... Please excuse that.

roy

-- 
Roy Sigurd Karlsbakk, Datavaktmester
ProntoTV AS - http://www.pronto.tv/
Tel: +47 9801 3356

Computers are like air conditioners.
They stop working when you open Windows.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AARGH! Please help. IDE controller fsckup
  2002-10-02 13:16 AARGH! Please help. IDE controller fsckup Roy Sigurd Karlsbakk
@ 2002-10-03  9:53 ` Jakob Oestergaard
  2002-10-03 10:25   ` Roy Sigurd Karlsbakk
  0 siblings, 1 reply; 8+ messages in thread
From: Jakob Oestergaard @ 2002-10-03  9:53 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: Kernel mailing list

On Wed, Oct 02, 2002 at 03:16:46PM +0200, Roy Sigurd Karlsbakk wrote:
> hi all
> 
> I have this cute little server with some 16 120gig IDE drives, and I've got 
> some serious problems with it.
> 
> Controllers:
> One onboard IDE controller (2 channels).
> Two promise ATA100 (2 channels each).
> One CMD649 (2 channels).
> 
> something seriously bad about the CMD649 makes Linux beleive it's the first 
> controller with hd[abcd]. On these, there are two RAID-1s (/ and /var). Due 
> to the fact that the box has some 1,6TB disk space, we haven't got any backup 
> solution (we have an identical box in order to mirror them).
> 
> so - now - the CMD649 has suddenly begun to fail - losing contact with one or 
> two drives, and I _really_ need to get what's on /data (RAID-5 on 
> hd[efghijklmnop]) out. Problem is - the replacement controller I've got from 
> the vendor works fine (turns up as controller 3 serving hd[mnop]). How can I 
> revert this most easily to be able to boot again?

Hindsight:  had you used persistent superblocks, this would not have
been a problem.  The kernel would know the correct ordering from the
superblocks, not the device names.

Solution 1: Write to the RAID mailing list and have one of the mdadm
gurus give you a one-liner to initialize the array with the proper
ordering.

Solution 2: Edit your /etc/raidtab to reflect the new device naming and
run raidstart.

If you start up the array with a bad ordering, no amount of magic is
going to bring back you data (after parity has been "reconstructed" on
various chunks of your existing data).


> 
> I hope this is not too off topic... Please excuse that.
> 

linux-raid is a better place.


Cheers,

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AARGH! Please help. IDE controller fsckup
  2002-10-03  9:53 ` Jakob Oestergaard
@ 2002-10-03 10:25   ` Roy Sigurd Karlsbakk
  2002-10-03 11:40     ` Jakob Oestergaard
  0 siblings, 1 reply; 8+ messages in thread
From: Roy Sigurd Karlsbakk @ 2002-10-03 10:25 UTC (permalink / raw)
  To: Jakob Oestergaard; +Cc: Kernel mailing list

> > so - now - the CMD649 has suddenly begun to fail - losing contact with
> > one or two drives, and I _really_ need to get what's on /data (RAID-5 on
> > hd[efghijklmnop]) out. Problem is - the replacement controller I've got
> > from the vendor works fine (turns up as controller 3 serving hd[mnop]).
> > How can I revert this most easily to be able to boot again?
>
> Hindsight:  had you used persistent superblocks, this would not have
> been a problem.  The kernel would know the correct ordering from the
> superblocks, not the device names.

I have used presistent superblocks, but md0,1,2,3 will be differently ordered 
if I change the disk order... At least I think so. It surely didn't work.

> Solution 1: Write to the RAID mailing list and have one of the mdadm
> gurus give you a one-liner to initialize the array with the proper
> ordering.
>
> Solution 2: Edit your /etc/raidtab to reflect the new device naming and
> run raidstart.

ok. but this won't be neccecary with persistent superblocks? right?

> If you start up the array with a bad ordering, no amount of magic is
> going to bring back you data (after parity has been "reconstructed" on
> various chunks of your existing data).

But ... with persistent superblock - is it possible to fsckup the raid?

> linux-raid is a better place.

I'll mail them. Thanks anyway

roy
-- 
Roy Sigurd Karlsbakk, Datavaktmester
ProntoTV AS - http://www.pronto.tv/
Tel: +47 9801 3356

Computers are like air conditioners.
They stop working when you open Windows.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AARGH! Please help. IDE controller fsckup
  2002-10-03 10:25   ` Roy Sigurd Karlsbakk
@ 2002-10-03 11:40     ` Jakob Oestergaard
  2002-10-03 13:13       ` Roy Sigurd Karlsbakk
  0 siblings, 1 reply; 8+ messages in thread
From: Jakob Oestergaard @ 2002-10-03 11:40 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: Kernel mailing list

On Thu, Oct 03, 2002 at 12:25:11PM +0200, Roy Sigurd Karlsbakk wrote:
> > > so - now - the CMD649 has suddenly begun to fail - losing contact with
> > > one or two drives, and I _really_ need to get what's on /data (RAID-5 on
> > > hd[efghijklmnop]) out. Problem is - the replacement controller I've got
> > > from the vendor works fine (turns up as controller 3 serving hd[mnop]).
> > > How can I revert this most easily to be able to boot again?
> >
> > Hindsight:  had you used persistent superblocks, this would not have
> > been a problem.  The kernel would know the correct ordering from the
> > superblocks, not the device names.
> 
> I have used presistent superblocks, but md0,1,2,3 will be differently ordered 
> if I change the disk order... At least I think so. It surely didn't work.

No. md0 would stay md0.  This is another effect of using superblocks,
and in fact this is also (ironically) more or less the only argument
*against* using them   :)

(Imagine inserting a disk which knows that it is disk 0 of md0 into some
machine that already has a perfectly fine md0 running)

> 
> > Solution 1: Write to the RAID mailing list and have one of the mdadm
> > gurus give you a one-liner to initialize the array with the proper
> > ordering.
> >
> > Solution 2: Edit your /etc/raidtab to reflect the new device naming and
> > run raidstart.
> 
> ok. but this won't be neccecary with persistent superblocks? right?

right

> 
> > If you start up the array with a bad ordering, no amount of magic is
> > going to bring back you data (after parity has been "reconstructed" on
> > various chunks of your existing data).
> 
> But ... with persistent superblock - is it possible to fsckup the raid?

You're root, it is indeed possible  :)

But you would not need to perform any of the special operations that you
need to now.

Persistent superblocks saves you from a number of "bad" situations you
can encounter with normal production systems (such as replacing a
controller or moving disks around).

One should be careful when moving disks with persistent superblocks
between systems though. You don't want the kernel to autodetect the
"wrong" md0 on boot   :)    I consider this problem nonexistent in the
production environment that I administer, but I know that some people
feel differently about it.  You should consider these pros and cons in
relation to your environment and make a decision based on that.

Cheers,

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AARGH! Please help. IDE controller fsckup
  2002-10-03 11:40     ` Jakob Oestergaard
@ 2002-10-03 13:13       ` Roy Sigurd Karlsbakk
  2002-10-03 13:23         ` Jakob Oestergaard
  0 siblings, 1 reply; 8+ messages in thread
From: Roy Sigurd Karlsbakk @ 2002-10-03 13:13 UTC (permalink / raw)
  To: Jakob Oestergaard; +Cc: Kernel mailing list, linux-raid

> > I have used presistent superblocks, but md0,1,2,3 will be differently
> > ordered if I change the disk order... At least I think so. It surely
> > didn't work.
>
> No. md0 would stay md0.  This is another effect of using superblocks,
> and in fact this is also (ironically) more or less the only argument
> *against* using them   :)
>
> (Imagine inserting a disk which knows that it is disk 0 of md0 into some
> machine that already has a perfectly fine md0 running)

ok. so. theoretically - as long as the system finds all 16 drives, I should be 
able to shuffle them around and attach them to whichever controller there is? 
right?

ok.

now, I've replaced the faulty controller, and booting up. the new controller 
is also (like the old one) a CMD649...

hæ?

it works. but it surely didn't work last time...

thanks

> > But ... with persistent superblock - is it possible to fsckup the raid?
>
> You're root, it is indeed possible  :)

er - yes. I more meant like 'automagically'

> But you would not need to perform any of the special operations that you
> need to now.
>
> Persistent superblocks saves you from a number of "bad" situations you
> can encounter with normal production systems (such as replacing a
> controller or moving disks around).
>
> One should be careful when moving disks with persistent superblocks
> between systems though. You don't want the kernel to autodetect the
> "wrong" md0 on boot   :)    I consider this problem nonexistent in the
> production environment that I administer, but I know that some people
> feel differently about it.  You should consider these pros and cons in
> relation to your environment and make a decision based on that.


-- 
Roy Sigurd Karlsbakk, Datavaktmester
ProntoTV AS - http://www.pronto.tv/
Tel: +47 9801 3356

Computers are like air conditioners.
They stop working when you open Windows.


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AARGH! Please help. IDE controller fsckup
  2002-10-03 13:13       ` Roy Sigurd Karlsbakk
@ 2002-10-03 13:23         ` Jakob Oestergaard
  2002-10-03 20:05           ` Andre Hedrick
  2002-10-05 15:42           ` Roy Sigurd Karlsbakk
  0 siblings, 2 replies; 8+ messages in thread
From: Jakob Oestergaard @ 2002-10-03 13:23 UTC (permalink / raw)
  To: Roy Sigurd Karlsbakk; +Cc: Kernel mailing list, linux-raid

On Thu, Oct 03, 2002 at 03:13:28PM +0200, Roy Sigurd Karlsbakk wrote:
> > > I have used presistent superblocks, but md0,1,2,3 will be differently
> > > ordered if I change the disk order... At least I think so. It surely
> > > didn't work.
> >
> > No. md0 would stay md0.  This is another effect of using superblocks,
> > and in fact this is also (ironically) more or less the only argument
> > *against* using them   :)
> >
> > (Imagine inserting a disk which knows that it is disk 0 of md0 into some
> > machine that already has a perfectly fine md0 running)
> 
> ok. so. theoretically - as long as the system finds all 16 drives, I should be 
> able to shuffle them around and attach them to whichever controller there is? 
> right?

It will not reattach your disks (you need to move cables to do that),
but it will know "First disk of md0" from "Second disk of md0"
regardless of whether those disks are /dev/hda or /dev/sdg.

You can shuffle your disks around as much as you please.  When the RAID
code looks at your disks, it will read their superblocks and correctly
make the first disk of md0 the first disk of md0, and so forth,
regardless of the actual device name of the disk.

> 
> ok.
> 
> now, I've replaced the faulty controller, and booting up. the new controller 
> is also (like the old one) a CMD649...
> 

RAID doesn't care about controllers.

RAID without persistent superblocks cares about disk device names.

RAID with persistent superblocks don't care about disk device names.

> hæ?

Øh?

> 
> it works. but it surely didn't work last time...
> 

Good for you  :)

> thanks
> 
> > > But ... with persistent superblock - is it possible to fsckup the raid?
> >
> > You're root, it is indeed possible  :)
> 
> er - yes. I more meant like 'automagically'

It will only automagically screw up your arrays if you shuffle disks
between machines (mix several RAID arrays from other systems in one
system)  (you can of course move all your disks to one new machine, if
it has none of it's original RAIDed disks left).

Just don't mix disks with persistent superblocks from multiple machines
into one single machine.  Unless you know exactly what you're doing.

-- 
................................................................
:   jakob@unthought.net   : And I see the elder races,         :
:.........................: putrid forms of man                :
:   Jakob Østergaard      : See him rise and claim the earth,  :
:        OZ9ABN           : his downfall is at hand.           :
:.........................:............{Konkhra}...............:

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AARGH! Please help. IDE controller fsckup
  2002-10-03 13:23         ` Jakob Oestergaard
@ 2002-10-03 20:05           ` Andre Hedrick
  2002-10-05 15:42           ` Roy Sigurd Karlsbakk
  1 sibling, 0 replies; 8+ messages in thread
From: Andre Hedrick @ 2002-10-03 20:05 UTC (permalink / raw)
  To: Jakob Oestergaard; +Cc: Roy Sigurd Karlsbakk, Kernel mailing list, linux-raid


One of the observed issues under raid-tools is not looking at all the
devices' superblocks.  This would allow for out of order initialization.
Treating the devices as domino chips and stuffing them back in random
order and it working.

If I am wrong here, great.  Somebody please make the correction.

Cheers,

On Thu, 3 Oct 2002, Jakob Oestergaard wrote:

> On Thu, Oct 03, 2002 at 03:13:28PM +0200, Roy Sigurd Karlsbakk wrote:
> > > > I have used presistent superblocks, but md0,1,2,3 will be differently
> > > > ordered if I change the disk order... At least I think so. It surely
> > > > didn't work.
> > >
> > > No. md0 would stay md0.  This is another effect of using superblocks,
> > > and in fact this is also (ironically) more or less the only argument
> > > *against* using them   :)
> > >
> > > (Imagine inserting a disk which knows that it is disk 0 of md0 into some
> > > machine that already has a perfectly fine md0 running)
> > 
> > ok. so. theoretically - as long as the system finds all 16 drives, I should be 
> > able to shuffle them around and attach them to whichever controller there is? 
> > right?
> 
> It will not reattach your disks (you need to move cables to do that),
> but it will know "First disk of md0" from "Second disk of md0"
> regardless of whether those disks are /dev/hda or /dev/sdg.
> 
> You can shuffle your disks around as much as you please.  When the RAID
> code looks at your disks, it will read their superblocks and correctly
> make the first disk of md0 the first disk of md0, and so forth,
> regardless of the actual device name of the disk.
> 
> > 
> > ok.
> > 
> > now, I've replaced the faulty controller, and booting up. the new controller 
> > is also (like the old one) a CMD649...
> > 
> 
> RAID doesn't care about controllers.
> 
> RAID without persistent superblocks cares about disk device names.
> 
> RAID with persistent superblocks don't care about disk device names.
> 
> > hæ?
> 
> Øh?
> 
> > 
> > it works. but it surely didn't work last time...
> > 
> 
> Good for you  :)
> 
> > thanks
> > 
> > > > But ... with persistent superblock - is it possible to fsckup the raid?
> > >
> > > You're root, it is indeed possible  :)
> > 
> > er - yes. I more meant like 'automagically'
> 
> It will only automagically screw up your arrays if you shuffle disks
> between machines (mix several RAID arrays from other systems in one
> system)  (you can of course move all your disks to one new machine, if
> it has none of it's original RAIDed disks left).
> 
> Just don't mix disks with persistent superblocks from multiple machines
> into one single machine.  Unless you know exactly what you're doing.
> 
> -- 
> ................................................................
> :   jakob@unthought.net   : And I see the elder races,         :
> :.........................: putrid forms of man                :
> :   Jakob Østergaard      : See him rise and claim the earth,  :
> :        OZ9ABN           : his downfall is at hand.           :
> :.........................:............{Konkhra}...............:
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

Andre Hedrick
LAD Storage Consulting Group


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: AARGH! Please help. IDE controller fsckup
  2002-10-03 13:23         ` Jakob Oestergaard
  2002-10-03 20:05           ` Andre Hedrick
@ 2002-10-05 15:42           ` Roy Sigurd Karlsbakk
  1 sibling, 0 replies; 8+ messages in thread
From: Roy Sigurd Karlsbakk @ 2002-10-05 15:42 UTC (permalink / raw)
  To: Jakob Oestergaard; +Cc: Kernel mailing list, linux-raid

Jakob Oestergaard wrote:

>>>>But ... with persistent superblock - is it possible to fsckup the raid?
>>>>        
>>>>
>>>You're root, it is indeed possible  :)
>>>      
>>>
>>er - yes. I more meant like 'automagically'
>>    
>>
>
>It will only automagically screw up your arrays if you shuffle disks
>between machines (mix several RAID arrays from other systems in one
>system)  (you can of course move all your disks to one new machine, if
>it has none of it's original RAIDed disks left).
>
>Just don't mix disks with persistent superblocks from multiple machines
>into one single machine.  Unless you know exactly what you're doing.
>  
>
Could it be some kind of idea to 'sign' the disks with some hash out of 
hostname and  IP or something?

roy



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-10-05 15:36 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2002-10-02 13:16 AARGH! Please help. IDE controller fsckup Roy Sigurd Karlsbakk
2002-10-03  9:53 ` Jakob Oestergaard
2002-10-03 10:25   ` Roy Sigurd Karlsbakk
2002-10-03 11:40     ` Jakob Oestergaard
2002-10-03 13:13       ` Roy Sigurd Karlsbakk
2002-10-03 13:23         ` Jakob Oestergaard
2002-10-03 20:05           ` Andre Hedrick
2002-10-05 15:42           ` Roy Sigurd Karlsbakk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).