All of lore.kernel.org
 help / color / mirror / Atom feed
* Is It Hopeless?
@ 2010-12-26 18:19 Carl Cook
  2010-12-26 20:11 ` Neil Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Carl Cook @ 2010-12-26 18:19 UTC (permalink / raw)
  To: linux-raid

I went in to turn on my home theater system today, and found a blank screen.  I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years.  I try to mount it manually, and "wrong fs or bad superblock".  The array is getting set up fine, but the filesystem seems to be destroyed.  

Unbelievable.  This isn't supposed to happen.  It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release.  But now it's RAID10 with JFS.

The only sign of trouble:
Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot
Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped.
Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3>
Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3)
Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3>
Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3)
Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0
Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped.

Reboot:
Dec 25 16:15:48 cygnus kernel: [    1.156657] Uniform CD-ROM driver Revision: 3.20
Dec 25 16:15:48 cygnus kernel: [    1.464298] md: raid10 personality registered for level 10
Dec 25 16:15:48 cygnus kernel: [    1.469307] md: md2 stopped.
Dec 25 16:15:48 cygnus kernel: [    1.470540] md: bind<sdc3>
Dec 25 16:15:48 cygnus kernel: [    1.470642] md: bind<sdb3>
Dec 25 16:15:48 cygnus kernel: [    1.471381] raid10: raid set md2 active with 2 out of 2 devices
Dec 25 16:15:48 cygnus kernel: [    1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits
Dec 25 16:15:48 cygnus kernel: [    1.476050] created bitmap (223 pages) for device md2
Dec 25 16:15:48 cygnus kernel: [    1.488465] md2: detected capacity change from 0 to 1913403736064
Dec 25 16:15:48 cygnus kernel: [    1.488942]  md2: unknown partition table
Dec 25 16:15:48 cygnus kernel: [    1.597375] PM: Starting manual resume from disk
Dec 25 16:15:48 cygnus kernel: [    1.650832] kjournald starting.  Commit interval 5 seconds

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-26 18:19 Is It Hopeless? Carl Cook
@ 2010-12-26 20:11 ` Neil Brown
  2010-12-26 20:19   ` Carl Cook
  0 siblings, 1 reply; 22+ messages in thread
From: Neil Brown @ 2010-12-26 20:11 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-raid

On Sun, 26 Dec 2010 10:19:55 -0800 Carl Cook <CACook@quantum-sci.com> wrote:

> I went in to turn on my home theater system today, and found a blank screen.  I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years.  I try to mount it manually, and "wrong fs or bad superblock".  The array is getting set up fine, but the filesystem seems to be destroyed.  
> 
> Unbelievable.  This isn't supposed to happen.  It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release.  But now it's RAID10 with JFS.

None of your logs show anything about jfs....

What does
  fsck.jfs /dev/md2
report?
What about
  mount -t jfs /dev/md2 /home

??

NeilBrown


> 
> The only sign of trouble:
> Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot
> Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped.
> Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3>
> Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3)
> Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3>
> Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3)
> Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0
> Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped.
> 
> Reboot:
> Dec 25 16:15:48 cygnus kernel: [    1.156657] Uniform CD-ROM driver Revision: 3.20
> Dec 25 16:15:48 cygnus kernel: [    1.464298] md: raid10 personality registered for level 10
> Dec 25 16:15:48 cygnus kernel: [    1.469307] md: md2 stopped.
> Dec 25 16:15:48 cygnus kernel: [    1.470540] md: bind<sdc3>
> Dec 25 16:15:48 cygnus kernel: [    1.470642] md: bind<sdb3>
> Dec 25 16:15:48 cygnus kernel: [    1.471381] raid10: raid set md2 active with 2 out of 2 devices
> Dec 25 16:15:48 cygnus kernel: [    1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits
> Dec 25 16:15:48 cygnus kernel: [    1.476050] created bitmap (223 pages) for device md2
> Dec 25 16:15:48 cygnus kernel: [    1.488465] md2: detected capacity change from 0 to 1913403736064
> Dec 25 16:15:48 cygnus kernel: [    1.488942]  md2: unknown partition table
> Dec 25 16:15:48 cygnus kernel: [    1.597375] PM: Starting manual resume from disk
> Dec 25 16:15:48 cygnus kernel: [    1.650832] kjournald starting.  Commit interval 5 seconds
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-26 20:11 ` Neil Brown
@ 2010-12-26 20:19   ` Carl Cook
  2010-12-26 20:19     ` CoolCold
  2010-12-26 20:33     ` Neil Brown
  0 siblings, 2 replies; 22+ messages in thread
From: Carl Cook @ 2010-12-26 20:19 UTC (permalink / raw)
  To: linux-raid


My God, it didn't have the command fsck.jfs, so I reinstalled jfsutils.  Now the array mounts.

I don't understand it.  I thought the JFS driver is in the kernel?


On Sun 26 December 2010 12:11:56 Neil Brown wrote:
> On Sun, 26 Dec 2010 10:19:55 -0800 Carl Cook <CACook@quantum-sci.com> wrote:
> 
> > I went in to turn on my home theater system today, and found a blank screen.  I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years.  I try to mount it manually, and "wrong fs or bad superblock".  The array is getting set up fine, but the filesystem seems to be destroyed.  
> > 
> > Unbelievable.  This isn't supposed to happen.  It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release.  But now it's RAID10 with JFS.
> 
> None of your logs show anything about jfs....
> 
> What does
>   fsck.jfs /dev/md2
> report?
> What about
>   mount -t jfs /dev/md2 /home
> 
> ??
> 
> NeilBrown
> 
> 
> > 
> > The only sign of trouble:
> > Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot
> > Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped.
> > Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3>
> > Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3)
> > Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3>
> > Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3)
> > Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0
> > Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped.
> > 
> > Reboot:
> > Dec 25 16:15:48 cygnus kernel: [    1.156657] Uniform CD-ROM driver Revision: 3.20
> > Dec 25 16:15:48 cygnus kernel: [    1.464298] md: raid10 personality registered for level 10
> > Dec 25 16:15:48 cygnus kernel: [    1.469307] md: md2 stopped.
> > Dec 25 16:15:48 cygnus kernel: [    1.470540] md: bind<sdc3>
> > Dec 25 16:15:48 cygnus kernel: [    1.470642] md: bind<sdb3>
> > Dec 25 16:15:48 cygnus kernel: [    1.471381] raid10: raid set md2 active with 2 out of 2 devices
> > Dec 25 16:15:48 cygnus kernel: [    1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits
> > Dec 25 16:15:48 cygnus kernel: [    1.476050] created bitmap (223 pages) for device md2
> > Dec 25 16:15:48 cygnus kernel: [    1.488465] md2: detected capacity change from 0 to 1913403736064
> > Dec 25 16:15:48 cygnus kernel: [    1.488942]  md2: unknown partition table
> > Dec 25 16:15:48 cygnus kernel: [    1.597375] PM: Starting manual resume from disk
> > Dec 25 16:15:48 cygnus kernel: [    1.650832] kjournald starting.  Commit interval 5 seconds
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-26 20:19   ` Carl Cook
@ 2010-12-26 20:19     ` CoolCold
  2010-12-26 20:33     ` Neil Brown
  1 sibling, 0 replies; 22+ messages in thread
From: CoolCold @ 2010-12-26 20:19 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-raid

On Sun, Dec 26, 2010 at 11:19 PM, Carl Cook <CACook@quantum-sci.com> wrote:
>
> My God, it didn't have the command fsck.jfs, so I reinstalled jfsutils.  Now the array mounts.
>
> I don't understand it.  I thought the JFS driver is in the kernel?
Driver is, but userland tools ( fsck.jfs ) are not.
>
>
> On Sun 26 December 2010 12:11:56 Neil Brown wrote:
>> On Sun, 26 Dec 2010 10:19:55 -0800 Carl Cook <CACook@quantum-sci.com> wrote:
>>
>> > I went in to turn on my home theater system today, and found a blank screen.  I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years.  I try to mount it manually, and "wrong fs or bad superblock".  The array is getting set up fine, but the filesystem seems to be destroyed.
>> >
>> > Unbelievable.  This isn't supposed to happen.  It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release.  But now it's RAID10 with JFS.
>>
>> None of your logs show anything about jfs....
>>
>> What does
>>   fsck.jfs /dev/md2
>> report?
>> What about
>>   mount -t jfs /dev/md2 /home
>>
>> ??
>>
>> NeilBrown
>>
>>
>> >
>> > The only sign of trouble:
>> > Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot
>> > Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped.
>> > Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3>
>> > Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3)
>> > Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3>
>> > Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3)
>> > Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0
>> > Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped.
>> >
>> > Reboot:
>> > Dec 25 16:15:48 cygnus kernel: [    1.156657] Uniform CD-ROM driver Revision: 3.20
>> > Dec 25 16:15:48 cygnus kernel: [    1.464298] md: raid10 personality registered for level 10
>> > Dec 25 16:15:48 cygnus kernel: [    1.469307] md: md2 stopped.
>> > Dec 25 16:15:48 cygnus kernel: [    1.470540] md: bind<sdc3>
>> > Dec 25 16:15:48 cygnus kernel: [    1.470642] md: bind<sdb3>
>> > Dec 25 16:15:48 cygnus kernel: [    1.471381] raid10: raid set md2 active with 2 out of 2 devices
>> > Dec 25 16:15:48 cygnus kernel: [    1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits
>> > Dec 25 16:15:48 cygnus kernel: [    1.476050] created bitmap (223 pages) for device md2
>> > Dec 25 16:15:48 cygnus kernel: [    1.488465] md2: detected capacity change from 0 to 1913403736064
>> > Dec 25 16:15:48 cygnus kernel: [    1.488942]  md2: unknown partition table
>> > Dec 25 16:15:48 cygnus kernel: [    1.597375] PM: Starting manual resume from disk
>> > Dec 25 16:15:48 cygnus kernel: [    1.650832] kjournald starting.  Commit interval 5 seconds
>> > --
>> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> > the body of a message to majordomo@vger.kernel.org
>> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Best regards,
[COOLCOLD-RIPN]
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-26 20:19   ` Carl Cook
  2010-12-26 20:19     ` CoolCold
@ 2010-12-26 20:33     ` Neil Brown
  2010-12-26 21:14       ` Berkey B Walker
  2010-12-27  0:06       ` Carl Cook
  1 sibling, 2 replies; 22+ messages in thread
From: Neil Brown @ 2010-12-26 20:33 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-raid

On Sun, 26 Dec 2010 12:19:41 -0800 Carl Cook <CACook@quantum-sci.com> wrote:

> 
> My God, it didn't have the command fsck.jfs, so I reinstalled jfsutils.  Now the array mounts.
> 
> I don't understand it.  I thought the JFS driver is in the kernel?

Like many parts of Linux, most of JFS is in the kernel, but some support
tools are separate.   Most filesystems have a separate mkfs.$FSTYPE and
fsck.$FSTYPE.  ALSA (sound subsystem) has alsamixer etc.  md/RAID has mdadm,
nfs has nfs-utils etc etc.  Each of these are primarily kernel subsystems,
but need user-space tools to configure and manage them.

But the important thing is that you have your data back, preparing you for a
Happy New Year!

NeilBrown


> 
> 
> On Sun 26 December 2010 12:11:56 Neil Brown wrote:
> > On Sun, 26 Dec 2010 10:19:55 -0800 Carl Cook <CACook@quantum-sci.com> wrote:
> > 
> > > I went in to turn on my home theater system today, and found a blank screen.  I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years.  I try to mount it manually, and "wrong fs or bad superblock".  The array is getting set up fine, but the filesystem seems to be destroyed.  
> > > 
> > > Unbelievable.  This isn't supposed to happen.  It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release.  But now it's RAID10 with JFS.
> > 
> > None of your logs show anything about jfs....
> > 
> > What does
> >   fsck.jfs /dev/md2
> > report?
> > What about
> >   mount -t jfs /dev/md2 /home
> > 
> > ??
> > 
> > NeilBrown
> > 
> > 
> > > 
> > > The only sign of trouble:
> > > Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot
> > > Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped.
> > > Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3>
> > > Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3)
> > > Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3>
> > > Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3)
> > > Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0
> > > Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped.
> > > 
> > > Reboot:
> > > Dec 25 16:15:48 cygnus kernel: [    1.156657] Uniform CD-ROM driver Revision: 3.20
> > > Dec 25 16:15:48 cygnus kernel: [    1.464298] md: raid10 personality registered for level 10
> > > Dec 25 16:15:48 cygnus kernel: [    1.469307] md: md2 stopped.
> > > Dec 25 16:15:48 cygnus kernel: [    1.470540] md: bind<sdc3>
> > > Dec 25 16:15:48 cygnus kernel: [    1.470642] md: bind<sdb3>
> > > Dec 25 16:15:48 cygnus kernel: [    1.471381] raid10: raid set md2 active with 2 out of 2 devices
> > > Dec 25 16:15:48 cygnus kernel: [    1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits
> > > Dec 25 16:15:48 cygnus kernel: [    1.476050] created bitmap (223 pages) for device md2
> > > Dec 25 16:15:48 cygnus kernel: [    1.488465] md2: detected capacity change from 0 to 1913403736064
> > > Dec 25 16:15:48 cygnus kernel: [    1.488942]  md2: unknown partition table
> > > Dec 25 16:15:48 cygnus kernel: [    1.597375] PM: Starting manual resume from disk
> > > Dec 25 16:15:48 cygnus kernel: [    1.650832] kjournald starting.  Commit interval 5 seconds
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-26 20:33     ` Neil Brown
@ 2010-12-26 21:14       ` Berkey B Walker
  2010-12-27  0:06       ` Carl Cook
  1 sibling, 0 replies; 22+ messages in thread
From: Berkey B Walker @ 2010-12-26 21:14 UTC (permalink / raw)
  To: Neil Brown; +Cc: Carl Cook, linux-raid

Excellent save!!!  The OP might want to continue the "Giving Season" by 
giving himself a brand new system backup.

Neil Brown wrote:
> On Sun, 26 Dec 2010 12:19:41 -0800 Carl Cook<CACook@quantum-sci.com>  wrote:
>
>    
>> My God, it didn't have the command fsck.jfs, so I reinstalled jfsutils.  Now the array mounts.
>>
>> I don't understand it.  I thought the JFS driver is in the kernel?
>>      
> Like many parts of Linux, most of JFS is in the kernel, but some support
> tools are separate.   Most filesystems have a separate mkfs.$FSTYPE and
> fsck.$FSTYPE.  ALSA (sound subsystem) has alsamixer etc.  md/RAID has mdadm,
> nfs has nfs-utils etc etc.  Each of these are primarily kernel subsystems,
> but need user-space tools to configure and manage them.
>
> But the important thing is that you have your data back, preparing you for a
> Happy New Year!
>
> NeilBrown
>
>
>    
>>
>> On Sun 26 December 2010 12:11:56 Neil Brown wrote:
>>      
>>> On Sun, 26 Dec 2010 10:19:55 -0800 Carl Cook<CACook@quantum-sci.com>  wrote:
>>>
>>>        
>>>> I went in to turn on my home theater system today, and found a blank screen.  I rebooted and it would not mount /home, which is a 4TB RAID10 array with every movie and show I've recorded over the past two years.  I try to mount it manually, and "wrong fs or bad superblock".  The array is getting set up fine, but the filesystem seems to be destroyed.
>>>>
>>>> Unbelievable.  This isn't supposed to happen.  It happened once before when I wasn't using RAID, but that was the BTRFS filesystem and I blamed it for being pre-release.  But now it's RAID10 with JFS.
>>>>          
>>> None of your logs show anything about jfs....
>>>
>>> What does
>>>    fsck.jfs /dev/md2
>>> report?
>>> What about
>>>    mount -t jfs /dev/md2 /home
>>>
>>> ??
>>>
>>> NeilBrown
>>>
>>>
>>>        
>>>> The only sign of trouble:
>>>> Dec 25 16:14:56 cygnus shutdown[2180]: shutting down for system reboot
>>>> Dec 25 16:14:58 cygnus kernel: [16607.840197] md: md2 stopped.
>>>> Dec 25 16:14:58 cygnus kernel: [16607.840210] md: unbind<sdb3>
>>>> Dec 25 16:14:58 cygnus kernel: [16607.852029] md: export_rdev(sdb3)
>>>> Dec 25 16:14:58 cygnus kernel: [16607.852083] md: unbind<sdc3>
>>>> Dec 25 16:14:58 cygnus kernel: [16607.864031] md: export_rdev(sdc3)
>>>> Dec 25 16:14:58 cygnus kernel: [16607.864092] md2: detected capacity change from 1913403736064 to 0
>>>> Dec 25 16:15:00 cygnus kernel: Kernel logging (proc) stopped.
>>>>
>>>> Reboot:
>>>> Dec 25 16:15:48 cygnus kernel: [    1.156657] Uniform CD-ROM driver Revision: 3.20
>>>> Dec 25 16:15:48 cygnus kernel: [    1.464298] md: raid10 personality registered for level 10
>>>> Dec 25 16:15:48 cygnus kernel: [    1.469307] md: md2 stopped.
>>>> Dec 25 16:15:48 cygnus kernel: [    1.470540] md: bind<sdc3>
>>>> Dec 25 16:15:48 cygnus kernel: [    1.470642] md: bind<sdb3>
>>>> Dec 25 16:15:48 cygnus kernel: [    1.471381] raid10: raid set md2 active with 2 out of 2 devices
>>>> Dec 25 16:15:48 cygnus kernel: [    1.476048] md2: bitmap initialized from disk: read 14/14 pags, set 0 bits
>>>> Dec 25 16:15:48 cygnus kernel: [    1.476050] created bitmap (223 pages) for device md2
>>>> Dec 25 16:15:48 cygnus kernel: [    1.488465] md2: detected capacity change from 0 to 1913403736064
>>>> Dec 25 16:15:48 cygnus kernel: [    1.488942]  md2: unknown partition table
>>>> Dec 25 16:15:48 cygnus kernel: [    1.597375] PM: Starting manual resume from disk
>>>> Dec 25 16:15:48 cygnus kernel: [    1.650832] kjournald starting.  Commit interval 5 seconds
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>          
>>>
>>>        
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>      
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>    

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-26 20:33     ` Neil Brown
  2010-12-26 21:14       ` Berkey B Walker
@ 2010-12-27  0:06       ` Carl Cook
  2010-12-27  4:45         ` Stan Hoeppner
  2010-12-27  5:35         ` Phil Turmel
  1 sibling, 2 replies; 22+ messages in thread
From: Carl Cook @ 2010-12-27  0:06 UTC (permalink / raw)
  To: linux-raid

On Sun 26 December 2010 12:33:30 Neil Brown wrote:
> But the important thing is that you have your data back, preparing you for a
> Happy New Year!
> NeilBrown

Indeed.  Thank you Neil, you saved me (along with my not touching anything).


On Sun 26 December 2010 13:14:32 you wrote:
> Excellent save!!!  The OP might want to continue the "Giving Season" by 
> giving himself a brand new system backup.

Actually I have much too much data to back anything up, so I've been stalling for over a year on building a backup server.  It's fixin' to happen now, you betcha.  A cube case with ITX and a BTRFS array the same size as my RAID10.  NoMachine NX over SSH for admin.  It'll be in a far corner of the garage down low, so if a fire or theft I don't lose my data.  I'll sync it with the main over GBethernet, at some interval, in some way (suggestions?), but keep it offline most of the time.  All my systems are named after constellations, so this will be called "Gemini". (the twin)



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27  0:06       ` Carl Cook
@ 2010-12-27  4:45         ` Stan Hoeppner
  2010-12-27  5:35         ` Phil Turmel
  1 sibling, 0 replies; 22+ messages in thread
From: Stan Hoeppner @ 2010-12-27  4:45 UTC (permalink / raw)
  To: linux-raid

Carl Cook put forth on 12/26/2010 6:06 PM:
> On Sun 26 December 2010 12:33:30 Neil Brown wrote:
>> But the important thing is that you have your data back, preparing you for a
>> Happy New Year!
>> NeilBrown
> 
> Indeed.  Thank you Neil, you saved me (along with my not touching anything).
> 
> 
> On Sun 26 December 2010 13:14:32 you wrote:
>> Excellent save!!!  The OP might want to continue the "Giving Season" by 
>> giving himself a brand new system backup.
> 
> Actually I have much too much data to back anything up

 <snip>

Every time I read/hear this I cringe.  If that is the case your data is
worthless to begin with so just delete it all right now.  You are
literally saying the same thing with your statement.  The difference is
that I _KNOW_ you drives will fail, or you'll lose an array due to
corruption, etc.  Your HTPC storage system will fail.  It's not an _IF_
but a _WHEN_ issue.  The question then becomes, what is the best
backup/restore strategy to fit HTPC needs.  Build another system with
similar technology and you have the same failure modes and risks as the
first.

Spinning Rust Disks (SRDs) are not a suitable long term backup/restore
solution.  What happens when your disk-to-disk backup server solution
drops an MD array for no reason such as just happened, _during_ a
restore operation?  Or you suffer a disk failure on the backup server
during a restore operation (which is very common today)?  Will your
backup server contain a 4 x 2 TB disk RAID 10 set?

I'd suggest tape as the better solution to D2D in the HTPC case,
primarily based on cost and availability of library and media, and the
fact the disaster recover procedure is much easier and much more
straightforward:

8 drive LTO-2 autoloader

http://www.msrcglobal.com/p-216-af203a.aspx?gclid=CI789Mm_i6YCFQTrKgodIg0pnA&
http://h18000.www1.hp.com/products/quickspecs/11841_div/11841_div.HTML
Ultrium 448 drive
3.2 TB compressed max per library
U160 LVD/SE SCSI interface
172 GB/hour capacity -
"desktop" model
$650 USD

http://www.newegg.com/Product/Product.aspx?Item=N82E16816118057
$90 USD

http://www.newegg.com/Product/Product.aspx?Item=N82E16840999118
8 x $22 = $176 USD

Total = $916 + shipping

Using the correct backup strategy this should easily meet your needs.
The 8 tapes in the library will handle 75% of your RAW level 10 mdraid
device capacity.  Once a filesystem is laid down, and you take overhead
concerns into account, you won't be putting more than 3.2 TB of data on
it anyway (or, at least, you SHOULDN'T be doing so).

The total cost should be similar to a disk based backup server but the
reliability and ease of restore should be better.  There's no beating a
tape library as a long term backup solution, especially for data that
doesn't change often, such as HTPC files.  10 tapes for 4 TB of space is
$220.  How much does a 4 TB SATA drive cost?  Or 2 x 2TB drives.  Quite
a bit more.  You can store spare tapes more easily than spare drives,
and the tapes take almost zero configuration when adding capacity to the
backup system.  Adding drives, especially if you don't buy a large
chassis with hot swap bays up front, is much more a PITA.

Keep in mind that using tape allows essentially unlimited backup
expansion, whereas you are severely limited with a backup server for
your HTPC unless you buy a large box upfront, which nobody doing HTPC
wants.  Each time you add larger/more disks to the HTPC, you have to do
the same for the backup server.  With tape, you simply swap tape
inventory and create a modified or new backup schedule.

Many people outside the corporate/government data center completely
ignore tape today as a backup solution.  Many times at their peril.

-- 
Stan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27  0:06       ` Carl Cook
  2010-12-27  4:45         ` Stan Hoeppner
@ 2010-12-27  5:35         ` Phil Turmel
  2010-12-27 13:10           ` Carl Cook
  1 sibling, 1 reply; 22+ messages in thread
From: Phil Turmel @ 2010-12-27  5:35 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-raid

On 12/26/2010 07:06 PM, Carl Cook wrote:
> On Sun 26 December 2010 12:33:30 Neil Brown wrote:
>> But the important thing is that you have your data back, preparing
>> you for a Happy New Year!
>> NeilBrown
> Indeed.  Thank you Neil, you saved me (along with my not touching
> anything).

> On Sun 26 December 2010 13:14:32 you wrote:
>> Excellent save!!!  The OP might want to continue the "Giving
>> Season" by giving himself a brand new system backup.

> Actually I have much too much data to back anything up, so I've been
> stalling for over a year on building a backup server.  It's fixin' to
> happen now, you betcha.  A cube case with ITX and a BTRFS array the
> same size as my RAID10.  NoMachine NX over SSH for admin.  It'll be
> in a far corner of the garage down low, so if a fire or theft I don't
> lose my data.  I'll sync it with the main over GBethernet, at some
> interval, in some way (suggestions?), but keep it offline most of the
> time.  All my systems are named after constellations, so this will be
> called "Gemini". (the twin)

(Heh.  I name my systems after constellations, too.)

I, too, use an alternate server to back up to, but my daily changes are small enough to rsync over the net.  To reduce the chance of double failures in my arrays, a cron job kicks off a "resync" weekly.  I also use a super-cheap SATA hot-swap bay (carriage-less) to support weekly and monthly rotations to a fire safe at a third site.  Although Stan is right that tape is the best choice for really large backups, I find the commodity 1.5T drives hard to beat for convenience.

Just my $0.02.

HTH,

Phil

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27  5:35         ` Phil Turmel
@ 2010-12-27 13:10           ` Carl Cook
  2010-12-27 15:04             ` Phil Turmel
  2010-12-27 16:37             ` Stan Hoeppner
  0 siblings, 2 replies; 22+ messages in thread
From: Carl Cook @ 2010-12-27 13:10 UTC (permalink / raw)
  To: linux-raid

> Every time I read/hear this I cringe.  If that is the case your data is worthless to begin with so just delete it all right now.  You are literally saying the same thing with your statement. 

No, I'm saying that the MTBF of disk drives is astronomical, and the likelihood of a fail during backup is miniscule.  MTBF of tape is hundreds of times sooner.  Not to mention that tape would take forever, and require constant tending.  This is why it's not used anymore.  My storage is 2TB now, but my library is growing all the time.  Backing to off-line disk storage is the only practical way now, given the extremely low cost and high capacity and speed.  Each WD 2TB drive is $99 from Newegg!  Astounding.  Thanks for the input though.


On Sun 26 December 2010 21:35:03 Phil Turmel wrote:
> I, too, use an alternate server to back up to, but my daily changes are small enough to rsync over the net.  To reduce the chance of double failures in my arrays, a cron job kicks off a "resync" weekly.

Can you please give some detail on your sync scripts?  I've never done this and am not a programmer, but I'm a pretty good shade-tree admin.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27 13:10           ` Carl Cook
@ 2010-12-27 15:04             ` Phil Turmel
  2010-12-27 21:34               ` Brad Campbell
  2010-12-27 16:37             ` Stan Hoeppner
  1 sibling, 1 reply; 22+ messages in thread
From: Phil Turmel @ 2010-12-27 15:04 UTC (permalink / raw)
  To: Carl Cook; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1322 bytes --]

On 12/27/2010 08:10 AM, Carl Cook wrote:
>> Every time I read/hear this I cringe.  If that is the case your data is worthless to begin with so just delete it all right now.  You are literally saying the same thing with your statement. 
> 
> No, I'm saying that the MTBF of disk drives is astronomical, and the likelihood of a fail during backup is miniscule.  MTBF of tape is hundreds of times sooner.  Not to mention that tape would take forever, and require constant tending.  This is why it's not used anymore.  My storage is 2TB now, but my library is growing all the time.  Backing to off-line disk storage is the only practical way now, given the extremely low cost and high capacity and speed.  Each WD 2TB drive is $99 from Newegg!  Astounding.  Thanks for the input though.
> 
> 
> On Sun 26 December 2010 21:35:03 Phil Turmel wrote:
>> I, too, use an alternate server to back up to, but my daily changes are small enough to rsync over the net.  To reduce the chance of double failures in my arrays, a cron job kicks off a "resync" weekly.
> 
> Can you please give some detail on your sync scripts?  I've never done this and am not a programmer, but I'm a pretty good shade-tree admin.

Sure.  Attached.  Note that the script doesn't set the sysctls for speed limits...  The defaults are fine for me.

HTH,

Phil

[-- Attachment #2: raidrepair --]
[-- Type: text/plain, Size: 180 bytes --]

#!/bin/bash
#
# Weekly Cron Job to initiate RAID scan/repair cycles

for x in /sys/block/md*/md/sync_action ; do
	echo repair >$x
done

# Process occurs in background kernel tasks

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27 13:10           ` Carl Cook
  2010-12-27 15:04             ` Phil Turmel
@ 2010-12-27 16:37             ` Stan Hoeppner
  2010-12-28  1:36               ` Berkey B Walker
                                 ` (4 more replies)
  1 sibling, 5 replies; 22+ messages in thread
From: Stan Hoeppner @ 2010-12-27 16:37 UTC (permalink / raw)
  To: linux-raid

Carl Cook put forth on 12/27/2010 7:10 AM:
>> Every time I read/hear this I cringe.  If that is the case your data is worthless to begin with so just delete it all right now.  You are literally saying the same thing with your statement. 
> 
> No, I'm saying that the MTBF of disk drives is astronomical, 

Interesting statement.  You're arguing the reliability of these modern
giant driver, yet use RAID 10 instead of simple spanning or striping.  I
wonder why that is...

> and the likelihood of a fail during backup is miniscule.  

Failure during backup is of less concern.  Failure of the source
media/system during _restore_ is.  Read this thread and the previous
thread from Eli a few months prior.  It will open your eyes to the peril
of using a D2D backup system with 2TB WD Green drives.  His choice of
such, along with other poor choices based on acquisition cost, almost
cost him his job.  Everything is cheap until it costs you dearly.

http://comments.gmane.org/gmane.comp.file-systems.xfs.general/35555

Too many (especially younger) IT people _only_ consider up front
acquisition cost of systems and not long term support of such systems.
Total system cost _must_ include a reliable DRS (Disaster Recover
System).  If you can't afford the DRS to go with a new system, then you
can't afford that system, and must downsize it or reduce its costs in
some way to allow inclusion of DRS.

 There is no free lunch.  Eli nearly lost his job over poor acquisition
and architecture choices.  In that thread he makes the same excuses you
do regarding his total storage size needs and his "budget for backup".
There is no such thing as "budget for backup".  DRS _must_ be included
in all acquisition costs.  If not, someone will pay dear consequences at
some point in time if the lost data has real value.  In Eli's case the
lost data was Ph.D. student research data.  If one student lost all his
data, he may likely have to redo an entire year of school.  Who pays for
that?  Who pays for his year of lost earnings sine he can't (re)enter
the workforce at Ph.D. pay scale?  This snafu may cost a single Ph.D.
student, the university, or both, $200K or more depending on career field.

If they'd had a decent tape silo he'd have lost no data.

> MTBF of tape is hundreds of times sooner.  

Really?  Eli had those WD20EARS online in his D2D backup system for less
than 5 months.  LTO tape reliability is less than 5 months?  Show data
to back that argument up please.

Tape isn't perfect either, but based on my experience and reading that
of many many others, it's still better than D2D is many cases.  Also
note that tapes don't wholesale fail as disks do.  Bad spots on tape
cause some lost files, not _all_ the files, as is the case when a D2D
system fails during restore.

If a tape drive fails during restore, you don't lose all the backup
data.  You simply replace the drive and run the tapes through the new
drive.  If you have a multi-drive silo or library, you simply get a log
message of the drive failure, and your restore may simply be slower.
This depends on how you've setup parallelism in your silo.  Outside of
the supercomputing centers where large files are backed up in parallel
streams to multiple tapes/drives simultaneously ("tape RAID" or tape
striping) most organizations don't stripe this way.  They simply
schedule simultaneous backups of various servers each hitting a
different drive in the silo.  In this case if all other silo drives are
busy, then your restore will have to wait.  But, you'll get your system
restored.

> Not to mention that tape would take forever, and require constant tending.  

Eli made similar statements as well, and they're bogus.  Modern high cap
drives/tapes are quite speedy, especially if striped using the proper
library/silo management software and planning/architecure.  Some silos
can absorb streaming backups at rates much higher than midrange SAN
arrays, in the multiple GB/s range.  They're not cheap, but then, good
DRS solutions aren't. :)

The D2D vendors use this scare tactic often also.  Would you care to
explain this "constant tending"?

> This is why it's not used anymore.

Would you care to back this up with actual evidence?  Tape unit shipment
numbers are down and declining as more folks (making informed decisions,
or otherwise) move to D2D and cloud services, but tape isn't dead by any
stretch of the imagination.  The D2D vendors sure want you to think so.
 Helps them sell more units.  This is simply FUD spreading.

> My storage is 2TB now, but my library is growing all the time.  Backing to off-line disk storage is the only practical way now, given the extremely low cost and high capacity and speed.  Each WD 2TB drive is $99 from Newegg!  Astounding.  Thanks for the input though.

No, it's not the only practical methodology.  Are you not familiar with
"differential copying"?  It's the feature that makes rsync so useful, as
well as tape.  Once you have your first complete backup of that 2TB of
media files, you're only writing to tape anything that's changed.

At $99 you'll have $396 of drives in your backup server.  Add the cost
of a case ($50), PSU ($30), mobo ($80), CPU ($100), DIMMs ($30), optical
drive ($20), did I omit anything?  You're now at around $700.

You now have a second system requiring "constant tending".  You also
have 9 components that could fail during restore.  With a tape drive you
have one.  Calculate the total MTBF of those 9 components using the
inverse probability rule and compare that to the MTBF of a single HP
LTO-2 drive?

Again, you're like a deer in the headlights mesmerized by initial
acquisition cost.  The tape solution I mentioned has a ~$200 greater
acquisition cost, yet its reliability is greater, and it is purpose
built for the task at hand.   Your DIY D2D server is not.

Please keep in mind Carl I'm not necessarily speaking directly to you,
or singling you out on this issue.  This list has a wider audience.
Many sites archive this list, and those Googling the subject need good
information on this subject.  The prevailing wind is D2D, but that
doesn't make it God's gift to DRS.  As I' noted earlier, many folks are
being bitten badly by the mindset you've demonstrated in this thread.

D2D and tape both have their place, and both can do some jobs equally
well at the same or varying costs.  D2D is better for some scenarios in
some environments.  Tape is the _ONLY_ solution for others, and
especially do for some government and business scenarios that require
WORM capability for legal compliance.  There are few, if any, disk based
solutions that can guarantee WORM archiving.

-- 
Stan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27 15:04             ` Phil Turmel
@ 2010-12-27 21:34               ` Brad Campbell
  0 siblings, 0 replies; 22+ messages in thread
From: Brad Campbell @ 2010-12-27 21:34 UTC (permalink / raw)
  To: Phil Turmel; +Cc: Carl Cook, linux-raid

On 27/12/10 23:04, Phil Turmel wrote:
> On 12/27/2010 08:10 AM, Carl Cook wrote:
>>> Every time I read/hear this I cringe.  If that is the case your data is worthless to begin with so just delete it all right now.  You are literally saying the same thing with your statement.
>>
>> No, I'm saying that the MTBF of disk drives is astronomical, and the likelihood of a fail during backup is miniscule.  MTBF of tape is hundreds of times sooner.  Not to mention that tape would take forever, and require constant tending.  This is why it's not used anymore.  My storage is 2TB now, but my library is growing all the time.  Backing to off-line disk storage is the only practical way now, given the extremely low cost and high capacity and speed.  Each WD 2TB drive is $99 from Newegg!  Astounding.  Thanks for the input though.
>>
>>
>> On Sun 26 December 2010 21:35:03 Phil Turmel wrote:
>>> I, too, use an alternate server to back up to, but my daily changes are small enough to rsync over the net.  To reduce the chance of double failures in my arrays, a cron job kicks off a "resync" weekly.
>>
>> Can you please give some detail on your sync scripts?  I've never done this and am not a programmer, but I'm a pretty good shade-tree admin.
>
> Sure.  Attached.  Note that the script doesn't set the sysctls for speed limits...  The defaults are fine for me.

Hrm.. I used to do this too, until a silent corruption issue with a controller trashed 8TB of data. 
Now if I'd done a "check" instead of a repair, and been alerted to the fact the controller was 
corrupting data rather than blindly overwriting the parity blocks I'd have had a chance of saving 
the array.

Checksum and test regularly!.

I do have a backup regime, however I do a full rotation every 3 months and it was approximately 3 
months and 5 days before I noticed I really had a problem.

Brad
-- 
Dolphins are so intelligent that within a few weeks they can
train Americans to stand at the edge of the pool and throw them
fish.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27 16:37             ` Stan Hoeppner
@ 2010-12-28  1:36               ` Berkey B Walker
  2010-12-28  4:16               ` Carl Cook
                                 ` (3 subsequent siblings)
  4 siblings, 0 replies; 22+ messages in thread
From: Berkey B Walker @ 2010-12-28  1:36 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: linux-raid

Whereas a tape and drive does not offer an "Everything is the single 
point of failure" as does a permanently _sealed_  storage disc/drive, 
tape is definately NOT "fool proof", nor even the lazy, everyday 
operator.  The operator and system requirements for use and maintenance 
are still, basically, as they were a half century ago.  Imagine if the 
operator were required to retension his/her RAID disks and replace them 
after so many uses.  Money drives it all, creating street level, IT 
techs, SysAdmins aproaching the commodity level.  This not intended to 
be pointing at the Original Poster but market and marketing of today.
berk

Stan Hoeppner wrote:
> Carl Cook put forth on 12/27/2010 7:10 AM:
>    
>>> Every time I read/hear this I cringe.  If that is the case your data is worthless to begin with so just delete it all right now.  You are literally saying the same thing with your statement.
>>>        
>> No, I'm saying that the MTBF of disk drives is astronomical,
>>      
> Interesting statement.  You're arguing the reliability of these modern
> giant driver, yet use RAID 10 instead of simple spanning or striping.  I
> wonder why that is...
>
>    
>> and the likelihood of a fail during backup is miniscule.
>>      
> Failure during backup is of less concern.  Failure of the source
> media/system during _restore_ is.  Read this thread and the previous
> thread from Eli a few months prior.  It will open your eyes to the peril
> of using a D2D backup system with 2TB WD Green drives.  His choice of
> such, along with other poor choices based on acquisition cost, almost
> cost him his job.  Everything is cheap until it costs you dearly.
>
> http://comments.gmane.org/gmane.comp.file-systems.xfs.general/35555
>
> Too many (especially younger) IT people _only_ consider up front
> acquisition cost of systems and not long term support of such systems.
> Total system cost _must_ include a reliable DRS (Disaster Recover
> System).  If you can't afford the DRS to go with a new system, then you
> can't afford that system, and must downsize it or reduce its costs in
> some way to allow inclusion of DRS.
>
>   There is no free lunch.  Eli nearly lost his job over poor acquisition
> and architecture choices.  In that thread he makes the same excuses you
> do regarding his total storage size needs and his "budget for backup".
> There is no such thing as "budget for backup".  DRS _must_ be included
> in all acquisition costs.  If not, someone will pay dear consequences at
> some point in time if the lost data has real value.  In Eli's case the
> lost data was Ph.D. student research data.  If one student lost all his
> data, he may likely have to redo an entire year of school.  Who pays for
> that?  Who pays for his year of lost earnings sine he can't (re)enter
> the workforce at Ph.D. pay scale?  This snafu may cost a single Ph.D.
> student, the university, or both, $200K or more depending on career field.
>
> If they'd had a decent tape silo he'd have lost no data.
>
>    
>> MTBF of tape is hundreds of times sooner.
>>      
> Really?  Eli had those WD20EARS online in his D2D backup system for less
> than 5 months.  LTO tape reliability is less than 5 months?  Show data
> to back that argument up please.
>
> Tape isn't perfect either, but based on my experience and reading that
> of many many others, it's still better than D2D is many cases.  Also
> note that tapes don't wholesale fail as disks do.  Bad spots on tape
> cause some lost files, not _all_ the files, as is the case when a D2D
> system fails during restore.
>
> If a tape drive fails during restore, you don't lose all the backup
> data.  You simply replace the drive and run the tapes through the new
> drive.  If you have a multi-drive silo or library, you simply get a log
> message of the drive failure, and your restore may simply be slower.
> This depends on how you've setup parallelism in your silo.  Outside of
> the supercomputing centers where large files are backed up in parallel
> streams to multiple tapes/drives simultaneously ("tape RAID" or tape
> striping) most organizations don't stripe this way.  They simply
> schedule simultaneous backups of various servers each hitting a
> different drive in the silo.  In this case if all other silo drives are
> busy, then your restore will have to wait.  But, you'll get your system
> restored.
>
>    
>> Not to mention that tape would take forever, and require constant tending.
>>      
> Eli made similar statements as well, and they're bogus.  Modern high cap
> drives/tapes are quite speedy, especially if striped using the proper
> library/silo management software and planning/architecure.  Some silos
> can absorb streaming backups at rates much higher than midrange SAN
> arrays, in the multiple GB/s range.  They're not cheap, but then, good
> DRS solutions aren't. :)
>
> The D2D vendors use this scare tactic often also.  Would you care to
> explain this "constant tending"?
>
>    
>> This is why it's not used anymore.
>>      
> Would you care to back this up with actual evidence?  Tape unit shipment
> numbers are down and declining as more folks (making informed decisions,
> or otherwise) move to D2D and cloud services, but tape isn't dead by any
> stretch of the imagination.  The D2D vendors sure want you to think so.
>   Helps them sell more units.  This is simply FUD spreading.
>
>    
>> My storage is 2TB now, but my library is growing all the time.  Backing to off-line disk storage is the only practical way now, given the extremely low cost and high capacity and speed.  Each WD 2TB drive is $99 from Newegg!  Astounding.  Thanks for the input though.
>>      
> No, it's not the only practical methodology.  Are you not familiar with
> "differential copying"?  It's the feature that makes rsync so useful, as
> well as tape.  Once you have your first complete backup of that 2TB of
> media files, you're only writing to tape anything that's changed.
>
> At $99 you'll have $396 of drives in your backup server.  Add the cost
> of a case ($50), PSU ($30), mobo ($80), CPU ($100), DIMMs ($30), optical
> drive ($20), did I omit anything?  You're now at around $700.
>
> You now have a second system requiring "constant tending".  You also
> have 9 components that could fail during restore.  With a tape drive you
> have one.  Calculate the total MTBF of those 9 components using the
> inverse probability rule and compare that to the MTBF of a single HP
> LTO-2 drive?
>
> Again, you're like a deer in the headlights mesmerized by initial
> acquisition cost.  The tape solution I mentioned has a ~$200 greater
> acquisition cost, yet its reliability is greater, and it is purpose
> built for the task at hand.   Your DIY D2D server is not.
>
> Please keep in mind Carl I'm not necessarily speaking directly to you,
> or singling you out on this issue.  This list has a wider audience.
> Many sites archive this list, and those Googling the subject need good
> information on this subject.  The prevailing wind is D2D, but that
> doesn't make it God's gift to DRS.  As I' noted earlier, many folks are
> being bitten badly by the mindset you've demonstrated in this thread.
>
> D2D and tape both have their place, and both can do some jobs equally
> well at the same or varying costs.  D2D is better for some scenarios in
> some environments.  Tape is the _ONLY_ solution for others, and
> especially do for some government and business scenarios that require
> WORM capability for legal compliance.  There are few, if any, disk based
> solutions that can guarantee WORM archiving.
>
>    

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27 16:37             ` Stan Hoeppner
  2010-12-28  1:36               ` Berkey B Walker
@ 2010-12-28  4:16               ` Carl Cook
  2010-12-29  3:04                 ` Stan Hoeppner
  2011-01-04 20:03               ` Phillip Susi
                                 ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Carl Cook @ 2010-12-28  4:16 UTC (permalink / raw)
  To: linux-raid

On Mon 27 December 2010 08:37:25 you wrote:
> Too many (especially younger) IT people _only_ consider up front
> acquisition cost of systems and not long term support of such systems.
> Total system cost _must_ include a reliable DRS (Disaster Recover
> System).  If you can't afford the DRS to go with a new system, then you
> can't afford that system, and must downsize it or reduce its costs in
> some way to allow inclusion of DRS.
> 
>  There is no free lunch.  Eli nearly lost his job over poor acquisition
> and architecture choices.  In that thread he makes the same excuses you
> do regarding his total storage size needs and his "budget for backup".
> There is no such thing as "budget for backup".  DRS _must_ be included
> in all acquisition costs.  If not, someone will pay dear consequences at
> some point in time if the lost data has real value.  In Eli's case the
> lost data was Ph.D. student research data.  If one student lost all his
> data, he may likely have to redo an entire year of school.  Who pays for
> that?  Who pays for his year of lost earnings sine he can't (re)enter
> the workforce at Ph.D. pay scale?  This snafu may cost a single Ph.D.
> student, the university, or both, $200K or more depending on career field.

Alll righty then, we are slipping close to hysteria.

Please try not to worry about me...  I am switching to BTRFS and a backup server for my frickin' movies, and will be fine.



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-28  4:16               ` Carl Cook
@ 2010-12-29  3:04                 ` Stan Hoeppner
  2010-12-29  5:34                   ` Roman Mamedov
  0 siblings, 1 reply; 22+ messages in thread
From: Stan Hoeppner @ 2010-12-29  3:04 UTC (permalink / raw)
  To: linux-raid

Carl Cook put forth on 12/27/2010 10:16 PM:

> Please try not to worry about me...  I am switching to BTRFS and a backup server for my frickin' movies, and will be fine.

BTRFS?  You must be pulling our chains.  It's an experimental filesystem
for Pete's sake.  A month ago it still didn't have a check/repair tool.
 Does it yet?

-- 
Stan

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-29  3:04                 ` Stan Hoeppner
@ 2010-12-29  5:34                   ` Roman Mamedov
  0 siblings, 0 replies; 22+ messages in thread
From: Roman Mamedov @ 2010-12-29  5:34 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1036 bytes --]

On Tue, 28 Dec 2010 21:04:10 -0600
Stan Hoeppner <stan@hardwarefreak.com> wrote:

> Carl Cook put forth on 12/27/2010 10:16 PM:
> 
> > Please try not to worry about me...  I am switching to BTRFS and a backup
> > server for my frickin' movies, and will be fine.
> 
> BTRFS?  You must be pulling our chains.  It's an experimental filesystem
> for Pete's sake.  A month ago it still didn't have a check/repair tool.
>  Does it yet?

AFAIK it does have a check tool, but not a repair tool yet :)

To be fair, it can be argued that if you have a backup [server], this doesn't
matter too much.

But personally I find btrfs to be good not for primary storage yet, but for
precisely the mentioned backup storage: its snapshot feature allows to do
(fast) incremental backups, while at the same time have the full state saved
at each backup step instantly accessible with no special software required
(snapshots of older state just sit there looking like plain directories in the
filesystem).

-- 
With respect,
Roman

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27 16:37             ` Stan Hoeppner
  2010-12-28  1:36               ` Berkey B Walker
  2010-12-28  4:16               ` Carl Cook
@ 2011-01-04 20:03               ` Phillip Susi
  2011-01-05 21:19                 ` Leslie Rhorer
  2011-01-05 14:45               ` Hank Barta
  2011-01-05 23:07               ` Leslie Rhorer
  4 siblings, 1 reply; 22+ messages in thread
From: Phillip Susi @ 2011-01-04 20:03 UTC (permalink / raw)
  To: Stan Hoeppner; +Cc: linux-raid

On 12/27/2010 11:37 AM, Stan Hoeppner wrote:
> If they'd had a decent tape silo he'd have lost no data.

Unless the tape failed, which they often do.

>> MTBF of tape is hundreds of times sooner.  
> 
> Really?  Eli had those WD20EARS online in his D2D backup system for less
> than 5 months.  LTO tape reliability is less than 5 months?  Show data
> to back that argument up please.

Those particular drives seem to have a rather high infant mortality
rate.  That does not change the fact that modern drives are rated with
MTBF of 300,000+ hours, which is a heck of a lot more than a tape.

> Tape isn't perfect either, but based on my experience and reading that
> of many many others, it's still better than D2D is many cases.  Also
> note that tapes don't wholesale fail as disks do.  Bad spots on tape
> cause some lost files, not _all_ the files, as is the case when a D2D
> system fails during restore.

Not necessarily.  Both systems can fail partially or totally, though
total failure is probably more likely with disks.

> If a tape drive fails during restore, you don't lose all the backup
> data.  You simply replace the drive and run the tapes through the new
> drive.  If you have a multi-drive silo or library, you simply get a log

It isn't the drive that is the problem; it's the tape.

> At $99 you'll have $396 of drives in your backup server.  Add the cost
> of a case ($50), PSU ($30), mobo ($80), CPU ($100), DIMMs ($30), optical
> drive ($20), did I omit anything?  You're now at around $700.

Or you can just spend $30 on an esata drive dock instead of building a
dedicated backup server.  Then you are looking at $430 to back up 4tb of
data.  An LTO Ultrium 3 tape drive looks like it's nearly two grand, and
only holds 400gb per tape at $30 a pop, so you're spending nearly $2500
on the drive and 20 tapes.  It doesn't make sense to spend 5x as much on
the backup solution as the primary storage solution.

> You now have a second system requiring "constant tending".  You also
> have 9 components that could fail during restore.  With a tape drive you
> have one.  Calculate the total MTBF of those 9 components using the
> inverse probability rule and compare that to the MTBF of a single HP
> LTO-2 drive?

This is a disingenuous argument since only one failure ( the drive )
results in data loss.  If the power supply fails, you just plug in
another one.  Also again, it is the tape that matters, not the tape
drive, so what is the MTBF of those tapes?

I know it isn't a significant sample size, but in 20 years of computing
I have only personally ever had one hard drive outright fail on me, and
that was a WD15EARS ( died in under 24 hours ), but I had tapes fail a
few times, often within 24 hours of them verifying fine, then you go to
restore from them and they are unreadable.  That combined with their
absurd cost is why I don't use tapes any more.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2010-12-27 16:37             ` Stan Hoeppner
                                 ` (2 preceding siblings ...)
  2011-01-04 20:03               ` Phillip Susi
@ 2011-01-05 14:45               ` Hank Barta
  2011-01-05 23:07               ` Leslie Rhorer
  4 siblings, 0 replies; 22+ messages in thread
From: Hank Barta @ 2011-01-05 14:45 UTC (permalink / raw)
  To: linux-raid

On Mon, Dec 27, 2010 at 10:37 AM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
>
> At $99 you'll have $396 of drives in your backup server.  Add the cost
> of a case ($50), PSU ($30), mobo ($80), CPU ($100), DIMMs ($30), optical
> drive ($20), did I omit anything?  You're now at around $700.
>

FWIW, I put together a system using a bare bones Atom based box with
two 2TB drives (one a WD20EARS and the other a Seagate LP drive) for a
total cost of $372. It has no optical drive since I used that space
for the second disk. (Not sure why I'd need an optical drive anyway.)
The drives are cheaper now and the system could probably be built for
a bit more than $300. The only shortcoming is that there is no space
for additional drives. As a bonus, the system supports wake on lan
well so I can bring it up, backup and then shut it down. In the long
run I'm not sure that's better for the drives than continuous
operation, but this is colocated at my son;s place (for off site
backup) and part of the deal was to minimize impact on his electric
bill. When running and active, it uses 35W measured at the plug. When
inactive, that drops to about 20W.

best,
hank

-- 
'03 BMW F650CS - hers
'98 Dakar K12RS - "BABY K" grew up.
'93 R100R w/ Velorex 700 (MBD starts...)
'95 Miata - "OUR LC"
polish visor: apply squashed bugs, rinse, repeat
Beautiful Sunny Winfield, Illinois
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Is It Hopeless?
  2011-01-04 20:03               ` Phillip Susi
@ 2011-01-05 21:19                 ` Leslie Rhorer
  0 siblings, 0 replies; 22+ messages in thread
From: Leslie Rhorer @ 2011-01-05 21:19 UTC (permalink / raw)
  To: 'Phillip Susi', 'Stan Hoeppner'; +Cc: linux-raid

> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Phillip Susi
> Sent: Tuesday, January 04, 2011 2:04 PM
> To: Stan Hoeppner
> Cc: linux-raid@vger.kernel.org
> Subject: Re: Is It Hopeless?
> 
> On 12/27/2010 11:37 AM, Stan Hoeppner wrote:
> > If they'd had a decent tape silo he'd have lost no data.
> 
> Unless the tape failed, which they often do.
> 
> >> MTBF of tape is hundreds of times sooner.
> >
> > Really?  Eli had those WD20EARS online in his D2D backup system for less
> > than 5 months.  LTO tape reliability is less than 5 months?  Show data
> > to back that argument up please.
> 
> Those particular drives seem to have a rather high infant mortality
> rate.  That does not change the fact that modern drives are rated with
> MTBF of 300,000+ hours, which is a heck of a lot more than a tape.
> 
> > Tape isn't perfect either, but based on my experience and reading that
> > of many many others, it's still better than D2D is many cases.  Also
> > note that tapes don't wholesale fail as disks do.  Bad spots on tape
> > cause some lost files, not _all_ the files, as is the case when a D2D
> > system fails during restore.
> 
> Not necessarily.  Both systems can fail partially or totally, though
> total failure is probably more likely with disks.
> 
> > If a tape drive fails during restore, you don't lose all the backup
> > data.  You simply replace the drive and run the tapes through the new
> > drive.  If you have a multi-drive silo or library, you simply get a log
> 
> It isn't the drive that is the problem; it's the tape.
> 
> > At $99 you'll have $396 of drives in your backup server.  Add the cost
> > of a case ($50), PSU ($30), mobo ($80), CPU ($100), DIMMs ($30), optical
> > drive ($20), did I omit anything?  You're now at around $700.
> 
> Or you can just spend $30 on an esata drive dock instead of building a
> dedicated backup server.  Then you are looking at $430 to back up 4tb of
> data.  An LTO Ultrium 3 tape drive looks like it's nearly two grand, and
> only holds 400gb per tape at $30 a pop, so you're spending nearly $2500
> on the drive and 20 tapes.  It doesn't make sense to spend 5x as much on
> the backup solution as the primary storage solution.
> 
> > You now have a second system requiring "constant tending".  You also
> > have 9 components that could fail during restore.  With a tape drive you
> > have one.  Calculate the total MTBF of those 9 components using the
> > inverse probability rule and compare that to the MTBF of a single HP
> > LTO-2 drive?
> 
> This is a disingenuous argument since only one failure ( the drive )
> results in data loss.  If the power supply fails, you just plug in
> another one.  Also again, it is the tape that matters, not the tape
> drive, so what is the MTBF of those tapes?
> 
> I know it isn't a significant sample size, but in 20 years of computing
> I have only personally ever had one hard drive outright fail on me, and
> that was a WD15EARS ( died in under 24 hours ), but I had tapes fail a
> few times, often within 24 hours of them verifying fine, then you go to
> restore from them and they are unreadable.  That combined with their
> absurd cost is why I don't use tapes any more.

	No backup solution is perfect.  That's why I employ a backup server
PLUS offline storage PLUS multiple backup locations on multiple systems for
my critical data.  My banking data, for example, has full multi-generation
backups on multiple internal drives of different workstations as well as
being on the server, the backup server, and on offline storage.  Tape has
advantages that usually only begin to make sense for large enterprise level
systems which may span many dozens of TB, and for whom acquisition time for
the backup is less important than WORM capability.  For very small,
especially private, systems, tape's advantages are mostly moot, and its
relative cost rises rapidly as the size of the system falls.  Backing up 4TB
of data reliably can easily be done with $400 worth of hard drives.  Backing
up 400TB of data with hard drives is, well, nightmarish.  BTW, for small,
fairly static data repositories, DVDs or Blu-Ray disks can provide a very
economical, if labor intensive, WORM backup solution.  In the case of the
OP, it sounds as if most of his data is a personal system containing mostly
movies whose content will never change.  DVD or Blu-Ray might be a
reasonable backup medium for him.

	One item that is for some reason rarely discussed and yet is the
very most important reason for a backup is human error.  People go on
endlessly about drive failures and tape failures, yet the fact is most data
loss is due to user errors.  A WORM solution can go a long way toward
alleviating such failures, while an online backup solution may inherently
encourage such failures.  At the same time, when a user accidentally
overwrites a file, he usually wants it recovered instantly.  I know I have
been very glad on more than one occasion of having an on-line backup system
from which I could recover a file I had accidentally overwritten.  That's
why I run an rsync every night, and why I don't delete any files during the
rsync that have been removed from the main system.  Of course that means the
backup system has to be larger than the main system, and that I have to go
through and delete old, temporary files on the backup from time to time.

	Another item that is often glossed over is the importance of the
data being targeted.  On many systems, some of the data is not very
important, at all, and a loss of some of that data may be of little
consequence.  It's not a black / white dichotomy between important vs.
unimportant data, either.  Not only does the importance of the data vary
over a significant range, the importance also scales with volume.  My
aforementioned critical banking data, for example, is rather small in
extent, so there's no significant monetary or administrative impact in
storing copies of it all over the galaxy, as it were.  OTOH, like the OP,
the bulk of the data on my home servers is video.  The loss of a single
video, while not wonderful, is hardly a tragedy.  Through one issue or
another I have indeed lost a small handful of videos over time.  The cost of
insuring I never would have lost any of these files would have been
prohibitively high to make that level of backup practical.  The thought,
however, of losing all 11 TB of video data is daunting, to say the least.
That's why I do have an online backup system and offline storage as well for
the bulk of the files on the server.  At work, the systems I administrate
are actually quite small in comparison to my home server, but the data on
some of them is at least as critical as my personal banking data.  Those
systems have multiple backups to multiple tapes, multiple hard drives, and
multiple solid state storage systems spread across the entire nation.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: Is It Hopeless?
  2010-12-27 16:37             ` Stan Hoeppner
                                 ` (3 preceding siblings ...)
  2011-01-05 14:45               ` Hank Barta
@ 2011-01-05 23:07               ` Leslie Rhorer
  2011-01-06 23:02                 ` Berkey B Walker
  4 siblings, 1 reply; 22+ messages in thread
From: Leslie Rhorer @ 2011-01-05 23:07 UTC (permalink / raw)
  To: 'Stan Hoeppner', linux-raid

> Too many (especially younger) IT people _only_ consider up front
> acquisition cost of systems and not long term support of such systems.

	Perhaps so, but the reality of the situation is any venture is a
risk.  Mitigating any risk costs money, and at some point one simply must
deploy the system and hope the inherent failures not mitigated by DR won't
happen.  Even locating multiple massively redundant DR centers across the
entire globe cannot completely mitigate all possible disasters, but few, if
any, enterprises can afford to launch satellites to put their critical data
in orbit.  Eventually, one must roll the dice.

> Total system cost _must_ include a reliable DRS (Disaster Recover
> System).  If you can't afford the DRS to go with a new system, then you
> can't afford that system, and must downsize it or reduce its costs in
> some way to allow inclusion of DRS.

	Well, not necessarily.  Again, not all data is critical.  Below a
certain level of criticality, any DR at all is a waste of money.  There are
literally millions of people out there whose computing needs don't call for
any great level of DR, and some of them even have RAID systems.

>  There is no free lunch.  Eli nearly lost his job over poor acquisition
> and architecture choices.

	Maybe.  It seems to me a lot of recovery avenues were left
unexplored.

> In that thread he makes the same excuses you
> do regarding his total storage size needs and his "budget for backup".
> There is no such thing as "budget for backup".  DRS _must_ be included
> in all acquisition costs.  If not, someone will pay dear consequences at
> some point in time if the lost data has real value.

	Pay now or pay later.  In some cases, paying later is the only
alternative.  What's more, in some cases the later payment is less
burdensome.  I'm not saying this is always the case, by a long shot, but
blankly assuming one must never take any risks is not the best approach,
either.  Risk is a part of life, sometimes even the risk of death.  One must
approach each endeavor with a risk / benefit analysis.

  In Eli's case the
> lost data was Ph.D. student research data.  If one student lost all his
> data, he may likely have to redo an entire year of school.  Who pays for
> that?  Who pays for his year of lost earnings sine he can't (re)enter
> the workforce at Ph.D. pay scale?  This snafu may cost a single Ph.D.
> student, the university, or both, $200K or more depending on career field.

	I didn't spot where any attempt was made to recover the data off the
failed drives.  If it was that important, then the cost of recovery from
failed drives should have been well worth it.  I'm not saying recovering
data from failed hard drives is a good DR plan, but in the event of such an
unmitigated failure, it becomes a worthwhile solution.  It's certainly not a
six figure proposal.

 
> If they'd had a decent tape silo he'd have lost no data.
> 
> > MTBF of tape is hundreds of times sooner.
> 
> Really?  Eli had those WD20EARS online in his D2D backup system for less
> than 5 months.  LTO tape reliability is less than 5 months?  Show data
> to back that argument up please.

	I've had tapes fail after their first write.  Indeed, I recently had
to recover a system from tape where the last 2 tapes were bad.  The 3rd tape
was off-site, so it took a couple of extra days to get the system back
online.  Of course, it wasn't a critical system, or else we would have had
more immediate recovery alternatives.  See my point above.

> Tape isn't perfect either, but based on my experience and reading that
> of many many others, it's still better than D2D is many cases.  Also
> note that tapes don't wholesale fail as disks do.  Bad spots on tape
> cause some lost files, not _all_ the files, as is the case when a D2D
> system fails during restore.

	Tapes certainly do fail completely.  Both tapes I mention above were
completely unreadable.  Note neither hard drives nor tape media usually in
fact fail completely.  Had it been necessary, we could have had the tapes
scanned and recovered much of the data.  The same, however, is true of a
hard drive.  There are data recovery services available for both types of
failed media.  They aren't cheap, but if the lost data is truly valuable...

	Frankly, many people, including IT people who should know better,
panic whenever data is thought to be lost, and wind up making things worse.
A few years ago, an acquaintance of mine - a real bubble-head - related a
story to me.  She was given the chore of backing up one of the systems in
her office to tape.  When the tape utility reached the end of the tape, of
course, it prompted her for additional tapes.  Why this puzzled her, I have
no idea, but it did, so she asked the IT guy who had tasked her with the
backups what she should do.  Believe it or not, he told her to ignore the
prompt and just quit the backup app.  Apparently he is an even bigger
bubble-head than she is.  Some months down the road, of course, the system
failed and they were stuck with a partial backup.  (Yeah, I know, but it
gets worse.)  So what was the failure?  Someone accidentally re-formatted
the hard drive.  When it was discovered the tape backup was incomplete, they
had my acquaintance completely re-install the OS, and manually re-create the
data, which took weeks.  I couldn't believe it.  It's very likely nearly all
the data could have been recovered from the re-formatted hard drive with far
less trouble and cost.  Note also the (at the time) very expensive tape
drive was rendered virtually useless by incompetent individuals.  Indeed,
the entire failure was due to incompetence.

	My brother designed and built a boat, and during this exercise he
read books published by a number of marine engineers.  One of them had a
favorite saying - "Don't do just something, stand there!".  A lot of people,
including IT people, need to take that to heart.  In a failure scenario,
don't do anything unless you have thoroughly thought it through and are
quite certain it won't make things worse.

> If a tape drive fails during restore, you don't lose all the backup
> data.  You simply replace the drive and run the tapes through the new
> drive.  If you have a multi-drive silo or library, you simply get a log
> message of the drive failure, and your restore may simply be slower.
> This depends on how you've setup parallelism in your silo.  Outside of
> the supercomputing centers where large files are backed up in parallel
> streams to multiple tapes/drives simultaneously ("tape RAID" or tape
> striping) most organizations don't stripe this way.  They simply
> schedule simultaneous backups of various servers each hitting a
> different drive in the silo.  In this case if all other silo drives are
> busy, then your restore will have to wait.  But, you'll get your system
> restored.

	None of that is relevant to a guy with a couple of TB of videos in
his home system, though.

> > Not to mention that tape would take forever, and require constant
> tending.
> 
> Eli made similar statements as well, and they're bogus.  Modern high cap
> drives/tapes are quite speedy, especially if striped using the proper
> library/silo management software and planning/architecure.

	Yes, but Eli was tending a much larger system than the OP, and for a
well endowed public system, not a $700 home computer.

> can absorb streaming backups at rates much higher than midrange SAN
> arrays, in the multiple GB/s range.  They're not cheap, but then, good
> DRS solutions aren't. :)

	A "good" DRS solution is only good if it is not so expensive as to
make the enterprise unprofitable.  It doesn't help for a company's data to
be safe if the company can't make money and goes bankrupt.

> The D2D vendors use this scare tactic often also.  Would you care to
> explain this "constant tending"?

	In a small system, the tapes will have to be swapped frequently.
The same is true of DVD or Blu-Ray backups.  An online backup can handle its
own management.

> > My storage is 2TB now, but my library is growing all the time.  Backing
> to off-line disk storage is the only practical way now, given the
> extremely low cost and high capacity and speed.  Each WD 2TB drive is $99
> from Newegg!  Astounding.  Thanks for the input though.
> 
> No, it's not the only practical methodology.  Are you not familiar with
> "differential copying"?  It's the feature that makes rsync so useful, as
> well as tape.  Once you have your first complete backup of that 2TB of
> media files, you're only writing to tape anything that's changed.

	And hoping the original backup set doesn't fail.  Again, every
solution is a a compromise, and when it comes to backups, speed and
efficiency are always balanced against cost and with robustness.
Multi-generation full backups take much longer and are more expensive, but
they don't rely on a single set of backup data that could turn out to be bad
when it is needed.  I'm not saying one should not take advantage of
differential or incremental backups, merely that they represent a compromise
whose implications must be considered.

> At $99 you'll have $396 of drives in your backup server.  Add the cost
> of a case ($50), PSU ($30), mobo ($80), CPU ($100), DIMMs ($30), optical
> drive ($20), did I omit anything?  You're now at around $700.
> 
> You now have a second system requiring "constant tending".  You also
> have 9 components that could fail during restore.

	Yes, but the failure of any one of those components won't destroy
the backup.  Indeed, the failure of any of the common components will stop
the restore, but won't impact the data system, at all.  With RAID10, RAID4,
RAID5, or RAID6, the failure of a drive during restore should not even stop
the restore.

> With a tape drive you have one.

	No, at a minimum, three.  The drive, the tape, and the controller.
Similarly to the backup system, failure of either of the two common systems
will stop the restore.  Failure of the tape will fail the restore.  What's
more, replacing a failed tape drive costs one hell of a lot more than
replacing a PSU or a single hard drive.  I recently had four simultaneous
drive failures in my backup system (with no loss of data, BTW), but
replacing all four was still a lot cheaper than replacing a tape drive.
Indeed, I would not have been able to afford a replacement tape drive, at
all, so had I chosen a tape drive as my sole means of backup, its failure
would have left me without a backup solution.

> Calculate the total MTBF of those 9 components using the
> inverse probability rule and compare that to the MTBF of a single HP
> LTO-2 drive?

	For a large data center or a company with many locations and
systems, such a computation is straightforward.  For a single system, it is
virtually meaningless.  One cannot apply statistics to a singular entity.
(The fact the average person lives to be about 80 doesn't mean my
grandmother did not live to be 102, or that her husband did not die when he
was 45.)

> Again, you're like a deer in the headlights mesmerized by initial
> acquisition cost.  The tape solution I mentioned has a ~$200 greater
> acquisition cost

	You did not include the cost of tapes, especially if he employs a
WORM strategy.  Failing to do so turns the tape solution into a single point
of failure for the backup strategy.  In a RAID backup system, the loss of a
single drive does not compromise the backup data.  With a single
differential or incremental tape strategy, the loss of a tape or any part of
one may compromise the backup set.

> yet its reliability is greater

	Not really.  Again, one cannot rely upon statistical analysis to
determine the relative reliability of the two strategies.  One must instead
analyze the single points of failure and their impact on the backup
strategy, and then determine how much one may afford.

> and it is purpose
> built for the task at hand.   Your DIY D2D server is not.

	How is that relevant?

> Please keep in mind Carl I'm not necessarily speaking directly to you,
> or singling you out on this issue.  This list has a wider audience.
> Many sites archive this list, and those Googling the subject need good
> information on this subject.  The prevailing wind is D2D, but that
> doesn't make it God's gift to DRS.

	True.  The fact is, if the data is truly important, then not only is
a single backup system - no matter how expensive - insufficient, but even a
single backup strategy is not sufficient.

> D2D and tape both have their place, and both can do some jobs equally
> well at the same or varying costs.  D2D is better for some scenarios in
> some environments.

	While for others, the advantages of one solution over the other make
more sense in the situation at hand.

>  Tape is the _ONLY_ solution for others, and
> especially do for some government and business scenarios that require
> WORM capability for legal compliance.  There are few, if any, disk based
> solutions that can guarantee WORM archiving.

	Yet a set of solutions that employ both tape and disk will meet
almost any demand - at a very increased cost.  If the applicatin calls for
instant, random access to backup data along with the ability to recover data
from the distant past, then only a combination of tape and online disk
backups may suffice.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: Is It Hopeless?
  2011-01-05 23:07               ` Leslie Rhorer
@ 2011-01-06 23:02                 ` Berkey B Walker
  0 siblings, 0 replies; 22+ messages in thread
From: Berkey B Walker @ 2011-01-06 23:02 UTC (permalink / raw)
  To: lrhorer; +Cc: 'Stan Hoeppner', linux-raid



Leslie Rhorer wrote:
>> Too many (especially younger) IT people _only_ consider up front
>> acquisition cost of systems and not long term support of such systems.
>>      
> 	Perhaps so, but the reality of the situation is any venture is a
> risk.  Mitigating any risk costs money, and at some point one simply must
> deploy the system and hope the inherent failures not mitigated by DR won't
> happen.  Even locating multiple massively redundant DR centers across the
> entire globe cannot completely mitigate all possible disasters, but few, if
> any, enterprises can afford to launch satellites to put their critical data
> in orbit.  Eventually, one must roll the dice.
>
>    
>> Total system cost _must_ include a reliable DRS (Disaster Recover
>> System).  If you can't afford the DRS to go with a new system, then you
>> can't afford that system, and must downsize it or reduce its costs in
>> some way to allow inclusion of DRS.
>>      
> 	Well, not necessarily.  Again, not all data is critical.  Below a
> certain level of criticality, any DR at all is a waste of money.  There are
> literally millions of people out there whose computing needs don't call for
> any great level of DR, and some of them even have RAID systems.
>
>    
>>   There is no free lunch.  Eli nearly lost his job over poor acquisition
>> and architecture choices.
>>      
> 	Maybe.  It seems to me a lot of recovery avenues were left
> unexplored.
>
>    
>> In that thread he makes the same excuses you
>> do regarding his total storage size needs and his "budget for backup".
>> There is no such thing as "budget for backup".  DRS _must_ be included
>> in all acquisition costs.  If not, someone will pay dear consequences at
>> some point in time if the lost data has real value.
>>      
> 	Pay now or pay later.  In some cases, paying later is the only
> alternative.  What's more, in some cases the later payment is less
> burdensome.  I'm not saying this is always the case, by a long shot, but
> blankly assuming one must never take any risks is not the best approach,
> either.  Risk is a part of life, sometimes even the risk of death.  One must
> approach each endeavor with a risk / benefit analysis.
>
>    In Eli's case the
>    
>> lost data was Ph.D. student research data.  If one student lost all his
>> data, he may likely have to redo an entire year of school.  Who pays for
>> that?  Who pays for his year of lost earnings sine he can't (re)enter
>> the workforce at Ph.D. pay scale?  This snafu may cost a single Ph.D.
>> student, the university, or both, $200K or more depending on career field.
>>      
> 	I didn't spot where any attempt was made to recover the data off the
> failed drives.  If it was that important, then the cost of recovery from
> failed drives should have been well worth it.  I'm not saying recovering
> data from failed hard drives is a good DR plan, but in the event of such an
> unmitigated failure, it becomes a worthwhile solution.  It's certainly not a
> six figure proposal.
>
>
>    
>> If they'd had a decent tape silo he'd have lost no data.
>>
>>      
>>> MTBF of tape is hundreds of times sooner.
>>>        
>> Really?  Eli had those WD20EARS online in his D2D backup system for less
>> than 5 months.  LTO tape reliability is less than 5 months?  Show data
>> to back that argument up please.
>>      
> 	I've had tapes fail after their first write.  Indeed, I recently had
> to recover a system from tape where the last 2 tapes were bad.  The 3rd tape
> was off-site, so it took a couple of extra days to get the system back
> online.  Of course, it wasn't a critical system, or else we would have had
> more immediate recovery alternatives.  See my point above.
>
>    
>> Tape isn't perfect either, but based on my experience and reading that
>> of many many others, it's still better than D2D is many cases.  Also
>> note that tapes don't wholesale fail as disks do.  Bad spots on tape
>> cause some lost files, not _all_ the files, as is the case when a D2D
>> system fails during restore.
>>      
> 	Tapes certainly do fail completely.  Both tapes I mention above were
> completely unreadable.  Note neither hard drives nor tape media usually in
> fact fail completely.  Had it been necessary, we could have had the tapes
> scanned and recovered much of the data.  The same, however, is true of a
> hard drive.  There are data recovery services available for both types of
> failed media.  They aren't cheap, but if the lost data is truly valuable...
>
> 	Frankly, many people, including IT people who should know better,
> panic whenever data is thought to be lost, and wind up making things worse.
> A few years ago, an acquaintance of mine - a real bubble-head - related a
> story to me.  She was given the chore of backing up one of the systems in
> her office to tape.  When the tape utility reached the end of the tape, of
> course, it prompted her for additional tapes.  Why this puzzled her, I have
> no idea, but it did, so she asked the IT guy who had tasked her with the
> backups what she should do.  Believe it or not, he told her to ignore the
> prompt and just quit the backup app.  Apparently he is an even bigger
> bubble-head than she is.  Some months down the road, of course, the system
> failed and they were stuck with a partial backup.  (Yeah, I know, but it
> gets worse.)  So what was the failure?  Someone accidentally re-formatted
> the hard drive.  When it was discovered the tape backup was incomplete, they
> had my acquaintance completely re-install the OS, and manually re-create the
> data, which took weeks.  I couldn't believe it.  It's very likely nearly all
> the data could have been recovered from the re-formatted hard drive with far
> less trouble and cost.  Note also the (at the time) very expensive tape
> drive was rendered virtually useless by incompetent individuals.  Indeed,
> the entire failure was due to incompetence.
>
> 	My brother designed and built a boat, and during this exercise he
> read books published by a number of marine engineers.  One of them had a
> favorite saying - "Don't do just something, stand there!".  A lot of people,
> including IT people, need to take that to heart.  In a failure scenario,
> don't do anything unless you have thoroughly thought it through and are
> quite certain it won't make things worse.
>
>    
>> If a tape drive fails during restore, you don't lose all the backup
>> data.  You simply replace the drive and run the tapes through the new
>> drive.  If you have a multi-drive silo or library, you simply get a log
>> message of the drive failure, and your restore may simply be slower.
>> This depends on how you've setup parallelism in your silo.  Outside of
>> the supercomputing centers where large files are backed up in parallel
>> streams to multiple tapes/drives simultaneously ("tape RAID" or tape
>> striping) most organizations don't stripe this way.  They simply
>> schedule simultaneous backups of various servers each hitting a
>> different drive in the silo.  In this case if all other silo drives are
>> busy, then your restore will have to wait.  But, you'll get your system
>> restored.
>>      
> 	None of that is relevant to a guy with a couple of TB of videos in
> his home system, though.
>
>    
>>> Not to mention that tape would take forever, and require constant
>>>        
>> tending.
>>
>> Eli made similar statements as well, and they're bogus.  Modern high cap
>> drives/tapes are quite speedy, especially if striped using the proper
>> library/silo management software and planning/architecure.
>>      
> 	Yes, but Eli was tending a much larger system than the OP, and for a
> well endowed public system, not a $700 home computer.
>
>    
>> can absorb streaming backups at rates much higher than midrange SAN
>> arrays, in the multiple GB/s range.  They're not cheap, but then, good
>> DRS solutions aren't. :)
>>      
> 	A "good" DRS solution is only good if it is not so expensive as to
> make the enterprise unprofitable.  It doesn't help for a company's data to
> be safe if the company can't make money and goes bankrupt.
>
>    
>> The D2D vendors use this scare tactic often also.  Would you care to
>> explain this "constant tending"?
>>      
> 	In a small system, the tapes will have to be swapped frequently.
> The same is true of DVD or Blu-Ray backups.  An online backup can handle its
> own management.
>
>    
>>> My storage is 2TB now, but my library is growing all the time.  Backing
>>>        
>> to off-line disk storage is the only practical way now, given the
>> extremely low cost and high capacity and speed.  Each WD 2TB drive is $99
>> from Newegg!  Astounding.  Thanks for the input though.
>>
>> No, it's not the only practical methodology.  Are you not familiar with
>> "differential copying"?  It's the feature that makes rsync so useful, as
>> well as tape.  Once you have your first complete backup of that 2TB of
>> media files, you're only writing to tape anything that's changed.
>>      
> 	And hoping the original backup set doesn't fail.  Again, every
> solution is a a compromise, and when it comes to backups, speed and
> efficiency are always balanced against cost and with robustness.
> Multi-generation full backups take much longer and are more expensive, but
> they don't rely on a single set of backup data that could turn out to be bad
> when it is needed.  I'm not saying one should not take advantage of
> differential or incremental backups, merely that they represent a compromise
> whose implications must be considered.
>
>    
>> At $99 you'll have $396 of drives in your backup server.  Add the cost
>> of a case ($50), PSU ($30), mobo ($80), CPU ($100), DIMMs ($30), optical
>> drive ($20), did I omit anything?  You're now at around $700.
>>
>> You now have a second system requiring "constant tending".  You also
>> have 9 components that could fail during restore.
>>      
> 	Yes, but the failure of any one of those components won't destroy
> the backup.  Indeed, the failure of any of the common components will stop
> the restore, but won't impact the data system, at all.  With RAID10, RAID4,
> RAID5, or RAID6, the failure of a drive during restore should not even stop
> the restore.
>
>    
>> With a tape drive you have one.
>>      
> 	No, at a minimum, three.  The drive, the tape, and the controller.
> Similarly to the backup system, failure of either of the two common systems
> will stop the restore.  Failure of the tape will fail the restore.  What's
> more, replacing a failed tape drive costs one hell of a lot more than
> replacing a PSU or a single hard drive.  I recently had four simultaneous
> drive failures in my backup system (with no loss of data, BTW), but
> replacing all four was still a lot cheaper than replacing a tape drive.
> Indeed, I would not have been able to afford a replacement tape drive, at
> all, so had I chosen a tape drive as my sole means of backup, its failure
> would have left me without a backup solution.
>
>    
>> Calculate the total MTBF of those 9 components using the
>> inverse probability rule and compare that to the MTBF of a single HP
>> LTO-2 drive?
>>      
> 	For a large data center or a company with many locations and
> systems, such a computation is straightforward.  For a single system, it is
> virtually meaningless.  One cannot apply statistics to a singular entity.
> (The fact the average person lives to be about 80 doesn't mean my
> grandmother did not live to be 102, or that her husband did not die when he
> was 45.)
>
>    
>> Again, you're like a deer in the headlights mesmerized by initial
>> acquisition cost.  The tape solution I mentioned has a ~$200 greater
>> acquisition cost
>>      
> 	You did not include the cost of tapes, especially if he employs a
> WORM strategy.  Failing to do so turns the tape solution into a single point
> of failure for the backup strategy.  In a RAID backup system, the loss of a
> single drive does not compromise the backup data.  With a single
> differential or incremental tape strategy, the loss of a tape or any part of
> one may compromise the backup set.
>
>    
>> yet its reliability is greater
>>      
> 	Not really.  Again, one cannot rely upon statistical analysis to
> determine the relative reliability of the two strategies.  One must instead
> analyze the single points of failure and their impact on the backup
> strategy, and then determine how much one may afford.
>
>    
>> and it is purpose
>> built for the task at hand.   Your DIY D2D server is not.
>>      
> 	How is that relevant?
>
>    
>> Please keep in mind Carl I'm not necessarily speaking directly to you,
>> or singling you out on this issue.  This list has a wider audience.
>> Many sites archive this list, and those Googling the subject need good
>> information on this subject.  The prevailing wind is D2D, but that
>> doesn't make it God's gift to DRS.
>>      
> 	True.  The fact is, if the data is truly important, then not only is
> a single backup system - no matter how expensive - insufficient, but even a
> single backup strategy is not sufficient.
>
>    
>> D2D and tape both have their place, and both can do some jobs equally
>> well at the same or varying costs.  D2D is better for some scenarios in
>> some environments.
>>      
> 	While for others, the advantages of one solution over the other make
> more sense in the situation at hand.
>
>    
>>   Tape is the _ONLY_ solution for others, and
>> especially do for some government and business scenarios that require
>> WORM capability for legal compliance.  There are few, if any, disk based
>> solutions that can guarantee WORM archiving.
>>      
> 	Yet a set of solutions that employ both tape and disk will meet
> almost any demand - at a very increased cost.  If the applicatin calls for
> instant, random access to backup data along with the ability to recover data
> from the distant past, then only a combination of tape and online disk
> backups may suffice.
>
>    
This is all very nice, but if memory serves, and it seems no one cares - 
the OP does not show interest in serious system maintenance - and I 
would judge by your posts, that you all care little about mag tape 
storage integrity.  How do you know that the tape read errors mentioned 
were not caused by lack of maintenance?

I'll go away now, as this is my second post [my limit] and off-topic of 
RAID.
Regards
b-



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2011-01-06 23:02 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-26 18:19 Is It Hopeless? Carl Cook
2010-12-26 20:11 ` Neil Brown
2010-12-26 20:19   ` Carl Cook
2010-12-26 20:19     ` CoolCold
2010-12-26 20:33     ` Neil Brown
2010-12-26 21:14       ` Berkey B Walker
2010-12-27  0:06       ` Carl Cook
2010-12-27  4:45         ` Stan Hoeppner
2010-12-27  5:35         ` Phil Turmel
2010-12-27 13:10           ` Carl Cook
2010-12-27 15:04             ` Phil Turmel
2010-12-27 21:34               ` Brad Campbell
2010-12-27 16:37             ` Stan Hoeppner
2010-12-28  1:36               ` Berkey B Walker
2010-12-28  4:16               ` Carl Cook
2010-12-29  3:04                 ` Stan Hoeppner
2010-12-29  5:34                   ` Roman Mamedov
2011-01-04 20:03               ` Phillip Susi
2011-01-05 21:19                 ` Leslie Rhorer
2011-01-05 14:45               ` Hank Barta
2011-01-05 23:07               ` Leslie Rhorer
2011-01-06 23:02                 ` Berkey B Walker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.