zero out blocks of freed user data for operation a virtual machine environment

All of lore.kernel.org
 help / color / mirror / Atom feed

* zero out blocks of freed user data for operation a virtual machine environment
@ 2009-05-24 17:00 Thomas Glanzmann
  2009-05-24 17:15 ` Arjan van de Ven
                   ` (4 more replies)
  0 siblings, 5 replies; 22+ messages in thread
From: Thomas Glanzmann @ 2009-05-24 17:00 UTC (permalink / raw)
  To: tytso; +Cc: LKML, linux-ext4

Hello Ted,
I would like to know if there is already a mount option or feature in
ext3/ext4 that automatically overwrites freed blocks with zeros? If this
is not the case I would like to know if you would consider a patch for
upstream? I'm asking this because I currently do some research work on
data deduplication in virtual machine environments and corresponding
backups. It would be a huge space saver if there is such a feature
because todays and tomorrows backup tools for virtual machine
environments work on the block layer (VMware Consolidated Backup, VMware
Data Recovery, and NetApp Snapshots). This is not only true for backup
tools but also for running Virtual machines. The case that this future
addresses is the following: A huge file is downloaded and later delted.
The backup and datadeduplication that is operating on the block level
can't identify the block as unused. This results in backing up the
amount of the data that was previously allocated by the file and as such
introduces an performance overhead. If you're interested in real live
data, I'm able to provide them.

If you don't intend to have such an optional feature in ext3/ext4 I
would like to know if you know a tool that makes it possible to zero out
unused blocks?

The only reference that I found for such a tool for Linux is the
following:

#!/bin/bash
FileSystem=`grep ext /etc/mtab| awk -F" " '{ print $2 }'`

for i in $FileSystem
do
       number=`df -B 512 $i | awk -F" " '{print $4}'`
       percent=$(echo "scale=0; $number * 95 / 100" | bc )
       dd count=`echo $percent` if=/dev/zero of=`echo $i`/zf
       rm -f $i/zf
done

Source: http://blog.core-it.com.au/?p=298

Even if certainly does job I would hardly recommend it to anyone for various
obvious reasons: A lot of I/O overhead that could be avoided, scheduling
this at the bad moment it could lead to full disk situation. And also
the blocksize is left the default and as such is way to low.

Just to be complete: For Microsoft Windows there is a tool called
sdelete which can be used to zero out unused disk blocks, again it has
the same problem as the above script but hopefully is saver to run.

Source: http://technet.microsoft.com/en-us/sysinternals/bb897443.aspx

        Thomas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
@ 2009-05-24 17:15 ` Arjan van de Ven
  2009-05-24 17:39   ` Thomas Glanzmann
  2009-05-25  3:29 ` David Newall
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: Arjan van de Ven @ 2009-05-24 17:15 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: tytso, LKML, linux-ext4

On Sun, 24 May 2009 19:00:45 +0200
Thomas Glanzmann <thomas@glanzmann.de> wrote:

> Hello Ted,
> I would like to know if there is already a mount option or feature in
> ext3/ext4 that automatically overwrites freed blocks with zeros? If
> this is not the case I would like to know if you would consider a
> patch for upstream? I'm asking this because I currently do some
> research work on data deduplication in virtual machine environments
> and corresponding backups. It would be a huge space saver if there is
> such a feature because todays and tomorrows backup tools for virtual
> machine environments work on the block layer (VMware Consolidated
> Backup, VMware Data Recovery, and NetApp Snapshots). This is not only
> true for backup tools but also for running Virtual machines. The case
> that this future addresses is the following: A huge file is
> downloaded and later delted. The backup and datadeduplication that is
> operating on the block level can't identify the block as unused. This
> results in backing up the amount of the data that was previously
> allocated by the file and as such introduces an performance overhead.
> If you're interested in real live data, I'm able to provide them.
> 
> If you don't intend to have such an optional feature in ext3/ext4 I
> would like to know if you know a tool that makes it possible to zero
> out unused blocks?
> 

wouldn't it be better if the VM's would just support the TRIM command?


-- 
Arjan van de Ven 	Intel Open Source Technology Centre
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-24 17:15 ` Arjan van de Ven
@ 2009-05-24 17:39   ` Thomas Glanzmann
  2009-05-25 12:03     ` Theodore Tso
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Glanzmann @ 2009-05-24 17:39 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: tytso, LKML, linux-ext4

Hello,

> > tunable feature that zeroes our free data

> wouldn't it be better if the VM's would just support the TRIM command?

the resources available to me indicate that the TRIM command is a not
yet standarized command targeted at SSD disks to indicate free disk
space. Does ext3/4 trigger a block device layer call that could result
in a TRIM command?

        Thomas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-24 17:39   ` Thomas Glanzmann
@ 2009-05-25 12:03     ` Theodore Tso
  2009-05-25 12:34       ` Thomas Glanzmann
  0 siblings, 1 reply; 22+ messages in thread
From: Theodore Tso @ 2009-05-25 12:03 UTC (permalink / raw)
  To: Thomas Glanzmann, Arjan van de Ven, tytso, LKML, linux-ext4

On Sun, May 24, 2009 at 07:39:33PM +0200, Thomas Glanzmann wrote:
> Hello,
> 
> > > tunable feature that zeroes our free data
> 
> > wouldn't it be better if the VM's would just support the TRIM command?
> 
> the resources available to me indicate that the TRIM command is a not
> yet standarized command targeted at SSD disks to indicate free disk
> space. Does ext3/4 trigger a block device layer call that could result
> in a TRIM command?

Yes, it does, sb_issue_discard().  So if you wanted to hook into this
routine with a function which issued calls to zero out blocks, it
would be easy to create a private patch.

					- Ted

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-25 12:03     ` Theodore Tso
@ 2009-05-25 12:34       ` Thomas Glanzmann
  2009-05-25 13:14         ` Goswin von Brederlow
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Glanzmann @ 2009-05-25 12:34 UTC (permalink / raw)
  To: Theodore Tso, Arjan van de Ven, tytso, LKML, linux-ext4

Hello Ted,

> Yes, it does, sb_issue_discard().  So if you wanted to hook into this
> routine with a function which issued calls to zero out blocks, it
> would be easy to create a private patch.

that sounds good because it wouldn't only target the most used
filesystem but every other filesystem that uses the interface as well.
Do you think that a tunable or configurable patch has a chance to hit
upstream as well?

        Thomas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-25 12:34       ` Thomas Glanzmann
@ 2009-05-25 13:14         ` Goswin von Brederlow
  2009-05-25 14:01           ` Thomas Glanzmann
       [not found]           ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
  0 siblings, 2 replies; 22+ messages in thread
From: Goswin von Brederlow @ 2009-05-25 13:14 UTC (permalink / raw)
  To: Thomas Glanzmann; +Cc: Theodore Tso, Arjan van de Ven, tytso, LKML, linux-ext4

Thomas Glanzmann <thomas@glanzmann.de> writes:

> Hello Ted,
>
>> Yes, it does, sb_issue_discard().  So if you wanted to hook into this
>> routine with a function which issued calls to zero out blocks, it
>> would be easy to create a private patch.
>
> that sounds good because it wouldn't only target the most used
> filesystem but every other filesystem that uses the interface as well.
> Do you think that a tunable or configurable patch has a chance to hit
> upstream as well?
>
>         Thomas

I could imagine a device mapper target that eats TRIM commands and
writes out zeroes instead. That should be easy to maintain outside or
inside the upstream kernel source.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-25 13:14         ` Goswin von Brederlow
@ 2009-05-25 14:01           ` Thomas Glanzmann
       [not found]           ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
  1 sibling, 0 replies; 22+ messages in thread
From: Thomas Glanzmann @ 2009-05-25 14:01 UTC (permalink / raw)
  To: Goswin von Brederlow
  Cc: Theodore Tso, Arjan van de Ven, tytso, LKML, linux-ext4

Hello Goswin,

> I could imagine a device mapper target that eats TRIM commands and
> writes out zeroes instead. That should be easy to maintain outside or
> inside the upstream kernel source.

again an interesting option and for sure easy to handle. However what
I'm really looking for is an option that gets upstream and will be
incorperated in major distributions so that this option is available on
every Linux distribution shipping in two years. However if this won't be
the case I'm going to consider writing a device mapper target.

        Thomas

^ permalink raw reply	[flat|nested] 22+ messages in thread

[parent not found: <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>]

* RE: zero out blocks of freed user data for operation a virtual  machine environment
       [not found]           ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
@ 2009-05-25 17:26               ` Chris Worley
  2009-05-26 10:22               ` Goswin von Brederlow
  1 sibling, 0 replies; 22+ messages in thread
From: Chris Worley @ 2009-05-25 17:26 UTC (permalink / raw)
  To: LKML, linux-ext4

On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Thomas Glanzmann <thomas@glanzmann.de> writes:
> > Hello Ted,
> >
> >> Yes, it does, sb_issue_discard().  So if you wanted to hook into this
> >> routine with a function which issued calls to zero out blocks, it
> >> would be easy to create a private patch.
> >
> > that sounds good because it wouldn't only target the most used
> > filesystem but every other filesystem that uses the interface as well.
> > Do you think that a tunable or configurable patch has a chance to hit
> > upstream as well?
> >
> >         Thomas
>
> I could imagine a device mapper target that eats TRIM commands and
> writes out zeroes instead. That should be easy to maintain outside or
> inside the upstream kernel source.

Why bother with a time-consuming performance-draining operation?
There are devices that already support TRIM/discard commands today,
and once you discard a block, it's completely irretrievable (you'll
just get back zeros if you try to read that block w/o writing it after
the discard).

Chris
>
>
> MfG
>        Goswin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: zero out blocks of freed user data for operation a virtual machine environment
@ 2009-05-25 17:26               ` Chris Worley
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Worley @ 2009-05-25 17:26 UTC (permalink / raw)
  To: LKML, linux-ext4

On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Thomas Glanzmann <thomas@glanzmann.de> writes:
> > Hello Ted,
> >
> >> Yes, it does, sb_issue_discard().  So if you wanted to hook into this
> >> routine with a function which issued calls to zero out blocks, it
> >> would be easy to create a private patch.
> >
> > that sounds good because it wouldn't only target the most used
> > filesystem but every other filesystem that uses the interface as well.
> > Do you think that a tunable or configurable patch has a chance to hit
> > upstream as well?
> >
> >         Thomas
>
> I could imagine a device mapper target that eats TRIM commands and
> writes out zeroes instead. That should be easy to maintain outside or
> inside the upstream kernel source.

Why bother with a time-consuming performance-draining operation?
There are devices that already support TRIM/discard commands today,
and once you discard a block, it's completely irretrievable (you'll
just get back zeros if you try to read that block w/o writing it after
the discard).

Chris
>
>
> MfG
>        Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual  machine environment
       [not found]           ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
@ 2009-05-26 10:22               ` Goswin von Brederlow
  2009-05-26 10:22               ` Goswin von Brederlow
  1 sibling, 0 replies; 22+ messages in thread
From: Goswin von Brederlow @ 2009-05-26 10:22 UTC (permalink / raw)
  To: Chris Worley; +Cc: Goswin von Brederlow, LKML, linux-ext4

Chris Worley <worleys@gmail.com> writes:

> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
> wrote:
>
>
>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>      
>      > Hello Ted,
>      >
>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>      this
>      >> routine with a function which issued calls to zero out blocks, it
>      >> would be easy to create a private patch.
>      >
>      > that sounds good because it wouldn't only target the most used
>      > filesystem but every other filesystem that uses the interface as
>      well.
>      > Do you think that a tunable or configurable patch has a chance to
>      hit
>      > upstream as well?
>      >
>      >         Thomas
>      
>      
>
>
>      I could imagine a device mapper target that eats TRIM commands and
>      writes out zeroes instead. That should be easy to maintain outside
>      or
>      inside the upstream kernel source.
>
>
> Why bother with a time-consuming performance-draining operation?  There are
> devices that already support TRIM/discard commands today, and once you discard
> a block, it's completely irretrievable (you'll just get back zeros if you try
> to read that block w/o writing it after the discard).
> Chris 

Because you have one of the billions of devices that don't.

Because, iirc, the specs say nothing about getting back zeros.

Because someone could read the raw data from disk and recover your
state secrets.

Because loopback don't support TRIM and compression of the image file
is much better with zeroes.

Because on a crypted device TRIM would show how much of the device is
in used while zeroing out (before crypting) would result in random
data.

Because it is fun?

So many reasons.

MfG
        Goswin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual  machine environment
@ 2009-05-26 10:22               ` Goswin von Brederlow
  0 siblings, 0 replies; 22+ messages in thread
From: Goswin von Brederlow @ 2009-05-26 10:22 UTC (permalink / raw)
  To: Chris Worley; +Cc: Goswin von Brederlow, LKML, linux-ext4

Chris Worley <worleys@gmail.com> writes:

> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
> wrote:
>
>
>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>      
>      > Hello Ted,
>      >
>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>      this
>      >> routine with a function which issued calls to zero out blocks, it
>      >> would be easy to create a private patch.
>      >
>      > that sounds good because it wouldn't only target the most used
>      > filesystem but every other filesystem that uses the interface as
>      well.
>      > Do you think that a tunable or configurable patch has a chance to
>      hit
>      > upstream as well?
>      >
>      >         Thomas
>      
>      
>
>
>      I could imagine a device mapper target that eats TRIM commands and
>      writes out zeroes instead. That should be easy to maintain outside
>      or
>      inside the upstream kernel source.
>
>
> Why bother with a time-consuming performance-draining operation?  There are
> devices that already support TRIM/discard commands today, and once you discard
> a block, it's completely irretrievable (you'll just get back zeros if you try
> to read that block w/o writing it after the discard).
> Chris 

Because you have one of the billions of devices that don't.

Because, iirc, the specs say nothing about getting back zeros.

Because someone could read the raw data from disk and recover your
state secrets.

Because loopback don't support TRIM and compression of the image file
is much better with zeroes.

Because on a crypted device TRIM would show how much of the device is
in used while zeroing out (before crypting) would result in random
data.

Because it is fun?

So many reasons.

MfG
        Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual  machine environment
  2009-05-26 10:22               ` Goswin von Brederlow
@ 2009-05-26 16:52                 ` Chris Worley
  -1 siblings, 0 replies; 22+ messages in thread
From: Chris Worley @ 2009-05-26 16:52 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: LKML, linux-ext4

On Tue, May 26, 2009 at 4:22 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Chris Worley <worleys@gmail.com> writes:
>
>> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
>> wrote:
>>
>>
>>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>>
>>      > Hello Ted,
>>      >
>>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>>      this
>>      >> routine with a function which issued calls to zero out blocks, it
>>      >> would be easy to create a private patch.
>>      >
>>      > that sounds good because it wouldn't only target the most used
>>      > filesystem but every other filesystem that uses the interface as
>>      well.
>>      > Do you think that a tunable or configurable patch has a chance to
>>      hit
>>      > upstream as well?
>>      >
>>      >         Thomas
>>
>>
>>
>>
>>      I could imagine a device mapper target that eats TRIM commands and
>>      writes out zeroes instead. That should be easy to maintain outside
>>      or
>>      inside the upstream kernel source.
>>
>>
>> Why bother with a time-consuming performance-draining operation?  There are
>> devices that already support TRIM/discard commands today, and once you discard
>> a block, it's completely irretrievable (you'll just get back zeros if you try
>> to read that block w/o writing it after the discard).
>> Chris
>

I do enjoy a good argument... and don't mean this as a flame (I'm told
I obliviously write curtly)...

Old man's observation: I've found that the people you would think
would readily embrace a new technology are as terrified of change as a
Windows user, and always find so many excuses for "why change won't
work for them" ;)

> Because you have one of the billions of devices that don't.

You have devices that _do_ work now, that should be your selection if
you want both this functionality and high performance.  If you don't
want performance, write zeros to rotating media.

The time frame given in this thread is two years.  In 2-5 years,
rotating media will be history.  The tip of the Linux kernel should
not be focused on defunct technology.

>
> Because, iirc, the specs say nothing about getting back zeros.
>

But all a Solid State Storage controller can do is give you garbage
when asked for an unwritten or discarded block; it doesn't know where
the data is, which is all that is needed for the functionality desired
(there's no need to specify exactly what a controller should return
when asked to read a block it knows nothing about).  Once the
controller is no longer managing a block, there is no way for it to
retrieve that block.  That's what TRIM is all about: get greatest
performance by allowing the SSS controller to manage as few blocks as
absolutely necessary.  Not being able to retrieve valid data for an
unwritten or discarded block is a side-effect of TRIM, that fits well
for this desired functionality.

>From drives I've tested so far, the de-facto standard is "zero" when
reading unmanaged blocks.

> Because someone could read the raw data from disk and recover your
> state secrets.

Water-boarding won't help... the controller simply doesn't know the
information you demand.

This isn't your grandfathers rotating media...

You would have to read at the Erase Block level, and know the specific
vendor implementation's EB layout and block-level
mapping/metadata/wear-leveling strategy (i.e. very tightly held IP).
Controllers don't provide the functionality to request raw EB's; there
is no way to read raw EB's.  There is no spec for it in existence for
reading EB's from a SCSI/SAS/SATA/block device.  Your only recourse
would be to pull the NAND chips physically off the drive and weld them
to another piece of hardware specifically designed to blindly read all
the erase blocks, then try to infer the manufacturers chip
organization as well as block-level metatdata, and then you'd only
know all the active blocks (which you would have known those blocks
anyway, before you pulled the chips off) and would have to come up
with some strategy for trying to figure out the original LBA's for all
the inactive data... so there _is_ a very small chance of recovery,
lacking physical security... there are worse issues too, when physical
security is not available on site (i.e. all your active data would be
vulnerable as with any mechanical drive).

Of concern to those handling state secrets: there is no guarantee in
SSS that writing whatever pattern over and over again will physically
overwrite the targeted LBA.  New methods of "declassifying" SSS drives
will be necessary (i.e. a Secure Erase where the controller is told to
erase all EB's... so your NAND EB reading device will read all ones no
matter what EB is read).  These methods are simple enough to develop,
but those who care about this should be aware that the old rotating
media methods no longer apply.

> Because loopback don't support TRIM and compression of the image file
> is much better with zeroes.

Wouldn't it be best if the block is not in existence after the
discard?  Then there would be nothing to compress, which I believe
"nothing" compresses very compactly.

>
> Because on a crypted device TRIM would show how much of the device is
> in used while zeroing out (before crypting) would result in random
> data.

TRIM doesn't tell you how much of the drive is used?

>
> Because it is fun?

You've got me there.  To each his own.

>
> So many reasons.

...to switch from the old rotating media to SSS ;)

Chris
>
> MfG
>        Goswin
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
@ 2009-05-26 16:52                 ` Chris Worley
  0 siblings, 0 replies; 22+ messages in thread
From: Chris Worley @ 2009-05-26 16:52 UTC (permalink / raw)
  To: Goswin von Brederlow; +Cc: LKML, linux-ext4

On Tue, May 26, 2009 at 4:22 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Chris Worley <worleys@gmail.com> writes:
>
>> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
>> wrote:
>>
>>
>>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>>
>>      > Hello Ted,
>>      >
>>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>>      this
>>      >> routine with a function which issued calls to zero out blocks, it
>>      >> would be easy to create a private patch.
>>      >
>>      > that sounds good because it wouldn't only target the most used
>>      > filesystem but every other filesystem that uses the interface as
>>      well.
>>      > Do you think that a tunable or configurable patch has a chance to
>>      hit
>>      > upstream as well?
>>      >
>>      >         Thomas
>>
>>
>>
>>
>>      I could imagine a device mapper target that eats TRIM commands and
>>      writes out zeroes instead. That should be easy to maintain outside
>>      or
>>      inside the upstream kernel source.
>>
>>
>> Why bother with a time-consuming performance-draining operation?  There are
>> devices that already support TRIM/discard commands today, and once you discard
>> a block, it's completely irretrievable (you'll just get back zeros if you try
>> to read that block w/o writing it after the discard).
>> Chris
>

I do enjoy a good argument... and don't mean this as a flame (I'm told
I obliviously write curtly)...

Old man's observation: I've found that the people you would think
would readily embrace a new technology are as terrified of change as a
Windows user, and always find so many excuses for "why change won't
work for them" ;)

> Because you have one of the billions of devices that don't.

You have devices that _do_ work now, that should be your selection if
you want both this functionality and high performance.  If you don't
want performance, write zeros to rotating media.

The time frame given in this thread is two years.  In 2-5 years,
rotating media will be history.  The tip of the Linux kernel should
not be focused on defunct technology.

>
> Because, iirc, the specs say nothing about getting back zeros.
>

But all a Solid State Storage controller can do is give you garbage
when asked for an unwritten or discarded block; it doesn't know where
the data is, which is all that is needed for the functionality desired
(there's no need to specify exactly what a controller should return
when asked to read a block it knows nothing about).  Once the
controller is no longer managing a block, there is no way for it to
retrieve that block.  That's what TRIM is all about: get greatest
performance by allowing the SSS controller to manage as few blocks as
absolutely necessary.  Not being able to retrieve valid data for an
unwritten or discarded block is a side-effect of TRIM, that fits well
for this desired functionality.

From drives I've tested so far, the de-facto standard is "zero" when
reading unmanaged blocks.

> Because someone could read the raw data from disk and recover your
> state secrets.

Water-boarding won't help... the controller simply doesn't know the
information you demand.

This isn't your grandfathers rotating media...

You would have to read at the Erase Block level, and know the specific
vendor implementation's EB layout and block-level
mapping/metadata/wear-leveling strategy (i.e. very tightly held IP).
Controllers don't provide the functionality to request raw EB's; there
is no way to read raw EB's.  There is no spec for it in existence for
reading EB's from a SCSI/SAS/SATA/block device.  Your only recourse
would be to pull the NAND chips physically off the drive and weld them
to another piece of hardware specifically designed to blindly read all
the erase blocks, then try to infer the manufacturers chip
organization as well as block-level metatdata, and then you'd only
know all the active blocks (which you would have known those blocks
anyway, before you pulled the chips off) and would have to come up
with some strategy for trying to figure out the original LBA's for all
the inactive data... so there _is_ a very small chance of recovery,
lacking physical security... there are worse issues too, when physical
security is not available on site (i.e. all your active data would be
vulnerable as with any mechanical drive).

Of concern to those handling state secrets: there is no guarantee in
SSS that writing whatever pattern over and over again will physically
overwrite the targeted LBA.  New methods of "declassifying" SSS drives
will be necessary (i.e. a Secure Erase where the controller is told to
erase all EB's... so your NAND EB reading device will read all ones no
matter what EB is read).  These methods are simple enough to develop,
but those who care about this should be aware that the old rotating
media methods no longer apply.

> Because loopback don't support TRIM and compression of the image file
> is much better with zeroes.

Wouldn't it be best if the block is not in existence after the
discard?  Then there would be nothing to compress, which I believe
"nothing" compresses very compactly.

>
> Because on a crypted device TRIM would show how much of the device is
> in used while zeroing out (before crypting) would result in random
> data.

TRIM doesn't tell you how much of the drive is used?

>
> Because it is fun?

You've got me there.  To each his own.

>
> So many reasons.

...to switch from the old rotating media to SSS ;)

Chris
>
> MfG
>        Goswin
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual  machine environment
  2009-05-26 16:52                 ` Chris Worley
@ 2009-05-28 19:27                   ` Goswin von Brederlow
  -1 siblings, 0 replies; 22+ messages in thread
From: Goswin von Brederlow @ 2009-05-28 19:27 UTC (permalink / raw)
  To: Chris Worley; +Cc: Goswin von Brederlow, LKML, linux-ext4

Chris Worley <worleys@gmail.com> writes:

> On Tue, May 26, 2009 at 4:22 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Chris Worley <worleys@gmail.com> writes:
>>
>>> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
>>> wrote:
>>>
>>>
>>>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>>>
>>>      > Hello Ted,
>>>      >
>>>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>>>      this
>>>      >> routine with a function which issued calls to zero out blocks, it
>>>      >> would be easy to create a private patch.
>>>      >
>>>      > that sounds good because it wouldn't only target the most used
>>>      > filesystem but every other filesystem that uses the interface as
>>>      well.
>>>      > Do you think that a tunable or configurable patch has a chance to
>>>      hit
>>>      > upstream as well?
>>>      >
>>>      >         Thomas
>>>
>>>
>>>
>>>
>>>      I could imagine a device mapper target that eats TRIM commands and
>>>      writes out zeroes instead. That should be easy to maintain outside
>>>      or
>>>      inside the upstream kernel source.
>>>
>>>
>>> Why bother with a time-consuming performance-draining operation?  There are
>>> devices that already support TRIM/discard commands today, and once you discard
>>> a block, it's completely irretrievable (you'll just get back zeros if you try
>>> to read that block w/o writing it after the discard).
>>> Chris
>>
>
> I do enjoy a good argument... and don't mean this as a flame (I'm told
> I obliviously write curtly)...
>
> Old man's observation: I've found that the people you would think
> would readily embrace a new technology are as terrified of change as a
> Windows user, and always find so many excuses for "why change won't
> work for them" ;)
>
>> Because you have one of the billions of devices that don't.
>
> You have devices that _do_ work now, that should be your selection if
> you want both this functionality and high performance.  If you don't
> want performance, write zeros to rotating media.
>
> The time frame given in this thread is two years.  In 2-5 years,
> rotating media will be history.  The tip of the Linux kernel should
> not be focused on defunct technology.

I certainly have disks in use that are a lot older than that. And for
sure Thomas also has disks that do not natively support TRIM or he
wouldn't want to zero fill blocks instead. So the fact that someone
else might have a "working" disk is of no help.

>> Because, iirc, the specs say nothing about getting back zeros.
>>
>
> But all a Solid State Storage controller can do is give you garbage
> when asked for an unwritten or discarded block; it doesn't know where
> the data is, which is all that is needed for the functionality desired
> (there's no need to specify exactly what a controller should return
> when asked to read a block it knows nothing about).  Once the
> controller is no longer managing a block, there is no way for it to
> retrieve that block.  That's what TRIM is all about: get greatest
> performance by allowing the SSS controller to manage as few blocks as
> absolutely necessary.  Not being able to retrieve valid data for an
> unwritten or discarded block is a side-effect of TRIM, that fits well
> for this desired functionality.

Are you sure? From what other people said some disks don't seem to
forget where the data is. They just don't preserve it anymore. So as
long as the block is not overwritten by the wear leveling you do get
the original data back. Security wise not acceptable.

>>From drives I've tested so far, the de-facto standard is "zero" when
> reading unmanaged blocks.
>
>> Because someone could read the raw data from disk and recover your
>> state secrets.
>
> Water-boarding won't help... the controller simply doesn't know the
> information you demand.

You assume that you have a controler that works right.

> This isn't your grandfathers rotating media...

It is for me.

> You would have to read at the Erase Block level, and know the specific
> vendor implementation's EB layout and block-level
> mapping/metadata/wear-leveling strategy (i.e. very tightly held IP).
> Controllers don't provide the functionality to request raw EB's; there
> is no way to read raw EB's.  There is no spec for it in existence for
> reading EB's from a SCSI/SAS/SATA/block device.  Your only recourse
> would be to pull the NAND chips physically off the drive and weld them
> to another piece of hardware specifically designed to blindly read all
> the erase blocks, then try to infer the manufacturers chip
> organization as well as block-level metatdata, and then you'd only
> know all the active blocks (which you would have known those blocks
> anyway, before you pulled the chips off) and would have to come up
> with some strategy for trying to figure out the original LBA's for all
> the inactive data... so there _is_ a very small chance of recovery,
> lacking physical security... there are worse issues too, when physical
> security is not available on site (i.e. all your active data would be
> vulnerable as with any mechanical drive).
>
> Of concern to those handling state secrets: there is no guarantee in
> SSS that writing whatever pattern over and over again will physically
> overwrite the targeted LBA.  New methods of "declassifying" SSS drives
> will be necessary (i.e. a Secure Erase where the controller is told to
> erase all EB's... so your NAND EB reading device will read all ones no
> matter what EB is read).  These methods are simple enough to develop,
> but those who care about this should be aware that the old rotating
> media methods no longer apply.

Again you assume you have an SSD. Think what happens on your average
rotating disk.

>> Because loopback don't support TRIM and compression of the image file
>> is much better with zeroes.
>
> Wouldn't it be best if the block is not in existence after the
> discard?  Then there would be nothing to compress, which I believe
> "nothing" compresses very compactly.

That would require erasing blocks from the middle of files, something
not yet possible in the VFS layer nor supported by any filesystem.
Itcertainly would be great if discarding a block on a loop mounted
filesystem image would free up the space on the underlying file. But
it doesn't work that way yet.

>> Because on a crypted device TRIM would show how much of the device is
>> in used while zeroing out (before crypting) would result in random
>> data.
>
> TRIM doesn't tell you how much of the drive is used?

Read the drive without decrypting. Any block that is all zeroes (you
claim above TRIMed blocks return zeroes) is unused. On the other hand
if you catch the TRIM commands above the crypt layer and write zeros
those zeroes get encrypted into random bits.

>> Because it is fun?
>
> You've got me there.  To each his own.
>
>>
>> So many reasons.
>
> ...to switch from the old rotating media to SSS ;)

Sure, if I had a SSD disk with TRIM support I certainly would not want
to circumvent it with zeroing blocks and decrease the live time. The
use for this would be for the other cases.

> Chris
>>
>> MfG
>>        Goswin
>>

MfG
        Goswin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual  machine environment
@ 2009-05-28 19:27                   ` Goswin von Brederlow
  0 siblings, 0 replies; 22+ messages in thread
From: Goswin von Brederlow @ 2009-05-28 19:27 UTC (permalink / raw)
  To: Chris Worley; +Cc: Goswin von Brederlow, LKML, linux-ext4

Chris Worley <worleys@gmail.com> writes:

> On Tue, May 26, 2009 at 4:22 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
>> Chris Worley <worleys@gmail.com> writes:
>>
>>> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
>>> wrote:
>>>
>>>
>>>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>>>
>>>      > Hello Ted,
>>>      >
>>>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>>>      this
>>>      >> routine with a function which issued calls to zero out blocks, it
>>>      >> would be easy to create a private patch.
>>>      >
>>>      > that sounds good because it wouldn't only target the most used
>>>      > filesystem but every other filesystem that uses the interface as
>>>      well.
>>>      > Do you think that a tunable or configurable patch has a chance to
>>>      hit
>>>      > upstream as well?
>>>      >
>>>      >         Thomas
>>>
>>>
>>>
>>>
>>>      I could imagine a device mapper target that eats TRIM commands and
>>>      writes out zeroes instead. That should be easy to maintain outside
>>>      or
>>>      inside the upstream kernel source.
>>>
>>>
>>> Why bother with a time-consuming performance-draining operation?  There are
>>> devices that already support TRIM/discard commands today, and once you discard
>>> a block, it's completely irretrievable (you'll just get back zeros if you try
>>> to read that block w/o writing it after the discard).
>>> Chris
>>
>
> I do enjoy a good argument... and don't mean this as a flame (I'm told
> I obliviously write curtly)...
>
> Old man's observation: I've found that the people you would think
> would readily embrace a new technology are as terrified of change as a
> Windows user, and always find so many excuses for "why change won't
> work for them" ;)
>
>> Because you have one of the billions of devices that don't.
>
> You have devices that _do_ work now, that should be your selection if
> you want both this functionality and high performance.  If you don't
> want performance, write zeros to rotating media.
>
> The time frame given in this thread is two years.  In 2-5 years,
> rotating media will be history.  The tip of the Linux kernel should
> not be focused on defunct technology.

I certainly have disks in use that are a lot older than that. And for
sure Thomas also has disks that do not natively support TRIM or he
wouldn't want to zero fill blocks instead. So the fact that someone
else might have a "working" disk is of no help.

>> Because, iirc, the specs say nothing about getting back zeros.
>>
>
> But all a Solid State Storage controller can do is give you garbage
> when asked for an unwritten or discarded block; it doesn't know where
> the data is, which is all that is needed for the functionality desired
> (there's no need to specify exactly what a controller should return
> when asked to read a block it knows nothing about).  Once the
> controller is no longer managing a block, there is no way for it to
> retrieve that block.  That's what TRIM is all about: get greatest
> performance by allowing the SSS controller to manage as few blocks as
> absolutely necessary.  Not being able to retrieve valid data for an
> unwritten or discarded block is a side-effect of TRIM, that fits well
> for this desired functionality.

Are you sure? From what other people said some disks don't seem to
forget where the data is. They just don't preserve it anymore. So as
long as the block is not overwritten by the wear leveling you do get
the original data back. Security wise not acceptable.

>>From drives I've tested so far, the de-facto standard is "zero" when
> reading unmanaged blocks.
>
>> Because someone could read the raw data from disk and recover your
>> state secrets.
>
> Water-boarding won't help... the controller simply doesn't know the
> information you demand.

You assume that you have a controler that works right.

> This isn't your grandfathers rotating media...

It is for me.

> You would have to read at the Erase Block level, and know the specific
> vendor implementation's EB layout and block-level
> mapping/metadata/wear-leveling strategy (i.e. very tightly held IP).
> Controllers don't provide the functionality to request raw EB's; there
> is no way to read raw EB's.  There is no spec for it in existence for
> reading EB's from a SCSI/SAS/SATA/block device.  Your only recourse
> would be to pull the NAND chips physically off the drive and weld them
> to another piece of hardware specifically designed to blindly read all
> the erase blocks, then try to infer the manufacturers chip
> organization as well as block-level metatdata, and then you'd only
> know all the active blocks (which you would have known those blocks
> anyway, before you pulled the chips off) and would have to come up
> with some strategy for trying to figure out the original LBA's for all
> the inactive data... so there _is_ a very small chance of recovery,
> lacking physical security... there are worse issues too, when physical
> security is not available on site (i.e. all your active data would be
> vulnerable as with any mechanical drive).
>
> Of concern to those handling state secrets: there is no guarantee in
> SSS that writing whatever pattern over and over again will physically
> overwrite the targeted LBA.  New methods of "declassifying" SSS drives
> will be necessary (i.e. a Secure Erase where the controller is told to
> erase all EB's... so your NAND EB reading device will read all ones no
> matter what EB is read).  These methods are simple enough to develop,
> but those who care about this should be aware that the old rotating
> media methods no longer apply.

Again you assume you have an SSD. Think what happens on your average
rotating disk.

>> Because loopback don't support TRIM and compression of the image file
>> is much better with zeroes.
>
> Wouldn't it be best if the block is not in existence after the
> discard?  Then there would be nothing to compress, which I believe
> "nothing" compresses very compactly.

That would require erasing blocks from the middle of files, something
not yet possible in the VFS layer nor supported by any filesystem.
Itcertainly would be great if discarding a block on a loop mounted
filesystem image would free up the space on the underlying file. But
it doesn't work that way yet.

>> Because on a crypted device TRIM would show how much of the device is
>> in used while zeroing out (before crypting) would result in random
>> data.
>
> TRIM doesn't tell you how much of the drive is used?

Read the drive without decrypting. Any block that is all zeroes (you
claim above TRIMed blocks return zeroes) is unused. On the other hand
if you catch the TRIM commands above the crypt layer and write zeros
those zeroes get encrypted into random bits.

>> Because it is fun?
>
> You've got me there.  To each his own.
>
>>
>> So many reasons.
>
> ...to switch from the old rotating media to SSS ;)

Sure, if I had a SSD disk with TRIM support I certainly would not want
to circumvent it with zeroing blocks and decrease the live time. The
use for this would be for the other cases.

> Chris
>>
>> MfG
>>        Goswin
>>

MfG
        Goswin
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
  2009-05-24 17:15 ` Arjan van de Ven
@ 2009-05-25  3:29 ` David Newall
  2009-05-25  5:26   ` Thomas Glanzmann
  2009-05-25  7:48 ` Ron Yorston
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 22+ messages in thread
From: David Newall @ 2009-05-25  3:29 UTC (permalink / raw)
  To: Thomas Glanzmann, tytso, LKML, linux-ext4

Thomas Glanzmann wrote:
> If you don't intend to have such an optional feature in ext3/ext4 I
> would like to know if you know a tool that makes it possible to zero out
> unused blocks?
>
> The only reference that I found for such a tool for Linux is the
> following:
>   
Astounding use of backquote. I'm not sure about the "percent" bit. I
think it's some confusion over only 95% of total blocks being available
for allocation.

I, too, would not recommend it, but it becomes safer by repeatedly
allocating only half the remaining disk, stopping when there's only a
few blocks free, to leave some for all of the other processes.
Presumably it won't be a problem having (potentially) a few free blocks
that don't de-duplicate.

#!/bin/bash
FileSystem=`grep ext /etc/mtab| awk -F" " '{ print $2 }'`

for i in $FileSystem
do
       while number=`df -B 512 $i | awk -F" " '$4 < 10 {exit(1)} {print $4 / 2}'`
       do
              dd count=$number if=/dev/zero || break
       done > $i/zf
       rm -f $i/zf
done



Are you proposing to de-duplicate a live filesystem?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-25  3:29 ` David Newall
@ 2009-05-25  5:26   ` Thomas Glanzmann
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Glanzmann @ 2009-05-25  5:26 UTC (permalink / raw)
  To: David Newall, tytso; +Cc: LKML, linux-ext4

Hello David,

                        [ RESEND: CC forgotten ]

> Are you proposing to de-duplicate a live filesystem?

I do, but on the storage appliance / nfs server and not inside the VM.
But inside VM a filesystem could make the deduplication effort much
easier if it reports unused blocks to the outside world by overwriting
them with zero. I have two scenarios in the moment in my head:

        - btrfs has already checksums. I'm at the moment evaluating if
          the crc32 is good enough to find candidates for deduplication
          or if a stronger checksum is required. After that one patch
          needs to be adapted and ioctl needs to be implemented in btrfs
          which than double checks if the blocks are for real
          duplications of each other and deduplicates them

        - btrfs will be at some point be able to generate a list of
          blocks that have changed between two transactions. This list
          can be used to create an (offsite-backup).

See also: http://thread.gmane.org/gmane.comp.file-systems.btrfs/2922

                Thomas

PS: And it seems that NetApp has the above already in a product. They
have the ability to dedup blocks on WAFL and they also have a feature
that allows to have an offsite duplication of the filesystem.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
  2009-05-24 17:15 ` Arjan van de Ven
  2009-05-25  3:29 ` David Newall
@ 2009-05-25  7:48 ` Ron Yorston
  2009-05-25 10:50   ` Thomas Glanzmann
  2009-05-25 12:06 ` Theodore Tso
  2009-05-25 21:19 ` Bill Davidsen
  4 siblings, 1 reply; 22+ messages in thread
From: Ron Yorston @ 2009-05-25  7:48 UTC (permalink / raw)
  To: tytso, thomas; +Cc: linux-kernel, linux-ext4

I've written a tool to zero freed blocks in ext2/ext3 filesystems, as well
as a (half-baked) kernel patch.  Details here:

   http://intgat.tigress.co.uk/rmy/uml/sparsify.html

Ron

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-25  7:48 ` Ron Yorston
@ 2009-05-25 10:50   ` Thomas Glanzmann
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Glanzmann @ 2009-05-25 10:50 UTC (permalink / raw)
  To: Ron Yorston; +Cc: tytso, linux-kernel, linux-ext4

Hello Ron,

* Ron Yorston <rmy@tigress.co.uk> [090525 09:49]:
> I've written a tool to zero freed blocks in ext2/ext3 filesystems, as well
> as a (half-baked) kernel patch.  Details here:

>    http://intgat.tigress.co.uk/rmy/uml/sparsify.html

nice work! While talking about sparse files: Do you know if there is an
option for qcow2 to reclaim zeroed out blocks (like a sparse in
userland)? I hope that this functionality hits upstream. It could also
used to provide a secure file deletion.

        Thomas

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
                   ` (2 preceding siblings ...)
  2009-05-25  7:48 ` Ron Yorston
@ 2009-05-25 12:06 ` Theodore Tso
  2009-05-25 21:19 ` Bill Davidsen
  4 siblings, 0 replies; 22+ messages in thread
From: Theodore Tso @ 2009-05-25 12:06 UTC (permalink / raw)
  To: Thomas Glanzmann, tytso, LKML, linux-ext4

On Sun, May 24, 2009 at 07:00:45PM +0200, Thomas Glanzmann wrote:
> Hello Ted,
> I would like to know if there is already a mount option or feature in
> ext3/ext4 that automatically overwrites freed blocks with zeros? If this
> is not the case I would like to know if you would consider a patch for
> upstream? I'm asking this because I currently do some research work on
> data deduplication in virtual machine environments and corresponding
> backups. It would be a huge space saver if there is such a feature
> because todays and tomorrows backup tools for virtual machine
> environments work on the block layer (VMware Consolidated Backup, VMware
> Data Recovery, and NetApp Snapshots). This is not only true for backup
> tools but also for running Virtual machines. The case that this future
> addresses is the following: A huge file is downloaded and later delted.
> The backup and datadeduplication that is operating on the block level
> can't identify the block as unused. This results in backing up the
> amount of the data that was previously allocated by the file and as such
> introduces an performance overhead. If you're interested in real live
> data, I'm able to provide them.

If you are planning to use this on production systems, forcing the
filesystem to zero out blocks to determine whether or not they are in
use is a terrible idea.  The performance hit it would impose would
probably not be tolerated by most users.  

It would be much better to design a system interface which allowed a
userspace program to be given a list of blocks that are in use given a
certain block range.  That way said userspace program could easily
determine whether or not a particular block is in use or not.

	  	     	   	      	    - Ted

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
                   ` (3 preceding siblings ...)
  2009-05-25 12:06 ` Theodore Tso
@ 2009-05-25 21:19 ` Bill Davidsen
  2009-05-26  4:45   ` Thomas Glanzmann
  4 siblings, 1 reply; 22+ messages in thread
From: Bill Davidsen @ 2009-05-25 21:19 UTC (permalink / raw)
  To: Thomas Glanzmann, tytso, LKML, linux-ext4

Thomas Glanzmann wrote:
> Hello Ted,
> I would like to know if there is already a mount option or feature in
> ext3/ext4 that automatically overwrites freed blocks with zeros? If this
> is not the case I would like to know if you would consider a patch for
> upstream? I'm asking this because I currently do some research work on
> data deduplication in virtual machine environments and corresponding
> backups. It would be a huge space saver if there is such a feature
> because todays and tomorrows backup tools for virtual machine
> environments work on the block layer (VMware Consolidated Backup, VMware
> Data Recovery, and NetApp Snapshots). This is not only true for backup
> tools but also for running Virtual machines. The case that this future
> addresses is the following: A huge file is downloaded and later delted.
> The backup and datadeduplication that is operating on the block level
> can't identify the block as unused. This results in backing up the
> amount of the data that was previously allocated by the file and as such
> introduces an performance overhead. If you're interested in real live
> data, I'm able to provide them.
> 
> If you don't intend to have such an optional feature in ext3/ext4 I
> would like to know if you know a tool that makes it possible to zero out
> unused blocks?
> 
Treating blocks as unused due to content seems a bad idea, if you want them to 
be unused look for references to TRIM, if you want this for security look at 
shred. And if you are interested in backing sparse files I believe that the tar 
"-S" option will do what you want or provide code you can use to start writing 
what you want.

I don't think this is a good solution to the problem that unused space is not 
accounted as you wish it would be. Most filesystems have a bitmap to track this 
already, a handle on that would be more generally useful.

Deleting files is slow enough, identifying unused storage by content is 1950s 
thinking, and also ignores the fact that new drives often don't come zeroed, and 
would behave badly unless you manually zeroed the unused portions.

I doubt this is the optimal solution, since you would have to read the zeros to 
see if they were present, making backup smaller but no faster than just doing a 
copy.

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: zero out blocks of freed user data for operation a virtual machine environment
  2009-05-25 21:19 ` Bill Davidsen
@ 2009-05-26  4:45   ` Thomas Glanzmann
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Glanzmann @ 2009-05-26  4:45 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: tytso, LKML, linux-ext4

Bill,
I think you didn't read what I write so here is it again: My
applications are VMs. Every disk that you give to a VM is zeroed out, one
way ot the other: One way is to use dd or something that has the same
effect the other is using a sparse file. That is guranteed. Now as soon
as you start working in this VM it is not guranteed because on real live
applications it makes limited sense to zero out freed blocks (expect
maybe you have a SAN LUN exported from a storage device that supports
data deduplication or if you want that deleted files disappear from you
block device). Todays datadeduplication and backupsolutions for VM
depend on the property that unused data blocks are zeroed out. And
actually I can't think of an easier interface. As I proposed earlier, if
you don't like it for performance reasons, that's fine, but if you have
to backup 5.6 Terabyte instead of 17 Terabyte than this is a huge space
safer even with the performance overhead involved.

        Thomas

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2009-05-28 19:27 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
2009-05-24 17:15 ` Arjan van de Ven
2009-05-24 17:39   ` Thomas Glanzmann
2009-05-25 12:03     ` Theodore Tso
2009-05-25 12:34       ` Thomas Glanzmann
2009-05-25 13:14         ` Goswin von Brederlow
2009-05-25 14:01           ` Thomas Glanzmann
     [not found]           ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
2009-05-25 17:26             ` Chris Worley
2009-05-25 17:26               ` Chris Worley
2009-05-26 10:22             ` Goswin von Brederlow
2009-05-26 10:22               ` Goswin von Brederlow
2009-05-26 16:52               ` Chris Worley
2009-05-26 16:52                 ` Chris Worley
2009-05-28 19:27                 ` Goswin von Brederlow
2009-05-28 19:27                   ` Goswin von Brederlow
2009-05-25  3:29 ` David Newall
2009-05-25  5:26   ` Thomas Glanzmann
2009-05-25  7:48 ` Ron Yorston
2009-05-25 10:50   ` Thomas Glanzmann
2009-05-25 12:06 ` Theodore Tso
2009-05-25 21:19 ` Bill Davidsen
2009-05-26  4:45   ` Thomas Glanzmann

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.