All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Worley <worleys@gmail.com>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: LKML <linux-kernel@vger.kernel.org>, linux-ext4@vger.kernel.org
Subject: Re: zero out blocks of freed user data for operation a virtual  machine environment
Date: Tue, 26 May 2009 10:52:21 -0600	[thread overview]
Message-ID: <f3177b9e0905260952x2e382d9ana02dcd10a5bcc63d@mail.gmail.com> (raw)
In-Reply-To: <87ab50p3ip.fsf@frosties.localdomain>

On Tue, May 26, 2009 at 4:22 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Chris Worley <worleys@gmail.com> writes:
>
>> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
>> wrote:
>>
>>
>>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>>
>>      > Hello Ted,
>>      >
>>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>>      this
>>      >> routine with a function which issued calls to zero out blocks, it
>>      >> would be easy to create a private patch.
>>      >
>>      > that sounds good because it wouldn't only target the most used
>>      > filesystem but every other filesystem that uses the interface as
>>      well.
>>      > Do you think that a tunable or configurable patch has a chance to
>>      hit
>>      > upstream as well?
>>      >
>>      >         Thomas
>>
>>
>>
>>
>>      I could imagine a device mapper target that eats TRIM commands and
>>      writes out zeroes instead. That should be easy to maintain outside
>>      or
>>      inside the upstream kernel source.
>>
>>
>> Why bother with a time-consuming performance-draining operation?  There are
>> devices that already support TRIM/discard commands today, and once you discard
>> a block, it's completely irretrievable (you'll just get back zeros if you try
>> to read that block w/o writing it after the discard).
>> Chris
>

I do enjoy a good argument... and don't mean this as a flame (I'm told
I obliviously write curtly)...

Old man's observation: I've found that the people you would think
would readily embrace a new technology are as terrified of change as a
Windows user, and always find so many excuses for "why change won't
work for them" ;)

> Because you have one of the billions of devices that don't.

You have devices that _do_ work now, that should be your selection if
you want both this functionality and high performance.  If you don't
want performance, write zeros to rotating media.

The time frame given in this thread is two years.  In 2-5 years,
rotating media will be history.  The tip of the Linux kernel should
not be focused on defunct technology.

>
> Because, iirc, the specs say nothing about getting back zeros.
>

But all a Solid State Storage controller can do is give you garbage
when asked for an unwritten or discarded block; it doesn't know where
the data is, which is all that is needed for the functionality desired
(there's no need to specify exactly what a controller should return
when asked to read a block it knows nothing about).  Once the
controller is no longer managing a block, there is no way for it to
retrieve that block.  That's what TRIM is all about: get greatest
performance by allowing the SSS controller to manage as few blocks as
absolutely necessary.  Not being able to retrieve valid data for an
unwritten or discarded block is a side-effect of TRIM, that fits well
for this desired functionality.

>From drives I've tested so far, the de-facto standard is "zero" when
reading unmanaged blocks.

> Because someone could read the raw data from disk and recover your
> state secrets.

Water-boarding won't help... the controller simply doesn't know the
information you demand.

This isn't your grandfathers rotating media...

You would have to read at the Erase Block level, and know the specific
vendor implementation's EB layout and block-level
mapping/metadata/wear-leveling strategy (i.e. very tightly held IP).
Controllers don't provide the functionality to request raw EB's; there
is no way to read raw EB's.  There is no spec for it in existence for
reading EB's from a SCSI/SAS/SATA/block device.  Your only recourse
would be to pull the NAND chips physically off the drive and weld them
to another piece of hardware specifically designed to blindly read all
the erase blocks, then try to infer the manufacturers chip
organization as well as block-level metatdata, and then you'd only
know all the active blocks (which you would have known those blocks
anyway, before you pulled the chips off) and would have to come up
with some strategy for trying to figure out the original LBA's for all
the inactive data... so there _is_ a very small chance of recovery,
lacking physical security... there are worse issues too, when physical
security is not available on site (i.e. all your active data would be
vulnerable as with any mechanical drive).

Of concern to those handling state secrets: there is no guarantee in
SSS that writing whatever pattern over and over again will physically
overwrite the targeted LBA.  New methods of "declassifying" SSS drives
will be necessary (i.e. a Secure Erase where the controller is told to
erase all EB's... so your NAND EB reading device will read all ones no
matter what EB is read).  These methods are simple enough to develop,
but those who care about this should be aware that the old rotating
media methods no longer apply.

> Because loopback don't support TRIM and compression of the image file
> is much better with zeroes.

Wouldn't it be best if the block is not in existence after the
discard?  Then there would be nothing to compress, which I believe
"nothing" compresses very compactly.

>
> Because on a crypted device TRIM would show how much of the device is
> in used while zeroing out (before crypting) would result in random
> data.

TRIM doesn't tell you how much of the drive is used?

>
> Because it is fun?

You've got me there.  To each his own.

>
> So many reasons.

...to switch from the old rotating media to SSS ;)

Chris
>
> MfG
>        Goswin
>

WARNING: multiple messages have this Message-ID (diff)
From: Chris Worley <worleys@gmail.com>
To: Goswin von Brederlow <goswin-v-b@web.de>
Cc: LKML <linux-kernel@vger.kernel.org>, linux-ext4@vger.kernel.org
Subject: Re: zero out blocks of freed user data for operation a virtual machine environment
Date: Tue, 26 May 2009 10:52:21 -0600	[thread overview]
Message-ID: <f3177b9e0905260952x2e382d9ana02dcd10a5bcc63d@mail.gmail.com> (raw)
In-Reply-To: <87ab50p3ip.fsf@frosties.localdomain>

On Tue, May 26, 2009 at 4:22 AM, Goswin von Brederlow <goswin-v-b@web.de> wrote:
> Chris Worley <worleys@gmail.com> writes:
>
>> On Mon, May 25, 2009 at 7:14 AM, Goswin von Brederlow <goswin-v-b@web.de>
>> wrote:
>>
>>
>>                Thomas Glanzmann <thomas@glanzmann.de> writes:
>>
>>      > Hello Ted,
>>      >
>>      >> Yes, it does, sb_issue_discard().  So if you wanted to hook into
>>      this
>>      >> routine with a function which issued calls to zero out blocks, it
>>      >> would be easy to create a private patch.
>>      >
>>      > that sounds good because it wouldn't only target the most used
>>      > filesystem but every other filesystem that uses the interface as
>>      well.
>>      > Do you think that a tunable or configurable patch has a chance to
>>      hit
>>      > upstream as well?
>>      >
>>      >         Thomas
>>
>>
>>
>>
>>      I could imagine a device mapper target that eats TRIM commands and
>>      writes out zeroes instead. That should be easy to maintain outside
>>      or
>>      inside the upstream kernel source.
>>
>>
>> Why bother with a time-consuming performance-draining operation?  There are
>> devices that already support TRIM/discard commands today, and once you discard
>> a block, it's completely irretrievable (you'll just get back zeros if you try
>> to read that block w/o writing it after the discard).
>> Chris
>

I do enjoy a good argument... and don't mean this as a flame (I'm told
I obliviously write curtly)...

Old man's observation: I've found that the people you would think
would readily embrace a new technology are as terrified of change as a
Windows user, and always find so many excuses for "why change won't
work for them" ;)

> Because you have one of the billions of devices that don't.

You have devices that _do_ work now, that should be your selection if
you want both this functionality and high performance.  If you don't
want performance, write zeros to rotating media.

The time frame given in this thread is two years.  In 2-5 years,
rotating media will be history.  The tip of the Linux kernel should
not be focused on defunct technology.

>
> Because, iirc, the specs say nothing about getting back zeros.
>

But all a Solid State Storage controller can do is give you garbage
when asked for an unwritten or discarded block; it doesn't know where
the data is, which is all that is needed for the functionality desired
(there's no need to specify exactly what a controller should return
when asked to read a block it knows nothing about).  Once the
controller is no longer managing a block, there is no way for it to
retrieve that block.  That's what TRIM is all about: get greatest
performance by allowing the SSS controller to manage as few blocks as
absolutely necessary.  Not being able to retrieve valid data for an
unwritten or discarded block is a side-effect of TRIM, that fits well
for this desired functionality.

From drives I've tested so far, the de-facto standard is "zero" when
reading unmanaged blocks.

> Because someone could read the raw data from disk and recover your
> state secrets.

Water-boarding won't help... the controller simply doesn't know the
information you demand.

This isn't your grandfathers rotating media...

You would have to read at the Erase Block level, and know the specific
vendor implementation's EB layout and block-level
mapping/metadata/wear-leveling strategy (i.e. very tightly held IP).
Controllers don't provide the functionality to request raw EB's; there
is no way to read raw EB's.  There is no spec for it in existence for
reading EB's from a SCSI/SAS/SATA/block device.  Your only recourse
would be to pull the NAND chips physically off the drive and weld them
to another piece of hardware specifically designed to blindly read all
the erase blocks, then try to infer the manufacturers chip
organization as well as block-level metatdata, and then you'd only
know all the active blocks (which you would have known those blocks
anyway, before you pulled the chips off) and would have to come up
with some strategy for trying to figure out the original LBA's for all
the inactive data... so there _is_ a very small chance of recovery,
lacking physical security... there are worse issues too, when physical
security is not available on site (i.e. all your active data would be
vulnerable as with any mechanical drive).

Of concern to those handling state secrets: there is no guarantee in
SSS that writing whatever pattern over and over again will physically
overwrite the targeted LBA.  New methods of "declassifying" SSS drives
will be necessary (i.e. a Secure Erase where the controller is told to
erase all EB's... so your NAND EB reading device will read all ones no
matter what EB is read).  These methods are simple enough to develop,
but those who care about this should be aware that the old rotating
media methods no longer apply.

> Because loopback don't support TRIM and compression of the image file
> is much better with zeroes.

Wouldn't it be best if the block is not in existence after the
discard?  Then there would be nothing to compress, which I believe
"nothing" compresses very compactly.

>
> Because on a crypted device TRIM would show how much of the device is
> in used while zeroing out (before crypting) would result in random
> data.

TRIM doesn't tell you how much of the drive is used?

>
> Because it is fun?

You've got me there.  To each his own.

>
> So many reasons.

...to switch from the old rotating media to SSS ;)

Chris
>
> MfG
>        Goswin
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2009-05-26 16:52 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-24 17:00 zero out blocks of freed user data for operation a virtual machine environment Thomas Glanzmann
2009-05-24 17:15 ` Arjan van de Ven
2009-05-24 17:39   ` Thomas Glanzmann
2009-05-25 12:03     ` Theodore Tso
2009-05-25 12:34       ` Thomas Glanzmann
2009-05-25 13:14         ` Goswin von Brederlow
2009-05-25 14:01           ` Thomas Glanzmann
     [not found]           ` <f3177b9e0905251023n762b815akace1ae34e643458e@mail.gmail.com>
2009-05-25 17:26             ` Chris Worley
2009-05-25 17:26               ` Chris Worley
2009-05-26 10:22             ` Goswin von Brederlow
2009-05-26 10:22               ` Goswin von Brederlow
2009-05-26 16:52               ` Chris Worley [this message]
2009-05-26 16:52                 ` Chris Worley
2009-05-28 19:27                 ` Goswin von Brederlow
2009-05-28 19:27                   ` Goswin von Brederlow
2009-05-25  3:29 ` David Newall
2009-05-25  5:26   ` Thomas Glanzmann
2009-05-25  7:48 ` Ron Yorston
2009-05-25 10:50   ` Thomas Glanzmann
2009-05-25 12:06 ` Theodore Tso
2009-05-25 21:19 ` Bill Davidsen
2009-05-26  4:45   ` Thomas Glanzmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f3177b9e0905260952x2e382d9ana02dcd10a5bcc63d@mail.gmail.com \
    --to=worleys@gmail.com \
    --cc=goswin-v-b@web.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.