Re: [patch] Add design document for UBIFS secure deletion

From: Artem Bityutskiy <dedekind1@gmail.com>
To: Joel Reardon <joel@clambassador.com>
Cc: linux-mtd@lists.infradead.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [patch] Add design document for UBIFS secure deletion
Date: Fri, 23 Mar 2012 17:38:52 +0200	[thread overview]
Message-ID: <1332517132.18717.102.camel@sauron.fi.intel.com> (raw)
In-Reply-To: <alpine.DEB.2.00.1203231448520.22944@eristoteles.iwoars.net>

[-- Attachment #1: Type: text/plain, Size: 8540 bytes --]

I've pushed this patch to the joel branch, but still have comments. Feel
free to send incremental changes - I'll just squash them in.

On Fri, 2012-03-23 at 14:50 +0100, Joel Reardon wrote:
> +Introduction
> +============
> +UBIFSec provides efficient secure deletion for the flash file system UBIFS.
> +Trivial secure deletion by overwriting the deleted data does not work for
> +UBI-accessed flash memory, as there is a large difference between the size of
> +the I/O unit (page) and the erasure unit (logical erase block or LEB).
> +UBIFSec encrypts each data node with a distinct key and stores the keys
> +colocated in a key storage area (KSA).  Secure deletion is achieved by
> +updating the (small) set of LEBs that constitute the KSA to remove keys
> +corresponding to deleted data, thereby deleting the data nodes they encrypted.
> +The additional wear due to flash erasure is small, only the LEBs containing
> +the keys, and the operation of removing old keys---called purging---is done
> +periodically so the actual increase in wear is controlled by the user.
> +Moreover, the use of UBI's logical interface means that the additional wear is
> +evenly spread over the flash memory and the new version of a LEB containing
> +the keys can be written using UBI's atomic update proceedure to ensure no keys
> +are lost during an update.

How about: s/during an update/in case of a power cut/

> +Key Storage Area (KSA)
> +======================
> +UBIFSec uses a small set of LEBs to store all the data node's keys---this set
> +is called the Key Storage Area (KSA). The KSA is managed separately from the
> +rest of the file system.

> In particular, it does not behave like a
> +log-structured file system: when a KSA LEB is updated, its contents are
> +written to a new physical location on the flash memory, UBI's logical map is
> +then updated to this new physical address and the previous version of the KSA
> +LEB is then erased.  

Am I right that you basically wanted to say that when you update a KSA
LEB, you make sure that the physical flash does not contain the old
contents of this LEB? Would be nice to re-phrase.

Side question - how do you do this?

> Thus, except while updating the KSA, only one copy of the
> +data in the KSA is available on the storage medium.  When the file system is
> +created, cryptographically-suitable random data is written from random_bytes()
> +to each of the KSA's LEBs and all the keys are marked as unused. Purging
> +writes new versions of the KSA LEBs using UBI's atomic update feature.

Just a general question about: I guess you have to call ubi_sync() then
to make sure the old version is actually erased, right?

> +Each data node's header stores the logical KSA position that contains its
> +decryption key. The LEBs in the KSA are periodically erased to securely delete
> +any keys that decrypt deleted data. When the file system no longer needs a
> +data node---i.e, it is removed or updated---we mark the data node's
> +corresponding key in the KSA as deleted.  This is independent of the notion of
> +files; keys are marked as deleted whenever a data node is discarded.  A key
> +remains marked as deleted until it is removed from the storage medium and its
> +location is replaced with fresh, unused random data, which is then marked as
> +unused.

Read this far, and have 2 big questions:

1. How keys are marked as deleted (I guess in some in-memory data
structure)
2. When deleted keys are removed from the medium (probably on commit?)

I guess I'll find the answers below.

> +When a new data node is written to the storage medium, an unused key is
> +selected from the KSA and its position is written to the data node's header.

Questions arises - how the key is selected?

> +The keys are in a protected area of the file system, so only users with root
> +access to the storage medium are capable of reading the keys that encrypt
> +data.

Hmm, what is the protected area? What prevents anyone from reading them
by finding them in /dev/ubiX_Y or /dev/mtdZ ?

> +Purging
> +=======
> +Purging is a periodic procedure that securely deletes keys from the KSA.

I guess you mean deleted keys?

> +Purging proceeds iteratively over each of the KSA's LEBs: a new version of the
> +LEB is prepared where the used keys remain in the same position and all other
> +keys (i.e., unused and deleted keys) are replaced with fresh, unused,
> +cryptographically-appropriate random data from a source of hardware
> +randomness.

Hmm, why is it necessary to re-initialize unused keys?

>  This fresh random data is then assigned to new keys as needed. We
> +keep used keys logically-fixed because their corresponding data node has
> +already written its logical position. The new version of the LEB is then
> +written to an arbitrary empty LEB on the storage medium.  After completion,
> +the LEB containing the old version is erased, thus securely deleting the
> +unused and deleted keys along with the data nodes they encrypt.
> +
> +If a KSA LEB becomes a bad block while erasing it, it is possible that its
> +contents will remain readable on the storage medium without the ability to
> +remove them. In this case, it is necessary to re-encrypt any data node whose
> +encryption key remains available and force the garbage collection of those
> +LEBs on which the data nodes reside.

Good point. UBI will always try to erase and re-write a PEB several
times before marking it as bad, so hopefully the keys will disappear,
but there is no guarantee.

> +Key State Map
> +=============
> +The key state map is an in-memory map that maps key positions to key states
> +{unused, used, deleted}. 

Is it 2 bits per key?

> Unused keys can be assigned and then marked used.
> +Used keys are keys that encrypt some valid data node, so they must be
> +preserved to ensure availability of the file system's data. Deleted keys are
> +keys used to encrypt deleted data---i.e., data nodes that are no longer
> +referenced by the index---and should be purged from the system to achieve
> +secure deletion.
> +
> +A correct key state map is one that has the following three properties:
> +1. every unused key must not decrypt any data node---either valid or invalid
> +2. every used key must have exactly one data node it can decrypt and this data
> +node must be valid according to the index
> +3. every deleted key must not decrypt any data node that is valid according to
> +the index.

I guess you do not enforce these rules, you just rely on the randomness?

> +
> +The operation of purging performed on a correct key state map guarantees
> +soundness: purging securely deletes any key in the KSA marked as
> +deleted---afterwards, every key either decrypts one valid data node or nothing
> +at all and every valid data node can be decrypted.  A correct key state map
> +also guarantees the integrity of our data during purging, because no key that
> +is used to decrypt valid data will be removed.
> +
> +The key state map is stored, used, and updated in volatile memory. Initially,
> +the key state map of a freshly-formatted UBIFSec file system is correct as it
> +consists of no data nodes, and every key is fresh random data that is marked
> +as unused. While mounted, UBIFSec performs appropriate key management to
> +ensure that the key state map is always correct when new data is written,
> +deleted, etc. We now show that we can always create a correct key state map
> +when mounting an arbitrary UBIFSec file system.

You'd also need to teach mkfs.ubifs to write correct KSA.

> +The key state map is built from a periodic checkpoint combined with a replay
> +of the most recent changes while mounting.  We checkpoint the current key
> +state map to the storage medium whenever the KSA is purged.

, which happens when UBIFS commits?

>  After a purge,
> +every key is either unused or used, and so a checkpoint of this map can be
> +stored using one bit per key---less than 1% of the KSA's size---which is then
> +compressed.  A special LEB is used to store checkpoints, where each new
> +checkpoint is appended; when the LEB is full then the next checkpoint is
> +written at the beginning using an atomic update.

Hmm, interesting how you manage checkpoints. I guess the easiest would
be to have a special inode (not visible for users) and store checkpoints
there?

-- 
Best Regards,
Artem Bityutskiy

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]