linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fscrypt: Howto resolve hash collisions?
@ 2016-10-05 14:00 Richard Weinberger
  2016-10-05 15:45 ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Weinberger @ 2016-10-05 14:00 UTC (permalink / raw)
  To: linux-fsdevel, linux-ext4; +Cc: Theodore Ts'o, Eric Biggers, David Gstir

Hi!

UBIFS uses the r5 hash algorithm for filenames and is able to resolve hash collisions.
Unless I miss something it is not possible to resolve hash collisions for bignames
in fscrypto.

UBIFS does in readdir():
fscrypt_fname_disk_to_usr(dir, key_hash_flash(c, &dent->key), 0, &nm.disk_name, &fstr);

Hence, it feeds its filename hash to fscrypto and when no key is present fscrypto
encodes that hash into a bigname starting with "_".
minor_hash is not set because UBIFS's hash has only 32bits.

Upon lookup UBIFS does:
fscrypt_setup_filename(dir, &dentry->d_name, 1, &nm);

For small names nm will contain the decoded name, UBIFS will compute the r5 hash,
does a lookup and compares whether the found directory entry matches the name.
It not, it will resolve the collision.
On the other hand, with bignames, nm will only contain hash and minor_hash.
UBIFS can do a lookup based on the hash value but it has no way to detect nor resolve
the collision since no name is present.

What do I miss? Are ext4 and f2fs not able to resolve hash collisions and therefore
nobody noticed?

Thanks,
//richard

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fscrypt: Howto resolve hash collisions?
  2016-10-05 14:00 fscrypt: Howto resolve hash collisions? Richard Weinberger
@ 2016-10-05 15:45 ` Theodore Ts'o
  2016-10-08 14:33   ` Richard Weinberger
  0 siblings, 1 reply; 4+ messages in thread
From: Theodore Ts'o @ 2016-10-05 15:45 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-fsdevel, linux-ext4, Eric Biggers, David Gstir

On Wed, Oct 05, 2016 at 04:00:36PM +0200, Richard Weinberger wrote:
> 
> UBIFS uses the r5 hash algorithm for filenames and is able to resolve hash collisions.
> Unless I miss something it is not possible to resolve hash collisions for bignames
> in fscrypto....
>
> What do I miss? Are ext4 and f2fs not able to resolve hash collisions and therefore
> nobody noticed?


"Resolve hash collisions" is a fairly generic term.  There are many
ways to resolve hash collisions, including linear probing --- which is
what ext4 uses.

Let's take a step back and understand what we're doing here.  The
requirement which we're trying to support here is the ability for root
to be able to delete encrypted files without having access to the key.
For example, imagine a ChromeOS device which is shared between
multiple users, and where each user's files are protected by a
different key.  This means that while user A is logged in, even if
there is a security exposure which allows a bad guy to execute code as
root, the files of user B and C are not at risk.  However, if user A
needs more disk space, one of the things things which a privileged
service can do is to delete files in user B's and user C's cache
directory, since cache files are safe to delete in order to make
space.

So in order to do this, we need to be able to allow root to be able to
run readdir on an encrypted directory, and get handles on directory
entries which can be then sent back into the kernel so that system
calls such as unlink(2) and opendir(2) can work.

For small files, we can just base-64 encode the encrypted directory,
and then send that back.  However, base-64 encoding expands the size
of the filename by (plus or minus) by 33%; and we can't send back a
filename longer than 255 bytes through the syscall interface.

To get around this, there is an alternate mechanism where the file
system can encode an encrypted filename, and that's to send back a
64-bit cookie to userspace via readdir(), and then when the user space
sends the base-64 encoded filename to unlink(2), fscrypt will either
provide the file system with the encrypted filename if it is short, or
with the 64-bit cookie.  Because the fscrypto code was originally
factored out of ext4, we use two 32-bit longs and name them "hash" and
"minor_hash", but they don't actually have to be a hash.

For example, POSIX requires that file systems supports telldir()
return a 64-bit cookie which will be valid in the face of intervening
additions and deletions, and this cookie must be able to be fed back
to seekdir() and resume the readdir() session from the location of the
last telldir().  So you might be able to use whatever scheme you use
to support telldir() and seekdir().  In retrospect, it probably would
have been better if, when we refactored the code, to have made the
interface be a 8 byte cookie.

It sounds like the main problem here is that Ubifs is using a 32-bit
hash, and so the chances of collisions are much higher --- 1 in 65536,
to be precise.  So the technique of using linear probing probably
wouldn't work as well for Ubifs.  However, feel free to use the minor
hash any which way you want.  The key is that you're starting with the
encrypted filename in readdir, so you also know where the file is located.

I'm not sure which mechanism of "collision resolution" you're using,
so at this point I can only make some conjections.  For example, are
you using the mechanism of hash(name || counter), where you keep
incrementing the counter until you find an empty slot in the your hash
tree?  If so, what you could do is to figure out what counter value
resulted in the location of the encrypted filename to that location in
the hash tree, and you could store that counter value in the
minor_hash field.

Finally, if it would help, we could change the fs/crypt interface so
that instead of using a 2x4 byte "cookie", we could make it be (for
example) a 128 bit cookie.  However, given that ubifs must have some
solution to the 64-bit telldir/seekdir cookie, maybe you can use that
mechanism for handling the encrypted readdir() functionality.

Hope this helps,

						- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fscrypt: Howto resolve hash collisions?
  2016-10-05 15:45 ` Theodore Ts'o
@ 2016-10-08 14:33   ` Richard Weinberger
  2016-10-08 20:46     ` Theodore Ts'o
  0 siblings, 1 reply; 4+ messages in thread
From: Richard Weinberger @ 2016-10-08 14:33 UTC (permalink / raw)
  To: Theodore Ts'o; +Cc: linux-fsdevel, linux-ext4, Eric Biggers, David Gstir

Ted,

On 05.10.2016 17:45, Theodore Ts'o wrote:
> On Wed, Oct 05, 2016 at 04:00:36PM +0200, Richard Weinberger wrote:
>>
>> UBIFS uses the r5 hash algorithm for filenames and is able to resolve hash collisions.
>> Unless I miss something it is not possible to resolve hash collisions for bignames
>> in fscrypto....
>>
>> What do I miss? Are ext4 and f2fs not able to resolve hash collisions and therefore
>> nobody noticed?
> 
> 
> "Resolve hash collisions" is a fairly generic term.  There are many
> ways to resolve hash collisions, including linear probing --- which is
> what ext4 uses.

But hash collisions can happen and ext4 aborts then, right?
Some time ago I saw this:
http://blog.merovius.de/2013/10/20/ext4-mysterious-no-space-left-on.html

Which indicates that.

[...]

> For example, POSIX requires that file systems supports telldir()
> return a 64-bit cookie which will be valid in the face of intervening
> additions and deletions, and this cookie must be able to be fed back
> to seekdir() and resume the readdir() session from the location of the
> last telldir().  So you might be able to use whatever scheme you use
> to support telldir() and seekdir().  In retrospect, it probably would
> have been better if, when we refactored the code, to have made the
> interface be a 8 byte cookie.

Here starts the trouble. UBIFS does not support really telldir().
...and therefore also no NFS exports.

> It sounds like the main problem here is that Ubifs is using a 32-bit
> hash, and so the chances of collisions are much higher --- 1 in 65536,
> to be precise.  So the technique of using linear probing probably
> wouldn't work as well for Ubifs.  However, feel free to use the minor
> hash any which way you want.  The key is that you're starting with the
> encrypted filename in readdir, so you also know where the file is located.
> 
> I'm not sure which mechanism of "collision resolution" you're using,
> so at this point I can only make some conjections.  For example, are
> you using the mechanism of hash(name || counter), where you keep
> incrementing the counter until you find an empty slot in the your hash
> tree?  If so, what you could do is to figure out what counter value
> resulted in the location of the encrypted filename to that location in
> the hash tree, and you could store that counter value in the
> minor_hash field.

UBIFS accepts the fact that multiple directory entries can have the same
hash (also on flash). Upon lookup it computes hash(name), finds the first
directory entry with that hash _and_ compares the name. If the name
does not match we're facing a hash collision and lookup the next entry
with the same hash.

> Finally, if it would help, we could change the fs/crypt interface so
> that instead of using a 2x4 byte "cookie", we could make it be (for
> example) a 128 bit cookie.  However, given that ubifs must have some
> solution to the 64-bit telldir/seekdir cookie, maybe you can use that
> mechanism for handling the encrypted readdir() functionality.

Long story short, UBIFS has no solution to offer a telldir/seekdir cookie.
And I fear this time, for fscrypto, it really hurts.

Maybe it is acceptable to support only 200 char long filenames with enabled
encryption and always use base64 encoding.

Thanks,
//richard

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: fscrypt: Howto resolve hash collisions?
  2016-10-08 14:33   ` Richard Weinberger
@ 2016-10-08 20:46     ` Theodore Ts'o
  0 siblings, 0 replies; 4+ messages in thread
From: Theodore Ts'o @ 2016-10-08 20:46 UTC (permalink / raw)
  To: Richard Weinberger; +Cc: linux-fsdevel, linux-ext4, Eric Biggers, David Gstir

On Sat, Oct 08, 2016 at 04:33:29PM +0200, Richard Weinberger wrote:
> 
> But hash collisions can happen and ext4 aborts then, right?

No, in case of hash collisions we use linear probing, as I said.

I tested this a while back by using a hash function which always
returns 42, and it works just fine.

> Some time ago I saw this:
> http://blog.merovius.de/2013/10/20/ext4-mysterious-no-space-left-on.html

The blog author is confused.  I'm guessing what happened is that he's
using a file system with a 1k block size, and htree currently has a
limitation that the tree depth can be no more than two deep.  (Yes,
it's a hack.  The original htree implementation was done by Daniel
Phillips, and we never got around to fix it.)  In practice, for file
systems with a 4k block size, the fanout is so wide that it's not an
issue.

> UBIFS accepts the fact that multiple directory entries can have the same
> hash (also on flash). Upon lookup it computes hash(name), finds the first
> directory entry with that hash _and_ compares the name. If the name
> does not match we're facing a hash collision and lookup the next entry
> with the same hash.

OK, so that's simple.  It's what ext4 does as well.

The difference is that we use a 64-bit hash, and we normally only
store 32-bit on disk for lookup purposes.  On 32-bit systems, we use
the 32-bit hash for the telldir cookie, and we accept the fact that
there can be hash collisions that will cause telldir/seekdir may not
return the directory pointer back to the exact location.  We do the
same thing for NFSv2, which also only supports a 32-bit telldir
cookie.

But what we do is we use the *other* 32-bits of the 64-bit hash as the
"minor hash", and on 64-bit architectures, the telldir cookie uses the
high 32-bits to store the "major" hash, and use the low 32-bits to
store the "minor" hash.  The readdir(2) system call returns the
directory entries in 64-bit hash order, and when we get the telldir
cookie, we use the "minor" hash as part of the virtual 64-bit hash.

> Long story short, UBIFS has no solution to offer a telldir/seekdir cookie.
> And I fear this time, for fscrypto, it really hurts.

So you could do the same thing.  Just use another 32-bit hash and use
that as the "minor" hash.  Now when you do the lookup, if there are
multiple files that have the same 32-bit hash you can use the "minor"
hash to disambiguate.  The chances of a collision is at that point is
1 in 4,294,967,296.

The reason why I haven't bothered to do more is that in practice, for
our use case if you do an rm -rf, you might just end up deleting the
files in the opposite order, but it's not a problem in practice ---
and even if you did care, one in 4 billion are pretty good odds.  (For
context: your chances of getting killed by a falling coconut is one in
250 million; your chances of getting killed by shark attack is one in
300 million; your chances of getting killed by food poisioning is one
in 3 million; and your chances of getting killed by a terrorist is one
in 25 million.  Funny thing that US citizens tend to be more freaked
out about getting killed by a terrorist as opposed to food poisoning,
but no one ever said civilians were rational.)

If for some reason you want to do better than one in 4 billion, what
we could do as chose yet another 64-bit hash, and then store that
alongside the major and minor hashes, and then compare against the
64-bit hash plus the minor hash.  At that point the chances of failure
is one in 18,446,744,073,709,551,616.  This could be done without
making a on-disk format change, so if we ever wanted to make this
change, we could do it.


Cheers,

						- Ted

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-10-08 23:17 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-10-05 14:00 fscrypt: Howto resolve hash collisions? Richard Weinberger
2016-10-05 15:45 ` Theodore Ts'o
2016-10-08 14:33   ` Richard Weinberger
2016-10-08 20:46     ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).