From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com ([209.85.220.51]:33904 "EHLO mail-pa0-f51.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753816AbcITDFt (ORCPT ); Mon, 19 Sep 2016 23:05:49 -0400 Received: by mail-pa0-f51.google.com with SMTP id wk8so1897624pab.1 for ; Mon, 19 Sep 2016 20:05:49 -0700 (PDT) From: Alex Elsayed To: "Theodore Ts'o" Cc: Chris Mason , linux-btrfs@vger.kernel.org Subject: Re: Experimental btrfs encryption Date: Mon, 19 Sep 2016 20:05:37 -0700 Message-ID: <2482412.ISOtnf003W@arkadios> In-Reply-To: <20160920025041.mzeljxxzclikktxn@thunk.org> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Sender: linux-btrfs-owner@vger.kernel.org List-ID: References: <20160919151518.i2aon4axmfzt54rn@thunk.org> <20160920025041.mzeljxxzclikktxn@thunk.org> Taking a stab at a different way of replying, to try and keep Ted in the loop. On Monday, 19 September 2016 22:50:41 PDT Theodore Ts'o wrote: > On Mon, Sep 19, 2016 at 08:32:34PM -0400, Chris Mason wrote: > > > > That key is used to protect the contents of the data file, and to > > > > encrypt filenames and symlink targets --- since filenames can leak > > > > significant information about what the user is doing. (For example, > > > > in the downloads directory of their web browser, leaking filenames is > > > > just as good as leaking part of their browsing history.) > > > > One of the things that makes per-subvolume encryption attractive to me is > > that we're able to enforce the idea that an entire directory tree is > > encrypted by one key. It can't be snapshotted again without the key, and > > it just fits with the rest of the btrfs management code. I do want to > > support the existing vfs interfaces as well too though. > > One of the main reasons for doing fs-level encryption is so you can > allow multiple users to have different keys. In some cases you can > assume that different users will be in different distinct subvolumes > (e.g., each user has their own home directory), but that's not always > going to be possible. Mm, that's definitely something to keep in mind. > One of the other things that was in the original design, but which got > dropped in our initial implementation, was the concept of having the > per-inode key wrapped by multiple user keys. This would allow a file > to be accessible by more than one user. So something to consider is > that there may very well be situations where you *want* to have more > than one key associated with a directory hierarchy. Makes sense. > > > The issue, here, is that inodes are fundamentally not a safe scope to > > > attach that information to in btrfs. As extents can be shared between > > > inodes (and thus both will need to decrypt them), and inodes can be > > > duplicated unmodified (snapshots), attaching keys and nonces to inodes > > > opens up a whole host of (possibly insoluble) issues, including > > > catastrophic nonce reuse via writable snapshots. > > > > I'm going to have to read harder about nonce reuse. In btrfs an inode is > > really a pair [ root id, inode number ], so strictly speaking two writable > > snapshots won't have the same inode in memory and when a snapshot is > > modified we'd end up with a different nonce for the new modifications. > > Nonce reuse is not necessrily catastrophic. It all depends on the > context. In the case of Counter or GCM mode, nonce (or IV) reuse is > absolutely catastrophic. It must *never* be done or you completely > lose all security. As the Soviets discovered the hard way courtesy of > the Venona project (well, they didn't discover it until after they > lost the cold war, but...) one time pads are completely secure. > Two-time pads, are most emphatically _not_. :-) Aaaand now I wish I'd seen this before I sent my Big Ol' Mail Full of References to Chris, so I could have tried this and kept you on CC. > In the case of the nonces used in fscrypt's key derivation, reuse of > the nonce basically means that two files share the same key. Assuming > you're using a competently designed block cipher (e.g., AES), reuse of > the key is not necessarily a problem. What it would mean is that two > files which are are reflinked would share the same key. And if you > have writable snapshots, that's definitely not a problem, since with > AES we use the a fixed key and a fixed IV given a logical block > number, and we can do block overwrites without having to guarantee > unique nonces (which you *do* need to worry about if you use counter > mode or some other stream cipher such as ChaCha20 --- Kent Overstreet > had some clever tricks to avoid IV reuse since he used a stream cipher > in his proposed bcachefs encryption). Er, not quite on the "safe" bit - part of the problem is that without going AEAD, you lose out on a good bit of security relative to GCM without reusing nonces. The reason (say) EME or CMC are safe for block-overwrite is actually _not_ that they're block ciphers - it's that they implement a security notion called SPRP, Strong Pseudorandom Permutation, which has a direct equivalence with misuse-resistant AEAD. XTS _not_ meeting that is in fact exactly why it's not as strong. If you take an SPRP, reserve `p` bits at the end for zeroes, fill the rest with your message, and encrypt it, the result is _exactly_ a misuse-resistant AEAD with `p`-bit integrity. Modern misuse-resistant AEADs differ from EME and CMC only in 1.) efficiency and 2.) supporting variable-length messages. > The main issue is if you want to reflink a file and then have the two > files have different permissions / ownerships. In that case, you > really want to use different keys for user A and for user B --- but if > you are assuming a single key per subvolume, you can't support > different keys for different users anyway, so you're kind of toast for > that use case in any case. Mm. > So in any case, assuming you're using block encryption (which is what > fscrypt uses) there really isn't a problem with nonce reuse, although > in some cases if you really do want to reflink a file and have it be > protected by different user keys, this would have to force copy of the > duplicated blocks at that point. But arguably, that is a feature, not > a bug. If the two users are mutually suspicious, you don't _want_ to > leak information about who much of a particular file had been changed > by a particular user. So you would want to break the reflink and have > separate copies for both users anyway. Agreed. > One final thought --- something which is really going to be a factor > in many use cases is going to be hardware accelerated encryption. For > example, Qualcomm is already shipping an SOC where the encryption can > be done in the data path between the CPU and the eMMC storage device. > If you want to support certain applications that try to read megabytes > and megabytes of data before painting a single pixel, in-line hardware > crypto at line speeds is going to be critical if you don't want to > sacrifice performance, and keep users from being cranky because it > took extra seconds before they could start reading their news feed (or > saving bird eggs from voracious porcine predators, or whatever). I heavily recommend reading the AES-GCM-SIV paper from my response to Chris - it uses exactly the same hardware acceleration as GCM, but achieves nonce- misuse-resistance. Less than one cycle per byte, too. > This may very well be an issue in the future not just for mobile > devices, but I could imagine this potentially being an issue for other > form factors as well. Yes, Skylake can encrypt multiple bytes per > clock cycle using the miracles of hardware acceleration and > pipelining. But in-line encryption will still have the advantage of > avoiding the memory bandwidth costs. So while it is fun to talk about > exotic encryption modes, it would be wise to have file system > encryption architectures to have modes which are compatible with > hardware in-line encryption schemes. Considering AES-GCM-SIV is being heavily considered for use in TLS 1.3, that may well be viable. > This is also why I'm not all that excited by Kent's work trying to > implement fast encryption using a stream cipher such as Chacha20. > Technically, it's interesting, sure. But on most modern systems, you > will either have really really good AES acceleration (any recent x86 > system), or you will probably have at your disposal a hardware in-line > cyptographic engine (ICE) that is going to be way faster than Chacha20 > implemented in software, and it means you don't have to go to extreme > lengths to avoid never reusing a nonce or risk losing all security > guarantees. Block ciphers are much safer, and with hardware support, > any speed advantage of using a stream cipher disappears; indeed, a > stream cipher in software will be slower than a hardware accelerated > block cipher. I agree regarding ChaCha20 (The cases it's good for - devices without AES - are already fading, with deep embedded using AES-CTR-CCM and mobile gaining AES-GCM accel), but I really think that nonce-misuse-resistant AEAD is going to be incredibly important to keep in mind. In crypto, it's far more often a subtle tool misapplied that causes problems than anything else - and both non-AEAD (due to CCA2) and nonce-dependent AEAD (due to nonce misuse catastrophes) are subtle tools indeed. Nonce-misuse- resistant AEAD is a much less subtle tool.