From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-fsdevel-owner@vger.kernel.org>
Received: from imap.thunk.org ([74.207.234.97]:36658 "EHLO imap.thunk.org"
        rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
        id S1752978AbeA1WB0 (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
        Sun, 28 Jan 2018 17:01:26 -0500
Date: Sun, 28 Jan 2018 16:49:25 -0500
From: Theodore Ts'o <tytso@mit.edu>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Andreas Dilger <adilger@dilger.ca>,
        linux-fsdevel <linux-fsdevel@vger.kernel.org>,
        lsf-pc@lists.linux-foundation.org
Subject: Re: [Lsf-pc] [LSF/MM TOPIC] fs-verity: file system-level integrity
 protection
Message-ID: <20180128214925.GA13621@thunk.org>
References: <1516927666.4082.25.camel@HansenPartnership.com>
 <20180126023054.GC31091@thunk.org>
 <1516942235.4082.52.camel@HansenPartnership.com>
 <20180126145856.GA2841@thunk.org>
 <1516985067.4000.10.camel@HansenPartnership.com>
 <20180126215540.GA23308@thunk.org>
 <275E5E86-635E-4D79-9AC9-3D24318EDDDF@dilger.ca>
 <1517069959.3012.13.camel@HansenPartnership.com>
 <20180128024604.GA12320@thunk.org>
 <1517162590.3082.55.camel@HansenPartnership.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <1517162590.3082.55.camel@HansenPartnership.com>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

On Sun, Jan 28, 2018 at 10:03:10AM -0800, James Bottomley wrote:
> 
> OK, so we agree then that what IMA provides: a hash and potential
> signature over the hash is sufficient input for what you want and does
> provide existing tooling to achieve it.

Well, that's not all IMA provides.  Remmber, the 'M' in IMA is
*measurement*.  IMA was originally about checksuming every single file
that is opened and writing it to a long.  You could suppress it by
using SELinux to tag files (using xattrs which are *not* signed) and
then writing an IMA policy which tells IMA to ignore files that belong
to a certain SE Linux tag.  This is something that I *don't* want and
today, enabling CONFIG_IMA drags all of this (including doing text
parsing of the IMA policy in the kernel, SELinux, etc.) into the
kernel.

One of my other complaints about IMA is that it's an integrated
solution, with a huge amount of complexity.  If it was only about file
signing, that would be one thing, but that's actually not the case
today.  Hence my comment about I don't want fs-verity to have a
dependency on IMA, such that we are forced to drag in all of IMA and
SELinux for anyone who wants to use use fs-verity.

This is fine; I don't want to have to dictate changes to IMA; I'd much
prefer to avoid the complexity instead of trying to reform it, since
I'm sure the IMA folks will be happy to explain why there are all
sorts of reasons why things have to be done the way it has to be done.
For example, the assertion that the latency hit at open(2) *must* be
there in order to kowtow to Microsoft because of its Trusted Boot
policies; fair enough, but *I* don't care about Trusted Boot, and it's
not fair to impose penalties on all scenarios because of a desire to
keep Microsoft happy just for one particular use case.

>    1. Could the signature piece of this be done in the way IMA currently
>       does file signatures. �We all agree "yes" on this, since a signed
>       mekle hash head is the same size as an existing IMA signature and
>       therefore does fit into xattrs.

Well.... not exactly.

It is fair to say that there are two parts of integrity metadata; and
the Merkle tree can be reconstructed at file install time.

>    2. Could IMA use a merkle tree for hash verification a page at a time
>       as part of its implementation? �I think the answer to this is yes,
>       except the hook has to be somewhere in the page fault mechanism, so
>       it would need some exploration and prototyping.

The way you are formulating things presumes that all of IMA has to be
dragged into any file integrity solution.  That's begging the
question.  As I've mentioned above, IMA has all sorts of complexity
which is currently mandatory, and I'm not volunteering to disentagling
the mess to make it be sane.  (And if you don't like that word, how
about, "designed with good taste?")

>    3. Could the merkle tree be cached somehow in the filesystem (probably
>       as part of the filesystem implementaiton)? �You've already assumed
>       "yes" for this since it's how fs-verity is proposed to work.

I don't consider this the Merkel tree to a *cache*, however because if
you don't mind a massive latency at file open tme, you can just use
the existing IMA mechanism.  So a core part of the design is that the
Merkle file is stored permanently (*not* as a cache) alongside the
file.  And if the file is renamed, the Merkle tree should come along
for the ride.

Whether the Merkle tree is reconstructed as part of the file / package
installation process, or whether the Merkle tree stored as part of the
package or streamed from the app store, etc., is an implementation
detail, and I don't think we need to prescribe one way of doing
things.  I *do* think though we should allow for possibility where
limitations on the local CPU power is such that it would be preferable
for the Merkle tree to be supplied from a remote server instead of
generated on the local system.

The problem is you have a specific use case in mind, involving the
Docker client, where you want to store the signature in an xattr, and
then not require any local changes to the Docker client --- and that's
not in my requirements set, and it was *your* statement that the
Docker client MUST NOT be modified which forced where (a) the Merkle
tree must be reconstructed in the kernel, and (b) it must be triggered
by setting the xattr.

My goal is to keep things simple, which means

* No parsing of the IMA policy as a text input in the kernel
* No Merkle tree construction in the kernel (which is also true of dm-verity)
* No magic xattr triggering

Speaking as a kernel developer, it makes a more sense to keep things
in the kernel, and do as much in userspace as possible --- and if that
means that the Docker client (or the package manager, etc.) needs a
minor change to call the userspace library, that's infinitely
preferable to keeping huge amounts of complexity in non-swappable
kernel memory --- which increases the attack surface of the kernel,
and so on.

So in my opinion, clean design of the kernel trumps the requirement of
"not one change, not one jot, in the Docker client".

It could be that the requiremnts of "keep the kernel changes simple"
and "no massive latency at file open time", means that requiremnt sets
of fs-verity and IMA are irreconcilable.  Which is fine as far as I'm
concerned.  Maybe IMA and fs-verity should be considered orthogonal
solutions.

						- Ted