On Sat, 2018-01-27 at 00:58 -0700, Andreas Dilger wrote: > On Jan 26, 2018, at 2:55 PM, Theodore Ts'o wrote: > > > > > > On Fri, Jan 26, 2018 at 08:44:27AM -0800, James Bottomley wrote: > > > > > > On Fri, 2018-01-26 at 09:58 -0500, Theodore Ts'o wrote: > > > > > > > > Docker save was going to have to be altered to use IMA, anyway. > > > > > > Actually, no, that's not entirely true[1].  Docker save produces > > > a tar file.  Once the tar on your platform picks up xattrs, > > > docker save just works for container images with IMA hashes and > > > signatures (and selinux labels, which was actually the driver for > > > the change).  The point at which the ecosystem changed to "just > > > work" was the point at which tar understood xattrs.  That's why I > > > was poking on how do we get tar to understand this format, > > > following on the way IMA and selinux did it.  There may be > > > another way of getting this change into the ecosystem, but > > > ecosystem adoption has to be part of the considerations for this. > > > > Oh, I see.  You are saying that you want to be able to use tar to > > backup integrity protected files, and then restore them later. > > > > Yes, that's different from what I was assuming, which is a model > > where the integrity protect file would be written by some package > > manager (e.g,. rpm, dpkg, the code that downloads the apk, etc.), > > and that we would *not* be trying to backup the file with the > > integrity data, and then restore it later via some kind of untar > > operation. > > > > The problem here is that a merkle tree simply won't fit inside an > > xattr for any non-trivail file.  And there may be use cases where > > blocking the open until the integrity is verifeid on the entire > > file. However, there are uses cases where the a signifcant increase > > in the open latency can't be tolerated, and wher the file might > > have might have large portions of dat which will never be read, and > > thus, don't need to have their integrity verified.  (Example: an > > APK might have megabytes and megabytes of translation resources for > > N languages, only one of which will normally be used by a > > particular user on a particular phone.  Or as another example, an > > ELF binary that has huge portions of symbol table and debugging > > information that is normally not used.) > > > > So the requirement that you must be able to backup an integrity > > protected file, and then restore it again, without modifying the > > tool which does the backup and restore, does certainly push you > > towards using xattrs.  But xattrs force the huge open latency, and > > while Docker is big in some circles, there are lots of use cases > > where the unmodified backup/restore requiremnt is simply not > > applicable. > > > > So perhaps there is room for both solutions. > > I think this is relatively straight forward to handle.  The package > (tarball, whatever) itself only needs to store the top-level > checksum, since this validates the whole Merkle tree, and in turn the > integrity of the whole file.  This is exactly what Bittorrent does > for files. Well, not quite: bittorrent doesn't reconstruct the hash from the file, it downloads the hash a piece at a time and uses that to verify the piece of the file it's obtained.  However, I accept that's only because the leechers don't have the whole file from which to reconstruct the hash; seed creation certainly does this. > When the package is extracted, the Merkle tree can be regenerated and > written with the file for random IO access using fs-verity.  When the > Merkle tree is written to disk, the top-level checksum is verified > against the checksum stored in the package to ensure it was written > correctly.  This means only a small checksum needs to be stored in > the archive (32 bytes), but an integrated system will have end-to-end > data verification. I certainly buy this approach, and it fits well with the limited data size there is in xattrs but Ted said in the initial proposal the entire tree would be present in the file.  I can't see a need for supplying the entire tree rather than reconstructing it but maybe there's an android use case I'm not seeing (Like not wanting to waste limited CPU power). Just so I understand the mechanics: The xattr would contain the head node.  When this is written, the tree would be reconstructed from the file and verified.  If it verifies, it must be stored in the filesystem data somehow (or at least the lowest layer), so all subsequent uses of the file can proceed from the per page hash even after unmount and remount?  Then I certainly think it suits both cases. James