linux-fscrypt.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* IMA metadata format to support fs-verity
@ 2020-08-26 17:13 Chuck Lever
  2020-08-26 18:31 ` Eric Biggers
  0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2020-08-26 17:13 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List

Hi Eric-

I'm trying to construct a viable IMA metadata format (ie, what
goes into security.ima) to support Merkle trees.

Rather than storing an entire Merkle tree per file, Mimi would
like to have a metadata format that can store the root hash of
a Merkle tree. Instead of reading the whole tree, an NFS client
(for example) would generate the parts of the file's fs-verity
Merkle tree on-demand. The tree itself would not be exposed or
transported by the NFS protocol.

Following up with the recent thread on linux-integrity, starting
here:

  https://lore.kernel.org/linux-integrity/1597079586.3966.34.camel@HansenPartnership.com/t/#u

I think the following will be needed.

1. The parameters for (re)constructing the Merkle tree:
- The name of the digest algorithm
- The unit size represented by each leaf in the tree
- The depth of the finished tree
- The size of the file
- Perhaps a salt value
- Perhaps the file's mtime at the time the hash was computed
- The root hash

2. A fingerprint of the signer:
- The name of the digest algorithm
- The digest of the signer's certificate

3. The signature
- The name of the signature algorithm
- The signature, computed over 1.

Does this seem right to you?

There has been some controversy about whether to allow the
metadata to be unsigned. It can't ever be unsigned for NFS files,
but some feel that on a physically secure local-only set up,
signatures could be unnecessary overhead. I'm not convinced, and
believe the metadata should always be signed: that's the only
way to guarantee end-to-end integrity, which includes protection
of the content's provenance, no matter how it is stored.

--
Chuck Lever




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-26 17:13 IMA metadata format to support fs-verity Chuck Lever
@ 2020-08-26 18:31 ` Eric Biggers
  2020-08-26 18:56   ` Chuck Lever
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Biggers @ 2020-08-26 18:31 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List

On Wed, Aug 26, 2020 at 10:13:43AM -0700, Chuck Lever wrote:
> Hi Eric-
> 
> I'm trying to construct a viable IMA metadata format (ie, what
> goes into security.ima) to support Merkle trees.
> 
> Rather than storing an entire Merkle tree per file, Mimi would
> like to have a metadata format that can store the root hash of
> a Merkle tree. Instead of reading the whole tree, an NFS client
> (for example) would generate the parts of the file's fs-verity
> Merkle tree on-demand. The tree itself would not be exposed or
> transported by the NFS protocol.

This won't work because you'd need to reconstruct the whole Merkle tree when
reading the first byte from the file.  Check the fs-verity FAQ
(https://www.kernel.org/doc/html/latest/filesystems/fsverity.html#faq) where I
explained this in more detail (fourth question).

> Following up with the recent thread on linux-integrity, starting
> here:
> 
>   https://lore.kernel.org/linux-integrity/1597079586.3966.34.camel@HansenPartnership.com/t/#u
> 
> I think the following will be needed.
> 
> 1. The parameters for (re)constructing the Merkle tree:
> - The name of the digest algorithm
> - The unit size represented by each leaf in the tree
> - The depth of the finished tree
> - The size of the file
> - Perhaps a salt value
> - Perhaps the file's mtime at the time the hash was computed
> - The root hash

Well, the xattr would need to contain the same information as
struct fsverity_enable_arg, the argument to FS_IOC_ENABLE_VERITY.

> 2. A fingerprint of the signer:
> - The name of the digest algorithm
> - The digest of the signer's certificate
> 
> 3. The signature
> - The name of the signature algorithm
> - The signature, computed over 1.

I thought there was a desire to just use the existing "integrity.ima"
signature format.

> Does this seem right to you?
> 
> There has been some controversy about whether to allow the
> metadata to be unsigned. It can't ever be unsigned for NFS files,
> but some feel that on a physically secure local-only set up,
> signatures could be unnecessary overhead. I'm not convinced, and
> believe the metadata should always be signed: that's the only
> way to guarantee end-to-end integrity, which includes protection
> of the content's provenance, no matter how it is stored.

Are you looking for integrity-only protection (protection against accidental
modification), or also for authenticity protection (protection against
malicicous modifications)?  For authenticity, you have to verify the file's hash
against something you trust.  A signature is the usual way to do that.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-26 18:31 ` Eric Biggers
@ 2020-08-26 18:56   ` Chuck Lever
  2020-08-26 19:24     ` Eric Biggers
  0 siblings, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2020-08-26 18:56 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List



> On Aug 26, 2020, at 2:31 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> 
> On Wed, Aug 26, 2020 at 10:13:43AM -0700, Chuck Lever wrote:
>> Hi Eric-
>> 
>> I'm trying to construct a viable IMA metadata format (ie, what
>> goes into security.ima) to support Merkle trees.
>> 
>> Rather than storing an entire Merkle tree per file, Mimi would
>> like to have a metadata format that can store the root hash of
>> a Merkle tree. Instead of reading the whole tree, an NFS client
>> (for example) would generate the parts of the file's fs-verity
>> Merkle tree on-demand. The tree itself would not be exposed or
>> transported by the NFS protocol.
> 
> This won't work because you'd need to reconstruct the whole Merkle tree when
> reading the first byte from the file.  Check the fs-verity FAQ
> (https://www.kernel.org/doc/html/latest/filesystems/fsverity.html#faq) where I
> explained this in more detail (fourth question).

We agree there are inefficiencies with the proposed scheme. The
Merkle tree would be rehydrated at measurement time, and used at
read time to verify the results of each subsequent NFS READ.

We assume that parts of the tree and parts of the file content
can be evicted from the client's memory at any time. So verifying
READ results may require rehydration of some or all of the Merkle
tree. If we're careful, eviction might avoid the higher levels of
the tree to prevent the need to read the whole file again.

So, maybe we want to store the first level or two of the tree as
well? Obviously there is a limit to how much can be stored in an
extended attribute.


>> Following up with the recent thread on linux-integrity, starting
>> here:
>> 
>>  https://lore.kernel.org/linux-integrity/1597079586.3966.34.camel@HansenPartnership.com/t/#u
>> 
>> I think the following will be needed.
>> 
>> 1. The parameters for (re)constructing the Merkle tree:
>> - The name of the digest algorithm
>> - The unit size represented by each leaf in the tree
>> - The depth of the finished tree
>> - The size of the file
>> - Perhaps a salt value
>> - Perhaps the file's mtime at the time the hash was computed
>> - The root hash
> 
> Well, the xattr would need to contain the same information as
> struct fsverity_enable_arg, the argument to FS_IOC_ENABLE_VERITY.
> 
>> 2. A fingerprint of the signer:
>> - The name of the digest algorithm
>> - The digest of the signer's certificate
>> 
>> 3. The signature
>> - The name of the signature algorithm
>> - The signature, computed over 1.
> 
> I thought there was a desire to just use the existing "integrity.ima"
> signature format.

I am very interested in using EVM_IMA_DIGSIG. However, there appears
to be a consensus that for cases like NFS, every readpage result needs
to be verified, just as fs-verity does it.

I suppose measurement for an NFS file could involve verifying a
saved linear hash while at the same time constructing a Merkle tree
on the client?


>> There has been some controversy about whether to allow the
>> metadata to be unsigned. It can't ever be unsigned for NFS files,
>> but some feel that on a physically secure local-only set up,
>> signatures could be unnecessary overhead. I'm not convinced, and
>> believe the metadata should always be signed: that's the only
>> way to guarantee end-to-end integrity, which includes protection
>> of the content's provenance, no matter how it is stored.
> 
> Are you looking for integrity-only protection (protection against accidental
> modification), or also for authenticity protection (protection against
> malicicous modifications)?  For authenticity, you have to verify the file's hash
> against something you trust.  A signature is the usual way to do that.

My interest is content provenance (authenticity), where both a
digest and its signature are required. I can't speak for how
others want to use this metadata.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-26 18:56   ` Chuck Lever
@ 2020-08-26 19:24     ` Eric Biggers
  2020-08-26 19:51       ` Chuck Lever
  2020-08-27  0:50       ` Mimi Zohar
  0 siblings, 2 replies; 10+ messages in thread
From: Eric Biggers @ 2020-08-26 19:24 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List

On Wed, Aug 26, 2020 at 02:56:45PM -0400, Chuck Lever wrote:
> 
> > On Aug 26, 2020, at 2:31 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> > 
> > On Wed, Aug 26, 2020 at 10:13:43AM -0700, Chuck Lever wrote:
> >> Hi Eric-
> >> 
> >> I'm trying to construct a viable IMA metadata format (ie, what
> >> goes into security.ima) to support Merkle trees.
> >> 
> >> Rather than storing an entire Merkle tree per file, Mimi would
> >> like to have a metadata format that can store the root hash of
> >> a Merkle tree. Instead of reading the whole tree, an NFS client
> >> (for example) would generate the parts of the file's fs-verity
> >> Merkle tree on-demand. The tree itself would not be exposed or
> >> transported by the NFS protocol.
> > 
> > This won't work because you'd need to reconstruct the whole Merkle tree when
> > reading the first byte from the file.  Check the fs-verity FAQ
> > (https://www.kernel.org/doc/html/latest/filesystems/fsverity.html#faq) where I
> > explained this in more detail (fourth question).
> 
> We agree there are inefficiencies with the proposed scheme. The
> Merkle tree would be rehydrated at measurement time, and used at
> read time to verify the results of each subsequent NFS READ.
> 
> We assume that parts of the tree and parts of the file content
> can be evicted from the client's memory at any time. So verifying
> READ results may require rehydration of some or all of the Merkle
> tree. If we're careful, eviction might avoid the higher levels of
> the tree to prevent the need to read the whole file again.
> 
> So, maybe we want to store the first level or two of the tree as
> well? Obviously there is a limit to how much can be stored in an
> extended attribute.

That's going to be very inefficient, and difficult to handle the caching,
preferential eviction, and constant tree rebuilding.

IMO, the only model that really makes sense is one where the full tree is stored
persistently.  Have you considered options for how that could be done in NFS?
What NFS protocol modifications (if any) are in scope?

> >> Following up with the recent thread on linux-integrity, starting
> >> here:
> >> 
> >>  https://lore.kernel.org/linux-integrity/1597079586.3966.34.camel@HansenPartnership.com/t/#u
> >> 
> >> I think the following will be needed.
> >> 
> >> 1. The parameters for (re)constructing the Merkle tree:
> >> - The name of the digest algorithm
> >> - The unit size represented by each leaf in the tree
> >> - The depth of the finished tree
> >> - The size of the file
> >> - Perhaps a salt value
> >> - Perhaps the file's mtime at the time the hash was computed
> >> - The root hash
> > 
> > Well, the xattr would need to contain the same information as
> > struct fsverity_enable_arg, the argument to FS_IOC_ENABLE_VERITY.
> > 
> >> 2. A fingerprint of the signer:
> >> - The name of the digest algorithm
> >> - The digest of the signer's certificate
> >> 
> >> 3. The signature
> >> - The name of the signature algorithm
> >> - The signature, computed over 1.
> > 
> > I thought there was a desire to just use the existing "integrity.ima"
> > signature format.
> 
> I am very interested in using EVM_IMA_DIGSIG. However, there appears
> to be a consensus that for cases like NFS, every readpage result needs
> to be verified, just as fs-verity does it.
> 
> I suppose measurement for an NFS file could involve verifying a
> saved linear hash while at the same time constructing a Merkle tree
> on the client?

fs-verity is mostly just a way of hashing a file.  Can't IMA just continue to do
its signatures in the same way, and just swap out the traditional full file hash
with the fs-verity file hash (when it's enabled)?

fs-verity does support its own signature mechanism, because people wanted a
simple knob to set that makes the kernel verify and enforce signatures for all
fs-verity files.  But it's not mandatory to use that.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-26 19:24     ` Eric Biggers
@ 2020-08-26 19:51       ` Chuck Lever
  2020-08-26 20:51         ` Eric Biggers
  2020-08-27  0:50       ` Mimi Zohar
  1 sibling, 1 reply; 10+ messages in thread
From: Chuck Lever @ 2020-08-26 19:51 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List



> On Aug 26, 2020, at 3:24 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> 
> On Wed, Aug 26, 2020 at 02:56:45PM -0400, Chuck Lever wrote:
>> 
>>> On Aug 26, 2020, at 2:31 PM, Eric Biggers <ebiggers@kernel.org> wrote:
>>> 
>>> On Wed, Aug 26, 2020 at 10:13:43AM -0700, Chuck Lever wrote:
>>>> Hi Eric-
>>>> 
>>>> I'm trying to construct a viable IMA metadata format (ie, what
>>>> goes into security.ima) to support Merkle trees.
>>>> 
>>>> Rather than storing an entire Merkle tree per file, Mimi would
>>>> like to have a metadata format that can store the root hash of
>>>> a Merkle tree. Instead of reading the whole tree, an NFS client
>>>> (for example) would generate the parts of the file's fs-verity
>>>> Merkle tree on-demand. The tree itself would not be exposed or
>>>> transported by the NFS protocol.
>>> 
>>> This won't work because you'd need to reconstruct the whole Merkle tree when
>>> reading the first byte from the file.  Check the fs-verity FAQ
>>> (https://www.kernel.org/doc/html/latest/filesystems/fsverity.html#faq) where I
>>> explained this in more detail (fourth question).
>> 
>> We agree there are inefficiencies with the proposed scheme. The
>> Merkle tree would be rehydrated at measurement time, and used at
>> read time to verify the results of each subsequent NFS READ.
>> 
>> We assume that parts of the tree and parts of the file content
>> can be evicted from the client's memory at any time. So verifying
>> READ results may require rehydration of some or all of the Merkle
>> tree. If we're careful, eviction might avoid the higher levels of
>> the tree to prevent the need to read the whole file again.
>> 
>> So, maybe we want to store the first level or two of the tree as
>> well? Obviously there is a limit to how much can be stored in an
>> extended attribute.
> 
> That's going to be very inefficient, and difficult to handle the caching,
> preferential eviction, and constant tree rebuilding.

My focus is code signing. I'm expecting individual executables to
be under a few dozen megabytes in size, on average, and to change
infrequently or never (immutable). Configuration files, shell
scripts, and symlinks will be even smaller on average.

Thus I anticipate that the frequency of eviction should be pretty
small, and the client should be able to read the files in their
entirety quickly. Efficiency comes from reading each file as few
times as possible to maintain its Merkle tree. The cost of
measuring the file is amortized well if the file is used
frequently enough to keep its tree in the client's memory.

The inefficient case is a file that is large and used infrequently,
IIUC.


> IMO, the only model that really makes sense is one where the full tree is stored
> persistently.

Can you say more about why you believe that?


> Have you considered options for how that could be done in NFS?

We have.


> What NFS protocol modifications (if any) are in scope?

There are two ways to pull data via NFS. One is READ, which assumes
an arbitrarily large byte stream and the ability to seek in it. The
byte stream content is read in sections no larger than "rsize"
(typically 1MB or less). The client has various mechanisms to
detect when the file content has changed on the server, and can use
them to cache the file's content aggressively.

The other is attribute data, which is pulled in a single operation and
is therefore limited in size. There is no cache consistency scheme
for this type of data, so clients typically read it every time there
is an application request for it.

- We could define a named attribute that is a secondary byte stream
associated with a filehandle. It can be arbitrarily large and is
read piecemeal via NFS READ.

- We could define a pNFS layout that enables the storage of the tree
to be on some other storage service. It can be arbitrarily large and is
read piecemeal via NFS READ or some other operation (SCSI, NVMe, etc).

- We could define a new fattr4 attribute that stores metadata (that's
what I've been doing in prototype to store IMA metadata). It is read
and written in its entirety in a single operation.


>>>> Following up with the recent thread on linux-integrity, starting
>>>> here:
>>>> 
>>>> https://lore.kernel.org/linux-integrity/1597079586.3966.34.camel@HansenPartnership.com/t/#u
>>>> 
>>>> I think the following will be needed.
>>>> 
>>>> 1. The parameters for (re)constructing the Merkle tree:
>>>> - The name of the digest algorithm
>>>> - The unit size represented by each leaf in the tree
>>>> - The depth of the finished tree
>>>> - The size of the file
>>>> - Perhaps a salt value
>>>> - Perhaps the file's mtime at the time the hash was computed
>>>> - The root hash
>>> 
>>> Well, the xattr would need to contain the same information as
>>> struct fsverity_enable_arg, the argument to FS_IOC_ENABLE_VERITY.
>>> 
>>>> 2. A fingerprint of the signer:
>>>> - The name of the digest algorithm
>>>> - The digest of the signer's certificate
>>>> 
>>>> 3. The signature
>>>> - The name of the signature algorithm
>>>> - The signature, computed over 1.
>>> 
>>> I thought there was a desire to just use the existing "integrity.ima"
>>> signature format.
>> 
>> I am very interested in using EVM_IMA_DIGSIG. However, there appears
>> to be a consensus that for cases like NFS, every readpage result needs
>> to be verified, just as fs-verity does it.
>> 
>> I suppose measurement for an NFS file could involve verifying a
>> saved linear hash while at the same time constructing a Merkle tree
>> on the client?
> 
> fs-verity is mostly just a way of hashing a file.  Can't IMA just continue to do
> its signatures in the same way, and just swap out the traditional full file hash
> with the fs-verity file hash (when it's enabled)?

Essentially that's what we're doing: inventing a new IMA metadata
format that stores a Merkle root hash instead of a linear hash.

The current IMA formats take a single parameter: which hash algo
to use. Merkle tree construction requires a larger set of parameters,
which is why we think a new metadata format is necessary.


> fs-verity does support its own signature mechanism, because people wanted a
> simple knob to set that makes the kernel verify and enforce signatures for all
> fs-verity files.  But it's not mandatory to use that.


--
Chuck Lever




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-26 19:51       ` Chuck Lever
@ 2020-08-26 20:51         ` Eric Biggers
  2020-08-27  0:53           ` Mimi Zohar
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Biggers @ 2020-08-26 20:51 UTC (permalink / raw)
  To: Chuck Lever; +Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List

On Wed, Aug 26, 2020 at 03:51:48PM -0400, Chuck Lever wrote:
> 
> 
> > On Aug 26, 2020, at 3:24 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> > 
> > On Wed, Aug 26, 2020 at 02:56:45PM -0400, Chuck Lever wrote:
> >> 
> >>> On Aug 26, 2020, at 2:31 PM, Eric Biggers <ebiggers@kernel.org> wrote:
> >>> 
> >>> On Wed, Aug 26, 2020 at 10:13:43AM -0700, Chuck Lever wrote:
> >>>> Hi Eric-
> >>>> 
> >>>> I'm trying to construct a viable IMA metadata format (ie, what
> >>>> goes into security.ima) to support Merkle trees.
> >>>> 
> >>>> Rather than storing an entire Merkle tree per file, Mimi would
> >>>> like to have a metadata format that can store the root hash of
> >>>> a Merkle tree. Instead of reading the whole tree, an NFS client
> >>>> (for example) would generate the parts of the file's fs-verity
> >>>> Merkle tree on-demand. The tree itself would not be exposed or
> >>>> transported by the NFS protocol.
> >>> 
> >>> This won't work because you'd need to reconstruct the whole Merkle tree when
> >>> reading the first byte from the file.  Check the fs-verity FAQ
> >>> (https://www.kernel.org/doc/html/latest/filesystems/fsverity.html#faq) where I
> >>> explained this in more detail (fourth question).
> >> 
> >> We agree there are inefficiencies with the proposed scheme. The
> >> Merkle tree would be rehydrated at measurement time, and used at
> >> read time to verify the results of each subsequent NFS READ.
> >> 
> >> We assume that parts of the tree and parts of the file content
> >> can be evicted from the client's memory at any time. So verifying
> >> READ results may require rehydration of some or all of the Merkle
> >> tree. If we're careful, eviction might avoid the higher levels of
> >> the tree to prevent the need to read the whole file again.
> >> 
> >> So, maybe we want to store the first level or two of the tree as
> >> well? Obviously there is a limit to how much can be stored in an
> >> extended attribute.
> > 
> > That's going to be very inefficient, and difficult to handle the caching,
> > preferential eviction, and constant tree rebuilding.
> 
> My focus is code signing. I'm expecting individual executables to
> be under a few dozen megabytes in size, on average, and to change
> infrequently or never (immutable). Configuration files, shell
> scripts, and symlinks will be even smaller on average.
> 
> Thus I anticipate that the frequency of eviction should be pretty
> small, and the client should be able to read the files in their
> entirety quickly. Efficiency comes from reading each file as few
> times as possible to maintain its Merkle tree. The cost of
> measuring the file is amortized well if the file is used
> frequently enough to keep its tree in the client's memory.
> 
> The inefficient case is a file that is large and used infrequently,
> IIUC.
> 
> 
> > IMO, the only model that really makes sense is one where the full tree is stored
> > persistently.
> 
> Can you say more about why you believe that?

Because if the full tree isn't stored, it defeats most of the point of doing a
Merkle tree based hash.  The filesystem would have to read and hash the full
file up-front, which by itself defeats the main benefit.  Then later, reads from
the file can result in having to rebuild large parts of the tree --- unless the
entire tree is kept pinned in memory, which isn't feasible for large files.

> > Have you considered options for how that could be done in NFS?
> 
> We have.
> 
> 
> > What NFS protocol modifications (if any) are in scope?
> 
> There are two ways to pull data via NFS. One is READ, which assumes
> an arbitrarily large byte stream and the ability to seek in it. The
> byte stream content is read in sections no larger than "rsize"
> (typically 1MB or less). The client has various mechanisms to
> detect when the file content has changed on the server, and can use
> them to cache the file's content aggressively.
> 
> The other is attribute data, which is pulled in a single operation and
> is therefore limited in size. There is no cache consistency scheme
> for this type of data, so clients typically read it every time there
> is an application request for it.
> 
> - We could define a named attribute that is a secondary byte stream
> associated with a filehandle. It can be arbitrarily large and is
> read piecemeal via NFS READ.

If it's possible, a secondary byte stream associated with the file would be a
good option.  The fs-verity implementation in ext4 and f2fs has been criticized
because the Merkle tree is stored past the end of the file rather than in a
separate file stream, which in theory would be a cleaner solution.

> > fs-verity is mostly just a way of hashing a file.  Can't IMA just continue to do
> > its signatures in the same way, and just swap out the traditional full file hash
> > with the fs-verity file hash (when it's enabled)?
> 
> Essentially that's what we're doing: inventing a new IMA metadata
> format that stores a Merkle root hash instead of a linear hash.
> 
> The current IMA formats take a single parameter: which hash algo
> to use. Merkle tree construction requires a larger set of parameters,
> which is why we think a new metadata format is necessary.

Well, you'll need to store the fsverity_descriptor or something equivalent.  Not
because it would be signed directly (it would be hashed first, as per
https://www.kernel.org/doc/html/latest/filesystems/fsverity.html#file-measurement-computation),
but because it's needed to understand the Merkle tree.

Of course, the bytes that are actually signed need to include not just the hash
itself, but also the type of hash algorithm that was used.  Else it's ambiguous
what the signer intended to sign.

Unfortunately, currently EVM appears to sign a raw hash, which means it is
broken, as the hash algorithm is not authenticated.  I.e. if the bytes
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 are signed,
there's no way to prove that the signer meant to sign a SHA-256 hash, as opposed
to, say, a Streebog hash.  So that will need to be fixed anyway.  While doing
so, you should reserve some fields so that there's also a flag available to
indicate whether the hash is a traditional full file hash or a fs-verity hash.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-26 19:24     ` Eric Biggers
  2020-08-26 19:51       ` Chuck Lever
@ 2020-08-27  0:50       ` Mimi Zohar
  1 sibling, 0 replies; 10+ messages in thread
From: Mimi Zohar @ 2020-08-27  0:50 UTC (permalink / raw)
  To: Eric Biggers, Chuck Lever
  Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List

On Wed, 2020-08-26 at 12:24 -0700, Eric Biggers wrote:

> fs-verity is mostly just a way of hashing a file.  Can't IMA just continue to do
> its signatures in the same way, and just swap out the traditional full file hash
> with the fs-verity file hash (when it's enabled)?

Yes, as previously discussed with you and Ted.

Mimi
> 
> fs-verity does support its own signature mechanism, because people wanted a
> simple knob to set that makes the kernel verify and enforce signatures for all
> fs-verity files.  But it's not mandatory to use that.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-26 20:51         ` Eric Biggers
@ 2020-08-27  0:53           ` Mimi Zohar
  2020-08-27  1:00             ` Eric Biggers
  0 siblings, 1 reply; 10+ messages in thread
From: Mimi Zohar @ 2020-08-27  0:53 UTC (permalink / raw)
  To: Eric Biggers, Chuck Lever
  Cc: linux-fscrypt, linux-integrity, Linux NFS Mailing List

On Wed, 2020-08-26 at 13:51 -0700, Eric Biggers wrote:
> Of course, the bytes that are actually signed need to include not just the hash
> itself, but also the type of hash algorithm that was used.  Else it's ambiguous
> what the signer intended to sign.
> 
> Unfortunately, currently EVM appears to sign a raw hash, which means it is
> broken, as the hash algorithm is not authenticated.  I.e. if the bytes
> e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 are signed,
> there's no way to prove that the signer meant to sign a SHA-256 hash, as opposed
> to, say, a Streebog hash.  So that will need to be fixed anyway.  While doing
> so, you should reserve some fields so that there's also a flag available to
> indicate whether the hash is a traditional full file hash or a fs-verity hash.

The original EVM HMAC is still sha1, but the newer portable & immutable
EVM signature supports different hash algorithms.

Mimi


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-27  0:53           ` Mimi Zohar
@ 2020-08-27  1:00             ` Eric Biggers
  2020-08-27 13:10               ` Mimi Zohar
  0 siblings, 1 reply; 10+ messages in thread
From: Eric Biggers @ 2020-08-27  1:00 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Chuck Lever, linux-fscrypt, linux-integrity, Linux NFS Mailing List

On Wed, Aug 26, 2020 at 08:53:33PM -0400, Mimi Zohar wrote:
> On Wed, 2020-08-26 at 13:51 -0700, Eric Biggers wrote:
> > Of course, the bytes that are actually signed need to include not just the hash
> > itself, but also the type of hash algorithm that was used.  Else it's ambiguous
> > what the signer intended to sign.
> > 
> > Unfortunately, currently EVM appears to sign a raw hash, which means it is
> > broken, as the hash algorithm is not authenticated.  I.e. if the bytes
> > e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 are signed,
> > there's no way to prove that the signer meant to sign a SHA-256 hash, as opposed
> > to, say, a Streebog hash.  So that will need to be fixed anyway.  While doing
> > so, you should reserve some fields so that there's also a flag available to
> > indicate whether the hash is a traditional full file hash or a fs-verity hash.
> 
> The original EVM HMAC is still sha1, but the newer portable & immutable
> EVM signature supports different hash algorithms.
> 

Read what I wrote again.  I'm talking about the bytes that are actually signed.

- Eric

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: IMA metadata format to support fs-verity
  2020-08-27  1:00             ` Eric Biggers
@ 2020-08-27 13:10               ` Mimi Zohar
  0 siblings, 0 replies; 10+ messages in thread
From: Mimi Zohar @ 2020-08-27 13:10 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Chuck Lever, linux-fscrypt, linux-integrity,
	Linux NFS Mailing List, Matthew Garrett

On Wed, 2020-08-26 at 18:00 -0700, Eric Biggers wrote:
> On Wed, Aug 26, 2020 at 08:53:33PM -0400, Mimi Zohar wrote:
> > On Wed, 2020-08-26 at 13:51 -0700, Eric Biggers wrote:
> > > Of course, the bytes that are actually signed need to include not just the hash
> > > itself, but also the type of hash algorithm that was used.  Else it's ambiguous
> > > what the signer intended to sign.
> > > 
> > > Unfortunately, currently EVM appears to sign a raw hash, which means it is
> > > broken, as the hash algorithm is not authenticated.  I.e. if the bytes
> > > e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 are signed,
> > > there's no way to prove that the signer meant to sign a SHA-256 hash, as opposed
> > > to, say, a Streebog hash.  So that will need to be fixed anyway.  While doing
> > > so, you should reserve some fields so that there's also a flag available to
> > > indicate whether the hash is a traditional full file hash or a fs-verity hash.
> > 
> > The original EVM HMAC is still sha1, but the newer portable & immutable
> > EVM signature supports different hash algorithms.
> > 
> 
> Read what I wrote again.  I'm talking about the bytes that are actually signed.

I agree including the hash algorithm in the digest would be
preferrable, but it isn't per-se broken.   The file signature and the
file metadata hash algorithms are the same, otherwise signature
verification fails[1].   The same tool calculates the file metadata
digest and then signs the digest, using the same hash algorithm.  In
terms of the HMAC, it is (still) limited to SHA1.

Mimi

[1] commit 5feeb61183dd ("evm: Allow non-SHA1 digital signatures")


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2020-08-27 14:55 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-26 17:13 IMA metadata format to support fs-verity Chuck Lever
2020-08-26 18:31 ` Eric Biggers
2020-08-26 18:56   ` Chuck Lever
2020-08-26 19:24     ` Eric Biggers
2020-08-26 19:51       ` Chuck Lever
2020-08-26 20:51         ` Eric Biggers
2020-08-27  0:53           ` Mimi Zohar
2020-08-27  1:00             ` Eric Biggers
2020-08-27 13:10               ` Mimi Zohar
2020-08-27  0:50       ` Mimi Zohar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).