All of lore.kernel.org
 help / color / mirror / Atom feed
* ghash
@ 2019-07-19 14:05 Pascal Van Leeuwen
  2019-07-19 16:16 ` ghash Eric Biggers
  0 siblings, 1 reply; 9+ messages in thread
From: Pascal Van Leeuwen @ 2019-07-19 14:05 UTC (permalink / raw)
  To: linux-crypto; +Cc: Herbert Xu, davem

Hi,

While implementing GHASH support for the inside-secure driver and wondering why I couldn't get 
the test vectors to pass I have come to the conclusion that ghash-generic.c actually does *not*
implement GHASH at all. It merely implements the underlying chained GF multiplication, which,
I understand, is convenient as a building block for e.g. aes-gcm but is is NOT the full GHASH.
Most importantly, it does NOT actually close the hash, so you can trivially add more data to the
authenticated block (i.e. the resulting output cannot be used directly without external closing)

GHASH is defined as GHASH(H,A,C) whereby you do this chained GF multiply on a block of AAD
data padded to 16 byte alignment with zeroes, followed by a block of ciphertext padded to 16
byte alignment with zeroes, followed by a block that contains both AAD and cipher length.

See also https://en.wikipedia.org/wiki/Galois/Counter_Mode

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ghash
  2019-07-19 14:05 ghash Pascal Van Leeuwen
@ 2019-07-19 16:16 ` Eric Biggers
  2019-07-19 19:26   ` ghash Pascal Van Leeuwen
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Biggers @ 2019-07-19 16:16 UTC (permalink / raw)
  To: Pascal Van Leeuwen; +Cc: linux-crypto, Herbert Xu, davem

On Fri, Jul 19, 2019 at 02:05:01PM +0000, Pascal Van Leeuwen wrote:
> Hi,
> 
> While implementing GHASH support for the inside-secure driver and wondering why I couldn't get 
> the test vectors to pass I have come to the conclusion that ghash-generic.c actually does *not*
> implement GHASH at all. It merely implements the underlying chained GF multiplication, which,
> I understand, is convenient as a building block for e.g. aes-gcm but is is NOT the full GHASH.
> Most importantly, it does NOT actually close the hash, so you can trivially add more data to the
> authenticated block (i.e. the resulting output cannot be used directly without external closing)
> 
> GHASH is defined as GHASH(H,A,C) whereby you do this chained GF multiply on a block of AAD
> data padded to 16 byte alignment with zeroes, followed by a block of ciphertext padded to 16
> byte alignment with zeroes, followed by a block that contains both AAD and cipher length.
> 
> See also https://en.wikipedia.org/wiki/Galois/Counter_Mode
> 
> Regards,
> Pascal van Leeuwen
> Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
> www.insidesecure.com
> 

Yes that's correct.  The hash APIs don't support multi-argument hashes, so
there's no natural way for it to be "full GHASH".  So it relies on the caller to
format the AAD and ciphertext into a single stream.  IMO it really should be
called something like "ghash_core".

Do you have some question or suggestion, or was this just an observation?

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: ghash
  2019-07-19 16:16 ` ghash Eric Biggers
@ 2019-07-19 19:26   ` Pascal Van Leeuwen
  2019-07-19 19:56     ` ghash Eric Biggers
  0 siblings, 1 reply; 9+ messages in thread
From: Pascal Van Leeuwen @ 2019-07-19 19:26 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-crypto, Herbert Xu, davem

> -----Original Message-----
> From: linux-crypto-owner@vger.kernel.org <linux-crypto-owner@vger.kernel.org> On Behalf Of Eric Biggers
> Sent: Friday, July 19, 2019 6:16 PM
> To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> Cc: linux-crypto@vger.kernel.org; Herbert Xu <herbert@gondor.apana.org.au>; davem@davemloft.net
> Subject: Re: ghash
> 
> On Fri, Jul 19, 2019 at 02:05:01PM +0000, Pascal Van Leeuwen wrote:
> > Hi,
> >
> > While implementing GHASH support for the inside-secure driver and wondering why I couldn't get
> > the test vectors to pass I have come to the conclusion that ghash-generic.c actually does *not*
> > implement GHASH at all. It merely implements the underlying chained GF multiplication, which,
> > I understand, is convenient as a building block for e.g. aes-gcm but is is NOT the full GHASH.
> > Most importantly, it does NOT actually close the hash, so you can trivially add more data to the
> > authenticated block (i.e. the resulting output cannot be used directly without external closing)
> >
> > GHASH is defined as GHASH(H,A,C) whereby you do this chained GF multiply on a block of AAD
> > data padded to 16 byte alignment with zeroes, followed by a block of ciphertext padded to 16
> > byte alignment with zeroes, followed by a block that contains both AAD and cipher length.
> >
> > See also https://en.wikipedia.org/wiki/Galois/Counter_Mode
> >
> > Regards,
> > Pascal van Leeuwen
> > Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
> > www.insidesecure.com
> >
> 
> Yes that's correct.  The hash APIs don't support multi-argument hashes, so
> there's no natural way for it to be "full GHASH".  So it relies on the caller to
> format the AAD and ciphertext into a single stream.  IMO it really should be
> called something like "ghash_core".
> 
> Do you have some question or suggestion, or was this just an observation?
> 
Well, considering it's pretending to be GHASH I was more less considering this a bug report ...

There's the inherent danger that someone not aware of the actual implementation tries to
use it as some efficient (e.g. due to instruction set support) secure authentication function.
Which, without proper external data formatting, it's surely not in its current form. This is 
not something you will actually notice when just using it locally for something (until
someone actually breaks it).

And then there was the issue of wanting the offload it to hardware, but that's kind of hard
if the software implementation does not follow the spec where the hardware does ...

I think more care should be taken with the algorithm naming - if it has a certain name, 
you expect it to follow the matching specification (fully). I have already identified 2 cases 
now (xts and ghash) where that is not actually the case.

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ghash
  2019-07-19 19:26   ` ghash Pascal Van Leeuwen
@ 2019-07-19 19:56     ` Eric Biggers
  2019-07-19 20:49       ` ghash Pascal Van Leeuwen
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Biggers @ 2019-07-19 19:56 UTC (permalink / raw)
  To: Pascal Van Leeuwen; +Cc: linux-crypto, Herbert Xu, davem

Hi Pascal,

On Fri, Jul 19, 2019 at 07:26:02PM +0000, Pascal Van Leeuwen wrote:
> > -----Original Message-----
> > From: linux-crypto-owner@vger.kernel.org <linux-crypto-owner@vger.kernel.org> On Behalf Of Eric Biggers
> > Sent: Friday, July 19, 2019 6:16 PM
> > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > Cc: linux-crypto@vger.kernel.org; Herbert Xu <herbert@gondor.apana.org.au>; davem@davemloft.net
> > Subject: Re: ghash
> > 
> > On Fri, Jul 19, 2019 at 02:05:01PM +0000, Pascal Van Leeuwen wrote:
> > > Hi,
> > >
> > > While implementing GHASH support for the inside-secure driver and wondering why I couldn't get
> > > the test vectors to pass I have come to the conclusion that ghash-generic.c actually does *not*
> > > implement GHASH at all. It merely implements the underlying chained GF multiplication, which,
> > > I understand, is convenient as a building block for e.g. aes-gcm but is is NOT the full GHASH.
> > > Most importantly, it does NOT actually close the hash, so you can trivially add more data to the
> > > authenticated block (i.e. the resulting output cannot be used directly without external closing)
> > >
> > > GHASH is defined as GHASH(H,A,C) whereby you do this chained GF multiply on a block of AAD
> > > data padded to 16 byte alignment with zeroes, followed by a block of ciphertext padded to 16
> > > byte alignment with zeroes, followed by a block that contains both AAD and cipher length.
> > >
> > > See also https://en.wikipedia.org/wiki/Galois/Counter_Mode
> > >
> > > Regards,
> > > Pascal van Leeuwen
> > > Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
> > > www.insidesecure.com
> > >
> > 
> > Yes that's correct.  The hash APIs don't support multi-argument hashes, so
> > there's no natural way for it to be "full GHASH".  So it relies on the caller to
> > format the AAD and ciphertext into a single stream.  IMO it really should be
> > called something like "ghash_core".
> > 
> > Do you have some question or suggestion, or was this just an observation?
> > 
> Well, considering it's pretending to be GHASH I was more less considering this a bug report ...
> 
> There's the inherent danger that someone not aware of the actual implementation tries to
> use it as some efficient (e.g. due to instruction set support) secure authentication function.
> Which, without proper external data formatting, it's surely not in its current form. This is 
> not something you will actually notice when just using it locally for something (until
> someone actually breaks it).

You do understand that GHASH is not a MAC, right?  It's only a universal
function.  Specifically, "almost-epsilon-XOR-universal".  So even if there was a
more natural API to access GHASH, it's still incorrect to use it outside of a
properly reviewed crypto mode of operation.  IOW, anyone using GHASH directly as
a MAC is screwed anyway no matter which API they are using, or misusing.

> 
> And then there was the issue of wanting the offload it to hardware, but that's kind of hard
> if the software implementation does not follow the spec where the hardware does ...
> 
> I think more care should be taken with the algorithm naming - if it has a certain name, 
> you expect it to follow the matching specification (fully). I have already identified 2 cases 
> now (xts and ghash) where that is not actually the case.
> 

If you take an (AD, Ctext) pair and format it into a single data stream the way
the API requires, then you will get the result defined by the specification.  So
it does follow the specification as best it can given the API which takes a
single data stream.  As I said though, I think "ghash_core" would be a better
name.  Note that it was added 10 years ago; I'm not sure it can actually be
renamed, but there may be a chance since no one should be using it directly.

So are you proposing that it be renamed?  Or are you proposing that a multi
argument hashing API be added?  Or are you proposing that universal functions
not be exposed through the crypto API?  What specifically is your constructive
suggestion to improve things?

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: ghash
  2019-07-19 19:56     ` ghash Eric Biggers
@ 2019-07-19 20:49       ` Pascal Van Leeuwen
  2019-07-19 21:48         ` ghash Eric Biggers
  0 siblings, 1 reply; 9+ messages in thread
From: Pascal Van Leeuwen @ 2019-07-19 20:49 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-crypto, Herbert Xu, davem

Hi Eric,

> -----Original Message-----
> From: linux-crypto-owner@vger.kernel.org <linux-crypto-owner@vger.kernel.org> On Behalf Of Eric Biggers
> Sent: Friday, July 19, 2019 9:57 PM
> To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> Cc: linux-crypto@vger.kernel.org; Herbert Xu <herbert@gondor.apana.org.au>; davem@davemloft.net
> Subject: Re: ghash
> 
> Hi Pascal,
> 
> On Fri, Jul 19, 2019 at 07:26:02PM +0000, Pascal Van Leeuwen wrote:
> > > -----Original Message-----
> > > From: linux-crypto-owner@vger.kernel.org <linux-crypto-owner@vger.kernel.org> On Behalf Of Eric Biggers
> > > Sent: Friday, July 19, 2019 6:16 PM
> > > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > > Cc: linux-crypto@vger.kernel.org; Herbert Xu <herbert@gondor.apana.org.au>; davem@davemloft.net
> > > Subject: Re: ghash
> > >
> > > On Fri, Jul 19, 2019 at 02:05:01PM +0000, Pascal Van Leeuwen wrote:
> > > > Hi,
> > > >
> > > > While implementing GHASH support for the inside-secure driver and wondering why I couldn't get
> > > > the test vectors to pass I have come to the conclusion that ghash-generic.c actually does *not*
> > > > implement GHASH at all. It merely implements the underlying chained GF multiplication, which,
> > > > I understand, is convenient as a building block for e.g. aes-gcm but is is NOT the full GHASH.
> > > > Most importantly, it does NOT actually close the hash, so you can trivially add more data to the
> > > > authenticated block (i.e. the resulting output cannot be used directly without external closing)
> > > >
> > > > GHASH is defined as GHASH(H,A,C) whereby you do this chained GF multiply on a block of AAD
> > > > data padded to 16 byte alignment with zeroes, followed by a block of ciphertext padded to 16
> > > > byte alignment with zeroes, followed by a block that contains both AAD and cipher length.
> > > >
> > > > See also https://en.wikipedia.org/wiki/Galois/Counter_Mode
> > > >
> > > > Regards,
> > > > Pascal van Leeuwen
> > > > Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
> > > > www.insidesecure.com
> > > >
> > >
> > > Yes that's correct.  The hash APIs don't support multi-argument hashes, so
> > > there's no natural way for it to be "full GHASH".  So it relies on the caller to
> > > format the AAD and ciphertext into a single stream.  IMO it really should be
> > > called something like "ghash_core".
> > >
> > > Do you have some question or suggestion, or was this just an observation?
> > >
> > Well, considering it's pretending to be GHASH I was more less considering this a bug report ...
> >
> > There's the inherent danger that someone not aware of the actual implementation tries to
> > use it as some efficient (e.g. due to instruction set support) secure authentication function.
> > Which, without proper external data formatting, it's surely not in its current form. This is
> > not something you will actually notice when just using it locally for something (until
> > someone actually breaks it).
> 
> You do understand that GHASH is not a MAC, right?  It's only a universal
> function.  
>
It's a universal keyed hash. Which you could use as a MAC, although, admittedly,
it would be rather weak, which is why the tag is usually additionally encrypted.
(which you could do externally, knowing that that's needed with GHASH)
In any case, the crypto API's ghash does not do what you would expect of a GHASH.

> Specifically, "almost-epsilon-XOR-universal".  So even if there was a
> more natural API to access GHASH, it's still incorrect to use it outside of a
> properly reviewed crypto mode of operation.  IOW, anyone using GHASH directly as
> a MAC is screwed anyway no matter which API they are using, or misusing.
> 
"Anyone" may actually know what they're doing. But rely on ghash being GHASH ...
So this is a rather lame argument.

> >
> > And then there was the issue of wanting the offload it to hardware, but that's kind of hard
> > if the software implementation does not follow the spec where the hardware does ...
> >
> > I think more care should be taken with the algorithm naming - if it has a certain name,
> > you expect it to follow the matching specification (fully). I have already identified 2 cases
> > now (xts and ghash) where that is not actually the case.
> >
> 
> If you take an (AD, Ctext) pair and format it into a single data stream the way
> the API requires, then you will get the result defined by the specification.  So
> it does follow the specification as best it can given the API which takes a
> single data stream.  As I said though, I think "ghash_core" would be a better
> name.  Note that it was added 10 years ago; I'm not sure it can actually be
> renamed, but there may be a chance since no one should be using it directly.
> 
Ok, I understand that it's legacy so maybe we should keep it as is.
It's terribly confusing though - I spent days trying to get my acceleration to
work (i.e. pass testmgr) only to realise after reverse engineering the generic 
implementation that it's not really ghash at all.

I guess the real problem is that this information can currently only be 
obtained by fully reverse engineering the implementation. And I'm a firm
believer in the natural order of things: programmers write code, compilers 
read code, not the other way around ...

> So are you proposing that it be renamed?  Or are you proposing that a multi
> argument hashing API be added?  Or are you proposing that universal functions
> not be exposed through the crypto API?  What specifically is your constructive
> suggestion to improve things?
> 
I guess my constructive suggestion *for the future* would be to be more careful
with the naming. Don't give something a "known" name if it does not comply with
the matching specification. Renaming stuff now is probably counter-productive,
but putting some remarks somewhere (near the actual test vectors may work?)
about implementation x not actually being known entity X would be nice.
(Or even just some reference on where the test vectors came from!)

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ghash
  2019-07-19 20:49       ` ghash Pascal Van Leeuwen
@ 2019-07-19 21:48         ` Eric Biggers
  2019-07-19 22:35           ` ghash Eric Biggers
  2019-07-19 23:09           ` ghash Pascal Van Leeuwen
  0 siblings, 2 replies; 9+ messages in thread
From: Eric Biggers @ 2019-07-19 21:48 UTC (permalink / raw)
  To: Pascal Van Leeuwen; +Cc: linux-crypto, Herbert Xu, davem

On Fri, Jul 19, 2019 at 08:49:02PM +0000, Pascal Van Leeuwen wrote:
> Hi Eric,
> 
> > -----Original Message-----
> > From: linux-crypto-owner@vger.kernel.org <linux-crypto-owner@vger.kernel.org> On Behalf Of Eric Biggers
> > Sent: Friday, July 19, 2019 9:57 PM
> > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > Cc: linux-crypto@vger.kernel.org; Herbert Xu <herbert@gondor.apana.org.au>; davem@davemloft.net
> > Subject: Re: ghash
> > 
> > Hi Pascal,
> > 
> > On Fri, Jul 19, 2019 at 07:26:02PM +0000, Pascal Van Leeuwen wrote:
> > > > -----Original Message-----
> > > > From: linux-crypto-owner@vger.kernel.org <linux-crypto-owner@vger.kernel.org> On Behalf Of Eric Biggers
> > > > Sent: Friday, July 19, 2019 6:16 PM
> > > > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > > > Cc: linux-crypto@vger.kernel.org; Herbert Xu <herbert@gondor.apana.org.au>; davem@davemloft.net
> > > > Subject: Re: ghash
> > > >
> > > > On Fri, Jul 19, 2019 at 02:05:01PM +0000, Pascal Van Leeuwen wrote:
> > > > > Hi,
> > > > >
> > > > > While implementing GHASH support for the inside-secure driver and wondering why I couldn't get
> > > > > the test vectors to pass I have come to the conclusion that ghash-generic.c actually does *not*
> > > > > implement GHASH at all. It merely implements the underlying chained GF multiplication, which,
> > > > > I understand, is convenient as a building block for e.g. aes-gcm but is is NOT the full GHASH.
> > > > > Most importantly, it does NOT actually close the hash, so you can trivially add more data to the
> > > > > authenticated block (i.e. the resulting output cannot be used directly without external closing)
> > > > >
> > > > > GHASH is defined as GHASH(H,A,C) whereby you do this chained GF multiply on a block of AAD
> > > > > data padded to 16 byte alignment with zeroes, followed by a block of ciphertext padded to 16
> > > > > byte alignment with zeroes, followed by a block that contains both AAD and cipher length.
> > > > >
> > > > > See also https://en.wikipedia.org/wiki/Galois/Counter_Mode
> > > > >
> > > > > Regards,
> > > > > Pascal van Leeuwen
> > > > > Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
> > > > > www.insidesecure.com
> > > > >
> > > >
> > > > Yes that's correct.  The hash APIs don't support multi-argument hashes, so
> > > > there's no natural way for it to be "full GHASH".  So it relies on the caller to
> > > > format the AAD and ciphertext into a single stream.  IMO it really should be
> > > > called something like "ghash_core".
> > > >
> > > > Do you have some question or suggestion, or was this just an observation?
> > > >
> > > Well, considering it's pretending to be GHASH I was more less considering this a bug report ...
> > >
> > > There's the inherent danger that someone not aware of the actual implementation tries to
> > > use it as some efficient (e.g. due to instruction set support) secure authentication function.
> > > Which, without proper external data formatting, it's surely not in its current form. This is
> > > not something you will actually notice when just using it locally for something (until
> > > someone actually breaks it).
> > 
> > You do understand that GHASH is not a MAC, right?  It's only a universal
> > function.  
> >
> It's a universal keyed hash. Which you could use as a MAC, although, admittedly,
> it would be rather weak, which is why the tag is usually additionally encrypted.
> (which you could do externally, knowing that that's needed with GHASH)
> In any case, the crypto API's ghash does not do what you would expect of a GHASH.

Well you could also use CRC-32 "as a MAC".  That doesn't actually make it a MAC.

> 
> > Specifically, "almost-epsilon-XOR-universal".  So even if there was a
> > more natural API to access GHASH, it's still incorrect to use it outside of a
> > properly reviewed crypto mode of operation.  IOW, anyone using GHASH directly as
> > a MAC is screwed anyway no matter which API they are using, or misusing.
> > 
> "Anyone" may actually know what they're doing. But rely on ghash being GHASH ...
> So this is a rather lame argument.
> 
> > >
> > > And then there was the issue of wanting the offload it to hardware, but that's kind of hard
> > > if the software implementation does not follow the spec where the hardware does ...
> > >
> > > I think more care should be taken with the algorithm naming - if it has a certain name,
> > > you expect it to follow the matching specification (fully). I have already identified 2 cases
> > > now (xts and ghash) where that is not actually the case.
> > >
> > 
> > If you take an (AD, Ctext) pair and format it into a single data stream the way
> > the API requires, then you will get the result defined by the specification.  So
> > it does follow the specification as best it can given the API which takes a
> > single data stream.  As I said though, I think "ghash_core" would be a better
> > name.  Note that it was added 10 years ago; I'm not sure it can actually be
> > renamed, but there may be a chance since no one should be using it directly.
> > 
> Ok, I understand that it's legacy so maybe we should keep it as is.
> It's terribly confusing though - I spent days trying to get my acceleration to
> work (i.e. pass testmgr) only to realise after reverse engineering the generic 
> implementation that it's not really ghash at all.
> 
> I guess the real problem is that this information can currently only be 
> obtained by fully reverse engineering the implementation. And I'm a firm
> believer in the natural order of things: programmers write code, compilers 
> read code, not the other way around ...
> 
> > So are you proposing that it be renamed?  Or are you proposing that a multi
> > argument hashing API be added?  Or are you proposing that universal functions
> > not be exposed through the crypto API?  What specifically is your constructive
> > suggestion to improve things?
> > 
> I guess my constructive suggestion *for the future* would be to be more careful
> with the naming. Don't give something a "known" name if it does not comply with
> the matching specification. Renaming stuff now is probably counter-productive,
> but putting some remarks somewhere (near the actual test vectors may work?)
> about implementation x not actually being known entity X would be nice.
> (Or even just some reference on where the test vectors came from!)
> 

I think a comment at the top of ghash-generic.c would be helpful, similar to the
one I wrote in nhpoly1305.c explaining that particular algorithm.

I'm surprised that you spent "days" debugging this, though.  Since the API gives
you a single data stream, surely you would have had to check at some point how
the two formal arguments (AAD, ciphertext) were encoded into it?  Were you just
passing the whole thing as the AAD or something?  Also to reiterate, it actually
does implement the GHASH algorithm correctly; the two formal parameters just
have to be encoded into the single data stream in a particular way.

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: ghash
  2019-07-19 21:48         ` ghash Eric Biggers
@ 2019-07-19 22:35           ` Eric Biggers
  2019-07-19 23:25             ` ghash Pascal Van Leeuwen
  2019-07-19 23:09           ` ghash Pascal Van Leeuwen
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Biggers @ 2019-07-19 22:35 UTC (permalink / raw)
  To: Pascal Van Leeuwen, linux-crypto, Herbert Xu, davem

On Fri, Jul 19, 2019 at 02:48:11PM -0700, Eric Biggers wrote:
> > 
> > > So are you proposing that it be renamed?  Or are you proposing that a multi
> > > argument hashing API be added?  Or are you proposing that universal functions
> > > not be exposed through the crypto API?  What specifically is your constructive
> > > suggestion to improve things?
> > > 
> > I guess my constructive suggestion *for the future* would be to be more careful
> > with the naming. Don't give something a "known" name if it does not comply with
> > the matching specification. Renaming stuff now is probably counter-productive,
> > but putting some remarks somewhere (near the actual test vectors may work?)
> > about implementation x not actually being known entity X would be nice.
> > (Or even just some reference on where the test vectors came from!)
> > 
> 
> I think a comment at the top of ghash-generic.c would be helpful, similar to the
> one I wrote in nhpoly1305.c explaining that particular algorithm.
> 
> I'm surprised that you spent "days" debugging this, though.  Since the API gives
> you a single data stream, surely you would have had to check at some point how
> the two formal arguments (AAD, ciphertext) were encoded into it?  Were you just
> passing the whole thing as the AAD or something?  Also to reiterate, it actually
> does implement the GHASH algorithm correctly; the two formal parameters just
> have to be encoded into the single data stream in a particular way.
> 

Hmm, NIST SP 800-38D actually defines GHASH to take one argument, same as the
Linux version.  So even outside Linux, there is no consensus on whether "GHASH"
refers to the one argument or two argument versions.

I.e. even if we had an API where the AAD and ciphertext were passed in
separately, which is apparently what you'd prefer, people would complain that it
doesn't match the NIST standard version.

Of course, in the end the actually important thing is that everyone agrees on
GCM, not that they agree on GHASH.

- Eric

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: ghash
  2019-07-19 21:48         ` ghash Eric Biggers
  2019-07-19 22:35           ` ghash Eric Biggers
@ 2019-07-19 23:09           ` Pascal Van Leeuwen
  1 sibling, 0 replies; 9+ messages in thread
From: Pascal Van Leeuwen @ 2019-07-19 23:09 UTC (permalink / raw)
  To: Eric Biggers; +Cc: linux-crypto, Herbert Xu, davem

> > It's a universal keyed hash. Which you could use as a MAC, although, admittedly,
> > it would be rather weak, which is why the tag is usually additionally encrypted.
> > (which you could do externally, knowing that that's needed with GHASH)
> > In any case, the crypto API's ghash does not do what you would expect of a GHASH.
> 
> Well you could also use CRC-32 "as a MAC".  That doesn't actually make it a MAC.
> 
For one thing, CRC-32 is not keyed so I don't see how you would use it as a MAC?
It's also not a cryptographic hash function as the output is biased.

And it's true GHASH by itself is not a good MAC as you can recover the key, but
that is easily solved by encrypting the tag with a one-time pad. Which, admittedly,
is not included in the GHASH definition itself, which makes it rather oddball as it
is also not a cryptographic hash function if the key is known.

Then again, does it have to be? You just mentioned CRC-32 being supported by
the crypto API, which is also not a cryptographic construct ...

> > I guess my constructive suggestion *for the future* would be to be more careful
> > with the naming. Don't give something a "known" name if it does not comply with
> > the matching specification. Renaming stuff now is probably counter-productive,
> > but putting some remarks somewhere (near the actual test vectors may work?)
> > about implementation x not actually being known entity X would be nice.
> > (Or even just some reference on where the test vectors came from!)
> >
> 
> I think a comment at the top of ghash-generic.c would be helpful, similar to the
> one I wrote in nhpoly1305.c explaining that particular algorithm.
> 
That sounds good as well. Although that would be the very last place I would
(and did) look.

> I'm surprised that you spent "days" debugging this, though.  Since the API gives
> you a single data stream, surely you would have had to check at some point how
> the two formal arguments (AAD, ciphertext) were encoded into it?  Were you just
> passing the whole thing as the AAD or something?  
>
Well, I had 2 possibilities: either everything was AAD or everything was ciphertext
and neither got matching results ... and then you go off into trying various byte 
orders (9 our of 10 times it's just some endianess issue) and investigating potential
bugs or limitations with the hardware itself. Of course this being crypto (and worse: 
only getting a tag out of the hardware at the very end)  there's not a whole lot of 
information to work with. Basically only pass or fail. So yes, you quickly end up 
wasting a lot of time.
I was NOT expecting ghash not to be GHASH as I know it, so it really was the last 
place to look.

> Also to reiterate, it actually
> does implement the GHASH algorithm correctly; the two formal parameters just
> have to be encoded into the single data stream in a particular way.
> 
I was about to make a remark stating that that's like saying a simple block XOR 
actually does implement AES counter mode (or ARC4 or Chacha20 or other random 
stream cipher) correctly as long as you generate the key stream data block in a 
particular way ...

And then I came across a 2007 NIST specification that defines GHASH exactly as it's 
implemented here, namely with all the formatting left out, just GHASH(K, X).
I've always known GHASH the way Wikipedia defines it (which comes from the
original 2005 spec by McGrew and Viega), and that's also how our hardware 
implements it (i.e. it cannot do GHASH without the padding and  the appended 
length word), but apparently there are different opinions out  there ... so I stand 
corrected?!

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: ghash
  2019-07-19 22:35           ` ghash Eric Biggers
@ 2019-07-19 23:25             ` Pascal Van Leeuwen
  0 siblings, 0 replies; 9+ messages in thread
From: Pascal Van Leeuwen @ 2019-07-19 23:25 UTC (permalink / raw)
  To: Eric Biggers, linux-crypto, Herbert Xu, davem

> 
> Hmm, NIST SP 800-38D actually defines GHASH to take one argument, same as the
> Linux version.  So even outside Linux, there is no consensus on whether "GHASH"
> refers to the one argument or two argument versions.
> 
Funny, I just stumbled upon that 2007 NIST specification myself minutes ago, before I
saw this mail. So yes, it looks like they redefined GHASH from the original 2005 
McGrew/Viega definition there, although I don't have a clue as to why ...
Must've been after our initial hardware implementation though, since all of our 
hardware strictly implements it according to the original McGrew/Viega definition
(which is also still surviving on Wikipedia)

> I.e. even if we had an API where the AAD and ciphertext were passed in
> separately, which is apparently what you'd prefer, people would complain that it
> doesn't match the NIST standard version.
> 
Prefer is a big word ... it's just that our hardware cannot do unformatted GHASH
that way as that was not the specification at the time we implemented it.
And that redefinition went totally unnoticed.

> Of course, in the end the actually important thing is that everyone agrees on
> GCM, not that they agree on GHASH.
> 
As long as GHASH is not being used for much else, probably. And in either case,
the crypto API implementation is flexible as it just pushes all the formatting out.
Our hardware, on the other hand :-(

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-07-19 23:25 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 14:05 ghash Pascal Van Leeuwen
2019-07-19 16:16 ` ghash Eric Biggers
2019-07-19 19:26   ` ghash Pascal Van Leeuwen
2019-07-19 19:56     ` ghash Eric Biggers
2019-07-19 20:49       ` ghash Pascal Van Leeuwen
2019-07-19 21:48         ` ghash Eric Biggers
2019-07-19 22:35           ` ghash Eric Biggers
2019-07-19 23:25             ` ghash Pascal Van Leeuwen
2019-07-19 23:09           ` ghash Pascal Van Leeuwen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.