All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: "linux-crypto@vger.kernel.org" <linux-crypto@vger.kernel.org>,
	"herbert@gondor.apana.org.au" <herbert@gondor.apana.org.au>
Subject: RE: Key endianness?
Date: Mon, 21 Oct 2019 21:40:05 +0000	[thread overview]
Message-ID: <CH2PR20MB29682A9EA7020D3156883F5ACA690@CH2PR20MB2968.namprd20.prod.outlook.com> (raw)
In-Reply-To: <CAKv+Gu_uUGu0vhKTf49kpiUQFSt-p9mrx36wHbdeOFexdJR=mw@mail.gmail.com>

> -----Original Message-----
> From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Sent: Monday, October 21, 2019 9:47 PM
> To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> Cc: linux-crypto@vger.kernel.org; herbert@gondor.apana.org.au
> Subject: Re: Key endianness?
> 
> On Mon, 21 Oct 2019 at 21:14, Pascal Van Leeuwen
> <pvanleeuwen@verimatrix.com> wrote:
> >
> > > -----Original Message-----
> > > From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > Sent: Monday, October 21, 2019 6:15 PM
> > > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > > Cc: linux-crypto@vger.kernel.org; herbert@gondor.apana.org.au
> > > Subject: Re: Key endianness?
> > >
> > > On Mon, 21 Oct 2019 at 17:55, Pascal Van Leeuwen
> > > <pvanleeuwen@verimatrix.com> wrote:
> > > >
> > > > > -----Original Message-----
> > > > > From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > > > Sent: Monday, October 21, 2019 5:32 PM
> > > > > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > > > > Cc: linux-crypto@vger.kernel.org; herbert@gondor.apana.org.au
> > > > > Subject: Re: Key endianness?
> > > > >
> > > > > p[
> > > > >
> > > > > On Mon, 21 Oct 2019 at 17:23, Pascal Van Leeuwen
> > > > > <pvanleeuwen@verimatrix.com> wrote:
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > > > > > Sent: Monday, October 21, 2019 2:54 PM
> > > > > > > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > > > > > > Cc: linux-crypto@vger.kernel.org; herbert@gondor.apana.org.au
> > > > > > > Subject: Re: Key endianness?
> > > > > > >
> > > > > > > On Mon, 21 Oct 2019 at 14:40, Pascal Van Leeuwen
> > > > > > > <pvanleeuwen@verimatrix.com> wrote:
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> > > > > > > > > Sent: Monday, October 21, 2019 1:59 PM
> > > > > > > > > To: Pascal Van Leeuwen <pvanleeuwen@verimatrix.com>
> > > > > > > > > Cc: linux-crypto@vger.kernel.org; herbert@gondor.apana.org.au
> > > > > > > > > Subject: Re: Key endianness?
> > > > > > > > >
> > > > > > > > > On Mon, 21 Oct 2019 at 12:56, Pascal Van Leeuwen
> > > > > > > > > <pvanleeuwen@verimatrix.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Another endianness question:
> > > > > > > > > >
> > > > > > > > > > I have some data structure that can be either little or big endian,
> > > > > > > > > > depending on the exact use case. Currently, I have it defined as u32.
> > > > > > > > > > This causes sparse errors when accessing it using cpu_to_Xe32() and
> > > > > > > > > > Xe32_to_cpu().
> > > > > > > > > >
> > > > > > > > > > Now, for the big endian case, I could use htonl()/ntohl() instead,
> > > > > > > > > > but this is inconsistent with all other endian conversions in the
> > > > > > > > > > driver ... and there's no little endian alternative I'm aware of.
> > > > > > > > > > So I don't really like that approach.
> > > > > > > > > >
> > > > > > > > > > Alternatively, I could define a union of both a big and little
> > > > > > > > > > endian version of the data but that would require touching a lot
> > > > > > > > > > of legacy code (unless I use a C11 anonymous union ... not sure
> > > > > > > > > > if that would be allowed?) and IMHO is a bit silly.
> > > > > > > > > >
> > > > > > > > > > Is there some way of telling sparse to _not_ check for "correct"
> > > > > > > > > > use of these functions for a certain variable?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > In this case, just use (__force __Xe32*) to cast it to the correct
> > > > > > > > > type. This annotates the cast as being intentionally endian-unclean,
> > > > > > > > > and shuts up Sparse.
> > > > > > > > >
> > > > > > > > Thanks for trying to help out, but that just gives me an
> > > > > > > > "error: not an lvalue" from both sparse and GCC.
> > > > > > > > But I'm probably doing it wrong somehow ...
> > > > > > > >
> > > > > > >
> > > > > > > It depends on what you are casting. But doing something like
> > > > > > >
> > > > > > > u32 l = ...
> > > > > > > __le32 ll = (__force __le32)l
> > > > > > >
> > > > > > > should not trigger a sparse warning.
> > > > > > >
> > > > > > I was actually casting the left side, not the right side,
> > > > > > as that's where my sparse issue was. Must be my poor grasp
> > > > > > of the C language hurting me here as I don't understand why
> > > > > > I'm not allowed to cast an array element to a different type
> > > > > > of the _same size_ ...
> > > > > >
> > > > > > i.e. why can't I do (__be32)some_u32_array[3] = cpu_to_be32(some_value)?
> > > > > >
> > > > >
> > > > > Because you can only change the type of an expression by casting, and
> > > > > an lvalue is not an expression. A variable has a type already, and you
> > > > > cannot cast that away - what would that mean, exactly? Would all
> > > > > occurrences of some_u32_array[] suddenly have a different type? Or
> > > > > only element [3]?
> > > > >
> > > > I think it would be perfectly logical to do such a cast and I'm really
> > > > surprised that it is not legal. Obviously, it would only apply to the
> > > > actual assignment it is used with. It's a cast, not a redefinition.
> > > > After all, a variable or an array item is just some storage area in
> > > > memory. Why shouldn't I be able to write to it _as if_ it is some
> > > > different type (if I know what I'm doing and especially if it is the
> > > > exact same size in memory)?
> > > >
> > > > >
> > > > > > I managed to work around it by doing *(__be32 *)&some_u32_array[3] =
> > > > > > but that's pretty ugly ... a better approach is still welcome.
> > > > > >
> > > > >
> > > > > You need to cast the right hand side, not the left hand side. If
> > > > > some_u32_array is u32[], force cast it to (__force u32)
> > > > >
> > > >
> > > > Sure, you can do the casting on the right hand side, but that may not
> > > > convey what you _really_ want to do, particularly in this case.
> > > > As I _really_ want to write a big endian word there. I don't want to
> > > > pretend I loose the endianness somewhere along the way. That written
> > > > word is still very much big endian.
> > > > (I know practically it makes no difference, but casting the left side
> > > > would just be so much clearer IMHO)
> > > >
> > >
> > >
> > > No, it really isn't, and I am tired of having another endless debate about this.
> > >
> > I was not intending to start a debate - I was just being a newby C programmer
> > being _honestly surprised_ by this limitation. You (or I, anyway) would just expect
> > it to work. The rest is personal taste, which is is not debatable (it just is).
> >
> 
> But it is not a limitation. You are arguing that it is natural and
> obvious to cast the expression, but it is not. Really.
> 
> (be32)lvalue = <expression of type be32>
> 
> would actually mean that the operation applied to the expression is
> the opposite, given the lvalue is *not* a be32 to begin with, and so
> you expect that lvalue is made to be a be32 for the purpose of the
> assignment, after which it is converted back to what it was before? In
> this particular case, this comes down to doing the inverse cast on the
> expression. But what happens is there is no inverse cast? What would
> it mean, for instance, to do
> 
> (u16)lvalue = <expression of type s16>
> 
> when lvalue is of type s16 itself? Does the sign bit become a data bit
> now? Do we have to keep track of this throughout the code?
> 
I thought you were the one not wanting to debate this ...

What I would expect such a cast to do is to simply treat the lefthand
side _as if_ it were defined as being the type of the cast, no more,
no less. So it would do whatever would happen if lvalue was defined
as a u16 in the first place. Which is probably not a whole lot for 
this fairly uninteresting example, as, as far as I know, C does not 
do anything  special if you assign a signed value to an unsigned 
variable.

It's difficult to come up with good examples that aren't contrived.
But say I have this variable that is usually a function pointer,
but I have this special case where it stores some integer value.
(because there I don't need the ptr and don't want to waste space)

Then (int)func_ptr = <integer value> is _more_ clear about
my intentions than func_ptr = (function ptr cast)<integer value>

Of course the same could be achieved with a union of a function ptr
and an int, but that may not be convenient if there's a lot of
legacy code using that function pointer already and C11 (and thus
anonymous unions) is not an option. 

> The bottom line is that the cast operation is only defined for values,
> not for variables. A variable's type is fixed - you cannot change it
> for the duration of a cast operation and expect the compiler to infer
> what this might mean in the general case, even if you think your
> endianness conversion example is an obvious case.
> 
I've understood by now that the C language specification does not 
allow it (although ... *(int *)&function_ptr = <integer value> ;-)

Which means this whole discussion is purely theoretical and
significantly off-topic for this mailing list, for which I apologize.
> 
> > > C permits casting of expressions, not of lvalues.
> > > ...
> > > > > > >
> > > > > > > > > If the
> > > > > > > > > hardware chooses to reorder those bytes for some reason, it is the
> > > > > > > > > responsibility of the driver to take care of that from the CPU side.
> > > > > > > > >
> > > > > > > > Which still requires you to know the byte order as used by the API.
> > > > > > > >
> > > > > > >
> > > > > > > Only if API means the AES or ChaCha specific helper routines that we
> > > > > > > have in the kernel. If you are using the AES helpers, then yes, you
> > > > > > > need to ensure that you use the same convention. But the algorithms
> > > > > > > themselves are fully defined by their specification, and so what other
> > > > > > > implementations in the kernel do is not really relevant.
> > > > > > >
> > > > > > What is relevant is what the API expects
> > > > >
> > > > > But *which* API? The skcipher API uses u8[] for in/output and keys,
> > > > > and how these byte arrays are interpreted is not (and cannot) be
> > > > > defined at this level of abstraction.
> > > > >
> > > > Yes, skcipher API. Obviously. As that's what we're talking about.
> > > > And _of course_ it has to be defined at that level of abstraction.
> > > > (which doesn't preclude inheriting it from some other specification)
> > > > Otherwise you would not able to e.g. exchange keys between different
> > > > platforms.
> > > >
> > >
> > > So what exactly are you suggesting? That the skcipher API should
> > > specify that the key is u8[], unless the algo in question operates on
> > > 32-bit words, in which case it is le32[], unless the algo in question
> > > operates on 64-bit words, in which case it is le64[] etc etc? Do you
> > > seriously think that at the skcipher API level we should mandate all
> > > of that? That is insane.
> > >
> > You think being _clear_ on the actual byte order is _insane_? Seriously?
> 
> That is not what I said.
> 
> I said that an abstract API that reasons about unspecified algorithms
> taking inputs and output of an unspecified nature, using keys of an
> unspecified nature should not be expected to define how such
> unspecified quantities are organized if they happen to be interpreted
> as multi-byte words by some of its implementations.
> 
If said API wants to achieve some kind of interoperability - i.e. allow 
for keys, nonces and IV's to be distributed to other platforms - it must
necessarily define the byte order thereof, either directly or by 
implication. I honestly can't believe you're trying to argue the contrary.

> > Like I said, inheriting that from the algorithm spec is fine but not all
> > algorithm specs are clear and unambiguous w.r.t. byte order.
> >
> 
> That doesn't make it the job of the abstraction that is layered on top
> to define it.
> 
Of course it does. Because if it doesn't, you wouldn't be able to exchange
key material with other implementations/platforms. Or extract it from/
insert it into some protocol data stream.

> > I know this is not crypto perse, but what would be the logical byte order
> > for a CRC32 "key" considering it is a 32 bit word? The CRC32 definition
> > sure isn't going to help you there, it is specified on 32 bit words only.
> > So in such cases it must be clear how the byte stream maps onto the word.
> >
> 
> Yes, so for the shash implementations of "crc32" and "crc32c", it is
> defined by the implementation. But that doesn't mean shash should
> specify that this is the case for all algorithms.
>
It should never be "defined by the implementation", for reasons I already
mentioned above. In fact, we have these crypto self-tests to ensure all
implementations behave _exactly_ the same. Which means there must be some
higher level specification they must all comply with. There has to be.
 
> > > > >
> > > > > > ... and from 20 years of
> > > > > > experience I would say many algorithm specifications are not exactly
> > > > > > very clear on the byte order at all, often assuming this to be
> > > > > > "obvious". (and if it's not little-endian, it's not obvious to me ;-)
> > > > > >
> > > > >
> > > > > I agree that not all specs are crystal clear on this. But it is still
> > > > > the algorithm that needs to define this.
> > > > >
> > > > In an ideal world, probably. In the real world, it is entirely possible
> > > > for an implementation to expect the key bytes in a different order.
> > > > Would not be the first time I run into that.
> > > >
> > >
> > > Of course. But that is not the point. The skcipher API cannot possibly
> > > reason about byte orders of all current and future algorithms that it
> > > may ever encapsulate.
> > >
> > Well, so far I got the impression that at least the intention is
> > to follow the cipher specification. That's a start. You could add
> > some generic rules on top of that, like "if the specification is
> > word based and does not mandate a byte order, than these words
> > shall be stored in little-endian byte order". It's not that hard.
> > You really don't need to know "future algorithm" details for that.
> >
> 
> I am not saying it is hard. I am saying it is wrong. The skcipher
> API's job is not to fill holes in the algorithm's specification.
>
It's _at least_ the API's job to ensure there is consistency across 
different implementations of the same algorithm, which implies there
_must_ be consensus on the byte order. So "wrong" my ass. QED.

Regards,
Pascal van Leeuwen
Silicon IP Architect, Multi-Protocol Engines @ Verimatrix
www.insidesecure.com

  reply	other threads:[~2019-10-21 21:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-21 10:55 Key endianness? Pascal Van Leeuwen
2019-10-21 11:59 ` Ard Biesheuvel
2019-10-21 12:39   ` Pascal Van Leeuwen
2019-10-21 12:53     ` Ard Biesheuvel
2019-10-21 15:23       ` Pascal Van Leeuwen
2019-10-21 15:31         ` Ard Biesheuvel
2019-10-21 15:55           ` Pascal Van Leeuwen
2019-10-21 16:14             ` Ard Biesheuvel
2019-10-21 19:14               ` Pascal Van Leeuwen
2019-10-21 19:47                 ` Ard Biesheuvel
2019-10-21 21:40                   ` Pascal Van Leeuwen [this message]
2019-10-21 12:07 ` Pascal Van Leeuwen
2019-10-21 12:11   ` Ard Biesheuvel
2019-10-21 12:20     ` Pascal Van Leeuwen
  -- strict thread matches above, loose matches on Subject: below --
2019-10-21  9:03 Pascal Van Leeuwen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CH2PR20MB29682A9EA7020D3156883F5ACA690@CH2PR20MB2968.namprd20.prod.outlook.com \
    --to=pvanleeuwen@verimatrix.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.