linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linux Crypto Mailing List <linux-crypto@vger.kernel.org>,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	David Miller <davem@davemloft.net>,
	Greg KH <gregkh@linuxfoundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Samuel Neves <sneves@dei.uc.pt>,
	Dan Carpenter <dan.carpenter@oracle.com>,
	Arnd Bergmann <arnd@arndb.de>, Eric Biggers <ebiggers@google.com>,
	Andy Lutomirski <luto@kernel.org>, Will Deacon <will@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>
Subject: Re: [RFC PATCH 00/18] crypto: wireguard using the existing crypto API
Date: Thu, 26 Sep 2019 22:47:12 +0200	[thread overview]
Message-ID: <CAHmME9rKFUvsQ6hhsKjxxVSnyNQsTaqBKGABoHibCiCBmfxCOA@mail.gmail.com> (raw)
In-Reply-To: <CAKv+Gu-RLRhwDahgvfvr2J9R+3GPM6vh4mjO73VcekusdzbuMA@mail.gmail.com>

Hi Ard,

On Thu, Sep 26, 2019 at 2:07 PM Ard Biesheuvel
<ard.biesheuvel@linaro.org> wrote:
> attitude goes counter to that, and this is why we have made so little
> progress over the past year.

I also just haven't submitted much in the past year, taking a bit of a
break to see how things would settle. Seemed like rushing things
wasn't prudent, so I slowed down.

> But I am happy with your willingness to collaborate and find common
> ground, which was also my motivation for spending a considerable
> amount of time to prepare this patch set.

Super.

> > If the process of doing that together will be fraught with difficulty,
> > I’m still open to the “7 patch series” with the ugly cryptoapi.c
> > approach, as described at the top.
>
> If your aim is to write ugly code and use that as a munition

No, this is not a matter of munition at all. Please take my words
seriously; I am entirely genuine here. Three people I greatly respect
made a very compelling argument to me, prompting the decision in [1].
The argument was that trying to fix the crypto layer AND trying to get
WireGuard merged at the same time was ambitious and crazy. Maybe
instead, they argued, I should just use the old crypto API, get
WireGuard in, and then begin the Zinc process after. I think
procedurally, that's probably good advice, and the people I was
talking to seemed to have a really firm grasp on what works and what
doesn't in the mainlining process. Now it's possible their judgement
is wrong, but I really am open, in earnest, to following it. And the
way that would look would be not trying to fix the crypto API now,
getting WireGuard in based on whatever we can cobble together based on
the current foundations with some intermediate file (cryptoapi.c in my
previous email) to prevent it from infecting WireGuard. This isn't
"munition"; it's a serious proposal.

The funny thing, though, is that all the while I was under the
impression somebody had figured out a great way to do this, it turns
out you were busy with basically Zinc-but-not. So we're back to square
one: you and I both want the crypto API to change, and now we have to
figure out a way forward together on how to do this, prompting my last
email to you, indicating that I was open to all sorts of compromises.
However, I still remain fully open to following the prior suggestion,
of not doing that at all right now, and instead basing this on the
existing crypto API as-is.

[1] https://lore.kernel.org/wireguard/CAHmME9pmfZAp5zd9BDLFc2fWUhtzZcjYZc2atTPTyNFFmEdHLg@mail.gmail.com/

> > reference, here’s what that kind of thing looks like: [2].
>
> This is one of the issues in the 'fix it for everyone else as well'
> category. If we can improve the crypto API to be less susceptible to
> these issues (e.g., using static calls), everybody benefits. I'd be
> happy to collaborate on that.

Good. I'm happy to learn that you're all for having fast
implementations that underlie the simple function calls.

> > Taken together, we simply can’t skimp on the implementations available
> > on the handshake layer, so we’ll need to add some form of
> > implementation selection, whether it’s the method Zinc uses ([2]), or
> > something else we cook up together.
>
> So are you saying that the handshake timing constraints in the
> WireGuard protocol are so stringent that we can't run it securely on,
> e.g., an ARM CPU that lacks a NEON unit? Or given that you are not
> providing accelerated implementations of blake2s or Curve25519 for
> arm64, we can't run it securely on arm64 at all?

Deployed at scale, the handshake must have a certain performance to
not be DoS'd. I've spent a long time benching these and attacking my
own code.  I won't be comfortable with this going in without the fast
implementations for the handshake. And down the line, too, we can look
into how to even improve the DoS resistance. I think there's room for
improvement, and I hope at some point you'll join us in discussions on
WireGuard internals. But the bottom line is that we need those fast
primitives.

> Typically, I would prefer to only introduce different versions of the
> same algorithm if there is a clear performance benefit for an actual
> use case.

As I was saying, this is indeed the case.

> Framing this as a security issue rather than a performance issue is
> slightly disingenuous, since people are less likely to challenge it.

I'm not being disingenuous. DoS resistance is a real issue with
WireGuard. You might argue that FourQ and Siphash would have made
better choices, and that's an interesting discussion, but it is what
it is. The thing needs fast implementations. And we're going to have
to implement that code anyway for other things, so might as well get
it working well now.

> But the security of any VPN protocol worth its salt

You're not required to use WireGuard.

> Parsing the string and connecting the function pointers happens only
> once, and only when the transform needs to be instantiated from its
> constituent parts. Subsequent invocations will just grab the existing
> object.

That's good to know. It doesn't fully address the issue, though.

> My preference would be to address this by permitting per-request keys
> in the AEAD layer. That way, we can instantiate the transform only
> once, and just invoke it with the appropriate key on the hot path (and
> avoid any per-keypair allocations)

That'd be a major improvement to the async interface, yes.

> > So given the above, how would you like to proceed? My personal
> > preference would be to see you start with the Zinc patchset and rename
> > things and change the infrastructure to something that fits your
> > preferences, and we can see what that looks like. Less appealing would
> > be to do several iterations of you reworking Zinc from scratch and
> > going through the exercises all over again, but if you prefer that I
> > guess I could cope. Alternatively, maybe this is a lot to chew on, and
> > we should just throw caution into the wind, implement cryptoapi.c for
> > WireGuard (as described at the top), and add C functions to the crypto
> > API sometime later? This is what I had envisioned in [1].

> It all depends on whether we are interested in supporting async
> accelerators or not, and it is clear what my position is on this
> point.

For a first version of WireGuard, no, I'm really not interested in
that. Adding it in there is more ambitious than it looks to get it
right. Async means more buffers, which means the queuing system for
WireGuard needs to be changed. There's already ongoing research into
this, and I'm happy to consider that research with a light toward
maybe having async stuff in the future. But sticking into the code now
as-is simply does not work from a buffering/queueing perspective. So
again, let's take an iterative approach here: first we do stuff with
the simple synchronous API. After the dust has settled, hardware is
available for testing, Van Jacobson has been taken off the bookshelf
for a fresh reading, and we've all sat down for a few interesting
conversations at netdev on queueing and bufferbloat, then let's start
working this in. In otherwords, just because technically you can glue
those APIs together, sort of, doesn't mean that approach makes sense
for the system as a whole.

> I am not convinced that we need accelerated implementations of blake2s
> and curve25519, but if we do, I'd like those to be implemented as
> individual modules under arch/*/crypto, with some moduleloader fu for
> weak symbols or static calls thrown in if we have to. Exposing them as
> shashes seems unnecessary to me at this point.

We need the accelerated implementations. And we'll need it for chapoly
too, obviously. So let's work out a good way to hook that all into the
Zinc-style interface. [2] does it in a very effective way that's
overall quite good for performance and easy to follow. The
chacha20-x86_64-glue.c code itself gets called via the static symbol
chacha20_arch. This is implemented for each platform with a fall back
to one that returns false, so that the generic code is called. The
Zinc stuff here is obvious, simple, and I'm pretty sure you know
what's up with it.

I prefer each of these glue implementations to live in
lib/zinc/chacha20/chacha20-${ARCH}-glue.c. You don't like that and
want things in arch/${ARCH}/crypto/chacha20-glue.c. Okay, sure, fine,
let's do all the naming and organization and political stuff how you
like, and I'll leave aside my arguments about why I disagree. Let's
take stock of where that leaves us, in terms of files:

- lib/crypto/chacha20.c: this has a generic implementation, but at the
top of the generic implementation, it has some code like "if
(chacha20_arch(..., ..., ...)) return;"
- arch/crypto/x86_64/chacha20-glue.c: this has the chacha20_arch()
implementation, which branches out to the various SIMD implementations
depending on some booleans calculated at module load time.
- arch/crypto/arm/chacha20-glue.c: this has the chacha20_arch()
implementation, which branches out to the various SIMD implementations
depending on some booleans calculated at module load time.
- arch/crypto/mips/chacha20-glue.c: this has the chacha20_arch()
implementation, which contains an assembly version that's always run
unconditionally.

Our goals are that chacha20_arch() from each of these arch glues gets
included in the lib/crypto/chacha20.c compilation unit. The reason why
we want it in its own unit is so that the inliner can get rid of
unreached code and more tightly integrate the branches. For the MIPS
case, the advantage is clear. Here's how Zinc handles it: [3]. Some
simple ifdefs and includes. Easy and straightforward. Some people
might object, though, and it sounds like you might. So let's talk
about some alternative mechanisms with their pros and cons:

- The zinc method: straightforward, but not everybody likes ifdefs.
- Stick the filename to be included into a Kconfig variable and let
the configuration system do the logic: also straightforward. Not sure
it's kosher, but it would work.
- Weak symbols: we don't get inlining or the dead code elimination.
- Function pointers: ditto, plus spectre.
- Other ideas you might have? I'm open to suggestions here.

[2] https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/lib/zinc/chacha20/chacha20-x86_64-glue.c?h=jd/wireguard#n54
[3] https://git.kernel.org/pub/scm/linux/kernel/git/zx2c4/linux.git/tree/lib/zinc/chacha20/chacha20.c?h=jd/wireguard#n19

> What I *don't* want is to merge WireGuard with its own library based
> crypto now, and extend that later for async accelerators once people
> realize that we really do need that as well.

I wouldn't worry so much about that. Zinc/library-based-crypto is just
trying to fulfill the boring synchronous pure-code part of crypto
implementations. For the async stuff, we can work together on
improving the existing crypto API to be more appealing, in tandem with
some interesting research into packet queuing systems. From the other
thread, you might have seen that already Toke has cool ideas that I
hope we can all sit down and talk about. I'm certainly not interested
in "bolting" anything on to Zinc/library-based-crypto, and I'm happy
to work to improve the asynchronous API for doing asynchronous crypto.

Jason

  parent reply	other threads:[~2019-09-26 20:47 UTC|newest]

Thread overview: 66+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-25 16:12 [RFC PATCH 00/18] crypto: wireguard using the existing crypto API Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 01/18] crypto: shash - add plumbing for operating on scatterlists Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 02/18] crypto: x86/poly1305 - implement .update_from_sg method Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 03/18] crypto: arm/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 04/18] crypto: arm64/poly1305 " Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 05/18] crypto: chacha - move existing library code into lib/crypto Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 06/18] crypto: rfc7539 - switch to shash for Poly1305 Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 07/18] crypto: rfc7539 - use zero reqsize for sync instantiations without alignmask Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 08/18] crypto: testmgr - add a chacha20poly1305 test case Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 09/18] crypto: poly1305 - move core algorithm into lib/crypto Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 10/18] crypto: poly1305 - add init/update/final library routines Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 11/18] int128: move __uint128_t compiler test to Kconfig Ard Biesheuvel
2019-09-25 21:01   ` Linus Torvalds
2019-09-25 21:19     ` Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 12/18] crypto: BLAKE2s - generic C library implementation and selftest Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 13/18] crypto: Curve25519 - generic C library implementations " Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 14/18] crypto: chacha20poly1305 - import construction and selftest from Zinc Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 15/18] net: WireGuard secure network tunnel Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 16/18] netlink: use new strict length types in policy for 5.2 Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 17/18] wg switch to lib/crypto algos Ard Biesheuvel
2019-09-25 16:12 ` [RFC PATCH 18/18] net: wireguard - switch to crypto API for packet encryption Ard Biesheuvel
2019-09-25 22:15   ` Linus Torvalds
2019-09-25 22:22     ` Linus Torvalds
2019-09-26  9:40     ` Pascal Van Leeuwen
2019-09-26 16:35       ` Linus Torvalds
2019-09-27  0:15         ` Pascal Van Leeuwen
2019-09-27  1:30           ` Linus Torvalds
2019-09-27  2:54             ` Linus Torvalds
2019-09-27  3:53               ` Herbert Xu
2019-09-27  4:37                 ` Andy Lutomirski
2019-09-27  4:59                   ` Herbert Xu
2019-09-27  4:01               ` Herbert Xu
2019-09-27  4:13                 ` Linus Torvalds
2019-09-27 10:44               ` Pascal Van Leeuwen
2019-09-27 11:08                 ` Pascal Van Leeuwen
2019-09-27  4:36             ` Andy Lutomirski
2019-09-27  9:58             ` Pascal Van Leeuwen
2019-09-27 10:11               ` Herbert Xu
2019-09-27 16:23               ` Linus Torvalds
2019-09-30 11:14                 ` France didn't want GSM encryption Marc Gonzalez
2019-09-30 21:37                   ` Linus Torvalds
2019-09-30 20:44                 ` [RFC PATCH 18/18] net: wireguard - switch to crypto API for packet encryption Pascal Van Leeuwen
2019-09-27  2:06           ` Linus Torvalds
2019-09-27 10:11             ` Pascal Van Leeuwen
2019-09-26 11:06     ` Ard Biesheuvel
2019-09-26 12:34       ` Ard Biesheuvel
2019-09-26  8:59 ` [RFC PATCH 00/18] crypto: wireguard using the existing crypto API Jason A. Donenfeld
2019-09-26 10:19   ` Pascal Van Leeuwen
2019-09-26 10:59     ` Jason A. Donenfeld
2019-09-26 11:06     ` chapoly acceleration hardware [Was: Re: [RFC PATCH 00/18] crypto: wireguard using the existing crypto API] Jason A. Donenfeld
2019-09-26 11:38       ` Toke Høiland-Jørgensen
2019-09-26 13:52       ` Pascal Van Leeuwen
2019-09-26 23:13         ` Dave Taht
2019-09-27 12:18           ` Pascal Van Leeuwen
2019-09-26 22:47       ` Jakub Kicinski
2019-09-26 12:07   ` [RFC PATCH 00/18] crypto: wireguard using the existing crypto API Ard Biesheuvel
2019-09-26 13:06     ` Pascal Van Leeuwen
2019-09-26 13:15       ` Ard Biesheuvel
2019-09-26 14:03         ` Pascal Van Leeuwen
2019-09-26 14:52           ` Ard Biesheuvel
2019-09-26 15:04             ` Pascal Van Leeuwen
2019-09-26 20:47     ` Jason A. Donenfeld [this message]
2019-09-26 21:22       ` Andrew Lunn
2019-09-26 21:36       ` Andy Lutomirski
2019-09-27  7:20         ` Jason A. Donenfeld
2019-10-01  8:56           ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHmME9rKFUvsQ6hhsKjxxVSnyNQsTaqBKGABoHibCiCBmfxCOA@mail.gmail.com \
    --to=jason@zx2c4.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=dan.carpenter@oracle.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=maz@kernel.org \
    --cc=sneves@dei.uc.pt \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).