LKML Archive on
 help / color / Atom feed
From: "Jason A. Donenfeld" <>
To: Andy Lutomirski <>
Cc: Eric Biggers <>,
	Linux Crypto Mailing List <>,
	LKML <>,
	Netdev <>,
	David Miller <>,
	Andrew Lutomirski <>,
	Greg Kroah-Hartman <>,
	Samuel Neves <>,
	"Daniel J . Bernstein" <>,
	Tanja Lange <>,
	Jean-Philippe Aumasson <>,
	Karthikeyan Bhargavan <>
Subject: Re: [PATCH v1 2/3] zinc: Introduce minimal cryptography library
Date: Fri, 3 Aug 2018 04:48:36 +0200
Message-ID: <> (raw)
In-Reply-To: <>

Hey Andy,

Thanks too for the feedback. Responses below:

On Wed, Aug 1, 2018 at 7:09 PM Andy Lutomirski <> wrote:
> > I think the above changes would also naturally lead to a much saner
> > patch series where each algorithm is added by its own patch, rather than
> > one monster patch that adds many algorithms and 24000 lines of code.
> >
> Yes, please.

Ack, will be in v2.

> I like this a *lot*.  (But why are you passing have_simd?  Shouldn't
> that check live in chacha20_arch?  If there's some init code needed,
> then chacha20_arch() should just return false before the init code
> runs.  Once the arch does whatever feature detection it needs, it can
> make chacha20_arch() start returning true.)

The have_simd stuff is so that the FPU state can be amortized across
several calls to the crypto functions. Here's a snippet from
WireGuard's send.c:

void packet_encrypt_worker(struct work_struct *work)
    struct crypt_queue *queue = container_of(work, struct
multicore_worker, work)->ptr;
    struct sk_buff *first, *skb, *next;
    bool have_simd = simd_get();

    while ((first = ptr_ring_consume_bh(&queue->ring)) != NULL) {
        enum packet_state state = PACKET_STATE_CRYPTED;

        skb_walk_null_queue_safe(first, skb, next) {
            if (likely(skb_encrypt(skb, PACKET_CB(first)->keypair, have_simd)))
            else {
                state = PACKET_STATE_DEAD;
        queue_enqueue_per_peer(&PACKET_PEER(first)->tx_queue, first, state);

        have_simd = simd_relax(have_simd);

simd_get() and simd_put() do the usual irq_fpu_usable/kernel_fpu_begin
dance and return/take a boolean accordingly. simd_relax(on) is:

static inline bool simd_relax(bool was_on)
    if (was_on && need_resched()) {
        return simd_get();
    return was_on;

With this, we most of the time get the FPU amortization, while still
doing the right thing for the preemption case (since kernel_fpu_begin
disables preemption). This is a quite important performance
optimization. However, I'd prefer the lazy FPU restoration proposal
discussed a few months ago, but it looks like that hasn't progressed,
hence the need for FPU call amortization:

> As I see it, there there are two truly new thing in the zinc patchset:
> the direct (in the direct call sense) arch dispatch, and the fact that
> the functions can be called directly, without allocating contexts,
> using function pointers, etc.
> In fact, I had a previous patch set that added such an interface for SHA256.

Seems like SHA256 will be a natural next candidate for Zinc, given the demand.

> > Your patch description is also missing any mention of crypto accelerator
> > hardware.  Quite a bit of the complexity in the crypto API, such as
> > scatterlist support and asynchronous execution, exists because it
> > supports crypto accelerators.  AFAICS your new APIs cannot support
> > crypto accelerators, as your APIs are synchronous and operate on virtual
> > addresses.  I assume your justification is that "djb algorithms" like
> > ChaCha and Poly1305 don't need crypto accelerators as they are fast in
> > software.  But you never explicitly stated this and discussed the
> > tradeoffs.  Since this is basically the foundation for the design you've
> > chosen, it really needs to be addressed.
> I see this as an advantage, not a disadvantage.  A very large majority
> of in-kernel crypto users (by number of call sites under a *very*
> brief survey, not by number of CPU cycles) just want to do some
> synchronous crypto on a buffer that is addressed by a regular pointer.
> Most of these users would be slowed down if they used any form of
> async crypto, since the CPU can complete the whole operation faster
> than it could plausibly initiate and complete anything asynchronous.
> And, right now, they suffer the full overhead of allocating a context
> (often with alloca!), looking up (or caching) some crypto API data
> structures, dispatching the operation, and cleaning up.
> So I think the right way to do it is to have directly callable
> functions like zinc uses and to have the fancy crypto API layer on top
> of them.  So if you actually want async accelerated crypto with
> scatterlists or whatever, you can call into the fancy API, and the
> fancy API can dispatch to hardware or it can dispatch to the normal
> static API.

Yes, exactly this.


  reply index

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-08-01  7:22 Eric Biggers
2018-08-01 17:02 ` Andy Lutomirski
2018-08-03  2:48   ` Jason A. Donenfeld [this message]
2018-08-03 21:29     ` Andy Lutomirski
2018-08-03 22:10       ` Jason A. Donenfeld
2018-08-07 18:54         ` Jason A. Donenfeld
2018-08-07 19:43           ` Andy Lutomirski
2018-08-07 23:48             ` Jason A. Donenfeld
2018-08-08  1:48               ` Andy Lutomirski
2018-08-08  1:51                 ` Jason A. Donenfeld
2018-08-09 18:08                   ` Andy Lutomirski
2018-08-03  2:33 ` Jason A. Donenfeld
2018-08-14 21:12   ` Eric Biggers
2018-08-15 16:28     ` D. J. Bernstein
2018-08-15 19:57       ` Eric Biggers
2018-08-16  4:24         ` D. J. Bernstein
2018-08-16 19:46           ` Eric Biggers
2018-08-17  7:31             ` D. J. Bernstein
2018-08-18  8:13               ` Ard Biesheuvel
2018-08-16  6:31     ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on

Archives are clonable:
	git clone --mirror lkml/git/0.git
	git clone --mirror lkml/git/1.git
	git clone --mirror lkml/git/2.git
	git clone --mirror lkml/git/3.git
	git clone --mirror lkml/git/4.git
	git clone --mirror lkml/git/5.git
	git clone --mirror lkml/git/6.git
	git clone --mirror lkml/git/7.git
	git clone --mirror lkml/git/8.git
	git clone --mirror lkml/git/9.git
	git clone --mirror lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ \
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:

AGPL code for this site: git clone