linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ard.biesheuvel@linaro.org>
To: Andy Lutomirski <luto@amacapital.net>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	"open list:HARDWARE RANDOM NUMBER GENERATOR CORE" 
	<linux-crypto@vger.kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	David Miller <davem@davemloft.net>,
	Greg KH <gregkh@linuxfoundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Samuel Neves <sneves@dei.uc.pt>,
	Dan Carpenter <dan.carpenter@oracle.com>,
	Arnd Bergmann <arnd@arndb.de>, Eric Biggers <ebiggers@google.com>,
	Andy Lutomirski <luto@kernel.org>, Will Deacon <will@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Martin Willi <martin@strongswan.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard
Date: Sat, 5 Oct 2019 09:24:33 +0200	[thread overview]
Message-ID: <CAKv+Gu-VqfFsW+nrG+-2g1-eu6S+ZuD7qaN9aTchwD=Bcj_giw@mail.gmail.com> (raw)
In-Reply-To: <CAKv+Gu87Co3BobUeC_x2TLE9vy-sDHzj3aiK=LFetwC2jz3aig@mail.gmail.com>

On Fri, 4 Oct 2019 at 16:56, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>
> On Fri, 4 Oct 2019 at 16:53, Andy Lutomirski <luto@amacapital.net> wrote:
> >
> >
> >
> > > On Oct 4, 2019, at 6:52 AM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> > >
> > > On Fri, 4 Oct 2019 at 15:42, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > >>
> > >>> On Thu, Oct 03, 2019 at 10:43:29AM +0200, Ard Biesheuvel wrote:
> > >>> On Wed, 2 Oct 2019 at 16:17, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> > >>>>
> > >>> ...
> > >>>>
> > >>>> In the future, I would like to extend these interfaces to use static calls,
> > >>>> so that the accelerated implementations can be [un]plugged at runtime. For
> > >>>> the time being, we rely on weak aliases and conditional exports so that the
> > >>>> users of the library interfaces link directly to the accelerated versions,
> > >>>> but without the ability to unplug them.
> > >>>>
> > >>>
> > >>> As it turns out, we don't actually need static calls for this.
> > >>> Instead, we can simply permit weak symbol references to go unresolved
> > >>> between modules (as we already do in the kernel itself, due to the
> > >>> fact that ELF permits it), and have the accelerated code live in
> > >>> separate modules that may not be loadable on certain systems, or be
> > >>> blacklisted by the user.
> > >>
> > >> You're saying that at module insertion time, the kernel will override
> > >> weak symbols with those provided by the module itself? At runtime?
> > >>
> > >
> > > Yes.
> > >
> > >> Do you know offhand how this patching works? Is there a PLT that gets
> > >> patched, and so the calls all go through a layer of function pointer
> > >> indirection? Or are all call sites fixed up at insertion time and the
> > >> call instructions rewritten with some runtime patching magic?
> > >>
> > >
> > > No magic. Take curve25519 for example, when built for ARM:
> > >
> > > 00000000 <curve25519>:
> > >   0:   f240 0300       movw    r3, #0
> > >                        0: R_ARM_THM_MOVW_ABS_NC        curve25519_arch
> > >   4:   f2c0 0300       movt    r3, #0
> > >                        4: R_ARM_THM_MOVT_ABS   curve25519_arch
> > >   8:   b570            push    {r4, r5, r6, lr}
> > >   a:   4604            mov     r4, r0
> > >   c:   460d            mov     r5, r1
> > >   e:   4616            mov     r6, r2
> > >  10:   b173            cbz     r3, 30 <curve25519+0x30>
> > >  12:   f7ff fffe       bl      0 <curve25519_arch>
> > >                        12: R_ARM_THM_CALL      curve25519_arch
> > >  16:   b158            cbz     r0, 30 <curve25519+0x30>
> > >  18:   4620            mov     r0, r4
> > >  1a:   2220            movs    r2, #32
> > >  1c:   f240 0100       movw    r1, #0
> > >                        1c: R_ARM_THM_MOVW_ABS_NC       .LANCHOR0
> > >  20:   f2c0 0100       movt    r1, #0
> > >                        20: R_ARM_THM_MOVT_ABS  .LANCHOR0
> > >  24:   f7ff fffe       bl      0 <__crypto_memneq>
> > >                        24: R_ARM_THM_CALL      __crypto_memneq
> > >  28:   3000            adds    r0, #0
> > >  2a:   bf18            it      ne
> > >  2c:   2001            movne   r0, #1
> > >  2e:   bd70            pop     {r4, r5, r6, pc}
> > >  30:   4632            mov     r2, r6
> > >  32:   4629            mov     r1, r5
> > >  34:   4620            mov     r0, r4
> > >  36:   f7ff fffe       bl      0 <curve25519_generic>
> > >                        36: R_ARM_THM_CALL      curve25519_generic
> > >  3a:   e7ed            b.n     18 <curve25519+0x18>
> > >
> > > curve25519_arch is a weak reference. It either gets satisfied at
> > > module load time, or it doesn't.
> > >
> > > If it does get satisfied, the relocations covering the movw/movt pair
> > > and the one covering the bl instruction get updated so that they point
> > > to the arch routine.
> > >
> > > If it does not get satisfied, the relocations are disregarded, in
> > > which case the cbz instruction at offset 0x10 jumps over the bl call.
> > >
> > > Note that this does not involve any memory accesses. It does involve
> > > some code patching, but only of the kind the module loader already
> > > does.
> >
> > Won’t this have the counterintuitive property that, if you load the modules in the opposite order, the reference won’t be re-resolved and performance will silently regress?
> >
>
> Indeed, the arch module needs to be loaded first
>

Actually, this can be addressed by retaining the module dependencies
as before, but permitting the arch module to be omitted at load time.

> > I think it might be better to allow two different modules to export the same symbol but only allow one of them to be loaded.
>
> That is what I am doing for chacha and poly
>
> > Or use static calls.

Given that static calls don't actually exist yet, I propose to proceed
with the approach above, and switch to static calls once all
architectures where it matters have an implementation that does not
use function pointers (which is how static calls will be implemented
generically)

  reply	other threads:[~2019-10-05  7:24 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 14:16 [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard Ard Biesheuvel
2019-10-02 14:16 ` [PATCH v2 01/20] crypto: chacha - move existing library code into lib/crypto Ard Biesheuvel
2019-10-02 14:30   ` Greg KH
2019-10-04 13:21   ` Jason A. Donenfeld
2019-10-02 14:16 ` [PATCH v2 02/20] crypto: x86/chacha - expose SIMD ChaCha routine as library function Ard Biesheuvel
2019-10-02 14:31   ` Greg KH
2019-10-04 13:36   ` Jason A. Donenfeld
2019-10-04 13:54     ` Ard Biesheuvel
2019-10-02 14:16 ` [PATCH v2 03/20] crypto: arm64/chacha - expose arm64 " Ard Biesheuvel
2019-10-02 14:31   ` Greg KH
2019-10-02 14:16 ` [PATCH v2 04/20] crypto: arm/chacha - expose ARM " Ard Biesheuvel
2019-10-04 13:52   ` Jason A. Donenfeld
2019-10-04 14:23     ` Ard Biesheuvel
2019-10-04 14:28       ` Jason A. Donenfeld
2019-10-04 14:29       ` Jason A. Donenfeld
2019-10-04 15:43         ` Eric Biggers
2019-10-04 15:24       ` Arnd Bergmann
2019-10-04 15:35         ` Ard Biesheuvel
2019-10-04 15:38           ` Jason A. Donenfeld
2019-10-02 14:16 ` [PATCH v2 05/20] crypto: mips/chacha - import accelerated 32r2 code from Zinc Ard Biesheuvel
2019-10-04 13:46   ` Jason A. Donenfeld
2019-10-04 14:38     ` Ard Biesheuvel
2019-10-04 14:38       ` Ard Biesheuvel
2019-10-04 14:59       ` Jason A. Donenfeld
2019-10-04 15:05         ` Ard Biesheuvel
2019-10-04 15:15         ` René van Dorst
2019-10-04 15:23           ` Ard Biesheuvel
2019-10-05  9:05             ` René van Dorst
2019-10-06 19:12             ` René van Dorst
2019-10-02 14:16 ` [PATCH v2 06/20] crypto: poly1305 - move into lib/crypto and refactor into library Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 07/20] crypto: x86/poly1305 - expose existing driver as poly1305 library Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 08/20] crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 09/20] crypto: arm/poly1305 " Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 10/20] crypto: mips/poly1305 - import accelerated 32r2 code from Zinc Ard Biesheuvel
2019-10-04 13:48   ` Jason A. Donenfeld
2019-10-02 14:17 ` [PATCH v2 11/20] int128: move __uint128_t compiler test to Kconfig Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 12/20] crypto: BLAKE2s - generic C library implementation and selftest Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 13/20] crypto: BLAKE2s - x86_64 library implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 14/20] crypto: Curve25519 - generic C library implementations and selftest Ard Biesheuvel
2019-10-04 13:57   ` Jason A. Donenfeld
2019-10-04 14:03     ` Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 15/20] crypto: lib/curve25519 - work around Clang stack spilling issue Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 16/20] crypto: Curve25519 - x86_64 library implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 17/20] crypto: arm - import Bernstein and Schwabe's Curve25519 ARM implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 18/20] crypto: arm/Curve25519 - wire up NEON implementation Ard Biesheuvel
2019-10-04 14:00   ` Jason A. Donenfeld
2019-10-04 14:11     ` Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 19/20] crypto: chacha20poly1305 - import construction and selftest from Zinc Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 20/20] crypto: lib/chacha20poly1305 - reimplement crypt_from_sg() routine Ard Biesheuvel
2019-10-04 14:03   ` Jason A. Donenfeld
2019-10-04 14:07     ` Ard Biesheuvel
2019-10-03  8:43 ` [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard Ard Biesheuvel
2019-10-04 13:42   ` Jason A. Donenfeld
2019-10-04 13:52     ` Ard Biesheuvel
2019-10-04 14:53       ` Andy Lutomirski
2019-10-04 14:55         ` Jason A. Donenfeld
2019-10-04 14:59           ` Ard Biesheuvel
2019-10-04 14:56         ` Ard Biesheuvel
2019-10-05  7:24           ` Ard Biesheuvel [this message]
2019-10-07  4:44             ` Andy Lutomirski
2019-10-07  5:23               ` Ard Biesheuvel
2019-10-07 15:01                 ` Andy Lutomirski
2019-10-07 15:12                   ` Ard Biesheuvel
2019-10-07 16:05                     ` Andy Lutomirski
2019-10-04 14:50     ` Andy Lutomirski
2019-10-04 13:16 ` Jason A. Donenfeld
2019-10-04 14:12 ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKv+Gu-VqfFsW+nrG+-2g1-eu6S+ZuD7qaN9aTchwD=Bcj_giw@mail.gmail.com' \
    --to=ard.biesheuvel@linaro.org \
    --cc=Jason@zx2c4.com \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=dan.carpenter@oracle.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jpoimboe@redhat.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=martin@strongswan.org \
    --cc=maz@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sneves@dei.uc.pt \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).