linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andy Lutomirski <luto@amacapital.net>
To: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	"open list:HARDWARE RANDOM NUMBER GENERATOR CORE" 
	<linux-crypto@vger.kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	David Miller <davem@davemloft.net>,
	Greg KH <gregkh@linuxfoundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Samuel Neves <sneves@dei.uc.pt>,
	Dan Carpenter <dan.carpenter@oracle.com>,
	Arnd Bergmann <arnd@arndb.de>, Eric Biggers <ebiggers@google.com>,
	Andy Lutomirski <luto@kernel.org>, Will Deacon <will@kernel.org>,
	Marc Zyngier <maz@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Martin Willi <martin@strongswan.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Josh Poimboeuf <jpoimboe@redhat.com>
Subject: Re: [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard
Date: Sun, 6 Oct 2019 21:44:15 -0700	[thread overview]
Message-ID: <04D32F59-34D4-4EBF-80E3-69088D14C5D8@amacapital.net> (raw)
In-Reply-To: <CAKv+Gu-VqfFsW+nrG+-2g1-eu6S+ZuD7qaN9aTchwD=Bcj_giw@mail.gmail.com>



> On Oct 5, 2019, at 12:24 AM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
> 
> On Fri, 4 Oct 2019 at 16:56, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>> 
>>> On Fri, 4 Oct 2019 at 16:53, Andy Lutomirski <luto@amacapital.net> wrote:
>>> 
>>> 
>>> 
>>>> On Oct 4, 2019, at 6:52 AM, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>>> 
>>>> On Fri, 4 Oct 2019 at 15:42, Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>>>>> 
>>>>>> On Thu, Oct 03, 2019 at 10:43:29AM +0200, Ard Biesheuvel wrote:
>>>>>> On Wed, 2 Oct 2019 at 16:17, Ard Biesheuvel <ard.biesheuvel@linaro.org> wrote:
>>>>>>> 
>>>>>> ...
>>>>>>> 
>>>>>>> In the future, I would like to extend these interfaces to use static calls,
>>>>>>> so that the accelerated implementations can be [un]plugged at runtime. For
>>>>>>> the time being, we rely on weak aliases and conditional exports so that the
>>>>>>> users of the library interfaces link directly to the accelerated versions,
>>>>>>> but without the ability to unplug them.
>>>>>>> 
>>>>>> 
>>>>>> As it turns out, we don't actually need static calls for this.
>>>>>> Instead, we can simply permit weak symbol references to go unresolved
>>>>>> between modules (as we already do in the kernel itself, due to the
>>>>>> fact that ELF permits it), and have the accelerated code live in
>>>>>> separate modules that may not be loadable on certain systems, or be
>>>>>> blacklisted by the user.
>>>>> 
>>>>> You're saying that at module insertion time, the kernel will override
>>>>> weak symbols with those provided by the module itself? At runtime?
>>>>> 
>>>> 
>>>> Yes.
>>>> 
>>>>> Do you know offhand how this patching works? Is there a PLT that gets
>>>>> patched, and so the calls all go through a layer of function pointer
>>>>> indirection? Or are all call sites fixed up at insertion time and the
>>>>> call instructions rewritten with some runtime patching magic?
>>>>> 
>>>> 
>>>> No magic. Take curve25519 for example, when built for ARM:
>>>> 
>>>> 00000000 <curve25519>:
>>>>  0:   f240 0300       movw    r3, #0
>>>>                       0: R_ARM_THM_MOVW_ABS_NC        curve25519_arch
>>>>  4:   f2c0 0300       movt    r3, #0
>>>>                       4: R_ARM_THM_MOVT_ABS   curve25519_arch
>>>>  8:   b570            push    {r4, r5, r6, lr}
>>>>  a:   4604            mov     r4, r0
>>>>  c:   460d            mov     r5, r1
>>>>  e:   4616            mov     r6, r2
>>>> 10:   b173            cbz     r3, 30 <curve25519+0x30>
>>>> 12:   f7ff fffe       bl      0 <curve25519_arch>
>>>>                       12: R_ARM_THM_CALL      curve25519_arch
>>>> 16:   b158            cbz     r0, 30 <curve25519+0x30>
>>>> 18:   4620            mov     r0, r4
>>>> 1a:   2220            movs    r2, #32
>>>> 1c:   f240 0100       movw    r1, #0
>>>>                       1c: R_ARM_THM_MOVW_ABS_NC       .LANCHOR0
>>>> 20:   f2c0 0100       movt    r1, #0
>>>>                       20: R_ARM_THM_MOVT_ABS  .LANCHOR0
>>>> 24:   f7ff fffe       bl      0 <__crypto_memneq>
>>>>                       24: R_ARM_THM_CALL      __crypto_memneq
>>>> 28:   3000            adds    r0, #0
>>>> 2a:   bf18            it      ne
>>>> 2c:   2001            movne   r0, #1
>>>> 2e:   bd70            pop     {r4, r5, r6, pc}
>>>> 30:   4632            mov     r2, r6
>>>> 32:   4629            mov     r1, r5
>>>> 34:   4620            mov     r0, r4
>>>> 36:   f7ff fffe       bl      0 <curve25519_generic>
>>>>                       36: R_ARM_THM_CALL      curve25519_generic
>>>> 3a:   e7ed            b.n     18 <curve25519+0x18>
>>>> 
>>>> curve25519_arch is a weak reference. It either gets satisfied at
>>>> module load time, or it doesn't.
>>>> 
>>>> If it does get satisfied, the relocations covering the movw/movt pair
>>>> and the one covering the bl instruction get updated so that they point
>>>> to the arch routine.
>>>> 
>>>> If it does not get satisfied, the relocations are disregarded, in
>>>> which case the cbz instruction at offset 0x10 jumps over the bl call.
>>>> 
>>>> Note that this does not involve any memory accesses. It does involve
>>>> some code patching, but only of the kind the module loader already
>>>> does.
>>> 
>>> Won’t this have the counterintuitive property that, if you load the modules in the opposite order, the reference won’t be re-resolved and performance will silently regress?
>>> 
>> 
>> Indeed, the arch module needs to be loaded first
>> 
> 
> Actually, this can be addressed by retaining the module dependencies
> as before, but permitting the arch module to be omitted at load time.

I think that, to avoid surprises, you should refuse to load the arch module if the generic module is loaded, too.

> 
>>> I think it might be better to allow two different modules to export the same symbol but only allow one of them to be loaded.
>> 
>> That is what I am doing for chacha and poly
>> 
>>> Or use static calls.
> 
> Given that static calls don't actually exist yet, I propose to proceed
> with the approach above, and switch to static calls once all
> architectures where it matters have an implementation that does not
> use function pointers (which is how static calls will be implemented
> generically)

  reply	other threads:[~2019-10-07  4:44 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-02 14:16 [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard Ard Biesheuvel
2019-10-02 14:16 ` [PATCH v2 01/20] crypto: chacha - move existing library code into lib/crypto Ard Biesheuvel
2019-10-02 14:30   ` Greg KH
2019-10-04 13:21   ` Jason A. Donenfeld
2019-10-02 14:16 ` [PATCH v2 02/20] crypto: x86/chacha - expose SIMD ChaCha routine as library function Ard Biesheuvel
2019-10-02 14:31   ` Greg KH
2019-10-04 13:36   ` Jason A. Donenfeld
2019-10-04 13:54     ` Ard Biesheuvel
2019-10-02 14:16 ` [PATCH v2 03/20] crypto: arm64/chacha - expose arm64 " Ard Biesheuvel
2019-10-02 14:31   ` Greg KH
2019-10-02 14:16 ` [PATCH v2 04/20] crypto: arm/chacha - expose ARM " Ard Biesheuvel
2019-10-04 13:52   ` Jason A. Donenfeld
2019-10-04 14:23     ` Ard Biesheuvel
2019-10-04 14:28       ` Jason A. Donenfeld
2019-10-04 14:29       ` Jason A. Donenfeld
2019-10-04 15:43         ` Eric Biggers
2019-10-04 15:24       ` Arnd Bergmann
2019-10-04 15:35         ` Ard Biesheuvel
2019-10-04 15:38           ` Jason A. Donenfeld
2019-10-02 14:16 ` [PATCH v2 05/20] crypto: mips/chacha - import accelerated 32r2 code from Zinc Ard Biesheuvel
2019-10-04 13:46   ` Jason A. Donenfeld
2019-10-04 14:38     ` Ard Biesheuvel
2019-10-04 14:38       ` Ard Biesheuvel
2019-10-04 14:59       ` Jason A. Donenfeld
2019-10-04 15:05         ` Ard Biesheuvel
2019-10-04 15:15         ` René van Dorst
2019-10-04 15:23           ` Ard Biesheuvel
2019-10-05  9:05             ` René van Dorst
2019-10-06 19:12             ` René van Dorst
2019-10-02 14:16 ` [PATCH v2 06/20] crypto: poly1305 - move into lib/crypto and refactor into library Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 07/20] crypto: x86/poly1305 - expose existing driver as poly1305 library Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 08/20] crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 09/20] crypto: arm/poly1305 " Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 10/20] crypto: mips/poly1305 - import accelerated 32r2 code from Zinc Ard Biesheuvel
2019-10-04 13:48   ` Jason A. Donenfeld
2019-10-02 14:17 ` [PATCH v2 11/20] int128: move __uint128_t compiler test to Kconfig Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 12/20] crypto: BLAKE2s - generic C library implementation and selftest Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 13/20] crypto: BLAKE2s - x86_64 library implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 14/20] crypto: Curve25519 - generic C library implementations and selftest Ard Biesheuvel
2019-10-04 13:57   ` Jason A. Donenfeld
2019-10-04 14:03     ` Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 15/20] crypto: lib/curve25519 - work around Clang stack spilling issue Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 16/20] crypto: Curve25519 - x86_64 library implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 17/20] crypto: arm - import Bernstein and Schwabe's Curve25519 ARM implementation Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 18/20] crypto: arm/Curve25519 - wire up NEON implementation Ard Biesheuvel
2019-10-04 14:00   ` Jason A. Donenfeld
2019-10-04 14:11     ` Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 19/20] crypto: chacha20poly1305 - import construction and selftest from Zinc Ard Biesheuvel
2019-10-02 14:17 ` [PATCH v2 20/20] crypto: lib/chacha20poly1305 - reimplement crypt_from_sg() routine Ard Biesheuvel
2019-10-04 14:03   ` Jason A. Donenfeld
2019-10-04 14:07     ` Ard Biesheuvel
2019-10-03  8:43 ` [PATCH v2 00/20] crypto: crypto API library interfaces for WireGuard Ard Biesheuvel
2019-10-04 13:42   ` Jason A. Donenfeld
2019-10-04 13:52     ` Ard Biesheuvel
2019-10-04 14:53       ` Andy Lutomirski
2019-10-04 14:55         ` Jason A. Donenfeld
2019-10-04 14:59           ` Ard Biesheuvel
2019-10-04 14:56         ` Ard Biesheuvel
2019-10-05  7:24           ` Ard Biesheuvel
2019-10-07  4:44             ` Andy Lutomirski [this message]
2019-10-07  5:23               ` Ard Biesheuvel
2019-10-07 15:01                 ` Andy Lutomirski
2019-10-07 15:12                   ` Ard Biesheuvel
2019-10-07 16:05                     ` Andy Lutomirski
2019-10-04 14:50     ` Andy Lutomirski
2019-10-04 13:16 ` Jason A. Donenfeld
2019-10-04 14:12 ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=04D32F59-34D4-4EBF-80E3-69088D14C5D8@amacapital.net \
    --to=luto@amacapital.net \
    --cc=Jason@zx2c4.com \
    --cc=ard.biesheuvel@linaro.org \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=dan.carpenter@oracle.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@google.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=jpoimboe@redhat.com \
    --cc=linux-crypto@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=martin@strongswan.org \
    --cc=maz@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sneves@dei.uc.pt \
    --cc=torvalds@linux-foundation.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).