linux-crypto.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Uros Bizjak <ubizjak@gmail.com>
To: "Jason A. Donenfeld" <Jason@zx2c4.com>
Cc: Karthik Bhargavan <karthikeyan.bhargavan@inria.fr>,
	Chris.Hawblitzel@microsoft.com,
	Jonathan Protzenko <protz@microsoft.com>,
	Aymeric Fromherz <fromherz@cmu.edu>,
	Linux Crypto Mailing List <linux-crypto@vger.kernel.org>,
	X86 ML <x86@kernel.org>, Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	Ard Biesheuvel <ardb@kernel.org>
Subject: Re: [PATCH] crypto/x86: Use XORL r32,32 in curve25519-x86_64.c
Date: Wed, 2 Sep 2020 07:50:36 +0200	[thread overview]
Message-ID: <CAFULd4ZH3s=9nsvNE8Sxf=r-KZJX5NKxFehNo7YU2=2ExwbsQQ@mail.gmail.com> (raw)
In-Reply-To: <CAHmME9oemtY5PG9WjbOOtd_xxbMRPb1t5mPoo2rR-y3umYKd5Q@mail.gmail.com>

On Tue, Sep 1, 2020 at 9:12 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
>
> On Tue, Sep 1, 2020 at 8:13 PM Jason A. Donenfeld <Jason@zx2c4.com> wrote:
> > operands are the same. Also, have you seen any measurable differences
> > when benching this? I can stick it into kbench9000 to see if you
> > haven't looked yet.
>
> On a Skylake server (Xeon Gold 5120), I'm unable to see any measurable
> difference with this, at all, no matter how much I median or mean or
> reduce noise by disabling interrupts.
>
> One thing that sticks out is that all the replacements of r8-r15 by
> their %r8d-r15d counterparts still have the REX prefix, as is
> necessary to access those registers. The only ones worth changing,
> then, are the legacy registers, and on a whole, this amounts to only
> 48 bytes of difference.

The patch implements one of x86 target specific optimizations,
performed by gcc. The optimization results in code size savings of one
byte, where REX prefix is omitted with legacy registers, but otherwise
should have no measurable runtime effect. Since gcc applies this
optimization universally to all integer registers, I took the same
approach and implemented the same change to legacy and REX registers.
As measured above, 48 bytes saved is a good result for such a trivial
optimization.

Uros.

  reply	other threads:[~2020-09-02  5:50 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-27 17:30 [PATCH] crypto/x86: Use XORL r32,32 in curve25519-x86_64.c Uros Bizjak
2020-09-01 15:39 ` Ard Biesheuvel
2020-09-01 18:13   ` Jason A. Donenfeld
2020-09-01 19:12     ` Jason A. Donenfeld
2020-09-02  5:50       ` Uros Bizjak [this message]
2020-09-02  9:17         ` peterz
2020-09-02 11:36           ` Uros Bizjak
2020-09-02 15:00             ` Jason A. Donenfeld
2020-09-02 14:58           ` Jason A. Donenfeld
2020-09-07 13:14 ` Jason A. Donenfeld
2020-09-11  6:56 ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAFULd4ZH3s=9nsvNE8Sxf=r-KZJX5NKxFehNo7YU2=2ExwbsQQ@mail.gmail.com' \
    --to=ubizjak@gmail.com \
    --cc=Chris.Hawblitzel@microsoft.com \
    --cc=Jason@zx2c4.com \
    --cc=ardb@kernel.org \
    --cc=davem@davemloft.net \
    --cc=fromherz@cmu.edu \
    --cc=herbert@gondor.apana.org.au \
    --cc=karthikeyan.bhargavan@inria.fr \
    --cc=linux-crypto@vger.kernel.org \
    --cc=protz@microsoft.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).