linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: David Miller <davem@davemloft.net>
Cc: "Herbert Xu" <herbert@gondor.apana.org.au>,
	linux-crypto@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	"Martin Willi" <martin@strongswan.org>,
	"WireGuard mailing list" <wireguard@lists.zx2c4.com>,
	"René van Dorst" <opensource@vdorst.com>
Subject: Re: [PATCH] poly1305: generic C can be faster on chips with slow unaligned access
Date: Thu, 3 Nov 2016 23:20:08 +0100	[thread overview]
Message-ID: <CAHmME9pm4DHuBsE+hoFxnm1B5OWAZ+OyKXzeKDxHtisZpw4ebg@mail.gmail.com> (raw)
In-Reply-To: <20161103.130852.1456848512897088071.davem@davemloft.net>

Hi David,

On Thu, Nov 3, 2016 at 6:08 PM, David Miller <davem@davemloft.net> wrote:
> In any event no piece of code should be doing 32-bit word reads from
> addresses like "x + 3" without, at a very minimum, going through the
> kernel unaligned access handlers.

Excellent point. In otherwords,

    ctx->r[0] = (le32_to_cpuvp(key +  0) >> 0) & 0x3ffffff;
    ctx->r[1] = (le32_to_cpuvp(key +  3) >> 2) & 0x3ffff03;
    ctx->r[2] = (le32_to_cpuvp(key +  6) >> 4) & 0x3ffc0ff;
    ctx->r[3] = (le32_to_cpuvp(key +  9) >> 6) & 0x3f03fff;
    ctx->r[4] = (le32_to_cpuvp(key + 12) >> 8) & 0x00fffff;

should change to:

    ctx->r[0] = (le32_to_cpuvp(key +  0) >> 0) & 0x3ffffff;
    ctx->r[1] = (get_unaligned_le32(key +  3) >> 2) & 0x3ffff03;
    ctx->r[2] = (get_unaligned_le32(key +  6) >> 4) & 0x3ffc0ff;
    ctx->r[3] = (get_unaligned_le32(key +  9) >> 6) & 0x3f03fff;
    ctx->r[4] = (le32_to_cpuvp(key + 12) >> 8) & 0x00fffff;

> We know explicitly that these offsets will not be 32-bit aligned, so
> it is required that we use the helpers, or alternatively do things to
> avoid these unaligned accesses such as using temporary storage when
> the HAVE_EFFICIENT_UNALIGNED_ACCESS kconfig value is not set.

So the question is: is the clever avoidance of unaligned accesses of
the original patch faster or slower than changing the unaligned
accesses to use the helper function?

I've put a little test harness together for playing with this:

    $ git clone git://git.zx2c4.com/polybench
    $ cd polybench
    $ make run

To test with one method, do as normal. To test with the other, remove
"#define USE_FIRST_METHOD" from the source code.

@René: do you think you could retest on your MIPS32r2 hardware and
report back which is faster?

And if anybody else has other hardware and would like to try, this
could be nice.

Regards,
Jason

  reply	other threads:[~2016-11-03 22:20 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-02 17:58 [PATCH] poly1305: generic C can be faster on chips with slow unaligned access Jason A. Donenfeld
2016-11-02 20:09 ` Herbert Xu
2016-11-02 20:47   ` Sandy Harris
2016-11-02 21:06   ` Jason A. Donenfeld
2016-11-02 21:08     ` Herbert Xu
2016-11-02 21:25       ` Jason A. Donenfeld
2016-11-02 21:26         ` Herbert Xu
2016-11-02 22:00           ` Jason A. Donenfeld
2016-11-03  0:49             ` Herbert Xu
2016-11-03  7:24               ` Jason A. Donenfeld
2016-11-03 17:08                 ` David Miller
2016-11-03 22:20                   ` Jason A. Donenfeld [this message]
2016-11-04 17:37                     ` Eric Biggers
2016-11-07 18:08                       ` Jason A. Donenfeld
2016-11-07 18:23                         ` Jason A. Donenfeld
2016-11-07 18:26                         ` Eric Biggers
2016-11-07 19:02                           ` Jason A. Donenfeld
2016-11-07 19:25                             ` Eric Biggers
2016-11-07 19:41                               ` Jason A. Donenfeld
2016-11-07 19:12 ` [PATCH v2] " Jason A. Donenfeld
2016-11-07 19:43   ` [PATCH v3] " Jason A. Donenfeld
2016-11-12 23:27     ` kbuild test robot
2016-11-07 19:47   ` [PATCH v4] " Jason A. Donenfeld
2016-11-07 20:40     ` Eric Biggers
2016-11-08  7:52     ` Martin Willi
2016-11-08 17:26       ` Eric Biggers
2016-11-13 11:29     ` Herbert Xu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHmME9pm4DHuBsE+hoFxnm1B5OWAZ+OyKXzeKDxHtisZpw4ebg@mail.gmail.com \
    --to=jason@zx2c4.com \
    --cc=davem@davemloft.net \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin@strongswan.org \
    --cc=opensource@vdorst.com \
    --cc=wireguard@lists.zx2c4.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).