linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Eric Biggers <ebiggers3@gmail.com>,
	x86@kernel.org, linux-kernel@vger.kernel.org,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Mathias Krause <minipli@googlemail.com>,
	Chandramouli Narayanan <mouli@linux.intel.com>,
	Jussi Kivilinna <jussi.kivilinna@iki.fi>,
	Peter Zijlstra <peterz@infradead.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	"David S. Miller" <davem@davemloft.net>,
	linux-crypto@vger.kernel.org, Eric Biggers <ebiggers@google.com>,
	Andy Lutomirski <luto@kernel.org>, Jiri Slaby <jslaby@suse.cz>
Subject: Re: [PATCH 00/12] x86/crypto: Fix RBP usage in several crypto .S files
Date: Thu, 14 Sep 2017 11:28:57 +0200	[thread overview]
Message-ID: <20170914092857.mvarp7iok6jf43sn@gmail.com> (raw)
In-Reply-To: <20170914091612.ck33coyubzevru2i@gmail.com>


* Ingo Molnar <mingo@kernel.org> wrote:

> 1)
> 
> Note how R12 is used immediately, right in the next instruction:
> 
>         vpaddq  (TBL), Y_0, XFER
> 
> I.e. the RBP fixes lengthen the program order data dependencies - that's a new 
> constraint and a few extra cycles per loop iteration if the workload is 
> address-generator bandwidth limited on that.
> 
> A simple way to ease that constraint would be to move the 'TLB' load up into the 
> loop, body, to the point where 'T1' is used for the last time - which is:
> 
> 
>         mov     a, T1           # T1 = a                                # MAJB
>         and     c, T1           # T1 = a&c                              # MAJB
> 
>         add     y0, y2          # y2 = S1 + CH                          # --
>         or      T1, y3          # y3 = MAJ = (a|c)&b)|(a&c)             # MAJ
> 
> +       mov frame_TBL(%rsp), TBL
> 
>         add     y1, h           # h = k + w + h + S0                    # --
> 
>         add     y2, d           # d = k + w + h + d + S1 + CH = d + t1  # --
> 
>         add     y2, h           # h = k + w + h + S0 + S1 + CH = t1 + S0# --
>         add     y3, h           # h = t1 + S0 + MAJ                     # --
> 
> Note how this moves up the 'TLB' reload by 4 instructions.

Note that in this case 'TBL' would have to be initialized before the 1st 
iteration, via something like:

        movq    $4, frame_SRND(%rsp)

+	mov frame_TBL(%rsp), TBL

.align 16
loop1:
        vpaddq  (TBL), Y_0, XFER
        vmovdqa XFER, frame_XFER(%rsp)
        FOUR_ROUNDS_AND_SCHED

Thanks,

	Ingo

  reply	other threads:[~2017-09-14  9:29 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-08-29 18:05 [PATCH 00/12] x86/crypto: Fix RBP usage in several crypto .S files Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 01/12] x86/crypto: Fix RBP usage in blowfish-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 02/12] x86/crypto: Fix RBP usage in camellia-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 03/12] x86/crypto: Fix RBP usage in cast5-avx-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 04/12] x86/crypto: Fix RBP usage in cast6-avx-x86_64-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 05/12] x86/crypto: Fix RBP usage in des3_ede-asm_64.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 06/12] x86/crypto: Fix RBP usage in sha1_avx2_x86_64_asm.S Josh Poimboeuf
2017-09-06 16:11   ` Tim Chen
2017-08-29 18:05 ` [PATCH 07/12] x86/crypto: Fix RBP usage in sha1_ssse3_asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 08/12] x86/crypto: Fix RBP usage in sha256-avx-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 09/12] x86/crypto: Fix RBP usage in sha256-avx2-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 10/12] x86/crypto: Fix RBP usage in sha256-ssse3-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 11/12] x86/crypto: Fix RBP usage in sha512-avx2-asm.S Josh Poimboeuf
2017-08-29 18:05 ` [PATCH 12/12] x86/crypto: Fix RBP usage in twofish-avx-x86_64-asm_64.S Josh Poimboeuf
2017-09-02  0:09 ` [PATCH 00/12] x86/crypto: Fix RBP usage in several crypto .S files Eric Biggers
2017-09-07  0:15   ` Josh Poimboeuf
2017-09-07  7:15   ` Ingo Molnar
2017-09-07 17:58     ` Eric Biggers
2017-09-07 21:26       ` Ingo Molnar
2017-09-08 17:57         ` Eric Biggers
2017-09-13 21:24           ` Josh Poimboeuf
2017-09-13 22:33             ` Josh Poimboeuf
2017-09-15  4:54               ` Eric Biggers
2017-09-15  5:34                 ` Ingo Molnar
2017-09-15 16:07                   ` Eric Biggers
2017-09-15 21:06                     ` Ingo Molnar
2017-09-19  3:00                       ` Herbert Xu
2017-09-14  9:16             ` Ingo Molnar
2017-09-14  9:28               ` Ingo Molnar [this message]
2017-09-14 13:28               ` Josh Poimboeuf
2017-09-15  5:37                 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170914092857.mvarp7iok6jf43sn@gmail.com \
    --to=mingo@kernel.org \
    --cc=davem@davemloft.net \
    --cc=ebiggers3@gmail.com \
    --cc=ebiggers@google.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jpoimboe@redhat.com \
    --cc=jslaby@suse.cz \
    --cc=jussi.kivilinna@iki.fi \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=minipli@googlemail.com \
    --cc=mouli@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=tim.c.chen@linux.intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).