All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ard Biesheuvel <ardb@kernel.org>
To: Herbert Xu <herbert@gondor.apana.org.au>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	Robert Elliott <elliott@hpe.com>,
	davem@davemloft.net, tim.c.chen@linux.intel.com,
	ap420073@gmail.com, David.Laight@aculab.com, ebiggers@kernel.org,
	linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v4 10/24] crypto: x86/poly - limit FPU preemption
Date: Fri, 25 Nov 2022 09:59:17 +0100	[thread overview]
Message-ID: <CAMj1kXHHm+L=qE5opDXhjoWZt+1eKXFeGVS=OdvyF0VNFZivCA@mail.gmail.com> (raw)
In-Reply-To: <Y4B/kjS0lgzdUJHG@gondor.apana.org.au>

On Fri, 25 Nov 2022 at 09:41, Herbert Xu <herbert@gondor.apana.org.au> wrote:
>
> On Wed, Nov 16, 2022 at 12:13:51PM +0100, Jason A. Donenfeld wrote:
> > On Tue, Nov 15, 2022 at 10:13:28PM -0600, Robert Elliott wrote:
> > > +/* avoid kernel_fpu_begin/end scheduler/rcu stalls */
> > > +static const unsigned int bytes_per_fpu = 337 * 1024;
> > > +
> >
> > Use an enum for constants like this:
> >
> >     enum { BYTES_PER_FPU = ... };
> >
> > You can even make it function-local, so it's near the code that uses it,
> > which will better justify its existence.
> >
> > Also, where did you get this number? Seems kind of weird.
>
> These numbers are highly dependent on hardware and I think having
> them hard-coded is wrong.
>
> Perhaps we should try a different approach.  How about just limiting
> the size to 4K, and then depending on need_resched we break out of
> the loop? Something like:
>
>         if (!len)
>                 return 0;
>
>         kernel_fpu_begin();
>         for (;;) {
>                 unsigned int chunk = min(len, 4096);
>
>                 sha1_base_do_update(desc, data, chunk, sha1_xform);
>
>                 len -= chunk;
>                 data += chunk;
>
>                 if (!len)
>                         break;
>
>                 if (need_resched()) {
>                         kernel_fpu_end();
>                         cond_resched();
>                         kernel_fpu_begin();
>                 }
>         }
>         kernel_fpu_end();
>

On arm64, this is implemented in an assembler macro 'cond_yield' so we
don't need to preserve/restore the SIMD state state at all if the
yield is not going to result in a call to schedule(). For example, the
SHA3 code keeps 400 bytes of state in registers, which we don't want
to save and reload unless needed. (5f6cb2e617681 'crypto:
arm64/sha512-ce - simplify NEON yield')

So the asm routines will call cond_yield, and return early if a yield
is required, with the number of blocks or bytes left to process as the
return value. The C wrapper just calls the asm routine in a loop until
the return value becomes 0.

That way, we don't need magic values at all, and the yield will occur
as soon as the asm inner loop observes the yield condition so the
latency should be much lower as well.

Note that it is only used in shash implementations, given that they
are the only ones that may receive unbounded inputs.

  reply	other threads:[~2022-11-25  8:59 UTC|newest]

Thread overview: 127+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-06 22:31 [RFC PATCH 0/7] crypto: x86 - fix RCU stalls Robert Elliott
2022-10-06 22:31 ` [RFC PATCH 1/7] rcu: correct CONFIG_EXT_RCU_CPU_STALL_TIMEOUT descriptions Robert Elliott
2022-10-06 22:31 ` [RFC PATCH 2/7] crypto: x86/sha - limit FPU preemption Robert Elliott
2022-10-06 22:31 ` [RFC PATCH 3/7] crypto: x86/crc " Robert Elliott
2022-10-06 22:31 ` [RFC PATCH 4/7] crypto: x86/sm3 " Robert Elliott
2022-10-06 22:31 ` [RFC PATCH 5/7] crypto: x86/ghash - restructure FPU context saving Robert Elliott
2022-10-06 22:31 ` [RFC PATCH 6/7] crypto: x86/ghash - limit FPU preemption Robert Elliott
2022-10-06 22:31 ` [RFC PATCH 7/7] crypto: x86 - use common macro for FPU limit Robert Elliott
2022-10-12 21:59 ` [PATCH v2 00/19] crypto: x86 - fix RCU stalls Robert Elliott
2022-10-12 21:59   ` [PATCH v2 01/19] crypto: tcrypt - test crc32 Robert Elliott
2022-10-12 21:59   ` [PATCH v2 02/19] crypto: tcrypt - test nhpoly1305 Robert Elliott
2022-10-12 21:59   ` [PATCH v2 03/19] crypto: tcrypt - reschedule during cycles speed tests Robert Elliott
2022-10-12 21:59   ` [PATCH v2 04/19] crypto: x86/sha - limit FPU preemption Robert Elliott
2022-10-13  0:41     ` Jason A. Donenfeld
2022-10-13 21:50       ` Elliott, Robert (Servers)
2022-10-14 11:01       ` David Laight
2022-10-13  5:57     ` Eric Biggers
2022-10-13  6:04       ` Herbert Xu
2022-10-13  6:08         ` Eric Biggers
2022-10-13  7:50           ` Herbert Xu
2022-10-13 22:41       ` :Re: " Elliott, Robert (Servers)
2022-10-12 21:59   ` [PATCH v2 05/19] crypto: x86/crc " Robert Elliott
2022-10-13  2:00     ` Herbert Xu
2022-10-13 22:34       ` Elliott, Robert (Servers)
2022-10-14  4:02     ` David Laight
2022-10-24  2:03     ` kernel test robot
2022-10-24  2:03       ` [LTP] " kernel test robot
2022-10-12 21:59   ` [PATCH v2 06/19] crypto: x86/sm3 " Robert Elliott
2022-10-12 21:59   ` [PATCH v2 07/19] crypto: x86/ghash - restructure FPU context saving Robert Elliott
2022-10-12 21:59   ` [PATCH v2 08/19] crypto: x86/ghash - limit FPU preemption Robert Elliott
2022-10-13  6:03     ` Eric Biggers
2022-10-13 22:52       ` Elliott, Robert (Servers)
2022-10-12 21:59   ` [PATCH v2 09/19] crypto: x86 - use common macro for FPU limit Robert Elliott
2022-10-13  0:35     ` Jason A. Donenfeld
2022-10-13 21:48       ` Elliott, Robert (Servers)
2022-10-14  1:26         ` Jason A. Donenfeld
2022-10-18  0:06           ` Elliott, Robert (Servers)
2022-10-12 21:59   ` [PATCH v2 10/19] crypto: x86/sha1, sha256 - load based on CPU features Robert Elliott
2022-10-12 21:59   ` [PATCH v2 11/19] crypto: x86/crc " Robert Elliott
2022-10-12 21:59   ` [PATCH v2 12/19] crypto: x86/sm3 " Robert Elliott
2022-10-12 21:59   ` [PATCH v2 13/19] crypto: x86/ghash " Robert Elliott
2022-10-12 21:59   ` [PATCH v2 14/19] crypto: x86 " Robert Elliott
2022-10-14 14:26     ` Elliott, Robert (Servers)
2022-10-12 21:59   ` [PATCH v2 15/19] crypto: x86 - add pr_fmt to all modules Robert Elliott
2022-10-12 21:59   ` [PATCH v2 16/19] crypto: x86 - print CPU optimized loaded messages Robert Elliott
2022-10-13  0:40     ` Jason A. Donenfeld
2022-10-13 13:47     ` kernel test robot
2022-10-13 13:48     ` kernel test robot
2022-10-12 21:59   ` [PATCH v2 17/19] crypto: x86 - standardize suboptimal prints Robert Elliott
2022-10-13  0:38     ` Jason A. Donenfeld
2022-10-12 21:59   ` [PATCH v2 18/19] crypto: x86 - standardize not loaded prints Robert Elliott
2022-10-13  0:42     ` Jason A. Donenfeld
2022-10-13 22:20       ` Elliott, Robert (Servers)
2022-11-10 22:06         ` Elliott, Robert (Servers)
2022-10-12 21:59   ` [PATCH v2 19/19] crypto: x86/sha - register only the best function Robert Elliott
2022-10-13  6:07     ` Eric Biggers
2022-10-13  7:52       ` Herbert Xu
2022-10-13 22:59         ` Elliott, Robert (Servers)
2022-10-14  8:22           ` Herbert Xu
2022-11-01 21:34   ` [PATCH v2 00/19] crypto: x86 - fix RCU stalls Elliott, Robert (Servers)
2022-11-03  4:27   ` [PATCH v3 00/17] crypt: " Robert Elliott
2022-11-03  4:27     ` [PATCH v3 01/17] crypto: tcrypt - test crc32 Robert Elliott
2022-11-03  4:27     ` [PATCH v3 02/17] crypto: tcrypt - test nhpoly1305 Robert Elliott
2022-11-03  4:27     ` [PATCH v3 03/17] crypto: tcrypt - reschedule during cycles speed tests Robert Elliott
2022-11-03  4:27     ` [PATCH v3 04/17] crypto: x86/sha - limit FPU preemption Robert Elliott
2022-11-03  4:27     ` [PATCH v3 05/17] crypto: x86/crc " Robert Elliott
2022-11-03  4:27     ` [PATCH v3 06/17] crypto: x86/sm3 " Robert Elliott
2022-11-03  4:27     ` [PATCH v3 07/17] crypto: x86/ghash - use u8 rather than char Robert Elliott
2022-11-03  4:27     ` [PATCH v3 08/17] crypto: x86/ghash - restructure FPU context saving Robert Elliott
2022-11-03  4:27     ` [PATCH v3 09/17] crypto: x86/ghash - limit FPU preemption Robert Elliott
2022-11-03  4:27     ` [PATCH v3 10/17] crypto: x86/*poly* " Robert Elliott
2022-11-03  4:27     ` [PATCH v3 11/17] crypto: x86/sha - register all variations Robert Elliott
2022-11-03  9:26       ` kernel test robot
2022-11-03  4:27     ` [PATCH v3 12/17] crypto: x86/sha - minimize time in FPU context Robert Elliott
2022-11-03  4:27     ` [PATCH v3 13/17] crypto: x86/sha1, sha256 - load based on CPU features Robert Elliott
2022-11-03  4:27     ` [PATCH v3 14/17] crypto: x86/crc " Robert Elliott
2022-11-03  4:27     ` [PATCH v3 15/17] crypto: x86/sm3 " Robert Elliott
2022-11-03  4:27     ` [PATCH v3 16/17] crypto: x86/ghash,polyval " Robert Elliott
2022-11-03  4:27     ` [PATCH v3 17/17] crypto: x86/nhpoly1305, poly1305 " Robert Elliott
2022-11-16  4:13     ` [PATCH v4 00/24] crypto: fix RCU stalls Robert Elliott
2022-11-16  4:13       ` [PATCH v4 01/24] crypto: tcrypt - test crc32 Robert Elliott
2022-11-16  4:13       ` [PATCH v4 02/24] crypto: tcrypt - test nhpoly1305 Robert Elliott
2022-11-16  4:13       ` [PATCH v4 03/24] crypto: tcrypt - reschedule during cycles speed tests Robert Elliott
2022-11-16  4:13       ` [PATCH v4 04/24] crypto: x86/sha - limit FPU preemption Robert Elliott
2022-11-16  4:13       ` [PATCH v4 05/24] crypto: x86/crc " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 06/24] crypto: x86/sm3 " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 07/24] crypto: x86/ghash - use u8 rather than char Robert Elliott
2022-11-16  4:13       ` [PATCH v4 08/24] crypto: x86/ghash - restructure FPU context saving Robert Elliott
2022-11-16  4:13       ` [PATCH v4 09/24] crypto: x86/ghash - limit FPU preemption Robert Elliott
2022-11-16  4:13       ` [PATCH v4 10/24] crypto: x86/poly " Robert Elliott
2022-11-16 11:13         ` Jason A. Donenfeld
2022-11-22  5:06           ` Elliott, Robert (Servers)
2022-11-22  9:07             ` David Laight
2022-11-25  8:40           ` Herbert Xu
2022-11-25  8:59             ` Ard Biesheuvel [this message]
2022-11-25  9:03               ` Herbert Xu
2022-11-28 16:57                 ` Elliott, Robert (Servers)
2022-11-28 18:48                   ` Elliott, Robert (Servers)
2022-12-02  6:21             ` Elliott, Robert (Servers)
2022-12-02  9:25               ` Herbert Xu
2022-12-02 16:15                 ` Elliott, Robert (Servers)
2022-12-06  4:27                   ` Herbert Xu
2022-12-06 14:03                     ` Peter Lafreniere
2022-12-06 14:44                       ` David Laight
2022-12-06 23:06               ` Peter Lafreniere
2022-12-10  0:34                 ` Elliott, Robert (Servers)
2022-12-16 22:12                   ` Elliott, Robert (Servers)
2022-11-16  4:13       ` [PATCH v4 11/24] crypto: x86/aegis " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 12/24] crypto: x86/sha - register all variations Robert Elliott
2022-11-16  4:13       ` [PATCH v4 13/24] crypto: x86/sha - minimize time in FPU context Robert Elliott
2022-11-16  4:13       ` [PATCH v4 14/24] crypto: x86/sha - load based on CPU features Robert Elliott
2022-11-16  4:13       ` [PATCH v4 15/24] crypto: x86/crc " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 16/24] crypto: x86/sm3 " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 17/24] crypto: x86/poly " Robert Elliott
2022-11-16 11:19         ` Jason A. Donenfeld
2022-11-16  4:13       ` [PATCH v4 18/24] crypto: x86/ghash " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 19/24] crypto: x86/aesni - avoid type conversions Robert Elliott
2022-11-16  4:13       ` [PATCH v4 20/24] crypto: x86/ciphers - load based on CPU features Robert Elliott
2022-11-16 11:30         ` Jason A. Donenfeld
2022-11-16  4:13       ` [PATCH v4 21/24] crypto: x86 - report used CPU features via module parameters Robert Elliott
2022-11-16 11:26         ` Jason A. Donenfeld
2022-11-16  4:13       ` [PATCH v4 22/24] crypto: x86 - report missing " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 23/24] crypto: x86 - report suboptimal CPUs " Robert Elliott
2022-11-16  4:13       ` [PATCH v4 24/24] crypto: x86 - standarize module descriptions Robert Elliott
2022-11-17  3:58       ` [PATCH v4 00/24] crypto: fix RCU stalls Herbert Xu
2022-11-17 15:13         ` Elliott, Robert (Servers)
2022-11-17 15:15           ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAMj1kXHHm+L=qE5opDXhjoWZt+1eKXFeGVS=OdvyF0VNFZivCA@mail.gmail.com' \
    --to=ardb@kernel.org \
    --cc=David.Laight@aculab.com \
    --cc=Jason@zx2c4.com \
    --cc=ap420073@gmail.com \
    --cc=davem@davemloft.net \
    --cc=ebiggers@kernel.org \
    --cc=elliott@hpe.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.