From: Ben Greear <greearb@candelatech.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Linux Crypto Mailing List <linux-crypto@vger.kernel.org>,
Herbert Xu <herbert@gondor.apana.org.au>,
Eric Biggers <ebiggers@kernel.org>
Subject: Re: [PATCH] crypto: x86/aesni - implement accelerated CBCMAC, CMAC and XCBC shashes
Date: Tue, 4 Aug 2020 06:22:39 -0700 [thread overview]
Message-ID: <b13c953c-45ea-d3fb-e17b-9a313af6d19b@candelatech.com> (raw)
In-Reply-To: <CAMj1kXFnfYKj1JE4NLsxXtaeKuAOKyBYDbayLr-mHDUYqnV1bA@mail.gmail.com>
On 8/4/20 6:08 AM, Ard Biesheuvel wrote:
> On Tue, 4 Aug 2020 at 15:01, Ben Greear <greearb@candelatech.com> wrote:
>>
>> On 8/4/20 5:55 AM, Ard Biesheuvel wrote:
>>> On Mon, 3 Aug 2020 at 21:11, Ben Greear <greearb@candelatech.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> This helps a bit...now download sw-crypt performance is about 150Mbps,
>>>> but still not as good as with my patch on 5.4 kernel, and fpu is still
>>>> high in perf top:
>>>>
>>>> 13.89% libc-2.29.so [.] __memset_sse2_unaligned_erms
>>>> 6.62% [kernel] [k] kernel_fpu_begin
>>>> 4.14% [kernel] [k] _aesni_enc1
>>>> 2.06% [kernel] [k] __crypto_xor
>>>> 1.95% [kernel] [k] copy_user_generic_string
>>>> 1.93% libjvm.so [.] SpinPause
>>>> 1.01% [kernel] [k] aesni_encrypt
>>>> 0.98% [kernel] [k] crypto_ctr_crypt
>>>> 0.93% [kernel] [k] udp_sendmsg
>>>> 0.78% [kernel] [k] crypto_inc
>>>> 0.74% [kernel] [k] __ip_append_data.isra.53
>>>> 0.65% [kernel] [k] aesni_cbc_enc
>>>> 0.64% [kernel] [k] __dev_queue_xmit
>>>> 0.62% [kernel] [k] ipt_do_table
>>>> 0.62% [kernel] [k] igb_xmit_frame_ring
>>>> 0.59% [kernel] [k] ip_route_output_key_hash_rcu
>>>> 0.57% [kernel] [k] memcpy
>>>> 0.57% libjvm.so [.] InstanceKlass::oop_follow_contents
>>>> 0.56% [kernel] [k] irq_fpu_usable
>>>> 0.56% [kernel] [k] mac_do_update
>>>>
>>>> If you'd like help setting up a test rig and have an ath10k pcie NIC or ath9k pcie NIC,
>>>> then I can help. Possibly hwsim would also be a good test case, but I have not tried
>>>> that.
>>>>
>>>
>>> I don't think this is likely to be reproducible on other
>>> micro-architectures, so setting up a test rig is unlikely to help.
>>>
>>> I'll send out a v2 which implements a ahash instead of a shash (and
>>> implements some other tweaks) so that kernel_fpu_begin() is only
>>> called twice for each packet on the cbcmac path.
>>>
>>> Do you have any numbers for the old kernel without your patch? This
>>> pathological FPU preserve/restore behavior could be caused be the
>>> optimizations, or by other changes that landed in the meantime, so I
>>> would like to know if kernel_fpu_begin() is as prominent in those
>>> traces as well.
>>>
>>
>> This same patch makes i7 mobile processors able to handle 1Gbps+ software
>> decrypt rates, where without the patch, the rate was badly constrained and CPU
>> load was much higher, so it is definitely noticeable on other processors too.
>
> OK
>
>> The weak processor on the current test rig is convenient because the problem
>> is so noticeable even at slower wifi speeds.
>>
>> We can do some tests on 5.4 with our patch reverted.
>>
>
> The issue with your CCM patch is that it keeps the FPU enabled for the
> entire input, which also means that preemption is disabled, which
> makes the -rt people grumpy. (Of course, it also uses APIs that no
> longer exists, but that should be easy to fix)
So, if there is no other way to get back the performance, can it be a compile
or runtime option (disabled by default for -RT type folks) to re-enable the feature
that helps our CPU usage?
Or, can you do an add-on patch to enable keeping fpu enabled so that I can test
how that affects our performance?
>
> Do you happen to have any ballpark figures for the packet sizes and
> the time spent doing encryption?
This test was using MTU UDP frames I think, and mostly it is just sending
and receiving frames. perf top output gives you as much detail as I have about
what the kernel is spending time doing.
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
next prev parent reply other threads:[~2020-08-04 13:22 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-02 9:06 [PATCH] crypto: x86/aesni - implement accelerated CBCMAC, CMAC and XCBC shashes Ard Biesheuvel
2020-08-03 19:11 ` Ben Greear
2020-08-04 12:55 ` Ard Biesheuvel
2020-08-04 13:01 ` Ben Greear
2020-08-04 13:08 ` Ard Biesheuvel
2020-08-04 13:22 ` Ben Greear [this message]
2020-08-04 19:45 ` Ben Greear
2020-08-04 20:12 ` Ard Biesheuvel
2020-09-23 11:03 ` Ben Greear
2020-10-29 16:58 ` Ard Biesheuvel
2020-08-18 8:24 ` [PATCH 0/5] crypto: Implement cmac based on cbc skcipher Herbert Xu
2020-08-18 8:25 ` [PATCH 1/6] crypto: skcipher - Add helpers for sync skcipher spawn Herbert Xu
2020-08-18 8:25 ` [PATCH 2/6] crypto: ahash - Add helper to free single spawn instance Herbert Xu
2020-08-18 8:25 ` [PATCH 3/6] crypto: ahash - Add init_tfm/exit_tfm Herbert Xu
2020-08-18 8:25 ` [PATCH 4/6] crypto: ahash - Add ahash_alg_instance Herbert Xu
2020-08-18 8:25 ` [PATCH 5/6] crypto: ahash - Remove AHASH_REQUEST_ON_STACK Herbert Xu
2020-08-26 10:55 ` Ard Biesheuvel
2020-08-18 8:25 ` [PATCH 6/6] crypto: cmac - Use cbc skcipher instead of raw cipher Herbert Xu
2020-08-24 9:47 ` Ard Biesheuvel
2020-08-24 11:20 ` Herbert Xu
2020-08-18 8:31 ` [PATCH 0/5] crypto: Implement cmac based on cbc skcipher Ard Biesheuvel
2020-08-18 13:51 ` Herbert Xu
2020-08-18 13:56 ` Ben Greear
2020-08-18 14:05 ` Herbert Xu
2020-08-18 14:17 ` Ben Greear
2020-08-18 22:15 ` Herbert Xu
2020-08-18 22:27 ` Herbert Xu
2020-08-18 22:31 ` Ben Greear
2020-08-18 22:33 ` Herbert Xu
2020-08-18 22:39 ` Ben Greear
2020-08-20 6:58 ` Ard Biesheuvel
2020-08-20 7:01 ` Herbert Xu
2020-08-20 7:04 ` Ard Biesheuvel
2020-08-20 7:06 ` Herbert Xu
2020-08-20 7:19 ` Ard Biesheuvel
2020-08-20 7:29 ` Herbert Xu
2020-08-20 7:33 ` Ard Biesheuvel
2020-08-20 7:44 ` Herbert Xu
2020-08-20 7:48 ` Ard Biesheuvel
2020-08-20 7:53 ` Herbert Xu
2020-08-20 7:56 ` Ard Biesheuvel
2020-08-20 13:54 ` Ben Greear
2020-08-20 20:10 ` Herbert Xu
2020-08-20 22:09 ` Ben Greear
2020-08-20 22:12 ` Herbert Xu
2020-08-22 22:35 ` Christian Lamparter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b13c953c-45ea-d3fb-e17b-9a313af6d19b@candelatech.com \
--to=greearb@candelatech.com \
--cc=ardb@kernel.org \
--cc=ebiggers@kernel.org \
--cc=herbert@gondor.apana.org.au \
--cc=linux-crypto@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).