Linux-Crypto Archive on lore.kernel.org
 help / color / Atom feed
From: Ben Greear <greearb@candelatech.com>
To: Ard Biesheuvel <ardb@kernel.org>
Cc: Linux Crypto Mailing List <linux-crypto@vger.kernel.org>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Eric Biggers <ebiggers@kernel.org>
Subject: Re: [PATCH] crypto: x86/aesni - implement accelerated CBCMAC, CMAC and XCBC shashes
Date: Tue, 4 Aug 2020 06:22:39 -0700
Message-ID: <b13c953c-45ea-d3fb-e17b-9a313af6d19b@candelatech.com> (raw)
In-Reply-To: <CAMj1kXFnfYKj1JE4NLsxXtaeKuAOKyBYDbayLr-mHDUYqnV1bA@mail.gmail.com>

On 8/4/20 6:08 AM, Ard Biesheuvel wrote:
> On Tue, 4 Aug 2020 at 15:01, Ben Greear <greearb@candelatech.com> wrote:
>>
>> On 8/4/20 5:55 AM, Ard Biesheuvel wrote:
>>> On Mon, 3 Aug 2020 at 21:11, Ben Greear <greearb@candelatech.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> This helps a bit...now download sw-crypt performance is about 150Mbps,
>>>> but still not as good as with my patch on 5.4 kernel, and fpu is still
>>>> high in perf top:
>>>>
>>>>       13.89%  libc-2.29.so   [.] __memset_sse2_unaligned_erms
>>>>         6.62%  [kernel]       [k] kernel_fpu_begin
>>>>         4.14%  [kernel]       [k] _aesni_enc1
>>>>         2.06%  [kernel]       [k] __crypto_xor
>>>>         1.95%  [kernel]       [k] copy_user_generic_string
>>>>         1.93%  libjvm.so      [.] SpinPause
>>>>         1.01%  [kernel]       [k] aesni_encrypt
>>>>         0.98%  [kernel]       [k] crypto_ctr_crypt
>>>>         0.93%  [kernel]       [k] udp_sendmsg
>>>>         0.78%  [kernel]       [k] crypto_inc
>>>>         0.74%  [kernel]       [k] __ip_append_data.isra.53
>>>>         0.65%  [kernel]       [k] aesni_cbc_enc
>>>>         0.64%  [kernel]       [k] __dev_queue_xmit
>>>>         0.62%  [kernel]       [k] ipt_do_table
>>>>         0.62%  [kernel]       [k] igb_xmit_frame_ring
>>>>         0.59%  [kernel]       [k] ip_route_output_key_hash_rcu
>>>>         0.57%  [kernel]       [k] memcpy
>>>>         0.57%  libjvm.so      [.] InstanceKlass::oop_follow_contents
>>>>         0.56%  [kernel]       [k] irq_fpu_usable
>>>>         0.56%  [kernel]       [k] mac_do_update
>>>>
>>>> If you'd like help setting up a test rig and have an ath10k pcie NIC or ath9k pcie NIC,
>>>> then I can help.  Possibly hwsim would also be a good test case, but I have not tried
>>>> that.
>>>>
>>>
>>> I don't think this is likely to be reproducible on other
>>> micro-architectures, so setting up a test rig is unlikely to help.
>>>
>>> I'll send out a v2 which implements a ahash instead of a shash (and
>>> implements some other tweaks) so that kernel_fpu_begin() is only
>>> called twice for each packet on the cbcmac path.
>>>
>>> Do you have any numbers for the old kernel without your patch? This
>>> pathological FPU preserve/restore behavior could be caused be the
>>> optimizations, or by other changes that landed in the meantime, so I
>>> would like to know if kernel_fpu_begin() is as prominent in those
>>> traces as well.
>>>
>>
>> This same patch makes i7 mobile processors able to handle 1Gbps+ software
>> decrypt rates, where without the patch, the rate was badly constrained and CPU
>> load was much higher, so it is definitely noticeable on other processors too.
> 
> OK
> 
>> The weak processor on the current test rig is convenient because the problem
>> is so noticeable even at slower wifi speeds.
>>
>> We can do some tests on 5.4 with our patch reverted.
>>
> 
> The issue with your CCM patch is that it keeps the FPU enabled for the
> entire input, which also means that preemption is disabled, which
> makes the -rt people grumpy. (Of course, it also uses APIs that no
> longer exists, but that should be easy to fix)

So, if there is no other way to get back the performance, can it be a compile
or runtime option (disabled by default for -RT type folks) to re-enable the feature
that helps our CPU usage?

Or, can you do an add-on patch to enable keeping fpu enabled so that I can test
how that affects our performance?

> 
> Do you happen to have any ballpark figures for the packet sizes and
> the time spent doing encryption?

This test was using MTU UDP frames I think, and mostly it is just sending
and receiving frames.  perf top output gives you as much detail as I have about
what the kernel is spending time doing.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

  reply index

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-02  9:06 Ard Biesheuvel
2020-08-03 19:11 ` Ben Greear
2020-08-04 12:55   ` Ard Biesheuvel
2020-08-04 13:01     ` Ben Greear
2020-08-04 13:08       ` Ard Biesheuvel
2020-08-04 13:22         ` Ben Greear [this message]
2020-08-04 19:45         ` Ben Greear
2020-08-04 20:12           ` Ard Biesheuvel
2020-09-23 11:03           ` Ben Greear
2020-08-18  8:24 ` [PATCH 0/5] crypto: Implement cmac based on cbc skcipher Herbert Xu
2020-08-18  8:25   ` [PATCH 1/6] crypto: skcipher - Add helpers for sync skcipher spawn Herbert Xu
2020-08-18  8:25   ` [PATCH 2/6] crypto: ahash - Add helper to free single spawn instance Herbert Xu
2020-08-18  8:25   ` [PATCH 3/6] crypto: ahash - Add init_tfm/exit_tfm Herbert Xu
2020-08-18  8:25   ` [PATCH 4/6] crypto: ahash - Add ahash_alg_instance Herbert Xu
2020-08-18  8:25   ` [PATCH 5/6] crypto: ahash - Remove AHASH_REQUEST_ON_STACK Herbert Xu
2020-08-26 10:55     ` Ard Biesheuvel
2020-08-18  8:25   ` [PATCH 6/6] crypto: cmac - Use cbc skcipher instead of raw cipher Herbert Xu
2020-08-24  9:47     ` Ard Biesheuvel
2020-08-24 11:20       ` Herbert Xu
2020-08-18  8:31   ` [PATCH 0/5] crypto: Implement cmac based on cbc skcipher Ard Biesheuvel
2020-08-18 13:51     ` Herbert Xu
2020-08-18 13:56       ` Ben Greear
2020-08-18 14:05         ` Herbert Xu
2020-08-18 14:17           ` Ben Greear
2020-08-18 22:15             ` Herbert Xu
2020-08-18 22:27               ` Herbert Xu
2020-08-18 22:31                 ` Ben Greear
2020-08-18 22:33                   ` Herbert Xu
2020-08-18 22:39                     ` Ben Greear
2020-08-20  6:58                       ` Ard Biesheuvel
2020-08-20  7:01                         ` Herbert Xu
2020-08-20  7:04                           ` Ard Biesheuvel
2020-08-20  7:06                             ` Herbert Xu
2020-08-20  7:19                               ` Ard Biesheuvel
2020-08-20  7:29                                 ` Herbert Xu
2020-08-20  7:33                                   ` Ard Biesheuvel
2020-08-20  7:44                                     ` Herbert Xu
2020-08-20  7:48                                       ` Ard Biesheuvel
2020-08-20  7:53                                         ` Herbert Xu
2020-08-20  7:56                                           ` Ard Biesheuvel
2020-08-20 13:54                                             ` Ben Greear
2020-08-20 20:10                                               ` Herbert Xu
2020-08-20 22:09                                                 ` Ben Greear
2020-08-20 22:12                                                   ` Herbert Xu
2020-08-22 22:35                 ` Christian Lamparter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b13c953c-45ea-d3fb-e17b-9a313af6d19b@candelatech.com \
    --to=greearb@candelatech.com \
    --cc=ardb@kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=herbert@gondor.apana.org.au \
    --cc=linux-crypto@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Crypto Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-crypto/0 linux-crypto/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-crypto linux-crypto/ https://lore.kernel.org/linux-crypto \
		linux-crypto@vger.kernel.org
	public-inbox-index linux-crypto

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-crypto


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git