All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Lamparter <chunkeey@gmail.com>
To: Ard Biesheuvel <ardb@kernel.org>,
	Thara Gopinath <thara.gopinath@linaro.org>,
	Eric Biggers <ebiggers@kernel.org>
Cc: Linux Crypto Mailing List <linux-crypto@vger.kernel.org>
Subject: Re: Qualcomm Crypto Engine performance numbers on mainline kernel
Date: Sun, 6 Jun 2021 12:07:10 +0200	[thread overview]
Message-ID: <0bd651ea-a062-3883-77ee-6ac275d66741@gmail.com> (raw)
In-Reply-To: <CAMj1kXGRb=_tozRAMA+ZFbAHU4P7ocLbWq+B3s0ngoRoo82V6g@mail.gmail.com>

On 05/06/2021 17:32, Ard Biesheuvel wrote:
> Hello Thara,
> 
> On Fri, 4 Jun 2021 at 18:49, Thara Gopinath <thara.gopinath@linaro.org> wrote:
>>
>>
>> Hi All,
>>
>> Below are the performance numbers from running "crypsetup benchmark" on
>> CE algorithms in the mainline kernel. All numbers are in MiB/s. The
>> platform used is RB3 for sdm845 and MTPs for rest of them.
>>
>>
>>                          SDM845    SM8150     SM8250     SM8350
>> AES-CBC (128)
>> Encrypt / Decrypt       114/106  36/48       120/188    133/197
>>
>> AES-XTS (256)
>> Encrypt / Decrypt       100/102  49/48       186/187    n/a
>>
> 
> The CPU instruction based ones are apparently an order of magnitude
> faster, and are synchronous so their latency should be lower.
> 
> So, as Eric already pointed out IIRC, there doesn't seem to be much
> value in enabling this IP in Linux - it should not be the default
> choice/highest priority, and it is not obvious to me whether/when you
> would prefer this implementation over the CPU based one. Do you have
> any idea how many queues it has, or how much data it can process in
> parallel? Are there other features that stand out?

While I can't say much for the qce-crypto. I do know that "cryptsetup
benchmark" isn't the greatest for pitting the hardware accelerated
crypto against the CPU in some instances.

In my case (crypto4xx / CPU is a PowerPC 464 800MHz - Hardware is a
Western Digital My Book Live - NAS) the "benchmark" results look
exceptionally poor:
#     Algorithm |       Key |      Encryption |      Decryption
         aes-cbc        128b         8.0 MiB/s         8.7 MiB/s
         aes-cbc        256b         8.7 MiB/s         8.7 MiB/s
         aes-xts        256b         5.3 MiB/s         7.9 MiB/s
         aes-xts        512b         7.9 MiB/s         7.9 MiB/s
(Hardware doesn't have cts/xts, but aes-cbc, aes-ctr and aes-gcm)

(for comparison, these are numbers that are produced by only the
800 MHz PowerPC CPU)
         aes-cbc        128b        15.8 MiB/s        16.3 MiB/s
         aes-cbc        256b        12.3 MiB/s        12.8 MiB/s
         aes-xts        256b        12.5 MiB/s        15.1 MiB/s
         aes-xts        512b        11.9 MiB/s        12.0 MiB/s


and (openssl speed -evp aes-128-cbc --elapsed -seconds 3) software
manages similar numbers:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes  16384 bytes
aes-128-cbc      12646.42k    16806.66k    18349.31k    18762.07k    18896.21k    18879.83k

However, when I format a partition on the NAS HDD with
cryptsetup + crypto4xx and use hdparm -i / dd

# hdparm -t /dev/mapper/aes-cbc-hw-test

/dev/mapper/aes-cbc-hw-test:
  Timing buffered disk reads:  96 MB in  3.05 seconds =  31.46 MB/sec

# dd if=/dev/mapper/aes-cbc-hw-test of=/dev/null bs=8M status=progress
5318377472 bytes (5.3 GB, 5.0 GiB) copied, 143 s, 37.2 MB/s^C
639+0 records in
638+0 records out
5351931904 bytes (5.4 GB, 5.0 GiB) copied, 144.246 s, 37.1 MB/s

whereas without crypto4xx:

# hdparm -t /dev/mapper/aes-cbc-hw-test

/dev/mapper/aes-cbc-hw-test:
  Timing buffered disk reads:  34 MB in  3.14 seconds =  10.82 MB/sec

# dd if=/dev/mapper/aes-cbc-hw-test of=/dev/null bs=8M status=progress
46+0 records in
45+0 records out
377487360 bytes (377 MB, 360 MiB) copied, 33.1952 s, 11.4 MB/s

This is 2-3 times the throughput that the CPU alone could do.

@Thara, Do you have a usb-3.0 + fast 3.0 usb-stick? If so, try
to format a partition on it for cryptsetup and try it there.

Cheers,
Christian

      parent reply	other threads:[~2021-06-06 10:07 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-04 16:49 Qualcomm Crypto Engine performance numbers on mainline kernel Thara Gopinath
2021-06-05 15:32 ` Ard Biesheuvel
2021-06-06  6:49   ` Gilad Ben-Yossef
2021-06-06 10:07   ` Christian Lamparter [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0bd651ea-a062-3883-77ee-6ac275d66741@gmail.com \
    --to=chunkeey@gmail.com \
    --cc=ardb@kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=linux-crypto@vger.kernel.org \
    --cc=thara.gopinath@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.