Linux-Crypto Archive on lore.kernel.org
 help / color / Atom feed
* Help getting aesni crypto patch upstream
@ 2020-07-28 22:03 Ben Greear
  2020-07-29  6:06 ` Ard Biesheuvel
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2020-07-28 22:03 UTC (permalink / raw)
  To: linux-crypto

Hello,

As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
performs well is to use aesni.  I've been using a patch for years that does this, but
recently somewhere between 5.4 and 5.7, the API I've been using has been removed.

Would anyone be interested in getting this support upstream?  I'd be happy to pay for
the effort.

Here is the patch in question:

https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch

Please keep me in CC, I'm not subscribed to this list.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-28 22:03 Help getting aesni crypto patch upstream Ben Greear
@ 2020-07-29  6:06 ` Ard Biesheuvel
  2020-07-29 12:27   ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2020-07-29  6:06 UTC (permalink / raw)
  To: Ben Greear; +Cc: Linux Crypto Mailing List

On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
>
> Hello,
>
> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
> performs well is to use aesni.  I've been using a patch for years that does this, but
> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
>
> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
> the effort.
>
> Here is the patch in question:
>
> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
>
> Please keep me in CC, I'm not subscribed to this list.
>

Hi Ben,

Recently, the x86 FPU handling was improved to remove the overhead of
preserving/restoring of the register state, so the issue that this
patch fixes may no longer exist. Did you try?

In any case, according to the commit log on that patch, the problem is
in the MAC generation, so it might be better to add a cbcmac(aes)
implementation only, and not duplicate all the CCM boilerplate.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-29  6:06 ` Ard Biesheuvel
@ 2020-07-29 12:27   ` Ben Greear
  2020-07-29 19:09     ` Ard Biesheuvel
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2020-07-29 12:27 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Linux Crypto Mailing List

On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
> On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
>>
>> Hello,
>>
>> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
>> performs well is to use aesni.  I've been using a patch for years that does this, but
>> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
>>
>> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
>> the effort.
>>
>> Here is the patch in question:
>>
>> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
>>
>> Please keep me in CC, I'm not subscribed to this list.
>>
> 
> Hi Ben,
> 
> Recently, the x86 FPU handling was improved to remove the overhead of
> preserving/restoring of the register state, so the issue that this
> patch fixes may no longer exist. Did you try?
> 
> In any case, according to the commit log on that patch, the problem is
> in the MAC generation, so it might be better to add a cbcmac(aes)
> implementation only, and not duplicate all the CCM boilerplate.
> 

Hello,

I don't know all of the details, and do not understand the crypto subsystem,
but I am pretty sure that I need at least some of this patch.

If you can suggest a patch to try I'll be happy to test it to see how it
performs.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-29 12:27   ` Ben Greear
@ 2020-07-29 19:09     ` Ard Biesheuvel
  2020-07-29 19:29       ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2020-07-29 19:09 UTC (permalink / raw)
  To: Ben Greear; +Cc: Linux Crypto Mailing List

On Wed, 29 Jul 2020 at 15:27, Ben Greear <greearb@candelatech.com> wrote:
>
> On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
> > On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
> >>
> >> Hello,
> >>
> >> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
> >> performs well is to use aesni.  I've been using a patch for years that does this, but
> >> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
> >>
> >> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
> >> the effort.
> >>
> >> Here is the patch in question:
> >>
> >> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
> >>
> >> Please keep me in CC, I'm not subscribed to this list.
> >>
> >
> > Hi Ben,
> >
> > Recently, the x86 FPU handling was improved to remove the overhead of
> > preserving/restoring of the register state, so the issue that this
> > patch fixes may no longer exist. Did you try?
> >
> > In any case, according to the commit log on that patch, the problem is
> > in the MAC generation, so it might be better to add a cbcmac(aes)
> > implementation only, and not duplicate all the CCM boilerplate.
> >
>
> Hello,
>
> I don't know all of the details, and do not understand the crypto subsystem,
> but I am pretty sure that I need at least some of this patch.
>

Whether this is true is what I am trying to get clarified.

Your patch works around a performance bottleneck related to the use of
AES-NI instructions in the kernel, which has been addressed recently.
If the issue still exists, we can attempt to devise a fix for it,
which may or may not be based on this patch.

> If you can suggest a patch to try I'll be happy to test it to see how it
> performs.
>

Please share performance numbers of an old kernel with this patch
applied, and a recent one without. If that shows there is in fact an
issue, we will do something about it.

>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-29 19:09     ` Ard Biesheuvel
@ 2020-07-29 19:29       ` Ben Greear
  2020-07-29 20:06         ` Ard Biesheuvel
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2020-07-29 19:29 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Linux Crypto Mailing List

On 7/29/20 12:09 PM, Ard Biesheuvel wrote:
> On Wed, 29 Jul 2020 at 15:27, Ben Greear <greearb@candelatech.com> wrote:
>>
>> On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
>>> On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
>>>>
>>>> Hello,
>>>>
>>>> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
>>>> performs well is to use aesni.  I've been using a patch for years that does this, but
>>>> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
>>>>
>>>> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
>>>> the effort.
>>>>
>>>> Here is the patch in question:
>>>>
>>>> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
>>>>
>>>> Please keep me in CC, I'm not subscribed to this list.
>>>>
>>>
>>> Hi Ben,
>>>
>>> Recently, the x86 FPU handling was improved to remove the overhead of
>>> preserving/restoring of the register state, so the issue that this
>>> patch fixes may no longer exist. Did you try?
>>>
>>> In any case, according to the commit log on that patch, the problem is
>>> in the MAC generation, so it might be better to add a cbcmac(aes)
>>> implementation only, and not duplicate all the CCM boilerplate.
>>>
>>
>> Hello,
>>
>> I don't know all of the details, and do not understand the crypto subsystem,
>> but I am pretty sure that I need at least some of this patch.
>>
> 
> Whether this is true is what I am trying to get clarified.
> 
> Your patch works around a performance bottleneck related to the use of
> AES-NI instructions in the kernel, which has been addressed recently.
> If the issue still exists, we can attempt to devise a fix for it,
> which may or may not be based on this patch.

Ok, I can do the testing.  Do you expect 5.7-stable has all the needed
performance improvements?

Thanks,
Ben

> 
>> If you can suggest a patch to try I'll be happy to test it to see how it
>> performs.
>>
> 
> Please share performance numbers of an old kernel with this patch
> applied, and a recent one without. If that shows there is in fact an
> issue, we will do something about it.
> 
>>
>> --
>> Ben Greear <greearb@candelatech.com>
>> Candela Technologies Inc  http://www.candelatech.com
> 


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-29 19:29       ` Ben Greear
@ 2020-07-29 20:06         ` Ard Biesheuvel
  2020-07-30 22:56           ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2020-07-29 20:06 UTC (permalink / raw)
  To: Ben Greear; +Cc: Linux Crypto Mailing List

On Wed, 29 Jul 2020 at 22:29, Ben Greear <greearb@candelatech.com> wrote:
>
> On 7/29/20 12:09 PM, Ard Biesheuvel wrote:
> > On Wed, 29 Jul 2020 at 15:27, Ben Greear <greearb@candelatech.com> wrote:
> >>
> >> On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
> >>> On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
> >>>>
> >>>> Hello,
> >>>>
> >>>> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
> >>>> performs well is to use aesni.  I've been using a patch for years that does this, but
> >>>> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
> >>>>
> >>>> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
> >>>> the effort.
> >>>>
> >>>> Here is the patch in question:
> >>>>
> >>>> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
> >>>>
> >>>> Please keep me in CC, I'm not subscribed to this list.
> >>>>
> >>>
> >>> Hi Ben,
> >>>
> >>> Recently, the x86 FPU handling was improved to remove the overhead of
> >>> preserving/restoring of the register state, so the issue that this
> >>> patch fixes may no longer exist. Did you try?
> >>>
> >>> In any case, according to the commit log on that patch, the problem is
> >>> in the MAC generation, so it might be better to add a cbcmac(aes)
> >>> implementation only, and not duplicate all the CCM boilerplate.
> >>>
> >>
> >> Hello,
> >>
> >> I don't know all of the details, and do not understand the crypto subsystem,
> >> but I am pretty sure that I need at least some of this patch.
> >>
> >
> > Whether this is true is what I am trying to get clarified.
> >
> > Your patch works around a performance bottleneck related to the use of
> > AES-NI instructions in the kernel, which has been addressed recently.
> > If the issue still exists, we can attempt to devise a fix for it,
> > which may or may not be based on this patch.
>
> Ok, I can do the testing.  Do you expect 5.7-stable has all the needed
> performance improvements?
>

Yes.

> Thanks,
> Ben
>
> >
> >> If you can suggest a patch to try I'll be happy to test it to see how it
> >> performs.
> >>
> >
> > Please share performance numbers of an old kernel with this patch
> > applied, and a recent one without. If that shows there is in fact an
> > issue, we will do something about it.
> >
> >>
> >> --
> >> Ben Greear <greearb@candelatech.com>
> >> Candela Technologies Inc  http://www.candelatech.com
> >
>
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-29 20:06         ` Ard Biesheuvel
@ 2020-07-30 22:56           ` Ben Greear
  2020-07-31 10:00             ` Ard Biesheuvel
  0 siblings, 1 reply; 9+ messages in thread
From: Ben Greear @ 2020-07-30 22:56 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Linux Crypto Mailing List

On 7/29/20 1:06 PM, Ard Biesheuvel wrote:
> On Wed, 29 Jul 2020 at 22:29, Ben Greear <greearb@candelatech.com> wrote:
>>
>> On 7/29/20 12:09 PM, Ard Biesheuvel wrote:
>>> On Wed, 29 Jul 2020 at 15:27, Ben Greear <greearb@candelatech.com> wrote:
>>>>
>>>> On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
>>>>> On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
>>>>>> performs well is to use aesni.  I've been using a patch for years that does this, but
>>>>>> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
>>>>>>
>>>>>> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
>>>>>> the effort.
>>>>>>
>>>>>> Here is the patch in question:
>>>>>>
>>>>>> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
>>>>>>
>>>>>> Please keep me in CC, I'm not subscribed to this list.
>>>>>>
>>>>>
>>>>> Hi Ben,
>>>>>
>>>>> Recently, the x86 FPU handling was improved to remove the overhead of
>>>>> preserving/restoring of the register state, so the issue that this
>>>>> patch fixes may no longer exist. Did you try?
>>>>>
>>>>> In any case, according to the commit log on that patch, the problem is
>>>>> in the MAC generation, so it might be better to add a cbcmac(aes)
>>>>> implementation only, and not duplicate all the CCM boilerplate.
>>>>>
>>>>
>>>> Hello,
>>>>
>>>> I don't know all of the details, and do not understand the crypto subsystem,
>>>> but I am pretty sure that I need at least some of this patch.
>>>>
>>>
>>> Whether this is true is what I am trying to get clarified.
>>>
>>> Your patch works around a performance bottleneck related to the use of
>>> AES-NI instructions in the kernel, which has been addressed recently.
>>> If the issue still exists, we can attempt to devise a fix for it,
>>> which may or may not be based on this patch.
>>
>> Ok, I can do the testing.  Do you expect 5.7-stable has all the needed
>> performance improvements?
>>
> 
> Yes.

It does not, as far as we can tell.

We did a download test on an apu2 (small embedded AMD CPU, but with
aesni support).  A WiFi station is in software-decrypt mode (ath10k-ct driver/firmware,
but ath9k would be valid to reproduce the issue as well.)

On our 5.4 kernel with the aesni patch applied, we get
about 220Mbps wpa2 download throughput.  With open, we get about 260Mbps
download throughput.

On 5.7, without any aesni patch, we see about 116Mbps download wpa2 throughput,
and about 265Mbps open download throughput.


perf-top on 5.4 during download test with our aesni patch looks like this:

    11.73%  libc-2.29.so   [.] __memset_sse2_unaligned_erms
      4.79%  [kernel]       [k] _aesni_enc1
      1.71%  [kernel]       [k] ___bpf_prog_run
      1.66%  [kernel]       [k] memcpy
      1.25%  [kernel]       [k] copy_user_generic_string
      1.18%  libjvm.so      [.] InstanceKlass::oop_follow_contents
      1.07%  [kernel]       [k] _aesni_enc4
      0.98%  [kernel]       [k] csum_partial_copy_generic
      0.96%  libjvm.so      [.] SpinPause
      0.84%  [kernel]       [k] get_data_to_compute
      0.81%  libjvm.so      [.] ParMarkBitMap::mark_obj
      0.64%  [kernel]       [k] udp_sendmsg
      0.62%  [kernel]       [k] __ip_append_data.isra.53
      0.58%  [kernel]       [k] ipt_do_table
      0.56%  [kernel]       [k] _aesni_inc
      0.56%  [kernel]       [k] fib_table_lookup
      0.55%  [kernel]       [k] __rcu_read_unlock
      0.52%  libc-2.29.so   [.] __GI___strcmp_ssse3
      0.50%  [kernel]       [k] igb_xmit_frame_ring


on 5.7, we see this:

    11.36%  libc-2.29.so   [.] __memset_sse2_unaligned_erms
      9.03%  [kernel]       [k] kernel_fpu_begin
      4.75%  libjvm.so      [.] SpinPause
      2.89%  [kernel]       [k] __crypto_xor
      2.35%  [kernel]       [k] _aesni_enc1
      1.94%  [kernel]       [k] copy_user_generic_string
      1.29%  [kernel]       [k] aesni_encrypt
      0.85%  [kernel]       [k] udp_sendmsg
      0.85%  [kernel]       [k] crypto_cipher_encrypt_one
      0.71%  [kernel]       [k] crypto_cbcmac_digest_update
      0.69%  [kernel]       [k] __ip_append_data.isra.53
      0.69%  [kernel]       [k] memcpy
      0.68%  [kernel]       [k] crypto_ctr_crypt
      0.61%  [kernel]       [k] irq_fpu_usable
      0.58%  [kernel]       [k] ipt_do_table
      0.55%  [kernel]       [k] __dev_queue_xmit
      0.54%  [kernel]       [k] crypto_inc
      0.49%  libc-2.29.so   [.] __GI___strcmp_ssse3
      0.45%  libjvm.so      [.] InstanceKlass::oop_follow_contents
      0.45%  [kernel]       [k] ip_route_output_key_hash_rcu



So, I think there is still some good improvement possible, likely with something like
the aesni patch I showed, but re-worked to function in 5.7+ kernels.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-30 22:56           ` Ben Greear
@ 2020-07-31 10:00             ` Ard Biesheuvel
  2020-07-31 14:02               ` Ben Greear
  0 siblings, 1 reply; 9+ messages in thread
From: Ard Biesheuvel @ 2020-07-31 10:00 UTC (permalink / raw)
  To: Ben Greear; +Cc: Linux Crypto Mailing List

On Fri, 31 Jul 2020 at 01:57, Ben Greear <greearb@candelatech.com> wrote:
>
> On 7/29/20 1:06 PM, Ard Biesheuvel wrote:
> > On Wed, 29 Jul 2020 at 22:29, Ben Greear <greearb@candelatech.com> wrote:
> >>
> >> On 7/29/20 12:09 PM, Ard Biesheuvel wrote:
> >>> On Wed, 29 Jul 2020 at 15:27, Ben Greear <greearb@candelatech.com> wrote:
> >>>>
> >>>> On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
> >>>>> On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
> >>>>>>
> >>>>>> Hello,
> >>>>>>
> >>>>>> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
> >>>>>> performs well is to use aesni.  I've been using a patch for years that does this, but
> >>>>>> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
> >>>>>>
> >>>>>> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
> >>>>>> the effort.
> >>>>>>
> >>>>>> Here is the patch in question:
> >>>>>>
> >>>>>> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
> >>>>>>
> >>>>>> Please keep me in CC, I'm not subscribed to this list.
> >>>>>>
> >>>>>
> >>>>> Hi Ben,
> >>>>>
> >>>>> Recently, the x86 FPU handling was improved to remove the overhead of
> >>>>> preserving/restoring of the register state, so the issue that this
> >>>>> patch fixes may no longer exist. Did you try?
> >>>>>
> >>>>> In any case, according to the commit log on that patch, the problem is
> >>>>> in the MAC generation, so it might be better to add a cbcmac(aes)
> >>>>> implementation only, and not duplicate all the CCM boilerplate.
> >>>>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> I don't know all of the details, and do not understand the crypto subsystem,
> >>>> but I am pretty sure that I need at least some of this patch.
> >>>>
> >>>
> >>> Whether this is true is what I am trying to get clarified.
> >>>
> >>> Your patch works around a performance bottleneck related to the use of
> >>> AES-NI instructions in the kernel, which has been addressed recently.
> >>> If the issue still exists, we can attempt to devise a fix for it,
> >>> which may or may not be based on this patch.
> >>
> >> Ok, I can do the testing.  Do you expect 5.7-stable has all the needed
> >> performance improvements?
> >>
> >
> > Yes.
>
> It does not, as far as we can tell.
>
> We did a download test on an apu2 (small embedded AMD CPU, but with
> aesni support).  A WiFi station is in software-decrypt mode (ath10k-ct driver/firmware,
> but ath9k would be valid to reproduce the issue as well.)
>
> On our 5.4 kernel with the aesni patch applied, we get
> about 220Mbps wpa2 download throughput.  With open, we get about 260Mbps
> download throughput.
>
> On 5.7, without any aesni patch, we see about 116Mbps download wpa2 throughput,
> and about 265Mbps open download throughput.
>

Thanks for the excellent data. Apparently, FPU preserve/restore is
still prohibitively expensive on these cores.

I'll have a stab at implementing cbcmac(aesni) early next week: as i
pointed out before, we don't need all the ccm boilerplate if the ctr
and mac processing are still done in separate passes anyway.


>
> perf-top on 5.4 during download test with our aesni patch looks like this:
>
>     11.73%  libc-2.29.so   [.] __memset_sse2_unaligned_erms
>       4.79%  [kernel]       [k] _aesni_enc1
>       1.71%  [kernel]       [k] ___bpf_prog_run
>       1.66%  [kernel]       [k] memcpy
>       1.25%  [kernel]       [k] copy_user_generic_string
>       1.18%  libjvm.so      [.] InstanceKlass::oop_follow_contents
>       1.07%  [kernel]       [k] _aesni_enc4
>       0.98%  [kernel]       [k] csum_partial_copy_generic
>       0.96%  libjvm.so      [.] SpinPause
>       0.84%  [kernel]       [k] get_data_to_compute
>       0.81%  libjvm.so      [.] ParMarkBitMap::mark_obj
>       0.64%  [kernel]       [k] udp_sendmsg
>       0.62%  [kernel]       [k] __ip_append_data.isra.53
>       0.58%  [kernel]       [k] ipt_do_table
>       0.56%  [kernel]       [k] _aesni_inc
>       0.56%  [kernel]       [k] fib_table_lookup
>       0.55%  [kernel]       [k] __rcu_read_unlock
>       0.52%  libc-2.29.so   [.] __GI___strcmp_ssse3
>       0.50%  [kernel]       [k] igb_xmit_frame_ring
>
>
> on 5.7, we see this:
>
>     11.36%  libc-2.29.so   [.] __memset_sse2_unaligned_erms
>       9.03%  [kernel]       [k] kernel_fpu_begin
>       4.75%  libjvm.so      [.] SpinPause
>       2.89%  [kernel]       [k] __crypto_xor
>       2.35%  [kernel]       [k] _aesni_enc1
>       1.94%  [kernel]       [k] copy_user_generic_string
>       1.29%  [kernel]       [k] aesni_encrypt
>       0.85%  [kernel]       [k] udp_sendmsg
>       0.85%  [kernel]       [k] crypto_cipher_encrypt_one
>       0.71%  [kernel]       [k] crypto_cbcmac_digest_update
>       0.69%  [kernel]       [k] __ip_append_data.isra.53
>       0.69%  [kernel]       [k] memcpy
>       0.68%  [kernel]       [k] crypto_ctr_crypt
>       0.61%  [kernel]       [k] irq_fpu_usable
>       0.58%  [kernel]       [k] ipt_do_table
>       0.55%  [kernel]       [k] __dev_queue_xmit
>       0.54%  [kernel]       [k] crypto_inc
>       0.49%  libc-2.29.so   [.] __GI___strcmp_ssse3
>       0.45%  libjvm.so      [.] InstanceKlass::oop_follow_contents
>       0.45%  [kernel]       [k] ip_route_output_key_hash_rcu
>
>
>
> So, I think there is still some good improvement possible, likely with something like
> the aesni patch I showed, but re-worked to function in 5.7+ kernels.
>
> Thanks,
> Ben
>
> --
> Ben Greear <greearb@candelatech.com>
> Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Help getting aesni crypto patch upstream
  2020-07-31 10:00             ` Ard Biesheuvel
@ 2020-07-31 14:02               ` Ben Greear
  0 siblings, 0 replies; 9+ messages in thread
From: Ben Greear @ 2020-07-31 14:02 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: Linux Crypto Mailing List

On 7/31/20 3:00 AM, Ard Biesheuvel wrote:
> On Fri, 31 Jul 2020 at 01:57, Ben Greear <greearb@candelatech.com> wrote:
>>
>> On 7/29/20 1:06 PM, Ard Biesheuvel wrote:
>>> On Wed, 29 Jul 2020 at 22:29, Ben Greear <greearb@candelatech.com> wrote:
>>>>
>>>> On 7/29/20 12:09 PM, Ard Biesheuvel wrote:
>>>>> On Wed, 29 Jul 2020 at 15:27, Ben Greear <greearb@candelatech.com> wrote:
>>>>>>
>>>>>> On 7/28/20 11:06 PM, Ard Biesheuvel wrote:
>>>>>>> On Wed, 29 Jul 2020 at 01:03, Ben Greear <greearb@candelatech.com> wrote:
>>>>>>>>
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> As part of my wifi test tool, I need to do decrypt AES on the CPU, and the only way this
>>>>>>>> performs well is to use aesni.  I've been using a patch for years that does this, but
>>>>>>>> recently somewhere between 5.4 and 5.7, the API I've been using has been removed.
>>>>>>>>
>>>>>>>> Would anyone be interested in getting this support upstream?  I'd be happy to pay for
>>>>>>>> the effort.
>>>>>>>>
>>>>>>>> Here is the patch in question:
>>>>>>>>
>>>>>>>> https://github.com/greearb/linux-ct-5.7/blob/master/wip/0001-crypto-aesni-add-ccm-aes-algorithm-implementation.patch
>>>>>>>>
>>>>>>>> Please keep me in CC, I'm not subscribed to this list.
>>>>>>>>
>>>>>>>
>>>>>>> Hi Ben,
>>>>>>>
>>>>>>> Recently, the x86 FPU handling was improved to remove the overhead of
>>>>>>> preserving/restoring of the register state, so the issue that this
>>>>>>> patch fixes may no longer exist. Did you try?
>>>>>>>
>>>>>>> In any case, according to the commit log on that patch, the problem is
>>>>>>> in the MAC generation, so it might be better to add a cbcmac(aes)
>>>>>>> implementation only, and not duplicate all the CCM boilerplate.
>>>>>>>
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> I don't know all of the details, and do not understand the crypto subsystem,
>>>>>> but I am pretty sure that I need at least some of this patch.
>>>>>>
>>>>>
>>>>> Whether this is true is what I am trying to get clarified.
>>>>>
>>>>> Your patch works around a performance bottleneck related to the use of
>>>>> AES-NI instructions in the kernel, which has been addressed recently.
>>>>> If the issue still exists, we can attempt to devise a fix for it,
>>>>> which may or may not be based on this patch.
>>>>
>>>> Ok, I can do the testing.  Do you expect 5.7-stable has all the needed
>>>> performance improvements?
>>>>
>>>
>>> Yes.
>>
>> It does not, as far as we can tell.
>>
>> We did a download test on an apu2 (small embedded AMD CPU, but with
>> aesni support).  A WiFi station is in software-decrypt mode (ath10k-ct driver/firmware,
>> but ath9k would be valid to reproduce the issue as well.)
>>
>> On our 5.4 kernel with the aesni patch applied, we get
>> about 220Mbps wpa2 download throughput.  With open, we get about 260Mbps
>> download throughput.
>>
>> On 5.7, without any aesni patch, we see about 116Mbps download wpa2 throughput,
>> and about 265Mbps open download throughput.
>>
> 
> Thanks for the excellent data. Apparently, FPU preserve/restore is
> still prohibitively expensive on these cores.
> 
> I'll have a stab at implementing cbcmac(aesni) early next week: as i
> pointed out before, we don't need all the ccm boilerplate if the ctr
> and mac processing are still done in separate passes anyway.

That will be very welcome.  We'll be happy to test.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-28 22:03 Help getting aesni crypto patch upstream Ben Greear
2020-07-29  6:06 ` Ard Biesheuvel
2020-07-29 12:27   ` Ben Greear
2020-07-29 19:09     ` Ard Biesheuvel
2020-07-29 19:29       ` Ben Greear
2020-07-29 20:06         ` Ard Biesheuvel
2020-07-30 22:56           ` Ben Greear
2020-07-31 10:00             ` Ard Biesheuvel
2020-07-31 14:02               ` Ben Greear

Linux-Crypto Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-crypto/0 linux-crypto/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-crypto linux-crypto/ https://lore.kernel.org/linux-crypto \
		linux-crypto@vger.kernel.org
	public-inbox-index linux-crypto

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-crypto


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git