linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robin Murphy <robin.murphy@arm.com>
To: Sunil Kovvuri <sunil.kovvuri@gmail.com>,
	Oliver Swede <oli.swede@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	will@kernel.org, linux-arm-kernel@lists.indradead.org,
	LKML <linux-kernel@vger.kernel.org>,
	Sunil Goutham <sgoutham@marvell.com>,
	George Cherian <gcherian@marvell.com>
Subject: Re: [PATCH v5 08/14] arm64: Import latest optimization of memcpy
Date: Tue, 1 Jun 2021 13:06:32 +0100	[thread overview]
Message-ID: <5156db7f-09a7-b0fa-d246-b024e40775fc@arm.com> (raw)
In-Reply-To: <CA+sq2CfGu+JPJ9h3Q-nCdmM2erqWbe=p-v_eip-J-R6F2EoXbw@mail.gmail.com>

On 2021-06-01 11:03, Sunil Kovvuri wrote:
> On Mon, Sep 14, 2020 at 8:44 PM Oliver Swede <oli.swede@arm.com> wrote:
>>
>> From: Sam Tebbs <sam.tebbs@arm.com>
>>
>> Import the latest memcpy implementation into memcpy,
>> copy_{from, to and in}_user.
>> The implementation of the user routines is separated into two forms:
>> one for when UAO is enabled and one for when UAO is disabled, with
>> the two being chosen between with a runtime patch.
>> This avoids executing the many NOPs emitted when UAO is disabled.
>>
>> The project containing optimized implementations for various library
>> functions has now been renamed from 'cortex-strings' to
>> 'optimized-routines', and the new upstream source is
>> string/aarch64/memcpy.S as of commit 4c175c8be12 in
>> https://github.com/ARM-software/optimized-routines.
>>
>> Signed-off-by: Sam Tebbs <sam.tebbs@arm.com>
>> [ rm: add UAO fixups, streamline copy_exit paths, expand commit message ]
>> Signed-off-by: Robin Murphy <robin.murphy@arm.com>
>> [ os: import newer memcpy algorithm, update commit message ]
>> Signed-off-by: Oliver Swede <oli.swede@arm.com>
>> ---
>>   arch/arm64/include/asm/alternative.h |  36 ---
>>   arch/arm64/lib/copy_from_user.S      | 113 ++++++--
>>   arch/arm64/lib/copy_in_user.S        | 129 +++++++--
>>   arch/arm64/lib/copy_template.S       | 375 +++++++++++++++------------
>>   arch/arm64/lib/copy_template_user.S  |  24 ++
>>   arch/arm64/lib/copy_to_user.S        | 112 ++++++--
>>   arch/arm64/lib/copy_user_fixup.S     |  14 +
>>   arch/arm64/lib/memcpy.S              |  47 ++--
>>   8 files changed, 557 insertions(+), 293 deletions(-)
>>   create mode 100644 arch/arm64/lib/copy_template_user.S
>>   create mode 100644 arch/arm64/lib/copy_user_fixup.S
> 
> Do you have any performance data with this patch ?
> I see these patches are still not pushed to mainline, any reasons ?

Funny you should pick up on the 6-month-old thread days after I've been 
posting new versions of the relevant parts[1] :)

I think this series mostly stalled on the complexity of the usercopy 
parts, which then turned into even more of a moving target anyway, hence 
why I decided to split it up.

> Also curious to know why 128bit registers are not considered, similar to
> https://android.googlesource.com/platform/bionic.git/+/a71b4c3f144a516826e8ac5b262099b920c49ce0/libc/arch-arm64/generic-neon/bionic/memcpy.S

The overhead of kernel_neon_begin() etc. is significant, and usually 
only worth it in places like the crypto routines where there's enough 
benefit from actual ASIMD computation to outweigh the save/restore cost. 
On smaller cores where the L1 interface is only 128 bits wide anyway 
there is no possible gain in memcpy() throughput to ever offset that 
cost, and even for wider microarchitectures it's only likely to start 
breaking even at relatively large copy sizes. Plus we can't necessarily 
assume the ASIMD registers are even present (apparently the lack of a 
soft-float ABI hasn't stopped people from wanting to run Linux on such 
systems...)

Robin.

[1] 
https://lore.kernel.org/linux-arm-kernel/cover.1622128527.git.robin.murphy@arm.com/

  reply	other threads:[~2021-06-01 12:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-14 15:09 [PATCH v5 00/14] Optimise and update memcpy, user copy and string routines Oliver Swede
2020-09-14 15:09 ` [PATCH v5 01/14] arm64: Allow passing fault address to fixup handlers Oliver Swede
2020-09-14 15:09 ` [PATCH v5 02/14] arm64: kprobes: Drop open-coded exception fixup Oliver Swede
2020-09-14 15:09 ` [PATCH v5 03/14] arm64: Import latest version of Cortex Strings' memcmp Oliver Swede
2020-09-14 15:09 ` [PATCH v5 04/14] arm64: Import latest version of Cortex Strings' memmove Oliver Swede
2020-09-14 15:09 ` [PATCH v5 05/14] arm64: Import latest version of Cortex Strings' strcmp Oliver Swede
2020-09-14 15:09 ` [PATCH v5 06/14] arm64: Import latest version of Cortex Strings' strlen Oliver Swede
2020-09-14 15:09 ` [PATCH v5 07/14] arm64: Import latest version of Cortex Strings' strncmp Oliver Swede
2020-09-14 15:09 ` [PATCH v5 08/14] arm64: Import latest optimization of memcpy Oliver Swede
2021-06-01 10:03   ` Sunil Kovvuri
2021-06-01 12:06     ` Robin Murphy [this message]
2021-06-01 12:31       ` Sunil Kovvuri
2021-06-03  8:45       ` David Laight
2020-09-14 15:09 ` [PATCH v5 09/14] arm64: Tidy up _asm_extable_faultaddr usage Oliver Swede
2020-09-14 15:09 ` [PATCH v5 10/14] arm64: usercopy: Store the arguments on stack Oliver Swede
2020-09-14 15:09 ` [PATCH v5 11/14] arm64: usercopy: Check for overlapping buffers in fixup Oliver Swede
2020-09-14 15:09 ` [PATCH v5 12/14] arm64: usercopy: Add intermediate fixup routine Oliver Swede
2020-09-14 15:09 ` [PATCH v5 13/14] arm64: usercopy: Add conclusive " Oliver Swede
2020-09-14 15:09 ` [PATCH v5 14/14] arm64: usercopy: Reduce overhead in fixup Oliver Swede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5156db7f-09a7-b0fa-d246-b024e40775fc@arm.com \
    --to=robin.murphy@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=gcherian@marvell.com \
    --cc=linux-arm-kernel@lists.indradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oli.swede@arm.com \
    --cc=sgoutham@marvell.com \
    --cc=sunil.kovvuri@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).