From: Oliver Swede <oli.swede@arm.com> To: Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com> Cc: Robin Murphy <robin.murphy@arm.com>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH v3 00/13] arm64: Optimise and update memcpy, user copy and string routines Date: Thu, 14 May 2020 14:32:14 +0000 [thread overview] Message-ID: <20200514143227.605-1-oli.swede@arm.com> (raw) Hi, I have been working on a fix for this branch. The initial version of this patchset imported new optimizations from Linaro's cortex-strings project, and converted the instructions to the macro equivalents in the usercopy template, in order to allow each of the helper usercopy functions to use expansions that insert exception table entries for exclusively the instructions that could potentially fault during a copy. The 'cortex-strings' repository has since been renamed to 'optimized-routines' (this is public on GitHub at https://github.com/ARM-software/optimized-routines), and has been updated with further optimizations of various library functions, and this includes a newer memcpy implementation. The address of page faults is exposed to the faulting instructions' corresponding fixup code, and in v2 a fixup routine to correspond to the previous copy algorithm was also implemented in a way that utilized the fault address to try and efficiently calculate the number of bytes that failed to copy. It returned the distance between the fault and the end of the buffer, however this fixup in [PATCH v2 2/14] had some issues due to the out-of-order nature of the copy algorithm, and was flagged by the LTP testcase of the preadv2() syscall (which makes multiple calls to copy_to_user). This was due to preadv2() reporting SUCCESS for an invalid destination address, NULL, where it expected EFAULT, because a nonzero return value was calculated (indicating that some bytes were copied) due to the fault not occurring at the start of the buffer. In this version I have imported the very latest optimized memcpy routine, and re-written the fixup to use multiple routines that encapsulate various properties of the algorithm (this is explained in more detail in patches 11/13, 12/13, 13/13). The aim is to return the exact number of bytes that haven't copied when a fault occurs in copy_{to, in, from}_user, and to enable the fixups to be modular so that they could be re-written without too much trouble in the future if the copy algorithm was to be updated again. Initial testing indicates that the backtracking performed in the fixup routines is accurate, and I am working on a separate patchset containing more concise selftests that indirectly call the usercopy functions via read()/write() - this will help to ease the verification of expected behaviour. I am going to post updated benchmark results, as the ~27% increase in speed measured by Sam with the previous 'cortex-strings' memcpy is no longer applicable due to the more recent replacement from 'optimized-routines', which should hopefully be even more efficient and improve this further. v1: https://lore.kernel.org/linux-arm-kernel/cover.1571073960.git.robin.murphy@arm.com/ v2: https://lore.kernel.org/linux-arm-kernel/cover.1571421836.git.robin.murphy@arm.com/ Changes since v2: * Adds Robin's separate patch that fixes a compilation issue with KProbes fixup [1] * Imports the most recent memcpy implementation by updating Sam's patch (and moves this patch to occur after the cortex-strings importing so that it's closer to the patches containing its corresponding fixups) * Uses the stack to preserve the initial parameters * Replaces the usercopy fixup routine in v2 with multiple longer fixups that each make use of the fault address to return the exact number of bytes that haven't yet copied. [1] https://lore.kernel.org/linux-arm-kernel/e70f7b9de7e601b9e4a6fedad8eaf64d304b1637.1571326276.git.robin.murphy@arm.com/ Many thanks, Oliver Oliver Swede (4): arm64: Store the arguments to copy_*_user on the stack arm64: Use additional memcpy macros and fixups arm64: Add fixup routines for usercopy load exceptions arm64: Add fixup routines for usercopy store exceptions Robin Murphy (2): arm64: Tidy up _asm_extable_faultaddr usage arm64: kprobes: Drop open-coded exception fixup Sam Tebbs (7): arm64: Allow passing fault address to fixup handlers arm64: Import latest optimization of memcpy arm64: Import latest version of Cortex Strings' memcmp arm64: Import latest version of Cortex Strings' memmove arm64: Import latest version of Cortex Strings' strcmp arm64: Import latest version of Cortex Strings' strlen arm64: Import latest version of Cortex Strings' strncmp arch/arm64/include/asm/alternative.h | 36 --- arch/arm64/include/asm/assembler.h | 13 + arch/arm64/include/asm/extable.h | 10 +- arch/arm64/kernel/probes/kprobes.c | 7 - arch/arm64/lib/copy_from_user.S | 272 +++++++++++++++++-- arch/arm64/lib/copy_in_user.S | 287 ++++++++++++++++++-- arch/arm64/lib/copy_template.S | 377 +++++++++++++++------------ arch/arm64/lib/copy_template_user.S | 50 ++++ arch/arm64/lib/copy_to_user.S | 273 +++++++++++++++++-- arch/arm64/lib/copy_user_fixup.S | 277 ++++++++++++++++++++ arch/arm64/lib/memcmp.S | 333 +++++++++-------------- arch/arm64/lib/memcpy.S | 128 +++++++-- arch/arm64/lib/memmove.S | 232 ++++++----------- arch/arm64/lib/strcmp.S | 272 ++++++++----------- arch/arm64/lib/strlen.S | 247 ++++++++++++------ arch/arm64/lib/strncmp.S | 363 ++++++++++++-------------- arch/arm64/mm/extable.c | 13 +- arch/arm64/mm/fault.c | 2 +- 18 files changed, 2073 insertions(+), 1119 deletions(-) create mode 100644 arch/arm64/lib/copy_template_user.S create mode 100644 arch/arm64/lib/copy_user_fixup.S -- 2.17.1
WARNING: multiple messages have this Message-ID (diff)
From: Oliver Swede <oli.swede@arm.com> To: Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com> Cc: Robin Murphy <robin.murphy@arm.com>, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: [PATCH v3 00/13] arm64: Optimise and update memcpy, user copy and string routines Date: Thu, 14 May 2020 14:32:14 +0000 [thread overview] Message-ID: <20200514143227.605-1-oli.swede@arm.com> (raw) Hi, I have been working on a fix for this branch. The initial version of this patchset imported new optimizations from Linaro's cortex-strings project, and converted the instructions to the macro equivalents in the usercopy template, in order to allow each of the helper usercopy functions to use expansions that insert exception table entries for exclusively the instructions that could potentially fault during a copy. The 'cortex-strings' repository has since been renamed to 'optimized-routines' (this is public on GitHub at https://github.com/ARM-software/optimized-routines), and has been updated with further optimizations of various library functions, and this includes a newer memcpy implementation. The address of page faults is exposed to the faulting instructions' corresponding fixup code, and in v2 a fixup routine to correspond to the previous copy algorithm was also implemented in a way that utilized the fault address to try and efficiently calculate the number of bytes that failed to copy. It returned the distance between the fault and the end of the buffer, however this fixup in [PATCH v2 2/14] had some issues due to the out-of-order nature of the copy algorithm, and was flagged by the LTP testcase of the preadv2() syscall (which makes multiple calls to copy_to_user). This was due to preadv2() reporting SUCCESS for an invalid destination address, NULL, where it expected EFAULT, because a nonzero return value was calculated (indicating that some bytes were copied) due to the fault not occurring at the start of the buffer. In this version I have imported the very latest optimized memcpy routine, and re-written the fixup to use multiple routines that encapsulate various properties of the algorithm (this is explained in more detail in patches 11/13, 12/13, 13/13). The aim is to return the exact number of bytes that haven't copied when a fault occurs in copy_{to, in, from}_user, and to enable the fixups to be modular so that they could be re-written without too much trouble in the future if the copy algorithm was to be updated again. Initial testing indicates that the backtracking performed in the fixup routines is accurate, and I am working on a separate patchset containing more concise selftests that indirectly call the usercopy functions via read()/write() - this will help to ease the verification of expected behaviour. I am going to post updated benchmark results, as the ~27% increase in speed measured by Sam with the previous 'cortex-strings' memcpy is no longer applicable due to the more recent replacement from 'optimized-routines', which should hopefully be even more efficient and improve this further. v1: https://lore.kernel.org/linux-arm-kernel/cover.1571073960.git.robin.murphy@arm.com/ v2: https://lore.kernel.org/linux-arm-kernel/cover.1571421836.git.robin.murphy@arm.com/ Changes since v2: * Adds Robin's separate patch that fixes a compilation issue with KProbes fixup [1] * Imports the most recent memcpy implementation by updating Sam's patch (and moves this patch to occur after the cortex-strings importing so that it's closer to the patches containing its corresponding fixups) * Uses the stack to preserve the initial parameters * Replaces the usercopy fixup routine in v2 with multiple longer fixups that each make use of the fault address to return the exact number of bytes that haven't yet copied. [1] https://lore.kernel.org/linux-arm-kernel/e70f7b9de7e601b9e4a6fedad8eaf64d304b1637.1571326276.git.robin.murphy@arm.com/ Many thanks, Oliver Oliver Swede (4): arm64: Store the arguments to copy_*_user on the stack arm64: Use additional memcpy macros and fixups arm64: Add fixup routines for usercopy load exceptions arm64: Add fixup routines for usercopy store exceptions Robin Murphy (2): arm64: Tidy up _asm_extable_faultaddr usage arm64: kprobes: Drop open-coded exception fixup Sam Tebbs (7): arm64: Allow passing fault address to fixup handlers arm64: Import latest optimization of memcpy arm64: Import latest version of Cortex Strings' memcmp arm64: Import latest version of Cortex Strings' memmove arm64: Import latest version of Cortex Strings' strcmp arm64: Import latest version of Cortex Strings' strlen arm64: Import latest version of Cortex Strings' strncmp arch/arm64/include/asm/alternative.h | 36 --- arch/arm64/include/asm/assembler.h | 13 + arch/arm64/include/asm/extable.h | 10 +- arch/arm64/kernel/probes/kprobes.c | 7 - arch/arm64/lib/copy_from_user.S | 272 +++++++++++++++++-- arch/arm64/lib/copy_in_user.S | 287 ++++++++++++++++++-- arch/arm64/lib/copy_template.S | 377 +++++++++++++++------------ arch/arm64/lib/copy_template_user.S | 50 ++++ arch/arm64/lib/copy_to_user.S | 273 +++++++++++++++++-- arch/arm64/lib/copy_user_fixup.S | 277 ++++++++++++++++++++ arch/arm64/lib/memcmp.S | 333 +++++++++-------------- arch/arm64/lib/memcpy.S | 128 +++++++-- arch/arm64/lib/memmove.S | 232 ++++++----------- arch/arm64/lib/strcmp.S | 272 ++++++++----------- arch/arm64/lib/strlen.S | 247 ++++++++++++------ arch/arm64/lib/strncmp.S | 363 ++++++++++++-------------- arch/arm64/mm/extable.c | 13 +- arch/arm64/mm/fault.c | 2 +- 18 files changed, 2073 insertions(+), 1119 deletions(-) create mode 100644 arch/arm64/lib/copy_template_user.S create mode 100644 arch/arm64/lib/copy_user_fixup.S -- 2.17.1 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next reply other threads:[~2020-05-14 14:32 UTC|newest] Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top 2020-05-14 14:32 Oliver Swede [this message] 2020-05-14 14:32 ` [PATCH v3 00/13] arm64: Optimise and update memcpy, user copy and string routines Oliver Swede 2020-05-14 14:32 ` [PATCH v3 01/13] arm64: Allow passing fault address to fixup handlers Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 02/13] arm64: kprobes: Drop open-coded exception fixup Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 03/13] arm64: Import latest version of Cortex Strings' memcmp Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 04/13] arm64: Import latest version of Cortex Strings' memmove Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 05/13] arm64: Import latest version of Cortex Strings' strcmp Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 06/13] arm64: Import latest version of Cortex Strings' strlen Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 07/13] arm64: Import latest version of Cortex Strings' strncmp Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 08/13] arm64: Import latest optimization of memcpy Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 09/13] arm64: Tidy up _asm_extable_faultaddr usage Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 10/13] arm64: Store the arguments to copy_*_user on the stack Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 11/13] arm64: Use additional memcpy macros and fixups Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 12/13] arm64: Add fixup routines for usercopy load exceptions Oliver Swede 2020-05-14 14:32 ` Oliver Swede 2020-05-14 14:32 ` [PATCH v3 13/13] arm64: Add fixup routines for usercopy store exceptions Oliver Swede 2020-05-14 14:32 ` Oliver Swede
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20200514143227.605-1-oli.swede@arm.com \ --to=oli.swede@arm.com \ --cc=catalin.marinas@arm.com \ --cc=linux-arm-kernel@lists.infradead.org \ --cc=linux-kernel@vger.kernel.org \ --cc=robin.murphy@arm.com \ --cc=will@kernel.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.