Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / Atom feed
From: Oliver Swede <oli.swede@arm.com>
To: Will Deacon <will@kernel.org>, Catalin Marinas <catalin.marinas@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>,
	linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
Subject: [PATCH v4 12/14] arm64: Add fixup routines for usercopy load exceptions
Date: Tue, 30 Jun 2020 19:48:20 +0000
Message-ID: <20200630194822.1082-13-oli.swede@arm.com> (raw)
In-Reply-To: <20200630194822.1082-1-oli.swede@arm.com>

This adds the fixup routines for exceptions that occur on load
operations while copying, by providing the calling code with a more
accurate value for the number of bytes that failed to copy.

The three routines for load exceptions work together to analyse
the position of the fault relative to the start or the end of the
buffer, and backtrack from the optimized memcpy algorithm to
determine if some number of bytes has already been successfully
copied.

The new template uses out-of-order copying, and this fixup routine is
specific to the latest memcpy implementation. It is assumed there is
no requirement to follow through with the copying of data that may
reside in temporary registers on a fault, as this would greatly
increase the fixup's complexity.

Signed-off-by: Oliver Swede <oli.swede@arm.com>
---
 arch/arm64/lib/copy_user_fixup.S | 172 ++++++++++++++++++++++++++++++-
 1 file changed, 168 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/lib/copy_user_fixup.S b/arch/arm64/lib/copy_user_fixup.S
index f878c8831b14..ae94b492129d 100644
--- a/arch/arm64/lib/copy_user_fixup.S
+++ b/arch/arm64/lib/copy_user_fixup.S
@@ -3,19 +3,183 @@
 addr	.req	x15
 .section .fixup,"ax"
 .align	2
+
+/*
+ * These fixup routines assume no requirements to follow through
+ * with copying any intermediate data in registers.
+ *
+ * The fixup subroutine that is directed to depends on the
+ * characteristics of the target address in the faulting instruction:
+ * if it is an address specified relative to the start or end of the
+ * buffer, and (in the case of copy sizes >128B) if the target address
+ * is 16-byte aligned.
+ */
+
+/*
+ * The following three routines are directed to from faults on load
+ * instructions. For 9993, 9994 and 9995, in each case nothing has
+ * been copied if either:
+ *	a) the copy size is less than 128 bytes - the algorithm does
+ *	   not interleave load/store instructions and starts with the
+ *	   load instructions, so for these sizes nothing has been
+ *	   copied on a load fault and the full buffer width srcend-src
+ *	   can be returned, or
+ *	b) the copy size is greater than or equal to 128 bytes and the
+ *	   src and dst buffers overlap - this would result in a
+ *	   backwards copy, however the calling code expects the return
+ *	   value, the numbwe of bytes not copied, to be from an in-order
+ *	   perspective.
+ */
 9993:
+	/*
+	 * This is reached from load instructions that are specified
+	 * relative to the start of a user space memory buffer, and
+	 * provide target addresses that are 16-byte aligned.
+	 */
+
+	/* Retrieve original params from stack */
+	ldp	dst, src, [sp], #16	// dst: x3, src: x1
+	ldr	count, [sp], #16	// count: x2
+	add	srcend, src, count
+	add	dstend, dst, count
+
+	/*
+	 * If the fault occurred with a copy size larger than 128 bytes,
+	 * and if the buffers do not overlap (where it would start copying
+	 * from the end), it still has not copied anything at this point
+	 * as the fault occurred at the start of the algorithm's subroutine
+	 * for long forward copies, and no bytes have been stored at this
+	 * point.
+	 */
+	b	L(none_copied)
 9994:
+	/*
+	 * This is reached from load instructions that are specified
+	 * relative to the end of a user space memory buffer, and
+	 * where the store targets are not guaranteed to be 16-byte
+	 * aligned.
+	 */
+
+	/* Save the current (adjusted) dst for later restoration */
+	mov	tmp1, dst
+
+	/* Retrieve original params from stack */
+	ldp	dst, src, [sp], #16	// dst: x3, src: x1
+	ldr	count, [sp], #16	// count: x2
+	add	srcend, src, count
+	add	dstend, dst, count
+
+	/* Copy size <= 128 bytes */
+	cmp	count, 128
+	b.le	L(none_copied)
+
+	/*
+	 * Overlapping buffers:
+	 * (src <= dst && dst < srcend) || (dst <= src && src < dstend)
+	 */
+	cmp	src, dst
+	ccmp	dst, srcend, #0, le
+	b.lt	L(none_copied)
+	cmp	dst, src
+	ccmp	src, dstend, #0, le
+	b.lt	L(none_copied)
+
+	/*
+	 * In a long copy over 128B, in the case of a fault at an
+	 * address relative to the end of the buffer, this will
+	 * occur in copy64_from_end() and thus the last 64 bytes.
+	 *
+	 * The algorithm first copies 16 bytes, and then aligns dst
+	 * to 16 bytes for the rest of the copy.
+	 * At the end of each loop64(), dst is incremented by 64.
+	 *
+	 * This takes the minimum of (16, max((current_dst + 16 - original_dst), 16) 
+	 */
+
+	/* already copied max((tmp1 + 16 - dst), 16) */
+	add	tmp1, tmp1, 16
+	sub	tmp1, tmp1, dst
+	mov	x0, 16
+	cmp	x0, tmp1
+	csel	tmp1, x0, tmp1, ge // threshold it has copied up to
+	sub	x0, srcend, 64 // base of final 64 bytes
+	sub	x0, addr, x0 // fault offset in final 64 bytes
+	bic	x0, x0, 15 // discard bytes between target and fault addr)
+	add	x0, x0, tmp1 // copied up to here
+	sub	x0, count, x0
+	b	L(end_fixup)
+
 9995:
+	/*
+	 * This is reached from load instructions that are specified
+	 * relative to the start of a user space memory buffer, and
+	 * where the store targets are guaranteed to be 16-byte aligned.
+	 */
+	
+	/* Save the current (adjusted) dst for later restoration */
+	mov	tmp1, dst
+
+	/* Retrieve original params from stack */
+	ldp	dst, src, [sp], #16	// dst: x3, src: x1
+	ldr	count, [sp], #16	// count: x2
+	add	srcend, src, count
+	add	dstend, dst, count
+
+	/* Copy size <= 128 bytes */
+	cmp	count, 128
+	b.le	L(none_copied)
+
+	/*
+	 * Overlapping buffers:
+	 * (src <= dst && dst < srcend) || (dst <= src && src < dstend)
+	 */
+	cmp	src, dst
+	ccmp	dst, srcend, #0, le
+	b.lt	L(none_copied)
+	cmp	dst, src
+	ccmp	src, dstend, #0, le
+	b.lt	L(none_copied)
+
+	/*
+	 * These stores provide target addresses that are 16-byte
+	 * aligned, and this allows for a mapping:
+	 */
+	and	x0, dst, 15
+	sub	x0, src, x0 // aligned src: (src - (dst & 15))
+	sub	x0, addr, src // offset of fault relative to aligned src
+	bic	x0, x0, 15 // round down to base of faulting 16B chunk
+	mov	tmp1, 16
+	cmp	x0, tmp1
+	/* fault_offset < 32 -> count */
+	b.le	L(none_copied)
+	mov	tmp1, 64
+	cmp	x0, tmp1
+	mov	tmp1, x0
+	sub	x0, count, 16
+	/* 32 <= fault_offset < 80 -> count-16 */
+	b.le	L(end_fixup)
+	mov	x0, tmp1
+	sub	x0, x0, 48 /* faulted in a loop64 subroutine */
+	and	tmp1, dst, 15
+	sub	x0, x0, tmp1
+	sub	x0, count, x0
+	/*
+	 * fault_offset >= 80 ->
+	 * count-((fault_offset&~15)-48)-(dst & 15))
+	 */
+	b	L(end_fixup)
+	
 9996:
 9997:
 9998:
-	/* Retrieve useful information & free the stack area */
-	ldr	count, [sp, #16]	// x2
+	/* Retrieve info from stack */
+	ldr	count, [sp], #16	// count: x2
 	add	sp, sp, 32
+L(none_copied):
 	/*
-	 * Return the initial count as the (under-estimated) number
+	 * Return the initial count as the number
 	 * of bytes that failed to copy
 	 */
 	mov	x0, count
+L(end_fixup):
 	ret
-
-- 
2.17.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply index

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-30 19:48 [PATCH v4 00/14] arm64: Optimise and update memcpy, user copy and string routines Oliver Swede
2020-06-30 19:48 ` [PATCH v4 01/14] arm64: Allow passing fault address to fixup handlers Oliver Swede
2020-06-30 19:48 ` [PATCH v4 02/14] arm64: kprobes: Drop open-coded exception fixup Oliver Swede
2020-06-30 19:48 ` [PATCH v4 03/14] arm64: Import latest version of Cortex Strings' memcmp Oliver Swede
2020-06-30 19:48 ` [PATCH v4 04/14] arm64: Import latest version of Cortex Strings' memmove Oliver Swede
2020-06-30 19:48 ` [PATCH v4 05/14] arm64: Import latest version of Cortex Strings' strcmp Oliver Swede
2020-06-30 19:48 ` [PATCH v4 06/14] arm64: Import latest version of Cortex Strings' strlen Oliver Swede
2020-06-30 19:48 ` [PATCH v4 07/14] arm64: Import latest version of Cortex Strings' strncmp Oliver Swede
2020-06-30 19:48 ` [PATCH v4 08/14] arm64: Import latest optimization of memcpy Oliver Swede
2020-06-30 19:48 ` [PATCH v4 09/14] arm64: Tidy up _asm_extable_faultaddr usage Oliver Swede
2020-06-30 19:48 ` [PATCH v4 10/14] arm64: Store the arguments to copy_*_user on the stack Oliver Swede
2020-06-30 19:48 ` [PATCH v4 11/14] arm64: Use additional memcpy macros and fixups Oliver Swede
2020-06-30 19:48 ` Oliver Swede [this message]
2020-06-30 19:48 ` [PATCH v4 13/14] arm64: Add fixup routines for usercopy store exceptions Oliver Swede
2020-06-30 19:48 ` [PATCH v4 14/14] arm64: Improve accuracy of fixup for UAO cases Oliver Swede
2020-07-01  8:12 ` [PATCH v4 00/14] arm64: Optimise and update memcpy, user copy and string routines Oli Swede

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200630194822.1082-13-oli.swede@arm.com \
    --to=oli.swede@arm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=robin.murphy@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-ARM-Kernel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/0 linux-arm-kernel/git/0.git
	git clone --mirror https://lore.kernel.org/linux-arm-kernel/1 linux-arm-kernel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-arm-kernel linux-arm-kernel/ https://lore.kernel.org/linux-arm-kernel \
		linux-arm-kernel@lists.infradead.org
	public-inbox-index linux-arm-kernel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-arm-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git