All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] arm64: gcc asm flag outputs
@ 2020-03-11 18:04 Richard Henderson
  2020-03-11 18:04 ` [PATCH 1/6] arm64: Add asm/ccset.h header Richard Henderson
                   ` (5 more replies)
  0 siblings, 6 replies; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

With gcc 10, arm64 includes support for flag outputs, much like
x86 has done for years.

The most effective use of this within the kernel is in uaccess,
where we can branch directly on the Carry flag output, e.g.

	adds    x3, x2, #0x2
	csel    x1, xzr, x1, hi
	csinv   x3, x3, xzr, cc
	sbcs    xzr, x3, x1
	b.ls    ffff800010084b5c <aarch32_break_handler+0xec>

Changing __arm64_rndr has very few uses, and so the effect is
very minor, but is also changed for consistency.

Finally, a mostly unrelated fix to uaccess.h, which I noticed
while poking around the header and looking at the assembly.
Has no effect on generic kernels that include ARMv8.3 support,
but eliminates a branch if one starts to play with the knobs.


r~


Richard Henderson (6):
  arm64: Add asm/ccset.h header
  arm64: uaccess: Use named asm operands for __in_range
  arm64: uaccess: Untie the input address in __range_ok
  arm64: uaccess: Use asm/ccset.h macros in __range_ok
  arm64: archrandom: Use asm/ccset.h macros in __arm64_rndr
  arm64: Hoist CONFIG option out of ALTERNATIVE in uaccess.h

 arch/arm64/include/asm/archrandom.h |  7 +++--
 arch/arm64/include/asm/ccset.h      | 19 ++++++++++++
 arch/arm64/include/asm/uaccess.h    | 46 ++++++++++++++++-------------
 3 files changed, 48 insertions(+), 24 deletions(-)
 create mode 100644 arch/arm64/include/asm/ccset.h

-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 1/6] arm64: Add asm/ccset.h header
  2020-03-11 18:04 [PATCH 0/6] arm64: gcc asm flag outputs Richard Henderson
@ 2020-03-11 18:04 ` Richard Henderson
  2020-03-13 10:54   ` Mark Rutland
  2020-03-11 18:04 ` [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range Richard Henderson
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

With gcc 10, arm64 includes support for flag outputs, much like
x86 has done for years.  Mirror the macros that x86 places in
asm/asm.h, with the necessary addition of CC_CLOBBER.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 arch/arm64/include/asm/ccset.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)
 create mode 100644 arch/arm64/include/asm/ccset.h

diff --git a/arch/arm64/include/asm/ccset.h b/arch/arm64/include/asm/ccset.h
new file mode 100644
index 000000000000..e733d383f515
--- /dev/null
+++ b/arch/arm64/include/asm/ccset.h
@@ -0,0 +1,19 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_CCSET_H
+#define __ASM_CCSET_H
+
+/*
+ * Macros to generate condition code outputs from inline assembly.
+ * The output operand must be integral but type "bool" preferred.
+ */
+#ifdef __GCC_ASM_FLAG_OUTPUTS__
+# define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
+# define CC_OUT(c) "=@cc" #c
+# define CC_CLOBBER
+#else
+# define CC_SET(c) "\n\tcset %[_cc_" #c "], " #c "\n"
+# define CC_OUT(c) [_cc_ ## c] "=r"
+# define CC_CLOBBER "cc"
+#endif
+
+#endif /* __ASM_CCSET_H */
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range
  2020-03-11 18:04 [PATCH 0/6] arm64: gcc asm flag outputs Richard Henderson
  2020-03-11 18:04 ` [PATCH 1/6] arm64: Add asm/ccset.h header Richard Henderson
@ 2020-03-11 18:04 ` Richard Henderson
  2020-03-11 19:08   ` Robin Murphy
  2020-03-11 18:04 ` [PATCH 3/6] arm64: uaccess: Untie the input address in __range_ok Richard Henderson
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

With zero change of behavior, use %[] syntax for the asm
operands instead of numbered operands.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 arch/arm64/include/asm/uaccess.h | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 32fc8061aa76..ceb1d79eab09 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -75,19 +75,21 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
 	asm volatile(
 	// A + B <= C + 1 for all A,B,C, in four easy steps:
 	// 1: X = A + B; X' = X % 2^64
-	"	adds	%0, %3, %2\n"
+	"	adds	%[addr], %[addr], %[size]\n"
 	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
-	"	csel	%1, xzr, %1, hi\n"
+	"	csel	%[limit], xzr, %[limit], hi\n"
 	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
 	//    to compensate for the carry flag being set in step 4. For
 	//    X > 2^64, X' merely has to remain nonzero, which it does.
-	"	csinv	%0, %0, xzr, cc\n"
+	"	csinv	%[addr], %[addr], xzr, cc\n"
 	// 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
 	//    comes from the carry in being clear. Otherwise, we are
 	//    testing X' - C == 0, subject to the previous adjustments.
-	"	sbcs	xzr, %0, %1\n"
-	"	cset	%0, ls\n"
-	: "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
+	"	sbcs	xzr, %[addr], %[limit]\n"
+	"       cset    %[addr], ls\n"
+	: [addr] "=&r" (ret), [limit] "+r" (limit)
+	: [size] "Ir" (size), "0" (addr)
+	: "cc");
 
 	return ret;
 }
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 3/6] arm64: uaccess: Untie the input address in __range_ok
  2020-03-11 18:04 [PATCH 0/6] arm64: gcc asm flag outputs Richard Henderson
  2020-03-11 18:04 ` [PATCH 1/6] arm64: Add asm/ccset.h header Richard Henderson
  2020-03-11 18:04 ` [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range Richard Henderson
@ 2020-03-11 18:04 ` Richard Henderson
  2020-03-11 19:08   ` Robin Murphy
  2020-03-11 18:04 ` [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros " Richard Henderson
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

There's no reason for the input address to match the output
register.  Give the register allocator a bit more freedom.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 arch/arm64/include/asm/uaccess.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index ceb1d79eab09..fe3dd70e901e 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -75,7 +75,7 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
 	asm volatile(
 	// A + B <= C + 1 for all A,B,C, in four easy steps:
 	// 1: X = A + B; X' = X % 2^64
-	"	adds	%[addr], %[addr], %[size]\n"
+	"	adds	%[addr], %[addr_in], %[size]\n"
 	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
 	"	csel	%[limit], xzr, %[limit], hi\n"
 	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
@@ -88,7 +88,7 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
 	"	sbcs	xzr, %[addr], %[limit]\n"
 	"       cset    %[addr], ls\n"
 	: [addr] "=&r" (ret), [limit] "+r" (limit)
-	: [size] "Ir" (size), "0" (addr)
+	: [size] "Ir" (size), [addr_in] "r" (addr)
 	: "cc");
 
 	return ret;
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros in __range_ok
  2020-03-11 18:04 [PATCH 0/6] arm64: gcc asm flag outputs Richard Henderson
                   ` (2 preceding siblings ...)
  2020-03-11 18:04 ` [PATCH 3/6] arm64: uaccess: Untie the input address in __range_ok Richard Henderson
@ 2020-03-11 18:04 ` Richard Henderson
  2020-03-12 11:48   ` Robin Murphy
  2020-03-13 11:04   ` Mark Rutland
  2020-03-11 18:04 ` [PATCH 5/6] arm64: archrandom: Use asm/ccset.h macros in __arm64_rndr Richard Henderson
  2020-03-11 18:04 ` [PATCH 6/6] arm64: Hoist CONFIG option out of ALTERNATIVE in uaccess.h Richard Henderson
  5 siblings, 2 replies; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

Uses of __range_ok almost always feed a branch.
This allows the compiler to use flags directly.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 arch/arm64/include/asm/uaccess.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index fe3dd70e901e..ca1acd7b95c3 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -22,6 +22,7 @@
 #include <asm/ptrace.h>
 #include <asm/memory.h>
 #include <asm/extable.h>
+#include <asm/ccset.h>
 
 #define get_fs()	(current_thread_info()->addr_limit)
 
@@ -86,10 +87,10 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
 	//    comes from the carry in being clear. Otherwise, we are
 	//    testing X' - C == 0, subject to the previous adjustments.
 	"	sbcs	xzr, %[addr], %[limit]\n"
-	"       cset    %[addr], ls\n"
-	: [addr] "=&r" (ret), [limit] "+r" (limit)
+		CC_SET(ls)
+	: [addr] "=&r" (addr), [limit] "+r" (limit), CC_OUT(ls) (ret)
 	: [size] "Ir" (size), [addr_in] "r" (addr)
-	: "cc");
+	: CC_CLOBBER);
 
 	return ret;
 }
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 5/6] arm64: archrandom: Use asm/ccset.h macros in __arm64_rndr
  2020-03-11 18:04 [PATCH 0/6] arm64: gcc asm flag outputs Richard Henderson
                   ` (3 preceding siblings ...)
  2020-03-11 18:04 ` [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros " Richard Henderson
@ 2020-03-11 18:04 ` Richard Henderson
  2020-03-11 18:04 ` [PATCH 6/6] arm64: Hoist CONFIG option out of ALTERNATIVE in uaccess.h Richard Henderson
  5 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel

Uses of __arm64_rndr always (indirectly) feed a branch.
This allows the compiler to use flags directly.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 arch/arm64/include/asm/archrandom.h | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/archrandom.h b/arch/arm64/include/asm/archrandom.h
index 3fe02da70004..f65df47283a6 100644
--- a/arch/arm64/include/asm/archrandom.h
+++ b/arch/arm64/include/asm/archrandom.h
@@ -6,6 +6,7 @@
 
 #include <linux/random.h>
 #include <asm/cpufeature.h>
+#include <asm/ccset.h>
 
 static inline bool __arm64_rndr(unsigned long *v)
 {
@@ -17,10 +18,10 @@ static inline bool __arm64_rndr(unsigned long *v)
 	 */
 	asm volatile(
 		__mrs_s("%0", SYS_RNDR_EL0) "\n"
-	"	cset %w1, ne\n"
-	: "=r" (*v), "=r" (ok)
+		CC_SET(ne)
+	: "=r" (*v), CC_OUT(ne) (ok)
 	:
-	: "cc");
+	: CC_CLOBBER);
 
 	return ok;
 }
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 6/6] arm64: Hoist CONFIG option out of ALTERNATIVE in uaccess.h
  2020-03-11 18:04 [PATCH 0/6] arm64: gcc asm flag outputs Richard Henderson
                   ` (4 preceding siblings ...)
  2020-03-11 18:04 ` [PATCH 5/6] arm64: archrandom: Use asm/ccset.h macros in __arm64_rndr Richard Henderson
@ 2020-03-11 18:04 ` Richard Henderson
  2020-03-13 10:46   ` Mark Rutland
  5 siblings, 1 reply; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 18:04 UTC (permalink / raw)
  To: linux-arm-kernel; +Cc: Richard Henderson

From: Richard Henderson <rth@twiddle.net>

The placement of the CONFIG check, within the asm, is less than
ideal within uaccess.h.  When we have

	if (cond)
		asm("something")

and "something" turns out to be empty, the if cannot be removed
by the compiler.

Signed-off-by: Richard Henderson <rth@twiddle.net>
---
 arch/arm64/include/asm/uaccess.h | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index ca1acd7b95c3..90be003101f4 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -43,11 +43,14 @@ static inline void set_fs(mm_segment_t fs)
 	 * Enable/disable UAO so that copy_to_user() etc can access
 	 * kernel memory with the unprivileged instructions.
 	 */
-	if (IS_ENABLED(CONFIG_ARM64_UAO) && fs == KERNEL_DS)
-		asm(ALTERNATIVE("nop", SET_PSTATE_UAO(1), ARM64_HAS_UAO));
-	else
-		asm(ALTERNATIVE("nop", SET_PSTATE_UAO(0), ARM64_HAS_UAO,
-				CONFIG_ARM64_UAO));
+	if (IS_ENABLED(CONFIG_ARM64_UAO)) {
+		if (fs == KERNEL_DS)
+			asm(ALTERNATIVE("nop", SET_PSTATE_UAO(1),
+					ARM64_HAS_UAO));
+		else
+			asm(ALTERNATIVE("nop", SET_PSTATE_UAO(0),
+					ARM64_HAS_UAO));
+	}
 }
 
 #define segment_eq(a, b)	((a) == (b))
@@ -178,28 +181,26 @@ static inline bool uaccess_ttbr0_enable(void)
 
 static inline void __uaccess_disable_hw_pan(void)
 {
-	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN,
-			CONFIG_ARM64_PAN));
+	if (IS_ENABLED(CONFIG_ARM64_PAN))
+		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN));
 }
 
 static inline void __uaccess_enable_hw_pan(void)
 {
-	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,
-			CONFIG_ARM64_PAN));
+	if (IS_ENABLED(CONFIG_ARM64_PAN))
+		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN));
 }
 
 #define __uaccess_disable(alt)						\
 do {									\
-	if (!uaccess_ttbr0_disable())					\
-		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), alt,		\
-				CONFIG_ARM64_PAN));			\
+	if (IS_ENABLED(CONFIG_ARM64_PAN) && !uaccess_ttbr0_disable())	\
+		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), alt));	\
 } while (0)
 
 #define __uaccess_enable(alt)						\
 do {									\
-	if (!uaccess_ttbr0_enable())					\
-		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), alt,		\
-				CONFIG_ARM64_PAN));			\
+	if (IS_ENABLED(CONFIG_ARM64_PAN) && !uaccess_ttbr0_enable())	\
+		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), alt));	\
 } while (0)
 
 static inline void uaccess_disable(void)
-- 
2.20.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range
  2020-03-11 18:04 ` [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range Richard Henderson
@ 2020-03-11 19:08   ` Robin Murphy
  2020-03-11 21:48     ` Richard Henderson
  0 siblings, 1 reply; 18+ messages in thread
From: Robin Murphy @ 2020-03-11 19:08 UTC (permalink / raw)
  To: Richard Henderson, linux-arm-kernel

On 11/03/2020 6:04 pm, Richard Henderson wrote:
> With zero change of behavior, use %[] syntax for the asm
> operands instead of numbered operands.

For any particular reason? It's very deliberate that a mere 4 
instructions have over twice as many lines of comments here, and IMO 
making the code more verbose only serves to distract from the 
explanation of what's actually happening. In particular, the value 
represented by %0 (the conceptual X') never corresponds to the variable 
"addr", so naming it "addr" provides the opposite of clarity.

Robin.

> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   arch/arm64/include/asm/uaccess.h | 14 ++++++++------
>   1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index 32fc8061aa76..ceb1d79eab09 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -75,19 +75,21 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
>   	asm volatile(
>   	// A + B <= C + 1 for all A,B,C, in four easy steps:
>   	// 1: X = A + B; X' = X % 2^64
> -	"	adds	%0, %3, %2\n"
> +	"	adds	%[addr], %[addr], %[size]\n"
>   	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
> -	"	csel	%1, xzr, %1, hi\n"
> +	"	csel	%[limit], xzr, %[limit], hi\n"
>   	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
>   	//    to compensate for the carry flag being set in step 4. For
>   	//    X > 2^64, X' merely has to remain nonzero, which it does.
> -	"	csinv	%0, %0, xzr, cc\n"
> +	"	csinv	%[addr], %[addr], xzr, cc\n"
>   	// 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
>   	//    comes from the carry in being clear. Otherwise, we are
>   	//    testing X' - C == 0, subject to the previous adjustments.
> -	"	sbcs	xzr, %0, %1\n"
> -	"	cset	%0, ls\n"
> -	: "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
> +	"	sbcs	xzr, %[addr], %[limit]\n"
> +	"       cset    %[addr], ls\n"
> +	: [addr] "=&r" (ret), [limit] "+r" (limit)
> +	: [size] "Ir" (size), "0" (addr)
> +	: "cc");
>   
>   	return ret;
>   }
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 3/6] arm64: uaccess: Untie the input address in __range_ok
  2020-03-11 18:04 ` [PATCH 3/6] arm64: uaccess: Untie the input address in __range_ok Richard Henderson
@ 2020-03-11 19:08   ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2020-03-11 19:08 UTC (permalink / raw)
  To: Richard Henderson, linux-arm-kernel

On 11/03/2020 6:04 pm, Richard Henderson wrote:
> There's no reason for the input address to match the output
> register.  Give the register allocator a bit more freedom.

See commit 9085b34d0e83 - as originally written they *were* the same 
operand, and this was just some impedance-matching for types. However, I 
suppose this might save a mov to initialise "ret" in the 
!CONFIG_ARM64_TAGGED_ADDR_ABI case where the original value of "addr" is 
still live for subsequent use, so it's probably reasonable. On the other 
hand, the naming only reinforces my previous complaint - now we have 
"addr_in" corresponding to "addr", and "addr" bearing no relation to 
either of those :(

Robin.

> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   arch/arm64/include/asm/uaccess.h | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index ceb1d79eab09..fe3dd70e901e 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -75,7 +75,7 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
>   	asm volatile(
>   	// A + B <= C + 1 for all A,B,C, in four easy steps:
>   	// 1: X = A + B; X' = X % 2^64
> -	"	adds	%[addr], %[addr], %[size]\n"
> +	"	adds	%[addr], %[addr_in], %[size]\n"
>   	// 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
>   	"	csel	%[limit], xzr, %[limit], hi\n"
>   	// 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
> @@ -88,7 +88,7 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
>   	"	sbcs	xzr, %[addr], %[limit]\n"
>   	"       cset    %[addr], ls\n"
>   	: [addr] "=&r" (ret), [limit] "+r" (limit)
> -	: [size] "Ir" (size), "0" (addr)
> +	: [size] "Ir" (size), [addr_in] "r" (addr)
>   	: "cc");
>   
>   	return ret;
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range
  2020-03-11 19:08   ` Robin Murphy
@ 2020-03-11 21:48     ` Richard Henderson
  2020-03-13 16:14       ` Robin Murphy
  0 siblings, 1 reply; 18+ messages in thread
From: Richard Henderson @ 2020-03-11 21:48 UTC (permalink / raw)
  To: Robin Murphy, linux-arm-kernel

On 3/11/20 12:08 PM, Robin Murphy wrote:
> On 11/03/2020 6:04 pm, Richard Henderson wrote:
>> With zero change of behavior, use %[] syntax for the asm
>> operands instead of numbered operands.
> 
> For any particular reason?

When we get to the third patch and add CC_SET(le), all of the operand numbers
will change, and I found that more confusing than not.

> In
> particular, the value represented by %0 (the conceptual X') never corresponds
> to the variable "addr", so naming it "addr" provides the opposite of clarity.

Would you simply prefer a different name for the operand?


r~


>> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
>> index 32fc8061aa76..ceb1d79eab09 100644
>> --- a/arch/arm64/include/asm/uaccess.h
>> +++ b/arch/arm64/include/asm/uaccess.h
>> @@ -75,19 +75,21 @@ static inline unsigned long __range_ok(const void __user
>> *addr, unsigned long si
>>       asm volatile(
>>       // A + B <= C + 1 for all A,B,C, in four easy steps:
>>       // 1: X = A + B; X' = X % 2^64
>> -    "    adds    %0, %3, %2\n"
>> +    "    adds    %[addr], %[addr], %[size]\n"
>>       // 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
>> -    "    csel    %1, xzr, %1, hi\n"
>> +    "    csel    %[limit], xzr, %[limit], hi\n"
>>       // 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
>>       //    to compensate for the carry flag being set in step 4. For
>>       //    X > 2^64, X' merely has to remain nonzero, which it does.
>> -    "    csinv    %0, %0, xzr, cc\n"
>> +    "    csinv    %[addr], %[addr], xzr, cc\n"
>>       // 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
>>       //    comes from the carry in being clear. Otherwise, we are
>>       //    testing X' - C == 0, subject to the previous adjustments.
>> -    "    sbcs    xzr, %0, %1\n"
>> -    "    cset    %0, ls\n"
>> -    : "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
>> +    "    sbcs    xzr, %[addr], %[limit]\n"
>> +    "       cset    %[addr], ls\n"
>> +    : [addr] "=&r" (ret), [limit] "+r" (limit)
>> +    : [size] "Ir" (size), "0" (addr)
>> +    : "cc");
>>         return ret;
>>   }

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros in __range_ok
  2020-03-11 18:04 ` [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros " Richard Henderson
@ 2020-03-12 11:48   ` Robin Murphy
  2020-03-13 11:04   ` Mark Rutland
  1 sibling, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2020-03-12 11:48 UTC (permalink / raw)
  To: Richard Henderson, linux-arm-kernel

On 2020-03-11 6:04 pm, Richard Henderson wrote:
> Uses of __range_ok almost always feed a branch.
> This allows the compiler to use flags directly.
> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>   arch/arm64/include/asm/uaccess.h | 7 ++++---
>   1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index fe3dd70e901e..ca1acd7b95c3 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -22,6 +22,7 @@
>   #include <asm/ptrace.h>
>   #include <asm/memory.h>
>   #include <asm/extable.h>
> +#include <asm/ccset.h>
>   
>   #define get_fs()	(current_thread_info()->addr_limit)
>   
> @@ -86,10 +87,10 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
>   	//    comes from the carry in being clear. Otherwise, we are
>   	//    testing X' - C == 0, subject to the previous adjustments.
>   	"	sbcs	xzr, %[addr], %[limit]\n"
> -	"       cset    %[addr], ls\n"
> -	: [addr] "=&r" (ret), [limit] "+r" (limit)
> +		CC_SET(ls)
> +	: [addr] "=&r" (addr), [limit] "+r" (limit), CC_OUT(ls) (ret)

...and we've immediately undone any benefit of the previous patch by 
effectively recoupling %0 with addr again :/

I don't entirely follow why, not least because this CC_SET/CC_OUT 
business is virtually unreadable. At the very least the macro should at 
least take an operand identifier as an explicit argument rather than 
secretly generating one, so that we're not all scratching our heads 
wondering how it can possibly work at all. Furthermore, if the 
associated C variable is a 32-bit or smaller type, then won't it provoke 
warnings from Clang due to the operand lacking the "w" modifier?

Robin.

>   	: [size] "Ir" (size), [addr_in] "r" (addr)
> -	: "cc");
> +	: CC_CLOBBER);
>   
>   	return ret;
>   }
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 6/6] arm64: Hoist CONFIG option out of ALTERNATIVE in uaccess.h
  2020-03-11 18:04 ` [PATCH 6/6] arm64: Hoist CONFIG option out of ALTERNATIVE in uaccess.h Richard Henderson
@ 2020-03-13 10:46   ` Mark Rutland
  0 siblings, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2020-03-13 10:46 UTC (permalink / raw)
  To: Richard Henderson
  Cc: maz, will, catalin.marinas, linux-arm-kernel, Richard Henderson

Hi Richard,

On Wed, Mar 11, 2020 at 11:04:16AM -0700, Richard Henderson wrote:
> From: Richard Henderson <rth@twiddle.net>
> 
> The placement of the CONFIG check, within the asm, is less than
> ideal within uaccess.h.  When we have
> 
> 	if (cond)
> 		asm("something")
> 
> and "something" turns out to be empty, the if cannot be removed
> by the compiler.

Given the config argument to ALTERNATIVE() is unfortunate for codegen,
and IMO hinder clarity, I think it'd be worth removing that completely.

Regardless of that, and regardless of the rest of the series, this patch
looks good to me, so FWIW:

Acked-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> 
> Signed-off-by: Richard Henderson <rth@twiddle.net>
> ---
>  arch/arm64/include/asm/uaccess.h | 31 ++++++++++++++++---------------
>  1 file changed, 16 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index ca1acd7b95c3..90be003101f4 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -43,11 +43,14 @@ static inline void set_fs(mm_segment_t fs)
>  	 * Enable/disable UAO so that copy_to_user() etc can access
>  	 * kernel memory with the unprivileged instructions.
>  	 */
> -	if (IS_ENABLED(CONFIG_ARM64_UAO) && fs == KERNEL_DS)
> -		asm(ALTERNATIVE("nop", SET_PSTATE_UAO(1), ARM64_HAS_UAO));
> -	else
> -		asm(ALTERNATIVE("nop", SET_PSTATE_UAO(0), ARM64_HAS_UAO,
> -				CONFIG_ARM64_UAO));
> +	if (IS_ENABLED(CONFIG_ARM64_UAO)) {
> +		if (fs == KERNEL_DS)
> +			asm(ALTERNATIVE("nop", SET_PSTATE_UAO(1),
> +					ARM64_HAS_UAO));
> +		else
> +			asm(ALTERNATIVE("nop", SET_PSTATE_UAO(0),
> +					ARM64_HAS_UAO));
> +	}
>  }
>  
>  #define segment_eq(a, b)	((a) == (b))
> @@ -178,28 +181,26 @@ static inline bool uaccess_ttbr0_enable(void)
>  
>  static inline void __uaccess_disable_hw_pan(void)
>  {
> -	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN,
> -			CONFIG_ARM64_PAN));
> +	if (IS_ENABLED(CONFIG_ARM64_PAN))
> +		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN));
>  }
>  
>  static inline void __uaccess_enable_hw_pan(void)
>  {
> -	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,
> -			CONFIG_ARM64_PAN));
> +	if (IS_ENABLED(CONFIG_ARM64_PAN))
> +		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN));
>  }
>  
>  #define __uaccess_disable(alt)						\
>  do {									\
> -	if (!uaccess_ttbr0_disable())					\
> -		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), alt,		\
> -				CONFIG_ARM64_PAN));			\
> +	if (IS_ENABLED(CONFIG_ARM64_PAN) && !uaccess_ttbr0_disable())	\
> +		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), alt));	\
>  } while (0)
>  
>  #define __uaccess_enable(alt)						\
>  do {									\
> -	if (!uaccess_ttbr0_enable())					\
> -		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), alt,		\
> -				CONFIG_ARM64_PAN));			\
> +	if (IS_ENABLED(CONFIG_ARM64_PAN) && !uaccess_ttbr0_enable())	\
> +		asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), alt));	\
>  } while (0)
>  
>  static inline void uaccess_disable(void)
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] arm64: Add asm/ccset.h header
  2020-03-11 18:04 ` [PATCH 1/6] arm64: Add asm/ccset.h header Richard Henderson
@ 2020-03-13 10:54   ` Mark Rutland
  2020-03-13 16:29     ` Richard Henderson
  0 siblings, 1 reply; 18+ messages in thread
From: Mark Rutland @ 2020-03-13 10:54 UTC (permalink / raw)
  To: Richard Henderson; +Cc: linux-arm-kernel

Hi Richard,

On Wed, Mar 11, 2020 at 11:04:11AM -0700, Richard Henderson wrote:
> With gcc 10, arm64 includes support for flag outputs, much like
> x86 has done for years.  Mirror the macros that x86 places in
> asm/asm.h, with the necessary addition of CC_CLOBBER.

This sounds like a neat feature on the compiler side, and I can see that
this is potentially beneficial for hot paths.

I am concerned as Robin suggests for specific patches, that that macros
are going to be very easy to misuse, and make the assembly somewhat
opaque. So if there's a substantial benefit, it may be worth dealing
with that, but otherwise I'd prefer to keep things more legible so that
code is easier to maintain.

> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  arch/arm64/include/asm/ccset.h | 19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
>  create mode 100644 arch/arm64/include/asm/ccset.h
> 
> diff --git a/arch/arm64/include/asm/ccset.h b/arch/arm64/include/asm/ccset.h
> new file mode 100644
> index 000000000000..e733d383f515
> --- /dev/null
> +++ b/arch/arm64/include/asm/ccset.h
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef __ASM_CCSET_H
> +#define __ASM_CCSET_H
> +
> +/*
> + * Macros to generate condition code outputs from inline assembly.
> + * The output operand must be integral but type "bool" preferred.
> + */

Is there any documentation on this? What value does the output operand
nominally have when the flags are clear / set?

Thanks,
Mark.

> +#ifdef __GCC_ASM_FLAG_OUTPUTS__
> +# define CC_SET(c) "\n\t/* output condition code " #c "*/\n"
> +# define CC_OUT(c) "=@cc" #c
> +# define CC_CLOBBER
> +#else
> +# define CC_SET(c) "\n\tcset %[_cc_" #c "], " #c "\n"
> +# define CC_OUT(c) [_cc_ ## c] "=r"
> +# define CC_CLOBBER "cc"
> +#endif
> +
> +#endif /* __ASM_CCSET_H */
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros in __range_ok
  2020-03-11 18:04 ` [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros " Richard Henderson
  2020-03-12 11:48   ` Robin Murphy
@ 2020-03-13 11:04   ` Mark Rutland
  2020-03-13 16:51     ` Robin Murphy
  1 sibling, 1 reply; 18+ messages in thread
From: Mark Rutland @ 2020-03-13 11:04 UTC (permalink / raw)
  To: Richard Henderson, robin.murphy; +Cc: linux-arm-kernel

On Wed, Mar 11, 2020 at 11:04:14AM -0700, Richard Henderson wrote:
> Uses of __range_ok almost always feed a branch.
> This allows the compiler to use flags directly.

If we want to give hte compiler the most freedom, the best thing would
be to write this in C. IIUC this code is written in assembly largely for
historical reasons, and the comment above says:

| This is equivalent to the following test:
| (u65)addr + (u65)size <= (u65)current->addr_limit + 1

... which e.g. we could write as:

	(__uint128_t)addr + (__uint128_t)size <= (__uint128_t)limit + 1;

... which would be much clearer than the assembly.

Is there any pattern like that for which the compiler generates
reasonable looking code, and if not, is that something that could be
improved compiler side?

Thanks,
Mark.

> 
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  arch/arm64/include/asm/uaccess.h | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
> index fe3dd70e901e..ca1acd7b95c3 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -22,6 +22,7 @@
>  #include <asm/ptrace.h>
>  #include <asm/memory.h>
>  #include <asm/extable.h>
> +#include <asm/ccset.h>
>  
>  #define get_fs()	(current_thread_info()->addr_limit)
>  
> @@ -86,10 +87,10 @@ static inline unsigned long __range_ok(const void __user *addr, unsigned long si
>  	//    comes from the carry in being clear. Otherwise, we are
>  	//    testing X' - C == 0, subject to the previous adjustments.
>  	"	sbcs	xzr, %[addr], %[limit]\n"
> -	"       cset    %[addr], ls\n"
> -	: [addr] "=&r" (ret), [limit] "+r" (limit)
> +		CC_SET(ls)
> +	: [addr] "=&r" (addr), [limit] "+r" (limit), CC_OUT(ls) (ret)
>  	: [size] "Ir" (size), [addr_in] "r" (addr)
> -	: "cc");
> +	: CC_CLOBBER);
>  
>  	return ret;
>  }
> -- 
> 2.20.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range
  2020-03-11 21:48     ` Richard Henderson
@ 2020-03-13 16:14       ` Robin Murphy
  0 siblings, 0 replies; 18+ messages in thread
From: Robin Murphy @ 2020-03-13 16:14 UTC (permalink / raw)
  To: Richard Henderson, linux-arm-kernel

On 2020-03-11 9:48 pm, Richard Henderson wrote:
> On 3/11/20 12:08 PM, Robin Murphy wrote:
>> On 11/03/2020 6:04 pm, Richard Henderson wrote:
>>> With zero change of behavior, use %[] syntax for the asm
>>> operands instead of numbered operands.
>>
>> For any particular reason?
> 
> When we get to the third patch and add CC_SET(le), all of the operand numbers
> will change, and I found that more confusing than not.
> 
>> In
>> particular, the value represented by %0 (the conceptual X') never corresponds
>> to the variable "addr", so naming it "addr" provides the opposite of clarity.
> 
> Would you simply prefer a different name for the operand?

If we were to go down this route, I think it might actually make sense 
to split it up further to separate the "private scratch register" and 
"return value" concerns.

Robin.

> 
> 
> r~
> 
> 
>>> diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
>>> index 32fc8061aa76..ceb1d79eab09 100644
>>> --- a/arch/arm64/include/asm/uaccess.h
>>> +++ b/arch/arm64/include/asm/uaccess.h
>>> @@ -75,19 +75,21 @@ static inline unsigned long __range_ok(const void __user
>>> *addr, unsigned long si
>>>        asm volatile(
>>>        // A + B <= C + 1 for all A,B,C, in four easy steps:
>>>        // 1: X = A + B; X' = X % 2^64
>>> -    "    adds    %0, %3, %2\n"
>>> +    "    adds    %[addr], %[addr], %[size]\n"
>>>        // 2: Set C = 0 if X > 2^64, to guarantee X' > C in step 4
>>> -    "    csel    %1, xzr, %1, hi\n"
>>> +    "    csel    %[limit], xzr, %[limit], hi\n"
>>>        // 3: Set X' = ~0 if X >= 2^64. For X == 2^64, this decrements X'
>>>        //    to compensate for the carry flag being set in step 4. For
>>>        //    X > 2^64, X' merely has to remain nonzero, which it does.
>>> -    "    csinv    %0, %0, xzr, cc\n"
>>> +    "    csinv    %[addr], %[addr], xzr, cc\n"
>>>        // 4: For X < 2^64, this gives us X' - C - 1 <= 0, where the -1
>>>        //    comes from the carry in being clear. Otherwise, we are
>>>        //    testing X' - C == 0, subject to the previous adjustments.
>>> -    "    sbcs    xzr, %0, %1\n"
>>> -    "    cset    %0, ls\n"
>>> -    : "=&r" (ret), "+r" (limit) : "Ir" (size), "0" (addr) : "cc");
>>> +    "    sbcs    xzr, %[addr], %[limit]\n"
>>> +    "       cset    %[addr], ls\n"
>>> +    : [addr] "=&r" (ret), [limit] "+r" (limit)
>>> +    : [size] "Ir" (size), "0" (addr)
>>> +    : "cc");
>>>          return ret;
>>>    }

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 1/6] arm64: Add asm/ccset.h header
  2020-03-13 10:54   ` Mark Rutland
@ 2020-03-13 16:29     ` Richard Henderson
  0 siblings, 0 replies; 18+ messages in thread
From: Richard Henderson @ 2020-03-13 16:29 UTC (permalink / raw)
  To: Mark Rutland; +Cc: linux-arm-kernel

On 3/13/20 3:54 AM, Mark Rutland wrote:
> Is there any documentation on this? What value does the output operand
> nominally have when the flags are clear / set?

Yes.

https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Flag-Output-Operands

The value is explicitly boolean, so {0,1}, just like type bool.


r~

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros in __range_ok
  2020-03-13 11:04   ` Mark Rutland
@ 2020-03-13 16:51     ` Robin Murphy
  2020-03-13 17:14       ` Mark Rutland
  0 siblings, 1 reply; 18+ messages in thread
From: Robin Murphy @ 2020-03-13 16:51 UTC (permalink / raw)
  To: Mark Rutland, Richard Henderson; +Cc: linux-arm-kernel

On 2020-03-13 11:04 am, Mark Rutland wrote:
> On Wed, Mar 11, 2020 at 11:04:14AM -0700, Richard Henderson wrote:
>> Uses of __range_ok almost always feed a branch.
>> This allows the compiler to use flags directly.
> 
> If we want to give hte compiler the most freedom, the best thing would
> be to write this in C. IIUC this code is written in assembly largely for
> historical reasons, and the comment above says:
> 
> | This is equivalent to the following test:
> | (u65)addr + (u65)size <= (u65)current->addr_limit + 1
> 
> ... which e.g. we could write as:
> 
> 	(__uint128_t)addr + (__uint128_t)size <= (__uint128_t)limit + 1;
> 
> ... which would be much clearer than the assembly.
> 
> Is there any pattern like that for which the compiler generates
> reasonable looking code, and if not, is that something that could be
> improved compiler side?

Hmm, in fact this:

	__uint128_t tmp = (__uint128_t)(unsigned long)addr + size;
	return !tmp || tmp - 1 <= limit;

generates code that looks like utter crap in isolation[1], but once 
inlined actually leads to a modest overall reduction (-0.09%) across the 
whole of vmlinux according to bloat-o-meter (presumably most of those 
branches roll up into the overall "if(access_ok())..." control flow for 
the typical case, and I'm sure size and limit get constant-folded a lot).

IIRC at the time there were so many uncertainties flying around that 
sticking with asm to take compiler unknowns out of the picture felt 
desirable, but perhaps the time might be nigh to retire my baby after 
all... I'll investigate a bit further.

Robin.


[1]:
0000000000000000 <range_ok>:
    0:   ab010000        adds    x0, x0, x1
    4:   9a9f37e3        cset    x3, cs  // cs = hs, nlast
    8:   aa030001        orr     x1, x0, x3
    c:   b4000161        cbz     x1, 38 <range_ok+0x38>
   10:   f1000401        subs    x1, x0, #0x1
   14:   d2800020        mov     x0, #0x1                        // #1
   18:   da1f0063        sbc     x3, x3, xzr
   1c:   b4000063        cbz     x3, 28 <range_ok+0x28>
   20:   d2800000        mov     x0, #0x0                        // #0
   24:   d65f03c0        ret
   28:   eb02003f        cmp     x1, x2
   2c:   54ffffc9        b.ls    24 <range_ok+0x24>  // b.plast
   30:   d2800000        mov     x0, #0x0                        // #0
   34:   17fffffc        b       24 <range_ok+0x24>
   38:   d2800020        mov     x0, #0x1                        // #1
   3c:   d65f03c0        ret

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros in __range_ok
  2020-03-13 16:51     ` Robin Murphy
@ 2020-03-13 17:14       ` Mark Rutland
  0 siblings, 0 replies; 18+ messages in thread
From: Mark Rutland @ 2020-03-13 17:14 UTC (permalink / raw)
  To: Robin Murphy; +Cc: Richard Henderson, linux-arm-kernel

On Fri, Mar 13, 2020 at 04:51:28PM +0000, Robin Murphy wrote:
> On 2020-03-13 11:04 am, Mark Rutland wrote:
> > On Wed, Mar 11, 2020 at 11:04:14AM -0700, Richard Henderson wrote:
> > > Uses of __range_ok almost always feed a branch.
> > > This allows the compiler to use flags directly.
> > 
> > If we want to give hte compiler the most freedom, the best thing would
> > be to write this in C. IIUC this code is written in assembly largely for
> > historical reasons, and the comment above says:
> > 
> > | This is equivalent to the following test:
> > | (u65)addr + (u65)size <= (u65)current->addr_limit + 1
> > 
> > ... which e.g. we could write as:
> > 
> > 	(__uint128_t)addr + (__uint128_t)size <= (__uint128_t)limit + 1;
> > 
> > ... which would be much clearer than the assembly.
> > 
> > Is there any pattern like that for which the compiler generates
> > reasonable looking code, and if not, is that something that could be
> > improved compiler side?
> 
> Hmm, in fact this:
> 
> 	__uint128_t tmp = (__uint128_t)(unsigned long)addr + size;
> 	return !tmp || tmp - 1 <= limit;
> 
> generates code that looks like utter crap in isolation[1], but once inlined
> actually leads to a modest overall reduction (-0.09%) across the whole of
> vmlinux according to bloat-o-meter (presumably most of those branches roll
> up into the overall "if(access_ok())..." control flow for the typical case,
> and I'm sure size and limit get constant-folded a lot).

Neat.

IIUC for a non-zero size the !tmp check can be elided, and for a
constant size the subtraction can be folded in at compile time, so for a
{get,put}_user(), the compiler only needs to do the addition and a check
against limit.

> IIRC at the time there were so many uncertainties flying around that
> sticking with asm to take compiler unknowns out of the picture felt
> desirable, but perhaps the time might be nigh to retire my baby after all...

I guess we might have thought we'd need to pass the result into some
masking logic, but I think uaccess_mask_ptr() turned out to be good
enough in practice.

> I'll investigate a bit further.

Great!

Thanks,
Mark.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2020-03-13 17:14 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-11 18:04 [PATCH 0/6] arm64: gcc asm flag outputs Richard Henderson
2020-03-11 18:04 ` [PATCH 1/6] arm64: Add asm/ccset.h header Richard Henderson
2020-03-13 10:54   ` Mark Rutland
2020-03-13 16:29     ` Richard Henderson
2020-03-11 18:04 ` [PATCH 2/6] arm64: uaccess: Use named asm operands for __in_range Richard Henderson
2020-03-11 19:08   ` Robin Murphy
2020-03-11 21:48     ` Richard Henderson
2020-03-13 16:14       ` Robin Murphy
2020-03-11 18:04 ` [PATCH 3/6] arm64: uaccess: Untie the input address in __range_ok Richard Henderson
2020-03-11 19:08   ` Robin Murphy
2020-03-11 18:04 ` [PATCH 4/6] arm64: uaccess: Use asm/ccset.h macros " Richard Henderson
2020-03-12 11:48   ` Robin Murphy
2020-03-13 11:04   ` Mark Rutland
2020-03-13 16:51     ` Robin Murphy
2020-03-13 17:14       ` Mark Rutland
2020-03-11 18:04 ` [PATCH 5/6] arm64: archrandom: Use asm/ccset.h macros in __arm64_rndr Richard Henderson
2020-03-11 18:04 ` [PATCH 6/6] arm64: Hoist CONFIG option out of ALTERNATIVE in uaccess.h Richard Henderson
2020-03-13 10:46   ` Mark Rutland

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.