All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] MIPS: Standard calling convention usercopy & memcpy
@ 2016-11-07 11:17 ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

This series makes the usercopy & memcpy functions follow the standard
calling convention, allowing us to clean up calls to them from
copy_{to,from}_user & variants such that they're just standard function
calls rather than inline assembly wrappers. This frees us from needing
to worry about performing long calls in modules, declaring the right
registers clobbered by the inline asm, retrieving results from
non-standard registers etc.

This series applies atop v4.9-rc4 with my "MIPS: Cleanup EXPORT_SYMBOL
usage" series applied first.


Paul Burton (7):
  MIPS: lib: Split lib-y to a line per file
  MIPS: lib: Implement memmove in C
  MIPS: memcpy: Split __copy_user & memcpy
  MIPS: memcpy: Return uncopied bytes from __copy_user*() in v0
  MIPS: memcpy: Use ta* instead of manually defining t4-t7
  MIPS: memcpy: Use a3/$7 for source end address
  MIPS: uaccess: Use standard __user_copy* function calls

 arch/mips/cavium-octeon/octeon-memcpy.S | 225 +++++++--------
 arch/mips/include/asm/uaccess.h         | 480 ++++++++------------------------
 arch/mips/lib/Makefile                  |  14 +-
 arch/mips/lib/memcpy.S                  | 198 +++++--------
 arch/mips/lib/memmove.c                 |  39 +++
 5 files changed, 324 insertions(+), 632 deletions(-)
 create mode 100644 arch/mips/lib/memmove.c

-- 
2.10.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 0/7] MIPS: Standard calling convention usercopy & memcpy
@ 2016-11-07 11:17 ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

This series makes the usercopy & memcpy functions follow the standard
calling convention, allowing us to clean up calls to them from
copy_{to,from}_user & variants such that they're just standard function
calls rather than inline assembly wrappers. This frees us from needing
to worry about performing long calls in modules, declaring the right
registers clobbered by the inline asm, retrieving results from
non-standard registers etc.

This series applies atop v4.9-rc4 with my "MIPS: Cleanup EXPORT_SYMBOL
usage" series applied first.


Paul Burton (7):
  MIPS: lib: Split lib-y to a line per file
  MIPS: lib: Implement memmove in C
  MIPS: memcpy: Split __copy_user & memcpy
  MIPS: memcpy: Return uncopied bytes from __copy_user*() in v0
  MIPS: memcpy: Use ta* instead of manually defining t4-t7
  MIPS: memcpy: Use a3/$7 for source end address
  MIPS: uaccess: Use standard __user_copy* function calls

 arch/mips/cavium-octeon/octeon-memcpy.S | 225 +++++++--------
 arch/mips/include/asm/uaccess.h         | 480 ++++++++------------------------
 arch/mips/lib/Makefile                  |  14 +-
 arch/mips/lib/memcpy.S                  | 198 +++++--------
 arch/mips/lib/memmove.c                 |  39 +++
 5 files changed, 324 insertions(+), 632 deletions(-)
 create mode 100644 arch/mips/lib/memmove.c

-- 
2.10.2

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH 1/7] MIPS: lib: Split lib-y to a line per file
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Split the lib-y assignment in arch/mips/lib/Makefile to a line per file
such that we can modify the list more easily, with clearer diffs.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/lib/Makefile | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/mips/lib/Makefile b/arch/mips/lib/Makefile
index 0344e57..c0f0d1d 100644
--- a/arch/mips/lib/Makefile
+++ b/arch/mips/lib/Makefile
@@ -2,9 +2,16 @@
 # Makefile for MIPS-specific library files..
 #
 
-lib-y	+= bitops.o csum_partial.o delay.o memcpy.o memset.o \
-	   mips-atomic.o strlen_user.o strncpy_user.o \
-	   strnlen_user.o uncached.o
+lib-y += bitops.o
+lib-y += csum_partial.o
+lib-y += delay.o
+lib-y += memcpy.o
+lib-y += memset.o
+lib-y += mips-atomic.o
+lib-y += strlen_user.o
+lib-y += strncpy_user.o
+lib-y += strnlen_user.o
+lib-y += uncached.o
 
 obj-y			+= iomap.o
 obj-$(CONFIG_PCI)	+= iomap-pci.o
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 1/7] MIPS: lib: Split lib-y to a line per file
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Split the lib-y assignment in arch/mips/lib/Makefile to a line per file
such that we can modify the list more easily, with clearer diffs.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/lib/Makefile | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/arch/mips/lib/Makefile b/arch/mips/lib/Makefile
index 0344e57..c0f0d1d 100644
--- a/arch/mips/lib/Makefile
+++ b/arch/mips/lib/Makefile
@@ -2,9 +2,16 @@
 # Makefile for MIPS-specific library files..
 #
 
-lib-y	+= bitops.o csum_partial.o delay.o memcpy.o memset.o \
-	   mips-atomic.o strlen_user.o strncpy_user.o \
-	   strnlen_user.o uncached.o
+lib-y += bitops.o
+lib-y += csum_partial.o
+lib-y += delay.o
+lib-y += memcpy.o
+lib-y += memset.o
+lib-y += mips-atomic.o
+lib-y += strlen_user.o
+lib-y += strncpy_user.o
+lib-y += strnlen_user.o
+lib-y += uncached.o
 
 obj-y			+= iomap.o
 obj-$(CONFIG_PCI)	+= iomap-pci.o
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/7] MIPS: lib: Implement memmove in C
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Implement memmove in C, dropping the asm implementation which does no
particular optimisation & was duplicated needlessly for octeon.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 50 +----------------------------
 arch/mips/lib/Makefile                  |  1 +
 arch/mips/lib/memcpy.S                  | 56 +--------------------------------
 arch/mips/lib/memmove.c                 | 39 +++++++++++++++++++++++
 4 files changed, 42 insertions(+), 104 deletions(-)
 create mode 100644 arch/mips/lib/memmove.c

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 7d96d9c..4336316 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -3,7 +3,7 @@
  * License.  See the file "COPYING" in the main directory of this archive
  * for more details.
  *
- * Unified implementation of memcpy, memmove and the __copy_user backend.
+ * Unified implementation of memcpy and the __copy_user backend.
  *
  * Copyright (C) 1998, 99, 2000, 01, 2002 Ralf Baechle (ralf@gnu.org)
  * Copyright (C) 1999, 2000, 01, 2002 Silicon Graphics, Inc.
@@ -66,9 +66,6 @@
  *
  * The exception handlers for stores adjust len (if necessary) and return.
  * These handlers do not need to overwrite any data.
- *
- * For __rmemcpy and memmove an exception is always a kernel bug, therefore
- * they're not protected.
  */
 
 #define EXC(inst_reg,addr,handler)		\
@@ -460,48 +457,3 @@ s_exc_p1:
 s_exc:
 	jr	ra
 	 nop
-
-	.align	5
-LEAF(memmove)
-EXPORT_SYMBOL(memmove)
-	ADD	t0, a0, a2
-	ADD	t1, a1, a2
-	sltu	t0, a1, t0			# dst + len <= src -> memcpy
-	sltu	t1, a0, t1			# dst >= src + len -> memcpy
-	and	t0, t1
-	beqz	t0, __memcpy
-	 move	v0, a0				/* return value */
-	beqz	a2, r_out
-	END(memmove)
-
-	/* fall through to __rmemcpy */
-LEAF(__rmemcpy)					/* a0=dst a1=src a2=len */
-	 sltu	t0, a1, a0
-	beqz	t0, r_end_bytes_up		# src >= dst
-	 nop
-	ADD	a0, a2				# dst = dst + len
-	ADD	a1, a2				# src = src + len
-
-r_end_bytes:
-	lb	t0, -1(a1)
-	SUB	a2, a2, 0x1
-	sb	t0, -1(a0)
-	SUB	a1, a1, 0x1
-	bnez	a2, r_end_bytes
-	 SUB	a0, a0, 0x1
-
-r_out:
-	jr	ra
-	 move	a2, zero
-
-r_end_bytes_up:
-	lb	t0, (a1)
-	SUB	a2, a2, 0x1
-	sb	t0, (a0)
-	ADD	a1, a1, 0x1
-	bnez	a2, r_end_bytes_up
-	 ADD	a0, a0, 0x1
-
-	jr	ra
-	 move	a2, zero
-	END(__rmemcpy)
diff --git a/arch/mips/lib/Makefile b/arch/mips/lib/Makefile
index c0f0d1d..0040bad 100644
--- a/arch/mips/lib/Makefile
+++ b/arch/mips/lib/Makefile
@@ -6,6 +6,7 @@ lib-y += bitops.o
 lib-y += csum_partial.o
 lib-y += delay.o
 lib-y += memcpy.o
+lib-y += memmove.o
 lib-y += memset.o
 lib-y += mips-atomic.o
 lib-y += strlen_user.o
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index c3031f1..b8d34d9 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -3,7 +3,7 @@
  * License.  See the file "COPYING" in the main directory of this archive
  * for more details.
  *
- * Unified implementation of memcpy, memmove and the __copy_user backend.
+ * Unified implementation of memcpy and the __copy_user backend.
  *
  * Copyright (C) 1998, 99, 2000, 01, 2002 Ralf Baechle (ralf@gnu.org)
  * Copyright (C) 1999, 2000, 01, 2002 Silicon Graphics, Inc.
@@ -82,9 +82,6 @@
  *
  * The exception handlers for stores adjust len (if necessary) and return.
  * These handlers do not need to overwrite any data.
- *
- * For __rmemcpy and memmove an exception is always a kernel bug, therefore
- * they're not protected.
  */
 
 /* Instruction type */
@@ -621,57 +618,6 @@ SEXC(1)
 	 nop
 	.endm
 
-	.align	5
-LEAF(memmove)
-EXPORT_SYMBOL(memmove)
-	ADD	t0, a0, a2
-	ADD	t1, a1, a2
-	sltu	t0, a1, t0			# dst + len <= src -> memcpy
-	sltu	t1, a0, t1			# dst >= src + len -> memcpy
-	and	t0, t1
-	beqz	t0, .L__memcpy
-	 move	v0, a0				/* return value */
-	beqz	a2, .Lr_out
-	END(memmove)
-
-	/* fall through to __rmemcpy */
-LEAF(__rmemcpy)					/* a0=dst a1=src a2=len */
-	 sltu	t0, a1, a0
-	beqz	t0, .Lr_end_bytes_up		# src >= dst
-	 nop
-	ADD	a0, a2				# dst = dst + len
-	ADD	a1, a2				# src = src + len
-
-.Lr_end_bytes:
-	R10KCBARRIER(0(ra))
-	lb	t0, -1(a1)
-	SUB	a2, a2, 0x1
-	sb	t0, -1(a0)
-	SUB	a1, a1, 0x1
-	.set	reorder				/* DADDI_WAR */
-	SUB	a0, a0, 0x1
-	bnez	a2, .Lr_end_bytes
-	.set	noreorder
-
-.Lr_out:
-	jr	ra
-	 move	a2, zero
-
-.Lr_end_bytes_up:
-	R10KCBARRIER(0(ra))
-	lb	t0, (a1)
-	SUB	a2, a2, 0x1
-	sb	t0, (a0)
-	ADD	a1, a1, 0x1
-	.set	reorder				/* DADDI_WAR */
-	ADD	a0, a0, 0x1
-	bnez	a2, .Lr_end_bytes_up
-	.set	noreorder
-
-	jr	ra
-	 move	a2, zero
-	END(__rmemcpy)
-
 /*
  * t6 is used as a flag to note inatomic mode.
  */
diff --git a/arch/mips/lib/memmove.c b/arch/mips/lib/memmove.c
new file mode 100644
index 0000000..4e902c6
--- /dev/null
+++ b/arch/mips/lib/memmove.c
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2016 Imagination Technologies
+ * Author: Paul Burton <paul.burton@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include <linux/export.h>
+#include <linux/string.h>
+
+extern void *memmove(void *dest, const void *src, size_t count)
+{
+	const char *s = src;
+	const char *s_end = s + count;
+	char *d = dest;
+	char *d_end = dest + count;
+
+	/* Use optimised memcpy when there's no overlap */
+	if ((d_end <= s) || (s_end <= d))
+		return memcpy(dest, src, count);
+
+	if (d <= s) {
+		/* Incrementing copy loop */
+		while (count--)
+			*d++ = *s++;
+	} else {
+		/* Decrementing copy loop */
+		d = d_end;
+		s = s_end;
+		while (count--)
+			*--d = *--s;
+	}
+
+	return dest;
+}
+EXPORT_SYMBOL(memmove);
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 2/7] MIPS: lib: Implement memmove in C
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Implement memmove in C, dropping the asm implementation which does no
particular optimisation & was duplicated needlessly for octeon.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 50 +----------------------------
 arch/mips/lib/Makefile                  |  1 +
 arch/mips/lib/memcpy.S                  | 56 +--------------------------------
 arch/mips/lib/memmove.c                 | 39 +++++++++++++++++++++++
 4 files changed, 42 insertions(+), 104 deletions(-)
 create mode 100644 arch/mips/lib/memmove.c

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 7d96d9c..4336316 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -3,7 +3,7 @@
  * License.  See the file "COPYING" in the main directory of this archive
  * for more details.
  *
- * Unified implementation of memcpy, memmove and the __copy_user backend.
+ * Unified implementation of memcpy and the __copy_user backend.
  *
  * Copyright (C) 1998, 99, 2000, 01, 2002 Ralf Baechle (ralf@gnu.org)
  * Copyright (C) 1999, 2000, 01, 2002 Silicon Graphics, Inc.
@@ -66,9 +66,6 @@
  *
  * The exception handlers for stores adjust len (if necessary) and return.
  * These handlers do not need to overwrite any data.
- *
- * For __rmemcpy and memmove an exception is always a kernel bug, therefore
- * they're not protected.
  */
 
 #define EXC(inst_reg,addr,handler)		\
@@ -460,48 +457,3 @@ s_exc_p1:
 s_exc:
 	jr	ra
 	 nop
-
-	.align	5
-LEAF(memmove)
-EXPORT_SYMBOL(memmove)
-	ADD	t0, a0, a2
-	ADD	t1, a1, a2
-	sltu	t0, a1, t0			# dst + len <= src -> memcpy
-	sltu	t1, a0, t1			# dst >= src + len -> memcpy
-	and	t0, t1
-	beqz	t0, __memcpy
-	 move	v0, a0				/* return value */
-	beqz	a2, r_out
-	END(memmove)
-
-	/* fall through to __rmemcpy */
-LEAF(__rmemcpy)					/* a0=dst a1=src a2=len */
-	 sltu	t0, a1, a0
-	beqz	t0, r_end_bytes_up		# src >= dst
-	 nop
-	ADD	a0, a2				# dst = dst + len
-	ADD	a1, a2				# src = src + len
-
-r_end_bytes:
-	lb	t0, -1(a1)
-	SUB	a2, a2, 0x1
-	sb	t0, -1(a0)
-	SUB	a1, a1, 0x1
-	bnez	a2, r_end_bytes
-	 SUB	a0, a0, 0x1
-
-r_out:
-	jr	ra
-	 move	a2, zero
-
-r_end_bytes_up:
-	lb	t0, (a1)
-	SUB	a2, a2, 0x1
-	sb	t0, (a0)
-	ADD	a1, a1, 0x1
-	bnez	a2, r_end_bytes_up
-	 ADD	a0, a0, 0x1
-
-	jr	ra
-	 move	a2, zero
-	END(__rmemcpy)
diff --git a/arch/mips/lib/Makefile b/arch/mips/lib/Makefile
index c0f0d1d..0040bad 100644
--- a/arch/mips/lib/Makefile
+++ b/arch/mips/lib/Makefile
@@ -6,6 +6,7 @@ lib-y += bitops.o
 lib-y += csum_partial.o
 lib-y += delay.o
 lib-y += memcpy.o
+lib-y += memmove.o
 lib-y += memset.o
 lib-y += mips-atomic.o
 lib-y += strlen_user.o
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index c3031f1..b8d34d9 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -3,7 +3,7 @@
  * License.  See the file "COPYING" in the main directory of this archive
  * for more details.
  *
- * Unified implementation of memcpy, memmove and the __copy_user backend.
+ * Unified implementation of memcpy and the __copy_user backend.
  *
  * Copyright (C) 1998, 99, 2000, 01, 2002 Ralf Baechle (ralf@gnu.org)
  * Copyright (C) 1999, 2000, 01, 2002 Silicon Graphics, Inc.
@@ -82,9 +82,6 @@
  *
  * The exception handlers for stores adjust len (if necessary) and return.
  * These handlers do not need to overwrite any data.
- *
- * For __rmemcpy and memmove an exception is always a kernel bug, therefore
- * they're not protected.
  */
 
 /* Instruction type */
@@ -621,57 +618,6 @@ SEXC(1)
 	 nop
 	.endm
 
-	.align	5
-LEAF(memmove)
-EXPORT_SYMBOL(memmove)
-	ADD	t0, a0, a2
-	ADD	t1, a1, a2
-	sltu	t0, a1, t0			# dst + len <= src -> memcpy
-	sltu	t1, a0, t1			# dst >= src + len -> memcpy
-	and	t0, t1
-	beqz	t0, .L__memcpy
-	 move	v0, a0				/* return value */
-	beqz	a2, .Lr_out
-	END(memmove)
-
-	/* fall through to __rmemcpy */
-LEAF(__rmemcpy)					/* a0=dst a1=src a2=len */
-	 sltu	t0, a1, a0
-	beqz	t0, .Lr_end_bytes_up		# src >= dst
-	 nop
-	ADD	a0, a2				# dst = dst + len
-	ADD	a1, a2				# src = src + len
-
-.Lr_end_bytes:
-	R10KCBARRIER(0(ra))
-	lb	t0, -1(a1)
-	SUB	a2, a2, 0x1
-	sb	t0, -1(a0)
-	SUB	a1, a1, 0x1
-	.set	reorder				/* DADDI_WAR */
-	SUB	a0, a0, 0x1
-	bnez	a2, .Lr_end_bytes
-	.set	noreorder
-
-.Lr_out:
-	jr	ra
-	 move	a2, zero
-
-.Lr_end_bytes_up:
-	R10KCBARRIER(0(ra))
-	lb	t0, (a1)
-	SUB	a2, a2, 0x1
-	sb	t0, (a0)
-	ADD	a1, a1, 0x1
-	.set	reorder				/* DADDI_WAR */
-	ADD	a0, a0, 0x1
-	bnez	a2, .Lr_end_bytes_up
-	.set	noreorder
-
-	jr	ra
-	 move	a2, zero
-	END(__rmemcpy)
-
 /*
  * t6 is used as a flag to note inatomic mode.
  */
diff --git a/arch/mips/lib/memmove.c b/arch/mips/lib/memmove.c
new file mode 100644
index 0000000..4e902c6
--- /dev/null
+++ b/arch/mips/lib/memmove.c
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2016 Imagination Technologies
+ * Author: Paul Burton <paul.burton@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include <linux/export.h>
+#include <linux/string.h>
+
+extern void *memmove(void *dest, const void *src, size_t count)
+{
+	const char *s = src;
+	const char *s_end = s + count;
+	char *d = dest;
+	char *d_end = dest + count;
+
+	/* Use optimised memcpy when there's no overlap */
+	if ((d_end <= s) || (s_end <= d))
+		return memcpy(dest, src, count);
+
+	if (d <= s) {
+		/* Incrementing copy loop */
+		while (count--)
+			*d++ = *s++;
+	} else {
+		/* Decrementing copy loop */
+		d = d_end;
+		s = s_end;
+		while (count--)
+			*--d = *--s;
+	}
+
+	return dest;
+}
+EXPORT_SYMBOL(memmove);
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/7] MIPS: memcpy: Split __copy_user & memcpy
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Up until now we have shared the same code for __copy_user() & memcpy(),
but this has the drawback that we use a non-standard ABI for __copy_user
and thus need to call it via inline assembly rather than a simple
function call. In order to allow for further patches to change this,
split the __copy_user() & memcpy() functions.

The resulting implementations of __copy_user() & memcpy() should differ
only in the existing difference of return value and that memcpy()
doesn't generate entries in the exception table or include exception
fixup code.

For octeon this involves introducing the __BUILD_COPY_USER macro &
renaming labels to remain unique, making the code match the non-octeon
memcpy implementation more closely.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 141 +++++++++++++++++++-------------
 arch/mips/lib/memcpy.S                  |  74 ++++++++++-------
 2 files changed, 131 insertions(+), 84 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 4336316..944f8f5 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -18,6 +18,9 @@
 #include <asm/export.h>
 #include <asm/regdef.h>
 
+#define MEMCPY_MODE	1
+#define USER_COPY_MODE	2
+
 #define dst a0
 #define src a1
 #define len a2
@@ -70,9 +73,11 @@
 
 #define EXC(inst_reg,addr,handler)		\
 9:	inst_reg, addr;				\
-	.section __ex_table,"a";		\
-	PTR	9b, handler;			\
-	.previous
+	.if	\mode != MEMCPY_MODE;		\
+		.section __ex_table,"a";	\
+		PTR	9b, handler;		\
+		.previous;			\
+	.endif
 
 /*
  * Only on the 64-bit kernel we can made use of 64-bit registers.
@@ -136,30 +141,7 @@
 	.set	noreorder
 	.set	noat
 
-/*
- * t7 is used as a flag to note inatomic mode.
- */
-LEAF(__copy_user_inatomic)
-EXPORT_SYMBOL(__copy_user_inatomic)
-	b	__copy_user_common
-	 li	t7, 1
-	END(__copy_user_inatomic)
-
-/*
- * A combined memcpy/__copy_user
- * __copy_user sets len to 0 for success; else to an upper bound of
- * the number of uncopied bytes.
- * memcpy sets v0 to dst.
- */
-	.align	5
-LEAF(memcpy)					/* a0=dst a1=src a2=len */
-EXPORT_SYMBOL(memcpy)
-	move	v0, dst				/* return value */
-__memcpy:
-FEXPORT(__copy_user)
-EXPORT_SYMBOL(__copy_user)
-	li	t7, 0				/* not inatomic */
-__copy_user_common:
+	.macro __BUILD_COPY_USER mode
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
 	 * Temps
@@ -170,15 +152,15 @@ __copy_user_common:
 	#
 	pref	0, 0(src)
 	sltu	t0, len, NBYTES		# Check if < 1 word
-	bnez	t0, copy_bytes_checklen
+	bnez	t0, .Lcopy_bytes_checklen\@
 	 and	t0, src, ADDRMASK	# Check if src unaligned
-	bnez	t0, src_unaligned
+	bnez	t0, .Lsrc_unaligned\@
 	 sltu	t0, len, 4*NBYTES	# Check if < 4 words
-	bnez	t0, less_than_4units
+	bnez	t0, .Lless_than_4units\@
 	 sltu	t0, len, 8*NBYTES	# Check if < 8 words
-	bnez	t0, less_than_8units
+	bnez	t0, .Lless_than_8units\@
 	 sltu	t0, len, 16*NBYTES	# Check if < 16 words
-	bnez	t0, cleanup_both_aligned
+	bnez	t0, .Lcleanup_both_aligned\@
 	 sltu	t0, len, 128+1		# Check if len < 129
 	bnez	t0, 1f			# Skip prefetch if len is too short
 	 sltu	t0, len, 256+1		# Check if len < 257
@@ -233,10 +215,10 @@ EXC(	STORE	t3, UNIT(-1)(dst),	s_exc_p1u)
 	#
 	# Jump here if there are less than 16*NBYTES left.
 	#
-cleanup_both_aligned:
-	beqz	len, done
+.Lcleanup_both_aligned\@:
+	beqz	len, .Ldone\@
 	 sltu	t0, len, 8*NBYTES
-	bnez	t0, less_than_8units
+	bnez	t0, .Lless_than_8units\@
 	 nop
 EXC(	LOAD	t0, UNIT(0)(src),	l_exc)
 EXC(	LOAD	t1, UNIT(1)(src),	l_exc_copy)
@@ -256,14 +238,14 @@ EXC(	STORE	t1, UNIT(5)(dst),	s_exc_p3u)
 EXC(	STORE	t2, UNIT(6)(dst),	s_exc_p2u)
 EXC(	STORE	t3, UNIT(7)(dst),	s_exc_p1u)
 	ADD	src, src, 8*NBYTES
-	beqz	len, done
+	beqz	len, .Ldone\@
 	 ADD	dst, dst, 8*NBYTES
 	#
 	# Jump here if there are less than 8*NBYTES left.
 	#
-less_than_8units:
+.Lless_than_8units\@:
 	sltu	t0, len, 4*NBYTES
-	bnez	t0, less_than_4units
+	bnez	t0, .Lless_than_4units\@
 	 nop
 EXC(	LOAD	t0, UNIT(0)(src),	l_exc)
 EXC(	LOAD	t1, UNIT(1)(src),	l_exc_copy)
@@ -275,15 +257,15 @@ EXC(	STORE	t1, UNIT(1)(dst),	s_exc_p3u)
 EXC(	STORE	t2, UNIT(2)(dst),	s_exc_p2u)
 EXC(	STORE	t3, UNIT(3)(dst),	s_exc_p1u)
 	ADD	src, src, 4*NBYTES
-	beqz	len, done
+	beqz	len, .Ldone\@
 	 ADD	dst, dst, 4*NBYTES
 	#
 	# Jump here if there are less than 4*NBYTES left. This means
 	# we may need to copy up to 3 NBYTES words.
 	#
-less_than_4units:
+.Lless_than_4units\@:
 	sltu	t0, len, 1*NBYTES
-	bnez	t0, copy_bytes_checklen
+	bnez	t0, .Lcopy_bytes_checklen\@
 	 nop
 	#
 	# 1) Copy NBYTES, then check length again
@@ -293,7 +275,7 @@ EXC(	LOAD	t0, 0(src),		l_exc)
 	sltu	t1, len, 8
 EXC(	STORE	t0, 0(dst),		s_exc_p1u)
 	ADD	src, src, NBYTES
-	bnez	t1, copy_bytes_checklen
+	bnez	t1, .Lcopy_bytes_checklen\@
 	 ADD	dst, dst, NBYTES
 	#
 	# 2) Copy NBYTES, then check length again
@@ -303,7 +285,7 @@ EXC(	LOAD	t0, 0(src),		l_exc)
 	sltu	t1, len, 8
 EXC(	STORE	t0, 0(dst),		s_exc_p1u)
 	ADD	src, src, NBYTES
-	bnez	t1, copy_bytes_checklen
+	bnez	t1, .Lcopy_bytes_checklen\@
 	 ADD	dst, dst, NBYTES
 	#
 	# 3) Copy NBYTES, then check length again
@@ -312,13 +294,13 @@ EXC(	LOAD	t0, 0(src),		l_exc)
 	SUB	len, len, NBYTES
 	ADD	src, src, NBYTES
 	ADD	dst, dst, NBYTES
-	b copy_bytes_checklen
+	b .Lcopy_bytes_checklen\@
 EXC(	 STORE	t0, -8(dst),		s_exc_p1u)
 
-src_unaligned:
+.Lsrc_unaligned\@:
 #define rem t8
 	SRL	t0, len, LOG_NBYTES+2	 # +2 for 4 units/iter
-	beqz	t0, cleanup_src_unaligned
+	beqz	t0, .Lcleanup_src_unaligned\@
 	 and	rem, len, (4*NBYTES-1)	 # rem = len % 4*NBYTES
 1:
 /*
@@ -344,10 +326,10 @@ EXC(	STORE	t3, UNIT(3)(dst),	s_exc_p1u)
 	bne	len, rem, 1b
 	 ADD	dst, dst, 4*NBYTES
 
-cleanup_src_unaligned:
-	beqz	len, done
+.Lcleanup_src_unaligned\@:
+	beqz	len, .Ldone\@
 	 and	rem, len, NBYTES-1  # rem = len % NBYTES
-	beq	rem, len, copy_bytes
+	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
 EXC(	LDFIRST t0, FIRST(0)(src),	l_exc)
@@ -358,15 +340,15 @@ EXC(	STORE	t0, 0(dst),		s_exc_p1u)
 	bne	len, rem, 1b
 	 ADD	dst, dst, NBYTES
 
-copy_bytes_checklen:
-	beqz	len, done
+.Lcopy_bytes_checklen\@:
+	beqz	len, .Ldone\@
 	 nop
-copy_bytes:
+.Lcopy_bytes\@:
 	/* 0 < len < NBYTES  */
 #define COPY_BYTE(N)			\
 EXC(	lb	t0, N(src), l_exc);	\
 	SUB	len, len, 1;		\
-	beqz	len, done;		\
+	beqz	len, .Ldone\@;		\
 EXC(	 sb	t0, N(dst), s_exc_p1)
 
 	COPY_BYTE(0)
@@ -379,10 +361,12 @@ EXC(	lb	t0, NBYTES-2(src), l_exc)
 	SUB	len, len, 1
 	jr	ra
 EXC(	 sb	t0, NBYTES-2(dst), s_exc_p1)
-done:
+.Ldone\@:
 	jr	ra
 	 nop
-	END(memcpy)
+
+	/* memcpy shouldn't generate exceptions */
+	.if \mode != MEMCPY_MODE
 
 l_exc_copy:
 	/*
@@ -419,7 +403,7 @@ l_exc:
 	 * Clear len bytes starting at dst.  Can't call __bzero because it
 	 * might modify len.  An inefficient loop for these rare times...
 	 */
-	beqz	len, done
+	beqz	len, .Ldone\@
 	 SUB	src, len, 1
 1:	sb	zero, 0(dst)
 	ADD	dst, dst, 1
@@ -457,3 +441,48 @@ s_exc_p1:
 s_exc:
 	jr	ra
 	 nop
+	.endif	/* \mode != MEMCPY_MODE */
+	.endm
+
+/*
+ * memcpy() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the destination pointer
+ */
+	.align	5
+LEAF(memcpy)					/* a0=dst a1=src a2=len */
+EXPORT_SYMBOL(memcpy)
+	move	v0, dst				/* return value */
+	__BUILD_COPY_USER MEMCPY_MODE
+	END(memcpy)
+
+/*
+ * __copy_user() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the number of uncopied bytes in @a2
+ */
+LEAF(__copy_user)
+EXPORT_SYMBOL(__copy_user)
+	li	t7, 0				/* not inatomic */
+__copy_user_common:
+	__BUILD_COPY_USER COPY_USER_MODE
+	END(__copy_user)
+
+/*
+ * t7 is used as a flag to note inatomic mode.
+ */
+LEAF(__copy_user_inatomic)
+EXPORT_SYMBOL(__copy_user_inatomic)
+	b	__copy_user_common
+	 li	t7, 1
+	END(__copy_user_inatomic)
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index b8d34d9..bfbe23c 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -92,6 +92,7 @@
 #define DST_PREFETCH 2
 #define LEGACY_MODE 1
 #define EVA_MODE    2
+#define MEMCPY_MODE 3
 #define USEROP   1
 #define KERNELOP 2
 
@@ -107,7 +108,9 @@
  */
 
 #define EXC(insn, type, reg, addr, handler)			\
-	.if \mode == LEGACY_MODE;				\
+	.if \mode == MEMCPY_MODE;				\
+		insn reg, addr;					\
+	.elseif \mode == LEGACY_MODE;				\
 9:		insn reg, addr;					\
 		.section __ex_table,"a";			\
 		PTR	9b, handler;				\
@@ -199,7 +202,7 @@
 #define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
 
 #define _PREF(hint, addr, type)						\
-	.if \mode == LEGACY_MODE;					\
+	.if \mode != EVA_MODE;						\
 		PREF(hint, addr);					\
 	.else;								\
 		.if ((\from == USEROP) && (type == SRC_PREFETCH)) ||	\
@@ -255,18 +258,12 @@
 	/*
 	 * Macro to build the __copy_user common code
 	 * Arguments:
-	 * mode : LEGACY_MODE or EVA_MODE
+	 * mode : LEGACY_MODE, EVA_MODE or MEMCPY_MODE
 	 * from : Source operand. USEROP or KERNELOP
 	 * to   : Destination operand. USEROP or KERNELOP
 	 */
 	.macro __BUILD_COPY_USER mode, from, to
 
-	/* initialize __memcpy if this the first time we execute this macro */
-	.ifnotdef __memcpy
-	.set __memcpy, 1
-	.hidden __memcpy /* make sure it does not leak */
-	.endif
-
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
 	 * Temps
@@ -525,11 +522,9 @@
 	b	1b
 	 ADD	dst, dst, 8
 #endif /* CONFIG_CPU_MIPSR6 */
-	.if __memcpy == 1
-	END(memcpy)
-	.set __memcpy, 0
-	.hidden __memcpy
-	.endif
+
+	/* memcpy shouldn't generate exceptions */
+	.if	\mode != MEMCPY_MODE
 
 .Ll_exc_copy\@:
 	/*
@@ -616,34 +611,57 @@ SEXC(1)
 .Ls_exc\@:
 	jr	ra
 	 nop
-	.endm
 
-/*
- * t6 is used as a flag to note inatomic mode.
- */
-LEAF(__copy_user_inatomic)
-EXPORT_SYMBOL(__copy_user_inatomic)
-	b	__copy_user_common
-	li	t6, 1
-	END(__copy_user_inatomic)
+	.endif	/* \mode != MEMCPY_MODE */
+	.endm
 
 /*
- * A combined memcpy/__copy_user
- * __copy_user sets len to 0 for success; else to an upper bound of
- * the number of uncopied bytes.
- * memcpy sets v0 to dst.
+ * memcpy() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the destination pointer
  */
 	.align	5
 LEAF(memcpy)					/* a0=dst a1=src a2=len */
 EXPORT_SYMBOL(memcpy)
 	move	v0, dst				/* return value */
 .L__memcpy:
-FEXPORT(__copy_user)
+	li	t6, 0	/* not inatomic */
+	/* Legacy Mode, user <-> user */
+	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP
+	END(memcpy)
+
+/*
+ * __copy_user() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the number of uncopied bytes in @a2
+ */
+	.align	5
+LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
 	li	t6, 0	/* not inatomic */
 __copy_user_common:
 	/* Legacy Mode, user <-> user */
 	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP
+	END(__copy_user)
+
+/*
+ * t6 is used as a flag to note inatomic mode.
+ */
+LEAF(__copy_user_inatomic)
+EXPORT_SYMBOL(__copy_user_inatomic)
+	b	__copy_user_common
+	li	t6, 1
+	END(__copy_user_inatomic)
 
 #ifdef CONFIG_EVA
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 3/7] MIPS: memcpy: Split __copy_user & memcpy
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Up until now we have shared the same code for __copy_user() & memcpy(),
but this has the drawback that we use a non-standard ABI for __copy_user
and thus need to call it via inline assembly rather than a simple
function call. In order to allow for further patches to change this,
split the __copy_user() & memcpy() functions.

The resulting implementations of __copy_user() & memcpy() should differ
only in the existing difference of return value and that memcpy()
doesn't generate entries in the exception table or include exception
fixup code.

For octeon this involves introducing the __BUILD_COPY_USER macro &
renaming labels to remain unique, making the code match the non-octeon
memcpy implementation more closely.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 141 +++++++++++++++++++-------------
 arch/mips/lib/memcpy.S                  |  74 ++++++++++-------
 2 files changed, 131 insertions(+), 84 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 4336316..944f8f5 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -18,6 +18,9 @@
 #include <asm/export.h>
 #include <asm/regdef.h>
 
+#define MEMCPY_MODE	1
+#define USER_COPY_MODE	2
+
 #define dst a0
 #define src a1
 #define len a2
@@ -70,9 +73,11 @@
 
 #define EXC(inst_reg,addr,handler)		\
 9:	inst_reg, addr;				\
-	.section __ex_table,"a";		\
-	PTR	9b, handler;			\
-	.previous
+	.if	\mode != MEMCPY_MODE;		\
+		.section __ex_table,"a";	\
+		PTR	9b, handler;		\
+		.previous;			\
+	.endif
 
 /*
  * Only on the 64-bit kernel we can made use of 64-bit registers.
@@ -136,30 +141,7 @@
 	.set	noreorder
 	.set	noat
 
-/*
- * t7 is used as a flag to note inatomic mode.
- */
-LEAF(__copy_user_inatomic)
-EXPORT_SYMBOL(__copy_user_inatomic)
-	b	__copy_user_common
-	 li	t7, 1
-	END(__copy_user_inatomic)
-
-/*
- * A combined memcpy/__copy_user
- * __copy_user sets len to 0 for success; else to an upper bound of
- * the number of uncopied bytes.
- * memcpy sets v0 to dst.
- */
-	.align	5
-LEAF(memcpy)					/* a0=dst a1=src a2=len */
-EXPORT_SYMBOL(memcpy)
-	move	v0, dst				/* return value */
-__memcpy:
-FEXPORT(__copy_user)
-EXPORT_SYMBOL(__copy_user)
-	li	t7, 0				/* not inatomic */
-__copy_user_common:
+	.macro __BUILD_COPY_USER mode
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
 	 * Temps
@@ -170,15 +152,15 @@ __copy_user_common:
 	#
 	pref	0, 0(src)
 	sltu	t0, len, NBYTES		# Check if < 1 word
-	bnez	t0, copy_bytes_checklen
+	bnez	t0, .Lcopy_bytes_checklen\@
 	 and	t0, src, ADDRMASK	# Check if src unaligned
-	bnez	t0, src_unaligned
+	bnez	t0, .Lsrc_unaligned\@
 	 sltu	t0, len, 4*NBYTES	# Check if < 4 words
-	bnez	t0, less_than_4units
+	bnez	t0, .Lless_than_4units\@
 	 sltu	t0, len, 8*NBYTES	# Check if < 8 words
-	bnez	t0, less_than_8units
+	bnez	t0, .Lless_than_8units\@
 	 sltu	t0, len, 16*NBYTES	# Check if < 16 words
-	bnez	t0, cleanup_both_aligned
+	bnez	t0, .Lcleanup_both_aligned\@
 	 sltu	t0, len, 128+1		# Check if len < 129
 	bnez	t0, 1f			# Skip prefetch if len is too short
 	 sltu	t0, len, 256+1		# Check if len < 257
@@ -233,10 +215,10 @@ EXC(	STORE	t3, UNIT(-1)(dst),	s_exc_p1u)
 	#
 	# Jump here if there are less than 16*NBYTES left.
 	#
-cleanup_both_aligned:
-	beqz	len, done
+.Lcleanup_both_aligned\@:
+	beqz	len, .Ldone\@
 	 sltu	t0, len, 8*NBYTES
-	bnez	t0, less_than_8units
+	bnez	t0, .Lless_than_8units\@
 	 nop
 EXC(	LOAD	t0, UNIT(0)(src),	l_exc)
 EXC(	LOAD	t1, UNIT(1)(src),	l_exc_copy)
@@ -256,14 +238,14 @@ EXC(	STORE	t1, UNIT(5)(dst),	s_exc_p3u)
 EXC(	STORE	t2, UNIT(6)(dst),	s_exc_p2u)
 EXC(	STORE	t3, UNIT(7)(dst),	s_exc_p1u)
 	ADD	src, src, 8*NBYTES
-	beqz	len, done
+	beqz	len, .Ldone\@
 	 ADD	dst, dst, 8*NBYTES
 	#
 	# Jump here if there are less than 8*NBYTES left.
 	#
-less_than_8units:
+.Lless_than_8units\@:
 	sltu	t0, len, 4*NBYTES
-	bnez	t0, less_than_4units
+	bnez	t0, .Lless_than_4units\@
 	 nop
 EXC(	LOAD	t0, UNIT(0)(src),	l_exc)
 EXC(	LOAD	t1, UNIT(1)(src),	l_exc_copy)
@@ -275,15 +257,15 @@ EXC(	STORE	t1, UNIT(1)(dst),	s_exc_p3u)
 EXC(	STORE	t2, UNIT(2)(dst),	s_exc_p2u)
 EXC(	STORE	t3, UNIT(3)(dst),	s_exc_p1u)
 	ADD	src, src, 4*NBYTES
-	beqz	len, done
+	beqz	len, .Ldone\@
 	 ADD	dst, dst, 4*NBYTES
 	#
 	# Jump here if there are less than 4*NBYTES left. This means
 	# we may need to copy up to 3 NBYTES words.
 	#
-less_than_4units:
+.Lless_than_4units\@:
 	sltu	t0, len, 1*NBYTES
-	bnez	t0, copy_bytes_checklen
+	bnez	t0, .Lcopy_bytes_checklen\@
 	 nop
 	#
 	# 1) Copy NBYTES, then check length again
@@ -293,7 +275,7 @@ EXC(	LOAD	t0, 0(src),		l_exc)
 	sltu	t1, len, 8
 EXC(	STORE	t0, 0(dst),		s_exc_p1u)
 	ADD	src, src, NBYTES
-	bnez	t1, copy_bytes_checklen
+	bnez	t1, .Lcopy_bytes_checklen\@
 	 ADD	dst, dst, NBYTES
 	#
 	# 2) Copy NBYTES, then check length again
@@ -303,7 +285,7 @@ EXC(	LOAD	t0, 0(src),		l_exc)
 	sltu	t1, len, 8
 EXC(	STORE	t0, 0(dst),		s_exc_p1u)
 	ADD	src, src, NBYTES
-	bnez	t1, copy_bytes_checklen
+	bnez	t1, .Lcopy_bytes_checklen\@
 	 ADD	dst, dst, NBYTES
 	#
 	# 3) Copy NBYTES, then check length again
@@ -312,13 +294,13 @@ EXC(	LOAD	t0, 0(src),		l_exc)
 	SUB	len, len, NBYTES
 	ADD	src, src, NBYTES
 	ADD	dst, dst, NBYTES
-	b copy_bytes_checklen
+	b .Lcopy_bytes_checklen\@
 EXC(	 STORE	t0, -8(dst),		s_exc_p1u)
 
-src_unaligned:
+.Lsrc_unaligned\@:
 #define rem t8
 	SRL	t0, len, LOG_NBYTES+2	 # +2 for 4 units/iter
-	beqz	t0, cleanup_src_unaligned
+	beqz	t0, .Lcleanup_src_unaligned\@
 	 and	rem, len, (4*NBYTES-1)	 # rem = len % 4*NBYTES
 1:
 /*
@@ -344,10 +326,10 @@ EXC(	STORE	t3, UNIT(3)(dst),	s_exc_p1u)
 	bne	len, rem, 1b
 	 ADD	dst, dst, 4*NBYTES
 
-cleanup_src_unaligned:
-	beqz	len, done
+.Lcleanup_src_unaligned\@:
+	beqz	len, .Ldone\@
 	 and	rem, len, NBYTES-1  # rem = len % NBYTES
-	beq	rem, len, copy_bytes
+	beq	rem, len, .Lcopy_bytes\@
 	 nop
 1:
 EXC(	LDFIRST t0, FIRST(0)(src),	l_exc)
@@ -358,15 +340,15 @@ EXC(	STORE	t0, 0(dst),		s_exc_p1u)
 	bne	len, rem, 1b
 	 ADD	dst, dst, NBYTES
 
-copy_bytes_checklen:
-	beqz	len, done
+.Lcopy_bytes_checklen\@:
+	beqz	len, .Ldone\@
 	 nop
-copy_bytes:
+.Lcopy_bytes\@:
 	/* 0 < len < NBYTES  */
 #define COPY_BYTE(N)			\
 EXC(	lb	t0, N(src), l_exc);	\
 	SUB	len, len, 1;		\
-	beqz	len, done;		\
+	beqz	len, .Ldone\@;		\
 EXC(	 sb	t0, N(dst), s_exc_p1)
 
 	COPY_BYTE(0)
@@ -379,10 +361,12 @@ EXC(	lb	t0, NBYTES-2(src), l_exc)
 	SUB	len, len, 1
 	jr	ra
 EXC(	 sb	t0, NBYTES-2(dst), s_exc_p1)
-done:
+.Ldone\@:
 	jr	ra
 	 nop
-	END(memcpy)
+
+	/* memcpy shouldn't generate exceptions */
+	.if \mode != MEMCPY_MODE
 
 l_exc_copy:
 	/*
@@ -419,7 +403,7 @@ l_exc:
 	 * Clear len bytes starting at dst.  Can't call __bzero because it
 	 * might modify len.  An inefficient loop for these rare times...
 	 */
-	beqz	len, done
+	beqz	len, .Ldone\@
 	 SUB	src, len, 1
 1:	sb	zero, 0(dst)
 	ADD	dst, dst, 1
@@ -457,3 +441,48 @@ s_exc_p1:
 s_exc:
 	jr	ra
 	 nop
+	.endif	/* \mode != MEMCPY_MODE */
+	.endm
+
+/*
+ * memcpy() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the destination pointer
+ */
+	.align	5
+LEAF(memcpy)					/* a0=dst a1=src a2=len */
+EXPORT_SYMBOL(memcpy)
+	move	v0, dst				/* return value */
+	__BUILD_COPY_USER MEMCPY_MODE
+	END(memcpy)
+
+/*
+ * __copy_user() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the number of uncopied bytes in @a2
+ */
+LEAF(__copy_user)
+EXPORT_SYMBOL(__copy_user)
+	li	t7, 0				/* not inatomic */
+__copy_user_common:
+	__BUILD_COPY_USER COPY_USER_MODE
+	END(__copy_user)
+
+/*
+ * t7 is used as a flag to note inatomic mode.
+ */
+LEAF(__copy_user_inatomic)
+EXPORT_SYMBOL(__copy_user_inatomic)
+	b	__copy_user_common
+	 li	t7, 1
+	END(__copy_user_inatomic)
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index b8d34d9..bfbe23c 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -92,6 +92,7 @@
 #define DST_PREFETCH 2
 #define LEGACY_MODE 1
 #define EVA_MODE    2
+#define MEMCPY_MODE 3
 #define USEROP   1
 #define KERNELOP 2
 
@@ -107,7 +108,9 @@
  */
 
 #define EXC(insn, type, reg, addr, handler)			\
-	.if \mode == LEGACY_MODE;				\
+	.if \mode == MEMCPY_MODE;				\
+		insn reg, addr;					\
+	.elseif \mode == LEGACY_MODE;				\
 9:		insn reg, addr;					\
 		.section __ex_table,"a";			\
 		PTR	9b, handler;				\
@@ -199,7 +202,7 @@
 #define STOREB(reg, addr, handler)	EXC(sb, ST_INSN, reg, addr, handler)
 
 #define _PREF(hint, addr, type)						\
-	.if \mode == LEGACY_MODE;					\
+	.if \mode != EVA_MODE;						\
 		PREF(hint, addr);					\
 	.else;								\
 		.if ((\from == USEROP) && (type == SRC_PREFETCH)) ||	\
@@ -255,18 +258,12 @@
 	/*
 	 * Macro to build the __copy_user common code
 	 * Arguments:
-	 * mode : LEGACY_MODE or EVA_MODE
+	 * mode : LEGACY_MODE, EVA_MODE or MEMCPY_MODE
 	 * from : Source operand. USEROP or KERNELOP
 	 * to   : Destination operand. USEROP or KERNELOP
 	 */
 	.macro __BUILD_COPY_USER mode, from, to
 
-	/* initialize __memcpy if this the first time we execute this macro */
-	.ifnotdef __memcpy
-	.set __memcpy, 1
-	.hidden __memcpy /* make sure it does not leak */
-	.endif
-
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
 	 * Temps
@@ -525,11 +522,9 @@
 	b	1b
 	 ADD	dst, dst, 8
 #endif /* CONFIG_CPU_MIPSR6 */
-	.if __memcpy == 1
-	END(memcpy)
-	.set __memcpy, 0
-	.hidden __memcpy
-	.endif
+
+	/* memcpy shouldn't generate exceptions */
+	.if	\mode != MEMCPY_MODE
 
 .Ll_exc_copy\@:
 	/*
@@ -616,34 +611,57 @@ SEXC(1)
 .Ls_exc\@:
 	jr	ra
 	 nop
-	.endm
 
-/*
- * t6 is used as a flag to note inatomic mode.
- */
-LEAF(__copy_user_inatomic)
-EXPORT_SYMBOL(__copy_user_inatomic)
-	b	__copy_user_common
-	li	t6, 1
-	END(__copy_user_inatomic)
+	.endif	/* \mode != MEMCPY_MODE */
+	.endm
 
 /*
- * A combined memcpy/__copy_user
- * __copy_user sets len to 0 for success; else to an upper bound of
- * the number of uncopied bytes.
- * memcpy sets v0 to dst.
+ * memcpy() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the destination pointer
  */
 	.align	5
 LEAF(memcpy)					/* a0=dst a1=src a2=len */
 EXPORT_SYMBOL(memcpy)
 	move	v0, dst				/* return value */
 .L__memcpy:
-FEXPORT(__copy_user)
+	li	t6, 0	/* not inatomic */
+	/* Legacy Mode, user <-> user */
+	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP
+	END(memcpy)
+
+/*
+ * __copy_user() - Copy memory
+ * @a0 - destination
+ * @a1 - source
+ * @a2 - length
+ *
+ * Copy @a2 bytes of memory from @a1 to @a0.
+ *
+ * Returns: the number of uncopied bytes in @a2
+ */
+	.align	5
+LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
 	li	t6, 0	/* not inatomic */
 __copy_user_common:
 	/* Legacy Mode, user <-> user */
 	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP
+	END(__copy_user)
+
+/*
+ * t6 is used as a flag to note inatomic mode.
+ */
+LEAF(__copy_user_inatomic)
+EXPORT_SYMBOL(__copy_user_inatomic)
+	b	__copy_user_common
+	li	t6, 1
+	END(__copy_user_inatomic)
 
 #ifdef CONFIG_EVA
 
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/7] MIPS: memcpy: Return uncopied bytes from __copy_user*() in v0
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

The __copy_user*() functions have thus far returned the number of
uncopied bytes in the $a2 register used as the argument providing the
length of the memory region to be copied. As part of moving to use the
standard calling convention, return the number of uncopied bytes in v0
instead.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 18 +++++++++---------
 arch/mips/include/asm/uaccess.h         | 30 ++++++++++++++++++++----------
 arch/mips/lib/memcpy.S                  | 26 +++++++++++++-------------
 3 files changed, 42 insertions(+), 32 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 944f8f5..6f312a2 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -141,7 +141,7 @@
 	.set	noreorder
 	.set	noat
 
-	.macro __BUILD_COPY_USER mode
+	.macro __BUILD_COPY_USER mode, uncopied
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
 	 * Temps
@@ -358,12 +358,12 @@ EXC(	 sb	t0, N(dst), s_exc_p1)
 	COPY_BYTE(4)
 	COPY_BYTE(5)
 EXC(	lb	t0, NBYTES-2(src), l_exc)
-	SUB	len, len, 1
+	SUB	\uncopied, len, 1
 	jr	ra
 EXC(	 sb	t0, NBYTES-2(dst), s_exc_p1)
 .Ldone\@:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 	/* memcpy shouldn't generate exceptions */
 	.if \mode != MEMCPY_MODE
@@ -410,13 +410,13 @@ l_exc:
 	bnez	src, 1b
 	 SUB	src, src, 1
 2:	jr	ra
-	 nop
+	 move	\uncopied, len
 
 
 #define SEXC(n)				\
 s_exc_p ## n ## u:			\
 	jr	ra;			\
-	 ADD	len, len, n*NBYTES
+	 ADD	\uncopied, len, n*NBYTES
 
 SEXC(16)
 SEXC(15)
@@ -437,10 +437,10 @@ SEXC(1)
 
 s_exc_p1:
 	jr	ra
-	 ADD	len, len, 1
+	 ADD	\uncopied, len, 1
 s_exc:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 	.endif	/* \mode != MEMCPY_MODE */
 	.endm
 
@@ -458,7 +458,7 @@ s_exc:
 LEAF(memcpy)					/* a0=dst a1=src a2=len */
 EXPORT_SYMBOL(memcpy)
 	move	v0, dst				/* return value */
-	__BUILD_COPY_USER MEMCPY_MODE
+	__BUILD_COPY_USER MEMCPY_MODE len
 	END(memcpy)
 
 /*
@@ -475,7 +475,7 @@ LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
 	li	t7, 0				/* not inatomic */
 __copy_user_common:
-	__BUILD_COPY_USER COPY_USER_MODE
+	__BUILD_COPY_USER COPY_USER_MODE v0
 	END(__copy_user)
 
 /*
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 89fa5c0b..81d632f 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -814,6 +814,7 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
 #ifndef CONFIG_EVA
 #define __invoke_copy_to_user(to, from, n)				\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void __user *__cu_to_r __asm__("$4");			\
 	register const void *__cu_from_r __asm__("$5");			\
 	register long __cu_len_r __asm__("$6");				\
@@ -823,11 +824,12 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
 	__cu_len_r = (n);						\
 	__asm__ __volatile__(						\
 	__MODULE_JAL(__copy_user)					\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_to_kernel(to, from, n)				\
@@ -963,6 +965,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 
 #define __invoke_copy_from_user(to, from, n)				\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -977,11 +980,12 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	__UA_ADDU "\t$1, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_from_kernel(to, from, n)				\
@@ -997,6 +1001,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 
 #define __invoke_copy_from_user_inatomic(to, from, n)			\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -1011,11 +1016,12 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	__UA_ADDU "\t$1, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_from_kernel_inatomic(to, from, n)			\
@@ -1035,6 +1041,7 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 
 #define __invoke_copy_from_user_eva_generic(to, from, n, func_ptr)	\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -1049,15 +1056,17 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 	__UA_ADDU "\t$1, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_to_user_eva_generic(to, from, n, func_ptr)	\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -1067,11 +1076,12 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 	__cu_len_r = (n);						\
 	__asm__ __volatile__(						\
 	__MODULE_JAL(func_ptr)						\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 /*
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index bfbe23c..052f7a1 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -262,7 +262,7 @@
 	 * from : Source operand. USEROP or KERNELOP
 	 * to   : Destination operand. USEROP or KERNELOP
 	 */
-	.macro __BUILD_COPY_USER mode, from, to
+	.macro __BUILD_COPY_USER mode, from, to, uncopied
 
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
@@ -398,7 +398,7 @@
 	SHIFT_DISCARD t0, t0, bits
 	STREST(t0, -1(t1), .Ls_exc\@)
 	jr	ra
-	 move	len, zero
+	 move	\uncopied, zero
 .Ldst_unaligned\@:
 	/*
 	 * dst is unaligned
@@ -500,12 +500,12 @@
 	COPY_BYTE(5)
 #endif
 	LOADB(t0, NBYTES-2(src), .Ll_exc\@)
-	SUB	len, len, 1
+	SUB	\uncopied, len, 1
 	jr	ra
 	STOREB(t0, NBYTES-2(dst), .Ls_exc_p1\@)
 .Ldone\@:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 #ifdef CONFIG_CPU_MIPSR6
 .Lcopy_unaligned_bytes\@:
@@ -584,13 +584,13 @@
 	.set	pop
 #endif
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 
 #define SEXC(n)							\
 	.set	reorder;			/* DADDI_WAR */ \
 .Ls_exc_p ## n ## u\@:						\
-	ADD	len, len, n*NBYTES;				\
+	ADD	\uncopied, len, n*NBYTES;			\
 	jr	ra;						\
 	.set	noreorder
 
@@ -605,12 +605,12 @@ SEXC(1)
 
 .Ls_exc_p1\@:
 	.set	reorder				/* DADDI_WAR */
-	ADD	len, len, 1
+	ADD	\uncopied, len, 1
 	jr	ra
 	.set	noreorder
 .Ls_exc\@:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 	.endif	/* \mode != MEMCPY_MODE */
 	.endm
@@ -632,7 +632,7 @@ EXPORT_SYMBOL(memcpy)
 .L__memcpy:
 	li	t6, 0	/* not inatomic */
 	/* Legacy Mode, user <-> user */
-	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP
+	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP len
 	END(memcpy)
 
 /*
@@ -651,7 +651,7 @@ EXPORT_SYMBOL(__copy_user)
 	li	t6, 0	/* not inatomic */
 __copy_user_common:
 	/* Legacy Mode, user <-> user */
-	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP
+	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP v0
 	END(__copy_user)
 
 /*
@@ -686,7 +686,7 @@ LEAF(__copy_from_user_eva)
 EXPORT_SYMBOL(__copy_from_user_eva)
 	li	t6, 0	/* not inatomic */
 __copy_from_user_common:
-	__BUILD_COPY_USER EVA_MODE USEROP KERNELOP
+	__BUILD_COPY_USER EVA_MODE USEROP KERNELOP v0
 END(__copy_from_user_eva)
 
 
@@ -697,7 +697,7 @@ END(__copy_from_user_eva)
 
 LEAF(__copy_to_user_eva)
 EXPORT_SYMBOL(__copy_to_user_eva)
-__BUILD_COPY_USER EVA_MODE KERNELOP USEROP
+__BUILD_COPY_USER EVA_MODE KERNELOP USEROP v0
 END(__copy_to_user_eva)
 
 /*
@@ -706,7 +706,7 @@ END(__copy_to_user_eva)
 
 LEAF(__copy_in_user_eva)
 EXPORT_SYMBOL(__copy_in_user_eva)
-__BUILD_COPY_USER EVA_MODE USEROP USEROP
+__BUILD_COPY_USER EVA_MODE USEROP USEROP v0
 END(__copy_in_user_eva)
 
 #endif
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 4/7] MIPS: memcpy: Return uncopied bytes from __copy_user*() in v0
@ 2016-11-07 11:17   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:17 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

The __copy_user*() functions have thus far returned the number of
uncopied bytes in the $a2 register used as the argument providing the
length of the memory region to be copied. As part of moving to use the
standard calling convention, return the number of uncopied bytes in v0
instead.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 18 +++++++++---------
 arch/mips/include/asm/uaccess.h         | 30 ++++++++++++++++++++----------
 arch/mips/lib/memcpy.S                  | 26 +++++++++++++-------------
 3 files changed, 42 insertions(+), 32 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 944f8f5..6f312a2 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -141,7 +141,7 @@
 	.set	noreorder
 	.set	noat
 
-	.macro __BUILD_COPY_USER mode
+	.macro __BUILD_COPY_USER mode, uncopied
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
 	 * Temps
@@ -358,12 +358,12 @@ EXC(	 sb	t0, N(dst), s_exc_p1)
 	COPY_BYTE(4)
 	COPY_BYTE(5)
 EXC(	lb	t0, NBYTES-2(src), l_exc)
-	SUB	len, len, 1
+	SUB	\uncopied, len, 1
 	jr	ra
 EXC(	 sb	t0, NBYTES-2(dst), s_exc_p1)
 .Ldone\@:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 	/* memcpy shouldn't generate exceptions */
 	.if \mode != MEMCPY_MODE
@@ -410,13 +410,13 @@ l_exc:
 	bnez	src, 1b
 	 SUB	src, src, 1
 2:	jr	ra
-	 nop
+	 move	\uncopied, len
 
 
 #define SEXC(n)				\
 s_exc_p ## n ## u:			\
 	jr	ra;			\
-	 ADD	len, len, n*NBYTES
+	 ADD	\uncopied, len, n*NBYTES
 
 SEXC(16)
 SEXC(15)
@@ -437,10 +437,10 @@ SEXC(1)
 
 s_exc_p1:
 	jr	ra
-	 ADD	len, len, 1
+	 ADD	\uncopied, len, 1
 s_exc:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 	.endif	/* \mode != MEMCPY_MODE */
 	.endm
 
@@ -458,7 +458,7 @@ s_exc:
 LEAF(memcpy)					/* a0=dst a1=src a2=len */
 EXPORT_SYMBOL(memcpy)
 	move	v0, dst				/* return value */
-	__BUILD_COPY_USER MEMCPY_MODE
+	__BUILD_COPY_USER MEMCPY_MODE len
 	END(memcpy)
 
 /*
@@ -475,7 +475,7 @@ LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
 	li	t7, 0				/* not inatomic */
 __copy_user_common:
-	__BUILD_COPY_USER COPY_USER_MODE
+	__BUILD_COPY_USER COPY_USER_MODE v0
 	END(__copy_user)
 
 /*
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 89fa5c0b..81d632f 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -814,6 +814,7 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
 #ifndef CONFIG_EVA
 #define __invoke_copy_to_user(to, from, n)				\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void __user *__cu_to_r __asm__("$4");			\
 	register const void *__cu_from_r __asm__("$5");			\
 	register long __cu_len_r __asm__("$6");				\
@@ -823,11 +824,12 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
 	__cu_len_r = (n);						\
 	__asm__ __volatile__(						\
 	__MODULE_JAL(__copy_user)					\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_to_kernel(to, from, n)				\
@@ -963,6 +965,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 
 #define __invoke_copy_from_user(to, from, n)				\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -977,11 +980,12 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	__UA_ADDU "\t$1, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_from_kernel(to, from, n)				\
@@ -997,6 +1001,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 
 #define __invoke_copy_from_user_inatomic(to, from, n)			\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -1011,11 +1016,12 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	__UA_ADDU "\t$1, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_from_kernel_inatomic(to, from, n)			\
@@ -1035,6 +1041,7 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 
 #define __invoke_copy_from_user_eva_generic(to, from, n, func_ptr)	\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -1049,15 +1056,17 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 	__UA_ADDU "\t$1, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 #define __invoke_copy_to_user_eva_generic(to, from, n, func_ptr)	\
 ({									\
+	register long __cu_ret_r __asm__("$2");				\
 	register void *__cu_to_r __asm__("$4");				\
 	register const void __user *__cu_from_r __asm__("$5");		\
 	register long __cu_len_r __asm__("$6");				\
@@ -1067,11 +1076,12 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 	__cu_len_r = (n);						\
 	__asm__ __volatile__(						\
 	__MODULE_JAL(func_ptr)						\
-	: "+r" (__cu_to_r), "+r" (__cu_from_r), "+r" (__cu_len_r)	\
+	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
+	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
 	:								\
 	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
 	  DADDI_SCRATCH, "memory");					\
-	__cu_len_r;							\
+	__cu_ret_r;							\
 })
 
 /*
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index bfbe23c..052f7a1 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -262,7 +262,7 @@
 	 * from : Source operand. USEROP or KERNELOP
 	 * to   : Destination operand. USEROP or KERNELOP
 	 */
-	.macro __BUILD_COPY_USER mode, from, to
+	.macro __BUILD_COPY_USER mode, from, to, uncopied
 
 	/*
 	 * Note: dst & src may be unaligned, len may be 0
@@ -398,7 +398,7 @@
 	SHIFT_DISCARD t0, t0, bits
 	STREST(t0, -1(t1), .Ls_exc\@)
 	jr	ra
-	 move	len, zero
+	 move	\uncopied, zero
 .Ldst_unaligned\@:
 	/*
 	 * dst is unaligned
@@ -500,12 +500,12 @@
 	COPY_BYTE(5)
 #endif
 	LOADB(t0, NBYTES-2(src), .Ll_exc\@)
-	SUB	len, len, 1
+	SUB	\uncopied, len, 1
 	jr	ra
 	STOREB(t0, NBYTES-2(dst), .Ls_exc_p1\@)
 .Ldone\@:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 #ifdef CONFIG_CPU_MIPSR6
 .Lcopy_unaligned_bytes\@:
@@ -584,13 +584,13 @@
 	.set	pop
 #endif
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 
 #define SEXC(n)							\
 	.set	reorder;			/* DADDI_WAR */ \
 .Ls_exc_p ## n ## u\@:						\
-	ADD	len, len, n*NBYTES;				\
+	ADD	\uncopied, len, n*NBYTES;			\
 	jr	ra;						\
 	.set	noreorder
 
@@ -605,12 +605,12 @@ SEXC(1)
 
 .Ls_exc_p1\@:
 	.set	reorder				/* DADDI_WAR */
-	ADD	len, len, 1
+	ADD	\uncopied, len, 1
 	jr	ra
 	.set	noreorder
 .Ls_exc\@:
 	jr	ra
-	 nop
+	 move	\uncopied, len
 
 	.endif	/* \mode != MEMCPY_MODE */
 	.endm
@@ -632,7 +632,7 @@ EXPORT_SYMBOL(memcpy)
 .L__memcpy:
 	li	t6, 0	/* not inatomic */
 	/* Legacy Mode, user <-> user */
-	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP
+	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP len
 	END(memcpy)
 
 /*
@@ -651,7 +651,7 @@ EXPORT_SYMBOL(__copy_user)
 	li	t6, 0	/* not inatomic */
 __copy_user_common:
 	/* Legacy Mode, user <-> user */
-	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP
+	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP v0
 	END(__copy_user)
 
 /*
@@ -686,7 +686,7 @@ LEAF(__copy_from_user_eva)
 EXPORT_SYMBOL(__copy_from_user_eva)
 	li	t6, 0	/* not inatomic */
 __copy_from_user_common:
-	__BUILD_COPY_USER EVA_MODE USEROP KERNELOP
+	__BUILD_COPY_USER EVA_MODE USEROP KERNELOP v0
 END(__copy_from_user_eva)
 
 
@@ -697,7 +697,7 @@ END(__copy_from_user_eva)
 
 LEAF(__copy_to_user_eva)
 EXPORT_SYMBOL(__copy_to_user_eva)
-__BUILD_COPY_USER EVA_MODE KERNELOP USEROP
+__BUILD_COPY_USER EVA_MODE KERNELOP USEROP v0
 END(__copy_to_user_eva)
 
 /*
@@ -706,7 +706,7 @@ END(__copy_to_user_eva)
 
 LEAF(__copy_in_user_eva)
 EXPORT_SYMBOL(__copy_in_user_eva)
-__BUILD_COPY_USER EVA_MODE USEROP USEROP
+__BUILD_COPY_USER EVA_MODE USEROP USEROP v0
 END(__copy_in_user_eva)
 
 #endif
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/7] MIPS: memcpy: Use ta* instead of manually defining t4-t7
@ 2016-11-07 11:18   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:18 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Manually defining registers t4-t7 to match the o32 ABI in 64 bit kernels
is at best a dirty hack. Use the generic ta* definitions instead, which
further prepares us for using a standard calling convention.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 12 ++++--------
 arch/mips/lib/memcpy.S                  | 26 +++++++++++---------------
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 6f312a2..db49fca 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -112,10 +112,6 @@
 #define t1	$9
 #define t2	$10
 #define t3	$11
-#define t4	$12
-#define t5	$13
-#define t6	$14
-#define t7	$15
 
 #ifdef CONFIG_CPU_LITTLE_ENDIAN
 #define LDFIRST LOADR
@@ -391,7 +387,7 @@ l_exc:
 	LOAD	t0, TI_TASK($28)
 	LOAD	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
 	SUB	len, AT, t0		# len number of uncopied bytes
-	bnez	t7, 2f		/* Skip the zeroing out part if inatomic */
+	bnez	ta0, 2f		/* Skip the zeroing out part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
 	 *   See (3) above.
@@ -473,16 +469,16 @@ EXPORT_SYMBOL(memcpy)
  */
 LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
-	li	t7, 0				/* not inatomic */
+	li	ta0, 0				/* not inatomic */
 __copy_user_common:
 	__BUILD_COPY_USER COPY_USER_MODE v0
 	END(__copy_user)
 
 /*
- * t7 is used as a flag to note inatomic mode.
+ * ta0 is used as a flag to note inatomic mode.
  */
 LEAF(__copy_user_inatomic)
 EXPORT_SYMBOL(__copy_user_inatomic)
 	b	__copy_user_common
-	 li	t7, 1
+	 li	ta0, 1
 	END(__copy_user_inatomic)
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index 052f7a1..48684c4 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -172,10 +172,6 @@
 #define t1	$9
 #define t2	$10
 #define t3	$11
-#define t4	$12
-#define t5	$13
-#define t6	$14
-#define t7	$15
 
 #else
 
@@ -314,8 +310,8 @@
 	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
 	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
 	SUB	len, len, 8*NBYTES
-	LOAD(t4, UNIT(4)(src), .Ll_exc_copy\@)
-	LOAD(t7, UNIT(5)(src), .Ll_exc_copy\@)
+	LOAD(ta0, UNIT(4)(src), .Ll_exc_copy\@)
+	LOAD(ta1, UNIT(5)(src), .Ll_exc_copy\@)
 	STORE(t0, UNIT(0)(dst),	.Ls_exc_p8u\@)
 	STORE(t1, UNIT(1)(dst),	.Ls_exc_p7u\@)
 	LOAD(t0, UNIT(6)(src), .Ll_exc_copy\@)
@@ -324,8 +320,8 @@
 	ADD	dst, dst, 8*NBYTES
 	STORE(t2, UNIT(-6)(dst), .Ls_exc_p6u\@)
 	STORE(t3, UNIT(-5)(dst), .Ls_exc_p5u\@)
-	STORE(t4, UNIT(-4)(dst), .Ls_exc_p4u\@)
-	STORE(t7, UNIT(-3)(dst), .Ls_exc_p3u\@)
+	STORE(ta0, UNIT(-4)(dst), .Ls_exc_p4u\@)
+	STORE(ta1, UNIT(-3)(dst), .Ls_exc_p3u\@)
 	STORE(t0, UNIT(-2)(dst), .Ls_exc_p2u\@)
 	STORE(t1, UNIT(-1)(dst), .Ls_exc_p1u\@)
 	PREFS(	0, 8*32(src) )
@@ -554,7 +550,7 @@
 	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
 	 nop
 	SUB	len, AT, t0		# len number of uncopied bytes
-	bnez	t6, .Ldone\@	/* Skip the zeroing part if inatomic */
+	bnez	ta2, .Ldone\@	/* Skip the zeroing part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
 	 *   See (3) above.
@@ -630,7 +626,7 @@ LEAF(memcpy)					/* a0=dst a1=src a2=len */
 EXPORT_SYMBOL(memcpy)
 	move	v0, dst				/* return value */
 .L__memcpy:
-	li	t6, 0	/* not inatomic */
+	li	ta2, 0	/* not inatomic */
 	/* Legacy Mode, user <-> user */
 	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP len
 	END(memcpy)
@@ -648,19 +644,19 @@ EXPORT_SYMBOL(memcpy)
 	.align	5
 LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
-	li	t6, 0	/* not inatomic */
+	li	ta2, 0	/* not inatomic */
 __copy_user_common:
 	/* Legacy Mode, user <-> user */
 	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP v0
 	END(__copy_user)
 
 /*
- * t6 is used as a flag to note inatomic mode.
+ * ta2 is used as a flag to note inatomic mode.
  */
 LEAF(__copy_user_inatomic)
 EXPORT_SYMBOL(__copy_user_inatomic)
 	b	__copy_user_common
-	li	t6, 1
+	li	ta2, 1
 	END(__copy_user_inatomic)
 
 #ifdef CONFIG_EVA
@@ -675,7 +671,7 @@ EXPORT_SYMBOL(__copy_user_inatomic)
 LEAF(__copy_user_inatomic_eva)
 EXPORT_SYMBOL(__copy_user_inatomic_eva)
 	b       __copy_from_user_common
-	li	t6, 1
+	li	ta2, 1
 	END(__copy_user_inatomic_eva)
 
 /*
@@ -684,7 +680,7 @@ EXPORT_SYMBOL(__copy_user_inatomic_eva)
 
 LEAF(__copy_from_user_eva)
 EXPORT_SYMBOL(__copy_from_user_eva)
-	li	t6, 0	/* not inatomic */
+	li	ta2, 0	/* not inatomic */
 __copy_from_user_common:
 	__BUILD_COPY_USER EVA_MODE USEROP KERNELOP v0
 END(__copy_from_user_eva)
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 5/7] MIPS: memcpy: Use ta* instead of manually defining t4-t7
@ 2016-11-07 11:18   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:18 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Manually defining registers t4-t7 to match the o32 ABI in 64 bit kernels
is at best a dirty hack. Use the generic ta* definitions instead, which
further prepares us for using a standard calling convention.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S | 12 ++++--------
 arch/mips/lib/memcpy.S                  | 26 +++++++++++---------------
 2 files changed, 15 insertions(+), 23 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 6f312a2..db49fca 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -112,10 +112,6 @@
 #define t1	$9
 #define t2	$10
 #define t3	$11
-#define t4	$12
-#define t5	$13
-#define t6	$14
-#define t7	$15
 
 #ifdef CONFIG_CPU_LITTLE_ENDIAN
 #define LDFIRST LOADR
@@ -391,7 +387,7 @@ l_exc:
 	LOAD	t0, TI_TASK($28)
 	LOAD	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
 	SUB	len, AT, t0		# len number of uncopied bytes
-	bnez	t7, 2f		/* Skip the zeroing out part if inatomic */
+	bnez	ta0, 2f		/* Skip the zeroing out part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
 	 *   See (3) above.
@@ -473,16 +469,16 @@ EXPORT_SYMBOL(memcpy)
  */
 LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
-	li	t7, 0				/* not inatomic */
+	li	ta0, 0				/* not inatomic */
 __copy_user_common:
 	__BUILD_COPY_USER COPY_USER_MODE v0
 	END(__copy_user)
 
 /*
- * t7 is used as a flag to note inatomic mode.
+ * ta0 is used as a flag to note inatomic mode.
  */
 LEAF(__copy_user_inatomic)
 EXPORT_SYMBOL(__copy_user_inatomic)
 	b	__copy_user_common
-	 li	t7, 1
+	 li	ta0, 1
 	END(__copy_user_inatomic)
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index 052f7a1..48684c4 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -172,10 +172,6 @@
 #define t1	$9
 #define t2	$10
 #define t3	$11
-#define t4	$12
-#define t5	$13
-#define t6	$14
-#define t7	$15
 
 #else
 
@@ -314,8 +310,8 @@
 	LOAD(t2, UNIT(2)(src), .Ll_exc_copy\@)
 	LOAD(t3, UNIT(3)(src), .Ll_exc_copy\@)
 	SUB	len, len, 8*NBYTES
-	LOAD(t4, UNIT(4)(src), .Ll_exc_copy\@)
-	LOAD(t7, UNIT(5)(src), .Ll_exc_copy\@)
+	LOAD(ta0, UNIT(4)(src), .Ll_exc_copy\@)
+	LOAD(ta1, UNIT(5)(src), .Ll_exc_copy\@)
 	STORE(t0, UNIT(0)(dst),	.Ls_exc_p8u\@)
 	STORE(t1, UNIT(1)(dst),	.Ls_exc_p7u\@)
 	LOAD(t0, UNIT(6)(src), .Ll_exc_copy\@)
@@ -324,8 +320,8 @@
 	ADD	dst, dst, 8*NBYTES
 	STORE(t2, UNIT(-6)(dst), .Ls_exc_p6u\@)
 	STORE(t3, UNIT(-5)(dst), .Ls_exc_p5u\@)
-	STORE(t4, UNIT(-4)(dst), .Ls_exc_p4u\@)
-	STORE(t7, UNIT(-3)(dst), .Ls_exc_p3u\@)
+	STORE(ta0, UNIT(-4)(dst), .Ls_exc_p4u\@)
+	STORE(ta1, UNIT(-3)(dst), .Ls_exc_p3u\@)
 	STORE(t0, UNIT(-2)(dst), .Ls_exc_p2u\@)
 	STORE(t1, UNIT(-1)(dst), .Ls_exc_p1u\@)
 	PREFS(	0, 8*32(src) )
@@ -554,7 +550,7 @@
 	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
 	 nop
 	SUB	len, AT, t0		# len number of uncopied bytes
-	bnez	t6, .Ldone\@	/* Skip the zeroing part if inatomic */
+	bnez	ta2, .Ldone\@	/* Skip the zeroing part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
 	 *   See (3) above.
@@ -630,7 +626,7 @@ LEAF(memcpy)					/* a0=dst a1=src a2=len */
 EXPORT_SYMBOL(memcpy)
 	move	v0, dst				/* return value */
 .L__memcpy:
-	li	t6, 0	/* not inatomic */
+	li	ta2, 0	/* not inatomic */
 	/* Legacy Mode, user <-> user */
 	__BUILD_COPY_USER MEMCPY_MODE USEROP USEROP len
 	END(memcpy)
@@ -648,19 +644,19 @@ EXPORT_SYMBOL(memcpy)
 	.align	5
 LEAF(__copy_user)
 EXPORT_SYMBOL(__copy_user)
-	li	t6, 0	/* not inatomic */
+	li	ta2, 0	/* not inatomic */
 __copy_user_common:
 	/* Legacy Mode, user <-> user */
 	__BUILD_COPY_USER LEGACY_MODE USEROP USEROP v0
 	END(__copy_user)
 
 /*
- * t6 is used as a flag to note inatomic mode.
+ * ta2 is used as a flag to note inatomic mode.
  */
 LEAF(__copy_user_inatomic)
 EXPORT_SYMBOL(__copy_user_inatomic)
 	b	__copy_user_common
-	li	t6, 1
+	li	ta2, 1
 	END(__copy_user_inatomic)
 
 #ifdef CONFIG_EVA
@@ -675,7 +671,7 @@ EXPORT_SYMBOL(__copy_user_inatomic)
 LEAF(__copy_user_inatomic_eva)
 EXPORT_SYMBOL(__copy_user_inatomic_eva)
 	b       __copy_from_user_common
-	li	t6, 1
+	li	ta2, 1
 	END(__copy_user_inatomic_eva)
 
 /*
@@ -684,7 +680,7 @@ EXPORT_SYMBOL(__copy_user_inatomic_eva)
 
 LEAF(__copy_from_user_eva)
 EXPORT_SYMBOL(__copy_from_user_eva)
-	li	t6, 0	/* not inatomic */
+	li	ta2, 0	/* not inatomic */
 __copy_from_user_common:
 	__BUILD_COPY_USER EVA_MODE USEROP KERNELOP v0
 END(__copy_from_user_eva)
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/7] MIPS: memcpy: Use a3/$7 for source end address
@ 2016-11-07 11:18   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:18 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Instead of using the at/$1 register (which does not form part of the
typical calling convention) to provide the end of the source region to
__copy_user* functions, use the a3/$7 register. This prepares us for
being able to call __copy_user* with a standard function call.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S |  8 ++++----
 arch/mips/include/asm/uaccess.h         | 21 ++++++++++++---------
 arch/mips/lib/memcpy.S                  |  8 ++++----
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index db49fca..9316ab1 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -57,13 +57,13 @@
 
 /*
  * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
+ *  1- a3 contain the address of the byte just past the end of the source
  *     of the copy,
- *  2- src_entry <= src < AT, and
+ *  2- src_entry <= src < a3, and
  *  3- (dst - src) == (dst_entry - src_entry),
  * The _entry suffix denotes values when __copy_user was called.
  *
- * (1) is set up up by uaccess.h and maintained by not writing AT in copy_user
+ * (1) is set up up by uaccess.h and maintained by not writing a3 in copy_user
  * (2) is met by incrementing src by the number of bytes copied
  * (3) is met by not doing loads between a pair of increments of dst and src
  *
@@ -386,7 +386,7 @@ EXC(	lb	t1, 0(src),	l_exc)
 l_exc:
 	LOAD	t0, TI_TASK($28)
 	LOAD	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
-	SUB	len, AT, t0		# len number of uncopied bytes
+	SUB	len, a3, t0		# len number of uncopied bytes
 	bnez	ta0, 2f		/* Skip the zeroing out part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 81d632f..562ad49 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -809,7 +809,8 @@ extern void __put_user_unaligned_unknown(void);
 #define DADDI_SCRATCH "$0"
 #endif
 
-extern size_t __copy_user(void *__to, const void *__from, size_t __n);
+extern size_t __copy_user(void *__to, const void *__from, size_t __n,
+			  const void *__from_end);
 
 #ifndef CONFIG_EVA
 #define __invoke_copy_to_user(to, from, n)				\
@@ -874,7 +875,8 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
 	__cu_len;							\
 })
 
-extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
+extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n,
+				   const void *__from_end);
 
 #define __copy_to_user_inatomic(to, from, n)				\
 ({									\
@@ -977,7 +979,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	".set\tnoreorder\n\t"						\
 	__MODULE_JAL(__copy_user)					\
 	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$1, %1, %2\n\t"					\
+	__UA_ADDU "\t$7, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
 	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
@@ -1013,7 +1015,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	".set\tnoreorder\n\t"						\
 	__MODULE_JAL(__copy_user_inatomic)				\
 	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$1, %1, %2\n\t"					\
+	__UA_ADDU "\t$7, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
 	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
@@ -1032,12 +1034,13 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 /* EVA specific functions */
 
 extern size_t __copy_user_inatomic_eva(void *__to, const void *__from,
-				       size_t __n);
+				       size_t __n, const void *__from_end);
 extern size_t __copy_from_user_eva(void *__to, const void *__from,
-				   size_t __n);
+				   size_t __n, const void *__from_end);
 extern size_t __copy_to_user_eva(void *__to, const void *__from,
-				 size_t __n);
-extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
+				 size_t __n, const void *__from_end);
+extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n,
+				 const void *__from_end);
 
 #define __invoke_copy_from_user_eva_generic(to, from, n, func_ptr)	\
 ({									\
@@ -1053,7 +1056,7 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 	".set\tnoreorder\n\t"						\
 	__MODULE_JAL(func_ptr)						\
 	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$1, %1, %2\n\t"					\
+	__UA_ADDU "\t$7, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
 	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index 48684c4..5af9f03 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -70,13 +70,13 @@
 
 /*
  * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
+ *  1- a3 contain the address of the byte just past the end of the source
  *     of the copy,
- *  2- src_entry <= src < AT, and
+ *  2- src_entry <= src < a3, and
  *  3- (dst - src) == (dst_entry - src_entry),
  * The _entry suffix denotes values when __copy_user was called.
  *
- * (1) is set up up by uaccess.h and maintained by not writing AT in copy_user
+ * (1) is set up up by uaccess.h and maintained by not writing a3 in copy_user
  * (2) is met by incrementing src by the number of bytes copied
  * (3) is met by not doing loads between a pair of increments of dst and src
  *
@@ -549,7 +549,7 @@
 	 nop
 	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
 	 nop
-	SUB	len, AT, t0		# len number of uncopied bytes
+	SUB	len, a3, t0		# len number of uncopied bytes
 	bnez	ta2, .Ldone\@	/* Skip the zeroing part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 6/7] MIPS: memcpy: Use a3/$7 for source end address
@ 2016-11-07 11:18   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:18 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

Instead of using the at/$1 register (which does not form part of the
typical calling convention) to provide the end of the source region to
__copy_user* functions, use the a3/$7 register. This prepares us for
being able to call __copy_user* with a standard function call.

Signed-off-by: Paul Burton <paul.burton@imgtec.com>
---

 arch/mips/cavium-octeon/octeon-memcpy.S |  8 ++++----
 arch/mips/include/asm/uaccess.h         | 21 ++++++++++++---------
 arch/mips/lib/memcpy.S                  |  8 ++++----
 3 files changed, 20 insertions(+), 17 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index db49fca..9316ab1 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -57,13 +57,13 @@
 
 /*
  * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
+ *  1- a3 contain the address of the byte just past the end of the source
  *     of the copy,
- *  2- src_entry <= src < AT, and
+ *  2- src_entry <= src < a3, and
  *  3- (dst - src) == (dst_entry - src_entry),
  * The _entry suffix denotes values when __copy_user was called.
  *
- * (1) is set up up by uaccess.h and maintained by not writing AT in copy_user
+ * (1) is set up up by uaccess.h and maintained by not writing a3 in copy_user
  * (2) is met by incrementing src by the number of bytes copied
  * (3) is met by not doing loads between a pair of increments of dst and src
  *
@@ -386,7 +386,7 @@ EXC(	lb	t1, 0(src),	l_exc)
 l_exc:
 	LOAD	t0, TI_TASK($28)
 	LOAD	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
-	SUB	len, AT, t0		# len number of uncopied bytes
+	SUB	len, a3, t0		# len number of uncopied bytes
 	bnez	ta0, 2f		/* Skip the zeroing out part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 81d632f..562ad49 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -809,7 +809,8 @@ extern void __put_user_unaligned_unknown(void);
 #define DADDI_SCRATCH "$0"
 #endif
 
-extern size_t __copy_user(void *__to, const void *__from, size_t __n);
+extern size_t __copy_user(void *__to, const void *__from, size_t __n,
+			  const void *__from_end);
 
 #ifndef CONFIG_EVA
 #define __invoke_copy_to_user(to, from, n)				\
@@ -874,7 +875,8 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n);
 	__cu_len;							\
 })
 
-extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
+extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n,
+				   const void *__from_end);
 
 #define __copy_to_user_inatomic(to, from, n)				\
 ({									\
@@ -977,7 +979,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	".set\tnoreorder\n\t"						\
 	__MODULE_JAL(__copy_user)					\
 	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$1, %1, %2\n\t"					\
+	__UA_ADDU "\t$7, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
 	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
@@ -1013,7 +1015,7 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 	".set\tnoreorder\n\t"						\
 	__MODULE_JAL(__copy_user_inatomic)				\
 	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$1, %1, %2\n\t"					\
+	__UA_ADDU "\t$7, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
 	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
@@ -1032,12 +1034,13 @@ extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n);
 /* EVA specific functions */
 
 extern size_t __copy_user_inatomic_eva(void *__to, const void *__from,
-				       size_t __n);
+				       size_t __n, const void *__from_end);
 extern size_t __copy_from_user_eva(void *__to, const void *__from,
-				   size_t __n);
+				   size_t __n, const void *__from_end);
 extern size_t __copy_to_user_eva(void *__to, const void *__from,
-				 size_t __n);
-extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
+				 size_t __n, const void *__from_end);
+extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n,
+				 const void *__from_end);
 
 #define __invoke_copy_from_user_eva_generic(to, from, n, func_ptr)	\
 ({									\
@@ -1053,7 +1056,7 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n);
 	".set\tnoreorder\n\t"						\
 	__MODULE_JAL(func_ptr)						\
 	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$1, %1, %2\n\t"					\
+	__UA_ADDU "\t$7, %1, %2\n\t"					\
 	".set\tat\n\t"							\
 	".set\treorder"							\
 	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index 48684c4..5af9f03 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -70,13 +70,13 @@
 
 /*
  * The exception handler for loads requires that:
- *  1- AT contain the address of the byte just past the end of the source
+ *  1- a3 contain the address of the byte just past the end of the source
  *     of the copy,
- *  2- src_entry <= src < AT, and
+ *  2- src_entry <= src < a3, and
  *  3- (dst - src) == (dst_entry - src_entry),
  * The _entry suffix denotes values when __copy_user was called.
  *
- * (1) is set up up by uaccess.h and maintained by not writing AT in copy_user
+ * (1) is set up up by uaccess.h and maintained by not writing a3 in copy_user
  * (2) is met by incrementing src by the number of bytes copied
  * (3) is met by not doing loads between a pair of increments of dst and src
  *
@@ -549,7 +549,7 @@
 	 nop
 	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
 	 nop
-	SUB	len, AT, t0		# len number of uncopied bytes
+	SUB	len, a3, t0		# len number of uncopied bytes
 	bnez	ta2, .Ldone\@	/* Skip the zeroing part if inatomic */
 	/*
 	 * Here's where we rely on src and dst being incremented in tandem,
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/7] MIPS: uaccess: Use standard __user_copy* function calls
@ 2016-11-07 11:18   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:18 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

The __user_copy* functions are now almost using the standard calling
conventions with the exception of manually redefined argument registers.
Remove those redefinitions such that we use the argument registers
matching the standard calling convention for the kernel build, and call
__user_copy* with standard C function calls. This allows us to remove
the assembly invoke macros & their manual lists of clobbered registers,
simplifying & tidying up the code in asm/uaccess.h significantly.

This does have a cost in that the compiler will now have to presume that
all registers that are call-clobbered in the standard calling convention
are clobbered by calls to __copy_user*. In practice this doesn't seem
to matter & this patch shaves ~850 bytes of code from a 64r6el generic
kernel:

  $ ./scripts/bloat-o-meter vmlinux-pre vmlinux-post
  add/remove: 7/7 grow/shrink: 161/161 up/down: 6420/-7270 (-850)
  function                                     old     new   delta
  ethtool_get_strings                            -     692    +692
  ethtool_get_dump_data                          -     568    +568
  ...
  ethtool_self_test                            540       -    -540
  ethtool_get_phy_stats.isra                   564       -    -564
  Total: Before=7006717, After=7005867, chg -0.01%

Signed-off-by: Paul Burton <paul.burton@imgtec.com>

---

 arch/mips/cavium-octeon/octeon-memcpy.S |  16 --
 arch/mips/include/asm/uaccess.h         | 493 +++++++-------------------------
 arch/mips/lib/memcpy.S                  |  16 --
 3 files changed, 110 insertions(+), 415 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 9316ab1..e3f6de1 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -43,8 +43,6 @@
  *     - src is readable  (no exceptions when reading src)
  *   copy_from_user
  *     - dst is writable  (no exceptions when writing dst)
- * __copy_user uses a non-standard calling convention; see
- * arch/mips/include/asm/uaccess.h
  *
  * When an exception happens on a load, the handler must
  # ensure that all of the destination buffer is overwritten to prevent
@@ -99,20 +97,6 @@
 #define NBYTES 8
 #define LOG_NBYTES 3
 
-/*
- * As we are sharing code base with the mips32 tree (which use the o32 ABI
- * register definitions). We need to redefine the register definitions from
- * the n64 ABI register naming to the o32 ABI register naming.
- */
-#undef t0
-#undef t1
-#undef t2
-#undef t3
-#define t0	$8
-#define t1	$9
-#define t2	$10
-#define t3	$11
-
 #ifdef CONFIG_CPU_LITTLE_ENDIAN
 #define LDFIRST LOADR
 #define LDREST	LOADL
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 562ad49..2e13c19 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -96,6 +96,20 @@ static inline bool eva_kernel_access(void)
 	return segment_eq(get_fs(), get_ds());
 }
 
+/**
+ * eva_user_access() - determine whether access should use EVA instructions
+ *
+ * Determines whether memory accesses should be performed using EVA memory
+ * access instructions - that is, whether to access the user address space on
+ * an EVA system.
+ *
+ * Return: true if user memory access on an EVA system, else false
+ */
+static inline bool eva_user_access(void)
+{
+	return IS_ENABLED(CONFIG_EVA) && !eva_kernel_access();
+}
+
 /*
  * Is a address valid? This does a straightforward calculation rather
  * than tests.
@@ -802,41 +816,18 @@ extern void __put_user_unaligned_unknown(void);
 	"jal\t" #destination "\n\t"
 #endif
 
-#if defined(CONFIG_CPU_DADDI_WORKAROUNDS) || (defined(CONFIG_EVA) &&	\
-					      defined(CONFIG_CPU_HAS_PREFETCH))
-#define DADDI_SCRATCH "$3"
-#else
-#define DADDI_SCRATCH "$0"
-#endif
-
-extern size_t __copy_user(void *__to, const void *__from, size_t __n,
-			  const void *__from_end);
-
-#ifndef CONFIG_EVA
-#define __invoke_copy_to_user(to, from, n)				\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void __user *__cu_to_r __asm__("$4");			\
-	register const void *__cu_from_r __asm__("$5");			\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	__MODULE_JAL(__copy_user)					\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_to_kernel(to, from, n)				\
-	__invoke_copy_to_user(to, from, n)
-
-#endif
+extern size_t __copy_user(void *to, const void *from, size_t n,
+			  const void *from_end);
+extern size_t __copy_user_inatomic(void *to, const void *from, size_t n,
+				   const void *from_end);
+extern size_t __copy_to_user_eva(void *to, const void *from, size_t n,
+				 const void *from_end);
+extern size_t __copy_from_user_eva(void *to, const void *from, size_t n,
+				   const void *from_end);
+extern size_t __copy_user_inatomic_eva(void *to, const void *from, size_t n,
+				       const void *from_end);
+extern size_t __copy_in_user_eva(void *to, const void *from, size_t n,
+				 const void *from_end);
 
 /*
  * __copy_to_user: - Copy a block of data into user space, with less checking.
@@ -853,316 +844,92 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n,
  * Returns number of bytes that could not be copied.
  * On success, this will be zero.
  */
-#define __copy_to_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void *__cu_from;						\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_from, __cu_len, true);			\
-	might_fault();							\
-									\
-	if (eva_kernel_access())					\
-		__cu_len = __invoke_copy_to_kernel(__cu_to, __cu_from,	\
-						   __cu_len);		\
-	else								\
-		__cu_len = __invoke_copy_to_user(__cu_to, __cu_from,	\
-						 __cu_len);		\
-	__cu_len;							\
-})
-
-extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n,
-				   const void *__from_end);
+static inline unsigned long __must_check
+__copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	check_object_size(from, n, true);
+	might_fault();
 
-#define __copy_to_user_inatomic(to, from, n)				\
-({									\
-	void __user *__cu_to;						\
-	const void *__cu_from;						\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_from, __cu_len, true);			\
-									\
-	if (eva_kernel_access())					\
-		__cu_len = __invoke_copy_to_kernel(__cu_to, __cu_from,	\
-						   __cu_len);		\
-	else								\
-		__cu_len = __invoke_copy_to_user(__cu_to, __cu_from,	\
-						 __cu_len);		\
-	__cu_len;							\
-})
+	if (eva_user_access())
+		return __copy_to_user_eva(to, from, n, from + n);
 
-#define __copy_from_user_inatomic(to, from, n)				\
-({									\
-	void *__cu_to;							\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_to, __cu_len, false);			\
-									\
-	if (eva_kernel_access())					\
-		__cu_len = __invoke_copy_from_kernel_inatomic(__cu_to,	\
-							      __cu_from,\
-							      __cu_len);\
-	else								\
-		__cu_len = __invoke_copy_from_user_inatomic(__cu_to,	\
-							    __cu_from,	\
-							    __cu_len);	\
-	__cu_len;							\
-})
+	return __copy_user(to, from, n, from + n);
+}
 
 /*
- * copy_to_user: - Copy a block of data into user space.
- * @to:	  Destination address, in user space.
- * @from: Source address, in kernel space.
+ * __copy_from_user: - Copy a block of data from user space, with less checking.
+ * @to:	  Destination address, in kernel space.
+ * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
  * Context: User context only. This function may sleep if pagefaults are
  *          enabled.
  *
- * Copy data from kernel space to user space.
+ * Copy data from user space to kernel space.  Caller must check
+ * the specified block with access_ok() before calling this function.
  *
  * Returns number of bytes that could not be copied.
  * On success, this will be zero.
+ *
+ * If some data could not be copied, this function will pad the copied
+ * data to the requested size using zero bytes.
  */
-#define copy_to_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void *__cu_from;						\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_from, __cu_len, true);			\
-									\
-	if (eva_kernel_access()) {					\
-		__cu_len = __invoke_copy_to_kernel(__cu_to,		\
-						   __cu_from,		\
-						   __cu_len);		\
-	} else {							\
-		if (access_ok(VERIFY_WRITE, __cu_to, __cu_len)) {       \
-			might_fault();                                  \
-			__cu_len = __invoke_copy_to_user(__cu_to,	\
-							 __cu_from,	\
-							 __cu_len);     \
-		}							\
-	}								\
-	__cu_len;							\
-})
-
-#ifndef CONFIG_EVA
-
-#define __invoke_copy_from_user(to, from, n)				\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	".set\tnoreorder\n\t"						\
-	__MODULE_JAL(__copy_user)					\
-	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$7, %1, %2\n\t"					\
-	".set\tat\n\t"							\
-	".set\treorder"							\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_from_kernel(to, from, n)				\
-	__invoke_copy_from_user(to, from, n)
-
-/* For userland <-> userland operations */
-#define ___invoke_copy_in_user(to, from, n)				\
-	__invoke_copy_from_user(to, from, n)
-
-/* For kernel <-> kernel operations */
-#define ___invoke_copy_in_kernel(to, from, n)				\
-	__invoke_copy_from_user(to, from, n)
-
-#define __invoke_copy_from_user_inatomic(to, from, n)			\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	".set\tnoreorder\n\t"						\
-	__MODULE_JAL(__copy_user_inatomic)				\
-	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$7, %1, %2\n\t"					\
-	".set\tat\n\t"							\
-	".set\treorder"							\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_from_kernel_inatomic(to, from, n)			\
-	__invoke_copy_from_user_inatomic(to, from, n)			\
-
-#else
-
-/* EVA specific functions */
-
-extern size_t __copy_user_inatomic_eva(void *__to, const void *__from,
-				       size_t __n, const void *__from_end);
-extern size_t __copy_from_user_eva(void *__to, const void *__from,
-				   size_t __n, const void *__from_end);
-extern size_t __copy_to_user_eva(void *__to, const void *__from,
-				 size_t __n, const void *__from_end);
-extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n,
-				 const void *__from_end);
-
-#define __invoke_copy_from_user_eva_generic(to, from, n, func_ptr)	\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	".set\tnoreorder\n\t"						\
-	__MODULE_JAL(func_ptr)						\
-	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$7, %1, %2\n\t"					\
-	".set\tat\n\t"							\
-	".set\treorder"							\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_to_user_eva_generic(to, from, n, func_ptr)	\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	__MODULE_JAL(func_ptr)						\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-/*
- * Source or destination address is in userland. We need to go through
- * the TLB
- */
-#define __invoke_copy_from_user(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_from_user_eva)
+static inline unsigned long __must_check
+__copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	check_object_size(to, n, false);
+	might_fault();
 
-#define __invoke_copy_from_user_inatomic(to, from, n)			\
-	__invoke_copy_from_user_eva_generic(to, from, n,		\
-					    __copy_user_inatomic_eva)
+	if (eva_user_access())
+		return __copy_from_user_eva(to, from, n, from + n);
 
-#define __invoke_copy_to_user(to, from, n)				\
-	__invoke_copy_to_user_eva_generic(to, from, n, __copy_to_user_eva)
+	return __copy_user(to, from, n, from + n);
+}
 
-#define ___invoke_copy_in_user(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_in_user_eva)
+static inline unsigned long __must_check
+__copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
+{
+	check_object_size(from, n, true);
 
-/*
- * Source or destination address in the kernel. We are not going through
- * the TLB
- */
-#define __invoke_copy_from_kernel(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_user)
+	if (eva_user_access())
+		return __copy_to_user_eva(to, from, n, from + n);
 
-#define __invoke_copy_from_kernel_inatomic(to, from, n)			\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_user_inatomic)
+	return __copy_user(to, from, n, from + n);
+}
 
-#define __invoke_copy_to_kernel(to, from, n)				\
-	__invoke_copy_to_user_eva_generic(to, from, n, __copy_user)
+static inline unsigned long __must_check
+__copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
+{
+	check_object_size(to, n, false);
 
-#define ___invoke_copy_in_kernel(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_user)
+	if (eva_user_access())
+		return __copy_user_inatomic_eva(to, from, n, from + n);
 
-#endif /* CONFIG_EVA */
+	return __copy_user_inatomic(to, from, n, from + n);
+}
 
 /*
- * __copy_from_user: - Copy a block of data from user space, with less checking.
- * @to:	  Destination address, in kernel space.
- * @from: Source address, in user space.
+ * copy_to_user: - Copy a block of data into user space.
+ * @to:	  Destination address, in user space.
+ * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
  * Context: User context only. This function may sleep if pagefaults are
  *          enabled.
  *
- * Copy data from user space to kernel space.  Caller must check
- * the specified block with access_ok() before calling this function.
+ * Copy data from kernel space to user space.
  *
  * Returns number of bytes that could not be copied.
  * On success, this will be zero.
- *
- * If some data could not be copied, this function will pad the copied
- * data to the requested size using zero bytes.
  */
-#define __copy_from_user(to, from, n)					\
-({									\
-	void *__cu_to;							\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_to, __cu_len, false);			\
-									\
-	if (eva_kernel_access()) {					\
-		__cu_len = __invoke_copy_from_kernel(__cu_to,		\
-						     __cu_from,		\
-						     __cu_len);		\
-	} else {							\
-		might_fault();						\
-		__cu_len = __invoke_copy_from_user(__cu_to, __cu_from,	\
-						   __cu_len);		\
-	}								\
-	__cu_len;							\
-})
+static inline unsigned long __must_check
+copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	if (!access_ok(VERIFY_WRITE, to, n))
+		return n;
+
+	return __copy_to_user(to, from, n);
+}
 
 /*
  * copy_from_user: - Copy a block of data from user space.
@@ -1181,78 +948,38 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n,
  * If some data could not be copied, this function will pad the copied
  * data to the requested size using zero bytes.
  */
-#define copy_from_user(to, from, n)					\
-({									\
-	void *__cu_to;							\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_to, __cu_len, false);			\
-									\
-	if (eva_kernel_access()) {					\
-		__cu_len = __invoke_copy_from_kernel(__cu_to,		\
-						     __cu_from,		\
-						     __cu_len);		\
-	} else {							\
-		if (access_ok(VERIFY_READ, __cu_from, __cu_len)) {	\
-			might_fault();                                  \
-			__cu_len = __invoke_copy_from_user(__cu_to,	\
-							   __cu_from,	\
-							   __cu_len);   \
-		} else {						\
-			memset(__cu_to, 0, __cu_len);			\
-		}							\
-	}								\
-	__cu_len;							\
-})
+static inline unsigned long __must_check
+copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	if (!access_ok(VERIFY_READ, from, n)) {
+		memset(to, 0, n);
+		return n;
+	}
 
-#define __copy_in_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-	if (eva_kernel_access()) {					\
-		__cu_len = ___invoke_copy_in_kernel(__cu_to, __cu_from,	\
-						    __cu_len);		\
-	} else {							\
-		might_fault();						\
-		__cu_len = ___invoke_copy_in_user(__cu_to, __cu_from,	\
-						  __cu_len);		\
-	}								\
-	__cu_len;							\
-})
+	return __copy_from_user(to, from, n);
+}
 
-#define copy_in_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-	if (eva_kernel_access()) {					\
-		__cu_len = ___invoke_copy_in_kernel(__cu_to,__cu_from,	\
-						    __cu_len);		\
-	} else {							\
-		if (likely(access_ok(VERIFY_READ, __cu_from, __cu_len) &&\
-			   access_ok(VERIFY_WRITE, __cu_to, __cu_len))) {\
-			might_fault();					\
-			__cu_len = ___invoke_copy_in_user(__cu_to,	\
-							  __cu_from,	\
-							  __cu_len);	\
-		}							\
-	}								\
-	__cu_len;							\
-})
+static inline unsigned long __must_check
+__copy_in_user(void __user *to, const void __user *from, unsigned long n)
+{
+	might_fault();
+
+	if (eva_user_access())
+		return __copy_in_user_eva(to, from, n, from + n);
+
+	return __copy_user(to, from, n, from + n);
+}
+
+static inline unsigned long __must_check
+copy_in_user(void __user *to, const void __user *from, unsigned long n)
+{
+	if (unlikely(!access_ok(VERIFY_READ, from, n)))
+		return n;
+	if (unlikely(!access_ok(VERIFY_WRITE, to, n)))
+		return n;
+
+	return __copy_in_user(to, from, n);
+}
 
 /*
  * __clear_user: - Zero a block of memory in user space, with less checking.
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index 5af9f03..dbd7013 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -56,8 +56,6 @@
  *     - src is readable  (no exceptions when reading src)
  *   copy_from_user
  *     - dst is writable  (no exceptions when writing dst)
- * __copy_user uses a non-standard calling convention; see
- * include/asm-mips/uaccess.h
  *
  * When an exception happens on a load, the handler must
  # ensure that all of the destination buffer is overwritten to prevent
@@ -159,20 +157,6 @@
 #define NBYTES 8
 #define LOG_NBYTES 3
 
-/*
- * As we are sharing code base with the mips32 tree (which use the o32 ABI
- * register definitions). We need to redefine the register definitions from
- * the n64 ABI register naming to the o32 ABI register naming.
- */
-#undef t0
-#undef t1
-#undef t2
-#undef t3
-#define t0	$8
-#define t1	$9
-#define t2	$10
-#define t3	$11
-
 #else
 
 #define LOADK lw /* No exception */
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH 7/7] MIPS: uaccess: Use standard __user_copy* function calls
@ 2016-11-07 11:18   ` Paul Burton
  0 siblings, 0 replies; 20+ messages in thread
From: Paul Burton @ 2016-11-07 11:18 UTC (permalink / raw)
  To: linux-mips; +Cc: Ralf Baechle, Paul Burton

The __user_copy* functions are now almost using the standard calling
conventions with the exception of manually redefined argument registers.
Remove those redefinitions such that we use the argument registers
matching the standard calling convention for the kernel build, and call
__user_copy* with standard C function calls. This allows us to remove
the assembly invoke macros & their manual lists of clobbered registers,
simplifying & tidying up the code in asm/uaccess.h significantly.

This does have a cost in that the compiler will now have to presume that
all registers that are call-clobbered in the standard calling convention
are clobbered by calls to __copy_user*. In practice this doesn't seem
to matter & this patch shaves ~850 bytes of code from a 64r6el generic
kernel:

  $ ./scripts/bloat-o-meter vmlinux-pre vmlinux-post
  add/remove: 7/7 grow/shrink: 161/161 up/down: 6420/-7270 (-850)
  function                                     old     new   delta
  ethtool_get_strings                            -     692    +692
  ethtool_get_dump_data                          -     568    +568
  ...
  ethtool_self_test                            540       -    -540
  ethtool_get_phy_stats.isra                   564       -    -564
  Total: Before=7006717, After=7005867, chg -0.01%

Signed-off-by: Paul Burton <paul.burton@imgtec.com>

---

 arch/mips/cavium-octeon/octeon-memcpy.S |  16 --
 arch/mips/include/asm/uaccess.h         | 493 +++++++-------------------------
 arch/mips/lib/memcpy.S                  |  16 --
 3 files changed, 110 insertions(+), 415 deletions(-)

diff --git a/arch/mips/cavium-octeon/octeon-memcpy.S b/arch/mips/cavium-octeon/octeon-memcpy.S
index 9316ab1..e3f6de1 100644
--- a/arch/mips/cavium-octeon/octeon-memcpy.S
+++ b/arch/mips/cavium-octeon/octeon-memcpy.S
@@ -43,8 +43,6 @@
  *     - src is readable  (no exceptions when reading src)
  *   copy_from_user
  *     - dst is writable  (no exceptions when writing dst)
- * __copy_user uses a non-standard calling convention; see
- * arch/mips/include/asm/uaccess.h
  *
  * When an exception happens on a load, the handler must
  # ensure that all of the destination buffer is overwritten to prevent
@@ -99,20 +97,6 @@
 #define NBYTES 8
 #define LOG_NBYTES 3
 
-/*
- * As we are sharing code base with the mips32 tree (which use the o32 ABI
- * register definitions). We need to redefine the register definitions from
- * the n64 ABI register naming to the o32 ABI register naming.
- */
-#undef t0
-#undef t1
-#undef t2
-#undef t3
-#define t0	$8
-#define t1	$9
-#define t2	$10
-#define t3	$11
-
 #ifdef CONFIG_CPU_LITTLE_ENDIAN
 #define LDFIRST LOADR
 #define LDREST	LOADL
diff --git a/arch/mips/include/asm/uaccess.h b/arch/mips/include/asm/uaccess.h
index 562ad49..2e13c19 100644
--- a/arch/mips/include/asm/uaccess.h
+++ b/arch/mips/include/asm/uaccess.h
@@ -96,6 +96,20 @@ static inline bool eva_kernel_access(void)
 	return segment_eq(get_fs(), get_ds());
 }
 
+/**
+ * eva_user_access() - determine whether access should use EVA instructions
+ *
+ * Determines whether memory accesses should be performed using EVA memory
+ * access instructions - that is, whether to access the user address space on
+ * an EVA system.
+ *
+ * Return: true if user memory access on an EVA system, else false
+ */
+static inline bool eva_user_access(void)
+{
+	return IS_ENABLED(CONFIG_EVA) && !eva_kernel_access();
+}
+
 /*
  * Is a address valid? This does a straightforward calculation rather
  * than tests.
@@ -802,41 +816,18 @@ extern void __put_user_unaligned_unknown(void);
 	"jal\t" #destination "\n\t"
 #endif
 
-#if defined(CONFIG_CPU_DADDI_WORKAROUNDS) || (defined(CONFIG_EVA) &&	\
-					      defined(CONFIG_CPU_HAS_PREFETCH))
-#define DADDI_SCRATCH "$3"
-#else
-#define DADDI_SCRATCH "$0"
-#endif
-
-extern size_t __copy_user(void *__to, const void *__from, size_t __n,
-			  const void *__from_end);
-
-#ifndef CONFIG_EVA
-#define __invoke_copy_to_user(to, from, n)				\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void __user *__cu_to_r __asm__("$4");			\
-	register const void *__cu_from_r __asm__("$5");			\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	__MODULE_JAL(__copy_user)					\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_to_kernel(to, from, n)				\
-	__invoke_copy_to_user(to, from, n)
-
-#endif
+extern size_t __copy_user(void *to, const void *from, size_t n,
+			  const void *from_end);
+extern size_t __copy_user_inatomic(void *to, const void *from, size_t n,
+				   const void *from_end);
+extern size_t __copy_to_user_eva(void *to, const void *from, size_t n,
+				 const void *from_end);
+extern size_t __copy_from_user_eva(void *to, const void *from, size_t n,
+				   const void *from_end);
+extern size_t __copy_user_inatomic_eva(void *to, const void *from, size_t n,
+				       const void *from_end);
+extern size_t __copy_in_user_eva(void *to, const void *from, size_t n,
+				 const void *from_end);
 
 /*
  * __copy_to_user: - Copy a block of data into user space, with less checking.
@@ -853,316 +844,92 @@ extern size_t __copy_user(void *__to, const void *__from, size_t __n,
  * Returns number of bytes that could not be copied.
  * On success, this will be zero.
  */
-#define __copy_to_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void *__cu_from;						\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_from, __cu_len, true);			\
-	might_fault();							\
-									\
-	if (eva_kernel_access())					\
-		__cu_len = __invoke_copy_to_kernel(__cu_to, __cu_from,	\
-						   __cu_len);		\
-	else								\
-		__cu_len = __invoke_copy_to_user(__cu_to, __cu_from,	\
-						 __cu_len);		\
-	__cu_len;							\
-})
-
-extern size_t __copy_user_inatomic(void *__to, const void *__from, size_t __n,
-				   const void *__from_end);
+static inline unsigned long __must_check
+__copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	check_object_size(from, n, true);
+	might_fault();
 
-#define __copy_to_user_inatomic(to, from, n)				\
-({									\
-	void __user *__cu_to;						\
-	const void *__cu_from;						\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_from, __cu_len, true);			\
-									\
-	if (eva_kernel_access())					\
-		__cu_len = __invoke_copy_to_kernel(__cu_to, __cu_from,	\
-						   __cu_len);		\
-	else								\
-		__cu_len = __invoke_copy_to_user(__cu_to, __cu_from,	\
-						 __cu_len);		\
-	__cu_len;							\
-})
+	if (eva_user_access())
+		return __copy_to_user_eva(to, from, n, from + n);
 
-#define __copy_from_user_inatomic(to, from, n)				\
-({									\
-	void *__cu_to;							\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_to, __cu_len, false);			\
-									\
-	if (eva_kernel_access())					\
-		__cu_len = __invoke_copy_from_kernel_inatomic(__cu_to,	\
-							      __cu_from,\
-							      __cu_len);\
-	else								\
-		__cu_len = __invoke_copy_from_user_inatomic(__cu_to,	\
-							    __cu_from,	\
-							    __cu_len);	\
-	__cu_len;							\
-})
+	return __copy_user(to, from, n, from + n);
+}
 
 /*
- * copy_to_user: - Copy a block of data into user space.
- * @to:	  Destination address, in user space.
- * @from: Source address, in kernel space.
+ * __copy_from_user: - Copy a block of data from user space, with less checking.
+ * @to:	  Destination address, in kernel space.
+ * @from: Source address, in user space.
  * @n:	  Number of bytes to copy.
  *
  * Context: User context only. This function may sleep if pagefaults are
  *          enabled.
  *
- * Copy data from kernel space to user space.
+ * Copy data from user space to kernel space.  Caller must check
+ * the specified block with access_ok() before calling this function.
  *
  * Returns number of bytes that could not be copied.
  * On success, this will be zero.
+ *
+ * If some data could not be copied, this function will pad the copied
+ * data to the requested size using zero bytes.
  */
-#define copy_to_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void *__cu_from;						\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_from, __cu_len, true);			\
-									\
-	if (eva_kernel_access()) {					\
-		__cu_len = __invoke_copy_to_kernel(__cu_to,		\
-						   __cu_from,		\
-						   __cu_len);		\
-	} else {							\
-		if (access_ok(VERIFY_WRITE, __cu_to, __cu_len)) {       \
-			might_fault();                                  \
-			__cu_len = __invoke_copy_to_user(__cu_to,	\
-							 __cu_from,	\
-							 __cu_len);     \
-		}							\
-	}								\
-	__cu_len;							\
-})
-
-#ifndef CONFIG_EVA
-
-#define __invoke_copy_from_user(to, from, n)				\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	".set\tnoreorder\n\t"						\
-	__MODULE_JAL(__copy_user)					\
-	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$7, %1, %2\n\t"					\
-	".set\tat\n\t"							\
-	".set\treorder"							\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_from_kernel(to, from, n)				\
-	__invoke_copy_from_user(to, from, n)
-
-/* For userland <-> userland operations */
-#define ___invoke_copy_in_user(to, from, n)				\
-	__invoke_copy_from_user(to, from, n)
-
-/* For kernel <-> kernel operations */
-#define ___invoke_copy_in_kernel(to, from, n)				\
-	__invoke_copy_from_user(to, from, n)
-
-#define __invoke_copy_from_user_inatomic(to, from, n)			\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	".set\tnoreorder\n\t"						\
-	__MODULE_JAL(__copy_user_inatomic)				\
-	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$7, %1, %2\n\t"					\
-	".set\tat\n\t"							\
-	".set\treorder"							\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_from_kernel_inatomic(to, from, n)			\
-	__invoke_copy_from_user_inatomic(to, from, n)			\
-
-#else
-
-/* EVA specific functions */
-
-extern size_t __copy_user_inatomic_eva(void *__to, const void *__from,
-				       size_t __n, const void *__from_end);
-extern size_t __copy_from_user_eva(void *__to, const void *__from,
-				   size_t __n, const void *__from_end);
-extern size_t __copy_to_user_eva(void *__to, const void *__from,
-				 size_t __n, const void *__from_end);
-extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n,
-				 const void *__from_end);
-
-#define __invoke_copy_from_user_eva_generic(to, from, n, func_ptr)	\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	".set\tnoreorder\n\t"						\
-	__MODULE_JAL(func_ptr)						\
-	".set\tnoat\n\t"						\
-	__UA_ADDU "\t$7, %1, %2\n\t"					\
-	".set\tat\n\t"							\
-	".set\treorder"							\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-#define __invoke_copy_to_user_eva_generic(to, from, n, func_ptr)	\
-({									\
-	register long __cu_ret_r __asm__("$2");				\
-	register void *__cu_to_r __asm__("$4");				\
-	register const void __user *__cu_from_r __asm__("$5");		\
-	register long __cu_len_r __asm__("$6");				\
-									\
-	__cu_to_r = (to);						\
-	__cu_from_r = (from);						\
-	__cu_len_r = (n);						\
-	__asm__ __volatile__(						\
-	__MODULE_JAL(func_ptr)						\
-	: "=r"(__cu_ret_r), "+r" (__cu_to_r),				\
-	  "+r" (__cu_from_r), "+r" (__cu_len_r)				\
-	:								\
-	: "$8", "$9", "$10", "$11", "$12", "$14", "$15", "$24", "$31",	\
-	  DADDI_SCRATCH, "memory");					\
-	__cu_ret_r;							\
-})
-
-/*
- * Source or destination address is in userland. We need to go through
- * the TLB
- */
-#define __invoke_copy_from_user(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_from_user_eva)
+static inline unsigned long __must_check
+__copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	check_object_size(to, n, false);
+	might_fault();
 
-#define __invoke_copy_from_user_inatomic(to, from, n)			\
-	__invoke_copy_from_user_eva_generic(to, from, n,		\
-					    __copy_user_inatomic_eva)
+	if (eva_user_access())
+		return __copy_from_user_eva(to, from, n, from + n);
 
-#define __invoke_copy_to_user(to, from, n)				\
-	__invoke_copy_to_user_eva_generic(to, from, n, __copy_to_user_eva)
+	return __copy_user(to, from, n, from + n);
+}
 
-#define ___invoke_copy_in_user(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_in_user_eva)
+static inline unsigned long __must_check
+__copy_to_user_inatomic(void __user *to, const void *from, unsigned long n)
+{
+	check_object_size(from, n, true);
 
-/*
- * Source or destination address in the kernel. We are not going through
- * the TLB
- */
-#define __invoke_copy_from_kernel(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_user)
+	if (eva_user_access())
+		return __copy_to_user_eva(to, from, n, from + n);
 
-#define __invoke_copy_from_kernel_inatomic(to, from, n)			\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_user_inatomic)
+	return __copy_user(to, from, n, from + n);
+}
 
-#define __invoke_copy_to_kernel(to, from, n)				\
-	__invoke_copy_to_user_eva_generic(to, from, n, __copy_user)
+static inline unsigned long __must_check
+__copy_from_user_inatomic(void *to, const void __user *from, unsigned long n)
+{
+	check_object_size(to, n, false);
 
-#define ___invoke_copy_in_kernel(to, from, n)				\
-	__invoke_copy_from_user_eva_generic(to, from, n, __copy_user)
+	if (eva_user_access())
+		return __copy_user_inatomic_eva(to, from, n, from + n);
 
-#endif /* CONFIG_EVA */
+	return __copy_user_inatomic(to, from, n, from + n);
+}
 
 /*
- * __copy_from_user: - Copy a block of data from user space, with less checking.
- * @to:	  Destination address, in kernel space.
- * @from: Source address, in user space.
+ * copy_to_user: - Copy a block of data into user space.
+ * @to:	  Destination address, in user space.
+ * @from: Source address, in kernel space.
  * @n:	  Number of bytes to copy.
  *
  * Context: User context only. This function may sleep if pagefaults are
  *          enabled.
  *
- * Copy data from user space to kernel space.  Caller must check
- * the specified block with access_ok() before calling this function.
+ * Copy data from kernel space to user space.
  *
  * Returns number of bytes that could not be copied.
  * On success, this will be zero.
- *
- * If some data could not be copied, this function will pad the copied
- * data to the requested size using zero bytes.
  */
-#define __copy_from_user(to, from, n)					\
-({									\
-	void *__cu_to;							\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_to, __cu_len, false);			\
-									\
-	if (eva_kernel_access()) {					\
-		__cu_len = __invoke_copy_from_kernel(__cu_to,		\
-						     __cu_from,		\
-						     __cu_len);		\
-	} else {							\
-		might_fault();						\
-		__cu_len = __invoke_copy_from_user(__cu_to, __cu_from,	\
-						   __cu_len);		\
-	}								\
-	__cu_len;							\
-})
+static inline unsigned long __must_check
+copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	if (!access_ok(VERIFY_WRITE, to, n))
+		return n;
+
+	return __copy_to_user(to, from, n);
+}
 
 /*
  * copy_from_user: - Copy a block of data from user space.
@@ -1181,78 +948,38 @@ extern size_t __copy_in_user_eva(void *__to, const void *__from, size_t __n,
  * If some data could not be copied, this function will pad the copied
  * data to the requested size using zero bytes.
  */
-#define copy_from_user(to, from, n)					\
-({									\
-	void *__cu_to;							\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-									\
-	check_object_size(__cu_to, __cu_len, false);			\
-									\
-	if (eva_kernel_access()) {					\
-		__cu_len = __invoke_copy_from_kernel(__cu_to,		\
-						     __cu_from,		\
-						     __cu_len);		\
-	} else {							\
-		if (access_ok(VERIFY_READ, __cu_from, __cu_len)) {	\
-			might_fault();                                  \
-			__cu_len = __invoke_copy_from_user(__cu_to,	\
-							   __cu_from,	\
-							   __cu_len);   \
-		} else {						\
-			memset(__cu_to, 0, __cu_len);			\
-		}							\
-	}								\
-	__cu_len;							\
-})
+static inline unsigned long __must_check
+copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	if (!access_ok(VERIFY_READ, from, n)) {
+		memset(to, 0, n);
+		return n;
+	}
 
-#define __copy_in_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-	if (eva_kernel_access()) {					\
-		__cu_len = ___invoke_copy_in_kernel(__cu_to, __cu_from,	\
-						    __cu_len);		\
-	} else {							\
-		might_fault();						\
-		__cu_len = ___invoke_copy_in_user(__cu_to, __cu_from,	\
-						  __cu_len);		\
-	}								\
-	__cu_len;							\
-})
+	return __copy_from_user(to, from, n);
+}
 
-#define copy_in_user(to, from, n)					\
-({									\
-	void __user *__cu_to;						\
-	const void __user *__cu_from;					\
-	long __cu_len;							\
-									\
-	__cu_to = (to);							\
-	__cu_from = (from);						\
-	__cu_len = (n);							\
-	if (eva_kernel_access()) {					\
-		__cu_len = ___invoke_copy_in_kernel(__cu_to,__cu_from,	\
-						    __cu_len);		\
-	} else {							\
-		if (likely(access_ok(VERIFY_READ, __cu_from, __cu_len) &&\
-			   access_ok(VERIFY_WRITE, __cu_to, __cu_len))) {\
-			might_fault();					\
-			__cu_len = ___invoke_copy_in_user(__cu_to,	\
-							  __cu_from,	\
-							  __cu_len);	\
-		}							\
-	}								\
-	__cu_len;							\
-})
+static inline unsigned long __must_check
+__copy_in_user(void __user *to, const void __user *from, unsigned long n)
+{
+	might_fault();
+
+	if (eva_user_access())
+		return __copy_in_user_eva(to, from, n, from + n);
+
+	return __copy_user(to, from, n, from + n);
+}
+
+static inline unsigned long __must_check
+copy_in_user(void __user *to, const void __user *from, unsigned long n)
+{
+	if (unlikely(!access_ok(VERIFY_READ, from, n)))
+		return n;
+	if (unlikely(!access_ok(VERIFY_WRITE, to, n)))
+		return n;
+
+	return __copy_in_user(to, from, n);
+}
 
 /*
  * __clear_user: - Zero a block of memory in user space, with less checking.
diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
index 5af9f03..dbd7013 100644
--- a/arch/mips/lib/memcpy.S
+++ b/arch/mips/lib/memcpy.S
@@ -56,8 +56,6 @@
  *     - src is readable  (no exceptions when reading src)
  *   copy_from_user
  *     - dst is writable  (no exceptions when writing dst)
- * __copy_user uses a non-standard calling convention; see
- * include/asm-mips/uaccess.h
  *
  * When an exception happens on a load, the handler must
  # ensure that all of the destination buffer is overwritten to prevent
@@ -159,20 +157,6 @@
 #define NBYTES 8
 #define LOG_NBYTES 3
 
-/*
- * As we are sharing code base with the mips32 tree (which use the o32 ABI
- * register definitions). We need to redefine the register definitions from
- * the n64 ABI register naming to the o32 ABI register naming.
- */
-#undef t0
-#undef t1
-#undef t2
-#undef t3
-#define t0	$8
-#define t1	$9
-#define t2	$10
-#define t3	$11
-
 #else
 
 #define LOADK lw /* No exception */
-- 
2.10.2

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/7] MIPS: memcpy: Use a3/$7 for source end address
@ 2016-11-14 14:47     ` Maciej W. Rozycki
  0 siblings, 0 replies; 20+ messages in thread
From: Maciej W. Rozycki @ 2016-11-14 14:47 UTC (permalink / raw)
  To: Paul Burton; +Cc: linux-mips, Ralf Baechle

On Mon, 7 Nov 2016, Paul Burton wrote:

> Instead of using the at/$1 register (which does not form part of the
> typical calling convention) to provide the end of the source region to
> __copy_user* functions, use the a3/$7 register. This prepares us for
> being able to call __copy_user* with a standard function call.
> 
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> ---
> 
>  arch/mips/cavium-octeon/octeon-memcpy.S |  8 ++++----
>  arch/mips/include/asm/uaccess.h         | 21 ++++++++++++---------
>  arch/mips/lib/memcpy.S                  |  8 ++++----
>  3 files changed, 20 insertions(+), 17 deletions(-)
> 
[...]
> diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
> index 48684c4..5af9f03 100644
> --- a/arch/mips/lib/memcpy.S
> +++ b/arch/mips/lib/memcpy.S
> @@ -70,13 +70,13 @@
>  
>  /*
>   * The exception handler for loads requires that:
> - *  1- AT contain the address of the byte just past the end of the source
> + *  1- a3 contain the address of the byte just past the end of the source
>   *     of the copy,
> - *  2- src_entry <= src < AT, and
> + *  2- src_entry <= src < a3, and
>   *  3- (dst - src) == (dst_entry - src_entry),
>   * The _entry suffix denotes values when __copy_user was called.
>   *
> - * (1) is set up up by uaccess.h and maintained by not writing AT in copy_user
> + * (1) is set up up by uaccess.h and maintained by not writing a3 in copy_user
>   * (2) is met by incrementing src by the number of bytes copied
>   * (3) is met by not doing loads between a pair of increments of dst and src
>   *
> @@ -549,7 +549,7 @@
>  	 nop
>  	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
>  	 nop
> -	SUB	len, AT, t0		# len number of uncopied bytes
> +	SUB	len, a3, t0		# len number of uncopied bytes
>  	bnez	ta2, .Ldone\@	/* Skip the zeroing part if inatomic */
>  	/*
>  	 * Here's where we rely on src and dst being incremented in tandem,

 With the lone explicit use of $at gone from this code I think you can 
remove `.set noat/at=v1' pseudo-ops across this source file as well.

 I think it would be good actually to do both changes with a single patch 
as it will ensure that whoever comes across them in the future in a look 
through our repo history will know immediately that one is a direct 
consequence of the other (i.e. that we only have those `.set noat/at=v1' 
pseudo-ops because of the special use of $at in this code).

 Thanks for doing these clean-ups; I actually have found this use of $at 
here particularly irritating.

  Maciej

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 6/7] MIPS: memcpy: Use a3/$7 for source end address
@ 2016-11-14 14:47     ` Maciej W. Rozycki
  0 siblings, 0 replies; 20+ messages in thread
From: Maciej W. Rozycki @ 2016-11-14 14:47 UTC (permalink / raw)
  To: Paul Burton; +Cc: linux-mips, Ralf Baechle

On Mon, 7 Nov 2016, Paul Burton wrote:

> Instead of using the at/$1 register (which does not form part of the
> typical calling convention) to provide the end of the source region to
> __copy_user* functions, use the a3/$7 register. This prepares us for
> being able to call __copy_user* with a standard function call.
> 
> Signed-off-by: Paul Burton <paul.burton@imgtec.com>
> ---
> 
>  arch/mips/cavium-octeon/octeon-memcpy.S |  8 ++++----
>  arch/mips/include/asm/uaccess.h         | 21 ++++++++++++---------
>  arch/mips/lib/memcpy.S                  |  8 ++++----
>  3 files changed, 20 insertions(+), 17 deletions(-)
> 
[...]
> diff --git a/arch/mips/lib/memcpy.S b/arch/mips/lib/memcpy.S
> index 48684c4..5af9f03 100644
> --- a/arch/mips/lib/memcpy.S
> +++ b/arch/mips/lib/memcpy.S
> @@ -70,13 +70,13 @@
>  
>  /*
>   * The exception handler for loads requires that:
> - *  1- AT contain the address of the byte just past the end of the source
> + *  1- a3 contain the address of the byte just past the end of the source
>   *     of the copy,
> - *  2- src_entry <= src < AT, and
> + *  2- src_entry <= src < a3, and
>   *  3- (dst - src) == (dst_entry - src_entry),
>   * The _entry suffix denotes values when __copy_user was called.
>   *
> - * (1) is set up up by uaccess.h and maintained by not writing AT in copy_user
> + * (1) is set up up by uaccess.h and maintained by not writing a3 in copy_user
>   * (2) is met by incrementing src by the number of bytes copied
>   * (3) is met by not doing loads between a pair of increments of dst and src
>   *
> @@ -549,7 +549,7 @@
>  	 nop
>  	LOADK	t0, THREAD_BUADDR(t0)	# t0 is just past last good address
>  	 nop
> -	SUB	len, AT, t0		# len number of uncopied bytes
> +	SUB	len, a3, t0		# len number of uncopied bytes
>  	bnez	ta2, .Ldone\@	/* Skip the zeroing part if inatomic */
>  	/*
>  	 * Here's where we rely on src and dst being incremented in tandem,

 With the lone explicit use of $at gone from this code I think you can 
remove `.set noat/at=v1' pseudo-ops across this source file as well.

 I think it would be good actually to do both changes with a single patch 
as it will ensure that whoever comes across them in the future in a look 
through our repo history will know immediately that one is a direct 
consequence of the other (i.e. that we only have those `.set noat/at=v1' 
pseudo-ops because of the special use of $at in this code).

 Thanks for doing these clean-ups; I actually have found this use of $at 
here particularly irritating.

  Maciej

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] MIPS: uaccess: Use standard __user_copy* function calls
@ 2017-06-27 22:33     ` James Hogan
  0 siblings, 0 replies; 20+ messages in thread
From: James Hogan @ 2017-06-27 22:33 UTC (permalink / raw)
  To: Paul Burton; +Cc: linux-mips, Ralf Baechle

[-- Attachment #1: Type: text/plain, Size: 728 bytes --]

Hi Paul,

On Mon, Nov 07, 2016 at 11:18:02AM +0000, Paul Burton wrote:
> -#define __invoke_copy_from_user(to, from, n)				\
> -({									\
> -	register long __cu_ret_r __asm__("$2");				\
> -	register void *__cu_to_r __asm__("$4");				\
> -	register const void __user *__cu_from_r __asm__("$5");		\
> -	register long __cu_len_r __asm__("$6");				\
> -									\
> -	__cu_to_r = (to);						\
> -	__cu_from_r = (from);						\
> -	__cu_len_r = (n);						\
> -	__asm__ __volatile__(						\
> -	".set\tnoreorder\n\t"						\
> -	__MODULE_JAL(__copy_user)					\
> -	".set\tnoat\n\t"						\
> -	__UA_ADDU "\t$7, %1, %2\n\t"					\

I believe __UA_ADDU is no longer used, so could now be removed from the
top of uaccess.h.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH 7/7] MIPS: uaccess: Use standard __user_copy* function calls
@ 2017-06-27 22:33     ` James Hogan
  0 siblings, 0 replies; 20+ messages in thread
From: James Hogan @ 2017-06-27 22:33 UTC (permalink / raw)
  To: Paul Burton; +Cc: linux-mips, Ralf Baechle

[-- Attachment #1: Type: text/plain, Size: 728 bytes --]

Hi Paul,

On Mon, Nov 07, 2016 at 11:18:02AM +0000, Paul Burton wrote:
> -#define __invoke_copy_from_user(to, from, n)				\
> -({									\
> -	register long __cu_ret_r __asm__("$2");				\
> -	register void *__cu_to_r __asm__("$4");				\
> -	register const void __user *__cu_from_r __asm__("$5");		\
> -	register long __cu_len_r __asm__("$6");				\
> -									\
> -	__cu_to_r = (to);						\
> -	__cu_from_r = (from);						\
> -	__cu_len_r = (n);						\
> -	__asm__ __volatile__(						\
> -	".set\tnoreorder\n\t"						\
> -	__MODULE_JAL(__copy_user)					\
> -	".set\tnoat\n\t"						\
> -	__UA_ADDU "\t$7, %1, %2\n\t"					\

I believe __UA_ADDU is no longer used, so could now be removed from the
top of uaccess.h.

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2017-06-27 22:34 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-07 11:17 [PATCH 0/7] MIPS: Standard calling convention usercopy & memcpy Paul Burton
2016-11-07 11:17 ` Paul Burton
2016-11-07 11:17 ` [PATCH 1/7] MIPS: lib: Split lib-y to a line per file Paul Burton
2016-11-07 11:17   ` Paul Burton
2016-11-07 11:17 ` [PATCH 2/7] MIPS: lib: Implement memmove in C Paul Burton
2016-11-07 11:17   ` Paul Burton
2016-11-07 11:17 ` [PATCH 3/7] MIPS: memcpy: Split __copy_user & memcpy Paul Burton
2016-11-07 11:17   ` Paul Burton
2016-11-07 11:17 ` [PATCH 4/7] MIPS: memcpy: Return uncopied bytes from __copy_user*() in v0 Paul Burton
2016-11-07 11:17   ` Paul Burton
2016-11-07 11:18 ` [PATCH 5/7] MIPS: memcpy: Use ta* instead of manually defining t4-t7 Paul Burton
2016-11-07 11:18   ` Paul Burton
2016-11-07 11:18 ` [PATCH 6/7] MIPS: memcpy: Use a3/$7 for source end address Paul Burton
2016-11-07 11:18   ` Paul Burton
2016-11-14 14:47   ` Maciej W. Rozycki
2016-11-14 14:47     ` Maciej W. Rozycki
2016-11-07 11:18 ` [PATCH 7/7] MIPS: uaccess: Use standard __user_copy* function calls Paul Burton
2016-11-07 11:18   ` Paul Burton
2017-06-27 22:33   ` James Hogan
2017-06-27 22:33     ` James Hogan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.